Mark Lord [Tue, 6 Mar 2007 09:42:03 +0000 (01:42 -0800)]
[PATCH] Fix 2.6.21 rfcomm lockups
Any attempt to open/use a bluetooth rfcomm device locks up
scheduling completely on my machine.
Interrupts (ping, alt-sysrq) seem to be alive, but nothing else.
This was working fine in 2.6.20, broken now in 2.6.21-rc2-git*
Reverting this change (below) fixes it:
| author Marcel Holtmann <marcel@holtmann.org>
| Sat, 17 Feb 2007 22:58:57 +0000 (23:58 +0100)
| committer David S. Miller <davem@sunset.davemloft.net>
| Mon, 26 Feb 2007 19:42:41 +0000 (11:42 -0800)
| commit c1a3313698895d8ad4760f98642007bf236af2e8
| tree 337a876f727061362b6a169f8759849c105b8f7a tree | snapshot
| parent f5ffd4620aba9e55656483ae1ef5c79ba81f5403 commit | diff
| | [Bluetooth] Make use of device_move() for RFCOMM TTY devices
| | In the case of bound RFCOMM TTY devices the parent is not available
| before its usage. So when opening a RFCOMM TTY device, move it to
| the corresponding ACL device as a child. When closing the device,
| move it back to the virtual device tree.
| Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
The simplest fix for this bug is to prevent sysfs_move_dir()
from self-deadlocking when (old_parent == new_parent).
This patch prevents total system lockup when using rfcomm devices.
Signed-off-by: Mark Lord <mlord@pobox.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Randy Dunlap [Tue, 6 Mar 2007 09:42:01 +0000 (01:42 -0800)]
[PATCH] atyfb: fix kconfig error part 2
Fix implicit declarations and missing code in atyfb.
drivers/video/aty/atyfb_base.c:2137: warning: implicit declaration of function 'a
ty_ld_lcd'
drivers/video/aty/atyfb_base.c:2154: warning: implicit declaration of function 'a
ty_st_lcd'
atyfb_base.c:(.text+0x33e5c): undefined reference to `aty_ld_lcd'
atyfb_base.c:(.text+0x33eb2): undefined reference to `aty_st_lcd'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Antonino Daplas <adaplas@gmail.com> Cc: James Simmons <jsimmons@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dave Jones [Tue, 6 Mar 2007 09:42:00 +0000 (01:42 -0800)]
[PATCH] nvidiafb backlight: Fix implicit declaration in nv_backlight
drivers/video/nvidia/nv_backlight.c: In function 'nvidia_bl_init':
drivers/video/nvidia/nv_backlight.c:103: error: implicit declaration of function 'pmac_has_backlight_type'
Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: Antonino Daplas <adaplas@gmail.com> Cc: James Simmons <jsimmons@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrew Morton [Tue, 6 Mar 2007 09:41:58 +0000 (01:41 -0800)]
[PATCH] fix build with CONFIG_NO_IDLE_HZ=n
arch/i386/kernel/vmi.c: In function 'vmi_safe_halt':
arch/i386/kernel/vmi.c:262: warning: implicit declaration of function 'vmi_stop_hz_timer'
arch/i386/kernel/vmi.c:266: warning: implicit declaration of function 'vmi_account_time_restart_hz_timer'
Thomas Gleixner [Tue, 6 Mar 2007 07:25:42 +0000 (08:25 +0100)]
[PATCH] Save/restore periodic tick information over suspend/resume
The programming of periodic tick devices needs to be saved/restored
across suspend/resume - otherwise we might end up with a system coming
up that relies on getting a PIT (or HPET) interrupt, while those devices
default to 'no interrupts' after powerup. (To confuse things it worked
to a certain degree on some systems because the lapic gets initialized
as a side-effect of SMP bootup.)
This suspend / resume thing was dropped unintentionally during the
last-minute -mm code reshuffling.
Mark Lord [Tue, 6 Mar 2007 12:30:13 +0000 (13:30 +0100)]
sdhci: make isr tolerant of read errors
The interrupt is shared with another device, which resumes earlier than the
sdhci controller, and generates an interrupt.
The sdhci interrupt handler runs, sees 0xffffffff in its own device's
interrupt status, and tries to handle it.. The reason for the 0xffffffff
is that the device is still suspended, and *all* regs are reading back
0xffffffff.
Signed-off-by: Mark Lord <mlord@pobox.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Pierre Ossman [Sun, 18 Feb 2007 11:07:47 +0000 (12:07 +0100)]
mmc: require explicit support for high-speed
The new high-speed timings are similar to each other and the old
system, but not identical. And although things "just work" most of
the time, sometimes it does not. So we need to start marking which
hosts are known to fully comply with the new timings.
Andrew Morton [Tue, 6 Mar 2007 10:41:55 +0000 (02:41 -0800)]
sis900 warning fixes
drivers/net/sis900.c: In function 'sis900_reset_phy':
drivers/net/sis900.c:972: warning: 'status' may be used uninitialized in this function
drivers/net/sis900.c: In function 'sis900_check_mode':
drivers/net/sis900.c:1431: warning: 'status' may be used uninitialized in this function
drivers/net/sis900.c: In function 'sis900_timer':
drivers/net/sis900.c:1467: warning: 'status' may be used uninitialized in this function
Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Dale Farnsworth [Sat, 3 Mar 2007 13:40:28 +0000 (06:40 -0700)]
mv643xx_eth: Place explicit port number in mv643xx_eth_platform_data
We were using the platform_device.id field to identify which ethernet
port is used for mv643xx_eth device. This is not generally correct.
It will be incorrect, for example, if a hardware platform uses a single
port but not the first port. Here, we add an explicit port_number field
to struct mv643xx_eth_platform_data.
This makes the mv643xx_eth_platform_data structure required, but that
isn't an issue since all users currently provide it already.
Signed-off-by: Dale Farnsworth <dale@farnsworth.org> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Don Fry [Tue, 6 Mar 2007 02:13:09 +0000 (18:13 -0800)]
pcnet32: Fix PCnet32 performance bug on non-coherent architecutres
The PCnet32 driver always passed the the size of the largest possible packet
to the pci_dma_sync_single_for_cpu and pci_dma_sync_single_for_device.
This results in a fairly large "colateral damage" in the caches and makes
the flush operation itself much slower. On a system with a 40MHz CPU this
patch increases network bandwidth by about 12%.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Acked-by: Don Fry <pcnet32@verizon.net> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Prarit Bhargava [Tue, 6 Mar 2007 10:42:00 +0000 (02:42 -0800)]
__devinit & __devexit cleanups for de2104x driver
Fixes MODPOST warnings similar to:
WARNING: drivers/net/tulip/de2104x.o - Section mismatch: reference to
.init.text:de_init_one from .data.rel.local after 'de_driver' (at offset 0x20)
WARNING: drivers/net/tulip/de2104x.o - Section mismatch: reference to
.exit.text:de_remove_one from .data.rel.local after 'de_driver' (at offset 0x28)
Signed-off-by: Prarit Bhargava <prarit@redhat.com> Cc: Valerie Henson <val_henson@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Commit 7628b0a8c01a02966d2228bdf741ddedb128e8f8 (drivers/net/tulip/dmfe:
support basic carrier detection) breaks networking on my Davicom DM9009.
ethtool always reports there is no link. tcpdump shows incoming packets,
but TX is disabled. Reverting the above patch fixes the problem.
Cc: Samuel Thibault <samuel.thibault@ens-lyon.org> Cc: Jeff Garzik <jeff@garzik.org> Cc: Valerie Henson <val_henson@linux.intel.com> Cc: Thomas Bachler <thomas@archlinux.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Li Yang [Tue, 6 Mar 2007 08:53:46 +0000 (16:53 +0800)]
ucc_geth: Fix BD processing
Fix broken BD processing code.
Signed-off-by: Michael Barkowski <michael.barkowski@freescale.com> Signed-off-by: Li Yang <leoli@freescale.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Sergei Shtylyov [Mon, 5 Mar 2007 20:10:08 +0000 (00:10 +0400)]
natsemi: netpoll fixes
Fix two issues in this driver's netpoll path: one usual, with spin_unlock_irq()
enabling interrupts which nobody asks it to do (that has been fixed recently in
a number of drivers) and one unusual, with poll_controller() method possibly
causing loss of interrupts due to the interrupt status register being cleared
by a simple read and the interrpupt handler simply storing it, not accumulating.
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Jay Vosburgh [Thu, 1 Mar 2007 01:03:37 +0000 (17:03 -0800)]
bonding: Improve IGMP join processing
In active-backup mode, the current bonding code duplicates IGMP
traffic to all slaves, so that switches are up to date in case of a
failover from an active to a backup interface. If bonding then fails
back to the original active interface, it is likely that the "active
slave" switch's IGMP forwarding for the port will be out of date until
some event occurs to refresh the switch (e.g., a membership query).
This patch alters the behavior of bonding to no longer flood
IGMP to all ports, and to issue IGMP JOINs to the newly active port at
the time of a failover. This insures that switches are kept up to date
for all cases.
"GOELLESCH Niels" <niels.goellesch@eurocontrol.int> originally
reported this problem, and included a patch. His original patch was
modified by Jay Vosburgh to additionally remove the existing IGMP flood
behavior, use RCU, streamline code paths, fix trailing white space, and
adjust for style.
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Jay Vosburgh [Thu, 1 Mar 2007 01:03:20 +0000 (17:03 -0800)]
bonding: fix double dev_add_pack
Bonding can erroneously register the same packet_type to receive
ARPs (for use by ARP validation): once at device open time, and once via
sysfs. Since sysfs can change the validate setting (and thus register
or unregister) at any time, a flag is needed to synchronize with device
open in order to avoid double registrations, and the simplest place is
within the packet_type structure itself. Double unregister is not an
issue.
Bug reported by Ulrich Oelmann <ulrich.oelmann@web.de>.
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Takashi Iwai [Thu, 15 Feb 2007 17:56:43 +0000 (18:56 +0100)]
[ALSA] cmipci - Allow to disable integrated FM port
The driver didn't allow to disable the integrated FM port (if available),
and this annoyed people who don't want FM port. Now fm_port=0 disables
the FM port unconditionally. fm_port=1 is used for enabling the integrated
FM port (as default).
Also fixed the documentation about this option.
Fix ALSA bug#2491.
Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Jaroslav Kysela <perex@suse.cz>
Tobin Davis [Mon, 26 Feb 2007 15:07:42 +0000 (16:07 +0100)]
[ALSA] hda-codec - Fix logic error in headphone mute for Conexant codecs
This patch fixes a logic error introduced in the previous patch.
Without it, speaker automute mutes the speakers when headphones are
removed and unmutes when headphones are plugged in.
This was reported by Gregorio Guidi after getting the earlier patch
off this mailing list.
Signed-off-by: Tobin Davis <tdavis@dsl-only.net> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Jaroslav Kysela <perex@suse.cz>
Takashi Iwai [Fri, 16 Feb 2007 12:27:18 +0000 (13:27 +0100)]
[ALSA] hda-codec - Define pin configs for MacBooks
Define pin configs for MacBook and MacBook Pro with STAC92xx codecs.
The latter is detected automatically by checking codec SSID now.
Also, fixed the documentation regarding available modeliof sigmatel
codec chips.
Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Jaroslav Kysela <perex@suse.cz>
pata_pdc202xx_old: fix data corruption and other problems
Fix wrong "port" calculations in pdc202xx_{configure_piomode,set_dmamode}()
They were broken for all configurations except one (master device on primary
channel, no other devices) and as a result device settings + PIO/DMA timings
were being programmed into the wrong PCI registers. This could result in
a large variety of problems including data corruption, hangs etc. (depending
on devices used and your luck :-).
Also forward port recent fixes from drivers/ide pdc202xx_old driver:
* fix XFER_MW_DMA0 timings (they were overclocked, use the official ones)
* fix bitmasks for clearing bits of register B:
- when programming DMA mode bit 0x10 of register B was cleared which
resulted in overclocked PIO timing setting (iff PIO0 was used)
- when programming PIO mode bits 0x18 weren't cleared so suboptimal
timings were used for PIO1-4 if PIO0 was previously set (bit 0x10)
and for PIO0/3/4 if PIO1/2 was previously set (bit 0x08)
and finally bump driver version.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
pata_legacy fails to detect the disk on my old ISA/VLB 486:
it starts to probe io=0x1f0 ctr=0x3f6 irq=15, complains
loudly about IDENTIFYs timing out, and finally fails.
(Sorry I couldn't capture the kernel's boot messages.)
It turns out that the driver's mapping from io to irq in
legacy_irq[] is wrong: index 0 for io=0x1f0 has irq=15 but
should have irq=14, and index 1 for io=0x170 has irq=14 but
should have irq=15. This is confirmed by a comparison with
include/asm-i386/ide.h:ide_default_irq().
This patch swaps the first two elements in legacy_irq[],
which makes pata_legacy work on my 486.
Signed-off-by: Mikael Pettersson <mikpe@it.uu.se> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Cornelia Huck [Mon, 5 Mar 2007 22:36:02 +0000 (23:36 +0100)]
[S390] cio: Call cancel_halt_clear even when actl == 0.
The subchannel may just be status pending, even with actl == 0. We
must go through the cancel_halt_clear procedure to put the subchannel
into a defined state.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cornelia Huck [Mon, 5 Mar 2007 22:35:59 +0000 (23:35 +0100)]
[S390] cio: Use path verification to check for path state.
After I/O has been killed by the common I/O layer, trigger path
verification which will queue cio_device_nopath_notify itself if it
finds a device to be without paths.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Gerald Schaefer [Mon, 5 Mar 2007 22:35:54 +0000 (23:35 +0100)]
[S390] Fixed handling of access register mode faults.
Replaced check_user_space() + __check_access_register with the new
check_space(). The old functions made wrong assumptions about kernel
and user space when the kernel and user address spaces are switched
(kernel in home space, user in primary/secondary space).
Secondly the user process can switch to the accress register mode if
it is running in primary or secondary mode. In addition it can load
an arbitrary value to the access registers. If any other value than
0 for primary space or 1 for secondary space is loaded and memory
is accessed using the base register related to the access register,
the program should be terminated with a SIGSEGV. To achieve that the
DUALD pointer in the DUCT and the PSALD pointer in the PASTE need
to point to an array of 8 invalid access-list entries to get a
ALEN-translation exception if an invalid alet is used.
Signed-off-by: Gerald Schaefer <geraldsc@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
[S390] dasd: Use default recovery for SNSS requests
For extended error reporting we sometimes have to start an
Sense Subsystem Status request (SNSS). When this request needs
to be recovered for some reason, the recovery request will
fail with 'command reject'.
Our usual recovery procedure will retry the failed request by
creating a new request and chaining the failed request from that
one. SNSS requests, though, must not be chained from anything,
so the recovery request will fail permanently.
Use the default recovery for SNSS request, which will just restart
the original request without further ado.
Signed-off-by: Stefan Weinhuber <wein@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
After switching compression on/off with the mt command, tape encryption is no
longer working. The reason for that is, that the modeset_byte is set to
the compression value instead of using bitwise and/or bit operations to
enable/disable the corresponding bit.
Signed-off-by: Michael Holzheu <holzheu@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Heiko Carstens [Mon, 5 Mar 2007 22:35:43 +0000 (23:35 +0100)]
[S390] reipl: move dump_prefix_page out of text section.
Reipl doesn't work on older machines were s390_reset_machine() gets
called. The reason is that the text section is read-only but the
variable dump_prefix_page is there. Since s390_reset_machine() writes
to it we get a protection exception.
Therefore move dump_prefix_page to the bss section.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Heiko Carstens [Mon, 5 Mar 2007 22:35:41 +0000 (23:35 +0100)]
[S390] smp: disable preemption in smp_call_function/smp_call_function_on
Avoid sprinkling a _lot_ of preempt_disable/preempt_enable pairs.
This would be necessary for e.g. the iucv driver. Also this way we
are more consistent with other architectures which disable
preemption at least for smp_call_function.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The illegal operation handler calls the die notifier with DIE_BPT to
let kprobes pick up its breakpoint. If kprobes does not find its
breakpoint it returns NOTIFY_STOP instead of NOTIFY_DONE.
Since we use stop_machine_run on s390 to arm/disarm the kprobes
breakpoints the race that kprobe_handler tries to solve by checking
for the kprobes breakpoints does not exist. Removing the check makes
BUG_ON working again.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Jan Altenberg [Mon, 5 Mar 2007 21:29:55 +0000 (13:29 -0800)]
[GIANFAR]: Fix compile error in latest git
I recognized a compile error in latest git:
/here/workdir/git/drivers/net/gianfar.c: In function `gfar_vlan_rx_kill_vid':
/here/workdir/git/drivers/net/gianfar.c:1135: error: structure has no member named `vgrp'
[NETFILTER]: ip6_route_me_harder should take into account mark
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Fix reference counting (memory leak) problem in __nfulnl_send() and callers
related to packet queueing.
Signed-off-by: Michal Miroslaw <mirq-linux@rere.qmqm.pl> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Michal Miroslaw [Sun, 4 Mar 2007 23:59:20 +0000 (15:59 -0800)]
[NETFILTER]: nfnetlink_log: fix possible NULL pointer dereference
Eliminate possible NULL pointer dereference in nfulnl_recv_config().
Signed-off-by: Michal Miroslaw <mirq-linux@rere.qmqm.pl> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Micha Mirosaw <mirq-linux@rere.qmqm.pl> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Michal Miroslaw [Sun, 4 Mar 2007 23:58:40 +0000 (15:58 -0800)]
[NETFILTER]: nfnetlink_log: fix use after free
Paranoia: instance_put() might have freed the inst pointer when we
spin_unlock_bh().
Signed-off-by: Michal Miroslaw <mirq-linux@rere.qmqm.pl> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Michal Miroslaw [Sun, 4 Mar 2007 23:58:15 +0000 (15:58 -0800)]
[NETFILTER]: nfnetlink_log: fix reference leak
Stop reference leaking in nfulnl_log_packet(). If we start a timer we
are already taking another reference.
Signed-off-by: Michal Miroslaw <mirq-linux@rere.qmqm.pl> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
The nf_conntrack_netlink config option is named CONFIG_NF_CT_NETLINK,
but multiple files use CONFIG_IP_NF_CONNTRACK_NETLINK or
CONFIG_NF_CONNTRACK_NETLINK for ifdefs.
Fix this and reformat all CONFIG_NF_CT_NETLINK ifdefs to only use a line.
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Fix {nf,ip}_ct_iterate_cleanup unconfirmed list handling:
- unconfirmed entries can not be killed manually, they are removed on
confirmation or final destruction of the conntrack entry, which means
we might iterate forever without making forward progress.
This can happen in combination with the conntrack event cache, which
holds a reference to the conntrack entry, which is only released when
the packet makes it all the way through the stack or a different
packet is handled.
- taking references to an unconfirmed entry and using it outside the
locked section doesn't work, the list entries are not refcounted and
another CPU might already be waiting to destroy the entry
What the code really wants to do is make sure the references of the hash
table to the selected conntrack entries are released, so they will be
destroyed once all references from skbs and the event cache are dropped.
Since unconfirmed entries haven't even entered the hash yet, simply mark
them as dying and skip confirmation based on that.
Reported and tested by Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 5 Mar 2007 04:36:18 +0000 (20:36 -0800)]
[SPARC64]: Fix floppy build failure.
Just define a local {claim,release}_dma_lock() implementation
for the floppy driver to use so we don't need to define and
export to modules the silly dma_spin_lock.
Signed-off-by: David S. Miller <davem@davemloft.net>
Ingo Molnar [Mon, 5 Mar 2007 13:46:30 +0000 (14:46 +0100)]
[PATCH] paravirt: re-enable COMPAT_VDSO
CONFIG_PARAVIRT broke old glibc bootup: it silently turned off the
selectability of CONFIG_COMPAT_VDSO and thus rendered distro kernels
unbootable on old-style VDSO glibc setups.
the proper solution is to keep COMPAT_VDSO available - if a hypervisor
needs any modification of that concept then we'll judge those changes in
full context, once those changes are submitted.
Ingo Molnar [Mon, 5 Mar 2007 12:20:11 +0000 (13:20 +0100)]
[PATCH] disable NMI watchdog by default
there's a new NMI watchdog related problem: KVM crashes on certain
bzImages because ... we enable the NMI watchdog by default (even if the
user does not ask for it) , and no other OS on this planet does that so
KVM doesnt have emulation for that yet. So KVM injects a #GP, which
crashes the Linux guest:
general protection fault: 0000 [#1]
PREEMPT SMP
Modules linked in:
CPU: 0
EIP: 0060:[<c011a8ae>] Not tainted VLI
EFLAGS: 00000246 (2.6.20-rc5-rt0 #3)
EIP is at setup_apic_nmi_watchdog+0x26d/0x3d3
and no, i did /not/ request an nmi_watchdog on the boot command line!
Solution: turn off that darn thing! It's a debug tool, not a 'make life
harder' tool!!
with this patch the KVM guest boots up just fine.
And with this my laptop (Lenovo T60) also stopped its sporadic hard
hanging (sometimes in acpi_init(), sometimes later during bootup,
sometimes much later during actual use) as well. It hung with both
nmi_watchdog=1 and nmi_watchdog=2, so it's generally the fact of NMI
injection that is causing problems, not the NMI watchdog variant, nor
any particular bootup code.
[ NMI breaks on some systems, esp in combination with SMM -Arjan ]
Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ingo Molnar [Mon, 5 Mar 2007 12:15:40 +0000 (13:15 +0100)]
[PATCH] paravirt: let users decide whether they want VMI
do not use default=y for CONFIG_VMI (we do not do that for any driver or
special-hardware feature): the overwhelming majority of Linux users does
not need it, and interested users and distributions can enable it
as-needed.
Ingo Molnar [Mon, 5 Mar 2007 12:13:46 +0000 (13:13 +0100)]
[PATCH] paravirt: clarify VMI description
Clarify the description of the CONFIG_VMI option: describe the reality
that VMI is a VMWare-only interface for now. Once that changes and
another hypervisor adopts the VMI ABI we can change the text.
As can be seen from the Xen paravirtualization patches submitted to lkml
the Xen project has chosen its own, non-VMI interface between Xen and
the para-Linux - so remove Xen from the description.
CT based mach64 cards were reported to hang on sparc64 boxes when
compiled with gcc-4.1.x and later.
Looking at this piece of code, it's no surprise. A critical
delay was implemented as an empty for() loop, and gcc 4.0.x
and previous did not optimize it away, so we did get a delay.
But gcc-4.1.x and later can optimize it away, and we get crashes.
Use a real udelay() to fix this. Fix verified on SunBlade100.
Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Adrian Bunk [Mon, 5 Mar 2007 08:30:56 +0000 (00:30 -0800)]
[PATCH] arch/i386/kernel/vmi.c must #include <asm/kmap_types.h>
CC arch/i386/kernel/vmi.o
/home/bunk/linux/kernel-2.6/linux-2.6.21-rc2-mm1/arch/i386/kernel/vmi.c: In function 'vmi_map_pt_hook':
/home/bunk/linux/kernel-2.6/linux-2.6.21-rc2-mm1/arch/i386/kernel/vmi.c:387: error: 'KM_PTE0' undeclared (first use in this function)
/home/bunk/linux/kernel-2.6/linux-2.6.21-rc2-mm1/arch/i386/kernel/vmi.c:387: error: (Each undeclared identifier is reported only once
/home/bunk/linux/kernel-2.6/linux-2.6.21-rc2-mm1/arch/i386/kernel/vmi.c:387: error: for each function it appears in.)
/home/bunk/linux/kernel-2.6/linux-2.6.21-rc2-mm1/arch/i386/kernel/vmi.c:387: error: 'KM_PTE1' undeclared (first use in this function)
make[2]: *** [arch/i386/kernel/vmi.o] Error 1
Signed-off-by: Adrian Bunk <bunk@stusta.de> Acked-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Which means the spinlocks are taken in an order which depends on which cpu
gets shut down from which other cpu. Therefore lockdep complains that there
might be an ABBA deadlock. Since migrate_hrtimers() gets only called on
cpu hotplug it's safe to assume that it isn't executed concurrently on a
The same problem exists in kernel/timer.c: migrate_timers().
As pointed out by Christian Borntraeger one possible solution to avoid
the locking order complaints would be to make sure that the locks are
always taken in the same order. E.g. by taking the lock of the cpu with
the lower number first.
To achieve this we introduce two new spinlock functions double_spin_lock
and double_spin_unlock which lock or unlock two locks in a given order.
Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: John Stultz <johnstul@us.ibm.com> Cc: Christian Borntraeger <cborntra@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch resolves the issue found here:
http://bugme.osdl.org/show_bug.cgi?id=7426
The basic summary is:
Currently we register most of i386/x86_64 clocksources at module_init
time. Then we enable clocksource selection at late_initcall time. This
causes some problems for drivers that use gettimeofday for init
calibration routines (specifically the es1968 driver in this case),
where durring module_init, the only clocksource available is the low-res
jiffies clocksource. This may cause slight calibration errors, due to
the small sampling time used.
It should be noted that drivers that require fine grained time may not
function on architectures that do not have better then jiffies
resolution timekeeping (there are a few). However, this does not
discount the reasonable need for such fine-grained timekeeping at init
time.
Thus the solution here is to register clocksources earlier (ideally when
the hardware is being initialized), and then we enable clocksource
selection at fs_initcall (before device_initcall).
This patch should probably get some testing time in -mm, since
clocksource selection is one of the most important issues for correct
timekeeping, and I've only been able to test this on a few of my own
boxes.
Signed-off-by: John Stultz <johnstul@us.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[PATCH] ipmi: check, if default ports are accessible on PPC
ipmi_si_intf tries to access default ports, if no device could be found
elsewhere. On PPC we have a function to check, if these legacy IO ports
are accessible. This patch adds a check for these ports on PPC. This
patch fixes a breakage of IPMI module on PPC machines without a BMC.
Signed-off-by: Christian Krafft <krafft@de.ibm.com> Acked-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Corey Minyard <minyard@acm.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- In fact we don't have to fail if AOP_TRUNCATED_PAGE was returned from
prepare_write or commit_write. It is beter to retry attempt where it
is possible.
- Rearange ecryptfs_get_lower_page() error handling logic, make it more clean.
Signed-off-by: Dmitriy Monakhov <dmonakhov@openvz.org> Acked-by: Michael Halcrow <mhalcrow@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[PATCH] ecryptfs: lower root result must be adirectory
- Currently after path_lookup succeed we dot't have any guarantie what
it is DIR. This must be explicitly demanded.
- path_lookup can't return negative dentry, So inode check is useless.
Signed-off-by: Dmitriy Monakhov <dmonakhov@openvz.org> Acked-by: Michael Halcrow <mhalcrow@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
NeilBrown [Mon, 5 Mar 2007 08:30:44 +0000 (00:30 -0800)]
[PATCH] md: fix for raid6 reshape
Recent patch for raid6 reshape had a change missing that showed up in
subsequent review.
Many places in the raid5 code used "conf->raid_disks-1" to mean "number of
data disks". With raid6 that had to be changed to "conf->raid_disk -
conf->max_degraded" or similar. One place was missed.
This bug means that if a raid6 reshape were aborted in the middle the
recorded position would be wrong. On restart it would either fail (as the
position wasn't on an appropriate boundary) or would leave a section of the
array unreshaped, causing data corruption.
Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Zachary Amsden [Mon, 5 Mar 2007 08:30:43 +0000 (00:30 -0800)]
[PATCH] vmi: smp fixes
Critical fixes for SMP.
Fix a couple functions which needed to be __devinit and fix a bogus parameter
to AP startup that just so happened to work because the low virtual mapping of
memory was still established.
Zachary Amsden [Mon, 5 Mar 2007 08:30:41 +0000 (00:30 -0800)]
[PATCH] vmi: apic ops
Use para_fill instead of directly setting the APIC ops to the result of the
vmi_get_function call - this allows one to implement a VMI ROM without
implementing APIC functions, just using the native APIC functions.
While doing this, I realized that there is a lot more cleanup that should have
been done. Basically, we should never assume that the ROM implements a
specific set of functions, and always allow fallback to the native
implementation.
This is critical for future compatibility.
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws> Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Zachary Amsden [Mon, 5 Mar 2007 08:30:41 +0000 (00:30 -0800)]
[PATCH] vmi: fix nohz compile
More goo from hrtimers integration. We do compile and run properly with NO_HZ
enabled. There was a period when we didn't because of a missing export, but
that was since fixed.
And with the clocksource code now firmly in place, we can get rid of code that
fixes up the wallclock, since this is done in the common infrastructure. This
actually fixes a timer bug as well, that was caused by do_settimeofday no
longer being callable with interrupts disabled due to the use of
on_each_cpu().
Zachary Amsden [Mon, 5 Mar 2007 08:30:39 +0000 (00:30 -0800)]
[PATCH] vmi: pit override
The time_init_hook in paravirt-ops no longer functions in the correct manner
after the integration of the hrtimers code. The problem is that now the call
path for time initialization is:
If this isn't confusing enough, the paravirt case goes through an indirect
function pointer in the paravirt-ops table. The problem is, by the time the
paravirt hook is called, the pit timer is already enabled.
But paravirt guests have their own timer, and don't want to use the PIT.
Rather than intensify the struggle for power going on here, just make it all
nice and simple and just unconditionally do all timer setup in the
late_time_init hook. This also has the advantage of enabling timers in the
same place in all code paths, so everyone has the same bugs and we don't have
outliers who break other code because they turn on timer too early or too
late.
So the paravirt-ops time init function is now by default hpet_time_init, which
is the time init function used for native hardware. Paravirt guests have the
chance to override this when they setup the paravirt-ops table, and should
need no change.
Zachary Amsden [Mon, 5 Mar 2007 08:30:38 +0000 (00:30 -0800)]
[PATCH] vmi: paravirt drop udelay op
Not respecting udelay causes problems with any virtual hardware that is passed
through to real hardware. This can be noticed by any device that interacts
with the real world in real time - like AP startup, which takes real time. Or
keyboard LEDs, which should blink in real-time. Or floppy drives, but only
when passed through to a real floppy controller on OSes which can't
sufficiently buffer the floppy commands to emulate a zero latency floppy. Or
IDE drives, when connecting to a physical CDROM.
This was mostly a hack to get the kernel to boot faster, but it introduced a
number of misvirtualization bugs, and Alan and Pavel argued pretty strongly
against it. We were the only client, and now want to clean up this cruft.
Zachary Amsden [Mon, 5 Mar 2007 08:30:37 +0000 (00:30 -0800)]
[PATCH] vmi: fix highpte
Provide a PT map hook for HIGHPTE kernels to designate where they are mapping
page tables. This information is required so the physical address of PTE
updates can be determined; otherwise, the mm layer would have to carry the
physical address all the way to each PTE modification callsite, which is even
more hideous that the macros required to provide the proper hooks.
So lets not mess up arch neutral code to achieve this, but keep the horror in
an #ifdef HIGHPTE in include/asm-i386/pgtable.h. I had to use macros here
because some types are not yet defined in all the include paths for this
header.
This patch is absolutely required for HIGHPTE kernels to operate properly with
VMI.
Zachary Amsden [Mon, 5 Mar 2007 08:30:36 +0000 (00:30 -0800)]
[PATCH] vmi: cpu cycles fix
In order to share the common code in tsc.c which does CPU Khz calibration, we
need to make an accurate value of CPU speed available to the tsc.c code. This
value loses a lot of precision in a VM because of the timing differences with
real hardware, but we need it to be as precise as possible so the guest can
make accurate time calculations with the cycle counters.
Zachary Amsden [Mon, 5 Mar 2007 08:30:35 +0000 (00:30 -0800)]
[PATCH] vmi: sched clock paravirt op fix
The custom_sched_clock hook is broken. The result from sched_clock needs to
be in nanoseconds, not in CPU cycles. The TSC is insufficient for this
purpose, because TSC is poorly defined in a virtual environment, and mostly
represents real world time instead of scheduled process time (which can be
interrupted without notice when a virtual machine is descheduled).
To make the scheduler consistent, we must expose a different nature of time,
that is scheduled time. So deprecate this custom_sched_clock hack and turn it
into a paravirt-op, as it should have been all along. This allows the tsc.c
code which converts cycles to nanoseconds to be shared by all paravirt-ops
backends.
It is unfortunate to add a new paravirt-op, but this is a very distinct
abstraction which is clearly different for all virtual machine
implementations, and it gets rid of an ugly indirect function which I
ashamedly admit I hacked in to try to get this to work earlier, and then even
got in the wrong units.
Zachary Amsden [Mon, 5 Mar 2007 08:30:34 +0000 (00:30 -0800)]
[PATCH] vmi: timer fixes round two
Critical bugfixes for the VMI-Timer code.
1) Do not setup a one shot alarm if we are keeping the periodic alarm
armed. Additionally, since the periodic alarm can be run at a lower rate
than HZ, let's fixup the guard to the no-idle-hz mode appropriately. This
fixes the bug where the no-idle-hz mode might have a higher interrupt rate
than the non-idle case.
2) The interrupt handler can no longer adjust xtime due to nested lock
acquisition. Drop this. We don't need to check for wallclock time at
every tick, it can be done in userspace instead.
3) Add a bypass to disable noidle operation. This is useful as a last
minute workaround, or testing measure.
4) The code to skip the IO_APIC timer testing (no_timer_check) should be
conditional on IO_APIC, not SMP, since UP kernels can have this configured
in as well.
Signed-off-by: Dan Hecht <dhecht@vmware.com> Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently we do not check for vma flags if sys_move_pages is called to move
individual pages. If sys_migrate_pages is called to move pages then we
check for vm_flags that indicate a non migratable vma but that still
includes VM_LOCKED and we can migrate mlocked pages.
Extract the vma_migratable check from mm/mempolicy.c, fix it and put it
into migrate.h so that is can be used from both locations.
Problem was spotted by Lee Schermerhorn
Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Paul Mundt [Mon, 5 Mar 2007 08:30:31 +0000 (00:30 -0800)]
[PATCH] fb: sm501fb off-by-1 sysfs store
Currently sm501fb_crtsrc_store() won't allow the routing to be changed via
echos from userspace in to the sysfs file. The reason for this is that the
strnicmp() for both heads uses a sizeof() for the string length, which ends
up being strlen() + 1 (\0 in the normal case, but the echo gives a newline,
which is where the issue occurs), this then causes a mismatch and
subsequently bails with the -EINVAL.
In addition to this, the hardcoded lengths were then used for the store
length that was returned, which ended up being erroneous and resulting in a
write error. There's also no point in returning anything but the full
length since it will -EINVAL out on a mismatch well before then anyways.
sizeof("string") is great for making sure you have space in your buffer,
but rather less so for string comparisons :-)
Signed-off-by: Paul Mundt <lethal@linux-sh.org> Acked-by: Ben Dooks <ben-linux@fluff.org> Cc: "Antonino A. Daplas" <adaplas@pol.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Con Kolivas [Mon, 5 Mar 2007 08:30:29 +0000 (00:30 -0800)]
[PATCH] sched: remove SMT nice
Remove the SMT-nice feature which idles sibling cpus on SMT cpus to
facilitiate nice working properly where cpu power is shared. The idling of
cpus in the presence of runnable tasks is considered too fragile, easy to
break with outside code, and the complexity of managing this system if an
architecture comes along with many logical cores sharing cpu power will be
unworkable.
Remove the associated per_cpu_gain variable in sched_domains used only by
this code.
Also:
The reason is that with dynticks enabled, this code breaks without yet
further tweaks so dynticks brought on the rapid demise of this code. So
either we tweak this code or kill it off entirely. It was Ingo's preference
to kill it off. Either way this needs to happen for 2.6.21 since dynticks
has gone in.
Signed-off-by: Con Kolivas <kernel@kolivas.org> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>