]> err.no Git - linux-2.6/log
linux-2.6
16 years agoMerge branch 'x86/mpparse' into x86/devel
Ingo Molnar [Tue, 8 Jul 2008 09:14:58 +0000 (11:14 +0200)]
Merge branch 'x86/mpparse' into x86/devel

Conflicts:

arch/x86/Kconfig
arch/x86/kernel/io_apic_32.c
arch/x86/kernel/setup_64.c
arch/x86/mm/init_32.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86, ioapic, acpi quirk: disable IRQ 0 through I/O APIC for some HP systems
Matthew Garrett [Tue, 1 Jul 2008 00:12:06 +0000 (01:12 +0100)]
x86, ioapic, acpi quirk: disable IRQ 0 through I/O APIC for some HP systems

Some HP laptops have a problem with their DSDT reporting as
HP/SB400/10000, which includes some code which overrides all temperature
trip points to 16C if the INTIN2 input of the I/O APIC is enabled.  This
input is incorrectly designated the ISA IRQ 0 via an interrupt source
override even though it is wired to the output of the master 8259A and
INTIN0 is not connected at all.  So far two models have been identified,
namely nx6125 and nx6325.

Use a knob provided by the I/O APIC interrupt registration code to
abandon any attempts to route IRQ 0 through the I/O APIC for these
systems.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86, ioapic, acpi: add a knob to disable IRQ 0 through I/O APIC
Maciej W. Rozycki [Tue, 1 Jul 2008 00:11:35 +0000 (01:11 +0100)]
x86, ioapic, acpi: add a knob to disable IRQ 0 through I/O APIC

As discovered recently some systems exhibit problems when the 8254 timer
IRQ is routed through the I/O APIC.  These problems do not affect the
timer IRQ itself and therefore cannot be detected when the correctness of
operation of the interrupt is verified in check_timer().  Therefore the
I/O APIC path of the timer IRQ has to be disabled entirely.

This is a change that lets platforms ask for the timer IRQ not to be
registered in the I/O APIC interrupt tables.  The local APIC and ExtINTA
paths are unaffected.  This request is only taken into account for ACPI
platforms as MP table systems seem unaffected so far.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: simplify x86_mpparse dependency check
Yinghai Lu [Fri, 20 Jun 2008 14:33:31 +0000 (07:33 -0700)]
x86: simplify x86_mpparse dependency check

"Maciej W. Rozycki" <macro@linux-mips.org> said:

> Given X86_64 selects X86_LOCAL_APIC I am not sure the redundancy seen
>above does not actually obscure the logic behind...  I think:
>
>       depends on X86_LOCAL_APIC && !X86_VISWS
>
>would be clearer and get the same.

Suggested-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Len Brown <lenb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoRevert parts of "x86: update mptable"
Ingo Molnar [Tue, 8 Jul 2008 08:47:39 +0000 (10:47 +0200)]
Revert parts of "x86: update mptable"

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: fix compiling when CONFIG_X86_MPPARSE is not set
Yinghai Lu [Thu, 19 Jun 2008 19:15:01 +0000 (12:15 -0700)]
x86: fix compiling when CONFIG_X86_MPPARSE is not set

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: let MPS support be selectable, v2
Yinghai Lu [Thu, 19 Jun 2008 19:13:09 +0000 (12:13 -0700)]
x86: let MPS support be selectable, v2

v2: seperate "fix for compiling when MPPARSE is not set" to another patch
    make X86_MPPARSE to be selectable only when acpi is set and
    X86_MPPARSE will be set if acpi is not set.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Maciej W. Rozycki <macro@linux-mips.org>
Cc: Len Brown <lenb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: update mptable fix with no ioapic v2
Yinghai Lu [Thu, 19 Jun 2008 00:29:31 +0000 (17:29 -0700)]
x86: update mptable fix with no ioapic v2

if the system doesn't have ioapic, we don't need to store entries for mptable
update

also let mp_config_acpi_gsi not call func in mpparse
so later could decouple mpparse with acpi more easily

Reported-by: Daniel Exner <dex@dragonslave.de>
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Daniel Exner <dex@dragonslave.de>
Cc: Len Brown <lenb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: cleanup machine_specific_memory_setup, v2
Yinghai Lu [Thu, 19 Jun 2008 00:27:08 +0000 (17:27 -0700)]
x86: cleanup machine_specific_memory_setup, v2

1. let 64bit support 88 and e801 too
2. introduce default_machine_specific_memory_setup, and reuse it
   for voyager

v2: fix 64 bit compiling

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: remove unused file after numaq etc depends on genericarch
Yinghai Lu [Wed, 18 Jun 2008 18:54:37 +0000 (11:54 -0700)]
x86: remove unused file after numaq etc depends on genericarch

we don't need those mach_mpspec.h files now.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: use acpi_numa_init to parse on 32-bit numa
Yinghai Lu [Tue, 17 Jun 2008 22:41:45 +0000 (15:41 -0700)]
x86: use acpi_numa_init to parse on 32-bit numa

seperate SRAT finding and parsing from get_memcfg_from_srat,
and let getmemcfg_from_srat only handle array from previous step.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: Kconfig cleanup with genericarch
Yinghai Lu [Tue, 17 Jun 2008 22:39:01 +0000 (15:39 -0700)]
x86: Kconfig cleanup with genericarch

we already have summit and etc depends on genericarch,
so use genericarch only.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: move some function out of setup_bootmem_alloc
Yinghai Lu [Tue, 17 Jun 2008 17:02:45 +0000 (10:02 -0700)]
x86: move some function out of setup_bootmem_alloc

... to make it more like 64-bit.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: merge setup_memory_map with e820
Yinghai Lu [Tue, 17 Jun 2008 02:58:28 +0000 (19:58 -0700)]
x86: merge setup_memory_map with e820

... and kill e820_32/64.c and e820_32/64.h

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: kill bad_ppro
Yinghai Lu [Mon, 16 Jun 2008 23:11:08 +0000 (16:11 -0700)]
x86: kill bad_ppro

so don't punish all other cpus without that problem when init highmem

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: move e820_resource_resources to e820.c
Yinghai Lu [Mon, 16 Jun 2008 20:03:31 +0000 (13:03 -0700)]
x86: move e820_resource_resources to e820.c

and make 32-bit resource registration more like 64 bit.

also move probe_roms back to setup_32.c

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86 boot: pass E820 memory map entries more than 128 via linked list of setup data
Huang, Ying [Wed, 11 Jun 2008 03:33:39 +0000 (11:33 +0800)]
x86 boot: pass E820 memory map entries more than 128 via linked list of setup data

Because of the size limits of struct boot_params (zero page), the
maximum number of E820 memory map entries can be passed to kernel is
128. As pointed by Paul Jackson, there is some machine produced by SGI
with so many nodes that the number of E820 memory map entries is more
than 128. To enabling Linux kernel on these system, a new setup data
type named SETUP_E820_EXT is defined to pass additional memory map
entries to Linux kernel.

This patch is based on x86/auto-latest branch of git-x86 tree and has
been tested on x86_64 and i386 platform.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86, mm: use add_highpages_with_active_regions() for high pages init v2
Yinghai Lu [Sun, 15 Jun 2008 01:32:52 +0000 (18:32 -0700)]
x86, mm: use add_highpages_with_active_regions() for high pages init v2

use early_node_map to init high pages, so we can remove page_is_ram() and
page_is_reserved_early() in the big loop with add_one_highpage

also remove page_is_reserved_early(), it is not needed anymore.

v2: fix the build of other platforms

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: rename two e820 related functions
Yinghai Lu [Mon, 16 Jun 2008 01:58:51 +0000 (18:58 -0700)]
x86: rename two e820 related functions

rename update_memory_range to e820_update_range
rename add_memory_region to e820_add_region

to make it more clear that they are about e820 map operations.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: use dstapic in mp_config_acpi_legacy_irqs
Yinghai Lu [Mon, 16 Jun 2008 01:19:46 +0000 (18:19 -0700)]
x86: use dstapic in mp_config_acpi_legacy_irqs

so we don't get the same value multiple times.

also make mp_config_acpi_legacy_irqs more readable by moving assignments
together.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: keep MP_intsrc_info untouched if we do not update mptable
Yinghai Lu [Sat, 14 Jun 2008 08:26:41 +0000 (01:26 -0700)]
x86: keep MP_intsrc_info untouched if we do not update mptable

Daniel Exner reported IO-APIC enumeration breakage in linux-next.

Alexey Starikovskiy found out that it might be related to
commit 2944e16b25 "x86: update mptable".

use enable_update_mptable to decide if need check before add mp_irqs array.

Reported-by: Daniel Exner <webmaster@dragonslave.de>
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: clean up relocate_initrd
Yinghai Lu [Sat, 14 Jun 2008 03:07:03 +0000 (20:07 -0700)]
x86: clean up relocate_initrd

1. move that before zone_sizes_init ...
2. add free_early for one old one, otherwise it will be be reserved again
   when we init highmem.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: replace shrink_active_range() with remove_active_range()
Yinghai Lu [Sat, 14 Jun 2008 02:08:52 +0000 (19:08 -0700)]
x86: replace shrink_active_range() with remove_active_range()

in case we have kva before ramdisk on a node, we still need to use
those ranges.

v2: reserve_early kva ram area, in case there are holes in highmem, to avoid
    those area could be treat as free high pages.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: clean up reserve_bootmem_generic() and port it to 32-bit
Yinghai Lu [Fri, 13 Jun 2008 09:00:56 +0000 (02:00 -0700)]
x86: clean up reserve_bootmem_generic() and port it to 32-bit

1. add reserve_bootmem_generic for 32bit
2. change len to unsigned long
3. make early_res_to_bootmem to use it

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: make generic arch support NUMAQ, fix #2
Yinghai Lu [Tue, 10 Jun 2008 01:11:36 +0000 (18:11 -0700)]
x86: make generic arch support NUMAQ, fix #2

we are checking mptable early for numaq, so don't need to reserve_bootmem
for it. bootmem is not there yet.

do the same thing as 64-bit.

found it on 64g above system from 64-bit kernel kexec to 32 bit kernel with
numaq support.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: make generic arch support NUMAQ, fix
Yinghai Lu [Tue, 10 Jun 2008 00:00:15 +0000 (17:00 -0700)]
x86: make generic arch support NUMAQ, fix

fix typo in bigsmp switching.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: e820 merge parsing of the mem=/memmap= boot parameters
Yinghai Lu [Tue, 10 Jun 2008 19:55:54 +0000 (12:55 -0700)]
x86: e820 merge parsing of the mem=/memmap= boot parameters

since we now have 32-bit support for e820_register_active_regions(),
we can merge the parsing of the mem=/memmap= boot parameters.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: unify the reserve_bootmem() behavior of early_res_to_bootmem()
Ingo Molnar [Tue, 10 Jun 2008 14:38:41 +0000 (16:38 +0200)]
x86: unify the reserve_bootmem() behavior of early_res_to_bootmem()

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: use reserve_bootmem_generic() to reserve crashkernel memory on x86_64
Bernhard Walle [Sun, 8 Jun 2008 13:46:31 +0000 (15:46 +0200)]
x86: use reserve_bootmem_generic() to reserve crashkernel memory on x86_64

This patch uses reserve_bootmem_generic() instead of reserve_bootmem()
to reserve the crashkernel memory on x86_64. That's necessary for NUMA
machines, see 00212fef814612245ed0261cbac8426d0c9a31a5:

  [PATCH] Fix kdump Crash Kernel boot memory reservation for NUMA machines

  This patch will fix a boot memory reservation bug that trashes memory on
  the ES7000 when loading the kdump crash kernel.

  The code in arch/x86_64/kernel/setup.c to reserve boot memory for the crash
  kernel uses the non-numa aware "reserve_bootmem" function instead of the
  NUMA aware "reserve_bootmem_generic".  I checked to make sure that no other
  function was using "reserve_bootmem" and found none, except the ones that
  had NUMA ifdef'ed out.

  I have tested this patch only on an ES7000 with NUMA on and off (numa=off)
  in a single (non-NUMA) and multi-cell (NUMA) configurations.

Signed-off-by: Amul Shah <amul.shah@unisys.com>
Looks-good-to: Vivek Goyal <vgoyal@in.ibm.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The switch-back to reserve_bootmem() was accidentally introduced in
5c3391f9f749023a49c64d607da4fb49263690eb when adding the BOOTMEM_EXCLUSIVE
parameter.

Signed-off-by: Bernhard Walle <bwalle@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: add flags parameter to reserve_bootmem_generic()
Bernhard Walle [Sun, 8 Jun 2008 13:46:30 +0000 (15:46 +0200)]
x86: add flags parameter to reserve_bootmem_generic()

This patch adds a 'flags' parameter to reserve_bootmem_generic() like it
already has been added in reserve_bootmem() with commit
72a7fe3967dbf86cb34e24fbf1d957fe24d2f246.

It also changes all users to use BOOTMEM_DEFAULT, which doesn't effectively
change the behaviour. Since the change is x86-specific, I don't think it's
necessary to add a new API for migration. There are only 4 users of that
function.

The change is necessary for the next patch, using reserve_bootmem_generic()
for crashkernel reservation.

Signed-off-by: Bernhard Walle <bwalle@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoMerge branch 'linus' into tmp.x86.mpparse.new
Ingo Molnar [Tue, 8 Jul 2008 08:32:56 +0000 (10:32 +0200)]
Merge branch 'linus' into tmp.x86.mpparse.new

16 years agoMerge branch 'x86/irq' into x86/devel
Ingo Molnar [Tue, 8 Jul 2008 07:53:57 +0000 (09:53 +0200)]
Merge branch 'x86/irq' into x86/devel

Conflicts:

arch/x86/kernel/i8259.c
arch/x86/kernel/irqinit_64.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoMerge branches 'x86/mmio', 'x86/delay', 'x86/idle', 'x86/oprofile', 'x86/debug',...
Ingo Molnar [Tue, 8 Jul 2008 07:46:15 +0000 (09:46 +0200)]
Merge branches 'x86/mmio', 'x86/delay', 'x86/idle', 'x86/oprofile', 'x86/debug', 'x86/ptrace' and 'x86/amd-iommu' into x86/devel

16 years agoMerge branch 'x86/setup' into x86/devel
Ingo Molnar [Tue, 8 Jul 2008 07:43:01 +0000 (09:43 +0200)]
Merge branch 'x86/setup' into x86/devel

16 years agoMerge branches 'x86/numa-fixes', 'x86/apic', 'x86/apm', 'x86/bitops', 'x86/build...
Ingo Molnar [Tue, 8 Jul 2008 07:16:56 +0000 (09:16 +0200)]
Merge branches 'x86/numa-fixes', 'x86/apic', 'x86/apm', 'x86/bitops', 'x86/build', 'x86/cleanups', 'x86/cpa', 'x86/cpu', 'x86/defconfig', 'x86/gart', 'x86/i8259', 'x86/intel', 'x86/irqstats', 'x86/kconfig', 'x86/ldt', 'x86/mce', 'x86/memtest', 'x86/pat', 'x86/ptemask', 'x86/resumetrace', 'x86/threadinfo', 'x86/timers', 'x86/vdso' and 'x86/xen' into x86/devel

16 years agox86, arch/x86/kernel/io_apic_32.c: use kzalloc instead of kmalloc/memset
Christophe Jaillet [Sun, 22 Jun 2008 20:13:48 +0000 (22:13 +0200)]
x86, arch/x86/kernel/io_apic_32.c: use kzalloc instead of kmalloc/memset

1) replace kmalloc/memset with equivalent kzalloc.

Signed-off-by: Christophe Jaillet <jaillet.christophe@wanadoo.fr>
Cc: cj <jaillet.christophe@wanadoo.fr>
Cc: petero2@telia.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: fix IO APIC breakage on HP nx6325, v2
Maciej W. Rozycki [Fri, 20 Jun 2008 00:35:07 +0000 (01:35 +0100)]
x86: fix IO APIC breakage on HP nx6325, v2

> That helped a lot, the system seems to work normally now.
>
> Here's the relevant snippet from dmesg:
>
> [    0.108006] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
> [    0.108006] ..MP-BIOS bug: 8254 timer not connected to IO-APIC
> [    0.108006] ...trying to set up timer (IRQ0) through the 8259A ... <3>
> [    0.108006] ..... (found apic 0 pin 2) ...<3> failed.
> [    0.108006] ...trying to set up timer as Virtual Wire IRQ...<3> works.
>
> and the whole thing is at: http://www.sisk.pl/kernel/debug/20080618/dmesg-2.log

 Hmm, that only proved the 8259A is indeed wired to the pin #2 of the I/O
APIC.

> I, personally, don't have any and AMD only has SB600 documentation on its
> web page (it's still marked as "AMD confidential" ;-)).

 Well, the IC block is most likely the same as that's not rocket science
and once done there is no need to fiddle with that.  That written, I am
afraid there is nothing useful about the IC in the document, except that
it's there and consists of an I/O APIC providing 24 inputs and the usual
pair of 8259A cores.  Thanks for the reference anyway.

> There is an interrupt controller in there, but I'm not sure if there's any
> 8259A.  The northbridge is on the CPU, actually.

 I will praise the day someone ships an x86 machine without an 8259A core!

 As expressed in another mail I suspect there may actually be a direct
route from the 8254 to INTIN0 in the southbridge -- this is what other
bootstrap logs seen in the Internet suggest.  This would mean this
particular BIOS is buggy (is it the latest version?) and provides an
incorrect IRQ override in its ACPI tables, for example because the
responsible block has been blindly copied from a machine using a commoner
wiring.  This could be moderately easily fixed up with a quirk based on
the PCI ID (after checking it again, we actually used to have a quirk for
ATI in this area, but the way it was done suggests the issue was not
understood well enough).

 Could you please remove the hack sent yesterday and test the patch
provided below?  I do hope it builds, but I have no immediate means to
check it.  Please report the output.  The intent is to test INTIN0
directly before testing INTIN2 through the 8259A.  Thanks.

 Aside of that, what I have gathered from your reports (please correct me
if I have got it wrong) is that when the through-8259A mode is used, then
after a while 8254 timer interrupts stop arriving.  What's interesting,
the "Virtual Wire IRQ" seems to work for you correctly (that's quite an
odd setup where a local APIC input is used in the native mode -- please
post /proc/interrupts for confirmation), which in turn implies the master
8259A drives its INT output as we expect.  Why would the I/O APIC input
have problems then?  Hmm...

[ mingo@elte.hu: revert the "x86: fix IO APIC breakage on HP nx6325"
  version. ]

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: fix IO APIC breakage on HP nx6325
Maciej W. Rozycki [Wed, 18 Jun 2008 23:39:33 +0000 (00:39 +0100)]
x86: fix IO APIC breakage on HP nx6325

On Thu, 19 Jun 2008, Rafael J. Wysocki wrote:

> >  With such a configuration the "x86: I/O APIC: timer through 8259A
> > second-chance" patch should not matter, because the only change it
> > introduces is an attempt to try the same I/O APIC pin again, but with the
> > IRQ0 line of the master 8259A enabled.  That's not a terribly unusual
> > configuration and nothing should get confused in the system.
>
> But it _does_ get confused, really.

 Something certainly gets confused, but so far I am not sure which bit
exactly it is, are you?

> >  Barring the unlikely possibility of the 8259A actually being wired to
> > INTIN2 of the I/O APIC I can see two possible explanations:
> >
> > 1. The 8259A interrupt actually escapes to the CPU somehow and is handled
> >    as an ExtINTA interrupt.  This would make the code in check_timer()
> >    decide it has found a working configuration, while actually it has been
> >    fooled.
[...]
> Here you go:
>
> [    0.108006] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
> [    0.108006] ..MP-BIOS bug: 8254 timer not connected to IO-APIC
> [    0.108006] ...trying to set up timer (IRQ0) through the 8259A ... <3>
> [    0.108006] ..... (found apic 0 pin 2) ...<3> works.
>
> The full dmesg is at: http://www.sisk.pl/kernel/debug/20080618/dmesg-1.log

Thanks.  In this case I suspect the case #1 quoted above happens, that is
the 8259A manages to deliver its interrupt somehow.  Note at this stage it
is meant to be in the AEOI mode, so it can happily resubmit the interrupt
indefinitely with no additional handling as long as it receives INTA
cycles.

Can you please try the patch below on top of "x86: I/O APIC: timer
through 8259A second-chance" to see whether my hypothesis is true?  It
modifies the through-8259A setup path so that the APIC input gets masked,
but the 8259A has the timer interrupt still enabled.  Let me know how the
timer interrupt is routed in this case.

Bisected-by: "Rafael J. Wysocki" <rjw@sisk.pl>
Tested-by: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: coding style fixes to arch/x86/kernel/io_apic_32.c
Paolo Ciarrocchi [Sun, 8 Jun 2008 11:07:18 +0000 (13:07 +0200)]
x86: coding style fixes to arch/x86/kernel/io_apic_32.c

Before:
total: 91 errors, 73 warnings, 2850 lines checked

After:
total: 1 errors, 47 warnings, 2848 lines checked

Compile tested:

paolo@paolo-desktop:/tmp$ size io*
   text    data     bss     dec     hex filename
  13836    1756   11104   26696    6848 io_apic_32.o.after
  13836    1756   11104   26696    6848 io_apic_32.o.before

Signed-off-by: Paolo Ciarrocchi <paolo.ciarrocchi@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86, io-apic: use predefined names instead of numeric constants
Cyrill Gorcunov [Sat, 7 Jun 2008 15:53:57 +0000 (19:53 +0400)]
x86, io-apic: use predefined names instead of numeric constants

This patch replaces some hard-coded numbers with predefined names.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86, io-apic: define names for redirection table entry fields
Cyrill Gorcunov [Sat, 7 Jun 2008 15:53:56 +0000 (19:53 +0400)]
x86, io-apic: define names for redirection table entry fields

Each I/O APIC redirection table entry has a number of fields.
Define names for them to eliminate reference by hard coded
numbers.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: APIC/SMP: Downgrade the NMI watchdog for "noapic"
Maciej W. Rozycki [Fri, 6 Jun 2008 02:28:13 +0000 (03:28 +0100)]
x86: APIC/SMP: Downgrade the NMI watchdog for "noapic"

 If configured to use the I/O APIC, the NMI watchdog is deemed to fail if
the chip has been deactivated as a result of "noapic".  Downgrade to the
local APIC watchdog similarly to what is done for the UP case.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: APIC/SMP: Downgrade the NMI watchdog for "nosmp"
Maciej W. Rozycki [Fri, 6 Jun 2008 02:28:02 +0000 (03:28 +0100)]
x86: APIC/SMP: Downgrade the NMI watchdog for "nosmp"

 If configured to use the I/O APIC, the NMI watchdog is deemed to fail if
the chip has been deactivated as a result of "nosmp".  Downgrade to the
local APIC watchdog similarly to what is done for the UP case.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: APIC/UP: Remove redundant NMI watchdog downgrade
Maciej W. Rozycki [Fri, 6 Jun 2008 02:27:56 +0000 (03:27 +0100)]
x86: APIC/UP: Remove redundant NMI watchdog downgrade

For the UP case the NMI watchdog downgrade is done consistently in
APIC_init_uniprocessor() now.  Remove redundant code used only when
BIOS-disabled local APIC is activated.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: APIC/UP: Downgrade the NMI watchdog for no I/O APIC
Maciej W. Rozycki [Fri, 6 Jun 2008 02:27:49 +0000 (03:27 +0100)]
x86: APIC/UP: Downgrade the NMI watchdog for no I/O APIC

 If configured to use the I/O APIC, the NMI watchdog is deemed to fail if
the chip will not be used in the UP configuration, because "noapic" has
been specified or the chip is simply not there.  Downgrade to the local
APIC watchdog to rectify.

The new #ifdef is ugly, I know.  A proper solution is to provide suitable
definitions of smp_found_config, etc. for !CONFIG_X86_IO_APIC in a header.
Likewise the whole if () condition should be moved to a static inline
function.  Such clean-ups are beyond the scope of this change and can be
done once the whole issue of the timer has been sorted out.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: NMI watchdog: Downgrade helper
Maciej W. Rozycki [Fri, 6 Jun 2008 02:27:41 +0000 (03:27 +0100)]
x86: NMI watchdog: Downgrade helper

A downgrade helper for the NMI watchdog to be used in all places where
the I/O APIC watchdog may have been requested, but the I/O APIC is found
not to be there or meant to be left disabled.  This is so that the
reconfiguration is cosistent and defined in a single place only.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoRevert "x86: APIC/SMP: downgrade the NMI watchdog for "nosmp""
Ingo Molnar [Sun, 8 Jun 2008 08:14:40 +0000 (10:14 +0200)]
Revert "x86: APIC/SMP: downgrade the NMI watchdog for "nosmp""

This reverts commit 791b93d3dfaf16c23e978bec0cc0a3dd9d855d63.

A better fix from Maciej will be merged.

16 years agoRevert "x86, io-apic: fix nmi_watchdog=1 bootup hang"
Ingo Molnar [Sun, 8 Jun 2008 08:13:33 +0000 (10:13 +0200)]
Revert "x86, io-apic: fix nmi_watchdog=1 bootup hang"

This reverts commit 2229ff84f01746d02fb6b79e156fb5cce48c908f.

A better fix from Maciej will be merged.

16 years agox86, io-apic: fix nmi_watchdog=1 bootup hang
Ingo Molnar [Thu, 5 Jun 2008 09:17:16 +0000 (11:17 +0200)]
x86, io-apic: fix nmi_watchdog=1 bootup hang

nmi_watchdog=1 hangs on 64-bit:

[    0.250000] Detected 12.564 MHz APIC timer.
[    0.254178] APIC timer registered as dummy, due to nmi_watchdog=1!
[    0.260366] Testing NMI watchdog ... <4>WARNING: CPU#0: NMI appears to be stuck (0->0)!
[    ...     ]
[    0.470003] calling  genl_init+0x0/0xd0
[  hard hang ]

bisected it down to:

 git-bisect start
 git-bisect good 1beee8dc8cf58e3f605bd7b34d7a39939be7d8d2
 git-bisect bad 11582ece0aaa2d0f94f345c08a4ab9997078a083
 git-bisect bad 5479c623bb44089844022c03d4c0eb16d5b7a15f
 git-bisect bad cfb4c7fabeb499e1c29f9d1878968e37a938e28a
 git-bisect good 246dd412d31e4f5de1d43aa6422a325b785f36e4
 git-bisect bad 3f8237eaff7dc1e35fa791dae095574fd974e671
 git-bisect good 90e23b13ab849e2a11f00c655eb3a2011b4623be
 git-bisect bad 833526a34eeefc117df3191a594c3c3a4f15a9ac
 git-bisect good 791b93d3dfaf16c23e978bec0cc0a3dd9d855d63
 git-bisect bad 65767c64068f2c93e56a1accfed5c78230ac12d7
 git-bisect bad 2abc5c05dd82c188e3bdf6641a274f013348d14b
 git-bisect bad 317e1f2597ffb4d4db940577bbe56dc6e881ef07

317e1f2597ffb4d4db940577bbe56dc6e881ef07 is first bad commit
| commit 317e1f2597ffb4d4db940577bbe56dc6e881ef07
| Author: Maciej W. Rozycki <macro@linux-mips.org>
| Date:   Wed May 21 22:10:22 2008 +0100
|     x86: I/O APIC: clean up the 8259A on a NMI watchdog failure

the problem is that in the dummy-lapic branch we rely on the i8259A
but if the NMI watchdog fails we turn off IRQ 0 - which doesnt work
too well ;-)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: IO-APIC - use NMI_NONE instead of numeric constant
Cyrill Gorcunov [Thu, 29 May 2008 18:32:30 +0000 (22:32 +0400)]
x86: IO-APIC - use NMI_NONE instead of numeric constant

Not sure but maybe it is better to use NMI_DISABLED,
will take a look. But for now this patch is not change
anything in logic so it will not hurt/broke the kernel.
For most cases nmi_watchdog assignment is by one of NMI_*
macro so I think there it make sense too.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86 build fix:
Ingo Molnar [Sat, 31 May 2008 10:20:10 +0000 (12:20 +0200)]
x86 build fix:

  arch/x86/kernel/io_apic_64.c: In function 'check_timer':
  arch/x86/kernel/io_apic_64.c:1688: error: 'vector' undeclared (first use in this function)
  arch/x86/kernel/io_apic_64.c:1688: error: (Each undeclared identifier is reported only once
  arch/x86/kernel/io_apic_64.c:1688: error: for each function it appears in.)

16 years agox86: apic_64.c fix sparse warnings about shadowed variables
Thomas Gleixner [Mon, 12 May 2008 13:43:35 +0000 (15:43 +0200)]
x86: apic_64.c fix sparse warnings about shadowed variables

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: make irq_cfg static
Thomas Gleixner [Mon, 12 May 2008 13:43:36 +0000 (15:43 +0200)]
x86: make irq_cfg static

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: move pci_routirq declaration to pci.h
Thomas Gleixner [Mon, 12 May 2008 13:43:37 +0000 (15:43 +0200)]
x86: move pci_routirq declaration to pci.h

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: I/O APIC: timer through 8259A second-chance
Maciej W. Rozycki [Tue, 27 May 2008 20:19:51 +0000 (21:19 +0100)]
x86: I/O APIC: timer through 8259A second-chance

Some systems incorrectly report the ExtINTA pin of the I/O APIC as the
genuine target of the timer interrupt.  Here is a change that copies timer
pin information found to the other pin if one has been found only.  This
way both a direct and a through-8259A route is tested with the pin letting
these problematic systems work well enough.  If no timer pin information
has been found for the I/O APIC, then local APIC variations are tried
only, similarly to what is done without the change (except without the
misleading messages).

Obviously if we try the first-chance path without being told by the BIOS
to do so, we should not complain either, so do not print the message in
this case.

The 64-bit variation should be updated with a call to
replace_pin_at_irq() which can be done with the upcoming merge.  Since
add_pin_to_irq() is now always called in the first-chance path, the
condition to require it in the second-chance path no longer happens.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: I/O APIC: keep the timer IRQ masked during set-up
Maciej W. Rozycki [Tue, 27 May 2008 20:19:45 +0000 (21:19 +0100)]
x86: I/O APIC: keep the timer IRQ masked during set-up

Keep the timer interrupt line masked when reconfiguring its interrupt
redirection entry in the I/O APIC.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: I/O APIC: unmask the second-chance timer interrupt
Maciej W. Rozycki [Tue, 27 May 2008 20:19:40 +0000 (21:19 +0100)]
x86: I/O APIC: unmask the second-chance timer interrupt

Unmask the timer interrupt line set up in the through-8259A mode
explicitly after setup_timer_IRQ0_pin() has set up the I/O APIC interrupt
redirection entry to let the two operations be unbound from each other.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: I/O APIC: rename setup_ExtINT_IRQ0_pin()
Maciej W. Rozycki [Tue, 27 May 2008 20:19:34 +0000 (21:19 +0100)]
x86: I/O APIC: rename setup_ExtINT_IRQ0_pin()

Rename setup_ExtINT_IRQ0_pin() to setup_timer_IRQ0_pin() to better
reflect the upcoming role of a function setting up a (semi-)arbitrary I/O
APIC pin appropriately for the 8254 timer.  By "appropriate" the following
settings are meant: edge-triggered, active-high, all the other settings
per-architecture.  Adjust comments to reflect code appropriately.  No
functional changes.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: I/O APIC: remove redundant LVT0 masking
Maciej W. Rozycki [Tue, 27 May 2008 20:19:28 +0000 (21:19 +0100)]
x86: I/O APIC: remove redundant LVT0 masking

The LINT0 line of the local APIC is masked in the LVT0 entry in
check_timer() before this function is ever called.  Removed the
redundant unmasking for better control.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: I/O APIC: remove redundant 8259A {,un}masking
Maciej W. Rozycki [Tue, 27 May 2008 20:19:22 +0000 (21:19 +0100)]
x86: I/O APIC: remove redundant 8259A {,un}masking

For a better control the masking and unmasking of the timer interrupt
line in the 8259A operating in the 'Virtual Wire' mode has been moved out
of setup_ExtINT_IRQ0_pin() now, so remove the redundant calls from the
function.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: I/O APIC: fix the name of the through-8259A handler
Maciej W. Rozycki [Tue, 27 May 2008 20:19:16 +0000 (21:19 +0100)]
x86: I/O APIC: fix the name of the through-8259A handler

When the through-8259A mode is used for the timer, the call to
set_irq_handler() will register a NULL handler name, resulting in
"IO-APIC-<NULL>" reported.  Fix by calling ioapic_register_intr() as done
for all the other I/O APIC interrupts.

The 64-bit variation calls set_irq_chip_and_handler_name() here
needlessly and should get fixed with the upcoming merge.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: I/O APIC: fix the name of the L-APIC IRQ handler
Maciej W. Rozycki [Tue, 27 May 2008 20:19:09 +0000 (21:19 +0100)]
x86: I/O APIC: fix the name of the L-APIC IRQ handler

The local APIC interrupt handler gets registered with
set_irq_chip_and_handler_name(), which results in
"local-APIC-edge-fasteoi" reported as the name of the handler.  Fix by
removing the type of the handler left over from before the generic
handlers were introduced.

The 64-bit variation should get fixed with the upcoming merge.

NB It should really use the "edge" handler and not the "fasteoi" one,
but that's a separate issue.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: I/O APIC: clean up the 8259A on a NMI watchdog failure
Maciej W. Rozycki [Wed, 21 May 2008 21:10:22 +0000 (22:10 +0100)]
x86: I/O APIC: clean up the 8259A on a NMI watchdog failure

There is no point in keeping the 8259A enabled if the I/O APIC NMI
watchdog has failed and the 8259A is not used to pass through regular
timer interrupts.  This fixes problems with some systems where some logic
gets confused.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: APIC/SMP: downgrade the NMI watchdog for "nosmp"
Maciej W. Rozycki [Wed, 21 May 2008 21:10:16 +0000 (22:10 +0100)]
x86: APIC/SMP: downgrade the NMI watchdog for "nosmp"

If configured to use the I/O APIC, the NMI watchdog is deemed to fail if
the chip has been deactivated as a result of "nosmp".  Downgrade to the
local APIC watchdog similarly to what is done for the UP case.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: APIC/SMP: correct the message for "nosmp"
Maciej W. Rozycki [Wed, 21 May 2008 21:09:43 +0000 (22:09 +0100)]
x86: APIC/SMP: correct the message for "nosmp"

The local APIC is no longer forced off when "nosmp" has been specified.
Correct the message printed.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: I/O APIC: keep IRQ off when changing LVT registers
Maciej W. Rozycki [Wed, 21 May 2008 21:09:34 +0000 (22:09 +0100)]
x86: I/O APIC: keep IRQ off when changing LVT registers

Disable the 8259A acting in the "virtual wire" mode to keep the interrupt
line inactive while fiddling with local APIC interrupt vector registers
associated with its destination inputs.  To be on the safe side,
especially concerning flipping the trigger mode.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: I/O APIC: clean up after a fasteoi failure
Maciej W. Rozycki [Wed, 21 May 2008 21:09:26 +0000 (22:09 +0100)]
x86: I/O APIC: clean up after a fasteoi failure

Disable the 8259A when routing of the timer interrupt through the chip to
the local APIC of the primary processor has failed.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: I/O APIC: remove parameters to fiddle with the 8259A
Maciej W. Rozycki [Wed, 21 May 2008 21:09:19 +0000 (22:09 +0100)]
x86: I/O APIC: remove parameters to fiddle with the 8259A

Remove the "disable_8254_timer" and "enable_8254_timer" kernel
parameters.  Now that AEOI acknowledgements are no longer needed for
correct timer operation, the 8259A can be kept disabled unconditionally
unless interrupts, either timer or watchdog ones, are actually passed
through it.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: I/O APIC: AEOI timer acknowledgement clean-ups
Maciej W. Rozycki [Wed, 21 May 2008 21:09:11 +0000 (22:09 +0100)]
x86: I/O APIC: AEOI timer acknowledgement clean-ups

The code that used to be in do_slow_gettimeoffset() that relied on the
IRR bit of the master 8259A PIC for IRQ0 to check the state of the output
timer 0 of the PIT is no longer there.  As a result, there is no need to
use the POLL command to acknowledge the timer interrupt in the "8259A
Virtual Wire", except for the NMI watchdog when the i82489DX APIC is used
(this is because this particular APIC treats NMIs as level-triggered and
keeping the input asserted would keep motherboard NMI sources held off for
too long).  Remove the unneeded bits and adjust comments accordingly.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoRevert "Revert "x86: fix ioapic bug again""
Ingo Molnar [Mon, 16 Jun 2008 10:44:17 +0000 (12:44 +0200)]
Revert "Revert "x86: fix ioapic bug again""

This reverts commit 0b6a39f7ebcb1c82587ce35b401c513eed41ac5c.

The changes in tip/x86/apic solve this better.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86_64: use PAGE_OFFSET in dump_pagetables
Jiri Slaby [Mon, 12 May 2008 13:43:37 +0000 (15:43 +0200)]
x86_64: use PAGE_OFFSET in dump_pagetables

Use PAGE_OFFSET macro instead of using 0xffff810000000000UL directly.

Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: hpa@zytor.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: add sparse annotations to ioremap
Thomas Gleixner [Mon, 12 May 2008 13:43:35 +0000 (15:43 +0200)]
x86: add sparse annotations to ioremap

arch/x86/mm/ioremap.c:308:11: error: incompatible types in comparison expression (different address spaces)

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: janitor CPA statistics patch
Thomas Gleixner [Mon, 5 May 2008 14:35:21 +0000 (16:35 +0200)]
x86: janitor CPA statistics patch

1) Remove __meminit from update_pages_count. It is used inside
split_pages()

2) Make the code depend on PROC_FS. Doing statistics for nothing is
useless and not adding useless code is nice to the Linux tiny folks.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86, generic: CPA add statistics about state of direct mapping v4
Andi Kleen [Fri, 2 May 2008 09:46:49 +0000 (11:46 +0200)]
x86, generic: CPA add statistics about state of direct mapping v4

Add information about the mapping state of the direct mapping to
/proc/meminfo. I chose /proc/meminfo because that is where all the other
memory statistics are too and it is a generally useful metric even
outside debugging situations. A lot of split kernel pages means the
kernel will run slower.

This way we can see how many large pages are really used for it and how
many are split.

Useful for general insight into the kernel.

v2: Add hotplug locking to 64bit to plug a very obscure theoretical race.
    32bit doesn't need it because it doesn't support hotadd for lowmem.
    Fix some typos
v3: Rename dpages_cnt
    Add CONFIG ifdef for count update as requested by tglx
    Expand description
v4: Fix stupid bugs added in v3
    Move update_page_count to pageattr.c

Signed-off-by: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoMerge commit 'v2.6.26-rc9' into x86/cpu
Ingo Molnar [Tue, 8 Jul 2008 05:47:47 +0000 (07:47 +0200)]
Merge commit 'v2.6.26-rc9' into x86/cpu

16 years agox86: make 64bit identify_cpu use cpu_dev v2
Yinghai Lu [Thu, 19 Jun 2008 22:30:31 +0000 (15:30 -0700)]
x86: make 64bit identify_cpu use cpu_dev v2

v2: fix early_panic on this config:

  http://redhat.com/~mingo/misc/config-Thu_Jun_19_14_22_37_CEST_2008.bad

reason : struct cpu_vendor_dev size is 16, need to make table to be 16
         byte alignment

also print out the cpu supported...

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: Dave Jones <davej@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: make 64-bit identify_cpu use cpu_dev
Yinghai Lu [Fri, 20 Jun 2008 06:18:09 +0000 (08:18 +0200)]
x86: make 64-bit identify_cpu use cpu_dev

we may need to move some functions to common.c later

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: Move PCI IO ECS code to x86/pci
Robert Richter [Thu, 12 Jun 2008 18:19:23 +0000 (20:19 +0200)]
x86: Move PCI IO ECS code to x86/pci

"Form follows function". Code is now where it belongs to.

Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86/pci: Renaming k8-bus_64.c to amd_bus.c
Robert Richter [Thu, 12 Jun 2008 18:19:22 +0000 (20:19 +0200)]
x86/pci: Renaming k8-bus_64.c to amd_bus.c

The name fits better since this is code not only for K8.

Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: add C1E aware idle function, fix
Thomas Gleixner [Tue, 17 Jun 2008 07:12:03 +0000 (09:12 +0200)]
x86: add C1E aware idle function, fix

On Tue, 17 Jun 2008, Rafael J. Wysocki wrote:
>
> BTW, with the C1E patches reverted I don't get the
> WARNING: at /home/rafael/src/linux-next/kernel/smp.c:215 smp_call_function_single+0x3d/0xa2
> in the log.  Thomas?

The BROADCAST_FORCE notification uses smp_function_call and therefor
must be run with interrupts enabled.

While at it, add a comment for the BROADCAST_EXIT notifier as well.

Reported-and-bisected-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86, clockevents: add C1E aware idle function
Thomas Gleixner [Mon, 9 Jun 2008 17:15:00 +0000 (19:15 +0200)]
x86, clockevents: add C1E aware idle function

C1E on AMD machines is like C3 but without control from the OS. Up to
now we disabled the local apic timer for those machines as it stops
when the CPU goes into C1E. This excludes those machines from high
resolution timers / dynamic ticks, which hurts especially X2 based
laptops.

The current boot time C1E detection has another, more serious flaw
as well: some BIOSes do not enable C1E until the ACPI processor module
is loaded. This causes systems to stop working after that point.

To work nicely with C1E enabled machines we use a separate idle
function, which checks on idle entry whether C1E was enabled in the
Interrupt Pending Message MSR. This allows us to do timer broadcasting
for C1E and covers the late enablement of C1E as well.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: clean up amd_iommu documentation
FUJITA Tomonori [Mon, 7 Jul 2008 07:40:26 +0000 (16:40 +0900)]
x86: clean up amd_iommu documentation

amd_iommu=off was replaced with a common parameter, iommu=off.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: joerg.roedel@amd.com
Cc: iommu@lists.linux-foundation.org
Cc: bhavna.sarathy@amd.com
Cc: robert.richter@amd.com
Cc: rjw@sisk.pl
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoMerge branch 'kvm-updates-2.6.26' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 6 Jul 2008 18:16:23 +0000 (11:16 -0700)]
Merge branch 'kvm-updates-2.6.26' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm

* 'kvm-updates-2.6.26' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm:
  KVM: IOAPIC: Fix level-triggered irq injection hang
  x86: KVM guest: Add memory clobber to hypercalls

16 years agopxamci: fix byte aligned DMA transfers
Philipp Zabel [Sat, 5 Jul 2008 23:15:34 +0000 (01:15 +0200)]
pxamci: fix byte aligned DMA transfers

The pxa27x DMA controller defaults to 64-bit alignment. This caused
the SCR reads to fail (and, depending on card type, error out) when
card->raw_scr was not aligned on a 8-byte boundary.

For performance reasons all scatter-gather addresses passed to
pxamci_request should be aligned on 8-byte boundaries, but if
this can't be guaranteed, byte aligned DMA transfers in the
have to be enabled in the controller to get correct behaviour.

Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoRevert "USB: don't explicitly reenable root-hub status interrupts"
Linus Torvalds [Sun, 6 Jul 2008 17:27:25 +0000 (10:27 -0700)]
Revert "USB: don't explicitly reenable root-hub status interrupts"

This reverts commit e872154921a6b5256a3c412dd69158ac0b135176.

Andrey Borzenkov reports that it resulted in a totally hung machine for
him when loading the OHCI driver.  Extensive netconsole capture with
SysRq output shows that modprobe gets stuck in ohci_hub_status_data()
when probing and enabling the OHCI controller, see for example

http://lkml.org/lkml/2008/7/5/236

for an analysis.

The problem appears to be an interrupt flood triggered by the commit
that gets reverted, and Andrey confirmed that the revert makes things
work for him again.

Reported-and-tested-by: Andrey Borzenkov <arvidjaar@mail.ru>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: David Brownell <david-b@pacbell.net>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoKVM: IOAPIC: Fix level-triggered irq injection hang
Mark McLoughlin [Fri, 4 Jul 2008 17:23:15 +0000 (18:23 +0100)]
KVM: IOAPIC: Fix level-triggered irq injection hang

The "remote_irr" variable is used to indicate an interrupt
which has been received by the LAPIC, but not acked.

In our EOI handler, we unset remote_irr and re-inject the
interrupt if the interrupt line is still asserted.

However, we do not set remote_irr here, leading to a
situation where if kvm_ioapic_set_irq() is called, then we go
ahead and call ioapic_service(). This means that IRR is
re-asserted even though the interrupt is currently in service
(i.e. LAPIC IRR is cleared and ISR/TMR set)

The issue with this is that when the currently executing
interrupt handler finishes and writes LAPIC EOI, then TMR is
unset and EOI sent to the IOAPIC. Since IRR is now asserted,
but TMR is not, then when the second interrupt is handled,
no EOI is sent and if there is any pending interrupt, it is
not re-injected.

This fixes a hang only seen while running mke2fs -j on an
8Gb virtio disk backed by a fully sparse raw file, with
aliguori "avoid fragmented virtio-blk transfers by copying"
changes.

Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Acked-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agox86: KVM guest: Add memory clobber to hypercalls
Anthony Liguori [Thu, 3 Jul 2008 16:02:36 +0000 (19:02 +0300)]
x86: KVM guest: Add memory clobber to hypercalls

Hypercalls can modify arbitrary regions of memory.  Make sure to indicate this
in the clobber list.  This fixes a hang when using KVM_GUEST kernel built with
GCC 4.3.0.

This was originally spotted and analyzed by Marcelo.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
16 years agoLinux 2.6.26-rc9 v2.6.26-rc9
Linus Torvalds [Sat, 5 Jul 2008 22:53:22 +0000 (15:53 -0700)]
Linux 2.6.26-rc9

16 years agoFix pagemap_read() use of struct mm_walk
Andrew Morton [Sat, 5 Jul 2008 08:02:01 +0000 (01:02 -0700)]
Fix pagemap_read() use of struct mm_walk

Fix some issues in pagemap_read noted by Alexey:

- initialize pagemap_walk.mm to "mm" , so the code starts working as
  advertised

- initialize ->private to "&pm" so it wouldn't immediately oops in
  pagemap_pte_hole()

- unstatic struct pagemap_walk, so two threads won't fsckup each other
  (including those started by root, including flipping ->mm when you don't
  have permissions)

- pagemap_read() contains two calls to ptrace_may_attach(), second one
  looks unneeded.

- avoid possible kmalloc(0) and integer wraparound.

Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Matt Mackall <mpm@selenic.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
[ Personally, I'd just remove the functionality entirely  - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoMove _RET_IP_ and _THIS_IP_ to include/linux/kernel.h
Eduard - Gabriel Munteanu [Sat, 5 Jul 2008 09:14:23 +0000 (12:14 +0300)]
Move _RET_IP_ and _THIS_IP_ to include/linux/kernel.h

These two macros are useful beyond lock debugging. Moved definitions from
include/linux/debug_locks.h to include/linux/kernel.h, so code that needs
them does not have to include the former, which would have been a less
intuitive choice of a header.

Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoMerge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 5 Jul 2008 20:09:31 +0000 (13:09 -0700)]
Merge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  softlockup: print a module list on being stuck

16 years agoMerge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 5 Jul 2008 20:08:38 +0000 (13:08 -0700)]
Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86 ACPI: fix resume from suspend to RAM on uniprocessor x86-64
  x86 ACPI: normalize segment descriptor register on resume

16 years agoFix clear_refs_write() use of struct mm_walk
Andrew Morton [Sat, 5 Jul 2008 19:29:05 +0000 (12:29 -0700)]
Fix clear_refs_write() use of struct mm_walk

Don't use a static entry, so as to prevent races during concurrent use
of this function.

Reported-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Matt Mackall <mpm@selenic.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6
Linus Torvalds [Sat, 5 Jul 2008 20:06:19 +0000 (13:06 -0700)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6:
  ide: ide_unregister() locking bugfix
  ide: ide_unregister() warm-plug bugfix
  ide: fix hwif->gendev refcounting

16 years agoahci: give another shot at clearing all bits in irq_stat
Tejun Heo [Sat, 5 Jul 2008 04:10:50 +0000 (13:10 +0900)]
ahci: give another shot at clearing all bits in irq_stat

Commit ea0c62f7cf70f13a67830471b613337bd0c9a62e tried to clear all
bits in irq_stat but it didn't actually achieve that as irq_stat was
anded with port_map right after read.  This patch makes ahci driver
always use the unmasked value to clear irq_status.

While at it, add explanation on the peculiarities of ahci IRQ
clearing.

This was spotted by Linus Torvalds.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoide: ide_unregister() locking bugfix
Bartlomiej Zolnierkiewicz [Sat, 5 Jul 2008 18:30:51 +0000 (20:30 +0200)]
ide: ide_unregister() locking bugfix

Holding ide_lock for ide_release_dma_engine() call is unnecessary
and triggers WARN_ON(irqs_disabled()) in dma_free_coherent().

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
16 years agoide: ide_unregister() warm-plug bugfix
Bartlomiej Zolnierkiewicz [Sat, 5 Jul 2008 18:30:51 +0000 (20:30 +0200)]
ide: ide_unregister() warm-plug bugfix

Fix ide_unregister() to work for ports with no devices attached to them.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
16 years agoide: fix hwif->gendev refcounting
Bartlomiej Zolnierkiewicz [Sat, 5 Jul 2008 18:30:51 +0000 (20:30 +0200)]
ide: fix hwif->gendev refcounting

class->dev_release is called by device_release() iff dev->release
is not present so ide_port_class_release() is never called and the
last hwif->gendev reference is not dropped.

Fix it by removing ide_port_class_release() and get_device() call
from ide_register_port() (device_create_drvdata() takes a hwif->gendev
reference anyway).

This patch fixes hang on wait_for_completion(&hwif->gendev_rel_comp)
in ide_unregister() reported by Pavel Machek.

Cc: Pavel Machek <pavel@suse.cz>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
16 years agosoftlockup: print a module list on being stuck
Arjan van de Ven [Mon, 16 Jun 2008 22:51:08 +0000 (15:51 -0700)]
softlockup: print a module list on being stuck

Most places in the kernel that go BUG: print a module list
(which is very useful for doing statistics and finding patterns),
however the softlockup detector does not do this yet.

This patch adds the one line change to fix this gap.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoMerge branch 'x86/s2ram-fix' into x86/urgent
Ingo Molnar [Sat, 5 Jul 2008 06:42:45 +0000 (08:42 +0200)]
Merge branch 'x86/s2ram-fix' into x86/urgent