]> err.no Git - linux-2.6/log
linux-2.6
17 years agox86_64: Remove outdated comment in boot decompressor Makefile
Andi Kleen [Sun, 22 Jul 2007 09:12:46 +0000 (11:12 +0200)]
x86_64: Remove outdated comment in boot decompressor Makefile

64bit code in there now since some time.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86_64: Squash initial_code modpost warnings
Andi Kleen [Sun, 22 Jul 2007 09:12:45 +0000 (11:12 +0200)]
x86_64: Squash initial_code modpost warnings

Get rid of warnings like

WARNING: vmlinux.o(.bootstrap.text+0x1a8): Section mismatch: reference to .init.text:x86_64_start_kernel (between 'initial_code' and 'init_rsp')

- Move initialization code into .text.head like i386 because modpost knows about this already
- Mark initial_code .initdata

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86_64: fix section mismatch warning in init.c
Sam Ravnborg [Sun, 22 Jul 2007 09:12:44 +0000 (11:12 +0200)]
x86_64: fix section mismatch warning in init.c

Fix following warning:
WARNING: vmlinux.o(.text+0x188ea): Section mismatch: reference to .init.text:__alloc_bootmem_core (between 'alloc_bootmem_high_node' and 'get_gate_vma')

alloc_bootmem_high_node() is only used from __init scope so declare it __init.
And in addition declare the weak variant __init too.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86_64: fix section mismatch warning in hpet.c
Sam Ravnborg [Sun, 22 Jul 2007 09:12:42 +0000 (11:12 +0200)]
x86_64: fix section mismatch warning in hpet.c

Fix following warnings:
WARNING: vmlinux.o(.text+0x945e): Section mismatch: reference to .init.text:__set_fixmap (between 'hpet_arch_init' and 'hpet_mask_rtc_irq_bit')
WARNING: vmlinux.o(.text+0x9474): Section mismatch: reference to .init.text:__set_fixmap (between 'hpet_arch_init' and 'hpet_mask_rtc_irq_bit')

hpet_arch_init is only used from __init context so mark it __init.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agoi386: Fix the K7 NMI watchdog checkbit
Björn Steinbrink [Sun, 22 Jul 2007 09:12:41 +0000 (11:12 +0200)]
i386: Fix the K7 NMI watchdog checkbit

The performance counters on K7 are only 48 bits wide, so using bit 63 to
check if the counter overflowed is wrong. Let's use bit 47 instead.

Signed-off-by: Björn Steinbrink <B.Steinbrink@gmx.de>
Cc: Don Zickus <dzickus@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agoi386: Use patchable lock prefix in set_64bit
Andi Kleen [Sun, 22 Jul 2007 09:12:40 +0000 (11:12 +0200)]
i386: Use patchable lock prefix in set_64bit

Previously lock was unconditionally used, but shouldn't be needed on
UP systems.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agoi386: Handle P6s without performance counters in nmi watchdog
Andi Kleen [Sun, 22 Jul 2007 09:12:39 +0000 (11:12 +0200)]
i386: Handle P6s without performance counters in nmi watchdog

I got an oops while booting a 32bit kernel on KVM because it doesn't
implement performance counters used by the NMI watchdog. Handle this
case.

Cc: Avi Kivity <avi@qumranet.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86: Replace NSC/Cyrix specific chipset access macros by inlined functions.
Juergen Beisert [Sun, 22 Jul 2007 09:12:38 +0000 (11:12 +0200)]
x86: Replace NSC/Cyrix specific chipset access macros by inlined functions.

Due to index register access ordering problems, when using macros a line
like this fails (and does nothing):

setCx86(CX86_CCR2, getCx86(CX86_CCR2) | 0x88);

With inlined functions this line will work as expected.

Note about a side effect: Seems on Geode GX1 based systems the
"suspend on halt power saving feature" was never enabled due to this
wrong macro expansion. With inlined functions it will be enabled, but
this will stop the TSC when the CPU runs into a HLT instruction.
Kernel output something like this:
Clocksource tsc unstable (delta = -472746897 ns)

This is the 3rd version of this patch.

 - Adding missed arch/i386/kernel/cpu/mtrr/state.c
Thanks to Andres Salomon
 - Adding some big fat comments into the new header file
  Suggested by Andi Kleen

AK: fixed x86-64 compilation

Signed-off-by: Juergen Beisert <juergen@kreuzholzen.de>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86_64: x86_64 - Use non locked version for local_cmpxchg()
Mathieu Desnoyers [Sun, 22 Jul 2007 09:12:37 +0000 (11:12 +0200)]
x86_64: x86_64 - Use non locked version for local_cmpxchg()

local_cmpxchg() should not use any LOCK prefix.  This change probably
got lost in the move to cmpxchg.h.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Acked-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agoi386: Do not include other cpus' interrupt 0 in nmi_watchdog
Keith Owens [Sun, 22 Jul 2007 09:12:36 +0000 (11:12 +0200)]
i386: Do not include other cpus' interrupt 0 in nmi_watchdog

kstat_irqs(0) includes the count of interrupt 0 from all cpus, not just
the current cpu.  The updated interrupt 0 on other cpus can stop the
nmi_watchdog from tripping, so only include the current cpu's int 0.

Signed-off-by: Keith Owens <kaos@ocs.com.au>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agoi386: Tune AMD Fam10h/11h like K8
Andi Kleen [Sun, 22 Jul 2007 09:12:35 +0000 (11:12 +0200)]
i386: Tune AMD Fam10h/11h like K8

This mainly changes the nops for alternative, so not very revolutionary.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86_64: Set K8 CPUID flag for K8/Fam10h/Fam11h
Andi Kleen [Sun, 22 Jul 2007 09:12:34 +0000 (11:12 +0200)]
x86_64: Set K8 CPUID flag for K8/Fam10h/Fam11h

Previously this flag was only used on 32bit, but some shared code can use
it now.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agoi386: Fix cpu_llc_id section mismatch warning
Andi Kleen [Sun, 22 Jul 2007 09:12:33 +0000 (11:12 +0200)]
i386: Fix cpu_llc_id section mismatch warning

Fix

WARNING: arch/i386/kernel/built-in.o(.text+0xdd0d): Section mismatch: reference to .init.data:cpu_llc_id (between 'set_cpu_sibling_map' and 'initialize_secondary')
WARNING: arch/i386/kernel/built-in.o(.text+0xdd1b): Section mismatch: reference to .init.data:cpu_llc_id (between 'set_cpu_sibling_map' and 'initialize_secondary')

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86: Stop MCEs and NMIs during code patching
Andi Kleen [Sun, 22 Jul 2007 09:12:32 +0000 (11:12 +0200)]
x86: Stop MCEs and NMIs during code patching

When a machine check or NMI occurs while multiple byte code is patched
the CPU could theoretically see an inconsistent instruction and crash.
Prevent this by temporarily disabling MCEs and returning early in the
NMI handler.

Based on discussion with Mathieu Desnoyers.

Cc: Mathieu Desnoyers <compudj@krystal.dyndns.org>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86: Fix alternatives and kprobes to remap write-protected kernel text
Andi Kleen [Sun, 22 Jul 2007 09:12:31 +0000 (11:12 +0200)]
x86: Fix alternatives and kprobes to remap write-protected kernel text

Reenable kprobes and alternative patching when the kernel text is write
protected by DEBUG_RODATA

Add a general utility function to change write protected text.  The new
function remaps the code using vmap to write it and takes care of CPU
synchronization.  It also does CLFLUSH to make icache recovery faster.

There are some limitations on when the function can be used, see the
comment.

This is a newer version that also changes the paravirt_ops code.
text_poke also supports multi byte patching now.

Contains bug fixes from Zach Amsden and suggestions from Mathieu
Desnoyers.

Cc: Jan Beulich <jbeulich@novell.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Mathieu Desnoyers <compudj@krystal.dyndns.org>
Cc: Zach Amsden <zach@vmware.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86_64: Use read and write crX in .c files
Glauber de Oliveira Costa [Sun, 22 Jul 2007 09:12:29 +0000 (11:12 +0200)]
x86_64: Use read and write crX in .c files

This patch uses the read and write functions provided at system.h
for control registers instead of writting raw assembly over and
over again in .c files. Functions to manipulate cr2 and cr8 were
provided, as they were lacking.

Also, removed some extra space after closing brackets

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86: i386-show-unhandled-signals-v3
Masoud Asgharifard Sharbiani [Sun, 22 Jul 2007 09:12:28 +0000 (11:12 +0200)]
x86: i386-show-unhandled-signals-v3

This patch makes the i386 behave the same way that x86_64 does when a
segfault happens.  A line gets printed to the kernel log so that tools
that need to check for failures can behave more uniformly between
debug.show_unhandled_signals sysctl variable to 0 (or by doing echo 0 >
/proc/sys/debug/exception-trace)

Also, all of the lines being printed are now using printk_ratelimit() to
deny the ability of DoS from a local user with a program like the
following:

main()
{
       while (1)
               if (!fork()) *(int *)0 = 0;
}

This new revision also includes the fix that Andrew did which got rid of
new sysctl that was added to the system in earlier versions of this.
Also, 'show-unhandled-signals' sysctl has been renamed back to the old
'exception-trace' to avoid breakage of people's scripts.

AK: Enabling by default for i386 will be likely controversal, but let's see what happens
AK: Really folks, before complaining just fix your segfaults
AK: I bet this will find a lot of silent issues

Signed-off-by: Masoud Sharbiani <masouds@google.com>
Signed-off-by: Andi Kleen <ak@suse.de>
[ Personally, I've found the complaints useful on x86-64, so I'm all for
  this. That said, I wonder if we could do it more prettily..   -Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agoFix ppc64 mismerge
Al Viro [Sun, 22 Jul 2007 07:10:35 +0000 (08:10 +0100)]
Fix ppc64 mismerge

Fix a mismerge in commit 8b6f50ef1d5cc86b278eb42bc91630fad455fb10:
"spufs: make signal-notification files readonly for NOSCHED contexts",
where structs got duplicated.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agoMerge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Linus Torvalds [Sun, 22 Jul 2007 03:39:59 +0000 (20:39 -0700)]
Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6

* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
  [NET]: Add missing entries to family name tables
  [NET]: Make NETDEVICES depend on NET.
  [IPV6]: endianness bug in ip6_tunnel
  [IrDA]: TOSHIBA_FIR depends on virt_to_bus
  [IrDA]: EP7211 IR driver port to the latest SIR API
  [IrDA] Typo fix in irnetlink.c copyright
  [NET]: Fix loopback crashes when multiqueue is enabled.
  [IPV4]: Fix inetpeer gcc-4.2 warnings

17 years agoMerge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6
Linus Torvalds [Sun, 22 Jul 2007 03:38:51 +0000 (20:38 -0700)]
Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6

* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6:
  [SPARC64]: ERROR: "sys_ioctl" [arch/sparc64/solaris/solaris.ko] undefined!
  [SPARC32]: Make PAGE_SHARED a read-mostly variable.
  [SPARC32]: Take enable_irq/disable_irq out of line.
  [SPARC32]: clean include/asm-sparc/irq.h
  [SPARC32]: Fix rounding errors in ndelay/udelay implementation.

17 years ago[NET]: Add missing entries to family name tables
David Howells [Sun, 22 Jul 2007 02:30:16 +0000 (19:30 -0700)]
[NET]: Add missing entries to family name tables

Add missing entries to af_family_clock_key_strings[].

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[SPARC64]: ERROR: "sys_ioctl" [arch/sparc64/solaris/solaris.ko] undefined!
Christoph Hellwig [Sun, 22 Jul 2007 02:22:42 +0000 (19:22 -0700)]
[SPARC64]: ERROR: "sys_ioctl" [arch/sparc64/solaris/solaris.ko] undefined!

From: Christoph Hellwig <hch@infradead.org>

On Fri, Jul 20, 2007 at 09:24:42AM -0400, Horst H. von Brand wrote:
> When building v2.6.22-3478-g275afca on sparc64 (.config attached) I get:
>
>   MODPOST vmlinux
>   Building modules, stage 2.
>   MODPOST 463 modules
> ERROR: "sys_ioctl" [arch/sparc64/solaris/solaris.ko] undefined!

Sorry, my fault.

It looked to me like sparc64 exports sys_ioctl on it's own, but it
only exports compat_sys_ioctl on it's own.

Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[SPARC32]: Make PAGE_SHARED a read-mostly variable.
Al Viro [Sun, 22 Jul 2007 02:20:34 +0000 (19:20 -0700)]
[SPARC32]: Make PAGE_SHARED a read-mostly variable.

same scheme as for sparc64, same rationale

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[SPARC32]: Take enable_irq/disable_irq out of line.
Al Viro [Sun, 22 Jul 2007 02:19:38 +0000 (19:19 -0700)]
[SPARC32]: Take enable_irq/disable_irq out of line.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[SPARC32]: clean include/asm-sparc/irq.h
Al Viro [Sun, 22 Jul 2007 02:18:57 +0000 (19:18 -0700)]
[SPARC32]: clean include/asm-sparc/irq.h

Move stuff used only by arch/sparc/kernel/* into arch/sparc/kernel/irq.h
and into individual files in there (e.g. macros internal to sun4m_irq.c,
etc.)

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[SPARC32]: Fix rounding errors in ndelay/udelay implementation.
Mark Fortescue [Sun, 22 Jul 2007 02:17:41 +0000 (19:17 -0700)]
[SPARC32]: Fix rounding errors in ndelay/udelay implementation.

__ndelay and __udelay have not been delayung >= specified time.
The problem with __ndelay has been tacked down to the rounding of the
multiplier constant. By changing this, delays > app 18us are correctly
calculated.
The problem with __udelay has also been tracked down to rounding issues.
Changing the multiplier constant (to match that used in sparc64) corrects
for large delays and adding in a rounding constant corrects for trunctaion
errors in the claculations.
Many short delays will return without looping. This is not an error as there
is the fixed delay of doing all the maths to calculate the loop count.

Signed-off-by: Mark Fortescue <mark@mtfhpc.demon.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[NET]: Make NETDEVICES depend on NET.
Jan Engelhardt [Sun, 22 Jul 2007 02:11:35 +0000 (19:11 -0700)]
[NET]: Make NETDEVICES depend on NET.

Enabling drivers from "Devices > Networking" (in menuconfig), for
example SLIP and/or PLIP, throws link time errors when CONFIG_NET itself
is =n. Have CONFIG_NETDEVICES depend on CONFIG_NET.

Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[IPV6]: endianness bug in ip6_tunnel
Al Viro [Sun, 22 Jul 2007 02:09:41 +0000 (19:09 -0700)]
[IPV6]: endianness bug in ip6_tunnel

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[IrDA]: TOSHIBA_FIR depends on virt_to_bus
Stephen Rothwell [Sun, 22 Jul 2007 02:08:13 +0000 (19:08 -0700)]
[IrDA]: TOSHIBA_FIR depends on virt_to_bus

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[IrDA]: EP7211 IR driver port to the latest SIR API
Samuel Ortiz [Sun, 22 Jul 2007 02:07:33 +0000 (19:07 -0700)]
[IrDA]: EP7211 IR driver port to the latest SIR API

The EP7211 SIR driver was the only one left without a new SIR API port.

Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years ago[IrDA] Typo fix in irnetlink.c copyright
Samuel Ortiz [Sun, 22 Jul 2007 02:06:53 +0000 (19:06 -0700)]
[IrDA] Typo fix in irnetlink.c copyright

Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 years agox86-64: introduce struct pci_sysdata to facilitate sharing of ->sysdata
Muli Ben-Yehuda [Sat, 21 Jul 2007 21:23:39 +0000 (00:23 +0300)]
x86-64: introduce struct pci_sysdata to facilitate sharing of ->sysdata

This patch introduces struct pci_sysdata to x86 and x86-64, and
converts the existing two users (NUMA, Calgary) to use it.

This lays the groundwork for having other users of sysdata, such as
the PCI domains work.

The Calgary bits are tested, the NUMA bits just look ok.

Signed-off-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86_64: make k8topology multi-core aware
Joachim Deguara [Sat, 21 Jul 2007 15:11:44 +0000 (17:11 +0200)]
x86_64: make k8topology multi-core aware

This makes k8topology multicore aware instead of limited to signle- and
dual-core CPUs.  It uses the CPUID to be more future proof.

Signed-off-by: Joachim Deguara <joachim.deguara@amd.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86_64: remove __smp_alt* sections
Jan Beulich [Sat, 21 Jul 2007 15:11:42 +0000 (17:11 +0200)]
x86_64: remove __smp_alt* sections

Leftovers from the removal of the more general (but abandoned) SMP
alternatives.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agoi386: Update alignment when 4K stacks are used.
Robert P. J. Day [Sat, 21 Jul 2007 15:11:41 +0000 (17:11 +0200)]
i386: Update alignment when 4K stacks are used.

Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agoi386: remove old IRQ balancing debug cruft
Stefan Richter [Sat, 21 Jul 2007 15:11:40 +0000 (17:11 +0200)]
i386: remove old IRQ balancing debug cruft

Dead or misnamed CONFIG_BALANCED_IRQ_DEBUG found by Robert P. J. Day.
It's not a Kconfig variable.

Since this debug code is ancient, I suggest to get rid of this
misleading CONFIG_ macro by deleting all of this debug code.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "Robert P. J. Day" <rpjday@mindspring.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agoi386: insert HPET firmware resource after PCI enumeration has completed
Aaron Durbin [Sat, 21 Jul 2007 15:11:39 +0000 (17:11 +0200)]
i386: insert HPET firmware resource after PCI enumeration has completed

Insert HPET resources after pci probing has been completed in order to
avoid resource conflicts with PCI resource reservation.  With this change
the HPET firmware resources will be identified, but it should also not
cause issues when the HPET address falls on a BAR in a PCI device, and the
PCI enumeration cannot reserve the resources.

Signed-off-by: Aaron Durbin <adurbin@google.com>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Len Brown <lenb@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agoi386: basic infrastructure support for AMD geode-class machines
Andres Salomon [Sat, 21 Jul 2007 15:11:38 +0000 (17:11 +0200)]
i386: basic infrastructure support for AMD geode-class machines

This builds upon the existing geode infrastructure, but adds southbridge
support, some GPIO functions, and a header file (asm-i386/geode.h) with some
useful GX/LX detection tests.

The majority of this code was written by Jordan Crouse.

Signed-off-by: Jordan Crouse <jordan.crouse@amd.com>
Signed-off-by: Andres Salomon <dilinger@debian.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86_64: arch/x86_64/kernel/e820.c lower printk severity
Dan Aloni [Sat, 21 Jul 2007 15:11:37 +0000 (17:11 +0200)]
x86_64: arch/x86_64/kernel/e820.c lower printk severity

Signed-off-by: Dan Aloni <da-x@monatomic.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86_64: arch/x86_64/kernel/aperture.c lower printk severity
Dan Aloni [Sat, 21 Jul 2007 15:11:36 +0000 (17:11 +0200)]
x86_64: arch/x86_64/kernel/aperture.c lower printk severity

Users that use kernel log filtering (e.g.  via syslogd or a proprietry method)
wouldn't like to see warning prints that are not really warnings.

Signed-off-by: Dan Aloni <da-x@monatomic.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agoi386: fix iounmap's use of vm_struct's size field
Jeremy Fitzhardinge [Sat, 21 Jul 2007 15:11:35 +0000 (17:11 +0200)]
i386: fix iounmap's use of vm_struct's size field

get_vm_area always returns an area with an adjacent guard page.  That guard
page is included in vm_struct.size.  iounmap uses vm_struct.size to
determine how much address space needs to have change_page_attr applied to
it, which will BUG if applied to the guard page.

This patch adds a helper function - get_vm_area_size() in linux/vmalloc.h -
to return the actual size of a vm area, and uses it to make iounmap do the
right thing.  There are probably other places which should be using
get_vm_area_size().

Thanks to Dave Young <hidave.darkstar@gmail.com> for debugging the
problem.

[ Andi, it wasn't clear to me whether x86_64 needs the same fix. ]

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Dave Young <hidave.darkstar@gmail.com>
Cc: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agoi386: move PIT function declarations and constants to correct header file
Thomas Gleixner [Sat, 21 Jul 2007 15:11:34 +0000 (17:11 +0200)]
i386: move PIT function declarations and constants to correct header file

setup_pit_timer is declared in asm-i386/timer.h.  Move it to the pit header
file, so it can be used by x86_64 as well.

Move also the PIT constants.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agoi386: hpet assumes boot cpu is 0
Chris Wright [Sat, 21 Jul 2007 15:11:33 +0000 (17:11 +0200)]
i386: hpet assumes boot cpu is 0

I fixed this in x86_64.  Looks like the kind of thing that will break voyager
on i386.

Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agoi386: remove volatile in apic.c
Thomas Gleixner [Sat, 21 Jul 2007 15:11:32 +0000 (17:11 +0200)]
i386: remove volatile in apic.c

Remove the volatile in apic.  We have a cpu_relax() in the wait loop.  Fix a
coding style issue while at it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86_64: move iommu declaration from proto to iommu.h
Yinghai Lu [Sat, 21 Jul 2007 15:11:31 +0000 (17:11 +0200)]
x86_64: move iommu declaration from proto to iommu.h

[akpm@linux-foundation.org: build fix]
Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: Vivek Goyal <vgoyal@in.ibm.com>
Cc: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86_64: disable srat when numa emulation succeeds
David Rientjes [Sat, 21 Jul 2007 15:11:30 +0000 (17:11 +0200)]
x86_64: disable srat when numa emulation succeeds

When NUMA emulation succeeds, acpi_numa needs to be set to -1 so that
srat_disabled() will always return true.  We won't be calling
acpi_scan_nodes() or registering the true nodes we've found.

[hugh@veritas.com: Fix x86_64 CONFIG_NUMA_EMU build: acpi_numa needs CONFIG_ACPI_NUMA]
Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Len Brown <lenb@kernel.org>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86_64: fix e820_hole_size based on address ranges
David Rientjes [Sat, 21 Jul 2007 15:11:29 +0000 (17:11 +0200)]
x86_64: fix e820_hole_size based on address ranges

e820_hole_size() now uses the newly extracted helper function,
e820_find_active_region(), to determine the size of usable RAM in a range of
PFN's.

This was previously broken because of two reasons:

 - The start and end PFN's of each e820 entry were not properly rounded
   prior to excluding those entries in the range, and

 - Entries smaller than a page were not properly excluded from being
   accumulated.

This resulted in emulated nodes being incorrectly mapped to ranges that
were completely reserved and not candidates for being registered as
active ranges.

Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86_64: disable the GART in shutdown
Yinghai Lu [Sat, 21 Jul 2007 15:11:28 +0000 (17:11 +0200)]
x86_64: disable the GART in shutdown

For K8 system: 4G RAM with memory hole remapping enabled, or more than 4G
RAM installed.  when using kexec to load second kernel.  In the second
kernel, when mem is allocated for GART, it will do the memset for clear, it
will cause restart, because some device still used that for dma.  solution
will be:

in second kernel: disable that at first before we try to allocate mem for
it.  or in the first kernel: do disable that before shutdown.
Andi/Eric/Alan prefer to second one for clean shutdown in first kernel.
Andi also point out need to consider to AGP enable but mem less 4G case
too.

Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: Vivek Goyal <vgoyal@in.ibm.com>
Cc: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agoi386: replace hard-coded constant with appropriate macro from kernel.h
Robert P. J. Day [Sat, 21 Jul 2007 15:11:26 +0000 (17:11 +0200)]
i386: replace hard-coded constant with appropriate macro from kernel.h

Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agoi386: add cpu_relax() to cmos_lock()
Andreas Mohr [Sat, 21 Jul 2007 15:11:25 +0000 (17:11 +0200)]
i386: add cpu_relax() to cmos_lock()

Add cpu_relax() to cmos_lock() inline function for faster operation on SMT
CPUs and less power consumption on others in case of lock contention (which
probably doesn't happen too often, so admittedly this patch is not too
exciting).

[akpm@linux-foundation.org: Include the header file for cpu_relax()]
Signed-off-by: Andreas Mohr <andi@lisas.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86_64: flush_tlb_kernel_range() warning fix
Andrew Morton [Sat, 21 Jul 2007 15:11:24 +0000 (17:11 +0200)]
x86_64: flush_tlb_kernel_range() warning fix

mm/vmalloc.c: In function 'unmap_kernel_range':
mm/vmalloc.c:75: warning: unused variable 'start'

make it a C function so that the compiler thinks it used its arguments.

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86_64: change _map_single to static in pci_gart.c etc
Yinghai Lu [Sat, 21 Jul 2007 15:11:23 +0000 (17:11 +0200)]
x86_64: change _map_single to static in pci_gart.c etc

This function is called via dma_ops->.., so change it to static

Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86_64: Geode HW Random Number Generator depends on X86_32
Yinghai Lu [Sat, 21 Jul 2007 15:11:22 +0000 (17:11 +0200)]
x86_64: Geode HW Random Number Generator depends on X86_32

Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
Acked-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86_64: fix wrong comment regarding set_fixmap()
Jiri Kosina [Sat, 21 Jul 2007 15:11:21 +0000 (17:11 +0200)]
x86_64: fix wrong comment regarding set_fixmap()

The function name is set_fixmap(), not fixmap_set() as stated in the comment.

Also fix a typo, punctuation and lower/uppercase a bit.

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86_64: lower printk severity
Dan Aloni [Sat, 21 Jul 2007 15:11:20 +0000 (17:11 +0200)]
x86_64: lower printk severity

Signed-off-by: Dan Aloni <da-x@monatomic.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86_64: fix typo in acpi_pm.c
Alessio Igor Bogani [Sat, 21 Jul 2007 15:11:19 +0000 (17:11 +0200)]
x86_64: fix typo in acpi_pm.c

Signed-off-by: Alessio Igor Bogani <abogani@texware.it>
Cc: john stultz <johnstul@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86_64: use the global PIT lock
Thomas Gleixner [Sat, 21 Jul 2007 15:11:18 +0000 (17:11 +0200)]
x86_64: use the global PIT lock

Replace the pcspkr private PIT lock by the global PIT lock to serialize the
PIT access all over the place.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86_64: During VM oom condition, kill all threads in process group
Will Schmidt [Sat, 21 Jul 2007 15:11:17 +0000 (17:11 +0200)]
x86_64: During VM oom condition, kill all threads in process group

During a VM oom condition, kill all threads in the process group.

We have had complaints where a threaded application is left in a bad state
after one of it's threads is killed when we hit a VM: out_of_memory condition.

Killing just one of the process threads can leave the application in a bad
state, whereas killing the entire process group would allow for the
application to restart, or otherwise handled, and makes it very obvious that
something has gone wrong.

This change allows the entire process group to be taken down, rather than just
the one thread.

Signed-off-by: Will <will_schmidt@vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86_64: Move functions declarations to header file
Glauber de Oliveira Costa [Sat, 21 Jul 2007 15:11:16 +0000 (17:11 +0200)]
x86_64: Move functions declarations to header file

Some interrupt entry points are currently defined in i8259.c They probably
belong in a header.  Right now, their only user is init_IRQ, justifying
their declaration in-file.  But when virtualization comes in, we may be
interested in using that functions in late initializations.

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agoi386: move the kernel to 16MB for NUMA-Q
Andy Whitcroft [Sat, 21 Jul 2007 15:11:15 +0000 (17:11 +0200)]
i386: move the kernel to 16MB for NUMA-Q

We are seeing corruption of the decompressed kernel.  It is suspected that
this is platform specific as it has yet to be seen on any other x86.  Move
the kernel to the 16MB boundary.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agoi386: Remove unneeded test of 'task' in dump_trace()
Jesper Juhl [Sat, 21 Jul 2007 15:11:14 +0000 (17:11 +0200)]
i386: Remove unneeded test of 'task' in dump_trace()

Remove unneeded test of task != NULL from
arch/i386/kernel/traps.c::dump_trace()

At the start of the function we have this test:
        if (!task)
                task = current;
so further down there's no need to test 'task'.

Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agoi386: divorce CONFIG_X86_PAE from CONFIG_HIGHMEM64G
William Lee Irwin III [Sat, 21 Jul 2007 15:11:13 +0000 (17:11 +0200)]
i386: divorce CONFIG_X86_PAE from CONFIG_HIGHMEM64G

PAE is useful for more than supporting more than 4GB RAM.  It supports
expanded swapspace and NX executable protections.  Some users may want NX
or expanded swapspace support without the overhead or instability of
highmem.  For these reasons, the following patch divorces CONFIG_X86_PAE
from CONFIG_HIGHMEM64G.

Cc: Mark Lord <lkml@rtr.ca>
Signed-off-by: William Irwin <wli@holomorphy.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agoi386: HPET, check if the counter works
Thomas Gleixner [Sat, 21 Jul 2007 15:11:12 +0000 (17:11 +0200)]
i386: HPET, check if the counter works

Some systems have a HPET which is not incrementing, which leads to a
complete hang.  Detect it during HPET setup.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agoi386: DMI_MATCH patch in reboot.c for SFF Dell OptiPlex 745 - fixes hang on reboot
James Jarvis [Sat, 21 Jul 2007 15:11:11 +0000 (17:11 +0200)]
i386: DMI_MATCH patch in reboot.c for SFF Dell OptiPlex 745 - fixes hang on reboot

The following patch enables reboot through BIOS on the Dell Optiplex 745
Small Form Factor base, on which reboot hangs.  The larger form factor does
not require this, hence the match on DMI_BOARD_NAME.

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agoi386: do not restore reserved memory after hibernation
Rafael J. Wysocki [Sat, 21 Jul 2007 15:11:09 +0000 (17:11 +0200)]
i386: do not restore reserved memory after hibernation

On some systems the ACPI NVS area is located in the first 1 MB of RAM and
it is overwritten by the i386 code during the restore after hibernation.
This confuses the ACPI platform firmware that doesn't update the AC adapter
status appropriately as a result
(http://bugzilla.kernel.org/show_bug.cgi?id=7995).

The solution is to register the reserved memory in the first 1 MB as
'nosave', so that swsusp doesn't touch it during the restore.  Also, this
has been done on x86_64 for a long time now, so this patch makes the i386
restore code behave like the x86_64 one.

[akpm@linux-foundation.org: build fix]
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agoi386: fix section mismatch warning in intel_cacheinfo
Sam Ravnborg [Sat, 21 Jul 2007 15:11:08 +0000 (17:11 +0200)]
i386: fix section mismatch warning in intel_cacheinfo

Fix following warning:
WARNING: arch/i386/kernel/built-in.o(.init.text+0x3818): Section mismatch: reference to .exit.text:cache_remove_dev (between 'cacheinfo_cpu_callback' and 'cache_sysfs_init')

It points out that a function marked __cpuexit is calling a function marked
__cpuinit => oops.

The call happens only in an error-condition which may explain why we have
not seen it before.

The offending function was not used anywhere else - so marked it __cpuexit.

Note: This warning triggers only with a local copy of modpost
      but that version will soon be pushed out.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agoi386: pgd_{c,d}tor() static
Adrian Bunk [Sat, 21 Jul 2007 15:11:07 +0000 (17:11 +0200)]
i386: pgd_{c,d}tor() static

pgd_{c,d}tor() can now become static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86_64: Calgary - fold in redundant functions
Muli Ben-Yehuda [Sat, 21 Jul 2007 15:11:06 +0000 (17:11 +0200)]
x86_64: Calgary - fold in redundant functions

After the bitmap changes we can get rid of the unlocked versions of
calgary_unmap_sg and iommu_free. Fold __calgary_unmap_sg and
__iommu_free into their calgary_unmap_sg and iommu_free, respectively.

Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86_64: Calgary - change _map_single, etc to static
Yinghai Lu [Sat, 21 Jul 2007 15:11:05 +0000 (17:11 +0200)]
x86_64: Calgary - change _map_single, etc to static

there function are called via dma_ops->.., so change them to static

Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86_64: Calgary - tighten up the bitmap locking
Muli Ben-Yehuda [Sat, 21 Jul 2007 15:11:04 +0000 (17:11 +0200)]
x86_64: Calgary - tighten up the bitmap locking

Currently the IOMMU table's lock protects both the bitmap and access
to the hardware's TCE table. Access to the TCE table is synchronized
through the bitmap; therefore, only hold the lock while modifying the
bitmap. This gives a yummy 10-15% reduction in CPU utilization for
netperf on a large SMP machine.

Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86_64: Calgary - fix few style problems pointed out by checkpatch.pl
Muli Ben-Yehuda [Sat, 21 Jul 2007 15:11:03 +0000 (17:11 +0200)]
x86_64: Calgary - fix few style problems pointed out by checkpatch.pl

No actual code was harmed in the production of this patch.

Thanks to Andrew Morton <akpm@linux-foundation.org> for telling me
about checkpatch.pl.

Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86_64: tidy up debug printks
Muli Ben-Yehuda [Sat, 21 Jul 2007 15:11:02 +0000 (17:11 +0200)]
x86_64: tidy up debug printks

Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86_64: only reserve the first 1MB of IO space for CalIOC2
Muli Ben-Yehuda [Sat, 21 Jul 2007 15:11:01 +0000 (17:11 +0200)]
x86_64: only reserve the first 1MB of IO space for CalIOC2

Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86_64: tabify and trim trailing whitespace
Muli Ben-Yehuda [Sat, 21 Jul 2007 15:11:00 +0000 (17:11 +0200)]
x86_64: tabify and trim trailing whitespace

Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86_64: cleanup of unneeded macros
Guillaume Thouvenin [Sat, 21 Jul 2007 15:10:59 +0000 (17:10 +0200)]
x86_64: cleanup of unneeded macros

Cleanup unneeded macros used for register space address calculation.
Now we are using the EBDA to find the space address.

Signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@bull.net>
Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86_64: reserve TCEs with the same address as MEM regions
Muli Ben-Yehuda [Sat, 21 Jul 2007 15:10:58 +0000 (17:10 +0200)]
x86_64: reserve TCEs with the same address as MEM regions

This works around a bug where DMAs that have the same addresses as
some MEM regions do not go through. Not clear yet if this is due to a
mis-configuration or something deeper.

[akpm@linux-foundation.org: coding style fixlet]
Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86_64: grab PLSSR too when a DMA error occurs
Muli Ben-Yehuda [Sat, 21 Jul 2007 15:10:57 +0000 (17:10 +0200)]
x86_64: grab PLSSR too when a DMA error occurs

Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86_64: make dump_error_regs a chip op
Muli Ben-Yehuda [Sat, 21 Jul 2007 15:10:55 +0000 (17:10 +0200)]
x86_64: make dump_error_regs a chip op

Provide seperate versions for Calgary and CalIOC2

Also print out the PCIe Root Complex Status on CalIOC2 errors

Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86_64: implement CalIOC2 TCE cache flush sequence
Muli Ben-Yehuda [Sat, 21 Jul 2007 15:10:54 +0000 (17:10 +0200)]
x86_64: implement CalIOC2 TCE cache flush sequence

Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86_64: add chip_ops and a quirk function for CalIOC2
Muli Ben-Yehuda [Sat, 21 Jul 2007 15:10:53 +0000 (17:10 +0200)]
x86_64: add chip_ops and a quirk function for CalIOC2

[akpm@linux-foundation.org>: make calioc2_chip_ops static]
Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86_64: introduce CalIOC2 support
Muli Ben-Yehuda [Sat, 21 Jul 2007 15:10:52 +0000 (17:10 +0200)]
x86_64: introduce CalIOC2 support

CalIOC2 is a PCI-e implementation of the Calgary logic. Most of the
programming details are the same, but some differ, e.g., TCE cache
flush. This patch introduces CalIOC2 support - detection and various
support routines. It's not expected to work yet (but will with
follow-on patches).

Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86_64: abstract how we find the iommu_table for a device
Muli Ben-Yehuda [Sat, 21 Jul 2007 15:10:51 +0000 (17:10 +0200)]
x86_64: abstract how we find the iommu_table for a device

... in preparation for doing it differently for CalIOC2.

Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86_64: introduce chipset specific ops
Muli Ben-Yehuda [Sat, 21 Jul 2007 15:10:50 +0000 (17:10 +0200)]
x86_64: introduce chipset specific ops

Calgary and CalIOC2 share most of the same logic. Introduce struct
cal_chipset_ops for quirks and tce flush logic which are

[akpm@linux-foundation.org: make calgary_chip_ops static]
Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86_64: introduce handle_quirks() for various chipset quirks
Muli Ben-Yehuda [Sat, 21 Jul 2007 15:10:49 +0000 (17:10 +0200)]
x86_64: introduce handle_quirks() for various chipset quirks

Move the aic94xx split completion timeout handling there.

Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86_64: update copyright notice
Muli Ben-Yehuda [Sat, 21 Jul 2007 15:10:48 +0000 (17:10 +0200)]
x86_64: update copyright notice

Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86_64: generalize calgary_increase_split_completion_timeout
Muli Ben-Yehuda [Sat, 21 Jul 2007 15:10:47 +0000 (17:10 +0200)]
x86_64: generalize calgary_increase_split_completion_timeout

... will be used by CalIOC2 later

Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86: remove support for the Rise CPU
Adrian Bunk [Sat, 21 Jul 2007 15:10:46 +0000 (17:10 +0200)]
x86: remove support for the Rise CPU

The Rise CPUs were only very short-lived, and there are no reports of
anyone both owning one and running Linux on it.

Googling for the printk string "CPU: Rise iDragon" didn't find any dmesg
available online.

If it turns out that against all expectations there are actually users
reverting this patch would be easy.

This patch will make the kernel images smaller by a few bytes for all
i386 users.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Dave Jones <davej@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86_64: check remote IRR bit before migrating level triggered irq
Eric W. Biederman [Sat, 21 Jul 2007 15:10:45 +0000 (17:10 +0200)]
x86_64: check remote IRR bit before migrating level triggered irq

On x86_64 kernel, level triggered irq migration gets initiated in the
context of that interrupt(after executing the irq handler) and following
steps are followed to do the irq migration.

1. mask IOAPIC RTE entry;     // write to IOAPIC RTE
2. EOI;                       // processor EOI write
3. reprogram IOAPIC RTE entry // write to IOAPIC RTE with new destination and
                              // and interrupt vector due to per cpu vector
                              // allocation.
4. unmask IOAPIC RTE entry;   // write to IOAPIC RTE

Because of the per cpu vector allocation in x86_64 kernels, when the irq
migrates to a different cpu, new vector(corresponding to the new cpu) will
get allocated.

An EOI write to local APIC has a side effect of generating an EOI write for
level trigger interrupts (normally this is a broadcast to all IOAPICs).
The EOI broadcast generated as a side effect of EOI write to processor may
be delayed while the other IOAPIC writes (step 3 and 4) can go through.

Normally, the EOI generated by local APIC for level trigger interrupt
contains vector number.  The IOAPIC will take this vector number and search
the IOAPIC RTE entries for an entry with matching vector number and clear
the remote IRR bit (indicate EOI).  However, if the vector number is
changed (as in step 3) the IOAPIC will not find the RTE entry when the EOI
is received later.  This will cause the remote IRR to get stuck causing the
interrupt hang (no more interrupt from this RTE).

Current x86_64 kernel assumes that remote IRR bit is cleared by the time
IOAPIC RTE is reprogrammed.  Fix this assumption by checking for remote IRR
bit and if it still set, delay the irq migration to the next interrupt
arrival event(hopefully, next time remote IRR bit will get cleared before
the IOAPIC RTE is reprogrammed).

Initial analysis and patch from Nanhai.

Clean up patch from Suresh.

Rewritten to be less intrusive, and to contain a big fat comment by Eric.

[akpm@linux-foundation.org: fix comments]
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Nanhai Zou <nanhai.zou@intel.com>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Asit Mallick <asit.k.mallick@intel.com>
Cc: Keith Packard <keith.packard@intel.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86: round_jiffies() for i386 and x86-64 non-critical/corrected MCE polling
Venki Pallipadi [Sat, 21 Jul 2007 15:10:44 +0000 (17:10 +0200)]
x86: round_jiffies() for i386 and x86-64 non-critical/corrected MCE polling

This helps to reduce the frequency at which the CPU must be taken out of a
lower-power state.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Acked-by: Tim Hockin <thockin@hockin.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agoi386: add reference to the arguments
Andrew Morton [Sat, 21 Jul 2007 15:10:43 +0000 (17:10 +0200)]
i386: add reference to the arguments

Prevent stuff like this:

mm/vmalloc.c: In function 'unmap_kernel_range':
mm/vmalloc.c:75: warning: unused variable 'start'

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86: Make Alt-SysRq-p display the debug register contents
Alan Stern [Sat, 21 Jul 2007 15:10:42 +0000 (17:10 +0200)]
x86: Make Alt-SysRq-p display the debug register contents

This patch (as921) adds code to the show_regs() routine in i386 and x86_64
to print the contents of the debug registers along with all the others.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86: PM_TRACE support
Nigel Cunningham [Sat, 21 Jul 2007 15:10:41 +0000 (17:10 +0200)]
x86: PM_TRACE support

Signed-off-by: Nigel Cunningham <nigel@nigel.suspend2.net>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Pavel Machek <pavel@ucw.cz>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agoi386: fix section mismatch warnings in mtrr
Sam Ravnborg [Sat, 21 Jul 2007 15:10:39 +0000 (17:10 +0200)]
i386: fix section mismatch warnings in mtrr

Following section mismatch warnings were reported by Andrey Borzenkov:

WARNING: arch/i386/kernel/built-in.o - Section mismatch: reference to .init.text:amd_init_mtrr from .text between 'mtrr_bp_init' (at offset 0x967a) and 'mtrr_attrib_to_str'
WARNING: arch/i386/kernel/built-in.o - Section mismatch: reference to .init.text:cyrix_init_mtrr from .text between 'mtrr_bp_init' (at offset 0x967f) and 'mtrr_attrib_to_str'
WARNING: arch/i386/kernel/built-in.o - Section mismatch: reference to .init.text:centaur_init_mtrr from .text between 'mtrr_bp_init' (at offset 0x9684) and 'mtrr_attrib_to_str'
WARNING: arch/i386/kernel/built-in.o - Section mismatch: reference to .init.text: from .text between 'get_mtrr_state' (at offset 0xa735) and 'generic_get_mtrr'
WARNING: arch/i386/kernel/built-in.o - Section mismatch: reference to .init.text: from .text between 'get_mtrr_state' (at offset 0xa749) and 'generic_get_mtrr'
WARNING: arch/i386/kernel/built-in.o - Section mismatch: reference to .init.text: from .text between 'get_mtrr_state' (at offset 0xa770) and 'generic_get_mtrr'

It was tracked down to a few functions missing __init tag.
Compile tested only.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agoi386: fix machine rebooting
Truxton Fulton [Sat, 21 Jul 2007 15:10:38 +0000 (17:10 +0200)]
i386: fix machine rebooting

Commit 59f4e7d572980a521b7bdba74ab71b21f5995538 fixed machine rebooting
on Truxton's machine (when no keyboard was present).  But it broke it on
Lee's machine.

The patch reinstates the old (pre-59f4e7d572980a521b7bdba74ab71b21f5995538)
code and if that doesn't work out, try the new,
post-59f4e7d572980a521b7bdba74ab71b21f5995538 code instead.

Cc: Lee Garrett <lee-in-berlin@web.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86_64: mcelog tolerant level cleanup
Tim Hockin [Sat, 21 Jul 2007 15:10:37 +0000 (17:10 +0200)]
x86_64: mcelog tolerant level cleanup

Background:
 The MCE handler has several paths that it can take, depending on various
 conditions of the MCE status and the value of the 'tolerant' knob.  The
 exact semantics are not well defined and the code is a bit twisty.

Description:
 This patch makes the MCE handler's behavior more clear by documenting the
 behavior for various 'tolerant' levels.  It also fixes or enhances
 several small things in the handler.  Specifically:
     * If RIPV is set it is not safe to restart, so set the 'no way out'
       flag rather than the 'kill it' flag.
     * Don't panic() on correctable MCEs.
     * If the _OVER bit is set *and* the _UC bit is set (meaning possibly
       dropped uncorrected errors), set the 'no way out' flag.
     * Use EIPV for testing whether an app can be killed (SIGBUS) rather
       than RIPV.  According to docs, EIPV indicates that the error is
       related to the IP, while RIPV simply means the IP is valid to
       restart from.
     * Don't clear the MCi_STATUS registers until after the panic() path.
       This leaves the status bits set after the panic() so clever BIOSes
       can find them (and dumb BIOSes can do nothing).

 This patch also calls nonseekable_open() in mce_open (as suggested by akpm).

Result:
 Tolerant levels behave almost identically to how they always have, but
 not it's well defined.  There's a slightly higher chance of panic()ing
 when multiple errors happen (a good thing, IMHO).  If you take an MBE and
 panic(), the error status bits are not cleared.

Alternatives:
 None.

Testing:
 I used software to inject correctable and uncorrectable errors.  With
 tolerant = 3, the system usually survives.  With tolerant = 2, the system
 usually panic()s (PCC) but not always.  With tolerant = 1, the system
 always panic()s.  When the system panic()s, the BIOS is able to detect
 that the cause of death was an MC4.  I was not able to reproduce the
 case of a non-PCC error in userspace, with EIPV, with (tolerant < 3).
 That will be rare at best.

Signed-off-by: Tim Hockin <thockin@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86_64: support poll() on /dev/mcelog
Tim Hockin [Sat, 21 Jul 2007 15:10:36 +0000 (17:10 +0200)]
x86_64: support poll() on /dev/mcelog

Background:
 /dev/mcelog is typically polled manually.  This is less than optimal for
 situations where accurate accounting of MCEs is important.  Calling
 poll() on /dev/mcelog does not work.

Description:
 This patch adds support for poll() to /dev/mcelog.  This results in
 immediate wakeup of user apps whenever the poller finds MCEs.  Because
 the exception handler can not take any locks, it can not call the wakeup
 itself.  Instead, it uses a thread_info flag (TIF_MCE_NOTIFY) which is
 caught at the next return from interrupt or exit from idle, calling the
 mce_user_notify() routine.  This patch also disables the "fake panic"
 path of the mce_panic(), because it results in printk()s in the exception
 handler and crashy systems.

 This patch also does some small cleanup for essentially unused variables,
 and moves the user notification into the body of the poller, so it is
 only called once per poll, rather than once per CPU.

Result:
 Applications can now poll() on /dev/mcelog.  When an error is logged
 (whether through the poller or through an exception) the applications are
 woken up promptly.  This should not affect any previous behaviors.  If no
 MCEs are being logged, there is no overhead.

Alternatives:
 I considered simply supporting poll() through the poller and not using
 TIF_MCE_NOTIFY at all.  However, the time between an uncorrectable error
 happening and the user application being notified is *the*most* critical
 window for us.  Many uncorrectable errors can be logged to the network if
 given a chance.

 I also considered doing the MCE poll directly from the idle notifier, but
 decided that was overkill.

Testing:
 I used an error-injecting DIMM to create lots of correctable DRAM errors
 and verified that my user app is woken up in sync with the polling interval.
 I also used the northbridge to inject uncorrectable ECC errors, and
 verified (printk() to the rescue) that the notify routine is called and the
 user app does wake up.  I built with PREEMPT on and off, and verified
 that my machine survives MCEs.

[wli@holomorphy.com: build fix]
Signed-off-by: Tim Hockin <thockin@google.com>
Signed-off-by: William Irwin <bill.irwin@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86_64: O_EXCL on /dev/mcelog
Tim Hockin [Sat, 21 Jul 2007 15:10:35 +0000 (17:10 +0200)]
x86_64: O_EXCL on /dev/mcelog

Background:
 /dev/mcelog is a clear-on-read interface.  It is currently possible for
 multiple users to open and read() the device.  Users are protected from
 each other during any one read, but not across reads.

Description:
 This patch adds support for O_EXCL to /dev/mcelog.  If a user opens the
 device with O_EXCL, no other user may open the device (EBUSY).  Likewise,
 any user that tries to open the device with O_EXCL while another user has
 the device will fail (EBUSY).

Result:
 Applications can get exclusive access to /dev/mcelog.  Applications that
 do not care will be unchanged.

Alternatives:
 A simpler choice would be to only allow one open() at all, regardless of
 O_EXCL.

Testing:
 I wrote an application that opens /dev/mcelog with O_EXCL and observed
 that any other app that tried to open /dev/mcelog would fail until the
 exclusive app had closed the device.

Caveats:
 None.

Signed-off-by: Tim Hockin <thockin@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agoi386: insert unclaimed MMCONFIG resources
Aaron Durbin [Sat, 21 Jul 2007 15:10:34 +0000 (17:10 +0200)]
i386: insert unclaimed MMCONFIG resources

Insert the unclaimed MMCONFIG resources into the resource tree without the
IORESOURCE_BUSY flag during late initialization.  This allows the MMCONFIG
regions to be visible in the iomem resource tree without interfering with
other system resources that were discovered during PCI initialization.

[akpm@linux-foundation.org: nanofixes]
Signed-off-by: Aaron Durbin <adurbin@google.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86_64: fake apicid_to_node mapping for fake numa
David Rientjes [Sat, 21 Jul 2007 15:10:33 +0000 (17:10 +0200)]
x86_64: fake apicid_to_node mapping for fake numa

When we are in the emulated NUMA case, we need to make sure that all existing
apicid_to_node mappings that point to real node ID's now point to the
equivalent fake node ID's.

If we simply iterate over all apicid_to_node[] members for each node, we risk
remapping an entry if it shares a node ID with a real node.  Since apicid's
may not be consecutive, we're forced to create an automatic array of
apicid_to_node mappings and then copy it over once we have finished remapping
fake to real nodes.

Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
17 years agox86_64: fake pxm-to-node mapping for fake numa
David Rientjes [Sat, 21 Jul 2007 15:10:32 +0000 (17:10 +0200)]
x86_64: fake pxm-to-node mapping for fake numa

For NUMA emulation, our SLIT should represent the true NUMA topology of the
system but our proximity domain to node ID mapping needs to reflect the
emulated state.

When NUMA emulation has successfully setup fake nodes on the system, a new
function, acpi_fake_nodes() is called.  This function determines the proximity
domain (_PXM) for each true node found on the system.  It then finds which
emulated nodes have been allocated on this true node as determined by its
starting address.  The node ID to PXM mapping is changed so that each fake
node ID points to the PXM of the true node that it is located on.

If the machine failed to register a SLIT, then we assume there is no special
requirement for emulated node affinity so we use the default LOCAL_DISTANCE,
which is newly exported to this code, as our measurement if the emulated nodes
appear in the same PXM.  Otherwise, we use REMOTE_DISTANCE.

PXM_INVAL and NID_INVAL are also exported to the ACPI header file so that we
can compare node_to_pxm() results in generic code (in this case, the SRAT
code).

Cc: Len Brown <lenb@kernel.org>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>