]> err.no Git - linux-2.6/log
linux-2.6
16 years agox86, 64-bit: unify early_ioremap
Jeremy Fitzhardinge [Wed, 25 Jun 2008 04:19:03 +0000 (00:19 -0400)]
x86, 64-bit: unify early_ioremap

The 32-bit early_ioremap will work equally well for 64-bit, so just use it.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: xen-devel <xen-devel@lists.xensource.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86, 64-bit: use p??_populate() to attach pages to pagetable
Jeremy Fitzhardinge [Wed, 25 Jun 2008 04:19:02 +0000 (00:19 -0400)]
x86, 64-bit: use p??_populate() to attach pages to pagetable

Use the _populate() functions to attach new pages to a pagetable, to
make sure the right paravirt_ops calls get called.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: xen-devel <xen-devel@lists.xensource.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86, 64-bit: use write_gdt_entry in vsyscall_set_cpu
Jeremy Fitzhardinge [Wed, 25 Jun 2008 04:19:01 +0000 (00:19 -0400)]
x86, 64-bit: use write_gdt_entry in vsyscall_set_cpu

Use write_gdt_entry to generate the special vgetcpu descriptor in the
vsyscall page.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: xen-devel <xen-devel@lists.xensource.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: remove open-coded save/load segment operations
Jeremy Fitzhardinge [Wed, 25 Jun 2008 04:19:00 +0000 (00:19 -0400)]
x86: remove open-coded save/load segment operations

This removes a pile of buggy open-coded implementations of savesegment
and loadsegment.

(They are buggy because they don't have memory barriers to prevent
them from being reordered with respect to memory accesses.)

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: xen-devel <xen-devel@lists.xensource.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: add memory barriers to wrmsr
Jeremy Fitzhardinge [Wed, 25 Jun 2008 04:18:59 +0000 (00:18 -0400)]
x86: add memory barriers to wrmsr

wrmsr is a special instruction which can have arbitrary system-wide
effects.  We don't want the compiler to reorder it with respect to
memory operations, so make it a memory barrier.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: xen-devel <xen-devel@lists.xensource.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: add memory clobber to save/loadsegment
Jeremy Fitzhardinge [Wed, 25 Jun 2008 04:18:58 +0000 (00:18 -0400)]
x86: add memory clobber to save/loadsegment

Add "memory" clobbers to savesegment and loadsegment, since they can
affect memory accesses and we never want the compiler to reorder them
with respect to memory references.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: xen-devel <xen-devel@lists.xensource.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: asm-x86/pgtable.h: fix compiler warning
Jeremy Fitzhardinge [Wed, 25 Jun 2008 04:18:57 +0000 (00:18 -0400)]
x86: asm-x86/pgtable.h: fix compiler warning

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: xen-devel <xen-devel@lists.xensource.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: nmi_watchdog - introduce nmi_watchdog_active() helper
Cyrill Gorcunov [Tue, 24 Jun 2008 20:52:06 +0000 (22:52 +0200)]
x86: nmi_watchdog - introduce nmi_watchdog_active() helper

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: macro@linux-mips.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: nmi_watchdog - use NMI_NONE by default
Cyrill Gorcunov [Tue, 24 Jun 2008 20:52:05 +0000 (22:52 +0200)]
x86: nmi_watchdog - use NMI_NONE by default

There is no need to keep NMI_DISABLED definition and use it
for nmi_watchdog by default. Here is the point why:

- IO-APIC and APIC chips are programmed for nmi_watchdog support at very
  early stage of kernel booting and not having nmi_watchdog specified as
  boot option lead only to nmi_watchdog becomes to NMI_NONE anyway
- enable nmi_watchdog thru /proc/sys/kernel/nmi if it was not specified at
  boot is not possible too (even having this sysfs entry)

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: macro@linux-mips.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: nmi_watchdog - remove useless check
Cyrill Gorcunov [Tue, 24 Jun 2008 20:52:04 +0000 (22:52 +0200)]
x86: nmi_watchdog - remove useless check

Since nmi_watchdog is unsigned variable we may
safely remove the check for negative value.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: macro@linux-mips.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: nmi_watchdog - use nmi_watchdog variable for printing
Cyrill Gorcunov [Tue, 24 Jun 2008 20:52:04 +0000 (22:52 +0200)]
x86: nmi_watchdog - use nmi_watchdog variable for printing

Since it is possible NMI_ definitions could be changed
one day we better print out real nmi_watchdog value instead
of constant string.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: macro@linux-mips.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: perfctr-watchdog.c - coding style cleanup
Cyrill Gorcunov [Tue, 24 Jun 2008 20:52:03 +0000 (22:52 +0200)]
x86: perfctr-watchdog.c - coding style cleanup

Just some code beautification. Nothing else.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: macro@linux-mips.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86 boot: more consistently use type int for node ids
Paul Jackson [Sun, 22 Jun 2008 14:22:17 +0000 (07:22 -0700)]
x86 boot: more consistently use type int for node ids

Everywhere I look, node id's are of type 'int', except in this one
case, which has 'unsigned long'.  Change this one to 'int' as well.
There is nothing special about the way this variable 'nid' is used in
this routine to justify using an unusual type here.

Signed-off-by: Paul Jackson <pj@sgi.com>
Cc: "Yinghai Lu" <yhlu.kernel@gmail.com>
Cc: "Jack Steiner" <steiner@sgi.com>
Cc: "Mike Travis" <travis@sgi.com>
Cc: "Huang
Cc: Ying" <ying.huang@intel.com>
Cc: "Andi Kleen" <andi@firstfloor.org>
Cc: "Andrew Morton" <akpm@linux-foundation.org>
Cc: Paul Jackson <pj@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86 boot: show pfn addresses in hex not decimal in some kernel info printks
Paul Jackson [Sun, 22 Jun 2008 14:22:12 +0000 (07:22 -0700)]
x86 boot: show pfn addresses in hex not decimal in some kernel info printks

Page frame numbers (the portion of physical addresses above the low
order page offsets) are displayed in several kernel debug and info
prints in decimal, not hex.  Decimal addresse are unreadable.  Use hex.

Signed-off-by: Paul Jackson <pj@sgi.com>
Cc: "Yinghai Lu" <yhlu.kernel@gmail.com>
Cc: "Jack Steiner" <steiner@sgi.com>
Cc: "Mike Travis" <travis@sgi.com>
Cc: "Huang
Cc: Ying" <ying.huang@intel.com>
Cc: "Andi Kleen" <andi@firstfloor.org>
Cc: "Andrew Morton" <akpm@linux-foundation.org>
Cc: Paul Jackson <pj@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86 boot: allow overlapping early reserve memory ranges
Paul Jackson [Sun, 22 Jun 2008 14:22:07 +0000 (07:22 -0700)]
x86 boot: allow overlapping early reserve memory ranges

Add support for overlapping early memory reservations.

In general, they still can't overlap, and will panic
with "Overlapping early reservations" if they do overlap.

But if a memory range is reserved with the new call:
    reserve_early_overlap_ok()
rather than with the usual call:
    reserve_early()
then subsequent early reservations are allowed to overlap.

This new reserve_early_overlap_ok() call is only used in one
place so far, which is the "BIOS reserved" reservation for the
the EBDA region, which out of Paranoia reserves more than what
the BIOS might have specified, and which thus might overlap with
another legitimate early memory reservation (such as, perhaps,
the EFI memmap.)

Signed-off-by: Paul Jackson <pj@sgi.com>
Cc: "Yinghai Lu" <yhlu.kernel@gmail.com>
Cc: "Jack Steiner" <steiner@sgi.com>
Cc: "Mike Travis" <travis@sgi.com>
Cc: "Huang
Cc: Ying" <ying.huang@intel.com>
Cc: "Andi Kleen" <andi@firstfloor.org>
Cc: "Andrew Morton" <akpm@linux-foundation.org>
Cc: Paul Jackson <pj@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86 boot: x86_64 efi compiler warning fix
Paul Jackson [Sun, 22 Jun 2008 14:22:02 +0000 (07:22 -0700)]
x86 boot: x86_64 efi compiler warning fix

Fix a compiler warning.  Rather than always casting a u32 to a pointer
(which generates a warning on x86_64 systems) instead separate the
x86_32 and x86_64 assignments entirely with ifdefs.  Until other recent
changes to this code, it used to have x86_64 separated like this.

Signed-off-by: Paul Jackson <pj@sgi.com>
Cc: "Yinghai Lu" <yhlu.kernel@gmail.com>
Cc: "Jack Steiner" <steiner@sgi.com>
Cc: "Mike Travis" <travis@sgi.com>
Cc: "Huang
Cc: Ying" <ying.huang@intel.com>
Cc: "Andi Kleen" <andi@firstfloor.org>
Cc: "Andrew Morton" <akpm@linux-foundation.org>
Cc: Paul Jackson <pj@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86 boot: e820 code indentation fix
Paul Jackson [Sun, 22 Jun 2008 14:21:57 +0000 (07:21 -0700)]
x86 boot: e820 code indentation fix

Fix indentation.  An earlier code merge got the
indentation of four lines of code off by a tab.

Signed-off-by: Paul Jackson <pj@sgi.com>
Cc: "Yinghai Lu" <yhlu.kernel@gmail.com>
Cc: "Jack Steiner" <steiner@sgi.com>
Cc: "Mike Travis" <travis@sgi.com>
Cc: "Huang
Cc: Ying" <ying.huang@intel.com>
Cc: "Andi Kleen" <andi@firstfloor.org>
Cc: "Andrew Morton" <akpm@linux-foundation.org>
Cc: Paul Jackson <pj@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: setup_arch 32bit move kvm_guest_init later
Yinghai Lu [Tue, 24 Jun 2008 02:55:05 +0000 (19:55 -0700)]
x86: setup_arch 32bit move kvm_guest_init later

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: setup_arch 32bit move command line copying early
Yinghai Lu [Tue, 24 Jun 2008 02:54:23 +0000 (19:54 -0700)]
x86: setup_arch 32bit move command line copying early

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: setup_arch 32bit move efi check later
Yinghai Lu [Tue, 24 Jun 2008 02:53:33 +0000 (19:53 -0700)]
x86: setup_arch 32bit move efi check later

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: move some func calling from setup_arch to paging_init
Yinghai Lu [Tue, 24 Jun 2008 02:51:10 +0000 (19:51 -0700)]
x86: move some func calling from setup_arch to paging_init

those function depend on paging setup pgtable, so they could access
the ram in bootmem region but just get mapped.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: numa32 pfn print out using hex instead
Yinghai Lu [Mon, 23 Jun 2008 23:41:30 +0000 (16:41 -0700)]
x86: numa32 pfn print out using hex instead

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: fix compile warning in init_64.c
Yinghai Lu [Mon, 23 Jun 2008 21:02:36 +0000 (14:02 -0700)]
x86: fix compile warning in init_64.c

len is long and ret is only for NUMA

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: build fix
Ingo Molnar [Mon, 23 Jun 2008 20:19:22 +0000 (22:19 +0200)]
x86: build fix

fix:

arch/x86/kernel/setup_32.c:409: error: 'enable_local_apic' undeclared (first use in this function)
arch/x86/kernel/setup_32.c:409: error: (Each undeclared identifier is reported only once
arch/x86/kernel/setup_32.c:409: error: for each function it appears in.)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: clean up min_low_pfn
Yinghai Lu [Mon, 23 Jun 2008 10:06:14 +0000 (03:06 -0700)]
x86: clean up min_low_pfn

for 32bit
we already had early_res support, so don't need to track min_low_pfn.
keep it to 0 always.

also use init_bootmem_node instead of init_bootmem, so don't touch
min_low_pfn.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: clean up using max_low_pfn on 32-bit
Yinghai Lu [Mon, 23 Jun 2008 10:05:30 +0000 (03:05 -0700)]
x86: clean up using max_low_pfn on 32-bit

so that max_low_pfn is not changed after it is set.
so we can move that early and out of initmem_init.

could call find_low_pfn_range just after max_pfn is set.

also could move reserve_initrd out of setup_bootmem_allocator

so 32bit is more like 64bit.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: move reservetop and vmalloc parsing to pgtable_32.c
Yinghai Lu [Mon, 23 Jun 2008 00:40:10 +0000 (17:40 -0700)]
x86: move reservetop and vmalloc parsing to pgtable_32.c

also change reserve_top_address to __init attibute

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: move find_max_low_pfn to init_32.c
Yinghai Lu [Mon, 23 Jun 2008 19:00:45 +0000 (21:00 +0200)]
x86: move find_max_low_pfn to init_32.c

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: move boot_params declaring to setup.c
Yinghai Lu [Mon, 23 Jun 2008 00:37:54 +0000 (17:37 -0700)]
x86: move boot_params declaring to setup.c

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: introduce reserve_initrd
Yinghai Lu [Sun, 22 Jun 2008 09:46:58 +0000 (02:46 -0700)]
x86: introduce reserve_initrd

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: introduce initmem_init for 32 bit
Yinghai Lu [Sun, 22 Jun 2008 09:45:39 +0000 (02:45 -0700)]
x86: introduce initmem_init for 32 bit

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: introduce initmem_init for 64 bit
Yinghai Lu [Sun, 22 Jun 2008 09:44:49 +0000 (02:44 -0700)]
x86: introduce initmem_init for 64 bit

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: move elfcorehdr parsing to setup.c
Yinghai Lu [Sun, 22 Jun 2008 04:02:20 +0000 (21:02 -0700)]
x86: move elfcorehdr parsing to setup.c

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: move reserve_standard_io_resource to setup.c
Yinghai Lu [Sun, 22 Jun 2008 03:22:09 +0000 (20:22 -0700)]
x86: move reserve_standard_io_resource to setup.c

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: remove two duplicated funcs in setup_32.c
Yinghai Lu [Sun, 22 Jun 2008 02:16:52 +0000 (19:16 -0700)]
x86: remove two duplicated funcs in setup_32.c

early_cpu_init is declared in processor.h
memory_setup is defined in e820.c

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: merge setup64.c into common_64.c
Yinghai Lu [Sat, 21 Jun 2008 23:25:37 +0000 (16:25 -0700)]
x86: merge setup64.c into common_64.c

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: seperate probe_roms into another file
Yinghai Lu [Sat, 21 Jun 2008 22:39:41 +0000 (15:39 -0700)]
x86: seperate probe_roms into another file

it is only needed for 32bit

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: add e820_remove_range
Yinghai Lu [Sat, 21 Jun 2008 21:48:05 +0000 (14:48 -0700)]
x86: add e820_remove_range

... so could add real hole in e820

agp check is using request_mem_region, and could fail if e820 is reserved...

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: change identify_cpu to static
Yinghai Lu [Sat, 21 Jun 2008 10:24:00 +0000 (03:24 -0700)]
x86: change identify_cpu to static

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: seperate funcs from setup_64 to cpu common_64.c
Yinghai Lu [Sat, 21 Jun 2008 10:24:19 +0000 (03:24 -0700)]
x86: seperate funcs from setup_64 to cpu common_64.c

Signed-off-by: Yinghai Lu <yhlu.kernel@mail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: remove some acpi ifdefs in setup_32/64
Yinghai Lu [Sat, 21 Jun 2008 08:38:41 +0000 (01:38 -0700)]
x86: remove some acpi ifdefs in setup_32/64

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: clean up init_amd()
Yinghai Lu [Sat, 21 Jun 2008 08:14:27 +0000 (01:14 -0700)]
x86: clean up init_amd()

1. move out calling of check_enable_amd_mmconf_dmi out of setup_64.c
   put it into init_amd(), so don't need to make extra dmi check for
   system with other cpus.
2. 15 --> 0xf

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: check command line when CONFIG_X86_MPPARSE is not set, v2
Yinghai Lu [Fri, 20 Jun 2008 23:11:20 +0000 (16:11 -0700)]
x86: check command line when CONFIG_X86_MPPARSE is not set, v2

if acpi=off, acpi=noirq and pci=noacpi, we need to disable apic.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Maciej W. Rozycki" <macro@linux-mips.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoxen: set max_pfn_mapped
Jeremy Fitzhardinge [Mon, 16 Jun 2008 21:54:51 +0000 (14:54 -0700)]
xen: set max_pfn_mapped

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: the arch/x86 maintainers <x86@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoxen: reserve ISA space in e820 map
Jeremy Fitzhardinge [Mon, 16 Jun 2008 21:54:49 +0000 (14:54 -0700)]
xen: reserve ISA space in e820 map

[ TODO: release the underlying memory back to Xen. ]

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: the arch/x86 maintainers <x86@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoxen: reserve Xen-specific memory in e820 map
Jeremy Fitzhardinge [Mon, 16 Jun 2008 21:54:46 +0000 (14:54 -0700)]
xen: reserve Xen-specific memory in e820 map

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: the arch/x86 maintainers <x86@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoRFC x86: try to remove arch_get_ram_range
Yinghai Lu [Tue, 17 Jun 2008 03:10:55 +0000 (20:10 -0700)]
RFC x86: try to remove arch_get_ram_range

want to remove arch_get_ram_range, and use early_node_map instead.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: fix sleep.c build error
Ingo Molnar [Fri, 13 Jun 2008 18:31:54 +0000 (20:31 +0200)]
x86: fix sleep.c build error

fix:

arch/x86/kernel/acpi/sleep.c: In function â€˜acpi_save_state_mem':
arch/x86/kernel/acpi/sleep.c:75: error: â€˜stack_start' undeclared (first use in this function)
arch/x86/kernel/acpi/sleep.c:75: error: (Each undeclared identifier is reported
only once
arch/x86/kernel/acpi/sleep.c:75: error: for each function it appears in.)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: take load_sp0 out of smpboot.c
Glauber Costa [Thu, 5 Jun 2008 06:05:39 +0000 (23:05 -0700)]
x86: take load_sp0 out of smpboot.c

there's no particular reason to do load_sp0 in different
places for i386 and x86_64. They should all be in cpu_init.
Right now, cpu_init itself is not integrated, but with this patch,
the code becomes closer to each other, making in easier to integrate
when the time comes.

Furthermore, although doing it in do_boot_cpu for x86_64 is fine, since it's
only a copy, load_sp0 should be executed in the cpu it refers to anyway.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: move cpu_exit_clear to process_32.c
Glauber Costa [Wed, 4 Jun 2008 18:35:03 +0000 (15:35 -0300)]
x86: move cpu_exit_clear to process_32.c

Take it out of smpboot.c, and move it to process_32.c, closer
to its only user.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: remove cpu from maps
Glauber Costa [Wed, 4 Jun 2008 05:05:03 +0000 (02:05 -0300)]
x86: remove cpu from maps

during cpu disable, take cpus out of all maps in i386, instead
of just the online map.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: change naming to match x86_64
Glauber Costa [Wed, 4 Jun 2008 05:03:07 +0000 (02:03 -0300)]
x86: change naming to match x86_64

Change unmap_cpu_to_logical_apicid to numa_remove_cpu.
Besides being shorter, it is the same name x86_64 uses. We
can save an ifdef in the code this way.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: provide connect_bsp_APIC for x86_64
Glauber Costa [Wed, 28 May 2008 16:38:28 +0000 (13:38 -0300)]
x86: provide connect_bsp_APIC for x86_64

Although it is not really needed, we provide it to get
closer to i386. ifdefs around it are removed in smpboot.c

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: change __setup_vector_irq with setup_vector_irq
Glauber Costa [Thu, 29 May 2008 03:34:19 +0000 (20:34 -0700)]
x86: change __setup_vector_irq with setup_vector_irq

We create a version of it for i386, and then take the CONFIG_X86_64
ifdef out of the game. We could create a __setup_vector_irq for i386,
but it would incur in an unnecessary lock taking. Moreover, it is better
practice to only export setup_vector_irq anyway.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: remove ifdef from stepping
Glauber Costa [Thu, 29 May 2008 03:05:46 +0000 (20:05 -0700)]
x86: remove ifdef from stepping

The stepping won't affect x86_64, since there are not x86_64 k7's
or pentiums. So, although it adds to the binary size, remove the ifdef
for smoother integration

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: clearing io_apic harmless for x86_64
Glauber Costa [Thu, 29 May 2008 00:09:53 +0000 (17:09 -0700)]
x86: clearing io_apic harmless for x86_64

so remove ifdef.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: boot secondary cpus through initial_code
Glauber Costa [Wed, 28 May 2008 16:01:54 +0000 (13:01 -0300)]
x86: boot secondary cpus through initial_code

remove "initialize_secondary". Boot both architectures via
initial_code.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: use initial_code for i386
Glauber Costa [Wed, 28 May 2008 15:57:02 +0000 (12:57 -0300)]
x86: use initial_code for i386

x86_64 jumps to whatever is written in "initial_code" symbol,
instead of a fixed address. Do it for i386 too. It will allow us
to integrate more of the smp boot code.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: move x86_64 gdt closer to i386
Glauber Costa [Wed, 28 May 2008 23:19:53 +0000 (16:19 -0700)]
x86: move x86_64 gdt closer to i386

i386 and x86_64 used two different schemes for maintaining the gdt.
With this patch, x86_64 initial gdt table is defined in a .c file,
same way as i386 is now. Also, we call it "gdt_page", and the descriptor,
"early_gdt_descr". This way we achieve common naming, which can allow for
more code integration.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: don't use gdt_page openly.
Glauber Costa [Wed, 28 May 2008 03:14:51 +0000 (20:14 -0700)]
x86: don't use gdt_page openly.

There's a macro available for that.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: use stack_start in x86_64
Glauber Costa [Wed, 28 May 2008 01:22:54 +0000 (18:22 -0700)]
x86: use stack_start in x86_64

call x86_64's init_rsp stack_start, just as i386 does.
Put a zeroed stack segment for consistency. With this,
we can eliminate one ugly ifdef in smpboot.c.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agobuild: add __page_aligned_data and __page_aligned_bss
Jeremy Fitzhardinge [Wed, 28 May 2008 14:02:14 +0000 (15:02 +0100)]
build: add __page_aligned_data and __page_aligned_bss

Making a variable page-aligned by using
__attribute__((section(".data.page_aligned"))) is fragile because if
sizeof(variable) is not also a multiple of page size, it leaves
variables in the remainder of the section unaligned.

This patch introduces two new qualifiers, __page_aligned_data and
__page_aligned_bss to set the section *and* the alignment of
variables.  This makes page-aligned variables more robust because the
linker will make sure they're aligned properly.  Unfortunately it
requires *all* page-aligned data to use these macros...

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: unify crashkernel reservation for 32 and 64 bit
Bernhard Walle [Fri, 20 Jun 2008 13:38:22 +0000 (15:38 +0200)]
x86: unify crashkernel reservation for 32 and 64 bit

This patch moves the reserve_crashkernel() to setup.c and removes the
architecture-specific version. Both versions were more or less the same.

I tested it on both x86-64 and i386, with CONFIG_KEXEC on and off (so
that it compiles).

Signed-off-by: Bernhard Walle <bwalle@suse.de>
Cc: yhlu.kernel@gmail.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoMerge branch 'x86/fixmap' into x86/devel
Ingo Molnar [Tue, 8 Jul 2008 10:24:29 +0000 (12:24 +0200)]
Merge branch 'x86/fixmap' into x86/devel

Conflicts:

arch/x86/mm/init_64.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoMerge branch 'x86/uv' into x86/devel
Ingo Molnar [Tue, 8 Jul 2008 10:24:13 +0000 (12:24 +0200)]
Merge branch 'x86/uv' into x86/devel

16 years agox86, SGI UV: uv_ptc_proc_write fix
Cliff Wickman [Mon, 23 Jun 2008 13:32:25 +0000 (08:32 -0500)]
x86, SGI UV: uv_ptc_proc_write fix

Someone could write 0 bytes to /proc/sgi_uv/ptc_statistics,
causing
  optstr[count - 1] = '\0';
to write to who-knows-where.

(Andi Kleen noticed this need from a patch I sent for
 similar code in the ia64 world (sn2_ptc_proc_write()).)

(count less than zero is not possible here, as count is unsigned)

Signed-off-by: Cliff Wickman <cpw@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86, SGI UV: TLB shootdown using broadcast assist unit, v6
Cliff Wickman [Thu, 19 Jun 2008 16:16:24 +0000 (11:16 -0500)]
x86, SGI UV: TLB shootdown using broadcast assist unit, v6

v6: 6/19 close the security hole in uv_ptc_proc_write())

  > Found a potential security hole while doing that:
  > static ssize_t uv_ptc_proc_write(struct file *file, const char __user *user,
  >                              size_t count, loff_t *data)
  >     if (copy_from_user(optstr, user, count))
  >             return -EFAULT;
  >
  > is count guaranteed to never be larger than 64?

is fixed below.

It adds tlb_uv.o to the Makefile.

Signed-off-by: Cliff Wickman <cpw@sgi.com>
Cc: mingo@elte.hu
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: increase MAX_APICS for very large x86-64 configs
Jack Steiner [Mon, 16 Jun 2008 17:09:10 +0000 (12:09 -0500)]
x86: increase MAX_APICS for very large x86-64 configs

Increase the maximum number of apics when running very large
configurations. This patch has no affect on most systems.

The patch has no effect on any 32-bit kernel. It adds ~4k to the size
of 64-bit kernels but only if NR_CPUS > 255.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: fix stack overflow for large values of MAX_APICS
Jack Steiner [Fri, 20 Jun 2008 02:51:05 +0000 (21:51 -0500)]
x86: fix stack overflow for large values of MAX_APICS

physid_mask_of_physid() causes a huge stack (12k) to be created if the
number of APICS is large. Replace physid_mask_of_physid() with a
new function that does not create large stacks. This is a problem only
on large x86_64 systems.

this paves the way to increase MAX_APICS.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Cc: linux-mm@kvack.org
Cc: mingo@elte.hu
Cc: tglx@linutronix.de
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoSGI UV: TLB shootdown using broadcast assist unit, fix
Ingo Molnar [Wed, 18 Jun 2008 12:51:57 +0000 (14:51 +0200)]
SGI UV: TLB shootdown using broadcast assist unit, fix

fix:

arch/x86/kernel/tlb_uv.c: In function â€˜uv_table_bases_init':
arch/x86/kernel/tlb_uv.c:612: error: â€˜bau_tabsp' undeclared (first use in this function)
arch/x86/kernel/tlb_uv.c:612: error: (Each undeclared identifier is reported only once
arch/x86/kernel/tlb_uv.c:612: error: for each function it appears in.)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoSGI UV: clean up arch/x86/kernel/tlb_uv.c
Ingo Molnar [Wed, 18 Jun 2008 12:28:19 +0000 (14:28 +0200)]
SGI UV: clean up arch/x86/kernel/tlb_uv.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoSGI UV: TLB shootdown using broadcast assist unit
Ingo Molnar [Wed, 18 Jun 2008 12:15:43 +0000 (14:15 +0200)]
SGI UV: TLB shootdown using broadcast assist unit

TLB shootdown for SGI UV.

v5: 6/12 corrections/improvements per Ingo's second review

Signed-off-by: Cliff Wickman <cpw@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoSGI UV: TLB shootdown using broadcast assist unit, cleanups
Cliff Wickman [Thu, 12 Jun 2008 13:23:48 +0000 (08:23 -0500)]
SGI UV: TLB shootdown using broadcast assist unit, cleanups

TLB shootdown for SGI UV.

v1: 6/2 original
v2: 6/3 corrections/improvements per Ingo's review
v3: 6/4 split atomic operations off to a separate patch (Jeremy's review)
v4: 6/12 include <mach_apic.h> rather than <asm/mach-bigsmp/mach_apic.h>
         (fixes a !SMP build problem that Ingo found)
         fix the index on uv_table_bases[blade]

Signed-off-by: Cliff Wickman <cpw@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86 atomic operations: atomic_or_long() atomic_inc_short()
Cliff Wickman [Wed, 4 Jun 2008 20:33:17 +0000 (15:33 -0500)]
x86 atomic operations: atomic_or_long() atomic_inc_short()

Provide atomic operations for increment of a 16-bit integer and
logical OR into a 64-bit integer.

Signed-off-by: Cliff Wickman <cpw@sgi.com>
Reviewed-by: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86, SGI UV: TLB shootdown using broadcast assist unit
Cliff Wickman [Mon, 2 Jun 2008 13:56:14 +0000 (08:56 -0500)]
x86, SGI UV: TLB shootdown using broadcast assist unit

TLB shootdown for SGI UV.

Depends on patch (in tip/x86/irq):
   x86-update-macros-used-by-uv-platform.patch   Jack Steiner May 29

This patch provides the ability to flush TLB's in cpu's that are not on
the local node.  The hardware mechanism for distributing the flush
messages is the UV's "broadcast assist unit".

The hook to intercept TLB shootdown requests is a 2-line change to
native_flush_tlb_others() (arch/x86/kernel/tlb_64.c).

This code has been tested on a hardware simulator. The real hardware
is not yet available.

The shootdown statistics are provided through /proc/sgi_uv/ptc_statistics.
The use of /sys was considered, but would have required the use of
many /sys files.  The debugfs was also considered, but these statistics
should be available on an ongoing basis, not just for debugging.

Issues to be fixed later:
- The IRQ for the messaging interrupt is currently hardcoded as 200
  (see UV_BAU_MESSAGE).  It should be dynamically assigned in the future.
- The use of appropriate udelay()'s is untested, as they are a problem
  in the simulator.

Signed-off-by: Cliff Wickman <cpw@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoMerge branch 'linus' into x86/irq
Ingo Molnar [Tue, 8 Jul 2008 10:23:00 +0000 (12:23 +0200)]
Merge branch 'linus' into x86/irq

16 years agoMerge branch 'x86/nmi' into x86/devel
Ingo Molnar [Tue, 8 Jul 2008 10:17:08 +0000 (12:17 +0200)]
Merge branch 'x86/nmi' into x86/devel

Conflicts:

arch/x86/kernel/nmi.c
arch/x86/kernel/nmi_32.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoMerge branch 'x86/numa' into x86/devel
Ingo Molnar [Tue, 8 Jul 2008 09:59:23 +0000 (11:59 +0200)]
Merge branch 'x86/numa' into x86/devel

Conflicts:

arch/x86/Kconfig
arch/x86/kernel/e820.c
arch/x86/kernel/efi_64.c
arch/x86/kernel/mpparse.c
arch/x86/kernel/setup.c
arch/x86/kernel/setup_32.c
arch/x86/mm/init_64.c
include/asm-x86/proto.h

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: use reserve_bootmem_generic() to reserve crashkernel memory on x86_64
Bernhard Walle [Sun, 8 Jun 2008 13:46:31 +0000 (15:46 +0200)]
x86: use reserve_bootmem_generic() to reserve crashkernel memory on x86_64

This patch uses reserve_bootmem_generic() instead of reserve_bootmem()
to reserve the crashkernel memory on x86_64. That's necessary for NUMA
machines, see 00212fef814612245ed0261cbac8426d0c9a31a5:

  [PATCH] Fix kdump Crash Kernel boot memory reservation for NUMA machines

  This patch will fix a boot memory reservation bug that trashes memory on
  the ES7000 when loading the kdump crash kernel.

  The code in arch/x86_64/kernel/setup.c to reserve boot memory for the crash
  kernel uses the non-numa aware "reserve_bootmem" function instead of the
  NUMA aware "reserve_bootmem_generic".  I checked to make sure that no other
  function was using "reserve_bootmem" and found none, except the ones that
  had NUMA ifdef'ed out.

  I have tested this patch only on an ES7000 with NUMA on and off (numa=off)
  in a single (non-NUMA) and multi-cell (NUMA) configurations.

Signed-off-by: Amul Shah <amul.shah@unisys.com>
Looks-good-to: Vivek Goyal <vgoyal@in.ibm.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The switch-back to reserve_bootmem() was accidentally introduced in
5c3391f9f749023a49c64d607da4fb49263690eb when adding the BOOTMEM_EXCLUSIVE
parameter.

Signed-off-by: Bernhard Walle <bwalle@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: add flags parameter to reserve_bootmem_generic()
Bernhard Walle [Sun, 8 Jun 2008 13:46:30 +0000 (15:46 +0200)]
x86: add flags parameter to reserve_bootmem_generic()

This patch adds a 'flags' parameter to reserve_bootmem_generic() like it
already has been added in reserve_bootmem() with commit
72a7fe3967dbf86cb34e24fbf1d957fe24d2f246.

It also changes all users to use BOOTMEM_DEFAULT, which doesn't effectively
change the behaviour. Since the change is x86-specific, I don't think it's
necessary to add a new API for migration. There are only 4 users of that
function.

The change is necessary for the next patch, using reserve_bootmem_generic()
for crashkernel reservation.

Signed-off-by: Bernhard Walle <bwalle@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: fix setup.c printk format warning
Randy Dunlap [Thu, 5 Jun 2008 18:10:59 +0000 (11:10 -0700)]
x86: fix setup.c printk format warning

Fix setup.c printk format warning:

linux-next-20080605/arch/x86/kernel/setup.c: In function 'setup_per_cpu_areas':
linux-next-20080605/arch/x86/kernel/setup.c:173: warning: format '%lu' expects type 'long unsigned int', but argument 2 has type 'ssize_t'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: don't return invalid pointers from node_to_cpumask()
Vegard Nossum [Fri, 6 Jun 2008 14:33:25 +0000 (16:33 +0200)]
x86: don't return invalid pointers from node_to_cpumask()

Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched, numa: replace MAX_NUMNODES with nr_node_ids in kernel/sched.c
Mike Travis [Mon, 12 May 2008 19:21:12 +0000 (21:21 +0200)]
sched, numa: replace MAX_NUMNODES with nr_node_ids in kernel/sched.c

  * Replace usages of MAX_NUMNODES with nr_node_ids in kernel/sched.c,
    where appropriate.  This saves some allocated space as well as many
    wasted cycles going through node entries that are non-existent.

Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: numa_64.c fix shadowed variable
Thomas Gleixner [Mon, 12 May 2008 13:43:36 +0000 (15:43 +0200)]
x86: numa_64.c fix shadowed variable

sparse mutters:
arch/x86/mm/numa_64.c:195:27: warning: symbol 'end_pfn' shadows an earlier one

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: numa_64.c make local variables static
Thomas Gleixner [Mon, 12 May 2008 13:43:36 +0000 (15:43 +0200)]
x86: numa_64.c make local variables static

plat_node_bdata, cmdline, nodemap_addr, nodemap_size are local to
numa_64.c. Make them static

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: compile error fix for smpboot.c
Jeremy Fitzhardinge [Wed, 21 May 2008 10:21:13 +0000 (11:21 +0100)]
x86: compile error fix for smpboot.c

Without this patch, my link fails with:

arch/x86/kernel/built-in.o(.cpuinit.text+0x3c6e): In function `get_local_pda':
: undefined reference to `_cpu_pda'
arch/x86/kernel/built-in.o(.cpuinit.text+0x3cd1): In function `get_local_pda':
: undefined reference to `after_bootmem'
arch/x86/kernel/built-in.o(.cpuinit.text+0x3cec): In function `get_local_pda':
: undefined reference to `_cpu_pda'
make[2]: *** [.tmp_vmlinux1] Error 1

Caused by commit 766da892634694f795b18b9538407816896fc470
    x86: remove static boot_cpu_pda array v2

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: leave initial __cpu_pda array in place until cpus are booted
Mike Travis [Mon, 12 May 2008 19:21:13 +0000 (21:21 +0200)]
x86: leave initial __cpu_pda array in place until cpus are booted

Ingo Molnar wrote:
...
> they crashed after about 3 randconfig iterations with:
>
>   early res: 4 [8000-afff] PGTABLE
>   early res: 5 [b000-b87f] MEMNODEMAP
> PANIC: early exception 0e rip 10:ffffffff8077a150 error 2 cr2 37
> Pid: 0, comm: swapper Not tainted 2.6.25-sched-devel.git-x86-latest.git #14
>
> Call Trace:
>  [<ffffffff81466196>] early_idt_handler+0x56/0x6a
>  [<ffffffff8077a150>] ? numa_set_node+0x30/0x60
>  [<ffffffff8077a129>] ? numa_set_node+0x9/0x60
>  [<ffffffff8147a543>] numa_init_array+0x93/0xf0
>  [<ffffffff8147b039>] acpi_scan_nodes+0x3b9/0x3f0
>  [<ffffffff8147a496>] numa_initmem_init+0x136/0x150
>  [<ffffffff8146da5f>] setup_arch+0x48f/0x700
>  [<ffffffff802566ea>] ? clockevents_register_notifier+0x3a/0x50
>  [<ffffffff81466a87>] start_kernel+0xd7/0x440
>  [<ffffffff81466422>] x86_64_start_kernel+0x222/0x280
...
Here's the fixup...  This one should follow the previous patches.

Thanks,
Mike
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: remove static boot_cpu_pda array v2
Mike Travis [Mon, 12 May 2008 19:21:13 +0000 (21:21 +0200)]
x86: remove static boot_cpu_pda array v2

  * Remove the boot_cpu_pda array and pointer table from the data section.
    Allocate the pointer table and array during init.  do_boot_cpu()
    will reallocate the pda in node local memory and if the cpu is being
    brought up before the bootmem array is released (after_bootmem = 0),
    then it will free the initial pda.  This will happen for all cpus
    present at system startup.

    This removes 512k + 32k bytes from the data section.

For inclusion into sched-devel/latest tree.

Based on:
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
    +   sched-devel/latest  .../mingo/linux-2.6-sched-devel.git

Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: remove the static 256k node_to_cpumask_map
Mike Travis [Mon, 12 May 2008 19:21:12 +0000 (21:21 +0200)]
x86: remove the static 256k node_to_cpumask_map

  * Consolidate node_to_cpumask operations and remove the 256k
    byte node_to_cpumask_map.  This is done by allocating the
    node_to_cpumask_map array after the number of possible nodes
    (nr_node_ids) is known.

  * Debug printouts when CONFIG_DEBUG_PER_CPU_MAPS is active have
    been increased.  It now shows faults when calling node_to_cpumask()
    and node_to_cpumask_ptr().

For inclusion into sched-devel/latest tree.

Based on:
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
    +   sched-devel/latest  .../mingo/linux-2.6-sched-devel.git

Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: restore pda nodenumber field
Mike Travis [Mon, 12 May 2008 19:21:12 +0000 (21:21 +0200)]
x86: restore pda nodenumber field

  * Restore the nodenumber field in the x86_64 pda.  This field is slightly
    different than the x86_cpu_to_node_map mainly because it's a static
    indication of which node the cpu is on while the cpu to node map is a
    dyanamic mapping that may get reset if the cpu goes offline.  This also
    simplifies the numa_node_id() macro.

For inclusion into sched-devel/latest tree.

Based on:
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
    +   sched-devel/latest  .../mingo/linux-2.6-sched-devel.git

Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: cleanup early per cpu variables/accesses v4
Mike Travis [Mon, 12 May 2008 19:21:12 +0000 (21:21 +0200)]
x86: cleanup early per cpu variables/accesses v4

  * Introduce a new PER_CPU macro called "EARLY_PER_CPU".  This is
    used by some per_cpu variables that are initialized and accessed
    before there are per_cpu areas allocated.

    ["Early" in respect to per_cpu variables is "earlier than the per_cpu
    areas have been setup".]

    This patchset adds these new macros:

DEFINE_EARLY_PER_CPU(_type, _name, _initvalue)
EXPORT_EARLY_PER_CPU_SYMBOL(_name)
DECLARE_EARLY_PER_CPU(_type, _name)

early_per_cpu_ptr(_name)
early_per_cpu_map(_name, _idx)
early_per_cpu(_name, _cpu)

    The DEFINE macro defines the per_cpu variable as well as the early
    map and pointer.  It also initializes the per_cpu variable and map
    elements to "_initvalue".  The early_* macros provide access to
    the initial map (usually setup during system init) and the early
    pointer.  This pointer is initialized to point to the early map
    but is then NULL'ed when the actual per_cpu areas are setup.  After
    that the per_cpu variable is the correct access to the variable.

    The early_per_cpu() macro is not very efficient but does show how to
    access the variable if you have a function that can be called both
    "early" and "late".  It tests the early ptr to be NULL, and if not
    then it's still valid.  Otherwise, the per_cpu variable is used
    instead:

#define early_per_cpu(_name, _cpu)  \
(early_per_cpu_ptr(_name) ? \
early_per_cpu_ptr(_name)[_cpu] : \
per_cpu(_name, _cpu))

    A better method is to actually check the pointer manually.  In the
    case below, numa_set_node can be called both "early" and "late":

void __cpuinit numa_set_node(int cpu, int node)
{
    int *cpu_to_node_map = early_per_cpu_ptr(x86_cpu_to_node_map);

    if (cpu_to_node_map)
    cpu_to_node_map[cpu] = node;
    else
    per_cpu(x86_cpu_to_node_map, cpu) = node;
}

  * Add a flag "arch_provides_topology_pointers" that indicates pointers
    to topology cpumask_t maps are available.  Otherwise, use the function
    returning the cpumask_t value.  This is useful if cpumask_t set size
    is very large to avoid copying data on to/off of the stack.

  * The coverage of CONFIG_DEBUG_PER_CPU_MAPS has been increased while
    the non-debug case has been optimized a bit.

  * Remove an unreferenced compiler warning in drivers/base/topology.c

  * Clean up #ifdef in setup.c

For inclusion into sched-devel/latest tree.

Based on:
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
    +   sched-devel/latest  .../mingo/linux-2.6-sched-devel.git

Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: modify Kconfig to allow up to 4096 cpus
Mike Travis [Mon, 12 May 2008 19:21:12 +0000 (21:21 +0200)]
x86: modify Kconfig to allow up to 4096 cpus

  * Increase the limit of NR_CPUS to 4096 and introduce a boolean
    called "MAXSMP" which when set (e.g. "allyesconfig"), will set
    NR_CPUS = 4096 and NODES_SHIFT = 9 (512).

  * Changed max setting for NODES_SHIFT from 15 to 9 to accurately
    reflect the real limit.

Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: fix remove cpu_pda table patch
Mike Travis [Mon, 12 May 2008 19:21:12 +0000 (21:21 +0200)]
x86: fix remove cpu_pda table patch

Mike Travis wrote:
> Ingo Molnar wrote:
>> * Mike Travis <travis@sgi.com> wrote:
>>
>>> [Ingo - please replace "PATCH 07/11" with this one.]
>>>
>>>     * Remove 544k bytes from the kernel by removing the boot_cpu_pda
>>>  array from the data section and allocating it during startup.
>>>
>>>  Fixed panic in setup_per_cpu_areas when HOTPLUG_CPU not set.
>>>
>>> For inclusion into sched-devel/latest tree.
>> sched-devel.git randconfig testing found another crash with your queue:
>>
>> [    0.111060] Brought up 1 CPUs
>> [    0.111986] Total of 1 processors activated (4022.73 BogoMIPS).
>> [    0.112987] Testing NMI watchdog ... <1>BUG: unable to handle kernel NULL pointer dereference at 0000000000000040
>> [    0.114982] IP: [<ffffffff8180d4a0>] check_nmi_watchdog+0xb0/0x210
>> [    0.114982] PGD 0
>> [    0.114982] Oops: 0000 [1] SMP
>> [    0.114982] CPU 0
>> [............]
>>
>>  http://redhat.com/~mingo/misc/config-Mon_Apr_28_23_25_25_CEST_2008.bad
>>  http://redhat.com/~mingo/misc/log-Mon_Apr_28_23_25_25_CEST_2008.bad
>>
>>  Ingo
>
> Hi Ingo,
>
> I need a bit more information on your hardware configuration.  Building a
> kernel with the above config file started up fine on both the Intel and AMD
> boxes.
>
> Based on the above output it looks like it might be a UP machine?
...

Ok, I think I found it.  In check_nmi_watchdog():

        for (cpu = 0; cpu < NR_CPUS; cpu++)
                prev_nmi_count[cpu] = cpu_pda(cpu)->__nmi_count;

As I mentioned it works fine on both of my systems so could you try it out?

Thanks!
Mike
--

  * Change function check_nmi_watchdog() to use nr_cpu_ids instead of NR_CPUS.

Based on:
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
    +   sched-devel/latest  .../mingo/linux-2.6-sched-devel.git

Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: don't call pxm_to_node again
Yinghai Lu [Sat, 19 Apr 2008 08:30:16 +0000 (01:30 -0700)]
x86: don't call pxm_to_node again

also make bus_numa work even if ACPI_NUMA is not defined.

don't call pxm_to_node again, and use node directly.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: make dev_to_node return online node
Yinghai Lu [Wed, 20 Feb 2008 20:41:52 +0000 (12:41 -0800)]
x86: make dev_to_node return online node

a numa system (with multi HT chains) may return node without ram. Aka it
is not online. Try to get an online node, otherwise return -1.

Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agoMerge branch 'x86/mpparse' into x86/devel
Ingo Molnar [Tue, 8 Jul 2008 09:14:58 +0000 (11:14 +0200)]
Merge branch 'x86/mpparse' into x86/devel

Conflicts:

arch/x86/Kconfig
arch/x86/kernel/io_apic_32.c
arch/x86/kernel/setup_64.c
arch/x86/mm/init_32.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86, ioapic, acpi quirk: disable IRQ 0 through I/O APIC for some HP systems
Matthew Garrett [Tue, 1 Jul 2008 00:12:06 +0000 (01:12 +0100)]
x86, ioapic, acpi quirk: disable IRQ 0 through I/O APIC for some HP systems

Some HP laptops have a problem with their DSDT reporting as
HP/SB400/10000, which includes some code which overrides all temperature
trip points to 16C if the INTIN2 input of the I/O APIC is enabled.  This
input is incorrectly designated the ISA IRQ 0 via an interrupt source
override even though it is wired to the output of the master 8259A and
INTIN0 is not connected at all.  So far two models have been identified,
namely nx6125 and nx6325.

Use a knob provided by the I/O APIC interrupt registration code to
abandon any attempts to route IRQ 0 through the I/O APIC for these
systems.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86, ioapic, acpi: add a knob to disable IRQ 0 through I/O APIC
Maciej W. Rozycki [Tue, 1 Jul 2008 00:11:35 +0000 (01:11 +0100)]
x86, ioapic, acpi: add a knob to disable IRQ 0 through I/O APIC

As discovered recently some systems exhibit problems when the 8254 timer
IRQ is routed through the I/O APIC.  These problems do not affect the
timer IRQ itself and therefore cannot be detected when the correctness of
operation of the interrupt is verified in check_timer().  Therefore the
I/O APIC path of the timer IRQ has to be disabled entirely.

This is a change that lets platforms ask for the timer IRQ not to be
registered in the I/O APIC interrupt tables.  The local APIC and ExtINTA
paths are unaffected.  This request is only taken into account for ACPI
platforms as MP table systems seem unaffected so far.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: simplify x86_mpparse dependency check
Yinghai Lu [Fri, 20 Jun 2008 14:33:31 +0000 (07:33 -0700)]
x86: simplify x86_mpparse dependency check

"Maciej W. Rozycki" <macro@linux-mips.org> said:

> Given X86_64 selects X86_LOCAL_APIC I am not sure the redundancy seen
>above does not actually obscure the logic behind...  I think:
>
>       depends on X86_LOCAL_APIC && !X86_VISWS
>
>would be clearer and get the same.

Suggested-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Len Brown <lenb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoRevert parts of "x86: update mptable"
Ingo Molnar [Tue, 8 Jul 2008 08:47:39 +0000 (10:47 +0200)]
Revert parts of "x86: update mptable"

Signed-off-by: Ingo Molnar <mingo@elte.hu>