]> err.no Git - linux-2.6/log
linux-2.6
16 years agox86: big ticket locks
Nick Piggin [Wed, 30 Jan 2008 12:33:00 +0000 (13:33 +0100)]
x86: big ticket locks

This implements ticket lock support for more than 255 CPUs on x86. The
code gets switched according to the configured NR_CPUS.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: coding style fixes in arch/x86/pci/fixup.c
Paolo Ciarrocchi [Wed, 30 Jan 2008 12:33:00 +0000 (13:33 +0100)]
x86: coding style fixes in arch/x86/pci/fixup.c

Simple coding style fixes.

no code changed:

   text    data     bss     dec     hex filename
   3139     576     194    3909     f45 fixup.o.before
   3139     576     194    3909     f45 fixup.o.after

  md5:
   9a3467057478b2d99962bdd448282eeb  fixup.o.before.asm
   9a3467057478b2d99962bdd448282eeb  fixup.o.after.asm

Signed-off-by: Paolo Ciarrocchi <paolo.ciarrocchi@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: use fixup_exception() in traps_64.c
Harvey Harrison [Wed, 30 Jan 2008 12:32:59 +0000 (13:32 +0100)]
x86: use fixup_exception() in traps_64.c

Use the fixup_exception() helper instead of the open-coded
search_extable() users.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: more users of PF_ constants in fault_32|64.c
Harvey Harrison [Wed, 30 Jan 2008 12:32:59 +0000 (13:32 +0100)]
x86: more users of PF_ constants in fault_32|64.c

Should be the last of the error_code tests that could use
the PF_ defines.  Makes X86_32|64 a little closer.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: introduce __die helper to X86_32
Harvey Harrison [Wed, 30 Jan 2008 12:32:59 +0000 (13:32 +0100)]
x86: introduce __die helper to X86_32

Small step towards unifying traps_32|64.c.  No functional
changes.  Pull out a small helper from an if() statement
in die().

Marked as __kprobes as eventually we will want to call this
from do_page_fault similar to how X86_64 does it.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: change x86 machine check handler to use unlocked_ioctl instead
Nikanth Karthikesan [Wed, 30 Jan 2008 12:32:59 +0000 (13:32 +0100)]
x86: change x86 machine check handler to use unlocked_ioctl instead

The machine check handler registers ioctl handler that is called
with the BKL held. Changing to register unlocked_ioctl instead.
Also mce ioctl handler does not seem to need any lock protection.

To: Andi Kleen <andi@firstfloor.org>
Cc: linux-kernel@vger.kernel.org
Cc: kernel-janitors@vger.kernel.org
Change the Machine check handler to use unlocked_ioctl instead of
ioctl handler. Also the mce ioctl handler does not need any lock
protection.

Signed-off-by: Nikanth Karthikesan <knikanth@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86/pgtable: explain constant sign extension problem
Jeremy Fitzhardinge [Wed, 30 Jan 2008 12:32:59 +0000 (13:32 +0100)]
x86/pgtable: explain constant sign extension problem

When the _PAGE_FOO constants are defined as (1ul << _PAGE_BIT_FOO), they
become unsigned longs.  In 32-bit PAE mode, these end up being
implicitly cast to 64-bit types when used to manipulate a pte, and
because they're unsigned the top 32-bits are 0, destroying the upper
bits of the pte.

When _PAGE_FOO constants are given a signed integer type, the cast to
64-bits will sign-extend so that the upper bits are all ones,
preserving the upper pte bits in manipulations.

Explain this in a prominent place.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agoxen: mask out PWT too
Jeremy Fitzhardinge [Wed, 30 Jan 2008 12:32:58 +0000 (13:32 +0100)]
xen: mask out PWT too

The hypervisor doesn't allow PCD or PWT to be set on guest ptes, so
make sure they're masked out.  Also, fix up some previous mispatching.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: unify paravirt pagetable accessors
Jeremy Fitzhardinge [Wed, 30 Jan 2008 12:32:58 +0000 (13:32 +0100)]
x86: unify paravirt pagetable accessors

Put all the defines for mapping pagetable operations to their native
versions (for the non-paravirt case) into one place.  Make the
corresponding changes to paravirt.h.

The tricky part here is that when a pagetable entry can't be updated
atomically (ie, 32-bit PAE), we need special handlers for pte_clear,
set_pte_atomic and set_pte_present.  However, the other two modes
don't need special handling for these, and can use a common
set_pte(_at) path.

[ mingo@elte.hu: fixes ]

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: unify zero_page definition
Jeremy Fitzhardinge [Wed, 30 Jan 2008 12:32:58 +0000 (13:32 +0100)]
x86: unify zero_page definition

Move ZERO_PAGE/empty_zero_page to common place.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: fix warning
Jeremy Fitzhardinge [Wed, 30 Jan 2008 12:32:58 +0000 (13:32 +0100)]
x86: fix warning

&ptep->pte isn't always an unsigned long *, so cast it to avoid a warning.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: pgtable: unify pte accessors
Jeremy Fitzhardinge [Wed, 30 Jan 2008 12:32:58 +0000 (13:32 +0100)]
x86: pgtable: unify pte accessors

Make various pte accessors common.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86/vmi: fix compilation as a result of pte_t changes
Jeremy Fitzhardinge [Wed, 30 Jan 2008 12:32:58 +0000 (13:32 +0100)]
x86/vmi: fix compilation as a result of pte_t changes

Fix various compilation problems as a result of changing pte_t.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Zachary Amsden <zach@vmware.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: page.h: make pte_t a union to always include
Jeremy Fitzhardinge [Wed, 30 Jan 2008 12:32:57 +0000 (13:32 +0100)]
x86: page.h: make pte_t a union to always include

Make sure pte_t, whatever its definition, has a pte element with type
pteval_t.  This allows common code to access it without needing to be
specifically parameterised on what pagetable mode we're compiling for.
For 32-bit, this means that pte_t becomes a union with "pte" and "{
pte_low, pte_high }" (PAE) or just "pte_low" (non-PAE).

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: fix pte_modify() bug
Ingo Molnar [Wed, 30 Jan 2008 12:32:57 +0000 (13:32 +0100)]
x86: fix pte_modify() bug

fix sign extension bug in PTE_MASK / _PTE_CHG_MASK.

this resolves the following bootup crash on PAE systems:

[   94.710726] init[1]: segfault at 00000004 ip 49471cbb sp bff0c6c0 error 4
[   94.717764] init[1]: segfault at 00000004 ip 49471cbb sp bff0c6c0 error 4
[   94.724772] init[1]: segfault at 00000004 ip 49471cbb sp bff0c6c0 error 4
[   94.731777] init[1]: segfault at 00000004 ip 49471cbb sp bff0c6c0 error 4

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: unify pgtable accessors which use, #2
Ingo Molnar [Wed, 30 Jan 2008 12:32:57 +0000 (13:32 +0100)]
x86: unify pgtable accessors which use, #2

based on:

 Subject: x86: unify pgtable accessors which use supported_pte_mask
 From: Jeremy Fitzhardinge <jeremy@goop.org>

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: unify pgtable accessors which use
Jeremy Fitzhardinge [Wed, 30 Jan 2008 12:32:57 +0000 (13:32 +0100)]
x86: unify pgtable accessors which use

Make users of supported_pte_mask common.  This has the side-effect of
introducing the variable for 32-bit non-PAE, but I think its a pretty
small cost to simplify the code.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: mask NX from pte_pfn
Jeremy Fitzhardinge [Wed, 30 Jan 2008 12:32:57 +0000 (13:32 +0100)]
x86: mask NX from pte_pfn

In 32-bit PAE, mask NX from pte_pfn, since it isn't part of the PFN.
This code is due for unification anyway, but this fixes a latent bug.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86/pgtable: unify pagetable accessors, #6
Jeremy Fitzhardinge [Wed, 30 Jan 2008 12:32:56 +0000 (13:32 +0100)]
x86/pgtable: unify pagetable accessors, #6

Unify functions to test and set bits in pagetable entries.

NOP: only moves existing code around, without any change to it.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86/pgtable: unify pagetable accessors, #5
Ingo Molnar [Wed, 30 Jan 2008 12:32:56 +0000 (13:32 +0100)]
x86/pgtable: unify pagetable accessors, #5

reorder. NOP.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86/pgtable: unify pagetable accessors, #4
Ingo Molnar [Wed, 30 Jan 2008 12:32:56 +0000 (13:32 +0100)]
x86/pgtable: unify pagetable accessors, #4

add new ops to 32-bit.

based on:

 Subject: x86/pgtable: unify pagetable accessors
 From: Jeremy Fitzhardinge <jeremy@goop.org>

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86/pgtable: unify pagetable accessors, #3
Ingo Molnar [Wed, 30 Jan 2008 12:32:56 +0000 (13:32 +0100)]
x86/pgtable: unify pagetable accessors, #3

change the pte_mk inlines to the unified format. Non-NOP!

based on:

 Subject: x86/pgtable: unify pagetable accessors
 From: Jeremy Fitzhardinge <jeremy@goop.org>

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86/pgtable: unify pagetable accessors, #2
Ingo Molnar [Wed, 30 Jan 2008 12:32:55 +0000 (13:32 +0100)]
x86/pgtable: unify pagetable accessors, #2

change the pte_dirty/* inlines to the unified format. Non-NOP!

based on:

 Subject: x86/pgtable: unify pagetable accessors
 From: Jeremy Fitzhardinge <jeremy@goop.org>

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86/pgtable: unify pagetable accessors, #1
Ingo Molnar [Wed, 30 Jan 2008 12:32:55 +0000 (13:32 +0100)]
x86/pgtable: unify pagetable accessors, #1

based on:

 Subject: x86/pgtable: unify pagetable accessors
 From: Jeremy Fitzhardinge <jeremy@goop.org>

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86/pgtable: fix constant sign extension problem
Ingo Molnar [Wed, 30 Jan 2008 12:32:55 +0000 (13:32 +0100)]
x86/pgtable: fix constant sign extension problem

based on:

 Subject: x86/pgtable: fix constant sign extension problem
 From: Jeremy Fitzhardinge <jeremy@goop.org>

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: avoid name conflict for Voyager leave_mm
Jeremy Fitzhardinge [Wed, 30 Jan 2008 12:32:55 +0000 (13:32 +0100)]
x86: avoid name conflict for Voyager leave_mm

Avoid a conflict between Voyager's leave_mm and asm-x86/mmu.h's leave_mm.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: move all asm/pgtable constants into one place
Jeremy Fitzhardinge [Wed, 30 Jan 2008 12:32:55 +0000 (13:32 +0100)]
x86: move all asm/pgtable constants into one place

32 and 64-bit use the same flags for pagetable entries, so make them all common.

[ mingo@elte.hu: fixes ]

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: add PWT to NOCACHE flags
Ingo Molnar [Wed, 30 Jan 2008 12:32:54 +0000 (13:32 +0100)]
x86: add PWT to NOCACHE flags

add PWT bit to NOCACHE flags. No real difference to CPUs, but needed
later for PAT.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: coding style fixes in arch/x86/ia32/audit.c
Paolo Ciarrocchi [Wed, 30 Jan 2008 12:32:54 +0000 (13:32 +0100)]
x86: coding style fixes in arch/x86/ia32/audit.c

Fix one error reported by checkpatch,
it now reports:

total: 0 errors, 0 warnings, 42 lines checked

Signed-off-by: Paolo Ciarrocchi <paolo.ciarrocchi@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: make NUMA work on 32-bit again
Mel Gorman [Wed, 30 Jan 2008 12:32:54 +0000 (13:32 +0100)]
x86: make NUMA work on 32-bit again

On 32-bit NUMA, the memmap representing struct pages on each node is
allocated from node-local memory if possible. As only node-0 has memory from
ZONE_NORMAL, the memmap must be mapped into low memory. This is done by
reserving space in the Kernel Virtual Area (KVA) for the memmap belonging
to other nodes by taking pages from the end of ZONE_NORMAL and remapping
the other nodes memmap into those virtual addresses. The node boundaries
are then adjusted so that the region of pages is not used and it is marked
as reserved in the bootmem allocator.

This reserved portion of the KVA is PMD aligned althought
strictly speaking that requirement could be lifted (see thread at
http://lkml.org/lkml/2007/8/24/220). The problem is that when aligned, there
may be a portion of ZONE_NORMAL at the end that is not used for memmap and
does not have an initialised memmap nor is it marked reserved in the bootmem
allocator. Later in the boot process, these pages are freed and a storm of
Bad page state messages result.

This patch marks these pages reserved that are wasted due to alignment
in the bootmem allocator so they are not accidently freed. It is worth
noting that memory from node-0 is wasted where it could have been put into
ZONE_HIGHMEM on NUMA machines. Worse, the KVA is always reserved from the
location of real memory even when there is plenty of spare virtual address
space.

This patch also makes sure that reserve_bootmem() is not called with a
0-length size in numa_kva_reserve().  When this happens, it usually means
that a kernel built for Summit is being booted on a normal machine. The
resulting BUG_ON() is misleading so it is caught here.

Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86, ptrace: add bts_struct size to status command
Markus Metzger [Wed, 30 Jan 2008 12:32:54 +0000 (13:32 +0100)]
x86, ptrace: add bts_struct size to status command

Return the size of bts_struct in the PTRACE_BTS_STATUS command.
Change types to u32.

Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: migration helpers for KVM
Ingo Molnar [Wed, 30 Jan 2008 12:32:54 +0000 (13:32 +0100)]
x86: migration helpers for KVM

migration helpers for KVM.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: unify arch/x86/kernel/acpi/sleep*.c
Pavel Machek [Wed, 30 Jan 2008 12:32:54 +0000 (13:32 +0100)]
x86: unify arch/x86/kernel/acpi/sleep*.c

Unify arch/x86/kernel/acpi/sleep*.c

Pretty trivial unification; when two functions differed, it was
usually in error handling, and better of the two was picked up.

Signed-off-by: Pavel Machek <pavel@suse.cz>
Looks-okay-to: Rafael J. Wysocki <rjw@sisk.pl>
Tested-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: clean up arch/x86/mm/fault_64.c
Ingo Molnar [Wed, 30 Jan 2008 12:32:53 +0000 (13:32 +0100)]
x86: clean up arch/x86/mm/fault_64.c

clean up arch/x86/mm/fault_64.c a bit.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: kprobes: add kprobes smoke tests that run on boot
Ananth N Mavinakayanahalli [Wed, 30 Jan 2008 12:32:53 +0000 (13:32 +0100)]
x86: kprobes: add kprobes smoke tests that run on boot

Here is a quick and naive smoke test for kprobes. This is intended to
just verify if some unrelated change broke the *probes subsystem. It is
self contained, architecture agnostic and isn't of any great use by itself.

This needs to be built in the kernel and runs a basic set of tests to
verify if kprobes, jprobes and kretprobes run fine on the kernel. In case
of an error, it'll print out a message with a "BUG" prefix.

This is a start; we intend to add more tests to this bucket over time.

Thanks to Jim Keniston and Masami Hiramatsu for comments and suggestions.

Tested on x86 (32/64) and powerpc.

Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Acked-by: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: unify percpu.h
travis@sgi.com [Wed, 30 Jan 2008 12:32:53 +0000 (13:32 +0100)]
x86: unify percpu.h

Form a single percpu.h from percpu_32.h and percpu_64.h. Both are now pretty
small so this is simply adding them together.

Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: use generic percpu on 64-bit
travis@sgi.com [Wed, 30 Jan 2008 12:32:52 +0000 (13:32 +0100)]
x86: use generic percpu on 64-bit

x86_64 provides an optimized way to determine the local per cpu area
offset through the pda and determines the base by accessing a remote
pda.

Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86_32: use generic percpu.h
travis@sgi.com [Wed, 30 Jan 2008 12:32:52 +0000 (13:32 +0100)]
x86_32: use generic percpu.h

x86_32 only provides a special way to obtain the local per cpu area offset
via x86_read_percpu. Otherwise it can fully use the generic handling.

Cc: ak@suse.de
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agopercpu: make the asm-generic/percpu.h more "generic"
travis@sgi.com [Wed, 30 Jan 2008 12:32:52 +0000 (13:32 +0100)]
percpu: make the asm-generic/percpu.h more "generic"

- add support for PER_CPU_ATTRIBUTES

- fix generic smp percpu_modcopy to use per_cpu_offset() macro.

Add the ability to use generic/percpu even if the arch needs to override
several aspects of its operations. This will enable the use of generic
percpu.h for all arches.

An arch may define:

__per_cpu_offset Do not use the generic pointer array. Arch must
define per_cpu_offset(cpu) (used by x86_64, s390).

__my_cpu_offset Can be defined to provide an optimized way to determine
the offset for variables of the currently executing
processor. Used by ia64, x86_64, x86_32, sparc64, s/390.

SHIFT_PTR(ptr, offset) If an arch defines it then special handling
of pointer arithmentic may be implemented. Used
by s/390.

(Some of these special percpu arch implementations may be later consolidated
so that there are less cases to deal with.)

Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agopercpu: move arch XX_PER_CPU_XX definitions into linux/percpu.h
travis@sgi.com [Wed, 30 Jan 2008 12:32:52 +0000 (13:32 +0100)]
percpu: move arch XX_PER_CPU_XX definitions into linux/percpu.h

- Special consideration for IA64: Add the ability to specify
  arch specific per cpu flags

- remove .data.percpu attribute from DEFINE_PER_CPU for non-smp case.

The arch definitions are all the same. So move them into linux/percpu.h.

We cannot move DECLARE_PER_CPU since some include files just include
asm/percpu.h to avoid include recursion problems.

Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agopercpu: use a kconfig variable to signal arch specific percpu setup
travis@sgi.com [Wed, 30 Jan 2008 12:32:51 +0000 (13:32 +0100)]
percpu: use a kconfig variable to signal arch specific percpu setup

The use of the __GENERIC_PERCPU is a bit problematic since arches
may want to run their own percpu setup while using the generic
percpu definitions. Replace it through a kconfig variable.

Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoi386: handle an initrd in highmem (version 2)
H. Peter Anvin [Wed, 30 Jan 2008 12:32:51 +0000 (13:32 +0100)]
i386: handle an initrd in highmem (version 2)

The boot protocol has until now required that the initrd be located in
lowmem, which makes the lowmem/highmem boundary visible to the boot
loader.  This was exported to the bootloader via a compile-time
field.  Unfortunately, the vmalloc= command-line option breaks this
part of the protocol; instead of adding yet another hack that affects
the bootloader, have the kernel relocate the initrd down below the
lowmem boundary inside the kernel itself.

Note that this does not rely on HIGHMEM being enabled in the kernel.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86 boot : export boot_params via debugfs for debugging
Huang, Ying [Wed, 30 Jan 2008 12:32:51 +0000 (13:32 +0100)]
x86 boot : export boot_params via debugfs for debugging

This patch export the boot parameters via debugfs for debugging.

The files added are as follow:

boot_params/data    :  binary file for struct boot_params
boot_params/version :  boot protocol version

This patch is based on 2.6.24-rc5-mm1 and has been tested on i386 and
x86_64 platform.

This patch is based on the Peter Anvin's proposal.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: reboot_{32|64}.c unification
Miguel Boton [Wed, 30 Jan 2008 12:32:51 +0000 (13:32 +0100)]
x86: reboot_{32|64}.c unification

reboot_{32|64}.c unification patch.

This patch unifies the code from the reboot_32.c and reboot_64.c files.

It has been tested in computers with X86_32 and X86_64 kernels and it
looks like all reboot modes work fine (EFI restart system hasn't been
tested yet).

Probably I made some mistakes (like I usually do) so I hope
we can identify and fix them soon.

Signed-off-by: Miguel Boton <mboton@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agodebug: add the end-of-trace marker and the module list to
Arjan van de Ven [Wed, 30 Jan 2008 12:32:50 +0000 (13:32 +0100)]
debug: add the end-of-trace marker and the module list to

Unlike oopses, WARN_ON() currently does't print the loaded modules list.
This makes it harder to take action on certain bug reports. For example,
recently there were a set of WARN_ON()s reported in the mac80211 stack,
which were just signalling a driver bug. It takes then anther round trip
to the bug reporter (if he responds at all) to find out which driver
is at fault.

Another issue is that, unlike oopses, WARN_ON() doesn't currently printk
the helpful "cut here" line, nor the "end of trace" marker.
Now that WARN_ON() is out of line, the size increase due to this is
minimal and it's worth adding.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agodebug: move WARN_ON() out of line
Arjan van de Ven [Wed, 30 Jan 2008 12:32:50 +0000 (13:32 +0100)]
debug: move WARN_ON() out of line

A quick grep shows that there are currently 1145 instances of WARN_ON
in the kernel. Currently, WARN_ON is pretty much entirely inlined,
which makes it hard to enhance it without growing the size of the kernel
(and getting Andrew unhappy).

This patch build on top of Olof's patch that introduces __WARN,
and places the slowpath out of line. It also uses Ingo's suggestion
to not use __FUNCTION__ but to use kallsyms to do the lookup;
this saves a ton of extra space since gcc doesn't need to store the function
string twice now:

3936367  833603  624736 5394706  525112 vmlinux.before
3917508  833603  624736 5375847  520767 vmlinux-slowpath

15Kb savings...

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Olof Johansson <olof@lixom.net>
Acked-by: Matt Meckall <mpm@selenic.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agodebug: introduce __WARN()
Olof Johansson [Wed, 30 Jan 2008 12:32:50 +0000 (13:32 +0100)]
debug: introduce __WARN()

Introduce __WARN() in the generic case, so the generic WARN_ON()
can use arch-specific code for when the condition is true.

Signed-off-by: Olof Johansson <olof@lixom.net>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: kprobes change kprobe_handler flow
Abhishek Sagar [Wed, 30 Jan 2008 12:32:50 +0000 (13:32 +0100)]
x86: kprobes change kprobe_handler flow

Signed-off-by: Abhishek Sagar <sagar.abhishek@gmail.com>
Signed-off-by: Quentin Barnes <qbarnes@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: make arch/x86/kernel/acpi/wakeup_32.S use a separate
Eric Dumazet [Wed, 30 Jan 2008 12:32:50 +0000 (13:32 +0100)]
x86: make arch/x86/kernel/acpi/wakeup_32.S use a separate

While examining vmlinux namelist on i386 (nm -v vmlinux) I noticed :

c01021d0 t es7000_rename_gsi
c010221a T es7000_start_cpu
<Big Hole>
c0103000 T thread_saved_pc

and

c0113218 T acpi_restore_state_mem
c0113219 T acpi_save_state_mem
<Big Hole>
c0114000 t wakeup_code

This is because arch/x86/kernel/acpi/wakeup_32.S forces a .text alignment
of 4096 bytes. (I have no idea if it is really needed, since
arch/x86/kernel/acpi/wakeup_64.S uses a 16 bytes alignment *only*)

So arch/x86/kernel/built-in.o also has this alignment

arch/x86/kernel/built-in.o:     file format elf32-i386

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .text         00018c94  00000000  00000000  00001000  2**12
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE

But as arch/x86/kernel/acpi/wakeup_32.o is not the first object linked
into arch/x86/kernel/built-in.o, linker had to build several holes to meet
alignement requirements, because of .o nestings in the kbuild process.

This can be solved by using a special section, .text.page_aligned, so that
no holes are needed.

# size vmlinux.before vmlinux.after
   text    data     bss     dec     hex filename
4619942  422838  458752 5501532  53f25c vmlinux.before
4610534  422838  458752 5492124  53cd9c vmlinux.after

This saves 9408 bytes

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: clean up include/asm-x86/calling.h
Ingo Molnar [Wed, 30 Jan 2008 12:32:49 +0000 (13:32 +0100)]
x86: clean up include/asm-x86/calling.h

clean up include/asm-x86/calling.h.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: mark memory_setup __init
Andi Kleen [Wed, 30 Jan 2008 12:32:49 +0000 (13:32 +0100)]
x86: mark memory_setup __init

Otherwise

WARNING: vmlinux.o(.text+0x64a9): Section mismatch: reference to .init.text:machine_specific_memory_setup (between 'memory_setup' and 'show_cpuinfo')

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: Set CFQ as default in 32-bit defconfig
Andi Kleen [Wed, 30 Jan 2008 12:32:49 +0000 (13:32 +0100)]
x86: Set CFQ as default in 32-bit defconfig

Someone complained that the 32-bit defconfig contains AS as default IO
scheduler. Change that to CFQ.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: compile apm and voyager module only when selected in Kconfig
Andi Kleen [Wed, 30 Jan 2008 12:32:49 +0000 (13:32 +0100)]
x86: compile apm and voyager module only when selected in Kconfig

Previously the complete files were #ifdef'ed, but now handle that in the
Makefile.

May save a minor bit of compilation time.

[ Stephen Rothwell <sfr@canb.auug.org.au>: build dependency fix ]
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: document fdimage/isoimage completely in make help
Andi Kleen [Wed, 30 Jan 2008 12:32:49 +0000 (13:32 +0100)]
x86: document fdimage/isoimage completely in make help

Add missing targets and missing options in x86 make help

[ mingo@elte.hu: more whitespace cleanups ]

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: remove CPU capabitilites printks on 32-bit
Andi Kleen [Wed, 30 Jan 2008 12:32:49 +0000 (13:32 +0100)]
x86: remove CPU capabitilites printks on 32-bit

I don't know of any case where they have been useful and they look ugly.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: add /proc/irq/*/spurious to dump the spurious irq debugging state
Andi Kleen [Wed, 30 Jan 2008 12:32:48 +0000 (13:32 +0100)]
x86: add /proc/irq/*/spurious to dump the spurious irq debugging state

This is useful to debug problems with interrupt handlers that return
sometimes IRQ_NONE.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agogenirq: turn irq debugging options into module params
Andi Kleen [Wed, 30 Jan 2008 12:32:48 +0000 (13:32 +0100)]
genirq: turn irq debugging options into module params

This allows to change them at runtime using sysfs. No need to
reboot to set them.

I only added aliases (kernel.noirqdebug etc.) so the old options
still work.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86/efi: fix improper use of lvalue
Jeremy Fitzhardinge [Wed, 30 Jan 2008 12:32:44 +0000 (13:32 +0100)]
x86/efi: fix improper use of lvalue

# HG changeset patch
# User Jeremy Fitzhardinge <jeremy@xensource.com>
# Date 1199391030 28800
# Node ID 5d35c92fdf0e2c52edbb6fc4ccd06c7f65f25009
# Parent  22f6a5902285b58bfc1fbbd9e183498c9017bd78
x86/efi: fix improper use of lvalue

pgd_val is no longer valid as an lvalue, so don't try to assign to it.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: page.h: move things back to their own files
Jeremy Fitzhardinge [Wed, 30 Jan 2008 12:32:44 +0000 (13:32 +0100)]
x86: page.h: move things back to their own files

# HG changeset patch
# User Jeremy Fitzhardinge <jeremy@xensource.com>
# Date 1199321648 28800
# Node ID 22f6a5902285b58bfc1fbbd9e183498c9017bd78
# Parent  bba9287641ff90e836d090d80b5c0a846aab7162
x86: page.h: move things back to their own files

Oops, asm/page.h has turned into an #ifdef hellhole.  Move
32/64-specific things back to their own headers to make it somewhat
comprehensible...

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: page.h: move remaining bits and pieces
Jeremy Fitzhardinge [Wed, 30 Jan 2008 12:32:44 +0000 (13:32 +0100)]
x86: page.h: move remaining bits and pieces

# HG changeset patch
# User Jeremy Fitzhardinge <jeremy@xensource.com>
# Date 1199319657 28800
# Node ID bba9287641ff90e836d090d80b5c0a846aab7162
# Parent  d617b72a0cc9d14bde2087d065c36d4ed3265761
x86: page.h: move remaining bits and pieces

Move the remaining odds and ends into page.h.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: page.h: move pa and va related things
Jeremy Fitzhardinge [Wed, 30 Jan 2008 12:32:43 +0000 (13:32 +0100)]
x86: page.h: move pa and va related things

# HG changeset patch
# User Jeremy Fitzhardinge <jeremy@xensource.com>
# Date 1199319656 28800
# Node ID d617b72a0cc9d14bde2087d065c36d4ed3265761
# Parent  3bd7db6e85e66e7f3362874802df26a82fcb2d92
x86: page.h: move pa and va related things

Move and unify the virtual<->physical address space conversion
functions.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: page.h: move and unify types for pagetable entry, #6
Ingo Molnar [Wed, 30 Jan 2008 12:32:43 +0000 (13:32 +0100)]
x86: page.h: move and unify types for pagetable entry, #6

based on:

 Subject: x86: page.h: move and unify types for pagetable entry
 From: Jeremy Fitzhardinge <jeremy@goop.org>

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: page.h: move and unify types for pagetable entry
Jeremy Fitzhardinge [Wed, 30 Jan 2008 12:32:43 +0000 (13:32 +0100)]
x86: page.h: move and unify types for pagetable entry

# HG changeset patch
# User Jeremy Fitzhardinge <jeremy@xensource.com>
# Date 1199319654 28800
# Node ID 3bd7db6e85e66e7f3362874802df26a82fcb2d92
# Parent  f7e7db3facd9406545103164f9be8f9ba1a2b549
x86: page.h: move and unify types for pagetable entry definitions

This patch:

1. Defines arch-specific types for the contents of a pagetable entry.
That is, 32-bit entries for 32-bit non-PAE, and 64-bit entries for
32-bit PAE and 64-bit.  However, even though the latter two are the
same size, they're defined with different types in order to retain
compatibility with printk format strings, etc.

2. Defines arch-specific pte_t.  This is different because 32-bit PAE
defines it in two halves, whereas 32-bit PAE and 64-bit define it as a
single entry.  All the other pagetable levels can be defined in a
common way.  This also defines arch-specific pte_val/make_pte functions.

3. Define PAGETABLE_LEVELS for each architecture variation, for later use.

4. Define common pagetable entry accessors in a paravirt-compatible
way. (64-bit does not yet use paravirt-ops in any way).

5. Convert a few instances of using a *_val() as an lvalue where it is
no longer a macro.  There are still places in the 64-bit code which
use pte_val() as an lvalue.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: page.h: move and unify types for pagetable entry, #5
Ingo Molnar [Wed, 30 Jan 2008 12:32:43 +0000 (13:32 +0100)]
x86: page.h: move and unify types for pagetable entry, #5

based on:

 Subject: x86: page.h: move and unify types for pagetable entry
 From: Jeremy Fitzhardinge <jeremy@goop.org>

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: page.h: move and unify types for pagetable entry, #4
Ingo Molnar [Wed, 30 Jan 2008 12:32:43 +0000 (13:32 +0100)]
x86: page.h: move and unify types for pagetable entry, #4

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: page.h: move and unify types for pagetable entry, #3
Ingo Molnar [Wed, 30 Jan 2008 12:32:42 +0000 (13:32 +0100)]
x86: page.h: move and unify types for pagetable entry, #3

based on:

 Subject: x86: page.h: move and unify types for pagetable entry
 From: Jeremy Fitzhardinge <jeremy@goop.org>

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: page.h: move and unify types for pagetable entry, #2
Ingo Molnar [Wed, 30 Jan 2008 12:32:42 +0000 (13:32 +0100)]
x86: page.h: move and unify types for pagetable entry, #2

based on:

 Subject: x86: page.h: move and unify types for pagetable entry
 From: Jeremy Fitzhardinge <jeremy@goop.org>

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: page.h: move and unify types for pagetable entry, #1
Ingo Molnar [Wed, 30 Jan 2008 12:32:42 +0000 (13:32 +0100)]
x86: page.h: move and unify types for pagetable entry, #1

based on:

 Subject: x86: page.h: move and unify types for pagetable entry
 From: Jeremy Fitzhardinge <jeremy@goop.org>

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: add _AT() macro to conditionally cast
Jeremy Fitzhardinge [Wed, 30 Jan 2008 12:32:42 +0000 (13:32 +0100)]
x86: add _AT() macro to conditionally cast

# HG changeset patch
# User Jeremy Fitzhardinge <jeremy@xensource.com>
# Date 1199317452 28800
# Node ID f7e7db3facd9406545103164f9be8f9ba1a2b549
# Parent  4d9a413a0f4c1d98dbea704f0366457b5117045d
x86: add _AT() macro to conditionally cast

Define _AT(type, value) to conditionally cast a value when compiling C
code, but not when used in assembler.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: page.h: unify page copying and clearing
Jeremy Fitzhardinge [Wed, 30 Jan 2008 12:32:42 +0000 (13:32 +0100)]
x86: page.h: unify page copying and clearing

# HG changeset patch
# User Jeremy Fitzhardinge <jeremy@xensource.com>
# Date 1199317362 28800
# Node ID 4d9a413a0f4c1d98dbea704f0366457b5117045d
# Parent  ba0ec40a50a7aef1a3153cea124c35e261f5a2df
x86: page.h: unify page copying and clearing

Move, and to some extent unify, the various page copying and clearing
functions.  The only unification here is that both architectures use
the same function for copying/clearing user and kernel pages.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: page.h: unify constants
Jeremy Fitzhardinge [Wed, 30 Jan 2008 12:32:41 +0000 (13:32 +0100)]
x86: page.h: unify constants

# HG changeset patch
# User Jeremy Fitzhardinge <jeremy@xensource.com>
# Date 1199317360 28800
# Node ID ba0ec40a50a7aef1a3153cea124c35e261f5a2df
# Parent  c45c263179cb78284b6b869c574457df088027d1
x86: page.h: unify constants

There are many constants which are shared by 32 and 64-bit.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: fix detection of CONSTANT_TSC bit for AMD CPUs
Andreas Herrmann [Wed, 30 Jan 2008 12:32:41 +0000 (13:32 +0100)]
x86: fix detection of CONSTANT_TSC bit for AMD CPUs

Commits
 - c52f61fcbdb2aa84f0e4d831ef07f375e6b99b2c
  (x86: allow TSC clock source on AMD Fam10h and some cleanup)
 - e30436f05d456efaff77611e4494f607b14c2782
  (x86: move X86_FEATURE_CONSTANT_TSC into early cpu feature detection)

are supposed to fix the detection of contant TSC for AMD CPUs.
Unfortunately on x86_64 it does still not work with current x86/mm.
For a Phenom I still get:

  ...
  TSC calibrated against PM_TIMER
  Marking TSC unstable due to TSCs unsynchronized
  time.c: Detected 2288.366 MHz processor.
  ...

We have to set c->x86_power in early_identify_cpu to properly detect
the CONSTANT_TSC bit in early_init_amd.

Attached patch fixes this issue. Following the relevant boot
messages when the fix is used:

  ...
  TSC calibrated against PM_TIMER
  time.c: Detected 2288.279 MHz processor.
  ...
  Initializing CPU#1
  ...
  checking TSC synchronization [CPU#0 -> CPU#1]: passed.
  ...
  Initializing CPU#2
  ...
  checking TSC synchronization [CPU#0 -> CPU#2]: passed.
  ...
  Booting processor 3/4 APIC 0x3
  ...
  checking TSC synchronization [CPU#0 -> CPU#3]: passed.
  Brought up 4 CPUs
  ...

Patch is against x86/mm (v2.6.24-rc8-672-ga9f7faa).
Please apply.

Set c->x86_power in early_identify_cpu. This ensures that
X86_FEATURE_CONSTANT_TSC can properly be set in early_init_amd.

Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: don't disable TSC in any C states on AMD Fam10h
Andi Kleen [Wed, 30 Jan 2008 12:32:41 +0000 (13:32 +0100)]
x86: don't disable TSC in any C states on AMD Fam10h

The ACPI code currently disables TSC use in any C2 and C3
states. But the AMD Fam10h BKDG documents that the TSC
will never stop in any C states when the CONSTANT_TSC bit is
set. Make this disabling conditional on CONSTANT_TSC
not set on AMD.

I actually think this is true on Intel too for C2 states
on CPUs with p-state invariant TSC, but this needs
further discussions with Len to really confirm :-)

So far it is only enabled on AMD.

Cc: lenb@kernel.org
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: remove explicit C3 TSC check on 64bit
Andi Kleen [Wed, 30 Jan 2008 12:32:41 +0000 (13:32 +0100)]
x86: remove explicit C3 TSC check on 64bit

Trust the ACPI code to disable TSC instead when C3 is used.

AMD Fam10h does not disable TSC in any C states so the
check was incorrect there anyways after the change
to handle this like Intel on AMD too.

This allows to use the TSC when C3 is disabled in software
(acpi.max_c_state=2), but the BIOS supports it anyways.

Match i386 behaviour.

Cc: lenb@kernel.org
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: allow TSC clock source on AMD Fam10h and some cleanup
Andi Kleen [Wed, 30 Jan 2008 12:32:40 +0000 (13:32 +0100)]
x86: allow TSC clock source on AMD Fam10h and some cleanup

After a lot of discussions with AMD it turns out that TSC
on Fam10h CPUs is synchronized when the CONSTANT_TSC cpuid bit is set.
Or rather that if there are ever systems where that is not
true it would be their BIOS' task to disable the bit.

So finally use TSC gettimeofday on Fam10h by default.

Or rather it is always used now on CPUs where the AMD
specific CONSTANT_TSC bit is set.

This gives a nice speed bost for gettimeofday() on these systems
which tends to be by far the most common v/syscall.

On a Fam10h system here TSC gtod uses about 20% of the CPU time of
acpi_pm based gtod(). This was measured on 32bit, on 64bit
it is even better because TSC gtod() can use a vsyscall
and stay in ring 3, which acpi_pm doesn't.

The Intel check simply checks for CONSTANT_TSC too without hardcoding
Intel vendor. This is equivalent on 64bit because all 64bit capable Intel
CPUs will have CONSTANT_TSC set.

On Intel there is no CPU supplied CONSTANT_TSC bit currently,
but we synthesize one based on hardcoded knowledge which steppings
have p-state invariant TSC.

So the new logic is now: On CPUs which have the AMD specific
CONSTANT_TSC bit set or on Intel CPUs which are new enough
to be known to have p-state invariant TSC always use
TSC based gettimeofday()

Cc: lenb@kernel.org
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: move X86_FEATURE_CONSTANT_TSC into early cpu feature detection
Andi Kleen [Wed, 30 Jan 2008 12:32:40 +0000 (13:32 +0100)]
x86: move X86_FEATURE_CONSTANT_TSC into early cpu feature detection

Need this in the next patch in time_init and that happens early.

This includes a minor fix on i386 where early_intel_workarounds()
[which is now called early_init_intel] really executes early as
the comments say.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: remove the now unused X86_FEATURE_SYNC_RDTSC
Andi Kleen [Wed, 30 Jan 2008 12:32:40 +0000 (13:32 +0100)]
x86: remove the now unused X86_FEATURE_SYNC_RDTSC

we need to know whether RDTSC is synchronous or not.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: fix sched_clock()
Ingo Molnar [Wed, 30 Jan 2008 12:32:40 +0000 (13:32 +0100)]
x86: fix sched_clock()

[ andi@firstfloor.org: build fix ]

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: remove get_cycles_sync
Andi Kleen [Wed, 30 Jan 2008 12:32:39 +0000 (13:32 +0100)]
x86: remove get_cycles_sync

rdtsc is now speculation-safe, so no need for the sync variants of
the APIs.

[ mingo@elte.hu: removed the nsec_barrier() complication. ]

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: read_tsc sync
Ingo Molnar [Wed, 30 Jan 2008 12:32:39 +0000 (13:32 +0100)]
x86: read_tsc sync

make native_read_tsc() always non-speculative.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: map vsyscalls early enough
Ingo Molnar [Wed, 30 Jan 2008 12:32:39 +0000 (13:32 +0100)]
x86: map vsyscalls early enough

map vsyscalls early enough. This is important if a __vsyscall_fn
function is used by other kernel code too.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: move native_read_tsc() offline
Ingo Molnar [Wed, 30 Jan 2008 12:32:39 +0000 (13:32 +0100)]
x86: move native_read_tsc() offline

move native_read_tsc() offline.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: introduce rdtsc_barrier()
Andi Kleen [Wed, 30 Jan 2008 12:32:38 +0000 (13:32 +0100)]
x86: introduce rdtsc_barrier()

rdtsc_barrier() is a new barrier primitive that stops RDTSC speculation
to avoid races with timer interrupts on other CPUs.

It expands either to LFENCE (for Intel CPUs) or MFENCE (for
AMD CPUs) which stops RDTSC on all currently known microarchitectures
that implement SSE. On CPUs without SSE there is generally no RDTSC
speculation.

[ mingo@elte.hu: renamed it to rdtsc_barrier() and made it x86-only ]

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agogit-x86: unbreak UML
WANG Cong [Wed, 30 Jan 2008 12:32:38 +0000 (13:32 +0100)]
git-x86: unbreak UML

Acked-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: move nop declarations into separate include file
Andi Kleen [Wed, 30 Jan 2008 12:32:38 +0000 (13:32 +0100)]
x86: move nop declarations into separate include file

Moving things out of processor.h is always a good thing.

Also needed to avoid include loop in later patch.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: lfence fix
Ingo Molnar [Wed, 30 Jan 2008 12:32:38 +0000 (13:32 +0100)]
x86: lfence fix

LFENCE is available on XMM2 or higher Intel CPUs - not XMM or higher...

this caused boot failures on XMM1 & !XMM1 capable CPUs.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: Implement support to synchronize RDTSC with LFENCE on Intel CPUs
Andi Kleen [Wed, 30 Jan 2008 12:32:37 +0000 (13:32 +0100)]
x86: Implement support to synchronize RDTSC with LFENCE on Intel CPUs

According to Intel RDTSC can be always synchronized with LFENCE
on all current CPUs. Implement the necessary CPUID bit for that.

It is unclear yet if that is true for all future CPUs too,
but if there's another way the kernel can be always updated.

Cc: asit.k.mallick@intel.com
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: implement support to synchronize RDTSC through MFENCE on AMD CPUs
Andi Kleen [Wed, 30 Jan 2008 12:32:37 +0000 (13:32 +0100)]
x86: implement support to synchronize RDTSC through MFENCE on AMD CPUs

According to AMD RDTSC can be synchronized through MFENCE.
Implement the necessary CPUID bit for that.

Cc: andreas.herrmann3@amd.com
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: make ptrace.h safe to include from assembler code
Andi Kleen [Wed, 30 Jan 2008 12:32:36 +0000 (13:32 +0100)]
x86: make ptrace.h safe to include from assembler code

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: clean up k8topology.c
Carlos R. Mafra [Wed, 30 Jan 2008 12:32:36 +0000 (13:32 +0100)]
x86: clean up k8topology.c

This patch fixes all errors pointed out by checkpatch.pl.

                                      errors   lines of code   errors/KLOC
arch/x86/mm/k8topology_64.c (before)      72             185         389.1
arch/x86/mm/k8topology_64.c (after)        0             185             0

No code changed.

   text    data     bss     dec     hex filename
   1506       0       0    1506     5e2 k8topology_64.o.after
   1506       0       0    1506     5e2 k8topology_64.o.before

md5sum:

   f9f48331a7eca4fc60d2a03369dc5f53  k8topology_64.o.after
   f9f48331a7eca4fc60d2a03369dc5f53  k8topology_64.o.before

Signed-off-by: Carlos R. Mafra <crmafra@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: clean up apic_32.c, take 2
Hiroshi Shimamoto [Wed, 30 Jan 2008 12:32:36 +0000 (13:32 +0100)]
x86: clean up apic_32.c, take 2

More white space and coding style clean up.

Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: debug: double-check the empty zero page
Ingo Molnar [Wed, 30 Jan 2008 12:32:36 +0000 (13:32 +0100)]
x86: debug: double-check the empty zero page

temporary debugging - remove before this hits v2.6.25.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: not clear empty_zero_page again
Yinghai Lu [Wed, 30 Jan 2008 12:32:36 +0000 (13:32 +0100)]
x86: not clear empty_zero_page again

empty_zero_page is in .bss section, and it is cleared in clear_bss by
x86_64_start_kernel(). So don't clear that again in mem_init

Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: clean up apic_32/64.c
Hiroshi Shimamoto [Wed, 30 Jan 2008 12:32:35 +0000 (13:32 +0100)]
x86: clean up apic_32/64.c

White space and coding style clean up.
Make apic_32/64.c similar.

Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: introduce force_sig_info_fault helper to X86_64
Harvey Harrison [Wed, 30 Jan 2008 12:32:35 +0000 (13:32 +0100)]
x86: introduce force_sig_info_fault helper to X86_64

Use the force_sig_info_fault helper from X86_32 in X86_64.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: begin fault_{32|64}.c unification
Harvey Harrison [Wed, 30 Jan 2008 12:32:35 +0000 (13:32 +0100)]
x86: begin fault_{32|64}.c unification

Move X86_32 only get_segment_eip to X86_64
Move X86_64 only is_errata93 to X86_32

Change X86_32 loop in is_prefetch to highlight the differences
between them.  Fold the logic from __is_prefetch in as well on
X86_32.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: fault_32.c cleanup
Harvey Harrison [Wed, 30 Jan 2008 12:32:34 +0000 (13:32 +0100)]
x86: fault_32.c cleanup

We get die() from kdebug.h, no need for forward declaration.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: fix style errors in nmi_int.c
Carlos R. Mafra [Wed, 30 Jan 2008 12:32:33 +0000 (13:32 +0100)]
x86: fix style errors in nmi_int.c

This patch fixes most errors detected by checkpatch.pl.

                                     errors   lines of code   errors/KLOC
arch/x86/oprofile/nmi_int.c (after)       1             461           2.1
arch/x86/oprofile/nmi_int.c (before)     60             477         125.7

No code changed.

size:
   text    data     bss     dec     hex filename
   2675     264     472    3411     d53 nmi_int.o.after
   2675     264     472    3411     d53 nmi_int.o.before

md5sum:
  847aea0cc68fe1a2b5e7019439f3b4dd  nmi_int.o.after
  847aea0cc68fe1a2b5e7019439f3b4dd  nmi_int.o.before

Signed-off-by: Carlos R. Mafra <crmafra@gmail.com>
Reviewed-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: code clarification patch to Kprobes arch code
Quentin Barnes [Wed, 30 Jan 2008 12:32:32 +0000 (13:32 +0100)]
x86: code clarification patch to Kprobes arch code

When developing the Kprobes arch code for ARM, I ran across some code
found in x86 and s390 Kprobes arch code which I didn't consider as
good as it could be.

Once I figured out what the code was doing, I changed the code
for ARM Kprobes to work the way I felt was more appropriate.
I've tested the code this way in ARM for about a year and would
like to push the same change to the other affected architectures.

The code in question is in kprobe_exceptions_notify() which
does:
====
          /* kprobe_running() needs smp_processor_id() */
          preempt_disable();
          if (kprobe_running() &&
              kprobe_fault_handler(args->regs, args->trapnr))
                  ret = NOTIFY_STOP;
          preempt_enable();
====

For the moment, ignore the code having the preempt_disable()/
preempt_enable() pair in it.

The problem is that kprobe_running() needs to call smp_processor_id()
which will assert if preemption is enabled.  That sanity check by
smp_processor_id() makes perfect sense since calling it with preemption
enabled would return an unreliable result.

But the function kprobe_exceptions_notify() can be called from a
context where preemption could be enabled.  If that happens, the
assertion in smp_processor_id() happens and we're dead.  So what
the original author did (speculation on my part!) is put in the
preempt_disable()/preempt_enable() pair to simply defeat the check.

Once I figured out what was going on, I considered this an
inappropriate approach.  If kprobe_exceptions_notify() is called
from a preemptible context, we can't be in a kprobe processing
context at that time anyways since kprobes requires preemption to
already be disabled, so just check for preemption enabled, and if
so, blow out before ever calling kprobe_running().  I wrote the ARM
kprobe code like this:
====
          /* To be potentially processing a kprobe fault and to
           * trust the result from kprobe_running(), we have
           * be non-preemptible. */
          if (!preemptible() && kprobe_running() &&
              kprobe_fault_handler(args->regs, args->trapnr))
                  ret = NOTIFY_STOP;
====

The above code has been working fine for ARM Kprobes for a year.
So I changed the x86 code (2.6.24-rc6) to be the same way and ran
the Systemtap tests on that kernel.  As on ARM, Systemtap on x86
comes up with the same test results either way, so it's a neutral
external functional change (as expected).

This issue has been discussed previously on linux-arm-kernel and the
Systemtap mailing lists.  Pointers to the by base for the two
discussions:
http://lists.arm.linux.org.uk/lurker/message/20071219.223225.1f5c2a5e.en.html
http://sourceware.org/ml/systemtap/2007-q1/msg00251.html

Signed-off-by: Quentin Barnes <qbarnes@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Ananth N Mavinakayahanalli <ananth@in.ibm.com>
Acked-by: Ananth N Mavinakayahanalli <ananth@in.ibm.com>
16 years agox86: get rid of checkpatch.pl complains on apm_32.c
Cyrill Gorcunov [Wed, 30 Jan 2008 12:32:32 +0000 (13:32 +0100)]
x86: get rid of checkpatch.pl complains on apm_32.c

This patch eliminates most of code-style errors
discovered by checkpatch.pl on arch/x86/kernel/apm_32.c

no code changed:

      text    data     bss     dec     hex filename
     12142    1837      84   14063    36ef apm_32.o.before
     12142    1837      84   14063    36ef apm_32.o.after

   md5:
       2676b881ad55e387da4a995e8b9ee372  apm_32.o.before.asm
       2676b881ad55e387da4a995e8b9ee372  apm_32.o.after.asm

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>