This patch moves the common code in x86 and x86-64's semaphore.c into a
single file in lib/semaphore-sleepers.c. The arch specific asm stubs are
left in the arch tree (in semaphore.c for i386 and in the asm for x86-64).
There should be no changes in code/functionality with this patch.
Signed-off-by: Benjamin LaHaise <benjamin.c.lahaise@intel.com> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Jeff Dike <jdike@addtoit.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Zwane Mwaikambo [Sat, 3 Sep 2005 22:56:51 +0000 (15:56 -0700)]
[PATCH] i386 boottime for_each_cpu broken
for_each_cpu walks through all processors in cpu_possible_map, which is
defined as cpu_callout_map on i386 and isn't initialised until all
processors have been booted. This breaks things which do for_each_cpu
iterations early during boot. So, define cpu_possible_map as a bitmap with
NR_CPUS bits populated. This was triggered by a patch i'm working on which
does alloc_percpu before bringing up secondary processors.
The SMP version of __alloc_percpu checks the cpu_possible_map before
allocating memory for a certain cpu. With the above patches the BSP cpuid
is never set in cpu_possible_map which breaks CONFIG_SMP on uniprocessor
machines (as soon as someone tries to dereference something allocated via
__alloc_percpu, which in fact is never allocated since the cpu is not set
in cpu_possible_map).
Signed-off-by: Zwane Mwaikambo <zwane@arm.linux.org.uk> Signed-off-by: Alexander Nyberg <alexn@telia.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This helps complete the encapsulation of updates to page tables (or pages
about to become page tables) into accessor functions rather than using
memcpy() to duplicate them. This is both generally good for consistency
and also necessary for running in a hypervisor which requires explicit
updates to page table entries.
The new function is:
clone_pgd_range(pgd_t *dst, pgd_t *src, int count);
dst - pointer to pgd range anwhere on a pgd page
src - ""
count - the number of pgds to copy.
dst and src can be on the same page, but the range must not overlap
and must not cross a page boundary.
Note that I ommitted using this call to copy pgd entries into the
software suspend page root, since this is not technically a live paging
structure, rather it is used on resume from suspend. CC'ing Pavel in case
he has any feedback on this.
Thanks to Chris Wright for noticing that this could be more optimal in
PAE compiles by eliminating the memset.
George Anzinger [Sat, 3 Sep 2005 22:56:48 +0000 (15:56 -0700)]
[PATCH] x86 NMI: better support for debuggers
This patch adds a notify to the die_nmi notify that the system is about to
be taken down. If the notify is handled with a NOTIFY_STOP return, the
system is given a new lease on life.
We also change the nmi watchdog to carry on if die_nmi returns.
This give debug code a chance to a) catch watchdog timeouts and b) possibly
allow the system to continue, realizing that the time out may be due to
debugger activities such as single stepping which is usually done with
"other" cpus held.
Signed-off-by: George Anzinger<george@mvista.com> Cc: Keith Owens <kaos@ocs.com.au> Signed-off-by: George Anzinger <george@mvista.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Note that (0x89 & ~0xfd) == 0; i.e, set_tss_desc(cpu,t) has already stored
the type field in the GDT with the busy bit clear.
Eliminating redundant and obscure code is always a good thing; in fact, I
pointed out this same optimization many moons ago in arch/i386/setup.c,
back when it used to be called that.
The pushf/popf in switch_to are ONLY used to switch IOPL. Making this
explicit in C code is more clear. This pushf/popf pair was added as a
bugfix for leaking IOPL to unprivileged processes when using
sysenter/sysexit based system calls (sysexit does not restore flags).
When requesting an IOPL change in sys_iopl(), it is just as easy to change
the current flags and the flags in the stack image (in case an IRET is
required), but there is no reason to force an IRET if we came in from the
SYSENTER path.
This change is the minimal solution for supporting a paravirtualized Linux
kernel that allows user processes to run with I/O privilege. Other
solutions require radical rewrites of part of the low level fault / system
call handling code, or do not fully support sysenter based system calls.
Unfortunately, this added one field to the thread_struct. But as a bonus,
on P4, the fastest time measured for switch_to() went from 312 to 260
cycles, a win of about 17% in the fast case through this performance
critical path.
Privilege checking cleanup. Originally, these diffs were much greater, but
recent cleanups in Linux have already done much of the cleanup. I added
some explanatory comments in places where the reasoning behind certain
tests is rather subtle.
Also, in traps.c, we can skip the user_mode check in handle_BUG(). The
reason is, there are only two call chains - one via die_if_kernel() and one
via do_page_fault(), both entering from die(). Both of these paths already
ensure that a kernel mode failure has happened. Also, the original check
here, if (user_mode(regs)) was insufficient anyways, since it would not
rule out BUG faults from V8086 mode execution.
Saving the %ss segment in show_regs() rather than assuming a fixed value
also gives better information about the current kernel state in the
register dump.
[PATCH] i386: use set_pte macros in a couple places where they were missing
Also, setting PDPEs in PAE mode does not require atomic operations, since the
PDPEs are cached by the processor, and only reloaded on an explicit or
implicit reload of CR3.
Since the four PDPEs must always be present in an active root, and the kernel
PDPE is never updated, we are safe even from SMIs and interrupts / NMIs using
task gates (which reload CR3). Actually, much of this is moot, since the user
PDPEs are never updated either, and the only usage of task gates is by the
doublefault handler. It appears the only place PGDs get updated in PAE mode
is in init_low_mappings() / zap_low_mapping() for initial page table creation
and recovery from ACPI sleep state, and these sites are safe by inspection.
Getting rid of the cmpxchg8b saves code space and 720 cycles in pgd_alloc on
P4.
Subtle fix: load_TLS has been moved after saving %fs and %gs segments to avoid
creating non-reversible segments. This could conceivably cause a bug if the
kernel ever needed to save and restore fs/gs from the NMI handler. It
currently does not, but this is the safest approach to avoiding fs/gs
corruption. SMIs are safe, since SMI saves the descriptor hidden state.
[PATCH] i386: inline assembler: cleanup and encapsulate descriptor and task register management
i386 inline assembler cleanup.
This change encapsulates descriptor and task register management. Also,
it is possible to improve assembler generation in two cases; savesegment
may store the value in a register instead of a memory location, which
allows GCC to optimize stack variables into registers, and MOV MEM, SEG
is always a 16-bit write to memory, making the casting in math-emu
unnecessary.
i386 arch cleanup. Introduce the serialize macro to serialize processor
state. Why the microcode update needs it I am not quite sure, since wrmsr()
is already a serializing instruction, but it is a microcode update, so I will
keep the semantic the same, since this could be a timing workaround. As far
as I can tell, this has always been there since the original microcode update
source.
i386 Inline asm cleanup. Use cr/dr accessor functions.
Also, a potential bugfix. Also, some CR accessors really should be volatile.
Reads from CR0 (numeric state may change in an exception handler), writes to
CR4 (flipping CR4.TSD) and reads from CR2 (page fault) prevent instruction
re-ordering. I did not add memory clobber to CR3 / CR4 / CR0 updates, as it
was not there to begin with, and in no case should kernel memory be clobbered,
except when doing a TLB flush, which already has memory clobber.
I noticed that page invalidation does not have a memory clobber. I can't find
a bug as a result, but there is definitely a potential for a bug here:
Roland McGrath [Sat, 3 Sep 2005 22:56:35 +0000 (15:56 -0700)]
[PATCH] i386: clean up vDSO alignment padding
This makes the vDSO use nops for all its padding around instructions,
rather than sometimes zeros, and nop-pads the end of the area containing
instructions to a 32-byte cache line, to keep text and data in separate
lines.
Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This is subarch update for ES7000. I've modified platform check code and
removed unnecessary OEM table parsing for newer systems that don't use OEM
information during boot. Parsing the table in fact is causing problems,
and the platform doesn't get recognized. The patch only affects the ES7000
subach.
Signed-off-by: <Natalie.Protasevich@unisys.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The VIA VT8237's IOAPIC sends 'APIC De-Assert Messages' by default, causing
another CPU interrupt when the IRQ pin is de-asserted. This feature is
switched off by the patch to get rid of doubled ioapic level interrupt
rates.
[PATCH] x86: sutomatically enable bigsmp when we have more than 8 CPUs
i386 generic subarchitecture requires explicit dmi strings or command line
to enable bigsmp mode. The patch below removes that restriction, and uses
bigsmp as soon as it finds more than 8 logical CPUs, Intel processors and
xAPIC support.
[PATCH] kdump: Save parameter segment in protected mode (x86)
o With introduction of kexec as boot-loader, the assumption that parameter
segment will always be loaded at lower address than kernel and will be
addressable by early bootup page tables is no longer valid. In kexec on
panic case parameter segment might well be loaded beyond kernel image and
might not be addressable by early boot page tables.
o This case might hit in the scenario where user has reserved a chunk of
memory for second kernel, for example 16MB to 64MB, and has also built
second kernel for physical memory location 16MB. In this case kexec has no
choice but to load the parameter segment at a higher address than new kernel
image at safe location where new kernel does not stomp it.
o Though problem should automatically go away once relocatable kernel for i386
is in place and kexec can determine the location of new kernel at run time
and load parameter segment at lower address than kernel image. But till then
this patch can go in (assuming it does not break something else).
o This patch moves up the boot parameter saving code. Now boot parameters
are copied out in protected mode before page tables are initialized. This
will ensure that parameter segment is always addressable irrespective of
its physical location.
Petr Tesarik [Sat, 3 Sep 2005 22:56:28 +0000 (15:56 -0700)]
[PATCH] vm86: Honor TF bit when emulating an instruction
If the virtual 86 machine reaches an instruction which raises a General
Protection Fault (such as CLI or STI), the instruction is emulated (in
handle_vm86_fault). However, the emulation ignored the TF bit, so the
hardware debug interrupt was not invoked after such an emulated instruction
(and the DOS debugger missed it).
This patch fixes the problem by emulating the hardware debug interrupt as
the last action before control is returned to the VM86 program.
Signed-off-by: Petr Tesarik <kernel@tesarici.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Matt Tolentino [Sat, 3 Sep 2005 22:56:27 +0000 (15:56 -0700)]
[PATCH] x86: fix EFI memory map parsing
The memory descriptors that comprise the EFI memory map are not fixed in
stone such that the size could change in the future. This uses the memory
descriptor size obtained from EFI to iterate over the memory map entries
during boot. This enables the removal of an x86 specific pad (and ifdef)
in the EFI header. I also couldn't stomach the broken up nature of the
function to put EFI runtime calls into virtual mode any longer so I fixed
that up a bit as well.
For reference, this patch only impacts x86.
Signed-off-by: Matt Tolentino <matthew.e.tolentino@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] hpet: use read_timer_tsc only when CPU has TSC
Only use read_timer_tsc only when CPU has TSC. Thanks to Andrea for
pointing this out. Should not be issue on any platforms as all recent
systems that has HPET also has CPUs that supports TSC. The patch is still
required for correctness.
[PATCH] x86: compress the stack layout of do_page_fault()
This patch pushes the creation of a rare signal frame (SIGBUS or SIGSEGV)
into a separate function, thus saving stackspace in the main
do_page_fault() stackframe. The effect is 132 bytes less of stack used by
the typical do_page_fault() invocation - resulting in a denser
cache-layout.
(Another minor effect is that in case of kernel crashes that come from a
pagefault, we add less space to the already existing frame, giving the
crash functions a slightly higher chance to do their stuff without
overflowing the stack.)
(The changes also result in slightly cleaner code.)
argument bugfix from "Guillaume C." <guichaz@gmail.com>
arch/mips/kernel/genex.S:250:5: warning: "CONFIG_64BIT" is not defined
arch/mips/math-emu/cp1emu.c:1128:5: warning: "__mips64" is not defined
arch/mips/math-emu/cp1emu.c:1206:5: warning: "__mips64" is not defined
arch/mips/math-emu/cp1emu.c:1270:5: warning: "__mips64" is not defined
arch/mips/math-emu/cp1emu.c:323:5: warning: "__mips64" is not defined
arch/mips/math-emu/cp1emu.c:808:5: warning: "__mips64" is not defined
arch/mips/math-emu/cp1emu.c:953:5: warning: "__mips64" is not defined
arch/mips/mm/tlbex.c:519:5: warning: "CONFIG_64BIT" is not defined
include/asm/reg.h:73:5: warning: "CONFIG_64BIT" is not defined
[PATCH] mips: add more SYS_SUPPORT_*_KERNEL and CPU_SUPPORTS_*_KERNEL
The addtion of SYS_SUPPORTS_*_KERNEL and CPU_SUPPORTS_*_KERNEL is halfway.
This patch has added more SYS_SUPPORTS_*_KERNEL and CPU_SUPPORTS_*_KERNEL
to arch/mips/Kconfig. Please apply.
[PATCH] fix warning of TANBAC_TB0219 in drivers/char/Kconfig
$ make menuconfig
scripts/kconfig/mconf arch/i386/Kconfig
drivers/char/Kconfig:847:warning: 'select' used by config symbol
'TANBAC_TB0219' refer to undefined symbol 'PCI_VR41XX'
There seem to be no more users or interest in the NEC Osprey evaluation
system for the NEC VR4181 SOC which is an old part anyway, so remove the
code. More information on the Osprey can be found at
http://www.linux-mips.org/wiki/Osprey.
Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] ppc64: replace schedule_timeout() with msleep_interruptible()
Use msleep_interruptible() instead of schedule_timeout() in ppc64-specific
code to cleanup/simplify the sleeping logic. Change the units of the
parameter of do_event_scan_all_cpus() to milliseconds from jiffies. The
return value of rtas_extended_busy_delay_time() was incorrectly being used
as a jiffies value (it is actually milliseconds), which is fixed by using
the value as a parameter to msleep_interruptible(). Also, use
rtas_extended_busy_delay_time() in another case where similar logic is
duplicated.
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Olof Johansson [Sat, 3 Sep 2005 22:55:59 +0000 (15:55 -0700)]
[PATCH] ppc64: Add VMX save flag to VPA
We need to indicate to the hypervisor that it needs to save our VMX
registers when switching partitions on a shared-processor system, just as
it needs to for FP and PMC registers.
This could be made to be on-demand when VMX is used, but we don't do that
for FP nor PMC right now either so let's not overcomplicate things.
Signed-off-by: Olof Johansson <olof@lixom.net> Acked-by: Paul Mackerras <paulus@samba.org> Cc: Anton Blanchard <anton@samba.org> Cc: <engebret@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Mark A. Greer [Sat, 3 Sep 2005 22:55:57 +0000 (15:55 -0700)]
[PATCH] ppc32: katana updates
Update the katana platform support code:
- if booted as zImage, pass mem size in via bi_req from bootwrapper
- if booted as uImage, get mem size from bd_info passed in from u-boot
- add support for 82544 present on katana 752i's
- set cacheline size on pci devices
- some minor fixups
Signed-off-by: Mark A. Greer <mgreer@mvista.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Mark A. Greer [Sat, 3 Sep 2005 22:55:56 +0000 (15:55 -0700)]
[PATCH] ppc32: mv64x60 updates & enhancements
Updates and enhancement to the ppc32 mv64x60 code:
- move code to get mem size from mem ctlr to bootwrapper
- address some errata in the mv64360 pic code
- some minor cleanups
- export one of the bridge's regs via sysfs so user daemon can watch for
extraction events
Signed-off-by: Mark A. Greer <mgreer@mvista.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add declaration and cacheable_memcpy(). I'll be needing this function in
new 4xx EMAC driver I'm going to submit to netdev soon.
IMHO, the better place for the declaration would be asm-powerpc/string.h,
unfortunately, ppc64 doesn't have this function, so asm-ppc/system.h is the
next best place.
Signed-off-by: Eugene Surovegin <ebs@ebshome.net> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Arthur Othieno [Sat, 3 Sep 2005 22:55:51 +0000 (15:55 -0700)]
[PATCH] ppc32: Re-order cputable for 750CXe DD2.4 entry
"745/755" (pvr_value:0x00083000) is a catch-all entry.
Since arch/ppc/kernel/misc.S:identify_cpu() returns on first match,
move this lower in the table so 750CXe DD2.4 (pvr_value:0x00083214)
may be correctly enumerated.
Signed-off-by: Arthur Othieno <a.othieno@bluewin.ch> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Kumar Gala [Sat, 3 Sep 2005 22:55:50 +0000 (15:55 -0700)]
[PATCH] ppc32: Added PCI support MPC83xx
Adds support for the two PCI busses on MPC83xx and the MPC834x SYS/PIBS
reference board.
The code initializes PCI inbound/outbound windows, allocates and registers
PCI memory/io space. Be aware that setup of the PCI buses on the PIBs
board is expected to be done by the firmware.
Signed-off-by: Tony Li <tony.li@freescale.com> Signed-off-by: Kumar Gala <kumar.gala@freescale.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Lee Nicks [Sat, 3 Sep 2005 22:55:48 +0000 (15:55 -0700)]
[PATCH] ppc32: add support for Marvell EV64360BP board
This patch adds support for Marvell EV64360BP board. So far, it supports
mpsc serial console, gigabit ethernet, jffs2 root filesystem, etc. Other
device support, like watchdog, RTC, will be added later.
Signed-off-by: Lee Nicks <allinux@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Kumar Gala [Sat, 3 Sep 2005 22:55:47 +0000 (15:55 -0700)]
[PATCH] ppc32: add CONFIG_HZ
While ppc32 has the CONFIG_HZ Kconfig option, it wasnt actually being used.
Connect it up and set all platforms to 250Hz. This pretty much mimics the
ppc64 patch from Anton Blanchard.
Signed-off-by: Kumar Gala <kumar.gala@freescale.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Kumar Gala [Sat, 3 Sep 2005 22:55:46 +0000 (15:55 -0700)]
[PATCH] ppc32: ppc_sys system on chip identification additions
Add the ability to identify an SOC by a name and id. There are cases in
which the integer identifier is not sufficient to specify a specific SOC.
In these cases we can use a string to further qualify the match.
Signed-off-by: Vitaly Bordug <vbordug@ru.mvista.com> Signed-off-by: Kumar Gala <kumar.gala@freescale.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Roland Dreier [Sat, 3 Sep 2005 22:55:43 +0000 (15:55 -0700)]
[PATCH] ppc32: Don't sleep in flush_dcache_icache_page()
flush_dcache_icache_page() will be called on an instruction page fault. We
can't sleep in the fault handler, so use kmap_atomic() instead of just
kmap() for the Book-E case.
Signed-off-by: Roland Dreier <rolandd@cisco.com> Acked-by: Matt Porter <mporter@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Kumar Gala [Sat, 3 Sep 2005 22:55:39 +0000 (15:55 -0700)]
[PATCH] ppc32: Cleaned up global namespace of Book-E watchdog variables
Renamed global variables used to convey if the watchdog is enabled and
periodicity of the timer and moved the declarations into a header for these
variables
Signed-off-by: Matt McClintock <msm@freescale.com> Signed-off-by: Kumar Gala <kumar.gala@freescale.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Kumar Gala [Sat, 3 Sep 2005 22:55:36 +0000 (15:55 -0700)]
[PATCH] cpm_uart: Fix 2nd serial port on MPC8560 ADS
The 2nd serial port on the MPC8560 ADS was not being configured correctly
and thus could not be used as a console. Updated the defconfig for the
board to configure the proper SCC channel for the 2nd serial port.
Signed-off-by: Roy Zang <tie-fei.zang@freescale.com> Signed-off-by: Kumar Gala <kumar.gala@freescale.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Matt Porter [Sat, 3 Sep 2005 22:55:35 +0000 (15:55 -0700)]
[PATCH] ppc32: add phy excluded features to ocp_func_emac_data
This patch adds a field to struct ocp_func_emac_data that allows
platform-specific unsupported PHY features to be passed in to the ibm_emac
ethernet driver.
This patch also adds some logic for the Bamboo eval board to populate this
field based on the dip switches on the board. This is a workaround for the
improperly biased RJ-45 sockets on the Rev. 0 Bamboo.
Signed-off-by: Wade Farnsworth <wfarnsworth@mvista.com> Signed-off-by: Matt Porter <mporter@kernel.crashing.org> Cc: Jeff Garzik <jgarzik@pobox.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Kumar Gala [Sat, 3 Sep 2005 22:55:34 +0000 (15:55 -0700)]
[PATCH] ppc32: Add ppc_sys descriptions for PowerQUICC II devices
Added ppc_sys device and system definitions for PowerQUICC II devices.
This will allow drivers for PQ2 to be proper platform device drivers.
Which can be shared on PQ3 processors with the same peripherals.
Signed-off-by: Matt McClintock <msm@freescale.com> Signed-off-by: Kumar Gala <kumar.gala@freescale.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Kumar Gala [Sat, 3 Sep 2005 22:55:33 +0000 (15:55 -0700)]
[PATCH] ppc32: Added support for the Book-E style Watchdog Timer
PowerPC 40x and Book-E processors support a watchdog timer at the processor
core level. The timer has implementation dependent timeout frequencies
that can be configured by software.
One the first Watchdog timeout we get a critical exception. It is left to
board specific code to determine what should happen at this point. If
nothing is done and another timeout period expires the processor may
attempt to reset the machine.