Jon Mason [Mon, 26 Jun 2006 11:58:14 +0000 (13:58 +0200)]
[PATCH] x86_64: Calgary IOMMU - Calgary specific bits
This patch hooks Calgary into the build, the x86-64 IOMMU
initialization paths, and introduces the Calgary specific bits. The
implementation draws inspiration from both PPC (which has support for
the same chip but requires firmware support which we don't have on
x86-64) and gart. Calgary is different from gart in that it support a
translation table per PHB, as opposed to the single gart aperture.
Changes from previous version:
* Addition of boot-time disablement for bus-level translation/isolation
(e.g, enable userspace DMA for things like X)
* Usage of newer IOMMU abstraction functions
This patch creates a new interface for IOMMUs by adding a centralized
location for IOMMU allocation (for translation tables/apertures) and
IOMMU initialization. In creating these, code was moved around for
abstraction, uniformity, and consiceness.
Take note of the move of the iommu_setup bootarg parsing code to
__setup. This is enabled by moving back the location of the aperture
allocation/detection to mem init (which while ugly, was already the
location of the swiotlb_init).
While a slight departure from the previous patch, I belive this provides
the true intention of the previous versions of the patch which changed
this code. It also makes the addition of the upcoming calgary code much
cleaner than previous patches.
[AK: Removed one broken change. iommu_setup still has to be called
early]
swiotlb relies on the gart specific iommu_aperture variable to know if
we discovered a hardware IOMMU before swiotlb initialization. Introduce
iommu_detected to do the same thing, but in a HW IOMMU neutral manner,
in preparation for adding the Calgary HW IOMMU.
Jan Beulich [Mon, 26 Jun 2006 11:57:41 +0000 (13:57 +0200)]
[PATCH] i386: reliable stack trace support (i386)
These are the i386-specific pieces to enable reliable stack traces. This is
going to be even more useful once CFI annotations get added to he assembly
code, namely to entry.S.
Jan Beulich [Mon, 26 Jun 2006 11:57:38 +0000 (13:57 +0200)]
[PATCH] x86_64: reliable stack trace support (x86-64 syscall
Adjust the CFA offset for 64- and 32-bit syscall entries so that the five
slots pre-subtracted from the stack pointer do not appear to reside outside
of the current frame.
Jan Beulich [Mon, 26 Jun 2006 11:57:32 +0000 (13:57 +0200)]
[PATCH] x86_64: reliable stack trace support (x86-64)
These are the x86_64-specific pieces to enable reliable stack traces. The
only restriction with this is that it currently cannot unwind across the
interrupt->normal stack boundary, as that transition is lacking proper
annotation.
Jan Beulich [Mon, 26 Jun 2006 11:57:28 +0000 (13:57 +0200)]
[PATCH] x86_64: reliable stack trace support
These are the generic bits needed to enable reliable stack traces based
on Dwarf2-like (.eh_frame) unwind information. Subsequent patches will
enable x86-64 and i386 to make use of this.
Thanks to Andi Kleen and Ingo Molnar, who pointed out several possibilities
for improvement.
bibo,mao [Mon, 26 Jun 2006 11:57:25 +0000 (13:57 +0200)]
[PATCH] x86_64: x86_86 msi miss one entry handler
In x86_64 architecture, if device driver with msi function
gets 0xee vector by assign_irq_vector() function, system will
crash if this interrupt happens. It is because 0xee interrupt
entry is empty. This patch modifies this. This patch is based
on 2.6.17-rc6.
Andi Kleen [Mon, 26 Jun 2006 11:57:22 +0000 (13:57 +0200)]
[PATCH] x86_64: Rename IOMMU option, fix help and mark option embedded.
- Rename the GART_IOMMU option to IOMMU to make clear it's not
just for AMD
- Rewrite the help text to better emphatise this fact
- Make it an embedded option because too many people get it wrong.
To my astonishment I discovered the aacraid driver tests this
symbol directly. This looks quite broken to me - it's an internal
implementation detail of the PCI DMA API. Can the maintainer
please clarify what this test was intended to do?
Ingo Molnar [Mon, 26 Jun 2006 11:57:16 +0000 (13:57 +0200)]
[PATCH] x86_64: fix vector_lock deadlock in io_apic.c
Fix a potential deadlock scenario introduced by io_apic.c's new vector_lock
on i386 and x86_64.
Found by the locking correctness validator. The patch was boot-tested on
x86. For details of the deadlock scenario, see the validator output:
======================================================
[ BUG: hard-safe -> hard-unsafe lock order detected! ]
------------------------------------------------------
idle/1 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
(msi_lock){....}, at: [<c04ff8d2>] startup_msi_irq_wo_maskbit+0x10/0x35
and this task is already holding:
(&irq_desc[i].lock){++..}, at: [<c015b924>] probe_irq_on+0x36/0x107
which would create a new lock dependency:
(&irq_desc[i].lock){++..} -> (msi_lock){....}
but this new dependency connects a hard-irq-safe lock:
(&irq_desc[i].lock){++..}
... which became hard-irq-safe at:
[<c01468c4>] lockdep_acquire+0x68/0x84
[<c10485e9>] _spin_lock+0x21/0x2f
[<c015aff5>] __do_IRQ+0x3d/0x113
[<c01062d3>] do_IRQ+0x8c/0xad
to a hard-irq-unsafe lock:
(vector_lock){--..}
... which became hard-irq-unsafe at:
... [<c01468c4>] lockdep_acquire+0x68/0x84
[<c10485e9>] _spin_lock+0x21/0x2f
[<c011b5e8>] assign_irq_vector+0x34/0xc8
[<c1aa82fa>] setup_IO_APIC+0x45a/0xcff
[<c1aa56e3>] smp_prepare_cpus+0x5ea/0x8aa
[<c010033f>] init+0x32/0x2cb
[<c0102005>] kernel_thread_helper+0x5/0xb
which could potentially lead to deadlocks!
other info that might help us debug this:
3 locks held by idle/1:
#0: (port_mutex){--..}, at: [<c067070d>] uart_add_one_port+0x61/0x289
#1: (&state->mutex){--..}, at: [<c067071f>] uart_add_one_port+0x73/0x289
#2: (&irq_desc[i].lock){++..}, at: [<c015b924>] probe_irq_on+0x36/0x107
Jon Mason [Mon, 26 Jun 2006 11:57:13 +0000 (13:57 +0200)]
[PATCH] x86_64: remove unused gart header file
include/asm-x86_64/gart-mapping.h is only ever used in
arch/x86_64/kernel/setup.c and none of its contents are referenced.
Looks to be leftover cruft not removed in the dma_ops patch.
Andi Kleen [Mon, 26 Jun 2006 11:57:07 +0000 (13:57 +0200)]
[PATCH] x86_64: Remove ia32_sys_call_table export
It was originally added for 2.4 oprofile, but 2.6 oprofile doesn't
need that anymore. Shouldn't be any use in tree anymore and it doesn't
make much sense to export the ia32 syscalls when the main syscalls
are not exported.
I think Adrian Bunk asked for removing it several times.
Also included hunk from Adrian to remove the .globl ia32_sys_call_table
Andi Kleen [Mon, 26 Jun 2006 11:56:52 +0000 (13:56 +0200)]
[PATCH] x86_64: Add compat_printk and sysctl to turn off compat layer warnings
Sometimes e.g. with crashme the compat layer warnings can be noisy.
Add a way to turn them off by gating all output through compat_printk
that checks a global sysctl. The default is not changed.
Andi Kleen [Mon, 26 Jun 2006 11:56:40 +0000 (13:56 +0200)]
[PATCH] x86_64: Clean and enhance up K8 northbridge access code
- Factor out the duplicated access/cache code into a single file
* Shared between i386/x86-64.
- Share flush code between AGP and IOMMU
* Fix a bug: AGP didn't wait for end of flush before
- Drop 8 northbridges limit and allocate dynamically
- Add lock to serialize AGP and IOMMU GART flushes
- Add PCI ID for next AMD northbridge
- Random related cleanups
The old K8 NUMA discovery code is unchanged. New systems
should all use SRAT for this.
Cc: "Navin Boppuri" <navin.boppuri@newisys.com> Cc: Dave Jones <davej@redhat.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Jon Mason [Mon, 26 Jun 2006 11:56:37 +0000 (13:56 +0200)]
[PATCH] x86_64: trivial gart clean-up
A trivial change to have gart_unmap_sg call gart_unmap_single directly,
instead of bouncing through the dma_unmap_single wrapper in
dma-mapping.h.
This change required moving the gart_unmap_single above gart_unmap_sg,
and under gart_map_single (which seems a more logical place that its
current location IMHO).
Ingo Molnar [Mon, 26 Jun 2006 11:56:25 +0000 (13:56 +0200)]
[PATCH] x86_64: x86_64-enable-large-bzImage.patch
enable large bzImages on x86_64. (fix is from x86's build.c) Using this
patch i have successfully built and booted an allyesconfig 13MB+ bzImage
on x86_64 too:
Gerd Hoffmann [Mon, 26 Jun 2006 11:56:16 +0000 (13:56 +0200)]
[PATCH] x86_64: x86_64 version of the smp alternative patch.
Changes are largely identical to the i386 version:
* alternative #define are moved to the new alternative.h file.
* one new elf section with pointers to the lock prefixes which can be
nop'ed out for non-smp.
* two new elf sections simliar to the "classic" alternatives to
replace SMP code with simpler UP code.
* fixup headers to use alternative.h instead of defining their own
LOCK / LOCK_PREFIX macros.
The patch reuses the i386 version of the alternatives code to avoid code
duplication. The code in alternatives.c was shuffled around a bit to
reduce the number of #ifdefs needed. It also got some tweaks needed for
x86_64 (vsyscall page handling) and new features (noreplacement option
which was x86_64 only up to now). Debug printk's are changed from
compile-time to runtime.
Loosely based on a early version from Bastian Blank <waldi@debian.org>
Andi Kleen [Mon, 26 Jun 2006 11:56:13 +0000 (13:56 +0200)]
[PATCH] i386/x86-64: Emulate CPUID4 on AMD
Intel systems report the cache level data from CPUID 4 in sysfs.
Add a CPUID 4 emulation for AMD CPUs to report the same
information for them. This allows programs to read this
information in a uniform way.
The AMD way to report this is less flexible so some assumptions
are hardcoded (e.g. no L3)
Andi Kleen [Mon, 26 Jun 2006 11:56:10 +0000 (13:56 +0200)]
[PATCH] i386/x86-64: Use new official CPUID to get APICID/core split on AMD platforms
Previously the apicid<->coreid split was computed based on the max
number of cores. Now use a new CPUID AMD defined for that. On most
systems right now it should be 0 and the old method will be used.
[PATCH] x86_64: Use local APIC ID from local APIC instead of CPUID
vSMPowered systems use apic_cluster too. Forcing apic_physflat works
on these systems too, but only if we change phys_pkg_id to use
hard_smp_prcoessor_id() instead of cpuid_ebx. I am guessing other
multichassi cluster systems would need this too.
Malcolm Parsons [Mon, 26 Jun 2006 01:49:41 +0000 (11:49 +1000)]
[PATCH] uclinux: use PER_LINUX_32BIT in binfmt_flat
binfmt_flat.c calls set_personality with PER_LINUX as the personality.
On the arm architecture this results in the program running in 26bit
usermode. PER_LINUX_32BIT should be used instead. This doesn't affect
other architectures that use binfmt_flat.
[PATCH] m68knommu: improve syscall entry and fix strace
Here is a patch to the system call handling for 5307/5272/etc to:
- fix the strace support (one tested the wrong bit)
- make all system calls a little bit faster by inlining set_esp0 and
supporting ENOSYS out of the critical path.
- remove extraneous spaces
Greg Ungerer [Mon, 26 Jun 2006 01:01:32 +0000 (11:01 +1000)]
[PATCH] m68knommu: force stack alignment on ColdFire
This patch solve a bug triggered by execvp (this function use calloc to
store the argument list and gcc 3.4.x align the stack to word, not to dword).
This situation aren't related to signal handling and all 2.6.x have the bug.
On ColdFire targets we must force the stack to be aligned.
Original patch from Andrea Tarani <andrea.tarani@gilbarco.com>,
Greg Ungerer [Mon, 26 Jun 2006 00:58:09 +0000 (10:58 +1000)]
[PATCH] m68knommu: configurable frequency selection header
Remove list of fixed clock frequency options used for configuring master
clock, and make field an int. Much more flexible this way, no need to add
more options for every new used freqency.
Greg Ungerer [Mon, 26 Jun 2006 00:55:36 +0000 (10:55 +1000)]
[PATCH] m68knommu: configurable frequency selection
Remove list of fixed clock frequency options used for configuring master
clock, and make field an int. Much more flexible this way, no need to add
more options for every new used freqency.
Greg Ungerer [Mon, 26 Jun 2006 00:33:10 +0000 (10:33 +1000)]
[PATCH] m68knommu: cleanup setup.c
A cleanup of m68knommu/kernel/setup.c :
- No need to initialize global pointers to NULL, they will have that value
automatically, and they eat up space in my data segment image in FLASH.
- Remove get_cpuinfo. It has been replaced by show_cpuinfo.
Signed-off-by: Philippe De Muyter <phdm@macqel.be> Signed-off-by: Greg Ungerer <gerg@uclinux.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Greg Ungerer [Mon, 26 Jun 2006 00:33:10 +0000 (10:33 +1000)]
[PATCH] m68knommu: switch arch config name to CONFIG_M68K
Switch to naming the architecture config options for the m68knommu branch
as "M68K", dropping "M68KNOMMU". The CONFIG_MMU separates the 2 now, and
the m68knommu branch is still strictly speaking an M68K (including the
ColdFire parts).
Linus Torvalds [Sun, 25 Jun 2006 23:07:58 +0000 (16:07 -0700)]
Merge branch 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband:
IB/iser: iSER Kconfig and Makefile
IB/iser: iSER handling of memory for RDMA
IB/iser: iSER RDMA CM (CMA) and IB verbs interaction
IB/iser: iSER initiator iSCSI PDU and TX/RX
IB/iser: iSCSI iSER transport provider high level code
IB/iser: iSCSI iSER transport provider header file
IB/uverbs: Remove unnecessary list_del()s
IB/uverbs: Don't free wr list when it's known to be empty
Björn Steinbrink [Sun, 25 Jun 2006 14:24:40 +0000 (16:24 +0200)]
[PATCH] i386: Fix softirq accounting with 4K stacks
Copy the softirq bits in preempt_count from the current context into the
hardirq context when using 4K stacks to make the softirq_count macro work
correctly and thereby fix softirq cpu time accounting.
Signed-off-by: Björn Steinbrink <B.Steinbrink@gmx.de> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Linus Torvalds [Sun, 25 Jun 2006 17:54:14 +0000 (10:54 -0700)]
Merge git://git.linux-nfs.org/pub/linux/nfs-2.6
* git://git.linux-nfs.org/pub/linux/nfs-2.6: (51 commits)
nfs: remove nfs_put_link()
nfs-build-fix-99
git-nfs-build-fixes
Merge branch 'odirect'
NFS: alloc nfs_read/write_data as direct I/O is scheduled
NFS: Eliminate nfs_get_user_pages()
NFS: refactor nfs_direct_free_user_pages
NFS: remove user_addr, user_count, and pos from nfs_direct_req
NFS: "open code" the NFS direct write rescheduler
NFS: Separate functions for counting outstanding NFS direct I/Os
NLM: Fix reclaim races
NLM: sem to mutex conversion
locks.c: add the fl_owner to nlm_compare_locks
NFS: Display the chosen RPCSEC_GSS security flavour in /proc/mounts
NFS: Split fs/nfs/inode.c
NFS: Fix typo in nfs_do_clone_mount()
NFS: Fix compile errors introduced by referrals patches
NFSv4: Ensure that referral mounts bind to a reserved port
NFSv4: A root pathname is sent as a zero component4
NFSv4: Follow a referral
...
* master.kernel.org:/pub/scm/linux/kernel/git/mchehab/v4l-dvb: (244 commits)
V4L/DVB (4210b): git-dvb: tea575x-tuner build fix
V4L/DVB (4210a): git-dvb versus matroxfb
V4L/DVB (4209): Added some BTTV PCI IDs for newer boards
Fixes some sync issues between V4L/DVB development and GIT
V4L/DVB (4206): Cx88-blackbird: always set encoder height based on tvnorm->id
V4L/DVB (4205): Merge tda9887 module into tuner.
V4L/DVB (4203): Explicitly set the enum values.
V4L/DVB (4202): allow selecting CX2341x port mode
V4L/DVB (4200): Disable bitrate_mode when encoding mpeg-1.
V4L/DVB (4199): Add cx2341x-specific control array to cx2341x.c
V4L/DVB (4198): Avoid newer usages of obsoleted experimental MPEGCOMP API
V4L/DVB (4197): Port new MPEG API to saa7134-empress with saa6752hs
V4L/DVB (4196): Port cx88-blackbird to the new MPEG API.
V4L/DVB (4193): Update cx2341x fw encoding API doc.
V4L/DVB (4192): Use control helpers for saa7115, cx25840, msp3400.
V4L/DVB (4191): Add CX2341X MPEG encoder module.
V4L/DVB (4190): Add helper functions for control processing to v4l2-common.
V4L/DVB (4189): Add videodev support for VIDIOC_S/G/TRY_EXT_CTRLS.
V4L/DVB (4188): Add new MPEG control/ioctl definitions to videodev2.h
V4L/DVB (4186): Add support for the DNTV Live! mini DVB-T card.
...
Amul Shah [Sun, 25 Jun 2006 12:49:31 +0000 (05:49 -0700)]
[PATCH] Fix kdump Crash Kernel boot memory reservation for NUMA machines
This patch will fix a boot memory reservation bug that trashes memory on
the ES7000 when loading the kdump crash kernel.
The code in arch/x86_64/kernel/setup.c to reserve boot memory for the crash
kernel uses the non-numa aware "reserve_bootmem" function instead of the
NUMA aware "reserve_bootmem_generic". I checked to make sure that no other
function was using "reserve_bootmem" and found none, except the ones that
had NUMA ifdef'ed out.
I have tested this patch only on an ES7000 with NUMA on and off (numa=off)
in a single (non-NUMA) and multi-cell (NUMA) configurations.
Signed-off-by: Amul Shah <amul.shah@unisys.com> Looks-good-to: Vivek Goyal <vgoyal@in.ibm.com> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Heiko Carstens [Sun, 25 Jun 2006 12:49:30 +0000 (05:49 -0700)]
[PATCH] s390: setup.c cleanup + build fix
Cleanup & fix 31 bit compilation:
CC arch/s390/kernel/setup.o
arch/s390/kernel/setup.c:83: error: initializer element is not computable at
load time
arch/s390/kernel/setup.c:83: error: (near initialization for
'code_resource.start')
Not sure which patch in the -mm tree breaks this, but since this can be
considered a cleanup it can be merged anyway.
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
KaiGai Kohei [Sun, 25 Jun 2006 12:49:26 +0000 (05:49 -0700)]
[PATCH] pacct: none-delayed process accounting accumulation
In current 2.6.17 implementation, signal_struct refered from task_struct is
used for per-process data structure. The pacct facility also uses it as a
per-process data structure to store stime, utime, minflt, majflt. But those
members are saved in __exit_signal(). It's too late.
For example, if some threads exits at same time, pacct facility has a
possibility to drop accountings for a part of those threads. (see, the
following 'The results of original 2.6.17 kernel') I think accounting
information should be completely collected into the per-process data structure
before writing out an accounting record.
This patch fixes this matter. Accumulation of stime, utime, minflt and majflt
are done before generating accounting record.
[mingo@elte.hu: fix acct_collect() siglock bug found by lockdep] Signed-off-by: KaiGai Kohei <kaigai@ak.jp.nec.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
KaiGai Kohei [Sun, 25 Jun 2006 12:49:25 +0000 (05:49 -0700)]
[PATCH] pacct: avoidance to refer the last thread as a representation of the process
When pacct facility generate an 'ac_flag' field in accounting record, it
refers a task_struct of the thread which died last in the process. But any
other task_structs are ignored.
Therefore, pacct facility drops ASU flag even if root-privilege operations are
used by any other threads except the last one. In addition, AFORK flag is
always set when the thread of group-leader didn't die last, although this
process has called execve() after fork().
We have a same matter in ac_exitcode. The recorded ac_exitcode is an exit
code of the last thread in the process. There is a possibility this exitcode
is not the group leader's one.
KaiGai Kohei [Sun, 25 Jun 2006 12:49:24 +0000 (05:49 -0700)]
[PATCH] pacct: add pacct_struct to fix some pacct bugs.
The pacct facility need an i/o operation when an accounting record is
generated. There is a possibility to wake OOM killer up. If OOM killer is
activated, it kills some processes to make them release process memory
regions.
But acct_process() is called in the killed processes context before calling
exit_mm(), so those processes cannot release own memory. In the results, any
processes stop in this point and it finally cause a system stall.
Paul Fulghum [Sun, 25 Jun 2006 12:49:20 +0000 (05:49 -0700)]
[PATCH] add synclink_gt custom hdlc idle
Add custom HDLC idle pattern feature.
It allows the user to specify an arbitrary 8 or 16 bit repeating pattern on
the transmit data pin between HDLC frames.
In most cases the idle pattern is continuous ones or flags as supported by off
the shelf synchronous controllers and defined in the ISO3309 standard. Some
applications (radio/satellite modems, connections to legacy military hardware)
require non-standard patterns.
Signed-off-by: Paul Fulghum <paulkf@microgate.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andrew Morton [Sun, 25 Jun 2006 12:49:19 +0000 (05:49 -0700)]
[PATCH] irda-usb printk fix
drivers/net/irda/irda-usb.c: In function 'stir421x_patch_device':
drivers/net/irda/irda-usb.c:1108: warning: format '%u' expects type 'unsigned int', but argument 4 has type 'size_t'
Cc: Greg KH <greg@kroah.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>