Joe Korty [Sun, 1 May 2005 15:59:06 +0000 (08:59 -0700)]
[PATCH] add EOWNERDEAD and ENOTRECOVERABLE version 2
Add EOWNERDEAD and ENOTRECOVERABLE to all architectures. This is to
support the upcoming patches for robust mutexes.
We normally don't reserve parts of the name/number space for external
patches, but robust mutexes are sufficiently popular and important to
justify it in this case.
Signed-off-by: Joe Korty <joe.korty@ccur.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Jens Axboe [Sun, 1 May 2005 15:59:06 +0000 (08:59 -0700)]
[PATCH] noop-iosched: kill O(N) merge scan
Profiling hit rates on merging shows that the last merge hint works
extremely well for most work loads. So lets kill the linear merge scan in
noop-iosched, so it provides O(1) run time for any operation.
Peter Missel [Sun, 1 May 2005 15:59:05 +0000 (08:59 -0700)]
[PATCH] LifeView FlyTV Platinum FM: GPIO usage
This is take two of a patch that should have appeared two days ago, before
yesterday's "remote control" patch for the same card.
This patch sets unconnected GPIO to Output to keep them from floating (just
good driver writing practice, being nice to the chip), and uses GPIO16 to
switch TV vs. FM - this pin switches inputs onto the tuner, as well as the
audio output from the tuner into the 7135 SIF input. Consequently, FM
radio support is being un-commented because it's now working (sort of, see
below).
These two patches get the card almost fully operational; there appears to
be a bug in tda8290.c remaining that puts an offset onto the tuned
frequency in FM radio mode. We're investigating.
Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Peter Missel [Sun, 1 May 2005 15:59:05 +0000 (08:59 -0700)]
[PATCH] LifeView FlyTV Platinum FM: Remote Control support
Subject says it ... this card's IR microcontroller design and attachment
are compatible to the company's previous designs, so the patch was as
simple as it gets.
DESC
LifeView FlyTV Platinum FM: GPIO usage
EDESC
From: Peter Missel <peter.missel@onlinehome.de>
This is take two of a patch that should have appeared two days ago, before
yesterday's "remote control" patch for the same card.
This patch sets unconnected GPIO to Output to keep them from floating (just
good driver writing practice, being nice to the chip), and uses GPIO16 to
switch TV vs. FM - this pin switches inputs onto the tuner, as well as the
audio output from the tuner into the 7135 SIF input. Consequently, FM
radio support is being un-commented because it's now working (sort of, see
below).
These two patches get the card almost fully operational; there appears to
be a bug in tda8290.c remaining that puts an offset onto the tuned
frequency in FM radio mode. We're investigating.
Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] reiserfs: make resize option auto-get new device size
It's trivial for the resize option to auto-get the underlying device size,
while it's harder for the user. I've copied the code from jfs.
Since of the different reiserfs option parser (which does not use the
superior match_token used by almost every other filesystem), I've had to
use the "resize=auto" and not "resize" option to specify this behaviour.
Changing the option parser to the kernel one wouldn't be bad but I've no
time to do this cleanup in this moment.
Btw, the mount(8) man page should be updated to include this option. Cc
the relevant people, please (I hope I cc'ed the right people).
Cc: <reiserfs-dev@namesys.com> Cc: <reiserfs-list@namesys.com> Cc: <mtk-manpages@gmx.net> Cc: Alex Zarochentsev <zam@namesys.com> Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] Change synchronize_kernel to _rcu and _sched
This patch changes calls to synchronize_kernel(), deprecated in the earlier
"Deprecate synchronize_kernel, GPL replacement" patch to instead call the new
synchronize_rcu() and synchronize_sched() APIs.
Signed-off-by: Paul E. McKenney <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The synchronize_kernel() primitive is used for quite a few different purposes:
waiting for RCU readers, waiting for NMIs, waiting for interrupts, and so on.
This makes RCU code harder to read, since synchronize_kernel() might or might
not have matching rcu_read_lock()s. This patch creates a new
synchronize_rcu() that is to be used for RCU readers and a new
synchronize_sched() that is used for the rest. These two new primitives
currently have the same implementation, but this is might well change with
additional real-time support. Both new primitives are GPL-only, the old
primitive is deprecated.
Signed-off-by: Paul E. McKenney <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] kernel/rcupdate.c: make the exports EXPORT_SYMBOL_GPL
The gpl exports need to be put back. Moving them to GPL -- but in a
measured manner, as I proposed on this list some months ago -- is fine.
Changing these particular exports precipitously is most definitely -not-
fine. Here is my earlier proposal:
See below for a patch that puts the exports back, along with an updated
version of my earlier patch that starts the process of moving them to GPL.
I will also be following this message with RFC patches that introduce two
(EXPORT_SYMBOL_GPL) interfaces to replace synchronize_kernel(), which then
becomes deprecated.
Signed-off-by: <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Daniel Drake [Sun, 1 May 2005 15:59:03 +0000 (08:59 -0700)]
[PATCH] procfs: Fix hardlink counts for /proc/<PID>/task
The current logic assumes that a /proc/<PID>/task directory should have a
hardlink count of 3, probably counting ".", "..", and a directory for a
single child task.
It's fairly obvious that this doesn't work out correctly when a PID has
more than one child task, which is quite often the case.
Signed-off-by: Daniel Drake <dsd@gentoo.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Daniel Drake [Sun, 1 May 2005 15:59:03 +0000 (08:59 -0700)]
[PATCH] procfs: Fix hardlink counts
The pid directories in /proc/ currently return the wrong hardlink count - 3,
when there are actually 4 : ".", "..", "fd", and "task".
This is easy to notice using find(1):
cd /proc/<pid>
find
In the output, you'll see a message similar to:
find: WARNING: Hard link count is wrong for .: this may be a bug in your
filesystem driver. Automatically turning on find's -noleaf option.
Earlier results may have failed to include directories that should have
been searched.
http://bugs.gentoo.org/show_bug.cgi?id=86031
I also noticed that CONFIG_SECURITY can add a 5th: attr, and performed a
similar fix on the task directories too.
Signed-off-by: Daniel Drake <dsd@gentoo.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Stas Sergeev [Sun, 1 May 2005 15:59:02 +0000 (08:59 -0700)]
[PATCH] move SA_xxx defines to linux/signal.h
The attached patch moves the IRQ-related SA_xxx flags (namely, SA_PROBE,
SA_SAMPLE_RANDOM and SA_SHIRQ) from all the arch-specific headers to
linux/signal.h. This looks like a left-over after the irq-handling code
was consolidated. The code was moved to kernel/irq/*, but the flags are
still left per-arch.
Right now, adding a new IRQ flag to the arch-specific header, like this
patch does:
http://cvs.sourceforge.net/viewcvs.py/*checkout*/alsa/alsa-driver/utils/patches/pcsp-kernel-2.6.10-03.diff?rev=1.1
no longer works, it breaks the compilation for all other arches, unless you
add that flag to all the other arch-specific headers too. So I think such
a clean-up makes sense.
Matt Mackall [Sun, 1 May 2005 15:59:00 +0000 (08:59 -0700)]
[PATCH] nice and rt-prio rlimits
Add a pair of rlimits for allowing non-root tasks to raise nice and rt
priorities. Defaults to traditional behavior. Originally written by
Chris Wright.
The patch implements a simple rlimit ceiling for the RT (and nice) priorities
a task can set. The rlimit defaults to 0, meaning no change in behavior by
default. A value of 50 means RT priority levels 1-50 are allowed. A value of
100 means all 99 privilege levels from 1 to 99 are allowed. CAP_SYS_NICE is
blanket permission.
(akpm: see http://www.uwsg.iu.edu/hypermail/linux/kernel/0503.1/1921.html for
tips on integrating this with PAM).
Signed-off-by: Matt Mackall <mpm@selenic.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Horst Hummel [Sun, 1 May 2005 15:58:59 +0000 (08:58 -0700)]
[PATCH] s390: don't pad cdl blocks for write requests
The first blocks on a cdl formatted dasd device are smaller than the blocksize
of the device. Read requests are padded with a 'e5' pattern. Write requests
should not pad the (user) buffer with 'e5' because a write request is not
allowed to modify the buffer.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] s390: enable write barriers in the dasd driver
The DASD device driver never reorders the I/O requests and relies on the
hardware to write all data to nonvolatile storage before signaling a
successful write. Hence, the only thing we have to do to support write
barriers is to set the queue ordered flag.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Horst Hummel [Sun, 1 May 2005 15:58:59 +0000 (08:58 -0700)]
[PATCH] s390: dasd readonly attribute
The independent read-only flags in devmap, dasd_device and gendisk are not
kept in sync. Use one bit per feature in the dasd driver and keep that bit in
sync with the gendisk bit.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
An arbitrary guest must not be allowed to trigger cmm actions. Only one
specific guest namely the one that serves as the resource monitor may send cmm
messages. Add a parameter that allows to specify the guest that may send
messages. z/VMs resource manager has the name 'VMRMSVM' which is the default.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Heiko Carstens [Sun, 1 May 2005 15:58:57 +0000 (08:58 -0700)]
[PATCH] s390: fix memory holes and cleanup setup_arch
The memory setup didn't take care of memory holes and this makes the memory
management think there would be more memory available than there is in
reality. That causes the OOM killer to kill processes even if there is enough
memory left that can be written to the swap space.
The patch fixes this by using free_area_init_node with an array of memory
holes instead of free_area_init. Further the patch cleans up the code in
setup.c by splitting setup_arch into smaller pieces.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix overflow in calculation of the new tod value in stop_hz_timer and fix
wrong virtual timer list idle time in case the virtual timer is already
expired in stop_cpu_timer.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Use the set_disk_ro() API when the backing file is read-only, to mark the disk
read-only, during the ->open(). The current hack does not work when doing a
mount -o remount.
Also, mark explicitly the code paths which should no more be triggerable (I've
removed the WARN_ON(1) things). They should actually become BUG()s probably
but I'll avoid that since I'm not so sure the change works so well. I gave it
only some limited testing.
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> CC: Jens Axboe <axboe@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Use this:
.set_page_dirty = __set_page_dirty_nobuffers,
We already dropped the inclusion of <linux/buffer_head.h>, and we don't have a
backing block device for this FS.
"Without having looked at it, I'm sure that hostfs does not use buffer_heads.
So setting your ->set_page_dirty a_op to point at __set_page_dirty_nobuffers()
is a reasonable thing to do - it'll provide a slight speedup."
This speedup is one less spinlock held and one less conditional branch, which
isn't bad.
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix some console locking problems (including scheduling in atomic) and various
reorderings and cleanup in that code. Not yet ready for 2.6.12 probably.
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] uml: fix syscall table by including $(SUBARCH)'s one, for i386
Split the i386 entry.S files into entry.S and syscall_table.S which is
included in the previous one (so actually there is no difference between them)
and use the syscall_table.S in the UML build, instead of tracking by hand the
syscall table changes (which is inherently error-prone).
We must only insert the right #defines to inject the changes we need from the
i386 syscall table (for instance some different function names); also, we
don't implement some i386 syscalls, as ioperm(), nor some TLS-related ones
(yet to provide).
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
GCC 2.95 uses __va_copy instead of va_copy. Handle it inside compiler.h
instead of in a casual file, and avoid the risk that this breaks with a newer
compiler (which it could do).
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Rob Landley [Sun, 1 May 2005 15:58:54 +0000 (08:58 -0700)]
[PATCH] uml: workaround old problematic sed behaviour
Old versions of sed from 1998 (predating the first release of gcc 2.95, but
still in use by debian stable) don't understand the single-line version of the
sed append command. Since newer versions of sed still understand the...
ahem, "vintage" form of the command, change our code to use that.
Signed-off-by: Rob Landley <rob@landley.net> Acked-by: Ian McDonald <imcdnzl@gmail.com> Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Jeff Dike [Sun, 1 May 2005 15:58:53 +0000 (08:58 -0700)]
[PATCH] uml: fix oops related to exception table
Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Prevent the kernel from oopsing during the extable sorting, as it can do
now, because the extable is in the readonly section of the binary.
Jeff says: The exception table turned RO in 2.6.11-rc3-mm1 for some reason.
Moving it causes it to land in the writable data section of the binary.
Paolo says: This patch fixes a oops on startup, which can be easily
triggered by compiling with CONFIG_MODE_TT disabled, and STATIC_LINK either
disabled or enabled. The resulting kernel will always Oops on startup,
after printing this simple output:
I've verified, by binary search on the BitKeeper repository (synced up as
of 2.6.12-rc2), starting from the range 2.6.11-2.6.12-rc1, that this bug
shows up on BitKeeper revisions in the range [@1.1994.11.168,+inf), i.e.
starting from this:
[PATCH] lib/sort: Replace insertion sort in exception tables
Since UML does not use the exception table, it's likely that insertion sort
didn't happen to write anything on the table.
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] x86-64: Handle empty E820 regions correctly
Brings sanitize_e820_map() in x86-64 in sync with that of i386.
x86_64 version was missing the changes from this patch.
http://linux.bkbits.net:8080/linux-2.6/cset@3e5e4083Y3HevldZl5KCy94V4DcZww?nav=index.html|src/|src/arch|src/arch/i386|src/arch/i386/kernel|related/arch/i386/kernel/setup.c
Pavel Pisa [Sun, 1 May 2005 15:58:52 +0000 (08:58 -0700)]
[PATCH] Linux 2.6.x VM86 interrupt emulation fixes
Patch solves VM86 interrupt emulation deadlock on SMP systems. The VM86
interrupt emulation has been heavily tested and works well on UP systems
after last update, but it seems to deadlock when we have used it on SMP/HT
boxes now.
It seems, that disable_irq() cannot be called from interrupts, because it
waits until disabled interrupt handler finishes
(/kernel/irq/manage.c:synchronize_irq():while(IRQ_INPROGRESS);). This
blocks one CPU after another. Solved by use disable_irq_nosync.
There is the second problem. If IRQ source is fast, it is possible, that
interrupt is sometimes processed and re-enabled by the second CPU, before
it is disabled by the first one, but negative IRQ disable depths are not
allowed. The spinlocking and disabling IRQs over call to
disable_irq_nosync/enable_irq is the only solution found reliable till now.
Signed-off-by: Michal Sojka <sojkam1@control.felk.cvut.cz> Signed-off-by: Pavel Pisa <pisa@cmp.felk.cvut.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] Increase number of e820 entries hard limit from 32 to 128
The specifications that talk about E820 map doesn't have an upper limit on
the number of e820 entries. But, today's kernel has a hard limit of 32.
With increase in memory size, we are seeing the number of E820 entries
reaching close to 32. Patch below bumps the number upto 128.
The patch changes the location of EDDBUF in zero-page (as it comes after E820).
As, EDDBUF is not used by boot loaders, this patch should not have any effect
on bootloader-setup code interface.
Patch covers both i386 and x86-64.
Tested on:
* grub booting bzImage
* lilo booting bzImage with EDID info enabled
* pxeboot of bzImage
Side-effect:
bss increases by ~ 2K and init.data increases by ~7.5K
on all systems, due to increase in size of static arrays.
Zwane Mwaikambo [Sun, 1 May 2005 15:58:51 +0000 (08:58 -0700)]
[PATCH] cpuid x87 bit on AMD falsely marked as PNI
http://bugme.osdl.org/show_bug.cgi?id=4426
vendor_id : AuthenticAMD
cpu family : 6
model : 10
model name : AMD Athlon(tm) XP
stepping : 0
cpu MHz : 2204.807
<snipped>
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 mmx fxsr sse pni syscall mmxext 3dnowext 3dnow
bogomips : 4358.14
We're marking bit 0 of extended function 0x80000001 cpuid as PNI support on
AMD processors, when it actually denotes x87 FPU present. Patch for i386
and x86_64 below.
Signed-off-by: Zwane Mwaikambo <zwane@arm.linux.org.uk> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Jason Gaston [Sun, 1 May 2005 15:58:50 +0000 (08:58 -0700)]
[PATCH] hda_intel: Intel ESB2 support
This adds the Intel ESB2 HD Audio DID to the hda_intel.c audio driver.
Signed-off-by: Â Jason Gaston <Jason.d.gaston@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Jason Gaston [Sun, 1 May 2005 15:58:50 +0000 (08:58 -0700)]
[PATCH] irq and pci_ids for Intel ICH7DH & ICH7-M DH
This patch adds the Intel ICH7DH and ICH7-M DH DID's to the irq.c and
pci_ids.h files.
Signed-off-by: Â Jason Gaston <Jason.d.gaston@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
john stultz [Sun, 1 May 2005 15:58:50 +0000 (08:58 -0700)]
[PATCH] i386: fix hpet for systems that don't support legacy replacement
Currently the i386 HPET code assumes the entire HPET implementation from
the spec is present. This breaks on boxes that do not implement the
optional legacy timer replacement functionality portion of the spec.
This patch, which is very similar to my x86-64 patch for the same issue,
fixes the problem allowing i386 systems that cannot use the HPET for the
timer interrupt and RTC to still use the HPET as a time source. I've
tested this patch on a system systems without HPET, with HPET but without
legacy timer replacement, as well as HPET with legacy timer replacement.
Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
H. Peter Anvin [Sun, 1 May 2005 15:58:49 +0000 (08:58 -0700)]
[PATCH] CPUID bug and inconsistency fix
The recent support for K8 multicore was misported from x86-64 to i386, due
to an unnecessary inconsistency between the CPUID code. Sure, there is are
no x86-64 VIA chips yet, but it should happen eventually.
This patch fixes the i386 bug as well as makes x86-64 match i386 in the
handing of the CPUID array.
Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Jaya Kumar [Sun, 1 May 2005 15:58:49 +0000 (08:58 -0700)]
[PATCH] x86 reboot: Add reboot fixup for gx1/cs5530a
This patch by Jaya Kumar introduces a generic infrastructure to deal with
x86 chipsets with nonstandard reset sequences, and adds support for the
Geode gx1/cs5530a chipset.
Signed-off-by: Jaya Kumar <jayalk@intworks.biz> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Jack F Vogel [Sun, 1 May 2005 15:58:48 +0000 (08:58 -0700)]
[PATCH] check nmi watchdog is broken
A bug against an xSeries system showed up recently noting that the
check_nmi_watchdog() test was failing.
I have been investigating it and discovered in both i386 and x86_64 the
recent change to the routine to use the cpu_callin_map has uncovered a
problem. Prior to that change, on an SMP box, the test was trivally
passing because all cpu's were found to not yet be online, but now with the
callin_map they are discovered, it goes on to test the counter and they
have not yet begun to increment, so it announces a CPU is stuck and bails
out.
On all the systems I have access to test, the announcement of failure is
also bougs... by the time you can login and check /proc/interrupts, the
NMI count is happily incrementing on all CPUs. Its just that the test is
being done too early.
I have tried moving the call to the test around a bit, and it was always
too early. I finally hit on this proposed solution, it delays the routine
via a late_initcall(), seems like the right solution to me.
Signed-off-by: Adrian Bunk <bunk@stusta.de> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The new i386/x86_64 assemblers no longer accept instructions for moving
between a segment register and a 32bit memory location, i.e.,
movl (%eax),%ds
movl %ds,(%eax)
To generate instructions for moving between a segment register and a
16bit memory location without the 16bit operand size prefix, 0x66,
mov (%eax),%ds
mov %ds,(%eax)
should be used. It will work with both new and old assemblers. The
assembler starting from 2.16.90.0.1 will also support
movw (%eax),%ds
movw %ds,(%eax)
without the 0x66 prefix. I am enclosing patches for 2.4 and 2.6 kernels
here. The resulting kernel binaries should be unchanged as before, with
old and new assemblers, if gcc never generates memory access for
Jake Moilanen [Sun, 1 May 2005 15:58:47 +0000 (08:58 -0700)]
[PATCH] ppc64: reverse prediction on spinlock busy loop code
On our raw spinlocks, we currently have an attempt at the lock, and if we do
not get it we enter a spin loop. This spinloop will likely continue for
awhile, and we pridict likely.
Shouldn't we predict that we will get out of the loop so our next instructions
are already prefetched. Even when we miss because the lock is still held, it
won't matter since we are waiting anyways.
I did a couple quick benchmarks, but the results are inconclusive.
16-way 690 running specjbb with original code
# ./specjbb 3000 16 1 1 19 30 120
...
Valid run, Score is 59282
Mispredicting the spinlock busy loop also means we slow down the rate at which
we do the loads which can be good for heavily contended locks.
Note: There are some gcc issues with our default build and branch prediction,
but a CONFIG_POWER4_ONLY build should emit them correctly. I'm working with
Alan Modra on it now.
Signed-off-by: Jake Moilanen <moilanen@austin.ibm.com> Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Anton Blanchard [Sun, 1 May 2005 15:58:46 +0000 (08:58 -0700)]
[PATCH] ppc64: firmware workaround
Recent gcc 4.0 testing uncovered a firmware issue. Some properties are larger
than 31 bytes and due to gcc 4.0s better stack allocation this overflow ran
over non volatile register storage.
Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Anton Blanchard [Sun, 1 May 2005 15:58:45 +0000 (08:58 -0700)]
[PATCH] ppc64: noexec fixes
There were a few issues with the ppc64 noexec support:
The 64bit ABI has a non executable stack by default. At the moment 64bit apps
require a PT_GNU_STACK section in order to have a non executable stack.
Disable the read implies exec workaround on the 64bit ABI. The 64bit
toolchain has never had problems with incorrect mmap permissions (the 32bit
has, thats why we need to retain the workaround).
With these fixes as well as a gcc fix from Alan Modra (that was recently
committed) 64bit apps work as expected.
Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Olof Johansson [Sun, 1 May 2005 15:58:45 +0000 (08:58 -0700)]
[PATCH] PPC64: Remove hot busy-wait loop in __hash_page
It turns out that our current __hash_page code will do a very hot busy-wait
loop waiting on _PAGE_BUSY to be cleared. It even does ldarx/stdcx in the
loop, which will bounce reservations around like crazy if there's more than
one CPU spinning on the same PTE (or even another PTE in the same
reservation granule). The end result is that each fault takes longer when
there's contention, which in turn increases the chance of another thread
hitting the same fault and also piling up. Not pretty.
There's two options here:
1. Do an out-of-line busy loop a'la spinlocks with just loads (no
reserves)
2. Just bail and refault if needed.
(2) makes sense here: If the PTE is busy, chances are it's in flux anyway
and the other code path making a change might just be ready to hash it.
This fixes a stampede seen on a large-ish system where a multithreaded
HPC app faults in the same text pages on several cpus at the same time.
Signed-off-by: Olof Johansson <olof@lixom.net> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Paul Mackerras [Sun, 1 May 2005 15:58:45 +0000 (08:58 -0700)]
[PATCH] ppc64: tell firmware about kernel capabilities
On pSeries systems, according to the platform architecture specs, we are
supposed to be supplying a structure to firmware that tells firmware about
our capabilities, such as which version of the data structures that
describe available memory we are expecting to see. The way we end up
having to supply this data structure is a bit gross, since it was designed
for AIX and doesn't suit us very well. This patch adds the code to supply
this data structure to the firmware.
Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Paul Mackerras [Sun, 1 May 2005 15:58:44 +0000 (08:58 -0700)]
[PATCH] ppc64: Fix irq parsing on powermac
When I tried Ben's patches to the powermac sound driver on my G5, I found
that it was taking enormous numbers of sound DMA transmit interrupts. This
turned out to be because it was incorrectly configured as level-sensitive
instead of edge-sensitive, which in turn was because the code that parses
the interrupt tree that Open Firmware gives us was incorrectly assigning
another device the same irq number as the sound DMA transmit interrupt
(i.e. 1).
This patch fixes the problem, in a somewhat quick and dirty way for now,
but one which will work for all the machines we currently run on.
Ultimately Ben and I want to do something more general and robust, but this
should go in for 2.6.12.
Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch from Roland adds a PT_NOTE section to both 32 and 64 bits vDSOs
to expose the kernel version to glibc, thus avoiding a uname syscall on
every launch. This is equivalent to the patches Roland posted already for
x86 and x86-64.
Note: the 64 bits .note is actually using the 32 bits format. This is
normal. The ELF spec specifies a different format for 64 bits .note, but
for some reason, this was never properly implemented, the core dumps for
example are all using 32 bits format .note, and binutils cannot even read a
64 bits format .note. Talking to our toolchain folks, they think we'd
rather stick to 32 bits format .note everywhere and get the spec fixed some
day ...
Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Colin Leroy [Sun, 1 May 2005 15:58:43 +0000 (08:58 -0700)]
[PATCH] pmac: save master volume on sleep
Ben's patch that shutdowns master switch and restores it after resume
("pmac: Improve sleep code of tumbler driver") isn't enough here on an
iBook (snapper chip).
The master switch is correctly saved and restored, but somehow
tumbler_put_master_volume() gets called just after
tumbler_set_master_volume() and sets mix->master_vol[*] to 0. So, on
resuming, the master switch is reenabled, but the volume is set to 0.
Here's a patch that also saves and restores master_vol.
Signed-off-by: Colin Leroy <colin@colino.net> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch applies on top of my previous g5 related sound patches and adds
support for the Mac Mini to the PowerMac Alsa driver.
However, I haven't found any kind of HW support for volume control on this
machine. If it exist, it's well hidden. That means that you probably want
to make sure you use software with the ability to do soft volume control,
or use Alsa 0.9 pre-release with the softvol plugin.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch fixes a couple more issues with the management of the GPIOs
dealing with headphone and line out mute on the G5. It should fix the
remaining problems of people not getting any sound out of the headphone
jack.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Some earlier models of aluminium powerbooks and ibook G4s have a clock chip
that requires some tweaking before and after sleep. It seems that without
that magic incantation to disable and re-enable clock spreading, RAM isn't
properly refreshed during sleep. This fixes it.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andreas Jaggi [Sun, 1 May 2005 15:58:41 +0000 (08:58 -0700)]
[PATCH] macintosh/adbhid.c: adb buttons support for aluminium PowerBook G4
This patch adds support for the special adb buttons of the aluminium
PowerBook G4.
Signed-off-by: Andreas Jaggi <andreas.jaggi@waterwave.ch> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
I noticed an occasional crash on wakeup from sleep on my powerbook
(strangly never happened before, probably timing related) that appears to
be due to a dangling interrupt while the chip is put to sleep and beeing
reset on wakeup.
This patch fixes is by disabling the irq in the ide pmac driver while
asleep and only re-enable it after the chip has been fully reset. This is
safe to do so as the interrupt of these apple IDE cells is never shared.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Chris Elston [Sun, 1 May 2005 15:58:40 +0000 (08:58 -0700)]
[PATCH] ppc32: fix for misreported SDRAM size on Radstone PPC7D platform
This patch fixes the SDRAM output from /proc/cpuinfo. The previous code
assumed that there was only one bank of SDRAM, and that the size in the memory
configuration register was the total size.
Signed-off-by: Chris Elston <chris.elston@radstone.co.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Paul Mackerras [Sun, 1 May 2005 15:58:40 +0000 (08:58 -0700)]
[PATCH] ppc32: refactor FPU exception handling
Moved common FPU exception handling code out of head.S so it can be used by
several of the sub-architectures that might of a full PowerPC FPU.
Also, uses new CONFIG_PPC_FPU define to fix alignment exception handling
for floating point load/store instructions to only occur if we have a
hardware FPU.
Signed-off-by: Jason McMullan <jason.mcmullan@timesys.com> Signed-off-by: Kumar Gala <kumar.gala@freescale.com> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Some G3 CPUs can crash in funny way if a store from an FPU register
instruction is executed on a register that has never been initialized since
power on. This patch fixes it by making sure all FP registers have been
properly initialized at kernel boot and when waking from sleep. It also makes
the code that decides wether HID0_BTIC and HID0_DPM are allowed on a given CPU
smarter (it can actually _clear_ them now if they are not allowed instead of
just setting them when they are allowed in case the firmware got them wrong)
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
James Morris [Sun, 1 May 2005 15:58:40 +0000 (08:58 -0700)]
[PATCH] SELinux: add finer grained permissions to Netlink audit processing
This patch provides finer grained permissions for the audit family of
Netlink sockets under SELinux.
1. We need a way to differentiate between privileged and unprivileged
reads of kernel data maintained by the audit subsystem. The AUDIT_GET
operation is unprivileged: it returns the current status of the audit
subsystem (e.g. whether it's enabled etc.). The AUDIT_LIST operation
however returns a list of the current audit ruleset, which is considered
privileged by the audit folk. To deal with this, a new SELinux
permission has been implemented and applied to the operation:
nlmsg_readpriv, which can be allocated to appropriately privileged
domains. Unprivileged domains would only be allocated nlmsg_read.
2. There is a requirement for certain domains to generate audit events
from userspace. These events need to be collected by the kernel,
collated and transmitted sequentially back to the audit daemon. An
example is user level login, an auditable event under CAPP, where
login-related domains generate AUDIT_USER messages via PAM which are
relayed back to auditd via the kernel. To prevent handing out
nlmsg_write permissions to such domains, a new permission has been
added, nlmsg_relay, which is intended for this type of purpose: data is
passed via the kernel back to userspace but no privileged information is
written to the kernel.
Also, AUDIT_LOGIN messages are now valid only for kernel->user messaging,
so this value has been removed from the SELinux nlmsgtab (which is only
used to check user->kernel messages).
Signed-off-by: James Morris <jmorris@redhat.com> Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Stephen Smalley [Sun, 1 May 2005 15:58:39 +0000 (08:58 -0700)]
[PATCH] SELinux: cleanup ipc_has_perm
This patch removes the sclass argument from ipc_has_perm in the SELinux
module, as it can be obtained from the ipc security structure. The use of
a separate argument was a legacy of the older precondition function
handling in SELinux and is obsolete. Please apply.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
akpm@osdl.org [Sun, 1 May 2005 15:58:39 +0000 (08:58 -0700)]
[PATCH] drop_buffers() oops fix
In rare situations, drop_buffers() can be called for a page which has buffers,
but no ->mapping (it was truncated, but the buffers were left behind because
ext3 was still fiddling with them).
But if there was an I/O error in a buffer_head, drop_buffers() will try to get
at the address_space and will oops.
Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Manfred Spraul [Sun, 1 May 2005 15:58:38 +0000 (08:58 -0700)]
[PATCH] add kmalloc_node, inline cleanup
The patch makes the following function calls available to allocate memory
on a specific node without changing the basic operation of the slab
allocator:
kmem_cache_alloc_node(kmem_cache_t *cachep, unsigned int flags, int node);
kmalloc_node(size_t size, unsigned int flags, int node);
in a similar way to the existing node-blind functions:
kmem_cache_alloc(kmem_cache_t *cachep, unsigned int flags);
kmalloc(size, flags);
kmem_cache_alloc_node was changed to pass flags and the node information
through the existing layers of the slab allocator (which lead to some minor
rearrangements). The functions at the lowest layer (kmem_getpages,
cache_grow) are already node aware. Also __alloc_percpu can call
kmalloc_node now.
Performance measurements (using the pageset localization patch) yields:
w/o patches:
Tasks jobs/min jti jobs/min/task real cpu
1 484.27 100 484.2736 12.02 1.97 Wed Mar 30 20:50:43 2005
100 25170.83 91 251.7083 23.12 150.10 Wed Mar 30 20:51:06 2005
200 34601.66 84 173.0083 33.64 294.14 Wed Mar 30 20:51:40 2005
300 37154.47 86 123.8482 46.99 436.56 Wed Mar 30 20:52:28 2005
400 39839.82 80 99.5995 58.43 580.46 Wed Mar 30 20:53:27 2005
500 40036.32 79 80.0726 72.68 728.60 Wed Mar 30 20:54:40 2005
600 44074.21 79 73.4570 79.23 872.10 Wed Mar 30 20:55:59 2005
700 44016.60 78 62.8809 92.56 1015.84 Wed Mar 30 20:57:32 2005
800 40411.05 80 50.5138 115.22 1161.13 Wed Mar 30 20:59:28 2005
900 42298.56 79 46.9984 123.83 1303.42 Wed Mar 30 21:01:33 2005
1000 40955.05 80 40.9551 142.11 1441.92 Wed Mar 30 21:03:55 2005
with pageset localization and slab API patches:
Tasks jobs/min jti jobs/min/task real cpu
1 484.19 100 484.1930 12.02 1.98 Wed Mar 30 21:10:18 2005
100 27428.25 92 274.2825 21.22 149.79 Wed Mar 30 21:10:40 2005
200 37228.94 86 186.1447 31.27 293.49 Wed Mar 30 21:11:12 2005
300 41725.42 85 139.0847 41.84 434.10 Wed Mar 30 21:11:54 2005
400 43032.22 82 107.5805 54.10 582.06 Wed Mar 30 21:12:48 2005
500 42211.23 83 84.4225 68.94 722.61 Wed Mar 30 21:13:58 2005
600 40084.49 82 66.8075 87.12 873.11 Wed Mar 30 21:15:25 2005
700 44169.30 79 63.0990 92.24 1008.77 Wed Mar 30 21:16:58 2005
800 43097.94 79 53.8724 108.03 1155.88 Wed Mar 30 21:18:47 2005
900 41846.75 79 46.4964 125.17 1303.38 Wed Mar 30 21:20:52 2005
1000 40247.85 79 40.2478 144.60 1442.21 Wed Mar 30 21:23:17 2005
Signed-off-by: Christoph Lameter <christoph@lameter.com> Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The smp_mb() is becaus sync_page() doesn't have PG_locked while it accesses
page_mapping(page). The comments in the patch (the entire patch is the
addition of this comment) try to explain further how and why smp_mb() is
used.
Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This is a patch for counting the number of pages for bounce buffers. It's
shown in /proc/vmstat.
Currently, the number of bounce pages are not counted anywhere. So, if
there are many bounce pages, it seems that there are leaked pages. And
it's difficult for a user to imagine the usage of bounce pages. So, it's
meaningful to show # of bouce pages.
Nick Piggin [Sun, 1 May 2005 15:58:37 +0000 (08:58 -0700)]
[PATCH] mempool: simplify alloc
Mempool is pretty clever. Looks too clever for its own good :) It
shouldn't really know so much about page reclaim internals.
- don't guess about what effective page reclaim might involve.
- don't randomly flush out all dirty data if some unlikely thing
happens (alloc returns NULL). page reclaim can (sort of :P) handle
it.
I think the main motivation is trying to avoid pool->lock at all costs.
However the first allocation is attempted with __GFP_WAIT cleared, so it
will be 'can_try_harder' if it hits the page allocator. So if allocation
still fails, then we can probably afford to hit the pool->lock - and what's
the alternative? Try page reclaim and hit zone->lru_lock?
A nice upshot is that we don't need to do any fancy memory barriers or do
(intentionally) racy access to pool-> fields outside the lock.
Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Nick Piggin [Sun, 1 May 2005 15:58:36 +0000 (08:58 -0700)]
[PATCH] mempool: NOMEMALLOC and NORETRY
Mempools have 2 problems.
The first is that mempool_alloc can possibly get stuck in __alloc_pages
when they should opt to fail, and take an element from their reserved pool.
The second is that it will happily eat emergency PF_MEMALLOC reserves
instead of going to their reserved pools.
Fix the first by passing __GFP_NORETRY in the allocation calls in
mempool_alloc. Fix the second by introducing a __GFP_MEMPOOL flag which
directs the page allocator not to allocate from the reserve pool.
Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Nick Piggin [Sun, 1 May 2005 15:58:36 +0000 (08:58 -0700)]
[PATCH] mm: pcp use non powers of 2 for batch size
Jack Steiner reported this to have fixed his problem (bad colouring):
"The patches fix both problems that I found - bad
coloring & excessive pages in pagesets."
In most workloads this is not likely to be such a pronounced problem,
however it should help corner cases. And avoiding powers of 2 in these
types of memory operations is always a good idea.
Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>