[PATCH] PTRACE_SYSEMU is only for i386 and clashes with other ptrace codes of other archs
PTRACE_SYSEMU{,_SINGLESTEP} is actually arch specific, for now, and the
current allocated number clashes with a ptrace code of frv, i.e.
PTRACE_GETFDPIC. I should have submitted this much earlier, anyway we get no
breakage for this.
CC: Daniel Jacobowitz <dan@debian.org> Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andrew Morton [Sun, 8 Jan 2006 09:04:36 +0000 (01:04 -0800)]
[PATCH] shrink struct page
Reduce the size of the pageframe for NR_CPUS>4, CONFIG_PREEMPT back to the
minimal size by unionising both ->private and ->mapping with the pagetable
lock.
It uses an anonymous struct and hence requires gcc-3.x.
[PATCH] aio: reorder kiocb structure elements to make sync iocb setup faster
Reorder members of the kiocb structure to make sync kiocb setup faster. By
setting the elements sequentially, the write combining buffers on the CPU
are able to combine the writes into a single burst, which results in fewer
cache cycles being consumed, freeing them up for other code. This results
in a 10-20KB/s[*] increase on the bw_unix part of LMbench on my test
system.
* The improvement varies based on what other patches are in the system,
as there are a number of bottlenecks, so this number is not absolutely
accurate.
Signed-off-by: Benjamin LaHaise <benjamin.c.lahaise@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Akinobu Mita [Sun, 8 Jan 2006 09:04:29 +0000 (01:04 -0800)]
[PATCH] modules: mark TAINT_FORCED_RMMOD correctly
Currently TAINT_FORCED_RMMOD is totally unused. Because it is marked as
TAINT_FORCED_MODULE instead when user forced a module unload. This patch
marks it correctly
Signed-off-by: Akinobu Mita <mita@miraclelinux.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Dave Jones <davej@codemonkey.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Ashutosh Naik [Sun, 8 Jan 2006 09:04:25 +0000 (01:04 -0800)]
[PATCH] modules: prevent overriding of symbols
Ensure that an exported symbol does not already exist in the kernel or in
some other module's exported symbol table. This is done by checking the
symbol tables for the exported symbol at the time of loading the module.
Currently this is done after the relocation of the symbol.
Signed-off-by: Ashutosh Naik <ashutosh.naik@gmail.com> Signed-off-by: Anand Krishnan <anandhkrishnan@yahoo.co.in> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Dmitry Torokhov [Sun, 8 Jan 2006 09:04:22 +0000 (01:04 -0800)]
[PATCH] Sonypi: convert to the new platform device interface
Do not use platform_device_register_simple() as it is going away, implement
->probe() and -remove() functions so manual binding and unbinding will work
with this driver.
Signed-off-by: Dmitry Torokhov <dtor@mail.ru> Cc: Stelian Pop <stelian@popies.net> Cc: Mattia Dongili <malattia@linux.it> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Bjorn Helgaas [Sun, 8 Jan 2006 09:04:13 +0000 (01:04 -0800)]
[PATCH] /dev/mem: validate mmap requests
Add a hook so architectures can validate /dev/mem mmap requests.
This is analogous to validation we already perform in the read/write
paths.
The identity mapping scheme used on ia64 requires that each 16MB or
64MB granule be accessed with exactly one attribute (write-back or
uncacheable). This avoids "attribute aliasing", which can cause a
machine check.
Sample problem scenario:
- Machine supports VGA, so it has uncacheable (UC) MMIO at 640K-768K
- efi_memmap_init() discards any write-back (WB) memory in the first granule
- Application (e.g., "hwinfo") mmaps /dev/mem, offset 0
- hwinfo receives UC mapping (the default, since memmap says "no WB here")
- Machine check abort (on chipsets that don't support UC access to WB
memory, e.g., sx1000)
In the scenario above, the only choices are
- Use WB for hwinfo mmap. Can't do this because it causes attribute
aliasing with the UC mapping for the VGA MMIO space.
- Use UC for hwinfo mmap. Can't do this because the chipset may not
support UC for that region.
- Disallow the hwinfo mmap with -EINVAL. That's what this patch does.
Andrew Morton [Sun, 8 Jan 2006 09:04:09 +0000 (01:04 -0800)]
[PATCH] remove gcc-2 checks
Remove various things which were checking for gcc-1.x and gcc-2.x compilers.
From: Adrian Bunk <bunk@stusta.de>
Some documentation updates and removes some code paths for gcc < 3.2.
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andrew Morton [Sun, 8 Jan 2006 09:04:07 +0000 (01:04 -0800)]
[PATCH] Abandon gcc-2.95.x
There's one scsi driver which doesn't compile due to weird __VA_ARGS__ tricks
and the rather useful scsi/sd.c is currently getting an ICE. None of the new
SAS code compiles, due to extensive use of anonymous unions. The V4L guys are
very good at exploiting the gcc-2.95.x macro expansion bug (_why_ does each
driver need to implement its own debug macros?) and various people keep on
sneaking in anonymous unions, which are rather nice.
Plus anonymous unions are rather useful.
Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andy Isaacson [Sun, 8 Jan 2006 09:04:00 +0000 (01:04 -0800)]
[PATCH] block/stat.txt
I couldn't find any docs explaining the contents of /sys/block/<dev>/stat,
so I wrote up the following. I'm not completely sure it's accurate - Jens,
could you give a yea or nay on this?
In particular, the counts of read/write IOs and read/write sectors are
incremented in different places - it looks like they both increment as the
request is being finished, but I'm not completely sure of that.
Oren Laadan [Sun, 8 Jan 2006 09:03:51 +0000 (01:03 -0800)]
[PATCH] fork: fix race in setting child's pgrp and tty
In fork, child should recopy parent's pgrp/tty after it has tasklist_lock.
Otherwise following a setpgid() on the parent, *after* copy_signal(), the
child will own a stale pgrp (which may be reused); (eg. if copy_mm()
sleeps a long while due to memory pressure). Similar issue for the tty.
Signed-off-by: Oren Laadan <orenl@cs.columbia.edu> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Mike Miller [Sun, 8 Jan 2006 09:03:50 +0000 (01:03 -0800)]
[PATCH] cciss: adds MSI and MSI-X support
This creates a new function, cciss_interrupt_mode called from
cciss_pci_init. This function determines what type of interrupt vector to
use, i.e., MSI, MSI-X, or IO-APIC.
One noticeable difference is changing the interrupt field of the controller
struct to an array of 4 unsigned ints. The Smart Array HW is capable of
generating 4 distinct interrupts depending on the transport method in use
during operation. These are:
#define DOORBELL_INT 0
Used to notify the contoller of configuration updates. We only use
this feature when in polling mode.
#define PERF_MODE_INT 0
Used when the controller is in Performant Mode.
#define SIMPLE_MODE_INT 2
Used when the controller is in Simple Mode (current Linux implementation).
#define MEMQ_INT_MODE 3
Not used.
When using IO-APIC interrupts these 4 lines are OR'ed together so when any
one fires an interrupt an is generated. In MSI or MSI-X mode this hardware
OR'ing is ignored. We must register for our interrupt depending on what
mode the controller is running. For Linux we use SIMPLE_MODE_INT
exclusively at this time. Please consider this for inclusion.
Signed-off-by: Mike Miller <mike.miller@hp.com> Cc: Jens Axboe <axboe@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Kylene Jo Hall [Sun, 8 Jan 2006 09:03:48 +0000 (01:03 -0800)]
[PATCH] tpmdd: remove global event log
Remove global event log in the tpm bios event measurement log code that
would have caused problems when the code was run concurrently. A log is
now allocated and attached to the seq file upon open and destroyed
appropriately.
Signed-off-by: Kylene Jo Hall <kjhall@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] Don't attempt to power off if power off is not implemented
The problem. It is expected that /sbin/halt -p works exactly like
/sbin/halt, when the kernel does not implement power off functionality.
The kernel can do a lot of work in the reboot notifiers and in
device_shutdown before we even get to machine_power_off. Some of that
shutdown is not safe if you are leaving the power on, and it definitely
gets in the way of using sysrq or pressing ctrl-alt-del. Since the
shutdown happens in generic code there is no way to fix this in
architecture specific code :(
Some machines are kernel oopsing today because of this.
The simple solution is to turn LINUX_REBOOT_CMD_POWER_OFF into
LINUX_REBOOT_CMD_HALT if power_off functionality is not implemented.
This has the unfortunate side effect of disabling the power off
functionality on architectures that leave pm_power_off to null and still
implement something in machine_power_off. And it will break the build on
some architectures that don't have a pm_power_off variable at all.
On both counts I say tough.
For architectures like alpha that don't implement the pm_power_off variable
pm_power_off is declared in linux/pm.h and it is a generic part of our
power management code, and all architectures should implement it.
For architectures like parisc that have a default power off method in
machine_power_off if pm_power_off is not implemented or fails. It is easy
enough to set the pm_power_off variable. And nothing bad happens there,
the machines just stop powering off.
The current semantics are impossible without a flag at the top level so we
can avoid the problem code if a power off is not implemented. pm_power_off
is as good a flag as any with the bonus that it works without modification
on at least x86, x86_64, powerpc, and ppc today.
Andrew can you pick this up and put this in the mm tree. Kernels that
don't compile or don't power off seem saner than kernels that oops or
panic. Until we get the arch specific patches for the problem
architectures this probably isn't smart to push into the stable kernel.
Unfortunately I don't have the time at the moment to walk through every
architecture and make them work. And even if I did I couldn't test it :(
From: Hirokazu Takata <takata@linux-m32r.org>
Add pm_power_off() for build fix of arch/m32r/kernel/process.c.
From: Miklos Szeredi <miklos@szeredi.hu>
UML build fix
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Hayato Fujiwara <fujiwara@linux-m32r.org> Signed-off-by: Hirokazu Takata <takata@linux-m32r.org> Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] Extend RCU torture module to test tickless idle CPU
This patch forces RCU torture threads off various CPUs in the system
allowing them to become idle and go tickless. Meant to test support for
such tickless idle CPU in RCU.
Dave Jones [Sun, 8 Jan 2006 09:03:41 +0000 (01:03 -0800)]
[PATCH] Add tainting for proprietary helper modules
Kernels that have had Windows drivers loaded into them are undebuggable.
I've wasted a number of hours chasing bugs filed in Fedora bugzilla only to
find out much later that the user had used such 'helpers', and their
problems were unreproducable without them loaded.
Acked-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Eric Dumazet [Sun, 8 Jan 2006 09:03:40 +0000 (01:03 -0800)]
[PATCH] remove unused blkp field in percpu_data
I found that blkp field was not used in kernel tree.
As most of the times NR_CPUS is a power of two and kmalloc() memory blocks
too, this extra field basically doubles the memory space allocated in
__alloc_percpu() to store the 'struct percpu_data'
(for example, if NR_CPUS=8 on i386, kmalloc(4*8+4) returns a 64 bytes block
instead of a 32 bytes block after this patch)
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
David Gibson [Sun, 8 Jan 2006 09:03:35 +0000 (01:03 -0800)]
[PATCH] Fix handling of ELF segments with zero filesize
mmap() returns -EINVAL if given a zero length, and thus elf_map() in
binfmt_elf.c does likewise if it attempts to map a (page-aligned) ELF
segment with zero filesize. Such a situation never arises with the default
linker scripts, but there's nothing inherently wrong with zero-filesize
(but non-zero memsize) ELF segments. Custom linker scripts can generate
them, and the kernel should be able to map them; this patch makes it so.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
David S. Miller [Sun, 8 Jan 2006 09:03:34 +0000 (01:03 -0800)]
[PATCH] drivers/connector/cn_proc.c typos
The parameter to put_cpu_var() is unreferenced by the implementation, and
the compiler doesn't try to comprehend comments, so this wouldn't cause any
problem, but if bugged me enough to post a fix :-)
Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Eric Dumazet [Sun, 8 Jan 2006 09:03:32 +0000 (01:03 -0800)]
[PATCH] shrink dentry struct
Some long time ago, dentry struct was carefully tuned so that on 32 bits
UP, sizeof(struct dentry) was exactly 128, ie a power of 2, and a multiple
of memory cache lines.
Then RCU was added and dentry struct enlarged by two pointers, with nice
results for SMP, but not so good on UP, because breaking the above tuning
(128 + 8 = 136 bytes)
This patch reverts this unwanted side effect, by using an union (d_u),
where d_rcu and d_child are placed so that these two fields can share their
memory needs.
At the time d_free() is called (and d_rcu is really used), d_child is known
to be empty and not touched by the dentry freeing.
Lockless lookups only access d_name, d_parent, d_lock, d_op, d_flags (so
the previous content of d_child is not needed if said dentry was unhashed
but still accessed by a CPU because of RCU constraints)
As dentry cache easily contains millions of entries, a size reduction is
worth the extra complexity of the ugly C union.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Cc: Dipankar Sarma <dipankar@in.ibm.com> Cc: Maneesh Soni <maneesh@in.ibm.com> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: "Paul E. McKenney" <paulmck@us.ibm.com> Cc: Ian Kent <raven@themaw.net> Cc: Paul Jackson <pj@sgi.com> Cc: Al Viro <viro@ftp.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Neil Brown <neilb@cse.unsw.edu.au> Cc: James Morris <jmorris@namei.org> Cc: Stephen Smalley <sds@epoch.ncsc.mil> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Jorn Dreyer [Sun, 8 Jan 2006 09:03:30 +0000 (01:03 -0800)]
[PATCH] nfsroot: do not silently stop parsing on an unknown option
It would be helpful if the kernel did not silently stop parsing
nfs options, but instead warned about any he does not recognize. The
attached patch adds one printk to do just that.
It took me a couple of hours to find my configuration mistake.
Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Oleg Nesterov [Sun, 8 Jan 2006 09:03:29 +0000 (01:03 -0800)]
[PATCH] sigio: cleanup, don't take tasklist twice
The only user of send_sigio_to_task() already holds tasklist_lock, so it is
better not to send the signal via send_group_sig_info() (which takes
tasklist recursively) but use group_send_sig_info().
The same change in send_sigurg()->send_sigurg_to_task().
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: "Paul E. McKenney" <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] ext3: use sbi instead of EXT3_SB() in resize code.
There are places in the resize code in which EXT3_SB() macro is used after
an statement like sbi = EXT3_SB(sb) is done. Inside the same function,
both sbi and EXT3_SB() are used to reference the super block Altough it is
not wrong, keeping it coherent increases legibility, IMHO.
Signed-off-by: Glauber de Oliveira Costa <glommer@br.ibm.com> Cc: "Stephen C. Tweedie" <sct@redhat.com> Cc: Andreas Dilger <adilger@clusterfs.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] ext3: remove trailing newlines from ext3_warning() calls
Remove the trailing newlines in calls to ext3_warning(). This function
already adds a trailing newline to the end of messages.
Signed-off-by: Glauber de Oliveira Costa <glommer@br.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Eric Dumazet [Sun, 8 Jan 2006 09:03:21 +0000 (01:03 -0800)]
[PATCH] oprofile: Use vmalloc_node() in alloc_cpu_buffers()
Make oprofile alloc_cpu_buffers() function NUMA aware, allocating each CPU
local buffer in its memory node if possible.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Cc: Philippe Elie <phil.el@wanadoo.fr> Cc: John Levon <levon@movementarian.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Kylene Jo Hall [Sun, 8 Jan 2006 09:03:15 +0000 (01:03 -0800)]
[PATCH] tpm: add bios measurement log
According to the TCG specifications measurements or hashes of the BIOS code
and data are extended into TPM PCRS and a log is kept in an ACPI table of
these extensions for later validation if desired. This patch exports the
values in the ACPI table through a security-fs seq_file.
Signed-off-by: Seiji Munetoh <munetoh@jp.ibm.com> Signed-off-by: Stefan Berger <stefanb@us.ibm.com> Signed-off-by: Reiner Sailer <sailer@us.ibm.com> Signed-off-by: Kylene Hall <kjhall@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Oleg Nesterov [Sun, 8 Jan 2006 09:03:13 +0000 (01:03 -0800)]
[PATCH] do_coredump() should reset group_stop_count earlier
__group_complete_signal() sets ->group_stop_count in sig_kernel_coredump()
path and marks the target thread as ->group_exit_task. So any thread
except group_exit_task will go to handle_group_stop()->finish_stop().
However, when group_exit_task actually starts do_coredump(), it sets
SIGNAL_GROUP_EXIT, but does not reset ->group_stop_count while killing
other threads. If we have not yet stopped threads in the same thread
group, they all will spin in kernel mode until group_exit_task sends them
SIGKILL, because ->group_stop_count > 0 means:
recalc_sigpending_tsk() never clears TIF_SIGPENDING
get_signal_to_deliver() goes to handle_group_stop()
handle_group_stop() returns when SIGNAL_GROUP_EXIT set
Andrew Morton [Sun, 8 Jan 2006 09:03:05 +0000 (01:03 -0800)]
[PATCH] fix possible PAGE_CACHE_SHIFT overflows
We've had two instances recently of overflows when doing
64_bit_value = (32_bit_value << PAGE_CACHE_SHIFT)
I did a tree-wide grep of `<<.*PAGE_CACHE_SHIFT' and this is the result.
- afs_rxfs_fetch_descriptor.offset is of type off_t, which seems broken.
- jfs and jffs are limited to 4GB anyway.
- reiserfs map_block_for_writepage() takes an unsigned long for the block -
it should take sector_t. (It'll fail for huge filesystems with
blocksize<PAGE_CACHE_SIZE)
- cramfs_read() needs to use sector_t (I think cramsfs is busted on large
filesystems anyway)
- affs is limited in file size anyway.
- I generally didn't fix 32-bit overflows in directory operations.
- arm's __flush_dcache_page() is peculiar. What if the page lies beyond 4G?
Cc: Oleg Drokin <green@linuxhacker.ru> Cc: David Howells <dhowells@redhat.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: <reiserfs-dev@namesys.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Anton Altaparmakov <aia21@cantab.net> Cc: Jeff Dike <jdike@addtoit.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: <linux-fsdevel@vger.kernel.org> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
HDIO_GETGEO is implemented in most block drivers, and all of them have to
duplicate the code to copy the structure to userspace, as well as getting
the start sector. This patch moves that to common code [1] and adds a
->getgeo method to fill out the raw kernel hd_geometry structure. For many
drivers this means ->ioctl can go away now.
[1] the s390 block drivers are odd in this respect. xpram sets ->start
to 4 always which seems more than odd, and the dasd driver shifts
the start offset around, probably because of it's non-standard
sector size.
Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Jens Axboe <axboe@suse.de> Cc: <mike.miller@hp.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Paolo Giarrusso <blaisorblade@yahoo.it> Cc: Bartlomiej Zolnierkiewicz <B.Zolnierkiewicz@elka.pw.edu.pl> Cc: Neil Brown <neilb@cse.unsw.edu.au> Cc: Markus Lidel <Markus.Lidel@shadowconnect.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: James Bottomley <James.Bottomley@steeleye.com> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
George Anzinger [Sun, 8 Jan 2006 09:02:48 +0000 (01:02 -0800)]
[PATCH] sigaction should clear all signals on SIG_IGN, not just < 32
While rooting aroung in the signal code trying to understand how to fix the
SIG_IGN ploy (set sig handler to SIG_IGN and flood system with high speed
repeating timers) I came across what, I think, is a problem in sigaction()
in that when processing a SIG_IGN request it flushes signals from 1 to
SIGRTMIN and leaves the rest. Attempt to fix this.
Signed-off-by: George Anzinger <george@mvista.com> Cc: Roland McGrath <roland@redhat.com> Cc: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
David Howells [Sun, 8 Jan 2006 09:02:47 +0000 (01:02 -0800)]
[PATCH] keys: Permit running process to instantiate keys
Make it possible for a running process (such as gssapid) to be able to
instantiate a key, as was requested by Trond Myklebust for NFS4.
The patch makes the following changes:
(1) A new, optional key type method has been added. This permits a key type
to intercept requests at the point /sbin/request-key is about to be
spawned and do something else with them - passing them over the
rpc_pipefs files or netlink sockets for instance.
The uninstantiated key, the authorisation key and the intended operation
name are passed to the method.
(2) The callout_info is no longer passed as an argument to /sbin/request-key
to prevent unauthorised viewing of this data using ps or by looking in
/proc/pid/cmdline.
This means that the old /sbin/request-key program will not work with the
patched kernel as it will expect to see an extra argument that is no
longer there.
A revised keyutils package will be made available tomorrow.
(3) The callout_info is now attached to the authorisation key. Reading this
key will retrieve the information.
(4) A new field has been added to the task_struct. This holds the
authorisation key currently active for a thread. Searches now look here
for the caller's set of keys rather than looking for an auth key in the
lowest level of the session keyring.
This permits a thread to be servicing multiple requests at once and to
switch between them. Note that this is per-thread, not per-process, and
so is usable in multithreaded programs.
The setting of this field is inherited across fork and exec.
(5) A new keyctl function (KEYCTL_ASSUME_AUTHORITY) has been added that
permits a thread to assume the authority to deal with an uninstantiated
key. Assumption is only permitted if the authorisation key associated
with the uninstantiated key is somewhere in the thread's keyrings.
This function can also clear the assumption.
(6) A new magic key specifier has been added to refer to the currently
assumed authorisation key (KEY_SPEC_REQKEY_AUTH_KEY).
(7) Instantiation will only proceed if the appropriate authorisation key is
assumed first. The assumed authorisation key is discarded if
instantiation is successful.
(8) key_validate() is moved from the file of request_key functions to the
file of permissions functions.
(9) The documentation is updated.
From: <Valdis.Kletnieks@vt.edu>
Build fix.
Signed-off-by: David Howells <dhowells@redhat.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Alexander Zangerl <az@bond.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
David Howells [Sun, 8 Jan 2006 09:02:45 +0000 (01:02 -0800)]
[PATCH] keys: Discard duplicate keys from a keyring on link
Cause any links within a keyring to keys that match a key to be linked into
that keyring to be discarded as a link to the new key is added. The match is
contingent on the type and description strings being the same.
This permits requests, adds and searches to displace negative, expired,
revoked and dead keys easily. After some discussion it was concluded that
duplicate valid keys should probably be discarded also as they would otherwise
hide the new key.
Since request_key() is intended to be the primary method by which keys are
added to a keyring, duplicate valid keys wouldn't be an issue there as that
function would return an existing match in preference to creating a new key.
Signed-off-by: David Howells <dhowells@redhat.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Alexander Zangerl <az@bond.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
What's the true meaning of the printk return value? Should it include the
priority prefix length of 3? and what about the timing information? In
both cases it was broken:
NeilBrown [Sun, 8 Jan 2006 09:02:40 +0000 (01:02 -0800)]
[PATCH] Fix overflow tests for compat_sys_fcntl64 locking
When making an fctl locking call through compat_sys_fcntl64 (i.e. a 32bit
app on a 64bit kernel), the syscall can return a locking range that is in
conflict with the queried lock.
If some aspect of this range does not fit in the 32bit structure, something
needs to be done.
The current code is wrong in several respects:
- It returns data to userspace even if no conflict was found
i.e. it should check l_type for F_UNLCK
- It returns -EOVERFLOW too agressively. A lock range covering
the last possible byte of the file (start = COMPAT_OFF_T_MAX,
len = 1) should be possible, but is rejected with the current test.
- A extra-long 'len' should not be a problem. If only that part
of the conflicting lock that would be visible to the 32bit
app needs to be reported to the 32bit app anyway.
This patch addresses those three issues and adds a comment to (hopefully)
record it for posterity.
Note: this patch mainly affects test-cases. Real applications rarely is
ever see the problems.
This patch has been tested (LSB test suite), and works.
Signed-off-by: Neil Brown <neilb@suse.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Matthew Wilcox <willy@debian.org> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
NeilBrown [Sun, 8 Jan 2006 09:02:39 +0000 (01:02 -0800)]
[PATCH] Fix some problems with truncate and mtime semantics.
SUS requires that when truncating a file to the size that it currently
is:
truncate and ftruncate should NOT modify ctime or mtime
O_TRUNC SHOULD modify ctime and mtime.
Currently mtime and ctime are always modified on most local
filesystems (side effect of ->truncate) or never modified (on NFS).
With this patch:
ATTR_CTIME|ATTR_MTIME are sent with ATTR_SIZE precisely when
an update of these times is required whether size changes or not
(via a new argument to do_truncate). This allows NFS to do
the right thing for O_TRUNC.
inode_setattr nolonger forces ATTR_MTIME|ATTR_CTIME when the ATTR_SIZE
sets the size to it's current value. This allows local filesystems
to do the right thing for f?truncate.
Also, the logic in inode_setattr is changed a bit so there are two return
points. One returns the error from vmtruncate if it failed, the other
returns 0 (there can be no other failure).
Finally, if vmtruncate succeeds, and ATTR_SIZE is the only change
requested, we now fall-through and mark_inode_dirty. If a filesystem did
not have a ->truncate function, then vmtruncate will have changed i_size,
without marking the inode as 'dirty', and I think this is wrong.
Signed-off-by: Neil Brown <neilb@suse.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] vgacon: Workaround for resize bug in some chipsets
Reported from Redhat Bugzilla Bug 170450
"I updated to the development kernel and now during boot only the top of the
text is visable. For example the monitor screen the is the lines and I can
only see text in the asterisk area.
[PATCH] use ptrace_get_task_struct in various places
The ptrace_get_task_struct() helper that I added as part of the ptrace
consolidation is useful in variety of places that currently opencode it.
Switch them to the common helpers.
Add a ptrace_traceme() helper that needs to be explicitly called, and simplify
the ptrace_get_task_struct() interface. We don't need the request argument
now, and we return the task_struct directly, using ERR_PTR() for error
returns. It's a bit more code in the callers, but we have two sane routines
that do one thing well now.
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Tom Zanussi [Sun, 8 Jan 2006 09:02:31 +0000 (01:02 -0800)]
[PATCH] relayfs: cleanup, change relayfs_file_* to relay_file_*
This patch renames relayfs_file_operations to relay_file_operations, and the
file operations themselves from relayfs_XXX to relay_file_XXX, to make it more
clear that they refer to relay files.
Signed-off-by: Tom Zanussi <zanussi@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Tom Zanussi [Sun, 8 Jan 2006 09:02:29 +0000 (01:02 -0800)]
[PATCH] relayfs: add support for global relay buffers
This patch adds the optional is_global outparam to the create_buf_file()
callback. This can be used by clients to create a single global relayfs
buffer instead of the default per-cpu buffers. This was suggested as being
useful for certain debugging applications where it's more convenient to be
able to get all the data from a single channel without having to go to the
bother of dealing with per-cpu files.
Signed-off-by: Tom Zanussi <zanussi@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Tom Zanussi [Sun, 8 Jan 2006 09:02:28 +0000 (01:02 -0800)]
[PATCH] relayfs: add support for relay files in other filesystems
This patch adds a couple of callback functions that allow a client to hook
into relay_open()/close() and supply the files that will be used to represent
the channel buffers; the default implementation if no callbacks are defined is
to create the files in relayfs. This is to support the creation and use of
relay files in other filesystems such as debugfs, as implied by the fact that
relayfs_file_operations are exported.
Signed-off-by: Tom Zanussi <zanussi@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Tom Zanussi [Sun, 8 Jan 2006 09:02:26 +0000 (01:02 -0800)]
[PATCH] relayfs: use generic_ip for private data
Use inode->u.generic_ip instead of relayfs_inode_info to store pointer to user
data. Clients using relayfs_file_create() to create their own files would
probably more expect their data to be stored in generic_ip; we also intend in
the next set of patches to get rid of relayfs-specific stuff in the file
operations, so we might as well do it here.
Signed-off-by: Tom Zanussi <zanussi@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Tom Zanussi [Sun, 8 Jan 2006 09:02:24 +0000 (01:02 -0800)]
[PATCH] relayfs: export relayfs_create_file() with fileops param
This patch adds a mandatory fileops param to relayfs_create_file() and exports
that function so that clients can use it to create files defined by their own
set of file operations, in relayfs. The purpose is to allow relayfs
applications to create their own set of 'control' files alongside their relay
files in relayfs rather than having to create them in /proc or debugfs for
instance. relayfs_create_file() is also used by relay_open_buf() to create
the relay files for a channel. In this case, a pointer to
relayfs_file_operations is passed in, along with a pointer to the buffer
associated with the file.
Signed-off-by: Tom Zanussi <zanussi@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Tom Zanussi [Sun, 8 Jan 2006 09:02:23 +0000 (01:02 -0800)]
[PATCH] relayfs: decouple buffer creation from inode creation
The patch series implementa or fixes 3 things that were specifically requested
or suggested by relayfs users:
- support for non-relay files (patches 1-6)
Currently, the relayfs API only supports the creation of directories
(relayfs_create_dir()) and relay files (relay_open()). These patches adds
support for non-relay files (relayfs_create_file()). This is so relayfs
applications can create 'control files' in relayfs itself rather than in /proc
or via a netlink channel, as is currently done in the relay-app examples.
Basically what this amounts to is exporting relayfs_create_file() with an
additional file_ops param that clients can use to supply file operations for
their own special-purpose files in relayfs.
- make exported relay file ops useful (patches 7-8)
The relayfs relay_file_operations have always been exported, the intent being
to make it possible to create relay files in other filesystems such as
debugfs. The problem, though, is that currently the file operations are too
tightly coupled to relayfs to actually be used for this purpose. This patch
fixes that by adding a couple of callback functions that allow a client to
hook into relay_open()/close() and supply the files that will be used to
represent the channel buffers; the default implementation if no callbacks are
defined is to create the files in relayfs.
- add an option to create global relay buffer (patches 9-10) The file creation
callback also supplies an optional param, is_global, that can be used by
clients to create a single global relayfs buffer instead of the default
per-cpu buffers. This was suggested as being useful for certain debugging
applications where it's more convenient to be able to get all the data from a
single channel without having to go to the bother of dealing with per-cpu
files.
- cleanup, some renaming and Documentation updates (patches 11-12)
There were several comments that the use of netlink in the example code was
non-intuitive and in fact the whole relay-app business was needlessly
confusing. Based on that feedback, the example code has been completely
converted over to relayfs control files as supported by this patch, and have
also been made completely self-contained.
The converted examples along with a couple of new examples that demonstrate
using exported relay files can be found in relay-apps tarball:
http://prdownloads.sourceforge.net/relayfs/relay-apps-0.9.tar.gz?download
This patch:
Separate buffer create/destroy from inode create/destroy. We want to be able
to associate other data and not just relay buffers with inodes. Buffer
create/destroy is moved out of inode.c and into relayfs core code.
Signed-off-by: Tom Zanussi <zanussi@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Adrian Bunk [Sun, 8 Jan 2006 09:02:17 +0000 (01:02 -0800)]
[PATCH] kernel/: small cleanups
This patch contains the following cleanups:
- make needlessly global functions static
- every file should include the headers containing the prototypes for
it's global functions
Signed-off-by: Adrian Bunk <bunk@stusta.de> Acked-by: "Paul E. McKenney" <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Adrian Bunk [Sun, 8 Jan 2006 09:02:15 +0000 (01:02 -0800)]
[PATCH] move rtc_interrupt() prototype to rtc.h
This patch moves the rtc_interrupt() prototype to rtc.h and removes the
prototypes from C files.
It also renames static rtc_interrupt() functions in
arch/arm/mach-integrator/time.c and arch/sh64/kernel/time.c to avoid compile
problems.
Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Paul Gortmaker <p_gortmaker@yahoo.com> Acked-by: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
OGAWA Hirofumi [Sun, 8 Jan 2006 09:02:14 +0000 (01:02 -0800)]
[PATCH] Fix and add EXPORT_SYMBOL(filemap_write_and_wait)
This patch add EXPORT_SYMBOL(filemap_write_and_wait) and use it.
See mm/filemap.c:
And changes the filemap_write_and_wait() and filemap_write_and_wait_range().
Current filemap_write_and_wait() doesn't wait if filemap_fdatawrite()
returns error. However, even if filemap_fdatawrite() returned an
error, it may have submitted the partially data pages to the device.
(e.g. in the case of -ENOSPC)
<quotation>
Andrew Morton writes,
If filemap_fdatawrite() returns an error, this might be due to some
I/O problem: dead disk, unplugged cable, etc. Given the generally
crappy quality of the kernel's handling of such exceptions, there's a
good chance that the filemap_fdatawait() will get stuck in D state
forever.
</quotation>
So, this patch doesn't wait if filemap_fdatawrite() returns the -EIO.
Trond, could you please review the nfs part? Especially I'm not sure,
nfs must use the "filemap_fdatawrite(inode->i_mapping) == 0", or not.
OGAWA Hirofumi [Sun, 8 Jan 2006 09:02:12 +0000 (01:02 -0800)]
[PATCH] export/change sync_page_range/_nolock()
This exports/changes the sync_page_range/_nolock(). The fatfs needs
sync_page_range/_nolock() for expanding truncate, and changes "size_t count"
to "loff_t count".
OGAWA Hirofumi [Sun, 8 Jan 2006 09:02:11 +0000 (01:02 -0800)]
[PATCH] fat: support ->direct_IO()
This patch add to support of ->direct_IO() for mostly read.
The user of this seems to want to use for streaming read. So, current direct
I/O has limitation, it can only overwrite. (For write operation, mainly we
need to handle the hole etc..)
Russell King [Sun, 8 Jan 2006 09:02:07 +0000 (01:02 -0800)]
[PATCH] IRQ type flags
Some ARM platforms have the ability to program the interrupt controller to
detect various interrupt edges and/or levels. For some platforms, this is
critical to setup correctly, particularly those which the setting is dependent
on the device.
Currently, ARM drivers do (eg) the following:
err = request_irq(irq, ...);
set_irq_type(irq, IRQT_RISING);
However, if the interrupt has previously been programmed to be level sensitive
(for whatever reason) then this will cause an interrupt storm.
Hence, if we combine set_irq_type() with request_irq(), we can then safely set
the type prior to unmasking the interrupt. The unfortunate problem is that in
order to support this, these flags need to be visible outside of the ARM
architecture - drivers such as smc91x need these flags and they're
cross-architecture.
Finally, the SA_TRIGGER_* flag passed to request_irq() should reflect the
property that the device would like. The IRQ controller code should do its
best to select the most appropriate supported mode.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Tim Schmielau [Sun, 8 Jan 2006 09:02:05 +0000 (01:02 -0800)]
[PATCH] fix more missing includes
Include fixes for 2.6.14-git11. Should allow to remove sched.h from
module.h on i386, x86_64, arm, ia64, ppc, ppc64, and s390. Probably more
to come since I haven't yet checked the other archs.
Signed-off-by: Tim Schmielau <tim@physik3.uni-rostock.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Paul Jackson [Sun, 8 Jan 2006 09:02:04 +0000 (01:02 -0800)]
[PATCH] cpuset: skip rcu check if task is in root cpuset
For systems that aren't using cpusets, but have them CONFIG_CPUSET enabled in
their kernel (eventually this may be most distribution kernels), this patch
removes even the minimal rcu_read_lock() from the memory page allocation path.
Actually, it removes that rcu call for any task that is in the root cpuset
(top_cpuset), which on systems not actively using cpusets, is all tasks.
We don't need the rcu check for tasks in the top_cpuset, because the
top_cpuset is statically allocated, so at no risk of being freed out from
underneath us.
Signed-off-by: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Paul Jackson [Sun, 8 Jan 2006 09:02:03 +0000 (01:02 -0800)]
[PATCH] cpuset: mark number_of_cpusets read_mostly
Mark cpuset global 'number_of_cpusets' as __read_mostly.
This global is accessed everytime a zone is considered in the zonelist loops
beneath __alloc_pages, looking for a free memory page. If number_of_cpusets
is just one, then we can short circuit the mems_allowed check.
Since this global is read alot on a hot path, and written rarely, it is an
excellent candidate for __read_mostly.
Thanks to Christoph Lameter for the suggestion.
Signed-off-by: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>