[PATCH] VFS: change struct file to use struct path
This patch changes struct file to use struct path instead of having
independent pointers to struct dentry and struct vfsmount, and converts all
users of f_{dentry,vfsmnt} in fs/ to use f_path.{dentry,mnt}.
Additionally, it adds two #define's to make the transition easier for users of
the f_dentry and f_vfsmnt.
Signed-off-by: Josef "Jeff" Sipek <jsipek@cs.sunysb.edu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] struct path: move struct path from fs/namei.c into include/linux
Moved struct path from fs/namei.c to include/linux/namei.h. This allows many
places in the VFS, as well as any stackable filesystem to easily keep track of
dentry-vfsmount pairs.
Signed-off-by: Josef "Jeff" Sipek <jsipek@cs.sunysb.edu> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Introduce several fsstack_copy_* functions which allow stackable filesystems
(such as eCryptfs and Unionfs) to easily copy over (currently only) inode
attributes. This prevents code duplication and allows for code reuse.
[akpm@osdl.org: Remove unneeded wrapper]
[bunk@stusta.de: fs/stack.c should #include <linux/fs_stack.h>] Signed-off-by: Josef "Jeff" Sipek <jsipek@cs.sunysb.edu> Cc: Michael Halcrow <mhalcrow@us.ibm.com> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Akinobu Mita [Fri, 8 Dec 2006 10:36:25 +0000 (02:36 -0800)]
[PATCH] crc32: replace bitreverse by bitrev32
This patch replaces bitreverse() by bitrev32. The only users of bitreverse()
are crc32 itself and via-velocity.
Cc: Jeff Garzik <jgarzik@pobox.com> Cc: Matt Domsch <Matt_Domsch@dell.com> Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Jeff Dike [Fri, 8 Dec 2006 10:36:23 +0000 (02:36 -0800)]
[PATCH] UML: add generic BUG support
The BUG changes in -mm3 need some arch support. This patch adds the UML
support needed. For the most part, it was stolen from the underlying
architecture. The exception is the kernel eip < PAGE_OFFSET test, which is
wrong for skas mode UMLs.
Signed-off-by: Jeff Dike <jdike@addtoit.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The main advantage in using the generic BUG machinery for x86-64 is that
the inlined overhead of BUG is just the ud2a instruction; the file+line
information are no longer inlined into the instruction stream. This
reduces cache pollution.
Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Andi Kleen <ak@muc.de> Cc: Hugh Dickens <hugh@veritas.com> Cc: Michael Ellerman <michael@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This makes i386 use the generic BUG machinery. There are no functional
changes from the old i386 implementation.
The main advantage in using the generic BUG machinery for i386 is that the
inlined overhead of BUG is just the ud2a instruction; the file+line(+function)
information are no longer inlined into the instruction stream. This reduces
cache pollution, and makes disassembly work properly.
Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Andi Kleen <ak@muc.de> Cc: Hugh Dickens <hugh@veritas.com> Cc: Michael Ellerman <michael@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch adds common handling for kernel BUGs, for use by architectures as
they wish. The code is derived from arch/powerpc.
The advantages of having common BUG handling are:
- consistent BUG reporting across architectures
- shared implementation of out-of-line file/line data
- implement CONFIG_DEBUG_BUGVERBOSE consistently
This means that in inline impact of BUG is just the illegal instruction
itself, which is an improvement for i386 and x86-64.
A BUG is represented in the instruction stream as an illegal instruction,
which has file/line information associated with it. This extra information is
stored in the __bug_table section in the ELF file.
When the kernel gets an illegal instruction, it first confirms it might
possibly be from a BUG (ie, in kernel mode, the right illegal instruction).
It then calls report_bug(). This searches __bug_table for a matching
instruction pointer, and if found, prints the corresponding file/line
information. If report_bug() determines that it wasn't a BUG which caused the
trap, it returns BUG_TRAP_TYPE_NONE.
Some architectures (powerpc) implement WARN using the same mechanism; if the
illegal instruction was the result of a WARN, then report_bug(Q) returns
CONFIG_DEBUG_BUGVERBOSE; otherwise it returns BUG_TRAP_TYPE_BUG.
lib/bug.c keeps a list of loaded modules which can be searched for __bug_table
entries. The architecture must call
module_bug_finalize()/module_bug_cleanup() from its corresponding
module_finalize/cleanup functions.
Unsetting CONFIG_DEBUG_BUGVERBOSE will reduce the kernel size by some amount.
At the very least, filename and line information will not be recorded for each
but, but architectures may decide to store no extra information per BUG at
all.
Unfortunately, gcc doesn't have a general way to mark an asm() as noreturn, so
architectures will generally have to include an infinite loop (or similar) in
the BUG code, so that gcc knows execution won't continue beyond that point.
gcc does have a __builtin_trap() operator which may be useful to achieve the
same effect, unfortunately it cannot be used to actually implement the BUG
itself, because there's no way to get the instruction's address for use in
generating the __bug_table entry.
[randy.dunlap@oracle.com: Handle BUG=n, GENERIC_BUG=n to prevent build errors]
[bunk@stusta.de: include/linux/bug.h must always #include <linux/module.h] Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Andi Kleen <ak@muc.de> Cc: Hugh Dickens <hugh@veritas.com> Cc: Michael Ellerman <michael@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Peter Zijlstra [Fri, 8 Dec 2006 10:36:18 +0000 (02:36 -0800)]
[PATCH] bdev: fix ->bd_part_count leak
Don't leak a ->bd_part_count when the partition open fails with -ENXIO.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
NeilBrown [Fri, 8 Dec 2006 10:36:17 +0000 (02:36 -0800)]
[PATCH] lockdep: avoid lockdep warning in md
md_open takes ->reconfig_mutex which causes lockdep to complain. This
(normally) doesn't have deadlock potential as the possible conflict is with a
reconfig_mutex in a different device.
I say "normally" because if a loop were created in the array->member hierarchy
a deadlock could happen. However that causes bigger problems than a deadlock
and should be fixed independently.
So we flag the lock in md_open as a nested lock. This requires defining
mutex_lock_interruptible_nested.
Cc: Ingo Molnar <mingo@elte.hu> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
NeilBrown [Fri, 8 Dec 2006 10:36:16 +0000 (02:36 -0800)]
[PATCH] lockdep: use mutex_lock_nested for bd_mutex to avoid lockdep warning
Now that the nesting in blkdev_{get,put} is simpler, adding mutex_lock_nested
is trivial.
Cc: Ingo Molnar <mingo@elte.hu> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
NeilBrown [Fri, 8 Dec 2006 10:36:16 +0000 (02:36 -0800)]
[PATCH] lockdep: simplify some aspects of bd_mutex nesting
When we open (actually blkdev_get) a partition we need to also open (get) the
whole device that holds the partition. The involves some limited recursion.
This patch tries to simplify some aspects of this.
As well as opening the whole device, we need to increment ->bd_part_count when
a partition is opened (this is used by rescan_partitions to avoid a rescan if
any partition is active, as that would be confusing).
The main change this patch makes is to move the inc/dec of bd_part_count into
blkdev_{get,put} for the whole rather than doing it in blkdev_{get,put} for
the partition.
More specifically, we introduce __blkdev_get and __blkdev_put which do exactly
what blkdev_{get,put} did, only with an extra "for_part" argument
(blkget_{get,put} then call the __ version with a '0' for the extra argument).
If for_part is 1, then the blkdev is being get(put) because a partition is
being opened(closed) for the first(last) time, and so bd_part_count should be
updated (on success). The particular advantage of pushing this function down
is that the bd_mutex lock (which is needed to update bd_part_count) is already
held at the lower level.
Note that this slightly changes the semantics of bd_part_count. Instead of
updating it whenever a partition is opened or released, it is now only updated
on the first open or last release. This is an adequate semantic as it is only
ever tested for "== 0".
Having introduced these functions we remove the current bd_part_count updates
from do_open (which is really the body of blkdev_get) and call
__blkdev_get(... 1). Similarly in blkget_put we remove the old bd_part_count
updates and call __blkget_put(..., 1). This call is moved to the end of
__blkdev_put to avoid nested locks of bd_mutex.
Finally the mutex_lock on whole->bd_mutex in do_open can be removed. It was
only really needed to protect bd_part_count, and that is now managed (and
protected) within the recursive call.
The observation that bd_part_count is central to the locking issues, and the
modifications to create __blkdev_put are from Peter Zijlstra.
Cc: Ingo Molnar <mingo@elte.hu> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
NeilBrown [Fri, 8 Dec 2006 10:36:14 +0000 (02:36 -0800)]
[PATCH] lockdep: remove lock_key approach to managing nested bd_mutex locks
The extra call to get_gendisk is not good. It causes a ->probe and possible
module load before it is really appropriate to do this.
Cc: Ingo Molnar <mingo@elte.hu> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Peter Zijlstra [Fri, 8 Dec 2006 10:36:14 +0000 (02:36 -0800)]
[PATCH] new bd_mutex lockdep annotation
Use the gendisk partition number to set a lock class.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Neil Brown <neilb@cse.unsw.edu.au> Cc: Ingo Molnar <mingo@elte.hu> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Jason Baron <jbaron@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Peter Zijlstra [Fri, 8 Dec 2006 10:36:13 +0000 (02:36 -0800)]
[PATCH] remove the old bd_mutex lockdep annotation
Remove the old complex and crufty bd_mutex annotation.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Neil Brown <neilb@cse.unsw.edu.au> Cc: Ingo Molnar <mingo@elte.hu> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Jason Baron <jbaron@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Thomas Maier [Fri, 8 Dec 2006 10:36:11 +0000 (02:36 -0800)]
[PATCH] pktcdvd: bio write congestion using congestion_wait()
This adds a bio write queue congestion control to the pktcdvd driver with
fixed on/off marks. It prevents that the driver consumes a unlimited
amount of write requests.
[akpm@osdl.org: sync with congestion_wait() renaming] Signed-off-by: Thomas Maier <balagi@justmail.de> Cc: Peter Osterlund <petero2@telia.com> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Oleg Nesterov [Fri, 8 Dec 2006 10:36:09 +0000 (02:36 -0800)]
[PATCH] sys_unshare: remove a broken CLONE_SIGHAND code
sys_unshare(CLONE_SIGHAND) is broken, the code under 'if (new_sigh)' is
never executed but very wrong. Just remove it to avoid a confusion,
task_lock() has nothing to do with ->sighand changing.
Also, change the comment in unshare_sighand(). Yes, CLONE_THREAD implies
CLONE_SIGHAND, but still it looks confusing. Also, we don't need to check
current->sighand != NULL.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Peter Zijlstra [Fri, 8 Dec 2006 10:36:04 +0000 (02:36 -0800)]
[PATCH] tty: ->signal->tty locking
Fix the locking of signal->tty.
Use ->sighand->siglock to protect ->signal->tty; this lock is already used
by most other members of ->signal/->sighand. And unless we are 'current'
or the tasklist_lock is held we need ->siglock to access ->signal anyway.
(NOTE: sys_unshare() is broken wrt ->sighand locking rules)
Note that tty_mutex is held over tty destruction, so while holding
tty_mutex any tty pointer remains valid. Otherwise the lifetime of ttys
are governed by their open file handles. This leaves some holes for tty
access from signal->tty (or any other non file related tty access).
It solves the tty SLAB scribbles we were seeing.
(NOTE: the change from group_send_sig_info to __group_send_sig_info needs to
be examined by someone familiar with the security framework, I think
it is safe given the SEND_SIG_PRIV from other __group_send_sig_info
invocations)
[schwidefsky@de.ibm.com: 3270 fix]
[akpm@osdl.org: various post-viro fixes] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Alan Cox <alan@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Chris Wright <chrisw@sous-sol.org> Cc: Roland McGrath <roland@redhat.com> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: James Morris <jmorris@namei.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jeff Dike <jdike@addtoit.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Jan Kara <jack@ucw.cz> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Herbert Poetzl [Fri, 8 Dec 2006 10:36:00 +0000 (02:36 -0800)]
[PATCH] Fix linux banner utsname information
utsname information is shown in the linux banner, which also is used for
/proc/version (which can have different utsname values inside a uts
namespaces). this patch makes the varying data arguments and changes the
string to a format string, using those arguments.
Signed-off-by: Herbert Poetzl <herbert@13thfloor.at> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>