Ivan Kokshaysky [Mon, 14 Jan 2008 22:31:09 +0000 (17:31 -0500)]
PCI x86: always use conf1 to access config space below 256 bytes
Thanks to Loic Prylli <loic@myri.com>, who originally proposed
this idea.
Always using legacy configuration mechanism for the legacy config space
and extended mechanism (mmconf) for the extended config space is
a simple and very logical approach. It's supposed to resolve all
known mmconf problems. It still allows per-device quirks (tweaking
dev->cfg_size). It also allows to get rid of mmconf fallback code.
Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Signed-off-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bastian Blank [Sun, 10 Feb 2008 14:47:57 +0000 (16:47 +0200)]
splice: fix user pointer access in get_iovec_page_array()
Commit 8811930dc74a503415b35c4a79d14fb0b408a361 ("splice: missing user
pointer access verification") added the proper access_ok() calls to
copy_from_user_mmap_sem() which ensures we can copy the struct iovecs
from userspace to the kernel.
But we also must check whether we can access the actual memory region
pointed to by the struct iovec to fix the access checks properly.
Signed-off-by: Bastian Blank <waldi@debian.org> Acked-by: Oliver Pinter <oliver.pntr@gmail.com> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David S. Miller [Sun, 10 Feb 2008 11:48:15 +0000 (03:48 -0800)]
[PKT_SCHED] ematch: Fix build warning.
Commit 954415e33ed6cfa932c13e8c2460bd05e50723b5 ("[PKT_SCHED] ematch:
tcf_em_destroy robustness") removed a cast on em->data when
passing it to kfree(), but em->data is an integer type that can
hold pointers as well as other values so the cast is necessary.
Signed-off-by: David S. Miller <davem@davemloft.net>
Oleg Nesterov [Fri, 1 Feb 2008 17:41:30 +0000 (20:41 +0300)]
hrtimer: don't modify restart_block->fn in restart functions
hrtimer_nanosleep_restart() clears/restores restart_block->fn. This is
pointless and complicates its usage. Note that if sys_restart_syscall()
doesn't actually happen, we have a bogus "pending" restart->fn anyway,
this is harmless.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Alexey Dobriyan <adobriyan@sw.ru> Cc: Pavel Emelyanov <xemul@sw.ru> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Toyo Abe <toyoa@mvista.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Also, set ->addr_limit = KERNEL_DS before doing hrtimer_nanosleep(), this func
was changed by the previous patch and now takes the "__user *" parameter.
Thanks to Ingo Molnar for fixing the bug in this patch.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Alexey Dobriyan <adobriyan@sw.ru> Cc: Pavel Emelyanov <xemul@sw.ru> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Toyo Abe <toyoa@mvista.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Oleg Nesterov [Fri, 1 Feb 2008 14:29:05 +0000 (17:29 +0300)]
hrtimer: fix *rmtp handling in hrtimer_nanosleep()
Spotted by Pavel Emelyanov and Alexey Dobriyan.
hrtimer_nanosleep() sets restart_block->arg1 = rmtp, but this rmtp points to
the local variable which lives in the caller's stack frame. This means that
if sys_restart_syscall() actually happens and it is interrupted as well, we
don't update the user-space variable, but write into the already dead stack
frame.
Change the callers to pass "__user *rmtp" to hrtimer_nanosleep(), and change
hrtimer_nanosleep() to use copy_to_user() to actually update *rmtp.
Small problem remains. man 2 nanosleep states that *rtmp should be written if
nanosleep() was interrupted (it says nothing whether it is OK to update *rmtp
if nanosleep returns 0), but (with or without this patch) we can dirty *rem
even if nanosleep() returns 0.
NOTE: this patch doesn't change compat_sys_nanosleep(), because it has other
bugs. Fixed by the next patch.
clocksource initialization and error accumulation. This corrects a 280ppm
drift seen on some systems using acpi_pm, and affects other clocksources as
well (likely to a lesser degree).
Signed-off-by: John Stultz <johnstul@us.ibm.com> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Jarek Poplawski [Sun, 10 Feb 2008 07:44:00 +0000 (23:44 -0800)]
[NET_SCHED] sch_htb: htb_requeue fix
htb_requeue() enqueues skbs for which htb_classify() returns NULL.
This is wrong because such skbs could be handled by NET_CLS_ACT code,
and the decision could be different than earlier in htb_enqueue().
So htb_requeue() is changed to work and look more like htb_enqueue().
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Morton [Sun, 10 Feb 2008 07:42:17 +0000 (23:42 -0800)]
starfire: secton fix
gcc-3.4.4 on powerpc:
drivers/net/starfire.c:219: error: version causes a section type conflict
Cc: Jeff Garzik <jeff@garzik.org> Cc: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Morton [Sun, 10 Feb 2008 07:41:40 +0000 (23:41 -0800)]
via-velocity: section fix
From: Andrew Morton <akpm@linux-foundation.org>
gcc-3.4.4 on powerpc:
drivers/net/via-velocity.c:443: error: chip_info_table causes a section type conflict
on this one I had to remove the __devinitdata too. Don't know why.
Cc: Jeff Garzik <jeff@garzik.org> Cc: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Morton [Sun, 10 Feb 2008 07:41:08 +0000 (23:41 -0800)]
natsemi: section fix
gcc-3.4.4 on powerpc:
drivers/net/natsemi.c:245: error: natsemi_pci_info causes a section type conflict
Cc: Jeff Garzik <jeff@garzik.org> Cc: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Morton [Sun, 10 Feb 2008 07:40:34 +0000 (23:40 -0800)]
typhoon: section fix
gcc-3.4.4 on powerpc:
drivers/net/typhoon.c:137: error: version causes a section type conflict
Cc: Jeff Garzik <jeff@garzik.org> Cc: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Sam Ravnborg [Sun, 10 Feb 2008 07:29:28 +0000 (23:29 -0800)]
isdn: fix section mismatch warning for ISACVer
Fix following warnings:
WARNING: drivers/isdn/hisax/built-in.o(.text+0x19723): Section mismatch in reference from the function ISACVersion() to the variable .devinit.data:ISACVer
WARNING: drivers/isdn/hisax/built-in.o(.text+0x2005b): Section mismatch in reference from the function setup_avm_a1_pcmcia() to the function .devinit.text:setup_isac()
ISACVer were only used from function annotated __devinit
so add same annotation to ISACVer.
One af the fererencing functions missed __devinit so add it
and kill an additional warning.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Acked-by: Karsten Keil <kkeil@suse.de> Cc: Jeff Garzik <jgarzik@pobox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Sam Ravnborg [Sun, 10 Feb 2008 07:28:50 +0000 (23:28 -0800)]
isdn: fix section mismatch warnings from hisax_cs_setup_card
Fix the following warnings:
WARNING: drivers/isdn/hisax/built-in.o(.text+0x722): Section mismatch in reference from the function hisax_cs_setup_card() to the function .devinit.text:setup_teles3()
WARNING: drivers/isdn/hisax/built-in.o(.text+0x72c): Section mismatch in reference from the function hisax_cs_setup_card() to the function .devinit.text:setup_s0box()
WARNING: drivers/isdn/hisax/built-in.o(.text+0x736): Section mismatch in reference from the function hisax_cs_setup_card() to the function .devinit.text:setup_telespci()
WARNING: drivers/isdn/hisax/built-in.o(.text+0x747): Section mismatch in reference from the function hisax_cs_setup_card() to the function .devinit.text:setup_avm_pcipnp()
WARNING: drivers/isdn/hisax/built-in.o(.text+0x74e): Section mismatch in reference from the function hisax_cs_setup_card() to the function .devinit.text:setup_elsa()
WARNING: drivers/isdn/hisax/built-in.o(.text+0x755): Section mismatch in reference from the function hisax_cs_setup_card() to the function .devinit.text:setup_diva()
WARNING: drivers/isdn/hisax/built-in.o(.text+0x75c): Section mismatch in reference from the function hisax_cs_setup_card() to the function .devinit.text:setup_sedlbauer()
WARNING: drivers/isdn/hisax/built-in.o(.text+0x763): Section mismatch in reference from the function hisax_cs_setup_card() to the function .devinit.text:setup_netjet_s()
WARNING: drivers/isdn/hisax/built-in.o(.text+0x76a): Section mismatch in reference from the function hisax_cs_setup_card() to the function .devinit.text:setup_hfcpci()
WARNING: drivers/isdn/hisax/built-in.o(.text+0x771): Section mismatch in reference from the function hisax_cs_setup_card() to the function .devinit.text:setup_hfcsx()
WARNING: drivers/isdn/hisax/built-in.o(.text+0x778): Section mismatch in reference from the function hisax_cs_setup_card() to the function .devinit.text:setup_niccy()
WARNING: drivers/isdn/hisax/built-in.o(.text+0x77f): Section mismatch in reference from the function hisax_cs_setup_card() to the function .devinit.text:setup_bkm_a4t()
WARNING: drivers/isdn/hisax/built-in.o(.text+0x786): Section mismatch in reference from the function hisax_cs_setup_card() to the function .devinit.text:setup_sct_quadro()
WARNING: drivers/isdn/hisax/built-in.o(.text+0x78d): Section mismatch in reference from the function hisax_cs_setup_card() to the function .devinit.text:setup_gazel()
WARNING: drivers/isdn/hisax/built-in.o(.text+0x794): Section mismatch in reference from the function hisax_cs_setup_card() to the function .devinit.text:setup_w6692()
WARNING: drivers/isdn/hisax/built-in.o(.text+0x79b): Section mismatch in reference from the function hisax_cs_setup_card() to the function .devinit.text:setup_netjet_u()
WARNING: drivers/isdn/hisax/built-in.o(.text+0x7a2): Section mismatch in reference from the function hisax_cs_setup_card() to the function .devinit.text:setup_enternow_pci()
checkcard() are the only user of hisax_cs_setup_card().
And checkcard is only used during init or when hot plugging
ISDN devices. So annotate hisax_cs_setup_card() with __devinit.
checkcard() is used by exported functions so it cannot be
annotated __devinit. Annotate it with __ref so modpost
ignore references to _devinit section.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Acked-by: Karsten Keil <kkeil@suse.de> Cc: Jeff Garzik <jgarzik@pobox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Sam Ravnborg [Sun, 10 Feb 2008 07:28:12 +0000 (23:28 -0800)]
isdn: fix section mismatch warnings in isac.c and isar.c
Fix the following warnings:
WARNING: drivers/isdn/hisax/built-in.o(.text+0x1b276): Section mismatch in reference from the function inithscxisac() to the function .devinit.text:clear_pending_isac_ints()
WARNING: drivers/isdn/hisax/built-in.o(.text+0x1b286): Section mismatch in reference from the function inithscxisac() to the function .devinit.text:initisac()
WARNING: drivers/isdn/hisax/built-in.o(.text+0x1fec7): Section mismatch in reference from the function AVM_card_msg() to the function .devinit.text:clear_pending_isac_ints()
WARNING: drivers/isdn/hisax/built-in.o(.text+0x21669): Section mismatch in reference from the function AVM_card_msg() to the function .devinit.text:clear_pending_isac_ints()
WARNING: drivers/isdn/hisax/built-in.o(.text+0x21671): Section mismatch in reference from the function AVM_card_msg() to the function .devinit.text:initisac()
WARNING: drivers/isdn/hisax/built-in.o(.text+0x2991e): Section mismatch in reference from the function Sedl_card_msg() to the function .devinit.text:clear_pending_isac_ints()
WARNING: drivers/isdn/hisax/built-in.o(.text+0x29936): Section mismatch in reference from the function Sedl_card_msg() to the function .devinit.text:initisac()
WARNING: drivers/isdn/hisax/built-in.o(.text+0x2993e): Section mismatch in reference from the function Sedl_card_msg() to the function .devinit.text:initisar()
WARNING: drivers/isdn/hisax/built-in.o(.text+0x2e026): Section mismatch in reference from the function NETjet_S_card_msg() to the function .devinit.text:clear_pending_isac_ints()
WARNING: drivers/isdn/hisax/built-in.o(.text+0x2e02e): Section mismatch in reference from the function NETjet_S_card_msg() to the function .devinit.text:initisac()
WARNING: drivers/isdn/hisax/built-in.o(.text+0x37813): Section mismatch in reference from the function BKM_card_msg() to the function .devinit.text:clear_pending_isac_ints()
WARNING: drivers/isdn/hisax/built-in.o(.text+0x37823): Section mismatch in reference from the function BKM_card_msg() to the function .devinit.text:initisac()
initisar(), initisac() and clear_pending_isac_ints()
were all used via a cardmsg fnction - which may be called
ouside __devinit context.
So remove the bogus __devinit annotation of the
above three functions to fix the warnings.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Acked-by: Karsten Keil <kkeil@suse.de> Cc: Jeff Garzik <jgarzik@pobox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sun, 10 Feb 2008 07:28:01 +0000 (23:28 -0800)]
Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: Add new "development flag" to the ext4 filesystem
ext4: Don't panic in case of corrupt bitmap
ext4: allocate struct ext4_allocation_context from a kmem cache
JBD2: Clear buffer_ordered flag for barried IO request on success
ext4: Fix Direct I/O locking
ext4: Fix circular locking dependency with migrate and rm.
allow in-inode EAs on ext4 root inode
ext4: Fix null bh pointer dereference in mballoc
ext4: Don't set EXTENTS_FL flag for fast symlinks
JBD2: Use the incompat macro for testing the incompat feature.
jbd2: Fix reference counting on the journal commit block's buffer head
[PATCH] jbd: Remove useless loop when writing commit record
jbd2: Add error check to journal_wait_on_commit_record to avoid oops
Sam Ravnborg [Sun, 10 Feb 2008 07:27:41 +0000 (23:27 -0800)]
isdn: fix section mismatch warning in hfc_sx.c
Fix the following warning:
WARNING: drivers/isdn/hisax/built-in.o(.text+0x35818): Section mismatch in reference from the function hfcsx_card_msg() to the function .devinit.text:inithfcsx()
hfcsx_card_msg() may be called outside __devinit context.
Following the program logic is looks like the CARD_INIT branch
will only be taken under __devinit context but to be consistent
remove the __devinit annotation of inithfcsx() so we
do not mix non-__devinit and __devinit code.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Acked-by: Karsten Keil <kkeil@suse.de> Cc: Jeff Garzik <jgarzik@pobox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Make the code in tcf_em_tree_destroy more robust and cleaner:
* Don't need to cast pointer to kfree() or avoid passing NULL.
* After freeing the tree, clear the pointer to avoid possible problems
from repeated free.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Emelyanov [Sun, 10 Feb 2008 07:24:58 +0000 (23:24 -0800)]
[SCTP]: Convert sctp_dbg_objcnt to seq files.
This makes the code use a good proc API and the text ~50 bytes shorter.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Emelyanov [Sun, 10 Feb 2008 07:23:44 +0000 (23:23 -0800)]
[SCTP]: Use snmp_fold_field instead of a homebrew analogue.
SCPT already depends in INET, so this doesn't create additional
dependencies.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Morton [Sun, 10 Feb 2008 07:17:51 +0000 (23:17 -0800)]
pppol2tp: fix printk warnings
drivers/net/pppol2tp.c: In function `pppol2tp_seq_tunnel_show':
drivers/net/pppol2tp.c:2295: warning: long long unsigned int format, __u64 arg (arg 4)
drivers/net/pppol2tp.c:2295: warning: long long unsigned int format, __u64 arg (arg 5)
drivers/net/pppol2tp.c:2295: warning: long long unsigned int format, __u64 arg (arg 6)
drivers/net/pppol2tp.c:2295: warning: long long unsigned int format, __u64 arg (arg 7)
drivers/net/pppol2tp.c:2295: warning: long long unsigned int format, __u64 arg (arg 8)
drivers/net/pppol2tp.c:2295: warning: long long unsigned int format, __u64 arg (arg 9)
drivers/net/pppol2tp.c: In function `pppol2tp_seq_session_show':
drivers/net/pppol2tp.c:2328: warning: long long unsigned int format, __u64 arg (arg 5)
drivers/net/pppol2tp.c:2328: warning: long long unsigned int format, __u64 arg (arg 6)
drivers/net/pppol2tp.c:2328: warning: long long unsigned int format, __u64 arg (arg 7)
drivers/net/pppol2tp.c:2328: warning: long long unsigned int format, __u64 arg (arg 8)
drivers/net/pppol2tp.c:2328: warning: long long unsigned int format, __u64 arg (arg 9)
drivers/net/pppol2tp.c:2328: warning: long long unsigned int format, __u64 arg (arg 10)
Not all platforms implement u64 with unsigned long long. eg: powerpc.
Cc: Jeff Garzik <jeff@garzik.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Morton [Sun, 10 Feb 2008 07:17:15 +0000 (23:17 -0800)]
bnx2: section fix
gcc-3.4.4 on powerpc:
drivers/net/bnx2.c:67: error: version causes a section type conflict
Cc: Jeff Garzik <jeff@garzik.org> Cc: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Morton [Sun, 10 Feb 2008 07:16:41 +0000 (23:16 -0800)]
bnx2x: section fix
From: Andrew Morton <akpm@linux-foundation.org>
gcc-3.4.4 on powerpc:
drivers/net/bnx2x.c:73: error: version causes a section type conflict
Cc: Jeff Garzik <jeff@garzik.org> Cc: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Theodore Tso [Sun, 10 Feb 2008 06:11:44 +0000 (01:11 -0500)]
ext4: Add new "development flag" to the ext4 filesystem
This flag is simply a generic "this is a crash/burn test filesystem"
marker. If it is set, then filesystem code which is "in development"
will be allowed to mount the filesystem. Filesystem code which is not
considered ready for prime-time will check for this flag, and if it is
not set, it will refuse to touch the filesystem.
As we start rolling ext4 out to distro's like Fedora, et. al, this makes
it less likely that a user might accidentally start using ext4 on a
production filesystem; a bad thing, since that will essentially make it
be unfsckable until e2fsprogs catches up.
Signed-off-by: Theodore Tso <tytso@MIT.EDU> Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Multiblock allocator calls BUG_ON in many case if the free and used
blocks count obtained looking at the bitmap is different from what
the allocator internally accounted for. Use ext4_error in such case
and don't panic the system.
Eric Sandeen [Sun, 10 Feb 2008 06:13:33 +0000 (01:13 -0500)]
ext4: allocate struct ext4_allocation_context from a kmem cache
struct ext4_allocation_context is rather large, and this bloats
the stack of many functions which use it. Allocating it from
a named slab cache will alleviate this.
For example, with this change (on top of the noinline patch sent earlier):
Most of these stack-allocated structs are actually used only for
mballoc history; and in those cases often a smaller struct would do.
So changing that may be another way around it, at least for those
functions, if preferred. For now, in those cases where the ac
is only for history, an allocation failure simply skips the history
recording, and does not cause any other failures.
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Dave Kleikamp [Sun, 10 Feb 2008 06:09:32 +0000 (01:09 -0500)]
JBD2: Clear buffer_ordered flag for barried IO request on success
In JBD2 jbd2_journal_write_commit_record(), clear the buffer_ordered
flag for the bh after barried IO has succeed. This prevents later, if
the same buffer head were submitted to the underlying device, which has
been reconfigured to not support barrier request, the JBD2 commit code
could treat it as a normal IO (without barrier).
This is a port from JBD/ext3 fix from Neil Brown.
More details from Neil:
Some devices - notably dm and md - can change their behaviour in
response to BIO_RW_BARRIER requests. They might start out accepting
such requests but on reconfiguration, they find out that they cannot
any more. JBD2 deal with this by always testing if BIO_RW_BARRIER
requests fail with EOPNOTSUPP, and retrying the write
requests without the barrier (probably after waiting for any pending
writes to complete).
However there is a bug in the handling this in JBD2 for ext4 .
When ext4/JBD2 to submit a BIO_RW_BARRIER request,
it sets the buffer_ordered flag on the buffer head.
If the request completes successfully, the flag STAYS SET.
Other code might then write the same buffer_head after the device has
been reconfigured to not accept barriers. This write will then fail,
but the "other code" is not ready to handle EOPNOTSUPP errors and the
error will be treated as fatal.
Cc: Neil Brown <neilb@suse.de> Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Jan Kara [Sun, 10 Feb 2008 06:08:38 +0000 (01:08 -0500)]
ext4: Fix Direct I/O locking
We cannot start transaction in ext4_direct_IO() and just let it last
during the whole write because dio_get_page() acquires mmap_sem which
ranks above transaction start (e.g. because we have dependency chain
mmap_sem->PageLock->journal_start, or because we update atime while
holding mmap_sem) and thus deadlocks could happen. We solve the problem
by starting a transaction separately for each ext4_get_block() call.
We *could* have a problem that we allocate a block and before its data
are written out the machine crashes and thus we expose stale data. But
that does not happen because for hole-filling generic code falls back to
buffered writes and for file extension, we add inode to orphan list and
thus in case of crash, journal replay will truncate inode back to the
original size.
Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
ext4: Fix circular locking dependency with migrate and rm.
In order to prevent a circular locking dependency when an unlink
operation is racing with an ext4 migration, we delay taking i_data_sem
until just before switch the inode format, and use i_mutex to prevent
writes and truncates during the first part of the migration operation.
Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Thomas Gleixner [Sat, 9 Feb 2008 22:24:09 +0000 (23:24 +0100)]
x86: cpa, strict range check in try_preserve_large_page()
Right now, we check only the first 4k page for static required protections.
This does not take overlapping regions into account. So we might end up
setting the wrong permissions/protections for other parts of this large page.
This can be optimized further, but correctness is the important part.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Thomas Gleixner [Sat, 9 Feb 2008 22:24:09 +0000 (23:24 +0100)]
x86: introduce page pool in cpa
DEBUG_PAGEALLOC was not possible on 64-bit due to its early-bootup
hardcoded reliance on PSE pages, and the unrobustness of the runtime
splitup of large pages. The splitup ended in recursive calls to
alloc_pages() when a page for a pte split was requested.
Avoid the recursion with a preallocated page pool, which is used to
split up large mappings and gets refilled in the return path of
kernel_map_pages after the split has been done. The size of the page
pool is adjusted to the available memory.
This part just implements the page pool and the initialization w/o
using it yet.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
In some suspend and hibernation files in arch/x86/power there are
comments referring to arch/x86-64 and arch/i386 . Update them to
reflect the current code layout.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Move the hibernation-specific code from arch/x86/power/suspend_64.c
to a separate file (hibernate_64.c) and the CPU-handling code to
cpu_64.c (in line with the corresponding 32-bit code).
Simplify arch/x86/power/Makefile .
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Rename cpu.c, suspend.c and swsusp.S in arch/x86/power to cpu_32.c,
hibernate_32.c and hibernate_asm_32.S, respectively, and update the
purpose and copyright information in these files.
Update the Makefile in arch/x86/power to reflect the above changes.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
x86 PM: move 64-bit hibernation files to arch/x86/power
Move arch/x86/kernel/suspend_64.c to arch/x86/power .
Move arch/x86/kernel/suspend_asm_64.S to arch/x86/power
as hibernate_asm_64.S .
Update purpose and copyright information in
arch/x86/power/suspend_64.c and
arch/x86/power/hibernate_asm_64.S .
Update the Makefiles in arch/x86, arch/x86/kernel and
arch/x86/power to reflect the above changes.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Ian Campbell [Sat, 9 Feb 2008 22:24:09 +0000 (23:24 +0100)]
x86: construct 32-bit boot time page tables in native format.
Specifically the boot time page tables in a CONFIG_X86_PAE=y enabled
kernel are in PAE format.
early_ioremap is updated to use the standard page table accessors.
Clear any mappings beyond max_low_pfn from the boot page tables in
native_pagetable_setup_start because the initial mappings can extend
beyond the range of physical memory and into the vmalloc area.
Derived from patches by Eric Biederman and H. Peter Anvin.
[ jeremy@goop.org: PAE swapper_pg_dir needs to be page-sized fix ]
Signed-off-by: Ian Campbell <ijc@hellion.org.uk> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Mika PenttilÃ\83¤ <mika.penttila@kolumbus.fi> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Harvey Harrison [Sat, 9 Feb 2008 22:24:08 +0000 (23:24 +0100)]
x86: fix sparse warnings in acpi/bus.c
Add function definition and extern variables to asm-x86/acpi.h.
All of these are used in bus.c in ifdef(CONFIG_X86) sections, so are
only added to the x86 include headers. boot.c already includes acpi.h
so no changes are needed there.
Fixes the following:
arch/x86/kernel/acpi/boot.c:83:4: warning: symbol 'acpi_sci_flags' was not declared. Should it be static?
arch/x86/kernel/acpi/boot.c:84:5: warning: symbol 'acpi_sci_override_gsi' was not declared. Should it be static?
arch/x86/kernel/acpi/boot.c:421:13: warning: symbol 'acpi_pic_sci_set_trigger' was not declared. Should it be static?
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Jordan Crouse [Sat, 9 Feb 2008 22:24:08 +0000 (23:24 +0100)]
x86: GEODE: make sure the right MFGPT timer fired the timer tick
Each AMD Geode MFGPT timer interrupt output is paired with another
timer; esentially the interrupt goes if either timer fires. This
is okay, but the handlers need to be aware of this. Make sure in
the timer tick handler that our timer really did expire.
Signed-off-by: Jordan Crouse <jordan.crouse@amd.com> Signed-off-by: Andres Salomon <dilinger@debian.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Andres Salomon [Sat, 9 Feb 2008 22:24:08 +0000 (23:24 +0100)]
x86: GEODE: MFGPT: fix a potential race when disabling a timer
We *really* don't want to be reading MFGPTx_SETUP and writing back those
values. What we want to be doing is clearing CMP1 and CMP2 unconditionally;
otherwise, we have races where CMP1 and/or CMP2 fire after we've read
MFGPTx_SETUP. They can also fire between when we've written ~CNTEN to
the register, and when the new register values get copied to the timer's
version of the register. By clearing both fields, we're okay.
Signed-off-by: Andres Salomon <dilinger@debian.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Jordan Crouse [Sat, 9 Feb 2008 22:24:08 +0000 (23:24 +0100)]
x86: GEODE: MFGPT: Use "just-in-time" detection for the MFGPT timers
There isn't much value to always detecting the MFGPT timers on
Geode platforms; detection is only needed when something wants
to use the timers. Move the detection code so that it gets
called the first time a timer is allocated.
Signed-off-by: Jordan Crouse <jordan.crouse@amd.com> Signed-off-by: Andres Salomon <dilinger@debian.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Andres Salomon [Sat, 9 Feb 2008 22:24:08 +0000 (23:24 +0100)]
x86: GEODE: MFGPT: drop module owner usage from MFGPT API
We had planned to use the 'owner' field for allowing re-allocation of
MFGPTs; however, doing it by module owner name isn't flexible enough. So,
drop this for now. If it turns out that we need timers in modules, we'll
need to come up with a scheme that matches the write-once fields of the
MFGPTx_SETUP register, and drops ponies from the sky.
Signed-off-by: Andres Salomon <dilinger@debian.org> Signed-off-by: Jordan Crouse <jordan.crouse@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Willy Tarreau [Sat, 9 Feb 2008 22:24:08 +0000 (23:24 +0100)]
x86: GEODE fix MFGPT input clock value
The GEODE MFGPT code assumed that 32kHz was 32000 Hz while the boards
run on a 32.768 kHz digital watch crystal. In practise, it will not
change the timer's frequency as the skew was only 2.4%, but it
should provide more accurate intervals.
Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Andres Salomon <dilinger@debian.org> Signed-off-by: Jordan Crouse <jordan.crouse@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Andres Salomon [Sat, 9 Feb 2008 22:24:08 +0000 (23:24 +0100)]
x86: GEODE: MFGPT: Minor cleanups
- uninline timer functions; the compiler knows better than we do
whether or not to inline these.
- mfgpt_start_timer() had an unused 'clock' argument, drop it.
From both Jordan and myself.
Signed-off-by: Jordan Crouse <jordan.crouse@amd.com> Signed-off-by: Andres Salomon <dilinger@debian.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild:
Kbuild: Fix deb-pkg target to work with kernel versions ending with -<text-without-digit>
ide: introduce HAVE_IDE
kbuild: silence CHK/UPD messages according to $(quiet)
scsi: fix makefile for aic7(3*x)
kbuild/modpost: Use warn() for announcing section mismatches
Add binoffset to gitignore
kbuild/modpost: improve warnings if symbol is unknown
Linus Torvalds [Sat, 9 Feb 2008 19:12:15 +0000 (11:12 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc:
mmc: extend ricoh_mmc to support Ricoh RL5c476
at91_mci: use generic GPIO calls
sdhci: add num index for multi controllers case
MAINTAINERS: remove non-existant URLs
mmc: remove sdhci and mmc_spi experimental markers
mmc: Handle suspend/resume in Ricoh MMC disabler
Alex Dubov [Sat, 9 Feb 2008 18:20:54 +0000 (10:20 -0800)]
memstick: initial commit for Sony MemoryStick support
Sony MemoryStick cards are used in many products manufactured by Sony.
They are available both as storage and as IO expansion cards. Currently,
only MemoryStick Pro storage cards are supported via TI FlashMedia
MemoryStick interface.
[mboton@gmail.com: biuld fix]
[akpm@linux-foundation.org: build fix] Signed-off-by: Alex Dubov <oakad@yahoo.com> Signed-off-by: Miguel Boton <mboton@gmail.co> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
CC init/main.o
In file included from include2/asm/uaccess.h:8,
from include/linux/poll.h:13,
from include/linux/rtc.h:113,
from include/linux/efi.h:19,
from linux-2.6/init/main.c:43:
include/linux/mm.h:1151:
error: expected declaration specifiers or '...' before 'pgtable_t'
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Reported-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Rientjes [Sat, 9 Feb 2008 08:10:15 +0000 (00:10 -0800)]
memcontrol: add vm_match_cgroup()
mm_cgroup() is exclusively used to test whether an mm's mem_cgroup pointer
is pointing to a specific cgroup. Instead of returning the pointer, we can
just do the test itself in a new macro:
vm_match_cgroup(mm, cgroup)
returns non-zero if the mm's mem_cgroup points to cgroup. Otherwise it
returns zero.
Signed-off-by: David Rientjes <rientjes@google.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jiri Kosina [Sat, 9 Feb 2008 08:10:14 +0000 (00:10 -0800)]
UML: fix hostfs build
/home/bunk/linux/kernel-2.6/git/linux-2.6/fs/hostfs/hostfs_kern.c: In function 'hostfs_show_options':
/home/bunk/linux/kernel-2.6/git/linux-2.6/fs/hostfs/hostfs_kern.c:328: error: dereferencing pointer to incomplete type
We need to include mount.h to get vfsmount.
Signed-off-by: Jiri Kosina <jkosina@suse.cz> Reported-by: Adrian Bunk <bunk@stusta.de> Cc: Jeff Dike <jdike@addtoit.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Matt Mackall [Sat, 9 Feb 2008 08:10:12 +0000 (00:10 -0800)]
Fix compile error on nommu for is_swap_pte
CC mm/vmscan.o
In file included from
/home/bunk/linux/kernel-2.6/git/linux-2.6/mm/vmscan.c:44:
/home/bunk/linux/kernel-2.6/git/linux-2.6/include/linux/swapops.h: In function 'is_swap_pte':
/home/bunk/linux/kernel-2.6/git/linux-2.6/include/linux/swapops.h:48: error: implicit declaration of function 'pte_none'
/home/bunk/linux/kernel-2.6/git/linux-2.6/include/linux/swapops.h:48: error: implicit declaration of function 'pte_present'
Does it ever make sense to ask "is this pte a swap entry?" on a machine
with no MMU? Presumably this also means it has no ptes too, right? In
which case, it's better to comment the whole function out. Then when
someone tries to ask the above meaningless question, they get a compile
error rather than a meaningless answer.
Signed-off-by: Matt Mackall <mpm@selenic.com> Cc: Mike Frysinger <vapier@gentoo.org> Reported-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add support for different number of page table levels dependent
on the highest address used for a process. This will cause a 31 bit
process to use a two level page table instead of the four level page
table that is the default after the pud has been introduced. Likewise
a normal 64 bit process will use three levels instead of four. Only
if a process runs out of the 4 tera bytes which can be addressed with
a three level page table the fourth level is dynamically added. Then
the process can use up to 8 peta byte.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
There are two problems in the vt220 intialization:
o Currently the vt220 console looses early printk events until the
the vt220 tty is registered.
o console should work if tty_register fails
sclp_vt220_con_init calls __sclp_vt220_init and register_console.
It does not register the driver with the sclp core code via
sclp_register. That results in an sclp_send_mask=0. Therefore,
__sclp_vt220_emit will reject buffers with EIO. Unfortunately
register_console will cause the printk buffer to be sent to the
console and, therefore, every early message gets dropped. The
sclp_send_mask is set later during boot, when sclp_vt220_tty_init
calls sclp_register.
The solution is to move the sclp_register call from sclp_vt220_tty_init
to __sclp_vt220_init. This makes sure that the console is properly
registered with the sclp subsystem before the first log buffer messages
are passed to the vt220 console.
We also adopt the cleanup on error to keep the console alive if
tty_register fails.
Thanks to Peter Oberparleiter and Heiko Carstens for review and ideas
for improvement.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Ursula Braun [Sat, 9 Feb 2008 17:24:32 +0000 (18:24 +0100)]
[S390] qdio: avoid hang when establishing qdio queues
If qdio establish runs in parallel with a channel error,
ccw_device_start_timeout may not trigger the qdio_timeout_handler.
In this case neither QDIO_IRQ_STATE_ESTABLISHED nor
QDIO_IRQ_STATE_ERR is reached and the following wait_event hangs
forever.
Solution: do not make use of the timeout option with
ccw_device_start, but add a timeout to the following wait_event.
Signed-off-by: Ursula Braun <braunu@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
With the new space saving spinlock_t and a non-debug configuration
the struct page only has 32 bytes for 31 bit s390. The causes an
overflow in the calculation of VMEM_MAX_PHYS which renders the
kernel unbootable.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The black art of inline assemblies.. The new __ffs_word_loop/
__ffz_word_loop inline assemblies need an early clobber for the
two input/output variables.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>