Al Viro [Fri, 21 Oct 2005 07:20:48 +0000 (03:20 -0400)]
[PATCH] gfp_t: fs/*
- ->releasepage() annotated (s/int/gfp_t), instances updated
- missing gfp_t in fs/* added
- fixed misannotation from the original sweep caught by bitwise checks:
XFS used __nocast both for gfp_t and for flags used by XFS allocator.
The latter left with unsigned int __nocast; we might want to add a
different type for those but for now let's leave them alone. That,
BTW, is a case when __nocast use had been actively confusing - it had
been used in the same code for two different and similar types, with
no way to catch misuses. Switch of gfp_t to bitwise had caught that
immediately...
One tricky bit is left alone to be dealt with later - mapping->flags is
a mix of gfp_t and error indications. Left alone for now.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Al Viro [Fri, 21 Oct 2005 06:55:38 +0000 (02:55 -0400)]
[PATCH] gfp_t: infrastructure
Beginning of gfp_t annotations:
- -Wbitwise added to CHECKFLAGS
- old __bitwise renamed to __bitwise__
- __bitwise defined to either __bitwise__ or nothing, depending on
__CHECK_ENDIAN__ being defined
- gfp_t switched from __nocast to __bitwise__
- force cast to gfp_t added to __GFP_... constants
- new helper - gfp_zone(); extracts zone bits out of gfp_t value and casts
the result to int
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Trond Myklebust [Fri, 28 Oct 2005 02:12:41 +0000 (22:12 -0400)]
NFS: Add optional post-op getattr instruction to the NFSv4 file close.
"Optional" means that the close call will not fail if the getattr
at the end of the compound fails.
If it does succeed, try to refresh inode attributes.
Chuck Lever [Tue, 25 Oct 2005 15:48:36 +0000 (11:48 -0400)]
NFS: nfs_lookup doesn't need to revalidate the parent directory's inode
nfs_lookup() used to consult a lookup cache before trying an actual wire
lookup operation. The lookup cache would be invalid, of course, if the
parent directory's mtime had changed, so nfs_lookup performed an inode
revalidation on the parent.
Since nfs_lookup() doesn't use a cache anymore, the revalidation is no
longer necessary. There are cases where it will generate a lot of
unnecessary GETATTR traffic.
See http://bugzilla.linux-nfs.org/show_bug.cgi?id=9
Test-plan:
Use lndir and "rm -rf" and watch for excess GETATTR traffic or application
level errors.
Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Fri, 28 Oct 2005 02:12:39 +0000 (22:12 -0400)]
NFS: Don't let nfs_end_data_update() clobber attribute update information
Since we almost always call nfs_end_data_update() after we called
nfs_refresh_inode(), we now end up marking the inode metadata
as needing revalidation immediately after having updated it.
This patch rearranges things so that we mark the inode as needing
revalidation _before_ we call nfs_refresh_inode() on those operations
that need it.
Herbert Xu [Thu, 27 Oct 2005 08:47:46 +0000 (18:47 +1000)]
[TCP]: Clear stale pred_flags when snd_wnd changes
This bug is responsible for causing the infamous "Treason uncloaked"
messages that's been popping up everywhere since the printk was added.
It has usually been blamed on foreign operating systems. However,
some of those reports implicate Linux as both systems are running
Linux or the TCP connection is going across the loopback interface.
In fact, there really is a bug in the Linux TCP header prediction code
that's been there since at least 2.1.8. This bug was tracked down with
help from Dale Blount.
The effect of this bug ranges from harmless "Treason uncloaked"
messages to hung/aborted TCP connections. The details of the bug
and fix is as follows.
When snd_wnd is updated, we only update pred_flags if
tcp_fast_path_check succeeds. When it fails (for example,
when our rcvbuf is used up), we will leave pred_flags with
an out-of-date snd_wnd value.
When the out-of-date pred_flags happens to match the next incoming
packet we will again hit the fast path and use the current snd_wnd
which will be wrong.
In the case of the treason messages, it just happens that the snd_wnd
cached in pred_flags is zero while tp->snd_wnd is non-zero. Therefore
when a zero-window packet comes in we incorrectly conclude that the
window is non-zero.
In fact if the peer continues to send us zero-window pure ACKs we
will continue making the same mistake. It's only when the peer
transmits a zero-window packet with data attached that we get a
chance to snap out of it. This is what triggers the treason
message at the next retransmit timeout.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Oleg Nesterov [Wed, 26 Oct 2005 16:26:53 +0000 (20:26 +0400)]
[PATCH] Fix cpu timers expiration time
There's a silly off-by-one error in the code that updates the expiration
of posix CPU timers, causing them to not be properly updated when they
hit exactly on their expiration time (which should be the normal case).
This causes them to then fire immediately again, and only _then_ get
properly updated.
Peter Wainwright [Wed, 26 Oct 2005 08:59:02 +0000 (01:59 -0700)]
[PATCH] Fix HFS+ to free up the space when a file is deleted.
fsck_hfs reveals lots of temporary files accumulating in the hidden
directory "\000\000\000HFS+ Private Data". According to the HFS+
documentation these are files which are unlinked while in use. However,
there may be a bug in the Linux hfsplus implementation which causes this to
happen even when the files are not in use. It looks like the "opencnt"
field is never initialized as (I think) it should be in hfsplus_read_inode.
This means that a file can appear to be still in use when in fact it has
been closed. This patch seems to fix it for me.
Signed-off-by: Anton Altaparmakov <aia21@cantab.net> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Jeff Garzik [Wed, 26 Oct 2005 08:59:01 +0000 (01:59 -0700)]
[PATCH] kill massive wireless-related log spam
Although this message is having the intended effect of causing wireless
driver maintainers to upgrade their code, I never should have merged this
patch in its present form. Leading to tons of bug reports and unhappy
users.
Some wireless apps poll for statistics regularly, which leads to a printk()
every single time they ask for stats. That's a little bit _too_ much of a
reminder that the driver is using an old API.
Change this to printing out the message once, per kernel boot.
Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] ppc64: Fix wrong register mapping in mpic driver
The mpic interrupt controller driver (used on G5 and early pSeries among
others) has a bug where it doesn't get the right virtual address for the
timer registers. It causes the driver to poke at the MMIO space of
whatever has been mapped just next to it (ouch !) when initializing and
causes boot failures on some IBM machines.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Magnus Damm [Wed, 26 Oct 2005 08:58:59 +0000 (01:58 -0700)]
[PATCH] NUMA: broken per cpu pageset counters
The NUMA counters in struct per_cpu_pageset (linux/mmzone.h) are never
cleared today. This works ok for CPU 0 on NUMA machines because
boot_pageset[] is already zero, but for other CPU:s this results in
uninitialized counters.
Signed-off-by: Magnus Damm <magnus@valinux.co.jp> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
NeilBrown [Wed, 26 Oct 2005 08:58:58 +0000 (01:58 -0700)]
[PATCH] md: make sure mdthreads will always respond to kthread_stop
There are still a couple of cases where md threads (the resync/recovery
thread) is not interruptible since the change to use kthreads. All places
there it tests "signal_pending", it should also test kthread_should_stop,
as with this patch.
Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Ian Campbell [Wed, 26 Oct 2005 14:04:21 +0000 (15:04 +0100)]
[ARM] 3032/1: sparse: complains about generic_fls() prototype in asm-arm/bitops.h
Patch from Ian Campbell
Sparse complains about the definition of generic_fls in asm-arm/bitops.h:
CHECK /home/icampbell/devel/kernel/2.6/arch/arm/mach-pxa/viper.c
include2/asm/bitops.h:350:34: error: marked inline, but without a definition
The definition is unnecessary since linux/bitops.h defines generic_fls before including asm/bitops.h and asm/bitops.h should not be included directly. There are still some places where asm/bitops.h is directly included, but I think that code should be fixed. I was a little wary of the patch for this reason but lubbock, mainstone and assabet all build OK and so do my in house boards...
ARM is the only arch with the generic_fls prototype in this way.
Signed-off-by: Ian Campbell <icampbell@arcom.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Linus Torvalds [Wed, 26 Oct 2005 03:40:09 +0000 (20:40 -0700)]
PCI: be more verbose about resource quirks
When reserving an PCI quirk, note that in the kernel bootup messages.
Also, parse the strange PIIX4 device resources - they should get their
own PCI resource quirks, but for now just print out what it finds to
verify that the code does the right thing.
David Engel [Sat, 22 Oct 2005 03:09:16 +0000 (22:09 -0500)]
[IPV4]: Fix setting broadcast for SIOCSIFNETMASK
Fix setting of the broadcast address when the netmask is set via
SIOCSIFNETMASK in Linux 2.6. The code wanted the old value of
ifa->ifa_mask but used it after it had already been overwritten with
the new value.
Signed-off-by: David Engel <gigem@comcast.net> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Jayachandran C [Thu, 13 Oct 2005 18:43:02 +0000 (11:43 -0700)]
[IPV4]: Remove dead code from ip_output.c
skb_prev is assigned from skb, which cannot be NULL. This patch removes the
unnecessary NULL check.
Signed-off-by: Jayachandran C. <c.jayachandran at gmail.com> Acked-by: James Morris <jmorris@namei.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Jayachandran C [Thu, 13 Oct 2005 18:43:05 +0000 (11:43 -0700)]
[NETLINK]: Remove dead code in af_netlink.c
Remove the variable nlk & call to nlk_sk as it does not have any side effect.
Signed-off-by: Jayachandran C. <c.jayachandran at gmail.com> Acked-by: James Morris <jmorris@namei.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Herbert Xu [Fri, 14 Oct 2005 23:42:39 +0000 (09:42 +1000)]
[IPV4]: Kill redundant rcu_dereference on fa_info
This patch kills a redundant rcu_dereference on fa->fa_info in fib_trie.c.
As this dereference directly follows a list_for_each_entry_rcu line, we
have already taken a read barrier with respect to getting an entry from
the list.
This read barrier guarantees that all values read out of fa are valid.
In particular, the contents of structure pointed to by fa->fa_info is
initialised before fa->fa_info is actually set (see fn_trie_insert);
the setting of fa->fa_info itself is further separated with a write
barrier from the insertion of fa into the list.
Therefore by taking a read barrier after obtaining fa from the list
(which is given by list_for_each_entry_rcu), we can be sure that
fa->fa_info contains a valid pointer, as well as the fact that the
data pointed to by fa->fa_info is itself valid.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Paul E. McKenney <paulmck@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Harald Welte [Sun, 16 Oct 2005 12:22:59 +0000 (14:22 +0200)]
[NETFILTER] ip_conntrack: Make "hashsize" conntrack parameter writable
It's fairly simple to resize the hash table, but currently you need to
remove and reinsert the module. That's bad (we lose connection
state). Harald has even offered to write a daemon which sets this
based on load.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
The code to handle the /proc interface can be cleaned up in several places:
* use seq_file for read
* don't need to remember all the filenames separately
* use for_online_cpu's
* don't vmalloc a buffer for small command from user.
Committer note:
This patch clashed with John Hawkes's "[NET]: Wider use of for_each_*cpu()",
so I fixed it up manually.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: Robert Olsson <robert.olsson@its.uu.se> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Fix some cosmetic issues. Indentation, spelling errors, and some whitespace.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: Robert Olsson <robert.olsson@its.uu.se> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
These are cleanup patches for pktgen that can go in 2.6.15
Can use kzalloc in a couple of places.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: Robert Olsson <robert.olsson@its.uu.se> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
pktgen is calling kmalloc GFP_KERNEL and vmalloc with lock held.
The simplest fix is to turn the lock into a semaphore, since the
thread lock is only used for admin control from user context.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: Robert Olsson <robert.olsson@its.uu.se> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
John Hawkes [Thu, 13 Oct 2005 16:30:31 +0000 (09:30 -0700)]
[NET]: Wider use of for_each_*cpu()
In 'net' change the explicit use of for-loops and NR_CPUS into the
general for_each_cpu() or for_each_online_cpu() constructs, as
appropriate. This widens the scope of potential future optimizations
of the general constructs, as well as takes advantage of the existing
optimizations of first_cpu() and next_cpu(), which is advantageous
when the true CPU count is much smaller than NR_CPUS.
Signed-off-by: John Hawkes <hawkes@sgi.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: Patrick Caulfield <patrick@tykepenguin.com> Signed-off-by: Steven Whitehouse <steve@chygwyn.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Andrew Morton [Tue, 25 Oct 2005 18:00:56 +0000 (11:00 -0700)]
[PATCH] qlogic lockup fix
If qla2x00_probe_one()'s call to qla2x00_iospace_config() fails, we call
qla2x00_free_device() to clean up. But because ha->dpc_pid hasn't been set
yet, qla2x00_free_device() tries to stop a kernel thread which hasn't started
yet. It does wait_for_completion() against an uninitialised completion struct
and the kernel hangs up.
Fix it by initialising ha->dpc_pid a bit earlier.
Cc: Andrew Vasquez <andrew.vasquez@qlogic.com> Cc: James Bottomley <James.Bottomley@steeleye.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Pavel Machek [Mon, 24 Oct 2005 21:30:10 +0000 (22:30 +0100)]
[ARM] fix sharp zaurus c-3000 compile failure without CONFIG_FB_PXA
This fixes compile problem when CONFIG_FB_PXA is not set.
LD .tmp_vmlinux1
arch/arm/mach-pxa/built-in.o(.text+0x1d74): In function
`spitz_get_hsync_len':
: undefined reference to `pxafb_get_hsync_time'
make: *** [.tmp_vmlinux1] Error 1
3.46user 0.46system 5.10 (0m5.106s) elapsed 77.01%CPU
Signed-off-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Richard Purdie <rpurdie@rpsys.net> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Bjorn Helgaas [Mon, 24 Oct 2005 21:11:57 +0000 (22:11 +0100)]
[SERIAL] support the Exsys EX-4055 4S four-port card
Tested by Wolfgang Denk with this device:
00:0f.0 Network controller: PLX Technology, Inc. PCI <-> IOBus Bridge (rev 01)
Subsystem: Exsys EX-4055 4S(16C550) RS-232
Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Interrupt: pin A routed to IRQ 10
Region 0: Memory at 80100000 (32-bit, non-prefetchable) [size=128]
Region 1: I/O ports at 7080 [size=128]
Region 2: I/O ports at 7400 [size=32]
00:0f.0 Class 0280: 10b5:9050 (rev 01)
Subsystem: d84d:4055
Results with this patch:
Serial: 8250/16550 driver $Revision: 1.90 $ 32 ports, IRQ sharing enabled
ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
PCI: Found IRQ 10 for device 0000:00:0f.0
ttyS4 at I/O 0x7400 (irq = 10) is a 16550A
ttyS5 at I/O 0x7408 (irq = 10) is a 16550A
ttyS6 at I/O 0x7410 (irq = 10) is a 16550A
ttyS7 at I/O 0x7418 (irq = 10) is a 16550A
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Oleg Nesterov [Mon, 24 Oct 2005 14:29:58 +0000 (18:29 +0400)]
[PATCH] posix-timers: fix posix_cpu_timer_set() vs run_posix_cpu_timers() race
This might be harmless, but looks like a race from code inspection (I
was unable to trigger it). I must admit, I don't understand why we
can't return TIMER_RETRY after 'spin_unlock(&p->sighand->siglock)'
without doing bump_cpu_timer(), but this is what original code does.
We are probaly deleting the timer from run_posix_cpu_timers's 'firing'
local list_head while run_posix_cpu_timers() does list_for_each_safe.
Various bad things can happen, for example we can just delete this timer
so that list_for_each() will not notice it and run_posix_cpu_timers()
will not reset '->firing' flag. In that case,
....
if (timer->it.cpu.firing) {
read_unlock(&tasklist_lock);
timer->it.cpu.firing = -1;
return TIMER_RETRY;
}
sys_timer_settime() goes to 'retry:', calls posix_cpu_timer_set() again,
it returns TIMER_RETRY ...
Oleg Nesterov [Mon, 24 Oct 2005 10:34:03 +0000 (14:34 +0400)]
[PATCH] posix-timers: remove false BUG_ON() from run_posix_cpu_timers()
do_exit() clears ->it_##clock##_expires, but nothing prevents
another cpu to attach the timer to exiting process after that.
After exit_notify() does 'write_unlock_irq(&tasklist_lock)' and
before do_exit() calls 'schedule() local timer interrupt can find
tsk->exit_state != 0. If that state was EXIT_DEAD (or another cpu
does sys_wait4) interrupted task has ->signal == NULL.
At this moment exiting task has no pending cpu timers, they were cleaned
up in __exit_signal()->posix_cpu_timers_exit{,_group}(), so we can just
return from irq.
Oleg Nesterov [Sun, 23 Oct 2005 16:25:39 +0000 (20:25 +0400)]
[PATCH] posix-timers: fix cleanup_timers() and run_posix_cpu_timers() races
1. cleanup_timers() sets timer->task = NULL under tasklist + ->sighand locks.
That means that this code in posix_cpu_timer_del() and posix_cpu_timer_set()
lock_timer(timer);
if (timer->task == NULL)
return;
read_lock(tasklist);
put_task_struct(timer->task)
is racy. With this patch timer->task modified and accounted only under
timer->it_lock. Sadly, this means that dead task_struct won't be freed
until timer deleted or armed.
2. run_posix_cpu_timers() collects expired timers into local list under
tasklist + ->sighand again. That means that posix_cpu_timer_del()
should check timer->it.cpu.firing under these locks too.