Thomas Graf [Tue, 11 Dec 2007 00:53:05 +0000 (16:53 -0800)]
[IPv4] ESP: Discard dummy packets introduced in rfc4303
RFC4303 introduces dummy packets with a nexthdr value of 59
to implement traffic confidentiality. Such packets need to
be dropped silently and the payload may not be attempted to
be parsed as it consists of random chunk.
Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Emelyanov [Sat, 8 Dec 2007 07:55:43 +0000 (23:55 -0800)]
[IPV4]: Swap the ifa allocation with the"ipv4_devconf_setall" call
According to Herbert, the ipv4_devconf_setall should be called
only when the ifa is added to the device. However, failed
ifa allocation may bring things into inconsistent state.
Move the call to ipv4_devconf_setall after the ifa allocation.
Fits both net-2.6 (with offsets) and net-2.6.25 (cleanly).
Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
[IPV6] XFRM: Fix auditing rt6i_flags; use RTF_xxx flags instead of RTCF_xxx.
RTCF_xxx flags, defined in include/linux/in_route.h) are available for
IPv4 route (rtable) entries only. Use RTF_xxx flags instead, defined
in include/linux/ipv6_route.h, for IPv6 route entries (rt6_info).
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Ultimately to implement /proc perfectly we need an implementation of
d_revalidate because files and directories can be removed behind the back
of the VFS, and d_revalidate is the only way we can let the VFS know that
this has happened.
Unfortunately the linux VFS can not cope with anything in the path to a
mount point going away. So a proper d_revalidate method that calls d_drop
also needs to call have_submounts which is moderately expensive, so you
really don't want a d_revalidate method that unconditionally calls it, but
instead only calls it when the backing object has really gone away.
proc generic entries only disappear on module_unload (when not counting the
fledgling network namespace) so it is quite rare that we actually encounter
that case and has not actually caused us real world trouble yet.
So until we get a proper test for keeping dentries in the dcache fix the
current d_revalidate method by completely removing it. This returns us to
the current status quo.
So with CONFIG_NETNS=n things should look as they have always looked.
For CONFIG_NETNS=y things work most of the time but there are a few rare
corner cases that don't behave properly. As the network namespace is
barely present in 2.6.24 this should not be a problem.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: "Denis V. Lunev" <den@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rini van Zetten [Mon, 10 Dec 2007 23:49:34 +0000 (15:49 -0800)]
atmel_spi: reload RCR before TCR
We have a wifi module connected to the spi bus and got sometimes FIFO
overrun errors on the spi bus.
After some investigation i found that the driver loads the TCR (transmit
count) register before the RCR (receive count). When the transfer list is
not empty the atmel_spi_next_message is called while tx and rx are enabled.
As soon as the TCR is loaded, hardware starts transfer and causes a rx
fifo overrun because the RCR is not loaded yet.
Load the RCR before the TCR. After this patch the fifo overrun disapears
at out setup.
Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com> Signed-off-by: Rini van Zetten <rini@arvoo.nl> Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The esp_reset_cleanup() function is called with the host lock held and
invokes starget_for_each_device() which wants to take it too. Here is a
fix along the lines of shost_for_each_device()/__shost_for_each_device()
adding a __starget_for_each_device() counterpart which assumes the lock
has already been taken.
Eventually, I think the driver should get modified so that more work is
done as a softirq rather than in the interrupt context, but for now it
fixes a bug that causes the spinlock debugger to fire.
While at it, it fixes a small number of cosmetic problems with
starget_for_each_device() too.
Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> Acked-by: David S. Miller <davem@davemloft.net> Cc: James Bottomley <James.Bottomley@steeleye.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Neil Brown [Mon, 10 Dec 2007 23:49:30 +0000 (15:49 -0800)]
Fix NULL dereference in umem.c
Fix NULL dereference in umem.c
Signed-off-by: Neil Brown <neilb@suse.de> Tested-by: Dave Chinner <dgc@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
T *d;
...
for_each_compatible_node(d,...)
{... when != of_node_put(d)
when != e = d
(
return d;
|
+ of_node_put(d);
? return ...;
)
...}
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk> Acked-by: Grant Likely <grant.likely@secretlab.ca> Acked-by: Peter Korsgaard <jacmet@sunsite.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Adam Litke [Mon, 10 Dec 2007 23:49:28 +0000 (15:49 -0800)]
hugetlb: handle write-protection faults in follow_hugetlb_page
The follow_hugetlb_page() fix I posted (merged as git commit 5b23dbe8173c212d6a326e35347b038705603d39) missed one case. If the pte is
present, but not writable and write access is requested by the caller to
get_user_pages(), the code will do the wrong thing. Rather than calling
hugetlb_fault to make the pte writable, it notes the presence of the pte
and continues.
This simple one-liner makes sure we also fault on the pte for this case.
Please apply.
Signed-off-by: Adam Litke <agl@us.ibm.com> Acked-by: Dave Kleikamp <shaggy@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Will Newton [Mon, 10 Dec 2007 23:49:27 +0000 (15:49 -0800)]
spi_imx: fix typo in description
Signed-off-by: Will Newton <will.newton@gmail.com> Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Will Newton [Mon, 10 Dec 2007 23:49:26 +0000 (15:49 -0800)]
spi_bfin5xx: fix typo in description
Signed-off-by: Will Newton <will.newton@gmail.com> Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Will Newton [Mon, 10 Dec 2007 23:49:25 +0000 (15:49 -0800)]
pxa2xx_spi: fix typo in description
Signed-off-by: Will Newton <will.newton@gmail.com> Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Randy Dunlap [Mon, 10 Dec 2007 23:49:22 +0000 (15:49 -0800)]
pcmcia: fix kernel-doc comments
Fix kernel-doc comments in drivers/pcmcia/:
- ti113x.h does not contain kernel-doc, so don't use /** to begin a doc
comment
- yenta_socket.c: remove /** on non-kernel-doc comments;
escape the ':' in an "http:" comment so that it won't be treated as a
section heading;
- cs.c: remove /** on non-kernel-doc comments & add function parameter info
- ds.c: fix function parameter info
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- Delete refereces to HOSTS_C
- Switch to module_init/module_exit instead of detect/release
- Don't pass around the host template and rename it to adpt_template
- Switch from scsi_register/scsi_unregister to scsi_host_alloc,
scsi_add_host, scsi_scan_host and scsi_host_put.
Because it caused (for unknown reasons) Andres' all-data-reads-as-zeroes
problem, reported at
http://groups.google.com/group/fa.linux.kernel/msg/083a9acff0330234
Cc: Matthew Wilcox <matthew@wil.cx> Cc: "Salyzyn, Mark" <mark_salyzyn@adaptec.com> Cc: James Bottomley <James.Bottomley@SteelEye.com> Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Anders Henke <anders.henke@1und1.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jarod Wilson [Mon, 3 Dec 2007 18:43:12 +0000 (13:43 -0500)]
firewire: OHCI 1.0 Isochronous Receive support
Third rendition of FireWire OHCI 1.0 Isochronous Receive support, using a
zer-copy method similar to OHCI 1.1 which puts the IR data payload directly
into the userspace buffer. The zero-copy implementation eliminates the
video artifacts, audio popping, and buffer underrun problems seen with
version 1 of this patch, as well as fixing a regression in OHCI 1.1 support
introduced by version 2 of this patch.
Successfully tested in OHCI 1.1 mode on the following chipsets:
- NEC uPD72847 (rev 01), OHCI 1.1 (PCI)
- Ti XIO2200(A) (rev 01), OHCI 1.1 (PCIe)
- Ti TSB41AB2 (rev 01), OHCI 1.1 (PCI on SB Audigy)
- Apple UniNorth 2 (rev 81), OHCI 1.1 (PowerBook G4 onboard)
Successfully tested in OHCI 1.0 mode on the following chipsets:
- Agere FW323 (rev 06), OHCI 1.0 (Mac Mini onboard)
- Agere FW323 (rev 06), OHCI 1.0 (PCI)
- Via VT6306 (rev 46), OHCI 1.0 (PCI)
- NEC OrangeLink (rev 01), OHCI 1.0 (PCI)
- NEC uPD72847 (rev 01), OHCI 1.1 (PCI)
- Ti XIO2200(A) (rev 01), OHCI 1.1 (PCIe)
The bulk of testing was done in an x86_64 system, but was also successfully
sanity-tested on other systems, including a PPC(32) PowerBook G4 and an i686
EPIA M10k. Crude benchmarking (watching top during capture) puts the cpu
utilization during capture on the EPIA's 1GHz Via C3 processor around 13%,
which is down from 30% with the v1 code.
Some implementation details:
To maintain the same userspace API as dual-buffer mode, we set up two
descriptors for every incoming packet. The first is an INPUT_MORE descriptor,
pointing to a buffer large enough to hold just the packet's iso headers,
immediately followed by an INPUT_LAST descriptor, pointing to a chunk of the
userspace buffer big enough for the packet's data payload. With this setup,
each incoming packet fills in these two descriptors in a manner that very
closely emulates dual-buffer receive, to the point where the bulk of the
handle_ir_* code is now identical between the two (and probably primed for
some restructuring to share code between them).
The only caveat I have at the moment is that neither of my OHCI 1.0 Via
VT6307-based FireWire controllers work particularly well with this code
for reasons I have yet to figure out.
Signed-off-by: Jarod Wilson <jwilson@redhat.com> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Linus Torvalds [Mon, 10 Dec 2007 18:18:27 +0000 (10:18 -0800)]
Merge branch 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6
* 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6:
[XFS] Fix xfs_ichgtime()s broken usage of I_SYNC
[XFS] Make xfsbufd threads freezable
[XFS] revert to double-buffering readdir
[XFS] Fix broken inode cluster setup.
[XFS] Clear XBF_READ_AHEAD flag on I/O completion.
[XFS] Fixed a few bugs in xfs_buf_associate_memory()
[XFS] 971064 Various fixups for xfs_bulkstat().
[XFS] Fix dbflush panic in xfs_qm_sync.
"It seems the whole MIPS resource managment is complicated enough (out
of necessity) that only a few people actually grok it. Ioports being
actually memory mapped on MIPS only makes the confusion worse, sigh."
Requested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Alan Cox <alan@redhat.com> Acked-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
PowerMac and CHRP/BriQ platforms have quirks to switch some IDE
controllers from legacy mode to fully native mode. Those quirks
however will not work properly anymore due to a change to the
generic code to better handle legacy IDE resources.
This fixes it by moving those quirk to "early" quirks (so they
run before resources are probed for the devices) and clearing
all BARs after the conversion to force a reallocation of sane
values.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Chinner [Fri, 7 Dec 2007 03:09:11 +0000 (14:09 +1100)]
[XFS] Fix xfs_ichgtime()s broken usage of I_SYNC
The recent I_LOCK->I_SYNC changes mistakenly changed xfs_ichgtime to look
at I_SYNC instead of I_LOCK. This was incorrect and prevents newly created
inodes from moving to the dirty list. Change this to the correct check
which is for I_NEW, not I_LOCK or I_SYNC so that behaviour is correct.
Fix breakage caused by commit 831441862956fffa17b9801db37e6ea1650b0f69
that did not introduce the necessary call to set_freezable() in
xfs/linux-2.6/xfs_buf.c .
The current readdir implementation deadlocks on a btree buffers locks
because nfsd calls back into ->lookup from the filldir callback. The only
short-term fix for this is to revert to the old inefficient
double-buffering scheme.
David Chinner [Fri, 23 Nov 2007 05:30:23 +0000 (16:30 +1100)]
[XFS] Fix broken inode cluster setup.
The radix tree based inode caches did away with the inode cluster hashes,
replacing them with a bunch of masking and gang lookups on the radix tree.
This masking got broken when moving the code to per-ag radix trees and
indexing by agino # rather than straight inode number. The result is
clustered inode writeback does not cluster and things can go extremely
slowly when there are lots of inodes to write.
Fix it up by comparing the agino # of the inode we just looked up to the
index of the cluster we are looking for.
Tested-by: Torsten Kaiser <just.for.lkml@googlemail.com>
SGI-PV: 972915
SGI-Modid: xfs-linux-melb:xfs-kern:30033a
Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Lachlan McIlroy [Tue, 27 Nov 2007 06:01:24 +0000 (17:01 +1100)]
[XFS] Fixed a few bugs in xfs_buf_associate_memory()
- calculation of 'page_count' was incorrect as it did not
consider the offset of 'mem' into the first page. The
logic to bump 'page_count' didn't work if 'len' was <=
PAGE_CACHE_SIZE (ie offset = 3k, len = 2k).
- setting b_buffer_length to 'len' is incorrect if 'offset'
is > 0. Set it to the total length of the buffer.
- I suspect that passing a non-aligned address into
mem_to_page() for the first page may have been causing
issues - don't know but just tidy up that code anyway.
Lachlan McIlroy [Fri, 23 Nov 2007 05:30:32 +0000 (16:30 +1100)]
[XFS] 971064 Various fixups for xfs_bulkstat().
- sanity check for NULL user buffer in xfs_ioc_bulkstat[_compat]()
- remove the special case for XFS_IOC_FSBULKSTAT with count == 1. This
special case causes bulkstat to fail because the special case uses
xfs_bulkstat_single() instead of xfs_bulkstat() and the two functions
have different semantics. xfs_bulkstat() will return the next inode
after the one supplied while skipping internal inodes (ie quota inodes).
xfs_bulkstate_single() will only lookup the inode supplied and return
an error if it is an internal inode.
- in xfs_bulkstat(), need to initialise 'lastino' to the inode supplied
so in cases were we return without examining any inodes the scan wont
restart back at zero.
- sanity check for valid *ubcountp values. Cannot sanity check for valid
ubuffer here because some users of xfs_bulkstat() don't supply a buffer.
- checks against 'ubleft' (the space left in the user's buffer) should be
against 'statstruct_size' which is the supplied minimum object size.
The mixture of checks against statstruct_size and 0 was one of the
reasons we were skipping inodes.
- if the formatter function returns BULKSTAT_RV_NOTHING and an error and
the error is not ENOENT or EINVAL then we need to abort the scan. ENOENT
is for inodes that are no longer valid and we just skip them. EINVAL is
returned if we try to lookup an internal inode so we skip them too. For
a DMF scan if the inode and DMF attribute cannot fit into the space left
in the user's buffer it would return ERANGE. We didn't handle this error
and skipped the inode. We would continue to skip inodes until one fitted
into the user's buffer or we completed the scan.
- put back the recalculation of agino (that got removed with the last fix)
at the end of the while loop. This is because the code at the start of
the loop expects agino to be the last inode examined if it is non-zero.
- if we found some inodes but then encountered an error, return success
this time and the error next time. If the formatter aborted with ENOMEM
we will now return this error but only if we couldn't read any inodes.
Previously if we encountered ENOMEM without reading any inodes we
returned a zero count and no error which falsely indicated the scan was
complete.
Donald Douwsma [Fri, 23 Nov 2007 05:27:42 +0000 (16:27 +1100)]
[XFS] Fix dbflush panic in xfs_qm_sync.
The recent behaviour layer removal dropped the check for quotas that have
been requested at mount time but have subsequently been turned off. This
results in a panic when accessing m_quotainfo which has been freed.
This patch adds the check originally made by xfs_qm_syncall() to
xfs_qm_sync().
Linus Torvalds [Sun, 9 Dec 2007 18:14:36 +0000 (10:14 -0800)]
Avoid double memclear() in SLOB/SLUB
Both slob and slub react to __GFP_ZERO by clearing the allocation, which
means that passing the GFP_ZERO bit down to the page allocator is just
wasteful and pointless.
Acked-by: Matt Mackall <mpm@selenic.com> Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
caused certain external modules not to build and
also caused 'make targz-pkg' to fail.
This is a minimal fix so we revert to previous
behaviour - but we do not overwrite the Makefile
in the top-level directory.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Tested-by: Jay Cliburn <jacliburn@bellsouth.net> Cc: Jay Cliburn <jacliburn@bellsouth.net>
Sam Ravnborg [Thu, 6 Dec 2007 21:20:11 +0000 (22:20 +0100)]
kbuild: fix building with redirected output.
Jan Altenberg <jan.altenberg@linutronix.de> reported that
building with redirected input like this failed:
make O=dir oldconfig bzImage < /dev/null
The problem were caused by a make silentoldconfig being
run before oldconfig and with a non-recent .config the build
failed because silentoldconfig requires non-redirected stdin.
Silentoldconfig was run as a side-effect of having the
top-level Makefile re-made by make.
Introducing an empty rule for the top-level Makefile
(and Kbuild.include) fixed the issue.
Ralf Baechle [Fri, 16 Nov 2007 14:54:46 +0000 (14:54 +0000)]
[MIPS] Malta: Enable tickless and highres timers.
Most Malta use an FPGA CPU card which rarely is good for more than 40MHz.
So the performance penalta of the regular timer interrupt, especially
for the VSMP kernel model is significant, even at a mere 100Hz.
Kenji Kaneshige [Fri, 9 Nov 2007 01:51:01 +0000 (10:51 +0900)]
[IA64] Fix iosapic interrupt delivery mode for CPE
If "CPEI Processor Override" bit is not set in "Platform Interrupt
Source Flags" in "Platform Interrupt Sources Structure" in ACPI MADT,
the target processor of CPEI is restricted to a specific CPU. Because
of this, the delivery mode for CPEI should be IOSAPIC_FIXED.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
Shaohua Li [Tue, 13 Nov 2007 06:55:20 +0000 (14:55 +0800)]
[IA64] kprobe: make kreturn probe handler stack unwind correct
Restore regs->ccr_iip before kreturn probe handler runs. In this way, if
probe handler does unwind, unwind can correctly get the stack trace.
Fixes: http://sourceware.org/bugzilla/show_bug.cgi?id=5051 Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
Li Zefan [Wed, 21 Nov 2007 22:58:26 +0000 (14:58 -0800)]
[IA64] make full use of macro efi_md_size
Macro efi_md_size is defined in efi.c, and here we apply it throughout
efi.c.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Tony Luck <tony.luck@intel.com>
Bernhard Walle [Wed, 21 Nov 2007 22:58:25 +0000 (14:58 -0800)]
[IA64] rename _bss to __bss_start
Rename _bss to __bss_start as on other architectures. That makes it
possible to use the <linux/sections.h> instead of own declarations. Also
add __bss_stop because that symbol exists on other architectures.
Signed-off-by: Bernhard Walle <bwalle@suse.de> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Tony Luck <tony.luck@intel.com>
Mike Habeck [Mon, 26 Nov 2007 17:19:57 +0000 (11:19 -0600)]
[IA64] SGI Altix : fix bug in sn_io_late_init()
When initializing pci_controller->node to point to the closest node we need
to take into consideration that a PIC PCI Bridge ASIC can be connected to a
headless/memless node just like the TIOCP and TIOCE Bridge ASICs
Signed-off-by: Mike Habeck <habeck@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
This only appears to be a problem with strangely configured
cross-compilation ... native compilers don't have this issue.
But in the interests of helping others at least compile for
ia64, this can go in. -Tony
Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Tony Luck <tony.luck@intel.com>
Tejun Heo [Fri, 7 Dec 2007 03:46:23 +0000 (12:46 +0900)]
libata: kill spurious NCQ completion detection
Spurious NCQ completion detection implemented in ahci was incorrect.
On AHCI receving and processing FISes and raising interrupts are not
interlocked and spurious interrupts are expected.
For example, if an interrupt occurs while interrupt handler is running
and the running interrupt handler handles the event the new IRQ
indicated, after IRQ handler finishes, it will be executed again
because IRQ pending bit is set by the new interrupt but there won't be
anything to process.
Please read the following message for more information.
http://article.gmane.org/gmane.linux.ide/26012
This patch...
* Removes all spurious IRQ whining from ahci. Spurious NCQ completion
detection was completely wrong. Spurious D2H Register FIS taught us
that some early drives send spurious D2H Register FIS with I bit set
while NCQ commands are in progress but none of recent drives does
that and even the ones which show such behavior can do NCQ fine.
* Kills all NCQ blacklist entries which were added because of spurious
NCQ completions. I tracked down each commit and verified all
removed ones are actually added because of spurious completions.
WD740ADFD-00NLR1 wasn't deleted but moved upward because the drive
not only had spurious NCQ completions but also is slow on sequential
data transfers if NCQ is enabled.
Maxtor 7V300F0 was added by 0e3dbc01d53940fe10e5a5cfec15ede3e929c918
from Alan Cox. I can only find evidences that the drive only had
troubles with spuruious completions by searching the mailing list.
This entry needs to be verified and removed if it doesn't have other
NCQ related problems.
Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Tejun Heo [Thu, 6 Dec 2007 06:09:43 +0000 (15:09 +0900)]
ahci: don't attach if ICH6 is in combined mode
ICH6 R/Ms share PCI ID between piix and ahci modes and we've been
allowing ahci to attach regardless of how BIOS configured it.
However, enabling AHCI mode when the controller is in combined mode
can result in unexpected behavior. Don't attach if the controller is
in combined mode.
Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: Bill Nottingham <notting@redhat.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
S2io: Check for register initialization completion before accesing device registers
- Making sure register initialisation is complete before proceeding further.
The driver must wait until initialization is complete before attempting to
access any other device registers.
ibm_newemac: Call dev_set_drvdata() before tah_reset()
The patch moves dev_set_drvdata(&ofdev->dev, dev) up before tah_reset(ofdev)
is called to avoid a NULL pointer dereference, since tah_reset uses drvdata.
Signed-off-by: Valentine Barshak <vbarshak@ru.mvista.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Hugh Blemings [Wed, 5 Dec 2007 00:14:30 +0000 (11:14 +1100)]
ibm_newemac: Skip EMACs that are marked unused by the firmware
Depending on how the 44x processors are wired, some EMAC cells
might not be useable (and not connected to a PHY). However, some
device-trees may choose to still expose them (since their registers
are present in the MMIO space) but with an "unused" property in them.
Signed-off-by: Hugh Blemings <hugh@blemings.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Jeff Garzik <jeff@garzik.org>
ibm_newemac: Cleanup/fix support for STACR register variants
There are a few variants of the STACR register that affect more than
just the "AXON" version of EMAC. Replace the current test of various
chip models with tests for generic properties in the device-tree.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Stefan Roese <sr@denx.de> Signed-off-by: Jeff Garzik <jeff@garzik.org>
ibm_newemac: Cleanup/Fix RGMII MDIO support detection
More than just "AXON" version of EMAC RGMII supports MDIO, so replace
the current test with a generic property in the device-tree that
indicates such support.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Stefan Roese <sr@denx.de> Signed-off-by: Jeff Garzik <jeff@garzik.org>
ibm_newemac: Workaround reset timeout when no link
With some PHYs, when the link goes away, the EMAC reset fails due
to the loss of the RX clock I believe.
The old EMAC driver worked around that using some internal chip-specific
clock force bits that are different on various 44x implementations.
This is an attempt at doing it differently, by avoiding the reset when
there is no link, but forcing loopback mode instead. It seems to work
on my Taishan 440GX based board so far.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Stefan Roese <sr@denx.de> Signed-off-by: Jeff Garzik <jeff@garzik.org>
When using ZMII for MDIO only (such as 440GX with RGMII for data and ZMII for
MDIO), the ZMII code would fail to properly refcount, thus triggering a
BUG_ON().
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Stefan Roese <sr@denx.de> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Stefan Roese [Wed, 5 Dec 2007 00:14:25 +0000 (11:14 +1100)]
ibm_newemac: Add BCM5248 and Marvell 88E1111 PHY support
This patch adds BCM5248 and Marvell 88E1111 PHY support to NEW EMAC driver.
These PHY chips are used on PowerPC 440EPx boards.
The PHY code is based on the previous work by Stefan Roese <sr@denx.de>
Signed-off-by: Stefan Roese <sr@denx.de> Signed-off-by: Valentine Barshak <vbarshak@ru.mvista.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Eliezer Tamir [Wed, 5 Dec 2007 14:12:39 +0000 (16:12 +0200)]
make bnx2x select ZLIB_INFLATE
The bnx2x module depends on the zlib_inflate functions. The
build will fail if ZLIB_INFLATE has not been selected manually
or by building another module that automatically selects it.
Modify BNX2X config option to 'select ZLIB_INFLATE' like BNX2
and others. This seems to fix it.
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Acked-by: Eliezer Tamir <eliezert@broadcom.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Jay Vosburgh [Fri, 7 Dec 2007 07:40:35 +0000 (23:40 -0800)]
bonding: Fix race at module unload
Fixes a race condition in module unload. Without this change,
workqueue events may fire while bonding data structures are partially
freed but before bond_close() is invoked by unregister_netdevice().
Update version to 3.2.3.
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Jay Vosburgh [Fri, 7 Dec 2007 07:40:34 +0000 (23:40 -0800)]
bonding: Add new layer2+3 hash for xor/802.3ad modes
Add new hash for balance-xor and 802.3ad modes. Originally
submitted by "Glenn Griffin" <ggriffin.kernel@gmail.com>; modified by
Jay Vosburgh to move setting of hash policy out of line, tweak the
documentation update and add version update to 3.2.2.
Glenn's original comment follows:
Included is a patch for a new xmit_hash_policy for the bonding driver
that selects slaves based on MAC and IP information. This is a middle
ground between what currently exists in the layer2 only policy and the
layer3+4 policy. This policy strives to be fully 802.3ad compliant by
transmitting every packet of any particular flow over the same link.
As documented the layer3+4 policy is not fully compliant for extreme
cases such as ip fragmentation, so this policy is a nice compromise
for environments that require full compliance but desire more than the
layer2 only policy.
Signed-off-by: "Glenn Griffin" <ggriffin.kernel@gmail.com> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Wagner Ferenc [Fri, 7 Dec 2007 07:40:32 +0000 (23:40 -0800)]
bonding: Allow setting and querying xmit policy regardless of mode
From: Wagner Ferenc <wferi@niif.hu>
For consistency with the behaviour of the arp_ip_target option,
let /sys/class/net/bond0/bonding/xmit_hash_policy accept and report
current policy even if the bonding mode in effect does not use it.
Signed-off-by: Ferenc Wagner <wferi@niif.hu> Acked-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Wagner Ferenc [Fri, 7 Dec 2007 07:40:29 +0000 (23:40 -0800)]
bonding: Return nothing for not applicable values
From: Wagner Ferenc <wferi@niif.hu>
The previous code returned '\n' (that is, a single empty line)
from most files, with one exception (xmit_hash_policy), where
it returned 'NA\n'. This patch consolidates each file to return
nothing at all if not applicable, not even a '\n'.
I find this behaviour more usual, more useful, more efficient
and shorter to code from both sides.
Signed-off-by: Ferenc Wagner <wferi@niif.hu> Acked-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Wagner Ferenc [Fri, 7 Dec 2007 07:40:28 +0000 (23:40 -0800)]
bonding: Remove trailing NULs from sysfs interface.
From: Wagner Ferenc <wferi@niif.hu>
Also remove trailing spaces from multivalued files.
This fixes output like for example:
$ od -c /sys/class/net/bond0/bonding/slaves 0000000 e t h - l e f t e t h - r i g 0000020 h t \n \0 0000025
It mostly entails deleting '+1'-s after sprintf() calls: the return value
of sprintf is the number of characters printed, without the closing NUL,
ie. exactly what the sysfs interface requires. The three multivalue
cases are different, because they also have to swallow back a trailing
space.
Signed-off-by: Ferenc Wagner <wferi@niif.hu> Acked-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86:
ACPI: move timer broadcast before busmaster disable
clockevents: warn once when program_event() is called with negative expiry
hrtimers: avoid overflow for large relative timeouts
* git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched:
sched: enable early use of sched_clock()
lockdep: make cli/sti annotation warnings clearer
Linus Torvalds [Fri, 7 Dec 2007 19:00:31 +0000 (11:00 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hskinnemoen/avr32-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hskinnemoen/avr32-2.6:
[AVR32] Fix wrong pt_regs in critical exception handler
[AVR32] Fix copy_to_user_page() breakage
[AVR32] Follow the rules when dealing with the OCD system
[AVR32] Clean up OCD register usage
[AVR32] Implement irqflags trace and lockdep support
[AVR32] Implement stacktrace support
[AVR32] Kconfig: Use def_bool instead of bool + default
[AVR32] Fix invalid status register bit definitions in asm/ptrace.h
[AVR32] Add TIF_RESTORE_SIGMASK to the work masks
Linus Torvalds [Fri, 7 Dec 2007 18:59:48 +0000 (10:59 -0800)]
Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
[AF_RXRPC]: Add a missing goto
[VLAN]: Lost rtnl_unlock() in vlan_ioctl()
[SCTP]: Fix the bind_addr info during migration.
[SCTP]: Add bind hash locking to the migrate code
[IPV4]: Remove prototype of ip_rt_advice
[IPv4]: Reply net unreachable ICMP message
[IPv6] SNMP: Increment OutNoRoutes when connecting to unreachable network
[BRIDGE]: Section fix.
[NIU]: Fix link LED handling.
Thomas Gleixner [Fri, 7 Dec 2007 18:16:17 +0000 (19:16 +0100)]
ACPI: move timer broadcast before busmaster disable
The timer broadcast code might access HPET, which should not be
accessed after the busmaster disable.
In acpi_idle_enter_simple() this change also prevents, that we modify
the busmaster state without going actually idle. This might leave the
ACPI bm state in a stale state, when we leave the function early in
the need_resched() check.
Thomas Gleixner [Fri, 7 Dec 2007 18:16:17 +0000 (19:16 +0100)]
clockevents: warn once when program_event() is called with negative expiry
The hrtimer problem with large relative timeouts resulting in a
negative expiry time went unnoticed as there is no check in the
clockevents_program_event() code. Put a check there with a WARN_ONCE
to avoid such problems in the future.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Thomas Gleixner [Fri, 7 Dec 2007 18:16:17 +0000 (19:16 +0100)]
hrtimers: avoid overflow for large relative timeouts
Relative hrtimers with a large timeout value might end up as negative
timer values, when the current time is added in hrtimer_start().
This in turn is causing the clockevents_set_next() function to set an
huge timeout and sleep for quite a long time when we have a clock
source which is capable of long sleeps like HPET. With PIT this almost
goes unnoticed as the maximum delta is ~27ms. The non-hrt/nohz code
sorts this out in the next timer interrupt, so we never noticed that
problem which has been there since the first day of hrtimers.
This bug became more apparent in 2.6.24 which activates HPET on more
hardware.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Fri, 7 Dec 2007 18:02:47 +0000 (19:02 +0100)]
sched: enable early use of sched_clock()
some platforms have sched_clock() implementations that cannot be called
very early during wakeup. If it's called it might hang or crash in hard
to debug ways. So only call update_rq_clock() [which calls sched_clock()]
if sched_init() has already been called. (rq->idle is NULL before the
scheduler is initialized.)
[AVR32] Fix wrong pt_regs in critical exception handler
It's not like it really matters at this point since the system is
dying anyway, but handle_critical pushes too few registers on the
stack so the register dump, which makes the register dump look a bit
strange. This patch fixes it.
The current implementation of copy_to_user_page() gives "vaddr" to the
cache instruction when trying to sync the icache with the dcache. If
vaddr does not exist in the TLB, the CPU will silently abort the
operation, which may result in the caches staying out of sync.
To fix this, pass the "dst" parameter to flush_icache_range() instead
-- we know this is valid because we just wrote to it.
[AVR32] Follow the rules when dealing with the OCD system
The current debug trap handling code does a number of things that are
illegal according to the AVR32 Architecture manual. Most importantly,
it may try to schedule from Debug Mode, thus clearing the D bit, which
can lead to "undefined behaviour".
It seems like this works in most cases, but several people have
observed somewhat unstable behaviour when debugging programs,
including soft lockups. So there's definitely something which is not
right with the existing code.
The new code will never schedule from Debug mode, it will always exit
Debug mode with a "retd" instruction, and if something not running in
Debug mode needs to do something debug-related (like doing a single
step), it will enter debug mode through a "breakpoint" instruction.
The monitor code will then return directly to user space, bypassing
its own saved registers if necessary (since we don't actually care
about the trapped context, only the one that came before.)
This adds three instructions to the common exception handling code,
including one branch. It does not touch super-hot paths like the TLB
miss handler.
Generate a new set of OCD register definitions in asm/ocd.h and rename
__mfdr() and __mtdr() to ocd_read() and ocd_write() respectively.
The bitfield definitions are a lot more complete now, and they are
entirely based on bit numbers, not masks. This is because OCD
registers are frequently accessed from assembly code, where bit
numbers are a lot more useful (can be fed directly to sbr, bfins,
etc.)
Bitfields that consist of more than one bit have two definitions:
_START, which indicates the number of the first bit, and _SIZE, which
indicates the number of bits. These directly correspond to the
parameters taken by the bfextu, bfexts and bfins instructions.