David Gibson [Mon, 10 Dec 2007 03:28:39 +0000 (14:28 +1100)]
[POWERPC] Use embedded libfdt in the bootwrapper
This incorporates libfdt (from the source embedded in an earlier
commit) into the wrapper.a library used by the bootwrapper. This
includes adding a libfdt_env.h file, which the libfdt sources need in
order to integrate into the bootwrapper environment, and a
libfdt-wrapper.c which provides glue to connect the bootwrapper's
abstract device tree callbacks to the libfdt functions.
In addition, this changes the various wrapper and platform files to
use libfdt functions instead of the older flatdevtree.c library.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
David Gibson [Mon, 10 Dec 2007 03:28:39 +0000 (14:28 +1100)]
[POWERPC] Merge libfdt upstream source
This incorporates a copy of dtc libfdt into the kernel source, in
arch/powerpc/boot/libfdt. This only imports the upstream sources
verbatim, later patches are needed to actually link it into the kernel
Makefiles.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
will schmidt [Thu, 6 Dec 2007 21:22:23 +0000 (08:22 +1100)]
[POWERPC] Update xmon slb code
This adds a bit more detail to the xmon SLB output. When the valid
bit is set, this displays the ESID and VSID values, as well as
decoding the segment size -- 1T or 256M -- and displaying the LLP
bits. This supresses the output for any slb entries that contain only
zeros.
Michael Neuling [Thu, 6 Dec 2007 06:24:48 +0000 (17:24 +1100)]
[POWERPC] Use SLB size from the device tree
Currently we hardwire the number of SLBs to 64, but PAPR says we
should use the ibm,slb-size property to obtain the number of SLB
entries. This uses this property instead of assuming 64. If no
property is found, we assume 64 entries as before.
This soft patches the SLB handler, so it shouldn't change performance
at all.
Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
[POWERPC] iommu_free_table doesn't need the device_node
It only needs the iommu_table address. It also makes use of the node
name to print error messages. So just pass it the things it needs.
This reduces the places that know about the pci_dn by one.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
Michael Ellerman [Mon, 26 Nov 2007 08:03:45 +0000 (19:03 +1100)]
[POWERPC] Add for_each_child_of_node() helper for iterating over child nodes
Add for_each_child_of_node() to encapsulate the common idiom of
iterating over the children of a device_node.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Acked-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
Tejun Heo [Fri, 7 Dec 2007 03:46:23 +0000 (12:46 +0900)]
libata: kill spurious NCQ completion detection
Spurious NCQ completion detection implemented in ahci was incorrect.
On AHCI receving and processing FISes and raising interrupts are not
interlocked and spurious interrupts are expected.
For example, if an interrupt occurs while interrupt handler is running
and the running interrupt handler handles the event the new IRQ
indicated, after IRQ handler finishes, it will be executed again
because IRQ pending bit is set by the new interrupt but there won't be
anything to process.
Please read the following message for more information.
http://article.gmane.org/gmane.linux.ide/26012
This patch...
* Removes all spurious IRQ whining from ahci. Spurious NCQ completion
detection was completely wrong. Spurious D2H Register FIS taught us
that some early drives send spurious D2H Register FIS with I bit set
while NCQ commands are in progress but none of recent drives does
that and even the ones which show such behavior can do NCQ fine.
* Kills all NCQ blacklist entries which were added because of spurious
NCQ completions. I tracked down each commit and verified all
removed ones are actually added because of spurious completions.
WD740ADFD-00NLR1 wasn't deleted but moved upward because the drive
not only had spurious NCQ completions but also is slow on sequential
data transfers if NCQ is enabled.
Maxtor 7V300F0 was added by 0e3dbc01d53940fe10e5a5cfec15ede3e929c918
from Alan Cox. I can only find evidences that the drive only had
troubles with spuruious completions by searching the mailing list.
This entry needs to be verified and removed if it doesn't have other
NCQ related problems.
Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Tejun Heo [Thu, 6 Dec 2007 06:09:43 +0000 (15:09 +0900)]
ahci: don't attach if ICH6 is in combined mode
ICH6 R/Ms share PCI ID between piix and ahci modes and we've been
allowing ahci to attach regardless of how BIOS configured it.
However, enabling AHCI mode when the controller is in combined mode
can result in unexpected behavior. Don't attach if the controller is
in combined mode.
Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: Bill Nottingham <notting@redhat.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
S2io: Check for register initialization completion before accesing device registers
- Making sure register initialisation is complete before proceeding further.
The driver must wait until initialization is complete before attempting to
access any other device registers.
ibm_newemac: Call dev_set_drvdata() before tah_reset()
The patch moves dev_set_drvdata(&ofdev->dev, dev) up before tah_reset(ofdev)
is called to avoid a NULL pointer dereference, since tah_reset uses drvdata.
Signed-off-by: Valentine Barshak <vbarshak@ru.mvista.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Hugh Blemings [Wed, 5 Dec 2007 00:14:30 +0000 (11:14 +1100)]
ibm_newemac: Skip EMACs that are marked unused by the firmware
Depending on how the 44x processors are wired, some EMAC cells
might not be useable (and not connected to a PHY). However, some
device-trees may choose to still expose them (since their registers
are present in the MMIO space) but with an "unused" property in them.
Signed-off-by: Hugh Blemings <hugh@blemings.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Jeff Garzik <jeff@garzik.org>
ibm_newemac: Cleanup/fix support for STACR register variants
There are a few variants of the STACR register that affect more than
just the "AXON" version of EMAC. Replace the current test of various
chip models with tests for generic properties in the device-tree.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Stefan Roese <sr@denx.de> Signed-off-by: Jeff Garzik <jeff@garzik.org>
ibm_newemac: Cleanup/Fix RGMII MDIO support detection
More than just "AXON" version of EMAC RGMII supports MDIO, so replace
the current test with a generic property in the device-tree that
indicates such support.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Stefan Roese <sr@denx.de> Signed-off-by: Jeff Garzik <jeff@garzik.org>
ibm_newemac: Workaround reset timeout when no link
With some PHYs, when the link goes away, the EMAC reset fails due
to the loss of the RX clock I believe.
The old EMAC driver worked around that using some internal chip-specific
clock force bits that are different on various 44x implementations.
This is an attempt at doing it differently, by avoiding the reset when
there is no link, but forcing loopback mode instead. It seems to work
on my Taishan 440GX based board so far.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Stefan Roese <sr@denx.de> Signed-off-by: Jeff Garzik <jeff@garzik.org>
When using ZMII for MDIO only (such as 440GX with RGMII for data and ZMII for
MDIO), the ZMII code would fail to properly refcount, thus triggering a
BUG_ON().
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Stefan Roese <sr@denx.de> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Stefan Roese [Wed, 5 Dec 2007 00:14:25 +0000 (11:14 +1100)]
ibm_newemac: Add BCM5248 and Marvell 88E1111 PHY support
This patch adds BCM5248 and Marvell 88E1111 PHY support to NEW EMAC driver.
These PHY chips are used on PowerPC 440EPx boards.
The PHY code is based on the previous work by Stefan Roese <sr@denx.de>
Signed-off-by: Stefan Roese <sr@denx.de> Signed-off-by: Valentine Barshak <vbarshak@ru.mvista.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Eliezer Tamir [Wed, 5 Dec 2007 14:12:39 +0000 (16:12 +0200)]
make bnx2x select ZLIB_INFLATE
The bnx2x module depends on the zlib_inflate functions. The
build will fail if ZLIB_INFLATE has not been selected manually
or by building another module that automatically selects it.
Modify BNX2X config option to 'select ZLIB_INFLATE' like BNX2
and others. This seems to fix it.
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Acked-by: Eliezer Tamir <eliezert@broadcom.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Jay Vosburgh [Fri, 7 Dec 2007 07:40:35 +0000 (23:40 -0800)]
bonding: Fix race at module unload
Fixes a race condition in module unload. Without this change,
workqueue events may fire while bonding data structures are partially
freed but before bond_close() is invoked by unregister_netdevice().
Update version to 3.2.3.
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Jay Vosburgh [Fri, 7 Dec 2007 07:40:34 +0000 (23:40 -0800)]
bonding: Add new layer2+3 hash for xor/802.3ad modes
Add new hash for balance-xor and 802.3ad modes. Originally
submitted by "Glenn Griffin" <ggriffin.kernel@gmail.com>; modified by
Jay Vosburgh to move setting of hash policy out of line, tweak the
documentation update and add version update to 3.2.2.
Glenn's original comment follows:
Included is a patch for a new xmit_hash_policy for the bonding driver
that selects slaves based on MAC and IP information. This is a middle
ground between what currently exists in the layer2 only policy and the
layer3+4 policy. This policy strives to be fully 802.3ad compliant by
transmitting every packet of any particular flow over the same link.
As documented the layer3+4 policy is not fully compliant for extreme
cases such as ip fragmentation, so this policy is a nice compromise
for environments that require full compliance but desire more than the
layer2 only policy.
Signed-off-by: "Glenn Griffin" <ggriffin.kernel@gmail.com> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Wagner Ferenc [Fri, 7 Dec 2007 07:40:32 +0000 (23:40 -0800)]
bonding: Allow setting and querying xmit policy regardless of mode
From: Wagner Ferenc <wferi@niif.hu>
For consistency with the behaviour of the arp_ip_target option,
let /sys/class/net/bond0/bonding/xmit_hash_policy accept and report
current policy even if the bonding mode in effect does not use it.
Signed-off-by: Ferenc Wagner <wferi@niif.hu> Acked-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Wagner Ferenc [Fri, 7 Dec 2007 07:40:29 +0000 (23:40 -0800)]
bonding: Return nothing for not applicable values
From: Wagner Ferenc <wferi@niif.hu>
The previous code returned '\n' (that is, a single empty line)
from most files, with one exception (xmit_hash_policy), where
it returned 'NA\n'. This patch consolidates each file to return
nothing at all if not applicable, not even a '\n'.
I find this behaviour more usual, more useful, more efficient
and shorter to code from both sides.
Signed-off-by: Ferenc Wagner <wferi@niif.hu> Acked-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Wagner Ferenc [Fri, 7 Dec 2007 07:40:28 +0000 (23:40 -0800)]
bonding: Remove trailing NULs from sysfs interface.
From: Wagner Ferenc <wferi@niif.hu>
Also remove trailing spaces from multivalued files.
This fixes output like for example:
$ od -c /sys/class/net/bond0/bonding/slaves 0000000 e t h - l e f t e t h - r i g 0000020 h t \n \0 0000025
It mostly entails deleting '+1'-s after sprintf() calls: the return value
of sprintf is the number of characters printed, without the closing NUL,
ie. exactly what the sysfs interface requires. The three multivalue
cases are different, because they also have to swallow back a trailing
space.
Signed-off-by: Ferenc Wagner <wferi@niif.hu> Acked-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86:
ACPI: move timer broadcast before busmaster disable
clockevents: warn once when program_event() is called with negative expiry
hrtimers: avoid overflow for large relative timeouts
* git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched:
sched: enable early use of sched_clock()
lockdep: make cli/sti annotation warnings clearer
Linus Torvalds [Fri, 7 Dec 2007 19:00:31 +0000 (11:00 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hskinnemoen/avr32-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hskinnemoen/avr32-2.6:
[AVR32] Fix wrong pt_regs in critical exception handler
[AVR32] Fix copy_to_user_page() breakage
[AVR32] Follow the rules when dealing with the OCD system
[AVR32] Clean up OCD register usage
[AVR32] Implement irqflags trace and lockdep support
[AVR32] Implement stacktrace support
[AVR32] Kconfig: Use def_bool instead of bool + default
[AVR32] Fix invalid status register bit definitions in asm/ptrace.h
[AVR32] Add TIF_RESTORE_SIGMASK to the work masks
Linus Torvalds [Fri, 7 Dec 2007 18:59:48 +0000 (10:59 -0800)]
Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
[AF_RXRPC]: Add a missing goto
[VLAN]: Lost rtnl_unlock() in vlan_ioctl()
[SCTP]: Fix the bind_addr info during migration.
[SCTP]: Add bind hash locking to the migrate code
[IPV4]: Remove prototype of ip_rt_advice
[IPv4]: Reply net unreachable ICMP message
[IPv6] SNMP: Increment OutNoRoutes when connecting to unreachable network
[BRIDGE]: Section fix.
[NIU]: Fix link LED handling.
Thomas Gleixner [Fri, 7 Dec 2007 18:16:17 +0000 (19:16 +0100)]
ACPI: move timer broadcast before busmaster disable
The timer broadcast code might access HPET, which should not be
accessed after the busmaster disable.
In acpi_idle_enter_simple() this change also prevents, that we modify
the busmaster state without going actually idle. This might leave the
ACPI bm state in a stale state, when we leave the function early in
the need_resched() check.
Thomas Gleixner [Fri, 7 Dec 2007 18:16:17 +0000 (19:16 +0100)]
clockevents: warn once when program_event() is called with negative expiry
The hrtimer problem with large relative timeouts resulting in a
negative expiry time went unnoticed as there is no check in the
clockevents_program_event() code. Put a check there with a WARN_ONCE
to avoid such problems in the future.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Thomas Gleixner [Fri, 7 Dec 2007 18:16:17 +0000 (19:16 +0100)]
hrtimers: avoid overflow for large relative timeouts
Relative hrtimers with a large timeout value might end up as negative
timer values, when the current time is added in hrtimer_start().
This in turn is causing the clockevents_set_next() function to set an
huge timeout and sleep for quite a long time when we have a clock
source which is capable of long sleeps like HPET. With PIT this almost
goes unnoticed as the maximum delta is ~27ms. The non-hrt/nohz code
sorts this out in the next timer interrupt, so we never noticed that
problem which has been there since the first day of hrtimers.
This bug became more apparent in 2.6.24 which activates HPET on more
hardware.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Fri, 7 Dec 2007 18:02:47 +0000 (19:02 +0100)]
sched: enable early use of sched_clock()
some platforms have sched_clock() implementations that cannot be called
very early during wakeup. If it's called it might hang or crash in hard
to debug ways. So only call update_rq_clock() [which calls sched_clock()]
if sched_init() has already been called. (rq->idle is NULL before the
scheduler is initialized.)
[AVR32] Fix wrong pt_regs in critical exception handler
It's not like it really matters at this point since the system is
dying anyway, but handle_critical pushes too few registers on the
stack so the register dump, which makes the register dump look a bit
strange. This patch fixes it.
The current implementation of copy_to_user_page() gives "vaddr" to the
cache instruction when trying to sync the icache with the dcache. If
vaddr does not exist in the TLB, the CPU will silently abort the
operation, which may result in the caches staying out of sync.
To fix this, pass the "dst" parameter to flush_icache_range() instead
-- we know this is valid because we just wrote to it.
[AVR32] Follow the rules when dealing with the OCD system
The current debug trap handling code does a number of things that are
illegal according to the AVR32 Architecture manual. Most importantly,
it may try to schedule from Debug Mode, thus clearing the D bit, which
can lead to "undefined behaviour".
It seems like this works in most cases, but several people have
observed somewhat unstable behaviour when debugging programs,
including soft lockups. So there's definitely something which is not
right with the existing code.
The new code will never schedule from Debug mode, it will always exit
Debug mode with a "retd" instruction, and if something not running in
Debug mode needs to do something debug-related (like doing a single
step), it will enter debug mode through a "breakpoint" instruction.
The monitor code will then return directly to user space, bypassing
its own saved registers if necessary (since we don't actually care
about the trapped context, only the one that came before.)
This adds three instructions to the common exception handling code,
including one branch. It does not touch super-hot paths like the TLB
miss handler.
Generate a new set of OCD register definitions in asm/ocd.h and rename
__mfdr() and __mtdr() to ocd_read() and ocd_write() respectively.
The bitfield definitions are a lot more complete now, and they are
entirely based on bit numbers, not masks. This is because OCD
registers are frequently accessed from assembly code, where bit
numbers are a lot more useful (can be fed directly to sbr, bfins,
etc.)
Bitfields that consist of more than one bit have two definitions:
_START, which indicates the number of the first bit, and _SIZE, which
indicates the number of bits. These directly correspond to the
parameters taken by the bfextu, bfexts and bfins instructions.
We really need to check TIF_RESTORE_SIGMASK before returning to
userspace. The existing code does not necessarily do this.
Define the work masks as a bitwise OR of the respective flags instead
of a hardcoded hex value to make it easier to spot errors like this in
the future.
Vlad Yasevich [Fri, 7 Dec 2007 06:50:54 +0000 (22:50 -0800)]
[SCTP]: Fix the bind_addr info during migration.
During accept/migrate the code attempts to copy the addresses from
the parent endpoint to the new endpoint. However, if the parent
was bound to a wildcard address, then we end up pointlessly copying
all of the current addresses on the system.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Yasevich [Fri, 7 Dec 2007 06:50:27 +0000 (22:50 -0800)]
[SCTP]: Add bind hash locking to the migrate code
SCTP accept code tries to add a newliy created socket
to a bind bucket without holding a lock. On a really
busy system, that can causes slab corruptions.
Add a lock around this code.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Mitsuru Chinen [Fri, 7 Dec 2007 09:07:24 +0000 (01:07 -0800)]
[IPv4]: Reply net unreachable ICMP message
IPv4 stack doesn't reply any ICMP destination unreachable message
with net unreachable code when IP detagrams are being discarded
because of no route could be found in the forwarding path.
Incidentally, IPv6 stack replies such ICMPv6 message in the similar
situation.
Signed-off-by: Mitsuru Chinen <mitch@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Richard Purdie [Sat, 10 Nov 2007 13:29:04 +0000 (13:29 +0000)]
leds: Fix led trigger locking bugs
Convert part of the led trigger core from rw spinlocks to rw
semaphores. We're calling functions which can sleep from invalid
contexts otherwise. Fixes bug #9264.
Mitsuru Chinen [Thu, 6 Dec 2007 06:31:47 +0000 (22:31 -0800)]
[IPv6] SNMP: Increment OutNoRoutes when connecting to unreachable network
IPv6 stack doesn't increment OutNoRoutes counter when IP datagrams
is being discarded because no route could be found to transmit them
to their destination. IPv6 stack should increment the counter.
Incidentally, IPv4 stack increments that counter in such situation.
Signed-off-by: Mitsuru Chinen <mitch@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Mirko Lindner [Thu, 6 Dec 2007 05:10:02 +0000 (21:10 -0800)]
[NIU]: Fix link LED handling.
The LED in the current driver will not be controlled correctly. During
a link change the carrier of the link is not available and the LED
will never turn on.
Signed-off-by: David S. Miller <davem@davemloft.net>
Grant Likely [Thu, 6 Dec 2007 19:16:44 +0000 (06:16 +1100)]
[POWERPC] virtex bug fix: Use canonical value for AC97 interrupt xparams
The ml300 and ml403 xparameters.h files use different macros for the
AC97 interrupt pin assignments. This normalizes them to a canonical
value similar to what EDK generates for most other devices. This is
needed to get ml300 support to compile in arch/ppc.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Acked-by: Josh Boyer <jwboyer@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>