Paul Mackerras [Fri, 11 Nov 2005 11:36:34 +0000 (22:36 +1100)]
powerpc: Fix reading and writing SPRs from xmon on 32-bit
When we created the instructions to read/write SPRs in xmon, we were
setting up a ppc64-style procedure descriptor and calling that, which
doesn't work in 32-bit. For 32-bit a function pointer just points
to the instructions of the function. This fixes it to do the right
thing for both 32-bit and 64-bit.
Paul Mackerras [Fri, 11 Nov 2005 11:34:43 +0000 (22:34 +1100)]
powerpc: Initialize secondary CPU setup for 32-bit SMP
32-bit SMP powermacs weren't booting with ARCH=powerpc because the
boot cpu wasn't saving away the state of various control registers,
but the secondary CPUs were loading them from the uninitialized
state. This adds the necessary save-state call.
[PATCH] powerpc: Merge vdso's and add vdso support to 32 bits kernel
This patch moves the vdso's to arch/powerpc, adds support for the 32
bits vdso to the 32 bits kernel, rename systemcfg (finally !), and adds
some new (still untested) routines to both vdso's: clock_gettime() with
support for CLOCK_REALTIME and CLOCK_MONOTONIC, clock_getres() (same
clocks) and get_tbfreq() for glibc to retreive the timebase frequency.
Tom,Steve: The implementation of get_tbfreq() I've done for 32 bits
returns a long long (r3, r4) not a long. This is such that if we ever
add support for >4Ghz timebases on ppc32, the userland interface won't
have to change.
I have tested gettimeofday() using some glibc patches in both ppc32 and
ppc64 kernels using 32 bits userland (I haven't had a chance to test a
64 bits userland yet, but the implementation didn't change and was
tested earlier). I haven't tested yet the new functions.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
David Gibson [Fri, 11 Nov 2005 05:42:12 +0000 (16:42 +1100)]
[PATCH] powerpc: Move udbg code to arch/powerpc
Since the udbg code in ppc64 has no ppc32 equivalent, move it straight
over into arch/powerpc (and include/asm-powerpc for udbg.h). In time,
we probably want to meld the various bits and pieces of 32-bit early
debugging code into udbg, but for now only include it on
CONFIG_PPC64=y builds. The only change during the move is to
standardise the protecting #ifdef/#define in udbg.h, and move its
banner comment above the initial #ifdef (which seems to be normal
practice).
Built and booted on POWER5 LPAR (ARCH=powerpc and ARCH=ppc64). Built
for 32bit multiplatform (ARCH=powerpc).
Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
Anton Blanchard [Fri, 11 Nov 2005 04:02:03 +0000 (15:02 +1100)]
[PATCH] ppc64: Increase sparsemem defaults
The definitions in sparsemem.h arent sufficient. We currently sell
machines with 2TB of RAM, and in order to give us room for a few years
growth lets set it to 16TB.
Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
Anton Blanchard [Fri, 11 Nov 2005 03:22:35 +0000 (14:22 +1100)]
[PATCH] ppc64: Convert NUMA to sparsemem (3)
Convert to sparsemem and remove all the discontigmem code in the
process. This has a few advantages:
- The old numa_memory_lookup_table can go away
- All the arch specific discontigmem magic can go away
We also remove the triple pass of memory properties and instead create a
list of per node extents that we iterate through. A final cleanup would
be to change our lmb code to store extents per node, then we can reuse
that information in the numa code.
Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
Anton Blanchard [Fri, 11 Nov 2005 02:53:11 +0000 (13:53 +1100)]
[PATCH] ppc64: Quieten lparcfg
If we dont have permission to read some information from the hypervisor,
lparcfg outputs a warning on the console. Now that lparcfg is world
readable this is a problem.
Dont warn in the case of H_Authority, remove some unnecessary function
prototypes and fix whitespace damage in a structure as well.
Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
Kumar Gala [Thu, 10 Nov 2005 16:34:33 +0000 (10:34 -0600)]
[PATCH] ppc32: fix PQ2 PCI DMA interrupt handling
The bit position in the status register corresponding to the
PCI DMA interrupt was incorrect. Additionally, we did not
have a define for the PCI DMA interrupt.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
Use "hints" to speed up the SACK processing. Various forms
of this have been used by TCP developers (Web100, STCP, BIC)
to avoid the 2x linear search of outstanding segments.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
John Heffner [Fri, 11 Nov 2005 01:11:48 +0000 (17:11 -0800)]
[TCP]: receive buffer growth limiting with mixed MTU
This is a patch for discussion addressing some receive buffer growing issues.
This is partially related to the thread "Possible BUG in IPv4 TCP window
handling..." last week.
Specifically it addresses the problem of an interaction between rcvbuf
moderation (receiver autotuning) and rcv_ssthresh. The problem occurs when
sending small packets to a receiver with a larger MTU. (A very common case I
have is a host with a 1500 byte MTU sending to a host with a 9k MTU.) In
such a case, the rcv_ssthresh code is targeting a window size corresponding
to filling up the current rcvbuf, not taking into account that the new rcvbuf
moderation may increase the rcvbuf size.
One hunk makes rcv_ssthresh use tcp_rmem[2] as the size target rather than
rcvbuf. The other changes the behavior when it overflows its memory bounds
with in-order data so that it tries to grow rcvbuf (the same as with
out-of-order data).
These changes should help my problem of mixed MTUs, and should also help the
case from last week's thread I think. (In both cases though you still need
tcp_rmem[2] to be set much larger than the TCP window.) One question is if
this is too aggressive at trying to increase rcvbuf if it's under memory
stress.
Orignally-from: John Heffner <jheffner@psc.edu> Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
This is an updated version of the RFC3465 ABC patch originally
for Linux 2.6.11-rc4 by Yee-Ting Li. ABC is a way of counting
bytes ack'd rather than packets when updating congestion control.
The orignal ABC described in the RFC applied to a Reno style
algorithm. For advanced congestion control there is little
change after leaving slow start.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Simplify the code that comuputes microsecond rtt estimate used
by TCP Vegas. Move the callback out of the RTT sampler and into
the end of the ack cleanup.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
[TCP]: fix congestion window update when using TSO deferal
TCP peformance with TSO over networks with delay is awful.
On a 100Mbit link with 150ms delay, we get 4Mbits/sec with TSO and
50Mbits/sec without TSO.
The problem is with TSO, we intentionally do not keep the maximum
number of packets in flight to fill the window, we hold out to until
we can send a MSS chunk. But, we also don't update the congestion window
unless we have filled, as per RFC2861.
This patch replaces the check for the congestion window being full
with something smarter that accounts for TSO.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Jesse Barnes [Wed, 9 Nov 2005 04:13:02 +0000 (20:13 -0800)]
[PATCH] PCI: fix for Toshiba ohci1394 quirk
After much testing and agony, I've discovered that my previous ohci1394
quirk for Toshiba laptops is not 100% reliable. It apparently fails to
do the interrupt line change either correctly or in time, since in about
2 out of 5 boots, the kernel's irqdebug code will *still* disable irq 11
when the ohci1394 driver is loaded (at pci_enable_device time I think).
This patch switches things around a little in the workaround. First, it
removes the mdelay. I didn't see a need for it and my testing has shown
that it's not necessary for the quirk to work.
Secondly, instead of trying to change the interrupt line to what ACPI
tells us it should be, this patch makes the quirk use the value in the
PCI_INTERRUPT_LINE register. On this laptop at least, that seems to be
the right thing to do, though additional testing on other laptops and/or
with actual firewire devices would be appreciated.
Ashok Raj [Wed, 9 Nov 2005 05:42:33 +0000 (21:42 -0800)]
[PATCH] PCI: Change MSI to use physical delivery mode always
MSI hardcoded delivery mode to use logical delivery mode. Recently
x86_64 moved to use physical mode addressing to support physflat mode.
With this mode enabled noticed that my eth with MSI werent working.
msi_address_init() was hardcoded to use logical mode for i386 and x86_64.
So when we switch to use physical mode, things stopped working.
Since anyway we dont use lowest priority delivery with MSI, its always
directed to just a single CPU. Its safe and simpler to use
physical mode always, even when we use logical delivery mode for IPI's
or other ioapic RTE's.
Adrian Bunk [Sun, 6 Nov 2005 00:45:08 +0000 (01:45 +0100)]
[PATCH] PCI: drivers/pci/: small cleanups
This patch contains the following cleanups:
- access.c should #include "pci.h" for getting the prototypes of it's
global functions
- hotplug/shpchp_pci.c: make the needlessly global function
program_fw_provided_values() static
Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Grant Coady [Sat, 5 Nov 2005 23:52:51 +0000 (10:52 +1100)]
[PATCH] pci_ids cleanup: fix two additional IDs in bt87x
pci_ids cleanup: fixup bt87x.c: two macro defined IDs missed in prior cleanup.
Caught by Chun-Chung Chen <cjj@u.washington.edu>: "In the patch for bt87x.c,
you seemed have missed the two occurrences of BT_DEVICE on line 897 and
line 898."
Signed-off-by: Grant Coady <gcoady@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
John Rose [Fri, 4 Nov 2005 21:38:50 +0000 (15:38 -0600)]
[PATCH] dlpar regression for ppc64 - probe change
This patch contains the driver bits for enabling DLPAR and PCI Hotplug
for the new OF-based PCI probe. This functionality was regressed when
the new PCI approach was introduced. Please apply if appropriate.
Signed-off-by: John Rose <johnrose@austin.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
[PATCH] pciehp: fix handling of power faults during hotplug
The current pciehp implementation reports a power-fail error
even if the condition has cleared by the time the corresponding
interrupt handling code gets a chance to run. This patch
fixes this problem.
Signed-off-by: Rajesh Shah <rajesh.shah@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
[PATCH] pciehp: clean-up how we request control of hotplug hardware
This patch further tweaks how we request control of hotplug
controller hardware from BIOS. We first search the ACPI namespace
corresponding to a specific hotplug controller looking for an
_OSC or OSHP method. On failure, we successively move to the
ACPI parent object, till we hit the highest level host bridge
in the hierarchy. This allows for different types of BIOS's
which place the _OSC/OSHP methods at various places in the acpi
namespace, while still not encroaching on the namespace of
some other root level host bridge.
This patch also introduces a new load time option (pciehp_force)
that allows us to bypass all _OSC/OSHP checking. Not supporting
these methods seems to be be the most common ACPI firmware problem
we've run into. This will still _not_ allow the pciehp driver to
work correctly if the BIOS really doesn't support pciehp (i.e. if
it doesn't generate a hotplug interrupt). Use this option with
caution. Some BIOS's may deliberately not build any _OSC/OSHP
methods to make sure it retains control the hotplug hardware.
Using the pciehp_force parameter for such systems can lead to
two separate entities trying to control the same hardware.
Signed-off-by: Rajesh Shah <rajesh.shah@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
[PATCH] pciehp: request control of each hotplug controller individually
This patch tweaks the way pciehp requests control of the hotplug
hardware from BIOS. It now tries to invoke the ACPI _OSC method
for a specific hotplug controller only, rather than walking the
entire acpi namespace invoking all possible _OSC methods under
all host bridges. This allows us to gain control of each hotplug
controller individually, even if BIOS fails to give us control of
some other hotplug controller in the system.
Signed-off-by: Rajesh Shah <rajesh.shah@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Reduce the number of debug messages generated if pciehp debug is
enabled. I tried to restrict this to removing debug messages that
are either early-driver-debug type messages, or print information
that can be inferred through other debug prints.
Signed-off-by: Rajesh Shah <rajesh.shah@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
State information is currently stored in per-slot as well as
per-pci-function data structures in pciehp. There's a lot of
overlap in the information kept, and some of it is never used.
This patch consolidates the state information to per-slot and
eliminates unused data structures. The biggest change is to
eliminate the pci_func structure and the code around managing
its lists.
Signed-off-by: Rajesh Shah <rajesh.shah@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Reduce the PCI Express hotplug driver's dependence on ACPI.
We don't walk the acpi namespace anymore to build a list of
bridges and devices. We go to ACPI only to run the _OSC or
_OSHP methods to transition control of hotplug hardware from
system BIOS to the hotplug driver, and to run the _HPP
method to get hotplug device parameters like cache line size,
latency timer and SERR/PERR enable from BIOS.
Note that one of the side effects of this patch is that pciehp
does not automatically enable the hot-added device or its DMA
bus mastering capability now. It expects the device driver to
do that. This may break some drivers and we will have to fix
them as they are reported.
Signed-off-by: Rajesh Shah <rajesh.shah@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
[PATCH] patch 1/8] pciehp: use the PCI core for hotplug resource management
This patch converts the pci express hotplug controller driver
to use the PCI core for resource management. This eliminates a
lot of duplicated code and integrates pciehp with the system's
normal PCI handling code.
Signed-off-by: Rajesh Shah <rajesh.shah@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Roland Dreier [Sat, 29 Oct 2005 00:35:34 +0000 (17:35 -0700)]
[PATCH] PCI: add pci_find_next_capability()
Some devices have more than one capability of the same type. For
example, the PCI header for the PathScale InfiniPath looks like:
04:01.0 InfiniBand: Unknown device 1fc1:000d (rev 02)
Subsystem: Unknown device 1fc1:000d
Flags: bus master, fast devsel, latency 0, IRQ 193
Memory at fea00000 (64-bit, non-prefetchable) [size=2M]
Capabilities: [c0] HyperTransport: Slave or Primary Interface
Capabilities: [f8] HyperTransport: Interrupt Discovery and Configuration
There are _two_ HyperTransport capabilities, and the PathScale driver
wants to look at both of them.
The current pci_find_capability() API doesn't work for this, since it
only allows us to get to the first capability of a given type. The
patch below introduces a new pci_find_next_capability(), which can be
used in a loop like
Pavel Roskin [Thu, 10 Nov 2005 21:03:08 +0000 (13:03 -0800)]
[NET]: Annotate h_proto in struct ethhdr
The protocol field in ethernet headers is big-endian and should be
annotated as such. This patch allows detection of missing ntohs() calls
on the ethernet protocol field when sparse is run with __CHECK_ENDIAN__
defined.
This is a revised version that includes <linux/types.h> so that the
userspace programs are not confused by __be16. Thanks to David S.
Miller.
Signed-off-by: Pavel Roskin <proski@gnu.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Here is the patch that introduces the generic skb_checksum_complete
which also checks for hardware RX checksum faults. If that happens,
it'll call netdev_rx_csum_fault which currently prints out a stack
trace with the device name. In future it can turn off RX checksum.
I've converted every spot under net/ that does RX checksum checks to
use skb_checksum_complete or __skb_checksum_complete with the
exceptions of:
* Those places where checksums are done bit by bit. These will call
netdev_rx_csum_fault directly.
* The following have not been completely checked/converted:
ipmr
ip_vs
netfilter
dccp
This patch is based on patches and suggestions from Stephen Hemminger
and David S. Miller.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Deprecating the -imacros fixes the build for me. It does not appear to be a
simple argument overflow problem in trapcpp0, since deprecating all the defines
reproduces the problem as well. Also, switching -imacros to -include fixes the
problem.
Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Dean Nelson [Wed, 9 Nov 2005 20:41:57 +0000 (14:41 -0600)]
[IA64] utilize notify_die() for XPC disengage
XPC (as in arch/ia64/sn/kernel/xp*) has a need to notify other partitions
(SGI Altix) whenever a partition is going down in order to get them to
disengage from accessing the halting partition's memory. If this is not
done before the reset of the hardware, the other partitions can find
themselves encountering MCAs that bring them down.
Signed-off-by: Dean Nelson <dcn@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
Roland Dreier [Thu, 10 Nov 2005 18:18:23 +0000 (10:18 -0800)]
[IB] umad: further ib_unregister_mad_agent() deadlock fixes
The previous umad deadlock fix left ib_umad_kill_port() still
vulnerable to deadlocking. This patch fixes that by downgrading our
lock to a read lock when we might end up trying to reacquire the lock
for reading.
Roland Dreier [Wed, 9 Nov 2005 20:23:17 +0000 (12:23 -0800)]
[IB] mthca: fix wraparound handling in mthca_cq_clean()
Handle case where prod_index has wrapped around and become less than
cq->cons_index by checking that their difference as a signed int is
positive rather than comparing directly.
Move the computation of QP capabilities (max scatter/gather entries,
max inline data, etc) into the kernel, and have the uverbs module
return the values as part of the create QP response. This keeps
precise knowledge of device limits in the low-level kernel driver.
This requires an ABI bump, so while we're making changes, get rid of
the max_sge parameter for the modify SRQ command -- it's not used and
shouldn't be there.
Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
Roland Dreier [Tue, 8 Nov 2005 19:10:25 +0000 (11:10 -0800)]
[IB] Have cq_resize() method take an int, not int*
Change the struct ib_device.resize_cq() method to take a plain integer
that holds the new CQ size, rather than a pointer to an integer that
it uses to return the new size. This makes the interface match the
exported ib_resize_cq() signature, and allows the low-level driver to
update the CQ size with proper locking if necessary.
Roland Dreier [Mon, 7 Nov 2005 18:41:29 +0000 (10:41 -0800)]
[IB] umad: avoid potential deadlock when unregistering MAD agents
ib_unregister_mad_agent() completes all pending MAD sends and waits
for the agent's send_handler routine to return. umad's send_handler()
calls queue_packet(), which does down_read() on the port mutex to look
up the agent ID. This means that the port mutex cannot be held for
writing while calling ib_unregister_mad_agent(), or else it will
deadlock. This patch fixes all the calls to ib_unregister_mad_agent()
in the umad module to avoid this deadlock.
Roland Dreier [Mon, 7 Nov 2005 18:33:11 +0000 (10:33 -0800)]
[IPoIB] add path record information in debugfs
Add ibX_path files to debugfs that contain information about the IPoIB
path cache. IPoIB ARP only gives GIDs, which the IPoIB driver must
resolve to real IB paths through the ib_sa module. For debugging,
when the ARP table looks OK but traffic isn't flowing, it's useful to
be able to see if the resolution from GID to path worked.
Also clean up the formatting of the existing _mcg debugfs files.
Liam Girdwood [Thu, 10 Nov 2005 17:45:39 +0000 (17:45 +0000)]
[ARM] 3098/1: pxa2xx disable ssp irq
Patch from Liam Girdwood
This patch allows users of the pxa SSP driver to register their own irq
handlers instead of using the default SSP handler. It also cleans up the
CKEN clock and irq detection as the values are now stored in a table.
This patch replaces 2845/1
Changes:-
o Added flags parameter to ssp_init()
o Added SSP_NO_IRQ flag to disable registering of ssp irq handler (for
drivers that want to register their own handler)
o Cleaned up clock and irq detection, values are now stored in table.
o Added build changes to allow other drivers (e.g audio) to select the
ssp driver.
o corgi_ssp.c changed to use new interface.
Signed-off-by: Liam Girdwood <liam.girdwood@wolfsonmicro.com> Signed-off-by: Richard Purdie <rpurdie@rpsys.net> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Richard Purdie [Thu, 10 Nov 2005 17:42:29 +0000 (17:42 +0000)]
[ARM] 3096/1: Add SharpSL Zaurus power and battery management core driver
Patch from Richard Purdie
This patch adds a power and battery management core driver which with
the addition of the right device files, supports the c7x0 and cxx00
series of Sharp Zaurus handhelds.
The driver is complex for several reasons. Battery charging is manually
monitored and controlled. When suspended, the device needs to
periodically partially resume, check the charging status and then
re-suspend. It does without bothering the higher linux layers as
a full resume and re-suspend is unnecessary. The code is carefully
written to avoid interrupts or calling code outside the module under
these circumstances. It also vets the various wake up sources and
monitors the device's power situation.
Hooks to limit the backlight intensity and to notify the battery
monitoring code of backlight events are connected/added as the
backlight is one of the biggest users of power on the device.
Signed-off-by: Richard Purdie <rpurdie@rpsys.net> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>