err.no Git - linux-2.6/log

[PATCH] x86_64: Don't clobber r8-r11 in int 0x80 handler

When int 0x80 is called from long mode r8-r11 would leak out of the
kernel (or rather they would be filled with some values from
the kernel stack). I don't think it's a security issue because
the values come from the fixed stack frame which should be near
always user registers from a previous interrupt.

Still better fix it.

Longer term the register save macros need to be cleaned up
to avoid such mistakes in the future.

Original analysis from Richard Brunner, fix by me.

Cc: Richard.Brunner@amd.com
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

[PATCH] i386/x86-64: Add user_mode checks to profile_pc for oprofile

Fixes a obscure user space triggerable crash during oprofiling.

Oprofile calls profile_pc from NMIs even when user_mode(regs) is not true and
the program counter is inside the kernel lock section. This opens
a race - when a user program jumps to a kernel lock address and
a NMI happens before the illegal page fault exception is raised
and the program has a unmapped esp or ebp then the kernel could
oops. NMIs have a higher priority than exceptions so that could
happen.

Add user_mode checks to i386/x86-64 profile_pc to prevent that.

Cc: John Levon <levon@movementarian.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

Merge branch 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6

* 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6:
  [S390] update default configuration
  [S390] duplicate ccw devices in ccwgroup.
  [S390] permanent subchannel busy conditions may cause I/O stall

Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6

* master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6:
  [SUNLANCE]: fix compilation on sparc-UP
  [SPARC]: Defer clock_probe to fs_initcall()
  [SPARC64]: Fix typo in pgprot_noncached().
  [SPARC64]: Fix quad-float multiply emulation.

Merge git://oss.sgi.com:8090/nathans/xfs-rc-2.6

* git://oss.sgi.com:8090/nathans/xfs-rc-2.6:
  [XFS] Ensure bulkstat from an invalid inode number gets caught always with
  [XFS] Fix a barrier related forced shutdown on mounts with quota enabled.
  [XFS] Fix remount vs no/barrier options by ensuring we clear unwanted
  [XFS] All xfs_disk_dquot_t values are (as the name says) disk endian.

Merge branch 'for-linus' of git://brick.kernel.dk/data/git/linux-2.6-block

* 'for-linus' of git://brick.kernel.dk/data/git/linux-2.6-block:
  [PATCH] scsi: kill overeager "not-ready" messages
  [PATCH] it821x: fix ide dma setup bug
  [PATCH] ide: if the id fields looks screwy, disable DMA
  [PATCH] ide: option to disable cache flushes for buggy drives

[PATCH] i386: switch_to(): misplaced parentheses

Recent changes in i386 __switch_to() have a misplaced closing
parenthesis causing an unlikely() to terminate early.

Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

[SUNLANCE]: fix compilation on sparc-UP

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

[XFS] Ensure bulkstat from an invalid inode number gets caught always with
EINVAL.

SGI-PV: 953819
SGI-Modid: xfs-linux-melb:xfs-kern:26629a

Signed-off-by: Nathan Scott <nathans@sgi.com>

[XFS] Fix a barrier related forced shutdown on mounts with quota enabled.

SGI-PV: 912426
SGI-Modid: xfs-linux-melb:xfs-kern:26622a

Signed-off-by: Nathan Scott <nathans@sgi.com>

[XFS] Fix remount vs no/barrier options by ensuring we clear unwanted
flags from iclog buffers before submitting them for writing.

SGI-PV: 954772
SGI-Modid: xfs-linux-melb:xfs-kern:26605a

Signed-off-by: Nathan Scott <nathans@sgi.com>

[XFS] All xfs_disk_dquot_t values are (as the name says) disk endian.
Before putting them into struct statfs they should be endian-swapped.

SGI-PV: 954580
SGI-Modid: xfs-linux-melb:xfs-kern:26550a

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nathan Scott <nathans@sgi.com>

[PATCH] scsi: kill overeager "not-ready" messages

HAL and friends have a tendency to trigger this one all the time.
It's not really interesting, so kill it. The vendor kernels all do
anyways.

Signed-off-by: Jens Axboe <axboe@suse.de>

[PATCH] it821x: fix ide dma setup bug

Only enable dma for a valid speed setting.

Signed-off-by: Jens Axboe <axboe@suse.de>

[PATCH] ide: if the id fields looks screwy, disable DMA

It's the safer choice. Originally due to a bug in itx821x, but a
generally sound thing to do.

Signed-off-by: Jens Axboe <axboe@suse.de>

[PATCH] ide: option to disable cache flushes for buggy drives

Some drives claim they support cache flushing, but get seriously
confused if you try. Add this option to be able to boot with
barriers enabled by default.

Signed-off-by: Jens Axboe <axboe@suse.de>

[SPARC]: Defer clock_probe to fs_initcall()

From: Bob Breuer <breuerr@mc.net>

That way all the of_driver bits will be ready.

Signed-off-by: David S. Miller <davem@davemloft.net>

[SPARC64]: Fix typo in pgprot_noncached().

The sun4v code sequence was or'ing in the sun4u pte bits by mistake.

Signed-off-by: David S. Miller <davem@davemloft.net>

[SPARC64]: Fix quad-float multiply emulation.

Something is wrong with the 3-multiply (vs. 4-multiply) optimized
version of _FP_MUL_MEAT_2_*(), so just use the slower version
which actually computes correct values.

Noticed by Rene Rebe

Signed-off-by: David S. Miller <davem@davemloft.net>

[S390] update default configuration

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

[S390] duplicate ccw devices in ccwgroup.

Fail to create a ccwgroup device if a ccw device is passed in twice.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

[S390] permanent subchannel busy conditions may cause I/O stall

In special conditions where a subchannel rejects the HALT I/O-
instruction with a busy indication (cc 2), I/O may stall.
I/O request termination logic retries HALT I/O indefinitely
because it expects HALT I/O to alter the subchannel status which
is not true when cc 2 is returned.
In case of a busy indication, try CLEAR I/O instruction immediately.

Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

[PATCH] fix compile regression for a few scsi drivers

This fixes three drivers to compile again after my patch that removes
the data_cmnd member from struct scsi_cmnd.

The fas216 change is trivial, it should have been using ->cmnd all the
time.

NCR53C9 (which seem to be mostly duplicate driver with esp.c!) is doing
something odd, it should only have looked at ->cmnd before not the saved
copy that is kept for the error handlers sake. Note that it really
should deal with the sync setting themselves but use the generic domain
validation code that get this right - but that's for later let's push
this simple compile fix for now.

And sorry for the late fix for this, I have been busy with OLS and
associated activities last week.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6

* master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6:
  [SCSI] esp: Fix build.
  [SPARC]: Fix SA_STATIC_ALLOC value.
  [SPARC64]: Explicitly print return PC when the kernel fault PC is bogus.

Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6

* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
  [IPV4/IPV6]: Setting 0 for unused port field in RAW IP recvmsg().
  [IPV4] ipmr: ip multicast route bug fix.
  [TG3]: Update version and reldate
  [TG3]: Handle tg3_init_rings() failures
  [TG3]: Add tg3_restart_hw()
  [IPV4]: Clear the whole IPCB, this clears also IPCB(skb)->flags.
  [IPV6]: Clean skb cb on IPv6 input.
  [NETFILTER]: Demote xt_sctp to EXPERIMENTAL
  [NETFILTER]: bridge netfilter: add deferred output hooks to feature-removal-schedule
  [NETFILTER]: xt_pkttype: fix mismatches on locally generated packets
  [NETFILTER]: SNMP NAT: fix byteorder confusion
  [NETFILTER]: conntrack: fix SYSCTL=n compile
  [NETFILTER]: nf_queue: handle NF_STOP and unknown verdicts in nf_reinject
  [NETFILTER]: H.323 helper: fix possible NULL-ptr dereference

[PATCH] Reorganize the cpufreq cpu hotplug locking to not be totally bizare

The patch below moves the cpu hotplugging higher up in the cpufreq
layering; this is needed to avoid recursive taking of the cpu hotplug
lock and to otherwise detangle the mess.

The new rules are:
1. you must do lock_cpu_hotplug() around the following functions:
   __cpufreq_driver_target
   __cpufreq_governor (for CPUFREQ_GOV_LIMITS operation only)
   __cpufreq_set_policy
2. governer methods (.governer) must NOT take the lock_cpu_hotplug()
   lock in any way; they are called with the lock taken already
3. if your governer spawns a thread that does things, like calling
   __cpufreq_driver_target, your thread must honor rule #1.
4. the policy lock and other cpufreq internal locks nest within
   the lock_cpu_hotplug() lock.

I'm not entirely happy about how the __cpufreq_governor rule ended up
(conditional locking rule depending on the argument) but basically all
callers pass this as a constant so it's not too horrible.

The patch also removes the cpufreq_governor() function since during the
locking audit it turned out to be entirely unused (so no need to fix it)

The patch works on my testbox, but it could use more testing
(otoh... it can't be much worse than the current code)

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

[IPV4/IPV6]: Setting 0 for unused port field in RAW IP recvmsg().

From: Tetsuo Handa from-linux-kernel@i-love.sakura.ne.jp

The recvmsg() for raw socket seems to return random u16 value
from the kernel stack memory since port field is not initialized.
But I'm not sure this patch is correct.
Does raw socket return any information stored in port field?

[ BSD defines RAW IP recvmsg to return a sin_port value of zero.
This is described in Steven's TCP/IP Illustrated Volume 2 on
page 1055, which is discussing the BSD rip_input() implementation. ]

Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

[IPV4] ipmr: ip multicast route bug fix.

IP multicast route code was reusing an skb which causes use after free
and double free.

From: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>

Note, it is real skb_clone(), not alloc_skb(). Equeued skb contains
the whole half-prepared netlink message plus room for the rest.
It could be also skb_copy(), if we want to be puristic about mangling
cloned data, but original copy is really not going to be used.

Acked-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

[TG3]: Update version and reldate

Update version to 3.63.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

[TG3]: Handle tg3_init_rings() failures

Handle dev_alloc_skb() failures when initializing the RX rings.
Without proper handling, the driver will crash when using a partial
ring.

Thanks to Stephane Doyon <sdoyon@max-t.com> for reporting the bug and
providing the initial patch.

Howie Xu <howie@vmware.com> also reported the same issue.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

[TG3]: Add tg3_restart_hw()

Add tg3_restart_hw() to handle failures when re-initializing the
device.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

[PATCH] cfq-iosched: don't use a hard jiffies value, translate from msecs

The CIC_SEEKY() test really wants to use the minimum of either:

- 2 msecs (not jiffies)

- or, the pending slice time

So code it like that.

Signed-off-by: Jens Axboe <axboe@suse.de>

[PATCH] blktrace: fix read-ahead bit

It should be toggling the same bit on and off, fix it up.

Signed-off-by: Jens Axboe <axboe@suse.de>

[PATCH] cciss: fix stall with softirq handling and CFQ

We need to postpone the queue startup until after the softirq
handler has actually finished some requests, otherwise we could
be racing with cciss_softirq_done() and not actually restart
the queue handling.

Signed-off-by: Jens Axboe <axboe@suse.de>

[IPV4]: Clear the whole IPCB, this clears also IPCB(skb)->flags.

Signed-off-by: Guillaume Chazarain <guichaz@yahoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>

[IPV6]: Clean skb cb on IPv6 input.

Clear the accumulated junk in IP6CB when starting to handle an IPV6
packet.

Signed-off-by: Guillaume Chazarain <guichaz@yahoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NETFILTER]: Demote xt_sctp to EXPERIMENTAL

After the recent problems with all the SCTP stuff it seems reasonable
to mark this as experimental.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NETFILTER]: bridge netfilter: add deferred output hooks to feature-removal-schedule

Add bridge netfilter deferred output hooks to feature-removal-schedule
and disable them by default. Until their removal they will be
activated by the physdev match when needed.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NETFILTER]: xt_pkttype: fix mismatches on locally generated packets

Locally generated broadcast and multicast packets have pkttype set to
PACKET_LOOPBACK instead of PACKET_BROADCAST or PACKET_MULTICAST. This
causes the pkttype match to fail to match packets of either type.

The below patch remedies this by using the daddr as a hint as to
broadcast|multicast. While not pretty, this seems like the only way
to solve the problem short of just noting this as a limitation of the
match.

This resolves netfilter bugzilla #484

Signed-off-by: Phil Oester <kernel@linuxace.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NETFILTER]: SNMP NAT: fix byteorder confusion

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NETFILTER]: conntrack: fix SYSCTL=n compile

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NETFILTER]: nf_queue: handle NF_STOP and unknown verdicts in nf_reinject

In case of an unknown verdict or NF_STOP the packet leaks. Unknown verdicts
can happen when userspace is buggy. Reinject the packet in case of NF_STOP,
drop on unknown verdicts.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NETFILTER]: H.323 helper: fix possible NULL-ptr dereference

An RCF message containing a timeout results in a NULL-ptr dereference if
no RRQ has been seen before.

Noticed by the "SATURN tool", reported by Thomas Dillig <tdillig@stanford.edu>
and Isil Dillig <isil@stanford.edu>.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

[SCSI] esp: Fix build.

The data_cmd[] member got deleted, so do not use it any more. Scsi
commands do not have their ->cmd[] overwritten temporary to probe for
status after an error before retrying.

Signed-off-by: David S. Miller <davem@davemloft.net>

[SPARC]: Fix SA_STATIC_ALLOC value.

It alises IRQF_SHARED which causes all kinds of
problems.

Signed-off-by: David S. Miller <davem@davemloft.net>

[SPARC64]: Explicitly print return PC when the kernel fault PC is bogus.

That way we'll have at least some debugging info even if
the stack dump explodes.

Signed-off-by: David S. Miller <davem@davemloft.net>

[NET]: Correct dev_alloc_skb kerneldoc

dev_alloc_skb is designated for RX descriptors, not TX. (Some drivers
use it for the latter anyway, but that's a different story)

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NET]: Remove CONFIG_HAVE_ARCH_DEV_ALLOC_SKB

skbuff.h has an #ifndef CONFIG_HAVE_ARCH_DEV_ALLOC_SKB to allow
architectures to reimplement __dev_alloc_skb. It's not set on any
architecture and now that we have an architecture-overrideable
NET_SKB_PAD there is not point at all to have one either.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

[VLAN]: Fix link state propagation

When the queue of the underlying device is stopped at initialization time
or the device is marked "not present", the state will be propagated to the
vlan device and never change. Based on an analysis by Patrick McHardy.

Signed-off-by: Stefan Rompf <stefan@loplof.de>
ACKed-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

[IPV6] xfrm6_tunnel: Delete debugging code.

It doesn't compile, and it's dubious in several regards:

1) is enabled by non-Kconfig controlled CONFIG_* value
(noted by Randy Dunlap)
2) XFRM6_TUNNEL_SPI_MAGIC is defined after it's first use
3) the debugging messages print object pointer addresses
which have no meaning without context

So let's just get rid of it.

Signed-off-by: David S. Miller <davem@davemloft.net>

[Bluetooth] Enable SCO support for Broadcom HID proxy dongle

The Broadcom dongles with HID proxy support actually support SCO over
HCI if the SCO buffer size values are corrected. So instead of disabling
the SCO support, mark this dongle with the quirk for the Bluetooth core
to correct the wrong buffer size values.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>

[Bluetooth] Add quirk for another broken RTX Telecom based dongle

This patch disables the ISOC transfers for another broken RTX Telecom
based USB dongle. Starting the USB ISOC transfers only ends in a burst
of error messages for invalid SCO packets on connection handle 0.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>

[Bluetooth] Correct SCO buffer size for Belkin devices

The Belkin F8T012 and F8T013 devices are both based on a Bluetooth chip
from Broadcom and their SCO buffer size values are wrong. The Bluetooth
core should correct these values.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>

[Bluetooth] Correct SCO buffer size for another Broadcom chip

The SCO buffer size values on IBM/Lenovo ThinkPad laptops with a
Bluetooth chip from Broadcom are wrong. The USB Bluetooth driver
has to set a quirk to correct the SCO buffer size values.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>

[Bluetooth] Correct RFCOMM channel MTU for broken implementations

Some Bluetooth RFCOMM implementations try to negotiate a bigger channel
MTU than we can support for a particular session. The maximum MTU for
a RFCOMM session is limited through the L2CAP layer. So if the other
side proposes a channel MTU that is bigger than the underlying L2CAP
MTU, we should reduce it to the L2CAP MTU of the session minus five
bytes for the RFCOMM headers.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>

[PKT_SCHED]: Fix regression in PSCHED_TADD{,2}.

In PSCHED_TADD and PSCHED_TADD2, if delta is less than tv.tv_usec (so,
less than USEC_PER_SEC too) then tv_res will be smaller than tv. The
affectation "(tv_res).tv_usec = __delta;" is wrong. The fix is to
revert to the original code before
4ee303dfeac6451b402e3d8512723d3a0f861857 and change the 'if' in
'while'.

[Shuya MAEDA: "while (__delta >= USEC_PER_SEC){ ... }" instead of
"while (__delta > USEC_PER_SEC){ ... }"]

Signed-off-by: Guillaume Chazarain <guichaz@yahoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>

[DCCP]: Fix default sequence window size

When using the default sequence window size (100) I got the following in
my logs:

Jun 22 14:24:09 localhost kernel: [ 1492.114775] DCCP: Step 6 failed for
DATA packet, (LSWL(6279674225) <= P.seqno(6279674749) <=
S.SWH(6279674324)) and (P.ackno doesn't exist or LAWL(18798206530) <=
P.ackno(1125899906842620) <= S.AWH(18798206548), sending SYNC...
Jun 22 14:24:09 localhost kernel: [ 1492.115147] DCCP: Step 6 failed for
DATA packet, (LSWL(6279674225) <= P.seqno(6279674750) <=
S.SWH(6279674324)) and (P.ackno doesn't exist or LAWL(18798206530) <=
P.ackno(1125899906842620) <= S.AWH(18798206549), sending SYNC...

I went to alter the default sysctl and it didn't take for new sockets.
Below patch fixes this.

I think the default is too low but it is what the DCCP spec specifies.

As a side effect of this my rx speed using iperf goes from about 2.8 Mbits/sec
to 3.5. This is still far too slow but it is a step in the right direction.

Compile tested only for IPv6 but not particularly complex change.

Signed off by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

IB/mthca: Initialize max_cmds before debug code prints it

Read the max_cmds value from the response to the QUERY_FW command
before printing out the value, so that the real value goes into the
debug output.

Signed-off-by: Roland Dreier <rolandd@cisco.com>

IB/ipoib: Fix packet loss after hardware address update

The neighbour ha field may get updated without destroying the
neighbour. In this case, the ha field gets out of sync with the
address handle stored in ipoib_neigh->ah, with the result that
the ah field would point to an incorrect path, resulting in all
packets being lost.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

IB/ipoib: Fix oops with ipoib_debug_mcast set

Need to set mcast->ah before debug code dereferences it.

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

IB/mad: Validate MADs for spec compliance

Validate MADs sent by userspace clients for spec compliance with
C13-18.1.1 (prevent duplicate requests and responses sent on the
same port). Without this, RMPP transactions get aborted because
of duplicate packets.

This patch is similar to that provided by Jack Morgenstein.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

IB/ipath: ipath_skip_sge() can break if num_sge > 1

ipath_skip_sge() doesn't exactly duplicate the side effects of
ipath_copy_sge() if num_sge > 1 since it doesn't decrement ss->num_sge.
This could result in the sg_list being accessed out of bounds.
Since ipath_skip_sge() is almost always called with num_sge == 1,
the original "optimization" is almost never used.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

IB/ipath: Fix ib_ipath driver to work with SRP

I am still working on a proposal to remove the phys_to_virt() calls
in the ib_ipath driver. In the mean time, this patch allows SRP
to work by fixing the R_Key check and conversion from IB address
to kernel virtual address. It also returns the correct page size
for FMRs.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

IB/ipath: Fix a data corruption

This patch fixes a problem where certain error packets are passed
to the InfiniBand layer for processing even though the packet
actually was received with an error.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

IB/mthca: Fix SRQ limit event range check

Mem-free HCAs always keep one spare SRQ WQE, so the SRQ limit cannot
be set beyond srq->max - 1.

Signed-off-by: Dotan Barak <dotanb@mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

IB/uverbs: Fix lockdep warnings

Lockdep warns because uverbs is trying to take uobj->mutex when it
already holds that lock. This is because there are really multiple
types of uobjs even though all of their locks are initialized in
common code.

Signed-off-by: Roland Dreier <rolandd@cisco.com>

IB/uverbs: Fix unlocking in error paths

ib_uverbs_create_ah() and ib_uverbs_create_srq() did not release the
PD's read lock in their error paths, which lead to deadlock when
destroying the PD.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

[PATCH] Cpuset: fix ABBA deadlock with cpu hotplug lock

Fix ABBA deadlock between lock_cpu_hotplug() and the cpuset
callback_mutex lock.

It only happens on cpu_exclusive cpusets, due to the dynamic
sched domain code trying to take the cpu hotplug lock inside
the cpuset callback_mutex lock.

This bug has apparently been here for several months, but didn't
get hit until the right customer load on a large system.

This fix appears right from inspection, but it will take a few
more days running it on that customers workload to be confident
we nailed it.  We don't have any other reproducible test case.

The cpu_hotplug_lock() tends to cover large runs of code.
The other places that hold both that lock and the cpuset callback
mutex lock always nest the cpuset lock inside the hotplug lock.
This place tries to do the reverse, risking an ABBA deadlock.

This is in the cpuset_rmdir() code, where we:
  * take the callback_mutex lock
  * mark the cpuset CS_REMOVED
  * call update_cpu_domains for cpu_exclusive cpusets
  * in that call, take the cpu_hotplug lock if the
    cpuset is marked for removal.

Thanks to Jack Steiner for identifying this deadlock.

The fix is to tear down the dynamic sched domain before we grab
the cpuset callback_mutex lock.  This way, the two locks are
serialized, with the hotplug lock taken and released before
trying for the cpuset lock.

I suspect that this bug was introduced when I changed the
cpuset locking from one lock to two.  The dynamic sched domain
dependency on cpu_exclusive cpusets and its hotplug hooks were
added to this code earlier, when cpusets had only a single lock.
It may well have been fine then.

Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

cpu hotplug: simplify and hopefully fix locking

The CPU hotplug locking was quite messy, with a recursive lock to
handle the fact that both the actual up/down sequence wanted to
protect itself from being re-entered, but the callbacks that it
called also tended to want to protect themselves from CPU events.

This splits the lock into two (one to serialize the whole hotplug
sequence, the other to protect against the CPU present bitmaps
changing). The latter still allows recursive usage because some
subsystems (ondemand policy for cpufreq at least) had already gotten
too used to the lax locking, but the locking mistakes are hopefully
now less fundamental, and we now warn about recursive lock usage
when we see it, in the hope that it can be fixed.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>

[cpufreq] ondemand: make shutdown sequence more robust

Shutting down the ondemand policy was fraught with potential
problems, causing issues for SMP suspend (which wants to hot-
unplug) all but the last CPU.

This should fix at least the worst problems (divide-by-zero
and infinite wait for the workqueue to shut down).

Signed-off-by: Linus Torvalds <torvalds@osdl.org>

Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6

* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (21 commits)
  [TIPC]: Removing useless casts
  [IPV4]: Fix nexthop realm dumping for multipath routes
  [DUMMY]: Avoid an oops when dummy_init_one() failed
  [IFB] After ifb_init_one() failed, i is increased. Decrease
  [NET]: Fix reversed error test in netif_tx_trylock
  [MAINTAINERS]: Mark LAPB as Oprhan.
  [NET]: Conversions from kmalloc+memset to k(z|c)alloc.
  [NET]: sun happymeal, little pci cleanup
  [IrDA]: Use alloc_skb() in IrDA TX path
  [I/OAT]: Remove pci_module_init() from Intel I/OAT DMA engine
  [I/OAT]: net/core/user_dma.c should #include <net/netdma.h>
  [SCTP]: ADDIP: Don't use an address as source until it is ASCONF-ACKed
  [SCTP]: Set chunk->data_accepted only if we are going to accept it.
  [SCTP]: Verify all the paths to a peer via heartbeat before using them.
  [SCTP]: Unhash the endpoint in sctp_endpoint_free().
  [SCTP]: Check for NULL arg to sctp_bucket_destroy().
  [PKT_SCHED] netem: Fix slab corruption with netem (2nd try)
  [WAN]: Converted synclink drivers to use netif_carrier_*()
  [WAN]: Cosmetic changes to N2 and C101 drivers
  [WAN]: Added missing netif_dormant_off() to generic HDLC
  ...

[TIPC]: Removing useless casts

Removing useless casts

Signed-off-by: Panagiotis Issaris <takis@issaris.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

[IPV4]: Fix nexthop realm dumping for multipath routes

Routing realms exist per nexthop, but are only returned to userspace
for the first nexthop. This is due to the fact that iproute2 only
allows to set the realm for the first nexthop and the kernel refuses
multipath routes where only a single realm is present.

Dump all realms for multipath routes to enable iproute to correctly
display them.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

[DUMMY]: Avoid an oops when dummy_init_one() failed

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

[IFB] After ifb_init_one() failed, i is increased. Decrease

It before entering in the loop for freeing the other ifb devices.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NET]: Fix reversed error test in netif_tx_trylock

A non-zero return value indicates success from spin_trylock,
not error.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

[MAINTAINERS]: Mark LAPB as Oprhan.

Maintainer email not longer exists.

Signed-off-by: David S. Miller <davem@davemloft.net>

[NET]: Conversions from kmalloc+memset to k(z|c)alloc.

Signed-off-by: Panagiotis Issaris <takis@issaris.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NET]: sun happymeal, little pci cleanup

Use pci_register_driver instead of pci_module_init. Use PCI_DEVICE macro.

Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

[IrDA]: Use alloc_skb() in IrDA TX path

As pointed out by Christoph Hellwig, dev_alloc_skb() is not intended to be
used for allocating TX sk_buff. The IrDA stack was exclusively calling
dev_alloc_skb() on the TX path, and this patch fixes that.

Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

[I/OAT]: Remove pci_module_init() from Intel I/OAT DMA engine

Changes pci_module_init() to pci_register_driver().

Signed-off-by: Henrik Kretzschmar <henne@nachtwindheim.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

[I/OAT]: net/core/user_dma.c should #include <net/netdma.h>

Every file should #include the headers containing the prototypes for
its global functions.

Especially in cases like this one where gcc can tell us through a
compile error that the prototype was wrong...

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

[SCTP]: ADDIP: Don't use an address as source until it is ASCONF-ACKed

This implements Rules D1 and D4 of Sec 4.3 in the ADDIP draft.

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

[SCTP]: Set chunk->data_accepted only if we are going to accept it.

Currently there is a code path in sctp_eat_data() where it is possible
to set this flag even when we are dropping this chunk.

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

[SCTP]: Verify all the paths to a peer via heartbeat before using them.

This patch implements Path Initialization procedure as described in
Sec 2.36 of RFC4460.

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

[SCTP]: Unhash the endpoint in sctp_endpoint_free().

This prevents a race between the close of a socket and receive of an
incoming packet.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

[SCTP]: Check for NULL arg to sctp_bucket_destroy().

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

[PKT_SCHED] netem: Fix slab corruption with netem (2nd try)

CONFIG_DEBUG_SLAB found the following bug:
netem_enqueue() in sch_netem.c gets a pointer inside a slab object:
struct netem_skb_cb *cb = (struct netem_skb_cb *)skb->cb;
But then, the slab object may be freed:
skb = skb_unshare(skb, GFP_ATOMIC)
cb is still pointing inside the freed skb, so here is a patch to
initialize cb later, and make it clear that initializing it sooner
is a bad idea.

[From Stephen Hemminger: leave cb unitialized in order to let gcc
complain in case of use before initialization]

Signed-off-by: Guillaume Chazarain <guichaz@yahoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>

[WAN]: Converted synclink drivers to use netif_carrier_*()

WAN: Converted synclink drivers to use netif_carrier_*() instead
of hdlc_set_carrier().

Signed-off-by: Krzysztof Halasa <khc@pm.waw.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>

[WAN]: Cosmetic changes to N2 and C101 drivers

WAN: Cosmetic changes to N2 and C101 drivers

Signed-off-by: Krzysztof Halasa <khc@pm.waw.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>

[WAN]: Added missing netif_dormant_off() to generic HDLC

WAN: Fixed a problem with PPP/raw HDLC/X.25 protocols not doing
netif_dormant_off() at startup.

Signed-off-by: Krzysztof Halasa <khc@pm.waw.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>

[IPV4]: Get rid of redundant IPCB->opts initialisation

Now that we always zero the IPCB->opts in ip_rcv, it is no longer
necessary to do so before calling netif_rx for tunneled packets.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

[SPARC64]: Update defconfig.

Signed-off-by: David S. Miller <davem@davemloft.net>

[SPARC]: Fix length parameter verification in sys_getdomainname().

Found by scrashme.

Signed-off-by: David S. Miller <davem@davemloft.net>

[SERIAL] sunzilog: Fix instance enumeration.

Just do a linear enumeration so that we handle sun4d systems
correctly. As a consequence, eliminate the hard coded keyboard and
mouse channel line values, use the CONS_{KEYB,MS} flags instead.

Also, report the keyboard/mouse Zilog channels just like the uart ones
do.

Signed-off-by: David S. Miller <davem@davemloft.net>

[SERIAL] sunzilog: Remove duplicate IRQ registry in zs_probe().

We do it now in sunzilog_init() after all devices have been
probed.

Signed-off-by: David S. Miller <davem@davemloft.net>

[SPARC]: Get sun4d SMP building again.

Signed-off-by: David S. Miller <davem@davemloft.net>

[SPARC]: Do not call sun4m_irq_rotate on sun4d.

Signed-off-by: David S. Miller <davem@davemloft.net>

[SPARC]: Simplify and correct __cpu_find_by()

By using for_each_node_by_type().

Also, correct a spurioud test in check_cpu_node() on sparc64.
It is only called with nodes that have device_type "cpu".

Signed-off-by: David S. Miller <davem@davemloft.net>

[SPARC]: Initialize iounit spinlock in iounit_init().

Signed-off-by: David S. Miller <davem@davemloft.net>