]> err.no Git - linux-2.6/log
linux-2.6
16 years ago[NET]: Avoid copying TCP packets unnecessarily
Herbert Xu [Mon, 15 Oct 2007 08:47:15 +0000 (01:47 -0700)]
[NET]: Avoid copying TCP packets unnecessarily

TCP packets all have writable heads, that is, even though it's cloned, it is
writable up to the end of the TCP header.  This patch makes skb_checksum_help
aware of this fact by using skb_clone_writable and avoiding a copy for TCP.

I've also modified the BUG_ON tests to be unsigned.  The only case where this
makes a difference is if csum_start points to a location before skb->data.
Since skb->data should always include the header where the checksum field
is (and all currently callers adhere to that), this change is safe and may
uncover bugs later.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years ago[NET]: Fix csum_start update in pskb_expand_head
Herbert Xu [Mon, 15 Oct 2007 08:46:08 +0000 (01:46 -0700)]
[NET]: Fix csum_start update in pskb_expand_head

I got confused by the dual nature of the off variable in the
function pskb_expand_head.  The csum_start offset should use
nhead instead of off which can change depending on whether we
are using offsets or pointers.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years ago[NIU]: getting rid of __ucmpdi2 in niu.o
Al Viro [Mon, 15 Oct 2007 08:42:31 +0000 (01:42 -0700)]
[NIU]: getting rid of __ucmpdi2 in niu.o

By the time we get to that switch by PHY type, we have 8bit
value.  No need to keep it in u64 when u8 would do.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years ago[NETLINK]: Don't leak 'listeners' in netlink_kernel_create()
Jesper Juhl [Mon, 15 Oct 2007 08:39:12 +0000 (01:39 -0700)]
[NETLINK]: Don't leak 'listeners' in netlink_kernel_create()

The Coverity checker spotted that we'll leak the storage allocated
to 'listeners' in netlink_kernel_create() when the
  if (!nl_table[unit].registered)
check is false.

This patch avoids the leak.

Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years ago[IPV6] __inet6_csk_dst_store(): fix check-after-use
Adrian Bunk [Mon, 15 Oct 2007 08:37:55 +0000 (01:37 -0700)]
[IPV6] __inet6_csk_dst_store(): fix check-after-use

The Coverity checker spotted that we have already oops'ed if "dst" was
NULL.

Since "dst" being NULL doesn't seem to be possible at this point this
patch removes the NULL check.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Acked-by: Noriaki TAKAMIYA <takamiya@po.ntts.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years ago[NIU]: Fix write past end of array in niu_pci_probe_sprom().
David S. Miller [Mon, 15 Oct 2007 08:36:24 +0000 (01:36 -0700)]
[NIU]: Fix write past end of array in niu_pci_probe_sprom().

Noticed by Coverity checker and reported by Adrian Bunk.

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years ago[IPV6]: Avoid skb_copy/pskb_copy/skb_realloc_headroom on input
Herbert Xu [Mon, 15 Oct 2007 08:29:10 +0000 (01:29 -0700)]
[IPV6]: Avoid skb_copy/pskb_copy/skb_realloc_headroom on input

This patch replaces unnecessary uses of skb_copy by pskb_expand_head
on the IPv6 input path.

This allows us to remove the double pointers later.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years ago[IPV6]: Make ipv6_frag_rcv return the same packet
Herbert Xu [Mon, 15 Oct 2007 08:28:47 +0000 (01:28 -0700)]
[IPV6]: Make ipv6_frag_rcv return the same packet

This patch implements the same change taht was done to ip_defrag.  It
makes ipv6_frag_rcv return the last packet received of a train of fragments
rather than the head of that sequence.

This allows us to get rid of the sk_buff ** argument later.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years ago[NETFILTER]: Replace sk_buff ** with sk_buff *
Herbert Xu [Mon, 15 Oct 2007 07:53:15 +0000 (00:53 -0700)]
[NETFILTER]: Replace sk_buff ** with sk_buff *

With all the users of the double pointers removed, this patch mops up by
finally replacing all occurances of sk_buff ** in the netfilter API by
sk_buff *.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years ago[NETFILTER]: Avoid skb_copy/pskb_copy/skb_realloc_headroom
Herbert Xu [Sun, 14 Oct 2007 07:39:55 +0000 (00:39 -0700)]
[NETFILTER]: Avoid skb_copy/pskb_copy/skb_realloc_headroom

This patch replaces unnecessary uses of skb_copy, pskb_copy and
skb_realloc_headroom by functions such as skb_make_writable and
pskb_expand_head.

This allows us to remove the double pointers later.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years ago[IPVS]: Replace local version of skb_make_writable
Herbert Xu [Sun, 14 Oct 2007 07:39:33 +0000 (00:39 -0700)]
[IPVS]: Replace local version of skb_make_writable

This patch removes the IPVS-specific version of skb_make_writable and
replaces it with the netfilter one.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years ago[NETFILTER]: Do not copy skb in skb_make_writable
Herbert Xu [Sun, 14 Oct 2007 07:39:18 +0000 (00:39 -0700)]
[NETFILTER]: Do not copy skb in skb_make_writable

Now that all callers of netfilter can guarantee that the skb is not shared,
we no longer have to copy the skb in skb_make_writable.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years ago[BRIDGE]: Unshare skb upon entry
Herbert Xu [Sun, 14 Oct 2007 07:39:01 +0000 (00:39 -0700)]
[BRIDGE]: Unshare skb upon entry

Due to the special location of the bridging hook, it should never see a
shared packet anyway (certainly not with any in-kernel code).  So it
makes sense to unshare the skb there if necessary as that will greatly
simplify the code below it (in particular, netfilter).

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years ago[NET]: Avoid unnecessary cloning for ingress filtering
Herbert Xu [Sun, 14 Oct 2007 07:38:47 +0000 (00:38 -0700)]
[NET]: Avoid unnecessary cloning for ingress filtering

As it is we always invoke pt_prev before ing_filter, even if there are no
ingress filters attached.  This can cause unnecessary cloning in pt_prev.

This patch changes it so that we only invoke pt_prev if there are ingress
filters attached.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years ago[IPV4]: Change ip_defrag to return an integer
Herbert Xu [Sun, 14 Oct 2007 07:38:32 +0000 (00:38 -0700)]
[IPV4]: Change ip_defrag to return an integer

Now that ip_frag always returns the packet given to it on input, we can
change it to return an integer indicating error instead.  This patch does
that and updates all its callers accordingly.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years ago[IPV4]: Make ip_defrag return the same packet
Herbert Xu [Sun, 14 Oct 2007 07:38:15 +0000 (00:38 -0700)]
[IPV4]: Make ip_defrag return the same packet

This patch is a bit of a hack.  However it is worth it if you consider that
this is the only reason why we have to carry around the struct sk_buff **
pointers in netfilter.

It makes ip_defrag always return the packet that was given to it on input.
It does this by cloning the packet and replacing its original contents with
the head fragment if necessary.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years ago[SKBUFF]: Add skb_morph
Herbert Xu [Sun, 14 Oct 2007 07:37:52 +0000 (00:37 -0700)]
[SKBUFF]: Add skb_morph

This patch creates a new function skb_morph that's just like skb_clone
except that it lets user provide the spare skb that will be overwritten
by the one that's to be cloned.

This will be used by IP fragment reassembly so that we get back the same
skb that went in last (rather than the head skb that we get now which
requires us to carry around double pointers all over the place).

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years ago[SKBUFF]: Merge common code between copy_skb_header and skb_clone
Herbert Xu [Sun, 14 Oct 2007 07:37:30 +0000 (00:37 -0700)]
[SKBUFF]: Merge common code between copy_skb_header and skb_clone

This patch creates a new function __copy_skb_header to merge the common
code between copy_skb_header and skb_clone.  Having two functions which
are largely the same is a source of wasted labour as well as confusion.

In fact the tc_verd stuff is almost certainly a bug since it's treated
differently in skb_clone compared to the callers of copy_skb_header
(skb_copy/pskb_copy/skb_copy_expand).

I've kept that difference in tact with a comment added asking for
clarification.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agolibata: add ST9160821AS / 3.CCD to NCQ blacklist
Tejun Heo [Thu, 11 Oct 2007 01:49:26 +0000 (10:49 +0900)]
libata: add ST9160821AS / 3.CCD to NCQ blacklist

ST9160821AS / 3.CCD does spurious completions too.  Blacklist it.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
16 years agolibata: fix revalidation issuing after configuration commands
Tejun Heo [Wed, 10 Oct 2007 06:57:44 +0000 (15:57 +0900)]
libata: fix revalidation issuing after configuration commands

After commands which can change device configuration, EH is scheduled
to revalidate and reconfigure the device.  Host link was incorrectly
used unconditionally when scheduling EH action.  This resulted in
bogus revalidation request and mismatched configuration between device
and driver.  Fix it.

This bug was reported by Igor Durdanovic.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Igor Durdanovic <idurdanovic@comcast.net>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
16 years ago[libata] sata_nv: add SW NCQ support for MCP51/MCP55/MCP61
Kuan Luo [Mon, 15 Oct 2007 19:16:53 +0000 (15:16 -0400)]
[libata] sata_nv: add SW NCQ support for MCP51/MCP55/MCP61

Add the Software NCQ support to sata_nv.c for MCP51/MCP55/MCP61 SATA
controller.  NCQ function is disable by default, you can enable it
with 'swncq=1'.  NCQ will be turned off if the drive is Maxtor on
MCP51 or MCP55 rev 0xa2 platform.

[akpm@linux-foundation.org: build fix]
Signed-off-by: Kuan Luo <kluo@nvidia.com>
Signed-off-by: Peer Chen <pchen@nvidia.com>
Cc: Zoltan Boszormenyi <zboszor@dunaweb.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years ago[libata] pata_sil680: Add MMIO support
Benjamin Herrenschmidt [Fri, 6 Jul 2007 23:21:22 +0000 (19:21 -0400)]
[libata] pata_sil680: Add MMIO support

This patch adds MMIO support to the pata_sil680 for taskfile IOs,
based on what the old siimage does.

I haven't bothered changing the chip setup stuff from PCI config
cycles to MMIO though (siimage does it), I don't think it matters,
I've only adapted it to use MMIO for taskfile accesses.

I've tested it on a Cell blade and it seems to work fine.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
16 years agoucc_geth: Fix build break introduced by commit 09f75cd7bf13720738e6a196cc0107ce9a5bd5a0
Emil Medve [Mon, 15 Oct 2007 13:43:50 +0000 (08:43 -0500)]
ucc_geth: Fix build break introduced by commit 09f75cd7bf13720738e6a196cc0107ce9a5bd5a0

drivers/net/ucc_geth.c: In function 'ucc_geth_rx':
drivers/net/ucc_geth.c:3483: error: 'dev' undeclared (first use in this function)
drivers/net/ucc_geth.c:3483: error: (Each undeclared identifier is reported only once
drivers/net/ucc_geth.c:3483: error: for each function it appears in.)
make[2]: *** [drivers/net/ucc_geth.o] Error 1

Signed-off-by: Emil Medve <Emilian.Medve@Freescale.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
16 years agogianfar: Fix regression caused by new napi interface
Li Yang [Mon, 15 Oct 2007 15:01:12 +0000 (23:01 +0800)]
gianfar: Fix regression caused by new napi interface

Protect all new napi function calls with CONFIG_GFAR_NAPI.  Otherwise
the driver will stop working when CONFIG_GFAR_NAPI disabled.

Signed-off-by: Li Yang <leoli@freescale.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
16 years agogianfar: Cleanup compile warning caused by 0795af57
Li Yang [Fri, 12 Oct 2007 13:53:53 +0000 (21:53 +0800)]
gianfar: Cleanup compile warning caused by 0795af57

Signed-off-by: Li Yang <leoli@freescale.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
16 years agogianfar: Fix compile regression caused by bea3348e
Li Yang [Fri, 12 Oct 2007 13:53:51 +0000 (21:53 +0800)]
gianfar: Fix compile regression caused by bea3348e

Signed-off-by: Li Yang <leoli@freescale.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
16 years agoadd new prom.h for AU1x00
Yoichi Yuasa [Mon, 15 Oct 2007 10:11:24 +0000 (19:11 +0900)]
add new prom.h for AU1x00

Add new prom.h for AU1x00.

Signed-off-by: Yoichi Yuasa <yoichi_yuasa@tripeaks.co.jp>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
16 years agoupdate AU1000 get_ethernet_addr()
Yoichi Yuasa [Mon, 15 Oct 2007 10:06:20 +0000 (19:06 +0900)]
update AU1000 get_ethernet_addr()

Update AU1000 get_ethernet_addr().
Three functions were brought together in one.

Signed-off-by: Yoichi Yuasa <yoichi_yuasa@tripeaks.co.jp>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
16 years agoMIPSsim: General cleanup
Ralf Baechle [Fri, 12 Oct 2007 13:59:56 +0000 (14:59 +0100)]
MIPSsim: General cleanup

General cleanups mostly as suggested by checkpatch plus getting rid of
homebrew version of offsetof().

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
16 years agoJazzsonic: Fix warning about unused variable.
Ralf Baechle [Mon, 15 Oct 2007 09:58:40 +0000 (10:58 +0100)]
Jazzsonic: Fix warning about unused variable.

Caused by "[NET]: Introduce and use print_mac() and DECLARE_MAC_BUF()"
aka 0795af5729b18218767fab27c44b1384f72dc9ad.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
16 years agoRemove msic_dcr_read() in axon_msi.c
Michael Ellerman [Mon, 15 Oct 2007 09:34:38 +0000 (19:34 +1000)]
Remove msic_dcr_read() in axon_msi.c

msic_dcr_read() doesn't really do anything useful, just replace it with
direct calls to dcr_read().

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
16 years agoUse dcr_host_t.base in dcr_unmap()
Michael Ellerman [Mon, 15 Oct 2007 09:34:37 +0000 (19:34 +1000)]
Use dcr_host_t.base in dcr_unmap()

With the base stored in dcr_host_t, there's no need for callers to pass
the dcr_n into dcr_unmap(). In fact this removes the possibility of them
passing the incorrect value, which would then be iounmap()'ed.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
16 years agoAdd dcr_host_t.base in dcr_read()/dcr_write()
Michael Ellerman [Mon, 15 Oct 2007 09:34:36 +0000 (19:34 +1000)]
Add dcr_host_t.base in dcr_read()/dcr_write()

Now that all users of dcr_read()/dcr_write() add the dcr_host_t.base, we
can save them the trouble and do it in dcr_read()/dcr_write().

As some background to why we just went through all this jiggery-pokery,
benh sayeth:

 Initially the goal of the dcr_read/dcr_write routines was to operate like
 mfdcr/mtdcr which take absolute DCR numbers. The reason is that on 4xx
 hardware, indirect DCR access is a pain (goes through a table of
 instructions) and it's useful to have the compiler resolve an absolute DCR
 inline.

 We decided that wasn't worth the API bastardisation since most places
 where absolute DCR values are used are low level 4xx-only code which may
 as well continue using mfdcr/mtdcr, while the new API is designed for
 device "instances" that can exist on 4xx and Axon type platforms and may
 be located at variable DCR offsets.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
16 years agoUse dcr_host_t.base in ibm_emac_mal
Michael Ellerman [Mon, 15 Oct 2007 09:34:35 +0000 (19:34 +1000)]
Use dcr_host_t.base in ibm_emac_mal

This requires us to do a sort-of fake dcr_map(), so that base is set
properly. This will be fixed/removed when the device-tree-aware emac driver
is merged.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
16 years agoUpdate ibm_newemac to use dcr_host_t.base
Michael Ellerman [Mon, 15 Oct 2007 09:34:34 +0000 (19:34 +1000)]
Update ibm_newemac to use dcr_host_t.base

Now that dcr_host_t contains the base address, we can use that in the
ibm_newemac code, rather than storing it separately.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
16 years agotehuti: possible leak in bdx_probe
Florin Malita [Sat, 13 Oct 2007 17:03:38 +0000 (13:03 -0400)]
tehuti: possible leak in bdx_probe

If pci_enable_device fails, bdx_probe returns without freeing the
allocated pci_nic structure.

Coverity CID 1908.

Signed-off-by: Florin Malita <fmalita@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
16 years agoTC35815: Fix build
Ralf Baechle [Sun, 14 Oct 2007 13:40:26 +0000 (14:40 +0100)]
TC35815: Fix build

bea3348eef27e6044b6161fd04c3152215f96411 broke the build of tc35815.c
for the non-NAPI case:

  CC      drivers/net/tc35815.o
drivers/net/tc35815.c: In function 'tc35815_interrupt':
drivers/net/tc35815.c:1464: error: redefinition of 'lp'
drivers/net/tc35815.c:1443: error: previous definition of 'lp' was here

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
16 years agoSAA9730: Fix build
Ralf Baechle [Sun, 14 Oct 2007 13:13:58 +0000 (14:13 +0100)]
SAA9730: Fix build

Fix build breakage by the recent statistics cleanup in cset
09f75cd7bf13720738e6a196cc0107ce9a5bd5a0.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
16 years agoAR7 ethernet
Matteo Croce [Sun, 14 Oct 2007 16:10:13 +0000 (18:10 +0200)]
AR7 ethernet

New version which uses less locking and drops old API

Signed-off-by: Matteo Croce <technoboy85@gmail.com>
Signed-off-by: Eugene Konev <ejka@imfi.kspu.ru>
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
16 years agomyri10ge: update driver version to 1.3.2-1.287
Brice Goglin [Sat, 13 Oct 2007 10:34:36 +0000 (12:34 +0200)]
myri10ge: update driver version to 1.3.2-1.287

The myri10ge driver is now at version 1.3.2-1.287.

Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
16 years agomyri10ge: add IPv6 TSO support
Brice Goglin [Sat, 13 Oct 2007 10:34:01 +0000 (12:34 +0200)]
myri10ge: add IPv6 TSO support

Add support for IPv6 TSO to the myri10ge driver.

Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
16 years agoAdd skb_is_gso_v6
Brice Goglin [Sat, 13 Oct 2007 10:33:32 +0000 (12:33 +0200)]
Add skb_is_gso_v6

Add skb_is_gso_v6().

Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
16 years agomyri10ge: update firmware headers
Brice Goglin [Sat, 13 Oct 2007 10:32:58 +0000 (12:32 +0200)]
myri10ge: update firmware headers

Update myri10ge firmware headers to latest upstream version with
TSO6 and RSS support.

Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
16 years agomyri10ge: fix some indentation, white spaces, and comments
Brice Goglin [Sat, 13 Oct 2007 10:32:21 +0000 (12:32 +0200)]
myri10ge: fix some indentation, white spaces, and comments

Fix one comment in myri10ge.c and update indendation and white spaces
to match the code generated by indent from upstream CVS.

Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
16 years agonet/bonding: Optionally allow ethernet slaves to keep own MAC
Jay Vosburgh [Wed, 10 Oct 2007 02:57:24 +0000 (19:57 -0700)]
net/bonding: Optionally allow ethernet slaves to keep own MAC

Update the "don't change MAC of slaves" functionality added in
previous changes to be a generic option, rather than something tied to
IB devices, as it's occasionally useful for regular ethernet devices as
well.

Adds "fail_over_mac" option (which is automatically enabled for IB
slaves), applicable only to active-backup mode.

Includes documentation update.

Updates bonding driver version to 3.2.0.

Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
16 years agonet/bonding: Destroy bonding master when last slave is gone
Moni Shoua [Wed, 10 Oct 2007 02:43:43 +0000 (19:43 -0700)]
net/bonding: Destroy bonding master when last slave is gone

When bonding enslaves non Ethernet devices it takes pointers to functions
in the module that owns the slaves. In this case it becomes unsafe
to keep the bonding master registered after last slave was unenslaved
because we don't know if the pointers are still valid.  Destroying the bond when slave_cnt is zero
ensures that these functions be used anymore.

Signed-off-by: Moni Shoua <monis at voltaire.com>
Acked-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
16 years agonet/bonding: Delay sending of gratuitous ARP to avoid failure
Moni Shoua [Wed, 10 Oct 2007 02:43:42 +0000 (19:43 -0700)]
net/bonding: Delay sending of gratuitous ARP to avoid failure

Delay sending a gratuitous_arp when LINK_STATE_LINKWATCH_PENDING bit
in dev->state field is on. This improves the chances for the arp packet to
be transmitted.

Signed-off-by: Moni Shoua <monis at voltaire.com>
Acked-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
16 years agonet/bonding: Handlle wrong assumptions that slave is always an Ethernet device
Moni Shoua [Wed, 10 Oct 2007 02:43:41 +0000 (19:43 -0700)]
net/bonding: Handlle wrong assumptions that slave is always an Ethernet device

bonding sometimes uses Ethernet constants (such as MTU and address length) which
are not good when it enslaves non Ethernet devices (such as InfiniBand).

Signed-off-by: Moni Shoua <monis at voltaire.com>
Acked-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
16 years agonet/bonding: Enable IP multicast for bonding IPoIB devices
Moni Shoua [Wed, 10 Oct 2007 02:43:40 +0000 (19:43 -0700)]
net/bonding: Enable IP multicast for bonding IPoIB devices

Allow to enslave devices when the bonding device is not up. Over the discussion
held at the previous post this seemed to be the most clean way to go, where it
is not expected to cause instabilities.

Normally, the bonding driver is UP before any enslavement takes place.
Once a netdevice is UP, the network stack acts to have it join some multicast groups
(eg the all-hosts 224.0.0.1). Now, since ether_setup() have set the bonding device
type to be ARPHRD_ETHER and address len to be ETHER_ALEN, the net core code
computes a wrong multicast link address. This is b/c ip_eth_mc_map() is called
where for multicast joins taking place after the enslavement another ip_xxx_mc_map()
is called (eg ip_ib_mc_map() when the bond type is ARPHRD_INFINIBAND)

Signed-off-by: Moni Shoua <monis at voltaire.com>
Signed-off-by: Or Gerlitz <ogerlitz at voltaire.com>
Acked-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
16 years agonet/bonding: Enable bonding to enslave netdevices not supporting set_mac_address()
Moni Shoua [Wed, 10 Oct 2007 02:43:39 +0000 (19:43 -0700)]
net/bonding: Enable bonding to enslave netdevices not supporting set_mac_address()

This patch allows for enslaving netdevices which do not support
the set_mac_address() function. In that case the bond mac address is the one
of the active slave, where remote peers are notified on the mac address
(neighbour) change by Gratuitous ARP sent by bonding when fail-over occurs
(this is already done by the bonding code).

Signed-off-by: Moni Shoua <monis at voltaire.com>
Signed-off-by: Or Gerlitz <ogerlitz at voltaire.com>
Acked-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
16 years agonet/bonding: Enable bonding to enslave non ARPHRD_ETHER
Moni Shoua [Wed, 10 Oct 2007 02:43:38 +0000 (19:43 -0700)]
net/bonding: Enable bonding to enslave non ARPHRD_ETHER

This patch changes some of the bond netdevice attributes and functions
to be that of the active slave for the case of the enslaved device not being
of ARPHRD_ETHER type. Basically it overrides those setting done by ether_setup(),
which are netdevice **type** dependent and hence might be not appropriate for
devices of other types. It also enforces mutual exclusion on bonding slaves
from dissimilar ether types, as was concluded over the v1 discussion.

IPoIB (see Documentation/infiniband/ipoib.txt) MAC address is made of a 3 bytes
IB QP (Queue Pair) number and 16 bytes IB port GID (Global ID) of the port this
IPoIB device is bounded to. The QP is a resource created by the IB HW and the
GID is an identifier burned into the HCA (i have omitted here some details which
are not important for the bonding RFC).

Signed-off-by: Moni Shoua <monis at voltaire.com>
Signed-off-by: Or Gerlitz <ogerlitz at voltaire.com>
Acked-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
16 years agoIB/ipoib: Verify address handle validity on send
Moni Shoua [Wed, 10 Oct 2007 02:43:37 +0000 (19:43 -0700)]
IB/ipoib: Verify address handle validity on send

When the bonding device senses a carrier loss of its active slave it replaces
that slave with a new one. In between the times when the carrier of an IPoIB
device goes down and ipoib_neigh is destroyed, it is possible that the
bonding driver will send a packet on a new slave that uses an old ipoib_neigh.
This patch detects and prevents this from happenning.

Signed-off-by: Moni Shoua <monis at voltaire.com>
Signed-off-by: Or Gerlitz <ogerlitz at voltaire.com>
Acked-by: Roland Dreier <rdreier@cisco.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
16 years agoIB/ipoib: Bound the net device to the ipoib_neigh structue
Moni Shoua [Wed, 10 Oct 2007 02:43:36 +0000 (19:43 -0700)]
IB/ipoib: Bound the net device to the ipoib_neigh structue

IPoIB uses a two layer neighboring scheme, such that for each struct neighbour
whose device is an ipoib one, there is a struct ipoib_neigh buddy which is
created on demand at the tx flow by an ipoib_neigh_alloc(skb->dst->neighbour)
call.

When using the bonding driver, neighbours are created by the net stack on behalf
of the bonding (master) device. On the tx flow the bonding code gets an skb such
that skb->dev points to the master device, it changes this skb to point on the
slave device and calls the slave hard_start_xmit function.

Under this scheme, ipoib_neigh_destructor assumption that for each struct
neighbour it gets, n->dev is an ipoib device and hence netdev_priv(n->dev)
can be casted to struct ipoib_dev_priv is buggy.

To fix it, this patch adds a dev field to struct ipoib_neigh which is used
instead of the struct neighbour dev one, when n->dev->flags has the
IFF_MASTER bit set.

Signed-off-by: Moni Shoua <monis at voltaire.com>
Signed-off-by: Or Gerlitz <ogerlitz at voltaire.com>
Acked-by: Roland Dreier <rdreier@cisco.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
16 years agonatsemi: Check return value for pci_enable_device()
Mark Brown [Wed, 10 Oct 2007 16:11:12 +0000 (17:11 +0100)]
natsemi: Check return value for pci_enable_device()

pci_enable_device() is __must_check so do that in natsemi_resume().

Signed-off-by: Mark Brown <broonie@sirena.org.uk>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
16 years agonatsemi: Use round_jiffies() for slow timers
Mark Brown [Wed, 10 Oct 2007 10:05:44 +0000 (11:05 +0100)]
natsemi: Use round_jiffies() for slow timers

Unless we have failed to fill the RX ring the timer used by the natsemi
driver is not particularly urgent and can use round_jiffies() to allow
grouping with other timers.

Signed-off-by: Mark Brown <broonie@sirena.org.uk>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
16 years agoMerge branch 'pxa' into devel
Russell King [Mon, 15 Oct 2007 17:55:44 +0000 (18:55 +0100)]
Merge branch 'pxa' into devel

16 years ago[ARM] 4578/1: CM-x270: PCMCIA support
Mike Rapoport [Sun, 23 Sep 2007 15:00:20 +0000 (16:00 +0100)]
[ARM] 4578/1: CM-x270: PCMCIA support

This patch provides support for PCMCIA on CM-X270

Signed-off-by: Mike Rapoport <mike@compulab.co.il>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
16 years ago[ARM] 4577/1: ITE 8152 PCI bridge support
Mike Rapoport [Sun, 23 Sep 2007 14:59:52 +0000 (15:59 +0100)]
[ARM] 4577/1: ITE 8152 PCI bridge support

This patch provides driver for ITE 8152 PCI bridge.

Signed-off-by: Mike Rapoport <mike@compulab.co.il>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
16 years ago[ARM] 4576/1: CM-X270 machine support
Mike Rapoport [Sun, 23 Sep 2007 14:59:26 +0000 (15:59 +0100)]
[ARM] 4576/1: CM-X270 machine support

This patch provides core support for CM-X270 platform.

Signed-off-by: Mike Rapoport <mike@compulab.co.il>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
16 years ago[ARM] pxa: Avoid pxa_gpio_mode() in gpio_direction_{in,out}put()
Russell King [Tue, 2 Oct 2007 13:28:01 +0000 (14:28 +0100)]
[ARM] pxa: Avoid pxa_gpio_mode() in gpio_direction_{in,out}put()

pxa_gpio_mode() is a universal call that fiddles with the GAFR
(gpio alternate function register.)  GAFR does not exist on PXA3
CPUs, but instead the alternate functions are controlled via the
MFP support code.

Platforms are expected to configure the MFP according to their
needs in their platform support code rather than drivers.  We
extend this idea to the GAFR, and make the gpio_direction_*()
functions purely operate on the GPIO level.

This means platform support code is entirely responsible for
configuring the GPIOs alternate functions on all PXA CPU types.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
16 years ago[ARM] pxa: move pxa_set_mode() from pxa2xx_mainstone.c to mainstone.c
Russell King [Tue, 2 Oct 2007 10:29:02 +0000 (11:29 +0100)]
[ARM] pxa: move pxa_set_mode() from pxa2xx_mainstone.c to mainstone.c

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
16 years ago[ARM] pxa: move pxa_set_mode() from pxa2xx_lubbock.c to lubbock.c
Russell King [Tue, 2 Oct 2007 10:28:26 +0000 (11:28 +0100)]
[ARM] pxa: move pxa_set_mode() from pxa2xx_lubbock.c to lubbock.c

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
16 years ago[ARM] pxa: Make cpu_is_pxaXXX dependent on configuration symbols
Russell King [Mon, 1 Oct 2007 15:22:24 +0000 (16:22 +0100)]
[ARM] pxa: Make cpu_is_pxaXXX dependent on configuration symbols

Make the cpu_is_pxaXXX() macros define to zero when support for a
particular CPU is disabled.  This allows us to eliminate code for
CPUs which aren't enabled.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
16 years ago[ARM] pxa: PXA3xx base support
eric miao [Wed, 12 Sep 2007 02:13:17 +0000 (19:13 -0700)]
[ARM] pxa: PXA3xx base support

Signed-off-by: eric miao <eric.y.miao@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
16 years ago[NET] smc91x: fix PXA DMA support code
Russell King [Sat, 1 Sep 2007 20:27:18 +0000 (21:27 +0100)]
[NET] smc91x: fix PXA DMA support code

The PXA DMA support code for smc91x doesn't pass a struct device to
the dma_*map_single() functions, which leads to an oops in the dma
bounce code.  We have a struct device which was used to probe the
SMC chip.  Use it.

(This patch is slightly larger because it requires struct smc_local
to move into the header file.)

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
16 years ago[SERIAL] Fix console initialisation ordering
Russell King [Sat, 1 Sep 2007 20:25:09 +0000 (21:25 +0100)]
[SERIAL] Fix console initialisation ordering

Ensure pm callback is called upon initialisation to place port in
correct power saving state.  Ensure console is initialised prior
to deciding whether to power down the port.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
16 years ago[ARM] pxa: tidy up arch/arm/mach-pxa/Makefile
Russell King [Wed, 19 Sep 2007 08:21:51 +0000 (09:21 +0100)]
[ARM] pxa: tidy up arch/arm/mach-pxa/Makefile

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
16 years agoMerge git://git.linux-nfs.org/pub/linux/nfs-2.6
Linus Torvalds [Mon, 15 Oct 2007 17:46:05 +0000 (10:46 -0700)]
Merge git://git.linux-nfs.org/pub/linux/nfs-2.6

* git://git.linux-nfs.org/pub/linux/nfs-2.6: (131 commits)
  NFSv4: Fix a typo in nfs_inode_reclaim_delegation
  NFS: Add a boot parameter to disable 64 bit inode numbers
  NFS: nfs_refresh_inode should clear cache_validity flags on success
  NFS: Fix a connectathon regression in NFSv3 and NFSv4
  NFS: Use nfs_refresh_inode() in ops that aren't expected to change the inode
  SUNRPC: Don't call xprt_release in call refresh
  SUNRPC: Don't call xprt_release() if call_allocate fails
  SUNRPC: Fix buggy UDP transmission
  [23/37] Clean up duplicate includes in
  [2.6 patch] net/sunrpc/rpcb_clnt.c: make struct rpcb_program static
  SUNRPC: Use correct type in buffer length calculations
  SUNRPC: Fix default hostname created in rpc_create()
  nfs: add server port to rpc_pipe info file
  NFS: Get rid of some obsolete macros
  NFS: Simplify filehandle revalidation
  NFS: Ensure that nfs_link() returns a hashed dentry
  NFS: Be strict about dentry revalidation when doing exclusive create
  NFS: Don't zap the readdir caches upon error
  NFS: Remove the redundant nfs_reval_fsid()
  NFSv3: Always use directory post-op attributes in nfs3_proc_lookup
  ...

Fix up trivial conflict due to sock_owned_by_user() cleanup manually in
net/sunrpc/xprtsock.c

16 years agoMerge branch 'v2.6.24-lockdep' of git://git.kernel.org/pub/scm/linux/kernel/git/peter...
Linus Torvalds [Mon, 15 Oct 2007 17:40:41 +0000 (10:40 -0700)]
Merge branch 'v2.6.24-lockdep' of git://git.kernel.org/pub/scm/linux/kernel/git/peterz/linux-2.6-lockdep

* 'v2.6.24-lockdep' of git://git.kernel.org/pub/scm/linux/kernel/git/peterz/linux-2.6-lockdep:
  lockdep: annotate dir vs file i_mutex
  lockdep: per filesystem inode lock class
  lockdep: annotate kprobes irq fiddling
  lockdep: annotate rcu_read_{,un}lock{,_bh}
  lockdep: annotate journal_start()
  lockdep: s390: connect the sysexit hook
  lockdep: x86_64: connect the sysexit hook
  lockdep: i386: connect the sysexit hook
  lockdep: syscall exit check
  lockdep: fixup mutex annotations
  lockdep: fix mismatched lockdep_depth/curr_chain_hash
  lockdep: Avoid /proc/lockdep & lock_stat infinite output
  lockdep: maintainers

16 years agoMerge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6
Linus Torvalds [Mon, 15 Oct 2007 16:57:54 +0000 (09:57 -0700)]
Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6

* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
  [IA64] update sn2_defconfig
  [IA64] Fix kernel hangup in kdump on INIT
  [IA64] Fix kernel panic in kdump on INIT
  [IA64] Remove vector from ia64_machine_kexec()
  [IA64] Fix race when multiple cpus go through MCA
  [IA64] Remove needless delay in MCA rendezvous
  [IA64] add driver for ACPI methods to call native firmware
  [IA64] abstract SAL_CALL wrapper to allow other firmware entry points
  [IA64] perfmon: Remove exit_pfm_fs()
  [IA64] tree-wide: Misc __cpu{initdata, init, exit} annotations

16 years agoGet rid of unused variable warning in drivers/pci/hotplug/pci_hotplug_core.c
Linus Torvalds [Mon, 15 Oct 2007 16:07:58 +0000 (09:07 -0700)]
Get rid of unused variable warning in drivers/pci/hotplug/pci_hotplug_core.c

Commit 5a7ad7f044941316dc98eda2a087a12a7a50649d removed all uses of
'retval', but didn't remove the variable itself.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years ago[IA64] update sn2_defconfig
Jes Sorensen [Wed, 19 Sep 2007 09:54:55 +0000 (11:54 +0200)]
[IA64] update sn2_defconfig

Update defonfig file for sn2 to match recent changes in config options.

Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
16 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched
Linus Torvalds [Mon, 15 Oct 2007 15:22:16 +0000 (08:22 -0700)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched

* git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched: (140 commits)
  sched: sync wakeups preempt too
  sched: affine sync wakeups
  sched: guest CPU accounting: maintain guest state in KVM
  sched: guest CPU accounting: maintain stats in account_system_time()
  sched: guest CPU accounting: add guest-CPU /proc/<pid>/stat fields
  sched: guest CPU accounting: add guest-CPU /proc/stat field
  sched: domain sysctl fixes: add terminator comment
  sched: domain sysctl fixes: do not crash on allocation failure
  sched: domain sysctl fixes: unregister the sysctl table before domains
  sched: domain sysctl fixes: use for_each_online_cpu()
  sched: domain sysctl fixes: use kcalloc()
  Make scheduler debug file operations const
  sched: enable wake-idle on CONFIG_SCHED_MC=y
  sched: reintroduce topology.h tunings
  sched: allow the immediate migration of cache-cold tasks
  sched: debug, improve migration statistics
  sched: debug: increase width of debug line
  sched: activate task_hot() only on fair-scheduled tasks
  sched: reintroduce cache-hot affinity
  sched: speed up context-switches a bit
  ...

16 years agoMerge master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6
Linus Torvalds [Mon, 15 Oct 2007 15:19:33 +0000 (08:19 -0700)]
Merge master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6

* master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (207 commits)
  [SCSI] gdth: fix CONFIG_ISA build failure
  [SCSI] esp_scsi: remove __dev{init,exit}
  [SCSI] gdth: !use_sg cleanup and use of scsi accessors
  [SCSI] gdth: Move members from SCp to gdth_cmndinfo, stage 2
  [SCSI] gdth: Setup proper per-command private data
  [SCSI] gdth: Remove gdth_ctr_tab[]
  [SCSI] gdth: switch to modern scsi host registration
  [SCSI] gdth: gdth_interrupt() gdth_get_status() & gdth_wait() fixes
  [SCSI] gdth: clean up host private data
  [SCSI] gdth: Remove virt hosts
  [SCSI] gdth: Reorder scsi_host_template intitializers
  [SCSI] gdth: kill gdth_{read,write}[bwl] wrappers
  [SCSI] gdth: Remove 2.4.x support, in-kernel changelog
  [SCSI] gdth: split out pci probing
  [SCSI] gdth: split out eisa probing
  [SCSI] gdth: split out isa probing
  gdth: Make one abuse of scsi_cmnd less obvious
  [SCSI] NCR5380: Use scsi_eh API for REQUEST_SENSE invocation
  [SCSI] usb storage: use scsi_eh API in REQUEST_SENSE execution
  [SCSI] scsi_error: Refactoring scsi_error to facilitate in synchronous REQUEST_SENSE
  ...

16 years agoMerge branch 'agp-patches' of master.kernel.org:/pub/scm/linux/kernel/git/airlied...
Linus Torvalds [Mon, 15 Oct 2007 15:18:44 +0000 (08:18 -0700)]
Merge branch 'agp-patches' of master.kernel.org:/pub/scm/linux/kernel/git/airlied/agp-2.6

* 'agp-patches' of master.kernel.org:/pub/scm/linux/kernel/git/airlied/agp-2.6:
  fix use after free in amd create gatt pages
  AGP fix race condition between unmapping and freeing pages

16 years agoMerge branch 'drm-patches' of ssh://master.kernel.org/pub/scm/linux/kernel/git/airlie...
Linus Torvalds [Mon, 15 Oct 2007 15:17:26 +0000 (08:17 -0700)]
Merge branch 'drm-patches' of ssh://master.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6

* 'drm-patches' of ssh://master.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
  via invalid device ids removal
  radeon: Commit the ring after each partial texture upload blit.
  i915: fix vbl swap allocation size.
  drm: Replace DRM_IOCTL_ARGS with (dev, data, file_priv) and remove DRM_DEVICE.
  drm: remove XFREE86_VERSION macros.
  drm: Replace filp in ioctl arguments with drm_file *file_priv.
  drm: Remove DRM_ERR OS macro.

16 years agoMerge branch 'nfs-server-stable' of git://linux-nfs.org/~bfields/linux
Linus Torvalds [Mon, 15 Oct 2007 15:16:53 +0000 (08:16 -0700)]
Merge branch 'nfs-server-stable' of git://linux-nfs.org/~bfields/linux

* 'nfs-server-stable' of git://linux-nfs.org/~bfields/linux:
  knfsd: query filesystem for NFSv4 getattr of FATTR4_MAXNAME
  knfsd: nfsv4 delegation recall should take reference on client
  knfsd: don't shutdown callbacks until nfsv4 client is freed
  knfsd: let nfsd manage timing out its own leases
  knfsd: Add source address to sunrpc svc errors
  knfsd: 64 bit ino support for NFS server
  svcgss: move init code into separate function
  knfsd: remove code duplication in nfsd4_setclientid()
  nfsd warning fix
  knfsd: fix callback rpc cred
  knfsd: move nfsv4 slab creation/destruction to module init/exit
  knfsd: spawn kernel thread to probe callback channel
  knfsd: nfs4 name->id mapping not correctly parsing negative downcall
  knfsd: demote some printk()s to dprintk()s
  knfsd: cleanup of nfsd4 cmp_* functions
  knfsd: delete code made redundant by map_new_errors
  nfsd: fix horrible indentation in nfsd_setattr
  nfsd: remove unused cache_for_each macro
  nfsd: tone down inaccurate dprintk

16 years agoPS3 system bus add_uevent_var() fallout
Geert Uytterhoeven [Mon, 15 Oct 2007 09:51:03 +0000 (11:51 +0200)]
PS3 system bus add_uevent_var() fallout

Kill unused variables

Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoHID: fix HIDIOCGRDESC memory access in hidraw
Jiri Kosina [Mon, 15 Oct 2007 13:17:41 +0000 (15:17 +0200)]
HID: fix HIDIOCGRDESC memory access in hidraw

Fix bogus copying of data into userspace when HIDIOCGRDESC is issued.
HID-transport layer makes sure that dev->hid->rdesc is not larger than
HID_MAX_DESCRIPTOR_SIZE.

Noticed-by: Al Viro <viro@ftp.linux.org.uk>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agosched: sync wakeups preempt too
Ingo Molnar [Mon, 15 Oct 2007 15:00:20 +0000 (17:00 +0200)]
sched: sync wakeups preempt too

make sure sync wakeups preempt too - the scheduler will not
overschedule as we've got various throttles against that.
As a result, sync wakeups can be used more widely in the kernel
(to signal wakeup affinity between tasks), and no arbitrary
latencies will be introduced either.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: affine sync wakeups
Ingo Molnar [Mon, 15 Oct 2007 15:00:19 +0000 (17:00 +0200)]
sched: affine sync wakeups

make sync wakeups affine for cache-cold tasks: if a cache-cold task
is woken up by a sync wakeup then use the opportunity to migrate it
straight away. (the two tasks are 'related' because they communicate)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: guest CPU accounting: maintain guest state in KVM
Laurent Vivier [Mon, 15 Oct 2007 15:00:19 +0000 (17:00 +0200)]
sched: guest CPU accounting: maintain guest state in KVM

Modify KVM to update guest time accounting.

[ mingo@elte.hu: ported to 2.6.24 KVM. ]

Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Acked-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: guest CPU accounting: maintain stats in account_system_time()
Laurent Vivier [Mon, 15 Oct 2007 15:00:19 +0000 (17:00 +0200)]
sched: guest CPU accounting: maintain stats in account_system_time()

modify account_system_time() to add cputime to cpustat->guest if we are
running a VCPU. We add this cputime to cpustat->user instead of
cpustat->system because this part of KVM code is in fact user code
although it is executed in the kernel. We duplicate VCPU time between
guest and user to allow an unmodified "top(1)" to display correct value.
A modified "top(1)" is able to display good cpu user time and cpu guest
time by subtracting cpu guest time from cpu user time. Update "gtime" in
task_struct accordingly.

Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Acked-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: guest CPU accounting: add guest-CPU /proc/<pid>/stat fields
Laurent Vivier [Mon, 15 Oct 2007 15:00:19 +0000 (17:00 +0200)]
sched: guest CPU accounting: add guest-CPU /proc/<pid>/stat fields

like for cpustat, introduce the "gtime" (guest time of the task) and
"cgtime" (guest time of the task children) fields for the
tasks. Modify signal_struct and task_struct.

Modify /proc/<pid>/stat to display these new fields.

Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Acked-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: guest CPU accounting: add guest-CPU /proc/stat field
Laurent Vivier [Mon, 15 Oct 2007 15:00:19 +0000 (17:00 +0200)]
sched: guest CPU accounting: add guest-CPU /proc/stat field

as recent CPUs introduce a third running state, after "user" and
"system", we need a new field, "guest", in cpustat to store the time
used by the CPU to run virtual CPU. Modify /proc/stat to display this
new field.

Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Acked-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: domain sysctl fixes: add terminator comment
Milton Miller [Mon, 15 Oct 2007 15:00:19 +0000 (17:00 +0200)]
sched: domain sysctl fixes: add terminator comment

we had an incorrect-terminator bug in sd_alloc_ctl_domain_table()
before, so add a comment that documents it.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: domain sysctl fixes: do not crash on allocation failure
Milton Miller [Mon, 15 Oct 2007 15:00:19 +0000 (17:00 +0200)]
sched: domain sysctl fixes: do not crash on allocation failure

Now that we are calling this at runtime, a more relaxed error path is
suggested.  If an allocation fails, we just register the partial table,
which will show empty directories.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: domain sysctl fixes: unregister the sysctl table before domains
Milton Miller [Mon, 15 Oct 2007 15:00:19 +0000 (17:00 +0200)]
sched: domain sysctl fixes: unregister the sysctl table before domains

Unregister and free the sysctl table before destroying domains, then
rebuild and register after creating the new domains.  This prevents the
sysctl table from pointing to freed memory for root to write.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: domain sysctl fixes: use for_each_online_cpu()
Milton Miller [Mon, 15 Oct 2007 15:00:19 +0000 (17:00 +0200)]
sched: domain sysctl fixes: use for_each_online_cpu()

init_sched_domain_sysctl was walking cpus 0-n and referencing per_cpu
variables.  If the cpus_possible mask is not contigious this will result
in a crash referencing unallocated data.  If the online mask is not
contigious then we would show offline cpus and miss online ones.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: domain sysctl fixes: use kcalloc()
Milton Miller [Mon, 15 Oct 2007 15:00:19 +0000 (17:00 +0200)]
sched: domain sysctl fixes: use kcalloc()

kcalloc checks for n * sizeof(element) overflows and it zeros.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoMake scheduler debug file operations const
Arjan van de Ven [Mon, 15 Oct 2007 15:00:19 +0000 (17:00 +0200)]
Make scheduler debug file operations const

In general, struct file_operations are const in the kernel, to not have
false cacheline sharing and to catch bugs at compiletime with accidental
writes to them. The new scheduler code introduces a new non-const one;
fix this up.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: enable wake-idle on CONFIG_SCHED_MC=y
Ingo Molnar [Mon, 15 Oct 2007 15:00:19 +0000 (17:00 +0200)]
sched: enable wake-idle on CONFIG_SCHED_MC=y

most multicore CPUs today have shared L2 caches, so tune things so
that the spreading amongst cores is more aggressive.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: reintroduce topology.h tunings
Ingo Molnar [Mon, 15 Oct 2007 15:00:19 +0000 (17:00 +0200)]
sched: reintroduce topology.h tunings

reintroduce the 2.6.22 topology.h tunings again - they result in
slightly better balancing.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: allow the immediate migration of cache-cold tasks
Ingo Molnar [Mon, 15 Oct 2007 15:00:18 +0000 (17:00 +0200)]
sched: allow the immediate migration of cache-cold tasks

allow the immediate migration of cache-cold tasks.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: debug, improve migration statistics
Ingo Molnar [Mon, 15 Oct 2007 15:00:18 +0000 (17:00 +0200)]
sched: debug, improve migration statistics

add new migration statistics when SCHED_DEBUG and SCHEDSTATS
is enabled. Available in /proc/<PID>/sched.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: debug: increase width of debug line
Ingo Molnar [Mon, 15 Oct 2007 15:00:18 +0000 (17:00 +0200)]
sched: debug: increase width of debug line

increase width of debug line - in preparation of more debugging info.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: activate task_hot() only on fair-scheduled tasks
Peter Zijlstra [Mon, 15 Oct 2007 15:00:18 +0000 (17:00 +0200)]
sched: activate task_hot() only on fair-scheduled tasks

activate task_hot() only for fair-scheduled tasks (i.e. disable it
for RT tasks).

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: reintroduce cache-hot affinity
Ingo Molnar [Mon, 15 Oct 2007 15:00:18 +0000 (17:00 +0200)]
sched: reintroduce cache-hot affinity

reintroduce a simplified version of cache-hot/cold scheduling
affinity. This improves performance with certain SMP workloads,
such as sysbench.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: speed up context-switches a bit
Ingo Molnar [Mon, 15 Oct 2007 15:00:18 +0000 (17:00 +0200)]
sched: speed up context-switches a bit

speed up context-switches a bit by not clearing p->exec_start.

(as a side-effect, this also makes p->exec_start a universal timestamp
available to cache-hot estimations.)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: do not wakeup-preempt with SCHED_BATCH tasks
Ingo Molnar [Mon, 15 Oct 2007 15:00:18 +0000 (17:00 +0200)]
sched: do not wakeup-preempt with SCHED_BATCH tasks

do not wakeup-preempt with SCHED_BATCH tasks, their preemption
is batched too, driven by the tick.

Signed-off-by: Ingo Molnar <mingo@elte.hu>