err.no Git - linux-2.6/log

[NETFILTER]: nf_nat_pptp: fix expectation removal

When removing the expectation for the opposite direction, the PPTP NAT
helper initializes the tuple for lookup with the addresses of the
opposite direction, which makes the lookup fail.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NETFILTER]: nf_nat: fix ICMP translation with statically linked conntrack

When nf_nat/nf_conntrack_ipv4 are linked statically, nf_nat is initialized
before nf_conntrack_ipv4, which makes the nf_ct_l3proto_find_get(AF_INET)
call during nf_nat initialization return the generic l3proto instead of
the AF_INET specific one. This breaks ICMP error translation since the
generic protocol always initializes the IPs in the tuple to 0.

Change the linking order and put nf_conntrack_ipv4 first.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

[TCP]: Restore SKB socket owner setting in tcp_transmit_skb().

Revert 931731123a103cfb3f70ac4b7abfc71d94ba1f03

We can't elide the skb_set_owner_w() here because things like certain
netfilter targets (such as owner MATCH) need a socket to be set on the
SKB for correct operation.

Thanks to Jan Engelhardt and other netfilter list members for
pointing this out.

Signed-off-by: David S. Miller <davem@davemloft.net>

[AF_PACKET]: Check device down state before hard header callbacks.

If the device is down, invoking the device hard header callbacks
is not legal, so check it early.

Based upon a shaper OOPS report from Frederik Deweerdt.

Signed-off-by: David S. Miller <davem@davemloft.net>

[DECNET]: Handle a failure in neigh_parms_alloc (take 2)

While enhancing the neighbour code to handle multiple network
namespaces I noticed that decnet is assuming neigh_parms_alloc
will allways succeed, which is clearly wrong. So handle the
failure.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Steven Whitehouse <steve@chygwyn.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

[BNX2]: Fix 2nd port's MAC address.

On the 5709, we need to add the proper offset to calculate the shared
memory base address of the 2nd port correctly. Otherwise, the 2nd
port's MAC address and other information will be the same as the 1st
port.

Update version to 1.5.4.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

[TCP]: Fix sorting of SACK blocks.

The sorting of SACK blocks actually munges them rather than sort,
causing the TCP stack to ignore some SACK information and breaking the
assumption of ordered SACK blocks after sorting.

The sort takes the data from a second buffer which isn't moved causing
subsequent data moves to occur from the wrong location. The fix is to
use a temporary buffer as a normal sort does.

Signed-off-By: Baruch Even <baruch@ev-en.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

[AF_PACKET]: Fix BPF handling.

This fixes a bug introduced by:

commit fda9ef5d679b07c9d9097aaf6ef7f069d794a8f9
Author: Dmitry Mishin <dim@openvz.org>
Date: Thu Aug 31 15:28:39 2006 -0700

[NET]: Fix sk->sk_filter field access

sk_run_filter() returns either 0 or an unsigned 32-bit
length which says how much of the packet to retain.
If that 32-bit unsigned integer is larger than the packet,
this is fine we just leave the packet unchanged.

The above commit caused all filter return values which
were negative when interpreted as a signed integer to
indicate a packet drop, which is wrong.

Based upon a report and initial patch by Raivis Bucis.

Signed-off-by: David S. Miller <davem@davemloft.net>

[IPV4]: Fix the fib trie iterator to work with a single entry routing tables

In a kernel with trie routing enabled I had a simple routing setup
with only a single route to the outside world and no default
route. "ip route table list main" showed my the route just fine but
/proc/net/route was an empty file.  What was going on?

Thinking it was a bug in something I did and I looked deeper.  Eventually
I setup a second route and everything looked correct, huh?  Finally I
realized that the it was just the iterator pair in fib_trie_get_first,
fib_trie_get_next just could not handle a routing table with a single entry.

So to save myself and others further confusion, here is a simple fix for
the fib proc iterator so it works even when there is only a single route
in a routing table.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Robert Olsson <robert.olsson@its.uu.se>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus

* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
  [MIPS] Fix wrong checksum calculation on 64-bit MIPS
  [MIPS] VPE loader: Initialize lists before they're actually being used ...
  [MIPS] Fix reported amount of freed memory - it's in kB not bytes
  [MIPS] vr41xx: need one more nop with mtc0_tlbw_hazard()
  [MIPS] SMTC: Fix module build by exporting symbol
  [MIPS] SMTC: Fix TLB sizing bug for TLB of 64 >= entries
  [MIPS] Fix APM build
  [MIPS] There is no __GNUC_MAJOR__

[PATCH] NFS: Fix races in nfs_revalidate_mapping()

Prevent the call to invalidate_inode_pages2() from racing with file writes
by taking the inode->i_mutex across the page cache flush and invalidate.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

[PATCH] NFS: Fix Oops in rpc_call_sync()

Fix the Oops in http://bugzilla.linux-nfs.org/show_bug.cgi?id=138
We shouldn't be calling rpc_release_task() for tasks that are not active.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

[MIPS] Fix wrong checksum calculation on 64-bit MIPS

The commit 8e3d8433d8c22ca6c42cba4a67d300c39aae7822 ([NET]: MIPS
checksum annotations and cleanups) broke 64-bit MIPS.

The problem is the commit replaces some unsigned long with __be32.  On
64bit MIPS, a __be32 (i.e. unsigned int) value is represented as a
sign-extented 32-bit value in a 64-bit argument register.  So the
address 192.168.0.1 (0xc0a80001) is passed as 0xffffffffc0a80001 to
csum_tcpudp_nofold() but the asm code in the function expects
0x00000000c0a80001, therefore it returns a wrong checksum.  Explicit
cast to unsigned long is needed to drop high 32bit.

Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

[MIPS] VPE loader: Initialize lists before they're actually being used ...

kspd which due to makefile order happens to be initialized before the
vpe loader causes references to vpecontrol lists before they're actually
been initialized.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

[MIPS] Fix reported amount of freed memory - it's in kB not bytes

While at it, change message on DEC for consistency.

Signed-off-by: Thiemo Seufer <ths@networkno.de>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

[MIPS] vr41xx: need one more nop with mtc0_tlbw_hazard()

NEC VR4111 and VR4121 need one more nop with mtc0_tlbw_hazard().

Signed-off-by: Yoichi Yuasa <yoichi_yuasa@tripeaks.co.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

[MIPS] SMTC: Fix module build by exporting symbol

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

[MIPS] SMTC: Fix TLB sizing bug for TLB of 64 >= entries

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

[MIPS] Fix APM build

Definitions for TIF_FREEZE and _TIF_FREEZE were missing.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

[MIPS] There is no __GNUC_MAJOR__

Gcc major version number is in __GNUC__. As side effect fix checking
with sparse if sparse was built with gcc 4.1 and mips cross-compiler
is 3.4.

Sparse will inherit version 4.1, __GNUC__ won't be filtered from
"-dM -E -xc" output, sparse will pick only new major, effectively becoming
gcc version 3.1 which is unsupported.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
  [CIFS] Fix oops when Windows server sent bad domain name null terminator
  [CIFS]  cifs sprintf fix
  [CIFS] Remove 2 unneeded kzalloc casts
  [CIFS] Update CIFS version number

Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6

* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (23 commits)
  [SCTP]: Fix compiler warning.
  [IP] TUNNEL: Fix to be built with user application.
  [IPV6]: Fixed the size of the netlink message notified by inet6_rt_notify().
  [TCP]: rare bad TCP checksum with 2.6.19
  [NET]: Process include/linux/if_{addr,link}.h with unifdef
  [NETFILTER]: Fix iptables ABI breakage on (at least) CRIS
  [IRDA] vlsi_ir.{h,c}: remove kernel 2.4 code
  [TCP]: skb is unexpectedly freed.
  [IPSEC]: Policy list disorder
  [IrDA]: Removed incorrect IRDA_ASSERT()
  [IrDA]: irda-usb TX path optimization (was Re: IrDA spams logfiles - since 2.6.19)
  [X.25]: Add missing sock_put in x25_receive_data
  [SCTP]: Fix SACK sequence during shutdown
  [SCTP]: Correctly handle unexpected INIT-ACK chunk.
  [SCTP]: Verify some mandatory parameters.
  [SCTP]: Set correct error cause value for missing parameters
  [NETFILTER]: fix xt_state compile failure
  [NETFILTER]: ctnetlink: fix leak in ctnetlink_create_conntrack error path
  [SELINUX]: increment flow cache genid
  [IPV6] MCAST: Fix joining all-node multicast group on device initialization.
  ...

Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6

* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6:
mv643xx_eth: Fix race condition in mv643xx_eth_free_tx_descs
s2io bogus memset

Merge branch 'master' into upstream-fixes

libata: Initialize qc->pad_len

Initialize qc->pad_len for each new command. This ensures
that pad_len is not set to a stale value for zero data
length commands.

Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

libata: Fixup n_elem initialization

Fixup the inialization of qc->n_elem. It currently gets
initialized to 1 for commands that do not transfer any data.
Fix this by initializing n_elem to 0 and only setting to 1
in ata_scsi_qc_new when there is data to transfer. This fixes
some problems seen with SATA devices attached to ipr adapters.

Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

ahci: don't enter slumber on power down

Some ATA/ATAPI devices act weirdly after the link is put into slumber
mode.  Some hang completely requiring physical power removal while
others fail to wake up till the link is hardreset a couple of times.

The addition of slumber on power down was never driven by real need.
It just followed what ahci spec said literally.  The spec itself seems
faulty in that it doesn't consider devices (not controllers) which
don't support link powersaving mode.

Theory never matches reality when it comes to dark allys of cheap
ATA/ATAPI world.  It's just unrealistic to expect vendors to test
rarely used link powersaving feature rigorously.  This patch makes
ahci more friendly to the coldness of reality.

This shouldn't have any negative effect - when suspend operation
succeeds, we power off the whole machine; otherwise, we wake up
everything.  I can't see any reason to be so elaborate with powering
down the link in the first place.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

sata_nv: don't rely on NV_INT_DEV indication with ADMA

Several people reported issues with certain drive commands timing out on
sata_nv controllers running in ADMA mode. The commands in question were
non-DMA-mapped commands, usually FLUSH CACHE or FLUSH CACHE EXT.

From experimentation it appears that the NV_INT_DEV indication isn't
always set when a legitimate command completion interrupt is received on
a legacy-mode command, at least not on these controllers in ADMA mode.
When a command is pending on the port, force the flag on always in the
irq_stat value before calling nv_host_intr so that the drive busy state
is always checked by ata_host_intr.

This also fixes some questionable code in nv_host_intr which called
ata_check_status when a command was pending and ata_host_intr returned
"unhandled". If the device interrupted at just the wrong time this could
cause interrupts to be lost.

Signed-off-by: Robert Hancock <hancockr@shaw.ca>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

ahci: make ULi M5288 ignore interface fatal error bit

As with JMicron controllers, ULi M5288 sets interface fatal error bit
on device error including ATAPI CC.  This makes libata hardreset the
port on ATAPI CC thus making it impossible to use.  Ignore interface
fatal error bit on ULi M5288.  This fixes bugzilla bug #7837.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

[SCTP]: Fix compiler warning.

> --- a/net/sctp/sm_statefuns.c
> +++ b/net/sctp/sm_statefuns.c
> @@ -462,24 +461,6 @@ sctp_disposition_t sctp_sf_do_5_1C_ack(const struct sctp_endpoint *ep,

> - if (!init_tag) {
> - struct sctp_chunk *reply = sctp_make_abort(asoc, chunk, 0);
> - if (!reply)
> - goto nomem;

This introduced a compiler warning, easily fixed.

Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

[IP] TUNNEL: Fix to be built with user application.

include/linux/if_tunnel.h is broken for user application
because it was changed to use __be32 which is required
to include linux/types.h in advance but didn't.

(This issue is found when building MIPL2 daemon. We are not sure this
is the last header to be fixed about __be32.)

Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: TAKAMIYA Noriaki <takamiya@po.ntts.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>

[IPV6]: Fixed the size of the netlink message notified by inet6_rt_notify().

I think the return value of rt6_nlmsg_size() should includes the
amount of RTA_METRICS.

Signed-off-by: Noriaki TAKAMIYA <takamiya@po.ntts.co.jp>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

[TCP]: rare bad TCP checksum with 2.6.19

The patch "Replace CHECKSUM_HW by CHECKSUM_PARTIAL/CHECKSUM_COMPLETE"
changed to unconditional copying of ip_summed field from collapsed
skb. This patch reverts this change.

The majority of substantial work including heavy testing
and diagnosing by: Michael Tokarev <mjt@tls.msk.ru>
Possible reasons pointed by: Herbert Xu and Patrick McHardy.

Signed-off-by: Jarek Poplawski <jarkao2@o2.pl>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NET]: Process include/linux/if_{addr,link}.h with unifdef

After commit d3dcc077bf88806201093f86325ec656e4dbfbce,
include/linux/if_{addr,link}.h should be processed with unifdef.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge master.kernel.org:/pub/scm/linux/kernel/git/holtmann/bluetooth-2.6

[NETFILTER]: Fix iptables ABI breakage on (at least) CRIS

With the introduction of x_tables we accidentally broke compatibility
by defining IPT_TABLE_MAXNAMELEN to XT_FUNCTION_MAXNAMELEN instead of
XT_TABLE_MAXNAMELEN, which is two bytes larger.

On most architectures it doesn't really matter since we don't have
any tables with names that long in the kernel and the structure
layout didn't change because of alignment requirements of following
members. On CRIS however (and other architectures that don't align
data) this changed the structure layout and thus broke compatibility
with old iptables binaries.

Changing it back will break compatibility with binaries compiled
against recent kernels again, but since the breakage has only been
there for three releases this seems like the better choice.

Spotted by Jonas Berlin <xkr47@outerspace.dyndns.org>.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

[IRDA] vlsi_ir.{h,c}: remove kernel 2.4 code

This patch removes kernel 2.4 compatibility code.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

[TCP]: skb is unexpectedly freed.

I encountered a kernel panic with my test program, which is a very
simple IPv6 client-server program.

The server side sets IPV6_RECVPKTINFO on a listening socket, and the
client side just sends a message to the server.  Then the kernel panic
occurs on the server.  (If you need the test program, please let me
know. I can provide it.)

This problem happens because a skb is forcibly freed in
tcp_rcv_state_process().

When a socket in listening state(TCP_LISTEN) receives a syn packet,
then tcp_v6_conn_request() will be called from
tcp_rcv_state_process().  If the tcp_v6_conn_request() successfully
returns, the skb would be discarded by __kfree_skb().

However, in case of a listening socket which was already set
IPV6_RECVPKTINFO, an address of the skb will be stored in
treq->pktopts and a ref count of the skb will be incremented in
tcp_v6_conn_request().  But, even if the skb is still in use, the skb
will be freed.  Then someone still using the freed skb will cause the
kernel panic.

I suggest to use kfree_skb() instead of __kfree_skb().

Signed-off-by: Masayuki Nakagawa <nakagawa.msy@ncos.nec.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>

[IPSEC]: Policy list disorder

The recent hashing introduced an off-by-one bug in policy list insertion.
Instead of adding after the last entry with a lesser or equal priority,
we're adding after the successor of that entry.

This patch fixes this and also adds a warning if we detect a duplicate
entry in the policy list. This should never happen due to this if clause.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

[IrDA]: Removed incorrect IRDA_ASSERT()

With USB2.0 bulk out MTU can be 512 bytes, so checking it only for 64
bytes is incorrect.

Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

[IrDA]: irda-usb TX path optimization (was Re: IrDA spams logfiles - since 2.6.19)

Since we stop using dev_alloc_skb on the IrDA TX frame, we constantly run
into the case of the skb headroom being 0, and thus we call skb_cow for
every IrDA TX frame.
This patch uses a local buffer and memcpy the skb to it, saving us a
kmalloc for each of those IrDA TX frames.

Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

[X.25]: Add missing sock_put in x25_receive_data

__x25_find_socket does a sock_hold.
This adds a missing sock_put in x25_receive_data.

Signed-off-by: Andrew Hendry <andrew.hendry@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

[SCTP]: Fix SACK sequence during shutdown

Currently, when association enters SHUTDOWN state,the
implementation will SACK any DATA first and then transmit
the SHUTDOWN chunk. This is against the order required by
2960bis spec. SHUTDOWN must always be first, followed by
SACK. This change forces this order and also enables bundling.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

[SCTP]: Correctly handle unexpected INIT-ACK chunk.

Consider the chunk as Out-of-the-Blue if we don't have
an endpoint. Otherwise discard it as before.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

[SCTP]: Verify some mandatory parameters.

Verify init_tag and a_rwnd mandatory parameters in INIT and
INIT-ACK chunks.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

[SCTP]: Set correct error cause value for missing parameters

sctp_process_missing_param() needs to use the SCTP_ERROR_MISS_PARAM
error cause value.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NETFILTER]: fix xt_state compile failure

In file included from net/netfilter/xt_state.c:13:
include/net/netfilter/nf_conntrack_compat.h: In function 'nf_ct_l3proto_try_module_get':
include/net/netfilter/nf_conntrack_compat.h:70: error: 'PF_INET' undeclared (first use in this function)
include/net/netfilter/nf_conntrack_compat.h:70: error: (Each undeclared identifier is reported only once
include/net/netfilter/nf_conntrack_compat.h:70: error: for each function it appears in.)
include/net/netfilter/nf_conntrack_compat.h:71: warning: control reaches end of non-void function
make[2]: *** [net/netfilter/xt_state.o] Error 1
make[1]: *** [net/netfilter] Error 2
make: *** [net] Error 2

A simple fix is to have nf_conntrack_compat.h #include <linux/socket.h>.

Signed-off-by: Mikael Pettersson <mikpe@it.uu.se>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NETFILTER]: ctnetlink: fix leak in ctnetlink_create_conntrack error path

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

[SELINUX]: increment flow cache genid

Currently, old flow cache entries remain valid even after
a reload of SELinux policy.

This patch increments the flow cache generation id
on policy (re)loads so that flow cache entries are
revalidated as needed.

Thanks to Herbet Xu for pointing this out. See:
http://marc.theaimsgroup.com/?l=linux-netdev&m=116841378704536&w=2

There's also a general issue as well as a solution proposed
by David Miller for when flow_cache_genid wraps. I might be
submitting a separate patch for that later.

I request that this be applied to 2.6.20 since it's
a security relevant fix.

Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

[IPV6] MCAST: Fix joining all-node multicast group on device initialization.

Join all-node multicast group after assignment of dev->ip6_ptr
because it must be assigned when ipv6_dev_mc_inc() is called.
This fixes Bug#7817, reported by <gernoth@informatik.uni-erlangen.de>.

Closes: 7817
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

[IPSEC] flow: Fix potential memory leak

When old flow cache entries that are not at the head of their chain
trigger a transient security error they get unlinked along with all
the entries preceding them in the chain. The preceding entries are
not freed correctly.

This patch fixes this by simply leaving the entry around. It's based
on a suggestion by Venkat Yekkirala.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

V4L/DVB (5123): Buf_qbuf: fix: videobuf_queue->stream corruption and lockup

We are doing ->buf_prepare(buf) before adding buf to q->stream list. This
means that videobuf_qbuf() should not try to re-add a STATE_PREPARED buffer.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>

Change Linus' email address too

This changes a few mentions of my email address to point to the new one,
leaving things like old copyright messages alone.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

[PATCH] email change for shemminger@osdl.org

Change my email address to reflect OSDL merger.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
[ The irony. Somebody still has his sign-off message hardcoded
in a script or his brainstem ;^]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Clear spurious irq stat information when adding irq handler

Any newly added irq handler may obviously make any old spurious irq
status invalid, since the new handler may well be the thing that is
supposed to handle any interrupts that came in.

So just clear the statistics when adding handlers.

Pointed-out-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mv643xx_eth: Fix race condition in mv643xx_eth_free_tx_descs

mv643xx_eth: Fix race condition in mv643xx_eth_free_tx_descs

This bug was found and isolated by Thibaut VARENE <T-Bone@parisc-linux.org>
and Jarek Poplawski <jarkao2@o2.pl>. This patch is a modification of their
fixes. We acquire and release the lock for each descriptor that is freed
to minimize the time the lock is held.

Signed-off-by: Jeff Garzik <jeff@garzik.org>

s2io bogus memset

memset() after kmalloc() on size * 8 would better be on size * 8, not
just size; fixed by switching to kcalloc() - it's more idiomatic anyway.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus

* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
[MIPS] Vr41xx: Fix after GENERIC_HARDIRQS_NO__DO_IRQ change
[MIPS] SMTC: Instant IPI replay.

[PATCH] correct sys_shmget allocation check

As written, sys_shmget will return ENOSPC when one page is still
available for allocation. This patch corrects the test.

Signed-off-by: Guy Streeter <guy.streeter+lkml@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
--

Merge branch 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband

* 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband:
  IB/ehca: Fix mismatched spin_unlock in irq handler
  IB/ehca: Fix improper use of yield() with spinlock held
  IB/srp: Check match_strdup() return

[PATCH] fix prototype of csum_ipv6_magic() (ia64)

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

[PATCH] s2io bogus memset

memset() after kmalloc() on size * 8 would better be on size * 8, not
just size; fixed by switching to kcalloc() - it's more idiomatic anyway.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

[PATCH] horizon.c: missing __devinit

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

[PATCH] funsoft: ktermios fix

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

[PATCH] notifiers: fix blocking_notifier_call_chain() scalability

while lock-profiling the -rt kernel i noticed weird contention during
mmap-intense workloads, and the tracer showed the following gem, in one
of our MM hotpaths:

threaded-2771  1....   65us : sys_munmap (sysenter_do_call)
threaded-2771  1....   66us : profile_munmap (sys_munmap)
threaded-2771  1....   66us : blocking_notifier_call_chain (profile_munmap)
threaded-2771  1....   66us : rt_down_read (blocking_notifier_call_chain)

ouch! a global rw-semaphore taken in one of the most performance-
sensitive codepaths of the kernel.  And i dont even have oprofile
enabled! All distro kernels have CONFIG_PROFILING enabled, so this
scalability problem affects the majority of Linux users.

The fix is to enhance blocking_notifier_call_chain() to only take the
lock if there appears to be work on the call-chain.

With this patch applied i get nicely saturated system, and much higher
munmap performance, on SMP systems.

And as a bonus this also fixes a similar scalability bottleneck in the
thread-exit codepath: profile_task_exit() ...

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'linus' of master.kernel.org:/pub/scm/linux/kernel/git/perex/alsa

* 'linus' of master.kernel.org:/pub/scm/linux/kernel/git/perex/alsa:
[ALSA] Repair snd-usb-usx2y over OHCI

Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6

* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6:
  NetXen: Use pci_register_driver() instead of pci_module_init() in init_module
  NetXen: Firmware check modifications
  ehea: Fixed possible nullpointer access
  ehea: Added logging off associated errors
  ehea: Improved logging of permission issues
  ehea: New method to determine number of available ports
  ehea: Modified initial autoneg state determination
  ehea: Fixing firmware queue config issue
  ehea: Fixed wrong dereferencation
  PHY: Export phy ethtool helpers
  modify 3c589_cs to be SMP safe

Merge branch 'ftape' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/misc-2.6

* 'ftape' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/misc-2.6:
more ftape removal

Merge branch 'kill-jffs-prep' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/misc-2.6

* 'kill-jffs-prep' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/misc-2.6:
Note that JFFS (v1) is to be deleted, in feature-removal-schedule.txt

[PATCH] elevator: move clearing of unplug flag earlier

A flag was recently added to the elevator code to avoid
performing an unplug when reuests are being re-queued.
The goal of this flag was to avoid a deep recursion that
can occur when re-queueing requests after a SCSI device/host
reset. See http://lkml.org/lkml/2006/5/17/254

However, that fix added the flag near the bottom of a case
statement, where an earlier break (in an if statement) could
transport one out of the case, without setting the flag.
This patch sets the flag earlier in the case statement.

I re-discovered the deep recursion recently during testing;
I was told that it was a known problem, and the fix to it was
in the kernel I was testing. Indeed it was ... but it didn't
fix the bug. With the patch below, I no longer see the bug.

Signed-off by: Linas Vepstas <linas@austin.ibm.com>
Signed-off-by: Jens Axboe <axboe@suse.de>
Cc: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

[MIPS] Vr41xx: Fix after GENERIC_HARDIRQS_NO__DO_IRQ change

Signed-off-by: Yoichi Yuasa <yoichi_yuasa@tripeaks.co.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

[MIPS] SMTC: Instant IPI replay.

SMTC pseudo-interrupts between TCs are deferred and queued if the target
TC is interrupt-inhibited (IXMT). In the first SMTC prototypes, these
queued IPIs were serviced on return to user mode, or on entry into the
kernel idle loop. The INSTANT_REPLAY option dispatches them as part of
local_irq_restore() processing, which adds runtime overhead (hence the
option to turn it off), but ensures that IPIs are handled promptly even
under heavy I/O interrupt load.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

[PATCH] acpi: remove "video device notify" message

Seems to be some left-over debug code.

Cc: Len Brown <lenb@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

[PATCH] Kdump documentation update: ia64 portion

this patch fills in the portions for ia64 kexec.

Signed-off-by: Simon Horman <horms@verge.net.au>
Cc: "Zou, Nanhai" <nanhai.zou@intel.com>
Cc: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

[PATCH] Kdump documentation update: kexec-tools update

Mohan Kumar suggested making kexec-tools-testing.tar.gz a link to the
latest version. I have done this and this patch updates the documentation
accordingly.

Signed-off-by: Simon Horman <horms@verge.net.au>
Cc: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

[PATCH] resierfs: avoid tail packing if an inode was ever mmapped

This patch fixes a confusion reiserfs has for a long time.

On release file operation reiserfs used to try to pack file data stored in
last incomplete page of some files into metadata blocks.  After packing the
page got cleared with clear_page_dirty.  It did not take into account that
the page may be mmaped into other process's address space.  Recent
replacement for clear_page_dirty cancel_dirty_page found the confusion with
sanity check that page has to be not mapped.

The patch fixes the confusion by making reiserfs avoid tail packing if an
inode was ever mmapped.  reiserfs_mmap and reiserfs_file_release are
serialized with mutex in reiserfs specific inode.  reiserfs_mmap locks the
mutex and sets a bit in reiserfs specific inode flags.
reiserfs_file_release checks the bit having the mutex locked.  If bit is
set - tail packing is avoided.  This eliminates a possibility that mmapped
page gets cancel_page_dirty-ed.

Signed-off-by: Vladimir Saveliev <vs@namesys.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Chris Mason <mason@suse.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

[PATCH] mbind: restrict nodes to the currently allowed cpuset

Currently one can specify an arbitrary node mask to mbind that includes
nodes not allowed.  If that is done with an interleave policy then we will
go around all the nodes.  Those outside of the currently allowed cpuset
will be redirected to the border nodes.  Interleave will then create
imbalances at the borders of the cpuset.

This patch restricts the nodes to the currently allowed cpuset.

The RFC for this patch was discussed at
http://marc.theaimsgroup.com/?t=116793842100004&r=1&w=2

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Paul Jackson <pj@sgi.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

[PATCH] tlclk: bug fix + misc fixes

The following patch fixes a few problems with the tlclk driver.
* bug in the select_amcb1_transmit_clock
* racy read sys call
* racy open sys call
* use of add_timer where mod_timer would be better
* change to the timer data parameter use

Signed-off-by: Mark Gross <mark.gross@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

[PATCH] fix blk_direct_IO bio preparation

For large size DIO that needs multiple bio, one full page worth of data was
lost at the boundary of bio's maximum sector or segment limits.  After a
bio is full and got submitted.  The outer while (nbytes) { ...  } loop will
allocate a new bio and just march on to index into next page.  It just
forgets about the page that bio_add_page() rejected when previous bio is
full.  Fix it by put the rejected page back to pvec so we pick it up again
for the next bio.

Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

[PATCH] rtc-sh: act on rtc_wkalrm.enabled when setting an alarm

This fixes the SH rtc driver correctly act on the "enabled" flag when
setting an alarm.

Signed-off-by: Jamie Lenehan <lenehan@twibble.org>
Cc: David Brownell <david-b@pacbell.net>
Cc: Alessandro Zummo <alessandro.zummo@towertech.it>
Cc: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

[PATCH] KVM: fix bogus pagefault on writable pages

If a page is marked as dirty in the guest pte, set_pte_common() can set the
writable bit on newly-instantiated shadow pte. This optimization avoids
a write fault after the initial read fault.

However, if a write fault instantiates the pte, fix_write_pf() incorrectly
reports the fault as a guest page fault, and the guest oopses on what appears
to be a correctly-mapped page.

Fix is to detect the condition and only report a guest page fault on a user
access to a kernel page.

With the fix, a kvm guest can survive a whole night of running the kernel
hacker's screensaver (make -j9 in a loop).

Signed-off-by: Avi Kivity <avi@qumranet.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

[PATCH] KVM: x86 emulator: fix bit string instructions

The various bit string instructions (bts, btc, etc.) fail to adjust the
address correctly if the bit address is beyond BITS_PER_LONG.

This bug creeped in as the emulator originally relied on cr2 to contain the
memory address; however we now decode it from the mod r/m bits, and must
adjust the offset to account for large bit indices.

The patch is rather large because it switches src and dst decoding around, so
that the bit index is available when decoding the memory address.

This fixes workloads like the FC5 installer.

Signed-off-by: Avi Kivity <avi@qumranet.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

[PATCH] KVM: fix race between mmio reads and injected interrupts

The kvm mmio read path looks like:

1. guest read faults
2. kvm emulates read, calls emulator_read_emulated()
3. fails as a read requires userspace help
4. exit to userspace
5. userspace emulates read, kvm sets vcpu->mmio_read_completed
6. re-enter guest, fault again
7. kvm emulates read, calls emulator_read_emulated()
8. succeeds as vcpu->mmio_read_emulated is set
9. instruction completes and guest is resumed

A problem surfaces if the userspace exit (step 5) also requests an interrupt
injection.  In that case, the guest does not re-execute the original
instruction, but the interrupt handler.  The next time an mmio read is
exectued (likely for a different address), step 3 will find
vcpu->mmio_read_completed set and return the value read for the original
instruction.

The problem manifested itself in a few annoying ways:
- little squares appear randomly on console when switching virtual terminals
- ne2000 fails under nfs read load
- rtl8139 complains about "pci errors" even though the device model is
  incapable of issuing them.

Fix by skipping interrupt injection if an mmio read is pending.

A better fix is to avoid re-entry into the guest, and re-emulating immediately
instead.  However that's a bit more complex.

Signed-off-by: Avi Kivity <avi@qumranet.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

[PATCH] KVM: make sure there is a vcpu context loaded when destroying the mmu

This makes the vmwrite errors on vm shutdown go away.

Signed-off-by: Avi Kivity <avi@qumranet.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

[PATCH] paravirt: mark the paravirt_ops export internal

The paravirt subsystem is still in flux so all exports from it are
definitely internal use only. The APIs around this /will/ change.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Andi Kleen <ak@suse.de>
Cc: Zachary Amsden <zach@vmware.com>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

[PATCH] SubmitChecklist update

Sing the praises of `gcc -W'. Would have prevented that blockdev direct-IO
bug.

Cc: "Chen, Kenneth W" <kenneth.w.chen@intel.com>
Cc: "Randy.Dunlap" <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

[PATCH] blockdev direct_io: fix signedness bug

size_t is unsigned. IO errors aren't getting through.

Cc: "Chen, Kenneth W" <kenneth.w.chen@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

[PATCH] Revert nmi_known_cpu() check during boot option parsing

Commit f2802e7f571c05f9a901b1f5bd144aa730ccc88e and its x86 version
(b7471c6da94d30d3deadc55986cc38d1ff57f9ca) adds nmi_known_cpu() check
while parsing boot options in x86_64 and i386.

With that, "nmi_watchdog=2" stops working for me on Intel Core 2 CPU
based system.

The problem is, setup_nmi_watchdog is called while parsing the boot
option and identify_cpu is not done yet. So, the return value of
nmi_known_cpu() is not valid at this point.

So revert that check. This should not have any adverse effect as the
nmi_known_cpu() check is done again later in enable_lapic_nmi_watchdog().

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

[PATCH] fix "kvm: add vm exit profiling"

export profile_hits() on !SMP too.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Avi Kivity <avi@qumranet.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

[ALSA] Repair snd-usb-usx2y over OHCI

The previous patch 'Repair snd-usb-usx2y for usb 2.6.18' assumed
urb->start_frame roll over beyond MAX_INT for both UHCI & OHCI.
This isn't true until now (kernel 2.6.20).
Fix this by only looking at the common between OHCI & UHCI Frame number
range.
This is for mainline and stable kernels >= 2.6.18.

Signed-off-by: Karsten Wiese <fzu@wemgehoertderstaat.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Jaroslav Kysela <perex@suse.cz>

NetXen: Use pci_register_driver() instead of pci_module_init() in init_module

This will use pci_register_driver() instead of pci_module_init().

Signed-off-by: Amit S. Kale <amitkale@netxen.com>
Signed-off-by: Richard Knutsson <ricknu-0@student.ltu.se>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

NetXen: Firmware check modifications

This patch is to make the driver work with multiple minor firmware versions

Signed-off-by: Amit S. Kale <amitkale@netxen.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

ehea: Fixed possible nullpointer access

Fixed possible nullpointer access in event queue processing

Signed-off-by: Thomas Klein <tklein@de.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

ehea: Added logging off associated errors

Added logging of error events associated with a specific queue pair

Signed-off-by: Thomas Klein <tklein@de.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

ehea: Improved logging of permission issues

Disabled dump of hcall regs on some permission issues and
fixed appropriate misleading logmessages

Signed-off-by: Thomas Klein <tklein@de.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

ehea: New method to determine number of available ports

Count OFDT nodes to determine the number of available ports
instead of using the possibly outdated value from the hypervisor

Signed-off-by: Thomas Klein <tklein@de.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

ehea: Modified initial autoneg state determination

Logical partitions are not allowed to (try to) set the autonegotiation status.
This patch removes the respective function call from the port setup function.

Signed-off-by: Thomas Klein <tklein@de.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

ehea: Fixing firmware queue config issue

Fix to use exactly one queue for incoming packets in all
firmware configurations

Signed-off-by: Thomas Klein <tklein@de.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

ehea: Fixed wrong dereferencation

Not only check the pointer against 0 but also the dereferenced value

Signed-off-by: Thomas Klein <tklein@de.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

PHY: Export phy ethtool helpers

We need to export phy_ethtool_gset and phy_ethtool_sset to allow drivers that
use these functions to be built as modules.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>