err.no Git - linux-2.6/log

[8390]: Fix build error.

module_init() function reference is wrong.

Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'upstream-net26' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6

Conflicts:

drivers/s390/net/qeth_main.c

[IPV4] route: use read_mostly

The route table parameters are set based on system memory and sysctl
values that almost never change. Also the genid only changes every
10 minutes.

RTprint is defined by never used.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

[IPV4]: sk parameter is unused in ipv4_dst_blackhole.

Just remove it.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NET]: NPROTO is redundant; it's equal to AF_MAX/PF_MAX.

DaveM pointed out NPROTO exposed to userspace, so keep it around,
just make sure it stays in sync.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

[RAW]: Add raw_hashinfo member on struct proto.

Sorry for the patch sequence confusion :| but I found that the similar
thing can be done for raw sockets easily too late.

Expand the proto.h union with the raw_hashinfo member and use it in
raw_prot and rawv6_prot. This allows to drop the protocol specific
versions of hash and unhash callbacks.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

[UDP]: Make full use of proto.h.udp_hash innovation.

After this we have only udp_lib_get_port to get the port and two
stubs for ipv4 and ipv6. No difference in udp and udplite except
for initialized h.udp_hash member.

I tried to find a graceful way to drop the only difference between
udp_v4_get_port and udp_v6_get_port (i.e. the rcv_saddr comparison
routine), but adding one more callback on the struct proto didn't
appear such :( Maybe later.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

[SOCK]: Add udp_hash member to struct proto.

Inspired by the commit ab1e0a13 ([SOCK] proto: Add hashinfo member to
struct proto) from Arnaldo, I made similar thing for UDP/-Lite IPv4
and -v6 protocols.

The result is not that exciting, but it removes some levels of
indirection in udpxxx_get_port and saves some space in code and text.

The first step is to union existing hashinfo and new udp_hash on the
struct proto and give a name to this union, since future initialization
of tcpxxx_prot, dccp_vx_protinfo and udpxxx_protinfo will cause gcc
warning about inability to initialize anonymous member this way.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

[IPV4]: Always pass ip_options pointer into ip_options_compile.

This makes code a bit more uniform and straigthforward.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

[IPV4]: Remove unused ip_options->is_data.

ip_options->is_data is assigned only and never checked. The structure is
not a part of kernel interface to the userspace. So, it is safe to remove
this field.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

[IPV4]: Remove unnecessary check for opt->is_data in ip_options_compile.

There is the only way to reach ip_options compile with opt != NULL:

ip_options_get_finish
opt->is_data = 1;
ip_options_compile(opt, NULL)

So, checking for is_data inside opt != NULL branch is not needed.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

[TCP]: TCP_DEFER_ACCEPT updates - process as established

Change TCP_DEFER_ACCEPT implementation so that it transitions a
connection to ESTABLISHED after handshake is complete instead of
leaving it in SYN-RECV until some data arrvies. Place connection in
accept queue when first data packet arrives from slow path.

Benefits:
  - established connection is now reset if it never makes it
   to the accept queue

- diagnostic state of established matches with the packet traces
   showing completed handshake

- TCP_DEFER_ACCEPT timeouts are expressed in seconds and can now be
   enforced with reasonable accuracy instead of rounding up to next
   exponential back-off of syn-ack retry.

Signed-off-by: Patrick McManus <mcmanus@ducksong.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

[TCP]: TCP_DEFER_ACCEPT updates - dont retxmt synack

a socket in LISTEN that had completed its 3 way handshake, but not notified
userspace because of SO_DEFER_ACCEPT, would retransmit the already
acked syn-ack during the time it was waiting for the first data byte
from the peer.

Signed-off-by: Patrick McManus <mcmanus@ducksong.com>
Acked-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

[TCP]: TCP_DEFER_ACCEPT updates - defer timeout conflicts with max_thresh

timeout associated with SO_DEFER_ACCEPT wasn't being honored if it was
less than the timeout allowed by the maximum syn-recv queue size
algorithm. Fix by using the SO_DEFER_ACCEPT value if the ack has
arrived.

Signed-off-by: Patrick McManus <mcmanus@ducksong.com>
Acked-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

socket: SOCK_DEBUG type checking

Use the inline trick (same as pr_debug) to get checking of debug
statements even if no code is generated.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NET]: NULL pointer dereference and other nasty things in /proc/net/(tcp|udp)[6]

Commits f40c81 ([NETNS][IPV4] tcp - make proc handle the network
namespaces) and a91275 ([NETNS][IPV6] udp - make proc handle the
network namespace) both introduced bad checks on sockets and tw
buckets to belong to proper net namespace.

I.e. when checking for socket to belong to given net and family the

do {
sk = sk_next(sk);
} while (sk && sk->sk_net != net && sk->sk_family != family);

constructions were used. This is wrong, since as soon as the
sk->sk_net fits the net the socket is immediately returned, even if it
belongs to other family.

As the result four /proc/net/(udp|tcp)[6] entries show wrong info.
The udp6 entry even oopses when dereferencing inet6_sk(sk) pointer:

static void udp6_sock_seq_show(struct seq_file *seq, struct sock *sp, int bucket)
{
...
        struct ipv6_pinfo *np = inet6_sk(sp);
...

        dest  = &np->daddr; /* will be NULL for AF_INET sockets */
...
seq_printf(...
           dest->s6_addr32[0], dest->s6_addr32[1],
                   dest->s6_addr32[2], dest->s6_addr32[3],
...

Fix it by converting && to ||.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

netlink: make socket filters work on netlink

Make socket filters work for netlink unicast and notifications.
This is useful for applications like Zebra that get overrun with
messages that are then ignored.

Note: netlink messages are in host byte order, but packet filter
state machine operations are done as network byte order.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NETNS][IPV6] tcp6 - make proc per namespace

Make the proc for tcp6 to be per namespace.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NETNS][IPV6] udp6 - make proc per namespace

The proc init/exit functions take a new network namespace parameter in
order to register/unregister /proc/net/udp6 for a namespace.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NETNS][IPV4] tcp - make proc handle the network namespaces

This patch, like udp proc, makes the proc functions to take care of
which namespace the socket belongs.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NETNS][IPV6] tcp - assign the netns for timewait sockets

Copy the network namespace from the socket to the timewait socket.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NETNS][IPV6] udp - make proc handle the network namespace

This patch makes the common udp proc functions to take care of which
socket they should show taking into account the namespace it belongs.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NETNS][IPV6] mcast - fix compilation warning when procfs is not compiled in

When CONFIG_PROC_FS=no, the out_sock_create label is not used because
the code using it is disabled and that leads to a warning at compile
time.

This patch fix that by making a specific function to initialize proc
for igmp6, and remove the annoying CONFIG_PROC_FS sections in
init/exit function.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NET]: Add per-connection option to set max TSO frame size

Update: My mailer ate one of Jarek's feedback mails...  Fixed the
parameter in netif_set_gso_max_size() to be u32, not u16.  Fixed the
whitespace issue due to a patch import botch.  Changed the types from
u32 to unsigned int to be more consistent with other variables in the
area.  Also brought the patch up to the latest net-2.6.26 tree.

Update: Made gso_max_size container 32 bits, not 16.  Moved the
location of gso_max_size within netdev to be less hotpath.  Made more
consistent names between the sock and netdev layers, and added a
define for the max GSO size.

Update: Respun for net-2.6.26 tree.

Update: changed max_gso_frame_size and sk_gso_max_size from signed to
unsigned - thanks Stephen!

This patch adds the ability for device drivers to control the size of
the TSO frames being sent to them, per TCP connection.  By setting the
netdevice's gso_max_size value, the socket layer will set the GSO
frame size based on that value.  This will propogate into the TCP
layer, and send TSO's of that size to the hardware.

This can be desirable to help tune the bursty nature of TSO on a
per-adapter basis, where one may have 1 GbE and 10 GbE devices
coexisting in a system, one running multiqueue and the other not, etc.

This can also be desirable for devices that cannot support full 64 KB
TSO's, but still want to benefit from some level of segmentation
offloading.

Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6

[NET] ifb: set separate lockdep classes for queue locks

[   10.536424] =======================================================
[   10.536424] [ INFO: possible circular locking dependency detected ]
[   10.536424] 2.6.25-rc3-devel #3
[   10.536424] -------------------------------------------------------
[   10.536424] swapper/0 is trying to acquire lock:
[   10.536424]  (&dev->queue_lock){-+..}, at: [<c0299b4a>]
dev_queue_xmit+0x175/0x2f3
[   10.536424]
[   10.536424] but task is already holding lock:
[   10.536424]  (&p->tcfc_lock){-+..}, at: [<f8a67154>] tcf_mirred+0x20/0x178
[act_mirred]
[   10.536424]
[   10.536424] which lock already depends on the new lock.

lockdep warns of locking order while using ifb with sch_ingress and
act_mirred: ingress_lock, tcfc_lock, queue_lock (usually queue_lock
is at the beginning). This patch is only to tell lockdep that ifb is
a different device (e.g. from eth) and has its own pair of queue
locks. (This warning is a false-positive in common scenario of using
ifb; yet there are possible situations, when this order could be
dangerous; lockdep should warn in such a case.) (With suggestions by
David S. Miller)

Reported-and-tested-by: Denys Fedoryshchenko <denys@visp.net.lb>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>

[IPV6] KCONFIG: Fix description about IPV6_TUNNEL.

Based on notice from "Colin" <colins@sjtu.edu.cn>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

[TCP]: Fix shrinking windows with window scaling

When selecting a new window, tcp_select_window() tries not to shrink
the offered window by using the maximum of the remaining offered window
size and the newly calculated window size. The newly calculated window
size is always a multiple of the window scaling factor, the remaining
window size however might not be since it depends on rcv_wup/rcv_nxt.
This means we're effectively shrinking the window when scaling it down.

The dump below shows the problem (scaling factor 2^7):

- Window size of 557 (71296) is advertised, up to 3111907257:

IP 172.2.2.3.33000 > 172.2.2.2.33000: . ack 3111835961 win 557 <...>

- New window size of 514 (65792) is advertised, up to 3111907217, 40 bytes
below the last end:

IP 172.2.2.3.33000 > 172.2.2.2.33000: . 3113575668:3113577116(1448) ack 3111841425 win 514 <...>

The number 40 results from downscaling the remaining window:

3111907257 - 3111841425 = 65832
65832 / 2^7 = 514
65832 % 2^7 = 40

If the sender uses up the entire window before it is shrunk, this can have
chaotic effects on the connection. When sending ACKs, tcp_acceptable_seq()
will notice that the window has been shrunk since tcp_wnd_end() is before
tp->snd_nxt, which makes it choose tcp_wnd_end() as sequence number.
This will fail the receivers checks in tcp_sequence() however since it
is before it's tp->rcv_wup, making it respond with a dupack.

If both sides are in this condition, this leads to a constant flood of
ACKs until the connection times out.

Make sure the window is never shrunk by aligning the remaining window to
the window scaling factor.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

netpoll: zap_completion_queue: adjust skb->users counter

zap_completion_queue() retrieves skbs from completion_queue where they have
zero skb->users counter. Before dev_kfree_skb_any() it should be non-zero
yet, so it's increased now.

Reported-and-tested-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

bridge: use time_before() in br_fdb_cleanup()

In br_fdb_cleanup() next_timer and this_timer are in jiffies, so they
should be compared using the time_after() macro.

Signed-off-by: Fabio Checconi <fabio@gandalf.sssup.it>
Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

[TG3]: Fix build warning on sparc32.

Sparc MAC address support should be protected consistently
with CONFIG_SPARC, but there was a stray CONFIG_SPARC64
case.

Bump driver version and release date.

Reported by Andrew Morton.

Signed-off-by: David S. Miller <davem@davemloft.net>

MAINTAINERS: bluez-devel is subscribers-only

Signed-off-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>

audit: netlink socket can be auto-bound to pid other than current->pid (v2)

From: Pavel Emelyanov <xemul@openvz.org>

This patch is based on the one from Thomas.

The kauditd_thread() calls the netlink_unicast() and passes
the audit_pid to it. The audit_pid, in turn, is received from
the user space and the tool (I've checked the audit v1.6.9)
uses getpid() to pass one in the kernel. Besides, this tool
doesn't bind the netlink socket to this id, but simply creates
it allowing the kernel to auto-bind one.

That's the preamble.

The problem is that netlink_autobind() _does_not_ guarantees
that the socket will be auto-bound to the current pid. Instead
it uses the current pid as a hint to start looking for a free
id. So, in case of conflict, the audit messages can be sent
to a wrong socket. This can happen (it's unlikely, but can be)
in case some task opens more than one netlink sockets and then
the audit one starts - in this case the audit's pid can be busy
and its socket will be bound to another id.

The proposal is to introduce an audit_nlk_pid in audit subsys,
that will point to the netlink socket to send packets to. It
will most often be equal to audit_pid. The socket id can be
got from the skb's netlink CB right in the audit_receive_msg.
The audit_nlk_pid reset to 0 is not required, since all the
decisions are taken based on audit_pid value only.

Later, if the audit tools will bind the socket themselves, the
kernel will have to provide a way to setup the audit_nlk_pid
as well.

A good side effect of this patch is that audit_pid can later
be converted to struct pid, as it is not longer safe to use
pid_t-s in the presence of pid namespaces. But audit code still
uses the tgid from task_struct in the audit_signal_info and in
the audit_filter_syscall.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NET]: Fix permissions of /proc/net

commit e9720ac ([NET]: Make /proc/net a symlink on /proc/self/net (v3))
broke ganglia and probably other applications that read /proc/net/dev.

This is due to the change of permissions of /proc/net that was
introduced in that commit.

Before: dr-xr-xr-x 5 root root 0 Mar 19 11:30 /proc/net
After: dr-xr--r-- 5 root root 0 Mar 19 11:29 /proc/self/net

This patch restores the permissions to the old value which makes
ganglia happy again.

Pavel Emelyanov says:

This also broke the postfix, as it was reported in bug #10286
and described in detail by Benjamin.

Signed-off-by: Andre Noll <maan@systemlinux.org>
Acked-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

[SCTP]: Fix a race between module load and protosw access

There is a race is SCTP between the loading of the module
and the access by the socket layer to the protocol functions.
In particular, a list of addresss that SCTP maintains is
not initialized prior to the registration with the protosw.
Thus it is possible for a user application to gain access
to SCTP functions before everything has been initialized.
The problem shows up as odd crashes during connection
initializtion when we try to access the SCTP address list.

The solution is to refactor how we do registration and
initialize the lists prior to registering with the protosw.
Care must be taken since the address list initialization
depends on some other pieces of SCTP initialization. Also
the clean-up in case of failure now also needs to be refactored.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Acked-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NETFILTER]: ipt_recent: sanity check hit count

If a rule using ipt_recent is created with a hit count greater than
ip_pkt_list_tot, the rule will never match as it cannot keep track
of enough timestamps. This patch makes ipt_recent refuse to create such
rules.

With ip_pkt_list_tot's default value of 20, the following can be used
to reproduce the problem.

nc -u -l 0.0.0.0 1234 &
for i in `seq 1 100`; do echo $i | nc -w 1 -u 127.0.0.1 1234; done

This limits it to 20 packets:
iptables -A OUTPUT -p udp --dport 1234 -m recent --set --name test \
         --rsource
iptables -A OUTPUT -p udp --dport 1234 -m recent --update --seconds \
         60 --hitcount 20 --name test --rsource -j DROP

While this is unlimited:
iptables -A OUTPUT -p udp --dport 1234 -m recent --set --name test \
         --rsource
iptables -A OUTPUT -p udp --dport 1234 -m recent --update --seconds \
         60 --hitcount 21 --name test --rsource -j DROP

With the patch the second rule-set will throw an EINVAL.

Reported-by: Sean Kennedy <skennedy@vcn.com>
Signed-off-by: Daniel Hokka Zakrisson <daniel@hozac.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NETFILTER]: nf_conntrack_h323: logical-bitwise & confusion in process_setup()

logical-bitwise & confusion

Signed-off-by: Roel Kluin <12o3l@tiscali.nl>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

[RT2X00] drivers/net/wireless/rt2x00/rt2x00dev.c: remove dead code, fix warning

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NET]: Add debugging names to __RW_LOCK_UNLOCKED macros.

Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6

Conflicts:

drivers/net/wireless/rt2x00/rt2x00dev.c
net/8021q/vlan_dev.c

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6

[IPV4]: esp_output() misannotations

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

[8021Q]: vlan_dev misannotations

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

xfrm: ->eth_proto is __be16

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

[IPV4]: ipv4_is_lbcast() misannotations

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

[SUNRPC]: net/* NULL noise

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

[SCTP]: fix misannotated __sctp_rcv_asconf_lookup()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

[PKT_SCHED]: annotate cls_u32

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NET] endianness noise: INADDR_ANY

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6

Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev

* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
  ahci: Add Marvell 6121 SATA support
  pata_ali: use atapi_cmd_type() to determine cmd type instead of transfer size
  ahci: implement skip_host_reset parameter
  ahci: request all PCI BARs
  devres: implement pcim_iomap_regions_request_all()
  libata-acpi: improve dock event handling

Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus

* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
  virtio: fix race in enable_cb
  virtio: Enable netpoll interface for netconsole logging
  virtio: handle > 2 billion page balloon targets
  virtio: Fix sysfs bits to have proper block symlink
  virtio: Use spin_lock_irqsave/restore for virtio-pci

hfs_bnode_find() can fail, resulting in hfs_bnode_split() breakage

oops and fs corruption; the latter can happen even on valid fs in case of oom.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ahci: Add Marvell 6121 SATA support

Signed-off-by: Jose Alberto Reguero <jareguero@telefonica.net>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

pata_ali: use atapi_cmd_type() to determine cmd type instead of transfer size

pata_ali was using qc->nbytes to determine whether a command is
data transfer type or not. As now qc->nbytes can be extended by
padding and draining buffers, these tests are not useful anymore.

Use atapi_cmd_type() instead.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

ahci: implement skip_host_reset parameter

Under certain circumstances (SSP turned off by the BIOS) and for
debugging purposes, skipping global controller reset is helpful. Add
a kernel parameter for it.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

ahci: request all PCI BARs

ahci is often implemented with accompanying SFF compatible interface
and legacy IDE driver may attach to the legacy IO ports when the
controller is already claimed by ahci and vice-versa. This patch
makes ahci use pcim_iomap_regions_request_all() so that all IO regions
are claimed on attach.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

devres: implement pcim_iomap_regions_request_all()

Some drivers need to reserve all PCI BARs to prevent other drivers
misusing unoccupied BARs. pcim_iomap_regions_request_all() requests
all BARs and iomap specified BARs.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

libata-acpi: improve dock event handling

Improve ACPI hotplug handling such that dock event is handled properly.

* Register handlers for dock events.

* Directly detach device on EJECT_REQUEST instead of signaling hotplug
  event.  This prevents libata from accessing severed controller
  and/or device.

* While at it, use named constants for ACPI events and move uevent
  signaling inside host lock.

Original patch and testing by Holger Macht.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Holger Macht <hmacht@suse.de>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

ioc3.c: replace remaining __FUNCTION__ occurrences

__FUNCTION__ is gcc-specific, use __func__

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
drivers/sn/ioc3.c | 22 +++++++++++-----------
1 files changed, 11 insertions(+), 11 deletions(-)
Signed-off-by: Jeff Garzik <jeff@garzik.org>

ucc_geth: use correct thread number for 10/100Mbps link

Use thread number of 1 for 10/100Mbps link instead of 4.

Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Li Yang <leoli@freescale.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

igb: Correctly get protocol information

We can't look at the socket to get protocol information. We should
instead look directly at the packet, and hope there are no IPv6
option headers.

Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

[IOC3] Fix section missmatch

LD drivers/net/built-in.o
WARNING: drivers/net/built-in.o(.text+0x3468): Section mismatch in reference fro
m the function ioc3_probe() to the function .devinit.text:ioc3_serial_probe()
The function ioc3_probe() references
the function __devinit ioc3_serial_probe().
This is often because ioc3_probe lacks a __devinit
annotation or the annotation of ioc3_serial_probe is wrong.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

2.6.25-rc4 de_stop_rxtx polling wrong

This untested patch _should_ fix:
"(net de2104x) Kernel panic with de2104x tulip driver on boot"
http://bugzilla.kernel.org/show_bug.cgi?id=3156

But the bug submitter isn't responding. Same fix has been applied
to tulip.c (several years ago) and uli526x.c (Feb 2008) drivers.

[ The panic reported in the bug report was removed in a recently
(march 2008) accepted patch from Ondrej Zary. ]

Signed-off-by: Grant Grundler <grundler@parisc-linux.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

forcedeth: limit tx to 16

This is a critical patch which adds a workaround for a HW bug. The patch
will limit the number of outstanding tx packets to 16. Otherwise, the HW
could send out packets with bad checksums.

The driver will still setup the tx packets into the ring, however, will
only set the Valid bit on 16 packets at a time.

Signed-off-by: Ayaz Abdulla <aabdulla@nvidia.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

3c501: Further coding style fixes

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

cxgb3: Fix transmit queue stop mechanism

The last change in the Tx queue stop mechanism opens a window
where the Tx queue might be stopped after pending credits
returned.

Tx credits are returned via a control message generated by the HW.
It returns tx credits on demand, triggered by a completion bit
set in selective transmit packet headers.

The current code can lead to the Tx queue stopped
with all pending credits returned, and the current frame
not triggering a credit return. The Tx queue will then never be
awaken.

The driver could alternatively request a completion for packets
that stop the queue. It's however safer at this point to go back
to the pre-existing behaviour.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

NEWEMAC: Add compatible "ibm,tah" to tah matching table

Add "ibm,tah" to the compatible matching table of the ibm_newemac
tah driver. The type "tah" is still preserved for compatibility reasons.
New dts files should use the compatible property though.

Signed-off-by: Stefan Roese <sr@denx.de>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

rndis_host: fix transfer size negotiation

This patch should resolve a problem that's troubled support for
some RNDIS peripherals. It seems to have boiled down to using a
variable to establish transfer size limits before it was assigned,
which caused those devices to fallback to a default "jumbogram"
mode we don't support. Fix by assigning it earlier for RNDIS.

Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
[ cleanups ]
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

NEWEMAC: fix support for pause packets

Problem Description and Fix
---------------------------
When a pause packet(with destination as reserved Multicast address) is
received by the EMAC hardware to control the flow of frames being
transmitted by it, it is dropped by the hardware unless the reserved
Multicast address is hashed in to the GAHT[1-4] registers. This code fix
adds the default reserved multicast address to the GAHT[1-4] registers
in the EMAC(s) present on the chip. The flow control with Pause packets
will only work if the following register bits are programmed in EMAC:
EMACx_MR1[APP] = 1
EMACx_RMR[BAE] = 1
EMACx_RMR[MAE] = 1

Behavior that may be observed in a running system
-------------------------------------------------
A host transferring data from a PPC based system may send a Pause packet
to the PPC EMAC requesting it to slow down the flow of packets. If the
default reserved multicast MAC address is not programmed into the
GAHT[1-4] registers this Pause packet will be dropped by PPC EMAC and no
Flow Control will be done.

Signed-off-by: Pravin M. Bathija <pbathija@amcc.com>
Signed-off-by: Stefan Roese <sr@denx.de>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

virtio: fix race in enable_cb

There is a race in virtio_net, dealing with disabling/enabling the callback.
I saw the following oops:

kernel BUG at /space/kvm/drivers/virtio/virtio_ring.c:218!
illegal operation: 0001 [#1] SMP
Modules linked in: sunrpc dm_mod
CPU: 2 Not tainted 2.6.25-rc1zlive-host-10623-gd358142-dirty #99
Process swapper (pid: 0, task: 000000000f85a610, ksp: 000000000f873c60)
Krnl PSW : 0404300180000000 00000000002b81a6 (vring_disable_cb+0x16/0x20)
           R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:0 CC:3 PM:0 EA:3
Krnl GPRS: 0000000000000001 0000000000000001 0000000010005800 0000000000000001
           000000000f3a0900 000000000f85a610 0000000000000000 0000000000000000
           0000000000000000 000000000f870000 0000000000000000 0000000000001237
           000000000f3a0920 000000000010ff74 00000000002846f6 000000000fa0bcd8
Krnl Code: 00000000002b819a: a7110001           tmll    %r1,1
           00000000002b819e: a7840004           brc     8,2b81a6
           00000000002b81a2: a7f40001           brc     15,2b81a4
          >00000000002b81a6: a51b0001           oill    %r1,1
           00000000002b81aa: 40102000           sth     %r1,0(%r2)
           00000000002b81ae: 07fe               bcr     15,%r14
           00000000002b81b0: eb7ff0380024       stmg    %r7,%r15,56(%r15)
           00000000002b81b6: a7f13e00           tmll    %r15,15872
Call Trace:
([<000000000fa0bcd0>] 0xfa0bcd0)
[<00000000002b8350>] vring_interrupt+0x5c/0x6c
[<000000000010ab08>] do_extint+0xb8/0xf0
[<0000000000110716>] ext_no_vtime+0x16/0x1a
[<0000000000107e72>] cpu_idle+0x1c2/0x1e0

The problem can be triggered with a high amount of host->guest traffic.
I think its the following race:

poll says netif_rx_complete
poll calls enable_cb
enable_cb opens the interrupt mask
a new packet comes, an interrupt is triggered----\
enable_cb sees that there is more work           |
enable_cb disables the interrupt                 |
       .                                         V
       .                            interrupt is delivered
       .                            skb_recv_done does atomic napi test, ok
some waiting                       disable_cb is called->check fails->bang!
       .
poll would do napi check
poll would do disable_cb

The fix is to let enable_cb not disable the interrupt again, but expect the
caller to do the cleanup if it returns false. In that case, the interrupt is
only disabled, if the napi test_set_bit was successful.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (cleaned up doco)

virtio: Enable netpoll interface for netconsole logging

Add a new poll_controller handler that the netpoll interface needs.

This enables netconsole logging from a kvm guest over the virtio
net interface.

Signed-off-by: Amit Shah <amitshah@gmx.net>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

virtio: handle > 2 billion page balloon targets

If the host asks for a huge target towards_target() can overflow, and
we up oops as we try to release more pages than we have. The simple
fix is to use a 64-bit value.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

virtio: Fix sysfs bits to have proper block symlink

Fix up so that the virtio_blk devices in sysfs link correctly to their
block device. This then allows them to be detected by hal, etc

Signed-off-by: Jeremy Katz <katzj@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

virtio: Use spin_lock_irqsave/restore for virtio-pci

virtio-pci acquires its spin lock in an interrupt context so it's necessary
to use spin_lock_irqsave/restore variants. This patch fixes guest SMP when
using virtio devices in KVM.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

drivers/net/atl1/atl1_main.c: remove unused variable

The variable update_rx is initialized but never used otherwise.

The semantic patch that makes this change is as follows:
(http://www.emn.fr/x-info/coccinelle/)

// <smpl>
@@
type T;
identifier i;
constant C;
@@

(
extern T i;
|
- T i;
<+... when != i
- i = C;
...+>
)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Acked-by: Jay Cliburn <jacliburn@bellsouth.net>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

drivers/net/ipg.c: remove unused variable

The variable gig is initialized but never used otherwise.

The semantic patch that makes this change is as follows:
(http://www.emn.fr/x-info/coccinelle/)

// <smpl>
@@
type T;
identifier i;
constant C;
@@

(
extern T i;
|
- T i;
<+... when != i
- i = C;
...+>
)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

epic100 endianness annotations and fixes

* "powerpc or sparc" is not the same as "big-endian", fix the ifdef
* since we tell the card to byteswap the descriptors on big-endian,
we ought to leave them host-endian...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

ipg fix

spurious cpu_to_le64()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

more misannotations: ne2k-pci

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

fore2000 - fix misannotations

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

wan/farsync: copy_from_user() to iomem is wrong

kmalloc intermediate buffer(), do copy_from_user() + memcpy_toio()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

r6040 endianness fixes

pci_unmap_single() on little-endian address

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

ixgbe: Increment version

Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

ixgbe: Add optional DCA infrastructure

82598 cards and up support DCA, which enables the chipset to warm
up the caches for upcoming payload data. This code makes the
driver plug into the CONFIG_DCA infrastructure that was merged
earlier.

Signed-off-by: Jeb Cramer <cramerj@intel.com>
Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

ixgbe: Introduce adaptive interrupt moderation

82598 can produce a formidable interrupt rate, and is largely
unusable without some form of moderation. The default behaviour
before this patch is to limit irq's to a reasonable number.
However, just like our other drivers we can reduce latency
for small packet-type traffic considerably by allowing the
irq rate to go up dynamically.

This patch introduces a simple irq moderation algorithm based
on traffic analysis. The driver will use more CPU to service
small packets quicker but will perform the same on bulk traffic
as the old code.

Signed-off-by: Ayyappan Veeraiyan <ayyappan.veeraiyan@intel.com>
Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

ixgbe: Introduce Multiqueue TX

Now that the irq vector code is in place, we can add the conditional
multiqueue TX code in the driver. This requires the optional
CONFIG_NETDEVICES_MULTIQUEUE=y and will not be enabled without
it.

Signed-off-by: Ayyappan Veeraiyan <ayyappan.veeraiyan@intel.com>
Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
Acked-by: Waskiewicz Jr, Peter P <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

ixgbe: Introduce MSI-X queue vector code

This code abstracts the per-queue MSI-X interrupt vector into
a queue vector layer. This abstraction is needed since there can
be many more queues than available MSI-X vectors in a machine.

The MSI-X irq vectors are remapped to a shared queue vector which
can point to several (both RX and TX) hardware queues. The NAPI
algorithm then cleans the appropriate ring/queues on interrupt
or poll.

The remapping is a delicate and complex calculation to make sure
that we're not unbalancing the irq load, and spreads the irqs
as much as possible, and may combine RX and TX flows onto the
same queue vector.

This effectively enables receive flow hashing across vectors
and helps irq load balance across CPUs.

Signed-off-by: Ayyappan Veeraiyan <ayyappan.veeraiyan@intel.com>
Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Acked-by: Waskiewicz Jr, Peter P <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

/drivers/net/atarilance.c replaced init_module&cleanup_module with module_init&module_exit

Replaced init_module and cleanup_module with static functions and module_init/module_exit.

Signed-off-by: Jon Schindler <jkschind@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

/drivers/net/at1700.c replaced init_module&cleanup_module with module_init&module_exit

Replaced init_module and cleanup_module with static functions and module_init/module_exit.

Signed-off-by: Jon Schindler <jkschind@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

/drivers/net/arcnet/com20020.c replaced init_module&cleanup_module with module_init&module_exit

Replaced init_module and cleanup_module with static functions and module_init/module_exit.

Signed-off-by: Jon Schindler <jkschind@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

/drivers/net/appletalk/cops.c replaced init_module&cleanup_module with module_init&module_exit

Replaced init_module and cleanup_module with static functions and module_init/module_exit.

Signed-off-by: Jon Schindler <jkschind@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

/drivers/net/8390.c replaced init_module&cleanup_module with module_init&module_exit

Replaced init_module and cleanup_module with static functions and module_init/module_exit.

Signed-off-by: Jon Schindler <jkschind@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

smc91x: make superh use default config V2

Removes superh board specific configuration from the header file. These boards
will instead be configured using platform data.

Signed-off-by: Magnus Damm <damm@igel.co.jp>
Acked-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

smc91x: add insw/outsw to default config V2

This patch makes sure SMC_insw()/SMC_outsw() are defined for the
default configuration. Without this change BUG()s will be triggered
when using 16-bit only platform data and the default configuration.

Signed-off-by: Magnus Damm <damm@igel.co.jp>
Acked-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

smc91x: introduce platform data flags V2

This patch introduces struct smc91x_platdata and modifies the driver so
bus width is checked during run time using SMC_nBIT() instead of
SMC_CAN_USE_nBIT.

V2 keeps static configuration lean using SMC_DYNAMIC_BUS_CONFIG.

Signed-off-by: Magnus Damm <damm@igel.co.jp>
Acked-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

smc91x: pass along private data V2

Pass a private data pointer to macros and functions. This makes it easy
to later on make run time decisions. This patch does not change any logic.
These changes should be optimized away during compilation.

V2 changes the macro argument name from "priv" to "lp".

Signed-off-by: Magnus Damm <damm@igel.co.jp>
Acked-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

S2io: Version update for multiqueue and vlan patches

- Updated version number.

Signed-off-by: Surjit Reang <surjit.reang@neterion.com>
Signed-off-by: Ramkrishna Vepa <ram.vepa@neterion.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

S2io: Support for vlan_rx_kill_vid entry point

- Resubmit #3
- Added s2io_vlan_rx_kill_vid entry point function for unregistering vlan.
- Fix to aggregate vlan packets. IP offset is incremented by
4 bytes if the packet contains vlan header.

Signed-off-by: Surjit Reang <surjit.reang@neterion.com>
Signed-off-by: Ramkrishna Vepa <ram.vepa@neterion.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

S2io: Multiqueue network device support - FIFO selection based on L4 ports

- Resubmit #2
- Transmit fifo selection based on TCP/UDP ports.
- Added tx_steering_type loadable parameter for transmit fifo selection.
  0x0 NO_STEERING: Default FIFO is selected.
  0x1 TX_PRIORITY_STEERING: FIFO is selected based on skb->priority.
  0x2 TX_DEFAULT_STEERING: FIFO is selected based on L4 Ports.

Signed-off-by: Surjit Reang <surjit.reang@neterion.com>
Signed-off-by: Ramkrishna Vepa <ram.vepa@neterion.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>