]> err.no Git - linux-2.6/log
linux-2.6
18 years ago[IPX]: Endian bug in ipxrtr_route_packet()
Alexey Dobriyan [Sun, 11 Jun 2006 01:05:35 +0000 (18:05 -0700)]
[IPX]: Endian bug in ipxrtr_route_packet()

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NET]: Warn in __skb_trim if skb is paged
Herbert Xu [Fri, 9 Jun 2006 23:13:38 +0000 (16:13 -0700)]
[NET]: Warn in __skb_trim if skb is paged

It's better to warn and fail rather than rarely triggering BUG on paths
that incorrectly call skb_trim/__skb_trim on a non-linear skb.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NET]: skb_trim audit
Herbert Xu [Fri, 9 Jun 2006 23:13:01 +0000 (16:13 -0700)]
[NET]: skb_trim audit

I found a few more spots where pskb_trim_rcsum could be used but were not.
This patch changes them to use it.

Also, sk_filter can get paged skb data.  Therefore we must use pskb_trim
instead of skb_trim.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NET] ppp: Remove unnecessary pskb_may_pull
Herbert Xu [Fri, 9 Jun 2006 23:11:27 +0000 (16:11 -0700)]
[NET] ppp: Remove unnecessary pskb_may_pull

In ppp_receive_nonmp_frame, we call pskb_may_pull(skb, skb->len) if the
tailroom is >= 124.  This is pointless because this pskb_may_pull is only
needed if the skb is non-linear.  However, if it is non-linear then the
tailroom would be zero.

So it can be safely removed.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NET]: Clean up skb_linearize
Herbert Xu [Fri, 9 Jun 2006 23:10:40 +0000 (16:10 -0700)]
[NET]: Clean up skb_linearize

The linearisation operation doesn't need to be super-optimised.  So we can
replace __skb_linearize with __pskb_pull_tail which does the same thing but
is more general.

Also, most users of skb_linearize end up testing whether the skb is linear
or not so it helps to make skb_linearize do just that.

Some callers of skb_linearize also use it to copy cloned data, so it's
useful to have a new function skb_linearize_cow to copy the data if it's
either non-linear or cloned.

Last but not least, I've removed the gfp argument since nobody uses it
anymore.  If it's ever needed we can easily add it back.

Misc bugs fixed by this patch:

* via-velocity error handling (also, no SG => no frags)

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NET]: Add netif_tx_lock
Herbert Xu [Fri, 9 Jun 2006 19:20:56 +0000 (12:20 -0700)]
[NET]: Add netif_tx_lock

Various drivers use xmit_lock internally to synchronise with their
transmission routines.  They do so without setting xmit_lock_owner.
This is fine as long as netpoll is not in use.

With netpoll it is possible for deadlocks to occur if xmit_lock_owner
isn't set.  This is because if a printk occurs while xmit_lock is held
and xmit_lock_owner is not set can cause netpoll to attempt to take
xmit_lock recursively.

While it is possible to resolve this by getting netpoll to use
trylock, it is suboptimal because netpoll's sole objective is to
maximise the chance of getting the printk out on the wire.  So
delaying or dropping the message is to be avoided as much as possible.

So the only alternative is to always set xmit_lock_owner.  The
following patch does this by introducing the netif_tx_lock family of
functions that take care of setting/unsetting xmit_lock_owner.

I renamed xmit_lock to _xmit_lock to indicate that it should not be
used directly.  I didn't provide irq versions of the netif_tx_lock
functions since xmit_lock is meant to be a BH-disabling lock.

This is pretty much a straight text substitution except for a small
bug fix in winbond.  It currently uses
netif_stop_queue/spin_unlock_wait to stop transmission.  This is
unsafe as an IRQ can potentially wake up the queue.  So it is safer to
use netif_tx_disable.

The hamradio bits used spin_lock_irq but it is unnecessary as
xmit_lock must never be taken in an IRQ handler.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NETFILTER]: hashlimit match: fix random initialization
Patrick McHardy [Fri, 9 Jun 2006 19:18:47 +0000 (12:18 -0700)]
[NETFILTER]: hashlimit match: fix random initialization

hashlimit does:

        if (!ht->rnd)
                get_random_bytes(&ht->rnd, 4);

ignoring that 0 is also a valid random number.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NETFILTER]: recent match: missing refcnt initialization
Patrick McHardy [Fri, 9 Jun 2006 19:18:17 +0000 (12:18 -0700)]
[NETFILTER]: recent match: missing refcnt initialization

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NETFILTER]: recent match: fix "sleeping function called from invalid context"
Patrick McHardy [Fri, 9 Jun 2006 19:17:41 +0000 (12:17 -0700)]
[NETFILTER]: recent match: fix "sleeping function called from invalid context"

create_proc_entry must not be called with locks held. Use a mutex
instead to protect data only changed in user context.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[SECMARK]: Add new packet controls to SELinux
James Morris [Fri, 9 Jun 2006 07:33:33 +0000 (00:33 -0700)]
[SECMARK]: Add new packet controls to SELinux

Add new per-packet access controls to SELinux, replacing the old
packet controls.

Packets are labeled with the iptables SECMARK and CONNSECMARK targets,
then security policy for the packets is enforced with these controls.

To allow for a smooth transition to the new controls, the old code is
still present, but not active by default.  To restore previous
behavior, the old controls may be activated at runtime by writing a
'1' to /selinux/compat_net, and also via the kernel boot parameter
selinux_compat_net.  Switching between the network control models
requires the security load_policy permission.  The old controls will
probably eventually be removed and any continued use is discouraged.

With this patch, the new secmark controls for SElinux are disabled by
default, so existing behavior is entirely preserved, and the user is
not affected at all.

It also provides a config option to enable the secmark controls by
default (which can always be overridden at boot and runtime).  It is
also noted in the kconfig help that the user will need updated
userspace if enabling secmark controls for SELinux and that they'll
probably need the SECMARK and CONNMARK targets, and conntrack protocol
helpers, although such decisions are beyond the scope of kernel
configuration.

Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[SECMARK]: Add CONNSECMARK xtables target
James Morris [Fri, 9 Jun 2006 07:32:39 +0000 (00:32 -0700)]
[SECMARK]: Add CONNSECMARK xtables target

Add a new xtables target, CONNSECMARK, which is used to specify rules
for copying security marks from packets to connections, and for
copyying security marks back from connections to packets.  This is
similar to the CONNMARK target, but is more limited in scope in that
it only allows copying of security marks to and from packets, as this
is all it needs to do.

A typical scenario would be to apply a security mark to a 'new' packet
with SECMARK, then copy that to its conntrack via CONNMARK, and then
restore the security mark from the connection to established and
related packets on that connection.

Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[SECMARK]: Add secmark support to conntrack
James Morris [Fri, 9 Jun 2006 07:31:46 +0000 (00:31 -0700)]
[SECMARK]: Add secmark support to conntrack

Add a secmark field to IP and NF conntracks, so that security markings
on packets can be copied to their associated connections, and also
copied back to packets as required.  This is similar to the network
mark field currently used with conntrack, although it is intended for
enforcement of security policy rather than network policy.

Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[SECMARK]: Add xtables SECMARK target
James Morris [Fri, 9 Jun 2006 07:30:57 +0000 (00:30 -0700)]
[SECMARK]: Add xtables SECMARK target

Add a SECMARK target to xtables, allowing the admin to apply security
marks to packets via both iptables and ip6tables.

The target currently handles SELinux security marking, but can be
extended for other purposes as needed.

Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[SECMARK]: Add secmark support to core networking.
James Morris [Fri, 9 Jun 2006 07:29:17 +0000 (00:29 -0700)]
[SECMARK]: Add secmark support to core networking.

Add a secmark field to the skbuff structure, to allow security subsystems to
place security markings on network packets.  This is similar to the nfmark
field, except is intended for implementing security policy, rather than than
networking policy.

This patch was already acked in principle by Dave Miller.

Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[SECMARK]: Add SELinux exports
James Morris [Fri, 9 Jun 2006 07:28:25 +0000 (00:28 -0700)]
[SECMARK]: Add SELinux exports

Add and export new functions to the in-kernel SELinux API in support of the
new secmark-based packet controls.

Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[SECMARK]: Add new flask definitions to SELinux
James Morris [Fri, 9 Jun 2006 07:27:28 +0000 (00:27 -0700)]
[SECMARK]: Add new flask definitions to SELinux

Secmark implements a new scheme for adding security markings to
packets via iptables, as well as changes to SELinux to use these
markings for security policy enforcement.  The rationale for this
scheme is explained and discussed in detail in the original threads:

 http://thread.gmane.org/gmane.linux.network/34927/
 http://thread.gmane.org/gmane.linux.network/35244/

Examples of policy and rulesets, as well as a full archive of patches
for iptables and SELinux userland, may be found at:

http://people.redhat.com/jmorris/selinux/secmark/

The code has been tested with various compilation options and in
several scenarios, including with 'complicated' protocols such as FTP
and also with the new generic conntrack code with IPv6 connection
tracking.

This patch:

Add support for a new object class ('packet'), and associated
permissions ('send', 'recv', 'relabelto').  These are used to enforce
security policy for network packets labeled with SECMARK, and for
adding labeling rules.

Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[SELINUX]: add security class for appletalk sockets
Christopher J. PeBenito [Fri, 9 Jun 2006 07:25:03 +0000 (00:25 -0700)]
[SELINUX]: add security class for appletalk sockets

Add a security class for appletalk sockets so that they can be
distinguished in SELinux policy.  Please apply.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NET]: Fix warnings after LSM-IPSEC changes.
David S. Miller [Fri, 9 Jun 2006 06:58:52 +0000 (23:58 -0700)]
[NET]: Fix warnings after LSM-IPSEC changes.

Assignment used as truth value in xfrm_del_sa()
and xfrm_get_policy().

Wrong argument type declared for security_xfrm_state_delete()
when SELINUX is disabled.

Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NET]: NET_TCPPROBE Kconfig fix
Dave Jones [Fri, 9 Jun 2006 06:42:09 +0000 (23:42 -0700)]
[NET]: NET_TCPPROBE Kconfig fix

Just spotted this typo in a new option.

Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[LSM-IPsec]: SELinux Authorize
Catherine Zhang [Fri, 9 Jun 2006 06:39:49 +0000 (23:39 -0700)]
[LSM-IPsec]: SELinux Authorize

This patch contains a fix for the previous patch that adds security
contexts to IPsec policies and security associations.  In the previous
patch, no authorization (besides the check for write permissions to
SAD and SPD) is required to delete IPsec policies and security
assocations with security contexts.  Thus a user authorized to change
SAD and SPD can bypass the IPsec policy authorization by simply
deleteing policies with security contexts.  To fix this security hole,
an additional authorization check is added for removing security
policies and security associations with security contexts.

Note that if no security context is supplied on add or present on
policy to be deleted, the SELinux module allows the change
unconditionally.  The hook is called on deletion when no context is
present, which we may want to change.  At present, I left it up to the
module.

LSM changes:

The patch adds two new LSM hooks: xfrm_policy_delete and
xfrm_state_delete.  The new hooks are necessary to authorize deletion
of IPsec policies that have security contexts.  The existing hooks
xfrm_policy_free and xfrm_state_free lack the context to do the
authorization, so I decided to split authorization of deletion and
memory management of security data, as is typical in the LSM
interface.

Use:

The new delete hooks are checked when xfrm_policy or xfrm_state are
deleted by either the xfrm_user interface (xfrm_get_policy,
xfrm_del_sa) or the pfkey interface (pfkey_spddelete, pfkey_delete).

SELinux changes:

The new policy_delete and state_delete functions are added.

Signed-off-by: Catherine Zhang <cxzhang@watson.ibm.com>
Signed-off-by: Trent Jaeger <tjaeger@cse.psu.edu>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[CONNECTOR]: Fix warning in cn_queue.c
Andreas Schwab [Tue, 6 Jun 2006 04:21:57 +0000 (21:21 -0700)]
[CONNECTOR]: Fix warning in cn_queue.c

cn_queue.c:130: warning: value computed is not used

There is no point in testing the atomic value if the result is thrown
away.

From Evgeniy:

It was created to put implicit smp barrier, but it is not needed there.

Signed-off-by: Andreas Schwab <schwab@suse.de>
Signed-off-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IPV4] icmp: Kill local 'ip' arg in icmp_redirect().
David S. Miller [Tue, 6 Jun 2006 04:19:24 +0000 (21:19 -0700)]
[IPV4] icmp: Kill local 'ip' arg in icmp_redirect().

It is typed wrong, and it's only assigned and used once.
So just pass in iph->daddr directly which fixes both problems.

Based upon a patch by Alexey Dobriyan.

Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IPV4]: Right prototype of __raw_v4_lookup()
Alexey Dobriyan [Tue, 6 Jun 2006 04:06:41 +0000 (21:06 -0700)]
[IPV4]: Right prototype of __raw_v4_lookup()

All users pass 32-bit values as addresses and internally they're
compared with 32-bit entities. So, change "laddr" and "raddr" types to
__be32.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IPV4] igmp: Fixup struct ip_mc_list::multiaddr type
Alexey Dobriyan [Tue, 6 Jun 2006 04:04:39 +0000 (21:04 -0700)]
[IPV4] igmp: Fixup struct ip_mc_list::multiaddr type

All users except two expect 32-bit big-endian value. One is of

->multiaddr = ->multiaddr

variety. And last one is "%08lX".

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[TCP]: Fix compile warning in tcp_probe.c
David S. Miller [Tue, 6 Jun 2006 00:59:20 +0000 (17:59 -0700)]
[TCP]: Fix compile warning in tcp_probe.c

The suseconds_t et al. are not necessarily any particular type on
every platform, so cast to unsigned long so that we can use one printf
format string and avoid warnings across the board

Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[TCP]: Limited slow start for Highspeed TCP
Stephen Hemminger [Tue, 6 Jun 2006 00:30:56 +0000 (17:30 -0700)]
[TCP]: Limited slow start for Highspeed TCP

Implementation of RFC3742 limited slow start. Added as part
of the TCP highspeed congestion control module.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[TCP]: TCP Probe congestion window tracing
Stephen Hemminger [Tue, 6 Jun 2006 00:30:32 +0000 (17:30 -0700)]
[TCP]: TCP Probe congestion window tracing

This adds a new module for tracking TCP state variables non-intrusively
using kprobes.  It has a simple /proc interface that outputs one line
for each packet received. A sample usage is to collect congestion
window and ssthresh over time graphs.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[TCP]: Minimum congestion window consolidation.
Stephen Hemminger [Tue, 6 Jun 2006 00:30:08 +0000 (17:30 -0700)]
[TCP]: Minimum congestion window consolidation.

Many of the TCP congestion methods all just use ssthresh
as the minimum congestion window on decrease.  Rather than
duplicating the code, just have that be the default if that
handle in the ops structure is not set.

Minor behaviour change to TCP compound.  It probably wants
to use this (ssthresh) as lower bound, rather than ssthresh/2
because the latter causes undershoot on loss.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[TCP]: TCP Compound quad root function
Stephen Hemminger [Tue, 6 Jun 2006 00:29:39 +0000 (17:29 -0700)]
[TCP]: TCP Compound quad root function

The original code did a 64 bit divide directly, which won't work on
32 bit platforms.  Rather than doing a 64 bit square root twice,
just implement a 4th root function in one pass using Newton's method.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[TCP]: TCP Compound congestion control
Angelo P. Castellani [Tue, 6 Jun 2006 00:29:09 +0000 (17:29 -0700)]
[TCP]: TCP Compound congestion control

TCP Compound is a sender-side only change to TCP that uses
a mixed Reno/Vegas approach to calculate the cwnd.

For further details look here:
  ftp://ftp.research.microsoft.com/pub/tr/TR-2005-86.pdf

Signed-off-by: Angelo P. Castellani <angelo.castellani@gmail.com>
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[TCP]: TCP Veno congestion control
Bin Zhou [Tue, 6 Jun 2006 00:28:30 +0000 (17:28 -0700)]
[TCP]: TCP Veno congestion control

TCP Veno module is a new congestion control module to improve TCP
performance over wireless networks. The key innovation in TCP Veno is
the enhancement of TCP Reno/Sack congestion control algorithm by using
the estimated state of a connection based on TCP Vegas. This scheme
significantly reduces "blind" reduction of TCP window regardless of
the cause of packet loss.

This work is based on the research paper "TCP Veno: TCP Enhancement
for Transmission over Wireless Access Networks." C. P. Fu, S. C. Liew,
IEEE Journal on Selected Areas in Communication, Feb. 2003.

Original paper and many latest research works on veno:
 http://www.ntu.edu.sg/home/ascpfu/veno/veno.html

Signed-off-by: Bin Zhou <zhou0022@ntu.edu.sg>
       Cheng Peng Fu <ascpfu@ntu.edu.sg>
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[TCP]: TCP Low Priority congestion control
Wong Hoi Sing Edison [Tue, 6 Jun 2006 00:27:58 +0000 (17:27 -0700)]
[TCP]: TCP Low Priority congestion control

 TCP Low Priority is a distributed algorithm whose goal is to utilize only
 the excess network bandwidth as compared to the ``fair share`` of
 bandwidth as targeted by TCP. Available from:
   http://www.ece.rice.edu/~akuzma/Doc/akuzma/TCP-LP.pdf

Original Author:
 Aleksandar Kuzmanovic <akuzma@northwestern.edu>

See http://www-ece.rice.edu/networks/TCP-LP/ for their implementation.
As of 2.6.13, Linux supports pluggable congestion control algorithms.
Due to the limitation of the API, we take the following changes from
the original TCP-LP implementation:
 o We use newReno in most core CA handling. Only add some checking
   within cong_avoid.
 o Error correcting in remote HZ, therefore remote HZ will be keeped
   on checking and updating.
 o Handling calculation of One-Way-Delay (OWD) within rtt_sample, sicne
   OWD have a similar meaning as RTT. Also correct the buggy formular.
 o Handle reaction for Early Congestion Indication (ECI) within
   pkts_acked, as mentioned within pseudo code.
 o OWD is handled in relative format, where local time stamp will in
   tcp_time_stamp format.

Port from 2.4.19 to 2.6.16 as module by:
 Wong Hoi Sing Edison <hswong3i@gmail.com>
 Hung Hing Lun <hlhung3i@gmail.com>

Signed-off-by: Wong Hoi Sing Edison <hswong3i@gmail.com>
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[LLC]: Fix double receive of SKB.
Andrew Morton [Fri, 2 Jun 2006 23:29:20 +0000 (16:29 -0700)]
[LLC]: Fix double receive of SKB.

Oops fix from Stephen: remove duplicate rcv() calls.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NETFILTER]: PPTP helper: fixup gre_keymap_lookup() return type
Alexey Dobriyan [Tue, 30 May 2006 01:27:32 +0000 (18:27 -0700)]
[NETFILTER]: PPTP helper: fixup gre_keymap_lookup() return type

GRE keys are 16-bit wide.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NETFILTER]: Add SIP connection tracking helper
Patrick McHardy [Tue, 30 May 2006 01:27:09 +0000 (18:27 -0700)]
[NETFILTER]: Add SIP connection tracking helper

Add SIP connection tracking helper. Originally written by
Christian Hentschel <chentschel@arnet.com.ar>, some cleanup, minor
fixes and bidirectional SIP support added by myself.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NETFILTER]: H.323 helper: replace internal_net_addr parameter by routing-based heuristic
Patrick McHardy [Tue, 30 May 2006 01:26:47 +0000 (18:26 -0700)]
[NETFILTER]: H.323 helper: replace internal_net_addr parameter by routing-based heuristic

Call Forwarding doesn't need to create an expectation if both peers can
reach each other without our help. The internal_net_addr parameter
lets the user explicitly specify a single network where this is true,
but is not very flexible and even fails in the common case that calls
will both be forwarded to outside parties and inside parties. Use an
optional heuristic based on routing instead, the assumption is that
if bpth the outgoing device and the gateway are equal, both peers can
reach each other directly.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NETFILTER]: H.323 helper: Add support for Call Forwarding
Jing Min Zhao [Tue, 30 May 2006 01:26:27 +0000 (18:26 -0700)]
[NETFILTER]: H.323 helper: Add support for Call Forwarding

Signed-off-by: Jing Min Zhao <zhaojingmin@users.sourceforge.net>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NETFILTER]: amanda helper: convert to textsearch infrastructure
Patrick McHardy [Tue, 30 May 2006 01:25:58 +0000 (18:25 -0700)]
[NETFILTER]: amanda helper: convert to textsearch infrastructure

When a port number within a packet is replaced by a differently sized
number only the packet is resized, but not the copy of the data.
Following port numbers are rewritten based on their offsets within
the copy, leading to packet corruption.

Convert the amanda helper to the textsearch infrastructure to avoid
the copy entirely.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NETFILTER]: FTP helper: search optimization
Patrick McHardy [Tue, 30 May 2006 01:25:38 +0000 (18:25 -0700)]
[NETFILTER]: FTP helper: search optimization

Instead of skipping search entries for the wrong direction simply index
them by direction.

Based on patch by Pablo Neira <pablo@netfilter.org>

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NETFILTER]: SNMP helper: fix debug module param type
Patrick McHardy [Tue, 30 May 2006 01:25:14 +0000 (18:25 -0700)]
[NETFILTER]: SNMP helper: fix debug module param type

debug is the debug level, not a bool.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NETFILTER]: ctnetlink: change table dumping not to require an unique ID
Patrick McHardy [Tue, 30 May 2006 01:24:58 +0000 (18:24 -0700)]
[NETFILTER]: ctnetlink: change table dumping not to require an unique ID

Instead of using the ID to find out where to continue dumping, take a
reference to the last entry dumped and try to continue there.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NETFILTER]: ctnetlink: fix NAT configuration
Patrick McHardy [Tue, 30 May 2006 01:24:39 +0000 (18:24 -0700)]
[NETFILTER]: ctnetlink: fix NAT configuration

The current configuration only allows to configure one manip and overloads
conntrack status flags with netlink semantic.

Signed-off-by: Patrick Mchardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NETFILTER]: conntrack: add fixed timeout flag in connection tracking
Eric Leblond [Tue, 30 May 2006 01:24:20 +0000 (18:24 -0700)]
[NETFILTER]: conntrack: add fixed timeout flag in connection tracking

Add a flag in a connection status to have a non updated timeout.
This permits to have connection that automatically die at a given
time.

Signed-off-by: Eric Leblond <eric@inl.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NETFILTER]: conntrack: add sysctl to disable checksumming
Patrick McHardy [Tue, 30 May 2006 01:23:54 +0000 (18:23 -0700)]
[NETFILTER]: conntrack: add sysctl to disable checksumming

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NETFILTER]: conntrack: don't call helpers for related ICMP messages
Patrick McHardy [Tue, 30 May 2006 01:21:53 +0000 (18:21 -0700)]
[NETFILTER]: conntrack: don't call helpers for related ICMP messages

None of the existing helpers expects to get called for related ICMP
packets and some even drop them if they can't parse them.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NETFILTER]: recent match: replace by rewritten version
Patrick McHardy [Tue, 30 May 2006 01:21:34 +0000 (18:21 -0700)]
[NETFILTER]: recent match: replace by rewritten version

Replace the unmaintainable ipt_recent match by a rewritten version that
should be fully compatible.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NETFILTER]: x_tables: add statistic match
Patrick McHardy [Tue, 30 May 2006 01:21:00 +0000 (18:21 -0700)]
[NETFILTER]: x_tables: add statistic match

Add statistic match which is a combination of the nth and random matches.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NETFILTER]: x_tables: add quota match
Patrick McHardy [Tue, 30 May 2006 01:20:32 +0000 (18:20 -0700)]
[NETFILTER]: x_tables: add quota match

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NETFILTER]: x_tables: add SCTP/DCCP support where missing
Patrick McHardy [Tue, 30 May 2006 01:19:56 +0000 (18:19 -0700)]
[NETFILTER]: x_tables: add SCTP/DCCP support where missing

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NETFILTER]: x_tables: remove some unnecessary casts
Patrick McHardy [Tue, 30 May 2006 01:19:19 +0000 (18:19 -0700)]
[NETFILTER]: x_tables: remove some unnecessary casts

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IPSEC] xfrm: Use IPPROTO_MAX instead of 256
Herbert Xu [Sun, 28 May 2006 06:06:33 +0000 (23:06 -0700)]
[IPSEC] xfrm: Use IPPROTO_MAX instead of 256

The size of the type_map array (256) comes from the number of IP protocols,
i.e., IPPROTO_MAX.  This patch is based on a suggestion from Ingo Oeser.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IPSEC] proto: Move transport mode input path into xfrm_mode_transport
Herbert Xu [Sun, 28 May 2006 06:06:13 +0000 (23:06 -0700)]
[IPSEC] proto: Move transport mode input path into xfrm_mode_transport

Now that we have xfrm_mode objects we can move the transport mode specific
input decapsulation code into xfrm_mode_transport.  This removes duplicate
code as well as unnecessary header movement in case of tunnel mode SAs
since we will discard the original IP header immediately.

This also fixes a minor bug for transport-mode ESP where the IP payload
length is set to the correct value minus the header length (with extension
headers for IPv6).

Of course the other neat thing is that we no longer have to allocate
temporary buffers to hold the IP headers for ESP and IPComp.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IPSEC] xfrm: Abstract out encapsulation modes
Herbert Xu [Sun, 28 May 2006 06:05:54 +0000 (23:05 -0700)]
[IPSEC] xfrm: Abstract out encapsulation modes

This patch adds the structure xfrm_mode.  It is meant to represent
the operations carried out by transport/tunnel modes.

By doing this we allow additional encapsulation modes to be added
without clogging up the xfrm_input/xfrm_output paths.

Candidate modes include 4-to-6 tunnel mode, 6-to-4 tunnel mode, and
BEET modes.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IPSEC] xfrm: Undo afinfo lock proliferation
Herbert Xu [Sun, 28 May 2006 06:03:58 +0000 (23:03 -0700)]
[IPSEC] xfrm: Undo afinfo lock proliferation

The number of locks used to manage afinfo structures can easily be reduced
down to one each for policy and state respectively.  This is based on the
observation that the write locks are only held by module insertion/removal
which are very rare events so there is no need to further differentiate
between the insertion of modules like ipv6 versus esp6.

The removal of the read locks in xfrm4_policy.c/xfrm6_policy.c might look
suspicious at first.  However, after you realise that nobody ever takes
the corresponding write lock you'll feel better :)

As far as I can gather it's an attempt to guard against the removal of
the corresponding modules.  Since neither module can be unloaded at all
we can leave it to whoever fixes up IPv6 unloading :)

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[TG3]: update version and reldate
Michael Chan [Sun, 18 Jun 2006 04:28:28 +0000 (21:28 -0700)]
[TG3]: update version and reldate

Update version to 3.60.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[TG3]: Add recovery logic when MMIOs are re-ordered
Michael Chan [Sat, 27 May 2006 00:48:07 +0000 (17:48 -0700)]
[TG3]: Add recovery logic when MMIOs are re-ordered

Add recovery logic when we suspect that the system is re-ordering
MMIOs. Re-ordered MMIOs to the send mailbox can cause bogus tx
completions and hit BUG_ON() in the tx completion path.

tg3 already has logic to handle re-ordered MMIOs by flushing the MMIOs
that must be strictly ordered (such as the send mailbox).  Determining
when to enable the flush is currently a manual process of adding known
chipsets to a list.

The new code replaces the BUG_ON() in the tx completion path with the
call to tg3_tx_recover(). It will set the TG3_FLAG_MBOX_WRITE_REORDER
flag and reset the chip later in the workqueue to recover and start
flushing MMIOs to the mailbox.

A message to report the problem will be printed. We will then decide
whether or not to add the host bridge to the list of chipsets that do
re-ordering.

We may add some additional code later to print the host bridge's ID so
that the user can report it more easily.

The assumption that re-ordering can only happen on x86 systems is also
removed.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[TG3]: Add 5786 PCI ID
Michael Chan [Sat, 27 May 2006 00:44:45 +0000 (17:44 -0700)]
[TG3]: Add 5786 PCI ID

Add PCI ID for BCM5786 which is a variant of 5787.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IRDA]: ali-ircc: using device model power management
Samuel Ortiz [Thu, 25 May 2006 23:21:10 +0000 (16:21 -0700)]
[IRDA]: ali-ircc: using device model power management

This patch gets rid of the old power management code and now uses the
device model for the ali-ircc driver.

Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IRDA]: stir4200, switching to the kthread API
Christoph Hellwig [Thu, 25 May 2006 23:20:19 +0000 (16:20 -0700)]
[IRDA]: stir4200, switching to the kthread API

stir4200 uses a kernel thread for its TX/RX operations, and it is now
converted to the kernel kthread API.
Tested on an STIR4200 based dongle.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IRDA]: Initial support for MCS7780 based dongles
Samuel Ortiz [Thu, 25 May 2006 23:19:22 +0000 (16:19 -0700)]
[IRDA]: Initial support for MCS7780 based dongles

The MosChip MCS7780 chipset is an IrDA USB bridge that
doesn't conform with the IrDA-USB standard and thus needs
its separate driver.
Tested on an actual MCS7780 based dongle.

Original implementation by Brian Pugh <bpugh@cs.pdx.edu>

Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[TCP]: tcp_rcv_rtt_measure_ts() call in pure-ACK path is superfluous
David S. Miller [Thu, 25 May 2006 23:11:14 +0000 (16:11 -0700)]
[TCP]: tcp_rcv_rtt_measure_ts() call in pure-ACK path is superfluous

We only want to take receive RTT mesaurements for data
bearing frames, here in the header prediction fast path
for a pure-sender, we know that we have a pure-ACK and
thus the checks in tcp_rcv_rtt_mesaure_ts() will not pass.

Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[BRIDGE]: netlink interface for link management
Stephen Hemminger [Thu, 25 May 2006 23:00:12 +0000 (16:00 -0700)]
[BRIDGE]: netlink interface for link management

Add basic netlink support to the Ethernet bridge. Including:
 * dump interfaces in bridges
 * monitor link status changes
 * change state of bridge port

For some demo programs see:
http://developer.osdl.org/shemminger/prototypes/brnl.tar.gz

These are to allow building a daemon that does alternative
implementations of Spanning Tree Protocol.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[BRIDGE]: fix module startup error handling
Stephen Hemminger [Thu, 25 May 2006 22:59:33 +0000 (15:59 -0700)]
[BRIDGE]: fix module startup error handling

Return address in use, if some other kernel code has the SAP.
Propogate out error codes from netfilter registration and unwind.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[BRIDGE]: optimize conditional in forward path
Stephen Hemminger [Thu, 25 May 2006 22:58:54 +0000 (15:58 -0700)]
[BRIDGE]: optimize conditional in forward path

Small optimizations of bridge forwarding path.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[LLC]: add multicast support for datagrams
Stephen Hemminger [Thu, 25 May 2006 22:10:37 +0000 (15:10 -0700)]
[LLC]: add multicast support for datagrams

Allow mulitcast reception of datagrams (similar to UDP).
All sockets bound to the same SAP receive a clone.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[LLC]: allow applications to get copy of kernel datagrams
Stephen Hemminger [Thu, 25 May 2006 22:10:02 +0000 (15:10 -0700)]
[LLC]: allow applications to get copy of kernel datagrams

It is legal for an application to bind to a SAP that is also being
used by the kernel. This happens if the bridge module binds to the
STP SAP, and the user wants to have a daemon for STP as well.
It is possible to have kernel doing STP on one bridge, but
let application do RSTP on another bridge.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[LLC]: use rcu_dereference on receive handler
Stephen Hemminger [Thu, 25 May 2006 22:09:37 +0000 (15:09 -0700)]
[LLC]: use rcu_dereference on receive handler

The receive hander pointer might be modified during network changes
of protocol. So use rcu_dereference (only matters on alpha).

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[LLC]: allow datagram recvmsg
Stephen Hemminger [Thu, 25 May 2006 22:08:59 +0000 (15:08 -0700)]
[LLC]: allow datagram recvmsg

LLC receive is broken for SOCK_DGRAM.
If an application does recv() on a datagram socket and there
is no data present, don't return "not connected". Instead, just
do normal datagram semantics.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[LLC]: use more efficient ether address routines
Stephen Hemminger [Thu, 25 May 2006 22:08:30 +0000 (15:08 -0700)]
[LLC]: use more efficient ether address routines

Use more cache efficient Ethernet address manipulation functions
in etherdevice.h.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
18 years ago[I/OAT]: Do not use for_each_cpu().
Andrew Morton [Thu, 25 May 2006 20:26:53 +0000 (13:26 -0700)]
[I/OAT]: Do not use for_each_cpu().

for_each_cpu() is going away (and is gone in -mm).

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[I/OAT]: TCP recv offload to I/OAT
Chris Leech [Wed, 24 May 2006 01:05:53 +0000 (18:05 -0700)]
[I/OAT]: TCP recv offload to I/OAT

Locks down user pages and sets up for DMA in tcp_recvmsg, then calls
dma_async_try_early_copy in tcp_v4_do_rcv

Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[I/OAT]: Add a sysctl for tuning the I/OAT offloaded I/O threshold
Chris Leech [Wed, 24 May 2006 01:02:55 +0000 (18:02 -0700)]
[I/OAT]: Add a sysctl for tuning the I/OAT offloaded I/O threshold

Any socket recv of less than this ammount will not be offloaded

Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[I/OAT]: Make sk_eat_skb I/OAT aware.
Chris Leech [Wed, 24 May 2006 01:01:28 +0000 (18:01 -0700)]
[I/OAT]: Make sk_eat_skb I/OAT aware.

Add an extra argument to sk_eat_skb, and make it move early copied
packets to the async_wait_queue instead of freeing them.

Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[I/OAT]: Rename cleanup_rbuf to tcp_cleanup_rbuf and make non-static
Chris Leech [Wed, 24 May 2006 01:00:16 +0000 (18:00 -0700)]
[I/OAT]: Rename cleanup_rbuf to tcp_cleanup_rbuf and make non-static

Needed to be able to call tcp_cleanup_rbuf in tcp_input.c for I/OAT

Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[I/OAT]: Structure changes for TCP recv offload to I/OAT
Chris Leech [Wed, 24 May 2006 00:55:33 +0000 (17:55 -0700)]
[I/OAT]: Structure changes for TCP recv offload to I/OAT

Adds an async_wait_queue and some additional fields to tcp_sock, and a
dma_cookie_t to sk_buff.

Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[I/OAT]: Utility functions for offloading sk_buff to iovec copies
Chris Leech [Wed, 24 May 2006 00:50:37 +0000 (17:50 -0700)]
[I/OAT]: Utility functions for offloading sk_buff to iovec copies

Provides for pinning user space pages in memory, copying to iovecs,
and copying from sk_buffs including fragmented and chained sk_buffs.

Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[I/OAT]: Setup the networking subsystem as a DMA client
Chris Leech [Sun, 18 Jun 2006 04:24:58 +0000 (21:24 -0700)]
[I/OAT]: Setup the networking subsystem as a DMA client

Attempts to allocate per-CPU DMA channels

Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[I/OAT]: Move PCI_DEVICE_ID_INTEL_IOAT to linux/pci_ids.h
David S. Miller [Wed, 24 May 2006 00:39:49 +0000 (17:39 -0700)]
[I/OAT]: Move PCI_DEVICE_ID_INTEL_IOAT to linux/pci_ids.h

Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[I/OAT]: ioatdma.c needs linux/dma-mapping.h
David S. Miller [Wed, 24 May 2006 00:37:58 +0000 (17:37 -0700)]
[I/OAT]: ioatdma.c needs linux/dma-mapping.h

For DMA_*_MASK defines.

Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[I/OAT]: Driver for the Intel(R) I/OAT DMA engine
Chris Leech [Wed, 24 May 2006 00:35:34 +0000 (17:35 -0700)]
[I/OAT]: Driver for the Intel(R) I/OAT DMA engine

Adds a new ioatdma driver

Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[I/OAT]: DMA memcpy subsystem
Chris Leech [Wed, 24 May 2006 00:18:44 +0000 (17:18 -0700)]
[I/OAT]: DMA memcpy subsystem

Provides an API for offloading memory copies to DMA devices

Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years agoLinux v2.6.17 v2.6.17
Linus Torvalds [Sun, 18 Jun 2006 01:49:35 +0000 (18:49 -0700)]
Linux v2.6.17

Being named "Crazed Snow-Weasel" instills a lot of confidence in this
release, so I'm sure this will be one of the better ones.

18 years ago[PATCH] powerpc: enable CPU_FTR_CI_LARGE_PAGE for cell
Arnd Bergmann [Thu, 15 Jun 2006 13:09:16 +0000 (15:09 +0200)]
[PATCH] powerpc: enable CPU_FTR_CI_LARGE_PAGE for cell

Reflect the fact that the Cell Broadband Engine supports 64k
pages by adding the bit to the CPU features.

Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] powerpc: Fix 64k pages on non-partitioned machines
Arnd Bergmann [Thu, 15 Jun 2006 11:15:44 +0000 (21:15 +1000)]
[PATCH] powerpc: Fix 64k pages on non-partitioned machines

The page size encoding passed to tlbie is incorrect for new-style
large pages.  This fixes it.  This doesn't affect anything on older
machines because mmu_psize_defs[psize].penc (the page size encoding)
is 0 for 4k and 16M pages (the two are distinguished by a separate "is
a large page" bit).

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] arm_timer: remove a racy and obsolete PF_EXITING check
Oleg Nesterov [Thu, 15 Jun 2006 16:12:02 +0000 (20:12 +0400)]
[PATCH] arm_timer: remove a racy and obsolete PF_EXITING check

arm_timer() checks PF_EXITING to prevent BUG_ON(->exit_state)
in run_posix_cpu_timers().

However, for some reason it does so only for CPUCLOCK_PERTHREAD
case (which is imho wrong).

Also, this check is not reliable, PF_EXITING could be set on
another cpu without any locks/barriers just after the check,
so it can't prevent from attaching the timer to the exiting
task.

The previous patch makes this check unneeded.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] run_posix_cpu_timers: remove a bogus BUG_ON()
Oleg Nesterov [Thu, 15 Jun 2006 16:11:43 +0000 (20:11 +0400)]
[PATCH] run_posix_cpu_timers: remove a bogus BUG_ON()

do_exit() clears ->it_##clock##_expires, but nothing prevents
another cpu to attach the timer to exiting process after that.
arm_timer() tries to protect against this race, but the check
is racy.

After exit_notify() does 'write_unlock_irq(&tasklist_lock)' and
before do_exit() calls 'schedule() local timer interrupt can find
tsk->exit_state != 0. If that state was EXIT_DEAD (or another cpu
does sys_wait4) interrupted task has ->signal == NULL.

At this moment exiting task has no pending cpu timers, they were
cleanuped in __exit_signal()->posix_cpu_timers_exit{,_group}(),
so we can just return from irq.

John Stultz recently confirmed this bug, see

http://marc.theaimsgroup.com/?l=linux-kernel&m=115015841413687

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] check_process_timers: fix possible lockup
Oleg Nesterov [Thu, 15 Jun 2006 16:11:15 +0000 (20:11 +0400)]
[PATCH] check_process_timers: fix possible lockup

If the local timer interrupt happens just after do_exit() sets PF_EXITING
(and before it clears ->it_xxx_expires) run_posix_cpu_timers() will call
check_process_timers() with tasklist_lock + ->siglock held and

check_process_timers:

t = tsk;
do {
....

do {
t = next_thread(t);
} while (unlikely(t->flags & PF_EXITING));
} while (t != tsk);

the outer loop will never stop.

Actually, the window is bigger.  Another process can attach the timer
after ->it_xxx_expires was cleared (see the next commit) and the 'if
(PF_EXITING)' check in arm_timer() is racy (see the one after that).

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] sky2: netconsole suspend/resume interaction
Stephen Hemminger [Fri, 16 Jun 2006 19:10:46 +0000 (12:10 -0700)]
[PATCH] sky2: netconsole suspend/resume interaction

A couple of fixes that should prevent crashes when using netconsole and
suspend/resume. First, netconsole poll routine shouldn't run unless the
device is up; second, the NAPI poll should be disabled during suspend.

This is only an issue on sky2, because it has to have one NAPI poll
routine for both ports on dual port boards. Normal drivers use
netif_rx_schedule_prep and that checks for netif_running.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] Fix missing ret assignment in __bio_map_user() error path
Jens Axboe [Fri, 16 Jun 2006 11:02:29 +0000 (13:02 +0200)]
[PATCH] Fix missing ret assignment in __bio_map_user() error path

If get_user_pages() returns less pages than what we asked for, we jump
to out_unmap which will return ERR_PTR(ret).  But ret can contain a
positive number just smaller than local_nr_pages, so be sure to set it
to -EFAULT always.

Problem found and diagnosed by Damien Le Moal <damien@sdl.hitachi.co.jp>

Signed-off-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] fix cdrom open
Jens Axboe [Fri, 16 Jun 2006 05:46:37 +0000 (07:46 +0200)]
[PATCH] fix cdrom open

Some time ago the cdrom open routine was changed so that we call the
driver's open routine before checking to see if it is read only.  However,
if we discovered that a read write open was not possible and the open
flags required a writable open, we just returned -EROFS without calling
the driver's release routine.   This seems to work for most cdrom drivers,
but breaks the Powerpc iSeries virtual cdrom rather badly.

This just inserts the release call in the error path to balance the call
to "->open()" done by "open_for_data()".

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] cfq-iosched: fix crash in do_div()
Jens Axboe [Wed, 14 Jun 2006 17:11:57 +0000 (19:11 +0200)]
[PATCH] cfq-iosched: fix crash in do_div()

We don't clear the seek stat values in cfq_alloc_io_context(), and if
->seek_mean is unlucky enough to be set to -36 by chance, the first
invocation of cfq_update_io_seektime() will oops with a divide by zero
in do_div().

Just memset the entire cic instead of filling invididual values
independently.

Signed-off-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] Return error in case flock_lock_file failure
Kirill Korotaev [Wed, 14 Jun 2006 13:59:35 +0000 (17:59 +0400)]
[PATCH] Return error in case flock_lock_file failure

If flock_lock_file() failed to allocate flock with locks_alloc_lock()
then "error = 0" is returned. Need to return some non-zero.

Signed-off-by: Pavel Emelianov <xemul@openvz.org>
Signed-off-by: Kirill Korotaev <dev@openvz.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] sky2: stop/start hardware idle timer on suspend/resume
Stephen Hemminger [Tue, 13 Jun 2006 08:17:31 +0000 (17:17 +0900)]
[PATCH] sky2: stop/start hardware idle timer on suspend/resume

The resume bug was caused not by an early interrupt but because the idle
timeout was not being stopped on suspend.  Also disable hardware IRQ's
on suspend.  Will need to revisit this with hotplug?

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] sky2: save/restore base hardware irq during suspend/resume
Stephen Hemminger [Tue, 13 Jun 2006 08:17:30 +0000 (17:17 +0900)]
[PATCH] sky2: save/restore base hardware irq during suspend/resume

The hardware should be fully shut off during suspend, and the base
irq mask restored during resume.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] sky2: fix hotplug detect during poll
Stephen Hemminger [Tue, 13 Jun 2006 08:17:29 +0000 (17:17 +0900)]
[PATCH] sky2: fix hotplug detect during poll

If the poll routine detects no hardware available, it needs to dequeue
it self from the network poll list. Linus didn't understand NAPI.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] sky2: don't hard code number of ports
Stephen Hemminger [Tue, 13 Jun 2006 08:17:28 +0000 (17:17 +0900)]
[PATCH] sky2: don't hard code number of ports

It is cleaner, to not loop over both ports if only one exists.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] sky2: set_power_state should be void
Stephen Hemminger [Tue, 13 Jun 2006 08:17:27 +0000 (17:17 +0900)]
[PATCH] sky2: set_power_state should be void

The set power state function is cleaner if it doesn't return anything.
The only caller that could fail is in suspend() and it can check the argument
there.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] alpha: generic hweight build fix
Randy Dunlap [Mon, 12 Jun 2006 22:13:40 +0000 (15:13 -0700)]
[PATCH] alpha: generic hweight build fix

From: Randy Dunlap <rdunlap@xenotime.net>

According to include/asm-alpha/bitops.h, only ALPHA_EV67 has hardware
hweight support, so ALPHA_EV6 needs to use GENERIC_HWEIGHT.

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Ernst Herzberg <earny@net4u.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] tmpfs: Decrement i_nlink correctly in shmem_rmdir()
Sergey Vlasov [Mon, 12 Jun 2006 20:53:23 +0000 (21:53 +0100)]
[PATCH] tmpfs: Decrement i_nlink correctly in shmem_rmdir()

shmem_rmdir() must undo the increment of i_nlink done in
shmem_get_inode() for directories, otherwise at least
IN_DELETE_SELF inotify event generation is broken.

Signed-off-by: Sergey Vlasov <vsu@altlinux.ru>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] tmpfs: time granularity fix for [acm]time going backwards
Robin H. Johnson [Mon, 12 Jun 2006 20:50:25 +0000 (21:50 +0100)]
[PATCH] tmpfs: time granularity fix for [acm]time going backwards

I noticed a strange behavior in a tmpfs file system the other day, while
building packages - occasionally, and seemingly at random, make decided to
rebuild a target. However, only on tmpfs.

A file would be created, and if checked, it had a sub-second timestamp.
However, after an utimes related call where sub-seconds should be set, they
were zeroed instead. In the case that a file was created, and utimes(...,NULL)
was used on it in the same second, the timestamp on the file moved backwards.

After some digging, I found that this was being caused by tmpfs not having a
time granularity set, thus inheriting the default 1 second granularity.

Hugh adds: yes, we missed tmpfs when the s_time_gran mods went into 2.6.11.
Unfortunately, the granularity of CURRENT_TIME, often used in filesystems,
does not match the default granularity set by alloc_super.  A few more such
discrepancies have been found, but this is the most important to fix now.

Signed-off-by: Robin H. Johnson <robbat2@gentoo.org>
Acked-by: Andi Kleen <ak@suse.de>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>