Herbert Xu [Tue, 20 Jun 2006 06:57:59 +0000 (23:57 -0700)]
[NET]: Prevent multiple qdisc runs
Having two or more qdisc_run's contend against each other is bad because
it can induce packet reordering if the packets have to be requeued. It
appears that this is an unintended consequence of relinquinshing the queue
lock while transmitting. That in turn is needed for devices that spend a
lot of time in their transmit routine.
There are no advantages to be had as devices with queues are inherently
single-threaded (the loopback device is not but then it doesn't have a
queue).
Even if you were to add a queue to a parallel virtual device (e.g., bolt
a tbf filter in front of an ipip tunnel device), you would still want to
process the queue in sequence to ensure that the packets are ordered
correctly.
The solution here is to steal a bit from net_device to prevent this.
BTW, as qdisc_restart is no longer used by anyone as a module inside the
kernel (IIRC it used to with netif_wake_queue), I have not exported the
new __qdisc_run function.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 20 Jun 2006 02:07:12 +0000 (19:07 -0700)]
Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus: (51 commits)
[MIPS] Make timer interrupt frequency configurable from kconfig.
[MIPS] Correct HAL2 Kconfig description
[MIPS] Fix R4K cache macro names
[MIPS] Add Missing R4K Cache Macros to IP27 & IP32
[MIPS] Support for the RM9000-based Basler eXcite smart camera platform.
[MIPS] Support for the R5500-based NEC EMMA2RH Mark-eins board
[MIPS] Support SNI RM200C SNI in big endian mode and R5000 processors.
[MIPS] SN: include asm/sn/types.h for nasid_t.
[MIPS] Random fixes for sb1250
[MIPS] Fix bcm1480 compile
[MIPS] Remove support for NEC DDB5476.
[MIPS] Remove support for NEC DDB5074.
[MIPS] Cleanup memory managment initialization.
[MIPS] SN: Declare bridge_pci_ops.
[MIPS] Remove unused function alloc_pci_controller.
[MIPS] IP27: Extract pci_ops into separate file.
[MIPS] IP27: Use symbolic constants instead of magic numbers.
[MIPS] vr41xx: remove unnecessay items from vr41xx/Kconfig.
[MIPS] IP27: Cleanup N/M mode configuration.
[MIPS] IP27: Throw away old unused hacks.
...
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (109 commits)
[ETHTOOL]: Fix UFO typo
[SCTP]: Fix persistent slowdown in sctp when a gap ack consumes rx buffer.
[SCTP]: Send only 1 window update SACK per message.
[SCTP]: Don't do CRC32C checksum over loopback.
[SCTP] Reset rtt_in_progress for the chunk when processing its sack.
[SCTP]: Reject sctp packets with broadcast addresses.
[SCTP]: Limit association max_retrans setting in setsockopt.
[PFKEYV2]: Fix inconsistent typing in struct sadb_x_kmprivate.
[IPV6]: Sum real space for RTAs.
[IRDA]: Use put_unaligned() in irlmp_do_discovery().
[BRIDGE]: Add support for NETIF_F_HW_CSUM devices
[NET]: Add NETIF_F_GEN_CSUM and NETIF_F_ALL_CSUM
[TG3]: Convert to non-LLTX
[TG3]: Remove unnecessary tx_lock
[TCP]: Add tcp_slow_start_after_idle sysctl.
[BNX2]: Update version and reldate
[BNX2]: Use CPU native page size
[BNX2]: Use compressed firmware
[BNX2]: Add firmware decompression
[BNX2]: Allow WoL settings on new 5708 chips
...
Manual fixup for conflict in drivers/net/tulip/winbond-840.c
Linus Torvalds [Tue, 20 Jun 2006 01:53:20 +0000 (18:53 -0700)]
Merge branch 'i915fb' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/intelfb-2.6
* 'i915fb' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/intelfb-2.6: (25 commits)
intelfb: fixup clock calculation debugging.
Removed hard coded EDID buffer size.
intelfb: use regular modedb table instead of VESA
intelfb: use firmware EDID for mode database
Revert "intelfb driver -- use the regular modedb table instead of the VESA"
intelfb: int option fix
sync modesetting code with X.org
intelfb: align with changes from my X driver.
intelfb driver -- use the regular modedb table instead of the VESA
Adds support for 256MB aperture on 945 chipsets to the intelfb driver
intelfb -- uses stride alignment of 64 on the 9xx chipsets.
intelfb: some cleanups for intelfbhw
intelfb: fixup pitch calculation like X does
intelfb: fixup p calculation
This patch makes a needlessly global struct static.
intelfb: add i945GM support
intelfb: fixup whitespace..
intelfb: add hw cursor support for i9xx
intelfb: make i915 modeset
intelfb: add support for i945G
...
Linus Torvalds [Tue, 20 Jun 2006 01:16:01 +0000 (18:16 -0700)]
Add support for suspending and resuming the whole console subsystem
Trying to suspend/resume with console messages flying all around is
doomed to failure, when the devices that the messages are trying to
go to are being shut down.
Atsushi Nemoto [Mon, 19 Jun 2006 15:19:13 +0000 (00:19 +0900)]
[MIPS] Make timer interrupt frequency configurable from kconfig.
Make HZ configurable. DECSTATION can select 128/256/1024 HZ, JAZZ can
only select 100 HZ, others can select 100/128/250/256/1000/1024 HZ if
not explicitly specified). Also remove all mach-xxx/param.h files and
update all defconfigs according to current HZ value.
Kumba [Sun, 18 Jun 2006 06:17:08 +0000 (02:17 -0400)]
[MIPS] Correct HAL2 Kconfig description
The current HAL2 Kconfig description indicates it is only used on SGI
Indy systems, however all members of the Indigo2 Family have this sound
card as well. Plus a minor grammatical fix is included.
Kumba [Sun, 18 Jun 2006 06:16:53 +0000 (02:16 -0400)]
[MIPS] Add Missing R4K Cache Macros to IP27 & IP32
Keeping in accordance with other machines, IP27 and IP32 lack a few
macros. IP27 lacks cpu_has_4kex & cpu_has_4k_cache macros while IP32
lacks just the cpu_has_4k_cache macro.
Ralf Baechle [Sun, 18 Jun 2006 00:32:22 +0000 (01:32 +0100)]
[MIPS] Cleanup memory managment initialization.
Historically plat_mem_setup did the entire platform initialization. This
was rather impractical because it meant plat_mem_setup had to get away
without any kind of memory allocator. To keep old code from breaking
plat_setup was just renamed to plat_setup and a second platform
initialization hook for anything else was introduced.
Ralf Baechle [Mon, 12 Jun 2006 11:20:09 +0000 (12:20 +0100)]
[MIPS] Drop 0 definition for kern_addr_valid
kern_addr_valid is currently only being used in kmem_ptr_validate which
is making some vague attempt at verfying the validity of an address.
Only IA-64, PARISC and x86-64 actually make some actual effort to verify
the validity of the pointer. Most architecture definitions of
kern_addr_valid() just define it as 1; the Alpha and CONFIG_DISCONTIGMEM
on i386 and MIPS even as 0; the 0-definition will result in
kmem_ptr_validate always failing which in turn will cause d_validate to
always fail. d_validate's only two users are smbfs and ncpfs, so the
0 definition ended breaking those ...
Ralf Baechle [Sun, 11 Jun 2006 22:03:08 +0000 (23:03 +0100)]
[MIPS] Cleanup ARCH_DISCONTIGMEM_ENABLE and NUMA configuration.
IP27 configuration isn't the only NUMA system - it just happens to be
the currently only supported MIPS NUMA system. So move the necessary
options back into the main MIPS Kconfig file.
Sergei Shtylyov [Sat, 27 May 2006 20:04:01 +0000 (00:04 +0400)]
[MIPS] arch/mips/au1000/time.c cleanup
Mark au1xxx_timer_setup() __init, just because it is. Get rid of
unneeded extern's (note that (*do_gettimeoffset)() is already declared by
<asm/time.c>) and an unused variable. Kill some whitespace...
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Atsushi Nemoto [Mon, 15 May 2006 16:26:03 +0000 (01:26 +0900)]
[MIPS] Unify mips_fpu_soft_struct and mips_fpu_hard_structs.
The struct mips_fpu_soft_struct and mips_fpu_hard_struct are
completely same now and the kernel fpu emulator assumes that. This
patch unifies them to mips_fpu_struct and get rid of mips_fpu_union.
Atsushi Nemoto [Sun, 11 Jun 2006 14:25:43 +0000 (23:25 +0900)]
[MIPS] Fix futex_atomic_op_inuser.
I found that NPTL's pthread_cond_signal() does not work properly on
kernels compiled by gcc 4.1.x. I suppose inline asm for
__futex_atomic_op() was wrong. I suppose:
1. "=&r" constraint should be used for oldval.
2. Instead of "r" (uaddr), "=R" (*uaddr) for output and "R" (*uaddr)
for input should be used.
3. "memory" should be added to the clobber list.
[PATCH] Fix BCM1480 doubled process accounting times.
Running a UP kernel on a bcm1480 board, I get nonsensical timing
results, like this:
release@unknown:~/tmp$ time ./a.out
real 0m22.906s
user 0m45.792s
sys 0m0.010s
According to my watch, this program took 23 seconds to run, so the real
time clock is OK. It is process accounting that is broken.
I tracked this down to a problem with the function
bcm1480_timer_interrupt in the file sibyte/bcm1480/time.c. This
function calls ll_timer_interrupt for cpu0, and ll_local_timer_interrupt
for all cpus. However, both of these functions do process accounting.
Thus processes running on cpu0 end up with doubled times. This is very
obvious in a UP kernel where all processes run on cpu0.
The correct way to do this is to only call ll_local_timer interrupt if
this is not cpu0. This can be seen in the mips-board/generic/time.c
file, and also in the sibyte/sb1250/time.c file, both of which handle
this correctly. I fixed the bcm1480/time.c file by copying over the
correct code from the sb1250/time.c file.
With this fix, I now get sensible results.
release@unknown:~/tmp$ time ./a.out
real 0m22.903s
user 0m22.894s
sys 0m0.006s
[MIPS] Malta: Handle byteswapping hardare bug in big endian mode.
The SOC-it system controller running in big endian mode might forget
byteswapping when DMAing to the last word of physical memory. Fixed by
ignoring the last page of memory.
Neil Horman [Sun, 18 Jun 2006 05:59:03 +0000 (22:59 -0700)]
[SCTP]: Fix persistent slowdown in sctp when a gap ack consumes rx buffer.
In the event that our entire receive buffer is full with a series of
chunks that represent a single gap-ack, and then we accept a chunk
(or chunks) that fill in the gap between the ctsn and the first gap,
we renege chunks from the end of the buffer, which effectively does
nothing but move our gap to the end of our received tsn stream. This
does little but move our missing tsns down stream a little, and, if the
sender is sending sufficiently large retransmit frames, the result is a
perpetual slowdown which can never be recovered from, since the only
chunk that can be accepted to allow progress in the tsn stream necessitates
that a new gap be created to make room for it. This leads to a constant
need for retransmits, and subsequent receiver stalls. The fix I've come up
with is to deliver the frame without reneging if we have a full receive
buffer and the receiving sockets sk_receive_queue is empty(indicating that
the receive buffer is being blocked by a missing tsn).
Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Tsutomu Fujii [Sun, 18 Jun 2006 05:58:28 +0000 (22:58 -0700)]
[SCTP]: Send only 1 window update SACK per message.
Right now, every time we increase our rwnd by more then MTU bytes, we
trigger a SACK. When processing large messages, this will generate a
SACK for almost every other SCTP fragment. However since we are freeing
the entire message at the same time, we might as well collapse the SACK
generation to 1.
Signed-off-by: Tsutomu Fujii <t-fujii@nb.jp.nec.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Yasevich [Sun, 18 Jun 2006 05:56:08 +0000 (22:56 -0700)]
[SCTP] Reset rtt_in_progress for the chunk when processing its sack.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Yasevich [Sun, 18 Jun 2006 05:55:35 +0000 (22:55 -0700)]
[SCTP]: Reject sctp packets with broadcast addresses.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Yasevich [Sun, 18 Jun 2006 05:54:51 +0000 (22:54 -0700)]
[SCTP]: Limit association max_retrans setting in setsockopt.
When using ASSOCINFO socket option, we need to limit the number of
maximum association retransmissions to be no greater than the sum
of all the path retransmissions. This is specified in Section 7.1.2
of the SCTP socket API draft.
However, we only do this if the association has multiple paths. If
there is only one path, the protocol stack will use the
assoc_max_retrans setting when trying to retransmit packets.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes RTNLGRP_IPV6_IFINFO netlink notifications. Issue
pointed out by Patrick McHardy <kaber@trash.net>.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Sun, 18 Jun 2006 05:06:45 +0000 (22:06 -0700)]
[BRIDGE]: Add support for NETIF_F_HW_CSUM devices
As it is the bridge will only ever declare NETIF_F_IP_CSUM even if all
its constituent devices support NETIF_F_HW_CSUM. This patch fixes
this by supporting the first one out of NETIF_F_NO_CSUM,
NETIF_F_HW_CSUM, and NETIF_F_IP_CSUM that is supported by all
constituent devices.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Sun, 18 Jun 2006 05:06:05 +0000 (22:06 -0700)]
[NET]: Add NETIF_F_GEN_CSUM and NETIF_F_ALL_CSUM
The current stack treats NETIF_F_HW_CSUM and NETIF_F_NO_CSUM
identically so we test for them in quite a few places. For the sake
of brevity, I'm adding the macro NETIF_F_GEN_CSUM for these two. We
also test the disjunct of NETIF_F_IP_CSUM and the other two in various
places, for that purpose I've added NETIF_F_ALL_CSUM.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Sun, 18 Jun 2006 04:58:45 +0000 (21:58 -0700)]
[TG3]: Convert to non-LLTX
Herbert Xu pointed out that it is unsafe to call netif_tx_disable()
from LLTX drivers because it uses dev->xmit_lock to synchronize
whereas LLTX drivers use private locks.
Convert tg3 to non-LLTX to fix this issue. tg3 is a lockless driver
where hard_start_xmit and tx completion handling can run concurrently
under normal conditions. A tx_lock is only needed to prevent
netif_stop_queue and netif_wake_queue race condtions when the queue
is full.
So whether we use LLTX or non-LLTX, it makes practically no
difference.
Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Sun, 18 Jun 2006 04:55:55 +0000 (21:55 -0700)]
[TG3]: Remove unnecessary tx_lock
Remove tx_lock where it is unnecessary. tg3 runs lockless and so it
requires interrupts to be disabled and sync'ed, netif_queue and NAPI
poll to be stopped before the device can be reconfigured. After
stopping everything, it is no longer necessary to get the tx_lock.
Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Tue, 13 Jun 2006 22:03:47 +0000 (15:03 -0700)]
[BNX2]: Use CPU native page size
Use CPU native page size to determine various ring sizes. This allows
order-0 memory allocations on all systems.
Added check to limit the page size to 16K since that's the maximum rx
ring size that will be used. This will prevent using unnecessarily
large page sizes on some architectures with large page sizes.
[Suggested by David Miller]
Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Luca De Cicco [Mon, 12 Jun 2006 06:02:19 +0000 (23:02 -0700)]
[TCP] Westwood: reset RTT min after FRTO
RTT_min is updated each time a timeout event occurs
in order to cope with hard handovers in wireless scenarios such as UMTS.
Signed-off-by: Luca De Cicco <ldecicco@gmail.com> Signed-off-by: Stephen Hemminger <shemminger@dxpl.pdx.osdl.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Luca De Cicco [Mon, 12 Jun 2006 06:01:59 +0000 (23:01 -0700)]
[TCP] Westwood: bandwidth filter startup
The bandwidth estimate filter is now initialized with the first
sample in order to have better performances in the case of small
file transfers.
Signed-off-by: Luca De Cicco <ldecicco@gmail.com> Signed-off-by: Stephen Hemminger <shemminger@dxpl.pdx.osdl.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Luca De Cicco [Mon, 12 Jun 2006 06:01:39 +0000 (23:01 -0700)]
[TCP] Westwood: comment fixes
Cleanup some comments and add more references
Signed-off-by: Luca De Cicco <ldecicco@gmail.com> Signed-off-by: Stephen Hemminger <shemminger@dxpl.pdx.osdl.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Nick Fedchik [Mon, 12 Jun 2006 03:56:02 +0000 (20:56 -0700)]
[IRDA]: irda-usb.c: STIR421x cleanups
This cleans the STIR421x part of the irda-usb code. We also no longer
try to load all existing firmwares but only the matching one
(according to the USB id we get from the dongle).
Signed-off-by: Nick Fedchik <nfedchik@atlantic-link.com.ua> Signed-off-by: Samuel Ortiz <samuel@sortiz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Fri, 9 Jun 2006 23:11:27 +0000 (16:11 -0700)]
[NET] ppp: Remove unnecessary pskb_may_pull
In ppp_receive_nonmp_frame, we call pskb_may_pull(skb, skb->len) if the
tailroom is >= 124. This is pointless because this pskb_may_pull is only
needed if the skb is non-linear. However, if it is non-linear then the
tailroom would be zero.
So it can be safely removed.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Fri, 9 Jun 2006 23:10:40 +0000 (16:10 -0700)]
[NET]: Clean up skb_linearize
The linearisation operation doesn't need to be super-optimised. So we can
replace __skb_linearize with __pskb_pull_tail which does the same thing but
is more general.
Also, most users of skb_linearize end up testing whether the skb is linear
or not so it helps to make skb_linearize do just that.
Some callers of skb_linearize also use it to copy cloned data, so it's
useful to have a new function skb_linearize_cow to copy the data if it's
either non-linear or cloned.
Last but not least, I've removed the gfp argument since nobody uses it
anymore. If it's ever needed we can easily add it back.
Misc bugs fixed by this patch:
* via-velocity error handling (also, no SG => no frags)
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>