err.no Git - linux-2.6/log

[PKT_SCHED] CLS_U32: Use ffs() instead of C code on hash mask to get first set bit.

Computing the rank of the first set bit in the hash mask (for using later
in u32_hash_fold()) was done with plain C code. Using ffs() instead makes
the code more readable and improves performance (since ffs() is better
optimized in assembler).

Using the conditional operator on hash mask before applying ntohl() also
saves one ntohl() call if mask is 0.

Signed-off-by: Radu Rendec <radu.rendec@ines.ro>
Signed-off-by: Jarek Poplawski <jarkao2@o2.pl>
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NET]: Fix skb_truesize_check() assertion

The intent of the assertion in skb_truesize_check() is to check
for skb->truesize being decremented too much by other code,
resulting in a wraparound below zero.

The type of the right side of the comparison causes the compiler to
promote the left side to an unsigned type, despite the presence of an
explicit type cast. This defeats the check for negativity.

Ensure both sides of the comparison are a signed type to prevent the
implicit type conversion.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

[VLAN]: Allow setting mac address while device is up

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

[VLAN]: Don't synchronize addresses while the vlan device is down

While the VLAN device is down, the unicast addresses are not configured
on the underlying device, so we shouldn't attempt to sync them.

Noticed by Dmitry Butskoy <buc@odusz.so-cdu.ru>

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

[INET]: Cleanup the xfrm4_tunnel_(un)register

Both check for the family to select an appropriate tunnel list.
Consolidate this check and make the for() loop more readable.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

[INET]: Add missed tunnel64_err handler

The tunnel64_protocol uses the tunnel4_protocol's err_handler and
thus calls the tunnel4_protocol's handlers.

This is not very good, as in case of (icmp) error the wrong error
handlers will be called (e.g. ipip ones instead of sit) and this
won't be noticed at all, because the error is not reported.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

[IPX]: Use existing sock refcnt debugging infrastructure

Just like in the af_packet.c, the ipx_sock_nr variable is used
for debugging purposes.

Switch to using existing infrastructure. Thanks to Arnaldo for
pointing this out.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

[PACKET]: Use existing sock refcnt debugging infrastructure

The packet_socks_nr variable is used purely for debugging
the number of sockets.

As Arnaldo pointed out, there's already an infrastructure
for this purposes, so switch to using it.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NET]: Fix infinite loop in dev_mc_unsync().

From: Joe Perches <joe@perches.com>

Based upon an initial patch and report by Luis R. Rodriguez.

Signed-off-by: David S. Miller <davem@davemloft.net>

[NET]: Make helper to get dst entry and "use" it

There are many places that get the dst entry, increase the
__use counter and set the "lastuse" time stamp.

Make a helper for this.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

[IPV4]: Remove bugus goto-s from ip_route_input_slow

Both places look like

        if (err == XXX)
               goto yyy;
   done:

while both yyy targets look like

        err = XXX;
        goto done;

so this is ok to remove the above if-s.

yyy labels are used in other places and are not removed.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

[TCP]: Split SACK FRTO flag clearing (fixes FRTO corner case bug)

In case we run out of mem when fragmenting, the clearing of
FLAG_ONLY_ORIG_SACKED might get missed which then feeds FRTO
with false information. Move clearing outside skb processing
loop so that it will get executed even if the skb loop
terminates prematurely due to out-of-mem.

Besides, now the core of the loop truly deals with a single
skb only, which also enables creation a more self-contained
of tcp_sacktag_one later on.

In addition, small reorganization of if branches was made.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>

[TCP]: Add unlikely() to sacktag out-of-mem in fragment case

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>

[TCP]: Fix reord detection due to snd_una covered holes

Fixes subtle bug like the one with fastpath_cnt_hint happening
due to the way the GSO and hints interact. Because hints are not
reset when just a GSOed skb is partially ACKed, there's no
guarantee that the relevant part of the write queue is going to
be processed in sacktag at all (skbs below snd_una) because
fastpath hint can fast forward the entrypoint.

This was also on the way of future reductions in sacktag's skb
processing. Also future cleanups in sacktag can be made after
this (in 2.6.25).

This may make reordering update in tcp_try_undo_partial
redundant but I'm not too sure so I left it there.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>

[TCP]: Consider GSO while counting reord in sacktag

Reordering detection fails to take account that the reordered
skb may have pcount larger than 1. In such case the lowest of
them had the largest reordering, the old formula used the
highest of them which is pcount - 1 packets less reordered.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>

[INET]: Add a missing include <linux/vmalloc.h> to inet_hashtables.h

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ACPI: add documentation for deprecated /proc/acpi/battery in ACPI_PROCFS

Add documentation in Kconfig help about the move of /proc/acpi/battery
to /sys/class/power_supply when selecting ACPI_PROCFS. This will impact
a lot of users and should be documented.

Signed-off-by: Jerome Pinot <ngc891@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc:
Add missing "\n" to log message

Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev

* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev:
  libata: Don't fail device revalidation for bad _GTF methods
  libata: port and host should be stopped before hardware resources are released
  libata: skip 0xff polling for PATA controllers
  libata: pata_platform: Support polling-mode configuration.
  libata: Support PIO polling-only hosts.
  libata sata_qstor conversion to new error handling (EH).
  libata sata_qstor workaround for spurious interrupts
  libata sata_qstor nuke idle state
  nv_hardreset: update dangling reference to bugzilla entry
  ata_piix: add SATELLITE PRO U200 to broken suspend list

Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6

* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6: (40 commits)
  r8169: prevent bit sign expansion error in mdio_write
  r8169: revert 7da97ec96a0934319c7fbedd3d38baf533e20640 (bis repetita)
  sky2: new pci id's
  ax88796: add superh to kconfig dependencies
  qla3xxx: bugfix: Fix bad logical operation in link state machine.
  qla3xxx: bugfix: Move link state machine into a worker thread
  pasemi_mac: Fix CRC checks
  pasemi_mac: Don't set replace-source-address descriptor bits
  bonding: don't validate address at device open
  bonding: fix rtnl locking merge error
  sky2: netpoll on port 0 only
  b43: Fix kconfig dependencies for rfkill and leds
  b43legacy: Fix sparse warning
  b43: properly request pcmcia IRQ
  b43legacy: fix shared IRQ race condition
  b43: fix shared IRQ race condition
  b43legacy: add me as maintainer and fix URLs
  b43legacy: fix possible buffer overrun in debugfs
  b43: Rewrite and fix rfkill init
  b43: debugfs SHM read buffer overrun fix
  ...

Revert "[ARM] 4642/2: netX: default config for netx based boards"

This reverts commit f33bac8dd4573428b94c67149c5607be489092d1, which was
totally bogus.

The arm/configs/netx_defconfig file already existed - in the right
place. Namely under "arch".

Noticed-by: Paul Mundt <lethal@linux-sh.org>
Acked-by: Robert Schwebel <r.schwebel@pengutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Add missing "\n" to log message

Signed-off-by: Rolf Eike Beer <eike-kernel@sf-tec.de>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>

r8169: prevent bit sign expansion error in mdio_write

Oops.

The current code does not like being given an u16 with the highest
bit set as an argument to mdio_write. Let's enforce a correct range of
values for both the register address and value (resp. 5 and 16 bits).

The callers are currently left as-is.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Cc: Edward Hsu <edward_hsu@realtek.com.tw>

r8169: revert 7da97ec96a0934319c7fbedd3d38baf533e20640 (bis repetita)

RTL_GIGA_MAC_VER_17 breaks as well.

Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Cc: Edward Hsu <edward_hsu@realtek.com.tw>

sky2: new pci id's

Found a couple of more chips in the latest version of the vendor driver.
They are minor variations on existing chips.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

ax88796: add superh to kconfig dependencies

ax88796: add superh to kconfig dependencies

This patch adds sh architecture support to the ax88796 kconfig.

Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

qla3xxx: bugfix: Fix bad logical operation in link state machine.

Luckily, this wasn't reported or reproduced. The logical operation for
setting duplex had wrong grouping.

Signed-off-by: Ron Mercer <ron.mercer@qlogic.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

qla3xxx: bugfix: Move link state machine into a worker thread

The link state machine requires access to some resources that
are shared with the iSCSI function on the chip. (See iSCSI
driver at drivers/scsi/qla4xxx) If the interface is being
up/downed at a rapid pace this driver may need to sleep
waiting to get access to the common resources. For this we
are moving the state machine to run as a work thread.

Signed-off-by: Ron Mercer <ron.mercer@qlogic.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

pasemi_mac: Fix CRC checks

Make sure we don't feed packets with bad CRC up the network stack,
and discount the packet length as reported from the MAC for the CRC
field.

Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

pasemi_mac: Don't set replace-source-address descriptor bits

Don't use the "replace source address with local MAC address" bits, since
it causes problems on some variations of the hardware due to an erratum.

Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

bonding: don't validate address at device open

The standard validate_addr handler refuses to accept the all zeroes address
as valid. However, it's common historical practice for the bonding
master to be configured up prior to having any slaves, at which time the
master will have a MAC address of all zeroes.

Resolved by setting the dev->validate_addr to NULL. The master still can't
end up with an invalid address, as the set_mac_address function tests
for validity.

Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

bonding: fix rtnl locking merge error

Looks like I incorrectly merged one of the rtnl lock changes,
so that one function, bonding_show_active_slave, held rtnl but didn't
release it, and another, bonding_store_active_slave, never held rtnl but
did release it.

Fixed so the first function doesn't mess with rtnl, and the
second correctly acquires and releases rtnl.

Bug reported by Moni Shoua <monis@voltaire.com>

Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

sky2: netpoll on port 0 only

Netpoll will only work on port 0 because of the restrictive
relationship between NAPI and netpoll.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

b43: Fix kconfig dependencies for rfkill and leds

Fix dependencies for built-in b43.

Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

b43legacy: Fix sparse warning

Fix a sparse warning about a nonstatic function.

Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

b43: properly request pcmcia IRQ

PCMCIA needs an additional step to request the IRQ.

No need to add code to release the IRQ here, as that's done
automatically in pcmcia_disable_device().

Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

b43legacy: fix shared IRQ race condition

Fix an IRQ race condition in b43legacy. If we call
b43legacy_wireless_core_stop(), it will set the status of the device to
INITIALIZED and the IRQ handler won't care any longer about IRQs, thus the
kernel will disable the IRQ if it's shared (unless we boot it with the
'irqpoll' option). So we must disable IRQs before changing the device
status.

Signed-off-by: Stefano Brivio <stefano.brivio@polimi.it>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

b43: fix shared IRQ race condition

Fix an IRQ race condition in b43. If we call b43_stop_wireless_core(), it
will set the status of the device to INITIALIZED and the IRQ handler won't
care any longer about IRQs, thus the kernel will disable the IRQ if it's
shared (unless we boot it with the 'irqpoll' option). So we must disable
IRQs before changing the device status.

Signed-off-by: Stefano Brivio <stefano.brivio@polimi.it>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

b43legacy: add me as maintainer and fix URLs

As b43legacy is going to be orphaned, add me as a maintainer. Fix URLs for
the related website and fix my e-mail address in MAINTAINERS file.

Signed-off-by: Stefano Brivio <stefano.brivio@polimi.it>
Cc: Larry Finger <larry.finger@lwfinger.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

b43legacy: fix possible buffer overrun in debugfs

Fix possible buffer overrun.

The patch to b43 by Michael Buesch <mb@bu3sch.de> has been ported to
b43legacy.

Signed-off-by: Stefano Brivio <stefano.brivio@polimi.it>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

b43: Rewrite and fix rfkill init

The rfkill subsystem doesn't like code like that
rfkill_allocate();
rfkill_register();
rfkill_unregister();
rfkill_register(); /* <- This will crash */

This sequence happens with
modprobe b43
ifconfig wlanX up
ifconfig wlanX down
ifconfig wlanX up

Fix this by always re-allocating the rfkill stuff before register.

Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

b43: debugfs SHM read buffer overrun fix

Fix possible buffer overrun.

Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

b43: Fix rfkill callback deadlock

wl->mutex might already be locked on initialization.

Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

b43: pcmcia-host initialization bugfixes

Fix the initialization for PCMCIA devices.

Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

ipw2100: fix postfix decrement errors

If i reaches zero, the loop ends, but the postfix decrement subtracts it to -1.
Testing for 'i == 0', later in the function, will not fulfill its purpose.

Signed-off-by: Roel Kluin <12o3l@tiscali.nl>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

hostap: fix section mismatch warning

Fix section mismatch warning:

WARNING: vmlinux.o(.data+0x36fcc): Section mismatch: reference to .init.data:prism2_pci_id_table (between 'prism2_pci_drv_id' and 'prism2_pci_funcs')

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

rt2x00: Block adhoc & master mode

rt2x00 is broken when it comes down to adhoc and master mode.
The main problem is the beaconing, which is completely failing.
Untill a solution has been found, both beacon requiring modes
must be disabled to prevent numerous bug reports.

Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

libertas: properly account for queue commands

Properly account for queue commands, this fixes a problem reported
by Holger Schurig when using the debugfs interface.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

libertas: make if_sdio align packets

Incoming packets have to be aligned or the IP stack becomes upset.
Make sure to shift them two bytes to achieve this.

Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

libertas: fixes for slow hardware

Fixes for slow hardware.

Signed-off-by: Vitaly V. Bursov <vitalyvb@ukr.net>
Signed-off-by: Holger Schurig <hs4233@mail.mn-solutions.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

hermes: clarify Intel reference in Kconfig help

The Intel device supported by the hermes driver core is the IPW2011. The
"Intel PRO/Wireless" wording suggests the later Centrino devices and may
be confusing to some users.

Signed-off-by: John W. Linville <linville@tuxdriver.com>

r8169: revert 7da97ec96a0934319c7fbedd3d38baf533e20640 (partly)

Various symptoms depending on the .config options:
- the card stops working after some (short) time
- the card does not work at all
- the card disappears (nothing in lspci/dmesg)

A real power-off is needed to recover the card.

Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>

r8169: do not enable the TBI for the 8168 and the 81x0

The 8168c and the 8100e choke on it. I have not seen an indication
nor received a report that the TBI is being actively used on the
remaining 8168b and 8110. Let's disable it for now until someone
complains.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Cc: Matthias Winkler <m.winkler@unicon-ka.de>
Cc: Maarten Vanraes <maarten.vanraes@gmail.com>
Cc: Edward Hsu <edward_hsu@realtek.com.tw>

r8169: add PCI ID for the 8168 in the Abit Fatal1ty F-190HD motherboard

Signed-off-by: Ciaran McCreesh <ciaran.mccreesh@blueyonder.co.uk>
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Cc: Edward Hsu <edward_hsu@realtek.com.tw>

add support for smc91x ethernet interface on zylonite

This patch adds LAN91C111 ethernet interface support for zylonite
(a.k.a Marvell's PXA3xx Development Platform) with smc91x driver.

It would be better if a patch would support zylonite along with all
other PXA boards with a single binary of smc91x driver, but it looks
quite difficult for the moment, so ugly #ifdef is still used here.

Signed-off-by: Aleksey Makarov <amakarov@ru.mvista.com>
Acked-by: eric miao <eric.miao@marvell.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

sky2: version 1.20

Version update to 1.20

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

sky2: handle advanced error recovery config issues

The PCI AER support may not work for a couple of reasons.
It may not be configured into the kernel or there may be a BIOS
bug that prevents MMCONFIG from working. If MMCONFIG doesn't work
then the PCI registers that control AER will not be accessible via
pci_read_config functions; luckly there is another window to access
PCI space in the device, so use that.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

sky2: remove unneeded mask update

The IRQ's is already masked on shutdown, and on startup avoid
touching PHY until after phy_init().

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

sky2: dont change LED after autoneg

Don't need to change LED's after auto negotiation, the chip
sets them correctly.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

sky2: longer PHY delay

Increse phy delay and handle I/O errors.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

sky2: status ring race fix

The D-Link PCI-X board (and maybe others) can lie about status
ring entries. It seems it will update the register for last status
index before completing the DMA for the ring entry. To avoid reading
stale data, zap the old entry and check.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

sky2: enable PCI config writes

On some boards, PCI configuration space access is turned off by default.
The 2.6.24 driver doesn't turn it on, and should have.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

libata: Don't fail device revalidation for bad _GTF methods

Experience suggests that the _GTF method may be bad. We currently fail
device revalidation in that case, which seems excessive.

Signed-off-by: Matthew Garrett <mjg59@srcf.ucam.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

x86 - 32-bit ptrace emulation mishandles 6th arg

[ jdike - Pushing Chuck's patch - see
http://lkml.org/lkml/2005/9/16/261 for some history and a test
program.  UML is also broken without this patch - its processes get
SIGBUS from the corrupt 6th argument to mmap being interpretted as a
file offset ]

When the 32-bit vDSO is used to make a system call, the %ebp register for
the 6th syscall arg has to be loaded from the user stack (where it's pushed
by the vDSO user code).  The native i386 kernel always does this before
stopping for syscall tracing, so %ebp can be seen and modified via ptrace
to access the 6th syscall argument.  The x86-64 kernel fails to do this,
presenting the stack address to ptrace instead.  This makes the %rbp value
seen by 64-bit ptrace of a 32-bit process, and the %ebp value seen by a
32-bit caller of ptrace, both differ from the native i386 behavior.

This patch fixes the problem by putting the word loaded from the user stack
into %rbp before calling syscall_trace_enter, and reloading the 6th syscall
argument from there afterwards (so ptrace can change it).  This makes the
behavior match that of i386 kernels.

Original-Patch-By: Roland McGrath <roland@redhat.com>
Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

x86_64: ia32 ptrace THREAD_AREA fix

The addr argument to PTRACE_GET_THREAD_AREA and PTRACE_SET_THREAD_AREA is
not a magic constant. It's derived from the segment register values being
used, which are computed originally from the index used with set_thread_area.
The value does not need to match what a native i386 kernel would accept.
It needs to match the segment selectors that can actually be in use in this
32-bit process. The 64-bit ptrace support for PTRACE_GET_THREAD_AREA
(normally used only on 32-bit processes) is correct, but the 32-bit emulation
of ptrace is broken.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

voyager: use struct instead of PARAM

Use struct boot_params instead of PARAM + 0xoffsets.
Fixes one of many Voyager build problems.

arch/x86/kernel/setup_32.c:543: error: 'PARAM' undeclared (first use in this function)

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

[FUTEX] Fix address computation in compat code.

compat_exit_robust_list() computes a pointer to the
futex entry in userspace as follows:

(void __user *)entry + futex_offset

'entry' is a 'struct robust_list __user *', and
'futex_offset' is a 'compat_long_t' (typically a 's32').

Things explode if the 32-bit sign bit is set in futex_offset.

Type promotion sign extends futex_offset to a 64-bit value before
adding it to 'entry'.

This triggered a problem on sparc64 running 32-bit applications which
would lock up a cpu looping forever in the fault handling for the
userspace load in handle_futex_death().

Compat userspace runs with address masking (wherein the cpu zeros out
the top 32-bits of every effective address given to a memory operation
instruction) so the sparc64 fault handler accounts for this by
zero'ing out the top 32-bits of the fault address too.

Since the kernel properly uses the compat_uptr interfaces, kernel side
accesses to compat userspace work too since they will only use
addresses with the top 32-bit clear.

Because of this compat futex layer bug we get into the following loop
when executing the get_user() load near the top of handle_futex_death():

1) load from address '0xfffffffff7f16bd8', FAULT
2) fault handler clears upper 32-bits, processes fault
for address '0xf7f16bd8' which succeeds
3) goto #1

I want to thank Bernd Zeimetz, Josip Rodin, and Fabio Massimo Di Nitto
for their tireless efforts helping me track down this bug.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6

* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
  [IA64] IOSAPIC bogus error cleanup
  [IA64] Update printing of feature set bits
  [IA64] Fix IOSAPIC delivery mode setting
  [IA64] XPC heartbeat timer function must run on CPU 0
  [IA64] Clean up /proc/interrupts output
  [IA64] Disable/re-enable CPE interrupts on Altix
  [IA64] Clean-up McKinley Errata message
  [IA64] Add gate.lds to list of files ignored by Git
  [IA64] Fix section mismatch in contig.c version of per_cpu_init()
  [IA64] Wrong args to memset in efi_gettimeofday()
  [IA64] Remove duplicate includes from ia32priv.h
  [IA64] fix number of bytes zeroed by sys_fw_init() in arch/ia64/hp/sim/boot/fw-emu.c
  [IA64] Fix perfmon sysctl directory modes

Merge git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched

* git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched:
  sched: proper prototype for kernel/sched.c:migration_init()
  sched: avoid large irq-latencies in smp-balancing
  sched: fix copy_namespace() <-> sched_fork() dependency in do_fork
  sched: clean up the wakeup preempt check, #2
  sched: clean up the wakeup preempt check
  sched: wakeup preemption fix
  sched: remove PREEMPT_RESTRICT
  sched: turn off PREEMPT_RESTRICT
  KVM: fix !SMP build error
  x86: make nmi_cpu_busy() always defined
  x86: make ipi_handler() always defined
  sched: cleanup, use NSEC_PER_MSEC and NSEC_PER_SEC
  sched: reintroduce SMP tunings again
  sched: restore deterministic CPU accounting on powerpc
  sched: fix delay accounting regression
  sched: reintroduce the sched_min_granularity tunable
  sched: documentation: place_entity() comments
  sched: fix vslice

Merge master.kernel.org:/pub/scm/linux/kernel/git/lethal/sh-2.6

* master.kernel.org:/pub/scm/linux/kernel/git/lethal/sh-2.6: (26 commits)
  sh: remove dead config symbols from SH code
  sh: Kill off broken snapgear ds1302 code.
  sh: Add a dummy vga.h.
  rtc: rtc-sh: Zero out tm value for invalid rtc states.
  rtc: sh-rtc: Handle rtc_device_register() failure properly.
  sh: Fix heartbeart on Solution Engine series
  sh: Remove SCI_NPORTS from sh-sci.h
  sh: Fix up PAGE_KERNEL_PCC() for nommu.
  sh: hs7751rvoip: Kill off dead IPR IRQ mappings.
  sh: hs7751rvoip: irq.c needs linux/interrupt.h.
  sh: Kill off __{copy,clear}_user_page().
  sh: Optimized copy_{to,from}_user_page() for SH-4.
  sh: Wire up clear_user_highpage().
  sh: Kill off the remaining ST40 cruft.
  superhyway: Handle device_register() retval properly.
  sh: kgdb sysrq depends on magic sysrq.
  sh: Add -Werror for clean directories.
  sh: Fix up kgdb build with modular sh-sci.
  sh: Export __{s,u}divsi3_i4i on all CPUs.
  sh: Fix up kgdb-on-NMI branch target.
  ...

Merge master.kernel.org:/home/rmk/linux-2.6-arm

* master.kernel.org:/home/rmk/linux-2.6-arm:
  [ARM] pxa: fix one-shot timer mode
  [ARM] 4645/1: Cyberpro: Trivial fix to restore 16bpp mode.
  [ARM] 4644/2: fix flush_kern_tlb_range() in module space
  [ARM] Allow watchdog drivers to be selected again
  [ARM] 4633/1: omap build fix when FB enabled
  [ARM] 4642/2: netX: default config for netx based boards
  [ARM] 4641/2: netX: fix kobject_name type
  [ARM] Fix iop3xx macro

Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
[LIB] crc32c: Keep intermediate crc state in cpu order

Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block

* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  Add UNPLUG traces to all appropriate places
  block: fix requeue handling in blk_queue_invalidate_tags()
  mmc: Fix sg helper copy-and-paste error
  pktcdvd: fix BUG caused by sysfs module reference semantics change
  ioprio: allow sys_ioprio_set() value of 0 to reset ioprio setting
  cfq_idle_class_timer: add paranoid checks for jiffies overflow
  cfq: fix IOPRIO_CLASS_IDLE delays
  cfq: fix IOPRIO_CLASS_IDLE accounting

Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc

* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (37 commits)
  [POWERPC] EEH: Make sure warning message is printed
  [POWERPC] Make altivec code in swsusp_32.S depend on CONFIG_ALTIVEC
  [POWERPC] windfarm: Fix windfarm thread freezer interaction
  [POWERPC] Fix si_addr value on low level hash failures
  [POWERPC] Refresh ppc64_defconfig and enable pasemi-related options
  [POWERPC] pasemi: Update defconfig
  [POWERPC] iSeries: Fix ref counting in vio setup
  [POWERPC] ] Fix memset size error
  [POWERPC] Fix link errors for allyesconfig
  [POWERPC] iSeries_init_IRQ non-PCI tidy
  [POWERPC] Change fallocate to match unistd.h on powerpc
  [POWERPC] EEH: Avoid crash on null device
  [POWERPC] EEH: Drivers that need reset trump others
  [POWERPC] EEH: Clean up comments
  [POWERPC] Fix off-by-one error in setting decrementer on Book E/4xx (v2)
  [POWERPC] Fix switch_slb handling of 1T ESID values
  [POWERPC] Fix build failure when CONFIG_VIRT_CPU_ACCOUNTING is not defined
  [POWERPC] Include udbg.h when using udbg_printf
  [POWERPC] Fix cache line vs. block size confusion
  [POWERPC] Fix sysctl table check failure on PowerMac
  ...

Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2

* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2:
  ocfs2: fix rename vs unlink race
  [PATCH] Fix possibly too long write in o2hb_setup_one_bio()
  ocfs2: fix write() performance regression
  ocfs2: Commit journal on sync writes
  ocfs2: Re-order iput in ocfs2_drop_dentry_lock
  ocfs2: Create locks at initially requested level
  [PATCH] Fix priority mistakes in fs/ocfs2/{alloc.c, dlmglue.c}
  [2.6 patch] make ocfs2_find_entry_el() static

frv: Remove bogus NO_IRQ = -1 define

The old NO_IRQ define some platforms had was long ago declared obsolete
and wrong. FRV should therefore not be re-introducing this, especially as
IRQs are usually unsigned in the kernel. The "no IRQ" case is defined to be
zero and Linus made this rather clear at the time.

arch/frv shows no dependancy on this but it might show up driver fixes
needing doing I guess

Signed-off-by: Alan Cox <alan@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6

* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6:
[SPARC64]: Use "is_power_of_2" macro for simplicity.
[SPARC]: Remove duplicate includes.

Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6

* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (44 commits)
  [NETLINK]: Fix unicast timeouts
  [INET]: Remove per bucket rwlock in tcp/dccp ehash table.
  [IPVS]: Synchronize closing of Connections
  [IPVS]: Bind connections on stanby if the destination exists
  [NET]: Remove Documentation/networking/pt.txt
  [NET]: Remove Documentation/networking/routing.txt
  [NET]: Remove Documentation/networking/ncsa-telnet
  [NET]: Remove comx driver docs.
  [NET]: Remove Documentation/networking/Configurable
  [NET]: Clean proto_(un)register from in-code ifdefs
  [IPSEC]: Fix crypto_alloc_comp error checking
  [VLAN]: Fix SET_VLAN_INGRESS_PRIORITY_CMD ioctl
  [NETNS]: Fix compiler error in net_namespace.c
  [TTY]: Use tty_mode_ioctl() in network drivers.
  [TTY]: Fix network driver interactions with TCGET/SET calls.
  [PKT_SCHED] CLS_U32: Fix endianness problem with u32 classifier hash masks.
  [NET]: Removing duplicit #includes
  [NET]: Let USB_USBNET always select MII.
  [RRUNNER]: Do not muck with sysctl_{r,w}mem_max
  [DLM] lowcomms: Do not muck with sysctl_rmem_max.
  ...

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6:
firewire: fw-sbp2: fix refcounting

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6:
  SELinux: add more validity checks on policy load
  SELinux: fix bug in new ebitmap code.
  SELinux: suppress a warning for 64k pages.

FRV: Remove the section annotation on free_initmem()

Remove the section annotation on FRV's free_initmem(). It can't be marked
__init, lest it free itself.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

sched: proper prototype for kernel/sched.c:migration_init()

This patch adds a proper prototype for migration_init() in
include/linux/sched.h

Since there's no point in always returning 0 to a caller that doesn't check
the return value it also changes the function to return void.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

sched: avoid large irq-latencies in smp-balancing

SMP balancing is done with IRQs disabled and can iterate the full rq.
When rqs are large this can cause large irq-latencies. Limit the nr of
iterations on each run.

This fixes a scheduling latency regression reported by the -rt folks.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Tested-by: Gregory Haskins <ghaskins@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

sched: fix copy_namespace() <-> sched_fork() dependency in do_fork

Sukadev Bhattiprolu reported a kernel crash with control groups.
There are couple of problems discovered by Suka's test:

- The test requires the cgroup filesystem to be mounted with
  atleast the cpu and ns options (i.e both namespace and cpu
  controllers are active in the same hierarchy).

# mkdir /dev/cpuctl
# mount -t cgroup -ocpu,ns none cpuctl
(or simply)
# mount -t cgroup none cpuctl -> Will activate all controllers
in same hierarchy.

- The test invokes clone() with CLONE_NEWNS set. This causes a a new child
  to be created, also a new group (do_fork->copy_namespaces->ns_cgroup_clone->
  cgroup_clone) and the child is attached to the new group (cgroup_clone->
  attach_task->sched_move_task). At this point in time, the child's scheduler
  related fields are uninitialized (including its on_rq field, which it has
  inherited from parent). As a result sched_move_task thinks its on
  runqueue, when it isn't.

  As a solution to this problem, I moved sched_fork() call, which
  initializes scheduler related fields on a new task, before
  copy_namespaces(). I am not sure though whether moving up will
  cause other side-effects. Do you see any issue?

- The second problem exposed by this test is that task_new_fair()
  assumes that parent and child will be part of the same group (which
  needn't be as this test shows). As a result, cfs_rq->curr can be NULL
  for the child.

  The solution is to test for curr pointer being NULL in
  task_new_fair().

With the patch below, I could run ns_exec() fine w/o a crash.

Reported-by: Sukadev Bhattiprolu <sukadev@us.ibm.com>
Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

sched: clean up the wakeup preempt check, #2

clean up the preemption check to not use unnecessary 64-bit
variables. This improves code size:

   text    data     bss     dec     hex filename
  44227    3326      36   47589    b9e5 sched.o.before
  44201    3326      36   47563    b9cb sched.o.after

Signed-off-by: Ingo Molnar <mingo@elte.hu>

sched: clean up the wakeup preempt check

clean up the wakeup preemption check. No code changed:

   text    data     bss     dec     hex filename
  44227    3326      36   47589    b9e5 sched.o.before
  44227    3326      36   47589    b9e5 sched.o.after

Signed-off-by: Ingo Molnar <mingo@elte.hu>

sched: wakeup preemption fix

wakeup preemption fix: do not make it dependent on p->prio.
Preemption purely depends on ->vruntime.

This improves preemption in mixed-nice-level workloads.

Signed-off-by: Ingo Molnar <mingo@elte.hu>

sched: remove PREEMPT_RESTRICT

remove PREEMPT_RESTRICT. (this is a separate commit so that any
regression related to the removal itself is bisectable)

Signed-off-by: Ingo Molnar <mingo@elte.hu>

sched: turn off PREEMPT_RESTRICT

PREEMPT_RESTRICT was a method aimed at reducing the amount of wakeup
related preemption. It has a disadvantage though, it can prevent
legitimate wakeups if a task is 'unlucky' to be hit too early by a tick
that clears peer_preempt.

Now that the wakeup preemption has been cleaned up we dont seem to have
excessive preemptions anymore, so this feature can be turned off. (and
removed in the next patch)

Signed-off-by: Ingo Molnar <mingo@elte.hu>

KVM: fix !SMP build error

fix a !SMP build error:

drivers/kvm/kvm_main.c: In function 'kvm_flush_remote_tlbs':
drivers/kvm/kvm_main.c:220: error: implicit declaration of function 'smp_call_function_mask'

(and also avoid unused function warning related to up_smp_call_function()
not making use of the 'func' parameter.)

Signed-off-by: Ingo Molnar <mingo@elte.hu>

x86: make nmi_cpu_busy() always defined

nmi_cpu_busy() must be available on !SMP too.

this is in preparation to a smp_call_function_mask() fix.

Signed-off-by: Ingo Molnar <mingo@elte.hu>

x86: make ipi_handler() always defined

prepare for up_smp_call_function() to ensure that the 'func'
pointer is unused. (which is related to a KVM build fix)

Signed-off-by: Ingo Molnar <mingo@elte.hu>

sched: cleanup, use NSEC_PER_MSEC and NSEC_PER_SEC

1) hardcoded 1000000000 value is used five times in places where
   NSEC_PER_SEC might be more readable.

2) A conversion from nsec to msec uses the hardcoded 1000000 value,
   which is a candidate for NSEC_PER_MSEC.

no code changed:

    text    data     bss     dec     hex filename
   44359    3326      36   47721    ba69 sched.o.before
   44359    3326      36   47721    ba69 sched.o.after

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

sched: reintroduce SMP tunings again

Yanmin Zhang reported an aim7 regression and bisected it down to:

|  commit 38ad464d410dadceda1563f36bdb0be7fe4c8938
|  Author: Ingo Molnar <mingo@elte.hu>
|  Date:   Mon Oct 15 17:00:02 2007 +0200
|
|     sched: uniform tunings
|
|     use the same defaults on both UP and SMP.

fix this by reintroducing similar SMP tunings again. This resolves
the regression.

(also update the comments to match the ilog2(nr_cpus) tuning effect)

Signed-off-by: Ingo Molnar <mingo@elte.hu>

sched: restore deterministic CPU accounting on powerpc

Since powerpc started using CONFIG_GENERIC_CLOCKEVENTS, the
deterministic CPU accounting (CONFIG_VIRT_CPU_ACCOUNTING) has been
broken on powerpc, because we end up counting user time twice: once in
timer_interrupt() and once in update_process_times().

This fixes the problem by pulling the code in update_process_times
that updates utime and stime into a separate function called
account_process_tick. If CONFIG_VIRT_CPU_ACCOUNTING is not defined,
there is a version of account_process_tick in kernel/timer.c that
simply accounts a whole tick to either utime or stime as before. If
CONFIG_VIRT_CPU_ACCOUNTING is defined, then arch code gets to
implement account_process_tick.

This also lets us simplify the s390 code a bit; it means that the s390
timer interrupt can now call update_process_times even when
CONFIG_VIRT_CPU_ACCOUNTING is turned on, and can just implement a
suitable account_process_tick().

account_process_tick() now takes the task_struct * as an argument.
Tested both with and without CONFIG_VIRT_CPU_ACCOUNTING.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

sched: fix delay accounting regression

Fix the delay accounting regression introduced by commit
75d4ef16a6aa84f708188bada182315f80aab6fa. rq no longer has sched_info
data associated with it. task_struct sched_info structure is used by delay
accounting to provide back statistics to user space.

also remove direct use of sched_clock() (which is not a valid thing to
do anymore) and use rq->clock instead.

Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

sched: reintroduce the sched_min_granularity tunable

we lost the sched_min_granularity tunable to a clever optimization
that uses the sched_latency/min_granularity ratio - but the ratio
is quite unintuitive to users and can also crash the kernel if the
ratio is set to 0. So reintroduce the min_granularity tunable,
while keeping the ratio maintained internally.

no functionality changed.

[ mingo@elte.hu: some fixlets. ]

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

sched: documentation: place_entity() comments

Add a few comments to place_entity(). No code changed.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

sched: fix vslice

vslice was missing a factor NICE_0_LOAD, as weight is in
weight*NICE_0_LOAD units.

the effect of this bug was larger initial slices and
thus latency-noisier forks.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

[IA64] IOSAPIC bogus error cleanup

On Altix (sn2) machines the "Error parsing MADT" message is
misleading because the lack of IOSAPIC entries is expected.

Since I am sure someone will ask, I have been told that
the chance of this changing anytime soon is close to nil.

Signed-off-by: George Beshers <gbeshers@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>