err.no Git - linux-2.6/log

Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus

* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
  lguest: Do not append space to guests kernel command line
  lguest: Revert 1ce70c4fac3c3954bd48c035f448793867592bc0, fix real problem.
  lguest: Sanitize the lguest clock.
  lguest: fix __get_vm_area usage.
  lguest: make sure cpu is initialized before accessing it

Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2

* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2:
  ocfs2: Fix NULL pointer dereferences in o2net
  ocfs2/dlm: dlm_thread should not sleep while holding the dlm_spinlock
  ocfs2/dlm: Print message showing the recovery master
  ocfs2/dlm: Add missing dlm_lockres_put()s
  ocfs2/dlm: Add missing dlm_lockres_put()s in migration path
  ocfs2/dlm: Add missing dlm_lock_put()s
  ocfs2: Fix an endian bug in online resize.
  [PATCH] [OCFS2]: constify function pointer tables
  ocfs2: Fix endian bug in o2dlm protocol negotiation.
  ocfs2: Use dlm_print_one_lock_resource for lock resource print
  [PATCH] fs/ocfs2/dlm/dlmdomain.c: fix printk warning

Merge git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog

* git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog:
  [WATCHDOG] make watchdog/hpwdt.c:asminline_call() static
  [WATCHDOG] Remove volatiles from watchdog device structures
  [WATCHDOG] replace remaining __FUNCTION__ occurrences
  [WATCHDOG] hpwdt: Use dmi_walk() instead of own copy
  [WATCHDOG] Fix return value warning in hpwdt
  [WATCHDOG] Fix declaration of struct smbios_entry_point in hpwdt
  [WATCHDOG] it8712f_wdt support for 16-bit timeout values, WDIOC_GETSTATUS

fbdev: add BF52x EZkit Display driver

Signed-off-by: Michael Hennerich <michael.hennerich@analog.com>
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
Cc: Antonino Daplas <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

BF54x LQ043 Framebuffer driver: Update copyright on previously modified files

Signed-off-by: Michael Hennerich <michael.hennerich@analog.com>
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

BF54x LQ043 Framebuffer driver: fix bug lcd_device_register API breakage

Signed-off-by: Bryan Wu <bryan.wu@analog.com>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

BF54x LQ043 Framebuffer driver: fix bug NULL for gpio_request label is not allowed

Signed-off-by: Michael Hennerich <michael.hennerich@analog.com>
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

stifb: fix crash A1439A CRX (Rattler) graphics card

Fix kernel crash when stifb driver is used with a A1439A CRX (Rattler)
graphics card. (Reference:
http://thread.gmane.org/gmane.linux.ports.hppa/1834)

Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mbxfb: fix incorrect argument type

Fix wrong pointer type passed into the dev_dbg() function.

Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Acked-by: Mike Rapoport <mike@compulab.co.il>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

iov_iter_advance() fix

iov_iter_advance() skips over zero-length iovecs, however it does not properly
terminate at the end of the iovec array. Fix this by checking against
i->count before we skip a zero-length iov.

The bug was reproduced with a test program that continually randomly creates
iovs to writev. The fix was also verified with the same program and also it
could verify that the correct data was contained in the file after each
writev.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Tested-by: "Kevin Coffman" <kwc@citi.umich.edu>
Cc: "Alexey Dobriyan" <adobriyan@gmail.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

rcu: move PREEMPT_RCU config option back under PREEMPT

The original preemptible-RCU patch put the choice between classic and
preemptible RCU into kernel/Kconfig.preempt, which resulted in build failures
on machines not supporting CONFIG_PREEMPT.  This choice was therefore moved to
init/Kconfig, which worked, but placed the choice between classic and
preemptible RCU at the top level, a very obtuse choice indeed.

This patch changes from the Kconfig "choice" mechanism to a pair of booleans,
only one of which (CONFIG_PREEMPT_RCU) is user-visible, and is located in
kernel/Kconfig.preempt, where one would expect it to be.  The other
(CONFIG_CLASSIC_RCU) is in init/Kconfig so that it is available to all
architectures, hopefully avoiding build breakage.  Thanks to Roman Zippel for
suggesting this approach.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Dipankar Sarma <dipankar@in.ibm.com>
Cc: Josh Triplett <josh@freedesktop.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

i8042: use SGI_HAS_I8042 to select SGI i8042 handlinig

Use SGI_HAS_I8042 to select SGI i8042 handling

Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

modules: warn about suspicious return values from module's ->init() hook

Return value convention of module's init functions is 0/-E.  Sometimes,
e.g.  during forward-porting mistakes happen and buggy module created,
where result of comparison "workqueue != NULL" is propagated all the way up
to sys_init_module.  What happens is that some other module created
workqueue in question, our module created it again and module was
successfully loaded.

Or it could be some other bug.

Let's make such mistakes much more visible.  In retrospective, such
messages would noticeably shorten some of my head-scratching sessions.

Note, that dump_stack() is just a way to get attention from user.  Sample
message:

sys_init_module: 'foo'->init suspiciously returned 1, it should follow 0/-E convention
sys_init_module: loading module anyway...
Pid: 4223, comm: modprobe Not tainted 2.6.24-25f666300625d894ebe04bac2b4b3aadb907c861 #5

Call Trace:
[<ffffffff80254b05>] sys_init_module+0xe5/0x1d0
[<ffffffff8020b39b>] system_call_after_swapgs+0x7b/0x80

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

modules: fix module waiting for dependent modules' init

Commit c9a3ba55 (module: wait for dependent modules doing init.) didn't quite
work because the waiter holds the module lock, meaning that the state of the
module it's waiting for cannot change.

Fortunately, it's fairly simple to update the state outside the lock and do
the wakeup.

Thanks to Jan Glauber for tracking this down and testing (qdio and qeth).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

hugetlb: correct page count for surplus huge pages

Free pages in the hugetlb pool are free and as such have a reference count of
zero.  Regular allocations into the pool from the buddy are "freed" into the
pool which results in their page_count dropping to zero.  However, surplus
pages can be directly utilized by the caller without first being freed to the
pool.  Therefore, a call to put_page_testzero() is in order so that such a
page will be handed to the caller with a correct count.

This has not affected end users because the bad page count is reset before the
page is handed off.  However, under CONFIG_DEBUG_VM this triggers a BUG when
the page count is validated.

Thanks go to Mel for first spotting this issue and providing an initial fix.

Signed-off-by: Adam Litke <agl@us.ibm.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

gpio/pca953x bugfix: mark as can_sleep

The pca953x driver is an I2C driver so gpio_chip->can_sleep should be set.
This lets upper layers know they should use the gpio_*_cansleep() calls to
access values, and may not access them from nonsleeping contexts.

Signed-off-by: Arnaud Patard <arnaud.patard@rtp-net.org>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Acked-by: "eric miao" <eric.y.miao@gmail.com>
Cc: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

md: reduce CPU wastage on idle md array with a write-intent bitmap

Recent patch titled
Reduce CPU wastage on idle md array with a write-intent bitmap.

would sometimes leave the array with dirty bitmap bits that stay dirty. A
subsequent write would sort things out so it isn't a big problem, but should
be fixed nonetheless.

We need to make sure that when the bitmap becomes not "allclean", the
daemon_sleep really does get set to a sensible value.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

md: fix formatting error in /proc/mdstat

If an md array is "auto-read-only", then this appears in /proc/mdstat as

   /dev/md0: active(auto-read-only)

whereas if it is truely readonly, it appears as

   /dev/md0: active (read-only)

The difference being a space.

One program known to parse this file expects the space and gets badly
confused.  It will be fixed, but it would be best if what the kernel generates
is more consistent too.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mempolicy: fix reference counting bugs

Address 3 known bugs in the current memory policy reference counting method.
I have a series of patches to rework the reference counting to reduce overhead
in the allocation path.  However, that series will require testing in -mm once
I repost it.

1) alloc_page_vma() does not release the extra reference taken for
   vma/shared mempolicy when the mode == MPOL_INTERLEAVE.  This can result in
   leaking mempolicy structures.  This is probably occurring, but not being
   noticed.

   Fix:  add the conditional release of the reference.

2) hugezonelist unconditionally releases a reference on the mempolicy when
   mode == MPOL_INTERLEAVE.  This can result in decrementing the reference
   count for system default policy [should have no ill effect] or premature
   freeing of task policy.  If this occurred, the next allocation using task
   mempolicy would use the freed structure and probably BUG out.

   Fix:  add the necessary check to the release.

3) The current reference counting method assumes that vma 'get_policy()'
   methods automatically add an extra reference a non-NULL returned mempolicy.
    This is true for shmem_get_policy() used by tmpfs mappings, including
   regular page shm segments.  However, SHM_HUGETLB shm's, backed by
   hugetlbfs, just use the vma policy without the extra reference.  This
   results in freeing of the vma policy on the first allocation, with reuse of
   the freed mempolicy structure on subsequent allocations.

   Fix: Rather than add another condition to the conditional reference
   release, which occur in the allocation path, just add a reference when
   returning the vma policy in shm_get_policy() to match the assumptions.

Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Greg KH <greg@kroah.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: David Rientjes <rientjes@google.com>
Cc: <eric.whitney@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Typo in Documentation/scheduler/sched-stats.txt

I have found a very small typo in Documentation/scheduler/sched-stats.txt.
See the end of this mail.

Signed-off-by: Masatake YAMATO <yamato@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

memstick: add support for JMicron jmb38x MemoryStick host controller

Signed-off-by: Alex Dubov <oakad@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

memstick: try harder to recover from unsuccessful interface mode switch

Signed-off-by: Alex Dubov <oakad@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

memstick: fix parsing of "assembly_date" attribute field

Signed-off-by: Alex Dubov <oakad@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

memstick: add support for decoding "specfile" media attributes

Signed-off-by: Alex Dubov <oakad@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

tifm: clear interrupt mask bits before setting them on adapter init

This should improve reliability of detection of cards already in socket on
driver load.

Signed-off-by: Alex Dubov <oakad@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

tifm: fix memorystick host initialization code

Instead of assuming that host is powered on only once at card insertion, allow
for the possibility that memstick layer may need to cycle card's power to get
it out from some unhealthy states.

Signed-off-by: Alex Dubov <oakad@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

tifm: fix the MemoryStick host fifo handling code

Additional input received from JMicron on MemoryStick host interfaces showed
that some assumtions in fifo handling code were incorrect. This patch also
fixes data corruption used to occure during PIO transfers.

Signed-off-by: Alex Dubov <oakad@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

memstick: drop DRIVER_VERSION numbers as meaningless

Signed-off-by: Alex Dubov <oakad@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

memstick: make sure number of command retries is exactly as specified

Signed-off-by: Alex Dubov <oakad@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

memstick: add memstick_suspend/resume_host methods

Bus driver may need to be informed that host is being suspended/resumed.

Signed-off-by: Alex Dubov <oakad@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

memstick: introduce correct definitions in the header

Thanks to some input from kind people at JMicron it is now possible to have
more correct definitions of protocol structures and bit field semantics.

Signed-off-by: Alex Dubov <oakad@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

tridentfb: fix memory size detection

Fix memory size multiplier during detection.

Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

tridentfb: register should be left in non-locked state

Remove locking registers after they are unlocked during switch to/from MMIO
mode. This fixes regression on the Blade3D (Trident 9880) caused by the
previous patch (probe fixes).

Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

of_serial: fix section mismatch warnings

Fix the following section mismatches:

WARNING: drivers/built-in.o(.exit.text+0x5a): Section mismatch in reference from the function of_platform_serial_exit() to the variable .devinit.data:of_platform_serial_driver
The function __exit of_platform_serial_exit() references
a variable __devinitdata of_platform_serial_driver.

Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

rename DECLARE_PCI_DEVICE_TABLE to DEFINE_PCI_DEVICE_TABLE

This macro is used to define tables, not to declare them.

Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

lguest: Do not append space to guests kernel command line

The lguest launcher appends a space to the kernel command line (if kernel
arguments are specified on its command line). This space is unneeded. More
importantly, this appended space will make Red Hat's nash script interpreter
(used in a Fedora style initramfs) add an empty argument to init's command
line. This empty argument will make kernel arguments like "init=/bin/bash"
fail (because the shell will try to execute a script with an empty name).
This could be considered a bug in nash, but is easily fixed in the lguest
launcher too.

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

lguest: Revert 1ce70c4fac3c3954bd48c035f448793867592bc0, fix real problem.

Ahmed managed to crash the Host in release_pgd(), which cannot be a Guest
bug, and indeed it wasn't.

The bug was that handing a 0 as the address of the toplevel page table
being manipulated can cause the lookup code in find_pgdir() to return
an uninitialized cache entry (we shadow up to 4 top level page tables
for each Guest).

Commit 37cc8d7f963ba2deec29c9b68716944516a3244f introduced this
behaviour in the Guest, uncovering the bug.

The patch which he submitted (which removed the /4 from the index
calculation) simply ensured that these high-indexed entries hit the
early exit path of guest_set_pmd(). But you get lots of segfaults in
guest userspace as the PMDs aren't being updated.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

lguest: Sanitize the lguest clock.

Now the TSC code handles a zero return from calculate_cpu_khz(),
lguest can simply pass through the value it gets from the Host: if
non-zero, all the normal TSC code applies.

Otherwise (or if the Host really doesn't support TSC), the clocksource
code will fall back to the slower but reasonable lguest clock.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

lguest: fix __get_vm_area usage.

Robert Bragg's 5dc331852848a38ca00a2817e5b98a1d0561b116 tightened
(ie. fixed) the checking in __get_vm_area, and it broke lguest.

lguest should pass the exact "end" it wants, not some random constant
(it was possible previously that it would actually get an address
different from SWITCHER_ADDR).

Also, Fabio Checconi pointed out that we should make sure we're not
hitting the fixmap area.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Robert Bragg <robert@sixbynine.org>

lguest: make sure cpu is initialized before accessing it

If req is LHREQ_INITIALIZE, and the guest has been initialized before
(unlikely), it will attempt to access cpu->tsk even though cpu is not yet
initialized.

Signed-off-by: Eugene Teo <eugeneteo@kernel.sg>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

ocfs2: Fix NULL pointer dereferences in o2net

In some situations, ocfs2_set_nn_state might get called with sc = NULL and
valid = 0. If sc = NULL, we can't dereference it to get the o2nm_node
member. Instead, do what o2net_initialize_handshake does and use NULL when
calling o2net_reconnect_delay and o2net_idle_timeout.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

ocfs2/dlm: dlm_thread should not sleep while holding the dlm_spinlock

This patch addresses the bug in which the dlm_thread could go to sleep
while holding the dlm_spinlock.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

ocfs2/dlm: Print message showing the recovery master

Knowing the dlm recovery master helps in debugging recovery
issues. This patch prints a message on the recovery master node.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

ocfs2/dlm: Add missing dlm_lockres_put()s

dlm_master_request_handler() forgot to put a lockres when
dlm_assert_master_worker() failed or was skipped.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

ocfs2/dlm: Add missing dlm_lockres_put()s in migration path

During migration, the recovery master node may be asked to master a lockres
it may not know about. In that case, it would not only have to create a
lockres and add it to the hash, but also remember to to do the _put_
corresponding to the kref_init in dlm_init_lockres(), as soon as the migration
is completed. Yes, we don't wait for the dlm_purge_lockres() to do that
matching put. Note the ref added for it being in the hash protects the lockres
from being freed prematurely.

This patch adds that missing put, as described above, to plug a memleak.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

ocfs2/dlm: Add missing dlm_lock_put()s

Normally locks for remote nodes are freed when that node sends an UNLOCK
message to the master. The master node tags an DLM_UNLOCK_FREE_LOCK action
to do an extra put on the lock at the end.

However, there are times when the master node has to free the locks for the
remote nodes forcibly.

Two cases when this happens are:
1. When the master has migrated the lockres plus all locks to another node.
2. When the master is clearing all the locks of a dead node.

It was in the above two conditions that the dlm was missing the extra put.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

ocfs2: Fix an endian bug in online resize.

In ocfs2_group_add, 'cr' is a disk field of type 'ocfs2_chain_rec', and we
were putting cpu byteorder values into it. Swap things to the right endian
before storing.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

[PATCH] [OCFS2]: constify function pointer tables

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

ocfs2: Fix endian bug in o2dlm protocol negotiation.

struct dlm_query_join_packet is made up of four one-byte fields.  They
are effectively in big-endian order already.  However, little-endian
machines swap them before putting the packet on the wire (because
query_join's response is a status, and that status is treated as a u32
on the wire).  Thus, a big-endian and little-endian machines will
treat this structure differently.

The solution is to have little-endian machines swap the structure when
converting from the structure to the u32 representation.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

ocfs2: Use dlm_print_one_lock_resource for lock resource print

__dlm_print_one_lock_resource must be called with spin_lock
the res->spinlock. While in some cases, we use it without this
precondition and lead to the failure of assert_spin_locked.
So call dlm_print_one_lock_resource instead.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

[PATCH] fs/ocfs2/dlm/dlmdomain.c: fix printk warning

fs/ocfs2/dlm/dlmdomain.c: In function 'dlm_send_join_cancels':
fs/ocfs2/dlm/dlmdomain.c:983: warning: format '%u' expects type 'unsigned int', but argument 7 has type 'long unsigned int'

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

riscom8: Fix hang on load

This has been around for a while but nobody reported it until recently.
Resubmitting the fix as it's appropriate for 2.6.25

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Linux 2.6.25-rc5

Do not include linux/backing-dev.h twice

Don't include linux/backing-dev.h twice in mm/filemap.c, it's pointless.

Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/tglx/linux-2.6-hrt

* git://git.kernel.org/pub/scm/linux/kernel/git/tglx/linux-2.6-hrt:
  time: remove obsolete CLOCK_TICK_ADJUST
  time: don't touch an offlined CPU's ts->tick_stopped in tick_cancel_sched_timer()
  time: prevent the loop in timespec_add_ns() from being optimised away
  ntp: use unsigned input for do_div()

Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
[CRYPTO] skcipher: Fix section mismatches

alpha: fix iommu-related boot panic

This fixes a boot panic due to a typo in the recent iommu patchset from
FUJITA Tomonori <tomof@acm.org> - the code used dma_get_max_seg_size()
instead of dma_get_seg_boundary().

It also removes a couple of unnecessary BUG_ON() and ALIGN() macros.

Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Reported-and-tested-by: Bob Tracy <rct@frus.com>
Acked-by: FUJITA Tomonori <tomof@acm.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

cpu hotplug: adjust root-domain->online span in response to hotplug event

We currently set the root-domain online span automatically when the
domain is added to the cpu if the cpu is already a member of
cpu_online_map.

This was done as a hack/bug-fix for s2ram, but it also causes a problem
with hotplug CPU_DOWN transitioning. The right way to fix the original
problem is to actually respond to CPU_UP events, instead of CPU_ONLINE,
which is already too late.

This solves the hung reboot regression reported by Andrew Morton and
others.

Signed-off-by: Gregory Haskins <ghaskins@novell.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

time: remove obsolete CLOCK_TICK_ADJUST

The first version of the ntp_interval/tick_length inconsistent usage patch was
recently merged as bbe4d18ac2e058c56adb0cd71f49d9ed3216a405

http://git.kernel.org/gitweb.cgi?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=bbe4d18ac2e058c56adb0cd71f49d9ed3216a405

While the fix did greatly improve the situation, it was correctly pointed out
by Roman that it does have a small bug: If the users change clocksources after
the system has been running and NTP has made corrections, the correctoins made
against the old clocksource will be applied against the new clocksource,
causing error.

The second attempt, which corrects the issue in the NTP_INTERVAL_LENGTH
definition has also made it up-stream as commit
e13a2e61dd5152f5499d2003470acf9c838eab84

http://git.kernel.org/gitweb.cgi?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e13a2e61dd5152f5499d2003470acf9c838eab84

Roman has correctly pointed out that CLOCK_TICK_ADJUST is calculated
based on the PIT's frequency, and isn't really relevant to non-PIT
driven clocksources (that is, clocksources other then jiffies and pit).

This patch reverts both of those changes, and simply removes
CLOCK_TICK_ADJUST.

This does remove the granularity error correction for users of PIT and Jiffies
clocksource users, but the granularity error but for the majority of users, it
should be within the 500ppm range NTP can accommodate for.

For systems that have granularity errors greater then 500ppm, the
"ntp_tick_adj=" boot option can be used to compensate.

[johnstul@us.ibm.com: provided changelog]
[mattilinnanvuori@yahoo.com: maek ntp_tick_adj static]
Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Acked-by: john stultz <johnstul@us.ibm.com>
Signed-off-by: Matti Linnanvuori <mattilinnanvuori@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: mingo@elte.hu
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

time: don't touch an offlined CPU's ts->tick_stopped in tick_cancel_sched_timer()

Silences WARN_ONs in rcu_enter_nohz() and rcu_exit_nohz(), which appeared
before caused by (repeated) calls to:
$ echo 0 > /sys/devices/system/cpu/cpu1/online
$ echo 1 > /sys/devices/system/cpu/cpu1/online

Signed-off-by: Karsten Wiese <fzu@wemgehoertderstaat.de>
Cc: johnstul@us.ibm.com
Cc: Rafael Wysocki <rjw@sisk.pl>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@elte.hu>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

time: prevent the loop in timespec_add_ns() from being optimised away

Since some architectures don't support __udivdi3().

Signed-off-by: Segher Boessenkool <segher@kernel.crashing.org>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

ntp: use unsigned input for do_div()

The kernel NTP code shouldn't hand 64-bit *signed* values to do_div(). Make it
instead hand 64-bit unsigned values. This gets rid of a couple of warnings.

Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Fix waitid si_code regression

In commit ee7c82da830ea860b1f9274f1f0cdf99f206e7c2 ("wait_task_stopped:
simplify and fix races with SIGCONT/SIGKILL/untrace"), the magic (short)
cast when storing si_code was lost in wait_task_stopped. This leaks the
in-kernel CLD_* values that do not match what userland expects.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

[CRYPTO] skcipher: Fix section mismatches

The previous patch to move chainiv and eseqiv into blkcipher created
a section mismatch for the chainiv exit function which was also called
from __init. This patch removes the __exit marking on it.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

x86_64: make ptrace always sign-extend orig_ax to 64 bits

This makes 64-bit ptrace calls setting the 64-bit orig_ax field for a
32-bit task sign-extend the low 32 bits up to 64.  This matches what a
64-bit debugger expects when tracing a 32-bit task.

This follows on my "x86_64 ia32 syscall restart fix".  This didn't
matter until that was fixed.

The debugger ignores or zeros the high half of every register slot it
sets (including the orig_rax pseudo-register) uniformly.  It expects
that the setting of the low 32 bits always has the same meaning as a
32-bit debugger setting those same 32 bits with native 32-bit
facilities.

This never arose before because the syscall restart check never
matched any -ERESTART* values due to lack of sign extension.  Before
that fix, even 32-bit ptrace setting orig_eax to -1 failed to trigger
the restart check anyway.  So this was never noticed as a regression
of 64-bit debuggers vs 32-bit debuggers on the same 64-bit kernel.

Signed-off-by: Roland McGrath <roland@redhat.com>
[ Changed to just do the sign-extension unconditionally on x86-64,
  since orig_ax is always just a small integer and doesn't need
  the full 64-bit range ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

bluetooth: Add another Broadcom device

This adds another Broadcom BCM2045 based device to the blacklist, with
these settings the micro dongle works on my system.

Signed-off-by: Karsten Keil <kkeil@suse.de>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'slab-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/christoph/vm

* 'slab-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/christoph/vm:
  slub: fix typo in Documentation/vm/slub.txt
  slab: NUMA slab allocator migration bugfix
  slub: Do not cross cacheline boundaries for very small objects
  slab - use angle brackets for include of kmalloc_sizes.h
  slab numa fallback logic: Do not pass unfiltered flags to page allocator
  slub statistics: Fix check for DEACTIVATE_REMOTE_FREES

Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6:
  ide: update references to Documentation/ide/ide.txt (v2)
  ide: move ide.txt to Documentation/ide/
  ide: fix buggy code in ide_register_hw()
  ide: fix enabling DMA on it821x in "smart" mode
  ide-cd: mark REQ_TYPE_ATA_PC write requests with REQ_RW flag

ide: update references to Documentation/ide/ide.txt (v2)

Fix all references to Documentation/ide/ide.txt.
Add/update ide/00-INDEX file.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>

ide: move ide.txt to Documentation/ide/

Cleanup some of Documentation directory:
Move Documentation/ide.txt to the ide/ sub-directory.
Fix trailing whitespace while there.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>

ide: fix buggy code in ide_register_hw()

Relocating the index to come after finding the hwif pointer.

Signed-off-by: Peter Teoh <htmldeveloper@gmail.com>
Reported-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>

ide: fix enabling DMA on it821x in "smart" mode

ide_tune_dma() should return '1' if IDE_HFLAG_NO_SET_MODE host flag is set.

Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>

ide-cd: mark REQ_TYPE_ATA_PC write requests with REQ_RW flag

On Thursday 06 March 2008, walt wrote:

> For me, this commit causes the problem it's intended to fix:
>
> commit 9f10d9ee0ac6d79d7bc8b9a158bf4a29322d84d3
> Author: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
> Date:   Tue Feb 26 21:50:35 2008 +0100
>
>      ide-cd: fix 'ireason' handling for REQ_TYPE_ATA_PC requests
>
>      This fixes some hangs caused by not finishing the transfer before ending
>      the request and also makes use of 'ireason == 1' quirk for spurious IRQs.
>
> When I mount a CD there is a long delay, and I see this error message:
>
> hdc: ide_cd_check_ireason: wrong transfer direction!
> cdrom: failed setting lba address space
> hdc: status error: status=0x58 { DriveReady SeekComplete DataRequest }
> ide: failed opcode was: unknown
> hdc: drive not ready for command
> <repeated many times>
>
> When I revert this commit everything works properly again, including
> CD burning.

It turned out that REQ_TYPE_ATA_PC write requests were not marked as such
(the previous commit assumed them to be).

Reported-by: walt <w41ter@gmail.com>
Tested-by: walt <w41ter@gmail.com>
Reviewed-by: Borislav Petkov <petkovbb@googlemail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>

Merge branch 'hotfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6

* 'hotfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
  NFS: Fix dentry revalidation for NFSv4 referrals and mountpoint crossings
  NFS: Fix the fsid revalidation in nfs_update_inode()
  SUNRPC: Fix a nfs4 over rdma transport oops
  NFS: Fix an f_mode/f_flags confusion in fs/nfs/write.c

NFS: Fix dentry revalidation for NFSv4 referrals and mountpoint crossings

As long as the directory contents haven't changed, we should just let the
path walk proceed to cross the mountpoint. Apart from being an optimisation
in the case of 'nohide' mountpoint traversals, it also fixes an issue with
referrals: referral inodes don't have valid filehandles, so calling
nfs_revalidate_inode() on them is a bug.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

NFS: Fix the fsid revalidation in nfs_update_inode()

When we detect that we've crossed a mountpoint on the remote server, we
must take care not to use that inode to revalidate the fsid on our
current superblock. To do so, we label the inode as a remote mountpoint,
and check for that in nfs_update_inode().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Fix a nfs4 over rdma transport oops

Prevent an RPC oops when freeing a dynamically allocated RDMA
buffer, used in certain special-case large metadata operations.

Signed-off-by: Tom Talpey <tmt@netapp.com>
Signed-off-by: James Lentini <jlentini@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

NFS: Fix an f_mode/f_flags confusion in fs/nfs/write.c

O_SYNC is stored in filp->f_flags.
Thanks to Al Viro for pointing out the bug.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

gigaset: fix Oops on module unload regression

The card state mutex was only initialized when a device was connected,
but used during unload unconditionally, leading to an Oops if a driver
was loaded and unloaded again without ever connecting a device.

Fix this by initializing the mutex as soon as the structure is allocated.
Also add a missing mutex unlock revealed in the same execution path.

This fixes a possible Oops in 2.6.25-rc that was introduced by commit
e468c04894f36045cf93d1384183a461014b6840 ("Gigaset: permit module
unload").

Thanks to Roland Kletzing for reporting this problem.

Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Tested-by: Roland Kletzing <devzero@web.de>
Cc: Hansjoerg Lipp <hjlipp@web.de>
Cc: Karsten Keil <kkeil@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched-devel

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched-devel:
  sched: don't allow rt_runtime_us to be zero for groups having rt tasks
  sched: rt-group: fixup schedulability constraints calculation
  sched: fix the wrong time slice value for SCHED_FIFO tasks
  sched: export task_nice
  sched: balance RT task resched only on runqueue
  sched: retain vruntime

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86:
  x86-boot: don't request VBE2 information
  x86: re-add reboot fixups
  x86: fix typo in step.c
  x86: fix merge mistake in i387.c
  x86: clear DF before calling signal handler

drivers/char/esp.c: fix bootup lockup

randconfig testing found a bootup lockup in drivers/char/esp.c because
of a spinlock that wasn't correctly initialized.

I'm not sure why it became more prominent in 2.6.25-rc4, the bug seems
rather old and i've been doing allyesconfig bootups for ages with
CONFIG_ESP enabled.

This fixes this bootup lockup:

PM: Adding info for No Bus:ttyP63
ttyP32 at 0x0240 (irq = 0) is an ESP primary port
BUG: spinlock lockup on CPU#0, swapper/1, f56dd004
Pid: 1, comm: swapper Not tainted 2.6.25-rc4-sched-devel.git-x86-latest.git #402 [<c03ac6f4>] _raw_spin_lock+0x134/0x140
  [<c08649be>] _spin_lock_irqsave+0x5e/0x80
  [<c0b9fbfe>] ? espserial_init+0x2be/0x6e0
  [<c0b9fbfe>] espserial_init+0x2be/0x6e0
  [<c0b877a3>] kernel_init+0x83/0x260
  [<c0b9f940>] ? espserial_init+0x0/0x6e0
  [<c010416a>] ? restore_nocheck_notrace+0x0/0xe
  [<c0b87720>] ? kernel_init+0x0/0x260
  [<c0b87720>] ? kernel_init+0x0/0x260
  [<c0104507>] kernel_thread_helper+0x7/0x10
  =======================

kzalloc() is not the way to initialize spinlocks anymore.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

sched: don't allow rt_runtime_us to be zero for groups having rt tasks

This patch checks if we can set the rt_runtime_us to 0. If there is a
realtime task in the group, we don't want to set the rt_runtime_us as 0
or bad things will happen. (that task wont get any CPU time despite
being TASK_RUNNNG)

Signed-off-by: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

sched: rt-group: fixup schedulability constraints calculation

it was only possible to configure the rt-group scheduling parameters
beyond the default value in a very small range.

that's because div64_64() has a different calling convention than
do_div() :/

fix a few untidies while we are here; sysctl_sched_rt_period may overflow
due to that multiplication, so cast to u64 first. Also that RUNTIME_INF
juggling makes little sense although its an effective NOP.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

sched: fix the wrong time slice value for SCHED_FIFO tasks

Function sys_sched_rr_get_interval returns wrong time slice value for
SCHED_FIFO tasks. The time slice for SCHED_FIFO tasks should be 0.

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

sched: export task_nice

The API is trivial, and so is the implementation.

Signed-off-by: Pavel Roskin <proski@gnu.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

sched: balance RT task resched only on runqueue

Sripathi Kodi reported a crash in the -rt kernel:

https://bugzilla.redhat.com/show_bug.cgi?id=435674

this is due to a place that can reschedule a task without holding
the tasks runqueue lock. This was caused by the RT balancing code
that pulls RT tasks to the current run queue and will reschedule the
current task.

There's a slight chance that the pulling of the RT tasks will release
the current runqueue's lock and retake it (in the double_lock_balance).
During this time that the runqueue is released, the current task can
migrate to another runqueue.

In the prio_changed_rt code, after the pull, if the current task is of
lesser priority than one of the RT tasks pulled, resched_task is called
on the current task. If the current task had migrated in that small
window, resched_task will be called without holding the runqueue lock
for the runqueue that the task is on.

This race condition also exists in the mainline kernel and this patch
adds a check to make sure the task hasn't migrated before calling
resched_task.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Tested-by: Sripathi Kodi <sripathik@in.ibm.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

sched: retain vruntime

Kei Tokunaga reported an interactivity problem when moving tasks
between control groups.

Tasks would retain their old vruntime when moved between groups, this
can cause funny lags. Re-set the vruntime on group move to fit within
the new tree.

Reported-by: Kei Tokunaga <tokunaga.keiich@jp.fujitsu.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

x86-boot: don't request VBE2 information

The new x86 setup code (4fd06960f120) broke booting on an old P3/500MHz
with an onboard Voodoo3 of mine. After debugging it, it turned out
to be caused by the fact that the vesa probing now asks for VBE2 data.

Disassembing the video BIOS shows that it overflows the vesa_general_info
structure when VBE2 data is requested because the source addresses for the
information strings which get strcpy'ed to the buffer lie outside the 32K
BIOS code (and hence contain long sequences of 0xff's).

E.G.:

get_vbe_controller_info:
00002A9C  60                pushaw
00002A9D  1E                push ds
00002A9E  0E                push cs
00002A9F  1F                pop ds
00002AA0  2BC9              sub cx,cx
00002AA2  6626813D56424532  cmp dword [es:di],0x32454256 ; "VBE2"
00002AAA  7501              jnz .1
00002AAC  41                inc cx
.1:
00002AAD  51                push cx
00002AAE  B91400            mov cx,0x14
00002AB1  BED47F            mov si, controller_header
00002AB4  57                push di
00002AB5  F3A4              rep movsb ; copy vbe1.2 header

00002AB7  B9EC00            mov cx,0xec
00002ABA  2AC0              sub al,al
00002ABC  F3AA              rep stosb ; zero pad remainder

00002ABE  5F                pop di
00002ABF  E8EB0D            call word get_memory
00002AC2  C1E002            shl ax,0x2
00002AC5  26894512          mov [es:di+0x12],ax ; total memory
00002AC9  26C745040003      mov word [es:di+0x4],0x300 ; VBE version
00002ACF  268C4D08          mov [es:di+0x8],cs
00002AD3  268C4D10          mov [es:di+0x10],cs
00002AD7  59                pop cx
00002AD8  E361              jcxz .done ; VBE2 requested?
00002ADA  8D9D0001          lea bx,[di+0x100]
00002ADE  53                push bx
00002ADF  87DF              xchg bx,di ; di now points to 2nd half
00002AE1  26C747140001      mov word [es:bx+0x14],0x100 ; sw rev

00002AE7  26897F06          mov [es:bx+0x6],di ; oem string
00002AEB  268C4708          mov [es:bx+0x8],es
00002AEF  BE5280            mov si,0x8052 ; oem string
00002AF2  E87A1B            call word strcpy

00002AF5  26897F0E          mov [es:bx+0xe],di ; video mode list
00002AF9  268C4710          mov [es:bx+0x10],es
00002AFD  B91E00            mov cx,0x1e
00002B00  BEE87F            mov si,vidmodes
00002B03  F3A5              rep movsw

00002B05  26897F16          mov [es:bx+0x16],di ; oem vendor
00002B09  268C4718          mov [es:bx+0x18],es
00002B0D  BE2480            mov si,0x8024 ; oem vendor
00002B10  E85C1B            call word strcpy

00002B13  26897F1A          mov [es:bx+0x1a],di ; oem product
00002B17  268C471C          mov [es:bx+0x1c],es
00002B1B  BE3880            mov si,0x8038 ; oem product
00002B1E  E84E1B            call word strcpy

00002B21  26897F1E          mov [es:bx+0x1e],di ; oem product rev
00002B25  268C4720          mov [es:bx+0x20],es
00002B29  BE4580            mov si,0x8045 ; oem product rev
00002B2C  E8401B            call word strcpy

00002B2F  58                pop ax
00002B30  B90001            mov cx,0x100
00002B33  2BCF              sub cx,di
00002B35  03C8              add cx,ax
00002B37  2AC0              sub al,al
00002B39  F3AA              rep stosb ; zero pad
.done:
00002B3B  1F                pop ds
00002B3C  61                popaw
00002B3D  B84F00            mov ax,0x4f
00002B40  C3                ret

(The full BIOS can be found at http://peter.korsgaard.com/vgabios.bin
if interested).

The old setup code didn't ask for VBE2 info, and the new code doesn't
actually do anything with the extra information, so the fix is to simply
not request it. Other BIOS'es might have the same problem.

Signed-off-by: Peter Korsgaard <jacmet@sunsite.dk>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

x86: re-add reboot fixups

Jan Beulich noticed that the reboot fixups went missing during
reboot.c unification.

(commit 4d022e35fd7e07c522c7863fee6f07e53cf3fc14)

Geode and a few other rare boards with special reboot quirks are
affected.

Reported-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

x86: fix typo in step.c

TIF_DEBUGCTLMSR has no meaning in the actual MSR...

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

x86: fix merge mistake in i387.c

convert_fxsr_to_user() in 2.6.24's i387_32.c did this, and
convert_to_fxsr() also does the inverse, so I assume it's an oversight
that it is no longer being done.

[ mingo@elte.hu:

  we encode it this way because there's no space for the 'FPU Last
  Instruction Opcode' (->fop) field in the legacy user_i387_ia32_struct
  that PTRACE_GETFPREGS/PTRACE_SETFPREGS uses.

  it's probably pure legacy - i'd be surprised if any user-space relied on
  the FPU Last Opcode in any way. But indeed we used to do it previously
  so the most conservative thing is to preserve that piece of information.
]

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

x86: clear DF before calling signal handler

The Linux kernel currently does not clear the direction flag before
calling a signal handler, whereas the x86/x86-64 ABI requires that.

Linux had this behavior/bug forever, but this becomes a real problem
with gcc version 4.3, which assumes that the direction flag is
correctly cleared at the entry of a function.

This patches changes the setup_frame() functions to clear the
direction before entering the signal handler.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: H. Peter Anvin <hpa@zytor.com>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6.25

* git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6.25:
  sh: Fix up the sh64 build.
  sh: Fix up SH7710 VoIP-GW build.
  sh: Flag PMB support as EXPERIMENTAL.
  sh: Update r7780mp defconfig.
  fb: hitfb: Balance probe/remove section annotations.
  sh: hp6xx: Fix up hp6xx_apm build failure.
  fb: pvr2fb: Fix up remaining section mismatch.
  sh: Fix up section mismatches.
  sh: hp6xx: Correct APM output.
  sh: update se7780 defconfig
  sh: replace remaining __FUNCTION__ occurrences
  sh: export copy-page() to modules
  sh_ksyms_32.c update for gcc 4.3
  sh/mm/pg-sh7705.c must #include <linux/fs.h>

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/blackfin-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/blackfin-2.6:
  [Blackfin] arch: current_l1_stack_save is a pointer, so use NULL rather than 0
  [Blackfin] arch: fix atomic and32/xor32 comments and ENDPROC markings
  [Blackfin] arch: fix bug - allow SDH driver to be used as module
  [Blackfin] arch: to kill syscalls missing warning by adding new timerfd syscalls

Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6

* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
  [IA64] kprobes arch consolidation build fix
  [IA64] update efi region debugging to use MB, GB and TB as well as KB
  [IA64] use dev_printk in video quirk
  [IA64] remove remaining __FUNCTION__ occurrences
  [IA64] remove unnecessary nfs includes from sys_ia32.c
  [IA64] remove CONFIG_SMP ifdef in ia64_send_ipi()
  [IA64] arch_ptrace() cleanup
  [IA64] remove duplicate code from arch_ptrace()
  [IA64] convert sys_ptrace to arch_ptrace
  [IA64] remove find_thread_for_addr()
  [IA64] do not sync RBS when changing PT_AR_BSP or PT_CFM
  [IA64] access user RBS directly

slub: fix typo in Documentation/vm/slub.txt

slub_debug=,dentry is correct, not dentry_cache.

Signed-off-by: Itaru Kitayama <i-kitayama@ap.jp.nec.com>
Signed-off-by: Christoph Lameter <clameter@sgi.com>

slab: NUMA slab allocator migration bugfix

NUMA slab allocator cpu migration bugfix

The NUMA slab allocator (specifically, cache_alloc_refill)
is not refreshing its local copies of what cpu and what
numa node it is on, when it drops and reacquires the irq
block that it inherited from its caller. As a result
those values become invalid if an attempt to migrate the
process to another numa node occured while the irq block
had been dropped.

The solution is to make cache_alloc_refill reload these
variables whenever it drops and reacquires the irq block.

The error is very difficult to hit. When it does occur,
one gets the following oops + stack traceback bits in
check_spinlock_acquired:

kernel BUG at mm/slab.c:2417
cache_alloc_refill+0xe6
kmem_cache_alloc+0xd0
...

This patch was developed against 2.6.23, ported to and
compiled-tested only against 2.6.25-rc4.

Signed-off-by: Joe Korty <joe.korty@ccur.com>
Signed-off-by: Christoph Lameter <clameter@sgi.com>

slub: Do not cross cacheline boundaries for very small objects

SLUB should pack even small objects nicely into cachelines if that is what
has been asked for. Use the same algorithm as SLAB for this.

The effect of this patch for a system with a cacheline size of 64
bytes is that the 24 byte sized slab caches will now put exactly
2 objects into a cacheline instead of 3 with some overlap into
the next cacheline. This reduces the object density in a 4k slab
from 170 to 128 objects (same as SLAB).

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Christoph Lameter <clameter@sgi.com>

slab - use angle brackets for include of kmalloc_sizes.h

Make them all use angle brackets and the directory name.

Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Christoph Lameter <clameter@sgi.com>