err.no Git - linux-2.6/log

[x86] Fix TSC calibration issues

Larry Finger reported at http://lkml.org/lkml/2008/9/1/90:
An ancient laptop of mine started throwing errors from b43legacy when
I started using 2.6.27 on it. This has been bisected to commit bfc0f59
"x86: merge tsc calibration".

The unification of the TSC code adopted mostly the 64bit code, which
prefers PMTIMER/HPET over the PIT calibration.

Larrys system has an AMD K6 CPU. Such systems are known to have
PMTIMER incarnations which run at double speed. This results in a
miscalibration of the TSC by factor 0.5. So the resulting calibrated
CPU/TSC speed is half of the real CPU speed, which means that the TSC
based delay loop will run half the time it should run. That might
explain why the b43legacy driver went berserk.

On the other hand we know about systems, where the PIT based
calibration results in random crap due to heavy SMI/SMM
disturbance. On those systems the PMTIMER/HPET based calibration logic
with SMI detection shows better results.

According to Alok also virtualized systems suffer from the PIT
calibration method.

The solution is to use a more wreckage aware aproach than the current
either/or decision.

1) reimplement the retry loop which was dropped from the 32bit code
during the merge. It repeats the calibration and selects the lowest
frequency value as this is probably the closest estimate to the real
frequency

2) Monitor the delta of the TSC values in the delay loop which waits
for the PIT counter to reach zero. If the maximum value is
significantly different from the minimum, then we have a pretty safe
indicator that the loop was disturbed by an SMI.

3) keep the pmtimer/hpet reference as a backup solution for systems
where the SMI disturbance is a permanent point of failure for PIT
based calibration

4) do the loop iteration for both methods, record the lowest value and
decide after all iterations finished.

5) Set a clear preference to PIT based calibration when the result
makes sense.

The implementation does the reference calibration based on
HPET/PMTIMER around the delay, which is necessary for the PIT anyway,
but keeps separate TSC values to ensure the "independency" of the
resulting calibration values.

Tested on various 32bit/64bit machines including Geode 266Mhz, AMD K6
(affected machine with a double speed pmtimer which I grabbed out of
the dump), Pentium class machines and AMD/Intel 64 bit boxen.

Bisected-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

drivers/char/random.c: fix a race which can lead to a bogus BUG()

Fix a bug reported by and diagnosed by Aaron Straus.

This is a regression intruduced into 2.6.26 by

    commit adc782dae6c4c0f6fb679a48a544cfbcd79ae3dc
    Author: Matt Mackall <mpm@selenic.com>
    Date:   Tue Apr 29 01:03:07 2008 -0700

        random: simplify and rename credit_entropy_store

credit_entropy_bits() does:

spin_lock_irqsave(&r->lock, flags);
...
if (r->entropy_count > r->poolinfo->POOLBITS)
r->entropy_count = r->poolinfo->POOLBITS;

so there is a time window in which this BUG_ON():

static size_t account(struct entropy_store *r, size_t nbytes, int min,
      int reserved)
{
unsigned long flags;

BUG_ON(r->entropy_count > r->poolinfo->POOLBITS);

/* Hold lock while accounting */
spin_lock_irqsave(&r->lock, flags);

can trigger.

We could fix this by moving the assertion inside the lock, but it seems
safer and saner to revert to the old behaviour wherein
entropy_store.entropy_count at no time exceeds
entropy_store.poolinfo->POOLBITS.

Reported-by: Aaron Straus <aaron@merfinllc.com>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: <stable@kernel.org> [2.6.26.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

pm_qos_requirement might sleep

Make PM_QOS and CPU_IDLE play nicer when run with the RT-Preempt kernel.

The purpose of the patch is to remove the spin_lock around the read in the
function pm_qos_requirement - since spinlocks can sleep in -rt and this
function is called from idle.

CPU_IDLE polls the target_value's of some of the pm_qos parameters from
the idle loop causing sleeping locking warnings. Changing the
target_value to an atomic avoids this issue.

Remove the spinlock in pm_qos_requirement by making target_value an atomic
type.

Signed-off-by: mark gross <mgross@linux.intel.com>
Signed-off-by: John Kacur <jkacur@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

rtc-cmos: wake again from S5

Update rtc-cmos shutdown handling to leave RTC alarms active, resolving
http://bugzilla.kernel.org/show_bug.cgi?id=11411 on several boards. There
are still some systems where the ACPI event handling doesn't cooperate.
(Possibly related to bugid 11312, reporting the spontaneous disabling of
RTC events.)

Bug 11411 reported that changes to work around some ACPI event issues
broke wake-from-S5 handling, as used for DVR applications. (They like to
power off, then wake later to record programs.)

[yakui.zhao@intel.com: add shutdown for PNP devices]
[dbrownell@users.sourceforge.net: update comments]
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Cc: Stefan Bauer <stefan.bauer@cs.tu-chemnitz.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

sysfs: document files in /sys/firmware/sgi_uv/

Document files in /sys/firmware/sgi_uv/.

Signed-off-by: Russ Anderson <rja@sgi.com>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ibft: fix target info parsing in ibft module

I got this patch through Red Hat's bugzilla from the bug submitter and
patch creator.  I have just fixed it up so it applies without fuzz to
upstream kernels.

Original patch and description from Shyam kumar Iyer:

The issue [ibft module not displaying targets with short names] is because
of an offset calculatation error in the iscsi_ibft.c code.  Due to this
error directory structure for the target in /sys/firmware/ibft does not
get created and so the initiator is unable to connect to the target.

Note that this bug surfaced only with an name that had a short section at
the end.  eg: "iqn.1984-05.com.dell:dell".  It did not surface when the
iqn's had a longer section at the end.  eg:
"iqn.2001-04.com.example:storage.disk2.sys1.xyz"

So, the eot_offset was calculated such that an extra 48 bytes i.e.  the
size of the ibft_header which has already been accounted was subtracted
twice.

This was not evident with longer iqn names because they would overshoot
the total ibft length more than 48 bytes and thus would escape the bug.

Signed-off-by: Shyam Kumar Iyer <shyam_iyer@dell.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Cc: Konrad Rzeszutek <konrad@virtualiron.com>
Cc: Peter Jones <pjones@redhat.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

rtc_time_to_tm: fix signed/unsigned arithmetic

commit 945185a69daa457c4c5e46e47f4afad7dcea734f ("rtc: rtc_time_to_tm: use
unsigned arithmetic") changed the some types in rtc_time_to_tm() to
unsigned:

void rtc_time_to_tm(unsigned long time, struct rtc_time *tm)
{
- register int days, month, year;
+ unsigned int days, month, year;

This doesn't work for all cases, because days is checked for < 0 later
on:

if (days < 0) {
year -= 1;
days += 365 + LEAP_YEAR(year);
}

I think the correct fix would be to keep days signed and do an appropriate
cast later on.

Signed-off-by: Jan Altenberg <jan.altenberg@linutronix.de>
Cc: Maciej W. Rozycki <macro@linux-mips.org>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: David Brownell <david-b@pacbell.net>
Cc: Dmitri Vorobiev <dmitri.vorobiev@gmail.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

tdfxfb: fix frame buffer name overrun

If there are more then one graphics card handled by the tdfxfb driver the
name of the frame buffer overruns reserved size.

Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

tdfxfb: fix SDRAM memory size detection

Fix memory detection on Voodoo3 cards with SDRAM memory.

Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

hp-wmi: add proper hotkey support

It turns out that event 0x4 merely indcates that a hotkey has been
pressed, not which one. A further query is required in order to determine
the actual keypress. The following patch adds support for that along with
the known keycodes.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

hp-wmi: update to match current rfkill semantics

hp-wmi currently changes the RFKill state by altering the struct members
rather than using the dedicated interface, meaning that update events
won't be pushed to userspace. This patch fixes that, along with fixing
the declared type of the WWAN kill switch. It also ensures that rfkill
interfaces are only registered for hardware that exists.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Acked-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Cc: Ivo van Doorn <ivdoorn@gmail.com>
Cc: Dave Young <hidave.darkstar@gmail.com>
Cc: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ipc: document the new auto_msgmni proc file

Update Documentation/filesystems/proc.txt: it describes the file
auto_msgmni intoduced to enable/disable msgmni automatic recomputing upon
memory add/remove (see thread http://lkml.org/lkml/2008/7/4/27). Also
added a description for msgmni (this filex is only listed in
Documentation/sysctl/kernel.txt).

Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: size of quicklists shouldn't be proportional to the number of CPUs

Quicklists store pages for each CPU as caches.  (Each CPU can cache
node_free_pages/16 pages)

It is used for page table cache.  exit() will increase the cache size,
while fork() consumes it.

So for example if an apache-style application runs (one parent and many
child model), one CPU process will fork() while another CPU will process
the middleware work and exit().

At that time, the CPU on which the parent runs doesn't have page table
cache at all.  Others (on which children runs) have maximum caches.

QList_max = (#ofCPUs - 1) x Free / 16
=> QList_max / (Free + QList_max) = (#ofCPUs - 1) / (16 + #ofCPUs - 1)

So, How much quicklist memory is used in the maximum case?

This is proposional to # of CPUs because the limit of per cpu quicklist
cache doesn't see the number of cpus.

Above calculation mean

Number of CPUs per node            2    4    8   16
==============================  ====================
QList_max / (Free + QList_max)   5.8%  16%  30%  48%

Wow! Quicklist can spend about 50% memory at worst case.

My demonstration program is here
--------------------------------------------------------------------------------
#define _GNU_SOURCE

#include <stdio.h>
#include <errno.h>
#include <stdlib.h>
#include <string.h>
#include <sched.h>
#include <unistd.h>
#include <sys/mman.h>
#include <sys/wait.h>

#define BUFFSIZE 512

int max_cpu(void) /* get max number of logical cpus from /proc/cpuinfo */
{
  FILE *fd;
  char *ret, buffer[BUFFSIZE];
  int cpu = 1;

  fd = fopen("/proc/cpuinfo", "r");
  if (fd == NULL) {
    perror("fopen(/proc/cpuinfo)");
    exit(EXIT_FAILURE);
  }
  while (1) {
    ret = fgets(buffer, BUFFSIZE, fd);
    if (ret == NULL)
      break;
    if (!strncmp(buffer, "processor", 9))
      cpu = atoi(strchr(buffer, ':') + 2);
  }
  fclose(fd);
  return cpu;
}

void cpu_bind(int cpu) /* bind current process to one cpu */
{
  cpu_set_t mask;
  int ret;

  CPU_ZERO(&mask);
  CPU_SET(cpu, &mask);
  ret = sched_setaffinity(0, sizeof(mask), &mask);
  if (ret == -1) {
    perror("sched_setaffinity()");
    exit(EXIT_FAILURE);
  }
  sched_yield(); /* not necessary */
}

#define MMAP_SIZE (10 * 1024 * 1024) /* 10 MB */
#define FORK_INTERVAL 1 /* 1 second */

main(int argc, char *argv[])
{
  int cpu_max, nextcpu;
  long pagesize;
  pid_t pid;

  /* set max number of logical cpu */
  if (argc > 1)
    cpu_max = atoi(argv[1]) - 1;
  else
    cpu_max = max_cpu();

  /* get the page size */
  pagesize = sysconf(_SC_PAGESIZE);
  if (pagesize == -1) {
    perror("sysconf(_SC_PAGESIZE)");
    exit(EXIT_FAILURE);
  }

  /* prepare parent process */
  cpu_bind(0);
  nextcpu = cpu_max;

loop:

  /* select destination cpu for child process by round-robin rule */
  if (++nextcpu > cpu_max)
    nextcpu = 1;

  pid = fork();

  if (pid == 0) { /* child action */

    char *p;
    int i;

    /* consume page tables */
    p = mmap(0, MMAP_SIZE, PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, 0, 0);
    i = MMAP_SIZE / pagesize;
    while (i-- > 0) {
      *p = 1;
      p += pagesize;
    }

    /* move to other cpu */
    cpu_bind(nextcpu);
/*
    printf("a child moved to cpu%d after mmap().\n", nextcpu);
    fflush(stdout);
*/

    /* back page tables to pgtable_quicklist */
    exit(0);

  } else if (pid > 0) { /* parent action */

    sleep(FORK_INTERVAL);
    waitpid(pid, NULL, WNOHANG);

  }

  goto loop;
}
----------------------------------------

When above program which does task migration runs, my 8GB box spends
800MB of memory for quicklist.  This is not memory leak but doesn't seem
good.

% cat /proc/meminfo

MemTotal:        7701568 kB
MemFree:         4724672 kB
(snip)
Quicklists:       844800 kB

because

- My machine spec is
number of numa node: 2
number of cpus:      8 (4CPU x2 node)
        total mem:           8GB (4GB x2 node)
        free mem:            about 5GB

- Then, 4.7GB x 16% ~= 880MB.
  So, Quicklist can use 800MB.

So, if following spec machine run that program

   CPUs: 64 (8cpu x 8node)
   Mem:  1TB (128GB x8node)

Then, quicklist can waste 300GB (= 1TB x 30%).  It is too large.

So, I don't like cache policies which is proportional to # of cpus.

My patch changes the number of caches
from:
   per-cpu-cache-amount = memory_on_node / 16
to
   per-cpu-cache-amount = memory_on_node / 16 / number_of_cpus_on_node.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Keiichiro Tokunaga <tokunaga.keiich@jp.fujitsu.com>
Acked-by: Christoph Lameter <cl@linux-foundation.org>
Tested-by: David Miller <davem@davemloft.net>
Acked-by: Mike Travis <travis@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: show quicklist usage in /proc/meminfo

Quicklists can consume several GB of memory.  We should provide a means of
monitoring this.

After this patch is applied, /proc/meminfo will output the following:

% cat /proc/meminfo

MemTotal:      7715392 kB
MemFree:       5401600 kB
Buffers:         80384 kB
Cached:         300800 kB
SwapCached:          0 kB
Active:         235584 kB
Inactive:       262656 kB
SwapTotal:     2031488 kB
SwapFree:      2031488 kB
Dirty:            3520 kB
Writeback:           0 kB
AnonPages:      117696 kB
Mapped:          38528 kB
Slab:          1589952 kB
SReclaimable:    23104 kB
SUnreclaim:    1566848 kB
PageTables:      14656 kB
NFS_Unstable:        0 kB
Bounce:              0 kB
WritebackTmp:        0 kB
CommitLimit:   5889152 kB
Committed_AS:   393152 kB
VmallocTotal: 17592177655808 kB
VmallocUsed:     29056 kB
VmallocChunk: 17592177626432 kB
Quicklists:     130944 kB
HugePages_Total:     0
HugePages_Free:      0
HugePages_Rsvd:      0
HugePages_Surp:      0
Hugepagesize:    262144 kB

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Keiichiro Tokunaga <tokunaga.keiich@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

devcgroup: fix race against rmdir()

During the use of a dev_cgroup, we should guarantee the corresponding
cgroup won't be deleted (i.e. via rmdir). This can be done through
css_get(&dev_cgroup->css), but here we can just get and use the dev_cgroup
under rcu_read_lock.

And also remove checking NULL dev_cgroup, it won't be NULL since a task
always belongs to a cgroup.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: Paul Menage <menage@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

cirrusfb: check_par fixes

1. Check if virtual resolution fits into memory.
   Otherwise, Linux hangs during panning.
2. When selected use all available memory to
    maximize yres_virtual to speed up panning
   (previously also xres_virtual was increased).
3. Simplify memory restriction calculations.

Signed-off-by: Krzysztof Helt <krzysztof.h1@poczta.fm>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

pid_ns: (BUG 11391) change ->child_reaper when init->group_leader exits

We don't change pid_ns->child_reaper when the main thread of the
subnamespace init exits. As Robert Rex <robert.rex@exasol.com> pointed
out this is wrong.

Yes, the re-parenting itself works correctly, but if the reparented task
exits it needs ->parent->nsproxy->pid_ns in do_notify_parent(), and if the
main thread is zombie its ->nsproxy was already cleared by
exit_task_namespaces().

Introduce the new function, find_new_reaper(), which finds the new
->parent for the re-parenting and changes ->child_reaper if needed. Kill
the now unneeded exit_child_reaper().

Also move the changing of ->child_reaper from zap_pid_ns_processes() to
find_new_reaper(), this consolidates the games with ->child_reaper and
makes it stable under tasklist_lock.

Addresses http://bugzilla.kernel.org/show_bug.cgi?id=11391

Reported-by: Robert Rex <robert.rex@exasol.com>
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

pid_ns: zap_pid_ns_processes: fix the ->child_reaper changing

zap_pid_ns_processes() sets pid_ns->child_reaper = NULL, this is wrong.

Yes, we have already killed all tasks in this namespace, and sys_wait4()
doesn't see any child.  But this doesn't mean ->children list is empty, we
may have EXIT_DEAD tasks which are not visible to do_wait().  In that case
the subsequent forget_original_parent() will crash the kernel because it
will try to re-parent these tasks to the NULL reaper.

Even if there are no childs, it is not good that forget_original_parent()
uses reaper == NULL.

Change the code to set ->child_reaper = init_pid_ns.child_reaper instead.
We could use pid_ns->parent->child_reaper as well, I think this does not
really matter.  These EXIT_DEAD tasks are not visible to the new ->parent
after re-parenting, they will silently do release_task() eventually.

Note that we must change ->child_reaper, otherwise
forget_original_parent() will use reaper == father, and in that case we
will hit the (correct) BUG_ON(!list_empty(&father->children)).

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Acked-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mmc: at91_mci: don't use coherent dma buffers

At91_mci is abusing dma_free_coherent(), which may not be called with IRQs
disabled.  I saw "mkfs.ext3" on an MMC card objecting voluminously as each
write completed:

WARNING: at arch/arm/mm/consistent.c:368 dma_free_coherent+0x2c/0x224()
[<c002726c>] (dump_stack+0x0/0x14) from [<c00387d4>] (warn_on_slowpath+0x4c/0x68)
[<c0038788>] (warn_on_slowpath+0x0/0x68) from [<c0028768>] (dma_free_coherent+0x2c/0x224)
  r6:00008008 r5:ffc06000 r4:00000000
[<c002873c>] (dma_free_coherent+0x0/0x224) from [<c01918ac>] (at91_mci_irq+0x374/0x420)
[<c0191538>] (at91_mci_irq+0x0/0x420) from [<c0065d9c>] (handle_IRQ_event+0x2c/0x6c)
...

This bug has been around for a LONG time.  The MM warning is from late
2005, but the driver merged a year later ...  so I'm puzzled why nobody
noticed this before now.

The fix involves noting that this buffer shouldn't be DMA-coherent; it's
just used for normal DMA writes.  So replace it with standard kmalloc()
buffering and DMA mapping calls.

This is the quickie fix.  A better one would not rely on allocating large
bounce buffers.  (Note that dma_alloc_coherent could have failed too, but
that case was ignored...  kmalloc is a bit more likely to fail though.)

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Acked-by: Pierre Ossman <drzeus-mmc@drzeus.cx>
Cc: Andrew Victor <linux@maxim.org.za>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

8250: improve workaround for UARTs that don't re-assert THRE correctly

Recent changes to tighten the check for UARTs that don't correctly
re-assert THRE (01c194d9278efc15d4785ff205643e9c0bdcef53: "serial 8250:
tighten test for using backup timer") caused problems when such a UART was
opened for the second time - the bug could only successfully be detected
at first initialization.  For users of this version of this particular
UART IP it is fatal.

This patch stores the information about the bug in the bugs field of the
port structure when the port is first started up so subsequent opens can
check this bit even if the test for the bug fails.

David Brownell: "My own exposure to this is that the UART on DaVinci
hardware, which TI allegedly derived from its original 16550 logic, has
periodically gone from working to unusable with the mainline 8250.c ...
and back and forth a bunch.  Currently it's "unusable", a regression from
some previous versions.  With this patch from Will, it's usable."

Signed-off-by: Will Newton <will.newton@gmail.com>
Acked-by: Alex Williamson <alex.williamson@hp.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: David Brownell <david-b@pacbell.net>
Cc: <stable@kernel.org> [2.6.26.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

MAINTAINERS: add a maintainer for the BCM5974 multitouch driver

Signed-off-by: Henrik Rydberg <rydberg@euromail.se>
Cc: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/bootmem: silence section mismatch warning - contig_page_data/bootmem_node_data

WARNING: vmlinux.o(.data+0x1f5c0): Section mismatch in reference from the variable contig_page_data to the variable .init.data:bootmem_node_data
The variable contig_page_data references
the variable __initdata bootmem_node_data
If the reference is valid then annotate the
variable with __init* (see linux/init.h) or name the variable:
*driver, *_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console,

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Johannes Weiner <hannes@saeurebad.de>
Cc: Sean MacLennan <smaclennan@pikatech.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

acer-wmi: remove debugfs entries upon unloading

The exit function neglects to remove debugfs entries, leading to a BUG
on reload.

[akpm@linux-foundation.org: cleanups]
Signed-off-by: Russ Dill <Russ.Dill@gmail.com>
Acked-by: Carlos Corbacho <carlos@strangeworlds.co.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

VFS: fix dio write returning EIO when try_to_release_page fails

Dio write returns EIO when try_to_release_page fails because bh is
still referenced.

The patch

    commit 3f31fddfa26b7594b44ff2b34f9a04ba409e0f91
    Author: Mingming Cao <cmm@us.ibm.com>
    Date:   Fri Jul 25 01:46:22 2008 -0700

        jbd: fix race between free buffer and commit transaction

was merged into 2.6.27-rc1, but I noticed that this patch is not enough
to fix the race.

I did fsstress test heavily to 2.6.27-rc1, and found that dio write still
sometimes got EIO through this test.

The patch above fixed race between freeing buffer(dio) and committing
transaction(jbd) but I discovered that there is another race, freeing
buffer(dio) and ext3/4_ordered_writepage.

: background_writeout()
     ->write_cache_pages()
       ->ext3_ordered_writepage()
         walk_page_buffers() -> take a bh ref
   block_write_full_page() -> unlock_page
: <- end_page_writeback
                : <- race! (dio write->try_to_release_page fails)
          walk_page_buffers() ->release a bh ref

ext3_ordered_writepage holds bh ref and does unlock_page remaining
taking a bh ref, so this causes the race and failure of
try_to_release_page.

To fix this race, I used the approach of falling back to buffered
writes if try_to_release_page() fails on a page.

[akpm@linux-foundation.org: cleanups]
Signed-off-by: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Mingming Cao <cmm@us.ibm.com>
Cc: Zach Brown <zach.brown@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: make setup_zone_migrate_reserve() aware of overlapping nodes

I have gotten to the root cause of the hugetlb badness I reported back on
August 15th.  My system has the following memory topology (note the
overlapping node):

            Node 0 Memory: 0x8000000-0x44000000
            Node 1 Memory: 0x0-0x8000000 0x44000000-0x80000000

setup_zone_migrate_reserve() scans the address range 0x0-0x8000000 looking
for a pageblock to move onto the MIGRATE_RESERVE list.  Finding no
candidates, it happily continues the scan into 0x8000000-0x44000000.  When
a pageblock is found, the pages are moved to the MIGRATE_RESERVE list on
the wrong zone.  Oops.

setup_zone_migrate_reserve() should skip pageblocks in overlapping nodes.

Signed-off-by: Adam Litke <agl@us.ibm.com>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Cc: Nishanth Aravamudan <nacc@us.ibm.com>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: <stable@kernel.org> [2.6.25.x, 2.6.26.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

NTFS: update homepage

Update the location of the NTFS homepage in several files.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6:
  ide/Kconfig: mark ide-scsi as deprecated
  ide-disk: remove stale init_idedisk_capacity() documentation
  palm_bk3710: improve IDE registration
  ide: fix hwif_to_node()
  IDE: palm_bk3710: fix compile warning for unused variable
  IDE: compile fix for sff_dma_ops

ide/Kconfig: mark ide-scsi as deprecated

Mark ide-scsi as deprecated and remove stale/bogus documentation.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>

ide-disk: remove stale init_idedisk_capacity() documentation

Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>

palm_bk3710: improve IDE registration

* fix device tree ... don't forget to set the parent device

* let init/exit code be removed where practical

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
[bart: splitted it from bigger DaVinci patch, s/hw.parent/hw.dev/]
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>

ide: fix hwif_to_node()

hwif_to_node() incorrectly assumes that hwif->dev always belongs to
a PCI device. This results in ide-cs oopsing in init_irq() after
commit c56c5648a3bd15ff14c50f284b261140cd5b5472 accidentally fixed
device tree registration for ide-cs. Fix it by using dev_to_node().

Thanks to Martin Michlmayr and Larry Finger for help with debugging
the issue.

Reported-by: Martin Michlmayr <tbm@cyrius.com>
Tested-by: Martin Michlmayr <tbm@cyrius.com>
Cc: Larry Finger <Larry.Finger@lwfinger.net>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>

IDE: palm_bk3710: fix compile warning for unused variable

Signed-off-by: Kevin Hilman <khilman@deeprootsystems.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>

IDE: compile fix for sff_dma_ops

The sff_dma_ops struct should be wrapped by BLK_DEV_IDEDMA_SFF instead
of BLK_DEV_IDEDMA_PCI.

Signed-off-by: Kevin Hilman <khilman@deeprootsystems.com>
Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
  ALSA: hda - Add mic-boost controls to ALC662/663 auto configuration
  ALSA: hda - Fix ALC663 auto-probe
  ALSA: ASoC: fix pxa2xx-i2s clk_get call
  ALSA: hda: Distortion fix for dell_m6_core_init

Merge branch 'audit.b57' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current

* 'audit.b57' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current:
[PATCH] audit: Moved variable declaration to beginning of function

Merge branch 'for-linus' of git://neil.brown.name/md

* 'for-linus' of git://neil.brown.name/md:
Fix problem with waiting while holding rcu read lock in md/bitmap.c
Remove invalidate_partition call from do_md_stop.

don't diff generated firmware files

With the new firmware infrastructure in 2.6.27, some files are generated and shouldn't be
diffed; add these 2 to the "dontdiff" file

Signed-off-by: Arjan van de Ven <arjan@Linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ACPI: Fix typo in "Disable MWAIT via DMI on broken Compal board"

This fixes a typo in commit 2a2a64714d9c40f7705c4de1e79a5b855c7211a9 "Disable MWAIT via DMI on broken Compal board".

It allows the nomwait dmi check to actually detect the Acer 5220.

Signed-off-by: Dennis Jansen <dennis.jansen@web.de>
Tested-by: Dennis Jansen <dennis.jansen@web.de>
Acked-by: Zhao Yakui <yakui.zhao@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge git://git.infradead.org/users/dwmw2/random-2.6

* git://git.infradead.org/users/dwmw2/random-2.6:
  [MTD] mtdchar.c: Fix regression in MEMGETREGIONINFO ioctl()
  dabusb_fpga_download(): fix a memory leak
  Remove '#include <stddef.h>' from mm/page_isolation.c
  Fix modules_install on RO nfs-exported trees.

Merge branch 'for-2.6.27' of git://linux-nfs.org/~bfields/linux

* 'for-2.6.27' of git://linux-nfs.org/~bfields/linux:
  nfsd: fix buffer overrun decoding NFSv4 acl
  sunrpc: fix possible overrun on read of /proc/sys/sunrpc/transports
  nfsd: fix compound state allocation error handling
  svcrdma: Fix race between svc_rdma_recvfrom thread and the dto_tasklet

m68k: atari_keyb_init operator precedence fix

Fix operator precedence bug in atari_keyb_init, which caused a failure on CT60

Signed-off-by: Michael Schmitz <schmitz@debian.org>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fix typo in arch/parisc/hpux/fs.c

A parisc allmodconfig build produces this:

arch/parisc/hpux/fs.c:107: error: 'buffer' undeclared (first use in this function)

Introduced by commit da574983de9f9283ba35662c8723627096e160de ("[PATCH]
fix hpux_getdents()").

Helge Dille also reported this in bugzilla 11461:

http://bugzilla.kernel.org/show_bug.cgi?id=11461

and he posted an identical patch.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

softlockup: minor cleanup, don't check task->state twice

The recent commit 16d9679f33caf7e683471647d1472bfe133d858 changed
check_hung_task() to filter out the TASK_KILLABLE tasks. We can
move this check to the caller which has to test t->state anyway.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Acked-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

kernel/resource.c: fix new kernel-doc warning

Fix kernel-doc warning for new function:

Warning(linux-2.6.27-rc5-git2//kernel/resource.c:448): No description found for parameter 'root'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6

* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
drm/radeon: downgrade debug message from info to debug.

Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
  [CIFS] Turn off Unicode during session establishment for plaintext authentication
  [CIFS] update cifs change log
  cifs: fix O_APPEND on directio mounts
  [CIFS] Fix plaintext authentication

Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block

* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
block: restore original behavior of /proc/partition when there's no partition
remove blk_register_filter and blk_unregister_filter in gendisk

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
  sparc64: setup_valid_addr_bitmap_from_pavail() should be __init
  sparc: Fix resource flags for PCI children in OF device tree.
  sparc32: Implement smp_call_function_single().

Un-break printk strings in x86 PCI probing code

Breaking lines due to some imaginary problem with a long line length is
often stupid and wrong, but never more so when it splits a string that
is printed out into multiple lines. This really ended up making it much
harder to find where some error strings were printed out, because a
simple 'grep' didn't work.

I'm sure there is tons more of this particular idiocy hiding in other
places, but this particular case hit me once more last week. So fix it.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ALSA: hda - Add mic-boost controls to ALC662/663 auto configuration

Signed-off-by: Takashi Iwai <tiwai@suse.de>

ALSA: hda - Fix ALC663 auto-probe

Fix the wrong DAC assignment for NID 0x17 mono-pin on ALC663.

Signed-off-by: Takashi Iwai <tiwai@suse.de>

[MTD] mtdchar.c: Fix regression in MEMGETREGIONINFO ioctl()

The MEMGETREGIONINFO ioctl() in mtdchar.c was clobbering user memory by
overwriting more than intended, due the size of struct mtd_erase_region_info
changing in commit 0ecbc81adfcb9f15f86b05ff576b342ce81bbef8 ('Support
for auto locking flash on power up').

Fix avoids this by copying struct members one by one with put_user(), as there
is no longer a convenient struct to use the size of as the length argument to
copy_to_user().

Signed-off-by: Zev Weiss <zevweiss@gmail.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>

dabusb_fpga_download(): fix a memory leak

This patch fixes a memory leak in an error path.

Reported-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>

Remove '#include <stddef.h>' from mm/page_isolation.c

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>

Fix modules_install on RO nfs-exported trees.

Fixes http://bugzilla.kernel.org/show_bug.cgi?id=11355 by avoiding a
needless rebuild of the firmware/ihex2fw tool.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>

[PATCH] audit: Moved variable declaration to beginning of function

got rid of compilation warning:
ISO C90 forbids mixed declarations and code

Signed-off-by: Cordelia Sam <cordesam@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

nfsd: fix buffer overrun decoding NFSv4 acl

The array we kmalloc() here is not large enough.

Thanks to Johann Dahm and David Richter for bug report and testing.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Cc: David Richter <richterd@citi.umich.edu>
Tested-by: Johann Dahm <jdahm@umich.edu>

sunrpc: fix possible overrun on read of /proc/sys/sunrpc/transports

Vegard Nossum reported
----------------------
> I noticed that something weird is going on with /proc/sys/sunrpc/transports.
> This file is generated in net/sunrpc/sysctl.c, function proc_do_xprt(). When
> I "cat" this file, I get the expected output:
>    $ cat /proc/sys/sunrpc/transports
>    tcp 1048576
>    udp 32768

> But I think that it does not check the length of the buffer supplied by
> userspace to read(). With my original program, I found that the stack was
> being overwritten by the characters above, even when the length given to
> read() was just 1.

David Wagner added (among other things) that copy_to_user could be
probably used here.

Ingo Oeser suggested to use simple_read_from_buffer() here.

The conclusion is that proc_do_xprt doesn't check for userside buffer
size indeed so fix this by using Ingo's suggestion.

Reported-by: Vegard Nossum <vegard.nossum@gmail.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
CC: Ingo Oeser <ioe-lkml@rameria.de>
Cc: Neil Brown <neilb@suse.de>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: Greg Banks <gnb@sgi.com>
Cc: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

nfsd: fix compound state allocation error handling

Move the cstate_alloc call so that if it fails, the response is setup to
encode the NFS error. The out label now means that the
nfsd4_compound_state has not been allocated.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

block: restore original behavior of /proc/partition when there's no partition

/proc/partitions didn't use to write out the header if there was no
partition. However, recent commit 66c64afe changed the behavior.
This is nothing major but there's no reason to change user visible
behavior without a good rationale. Restore the original behavior.

Note that 2.6.28 has clean up changes scheduled which will replace
this rather hacky implementation.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Greg KH <greg@kroah.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

ALSA: ASoC: fix pxa2xx-i2s clk_get call

pxa2xx-i2s: probe actual device and use it for clk_get call
thus fixing error during startup hook

Signed-off-by: Dmitry Baryshkov <dbaryshkov@gmail.com>
Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>

ALSA: hda: Distortion fix for dell_m6_core_init

Added the EQ distortion fix to the dell_m6_core_init.

Signed-off-by: Matthew Ranostay <mranostay@embeddedalley.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>

Fix problem with waiting while holding rcu read lock in md/bitmap.c

A recent patch to protect the rdev list with rcu locking leaves us
with a problem because we can sleep on memalloc while holding the
rcu lock.

The rcu lock is only needed while walking the linked list as
uninteresting devices (failed or spares) can be removed at any time.

So only take the rcu lock while actually walking the linked list.
Take a refcount on the rdev during the time when we drop the lock
and do the memalloc to start IO.
When we return to the locked code, all the interesting devices
on the list will not have moved, so we can simply use
list_for_each_continue_rcu to pick up where we left off.

Signed-off-by: NeilBrown <neilb@suse.de>

Remove invalidate_partition call from do_md_stop.

When stopping an md array, or just switching to read-only, we
currently call invalidate_partition while holding the mddev lock.
The main reason for this is probably to ensure all dirty buffers
are flushed (invalidate_partition calls fsync_bdev).

However if any dirty buffers are found, it will almost certainly cause
a deadlock as starting writeout will require an update to the
superblock, and performing that updates requires taking the mddev
lock - which is already held.

This deadlock can be demonstrated by running "reboot -f -n" with
a root filesystem on md/raid, and some dirty buffers in memory.

All other calls to stop an array should already happen after a flush.
The normal sequence is to stop using the array (e.g. umount) which
will cause __blkdev_put to call sync_blockdev.  Then open the
array and issue the STOP_ARRAY ioctl while the buffers are all still
clean.

So this invalidate_partition is normally a no-op, except for one case
where it will cause a deadlock.

So remove it.

This patch possibly addresses the regression recored in
   http://bugzilla.kernel.org/show_bug.cgi?id=11460
and
   http://bugzilla.kernel.org/show_bug.cgi?id=11452

though it isn't yet clear how it ever worked.

Signed-off-by: NeilBrown <neilb@suse.de>

drm/radeon: downgrade debug message from info to debug.

If this triggers its bad, however some machines seem to have been
triggering it for ages and we didn't know until we added the debug.

So downgrade the debug now so people don't call this a regression.

Signed-off-by: Dave Airlie <airlied@redhat.com>

sparc64: setup_valid_addr_bitmap_from_pavail() should be __init

Signed-off-by: David S. Miller <davem@davemloft.net>

Resource handling: add 'insert_resource_expand_to_fit()' function

Not used anywhere yet, but this complements the existing plain
'insert_resource()' functionality with a version that can expand the
resource we are adding in order to fix up any conflicts it has with
existing resources.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
ALSA: oxygen: fix distorted output on AK4396-based cards
Revert "ALSA: hda - Added model selection for iMac 24""

Don't trigger softlockup detector on network fs blocked tasks

Pulling the ethernet cable on a 2.6.27-rc system with NFS mounts
currently leads to an ongoing flood of soft lockup detector backtraces
for all tasks blocked on the NFS mounts when the hickup takes
longer than 120s.

I don't think NFS problems should be all that noisy.

Luckily there's a reasonably easy way to distingush this case.

Don't report task softlockup warnings for tasks in TASK_KILLABLE
state, which is used by the network file systems.

I believe this patch is a 2.6.27 candidate.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Revert "x86: fix HPET regression in 2.6.26 versus 2.6.25, check hpet against BAR, v3"

This reverts commit a2bd7274b47124d2fc4dfdb8c0591f545ba749dd.

It wasn't really right to begin with (there's a better fix for the
problem with e820 reservations clashing with PCI BAR's pending), but it
also actually causes more regressions, so it should be reverted even
before the better fix is finalized.

Rafael reports that this commit broke AHCI detection, and thus causes
the kernel to not boot on his quad core test box.

Reported-and-bisected-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: David Witbrodt <dawitbro@sbcglobal.net>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ALSA: oxygen: fix distorted output on AK4396-based cards

When changing the sample rate, the CMI8788's master clock output becomes
unstable for a short time. The AK4396 needs the master clock to do SPI
writes, so writing to an AK4396 control register directly after a sample
rate change will garble the value. In our case, this leads to the DACs
being misconfigured to I2S sample format, which results in a wrong
output level and horrible distortions on samples louder than -6 dB.

To fix this, we need to wait until the new master clock signal has
become stable before doing SPI writes.

Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>

remove blk_register_filter and blk_unregister_filter in gendisk

This patch remove blk_register_filter and blk_unregister_filter in
gendisk, and adds them to sd.c, sr.c. and ide-cd.c

The commit abf5439370491dd6fbb4fe1a7939680d2a9bc9d4 moved cmdfilter
from gendisk to request_queue. It turned out that in some subsystems
multiple gendisks share a single request_queue. So we get:

Using physmap partition information
Creating 3 MTD partitions on "physmap-flash":
0x00000000-0x01c00000 : "User FS"
0x01c00000-0x01c40000 : "booter"
kobject (8511c410): tried to init an initialized object, something is seriously wrong.
Call Trace:
[<8036644c>] dump_stack+0x8/0x34
[<8021f050>] kobject_init+0x50/0xcc
[<8021fa18>] kobject_init_and_add+0x24/0x58
[<8021d20c>] blk_register_filter+0x4c/0x64
[<8021c194>] add_disk+0x78/0xe0
[<8027d14c>] add_mtd_blktrans_dev+0x254/0x278
[<8027c8f0>] blktrans_notify_add+0x40/0x78
[<80279c00>] add_mtd_device+0xd0/0x150
[<8027b090>] add_mtd_partitions+0x568/0x5d8
[<80285458>] physmap_flash_probe+0x2ac/0x334
[<802644f8>] driver_probe_device+0x12c/0x244
[<8026465c>] __driver_attach+0x4c/0x84
[<80263c64>] bus_for_each_dev+0x58/0xac
[<802633ec>] bus_add_driver+0xc4/0x24c
[<802648e0>] driver_register+0xcc/0x184
[<80100460>] _stext+0x60/0x1bc

In the long term, we need to fix such subsystems but we need a quick
fix now. This patch add the command filter support to only sd and sr
though it might be useful for other SG_IO users (such as cciss).

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Reported-by: Manuel Lauss <mano@roarinelk.homelinux.net>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

sparc: Fix resource flags for PCI children in OF device tree.

When a device is under an EBUS or ISA bus, the resource flags
don't get set properly.

Fix this by re-evaluating the resource flags at each level of
bus as we apply ranges on the way to the root. And let PCI
override any existing flags setting, but don't let the
default flags calculator make such overrides.

Signed-off-by: David S. Miller <davem@davemloft.net>

Linux 2.6.27-rc5

Merge master.kernel.org:/home/rmk/linux-2.6-arm

* master.kernel.org:/home/rmk/linux-2.6-arm:
  [ARM] 5226/1: remove unmatched comment end.
  [ARM] Skip memory holes in FLATMEM when reading /proc/pagetypeinfo
  [ARM] use bcd2bin/bin2bcd
  [ARM] use the new byteorder headers
  [ARM] OMAP: Fix 2430 SMC91x ethernet IRQ
  [ARM] OMAP: Add and update OMAP default configuration files
  [ARM] OMAP: Change mailing list for OMAP in MAINTAINERS
  [ARM] S3C2443: Fix the S3C2443 clock register definitions
  [ARM] JIVE: Fix the spi bus numbering
  [ARM] S3C24XX: pwm.c: stop debugging output
  [ARM] S3C24XX: Fix sparse warnings in pwm.c
  [ARM] S3C24XX: Fix spare errors in pwm-clock driver
  [ARM] S3C24XX: Fix sparse warnings in arch/arm/plat-s3c24xx/gpiolib.c
  [ARM] S3C24XX: Fix nor-simtec driver sparse errors
  [ARM] 5225/1: zaurus: Register I2C controller for audio codecs
  [ARM] orion5x: update defconfig to v2.6.27-rc4
  [ARM] Orion: register UART1 on QNAP TS-209 and TS-409
  [ARM] Orion: activate lm75 driver on DNS-323
  [ARM] Orion: fix MAC detection on QNAP TS-209 and TS-409
  [ARM] Orion: Fix boot crash on Kurobox Pro

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/blackfin-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/blackfin-2.6:
  Blackfin arch: Fix PM building on BF52x: No ROTWE on BF52x, add USBWE
  Blackfin arch: sram: use 'unsigned long' for irqflags
  Blackfin arch: let PCI depend on BROKEN
  Blackfin arch: move include/asm-blackfin header files to arch/blackfin
  Blackfin arch: fix bug - MPU crashes under stress
  Blackfin arch: Fix bug - when to rmmod the L1_module, it stucks and then reboot the board.
  Blackfin arch: dont actually need to muck with EMAC_SYSTAT for BF52x for demuxing
  Blackfin arch: Add MTD Partitions for MTD_DATAFLASH, increase max SPI SCLK

Merge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  exit signals: use of uninitialized field notify_count
  lockdep: fix invalid list_del_rcu in zap_class
  lockstat: repair erronous contention statistics
  lockstat: fix numerical output rounding error

Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: rt-bandwidth accounting fix
sched: fix sched_rt_rq_enqueue() resched idle

Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: update defconfigs
  x86: msr: fix bogus return values from rdmsr_safe/wrmsr_safe
  x86: cpuid: correct return value on partial operations
  x86: msr: correct return value on partial operations
  x86: cpuid: propagate error from smp_call_function_single()
  x86: msr: propagate errors from smp_call_function_single()
  smp: have smp_call_function_single() detect invalid CPUs

Merge branch 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6

* 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6:
  i2c: Prevent log spam on some DVB adapters
  i2c: Add missing kerneldoc descriptions
  i2c: Fix device_init_wakeup place

ftrace: disable tracing for hibernation

In accordance with commit f42ac38c59e0a03d6da0c24a63fb211393f484b0
("ftrace: disable tracing for suspend to ram"), disable tracing
around the suspend code in hibernation code paths.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

[ARM] 5226/1: remove unmatched comment end.

remove unmatched comment end.

Signed-off-by: Jean-Christophe DUBOIS <jcd@tribudubois.net>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

[CIFS] Turn off Unicode during session establishment for plaintext authentication

LANMAN session setup did not support Unicode (after session setup, unicode can
still be used though).

Fixes samba bug# 5319

CC: Jeff Layton <jlayton@redhat.com>
CC: Stable Kernel <stable@vger.kernel.org>
Signed-off-by: Steve French <sfrench@us.ibm.com>

[CIFS] update cifs change log

Signed-off-by: Steve French <sfrench@us.ibm.com>

cifs: fix O_APPEND on directio mounts

The direct I/O write codepath for CIFS is done through
cifs_user_write(). That function does not currently call
generic_write_checks() so the file position isn't being properly set
when the file is opened with O_APPEND. It's also not doing the other
"normal" checks that should be done for a write call.

The problem is currently that when you open a file with O_APPEND on a
mount with the directio mount option, the file position is set to the
beginning of the file. This makes any subsequent writes clobber the data
in the file starting at the beginning.

This seems to fix the problem in cursory testing. It is, however
important to note that NFS disallows the combination of
(O_DIRECT|O_APPEND). If my understanding is correct, the concern is
races with multiple clients appending to a file clobbering each others'
data. Since the write model for CIFS and NFS is pretty similar in this
regard, CIFS is probably subject to the same sort of races. What's
unclear to me is why this is a particular problem with O_DIRECT and not
with buffered writes...

Regardless, disallowing O_APPEND on an entire mount is probably not
reasonable, so we'll probably just have to deal with it and reevaluate
this flag combination when we get proper support for O_DIRECT. In the
meantime this patch at least fixes the existing problem.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Cc: Stable Tree <stable@kernel.org>
Signed-off-by: Steve French <sfrench@us.ibm.com>

sched: rt-bandwidth accounting fix

It fixes an accounting bug where we would continue accumulating runtime
even though the bandwidth control is disabled. This would lead to very long
throttle periods once bandwidth control gets turned on again.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

Blackfin arch: Fix PM building on BF52x: No ROTWE on BF52x, add USBWE

Signed-off-by: Michael Hennerich <michael.hennerich@analog.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>

Blackfin arch: sram: use 'unsigned long' for irqflags

Using just 'unsigned' will make flags an unsigned int. While this is
arguably not an error on blackfin where sizeof(int) == sizeof(long),
the patch is still justified on the grounds of principle.

The patch was generated using the Coccinelle semantic patch framework.

Cc: Julia Lawall <julia@diku.dk>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Signed-off-by: Mike Frysinger <vapier.adi@gmail.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>

sched: fix sched_rt_rq_enqueue() resched idle

When sysctl_sched_rt_runtime is set to something other than -1 and the
CONFIG_RT_GROUP_SCHED kernel parameter is NOT enabled, we get into a state
where we see one or more CPUs idling forvever even though there are
real-time
tasks in their rt runqueue that are able to run (no longer throttled).

The sequence is:

- A real-time task is running when the timer sets the rt runqueue
    to throttled, and the rt task is resched_task()ed and switched
    out, and idle is switched in since there are no non-rt tasks to
    run on that cpu.

- Eventually the do_sched_rt_period_timer() runs and un-throttles
    the rt runqueue, but we just exit the timer interrupt and go back
    to executing the idle task in the idle loop forever.

If we change the sched_rt_rq_enqueue() routine to use some of the code
from the CONFIG_RT_GROUP_SCHED enabled version of this same routine and
resched_task() the currently executing task (idle in our case) if it is
a lower priority task than the higher rt task in the now un-throttled
runqueue, the problem is no longer observed.

Signed-off-by: John Blackwood <john.blackwood@ccur.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

i2c: Prevent log spam on some DVB adapters

Some DVB adapters do not support the special I2C transaction that we
use for probing purposes. There's no point in logging this event, as
there's nothing the user can do and in general there is no actual
problem. So, degrade one of these messages to a debug message, and
move the other one around so that it is only printed on bogus drivers.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Uwe Bugla <uwe.bugla@gmx.de>

i2c: Add missing kerneldoc descriptions

Add missing kernel descriptions of struct i2c_driver members.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: David Brownell <david-b@pacbell.net>

i2c: Fix device_init_wakeup place

device_init_wakeup must be called after device_register.

Signed-off-by: Marc Pignat <marc.pignat@hevs.ch>
Acked-by: David Brownell <david-b@pacbell.net>
Signed-off-by: Jean Delvare <khali@linux-fr.org>

sparc32: Implement smp_call_function_single().

Reported by Stephen Rothwell.

Needed to fix the build when CONFIG_RELAY is enabled.

Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'master' of /pub/scm/linux/kernel/git/torvalds/linux-2.6

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (55 commits)
  sctp: fix random memory dereference with SCTP_HMAC_IDENT option.
  sctp: correct bounds check in sctp_setsockopt_auth_key
  wan: Missing capability checks in sbni_ioctl()
  e100, fix iomap read
  qeth: preallocated header account offset
  qeth: l2 write unicast list to hardware
  qeth: use -EOPNOTSUPP instead of -ENOTSUPP.
  ibm_newemac: Don't call dev_mc_add() before device is registered
  net: don't grab a mutex within a timer context in gianfar
  forcedeth: fix checksum flag
  net/usb/mcs7830: add set_mac_address
  net/usb/mcs7830: new device IDs
  [netdrvr] smc91x: fix resource removal (null ptr deref)
  ibmveth: fix bad UDP checksums
  [netdrvr] hso: dev_kfree_skb crash fix
  [netdrvr] hso: icon 322 detection fix
  atl1: disable TSO by default
  atl1e: multistatement if missing braces
  igb: remove 82576 quad adapter
  drivers/net/skfp/ess.c: fix compile warnings
  ...

sctp: fix random memory dereference with SCTP_HMAC_IDENT option.

The number of identifiers needs to be checked against the option
length. Also, the identifier index provided needs to be verified
to make sure that it doesn't exceed the bounds of the array.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sctp: correct bounds check in sctp_setsockopt_auth_key

The bonds check to prevent buffer overlflow was not exactly
right. It still allowed overflow of up to 8 bytes which is
sizeof(struct sctp_authkey).

Since optlen is already checked against the size of that struct,
we are guaranteed not to cause interger overflow either.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'omap-rmk'

IB/mlx4: Actually return L_Key and R_Key for fast register MRs

Initialize the L_Key and R_Key for memory regions returned from
mlx4_ib_alloc_fast_reg_mr(). Otherwise callers just get garbage for
the memory keys and can't do anything useful with these MRs.

Signed-off-by: Vladimir Sokolovsky <vlad@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog

* git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog:
  [WATCHDOG] removed unused #include <version.h>
  [WATCHDOG] at91rm9200_wdt.c: fix misleading indentation
  [WATCHDOG] mpc8xxx_wdt: fix modular build
  [WATCHDOG] hpwdt.c kdebug support
  [WATCHDOG] Add support for the IDT RC32434 watchdog
  [WATCHDOG] Add support for the built-int RDC R-321x SoC watchdog
  [WATHDOG] delete unused driver mpc8xx_wdt.c
  [WATCHDOG] Fix s3c2410_wdt driver coding style issues
  [WATCHDOG] Clean out header of s3c2410_wdt driver.
  [WATCHDOG] Fix NULL usage in s3c2410_wdt driver.