]> err.no Git - linux-2.6/log
linux-2.6
16 years agosh: add interrupt ack code to sh3
Magnus Damm [Thu, 24 Apr 2008 12:36:34 +0000 (21:36 +0900)]
sh: add interrupt ack code to sh3

This patch adds interrupt acknowledge code for external interrupt
sources on sh3 processors. Only really required for edge triggered
interrupts, but we ack regardless of sense configuration.

Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
16 years agosh: unify external irq pin code for sh3
Magnus Damm [Thu, 24 Apr 2008 12:30:09 +0000 (21:30 +0900)]
sh: unify external irq pin code for sh3

This patch unifies the sh3 external irq pin code. It buys us some
savings with reduced code redundancy, but the main feature with
this change is irq sense selection support for all sh3 processors.

Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
16 years agosh-sci: avoid writing to nonexistent registers
Magnus Damm [Wed, 23 Apr 2008 12:37:39 +0000 (21:37 +0900)]
sh-sci: avoid writing to nonexistent registers

Only write to hardware in SCI_OUT() if the register size is valid.

Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
16 years agosh-sci: sh7722 lacks scsptr registers
Magnus Damm [Wed, 23 Apr 2008 12:31:14 +0000 (21:31 +0900)]
sh-sci: sh7722 lacks scsptr registers

The sh7722 serial ports all lack SCSPTR registers, so mark them as
nonexistent in the register table.

Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
16 years agosh-sci: improve sh7722 support
Magnus Damm [Wed, 23 Apr 2008 12:25:29 +0000 (21:25 +0900)]
sh-sci: improve sh7722 support

Improve sh7722 support for SCIF1 and SCIF2 and separate code
from sh7366 implementation.

Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
16 years agosh: reset hardware from early printk
Magnus Damm [Wed, 23 Apr 2008 12:16:06 +0000 (21:16 +0900)]
sh: reset hardware from early printk

Reset the transmitter and receiver when setting up early printk.

Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
16 years agosh: drain and wait for early printk
Magnus Damm [Wed, 23 Apr 2008 12:05:11 +0000 (21:05 +0900)]
sh: drain and wait for early printk

Drain by waiting for all characters to be sent, and make sure to
wait a little bit after setting up the baud rate.

Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
16 years agosh: use sci_out() for early printk
Magnus Damm [Wed, 23 Apr 2008 12:00:54 +0000 (21:00 +0900)]
sh: use sci_out() for early printk

Use sci_out() instead of ctrl_outw() for early printk setup code.

Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
16 years agosh: add memory resources to /proc/iomem
Magnus Damm [Wed, 23 Apr 2008 11:56:44 +0000 (20:56 +0900)]
sh: add memory resources to /proc/iomem

Add physical memory resources such as System RAM, Kernel code/data/bss
and reserved crash dump area to /proc/iomem. Same strategy as on x86.

Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
16 years agosh: add kernel bss resource
Magnus Damm [Wed, 23 Apr 2008 11:50:27 +0000 (20:50 +0900)]
sh: add kernel bss resource

Do like everyone else and have a struct resource for kernel bss.

Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
16 years agosh: fix sh7705 interrupt vector typo
Magnus Damm [Wed, 23 Apr 2008 11:24:52 +0000 (20:24 +0900)]
sh: fix sh7705 interrupt vector typo

Fix sh7705 interrupt sources for vectors 0xc80 and 0xca0.

Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
16 years agosh: update smc91x platform data for se7722
Magnus Damm [Wed, 23 Apr 2008 11:18:04 +0000 (20:18 +0900)]
sh: update smc91x platform data for se7722

Select smc91x bus width using platform data for se7722 now when the
smc91x header file is in place.

Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
16 years agosh: update smc91x platform data for MigoR
Magnus Damm [Wed, 23 Apr 2008 11:13:59 +0000 (20:13 +0900)]
sh: update smc91x platform data for MigoR

Select smc91x bus width and irg flags using platform data for MigoR
now when the smc91x header file is in place.

Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
16 years agosh: remove -traditional.
Mathieu Desnoyers [Fri, 25 Apr 2008 09:01:17 +0000 (18:01 +0900)]
sh: remove -traditional.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: Sam Ravnborg <sam@ravnborg.org>
CC: linux-sh@vger.kernel.org
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
16 years agortc: rtc-sh: Fixup for 64-bit resources.
Paul Mundt [Fri, 25 Apr 2008 08:58:42 +0000 (17:58 +0900)]
rtc: rtc-sh: Fixup for 64-bit resources.

ioremap() and friends get the size information right, so force everything
to go through there.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
16 years agosh: r7780rp: Kill off unneded ifdefs for irq setup.
Paul Mundt [Fri, 25 Apr 2008 08:58:21 +0000 (17:58 +0900)]
sh: r7780rp: Kill off unneded ifdefs for irq setup.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
16 years agosh: rts7751r2d: Kill off unneeded ifdefs.
Paul Mundt [Fri, 25 Apr 2008 07:10:53 +0000 (16:10 +0900)]
sh: rts7751r2d: Kill off unneeded ifdefs.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
16 years agosh: intc_sh5 depends on cayman board for IRQ priority table.
Paul Mundt [Fri, 25 Apr 2008 07:08:37 +0000 (16:08 +0900)]
sh: intc_sh5 depends on cayman board for IRQ priority table.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
16 years agoinput: i8042: sh64 IRQ definitions depend on cayman board.
Paul Mundt [Fri, 25 Apr 2008 07:07:53 +0000 (16:07 +0900)]
input: i8042: sh64 IRQ definitions depend on cayman board.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
16 years agosh: Enable use of the clk fwk on SH-5.
Paul Mundt [Fri, 25 Apr 2008 07:04:20 +0000 (16:04 +0900)]
sh: Enable use of the clk fwk on SH-5.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
16 years agosh64: export onchip_remap/unmap() too.
Paul Mundt [Fri, 25 Apr 2008 07:03:21 +0000 (16:03 +0900)]
sh64: export onchip_remap/unmap() too.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
16 years agosh64: Some symbol exports to make the allmodconfig happier.
Paul Mundt [Fri, 25 Apr 2008 07:01:38 +0000 (16:01 +0900)]
sh64: Some symbol exports to make the allmodconfig happier.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
16 years agomtd: solutionengine flash map depends on solution engine mach group.
Paul Mundt [Fri, 25 Apr 2008 05:27:08 +0000 (14:27 +0900)]
mtd: solutionengine flash map depends on solution engine mach group.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
16 years agosh: remove the broken SH_MPC1211 support
Adrian Bunk [Sun, 30 Mar 2008 22:40:17 +0000 (01:40 +0300)]
sh: remove the broken SH_MPC1211 support

SH_MPC1211 has been marked as BROKEN for some time.

Unless someone is working on reviving it now, I'd therefore suggest this
patch to remove it.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
16 years agosh64: Setup I/D-TLB defaults in SH-5 probe path.
Paul Mundt [Fri, 25 Apr 2008 04:05:17 +0000 (13:05 +0900)]
sh64: Setup I/D-TLB defaults in SH-5 probe path.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
16 years agosh: kexec and kdump depend on SUPERH32.
Paul Mundt [Fri, 25 Apr 2008 04:04:56 +0000 (13:04 +0900)]
sh: kexec and kdump depend on SUPERH32.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
16 years agosh64: Fix up compile warning in event tracer.
Paul Mundt [Fri, 25 Apr 2008 03:59:09 +0000 (12:59 +0900)]
sh64: Fix up compile warning in event tracer.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
16 years agosh64: Fixup the nommu build.
Paul Mundt [Fri, 25 Apr 2008 03:58:40 +0000 (12:58 +0900)]
sh64: Fixup the nommu build.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
16 years agosh64: fixups for xtime_lock seqlock conversion.
Paul Mundt [Fri, 25 Apr 2008 02:54:24 +0000 (11:54 +0900)]
sh64: fixups for xtime_lock seqlock conversion.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
16 years agosh: sh-bios depends on SUPERH32.
Paul Mundt [Fri, 25 Apr 2008 02:54:06 +0000 (11:54 +0900)]
sh: sh-bios depends on SUPERH32.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
16 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6
Linus Torvalds [Thu, 8 May 2008 00:04:49 +0000 (17:04 -0700)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
  sparc: Fix fork/clone/vfork system call restart.
  sparc: Fix mmap VA span checking.

16 years agosparc: Fix fork/clone/vfork system call restart.
David S. Miller [Wed, 7 May 2008 23:21:28 +0000 (16:21 -0700)]
sparc: Fix fork/clone/vfork system call restart.

We clobber %i1 as well as %i0 for these system calls,
because they give two return values.

Therefore, on error, we have to restore %i1 properly
or else the restart explodes since it uses the wrong
arguments.

This fixes glibc's nptl/tst-eintr1.c testcase.

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years ago[MAINTAINERS] New maintainer for Intel ethernet adapters
Auke Kok [Wed, 7 May 2008 20:42:33 +0000 (13:42 -0700)]
[MAINTAINERS] New maintainer for Intel ethernet adapters

I'm handing over maintainership to Jeff Kirsher and moving on
to other Linux/Open Source work within Intel. Good luck to Jeff ;)

Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agosparc: Fix mmap VA span checking.
David S. Miller [Wed, 7 May 2008 09:24:28 +0000 (02:24 -0700)]
sparc: Fix mmap VA span checking.

We should not conditionalize VA range checks on MAP_FIXED.

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6
Linus Torvalds [Wed, 7 May 2008 01:18:43 +0000 (18:18 -0700)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
  sparc64: Fix initrd regression.
  usb: Sparc build fix, make USB_ISP1760_OF depend on PPC_OF
  sparc64: remove online_page()
  sparc64: use compat_sys_utimes instead of home-grown local copy.
  sbus: Fix bpp driver build.
  sparc video: make blank use proper constant
  Revert "[SPARC64]: Wrap SMP IPIs with irq_enter()/irq_exit()."
  sparc: tcx.c remove unnecessary function

16 years agoRevert "uml: fix gcc problem"
Linus Torvalds [Wed, 7 May 2008 00:09:27 +0000 (17:09 -0700)]
Revert "uml: fix gcc problem"

This reverts commit 22eecde2f9034764a3fd095eecfa3adfb8ec9a98.  Uli
reports that it breaks UML on x86-64 with the Fedora 8 gcc (gcc 4.1.2),
causing a crash on startup. See

http://marc.info/?l=linux-kernel&m=121011722806093&w=2

for a trace.

Reported-by: Ulrich Drepper <drepper@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agosparc64: Fix initrd regression.
David S. Miller [Tue, 6 May 2008 22:19:54 +0000 (15:19 -0700)]
sparc64: Fix initrd regression.

We die because we forget to convert initrd_start and
initrd_end to virtual addresses.

Reported by Mikael Pettersson

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agousb: Sparc build fix, make USB_ISP1760_OF depend on PPC_OF
David S. Miller [Tue, 6 May 2008 22:15:12 +0000 (15:15 -0700)]
usb: Sparc build fix, make USB_ISP1760_OF depend on PPC_OF

Sparc doesn't have some of the OF interfaces this driver
wants to use.

Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoFix bogus warning in sysdev_driver_register()
OGAWA Hirofumi [Tue, 6 May 2008 19:02:53 +0000 (04:02 +0900)]
Fix bogus warning in sysdev_driver_register()

        if ((drv->entry.next != drv->entry.prev) ||
            (drv->entry.next != NULL)) {

warns list_empty(&drv->entry).

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: Greg KH <gregkh@suse.de>
Cc: Len Brown <lenb@kernel.org>
[ Version 2 totally redone based on suggestions from Linus & Greg ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoVFS: fix unused variable warning
Linus Torvalds [Tue, 6 May 2008 20:13:37 +0000 (13:13 -0700)]
VFS: fix unused variable warning

Commit 33dcdac2df54e66c447ae03f58c95c7251aa5649 ("kill ->put_inode")
removed the final use of i_op->put_inode, but left the now totally
unused "op" variable in iput().

Get rid of it.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agox86: fix PAE pmd_bad bootup warning
Hugh Dickins [Tue, 6 May 2008 19:49:23 +0000 (20:49 +0100)]
x86: fix PAE pmd_bad bootup warning

Fix warning from pmd_bad() at bootup on a HIGHMEM64G HIGHPTE x86_32.

That came from 9fc34113f6880b215cbea4e7017fc818700384c2 x86: debug pmd_bad();
but we understand now that the typecasting was wrong for PAE in the previous
version: pagetable pages above 4GB looked bad and stopped Arjan from booting.

And revert that cded932b75ab0a5f9181ee3da34a0a488d1a14fd x86: fix pmd_bad
and pud_bad to support huge pages.  It was the wrong way round: we shouldn't
weaken every pmd_bad and pud_bad check to let huge pages slip through - in
part they check that we _don't_ have a huge page where it's not expected.

Put the x86 pmd_bad() and pud_bad() definitions back to what they have long
been: they can be improved (x86_32 should use PTE_MASK, to stop PAE thinking
junk in the upper word is good; and x86_64 should follow x86_32's stricter
comparison, to stop thinking any subset of required bits is good); but that
should be a later patch.

Fix Hans' good observation that follow_page() will never find pmd_huge()
because that would have already failed the pmd_bad test: test pmd_huge in
between the pmd_none and pmd_bad tests.  Tighten x86's pmd_huge() check?
No, once it's a hugepage entry, it can get quite far from a good pmd: for
example, PROT_NONE leaves it with only ACCESSED of the KERN_PGTABLE bits.

However... though follow_page() contains this and another test for huge
pages, so it's nice to keep it working on them, where does it actually get
called on a huge page?  get_user_pages() checks is_vm_hugetlb_page(vma) to
to call alternative hugetlb processing, as does unmap_vmas() and others.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Earlier-version-tested-by: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jeff Chua <jeff.chua.linux@gmail.com>
Cc: Hans Rosenfeld <hans.rosenfeld@amd.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
Linus Torvalds [Tue, 6 May 2008 18:39:57 +0000 (11:39 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
  [PATCH] fix SMP ordering hole in fcntl_setlk()
  [PATCH] kill ->put_inode
  [PATCH] fix reservation discarding in affs

16 years ago[PATCH] fix SMP ordering hole in fcntl_setlk()
Al Viro [Tue, 6 May 2008 17:58:34 +0000 (13:58 -0400)]
[PATCH] fix SMP ordering hole in fcntl_setlk()

fcntl_setlk()/close() race prevention has a subtle hole - we need to
make sure that if we *do* have an fcntl/close race on SMP box, the
access to descriptor table and inode->i_flock won't get reordered.

As it is, we get STORE inode->i_flock, LOAD descriptor table entry vs.
STORE descriptor table entry, LOAD inode->i_flock with not a single
lock in common on both sides.  We do have BKL around the first STORE,
but check in locks_remove_posix() is outside of BKL and for a good
reason - we don't want BKL on common path of close(2).

Solution is to hold ->file_lock around fcheck() in there; that orders
us wrt removal from descriptor table that preceded locks_remove_posix()
on close path and we either come first (in which case eviction will be
handled by the close side) or we'll see the effect of close and do
eviction ourselves.  Note that even though it's read-only access,
we do need ->file_lock here - rcu_read_lock() won't be enough to
order the things.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
16 years ago[PATCH] kill ->put_inode
Christoph Hellwig [Tue, 29 Apr 2008 15:46:26 +0000 (17:46 +0200)]
[PATCH] kill ->put_inode

And with that last patch to affs killing the last put_inode instance we
can finally, after many years of transition kill this racy and awkward
interface.

(It's kinda funny that even the description in
Documentation/filesystems/vfs.txt was entirely wrong..)

Also remove a very misleading comment above the defintion of
struct super_operations.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
16 years ago[PATCH] fix reservation discarding in affs
Roman Zippel [Tue, 29 Apr 2008 15:02:20 +0000 (17:02 +0200)]
[PATCH] fix reservation discarding in affs

- remove affs_put_inode, so preallocations aren't discared unnecessarily
  often.
- remove affs_drop_inode, it's called with a spinlock held, so it can't
  use a mutex.
- make i_opencnt atomic
- avoid direct b_count manipulations
- a few allocation failure fixes, so that these are more gracefully
  handled now.
Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
16 years agoMerge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzi...
Linus Torvalds [Tue, 6 May 2008 16:17:03 +0000 (09:17 -0700)]
Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev

* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev: (27 commits)
  pata_atiixp: Don't disable
  sata_inic162x: update intro comment, up the version and drop EXPERIMENTAL
  sata_inic162x: add cardbus support
  sata_inic162x: kill now unused SFF related stuff
  sata_inic162x: use IDMA for ATAPI commands
  sata_inic162x: use IDMA for non DMA ATA commands
  sata_inic162x: kill now unused bmdma related stuff
  sata_inic162x: use IDMA for ATA_PROT_DMA
  sata_inic162x: update TF read handling
  sata_inic162x: add / update constants
  sata_inic162x: misc clean ups
  sata_mv use hweight16() for bit counting (V2)
  sata_mv NCQ-EH for FIS-based switching
  sata_mv delayed eh handling
  libata: export ata_eh_analyze_ncq_error
  sata_mv new mv_port_intr function
  sata_mv fix mv_host_intr bug for hc_irq_cause
  sata_mv NCQ and SError fixes for mv_err_intr
  sata_mv rearrange mv_config_fbs
  sata_mv errata workaround for sata25 part 1
  ...

16 years agopata_atiixp: Don't disable
Alan Cox [Fri, 2 May 2008 22:13:39 +0000 (15:13 -0700)]
pata_atiixp: Don't disable

A couple of distributions (Fedora, Ubuntu) were having weird problems with the
ATI IXP series PATA controllers being reported as simplex.  At the heart of
the problem is that both distros ignored the recommendations to load pata_acpi
and ata_generic *AFTER* specific host drivers.

The underlying cause however is that if you D3 and then D0 an ATI IXP it
helpfully throws away some configuration and won't let you rewrite it.

Add checks to ata_generic and pata_acpi to pin ATIIXP devices.  Possibly the
real answer here is to quirk them and pin them, but right now we can't do that
before they've been pcim_enable()'d by a driver.

I'm indebted to David Gero for this.  His bug report not only reported the
problem but identified the cause correctly and he had tested the right values
to prove what was going on

[If you backport this for 2.6.24 you will need to pull in the 2.6.25
removal of the bogus WARN_ON() in pcim_enagle]

Signed-off-by: Alan Cox <alan@redhat.com>
Tested-by: David Gero <davidg@havidave.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agosata_inic162x: update intro comment, up the version and drop EXPERIMENTAL
Tejun Heo [Wed, 30 Apr 2008 07:35:17 +0000 (16:35 +0900)]
sata_inic162x: update intro comment, up the version and drop EXPERIMENTAL

sata_inic162x is now ready for production use.  Bump the version,
explain what's working and what's not and drop EXPERIMENTAL.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agosata_inic162x: add cardbus support
Tejun Heo [Wed, 30 Apr 2008 07:35:16 +0000 (16:35 +0900)]
sata_inic162x: add cardbus support

When attached to cardbus, mmio region is at BAR 1.  Other than that,
everything else is the same.  Add support for it.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agosata_inic162x: kill now unused SFF related stuff
Tejun Heo [Wed, 30 Apr 2008 07:35:15 +0000 (16:35 +0900)]
sata_inic162x: kill now unused SFF related stuff

sata_inic162x now doesn't use any SFF features.  Remove all SFF
related stuff.

* Mask unsolicited ATA interrupts.  This removes our primary source of
  spurious interrupts and spurious interrupt handling can be tightened
  up.  There's no need to clear ATA interrupts by reading status
  register either.

* Don't dance with IDMA_CTL_ATA_NIEN and simplify accesses to
  IDMA_CTL.

* Inherit from sata_port_ops instead of ata_sff_port_ops.

* Don't initialize or use ioaddr.  There's no need to map BAR0-4
  anymore.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agosata_inic162x: use IDMA for ATAPI commands
Tejun Heo [Wed, 30 Apr 2008 07:35:14 +0000 (16:35 +0900)]
sata_inic162x: use IDMA for ATAPI commands

Use IDMA for ATAPI commands.  Write and some misc commands time out
when executed using ATAPI_PROT_DMA but ATAPI_PROT_PIO works fine.  As
PIO is driven by DMA too, it doesn't make any noticeable difference
for native SATA devices.  inic_check_atapi_dma() is implemented to
force PIO for those ATAPI commands.

After this change, sata_inic162x issues all commands using IDMA.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agosata_inic162x: use IDMA for non DMA ATA commands
Tejun Heo [Wed, 30 Apr 2008 07:35:13 +0000 (16:35 +0900)]
sata_inic162x: use IDMA for non DMA ATA commands

Use IDMA for PIO and non-data commands.  This allows sata_inic162x to
safely drive LBA48 devices.  Kill inic_dev_config() which contains
code to reject LBA48 devices.

With this change, status checking in inic_qc_issue() to avoid hard
lock up after hotplug can go away too.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agosata_inic162x: kill now unused bmdma related stuff
Tejun Heo [Wed, 30 Apr 2008 07:35:12 +0000 (16:35 +0900)]
sata_inic162x: kill now unused bmdma related stuff

sata_inic162x doesn't use BMDMA anymore.  Kill bmdma related stuff.

* prdctl manipulation

* port IRQ mask manipulation

* inherit ATA_BASE_SHT instead of ATA_BMDMA_SHT

* BMDMA methods

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agosata_inic162x: use IDMA for ATA_PROT_DMA
Tejun Heo [Wed, 30 Apr 2008 07:35:11 +0000 (16:35 +0900)]
sata_inic162x: use IDMA for ATA_PROT_DMA

The modified driver on initio site has enough clue on how to use IDMA.
Use IDMA for ATA_PROT_DMA.

* LBA48 now works as long as it uses DMA (LBA48 devices still aren't
  allowed as it can destroy data if PIO is used for any reason).

* No need to mask IRQs for read DMAs as IDMA_DONE is properly raised
  after transfer to memory is actually completed.  There will be some
  spurious interrupts but host_intr will handle it correctly and
  manipulating port IRQ mask interacts badly with the other port for
  some reason, so command type dependent port IRQ masking is not used
  anymore.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agosata_inic162x: update TF read handling
Tejun Heo [Thu, 1 May 2008 14:55:58 +0000 (23:55 +0900)]
sata_inic162x: update TF read handling

inic162x can't reliably read back TF or at least we don't know how to
do it yet.  The only values which seem reliable are status and error.
This patch updates access to TF.

* implement inic_tf_read() which reads the TF area in mmio area

* implement custom inic_qc_fill_rtf() which only returns true if
  status indicates device error.  it'll be returning bogus addresses
  for device errors but it'll be able to report why it failed at
  least.

* implement custom inic_check_ready() and use ata_wait_after_reset()
  instead of the SFF version.

* use inic_tf_read() for classification.

This is not perfect but it fixes hotplug detection failure and at
least makes the driver report 0's instead of random garbages while
reporting valid status and error for device errors.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agosata_inic162x: add / update constants
Tejun Heo [Wed, 30 Apr 2008 07:35:09 +0000 (16:35 +0900)]
sata_inic162x: add / update constants

* add a bunch of constants, most are from the datasheet, a few
  undocumented ones are from initio's modified driver

* HCTL_PWRDWN is bit 12 not 13

This is in preparation of further inic162x updates.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agosata_inic162x: misc clean ups
Tejun Heo [Wed, 30 Apr 2008 07:35:08 +0000 (16:35 +0900)]
sata_inic162x: misc clean ups

* use larger indents for structure member definitions

* kill unused variable @addr in inic_scr_write()

* kill unnecessary flushes in inic_freeze/thaw()

* kill buggy explicit kfree() on devres managed port private data

This is in preparation of further inic162x updates.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agosata_mv use hweight16() for bit counting (V2)
Mark Lord [Fri, 2 May 2008 18:02:28 +0000 (14:02 -0400)]
sata_mv use hweight16() for bit counting (V2)

Some tidying as suggested by Grant Grundler.

Nuke local bit-counting function from sata_mv in favour of using hweight16().
Also add a short explanation for the 15msec timeout used when waiting for empty/idle.

Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agosata_mv NCQ-EH for FIS-based switching
Mark Lord [Fri, 2 May 2008 06:16:20 +0000 (02:16 -0400)]
sata_mv NCQ-EH for FIS-based switching

Convert sata_mv's EH for FIS-based switching (FBS) over to the
sequence recommended by Marvell.  This enables us to catch/analyze
multiple failed links on a port-multiplier when using NCQ.

To do this, we clear the ERR_DEV bit in the EDMA Halt-Conditions register,
so that the EDMA engine doesn't self-disable on the first NCQ error.

Our EH code sets the MV_PP_FLAG_DELAYED_EH flag to prevent new commands
being queued while we await completion of all outstanding NCQ commands
on all links of the failed PM.

The SATA Test Control register tells us which links have failed,
so we must only wait for any other active links to finish up
before we stop the EDMA and run the .error_handler afterward.

The patch also includes skeleton code for handling of non-NCQ FBS operation.
This is more for documentation purposes right now, as that mode is not yet
enabled in sata_mv.

Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agosata_mv delayed eh handling
Mark Lord [Fri, 2 May 2008 06:15:37 +0000 (02:15 -0400)]
sata_mv delayed eh handling

Introduce a new "delayed error handling" mechanism in sata_mv,
to enable us to eventually deal with multiple simultaneous NCQ
failures on a single host link when a PM is present.

This involves a port flag (MV_PP_FLAG_DELAYED_EH) to prevent new
commands being queued, and a pmp bitmap to indicate which pmp links
had NCQ errors.

The new mv_pmp_error_handler() uses those values to invoke
ata_eh_analyze_ncq_error() on each failed link, prior to freezing
the port and passing control to sata_pmp_error_handler().

This is based upon a strategy suggested by Tejun.

For now, we just implement the delayed mechanism.
The next patch in this series will add the multiple-NCQ EH code
to take advantage of it.

Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agolibata: export ata_eh_analyze_ncq_error
Mark Lord [Fri, 2 May 2008 06:14:53 +0000 (02:14 -0400)]
libata: export ata_eh_analyze_ncq_error

Export ata_eh_analyze_ncq_error() for subsequent use by sata_mv,
as suggested by Tejun.

Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agosata_mv new mv_port_intr function
Mark Lord [Fri, 2 May 2008 06:14:02 +0000 (02:14 -0400)]
sata_mv new mv_port_intr function

Separate out the inner loop body of mv_host_intr()
into it's own function called mv_port_intr().

This should help maintainabilty.

Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agosata_mv fix mv_host_intr bug for hc_irq_cause
Mark Lord [Fri, 2 May 2008 06:13:27 +0000 (02:13 -0400)]
sata_mv fix mv_host_intr bug for hc_irq_cause

Remove the unwanted reads of hc_irq_cause from mv_host_intr(),
thereby removing a bug whereby we were not always reading it when needed..

Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agosata_mv NCQ and SError fixes for mv_err_intr
Mark Lord [Fri, 2 May 2008 06:12:34 +0000 (02:12 -0400)]
sata_mv NCQ and SError fixes for mv_err_intr

Sigh.  Undo some earlier changes to mv_port_intr(),
so that we now read/clear SError again in all cases.

Arrange the top of the function to be as close as possible
to what we need for a later update (in this series) for ERR_DEV handling.

Fix things so that libata-eh can attempt a READ_LOG_EXT_10H
in response to a failed NCQ command, by just doing a local
mv_eh_freeze() rather than ata_port_freeze().

This will now fully handle NCQ errors much of the time,
but more fixes are needed for FBS/PMP, and for certain chip errata.

Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agosata_mv rearrange mv_config_fbs
Mark Lord [Fri, 2 May 2008 06:11:45 +0000 (02:11 -0400)]
sata_mv rearrange mv_config_fbs

Rearrange mv_config_fbs() to more closely follow the (corrected) datasheet
recommendations for NCQ and FIS-based switching (FBS).

Also, maintain a port flag to let us know when FBS is enabled.
We will make more use of that flag later in this patch series.

Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agosata_mv errata workaround for sata25 part 1
Mark Lord [Fri, 2 May 2008 06:10:56 +0000 (02:10 -0400)]
sata_mv errata workaround for sata25 part 1

Part 1 of workaround for errata "sata#25" for the 60x1 series
(the second half of this errata workaround is still in development.

Bit22 of the GPIO port has to be set "on" when in NCQ mode.

Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agosata_mv new mv_qc_defer method
Mark Lord [Fri, 2 May 2008 06:10:02 +0000 (02:10 -0400)]
sata_mv new mv_qc_defer method

The EDMA engine cannot tolerate a mix of NCQ/non-NCQ commands,
and cannot be used for PIO at all.  So we need to prevent libata
from trying to feed us such mixtures.

Introduce mv_qc_defer() for this purpose, and use it for all chip versions.

Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agosata_mv wait for empty+idle
Mark Lord [Fri, 2 May 2008 06:09:14 +0000 (02:09 -0400)]
sata_mv wait for empty+idle

When performing EH, it is recommended to wait for the EDMA engine
to empty out requests-in-progress before disabling EDMA.

Introduce code to poll the EDMA_STATUS register for idle/empty bits
before disabling EDMA.  For non-EH operation, this will normally exit
without delay, other than the register read.

A later series of patches may focus on eliminating this and various
other register reads (when possible) throughout the driver,
but for now we're focussing on solid reliablity.

Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agosata_mv pci features
Mark Lord [Fri, 2 May 2008 06:08:32 +0000 (02:08 -0400)]
sata_mv pci features

Some of the GenIIe EDMA optimizations should not be used
for non-PCI (SOC) devices, and nor for certain configurations
of conventional PCI (non PCI-X, PCIe) buses.

Logic taken/simplified from that in the Marvell proprietary driver.

Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agosata_mv more cosmetic changes
Mark Lord [Fri, 2 May 2008 06:07:51 +0000 (02:07 -0400)]
sata_mv more cosmetic changes

More cosmetic changes; no code changes.

 -- try and improve consistency of naming.
 -- add missing _OFS to tails of register offset definitions.
 -- rename mv_setup_ifctl() to mv_setup_ifcfg(), since that's what it really does.
 -- remove/move some dead comments

Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agolibata: Add Intel SCH PATA driver
Alek Du [Tue, 6 May 2008 13:31:41 +0000 (21:31 +0800)]
libata: Add Intel SCH PATA driver

This patch adds Intel SCH chipsets (AF82US15W, AF82US15L, AF82UL11L)
PATA controller support.

Signed-off-by: Alek Du <alek.du@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agoata_piix: verify SIDPR access before enabling it
Tejun Heo [Thu, 1 May 2008 01:03:08 +0000 (10:03 +0900)]
ata_piix: verify SIDPR access before enabling it

On certain configurations (certain macbooks), even though all the
conditions for SIDPR access described in the datasheet are met,
actually reading those registers just returns 0 and have no effect on
write.  Verify SIDPR is actually working before enabling it.

This is reported by Ryan Roth in bz#10512.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Ryan Roth <ryan.roth@ch2m.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agolibata: improve post-reset device ready test
Tejun Heo [Thu, 1 May 2008 14:41:41 +0000 (23:41 +0900)]
libata: improve post-reset device ready test

Some controllers (jmb and inic162x) use 0x77 and 0x7f to indicate that
the device isn't ready yet.  It looks like they use 0xff if device
presence is detected but connection isn't established.  0x77 or 0x7f
after connection is established and use the value from signature FIS
after receiving it.

This patch implements ata_check_ready(), which takes TF status value
and determines whether the port is ready or not considering the above
and other conditions, and use it in @check_ready() functions.  This is
safe as both 0x77 and 0x7f aren't valid ready status value even though
they have BSY bit cleared.

This fixes hot plug detection failures which can be triggered with
certain drives if they aren't already spun up when the data connector
is hot plugged.

Tested on sil, sil24, ahci (jmb/ich), piix and inic162x combined with
eight drives from all major vendors.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
16 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
Linus Torvalds [Tue, 6 May 2008 14:49:20 +0000 (07:49 -0700)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  net_cls_act: act_simple dont ignore realloc code
  iwlwifi: make IWLWIFI a tristate
  Revert "atm: Do not free already unregistered net device."
  dccp: return -EINVAL on invalid feature length
  irda: fix !PNP support for drivers/net/irda/smsc-ircc2.c
  irda: fix !PNP support in drivers/net/irda/nsc-ircc.c
  net_cls_act: Make act_simple use of netlink policy.
  ip: Use inline function dst_metric() instead of direct access to dst->metric[]
  ip: Make use of the inline function dst_metric_locked()
  atm: Bad locking on br2684_devs modifications.
  atm: Do not free already unregistered net device.
  mac80211: Do not free net device after it is unregistered.
  bridge: Consolidate error paths in br_add_bridge().
  bridge: Net device leak in br_add_bridge().
  niu: Fix probing regression for maramba on-board chips.
  lapbeth: Release ->ethdev when unregistering device.
  xfrm: convert empty xfrm_audit_* macros to functions
  net: Fix useless comment reference loop.
  sch_htb: remove from event queue in htb_parent_to_leaf()

16 years agonet_cls_act: act_simple dont ignore realloc code
Jamal Hadi Salim [Tue, 6 May 2008 07:10:24 +0000 (00:10 -0700)]
net_cls_act: act_simple dont ignore realloc code

reallocation of the policy data was being ignored. It could fail.
Simplify so that there is no need for reallocating.

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoiwlwifi: make IWLWIFI a tristate
Adrian Bunk [Tue, 6 May 2008 07:04:47 +0000 (00:04 -0700)]
iwlwifi: make IWLWIFI a tristate

IWLWIFI should be a tristate so that if IWLCORE and/or IWL3945 are m
and none of them is y kbuild doesn't create an empty
drivers/net/wireless/built-in.o

This patch also removes the pointless "default n".

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoRevert "atm: Do not free already unregistered net device."
David S. Miller [Tue, 6 May 2008 07:00:16 +0000 (00:00 -0700)]
Revert "atm: Do not free already unregistered net device."

This reverts commit 65e4113684e50cee75357ce10dc9026b0929e4e9.

Unlike the other cases Pavel fixed, this case did not
setup a netdev->destructor of free_netdev, therefore this
change was not correct.

Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland...
Linus Torvalds [Tue, 6 May 2008 00:31:41 +0000 (17:31 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  mlx4_core: Support creation of FMRs with pages smaller than 4K
  IB/ehca: Fix function return types
  RDMA/cxgb3: Bump up the MPA connection setup timeout.
  RDMA/cxgb3: Silently ignore close reply after abort.
  RDMA/cxgb3: QP flush fixes
  IB/ipoib: Fix transmit queue stalling forever
  IB/mlx4: Fix off-by-one errors in calls to mlx4_ib_free_cq_buf()

16 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux...
Linus Torvalds [Tue, 6 May 2008 00:31:14 +0000 (17:31 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched-fixes

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched-fixes:
  sched: default to n for GROUP_SCHED and FAIR_GROUP_SCHED
  sched: add optional support for CONFIG_HAVE_UNSTABLE_SCHED_CLOCK
  sched, x86: add HAVE_UNSTABLE_SCHED_CLOCK
  sched: fix cpu clock
  sched: fair-group: fix a Div0 error of the fair group scheduler
  sched: fix missing locking in sched_domains code
  sched: make clock sync tunable by architecture code
  sched: fix debugging
  sched: fix sched_info_switch not being called according to documentation
  sched: fix hrtick_start_fair and CPU-Hotplug
  sched: fix SCHED_FAIR wake-idle logic error
  sched: fix RT task-wakeup logic
  sched: add statics, don't return void expressions
  sched: add debug checks to idle functions
  sched: remove old sched doc
  sched: make rt_sched_class, idle_sched_class static
  sched: optimize calc_delta_mine()
  sched: fix normalized sleeper

16 years agomlx4_core: Support creation of FMRs with pages smaller than 4K
Oren Duer [Mon, 5 May 2008 22:56:52 +0000 (15:56 -0700)]
mlx4_core: Support creation of FMRs with pages smaller than 4K

Don't hard code a test against a minimum page shift of 12, since the
device may support smaller pages.  Test against the actual smallest
page size from the device capabilities.

Signed-off-by: Oren Duer <oren@mellanox.co.il>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
16 years agoIB/ehca: Fix function return types
Stefan Roscher [Mon, 5 May 2008 22:51:49 +0000 (15:51 -0700)]
IB/ehca: Fix function return types

Also remove duplicate assignment of local_ca_ack_delay and change
min_t check for local_ca_ack_delay to u8 instead of int.

Signed-off-by: Stefan Roscher <stefan.roscher at de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
16 years agoMerge branch 'powerpc-next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus...
Linus Torvalds [Mon, 5 May 2008 22:48:53 +0000 (15:48 -0700)]
Merge branch 'powerpc-next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc

* 'powerpc-next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
  [POWERPC] Assign PDE->data before gluing PDE into /proc tree
  [POWERPC] devres: Add devm_ioremap_prot()
  [POWERPC] macintosh: ADB driver: adb_handler_sem semaphore to mutex
  [POWERPC] macintosh: windfarm_smu_sat: semaphore to mutex
  [POWERPC] macintosh: therm_pm72: driver_lock semaphore to mutex

16 years agodev_name introduction fall out fix
Stephen Rothwell [Mon, 5 May 2008 03:54:19 +0000 (13:54 +1000)]
dev_name introduction fall out fix

Commit 06916639e2fed9ee475efef2747a1b7429f8fe76 ("driver-core: add
dev_name() to help transition away from using bus_id") added a static
inline dev_name() and used it in dev_printk.

Unfortunately, drivers/edac/edac_core.h defines a macro called
dev_name().  Rename the latter.

Diagnosis by Tony Breeds and Michael Ellerman.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agosched: default to n for GROUP_SCHED and FAIR_GROUP_SCHED
Parag Warudkar [Sun, 4 May 2008 00:42:34 +0000 (20:42 -0400)]
sched: default to n for GROUP_SCHED and FAIR_GROUP_SCHED

GROUP_SCHED is confirmed to cause unacceptable latencies, see:

   http://lkml.org/lkml/2008/5/2/370.

Mark it EXPERIMENTAL and default to no for now.

Signed-off-by: Parag Warudkar <parag.warudkar@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: add optional support for CONFIG_HAVE_UNSTABLE_SCHED_CLOCK
Peter Zijlstra [Sat, 3 May 2008 16:29:28 +0000 (18:29 +0200)]
sched: add optional support for CONFIG_HAVE_UNSTABLE_SCHED_CLOCK

this replaces the rq->clock stuff (and possibly cpu_clock()).

 - architectures that have an 'imperfect' hardware clock can set
   CONFIG_HAVE_UNSTABLE_SCHED_CLOCK

 - the 'jiffie' window might be superfulous when we update tick_gtod
   before the __update_sched_clock() call in sched_clock_tick()

 - cpu_clock() might be implemented as:

     sched_clock_cpu(smp_processor_id())

   if the accuracy proves good enough - how far can TSC drift in a
   single jiffie when considering the filtering and idle hooks?

[ mingo@elte.hu: various fixes and cleanups ]

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched, x86: add HAVE_UNSTABLE_SCHED_CLOCK
Ingo Molnar [Mon, 5 May 2008 21:19:50 +0000 (23:19 +0200)]
sched, x86: add HAVE_UNSTABLE_SCHED_CLOCK

add the HAVE_UNSTABLE_SCHED_CLOCK, for architectures to select.

the next change utilizes it.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: fix cpu clock
Ingo Molnar [Wed, 23 Apr 2008 07:24:06 +0000 (09:24 +0200)]
sched: fix cpu clock

David Miller pointed it out that nothing in cpu_clock() sets
prev_cpu_time. This caused __sync_cpu_clock() to be called
all the time - against the intention of this code.

The result was that in practice we hit a global spinlock every
time cpu_clock() is called - which - even though cpu_clock()
is used for tracing and debugging, is suboptimal.

While at it, also:

- move the irq disabling to the outest layer,
  this should make cpu_clock() warp-free when called with irqs
  enabled.

- use long long instead of cycles_t - for platforms where cycles_t
  is 32-bit.

Reported-by: David Miller <davem@davemloft.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: fair-group: fix a Div0 error of the fair group scheduler
Miao Xie [Mon, 28 Apr 2008 04:54:56 +0000 (12:54 +0800)]
sched: fair-group: fix a Div0 error of the fair group scheduler

When I echoed 0 into the "cpu.shares" file, a Div0 error occured.

We found it is caused by the following calling.

   sched_group_set_shares(tg, shares)
       set_se_shares(tg->se[i], shares/nr_cpu_ids)
           __set_se_shares(se, shares)
               div64_64((1ULL<<32), shares)

When the echoed value was less than the number of processores, the result of the
sentence "shares/nr_cpu_ids" was 0, and then the system called div64() to divide
the result, the Div0 error occured.

It is unnecessary that the shares value is divided by nr_cpu_ids, I think.
Because in the function  __update_group_shares_cpu() and init_tg_cfs_entry(),
the shares value isn't divided by nr_cpu_ids when setting shares of the sched
entity.

This patch fixes this bug. And echoing ULONG_MAX value into cpu.shares also
causes Div0 error, so we set a macro MAX_SHARES to limit the max value of
shares.

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: fix missing locking in sched_domains code
Heiko Carstens [Mon, 28 Apr 2008 09:33:07 +0000 (11:33 +0200)]
sched: fix missing locking in sched_domains code

Concurrent calls to detach_destroy_domains and arch_init_sched_domains
were prevented by the old scheduler subsystem cpu hotplug mutex. When
this got converted to get_online_cpus() the locking got broken.
Unlike before now several processes can concurrently enter the critical
sections that were protected by the old lock.

So use the already present doms_cur_mutex to protect these sections again.

Cc: Gautham R Shenoy <ego@in.ibm.com>
Cc: Paul Jackson <pj@sgi.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: make clock sync tunable by architecture code
Ingo Molnar [Wed, 23 Apr 2008 07:31:35 +0000 (09:31 +0200)]
sched: make clock sync tunable by architecture code

make time_sync_thresh tunable to architecture code.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: fix debugging
Mike Galbraith [Tue, 29 Apr 2008 10:23:09 +0000 (12:23 +0200)]
sched: fix debugging

Revert debugging commit 7ba2e74ab5a0518bc953042952dd165724bc70c9.
print_cfs_rq_tasks() can induce live-lock if a task is dequeued
during list traversal.

Signed-off-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: fix sched_info_switch not being called according to documentation
David Simner [Tue, 29 Apr 2008 09:08:59 +0000 (10:08 +0100)]
sched: fix sched_info_switch not being called according to documentation

http://bugzilla.kernel.org/show_bug.cgi?id=10545

sched_stats.h says that __sched_info_switch is "called when prev !=
next" in the comment.  sched.c should therefore do that.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: fix hrtick_start_fair and CPU-Hotplug
Peter Zijlstra [Tue, 29 Apr 2008 08:02:46 +0000 (10:02 +0200)]
sched: fix hrtick_start_fair and CPU-Hotplug

Gautham R Shenoy reported:

 > While running the usual CPU-Hotplug stress tests on linux-2.6.25,
 > I noticed the following in the console logs.
 >
 > This is a wee bit difficult to reproduce. In the past 10 runs I hit this
 > only once.
 >
 > ------------[ cut here ]------------
 >
 > WARNING: at kernel/sched.c:962 hrtick+0x2e/0x65()
 >
 > Just wondering if we are doing a good job at handling the cancellation
 > of any per-cpu scheduler timers during CPU-Hotplug.

This looks like its indeed not cancelled at all and migrates the it to
another cpu. Fix it via a proper hotplug notifier mechanism.

Reported-by: Gautham R Shenoy <ego@in.ibm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: stable@kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: fix SCHED_FAIR wake-idle logic error
Gregory Haskins [Mon, 28 Apr 2008 16:40:01 +0000 (12:40 -0400)]
sched: fix SCHED_FAIR wake-idle logic error

We currently use an optimization to skip the overhead of wake-idle
processing if more than one task is assigned to a run-queue.  The
assumption is that the system must already be load-balanced or we
wouldnt be overloaded to begin with.

The problem is that we are looking at rq->nr_running, which may include
RT tasks in addition to CFS tasks.  Since the presence of RT tasks
really has no bearing on the balance status of CFS tasks, this throws
the calculation off.

This patch changes the logic to only consider the number of CFS tasks
when making the decision to optimze the wake-idle.

Signed-off-by: Gregory Haskins <ghaskins@novell.com>
CC: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: fix RT task-wakeup logic
Gregory Haskins [Wed, 23 Apr 2008 11:13:29 +0000 (07:13 -0400)]
sched: fix RT task-wakeup logic

Dmitry Adamushko pointed out a logic error in task_wake_up_rt() where we
will always evaluate to "true".  You can find the thread here:

http://lkml.org/lkml/2008/4/22/296

In reality, we only want to try to push tasks away when a wake up request is
not going to preempt the current task.  So lets fix it.

Note: We introduce test_tsk_need_resched() instead of open-coding the flag
check so that the merge-conflict with -rt should help remind us that we
may need to support NEEDS_RESCHED_DELAYED in the future, too.

Signed-off-by: Gregory Haskins <ghaskins@novell.com>
CC: Dmitry Adamushko <dmitry.adamushko@gmail.com>
CC: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: add statics, don't return void expressions
Harvey Harrison [Fri, 25 Apr 2008 01:17:55 +0000 (18:17 -0700)]
sched: add statics, don't return void expressions

Noticed by sparse:
kernel/sched.c:760:20: warning: symbol 'sched_feat_names' was not declared. Should it be static?
kernel/sched.c:767:5: warning: symbol 'sched_feat_open' was not declared. Should it be static?
kernel/sched_fair.c:845:3: warning: returning void-valued expression
kernel/sched.c:4386:3: warning: returning void-valued expression

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: add debug checks to idle functions
Andrew Morton [Sat, 26 Apr 2008 18:30:34 +0000 (11:30 -0700)]
sched: add debug checks to idle functions

Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: "Justin Mattock" <justinmattock@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: remove old sched doc
Ingo Molnar [Mon, 28 Apr 2008 12:05:18 +0000 (14:05 +0200)]
sched: remove old sched doc

Fabio Checconi noticed that Documentation/scheduler/sched-design.txt was
a stale copy of the old scheduler. Remove it.

Reported-by: Fabio Checconi <fabio@gandalf.sssup.it>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: make rt_sched_class, idle_sched_class static
Harvey Harrison [Fri, 25 Apr 2008 17:53:13 +0000 (10:53 -0700)]
sched: make rt_sched_class, idle_sched_class static

The C files are included directly in sched.c, so they are
effectively static.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: optimize calc_delta_mine()
Peter Zijlstra [Mon, 5 May 2008 21:56:17 +0000 (23:56 +0200)]
sched: optimize calc_delta_mine()

Joel noticed that the !lw->inv_weight contition isn't unlikely anymore so
remove the unlikely annotation. Also, remove the two div64_u64() inv_weight
calculations, which makes them rely on the calc_delta_mine() path as well.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: Joel Schopp <jschopp@austin.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>