]> err.no Git - linux-2.6/log
linux-2.6
16 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland...
Linus Torvalds [Fri, 18 Apr 2008 15:20:06 +0000 (08:20 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (104 commits)
  IB/iser: Don't change itt endianness
  IB/mlx4: Update module version and release date
  IPoIB: Handle case when P_Key is deleted and re-added at same index
  IB/iser: Release connection resources on RDMA_CM_EVENT_DEVICE_REMOVAL event
  IB/mlx4: Fix incorrect comment
  IB/mlx4: Fix race when detaching a QP from a multicast group
  IB/ehca: Support all ibv_devinfo values in query_device() and query_port()
  RDMA/nes: Free IRQ before killing tasklet
  IB/mthca: Update module version and release date
  IB/mlx4: Update QP state if query QP succeeds
  IB/mthca: Update QP state if query QP succeeds
  RDMA/amso1100: Add check for NULL reply_msg in c2_intr()
  IB/mlx4: Add support for resizing CQs
  IB/mlx4: Add support for modifying CQ moderation parameters
  IPoIB: Support modifying IPoIB CQ event moderation
  IB/core: Add support for modify CQ
  IPoIB: Add basic ethtool support
  mlx4_core: Increase max number of QPs to 128K
  RDMA/amso1100: Add support for "send with invalidate" work requests
  IB/core: Add support for "send with invalidate" work requests
  ...

16 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris...
Linus Torvalds [Fri, 18 Apr 2008 15:19:40 +0000 (08:19 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6:
  security: enhance DEFAULT_MMAP_MIN_ADDR description
  SELinux: add netport.[ch]
  SELinux: Add network port SID cache
  SELinux: turn mount options strings into defines
  selinux/ss/services.c should #include <linux/selinux.h>
  selinux: introduce permissive types
  selinux: remove ptrace_sid
  SELinux: requesting no permissions in avc_has_perm_noaudit is a BUG()
  security: code cleanup
  security: replace remaining __FUNCTION__ occurrences
  SELinux: create new open permission
  selinux: selinux/netlabel.c should #include "netlabel.h"
  SELinux: unify printk messages
  SELinux: remove unused backpointers from security objects
  SELinux: Correct the NetLabel locking for the sk_security_struct

16 years agoMerge branch 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6
Linus Torvalds [Fri, 18 Apr 2008 15:19:15 +0000 (08:19 -0700)]
Merge branch 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6

* 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6: (36 commits)
  [S390] Remove code duplication from monreader / dcssblk.
  [S390] kernel: show last breaking-event-address on oops
  [S390] lowcore: Change type of lowcores softirq_pending to __u32.
  [S390] zcrypt: Comments and kernel-doc cleanup
  [S390] uaccess: Always access the correct address space.
  [S390] Fix a lot of sparse warnings.
  [S390] Convert s390 to GENERIC_CLOCKEVENTS.
  [S390] genirq/clockevents: move irq affinity prototypes/inlines to interrupt.h
  [S390] Convert monitor calls to function calls.
  [S390] qdio (new feature): enhancing info-retrieval from QDIO-adapters
  [S390] replace remaining __FUNCTION__ occurrences
  [S390] remove redundant display of free swap space in show_mem()
  [S390] qdio: remove outdated developerworks link.
  [S390] Add debug_register_mode() function to debug feature API
  [S390] crypto: use more descriptive function names for init/exit routines.
  [S390] switch sched_clock to store-clock-extended.
  [S390] zcrypt: add support for large random numbers
  [S390] hw_random: allow rng_dev_read() to return hardware errors.
  [S390] Vertical cpu management.
  [S390] cpu topology support for s390.
  ...

16 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg...
Linus Torvalds [Fri, 18 Apr 2008 15:19:00 +0000 (08:19 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6:
  slub: No need for per node slab counters if !SLUB_DEBUG
  slub: Move map/flag clearing to __free_slab
  slub: Fixes to per cpu stat output in sysfs
  slub: Deal with config variable dependencies
  slub: Reduce #ifdef ZONE_DMA by moving kmalloc_caches_dma near dma logic
  slub: Initialize per-cpu stats

16 years agoptrace_signal subroutine
Roland McGrath [Fri, 18 Apr 2008 01:44:38 +0000 (18:44 -0700)]
ptrace_signal subroutine

This breaks out the ptrace handling from get_signal_to_deliver into a
new subroutine.  The actual code there doesn't change, and it gets
inlined into nearly identical compiled code.  This makes the function
substantially shorter and thus easier to read, and it nicely isolates
the ptrace magic.

Signed-off-by: Roland McGrath <roland@redhat.com>
Acked-by: Kyle McMartin <kyle@mcmartin.ca>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agocgroup: fix a race condition in manipulating tsk->cg_list
Li Zefan [Thu, 17 Apr 2008 03:37:15 +0000 (11:37 +0800)]
cgroup: fix a race condition in manipulating tsk->cg_list

When I ran a test program to fork mass processes and at the same time
'cat /cgroup/tasks', I got the following oops:

  ------------[ cut here ]------------
  kernel BUG at lib/list_debug.c:72!
  invalid opcode: 0000 [#1] SMP
  Pid: 4178, comm: a.out Not tainted (2.6.25-rc9 #72)
  ...
  Call Trace:
   [<c044a5f9>] ? cgroup_exit+0x55/0x94
   [<c0427acf>] ? do_exit+0x217/0x5ba
   [<c0427ed7>] ? do_group_exit+0.65/0x7c
   [<c0427efd>] ? sys_exit_group+0xf/0x11
   [<c0404842>] ? syscall_call+0x7/0xb
   [<c05e0000>] ? init_cyrix+0x2fa/0x479
  ...
  EIP: [<c04df671>] list_del+0x35/0x53 SS:ESP 0068:ebc7df4
  ---[ end trace caffb7332252612b ]---
  Fixing recursive fault but reboot is needed!

After digging into the code and debugging, I finlly found out a race
situation:

do_exit()
  ->cgroup_exit()
    ->if (!list_empty(&tsk->cg_list))
        list_del(&tsk->cg_list);

  cgroup_iter_start()
    ->cgroup_enable_task_cg_list()
      ->list_add(&tsk->cg_list, ..);

In this case the list won't be deleted though the process has exited.

We got two bug reports in the past, which seem to be the same bug as
this one:
http://lkml.org/lkml/2008/3/5/332
http://lkml.org/lkml/2007/10/17/224

Actually sometimes I got oops on list_del, sometimes oops on list_add.
And I can change my test program a bit to trigger other oops.

The patch has been tested both on x86_32 and x86_64.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Paul Menage <menage@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agosecurity: enhance DEFAULT_MMAP_MIN_ADDR description
maximilian attems [Wed, 16 Apr 2008 17:36:36 +0000 (19:36 +0200)]
security: enhance DEFAULT_MMAP_MIN_ADDR description

Got burned by setting the proposed default of 65536
across all Debian archs.

Thus proposing to be more specific on which archs you may
set this. Also propose a value for arm and friends that
doesn't break sshd.

Reword to mention working archs ia64 and ppc64 too.

Signed-off-by: maximilian attems <max@stro.at>
Cc: Martin Michlmayr <tbm@cyrius.com>
Cc: Gordon Farquharson <gordonfarquharson@gmail.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
16 years agoSELinux: add netport.[ch]
James Morris [Mon, 14 Apr 2008 05:09:53 +0000 (15:09 +1000)]
SELinux: add netport.[ch]

Thank you, git.

Signed-off-by: James Morris <jmorris@namei.org>
16 years agoSELinux: Add network port SID cache
Paul Moore [Thu, 10 Apr 2008 14:48:14 +0000 (10:48 -0400)]
SELinux: Add network port SID cache

Much like we added a network node cache, this patch adds a network port
cache. The design is taken almost completely from the network node cache
which in turn was taken from the network interface cache.  The basic idea is
to cache entries in a hash table based on protocol/port information.  The
hash function only takes the port number into account since the number of
different protocols in use at any one time is expected to be relatively
small.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
16 years agoSELinux: turn mount options strings into defines
Eric Paris [Tue, 1 Apr 2008 17:24:09 +0000 (13:24 -0400)]
SELinux: turn mount options strings into defines

Convert the strings used for mount options into #defines rather than
retyping the string throughout the SELinux code.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
16 years agoselinux/ss/services.c should #include <linux/selinux.h>
Adrian Bunk [Sun, 30 Mar 2008 22:54:02 +0000 (01:54 +0300)]
selinux/ss/services.c should #include <linux/selinux.h>

Every file should include the headers containing the externs for its global
code.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: James Morris <jmorris@namei.org>
16 years agoselinux: introduce permissive types
Eric Paris [Mon, 31 Mar 2008 01:17:33 +0000 (12:17 +1100)]
selinux: introduce permissive types

Introduce the concept of a permissive type.  A new ebitmap is introduced to
the policy database which indicates if a given type has the permissive bit
set or not.  This bit is tested for the scontext of any denial.  The bit is
meaningless on types which only appear as the target of a decision and never
the source.  A domain running with a permissive type will be allowed to
perform any action similarly to when the system is globally set permissive.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
16 years agoselinux: remove ptrace_sid
Roland McGrath [Wed, 26 Mar 2008 22:46:39 +0000 (15:46 -0700)]
selinux: remove ptrace_sid

This changes checks related to ptrace to get rid of the ptrace_sid tracking.
It's good to disentangle the security model from the ptrace implementation
internals.  It's sufficient to check against the SID of the ptracer at the
time a tracee attempts a transition.

Signed-off-by: Roland McGrath <roland@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
16 years agoSELinux: requesting no permissions in avc_has_perm_noaudit is a BUG()
Eric Paris [Tue, 11 Mar 2008 18:19:34 +0000 (14:19 -0400)]
SELinux: requesting no permissions in avc_has_perm_noaudit is a BUG()

This patch turns the case where we have a call into avc_has_perm with no
requested permissions into a BUG_ON.  All callers to this should be in
the kernel and thus should be a function we need to fix if we ever hit
this.  The /selinux/access permission checking it done directly in the
security server and not through the avc, so those requests which we
cannot control from userspace should not be able to trigger this BUG_ON.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen D. Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
16 years agosecurity: code cleanup
Andrew Morton [Wed, 5 Mar 2008 23:05:08 +0000 (10:05 +1100)]
security: code cleanup

ERROR: "(foo*)" should be "(foo *)"
#168: FILE: security/selinux/hooks.c:2656:
+        "%s, rc=%d\n", __func__, (char*)value, -rc);

total: 1 errors, 0 warnings, 195 lines checked

./patches/security-replace-remaining-__function__-occurences.patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Harvey Harrison <harvey.harrison@gmail.com>
Cc: James Morris <jmorris@namei.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Morris <jmorris@namei.org>
16 years agosecurity: replace remaining __FUNCTION__ occurrences
Harvey Harrison [Wed, 5 Mar 2008 23:03:59 +0000 (10:03 +1100)]
security: replace remaining __FUNCTION__ occurrences

__FUNCTION__ is gcc-specific, use __func__

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Cc: James Morris <jmorris@namei.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Morris <jmorris@namei.org>
16 years agoSELinux: create new open permission
Eric Paris [Thu, 28 Feb 2008 17:58:40 +0000 (12:58 -0500)]
SELinux: create new open permission

Adds a new open permission inside SELinux when 'opening' a file.  The idea
is that opening a file and reading/writing to that file are not the same
thing.  Its different if a program had its stdout redirected to /tmp/output
than if the program tried to directly open /tmp/output. This should allow
policy writers to more liberally give read/write permissions across the
policy while still blocking many design and programing flaws SELinux is so
good at catching today.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Reviewed-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
16 years agoselinux: selinux/netlabel.c should #include "netlabel.h"
Adrian Bunk [Wed, 27 Feb 2008 21:20:42 +0000 (23:20 +0200)]
selinux: selinux/netlabel.c should #include "netlabel.h"

Every file should include the headers containing the externs for its
global code.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
16 years agoSELinux: unify printk messages
James Morris [Tue, 26 Feb 2008 09:42:02 +0000 (20:42 +1100)]
SELinux: unify printk messages

Replace "security:" prefixes in printk messages with "SELinux"
to help users identify the source of the messages.  Also fix a
couple of minor formatting issues.

Signed-off-by: James Morris <jmorris@namei.org>
16 years agoSELinux: remove unused backpointers from security objects
James Morris [Mon, 25 Feb 2008 22:52:58 +0000 (09:52 +1100)]
SELinux: remove unused backpointers from security objects

Remove unused backpoiters from security objects.

Signed-off-by: James Morris <jmorris@namei.org>
16 years agoSELinux: Correct the NetLabel locking for the sk_security_struct
Paul Moore [Mon, 25 Feb 2008 16:40:33 +0000 (11:40 -0500)]
SELinux: Correct the NetLabel locking for the sk_security_struct

The RCU/spinlock locking approach for the nlbl_state in the sk_security_struct
was almost certainly overkill.  This patch removes both the RCU and spinlock
locking, relying on the existing socket locks to handle the case of multiple
writers.  This change also makes several code reductions possible.

Less locking, less code - it's a Good Thing.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
16 years ago[S390] Remove code duplication from monreader / dcssblk.
Martin Schwidefsky [Thu, 17 Apr 2008 05:46:31 +0000 (07:46 +0200)]
[S390] Remove code duplication from monreader / dcssblk.

Move the function that prints the segment warning messages found in the
monreader driver and the dcssblk driver to the extmem base code.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
16 years ago[S390] kernel: show last breaking-event-address on oops
Christian Borntraeger [Thu, 17 Apr 2008 05:46:30 +0000 (07:46 +0200)]
[S390] kernel: show last breaking-event-address on oops

Newer s390 models have a breaking-event-address-recording register.
Each time an instruction causes a break in the sequential instruction
execution, the address is saved in that hardware register. On a program
interrupt the address is copied to the lowcore address 272-279, which
makes it software accessible.

This patch changes the program check handler and the stack overflow
checker to copy the value into the pt_regs argument.
The oops output is enhanced to show the last known breaking address.
It might give additional information if the stack trace is corrupted.

The feature is only available on 64 bit.

The new oops output looks like:

[---------snip----------]
Modules linked in: vmcp sunrpc qeth_l2 dm_mod qeth ccwgroup
CPU: 2 Not tainted 2.6.24zlive-host #8
Process modprobe (pid: 4788, task: 00000000bf3d8718, ksp: 00000000b2b0b8e0)
Krnl PSW : 0704200180000000 000003e000020028 (vmcp_init+0x28/0xe4 [vmcp])
           R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:2 PM:0 EA:3
Krnl GPRS: 0000000004000002 000003e000020000 0000000000000000 0000000000000001
           000000000015734c ffffffffffffffff 000003e0000b3b00 0000000000000000
           000003e00007ca30 00000000b5bb5d40 00000000b5bb5800 000003e0000b3b00
           000003e0000a2000 00000000003ecf50 00000000b2b0bd50 00000000b2b0bcb0
Krnl Code: 000003e000020018c0c000040ff4       larl    %r12,3e0000a2000
           000003e00002001ee3e0f0000024       stg     %r14,0(%r15)
           000003e000020024a7f40001           brc     15,3e000020026
          >000003e000020028e310c0100004       lg      %r1,16(%r12)
           000003e00002002ec020000413dc       larl    %r2,3e0000a27e6
           000003e000020034c0a00004aee6       larl    %r10,3e0000b5e00
           000003e00002003aa7490001           lghi    %r4,1
           000003e00002003ea75900f0           lghi    %r5,240
Call Trace:
([<000000000014b300>] blocking_notifier_call_chain+0x2c/0x40)
 [<000000000015735c>] sys_init_module+0x19d8/0x1b08
 [<0000000000110afc>] sysc_noemu+0x10/0x16
 [<000002000011cda2>] 0x2000011cda2
Last Breaking-Event-Address:
 [<000003e000020024>] vmcp_init+0x24/0xe4 [vmcp]
[---------snip----------]

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
16 years ago[S390] lowcore: Change type of lowcores softirq_pending to __u32.
Heiko Carstens [Thu, 17 Apr 2008 05:46:29 +0000 (07:46 +0200)]
[S390] lowcore: Change type of lowcores softirq_pending to __u32.

As noted by akpm:

> kernel/time/tick-sched.c: In function 'tick_nohz_stop_sched_tick':
> kernel/time/tick-sched.c:229: warning: format '%02x' expects type 'unsigned int', but argument 2 has type '__u64'
>
> I don't think the architecture's local_softirq_pending() should return u64.
> This is the sort of thing which should be consistent across architectures.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
16 years ago[S390] zcrypt: Comments and kernel-doc cleanup
Felix Beck [Thu, 17 Apr 2008 05:46:28 +0000 (07:46 +0200)]
[S390] zcrypt: Comments and kernel-doc cleanup

Comments, which suggested to be kernel-doc but were not in the right
formatting, have been corrected. Additionally some minor cleanup in
the comments has been done.

Signed-off-by: Felix Beck <felix.beck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
16 years ago[S390] uaccess: Always access the correct address space.
Heiko Carstens [Thu, 17 Apr 2008 05:46:27 +0000 (07:46 +0200)]
[S390] uaccess: Always access the correct address space.

The current uaccess page table walk code assumes at a few places that
any access is a user space access. This is not correct if somebody
has issued a set_fs(KERNEL_DS) in advance.
Add code which checks which address space we are in and with this make
sure we access the correct address space. This way we get also rid of
the dirty
if (!currrent-mm)
return -EFAULT;
hack in futex_atomic_cmpxchg_pt.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
16 years ago[S390] Fix a lot of sparse warnings.
Heiko Carstens [Thu, 17 Apr 2008 05:46:26 +0000 (07:46 +0200)]
[S390] Fix a lot of sparse warnings.

Most noteable part of this commit is the new local header file entry.h
which contains all the function declarations of functions that get only
called from asm code or are arch internal. That way we can avoid extern
declarations in C files.
This is more or less the same that was done for sparc64.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
16 years ago[S390] Convert s390 to GENERIC_CLOCKEVENTS.
Heiko Carstens [Thu, 17 Apr 2008 05:46:25 +0000 (07:46 +0200)]
[S390] Convert s390 to GENERIC_CLOCKEVENTS.

This way we get rid of s390's NO_IDLE_HZ and use the generic dynticks
variant instead. In addition we get high resolution timers for free.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
16 years ago[S390] genirq/clockevents: move irq affinity prototypes/inlines to interrupt.h
Russell King [Thu, 17 Apr 2008 05:46:24 +0000 (07:46 +0200)]
[S390] genirq/clockevents: move irq affinity prototypes/inlines to interrupt.h

> Generic code is not supposed to include irq.h. Replace this include
> by linux/hardirq.h instead and add/replace an include of linux/irq.h
> in asm header files where necessary.
> This change should only matter for architectures that make use of
> GENERIC_CLOCKEVENTS.
> Architectures in question are mips, x86, arm, sh, powerpc, uml and sparc64.
>
> I did some cross compile tests for mips, x86_64, arm, powerpc and sparc64.
> This patch fixes also build breakages caused by the include replacement in
> tick-common.h.

I generally dislike adding optional linux/* includes in asm/* includes -
I'm nervous about this causing include loops.

However, there's a separate point to be discussed here.

That is, what interfaces are expected of every architecture in the kernel.
If generic code wants to be able to set the affinity of interrupts, then
that needs to become part of the interfaces listed in linux/interrupt.h
rather than linux/irq.h.

So what I suggest is this approach instead (against Linus' tree of a
couple of days ago) - we move irq_set_affinity() and irq_can_set_affinity()
to linux/interrupt.h, change the linux/irq.h includes to linux/interrupt.h
and include asm/irq_regs.h where needed (asm/irq_regs.h is supposed to be
rarely used include since not much touches the stacked parent context
registers.)

Build tested on ARM PXA family kernels and ARM's Realview platform
kernels which both use genirq.

[ tglx@linutronix.de: add GENERIC_HARDIRQ dependencies ]

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
16 years ago[S390] Convert monitor calls to function calls.
Heiko Carstens [Thu, 17 Apr 2008 05:46:23 +0000 (07:46 +0200)]
[S390] Convert monitor calls to function calls.

Remove the program check generating monitor calls and use function
calls instead. Theres is no real advantage in using monitor calls,
but they do make debugging harder, because of all the program checks
it generates.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
16 years ago[S390] qdio (new feature): enhancing info-retrieval from QDIO-adapters
Ursula Braun [Thu, 17 Apr 2008 05:46:22 +0000 (07:46 +0200)]
[S390] qdio (new feature): enhancing info-retrieval from QDIO-adapters

Next generation of OSA adapters allows retrieval of further self-describing
infos. This is the preparational infrastructure patch for further exploitation
in the qeth driver.

Signed-off-by: Ursula Braun <braunu@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
16 years ago[S390] replace remaining __FUNCTION__ occurrences
Harvey Harrison [Thu, 17 Apr 2008 05:46:21 +0000 (07:46 +0200)]
[S390] replace remaining __FUNCTION__ occurrences

__FUNCTION__ is gcc-specific, use __func__

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
16 years ago[S390] remove redundant display of free swap space in show_mem()
Johannes Weiner [Thu, 17 Apr 2008 05:46:20 +0000 (07:46 +0200)]
[S390] remove redundant display of free swap space in show_mem()

Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
16 years ago[S390] qdio: remove outdated developerworks link.
Ursula Braun [Thu, 17 Apr 2008 05:46:19 +0000 (07:46 +0200)]
[S390] qdio: remove outdated developerworks link.

Signed-off-by: Ursula Braun <braunu@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
16 years ago[S390] Add debug_register_mode() function to debug feature API
Michael Holzheu [Thu, 17 Apr 2008 05:46:18 +0000 (07:46 +0200)]
[S390] Add debug_register_mode() function to debug feature API

The new function supports setting of permissions for the debugfs files
created by the debug feature. In addition to that, the function provides
uid and gid as parameters for future use. Currently only root is allowed
for uid and gid.

Signed-off-by: Michael Holzheu <holzheu@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
16 years ago[S390] crypto: use more descriptive function names for init/exit routines.
Heiko Carstens [Thu, 17 Apr 2008 05:46:17 +0000 (07:46 +0200)]
[S390] crypto: use more descriptive function names for init/exit routines.

Not very helpful when code dies in "init".
See also http://lkml.org/lkml/2008/3/26/557 .

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
16 years ago[S390] switch sched_clock to store-clock-extended.
Jan Glauber [Thu, 17 Apr 2008 05:46:16 +0000 (07:46 +0200)]
[S390] switch sched_clock to store-clock-extended.

Add get_clock_xt to read an 8 byte clock value using store clock
extended (STCKE) and use get_clock_xt for sched_clock. STCKE should
be faster than STCK on newer machines.

Signed-off-by: Jan Glauber <jan.glauber@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
16 years ago[S390] zcrypt: add support for large random numbers
Ralph Wuerthner [Thu, 17 Apr 2008 05:46:15 +0000 (07:46 +0200)]
[S390] zcrypt: add support for large random numbers

This patch allows user space applications to access large amounts of
truly random data. The random data source is the build-in hardware
random number generator on the CEX2C cards.

Signed-off-by: Ralph Wuerthner <rwuerthn@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
16 years ago[S390] hw_random: allow rng_dev_read() to return hardware errors.
Ralph Wuerthner [Thu, 17 Apr 2008 05:46:14 +0000 (07:46 +0200)]
[S390] hw_random: allow rng_dev_read() to return hardware errors.

The api for hardware random number generators is currently limited to
devices that never fail. If the hardware is registered as a source for
random numbers it has to work. This prevents the use of i/o based
random number devices where the i/o might fail.

Add a check for errors after the read from a hardware random number device.

This patch is required to support large random numbers retrieved
from the CEX2C cards on System z.

Signed-off-by: Ralph Wuerthner <rwuerthn@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
16 years ago[S390] Vertical cpu management.
Heiko Carstens [Thu, 17 Apr 2008 05:46:13 +0000 (07:46 +0200)]
[S390] Vertical cpu management.

If vertical cpu polarization is active then the hypervisor will
dispatch certain cpus for a longer time than other cpus for maximum
performance. For example if a guest would have three virtual cpus,
each of them with a share of 33 percent, then in case of vertical
cpu polarization all of the processing time would be combined to a
single cpu which would run all the time, while the other two cpus
would get nearly no cpu time.

There are three different types of vertical cpus: high, medium and
low. Low cpus hardly get any real cpu time, while high cpus get a
full real cpu. Medium cpus get something in between.

In order to switch between the two possible modes (default is
horizontal) a 0 for horizontal polarization or a 1 for vertical
polarization must be written to the dispatching sysfs attribute:

/sys/devices/system/cpu/dispatching

The polarization of each single cpu can be figured out by the
polarization sysfs attribute of each cpu:

/sys/devices/system/cpu/cpuX/polarization

horizontal, vertical:high, vertical:medium, vertical:low or unknown.

When switching polarization the polarization attribute may contain
the value unknown until the configuration change is done and the
kernel has figured out the new polarization of each cpu.

Note that running a system with different types of vertical cpus may
result in significant performance regressions. If possible only one
type of vertical cpus should be used. All other cpus should be
offlined.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
16 years ago[S390] cpu topology support for s390.
Heiko Carstens [Thu, 17 Apr 2008 05:46:12 +0000 (07:46 +0200)]
[S390] cpu topology support for s390.

Add s390 backend so we can give the scheduler some hints about the
cpu topology.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
16 years ago[S390] Export stfle.
Heiko Carstens [Thu, 17 Apr 2008 05:46:11 +0000 (07:46 +0200)]
[S390] Export stfle.

Make stfle visible so other code can call this.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
16 years ago[S390] Add new fields for System z10 to /proc/sysinfo
Martin Schwidefsky [Thu, 17 Apr 2008 05:46:10 +0000 (07:46 +0200)]
[S390] Add new fields for System z10 to /proc/sysinfo

Add permanent and temporary model capacity and the corresponding
capacity value fields for the three capacity identifiers to the
output of /proc/sysinfo.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
16 years ago[S390] KVM preparation: split sysinfo definitions for kvm use
Christian Borntraeger [Thu, 17 Apr 2008 05:46:09 +0000 (07:46 +0200)]
[S390] KVM preparation: split sysinfo definitions for kvm use

drivers/s390/sysinfo.c uses the store system information intruction to query
the system about information of the machine, the LPAR and additional
hypervisors. KVM has to implement the host part for this instruction.

To avoid code duplication, this patch splits the common definitions from
sysinfo.c into a separate header file include/asm-s390/sysinfo.h for KVM use.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
16 years ago[S390] dasd: add sim handling.
Stefan Haberland [Thu, 17 Apr 2008 05:46:08 +0000 (07:46 +0200)]
[S390] dasd: add sim handling.

Now the system reports system information messages (SIM) to the user.
The System Reference Code (SRC) which is reported to the user gives
the abbility to lookup the reason of the SIM online in the
documentation of the storage server.

Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
16 years ago[S390] exec_protect: Fix incorrect extern declarations.
Heiko Carstens [Thu, 17 Apr 2008 05:46:07 +0000 (07:46 +0200)]
[S390] exec_protect: Fix incorrect extern declarations.

sys_sigreturn and sys_rt_sigreturn don't take any arguments. So luckily
this resulted only in unneeded instead of incorrect code.
But still this clearly shows why one should not put extern declarations
in C files (will be fixed with a larger sparse patch).

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
16 years ago[S390] vmur: Use wait queue instead of mutex to serialize open
Frank Munzert [Thu, 17 Apr 2008 05:46:06 +0000 (07:46 +0200)]
[S390] vmur: Use wait queue instead of mutex to serialize open

If user space opens a unit record device node then vmur is leaving the kernel
with lock open_mutex still held to prevent other processes from opening the
device simultaneously. This causes lockdep to complain about a lock held when
returning to user space.
Now the mutex is replaced by a wait queue to serialize device open.

Signed-off-by: Frank Munzert <munzert@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
16 years ago[S390] tape: duplicate sysfs filename when setting tape device online
Michael Holzheu [Thu, 17 Apr 2008 05:46:05 +0000 (07:46 +0200)]
[S390] tape: duplicate sysfs filename when setting tape device online

When a tape device is set online, offline and online again, the following
error message is printed on the console: "sysfs: duplicate filename
'non-rewinding' can not be created". The reason is that when setting a
device online, the tape driver creates a sysfs symlink from the tape device
to the tape class device. Unfortunately the symlink is not removed
correctly, when the device is set offline. Instead of passing the
tape device object to sysfs_remove_link, the class device object is used.
This patch fixes this problem and uses the correct tape device object now.

Signed-off-by: Michael Holzheu <holzheu@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
16 years ago[S390] dasd: use GFP_DMA for fba private data allocation
Stefan Haberland [Thu, 17 Apr 2008 05:46:04 +0000 (07:46 +0200)]
[S390] dasd: use GFP_DMA for fba private data allocation

allocating dasd_fba_private without GFP_DMA results in IO error
during read device characteristics of a FBA disk

Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
16 years ago[S390] qdio: Unrecognized inbound traffic if many FCP devices are online
Ursula Braun [Thu, 17 Apr 2008 05:46:03 +0000 (07:46 +0200)]
[S390] qdio: Unrecognized inbound traffic if many FCP devices are online

Problem:
Usually every FCP device has its own indicator field the adapter
uses to signal outstanding work. Once a certain limit of devices
is reached, a common indicator field is used. In certain scenarios
qdio resets this common indicator field, but handles only part of
the FCP-devices sharing the common indicator field. Thus inbound
traffic on the non-processed shared FCP-devices is not recognized
immediately.

Solution:
Make sure common indicator field is reset only, if all FCP-devices
sharing the indicator are processed.

Signed-off-by: Ursula Braun <braunu@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
16 years ago[S390] sclp: Get rid of in_atomic() use.
Heiko Carstens [Thu, 17 Apr 2008 05:46:02 +0000 (07:46 +0200)]
[S390] sclp: Get rid of in_atomic() use.

Reintroduces in_interrupt() check in sclp_tty code. Add may_schedule
parameter to vt220 write function, so we can let the write function
know if it may schedule or not. So we disallow scheduling for all
console calls and may allow them for tty calls.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
16 years ago[S390] cio: fix parallel cm_enable processing.
Michael Ernst [Thu, 17 Apr 2008 05:46:01 +0000 (07:46 +0200)]
[S390] cio: fix parallel cm_enable processing.

It is now possible to trigger cm_enable processing several times in
parallel without causing a kernel panic.

Signed-off-by: Michael Ernst <mernst@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
16 years ago[S390] cio: Trigger verification on device/path not operational.
Cornelia Huck [Thu, 17 Apr 2008 05:46:00 +0000 (07:46 +0200)]
[S390] cio: Trigger verification on device/path not operational.

Currently, we don't do much on no path or no device situations during
normal user I/O, since we rely on reports regarding those events by
the machine. If we trigger a path verification to bring our device
state up-to-date, we (a) may recover from path failures earlier and
(b) better handle situations where the hardware/hypervisor doesn't
give us enough notifications.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
16 years ago[S390] cio: Fix race for "fast" path gone/path back situations.
Cornelia Huck [Thu, 17 Apr 2008 05:45:59 +0000 (07:45 +0200)]
[S390] cio: Fix race for "fast" path gone/path back situations.

Make sure we wait for previous evaluations triggered by path state
changes to have settled before we manipulate path states again.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
16 years ago[S390] allnoconfig build error.
Martin Schwidefsky [Thu, 17 Apr 2008 05:45:58 +0000 (07:45 +0200)]
[S390] allnoconfig build error.

Fix the following link error with allnoconfig:

vmem.c:(.text+0x175c): undefined reference to `smp_ptlb_all'
vmem.c:(.text+0x1b24): undefined reference to `smp_ptlb_all'
fork.c:(.text+0x4190): undefined reference to `smp_ptlb_all'
: undefined reference to `smp_ptlb_all'
: undefined reference to `smp_ptlb_all'
mm/built-in.o:: more undefined references to `smp_ptlb_all' follow
make[1]: *** [.tmp_vmlinux1] Error 1
make: *** [sub-make] Error 2

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
16 years ago[S390] Protect against sigaltstack wraparound.
Heiko Carstens [Thu, 17 Apr 2008 05:45:57 +0000 (07:45 +0200)]
[S390] Protect against sigaltstack wraparound.

This is just a port of 83bd01024b1fdfc41d9b758e5669e80fca72df66
"x86: protect against sigaltstack wraparound".

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
16 years ago[S390] dasd: fix double elevator_exit call when deadline iosched fails to load
Josef 'Jeff' Sipek [Thu, 17 Apr 2008 05:45:56 +0000 (07:45 +0200)]
[S390] dasd: fix double elevator_exit call when deadline iosched fails to load

I compiled the kernel without deadline, and the dasd code exits the old
scheduler (CFQ), fails to load the new one (deadline), and then things just
hang - with one of these (sorry about the weird chars - I copy & pasted it
from a 3270 console):

dasd(eckd): 0.0.0151: 3390/0A(CU:3990/01) Cyl:3338 Head:15 Sec:224
------------ cut here ------------
Badness at kernel/mutex.c:134
Modules linked in: dasd_eckd_mod dasd_mod
CPU: 0 Not tainted 2.6.25-rc3 #9
Process exe (pid: 538, task: 000000000d172000, ksp: 000000000d21ef88)
Krnl PSW : 0404000180000000 000000000022fb5c (mutex_lock_nested+0x2a4/0x2cc)
           R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:0 CC:0 PM:0 EA:3
Krnl GPRS: 0000000000024218 000000000076fc78 0000000000000000 000000000000000f
           000000000022f92e 0000000000449898 000000000f921c00 000003e000162590
           00000000001539c4 000000000d172000 070000007fffffff 000000000d21f400
           000000000f8f2560 00000000002413f8 000000000022fb44 000000000d21f400
Krnl Code: 000000000022fb50bf2f1000           icm     %r2,15,0(%r1)
           000000000022fb54a774fef6           brc     7,22f940
           000000000022fb58a7f40001           brc     15,22fb5a
          >000000000022fb5ca7f4fef2           brc     15,22f940
           000000000022fb60c0e5fffa112a       brasl   %r14,171db4
           000000000022fb66: 1222               ltr     %r2,%r2
           000000000022fb68a784fedb           brc     8,22f91e
           000000000022fb6cc010002a0086       larl    %r1,76fc78
Call Trace:
(<000000000022f92e> mutex_lock_nested+0x76/0x2cc)
 <00000000001539c4> elevator_exit+0x38/0x80
 <0000000000156ffe> blk_cleanup_queue+0x62/0x7c
 <000003e0001d5414> dasd_change_state+0xe0/0x8ec
 <000003e0001d5cae> dasd_set_target_state+0x8e/0x9c
 <000003e0001d5f74> dasd_generic_set_online+0x160/0x284
 <000003e00011e83a> dasd_eckd_set_online+0x2e/0x40
 <0000000000199bf4> ccw_device_set_online+0x170/0x2c0
 <0000000000199d9e> online_store_recog_and_online+0x5a/0x14c
 <000000000019a08a> online_store+0xbe/0x2ec
 <000000000018456c> dev_attr_store+0x38/0x58
 <000000000010efbc> sysfs_write_file+0x130/0x190
 <00000000000af582> vfs_write+0xb2/0x160
 <00000000000afc7c> sys_write+0x54/0x9c
 <0000000000025e16> sys32_write+0x2e/0x50
 <0000000000024218> sysc_noemu+0x10/0x16
 <0000000077e82bd2> 0x77e82bd2

Set elevator pointer to NULL in order to avoid double elevator_exit
calls when elevator_init call for deadline iosched fails.
Also make sure the dasd device driver depends on IOSCHED_DEADLINE so
the default IO scheduler of the dasd driver is present.

Signed-off-by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
16 years agoIB/iser: Don't change itt endianness
Erez Zilber [Thu, 17 Apr 2008 04:09:35 +0000 (21:09 -0700)]
IB/iser: Don't change itt endianness

The itt field in struct iscsi_data is not defined with any particular
endianness.  open-iscsi should use it as-is without byte-swapping it.
This fixes sparse warnings coming from doing ntohl(hdr->itt).

Signed-off-by: Erez Zilber <erezz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
16 years agoIB/mlx4: Update module version and release date
Jack Morgenstein [Thu, 17 Apr 2008 04:09:35 +0000 (21:09 -0700)]
IB/mlx4: Update module version and release date

The mlx4_ib driver is stable enough for production use, so bump the
version number to 1.0 to indicate this.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
16 years agoIPoIB: Handle case when P_Key is deleted and re-added at same index
Roland Dreier [Thu, 17 Apr 2008 04:09:35 +0000 (21:09 -0700)]
IPoIB: Handle case when P_Key is deleted and re-added at same index

If a P_Key is deleted and then re-added at the same index, then IPoIB
gets confused because __ipoib_ib_dev_flush() only checks whether the
index is the same without checking whether the P_Key was present, so
the interface is stopped when the P_Key is deleted, but the event when
the P_Key is re-added gets ignored and the interface never gets
restarted.

Also, switch to using ib_find_pkey() instead of ib_find_cached_pkey()
everywhere in IPoIB, since none of the places that look for P_Keys are
in a fast path or in non-sleeping context, and in general we want to
kill off the whole caching infrastructure eventually.  This also fixes
consistency problems caused because some IPoIB queries were cached and
some were uncached during the window where the cache was not updated.

Thanks to Venkata Subramonyam <vsubramo@cisco.com> for debugging this
problem and testing this fix.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
16 years agoIB/iser: Release connection resources on RDMA_CM_EVENT_DEVICE_REMOVAL event
Erez Zilber [Thu, 17 Apr 2008 04:09:35 +0000 (21:09 -0700)]
IB/iser: Release connection resources on RDMA_CM_EVENT_DEVICE_REMOVAL event

When a RDMA_CM_EVENT_DEVICE_REMOVAL event is raised, iSER should
release the connection resources.

This is necessary when the IB HCA module is unloaded while open-iscsi
is still running.  Currently, iSER just BUG()s.

Signed-off-by: Erez Zilber <erezz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
16 years agoIB/mlx4: Fix incorrect comment
Eli Cohen [Thu, 17 Apr 2008 04:09:35 +0000 (21:09 -0700)]
IB/mlx4: Fix incorrect comment

mlx4 hardware does not support external DDR memory.  Moreover, UAR
area (BAR 2) can change depending on FW version.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
16 years agoIB/mlx4: Fix race when detaching a QP from a multicast group
Eli Cohen [Thu, 17 Apr 2008 04:09:35 +0000 (21:09 -0700)]
IB/mlx4: Fix race when detaching a QP from a multicast group

When detaching the last QP from an MCG entry, we need to make
sure that at any time, there will be no entry with zero number of
QPs which is linked to the list of the MCGs of the corresponding
hash index.  So don't write back the MCG entry if we are removing the
last QP; just unlink the entry.

Also, remove an unnecessary MCG read when attaching a QP requires
allocation of a new entry in the AMGM.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
16 years agoIB/ehca: Support all ibv_devinfo values in query_device() and query_port()
Stefan Roscher [Thu, 17 Apr 2008 04:09:35 +0000 (21:09 -0700)]
IB/ehca: Support all ibv_devinfo values in query_device() and query_port()

Also, introduce a few inline helper functions to make the code more readable.

Signed-off-by: Stefan Roscher <stefan.roscher@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
16 years agoRDMA/nes: Free IRQ before killing tasklet
Roland Dreier [Thu, 17 Apr 2008 04:09:34 +0000 (21:09 -0700)]
RDMA/nes: Free IRQ before killing tasklet

Move the free_irq() call in nes_remove() to before the tasklet_kill();
otherwise there is a window after tasklet_kill() where a new interrupt
can be handled and reschedule the tasklet, leading to a use-after-free
crash.

Cc: <stable@kernel.org>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
16 years agoIB/mthca: Update module version and release date
Jack Morgenstein [Thu, 17 Apr 2008 04:09:34 +0000 (21:09 -0700)]
IB/mthca: Update module version and release date

The ib_mthca driver has been stable for a while, so bump the version
number to 1.0 to indicate this.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
16 years agoIB/mlx4: Update QP state if query QP succeeds
Dotan Barak [Thu, 17 Apr 2008 04:09:34 +0000 (21:09 -0700)]
IB/mlx4: Update QP state if query QP succeeds

If the QP was moved to another state (such as SQE) by the hardware,
then after this change the user won't have to set the IBV_QP_CUR_STATE
mask in order to execute modify QP in order to recover from this state.

Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
16 years agoIB/mthca: Update QP state if query QP succeeds
Dotan Barak [Thu, 17 Apr 2008 04:09:34 +0000 (21:09 -0700)]
IB/mthca: Update QP state if query QP succeeds

If the QP was moved to another state (such as SQE) by the hardware,
then after this change the user won't have to set the IBV_QP_CUR_STATE
mask in order to execute modify QP in order to recover from this state.

Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
16 years agoRDMA/amso1100: Add check for NULL reply_msg in c2_intr()
Tom Tucker [Thu, 17 Apr 2008 04:09:34 +0000 (21:09 -0700)]
RDMA/amso1100: Add check for NULL reply_msg in c2_intr()

Fix a place where we might dereference a NULL pointer; this fixes
Coverity CID 1392.  On inspection I also found a place where we could
attempt to kmem_cache_free() a NULL pointer, so fix this too.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
16 years agoIB/mlx4: Add support for resizing CQs
Vladimir Sokolovsky [Thu, 17 Apr 2008 04:09:33 +0000 (21:09 -0700)]
IB/mlx4: Add support for resizing CQs

Signed-off-by: Vladimir Sokolovsky <vlad@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
16 years agoIB/mlx4: Add support for modifying CQ moderation parameters
Eli Cohen [Thu, 17 Apr 2008 04:09:33 +0000 (21:09 -0700)]
IB/mlx4: Add support for modifying CQ moderation parameters

Signed-off-by: Eli Cohen <eli@mellnaox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
16 years agoIPoIB: Support modifying IPoIB CQ event moderation
Eli Cohen [Thu, 17 Apr 2008 04:09:33 +0000 (21:09 -0700)]
IPoIB: Support modifying IPoIB CQ event moderation

This can be used to tune at run time the parameters controlling the
event (interrupt) generation rate and thus reduce the overhead
incurred by handling interrupts resulting in better throughput.  Since
IPoIB uses a single CQ for both RX and TX, RX is chosen to dictate
configuration for both RX and TX.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
16 years agoIB/core: Add support for modify CQ
Eli Cohen [Thu, 17 Apr 2008 04:09:33 +0000 (21:09 -0700)]
IB/core: Add support for modify CQ

Add support for modifying CQ parameters for controlling event
generation moderation.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
16 years agoIPoIB: Add basic ethtool support
Eli Cohen [Thu, 17 Apr 2008 04:09:32 +0000 (21:09 -0700)]
IPoIB: Add basic ethtool support

Just add the infrastructure so we can add functionality later.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
16 years agomlx4_core: Increase max number of QPs to 128K
Jack Morgenstein [Thu, 17 Apr 2008 04:09:32 +0000 (21:09 -0700)]
mlx4_core: Increase max number of QPs to 128K

With the advent large clusters which utilize multicore hosts, 64K QPs
is not enough.  We should increase the default maximum for QPs to 128K.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
16 years agoRDMA/amso1100: Add support for "send with invalidate" work requests
Roland Dreier [Thu, 17 Apr 2008 04:09:32 +0000 (21:09 -0700)]
RDMA/amso1100: Add support for "send with invalidate" work requests

Handle IB_WR_SEND_WITH_INV work requests.

This resurrects a patch sent long ago by Mikkel Hagen <mhagen@iol.unh.edu>.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
16 years agoIB/core: Add support for "send with invalidate" work requests
Roland Dreier [Thu, 17 Apr 2008 04:09:32 +0000 (21:09 -0700)]
IB/core: Add support for "send with invalidate" work requests

Add a new IB_WR_SEND_WITH_INV send opcode that can be used to mark a
"send with invalidate" work request as defined in the iWARP verbs and
the InfiniBand base memory management extensions.  Also put "imm_data"
and a new "invalidate_rkey" member in a new "ex" union in struct
ib_send_wr. The invalidate_rkey member can be used to pass in an
R_Key/STag to be invalidated.  Add this new union to struct
ib_uverbs_send_wr.  Add code to copy the invalidate_rkey field in
ib_uverbs_post_send().

Fix up low-level drivers to deal with the change to struct ib_send_wr,
and just remove the imm_data initialization from net/sunrpc/xprtrdma/,
since that code never does any send with immediate operations.

Also, move the existing IB_DEVICE_SEND_W_INV flag to a new bit, since
the iWARP drivers currently in the tree set the bit.  The amso1100
driver at least will silently fail to honor the IB_SEND_INVALIDATE bit
if passed in as part of userspace send requests (since it does not
implement kernel bypass work request queueing).  Remove the flag from
all existing drivers that set it until we know which ones are OK.

The values chosen for the new flag is not consecutive to avoid clashing
with flags defined in the XRC patches, which are not merged yet but
which are already in use and are likely to be merged soon.

This resurrects a patch sent long ago by Mikkel Hagen <mhagen@iol.unh.edu>.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
16 years agoIB/ipath: Update copyright dates for files changed in 2008
Ralph Campbell [Thu, 17 Apr 2008 04:09:32 +0000 (21:09 -0700)]
IB/ipath: Update copyright dates for files changed in 2008

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
16 years agoIB/ipath: add calls to new 7220 code and enable in build
Dave Olson [Thu, 17 Apr 2008 04:09:32 +0000 (21:09 -0700)]
IB/ipath: add calls to new 7220 code and enable in build

This patch adds the initialization calls into the new 7220 HCA files,
changes the Makefile to compile and link the new files, and code to
handle send DMA.

Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
16 years agoIB/ipath: Misc changes to prepare for IB7220 introduction
Arthur Jones [Thu, 17 Apr 2008 04:09:31 +0000 (21:09 -0700)]
IB/ipath: Misc changes to prepare for IB7220 introduction

The patch adds a number of minor changes to support newer HCAs:
 - New send buffer control bits
 - New error condition bits
 - Locking and initialization changes
 - More send buffers

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
16 years agoIB/ipath: User mode send DMA
Arthur Jones [Thu, 17 Apr 2008 04:09:31 +0000 (21:09 -0700)]
IB/ipath: User mode send DMA

A new file which allows the IBA7220 send DMA engine to be used from
userland.  The routines here are not linked in yet, that will happen in
a follow-on patch...

Signed-off-by: Arthur Jones <arthur.jones@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
16 years agoIB/ipath: User mode send DMA header file
Arthur Jones [Thu, 17 Apr 2008 04:09:31 +0000 (21:09 -0700)]
IB/ipath: User mode send DMA header file

A new header file which allows the IBA7220 send DMA engine to be used
from userland.  The definitions here are not used yet, that will happen
in a follow-on patch...

Signed-off-by: Arthur Jones <arthur.jones@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
16 years agoIB/ipath: Add code for IBA7220 send DMA
John Gregor [Thu, 17 Apr 2008 04:09:31 +0000 (21:09 -0700)]
IB/ipath: Add code for IBA7220 send DMA

The IBA7220 HCA has a new feature to DMA data to the on chip send
buffers instead of or in addition to the host CPU doing the data
transfer.  This patch adds code to support the send DMA queue.

Signed-off-by: John Gregor <john.gregor@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
16 years agoIB/ipath: Add IBA7220-specific SERDES initialization data
Ralph Campbell [Thu, 17 Apr 2008 04:09:31 +0000 (21:09 -0700)]
IB/ipath: Add IBA7220-specific SERDES initialization data

This patch adds binary data to initialize the IB SERDES.

Signed-off-by: Michael Albaugh <Michael.Albaugh@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
16 years agoIB/ipath: Support for SerDes portion of IBA7220
Michael Albaugh [Thu, 17 Apr 2008 04:09:31 +0000 (21:09 -0700)]
IB/ipath: Support for SerDes portion of IBA7220

The control and initialization of the SerDes blocks of the IBA7220 is
sufficiently complex to merit a separate file.

Signed-off-by: Michael Albaugh <Michael.Albaugh@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
16 years agoIB/ipath: HCA-specific code to support IBA7220
Ralph Campbell [Thu, 17 Apr 2008 04:09:30 +0000 (21:09 -0700)]
IB/ipath: HCA-specific code to support IBA7220

This patch adds the HCA-specific code for the IBA7220 HCA.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
16 years agoIB/ipath: Isolate 7220-specific content
Michael Albaugh [Thu, 17 Apr 2008 04:09:30 +0000 (21:09 -0700)]
IB/ipath: Isolate 7220-specific content

This patch adds a new ASIC-specific header file for the HCAs using the IBA7220.

Signed-off-by: Michael Albaugh <Michael.Albaugh@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
16 years agoIB/ipath: Header file changes to support IBA7220
Ralph Campbell [Thu, 17 Apr 2008 04:09:30 +0000 (21:09 -0700)]
IB/ipath: Header file changes to support IBA7220

This is part of a patch series to add support for a new HCA.  This patch
adds new fields to the header files.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
16 years agoIB/ipath: Fix up error handling
Ralph Campbell [Thu, 17 Apr 2008 04:09:30 +0000 (21:09 -0700)]
IB/ipath: Fix up error handling

This patch makes chip reset more robust and reduces lock contention
between user and kernel TID register updates.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
16 years agoIB/ipath: Fix check for no interrupts to reliably fallback to INTx
Dave Olson [Thu, 17 Apr 2008 04:09:30 +0000 (21:09 -0700)]
IB/ipath: Fix check for no interrupts to reliably fallback to INTx

Newer HCAs support MSI interrupts and also INTx interrupts.  Fix the
code so that INTx can be reliably enabled if MSI interrupts are not
working.

Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
16 years agoIB/ipath: Enable reduced PIO update for HCAs that support it.
Dave Olson [Thu, 17 Apr 2008 04:09:30 +0000 (21:09 -0700)]
IB/ipath: Enable reduced PIO update for HCAs that support it.

Newer HCAs have a threshold counter to reduce the number of DMAs the
chip makes to update the PIO buffer availability status bits.  This
patch enables the feature.

Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
16 years agoIB/ipath: Set LID filtering for HCAs that support it.
Dave Olson [Thu, 17 Apr 2008 04:09:29 +0000 (21:09 -0700)]
IB/ipath: Set LID filtering for HCAs that support it.

Whenever the LID is set, notify the HCA specific code so that the
appropriate HW registers can be updated. Also log the info on the
console at low priority.

Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
16 years agoIB/ipath: Add support for IBTA 1.2 Heartbeat
Dave Olson [Thu, 17 Apr 2008 04:09:29 +0000 (21:09 -0700)]
IB/ipath: Add support for IBTA 1.2 Heartbeat

This patch adds code to enable/disable the IBTA 1.2 heartbeat for testing
if the HCA supports it.

Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
16 years agoIB/ipath: Make link state transition code ignore (transient) link recovery
Dave Olson [Thu, 17 Apr 2008 04:09:29 +0000 (21:09 -0700)]
IB/ipath: Make link state transition code ignore (transient) link recovery

The hardware-based recovery doesn't need any intervention, and in a few
cases we can get a bit confused about state and skip steps such as
turning off the link state LED when we consider recovery to be "down".
So ignore this transition, and either we recover in hardware, or we
transition to down, and will handle it then.

Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
16 years agoIB/ipath: Add support for 7220 receive queue changes
Ralph Campbell [Thu, 17 Apr 2008 04:09:29 +0000 (21:09 -0700)]
IB/ipath: Add support for 7220 receive queue changes

Newer HCAs have a HW option to write a sequence number to each receive
queue entry and avoid a separate DMA of the tail register to memory.
This patch adds support for these changes.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
16 years agoIB/ipath: Fix some white space and code style issues
Ralph Campbell [Thu, 17 Apr 2008 04:09:29 +0000 (21:09 -0700)]
IB/ipath: Fix some white space and code style issues

This patch makes some white space changes and minor non-functional
changes to more closely match the code in OFED-1.3.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
16 years agoIB/ipath: Allow old and new diagnostic packet formats
Michael Albaugh [Thu, 17 Apr 2008 04:09:28 +0000 (21:09 -0700)]
IB/ipath: Allow old and new diagnostic packet formats

This patch checks for old and new format writes to send a packet via the
diagnostic interface.

Signed-off-by: Michael Albaugh <Michael.Albaugh@Qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
16 years agoIB/core: Check optional verbs before using them
Dotan Barak [Thu, 17 Apr 2008 04:09:28 +0000 (21:09 -0700)]
IB/core: Check optional verbs before using them

Make sure that a device implements the modify_srq and reg_phys_mr
optional methods before calling them.

Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
16 years agoIB/ipath: Fix time comparison to use time_after_eq()
Robert P. J. Day [Thu, 17 Apr 2008 04:09:28 +0000 (21:09 -0700)]
IB/ipath: Fix time comparison to use time_after_eq()

Raw comparison against jiffies will fail if jiffies wraps, although
since ipath currently only supports 64-bit architectures, this is rather
far-fetched.  Still, it's better to use time_after_eq().

Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
16 years agoIB/mlx4: Micro-optimize mlx4_ib_post_send()
Roland Dreier [Thu, 17 Apr 2008 04:09:28 +0000 (21:09 -0700)]
IB/mlx4: Micro-optimize mlx4_ib_post_send()

Rather than have build_mlx_header() return a negative value on failure
and the length of the segments it builds on success, add a pointer
parameter to return the length and return 0 on success.  This matches
the calling convention used for build_lso_seg() and generates slightly
smaller code -- eg, on 64-bit x86:

add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-22 (-22)
function                                     old     new   delta
mlx4_ib_post_send                           2023    2001     -22

Signed-off-by: Roland Dreier <rolandd@cisco.com>