]> err.no Git - linux-2.6/log
linux-2.6
18 years ago[PATCH] generic_file_buffered_write(): deadlock on vectored write
Vladimir V. Saveliev [Tue, 27 Jun 2006 09:53:57 +0000 (02:53 -0700)]
[PATCH] generic_file_buffered_write(): deadlock on vectored write

generic_file_buffered_write() prefaults in user pages in order to avoid
deadlock on copying from the same page as write goes to.

However, it looks like there is a problem when write is vectored:
fault_in_pages_readable brings in current segment or its part (maxlen).
OTOH, filemap_copy_from_user_iovec is called to copy number of bytes
(bytes) which may exceed current segment, so filemap_copy_from_user_iovec
switches to the next segment which is not brought in yet.  Pagefault is
generated.  That causes the deadlock if pagefault is for the same page
write goes to: page being written is locked and not uptodate, pagefault
will deadlock trying to lock locked page.

[akpm@osdl.org: somewhat rewritten]
Cc: Neil Brown <neilb@suse.de>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] Remove gratuitous inclusion of <linux/config.h> from <linux/dmaengine.h>
David Woodhouse [Tue, 27 Jun 2006 09:53:56 +0000 (02:53 -0700)]
[PATCH] Remove gratuitous inclusion of <linux/config.h> from <linux/dmaengine.h>

We include config.h on the compiler command line. There's no need for it
to be included again.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] spin/rwlock init cleanups
Ingo Molnar [Tue, 27 Jun 2006 09:53:55 +0000 (02:53 -0700)]
[PATCH] spin/rwlock init cleanups

locking init cleanups:

 - convert " = SPIN_LOCK_UNLOCKED" to spin_lock_init() or DEFINE_SPINLOCK()
 - convert rwlocks in a similar manner

this patch was generated automatically.

Motivation:

 - cleanliness
 - lockdep needs control of lock initialization, which the open-coded
   variants do not give
 - it's also useful for -rt and for lock debugging in general

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] fs/buffer.c: cleanups
Adrian Bunk [Tue, 27 Jun 2006 09:53:54 +0000 (02:53 -0700)]
[PATCH] fs/buffer.c: cleanups

- add a proper prototype for the following global function:
  - buffer_init()

- make the following needlessly global function static:
  - end_buffer_async_write()

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] poison: add & use more constants
Randy Dunlap [Tue, 27 Jun 2006 09:53:54 +0000 (02:53 -0700)]
[PATCH] poison: add & use more constants

Add more poison values to include/linux/poison.h.  It's not clear to me
whether some others should be added or not, so I haven't added any of
these:

./include/linux/libata.h:#define ATA_TAG_POISON 0xfafbfcfdU
./arch/ppc/8260_io/fcc_enet.c:1918: memset((char *)(&(immap->im_dprambase[(mem_addr+64)])), 0x88, 32);
./drivers/usb/mon/mon_text.c:429: memset(mem, 0xe5, sizeof(struct mon_event_text));
./drivers/char/ftape/lowlevel/ftape-ctl.c:738: memset(ft_buffer[i]->address, 0xAA, FT_BUFF_SIZE);
./drivers/block/sx8.c:/* 0xf is just arbitrary, non-zero noise; this is sorta like poisoning */

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] update two drivers for poison.h
Randy Dunlap [Tue, 27 Jun 2006 09:53:53 +0000 (02:53 -0700)]
[PATCH] update two drivers for poison.h

Update two drivers to use poison.h.

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] add poison.h and patch primary users
Randy Dunlap [Tue, 27 Jun 2006 09:53:52 +0000 (02:53 -0700)]
[PATCH] add poison.h and patch primary users

Localize poison values into one header file for better documentation and
easier/quicker debugging and so that the same values won't be used for
multiple purposes.

Use these constants in core arch., mm, driver, and fs code.

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Acked-by: Matt Mackall <mpm@selenic.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] vdso: randomize the i386 vDSO by moving it into a vma
Ingo Molnar [Tue, 27 Jun 2006 09:53:50 +0000 (02:53 -0700)]
[PATCH] vdso: randomize the i386 vDSO by moving it into a vma

Move the i386 VDSO down into a vma and thus randomize it.

Besides the security implications, this feature also helps debuggers, which
can COW a vma-backed VDSO just like a normal DSO and can thus do
single-stepping and other debugging features.

It's good for hypervisors (Xen, VMWare) too, which typically live in the same
high-mapped address space as the VDSO, hence whenever the VDSO is used, they
get lots of guest pagefaults and have to fix such guest accesses up - which
slows things down instead of speeding things up (the primary purpose of the
VDSO).

There's a new CONFIG_COMPAT_VDSO (default=y) option, which provides support
for older glibcs that still rely on a prelinked high-mapped VDSO.  Newer
distributions (using glibc 2.3.3 or later) can turn this option off.  Turning
it off is also recommended for security reasons: attackers cannot use the
predictable high-mapped VDSO page as syscall trampoline anymore.

There is a new vdso=[0|1] boot option as well, and a runtime
/proc/sys/vm/vdso_enabled sysctl switch, that allows the VDSO to be turned
on/off.

(This version of the VDSO-randomization patch also has working ELF
coredumping, the previous patch crashed in the coredumping code.)

This code is a combined work of the exec-shield VDSO randomization
code and Gerd Hoffmann's hypervisor-centric VDSO patch. Rusty Russell
started this patch and i completed it.

[akpm@osdl.org: cleanups]
[akpm@osdl.org: compile fix]
[akpm@osdl.org: compile fix 2]
[akpm@osdl.org: compile fix 3]
[akpm@osdl.org: revernt MAXMEM change]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Cc: Gerd Hoffmann <kraxel@suse.de>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Zachary Amsden <zach@vmware.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] voyager: fix compile after setup rework
James Bottomley [Tue, 27 Jun 2006 09:53:50 +0000 (02:53 -0700)]
[PATCH] voyager: fix compile after setup rework

The following

[PATCH] Clean up and refactor i386 sub-architecture setup

Doesn't quite work, since it leaves out an include of asm/io.h, without
which the use of inb/outb in the setup file won.t work.  This corrects that
and also removes a spurious acpi reference that apparently crept in ages
ago but should never have been there.

Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] fix subarchitecture breakage with CONFIG_SCHED_SMT
James Bottomley [Tue, 27 Jun 2006 09:53:49 +0000 (02:53 -0700)]
[PATCH] fix subarchitecture breakage with CONFIG_SCHED_SMT

Commit 1e9f28fa1eb9773bf65bae08288c6a0a38eef4a7 ("[PATCH] sched: new
sched domain for representing multi-core") incorrectly made SCHED_SMT
and some of the structures it uses dependent on SMP.

However, this is wrong, the structures are only defined if X86_HT, so
SCHED_SMT has to depend on that as well.

The patch broke voyager, since it doesn't provide any of the multi-core
or hyperthreading structures.

Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] fix broken vm86 interrupt/signal handling
Aleksey Gorelov [Tue, 27 Jun 2006 09:53:48 +0000 (02:53 -0700)]
[PATCH] fix broken vm86 interrupt/signal handling

Commit c3ff8ec31c1249d268cd11390649768a12bec1b9 ("[PATCH] i386: Don't
miss pending signals returning to user mode after signal processing")
meant that vm86 interrupt/signal handling got broken for the case when
vm86 is called from kernel space.

In this scenario, if signal is pending because of vm86 interrupt,
do_notify_resume/do_signal exits immediately due to user_mode() check,
without processing any signals.  Thus, resume_userspace handler is spinning
in a tight loop with signal pending and TIF_SIGPENDING is set.  Previously
everything worked Ok.

No in-tree usage of vm86() from kernel space exists, but I've heard
about a number of projects out there which use vm86 calls from kernel,
one of them being this, for instance:

http://dev.gentoo.org/~spock/projects/vesafb-tng/

The following patch fixes the issue.

Signed-off-by: Aleksey Gorelov <aleksey_gorelov@phoenix.com>
Cc: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] i386: use C code for current_thread_info()
Chuck Ebbert [Tue, 27 Jun 2006 09:53:47 +0000 (02:53 -0700)]
[PATCH] i386: use C code for current_thread_info()

Using C code for current_thread_info() lets the compiler optimize it.
With gcc 4.0.2, kernel is smaller:

    text           data     bss     dec     hex filename
 3645212         555556  312024 4512792  44dc18 2.6.17-rc6-nb-post/vmlinux
 3647276         555556  312024 4514856  44e428 2.6.17-rc6-nb/vmlinux
 -------
   -2064

Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] i386: move phys_proc_id and cpu_core_id to cpuinfo_x86
Rohit Seth [Tue, 27 Jun 2006 09:53:46 +0000 (02:53 -0700)]
[PATCH] i386: move phys_proc_id and cpu_core_id to cpuinfo_x86

Move the phys_core_id and cpu_core_id to cpuinfo_x86 structure.  Similar
patch for x86_64 is already accepted by Andi earlier this week.

[akpm@osdl.org: fix warning]
Signed-off-by: Rohit Seth <rohitseth@google.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] x86: constify some parts of arch/i386/kernel/cpu/
Andreas Mohr [Tue, 27 Jun 2006 09:53:45 +0000 (02:53 -0700)]
[PATCH] x86: constify some parts of arch/i386/kernel/cpu/

Signed-off-by: Andreas Mohr <andi@lisas.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] x86: increase interrupt vector range
Rusty Russell [Tue, 27 Jun 2006 09:53:44 +0000 (02:53 -0700)]
[PATCH] x86: increase interrupt vector range

Remove the limit of 256 interrupt vectors by changing the value stored in
orig_{e,r}ax to be the complemented interrupt vector.  The orig_{e,r}ax
needs to be < 0 to allow the signal code to distinguish between return from
interrupt and return from syscall.  With this change applied, NR_IRQS can
be > 256.

Xen extends the IRQ numbering space to include room for dynamically
allocated virtual interrupts (in the range 256-511), which requires a more
permissive interface to do_IRQ.

Signed-off-by: Ian Pratt <ian.pratt@xensource.com>
Signed-off-by: Christian Limpach <Christian.Limpach@cl.cam.ac.uk>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] x86: cpu_init(): avoid GFP_KERNEL allocation while atomic
Shaohua Li [Tue, 27 Jun 2006 09:53:43 +0000 (02:53 -0700)]
[PATCH] x86: cpu_init(): avoid GFP_KERNEL allocation while atomic

The patch fixes two issues:

1.  cpu_init is called with interrupt disabled.  Allocating gdt table
   there isn't good at runtime.

2. gdt table page cause memory leak in CPU hotplug case.

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] selinux: inherit /proc/self/attr/keycreate across fork
Michael LeMay [Tue, 27 Jun 2006 09:53:42 +0000 (02:53 -0700)]
[PATCH] selinux: inherit /proc/self/attr/keycreate across fork

Update SELinux to cause the keycreate process attribute held in
/proc/self/attr/keycreate to be inherited across a fork and reset upon
execve.  This is consistent with the handling of the other process
attributes provided by SELinux and also makes it simpler to adapt logon
programs to properly handle the keycreate attribute.

Signed-off-by: Michael LeMay <mdlemay@epoch.ncsc.mil>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] node hotplug: register cpu: remove node struct
KAMEZAWA Hiroyuki [Tue, 27 Jun 2006 09:53:41 +0000 (02:53 -0700)]
[PATCH] node hotplug: register cpu: remove node struct

With Goto-san's patch, we can add new pgdat/node at runtime.  I'm now
considering node-hot-add with cpu + memory on ACPI.

I found acpi container, which describes node, could evaluate cpu before
memory. This means cpu-hot-add occurs before memory hot add.

In most part, cpu-hot-add doesn't depend on node hot add.  But register_cpu(),
which creates symbolic link from node to cpu, requires that node should be
onlined before register_cpu().  When a node is onlined, its pgdat should be
there.

This patch-set holds off creating symbolic link from node to cpu
until node is onlined.

This removes node arguments from register_cpu().

Now, register_cpu() requires 'struct node' as its argument.  But the array of
struct node is now unified in driver/base/node.c now (By Goto's node hotplug
patch).  We can get struct node in generic way.  So, this argument is not
necessary now.

This patch also guarantees add cpu under node only when node is onlined.  It
is necessary for node-hot-add vs.  cpu-hot-add patch following this.

Moreover, register_cpu calculates cpu->node_id by cpu_to_node() without regard
to its 'struct node *root' argument.  This patch removes it.

Also modify callers of register_cpu()/unregister_cpu, whose args are changed
by register-cpu-remove-node-struct patch.

[Brice.Goglin@ens-lyon.org: fix it]
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Brice Goglin <Brice.Goglin@ens-lyon.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] pgdat allocation and update for ia64 of memory hotplug: allocate pgdat and...
Yasunori Goto [Tue, 27 Jun 2006 09:53:40 +0000 (02:53 -0700)]
[PATCH] pgdat allocation and update for ia64 of memory hotplug: allocate pgdat and per node data

This is a patch to allocate pgdat and per node data area for ia64.  The size
for them can be calculated by compute_pernodesize().

Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] pgdat allocation and update for ia64 of memory hotplug: update pgdat address...
Yasunori Goto [Tue, 27 Jun 2006 09:53:39 +0000 (02:53 -0700)]
[PATCH] pgdat allocation and update for ia64 of memory hotplug: update pgdat address array

This is to refresh node_data[] array for ia64.  As I mentioned previous
patches, ia64 has copies of information of pgdat address array on each node as
per node data.

At v2 of node_add, this function used stop_machine_run() to update them.  (I
wished that they were copied safety as much as possible.) But, in this patch,
this arrays are just copied simply, and set node_online_map bit after
completion of pgdat initialization.

So, kernel must touch NODE_DATA() macro after checking node_online_map().
(Current code has already done it.) This is more simple way for just
hot-add.....

Note : It will be problem when hot-remove will occur,
       because, even if online_map bit is set, kernel may
       touch NODE_DATA() due to race condition. :-(

Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] pgdat allocation and update for ia64 of memory hotplug: hold pgdat address...
Yasunori Goto [Tue, 27 Jun 2006 09:53:38 +0000 (02:53 -0700)]
[PATCH] pgdat allocation and update for ia64 of memory hotplug: hold pgdat address at system running

This is a preparatory patch to make common code for updating of NODE_DATA() of
ia64 between boottime and hotplug.

Current code remembers pgdat address in mem_data which is used at just boot
time.  But its information can be used at hotplug time by moving to global
value.  The next patch uses this array.

Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] Register sysfs file for hotplugged new node
Yasunori Goto [Tue, 27 Jun 2006 09:53:38 +0000 (02:53 -0700)]
[PATCH] Register sysfs file for hotplugged new node

When new node becomes enable by hot-add, new sysfs file must be created for
new node.  So, if new node is enabled by add_memory(), register_one_node() is
called to create it.  In addition, I386's arch_register_node() and a part of
register_nodes() of powerpc are consolidated to register_one_node() as a
generic_code().

This is tested by Tiger4(IPF) with node hot-plug emulation.

Signed-off-by: Keiichiro Tokunaga <tokuanga.keiich@jp.fujitsu.com>
Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] sparc64: support sparsemem and !memory hotplug
Yasunori Goto [Tue, 27 Jun 2006 09:53:37 +0000 (02:53 -0700)]
[PATCH] sparc64: support sparsemem and !memory hotplug

Fix "undefined reference to `arch_add_memory'" on sparc64 allmodconfig.

sparc64 doesn't support memory hotplug.  But we want it to support
sparsemem.

Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] catch valid mem range at onlining memory
KAMEZAWA Hiroyuki [Tue, 27 Jun 2006 09:53:36 +0000 (02:53 -0700)]
[PATCH] catch valid mem range at onlining memory

This patch allows hot-add memory which is not aligned to section.

Now, hot-added memory has to be aligned to section size.  Considering big
section sized archs, this is not useful.

When hot-added memory is registerd as iomem resoruce by iomem resource
patch, we can make use of that information to detect valid memory range.

Note: With this, not-aligned memory can be registerd. To allow hot-add
      memory with holes, we have to do more work around add_memory().
      (It doesn't allows add memory to already existing mem section.)

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] register hot-added memory to iomem resource
KAMEZAWA Hiroyuki [Tue, 27 Jun 2006 09:53:35 +0000 (02:53 -0700)]
[PATCH] register hot-added memory to iomem resource

Register hot-added memory to iomem_resource.  With this, /proc/iomem can
show hot-added memory.

Note: kdump uses /proc/iomem to catch memory range when it is installed.
      So, kdump should be re-installed after /proc/iomem change.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Vivek Goyal <vgoyal@in.ibm.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] pgdat allocation for new node add (call pgdat allocation)
Yasunori Goto [Tue, 27 Jun 2006 09:53:34 +0000 (02:53 -0700)]
[PATCH] pgdat allocation for new node add (call pgdat allocation)

Add node-hot-add support to add_memory().

node hotadd uses this sequence.
1. allocate pgdat.
2. refresh NODE_DATA()
3. call free_area_init_node() to initialize
4. create sysfs entry
5. add memory (old add_memory())
6. set node online
7. run kswapd for new node.
(8). update zonelist after pages are onlined. (This is already merged in -mm
   due to update phase is difference.)

Note:
  To make common function as much as possible,
  there is 2 changes from v2.
    - The old add_memory(), which is defiend by each archs,
      is renamed to arch_add_memory(). New add_memory becomes
      caller of arch dependent function as a common code.

    - This patch changes add_memory()'s interface
        From: add_memory(start, end)
        TO  : add_memory(nid, start, end).
      It was cause of similar code that finding node id from
      physical address is inside of old add_memory() on each arch.

      In addition, acpi memory hotplug driver can find node id easier.
      In v2, it must walk DSDT'S _CRS by matching physical address to
      get the handle of its memory device, then get _PXM and node id.
      Because input is just physical address.
      However, in v3, the acpi driver can use handle to get _PXM and node id
      for the new memory device. It can pass just node id to add_memory().

Fix interface of arch_add_memory() is in next patche.

Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: "Brown, Len" <len.brown@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] pgdat allocation for new node add (export kswapd start func)
Yasunori Goto [Tue, 27 Jun 2006 09:53:33 +0000 (02:53 -0700)]
[PATCH] pgdat allocation for new node add (export kswapd start func)

When node is hot-added, kswapd for the node should start.  This export kswapd
start function as kswapd_run() to use at add_memory().

[akpm@osdl.org: daemonize() isn't needed when using the kthread API]
Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: "Brown, Len" <len.brown@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] pgdat allocation for new node add (refresh node_data[])
Yasunori Goto [Tue, 27 Jun 2006 09:53:33 +0000 (02:53 -0700)]
[PATCH] pgdat allocation for new node add (refresh node_data[])

Refresh NODE_DATA() for generic archs.  In this case, NODE_DATA(nid) ==
node_data[nid].  node_data[] is array of address of pgdat.  So, refresh is
quite simple.

Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: "Brown, Len" <len.brown@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] pgdat allocation for new node add (generic alloc node_data)
Yasunori Goto [Tue, 27 Jun 2006 09:53:32 +0000 (02:53 -0700)]
[PATCH] pgdat allocation for new node add (generic alloc node_data)

For node hotplug, basically we have to allocate new pgdat.  But, there are
several types of implementations of pgdat.

1. Allocate only pgdat.
   This style allocate only pgdat area.
   And its address is recorded in node_data[].
   It is most popular style.

2. Static array of pgdat
   In this case, all of pgdats are static array.
   Some archs use this style.

3. Allocate not only pgdat, but also per node data.
   To increase performance, each node has copy of some data as
   a per node data. So, this area must be allocated too.

   Ia64 is this style. Ia64 has the copies of node_data[] array
   on each per node data to increase performance.

In this series of patches, treat (1) as generic arch.

generic archs can use generic function. (2) and (3) should have
its own if necessary.

This patch defines pgdat allocator.
Updating NODE_DATA() macro function is in other patch.

Signed-off-by: Yasonori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: "Brown, Len" <len.brown@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] pgdat allocation for new node add (get node id by acpi)
Yasunori Goto [Tue, 27 Jun 2006 09:53:31 +0000 (02:53 -0700)]
[PATCH] pgdat allocation for new node add (get node id by acpi)

This is to find node id from acpi's handle of memory_device in DSDT.  _PXM for
the new node can be found by acpi_get_pxm() by using new memory's handle.  So,
node id can be found by pxm_to_nid_map[].

  This patch becomes simpler than v2 of node hot-add patch.
  Because old add_memory() function doesn't have node id parameter.
  So, kernel must find its handle by physical address via DSDT again.
  But, v3 just give node id to add_memory() now.

Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: "Brown, Len" <len.brown@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] pgdat allocation for new node add (specify node id)
Yasunori Goto [Tue, 27 Jun 2006 09:53:30 +0000 (02:53 -0700)]
[PATCH] pgdat allocation for new node add (specify node id)

Change the name of old add_memory() to arch_add_memory.  And use node id to
get pgdat for the node at NODE_DATA().

Note: Powerpc's old add_memory() is defined as __devinit. However,
      add_memory() is usually called only after bootup.
      I suppose it may be redundant. But, I'm not well known about powerpc.
      So, I keep it. (But, __meminit is better at least.)

Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: "Brown, Len" <len.brown@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] Catch notification of memory add event of ACPI via container driver. (avoid...
Yasunori Goto [Tue, 27 Jun 2006 09:53:29 +0000 (02:53 -0700)]
[PATCH] Catch notification of memory add event of ACPI via container driver. (avoid redundant call add_memory)

When acpi_memory_device_init() is called at boottime to register struct
memory acpi_memory_device, acpi_bus_add() are called via
acpi_driver_attach().

But it also calls ops->start() function.  It is called even if the memory
blocks are initialized at early boottime.  In this case add_memory() return
-EEXIST, and the memory blocks becomes INVALID state even if it is normal.

This is patch to avoid calling add_memory() for already available memory.

[akpm@osdl.org: coding cleanups]
Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: "Brown, Len" <len.brown@intel.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] Catch notification of memory add event of ACPI via container driver. (registe...
Yasunori Goto [Tue, 27 Jun 2006 09:53:28 +0000 (02:53 -0700)]
[PATCH] Catch notification of memory add event of ACPI via container driver. (register start func for memory device)

This is a patch to call add_memroy() when notify reaches for new node's add
event.

When new node is added, notify of ACPI reaches container device which means
the node.

Container device driver calls acpi_bus_scan() to find and add belonging
devices (which means cpu, memory and so on).  Its function calls add and
start function of belonging devices's driver.

Howevever, current memory hotplug driver just register add function to
create sysfs file for its memory.  But, acpi_memory_enable_device() is not
called because it is considered just the case that notify reaches memory
device directly.  So, if notify reaches container device nothing can call
add_memory().

This is a patch to create start function which calls add_memory().
add_memory() can be called by this when notify reaches container device.

[akpm@osdl.org: coding cleanups]
Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: "Brown, Len" <len.brown@intel.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] acpi memory hotplug cannot manage _CRS with plural resoureces
KAMEZAWA Hiroyuki [Tue, 27 Jun 2006 09:53:27 +0000 (02:53 -0700)]
[PATCH] acpi memory hotplug cannot manage _CRS with plural resoureces

Current acpi memory hotplug just looks into the first entry of resources in
_CRS.  But, _CRS can contain plural resources.  So, if _CRS contains plural
resoureces, acpi memory hot add cannot add all memory.

With this patch, acpi memory hotplug can deal with Memory Device, whose
_CRS contains plural resources.

Tested on ia64 memory hotplug test envrionment (not emulation, uses alpha
version firmware which supports dynamic reconfiguration of NUMA.)

Note: Microsoft's Windows Server 2003 requires big (>4G)resoureces to be
      divided into small (<4G) resources. looks crazy, but not invalid.
      (See http://www.microsoft.com/whdc/system/pnppwr/hotadd/hotaddmem.mspx)
      For this reason, a firmware vendor who supports Windows writes plural
      resources in a _CRS even if they are contiguous.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: "Brown, Len" <len.brown@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] pm_trace is dangerous
Andrew Morton [Tue, 27 Jun 2006 09:53:26 +0000 (02:53 -0700)]
[PATCH] pm_trace is dangerous

CONFIG_PM_TRACES scrogs your RTC.  Mark it as experimental, and defaulting to
`off'.

Also beef up the help message a bit.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] zlib inflate: fix function definitions
Randy Dunlap [Tue, 27 Jun 2006 09:53:26 +0000 (02:53 -0700)]
[PATCH] zlib inflate: fix function definitions

Fix function definitions to be ANSI-compliant:
lib/zlib_inflate/inffast.c:68:1: warning: non-ANSI definition of function 'inflate_fast'
lib/zlib_inflate/inftrees.c:33:1: warning: non-ANSI definition of function 'zlib_inflate_table'

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] kernel/acct: fix function definition
Randy Dunlap [Tue, 27 Jun 2006 09:53:25 +0000 (02:53 -0700)]
[PATCH] kernel/acct: fix function definition

kernel/acct.c:579:19: warning: non-ANSI function declaration of function 'acct_process'

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] fix static linking of NFS
David Brownell [Tue, 27 Jun 2006 19:59:15 +0000 (12:59 -0700)]
[PATCH] fix static linking of NFS

Builds on ARM report link problems with common configurations like
statically linked NFS (for nfsroot).  The symptom is that __init
section code references __exit section code; that won't work since
the exit sections are discarded (since they can never be called).

The best fix for these particular cases would be an "__init_or_exit"
section annotation.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years agoInput: fix resetting name, phys and uniq when unregistering device
Dmitry Torokhov [Tue, 27 Jun 2006 12:30:31 +0000 (08:30 -0400)]
Input: fix resetting name, phys and uniq when unregistering device

It should be done before calling class_device_unregister() because
it will destroy the device and free memory if there are no other
references to the device.

Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years agoRevert "kbuild: fix make -rR breakage"
Linus Torvalds [Mon, 26 Jun 2006 23:59:26 +0000 (16:59 -0700)]
Revert "kbuild: fix make -rR breakage"

This reverts commit e5c44fd88c146755da6941d047de4d97651404a9.

Thanks to Daniel Ritz and Michal Piotrowski for noticing the problem.

Daniel says:

  "[The] reason is a recent change that made modules always shows as
   module.mod.  it breaks modprobe and probably many scripts..besides
   lsmod looking horrible

   stuff like this in modprobe.conf:
        install pcmcia_core /sbin/modprobe --ignore-install pcmcia_core; /sbin/modprobe pcmcia
   makes modprobe fork/exec endlessly calling itself...until oom
   interrupts it"

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years agoMerge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfashe...
Linus Torvalds [Mon, 26 Jun 2006 23:06:08 +0000 (16:06 -0700)]
Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2

* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2: (56 commits)
  [PATCH] fs/ocfs2/dlm/: cleanups
  ocfs2: fix compiler warnings in dlm_convert_lock_handler()
  ocfs2: dlm_print_one_mle() needs to be defined
  ocfs2: remove whitespace in dlmunlock.c
  ocfs2: move dlm work to a private work queue
  ocfs2: fix incorrect error returns
  ocfs2: tune down some noisy messages during dlm recovery
  ocfs2: display message before waiting for recovery to complete
  ocfs2: mlog in dlm_convert_lock_handler() should be ML_ERROR
  ocfs2: retry operations when a lock is marked in recovery
  ocfs2: use cond_resched() in dlm_thread()
  ocfs2: use GFP_NOFS in some dlm operations
  ocfs2: wait for recovery when starting lock mastery
  ocfs2: continue recovery when a dead node is encountered
  ocfs2: remove unneccesary spin_unlock() in dlm_remaster_locks()
  ocfs2: dlm_remaster_locks() should never exit without completing
  ocfs2: special case recovery lock in dlmlock_remote()
  ocfs2: pending mastery asserts and migrations should block each other
  ocfs2: temporarily disable automatic lock migration
  ocfs2: do not unconditionally purge the lockres in dlmlock_remote()
  ...

18 years agoMerge master.kernel.org:/home/rmk/linux-2.6-arm
Linus Torvalds [Mon, 26 Jun 2006 22:01:05 +0000 (15:01 -0700)]
Merge master.kernel.org:/home/rmk/linux-2.6-arm

* master.kernel.org:/home/rmk/linux-2.6-arm:
  [ARM] 3657/1: S3C24XX: Documentation update of Overview.txt
  [ARM] Update mach-types
  [ARM] 3656/1: S3C2412: Add S3C2412 and S3C2413 documenation
  [ARM] 3654/1: add ajeco 1arm sbc support
  [ARM] fix drivers/mfd/ucb1x00-core.c IRQ probing bug
  [ARM] 3651/1: S3C24XX: Make arch list more detailed
  [ARM] 3650/1: S3C2412: Update s3c2410_defconfig
  [ARM] 3649/1: S3C24XX: Fix capitalisation of CPU on SMDK2440
  [ARM] 3612/1: make pci bus optional for ixp4xx platform
  [ARM] Remove MODE_(SVC|IRQ|FIQ|USR) and DEFAULT_FIQ
  [ARM] Remove save_lr/restore_pc macros
  [ARM] Remove partial non-v6 binutils compatibility
  [ARM] Remove LOADREGS macro
  [ARM] Remove RETINSTR macro

18 years agoMerge master.kernel.org:/home/rmk/linux-2.6-serial
Linus Torvalds [Mon, 26 Jun 2006 22:00:33 +0000 (15:00 -0700)]
Merge master.kernel.org:/home/rmk/linux-2.6-serial

* master.kernel.org:/home/rmk/linux-2.6-serial:
  [SERIAL] 8250_pnp: add support for other Wacom tablets

18 years ago[ARM] 3657/1: S3C24XX: Documentation update of Overview.txt
Ben Dooks [Mon, 26 Jun 2006 21:51:08 +0000 (22:51 +0100)]
[ARM] 3657/1: S3C24XX: Documentation update of Overview.txt

Patch from Ben Dooks

Update the list of supported devices, and remove the
changelog. Add SMDK2413 information.--

Signed-off-by: Ben Dooks <ben-linux@fluff.org>Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
18 years ago[ARM] Update mach-types
Russell King [Mon, 26 Jun 2006 21:50:21 +0000 (22:50 +0100)]
[ARM] Update mach-types

Usual mach-types update.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
18 years ago[PATCH] fs/ocfs2/dlm/: cleanups
Adrian Bunk [Tue, 16 May 2006 15:26:41 +0000 (17:26 +0200)]
[PATCH] fs/ocfs2/dlm/: cleanups

This patch #if 0's the no longer used dlm_dump_lock_resources().

Since this makes dlmdebug.h empty, this patch also removes this header.

Additionally, the needlessly global dlm_is_node_recovered() is made
static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: fix compiler warnings in dlm_convert_lock_handler()
Mark Fasheh [Mon, 1 May 2006 21:56:57 +0000 (14:56 -0700)]
ocfs2: fix compiler warnings in dlm_convert_lock_handler()

We need to cast to unsigned long long.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: dlm_print_one_mle() needs to be defined
Mark Fasheh [Mon, 1 May 2006 21:55:10 +0000 (14:55 -0700)]
ocfs2: dlm_print_one_mle() needs to be defined

Fixes compile breakage.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: remove whitespace in dlmunlock.c
Kurt Hackel [Mon, 1 May 2006 21:39:57 +0000 (14:39 -0700)]
ocfs2: remove whitespace in dlmunlock.c

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: move dlm work to a private work queue
Kurt Hackel [Mon, 1 May 2006 21:39:29 +0000 (14:39 -0700)]
ocfs2: move dlm work to a private work queue

The work that is done can block for long periods of time and so is not
appropriate for keventd.

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: fix incorrect error returns
Kurt Hackel [Mon, 1 May 2006 21:34:08 +0000 (14:34 -0700)]
ocfs2: fix incorrect error returns

Use DLM_REJECTED instead of DLM_RECOVERING.

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: tune down some noisy messages during dlm recovery
Kurt Hackel [Mon, 1 May 2006 21:31:37 +0000 (14:31 -0700)]
ocfs2: tune down some noisy messages during dlm recovery

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: display message before waiting for recovery to complete
Kurt Hackel [Mon, 1 May 2006 21:30:39 +0000 (14:30 -0700)]
ocfs2: display message before waiting for recovery to complete

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: mlog in dlm_convert_lock_handler() should be ML_ERROR
Kurt Hackel [Mon, 1 May 2006 21:29:59 +0000 (14:29 -0700)]
ocfs2: mlog in dlm_convert_lock_handler() should be ML_ERROR

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: retry operations when a lock is marked in recovery
Kurt Hackel [Mon, 1 May 2006 21:29:28 +0000 (14:29 -0700)]
ocfs2: retry operations when a lock is marked in recovery

Before checking for a nonexistent lock, make sure the lockres is not marked
RECOVERING. The caller will just retry and the state should be fixed up when
recovery completes.

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: use cond_resched() in dlm_thread()
Kurt Hackel [Mon, 1 May 2006 21:27:41 +0000 (14:27 -0700)]
ocfs2: use cond_resched() in dlm_thread()

yield() does not yield.  cond_resched() does.

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: use GFP_NOFS in some dlm operations
Kurt Hackel [Mon, 1 May 2006 21:25:21 +0000 (14:25 -0700)]
ocfs2: use GFP_NOFS in some dlm operations

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: wait for recovery when starting lock mastery
Kurt Hackel [Mon, 1 May 2006 20:54:07 +0000 (13:54 -0700)]
ocfs2: wait for recovery when starting lock mastery

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: continue recovery when a dead node is encountered
Kurt Hackel [Mon, 1 May 2006 20:51:49 +0000 (13:51 -0700)]
ocfs2: continue recovery when a dead node is encountered

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: remove unneccesary spin_unlock() in dlm_remaster_locks()
Kurt Hackel [Mon, 1 May 2006 20:50:12 +0000 (13:50 -0700)]
ocfs2: remove unneccesary spin_unlock() in dlm_remaster_locks()

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: dlm_remaster_locks() should never exit without completing
Kurt Hackel [Mon, 1 May 2006 20:49:20 +0000 (13:49 -0700)]
ocfs2: dlm_remaster_locks() should never exit without completing

We cannot restart recovery. Once we begin to recover a node, keep the state
of the recovery intact and follow through, regardless of any other node
deaths that may occur.

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: special case recovery lock in dlmlock_remote()
Kurt Hackel [Mon, 1 May 2006 20:47:50 +0000 (13:47 -0700)]
ocfs2: special case recovery lock in dlmlock_remote()

If the previous master of the recovery lock dies, let calc_usage take it
down completely and let the caller completely redo the dlmlock() call.
Otherwise, there will never be an opportunity to re-master the lockres and
recovery wont be able to progress.

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: pending mastery asserts and migrations should block each other
Kurt Hackel [Mon, 1 May 2006 20:32:27 +0000 (13:32 -0700)]
ocfs2: pending mastery asserts and migrations should block each other

Use the existing structure for blocking migrations when ASTs are pending to
achieve the same result. If we can catch the assert before it goes on the
wire, just cancel it and let the migration continue.

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: temporarily disable automatic lock migration
Kurt Hackel [Mon, 1 May 2006 20:30:49 +0000 (13:30 -0700)]
ocfs2: temporarily disable automatic lock migration

Now we never change the owner of a lock resource until unmount or node
death. This will be re-enabled once some issues in the algorithm used have
been resolved.

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: do not unconditionally purge the lockres in dlmlock_remote()
Kurt Hackel [Mon, 1 May 2006 20:27:10 +0000 (13:27 -0700)]
ocfs2: do not unconditionally purge the lockres in dlmlock_remote()

In dlmlock_remote(), do not call purge_lockres until the lock resource
actually changes. otherwise, the mastery info on the lockres will go away
underneath the caller.

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: increase backoff before waiting for recovery
Kurt Hackel [Mon, 1 May 2006 19:02:07 +0000 (12:02 -0700)]
ocfs2: increase backoff before waiting for recovery

When mastering non-recovery lock resources, additional time was frequently
needed to allow the disk heartbeat to catch up with the network timeout. the
recovery lock resource is time critical and avoids this path.

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: have dlm_pre_master_reco_lockres() ignore dead nodes
Kurt Hackel [Mon, 1 May 2006 18:53:33 +0000 (11:53 -0700)]
ocfs2: have dlm_pre_master_reco_lockres() ignore dead nodes

Recovery will spin in dlm_pre_master_reco_lockres if we do not ignore
timed-out network responses from dead nodes.

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: give the dlm dirty list a reference on the lockres
Kurt Hackel [Mon, 1 May 2006 18:51:45 +0000 (11:51 -0700)]
ocfs2: give the dlm dirty list a reference on the lockres

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: teach dlm_restart_lock_mastery() to wait on recovery
Kurt Hackel [Mon, 1 May 2006 18:49:52 +0000 (11:49 -0700)]
ocfs2: teach dlm_restart_lock_mastery() to wait on recovery

Change behavior of dlm_restart_lock_mastery() when a node goes down.  Dump
all responses that have been collected and start over.

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: gracefully handle stale create_lock messages.
Kurt Hackel [Mon, 1 May 2006 18:46:59 +0000 (11:46 -0700)]
ocfs2: gracefully handle stale create_lock messages.

This is an error on the sending side, so gracefully error out on the
receiving end.

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: update lvb immediately during recovery
Kurt Hackel [Mon, 1 May 2006 18:32:14 +0000 (11:32 -0700)]
ocfs2: update lvb immediately during recovery

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: do not send master requests to localhost
Kurt Hackel [Mon, 1 May 2006 18:22:06 +0000 (11:22 -0700)]
ocfs2: do not send master requests to localhost

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: purge lockres' sooner
Kurt Hackel [Mon, 1 May 2006 18:16:45 +0000 (11:16 -0700)]
ocfs2: purge lockres' sooner

Immediately purge a lockress that the local node is not the master of.

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: dump mismatching migrated lvbs before BUG()
Kurt Hackel [Mon, 1 May 2006 18:15:04 +0000 (11:15 -0700)]
ocfs2: dump mismatching migrated lvbs before BUG()

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: make dlm recovery finalization 2 stage
Kurt Hackel [Mon, 1 May 2006 18:11:13 +0000 (11:11 -0700)]
ocfs2: make dlm recovery finalization 2 stage

Makes it easier for the recovery process to deal with node death.

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: dlm recovery / lockres reference count fix
Kurt Hackel [Mon, 1 May 2006 17:57:51 +0000 (10:57 -0700)]
ocfs2: dlm recovery / lockres reference count fix

Take a reference on lockres structures while they are on the recovery list.

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: better error handling during assert master message
Kurt Hackel [Fri, 28 Apr 2006 02:26:15 +0000 (19:26 -0700)]
ocfs2: better error handling during assert master message

handle errors during lock assert master by either killing self or other node

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: dump lockres info before we BUG() on a bad reference
Kurt Hackel [Fri, 28 Apr 2006 02:24:21 +0000 (19:24 -0700)]
ocfs2: dump lockres info before we BUG() on a bad reference

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: do LVB puts in place
Mark Fasheh [Fri, 28 Apr 2006 02:07:45 +0000 (19:07 -0700)]
ocfs2: do LVB puts in place

Don't wait until the AST will be fired to do the LVB copy into the lock
resource.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: mle ref count debugging
Kurt Hackel [Fri, 28 Apr 2006 02:04:49 +0000 (19:04 -0700)]
ocfs2: mle ref count debugging

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: allow for an assert message during lock mastery
Kurt Hackel [Fri, 28 Apr 2006 02:03:18 +0000 (19:03 -0700)]
ocfs2: allow for an assert message during lock mastery

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: take mle reference during migration
Kurt Hackel [Fri, 28 Apr 2006 02:01:35 +0000 (19:01 -0700)]
ocfs2: take mle reference during migration

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: properly initialize the mle structure
Kurt Hackel [Fri, 28 Apr 2006 02:00:26 +0000 (19:00 -0700)]
ocfs2: properly initialize the mle structure

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: detach mle from heartbeat events
Kurt Hackel [Fri, 28 Apr 2006 01:53:04 +0000 (18:53 -0700)]
ocfs2: detach mle from heartbeat events

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: mle ref counting fixes
Kurt Hackel [Fri, 28 Apr 2006 01:51:26 +0000 (18:51 -0700)]
ocfs2: mle ref counting fixes

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: better mle debugging
Kurt Hackel [Fri, 28 Apr 2006 01:47:41 +0000 (18:47 -0700)]
ocfs2: better mle debugging

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: clean up recovery related messages
Kurt Hackel [Fri, 28 Apr 2006 01:08:51 +0000 (18:08 -0700)]
ocfs2: clean up recovery related messages

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: handle network errors during recovery
Kurt Hackel [Fri, 28 Apr 2006 01:06:58 +0000 (18:06 -0700)]
ocfs2: handle network errors during recovery

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: only recover one dead node at a time
Kurt Hackel [Fri, 28 Apr 2006 01:05:41 +0000 (18:05 -0700)]
ocfs2: only recover one dead node at a time

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: Better tracking for recovery state changes
Kurt Hackel [Fri, 28 Apr 2006 01:03:49 +0000 (18:03 -0700)]
ocfs2: Better tracking for recovery state changes

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: Fix empty lvb check
Kurt Hackel [Fri, 28 Apr 2006 01:02:10 +0000 (18:02 -0700)]
ocfs2: Fix empty lvb check

The check for an empty lvb should check the entire buffer not just the first
byte.

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: fix inverted logic in dlm_is_node_dead
Kurt Hackel [Fri, 28 Apr 2006 01:00:21 +0000 (18:00 -0700)]
ocfs2: fix inverted logic in dlm_is_node_dead

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: recheck lockres master before sending an unlock request.
Kurt Hackel [Fri, 28 Apr 2006 00:59:46 +0000 (17:59 -0700)]
ocfs2: recheck lockres master before sending an unlock request.

Recovery may have happened and it may now be mastered locally.

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: add a small delay after a failed migration
Kurt Hackel [Fri, 28 Apr 2006 00:58:23 +0000 (17:58 -0700)]
ocfs2: add a small delay after a failed migration

Otherwise we risk starving other threads.

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: silence a compile warning in dlm_alloc_pagevec()
Mark Fasheh [Thu, 23 Mar 2006 19:23:29 +0000 (11:23 -0800)]
ocfs2: silence a compile warning in dlm_alloc_pagevec()

Reported by Andrew Morton.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years ago[PATCH] ocfs2: Alloc at least a page for the DLM hash
Joel Becker [Fri, 17 Mar 2006 01:40:37 +0000 (17:40 -0800)]
[PATCH] ocfs2: Alloc at least a page for the DLM hash

The OCFS2 DLM allocates a number of pages for a hash to lookup locks.
There was a bug where a PAGE_SIZE bigger than the hash size (eg, 64K
pages) would result in zero pages allocated.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: allocate lockres hash pages in an array
Daniel Phillips [Sat, 11 Mar 2006 02:08:16 +0000 (18:08 -0800)]
ocfs2: allocate lockres hash pages in an array

This allows us to have a hash table greater than a single page which greatly
improves dlm performance on some tests.

Signed-off-by: Daniel Phillips <phillips@google.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: inline dlm_lockres_get()
Mark Fasheh [Fri, 10 Mar 2006 21:44:00 +0000 (13:44 -0800)]
ocfs2: inline dlm_lockres_get()

It's called on every lookup so this might help performance a bit.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years ago[PATCH] Clean up ocfs2 hash probe and make it faster
Daniel Phillips [Fri, 10 Mar 2006 21:31:47 +0000 (13:31 -0800)]
[PATCH] Clean up ocfs2 hash probe and make it faster

Signed-Off-By: Daniel Phillips <phillips@google.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: calculate lockid hash values outside of the spinlock
Mark Fasheh [Fri, 10 Mar 2006 01:55:56 +0000 (17:55 -0800)]
ocfs2: calculate lockid hash values outside of the spinlock

Fixes a performance bug - pointed out by Andrew.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>