Jun'ichi Nomura [Thu, 13 Dec 2007 14:15:25 +0000 (14:15 +0000)]
dm: table detect io beyond device
This patch fixes a panic on shrinking a DM device if there is
outstanding I/O to the part of the device that is being removed.
(Normally this doesn't happen - a filesystem would be resized first,
for example.)
The bug is that __clone_and_map() assumes dm_table_find_target()
always returns a valid pointer. It may fail if a bio arrives from the
block layer but its target sector is no longer included in the DM
btree.
This patch appends an empty entry to table->targets[] which will
be returned by a lookup beyond the end of the device.
After calling dm_table_find_target(), __clone_and_map() and target_message()
check for this condition using
dm_target_is_valid().
Ivan Kokshaysky [Thu, 20 Dec 2007 08:47:07 +0000 (11:47 +0300)]
mm: fix exit_mmap BUG() on a.out binary exit
The problem was introduced by commit "mm: variable length argument
support" (b6a2fea39318e43fee84fa7b0b90d68bed92d2ba)
as it didn't update fs/binfmt_aout.c like other binfmt's.
I noticed that on alpha when accidentally launched old OSF/1
Acrobat Reader binary. Obviously, other architectures are affected
as well.
Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Ollie Wild <aaw@google.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Hugh Dickins <hugh@veritas.com> Cc: Adrian Bunk <bunk@stusta.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Arjan van de Ven [Thu, 20 Dec 2007 14:01:17 +0000 (15:01 +0100)]
debug: add end-of-oops marker
Right now it's nearly impossible for parsers that collect kernel crashes
from logs or emails (such as www.kerneloops.org) to detect the
end-of-oops condition. In addition, it's not currently possible to
detect whether or not 2 oopses that look alike are actually the same
oops reported twice, or are truly two unique oopses.
This patch adds an end-of-oops marker, and makes the end marker include
a very simple 64-bit random ID to be able to detect duplicate reports.
Normally, this ID is calculated as a late_initcall() (in the hope that
at that time there is enough entropy to get a unique enough ID); however
for early oopses the oops_exit() function needs to generate the ID on
the fly.
We do this all at the _end_ of an oops printout, so this does not impact
our ability to get the most important portions of a crash out to the
console first.
[ Sidenote: the already existing oopses-since-bootup counter we print
during crashes serves as the differentiator between multiple oopses
that trigger during the same bootup. ]
Tested on 32-bit and 64-bit x86. Artificially injected very early
crashes as well, as expected they result in this constant ID after
multiple bootups:
because the random pools are still all zero. But it all still works
fine and causes no additional problems (which is the main goal of
instrumentation code).
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
T *d;
...
for_each_node_by_type(d,...)
{... when != of_node_put(d)
when != e = d
(
return d;
|
+ of_node_put(d);
? return ...;
)
...}
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk> Cc: Christian Krafft <krafft@de.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David Erb <djerb@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
Julia Lawall [Thu, 13 Dec 2007 23:56:15 +0000 (15:56 -0800)]
[POWERPC] arch/powerpc: Add missing of_node_put
There should be an of_node_put when breaking out of a loop that iterates
over calls to of_find_all_nodes, as this function does an of_node_get on
the value it returns.
This was fixed using the following semantic patch.
(http://www.emn.fr/x-info/coccinelle/)
// <smpl>
@@
type T;
identifier d;
expression e;
@@
T *d;
...
for (d = NULL; (d = of_find_all_nodes(d)) != NULL; )
{... when != of_node_put(d)
when != e = d
(
return d;
|
+ of_node_put(d);
? return ...;
)
...}
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
Julia Lawall [Thu, 13 Dec 2007 23:56:07 +0000 (15:56 -0800)]
[POWERPC] arch/ppc: Remove an unnecessary pci_dev_put
Remove an unnecessary pci_dev_put. pci_dev_put is called implicitly
by the subsequent call to pci_get_device.
The problem was detected using the following semantic patch, and
corrected by hand.
@@
expression dev;
expression E;
@@
- pci_dev_put(dev)
... when != dev = E
- pci_get_device(...,dev)
Signed-off-by: Julia Lawall <julia@diku.dk> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
Lucas Woods [Thu, 13 Dec 2007 23:56:06 +0000 (15:56 -0800)]
[POWERPC] arch/ppc: Remove duplicate includes
Signed-off-by: Lucas Woods <woodzy@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
Lucas Woods [Thu, 13 Dec 2007 23:56:06 +0000 (15:56 -0800)]
[POWERPC] arch/powerpc: Remove duplicate includes
Signed-off-by: Lucas Woods <woodzy@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
Li Zefan [Thu, 6 Dec 2007 09:33:14 +0000 (20:33 +1100)]
[POWERPC] Don't cast a struct pointer to list_head pointer
The casting is safe only when the list_head member is the first member
of the structure, and even then it is better to use the address of the
list_head structure member.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
[POWERPC] Stop the TOC overflowing for large builds
We were using -mno-minimal-toc on everything in arch/powerpc/kernel,
which means that all the functions in there were putting all their
TOC entries in the top-level TOC, and it was overflowing on an
allyesconfig build. For various reasons, prom_init.c does need
-mno-minimal-toc, but the other .c files in there can use sub-TOCs
quite happily. This change is sufficient for now to stop the TOC
overflowing; other directories under arch/powerpc also use
-mno-minimal-toc and could also be changed later if necessary.
Lmbench runs with and without this patch showed no significant speed
differences.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
[POWERPC] Fix PCI IRQ fallback code to not map IRQ 0
The PCI IRQ code has a fallback when the device-tree parsing fails, that
tries to map the interrupt indicated by PCI_INTERRUPT_LINE if the firmware
set something in there. This is a bit fragile but has proven useful in some
cases so far. However, it's causing us to incorrectly try to map interrupt 0
on various setups, so let's prevent that case, as none of the cases where
the fallback is legit should have an IRQ 0.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
[POWERPC] Disable PCI IO/Mem on a device when resources can't be allocated
This patch changes the PowerPC PCI code to disable IO and/or Memory
decoding on a PCI device when a resource of that type failed to be
allocated. This is done to avoid having unallocated dangling BARs
enabled that might try to decode on top of other devices.
If a proper resource is assigned later on, then pci_enable_device()
will take care of re-enabling decoding.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
[POWERPC] Fixup skipping of PowerMac PCI<->PCI bridge "closed" resources
Apple firmware has a strange way to "close" bridge resources by setting
them to some bogus values that overlap RAM (strangely, I haven't seen it
conflicting with DMA so far...). This explicitely closes them to avoid
problems. Previously, they would be closed as a consequence of failing
to be allocated, but this makes it more explicit, and thus the log
message is more explicit too.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
[POWERPC] Improve resource setup of PowerMac G5 HT bridge
The device node for the HT bridge on G5s doesn't contain useful ranges.
We used to give it a bunch of the known PCI space and then punch a "hole"
in it based on where the AGP or PCIe region was. This reworks it to
use the actual register in the bridge that controls the decoding instead.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
[POWERPC] Various fixes to pcibios_enable_device()
Our implementation of pcibios_enable_device() has a couple of problems.
One is that it should not check IORESOURCE_UNSET, as this might be
left dangling after resource assignment (shouldn't but there are
bugs), but instead, we make it check resource->parent which should
be a reliable indication that the resource has been successfully
claimed (it's in the resource tree).
Then, we also need to skip ROM resources that haven't been enabled
as x86 does.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
Powermac's use of the pcibios_enable_device_hook() got slightly
broken by the recent PCI merge in that it won't be called for
the "initial" case of assigning resources to a previously
unassigned device. This was an abuse of that hook anyway, so
instead we now use a header quirk.
While at it, we move a #ifdef CONFIG_PPC32 to enclose more code
that is only ever used on 32 bits.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
[POWERPC] Merge 32 and 64 bits pcibios_enable_device
This merge the two implementations, based on the previously
fixed up 32 bits one. The pcibios_enable_device_hook in ppc_md
is now available for ppc64 use. Also remove the new unused
"initial" parameter from it and fixup users.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
[POWERPC] Updates/fixes to 32 bits pcibios_enable_device()
Our implementation of pcibios_enable_device() incorrectly ignores
the mask argument and always checks that all resources have been
allocated, which isn't the right thing to do anymore.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
The way iSeries manages PCI IO and Memory resources is a bit strange
and is based on overriding the content of those resources with home
cooked ones afterward.
This changes it a bit to better integrate with the new resource handling
so that the "virtual" tokens that iSeries replaces resources with are
done from the proper per-device fixup hook, and bridge resources are
set to enclose that token space. This fixes various things such as
the output of /proc/iomem & ioports, among others. This also fixes up
various boot messages as well.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
The 32 bits PCI code now uses the generic code for assigning unassigned
resources and an algorithm similar to x86 for claiming existing ones.
This works far better than the 64 bits code which basically can only
claim existing ones (pci_probe_only=1) or would fall apart completely.
This merges them so that the new 32 bits implementation is used for both.
64 bits now gets the new PCI flags for controlling the behaviour, though
the old pci_probe_only global is still there for now to be cleared if you
want to.
I kept a pcibios_claim_one_bus() function mostly based on the old 64
bits code for use by the DLPAR hotplug. This will have to be cleaned
up, thought I hope it will work in the meantime.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
The PCI code in 32 and 64 bits fixes up resources differently.
32 bits uses a header quirk plus handles bridges in pcibios_fixup_bus()
while 64 bits does things in various places depending on whether you
are using OF probing, using PCI hotplug, etc...
This merges those by basically using the 32 bits approach for both,
with various tweaks to make 64 bits work with the new approach.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
[POWERPC] pci32: Remove obsolete PowerMac bus number hack
The 32 bits PCI code carries an old hack that was only useful for G5
machines. Nowdays, the 32 bits kernel doesn't support any of those
machines anymore so the hack is basically never used, so remove it.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
[POWERPC] pci32: Add flags modifying the PCI code behaviour
This adds to the 32 bits PCI code some flags, replacing the old
pci_assign_all_busses global, that allow us to control various
aspects of the PCI probing, such as whether to re-assign all
resources or not, or to not try to assign anything at all.
This also adds the flag x86 already has to avoid ISA alignment
on bridges that don't have ISA forwarding enabled (no legacy
devices on the top level bus) and sets it for PowerMacs.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
The 32 bits PowerPC PCI code has a hack for use by some PowerMacs
to try to re-open PCI<->PCI bridge IO resources that were closed
by the firmware. This is no longer necessary as the generic code
will now do that for us.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
[POWERPC] pci32: Use generic pci_assign_unassign_resources
This makes the 32 bits PowerPC PCI code use the generic code to assign
resources to devices that had unassigned or conflicting resources.
This allow us to remove the local implementation that was incomplete and
could not assign for example a PCI<->PCI bridge from scratch, which is
needed on various embedded platforms.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
David Gibson [Tue, 18 Dec 2007 04:07:20 +0000 (15:07 +1100)]
[POWERPC] Use embedded dtc in kernel builds
This patch alters the kernel makefiles to build dtc from the sources
embedded in the previous patch. It also changes the
arch/powerpc/boot/wrapper script to use the embedded dtc, rather than
expecting a copy of dtc already installed on the system.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
David Gibson [Tue, 18 Dec 2007 04:06:42 +0000 (15:06 +1100)]
[POWERPC] Merge dtc upstream source
This incorporates a copy of dtc into the kernel source, in
arch/powerpc/boot/dtc-src. This commit only imports the upstream
sources verbatim, a later commit will actually link it into the kernel
Makefiles and use the embedded code during the kernel build.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
Michael Ellerman [Mon, 17 Dec 2007 06:35:53 +0000 (17:35 +1100)]
[POWERPC] Implement pci_set_dma_mask() in terms of the dma_ops
PowerPC currently doesn't implement pci_set_dma_mask(), which means drivers
calling it will get the generic version in drivers/pci/pci.c.
The powerpc dma mapping ops include a dma_set_mask() hook, which luckily is
not implemented by anyone - so there is no bug in the fact that the hook
is currently never called.
However in future we'll add implementation(s) of dma_set_mask(), and so we
need pci_set_dma_mask() to call the hook.
To save adding a hook to the dma mapping ops, pci-set_consistent_dma_mask()
simply calls the dma_set_mask() hook and then copies the new mask into
dev.coherenet_dma_mask.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
Milton Miller [Fri, 14 Dec 2007 04:52:20 +0000 (15:52 +1100)]
[POWERPC] Optimize account_system_vtime
We have multiple calls to has_feature being inlined, but gcc can't
be sure that the store via get_paca() doesn't alias the path to
cur_cpu_spec->feature.
Reorder to put the calls to read_purr and read_spurr adjacent to each
other. To add a sense of consistency, reorder the remaining lines to
perform parallel steps on purr and scaled purr of each line instead of
calculating and then using one value before going on to the next.
In addition, we can tell gcc that no SPURR means no PURR. The test is
completely hidden in the PURR case, and in the !PURR case the second test
is eliminated resulting in the simple register copy in the out-of-line
branch.
Further, gcc sees get_paca()->system_time referenced several times and
allocates a register to address it (shadowing r13) instead of caching its
value. Reading into a local varable saves the shadow of r13 and removes
a potentially duplicate load (between the nested if and its parent).
Signed-off-by: Milton Miller <miltonm@bga.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
Milton Miller [Fri, 14 Dec 2007 04:52:19 +0000 (15:52 +1100)]
[POWERPC] Depend on ->initialized in calc_steal_time
If CPU_FTR_PURR is not set, we will never set cpu_purr_data->initialized.
Checking via __get_cpu_var on 64 bit avoids one dependent load compared
to cpu_has_feature in the not-present case, and is always required when
it is present. The code is under CONFIG_VIRT_CPU_ACCOUNTING so 32 bit
will not be affected.
Signed-off-by: Milton Miller <miltonm@bga.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
Milton Miller [Fri, 14 Dec 2007 04:52:15 +0000 (15:52 +1100)]
[POWERPC] Timer interrupt: use a struct for two per_cpu varables
timer_interrupt() was calculating per_cpu_offset several times, having to
start from the toc because of potential aliasing issues.
Placing both decrementer per_cpu varables in a struct and calculating
the address once with __get_cpu_var results in better code on both 32
and 64 bit.
Signed-off-by: Milton Miller <miltonm@bga.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
Milton Miller [Fri, 14 Dec 2007 04:52:10 +0000 (15:52 +1100)]
[POWERPC] Use __get_cpu_var in time.c
Use __get_cpu_var(x) instead of per_cpu(x, smp_processor_id()), as it
is optimized on ppc64 to access the current cpu's per-cpu offset directly;
it's local_paca.offset instead of TOC->paca[local_paca->processor_id].offset.
This is the trivial portion, two functions with one use each.
Signed-off-by: Milton Miller <miltonm@bga.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
Milton Miller [Fri, 14 Dec 2007 04:52:09 +0000 (15:52 +1100)]
[POWERPC] Push down or eliminate smp_processor_id calls in xics code
The per-processor interrupt request register and current processor
priority register are only accessed on the current cpu. In fact the
hypervisor doesn't even let us choose which cpu's registers to access.
The only function to use cpu twice is xics_migrate_irqs_away, not a fast
path. But we can cache the result of get_hard_processor_id() instead of
calling get_hard_smp_processor_id(cpu) in a loop across the call to rtas.
Years ago the irq code passed smp_processor_id into get_irq, I thought
we might initialize the CPPR third party at boot as an extra measure of
saftey, and it made the code symmetric with the qirr (queued interrupt
for software generated interrupts), but now it is just extra and
sometimes unneeded work to pass it down.
Signed-off-by: Milton Miller <miltonm@bga.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
Ishizaki Kou [Thu, 13 Dec 2007 10:13:14 +0000 (21:13 +1100)]
[POWERPC] celleb: Split machine definition
This splits the machine definition for celleb into two definitions,
one for celleb_beat, and the other for celleb_native. Though this
looks complex because of sorting some functions, there are no
more semantic changes than that for the splitting.
Olof Johansson [Wed, 12 Dec 2007 06:44:46 +0000 (17:44 +1100)]
[POWERPC] pasemi: Implement MSI support
Implement MSI support for PA Semi PWRficient platforms. MSI is done
through a special range of sources on the openpic controller, and they're
unfortunately breaking the usual concepts of how sources are programmed:
* The source is calculated as 512 + the value written into the MSI
register
* The vector for this source is added to the source and reported
through IACK
This means that for simplicity, it makes much more sense to just set the
vector to 0 for the source, since that's really the vector we expect to
see from IACK.
Also, the affinity/priority registers will affect 16 sources at a
time. To avoid most (simple) users from being limited by this, allocate
16 sources per device but use only one. This means that there's a total
of 32 sources.
If we get usage scenarions that need more sources, the allocator should
probably be revised to take an alignment argument and size, not just do
natural alignment.
Finally, since I'm already touching the MPIC names on pasemi, rename
the base one from the somewhat odd " PAS-OPIC " to "PASEMI-OPIC".
Signed-off-by: Olof Johansson <olof@lixom.net> Acked-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
Commit fbd568a3e61a7decb8a754ad952aaa5b5c82e9e5 ("Change
synchronize_kernel to _rcu and _sched") changed the deprecated
synchronize_kernel() in HvLpEvent_unregisterHandler() to
synchronize_rcu(). It turns out that it should have been
synchronize_sched().
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
Scott Wood [Tue, 11 Dec 2007 21:23:05 +0000 (08:23 +1100)]
[POWERPC] wrapper: Treat NULL as root node in devp_offset; add devp_offset_find()
Many operations, as currently used in the wrapper, assume they can
pass NULL and have it be treated as the root node. However, libfdt-wrapper
converts NULL to -1, which is only appropriate when searching for nodes,
and will cause an error otherwise.
Signed-off-by: Scott Wood <scottwood@freescale.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
Scott Wood [Tue, 11 Dec 2007 21:23:04 +0000 (08:23 +1100)]
[POWERPC] wrapper: Rename offset in offset_devp()
fdt_wrapper_create_node passes a variable called offset to offset_devp(),
which uses said parameter to initialize a local variable called offset.
Due to one of the odder aspects of the C language, the result is an
undefined variable, with no error or warning.
Signed-off-by: Scott Wood <scottwood@freescale.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
Balbir Singh [Fri, 7 Dec 2007 22:37:14 +0000 (09:37 +1100)]
[POWERPC] Fake NUMA emulation for PowerPC
Here's a dumb simple implementation of fake NUMA nodes for PowerPC.
Fake NUMA nodes can be specified using the following command line option
numa=fake=<node range>
node range is of the format <range1>,<range2>,...<rangeN>
Each of the rangeX parameters is passed using memparse(). I find this
useful for fake NUMA emulation on my simple PowerPC machine. I've
tested it on a non-numa box with the following arguments:
Olof Johansson [Fri, 19 Oct 2007 23:49:50 +0000 (09:49 +1000)]
[POWERPC] MPIC: Minor optimization of ipi handler
Optimize MPIC IPIs, by passing in the IPI number as the argument to the
handler, since all we did was translate it back based on which mpic
the interrupt came though on (and that was always the primary mpic).
Signed-off-by: Olof Johansson <olof@lixom.net> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
Linus Torvalds [Wed, 19 Dec 2007 22:29:23 +0000 (14:29 -0800)]
Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
[IA64] Adjust CMCI mask on CPU hotplug
[IA64] make flush_tlb_kernel_range() an inline function
[IA64] Guard elfcorehdr_addr with #if CONFIG_PROC_FS
[IA64] Fix Altix BTE error return status
[IA64] Remove assembler warnings on head.S
[IA64] Remove compiler warinings about uninitialized variable in irq_ia64.c
[IA64] set_thread_area fails in IA32 chroot
[IA64] print kernel release in OOPS to make kerneloops.org happy
[IA64] Two trivial spelling fixes
[IA64] Avoid unnecessary TLB flushes when allocating memory
[IA64] ia32 nopage
[IA64] signal: remove redundant code in setup_sigcontext()
IA64: Slim down __clear_bit_unlock
As of PS3 firmware version 2.10, the GPU command buffer size must be at least 2
MiB large. Since we use only a small part of the GPU command buffer and don't
want to waste precious XDR memory, move the GPU command buffer back to the
start of the XDR memory reserved for ps3fb and let the unused part overlap with
the actual frame buffer.
Linus Torvalds [Wed, 19 Dec 2007 22:05:13 +0000 (14:05 -0800)]
Do dirty page accounting when removing a page from the page cache
Krzysztof Oledzki noticed a dirty page accounting leak on some of his
machines, causing the machine to eventually lock up when the kernel
decided that there was too much dirty data, but nobody could actually
write anything out to fix it.
The culprit turns out to be filesystems (cough ext3 with data=journal
cough) that re-dirty the page when the "->invalidatepage()" callback is
called.
Fix it up by doing a final dirty page accounting check when we actually
remove the page from the page cache.
This fixes bugzilla entry 9182:
http://bugzilla.kernel.org/show_bug.cgi?id=9182
Tested-by: Ingo Molnar <mingo@elte.hu> Tested-by: Krzysztof Oledzki <olel@ans.pl> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Simon Horman [Mon, 12 Nov 2007 04:55:21 +0000 (13:55 +0900)]
[IA64] Guard elfcorehdr_addr with #if CONFIG_PROC_FS
Access to elfcorehdr_addr needs to be guarded by #if CONFIG_PROC_FS
as well as the existing #if guards.
Fixes the following build problem:
arch/ia64/hp/common/built-in.o: In function
`sba_init':arch/ia64/hp/common/sba_iommu.c:2043: undefined reference to `elfcorehdr_addr'
:arch/ia64/hp/common/sba_iommu.c:2043: undefined reference to `elfcorehdr_addr'
Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Tony Luck <tony.luck@intel.com>
Russ Anderson [Tue, 21 Aug 2007 21:45:12 +0000 (16:45 -0500)]
[IA64] Fix Altix BTE error return status
The Altix shub2 BTE error detail bits are in a different location
than on shub1. The current code does not take this into account
resulting in all shub2 BTE failures mapping to "unknown".
This patch reads the error detail bits from the proper location,
so the correct BTE failure reason is returned for both shub1
and shub2.
Signed-off-by: Russ Anderson <rja@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
Hidetoshi Seto [Wed, 12 Dec 2007 07:28:52 +0000 (16:28 +0900)]
[IA64] Remove assembler warnings on head.S
This patch removes the following assembler warning messages.
AS arch/ia64/kernel/head.o
arch/ia64/kernel/head.S: Assembler messages:
arch/ia64/kernel/head.S:1179: Warning: Use of 'ld8' violates RAW dependency 'CR[PTA]' (data)
arch/ia64/kernel/head.S:1179: Warning: Only the first path encountering the conflict is reported
arch/ia64/kernel/head.S:1178: Warning: This is the location of the conflicting usage
arch/ia64/kernel/head.S:1180: Warning: Use of 'ld8' violates RAW dependency 'CR[PTA]' (data)
arch/ia64/kernel/head.S:1180: Warning: Only the first path encountering the conflict is reported
arch/ia64/kernel/head.S:1178: Warning: This is the location of the conflicting usage
:
arch/ia64/kernel/head.S:1213: Warning: Use of 'ldf.fill.nta' violates RAW dependency 'CR[PTA]' (data)
arch/ia64/kernel/head.S:1213: Warning: Only the first path encountering the conflict is reported
arch/ia64/kernel/head.S:1178: Warning: This is the location of the conflicting usage
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
Kenji Kaneshige [Wed, 22 Aug 2007 10:28:36 +0000 (19:28 +0900)]
[IA64] Remove compiler warinings about uninitialized variable in irq_ia64.c
This patch removes the following compiler warning messages.
CC arch/ia64/kernel/irq_ia64.o
arch/ia64/kernel/irq_ia64.c: In function 'create_irq':
arch/ia64/kernel/irq_ia64.c:343: warning: 'domain.bits[0u]' may be used uninitialized in this function
arch/ia64/kernel/irq_ia64.c: In function 'assign_irq_vector':
arch/ia64/kernel/irq_ia64.c:203: warning: 'domain.bits[0u]' may be used uninitialized in this function
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
Ian Wienand [Tue, 20 Nov 2007 03:12:46 +0000 (14:12 +1100)]
[IA64] set_thread_area fails in IA32 chroot
I tried to upgrade an IA32 chroot on my IA64 to a new glibc with TLS.
It kept dying because set_thread_area was returning -ESRCH
(bugs.debian.org/451939).
I instrumented arch/ia64/ia32/sys_ia32.c:get_free_idx() and ended up
seeing output like
Johannes Berg [Tue, 11 Dec 2007 14:25:59 +0000 (01:25 +1100)]
[POWERPC] powermac: Use generic suspend code
This adds platform_suspend_ops for PMU based machines, directly in
the PMU driver. This allows suspending via /sys/power/state
on powerbooks.
The patch also replaces the PMU ioctl with a simple call to
pm_suspend(PM_SUSPEND_MEM).
Additionally, it cleans up some debug code.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
Paul Mackerras [Wed, 19 Dec 2007 11:45:31 +0000 (22:45 +1100)]
[POWERPC] Fix sleep on powerbook 3400
Sleep on the powerbook 3400 has been broken since the change that made
powerbook_sleep_3400 call pmac_suspend_devices(), which disables
interrupts. There are a couple of loops in powerbook_sleep_3400 that
depend on interrupts being enabled, and in fact it has to have
interrupts enabled at the point of going to sleep since it is an
interrupt from the PMU that wakes it up.
This fixes it by using pmu_wait_complete() instead of a spinloop, and
by explicitly enabling interrupts before putting the CPU into sleep
mode (which is OK since all interrupts except the PMU interrupt have
been disabled at the interrupt controller by this stage).
This changes the logic so that it keeps putting the CPU into sleep mode
until the completion of the interrupt transaction from the PMU that
signals the end of sleep. Also, we now call pmu_unlock() before sleep
so that the via_pmu_interrupt() code can process the interrupt event
from the PMU properly.
Now that generic code saves and restores PCI state, it is no longer
necessary to do that here. Thus pbook_pci_save/restore and related
functions are no longer necessary, so this removes them.
Lastly, this moves the ioremap of the memory controller to init code
rather than doing it on every sleep/wakeup cycle.
Paul Mackerras [Thu, 13 Dec 2007 04:54:45 +0000 (15:54 +1100)]
[POWERPC] Convert therm_pm72.c to use the kthread API
This converts the therm_pm72.c driver to use the kthread API. I
thought about making it use kthread_stop() instead of the `state'
variable and the `ctrl_complete' completion, but that isn't simple and
will require changing the way that `state' is used.
Paul Mackerras [Thu, 13 Dec 2007 04:11:22 +0000 (15:11 +1100)]
[POWERPC] Convert adb.c to use kthread API and not spin on ADB requests
This converts adb.c to use the kthread API.
It also changes adb_request so that if the ADBREQ_SYNC flag is
specified, we now sleep waiting for the request to finish using an
on-stack completion rather than spinning. To implement this, we now
require that if the ADBREQ_SYNC flag is set, the `done' parameter must
be NULL. All of the existing callers of adb_request that pass
ADBREQ_SYNC appear to be in process context and have done == NULL.
Doing this allows us to get rid of an awful hack in adb_request()
where we used to test whether the request was coming from the adb
probe task and use a completion if it was, and otherwise spin.
This also gets rid of a static request block that was used if the req
parameter to adb_request was NULL. None of the callers do that any
more, so the static request block is no longer necessary.
This kills off the remnants of the old sleep notifiers now that they
are no longer used.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
Johannes Berg [Tue, 11 Dec 2007 14:21:25 +0000 (01:21 +1100)]
[POWERPC] adb: Replace sleep notifier with platform driver suspend/resume hooks
This replaces the pmu sleep notifier that adb had with suspend/resume
hooks in a new platform driver/device.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
Luck, Tony [Tue, 18 Dec 2007 19:46:38 +0000 (11:46 -0800)]
[IA64] print kernel release in OOPS to make kerneloops.org happy
The ia64 oops message doesn't include the kernel version, which
makes it hard to automatically categorize oops messages scraped
from mailing lists and bug databases.
[IA64] Avoid unnecessary TLB flushes when allocating memory
Improve performance of memory allocations on ia64 by avoiding a global TLB
purge to purge a single page from the file cache. This happens whenever we
evict a page from the buffer cache to make room for some other allocation.
Test case: Run 'find /usr -type f | xargs cat > /dev/null' in the
background to fill the buffer cache, then run something that uses memory,
e.g. 'gmake -j50 install'. Instrumentation showed that the number of
global TLB purges went from a few millions down to about 170 over a 12
hours run of the above.
The performance impact is particularly noticeable under virtualization,
because a virtual TLB is generally both larger and slower to purge than
a physical one.
Signed-off-by: Christophe de Dinechin <ddd@hp.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
Shi Weihua [Thu, 13 Dec 2007 23:58:26 +0000 (15:58 -0800)]
[IA64] signal: remove redundant code in setup_sigcontext()
This patch removes some redundant code in the function setup_sigcontext().
The registers ar.ccv,b7,r14,ar.csd,ar.ssd,r2-r3 and r16-r31 are not
restored in restore_sigcontext() when (flags & IA64_SC_FLAG_IN_SYSCALL) is
true. So we don't need to zero those variables in setup_sigcontext().
Signed-off-by: Shi Weihua <shiwh@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Tony Luck <tony.luck@intel.com>
__clear_bit_unlock does not need to perform atomic operations on the
variable. Avoid a cmpxchg and simply do a store with release semantics.
Add a barrier to be safe that the compiler does not do funky things.
Tony: Use intrinsic rather than inline assembler
Signed-off-by: Christoph Lameter <clameter@sgi.com> Acked-by: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Tony Luck <tony.luck@intel.com>
Jeremy Kerr [Wed, 5 Dec 2007 02:49:31 +0000 (13:49 +1100)]
[POWERPC] cell: handle SPE kernel mappings that cross segment boundaries
Currently, we have a possibilty that the SLBs setup during context
switch don't cover the entirety of the necessary lscsa and code
regions, if these regions cross a segment boundary.
This change checks the start and end of each region, and inserts a SLB
entry for each, if unique. We also remove the assumption that the
spu_save_code and spu_restore_code reside in the same segment, by using
the specific code array for save and restore.
Signed-off-by: Jeremy Kerr <jk@ozlabs.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Jeremy Kerr [Wed, 5 Dec 2007 02:49:31 +0000 (13:49 +1100)]
[POWERPC] cell: handle kernel SLB setup in spu_base.c
Currently, the SPU context switch code (spufs/switch.c) sets up the
SPU's SLBs directly, which requires some low-level mm stuff.
This change moves the kernel SLB setup to spu_base.c, by exposing
a function spu_setup_kernel_slbs() to do this setup. This allows us
to remove the low-level mm code from switch.c, making it possible
to later move switch.c to the spufs module.
Also, add a struct spu_slb for the cases where we need to deal with
SLB entries.
Signed-off-by: Jeremy Kerr <jk@ozlabs.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Andre Detsch [Wed, 5 Dec 2007 02:49:31 +0000 (13:49 +1100)]
[POWERPC] cell: safer of_has_vicinity routine
This patch changes the way we check for the existence of
vicinity property in spe device nodes.
The new implementation does not depend on having an initialized
cbe_spu_info[0].spus, and checks for presence of vicinity in all
nodes, not only in the first one.
Signed-off-by: Andre Detsch <adetsch@br.ibm.com> Signed-off-by: Jeremy Kerr <jk@ozlabs.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Bob Nelson [Fri, 17 Aug 2007 16:06:09 +0000 (11:06 -0500)]
[POWERPC] OProfile: fix cbe pm signal routing problem
Fix debug_bus_control and group_control PMU register values set up in
set_pm_event(). Initialize variables before calling set_pm_event().
Delete unused static array and code that initialized it.
Rename constant to better reflect usage.
Signed-off-by: Bob Nelson <rrnelson@us.ibm.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Kevin Corry [Tue, 11 Dec 2007 12:49:17 +0000 (13:49 +0100)]
[POWERPC] perfmon2: make pm_interval register read/write
The pm_interval register in the Cell PMU is read/write, but was implemented in
the kernel as write-only. Previously, the written value was saved in a "shadow"
copy so calls to cbe_read_pm() could return the value.
Perfmon2 needs to be able to read the current values of pm_interval, so change
cbe_read_pm() to read the actual register instead of the "shadow" copy. There
is currently no code in the kernel that tries to read the pm_interval register
with cbe_read_pm() (expecting to receive the "shadow" value), so this should
not break any existing code.
Signed-off-by: Kevin Corry <kevcorry@us.ibm.com> Signed-off-by: Carl Love <carll@us.ibm.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Boaz Harrosh [Mon, 17 Dec 2007 16:08:59 +0000 (18:08 +0200)]
[SCSI] initio: bugfix for accessors patch
patch: [SCSI] initio: convert to use the data buffer accessors had a
small but fatal bug in that it didn't increment the pointer into the
initio scatterlist descriptors as it looped over the block generated
ones. Fixed here.
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>