Mark Nutter [Wed, 4 Oct 2006 15:26:12 +0000 (17:26 +0200)]
[POWERPC] spufs: scheduler support for NUMA.
This patch adds NUMA support to the the spufs scheduler.
The new arch/powerpc/platforms/cell/spufs/sched.c is greatly
simplified, in an attempt to reduce complexity while adding
support for NUMA scheduler domains. SPUs are allocated starting
from the calling thread's node, moving to others as supported by
current->cpus_allowed. Preemption is gone as it was buggy, but
should be re-enabled in another patch when stable.
The new arch/powerpc/platforms/cell/spu_base.c maintains idle
lists on a per-node basis, and allows caller to specify which
node(s) an SPU should be allocated from, while passing -1 tells
spu_alloc() that any node is allowed.
Since the patch removes the currently implemented preemptive
scheduling, it is technically a regression, but practically
all users have since migrated to this version, as it is
part of the IBM SDK and the yellowdog distribution, so there
is not much point holding it back while the new preemptive
scheduling patch gets delayed further.
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
[POWERPC] spufs: cell spu problem state mapping updates
This patch adds a new "psmap" file to spufs that allows mmap of all of
the problem state mapping of SPEs. It is compatible with 64k pages. In
addition, it removes mmap ability of individual files when using 64k
pages, with the exception of signal1 and signal2 which will both map the
entire 64k page holding both registers. It also removes
CONFIG_SPUFS_MMAP as there is no point in not building mmap support in
spufs.
It goes along a separate patch to libspe implementing usage of that new
file to access problem state registers.
Another patch will follow up to fix races opened up by accessing
the 'runcntl' register directly, which is made possible with this
patch.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
The include of linux/smp.h needs to be done before the #if that
checks for the compiler version. Seems like fallout from the
inline assembly cleanup patch vs. the directed yield patch.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Fix new restore_sigregs function. It copies the user space copy of the
old psw without correcting the psw.mask and the psw.addr high order bit.
While we are at it, simplify save_sigregs a bit.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Linus Torvalds [Wed, 4 Oct 2006 17:44:01 +0000 (10:44 -0700)]
Merge branch 'for-2.6.19' of git://brick.kernel.dk/data/git/linux-2.6-block
* 'for-2.6.19' of git://brick.kernel.dk/data/git/linux-2.6-block:
[PATCH] Document bi_sector and sector_t
[PATCH] helper function for retrieving scsi_cmd given host based block layer tag
[PATCH] helper function for retrieving scsi_cmd given host based block layer tag
This was necessitated by the need for a function to get back
to a scsi_cmnd, when an hba the posts its (corresponding) completion
interrupt with a block layer tag as its reference.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: David Somayajulu <david.somayajulu@qlogic.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* master.kernel.org:/pub/scm/linux/kernel/git/mchehab/v4l-dvb:
V4L/DVB (4712): Fix warning when compiling on x86_i64
V4L/DVB (4711): Radio: No need to return void
V4L/DVB (4708): Add tveeprom support for Philips FM1236/FM1216ME MK5
V4L/DVB (4707): 4linux: complete conversion to hotplug safe PCI API
V4L/DVB (4706): Do not enable VIDEO_V4L2 unconditionally
V4L/DVB (4704): SAA713x: fixed compile warning in SECAM fixup
V4L/DVB (4703): Add support for the ASUS EUROPA2 OEM board
V4L/DVB (4702): Fix: set antenna input for DVB-T for Asus P7131 Dual hybrid
V4L/DVB (4701): Saa713x audio fixes
V4L/DVB (4676a): Remove Kconfig item for DiB7000M support
[PATCH] atmel_serial: Fix roundoff error in atmel_console_get_options
The atmel_console_get_options() function initializes the baud,
parity and bits settings from the actual hardware setup, in
case it has been initialized by a e.g. boot loader.
The baud rate, however, is not necessarily exactly equal to one of
the standard baud rates (115200, etc.) This means that the baud rate
calculated by this function may be slightly higher or slightly lower
than one of the standard baud rates.
If the baud rate is slightly lower than the target, this causes
problems when uart_set_option() tries to match the detected baud rate
against the standard baud rate, as it will always select a baud rate
that is lower or equal to the target rate. For example if the
detected baud rate is slightly lower than 115200, usart_set_options()
will select 57600.
This patch fixes the problem by subtracting 1 from the value in BRGR
when calculating the baud rate. The detected baud rate will thus
always be higher than the nearest standard baud rate, and
uart_set_options() will end up doing the right thing.
Tested on ATSTK1000 and AT91RM9200-EK boards. Both are broken without
this patch.
Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com> Acked-by: Andrew Victor <andrew@sanpeople.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] AVR32: Allow renumbering of serial devices
Allow the board to remap actual USART peripheral devices to serial
devices by calling at32_map_usart(hw_id, serial_line). This ensures
that even though ATSTK1002 uses USART1 as the first serial port, it
will still have a ttyS0 device.
This also adds a board-specific early setup hook and moves the
at32_setup_serial_console() call there from the platform code.
[PATCH] atmel_serial: Pass fixed register mappings through platform_data
In order to initialize the serial console early, the atmel_serial
driver had to do a hack where it compared the physical address of the
port with an address known to be permanently mapped, and used it as a
virtual address. This got around the limitation that ioremap() isn't
always available when the console is being initalized.
This patch removes that hack and replaces it with a new "regs" field
in struct atmel_uart_data that the board-specific code can initialize
to a fixed virtual mapping for platform devices where this is possible.
It also initializes the DBGU's regs field with the address the driver
used to check against.
On AVR32, the "regs" field is initialized from the physical base
address when this it can be accessed through a permanently 1:1 mapped
segment, i.e. the P4 segment.
If regs is NULL, the console initialization is delayed until the "real"
driver is up and running and ioremap() can be used.
Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com> Acked-by: Andrew Victor <andrew@sanpeople.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The at91_serial driver can be used with both AT32 and AT91 devices
from Atmel and has therefore been renamed atmel_serial. The only
thing left is to rename PORT_AT91 PORT_ATMEL.
Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com> Acked-by: Andrew Victor <andrew@sanpeople.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Vitaly Wool [Wed, 4 Oct 2006 15:19:58 +0000 (19:19 +0400)]
[MIPS] PNX8550 fixups
This patch fixes the compilation errors on PNX8550 and hard-to-track
bug in interrupt handling.
It also corresponds to the latest changes in PNX8550 serial driver.
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
[XFRM]: BEET mode
[TCP]: Kill warning in tcp_clean_rtx_queue().
[NET_SCHED]: Remove old estimator implementation
[ATM]: [zatm] always *pcr in alloc_shaper()
[ATM]: [ambassador] Change the return type to reflect reality
[ATM]: kmalloc to kzalloc patches for drivers/atm
[TIPC]: fix printk warning
[XFRM]: Clearing xfrm_policy_count[] to zero during flush is incorrect.
[XFRM] STATE: Use destination address for src hash.
[NEIGH]: always use hash_mask under tbl lock
[UDP]: Fix MSG_PROBE crash
[UDP6]: Fix flowi clobbering
[NET_SCHED]: Revert "HTB: fix incorrect use of RB_EMPTY_NODE"
[NETFILTER]: ebt_mark: add or/and/xor action support to mark target
[NETFILTER]: ipt_REJECT: remove largely duplicate route_reverse function
[NETFILTER]: Honour source routing for LVS-NAT
[NETFILTER]: add type parameter to ip_route_me_harder
[NETFILTER]: Kconfig: fix xt_physdev dependencies
* master.kernel.org:/pub/scm/linux/kernel/git/willy/parisc-2.6: (41 commits)
[PARISC] Kill wall_jiffies use
[PARISC] Honour "panic_on_oops" sysctl
[PARISC] Fix fs/binfmt_som.c
[PARISC] Export clear_user_page to modules
[PARISC] Make DMA routines more stubby
[PARISC] Define pci_get_legacy_ide_irq
[PARISC] Fix CONFIG_DEBUG_SPINLOCK
[PARISC] Fix HPUX compat compile with current GCC
[PARISC] Fix iounmap compile warning
[PARISC] Add support for Quicksilver AGPGART
[PARISC] Move LBA and SBA register defines to the common ropes.h
[PARISC] Create shared <asm/ropes.h> header
[PARISC] Stash the lba_device in its struct device drvdata
[PARISC] Generalize IS_ASTRO et al to take a parisc_device like
[PARISC] Pretty print the name of the lba type on kernel boot
[PARISC] Remove some obsolete comments and I checked that Reo is similar to Ike
[PARISC] Add hardware found in the rp8400
[PARISC] Allow nested interrupts
[PARISC] Further updates to timer_interrupt()
[PARISC] remove halftick and copy clocktick to local var (gcc can optimize usage)
...
* git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (25 commits)
[POWERPC] Add support for the mpc832x mds board
[POWERPC] Add initial support for the e300c2 core
[POWERPC] Add MPC8360EMDS default dts file
[POWERPC] Add MPC8360EMDS board support
[POWERPC] Add QUICC Engine (QE) infrastructure
[POWERPC] Add QE device tree node definition
[POWERPC] Don't try to just continue if xmon has no input device
[POWERPC] Fix a printk in pseries_mpic_init_IRQ
[POWERPC] Get default baud rate in udbg_scc
[POWERPC] Fix zImage.coff on oldworld PowerMac
[POWERPC] Fix xmon=off and cleanup xmon initialisation
[POWERPC] Cleanup include/asm-powerpc/xmon.h
[POWERPC] Update swim3 printk after blkdev.h change
[POWERPC] Cell interrupt rework
POWERPC: mpc82xx merge: board-specific/platform stuff(resend)
POWERPC: 8272ads merge to powerpc: common stuff
POWERPC: Added devicetree for mpc8272ads board
[POWERPC] iSeries has no legacy I/O
[POWERPC] implement BEGIN/END_FW_FTR_SECTION
[POWERPC] iSeries does not need pcibios_fixup_resources
...
Linus Torvalds [Wed, 4 Oct 2006 15:15:55 +0000 (08:15 -0700)]
Merge branch 'audit.b32' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current
* 'audit.b32' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current:
[PATCH] message types updated
[PATCH] name_count array overrun
[PATCH] PPID filtering fix
[PATCH] arch filter lists with < or > should not be accepted
Alan Cox [Wed, 4 Oct 2006 11:47:14 +0000 (12:47 +0100)]
[PATCH] pata: teach ali about rev C8, keep pcmcia driver in sync
This fixes support for rev c8 of the ALi/ULi PATA, and keeps pcmcia in
sync so ide_cs and pata_pcmcia are interchangable, both are only changes
to constants.
Right now rev 0xC8 and higher don't work with libata but 0xc8 is in the
field now.
Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Linus Torvalds [Wed, 4 Oct 2006 15:06:16 +0000 (08:06 -0700)]
Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev:
[libata] pata_artop: kill gcc warning
[PATCH] libata: turn off NCQ if queue depth is adjusted to 1
[PATCH] libata: cosmetic changes to constants
[libata] DocBook minor updates, fixes
[libata] PCI ID table cleanup in various drivers
[libata] Print out Status register, if a BSY-sleep takes too long
[libata] init probe_ent->private_data in a common location
[libata] minor PCI IDE probe fixes and cleanups
[libata] Use new PCI_VDEVICE() macro to dramatically shorten ID lists
[PATCH] Fix reference of uninitialised memory in ata_device_add()
If you lose the x bit (eg: by using patch(1)), powerpc won't build. Be
defensive about it...
Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Adrian Bunk [Wed, 4 Oct 2006 09:17:22 +0000 (02:17 -0700)]
[PATCH] The scheduled removal of some OSS drivers
This patch contains the scheduled removal of OSS drivers that:
- have ALSA drivers for the same hardware without known regressions and
- whose Kconfig options have been removed in 2.6.17.
[michal.k.k.piotrowski@gmail.com: build fix] Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Michal Piotrowski <michal.k.k.piotrowski@gmail.com> Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Josh Triplett [Wed, 4 Oct 2006 09:17:21 +0000 (02:17 -0700)]
[PATCH] RCU: CREDITS and MAINTAINERS
Add MAINTAINERS entry for Read-Copy Update (RCU), listing Dipankar Sarma as
maintainer, and giving the URL for Paul McKenney's RCU site. Add
MAINTAINERS entry for rcutorture, listing myself as maintainer. Add
CREDITS entries for developers of RCU, RCU variants, and rcutorture. Use
Paul McKenney's preferred email address in include/linux/rcupdate.h .
Signed-off-by: Josh Triplett <josh@freedesktop.org> Cc: Paul McKenney <paulmck@us.ibm.com> Cc: Dipankar Sarma <dipankar@in.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Oleg Nesterov [Wed, 4 Oct 2006 09:17:17 +0000 (02:17 -0700)]
[PATCH] rcu: simplify/improve batch tuning
Kill a hard-to-calculate 'rsinterval' boot parameter and per-cpu
rcu_data.last_rs_qlen. Instead, it adds adds a flag rcu_ctrlblk.signaled,
which records the fact that one of CPUs has sent a resched IPI since the
last rcu_start_batch().
Roughly speaking, we need two rcu_start_batch()s in order to move callbacks
from ->nxtlist to ->donelist. This means that when ->qlen exceeds qhimark
and continues to grow, we should send a resched IPI, and then do it again
after we gone through a quiescent state.
On the other hand, if it was already sent, we don't need to do it again
when another CPU detects overflow of the queue.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Acked-by: Paul E. McKenney <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Josh Triplett [Wed, 4 Oct 2006 09:17:16 +0000 (02:17 -0700)]
[PATCH] rcu: add rcu_bh_sync torture type to rcutorture
Use the newly-generic synchronous deferred free function to implement torture
testing for rcu_bh using synchronize_rcu_bh rather than the asynchronous
call_rcu_bh.
Signed-off-by: Josh Triplett <josh@freedesktop.org> Acked-by: Paul E. McKenney <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Josh Triplett [Wed, 4 Oct 2006 09:17:15 +0000 (02:17 -0700)]
[PATCH] rcu: add rcu_sync torture type to rcutorture
Use the newly-generic synchronous deferred free function to implement torture
testing for RCU using synchronize_rcu rather than the asynchronous call_rcu.
Signed-off-by: Josh Triplett <josh@freedesktop.org> Acked-by: Paul E. McKenney <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Josh Triplett [Wed, 4 Oct 2006 09:17:14 +0000 (02:17 -0700)]
[PATCH] rcu: refactor srcu_torture_deferred_free to work for any implementation
Make srcu_torture_deferred_free use cur_ops->sync() so it will work for any
implementation. Move and rename it in preparation for use in the ops of other
implementations.
Signed-off-by: Josh Triplett <josh@freedesktop.org> Acked-by: Paul E. McKenney <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Josh Triplett [Wed, 4 Oct 2006 09:17:13 +0000 (02:17 -0700)]
[PATCH] RCU: add fake writers to rcutorture
rcutorture currently has one writer and an arbitrary number of readers. To
better exercise some of the code paths in RCU implementations, add fake
writer threads which call the synchronize function for the RCU variant in a
loop, with a delay between calls to arrange for different numbers of
writers running in parallel.
[bunk@stusta.de: cleanup] Acked-by: Paul McKenney <paulmck@us.ibm.com> Cc: Dipkanar Sarma <dipankar@in.ibm.com> Signed-off-by: Josh Triplett <josh@freedesktop.org> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Josh Triplett [Wed, 4 Oct 2006 09:17:12 +0000 (02:17 -0700)]
[PATCH] rcu: Fix sign bug making rcu_random always return the same sequence
rcu_random uses a counter rrs_count to occasionally mix data from
get_random_bytes into the state of its pseudorandom generator. However,
the rrs_counter gets declared as an unsigned long, and rcu_random checks
for --rrs_count < 0, so this code will never mix any real random data into
the state, and will thus always return the same sequence of random numbers.
Also, change the return value of rcu_random from long to unsigned long, to
avoid potential issues caused by the use of the % operator, which can
return negative values for negative left operands.
Signed-off-by: Josh Triplett <josh@freedesktop.org> Acked-by: Paul E. McKenney <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Josh Triplett [Wed, 4 Oct 2006 09:17:11 +0000 (02:17 -0700)]
[PATCH] rcu: Avoid kthread_stop on invalid pointer if rcutorture reader startup fails
rcu_torture_init kmallocs the array of reader threads, then creates each
one with kthread_run, cleaning up with rcu_torture_cleanup if this fails.
rcu_torture_cleanup calls kthread_stop on any non-NULL pointer in the
array; however, any readers after the one that failed to start up will have
invalid pointers, not null pointers. Avoid this by using kzalloc instead.
Signed-off-by: Josh Triplett <josh@freedesktop.org> Acked-by: Paul E. McKenney <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Alan Stern [Wed, 4 Oct 2006 09:17:06 +0000 (02:17 -0700)]
[PATCH] cpufreq: make the transition_notifier chain use SRCU
This patch (as762) changes the cpufreq_transition_notifier_list from a
blocking_notifier_head to an srcu_notifier_head. This will prevent errors
caused attempting to call down_read() to access the notifier chain at a
time when interrupts must remain disabled, during system suspend.
It's not clear to me whether this is really necessary; perhaps the chain
could be made into an atomic_notifier. However a couple of the callout
routines do use blocking operations, so this approach seems safer.
The head of the notifier chain needs to be initialized before use; this is
done by an __init routine at core_initcall time. If this turns out not to
be a good choice, it can easily be changed.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Cc: "Paul E. McKenney" <paulmck@us.ibm.com> Cc: Jesse Brandeburg <jesse.brandeburg@gmail.com> Cc: Dave Jones <davej@codemonkey.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Alan Stern [Wed, 4 Oct 2006 09:17:05 +0000 (02:17 -0700)]
[PATCH] SRCU: report out-of-memory errors
Currently the init_srcu_struct() routine has no way to report out-of-memory
errors. This patch (as761) makes it return -ENOMEM when the per-cpu data
allocation fails.
The patch also makes srcu_init_notifier_head() report a BUG if a notifier
head can't be initialized. Perhaps it should return -ENOMEM instead, but
in the most likely cases where this might occur I don't think any recovery
is possible. Notifier chains generally are not created dynamically.
[akpm@osdl.org: avoid statement-with-side-effect in macro] Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Acked-by: Paul E. McKenney <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Alan Stern [Wed, 4 Oct 2006 09:17:04 +0000 (02:17 -0700)]
[PATCH] Add SRCU-based notifier chains
This patch (as751) adds a new type of notifier chain, based on the SRCU
(Sleepable Read-Copy Update) primitives recently added to the kernel. An
SRCU notifier chain is much like a blocking notifier chain, in that it must
be called in process context and its callout routines are allowed to sleep.
The difference is that the chain's links are protected by the SRCU
mechanism rather than by an rw-semaphore, so calling the chain has
extremely low overhead: no memory barriers and no cache-line bouncing. On
the other hand, unregistering from the chain is expensive and the chain
head requires special runtime initialization (plus cleanup if it is to be
deallocated).
SRCU notifiers are appropriate for notifiers that will be called very
frequently and for which unregistration occurs very seldom. The proposed
"task notifier" scheme qualifies, as may some of the network notifiers.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Acked-by: Paul E. McKenney <paulmck@us.ibm.com> Acked-by: Chandra Seetharaman <sekharan@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Adds SRCU operations to rcutorture and updates rcutorture documentation.
Also increases the stress imposed by the rcutorture test.
[bunk@stusta.de: make needlessly global code static] Signed-off-by: Paul E. McKenney <paulmck@us.ibm.com> Cc: Paul E. McKenney <paulmck@us.ibm.com> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Updated patch adding a variant of RCU that permits sleeping in read-side
critical sections. SRCU is as follows:
o Each use of SRCU creates its own srcu_struct, and each
srcu_struct has its own set of grace periods. This is
critical, as it prevents one subsystem with a blocking
reader from holding up SRCU grace periods for other
subsystems.
o The SRCU primitives (srcu_read_lock(), srcu_read_unlock(),
and synchronize_srcu()) all take a pointer to a srcu_struct.
o The SRCU primitives must be called from process context.
o srcu_read_lock() returns an int that must be passed to
the matching srcu_read_unlock(). Realtime RCU avoids the
need for this by storing the state in the task struct,
but SRCU needs to allow a given code path to pass through
multiple SRCU domains -- storing state in the task struct
would therefore require either arbitrary space in the
task struct or arbitrary limits on SRCU nesting. So I
kicked the state-storage problem up to the caller.
Of course, it is not permitted to call synchronize_srcu()
while in an SRCU read-side critical section.
o There is no call_srcu(). It would not be hard to implement
one, but it seems like too easy a way to OOM the system.
(Hey, we have enough trouble with call_rcu(), which does
-not- permit readers to sleep!!!) So, if you want it,
please tell me why...
[josht@us.ibm.com: sparse notation] Signed-off-by: Paul E. McKenney <paulmck@us.ibm.com> Signed-off-by: Josh Triplett <josh@freedesktop.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This moves the declarations for the architecture helpers into
include/linux/htirq.h from the generic include/linux/pci.h. Hopefully this
will make this distinction clearer.
htirq.h is included where it is needed.
The dependency on the msi code is fixed and removed.
The Makefile is tidied up.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Tony Luck <tony.luck@intel.com> Cc: Andi Kleen <ak@suse.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Greg KH <greg@kroah.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] msi: refactor and move the msi irq_chip into the arch code
It turns out msi_ops was simply not enough to abstract the architecture
specific details of msi. So I have moved the resposibility of constructing
the struct irq_chip to the architectures, and have two architecture specific
functions arch_setup_msi_irq, and arch_teardown_msi_irq.
For simple architectures those functions can do all of the work. For
architectures with platform dependencies they can call into the appropriate
platform code.
With this msi.c is finally free of assuming you have an apic, and this
actually takes less code.
The helpers for the architecture specific code are declared in the linux/msi.h
to keep them separate from the msi functions used by drivers in linux/pci.h
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Tony Luck <tony.luck@intel.com> Cc: Andi Kleen <ak@suse.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Greg KH <greg@kroah.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] msi: only use a single irq_chip for msi interrupts
The logic works like this.
Since we no longer track the state logic by hand in msi.c startup and shutdown
are no longer needed.
By updating msi_set_mask_bit to work on msi devices that do not implement a
mask bit we can always call the mask/unmask functions.
What we really have are mask and unmask so we use them to implement the .mask
and .unmask functions instead of .enable and .disable.
By switching to the handle_edge_irq handler we only need an ack function that
moves the irq if necessary. Which removes the old end and ack functions and
their peculiar logic of sometimes disabling an irq.
This removes the reliance on pre genirq irq handling methods.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Tony Luck <tony.luck@intel.com> Cc: Andi Kleen <ak@suse.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Greg KH <greg@kroah.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] msi: simplify msi sanity checks by adding with generic irq code
Currently msi.c is doing sanity checks that make certain before an irq is
destroyed it has no more users.
By adding irq_has_action I can perform the test is a generic way, instead of
relying on a msi specific data structure.
By performing the core check in dynamic_irq_cleanup I ensure every user of
dynamic irqs has a test present and we don't free resources that are in use.
In msi.c this allows me to kill the attrib.state member of msi_desc and all of
the assciated code to maintain it.
To keep from freeing data structures when irq cleanup code is called to soon
changing dyanamic_irq_cleanup is insufficient because there are msi specific
data structures that are also not safe to free.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Tony Luck <tony.luck@intel.com> Cc: Andi Kleen <ak@suse.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Greg KH <greg@kroah.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] Initial generic hypertransport interrupt support
This patch implements two functions ht_create_irq and ht_destroy_irq for
use by drivers. Several other functions are implemented as helpers for
arch specific irq_chip handlers.
The driver for the card I tested this on isn't yet ready to be merged.
However this code is and hypertransport irqs are in use in a few other
places in the kernel. Not that any of this will get merged before 2.6.19
Because the ipath-ht400 is slightly out of spec this code will need to be
generalized to work there.
I think all of the powerpc uses are for a plain interrupt controller in a
chipset so support for native hypertransport devices is a little less
interesting.
However I think this is a half way decent model on how to separate arch
specific and generic helper code, and I think this is a functional model of
how to get the architecture dependencies out of the msi code.
[akpm@osdl.org: Kconfig fix] Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Greg KH <greg@kroah.com> Cc: Andi Kleen <ak@muc.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] genirq: x86_64 irq: make vector_irq per cpu
This refactors the irq handling code to make the vectors a per cpu resource so
the same vector number can be simultaneously used on multiple cpus for
different irqs.
This should make systems that were hitting limits on the total number of irqs
much more livable.
[akpm@osdl.org: build fix]
[akpm@osdl.org: __target_IO_APIC_irq is unneeded on UP] Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Rajesh Shah <rajesh.shah@intel.com> Cc: Andi Kleen <ak@muc.de> Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] genirq: x86_64 irq: Make the external irq handlers report their vector, not the irq number
This is a small pessimization but it paves the way for making this information
per cpu. Which allows the the maximum number of IRQS to become NR_CPUS*224.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Rajesh Shah <rajesh.shah@intel.com> Cc: Andi Kleen <ak@muc.de> Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] genirq: irq: generalize the check for HARDIRQ_BITS
This patch adds support for systems that cannot receive every interrupt on a
single cpu simultaneously, in the check to see if we have enough HARDIRQ_BITS.
MAX_HARDIRQS_PER_CPU becomes the count of the maximum number of hardare
generated interrupts per cpu.
On architectures that support per cpu interrupt delivery this can be a
significant space savings and scalability bonus.
This patch adds support for systems that cannot receive every interrupt on
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Rajesh Shah <rajesh.shah@intel.com> Cc: Andi Kleen <ak@muc.de> Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Because of the nasty way that CONFIG_PCI_MSI was implemented we wound up with
set_irq_info and set_native_irq_info, with move_irq and move_native_irq. Both
functions did the same thing but they were built and called under different
circumstances. Now that the msi hacks are gone we can kill move_irq and
set_irq_info.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Rajesh Shah <rajesh.shah@intel.com> Cc: Andi Kleen <ak@muc.de> Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] genirq: i386 irq: Remove the msi assumption that irq == vector
This patch removes the change in behavior of the irq allocation code when
CONFIG_PCI_MSI is defined. Removing all instances of the assumption that irq
== vector.
create_irq is rewritten to first allocate a free irq and then to assign that
irq a vector.
assign_irq_vector is made static and the AUTO_ASSIGN case which allocates an
vector not bound to an irq is removed.
The ioapic vector methods are removed, and everything now works with irqs.
The definition of NR_IRQS no longer depends on CONFIG_PCI_MSI
[akpm@osdl.org: cleanup] Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Rajesh Shah <rajesh.shah@intel.com> Cc: Andi Kleen <ak@muc.de> Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] genirq: x86_64 irq: Remove the msi assumption that irq == vector
This patch removes the change in behavior of the irq allocation code when
CONFIG_PCI_MSI is defined. Removing all instances of the assumption that irq
== vector.
create_irq is rewritten to first allocate a free irq and then to assign that
irq a vector.
assign_irq_vector is made static and the AUTO_ASSIGN case which allocates an
vector not bound to an irq is removed.
The ioapic vector methods are removed, and everything now works with irqs.
The definition of NR_IRQS no longer depends on CONFIG_PCI_MSI
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Rajesh Shah <rajesh.shah@intel.com> Cc: Andi Kleen <ak@muc.de> Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] genirq: i386 irq: Move msi message composition into io_apic.c
This removes the hardcoded assumption that irq == vector in the msi
composition code, and it allows the msi message composition to setup logical
mode, or lowest priorirty delivery mode as we do for other apic interrupts,
and with the same selection criteria.
Basically this moves the problem of what is in the msi message into the
architecture irq management code where it belongs. Not in a generic layer
that doesn't have enough information to compose msi messages properly.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Rajesh Shah <rajesh.shah@intel.com> Cc: Andi Kleen <ak@muc.de> Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] genirq: x86_64 irq: Move msi message composition into io_apic.c
This removes the hardcoded assumption that irq == vector in the msi
composition code, and it allows the msi message composition to setup logical
mode, or lowest priorirty delivery mode as we do for other apic interrupts,
and with the same selection criteria.
Basically this moves the problem of what is in the msi message into the
architecture irq management code where it belongs. Not in a generic layer
that doesn't have enough information to compose msi messages properly.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Rajesh Shah <rajesh.shah@intel.com> Cc: Andi Kleen <ak@muc.de> Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] genirq: msi: make the msi code irq based and not vector based
The msi currently allocates irqs backwards. First it allocates a platform
dependent routing value for an interrupt the ``vector'' and then it figures
out from the vector which irq you are on.
For ia64 this is fine. For x86 and x86_64 this is complete nonsense and makes
an enourmous mess of the irq handling code and prevents some pretty
significant cleanups in the code for handling large numbers of irqs.
This patch refactors msi.c to work in terms of irqs and create_irq/destroy_irq
for dynamically managing irqs.
Hopefully this is finally a version of msi.c that is useful on more than just
x86 derivatives.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Rajesh Shah <rajesh.shah@intel.com> Cc: Andi Kleen <ak@muc.de> Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The current implementation of create_irq() is a hack but it is the current
hack that msi.c uses, and unfortunately the ``generic'' apic msi ops depend on
this hack. Thus we are this hack of assuming irq == vector until the
depencencies in the generic irq code are removed.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Rajesh Shah <rajesh.shah@intel.com> Cc: Andi Kleen <ak@muc.de> Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The current implementation of create_irq() is a hack but it is the current
hack that msi.c uses, and unfortunately the ``generic'' apic msi ops depend on
this hack. Thus we are stuck this hack of assuming irq == vector until the
depencencies in the generic msi code are removed.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Rajesh Shah <rajesh.shah@intel.com> Cc: Andi Kleen <ak@muc.de> Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] genirq: irq: add a dynamic irq creation API
With the msi support comes a new concept in irq handling, irqs that are
created dynamically at run time.
Currently the msi code allocates irqs backwards. First it allocates a
platform dependent routing value for an interrupt the ``vector'' and then it
figures out from the vector which irq you are on.
This msi backwards allocator suffers from two basic problems. The allocator
suffers because it is trying to do something that is architecture specific in
a generic way making it brittle, inflexible, and tied to tightly to the
architecture implementation. The alloctor also suffers from it's very
backwards nature as it has tied things together that should have no
dependencies.
To solve the basic dynamic irq allocation problem two new architecture
specific functions are added: create_irq and destroy_irq.
create_irq takes no input and returns an unused irq number, that won't be
reused until it is returned to the free poll with destroy_irq. The irq then
can be used for any purpose although the only initial consumer is the msi
code.
destroy_irq takes an irq number allocated with create_irq and returns it to
the free pool.
Making this functionality per architecture increases the simplicity of the irq
allocation code and increases it's flexibility.
dynamic_irq_init() and dynamic_irq_cleanup() are added to automate the
irq_desc initializtion that should happen for dynamic irqs.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Rajesh Shah <rajesh.shah@intel.com> Cc: Andi Kleen <ak@muc.de> Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] genirq: msi: simplify the msi irq limit policy
Currently we attempt to predict how many irqs we will be able to allocate with
msi using pci_vector_resources and some complicated accounting, and then we
only allow each device as many irqs as we think are available on average.
Only the s2io driver even takes advantage of this feature all other drivers
have a fixed number of irqs they need and bail if they can't get them.
pci_vector_resources is inaccurate if anyone ever frees an irq. The whole
implmentation is racy. The current irq limit policy does not appear to make
sense with current drivers. So I have simplified things. We can revisit this
we we need a more sophisticated policy.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Rajesh Shah <rajesh.shah@intel.com> Cc: Andi Kleen <ak@muc.de> Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The current msi_ops are short sighted in a number of ways, this patch attempts
to fix the glaring deficiences.
- Report in msi_ops if a 64bit address is needed in the msi message, so we
can fail 32bit only msi structures.
- Send and receive a full struct msi_msg in both setup and target. This is
a little cleaner and allows for architectures that need to modify the data
to retarget the msi interrupt to a different cpu.
- In target pass in the full cpu mask instead of just the first cpu in case
we can make use of the full cpu mask.
- Operate in terms of irqs and not vectors, currently there is still a 1-1
relationship but on architectures other than ia64 I expect this will change.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Rajesh Shah <rajesh.shah@intel.com> Cc: Andi Kleen <ak@muc.de> Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] genirq: msi: implement helper functions read_msi_msg and write_msi_msg
In support of this I also add a struct msi_msg that captures the the two
address and one data field ina typical msi message, and I remember the pos and
if the address is 64bit in struct msi_desc.
This makes the code a little more readable and easier to maintain, and paves
the way to further simplfications.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Rajesh Shah <rajesh.shah@intel.com> Cc: Andi Kleen <ak@muc.de> Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] genirq: msi: make the msi boolean tests return either 0 or 1
This allows the output of the msi tests to be stored directly in a bit field.
If you don't do this a value greater than one will be truncated and become 0.
Changing true to false with bizare consequences.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Rajesh Shah <rajesh.shah@intel.com> Cc: Andi Kleen <ak@muc.de> Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] genirq: msi: simplify msi enable and disable
The problem. Because the disable routines leave the msi interrupts in all
sorts of half enabled states the enable routines become impossible to
implement correctly, and almost impossible to understand.
Simplifing this allows me to simply kill the buggy reroute_msix_table, and
generally makes the code more maintainable.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Rajesh Shah <rajesh.shah@intel.com> Cc: Andi Kleen <ak@muc.de> Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Rajesh Shah <rajesh.shah@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] genirq: x86_64 irq: Reenable migrating irqs to other cpus
In the latest changes the code for migrating x86_64 irqs was dropped. This
reads it in a fashion that will work even if we change the vector on level
triggered irqs when we migrate them.
[akpm@osdl.org: build fix] Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Rajesh Shah <rajesh.shah@intel.com> Cc: Andi Kleen <ak@muc.de> Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Currently move_native_irq disables and renables the irq we are migrating to
ensure we don't take that irq when we are actually doing the migration
operation. Disabling the irq needs to happen but sometimes doing the work is
move_native_irq is too late.
On x86 with ioapics the irq move sequences needs to be:
edge_triggered:
mask irq.
move irq.
unmask irq.
ack irq.
level_triggered:
mask irq.
ack irq.
move irq.
unmask irq.
We can easily perform the edge triggered sequence, with the current defintion
of move_native_irq. However the level triggered case does not map well. For
that I have added move_masked_irq, to allow me to disable the irqs around both
the ack and the move.
Q: Why have we not seen this problem earlier?
A: The only symptom I have been able to reproduce is that if we change
the vector before acknowleding an irq the wrong irq is acknowledged.
Since we currently are not reprogramming the irq vector during
migration no problems show up.
We have to mask the irq before we acknowledge the irq or else we could
hit a window where an irq is asserted just before we acknowledge it.
Edge triggered irqs do not have this problem because acknowledgements
do not propogate in the same way.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Rajesh Shah <rajesh.shah@intel.com> Cc: Andi Kleen <ak@muc.de> Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] genirq: irq: convert the move_irq flag from a 32bit word to a single bit
The primary aim of this patchset is to remove maintenances problems caused by
the irq infrastructure. The two big issues I address are an artificially
small cap on the number of irqs, and that MSI assumes vector == irq. My
primary focus is on x86_64 but I have touched other architectures where
necessary to keep them from breaking.
- To increase the number of irqs I modify the code to look at the (cpu,
vector) pair instead of just looking at the vector.
With a large number of irqs available systems with a large irq count no
longer need to compress their irq numbers to fit. Removing a lot of brittle
special cases.
For acpi guys the result is that irq == gsi.
- Addressing the fact that MSI assumes irq == vector takes a few more
patches. But suffice it to say when I am done none of the generic irq code
even knows what a vector is.
In quick testing on a large Unisys x86_64 machine we stumbled over at least
one driver that assumed that NR_IRQS could always fit into an 8 bit number.
This driver is clearly buggy today. But this has become a class of bugs that
it is now much easier to hit.
This patch:
This is a minor space optimization. In practice I don't think this has any
affect because of our alignment constraints and the other fields but there is
not point in chewing up an uncessary word and since we already read the flag
field this should improve the cache hit ratio of the irq handler.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Rajesh Shah <rajesh.shah@intel.com> Cc: Andi Kleen <ak@muc.de> Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Ingo Molnar [Wed, 4 Oct 2006 09:16:26 +0000 (02:16 -0700)]
[PATCH] genirq: convert the i386 architecture to irq-chips
This patch converts all the i386 PIC controllers (except VisWS and Voyager,
which I could not test - but which should still work as old-style IRQ layers)
to the new and simpler irq-chip interrupt handling layer.
[akpm@osdl.org: build fix]
[mingo@elte.hu: enable fasteoi handler for i386 level-triggered IO-APIC irqs] Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Roland Dreier <rolandd@cisco.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Ingo Molnar [Wed, 4 Oct 2006 09:16:25 +0000 (02:16 -0700)]
[PATCH] genirq: convert the x86_64 architecture to irq-chips
This patch converts all the x86_64 PIC controllers layers to the new and
simpler irq-chip interrupt handling layer.
[mingo@elte.hu: The patch also enables the fasteoi handler for x86_64] Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Roland Dreier <rolandd@cisco.com> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andrew Morton [Wed, 4 Oct 2006 09:16:24 +0000 (02:16 -0700)]
[PATCH] fbdev: riva warning fix
drivers/video/riva/fbdev.c: In function `riva_get_EDID_OF':
drivers/video/riva/fbdev.c:1846: warning: assignment discards qualifiers from pointer target type
This code is being bad: copying a pointer to read-only OF data into a
non-const pointer.
Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: "Antonino A. Daplas" <adaplas@pol.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Michael Halcrow [Wed, 4 Oct 2006 09:16:22 +0000 (02:16 -0700)]
[PATCH] ecryptfs: fs/Makefile and fs/Kconfig
eCryptfs is a stacked cryptographic filesystem for Linux. It is derived from
Erez Zadok's Cryptfs, implemented through the FiST framework for generating
stacked filesystems. eCryptfs extends Cryptfs to provide advanced key
management and policy features. eCryptfs stores cryptographic metadata in the
header of each file written, so that encrypted files can be copied between
hosts; the file will be decryptable with the proper key, and there is no need
to keep track of any additional information aside from what is already in the
encrypted file itself.
[akpm@osdl.org: updates for ongoing API changes]
[bunk@stusta.de: cleanups]
[akpm@osdl.org: alpha build fix]
[akpm@osdl.org: cleanups]
[tytso@mit.edu: inode-diet updates]
[pbadari@us.ibm.com: generic_file_*_read/write() interface updates]
[rdunlap@xenotime.net: printk format fixes]
[akpm@osdl.org: make slab creation and teardown table-driven] Signed-off-by: Phillip Hellewell <phillip@hellewell.homeip.net> Signed-off-by: Michael Halcrow <mhalcrow@us.ibm.com> Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] Fix linux/nfsd/const.h for make headers_check
make headers_check fails on linux/nfsd/const.h.
Since linux/sunrpc/msg_prot.h does not seem to export anything interesting
for userspace, this patch moves it in the __KERNEL__ protected section.
Signed-off-by: Cedric Le Goater <clg@fr.ibm.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
J.Bruce Fields [Wed, 4 Oct 2006 09:16:20 +0000 (02:16 -0700)]
[PATCH] knfsd: nfsd4: actually use all the pieces to implement referrals
Use all the pieces set up so far to implement referral support, allowing
return of NFS4ERR_MOVED and fs_locations attribute.
Signed-off-by: Manoj Naik <manoj@almaden.ibm.com> Signed-off-by: Fred Isaman <iisaman@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
J.Bruce Fields [Wed, 4 Oct 2006 09:16:19 +0000 (02:16 -0700)]
[PATCH] knfsd: nfsd4: xdr encoding for fs_locations
Encode fs_locations attribute.
Signed-off-by: Manoj Naik <manoj@almaden.ibm.com> Signed-off-by: Fred Isaman <iisaman@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Manoj Naik [Wed, 4 Oct 2006 09:16:18 +0000 (02:16 -0700)]
[PATCH] knfsd: nfsd4: fslocations data structures
Define FS locations structures, some functions to manipulate them, and add
code to parse FS locations in downcall and add to the exports structure.
[bfields@fieldses.org: bunch of fixes and cleanups] Signed-off-by: Manoj Naik <manoj@almaden.ibm.com> Signed-off-by: Fred Isaman <iisaman@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
J.Bruce Fields [Wed, 4 Oct 2006 09:16:17 +0000 (02:16 -0700)]
[PATCH] knfsd: nfsd: store export path in export
Store the export path in the svc_export structure instead of storing only the
dentry. This will prevent the need for additional d_path calls to provide
NFSv4 fs_locations support.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>