Roland Dreier [Fri, 18 Nov 2005 22:18:26 +0000 (14:18 -0800)]
IB/umad: make sure write()s have sufficient data
Make sure that userspace passes in enough data when sending a MAD. We
always copy at least sizeof (struct ib_user_mad) + IB_MGMT_RMPP_HDR
bytes from userspace, so anything less is definitely invalid. Also,
if the length is less than this limit, it's possible for the second
copy_from_user() to get a negative length and trigger a BUG().
Calculation of QP capabilities still isn't exactly right in mthca:
max_send_sge/max_recv_sge fields returned in create_qp can exceed the
handware supported limits.
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
mikem [Fri, 18 Nov 2005 21:05:36 +0000 (22:05 +0100)]
[PATCH 3/3] cciss: add put_disk into cleanup routines
Jeff Garzik pointed me to his code to see how to remove a disk from
the system _properly_. Well, here it is...
Every place we remove disks we are now testing before calling del_gendisk
or blk_cleanup_queue and then call put_disk.
Signed-off-by: Mike Miller <mike.miller@hp.com> Signed-off-by: Jens Axboe <axboe@suse.de>
Jens Axboe [Fri, 18 Nov 2005 21:02:44 +0000 (22:02 +0100)]
[PATCH 2/3] cciss: bug fix for BIG_PASS_THRU
Applications using CCISS_BIG_PASSTHRU complained that the data written
was zeros. The problem is that the buffer is being cleared after the
user copy, unless the user copy has failed... Correct that logic.
Signed-off-by: Mike Miller <mike.miller@hp.com> Signed-off-by: Jens Axboe <axboe@suse.de>
mikem [Fri, 18 Nov 2005 21:00:17 +0000 (22:00 +0100)]
[PATCH 1/3] cciss: bug fix for hpacucli
This patch fixes a bug that breaks hpacucli, a command line interface
for the HP Array Config Utility. Without this fix the utility will
not detect any controllers in the system. I thought I had already fixed
this, but I guess not.
Thanks to all who reported the issue. Please consider this this inclusion.
Signed-off-by: Mike Miller <mikem@beardog.cca.cpqcorp.net> Signed-off-by: Jens Axboe <axboe@suse.de>
James Ketrenos [Sat, 12 Nov 2005 18:50:12 +0000 (12:50 -0600)]
[PATCH] ipw2100: Fix 'Driver using old /proc/net/wireless...' message
ipw2100: Fix 'Driver using old /proc/net/wireless...' message
Wireless extensions moved the get_wireless_stats handler from being
in net_device into wireless_handler.
A prior instance of this patch resolved the issue for the ipw2200.
This one fixes it for the ipw2100.
Signed-off-by: Benoit Boissinot <benoit.boissinot@ens-lyon.org> Signed-off-by: James Ketrenos <jketreno@linux.intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
[PATCH] fec_8xx: make CONFIG_FEC_8XX depend on CONFIG_8xx
Change CONFIG_FEC_8XX to depend on CONFIG_8xx instead of CONFIG_FEC.
CONFIG_FEC depends on ColdFire CPUs, which does not apply for the
PPC 8xx processors.
Signed-off-by: John W. Linville <linville@tuxdriver.com>
[PATCH] Add HOWTO do kernel development document to the Documentation directory
Here's a document that describes the process and procedures of how to do Linux
kernel development. It has gone through a number of rounds of review on the
linux-kernel mailing list, and contains contributions and help from Paolo
Ciarrocchi, Randy Dunlap, Gerrit Huizenga, Pat Mochel, Hanna Linder, Kay
Sievers, Vojtech Pavlik, Jan Kara, Josh Boyer, Kees Cook, Andrew Morton, Andi
Kleen, Vadim Lobanov, Jesper Juhl, Adrian Bunk, Keri Harris, Frans Pop, David
A. Wheeler, Junio Hamano, Michael Kerrisk, and Alex Shepard.
[PATCH] drivers/net/wireless/hermes.c unsigned int comparision
hermas_bap_pread, hermes_bap_pwrite, and hermes_bap_pwrite_pad all have a parameter "len" that is declared unsigned,
but checked for a value less than zero. Auditing the callers, it is possible for len to be passed a negative value, so len should be an int.
Thanks to LinuxICC (http://linuxicc.sf.net)
Signed-off-by: Gabriel A. Devenyi <ace@staticwave.ca> Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
Jesse Brandeburg [Mon, 14 Nov 2005 21:15:49 +0000 (13:15 -0800)]
[PATCH] e100: re-enable microcode with more useful defaults
For the four versions of hardware that we (currently) support microcode
download on, the default configuration of our receive interrupt mitigation
microcode was too aggressive, and caused unnecessary delays when pinging,
and low(er) throughput on single connection latency sensitive performance
tests.
This code adds microcode support, and sets the defaults to more reasonable
settings. It also explains the functionality in the code in more detail.
Compile and load tested, shows expected behavior for slight delay of ping
packets (1-2ms) when ucode is loaded, and decent interrupt moderation for
small packets, while maintaining good throughput.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
Tejun Heo [Fri, 18 Nov 2005 05:22:03 +0000 (14:22 +0900)]
[PATCH] sil24: make error_intr less verbose
sil24_error_intr logs all error interrupts. ATAPI devices generates
many harmless errors which can be ignored and all serious ones are
reported via sense data by SCSI layer. Don't log device errors from
ATAPI devices.
Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
Tejun Heo [Fri, 18 Nov 2005 05:14:01 +0000 (14:14 +0900)]
[PATCH] sil24: use SRST for phy_reset
There seems to be no way to obtain device signature from sil24 after
SATA phy reset and SRST is needed anyway for later port multiplier
suppport. This patch converts sil24_phy_reset to use SRST instaed.
Signed-off-by: Tejun Heo <htejun@gmail.com>
--
Jeff, I didn't remove the 10ms sleep just to be on the safe side. I
think we can live with 10ms sleep on SRST. Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
Tejun Heo [Fri, 18 Nov 2005 05:09:05 +0000 (14:09 +0900)]
[PATCH] sil24: add sil24_restart_controller
When an error condition is raised by device via D2H FIS or SDB. sil24
controller should be restarted by setting PORT_CS_INIT and waiting
until PORT_CS_RDY is asserted instead of resetting the controller.
This patch implements sil24_restart_controller for those cases. This
patch also makes sure that PORT_CS_RDY is asserted on
sil24_reset_controller completion.
Signed-off-by: Tejun Heo <htejun@gmail.com>
--
Jeff, delay is reduced to 1us and cnt increased to 10k. My sil3124
turns on PORT_CS_RDY on the second iteration even without any delay.
I think 10k * 1us should be more than enough.
I tried to convert both restart and reset to use msleep's with work
queue, but if we do that, host_set lock should be released after
initiating restart or reset, leading to race condition among
reset/restart, other interrupts and timeout. Implementing
synchronization among those in low-level driver doesn't seem right.
Well, reduced timeout should work for the time being.
Thanks. Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
Russell King [Fri, 18 Nov 2005 17:57:55 +0000 (12:57 -0500)]
[PATCH] smc91x: fix bank mismatch
The smc91x driver relies upon register bank 2 being selected whenever
the interrupt handler is called. This isn't always so, especially if
we have a link change event during PHY configuration.
This results in register bank 0 being selected when the interrupt
handler is called, causing the wrong registers to be read for the
IRQ mask and status. In turn, this causes us to spin with a
permanently asserted IRQ.
The patch ensures that smc_phy_configure always exits with register
bank 2 selected.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
Nicolas Pitre [Thu, 17 Nov 2005 19:02:48 +0000 (14:02 -0500)]
[PATCH] smc91x: fix one source of spurious interrupts
Not only SMC_ACK_INT(IM_TX_EMPTY_INT) in in smc_hardware_send_pkt)
appears to be unnecessary (tested with an SMC91C94 and SMC91C111), but
it seems to trigger spurious interrupts on some machines as well.
Removed.
While at it, let's log any remaining spurious interrupts if any (and
clean usage of the max IRQ loop count value).
Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
Andy Whitcroft [Fri, 18 Nov 2005 09:11:02 +0000 (01:11 -0800)]
[PATCH] ppc64 need HPAGE_SHIFT when huge pages disabled
With the new powerpc architecture we don't seem to be able to disable huge
pages anymore.
mm/built-in.o(.toc1+0xae0): undefined reference to `HPAGE_SHIFT'
make: *** [.tmp_vmlinux1] Error 1
We seem to need to define HPAGE_SHIFT to something when HUGETLB_PAGE isn't
defined. This patch defines it to PAGE_SHIFT when we have no support.
Signed-off-by: Andy Whitcroft <apw@shadowen.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
NeilBrown [Fri, 18 Nov 2005 09:11:01 +0000 (01:11 -0800)]
[PATCH] md: fix is_mddev_idle calculation now that disk/sector accounting happens when request completes
md needs to monitor the rate of requests to its devices when doing
resync/recovery so that it can back-off when there is non-resync IO. It
does this by comparing resync IO, which it counts, with total IO which is
taken from disk_stats.
disk_stats were recently changed to account sectors when a request
completes instead of when it is queued. This upsets md's calculations.
We could do the sync_io accounting at the end of requests too, but that has
problems. If an underlying device is an md array, the accounting will
still be done when the request is submitted. This could be changed for
some raid levels, but it cannot be changed for raid0 or linear without
substantial code changes.
So instead, we increase the error that is_mddev_idle allows, up to the
maximum amount of resync IO that can be in flight at any time. The
calculation is current fragile as each personality as different limits for
in-flight resync. This should be fixed up.
For now, this simple patch fixes the problem.
Increasing the error margin decreases the sensitivity to non-resync IO. To
partially compensate for this, the time to wait when non-resync IO is
detected is increased so that less steady IO is required to keep the resync
at bay.
Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Matt Domsch [Fri, 18 Nov 2005 09:10:54 +0000 (01:10 -0800)]
[PATCH] ipmi: missing NULL test for kthread
On IPMI systems with BT interfaces, we don't start the kernel thread, so
smi_info->thread is NULL. Test for NULL when stopping the thread, because
kthread_stop() doesn't, and an oops ensues otherwise.
Signed-off-by: Matt Domsch <Matt_Domsch@dell.com> Acked-by: Corey Minyard <minyard@acm.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Paul E. McKenney [Fri, 18 Nov 2005 09:10:50 +0000 (01:10 -0800)]
[PATCH] add success/failure indication to RCU torture test
One issue with the RCU torture test is that the current error flagging can
be lost in dmesg. This patch adds a "SUCCESS"/"FAILURE" string to the line
that flags the end of the test, where it can easily be seen with "dmesg |
tail" at the end of the test. Also adds tests of architecture-specific
memory barriers -- or, more likely, of the RCU torture test itself.
Cc: <vatsa@in.ibm.com> Signed-off-by: "Paul E. McKenney" <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Linus Torvalds [Fri, 18 Nov 2005 15:29:51 +0000 (07:29 -0800)]
Fix ACPI processor power block initialization
Properly clear the memory, and set "pr->flags.power" only if a C2 or
deeper state is valid (to make the code match both the comment and
previous behaviour).
This fixes a boot-time lockup reported by Maneesh Soni when using
"maxcpus=1".
Acked-by: Maneesh Soni <maneesh@in.ibm.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Paul Mackerras [Fri, 18 Nov 2005 05:39:08 +0000 (16:39 +1100)]
powerpc: Move defconfig over and remove remaining arch/ppc64 files
make defconfig will now use arch/powerpc/configs/ppc64_defconfig
if running on a ppc64 system. I need to add an
arch/powerpc/configs/ppc_defconfig sometime.
Paul Mackerras [Fri, 18 Nov 2005 04:52:38 +0000 (15:52 +1100)]
powerpc: time-of-day fixes for 32-bit CHRP systems
This makes 32-bit CHRP systems use the RTAS time-of-day routines if
available. It fixes a bug in the RTAS time-of-day routines where they
were storing a 64-bit timebase value in an unsigned long by making
those variables u64. Also, the direct-access time-of-day routines
had the wrong convention for the month and year in the struct rtc_time.
Paul Mackerras [Fri, 18 Nov 2005 04:47:18 +0000 (15:47 +1100)]
powerpc: Fix compile error on pSeries arising from delay.h changes
pseries_dedicated_idle() was using __get_tb which used to be defined
in asm/delay.h. Change it to use get_tb from asm/time.h, which is
in fact exactly the same thing.
Paul Mackerras [Fri, 18 Nov 2005 04:43:34 +0000 (15:43 +1100)]
powerpc: Move remaining .c files from arch/ppc64 to arch/powerpc
This also deletes the now-unused Makefiles under arch/ppc64.
Both of the files moved over could use some merging, but for now I
have moved them as-is and arranged for them to be used only in 64-bit
kernels. For 32-bit kernels we still use arch/ppc/kernel/idle.c and
drivers/char/generic_nvram.c as before.
Mike Kravetz [Tue, 15 Nov 2005 00:12:49 +0000 (16:12 -0800)]
[PATCH] Remove SPAN_OTHER_NODES config definition
The config option SPAN_OTHER_NODES was created so that we could
make pSeries numa layouts work within the DISCONTIG memory model.
Now that DISCONTIG has been replaced by SPARSEMEM, we can eliminate
this option.
I'll be sending a separate patch to Andrew to remove the arch
independent code as pSeries was the only arch that needed this.
Signed-off-by: Mike Kravetz <kravetz@us.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch merges align.c, the result isn't quite what was in ppc64 nor
what was in ppc32 :) It should implement all the functionalities of both
though. Kumar, since you played with that in the past, I suppose you
have some test cases for verifying that it works properly before I dig
out the 601 machine ? :)
Since it's likely that I won't be able to test all scenario, code
inspection is much welcome.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
Paul Mackerras [Fri, 18 Nov 2005 02:44:17 +0000 (13:44 +1100)]
powerpc: Fix delay functions for 601 processors
My earlier merge of delay.h introduced a timebase-based udelay for
32-bit machines but also broke the 601, which doesn't have the
timebase register. This fixes it by using the 601's RTC register on
the 601, and also moves __delay() and udelay() to be out-of-line in
arch/powerpc/kernel/time.c. These functions aren't really performance
critical, after all.
Michael Ellerman [Thu, 17 Nov 2005 09:34:35 +0000 (20:34 +1100)]
[PATCH] powerpc: Fix typo in topology.h
The fix to topology.h (5cfccd7f132432dd4705444a44b51d12ef88a85f) seems to have
a typeo, struct sched_domain has an idle_idx member but not an idle_id
member. I assume this is the fix.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
Roman Zippel [Thu, 17 Nov 2005 23:22:39 +0000 (15:22 -0800)]
[NET]: Sanitize NET_SCHED protection in /net/sched/Kconfig
On Thu, 17 Nov 2005, David Gómez wrote:
> I found out that if i select NET_CLS_ROUTE4, save my changes and exit
> menuconfig, execute again make menuconfig and go to QoS options, then the new
> available options are visible. So menuconfig has some problem refreshing
> contents :?
No, they were there before too, but you have to go up one level to see
them.
It's better in 2.6.15-rc1-git5, but the menu structure is still a little
messed up, the patch below properly indents all menu entries.
Signed-off-by: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 17 Nov 2005 23:17:42 +0000 (15:17 -0800)]
[LLC]: Fix compiler warnings introduced by TX window scaling changes.
Noticed by Olaf Hering.
The comparisons want a u8 here (the data type on the left-hand branch
is a u8 structure member, and the constant on the right-hand branch is
"~((u8) 128)"), but C turns it into an integer so we get:
net/llc/llc_c_ac.c: In function `llc_conn_ac_inc_npta_value':
net/llc/llc_c_ac.c:998: warning: comparison is always true due to limited range of data type
net/llc/llc_c_ac.c:999: warning: large integer implicitly truncated to unsigned type
Fix this up by explicitly recasting the right-hand branch constant
into a "u8" once more.
Signed-off-by: David S. Miller <davem@davemloft.net>
Harald Welte [Thu, 17 Nov 2005 23:06:47 +0000 (15:06 -0800)]
[NETFILTER] ip_conntrack: fix ftp/irc/tftp helpers on ports >= 32768
Since we've converted the ftp/irc/tftp helpers to use the new
module_parm_array() some time ago, we ware accidentially using signed data
types - thus preventing those modules from being used on ports >= 32768.
This patch fixes it by using 'ushort' module parameters.
Thanks to Jan Nijs for reporting this bug.
Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Kyle McMartin [Thu, 17 Nov 2005 21:44:57 +0000 (16:44 -0500)]
[PARISC] Make superio.c initialize before any driver needs it
Convert superio_init to use PCI_FIXUP_FINAL as ohci_pci being called
before superio_probe really makes a mess. superio_init will then fail
to register irq 20 (the "SuperIO" irq) and BUG() because ohci_pci has
stolen it before superio_fixup_irq can be moved USB to irq 1.
Matthew Wilcox [Thu, 17 Nov 2005 21:44:14 +0000 (16:44 -0500)]
[PARISC] Always spinlock tlb flush operations to ensure preempt safety
Since taking a spinlock disables preempt, and we need to spinlock tlb flush
on SMP for N class, we might as well just spinlock on uniprocessor machines
too.
Signed-off-by: Matthew Wilcox <willy@parisc-linux.org> Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Ryan Bradetich [Thu, 17 Nov 2005 21:38:28 +0000 (16:38 -0500)]
[PARISC] Define port->timeout to fix a long msleep in mux.c
This commit is in response to a bug reported by Vesa on the irc channel
a couple of weeks ago.
The bug was that the console would apparently hang (not return) while
using the mux console.
The root cause of this bug is that bash (with readline support) makes a
call to the tcsetattr() glibc function with the argument TCSADRAIN. This
causes the serial core in the kernel use the uart_wait_until_sent() to be
called. This function verifies the mux transmit queue is empty or calls the
msleep_interruptable() with a calculated timeout value that is dependant
upon the port->timeout variable.
The real problem here is that the port->timeout was not defined so it
was defaulted to 0 and the timeout calculation performs the following
calculation:
where char_time is an unsigned long. Since the serial Mux does not use
interrupts, the msleep_interruptable() function waits until the timeout
has been reached ... and when the port->timeout < HZ/50 this timeout will
be a long time. (I have validated that the console will eventually
return ... but it takes quite a while for this to happen).
This patch simply sets the port->timeout on the Mux to HZ/50 to avoid
this long timeout period.
Signed-off-by: Ryan Bradetich <rbrad@parisc-linux.org> Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
James Bottomley [Thu, 17 Nov 2005 21:35:09 +0000 (16:35 -0500)]
[PARISC] Fix our spinlock implementation
We actually have two separate bad bugs
1. The read_lock implementation spins with disabled interrupts. This is
completely wrong
2. Our spin_lock_irqsave should check to see if interrupts were enabled
before the call and re-enable interrupts around the inner spin loop.
The problem is that if we spin with interrupts off, we can't receive
IPIs. This has resulted in a bug where SMP machines suddenly spit
smp_call_function timeout messages and hang.
The scenario I've caught is
CPU0 does a flush_tlb_all holding the vmlist_lock for write.
CPU1 tries a cat of /proc/meminfo which tries to acquire vmlist_lock for
read
CPU1 is now spinning with interrupts disabled
CPU0 tries to execute a smp_call_function to flush the local tlb caches
This is now a deadlock because CPU1 is spinning with interrupts disabled
and can never receive the IPI
Signed-off-by: James Bottomley <jejb@parisc-linux.org> Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Matthew Wilcox [Thu, 17 Nov 2005 21:33:29 +0000 (16:33 -0500)]
[PARISC] Return PDC_OK when alloc_pa_dev fails to enumerate all devices
Return PDC_OK when device registration fails so that we enumerate all
subsequent devices, even when we get two devices with the same hardware
path (which should never happen, but does with at least one revision of
rp8400 firmware).
Signed-off-by: Matthew Wilcox <willy@parisc-linux.org> Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Carlos O'Donell [Thu, 17 Nov 2005 21:32:46 +0000 (16:32 -0500)]
[PARISC] Document some register usages in assembly files
Document clobbers and args in entry.S and syscall.S.
entry.S: Add comment to indicate that cr27 may recycle and EDEADLOCK
detection is not 100% correct. Since this is only enabled when using
ENABLE_LWS_DEBUG, the user is warned by the comment.
Signed-off-by: Carlos O'Donell <carlos@parisc-linux.org> Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
James Bottomley [Thu, 17 Nov 2005 21:28:37 +0000 (16:28 -0500)]
[PARISC] Add IRQ affinities
This really only adds them for the machines I can check SMP on, which
is CPU interrupts and IOSAPIC (so not any of the GSC based machines).
With this patch, irqbalanced can be used to maintain irq balancing.
Unfortunately, irqbalanced is a bit x86 centric, so it doesn't do an
incredibly good job, but it does work.
Signed-off-by: James Bottomley <jejb@parisc-linux.org> Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Kyle McMartin [Thu, 17 Nov 2005 21:27:44 +0000 (16:27 -0500)]
[PARISC] Fix uniprocessor build by dummying smp_send_all_nop()
Since irq.c uses smp_send_all_nop, we must define it for UP builds
as well. Make it a static inline so it gets optimized away. This forces
irq.c to include <asm/smp.h> though.
James Bottomley [Thu, 17 Nov 2005 21:27:02 +0000 (16:27 -0500)]
[PARISC] Fix our interrupts not to use smp_call_function
Fix our interrupts not to use smp_call_function
On K and D class smp, the generic code calls this under an irq
spinlock, which causes the WARN_ON() message in smp_call_function()
(and is also illegal because it could deadlock).
The fix is to use a new scheme based on the IPI_NOP.
Signed-off-by: James Bottomley <jejb@parisc-linux.org> Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Grant Grundler [Thu, 17 Nov 2005 21:26:20 +0000 (16:26 -0500)]
[PARISC] Disable nesting of interrupts
Disable nesting of interrupts - still has holes
The offending sequence starts out like this:
1) take external interrupt
2) set_eiem() to only allow TIMER_IRQ; local interrupts still disabled
3) read the EIRR to get a "list" of pending interrupts
4) clear EIRR of pending interrupts we intend to handle
5) call __do_IRQ() to handle IRQ.
6) handle_IRQ_event() enables local interrupts (I-Bit)
7) take a timer interrupt
8) read EIRR to get a new list of pending interrupts
9) clear EIRR of pending interrupts we just read
10) handle pending interrupts found in (8)
11) set_eiem(cpu_eiem) and return
[ TROUBLE! all enabled CPU IRQs are unmasked. }
12) handle remaining interrupts pending from (3)
e.g. call __do_IRQ() -> handle_IRQ_event()..etc
[ TROUBLE! call to handle_IRQ_event() can now enable *any* IRQ. }
13) set_eiem(cpu_eiem) and return
The problem is we now get into ugly race conditions with Timer and IPI
interrupts at this point. I'm not exactly sure what happens when
things go wrong (perhaps nest calls to IPI or timer interrupt?).
But I'm certain it's not good.
This sequence will break sooner if (10) would accidentally leave
interrupts enabled.
I'm pretty sure the right answer is now to make cpu_eiem
a per CPU variable since all external interrupts on parisc
are per CPU. This means we will NOT need to send an IPI to
every CPU in the system when enabling or disabling an IRQ
since only one CPU needs to change it's EIEM.
Thanks to James Bottomley for (once again) pointing out the problem.
Signed-off-by: Grant Grundler <grundler@parisc-linux.org> Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
James Bottomley [Thu, 17 Nov 2005 21:24:52 +0000 (16:24 -0500)]
[PARISC] Make sure timer and IPI execute with interrupts disabled
Fix a longstanding smp bug
The problem is that both the timer and ipi interrupts are being called
with interrupts enabled, which isn't what anyone is expecting.
The IPI issue has just started to show up by causing a BUG_ON in the
slab debugging code. The timer issue never shows up because there's an
eiem work around in our irq.c
The fix is to label both these as SA_INTERRUPT which causes the generic
irq code not to enable interrupts.
I also suspect the smp_call_function timeouts we're seeing might be
connected with the fact that we disable IPIs when handling any other
type of interrupt. I've put a WARN_ON in the code for executing
smp_call_function() with IPIs disabled.
Signed-off-by: James Bottomley <jejb@parisc-linux.org> Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Jens Axboe [Thu, 17 Nov 2005 20:35:02 +0000 (21:35 +0100)]
[PATCH] VM: fix zone list restart in page allocatate
We must reassign z before looping through the zones kicking kswapd,
since it will be NULL if we hit an OOM condition and jump back to the
beginning again. 'z' is initially assigned before the restart: label. So
move the restart label up a little.