Dmitry Adamushko [Thu, 15 Nov 2007 19:57:40 +0000 (20:57 +0100)]
sched: remove activate_idle_task()
cpu_down() code is ok wrt sched_idle_next() placing the 'idle' task not
at the beginning of the queue.
So get rid of activate_idle_task() and make use of activate_task() instead.
It is the same as activate_task(), except for the update_rq_clock(rq) call
that is redundant.
Code size goes down:
text data bss dec hex filename
47853 3934 336 52123 cb9b sched.o.before
47828 3934 336 52098 cb82 sched.o.after
Dmitry Adamushko [Thu, 15 Nov 2007 19:57:40 +0000 (20:57 +0100)]
sched: fix __set_task_cpu() SMP race
Grant Wilson has reported rare SCHED_FAIR_USER crashes on his quad-core
system, which crashes can only be explained via runqueue corruption.
there is a narrow SMP race in __set_task_cpu(): after ->cpu is set up to
a new value, task_rq_lock(p, ...) can be successfuly executed on another
CPU. We must ensure that updates of per-task data have been completed by
this moment.
this bug has been hiding in the Linux scheduler for an eternity (we never
had any explicit barrier for task->cpu in set_task_cpu() - so the bug was
introduced in 2.5.1), but only became visible via set_task_cfs_rq() being
accidentally put after the task->cpu update. It also probably needs a
sufficiently out-of-order CPU to trigger.
Reported-by: Grant Wilson <grant.wilson@zen.co.uk> Signed-off-by: Dmitry Adamushko <dmitry.adamushko@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Oleg Nesterov [Thu, 15 Nov 2007 19:57:40 +0000 (20:57 +0100)]
sched: fix SCHED_FIFO tasks & FAIR_GROUP_SCHED
Suppose that the SCHED_FIFO task does
switch_uid(new_user);
Now, p->se.cfs_rq and p->se.parent both point into the old
user_struct->tg because sched_move_task() doesn't call set_task_cfs_rq()
for !fair_sched_class case.
Suppose that old user_struct/task_group is freed/reused, and the task
does
sched_setscheduler(SCHED_NORMAL);
__setscheduler() sets fair_sched_class, but doesn't update
->se.cfs_rq/parent which point to the freed memory.
This means that check_preempt_wakeup() doing
while (!is_same_group(se, pse)) {
se = parent_entity(se);
pse = parent_entity(pse);
}
may OOPS in a similar way if rq->curr or p did something like above.
Perhaps we need something like the patch below, note that
__setscheduler() can't do set_task_cfs_rq().
sched: fix accounting of interrupts during guest execution on s390
Currently the scheduler checks for PF_VCPU to decide if this timeslice
has to be accounted as guest time. On s390 host interrupts are not
disabled during guest execution. This causes theses interrupts to be
accounted as guest time if CONFIG_VIRT_CPU_ACCOUNTING is set. Solution
is to check if an interrupt triggered account_system_time. As the tick
is timer interrupt based, we have to subtract hardirq_offset.
I tested the patch on s390 with CONFIG_VIRT_CPU_ACCOUNTING and on
x86_64. Seems to work.
CC: Avi Kivity <avi@qumranet.com> CC: Laurent Vivier <Laurent.Vivier@bull.net> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Linus Torvalds [Thu, 15 Nov 2007 19:01:07 +0000 (11:01 -0800)]
Merge branch 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6
* 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6:
i2c/eeprom: Recognize VGN as a valid Sony Vaio name prefix
i2c/eeprom: Hide Sony Vaio serial numbers
i2c-pasemi: Fix NACK detection
i2c-pasemi: Replace obsolete "driverfs" reference with "sysfs"
i2c: Make i2c_check_addr static
i2c-dev: Unbound new-style i2c clients aren't busy
i2c-dev: "how does it work" comments
Jean Delvare [Thu, 15 Nov 2007 18:24:03 +0000 (19:24 +0100)]
i2c/eeprom: Hide Sony Vaio serial numbers
The sysfs interface to DMI data takes care to not make the system
serial number and UUID world-readable, presumably due to privacy
concerns. For consistency, we should not let the eeprom driver
export these same strings to the world on Sony Vaio laptops.
Instead, only make them readable by root, as we already do for BIOS
passwords.
Let i2c-dev deal properly with new-style i2c clients. Instead of
considering them always busy, it needs to check wether a driver is
bound to them or not.
This is still not completely correct, as the client could become
busy later, but the same problem already existed before new-style
clients were introduced. We'll want to fix it someday.
Signed-off-by: Jean Delvare <khali@linux-fr.org> Acked-by: David Brownell <dbrownell@users.sourceforge.net>
David Brownell [Thu, 15 Nov 2007 18:24:01 +0000 (19:24 +0100)]
i2c-dev: "how does it work" comments
This adds some "how does this work" comments to the i2c-dev driver,
plus separators between the three main components:
- The parallel list of i2c_adapters ("i2c_dev_list"), each of which
gets a "struct i2c_dev" and a /dev/i2c-X character special file.
- An i2cdev_driver gets adapter add/remove notifications, which are
used to maintain that list of adapters.
- Special file operations, which let userspace talk either directly to
the adapter (for i2c_msg operations) or through cached addressing info
using an anonymous i2c_client (never registered anywhere).
Plus there's the usual module load/unload record keeping.
After making sense of this code, I think that the anonymous i2c_client
is pretty shady. But since it's never registered, using this code with
a system set up for "new style" I2C drivers is no more complicated than
always using the I2C_SLAVE_FORCE ioctl (instead of I2C_SLAVE).
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Jean Delvare <khali@linux-fr.org>
Linus Torvalds [Thu, 15 Nov 2007 16:37:09 +0000 (08:37 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hskinnemoen/avr32-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hskinnemoen/avr32-2.6:
[AVR32] Export intc_get_pending symbol
[AVR32] Add missing bit in PCCR sysreg
[AVR32] Fix duplicate clock index in at32ap machine code
[AVR32] remove UID16 option
[AVR32] Turn off debugging in SMC driver
Extend I/O resource for wdt0 for at32ap7000 devices
[AVR32] pcmcia ioaddr_t should be 32 bits on AVR32
Nick Piggin [Thu, 15 Nov 2007 11:32:04 +0000 (12:32 +0100)]
slob: fix memory corruption
Previously, it would be possible for prev->next to point to
&free_slob_pages, and thus we would try to move a list onto itself, and
bad things would happen.
It seems a bit hairy to be doing list operations with the list marker as
an entry, rather than a head, but...
this resolves the following crash:
http://bugzilla.kernel.org/show_bug.cgi?id=9379
Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Roland McGrath [Wed, 14 Nov 2007 06:11:50 +0000 (22:11 -0800)]
wait_task_stopped: Check p->exit_state instead of TASK_TRACED
The original meaning of the old test (p->state > TASK_STOPPED) was
"not dead", since it was before TASK_TRACED existed and before the
state/exit_state split. It was a wrong correction in commit 14bf01bb0599c89fc7f426d20353b76e12555308 to make this test for
TASK_TRACED instead. It should have been changed when TASK_TRACED
was introducted and again when exit_state was introduced.
Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Alexey Dobriyan <adobriyan@sw.ru> Cc: Kees Cook <kees@ubuntu.com> Acked-by: Scott James Remnant <scott@ubuntu.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[AVR32] Fix duplicate clock index in at32ap machine code
There's a duplicate clock index between USART0 and USART1 which may be
causing system crashes when USART0 is used. Change the USART0 index
to '3', indicating the clock that is actually used by USART0.
Signed-off-by: Ben Nizette <ben@niasdigital.com> Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
Extend I/O resource for wdt0 for at32ap7000 devices
This patch extends the I/O resource to 0xfff000cf which will enable the
watchdog driver to access the reset cause (RCAUSE) register. Making it
capable of reporting boot status.
Linus Torvalds [Thu, 15 Nov 2007 02:53:49 +0000 (18:53 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
mlx4_core: Fix thinko in QP destroy (incorrect bitmap_free)
RDMA/cxgb3: Set the max_qp_init_rd_atom attribute in query_device
IB/ehca: Fix static rate calculation
IB/ehca: Return physical link information in query_port()
IB/ipath: Fix race with ACK retry timeout list management
IB/ipath: Fix memory leak in ipath_resize_cq() if copy_to_user() fails
mlx4_core: Fix possible bad free in mlx4_buf_free()
* git://git.kernel.org/pub/scm/linux/kernel/git/sam/x86:
x86: enable "make ARCH=x86"
x86: do not use $(ARCH) when not needed
kconfig: use $K64BIT to set 64BIT with all*config targets
kconfig: add helper to set config symbol from environment variable
kconfig: factor out code in confdata.c
x86: move the rest of the menu's to Kconfig
x86: move all simple arch settings to Kconfig
x86: copy x86_64 specific Kconfig symbols to Kconfig.i386
x86: add X86_64 dependency to x86_64 specific symbols in Kconfig.x86_64
x86: add X86_32 dependency to i386 specific symbols in Kconfig.i386
x86: arch/x86/Kconfig.cpu unification
x86: start unification of arch/x86/Kconfig.*
x86: unification of cfufreq/Kconfig
Linus Torvalds [Thu, 15 Nov 2007 02:51:48 +0000 (18:51 -0800)]
Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
[NET]: rt_check_expire() can take a long time, add a cond_resched()
[ISDN] sc: Really, really fix warning
[ISDN] sc: Fix sndpkt to have the correct number of arguments
[TCP] FRTO: Clear frto_highmark only after process_frto that uses it
[NET]: Remove notifier block from chain when register_netdevice_notifier fails
[FS_ENET]: Fix module build.
[TCP]: Make sure write_queue_from does not begin with NULL ptr
[TCP]: Fix size calculation in sk_stream_alloc_pskb
[S2IO]: Fixed memory leak when MSI-X vector allocation fails
[BONDING]: Fix resource use after free
[SYSCTL]: Fix warning for token-ring from sysctl checker
[NET] random : secure_tcp_sequence_number should not assume CONFIG_KTIME_SCALAR
[IWLWIFI]: Not correctly dealing with hotunplug.
[TCP] FRTO: Plug potential LOST-bit leak
[TCP] FRTO: Limit snd_cwnd if TCP was application limited
[E1000]: Fix schedule while atomic when called from mii-tool.
[NETX]: Fix build failure added by 2.6.24 statistics cleanup.
[EP93xx_ETH]: Build fix after 2.6.24 NAPI changes.
[PKT_SCHED]: Check subqueue status before calling hard_start_xmit
Jesper Nilsson [Thu, 15 Nov 2007 01:01:33 +0000 (17:01 -0800)]
CRISv10 fasttimer: Scrap INLINE and name timeval_cmp better
Scrap the local __INLINE__ macro, and rename timeval_cmp to fasttime_cmp.
Inline macro was completely unnecessary since the macro was defined
locally to be inline.
timeval_cmp was inaccurately named since it does comparison on
struct fasttimer_t and not on struct timeval.
Signed-off-by: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Mikael Starvik <mikael.starvik@axis.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jesper Nilsson [Thu, 15 Nov 2007 01:01:32 +0000 (17:01 -0800)]
CRISv10 memset library add lineendings to asm
Add \n\ at end of lines inside asm statement to avoid warning.
No change except adding \n\ to end of line and correcting
whitespace has been done.
Removes warning about multi-line string literals when compiling
arch/cris/arch-v10/lib/memset.c
Signed-off-by: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Mikael Starvik <mikael.starvik@axis.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jesper Nilsson [Thu, 15 Nov 2007 01:01:31 +0000 (17:01 -0800)]
CRISv10 string library add lineendings to asm
Add \n\ at end of lines inside asm statement to avoid warning.
No change except adding \n\ to end of line and correcting
whitespace has been done.
Removes warning about multi-line string literals when compiling
arch/cris/arch-v10/lib/string.c
Signed-off-by: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Mikael Starvik <mikael.starvik@axis.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jesper Nilsson [Thu, 15 Nov 2007 01:01:27 +0000 (17:01 -0800)]
CRISv32: add cache flush operations
These are needed due to a cache bug, and can be used to make sure that the
DMA descriptors are flushed to memory and can be safely handled by DMA.
flush_dma_descr - Flush one DMA descriptor.
flush_dma_list - Flush a complete list of DMA descriptors.
cris_flush_cache - Flush the complete cache.
cris_flush_cache_range - Flush only the specified range
Jesper Nilsson [Thu, 15 Nov 2007 01:01:23 +0000 (17:01 -0800)]
CRISv10 improve and bugfix fasttimer
Improve and bugfix CRIS v10 fast timers.
- irq_handler_t now only takes two arguments.
- Keep interrupts disabled as long as we have a reference to the
fasttimer list and only enable them while doing the callback.
del_fast_timer may be called from other interrupt context.
- Fix bug where debug code could return without calling local_irq_restore.
- Use jiffies instead of usec (change from struct timeval to fasttime_t).
- Don't initialize static variables to zero.
- Remove obsolete #ifndef DECLARE_WAITQUEUE code.
- fast_timer_init should be __initcall.
- Change status/debug variables to unsigned.
- Remove CVS log and CVS id.
Signed-off-by: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Mikael Starvik <mikael.starvik@axis.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jesper Nilsson [Thu, 15 Nov 2007 01:01:15 +0000 (17:01 -0800)]
CRISv10 serial driver rewrite
New and improved serial driver for CRISv10, take three, with improvements
suggested by Jiri Slaby.
- Call wait_event_interruptible with a _correct_ and sensible condition.
- Removed superfluous test of info->flags & ASYNC_CLOSING, since that is done
by wait_event_interruptible.
- Moved common code for deregistering DMA and IRQ to deinit_port function.
- Use setup_timer when initializing flush_timer.
- Convert bit-field for uses_dma_in and uses_dma_out to regular bytes.
- Removed CVS tags.
- Removed defines and comments for CRIS_BUF_SIZE and TTY_THRESHOLD_THROTTLE
(no longer used).
- Cleaned up code to pass checkpatch.
- Add crisv10.h header file.
- Merge of CRISv10 from Axis internal CVS.
Signed-off-by: Jesper Nilsson <jesper.nilsson@axis.com> Reviewed-by: Jiri Slaby <jirislaby@gmail.com> Cc: Mikael Starvik <starvik@axis.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jesper Nilsson [Thu, 15 Nov 2007 01:01:13 +0000 (17:01 -0800)]
CRIS don't include bitops.h in posix_types.h
In file included from include/asm/byteorder.h:23,
from include/asm-generic/bitops/le.h:5,
from include/asm-generic/bitops/ext2-non-atomic.h:4,
from include/asm/bitops.h:163,
from include/linux/bitops.h:17,
from include/asm/posix_types.h:55,
from include/linux/posix_types.h:47,
from include/linux/types.h:11,
from include/linux/capability.h:16,
from include/linux/sched.h:49,
from arch/cris/kernel/asm-offsets.c:1:
include/linux/byteorder/little_endian.h:43: parse error before "__cpu_to_le64p"
include/linux/byteorder/little_endian.h:44: warning: return type defaults to `int'
include/linux/byteorder/little_endian.h: In function `__cpu_to_le64p':
include/linux/byteorder/little_endian.h:45: `__le64' undeclared (first use in this function)
Remove include of asm/bitops.h, not needed here, corrects compilation error
(__le64 undeclared).
Signed-off-by: Jesper Nilsson <jesper.nilsson@axis.com> Acked-by: Mikael Starvik <starvik@axis.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jesper Nilsson [Thu, 15 Nov 2007 01:01:00 +0000 (17:01 -0800)]
cris build fixes: fixes in arch/cris/kernel/time.c
- Remove debug print.
- Change #if to #ifdef to avoid compile time warning if CONFIG_PROFILING
isn't set.
- Number of parameters to profile_tick has changed, drop the regs parameter.
Signed-off-by: Jesper Nilsson <jesper.nilsson@axis.com> Acked-by: Mikael Starvik <starvik@axis.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jesper Nilsson [Thu, 15 Nov 2007 01:00:59 +0000 (17:00 -0800)]
cris build fixes: corrected and improved NMI and IRQ handling
Corrects compile errors and the following:
- Remove oldset parameter from do_signal and do_notify_resume.
- Modified to fit new consolidated IRQ handling code.
- Reverse check order between external nmi and watchdog nmi to avoid false
watchdog oops in case of a glitch on the nmi pin.
- Return from an pin-generated NMI the same way as for other interrupts.
- Moved blocking of ethernet rx/tx irq from ethernet interrupt handler to
low-level asm interrupt handlers. Fixed in the multiple interrupt
handler also.
- Add space for thread local storage in thread_info struct.
- Add NO_DMA to Kconfig, and include arch specific Kconfig using arch
independent path. Include subsystem Kconfigs for pcmcia, usb, i2c,
rtc and pci.
Signed-off-by: Jesper Nilsson <jesper.nilsson@axis.com> Acked-by: Mikael Starvik <starvik@axis.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
New (updated) version of ethernet driver for cris v10.
- First steps to simplify and make the MII code more similar
between the etrax100 and etraxfs ports.
- Start the transmit queue before enabling tx interrupts
to avoid race with the first frame.
- Flip the comparition statement to stick to physical addresses
to avoid phys_to_virt mapping a potential null pointer.
This was not an error but the change simplifies debugging
of address-space mappings.
- Made myPrevRxDesc local to e100_rx since it was only used there.
Fixed out of memory handling in e100_rx. If dev_alloc_skb() fails
persistently the system is hosed anyway but at least it won't
loop in an interrupt handler.
- Correct some code formatting issues.
- Add defines SET_ETH_ENABLE_LEDS, SET_ETH_DISABLE_LEDS
and SET_ETH_AUTONEG used in new cris v10 ethernet driver.
Signed-off-by: Jesper Nilsson <jesper.nilsson@axis.com> Acked-by: Mikael Starvik <starvik@axis.com> Cc: Jeff Garzik <jeff@garzik.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrew Morton [Thu, 15 Nov 2007 01:00:45 +0000 (17:00 -0800)]
aic94xx_sds: rename FLASH_SIZE
arm:
drivers/scsi/aic94xx/aic94xx_sds.c:381:1: warning: "FLASH_SIZE" redefined
In file included from include/asm/arch/irqs.h:22,
from include/asm/irq.h:4,
from include/asm/hardirq.h:6,
from include/linux/hardirq.h:7,
from include/asm-generic/local.h:5,
from include/asm/local.h:1,
from include/linux/module.h:19,
from include/linux/device.h:21,
from include/linux/pci.h:52,
from drivers/scsi/aic94xx/aic94xx_sds.c:28:
include/asm/arch/platform.h:444:1: warning: this is the location of the previous definition
Cc: Gilbert Wu <gilbert_wu@adaptec.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
A CPU which was not brought up during boot (using maxcpus and
additional_cpus parameters) couldn't be onlined anymore. For such a CPU it
seemed that MCE was not supported during CPU_UP_PREPARE-time which caused
mce_cpu_callback to return NOTIFY_BAD to notifier_call_chain. To fix this
we:
- call mce_create_device for CPU_ONLINE event (instead of CPU_UP_PREPARE),
- avoid mce_remove_device() for the CPU that is not correctly initialized
by mce_create_device() failure,
- make mce_cpu_callback always return NOTIFY_OK for CPU_ONLINE event.
Because CPU_ONLINE callback return value is always ignored.
[akinobu.mita@gmail.com: avoid mce_remove_device() for not initialized device]
[akinobu.mita@gmail.com: make mce_cpu_callback always return NOTIFY_OK] Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrew Morton [Thu, 15 Nov 2007 01:00:41 +0000 (17:00 -0800)]
x86: disable preemption in delay_tsc()
Marin Mitov points out that delay_tsc() can misbehave if it is preempted and
rescheduled on a different CPU which has a skewed TSC. Fix it by disabling
preemption.
(I assume that the worst-case behaviour here is a stall of 2^32 cycles)
Andrey Borzenkov [Thu, 15 Nov 2007 01:00:37 +0000 (17:00 -0800)]
make /proc/acpi/ac_adapter dependent on ACPI_PROCFS
Do not provide /proc/acpi/ac_adapter if ACPI_PROCFS is not defined. This
eliminates duplicated power adapters in HAL and makes it consistent with
battery module
Signed-off-by: Andrey Borzenkov <arvidjaar@mail.ru> Acked-by: Alexey Starikovskiy <astarikovskiy@suse.de> Cc: Len Brown <lenb@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Allow sigcont to be sent to a process with greater capabilities if it is in
the same session. Otherwise, a shell from which I've started a root shell
and done 'suspend' can't be restarted by the parent shell.
Also don't do file-capabilities signaling checks when uids for the
processes don't match, since the standard check_kill_permission will have
done those checks.
[akpm@linux-foundation.org: coding-style cleanups] Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Acked-by: Andrew Morgan <morgan@kernel.org> Cc: Chris Wright <chrisw@sous-sol.org> Tested-by: "Theodore Ts'o" <tytso@mit.edu> Cc: Stephen Smalley <sds@epoch.ncsc.mil> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Chris Wright <chrisw@sous-sol.org> Cc: James Morris <jmorris@namei.org> Cc: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jeff Dike [Thu, 15 Nov 2007 01:00:31 +0000 (17:00 -0800)]
uml: fix build for !CONFIG_PRINTK
Handle the case of CONFIG_PRINTK being disabled. This requires a do-nothing
stub to be present in arch/um/include/user.h so that we don't get references
to printk from libc code.
Signed-off-by: Jeff Dike <jdike@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jeff Dike [Thu, 15 Nov 2007 01:00:28 +0000 (17:00 -0800)]
uml: fix build for !CONFIG_TCP
Make UML build in the absence of CONFIG_INET by making the inetaddr_notifier
registration depend on it.
Signed-off-by: Jeff Dike <jdike@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jeff Dike [Thu, 15 Nov 2007 01:00:27 +0000 (17:00 -0800)]
uml: remove last include of libc asm/page.h
asm/page.h is disappearing from the libc headers and we don't need it anyway.
Signed-off-by: Jeff Dike <jdike@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jeff Dike [Thu, 15 Nov 2007 01:00:23 +0000 (17:00 -0800)]
uml: fix spurious IRQ testing
The spurious IRQ testing in request_irq is mishandled in um_request_irq, which
sets the incoming file descriptors non-blocking only after request_irq
succeeds. This results in the spurious irq calling read on a blocking
descriptor, and a hang.
Fixed by reversing the O_NONBLOCK setting and the request_irq call.
Signed-off-by: Jeff Dike <jdike@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jan Kara [Thu, 15 Nov 2007 01:00:19 +0000 (17:00 -0800)]
Fix 64KB blocksize in ext3 directories
With 64KB blocksize, a directory entry can have size 64KB which does not
fit into 16 bits we have for entry lenght. So we store 0xffff instead and
convert value when read from / written to disk. The patch also converts
some places to use ext3_next_entry() when we are changing them anyway.
[akpm@linux-foundation.org: coding-style cleanups] Signed-off-by: Jan Kara <jack@suse.cz> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Johannes Berg [Thu, 15 Nov 2007 01:00:16 +0000 (17:00 -0800)]
hibernate: fix lockdep report
Lockdep reports a circular locking dependency in the hibernate code
because
- during system boot hibernate code (from an initcall) locks pm_mutex
and then a sysfs buffer mutex via name_to_dev_t
- during regular operation hibernate code locks pm_mutex under a
sysfs buffer mutex because it's called from sysfs methods.
The deadlock can never happen because during initcall invocation nothing
can write to sysfs yet. This removes the lockdep report by marking the
initcall locking as being in a different class.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Alan Stern <stern@rowland.harvard.edu> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Pavel Machek <pavel@ucw.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Russ Anderson [Thu, 15 Nov 2007 01:00:15 +0000 (17:00 -0800)]
__do_IRQ does not check IRQ_DISABLED when IRQ_PER_CPU is set
In __do_IRQ(), the normal case is that IRQ_DISABLED is checked and if set
the handler (handle_IRQ_event()) is not called.
Earlier in __do_IRQ(), if IRQ_PER_CPU is set the code does not check
IRQ_DISABLED and calls the handler even though IRQ_DISABLED is set. This
behavior seems unintentional.
One user encountering this behavior is the CPE handler (in
arch/ia64/kernel/mca.c). When the CPE handler encounters too many CPEs
(such as a solid single bit error), it sets up a polling timer and disables
the CPE interrupt (to avoid excessive overhead logging the stream of single
bit errors). disable_irq_nosync() is called which sets IRQ_DISABLED. The
IRQ_PER_CPU flag was previously set (in ia64_mca_late_init()). The net
result is the CPE handler gets called even though it is marked disabled.
If the behavior of not checking IRQ_DISABLED when IRQ_PER_CPU is set is
intentional, it would be worthy of a comment describing the intended
behavior. disable_irq_nosync() does call chip->disable() to provide a
chipset specifiec interface for disabling the interrupt, which avoids this
issue when used.
Signed-off-by: Russ Anderson <rja@sgi.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is my trivial patch to swat innumerable little bugs with a single
blow.
After some intensive review (my apologies for not having gotten to this
sooner) what we have looks like a good base to build on with the current
pid namespace code but it is not complete, and it is still much to simple
to find issues where the kernel does the wrong thing outside of the initial
pid namespace.
Until the dust settles and we are certain we have the ABI and the
implementation is as correct as humanly possible let's keep process ID
namespaces behind CONFIG_EXPERIMENTAL.
Allowing us the option of fixing any ABI or other bugs we find as long as
they are minor.
Allowing users of the kernel to avoid those bugs simply by ensuring their
kernel does not have support for multiple pid namespaces.
[akpm@linux-foundation.org: coding-style cleanups] Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Adrian Bunk <bunk@kernel.org> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Kir Kolyshkin <kir@swsoft.com> Cc: Kirill Korotaev <dev@sw.ru> Cc: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Arjan van de Ven [Thu, 15 Nov 2007 01:00:10 +0000 (17:00 -0800)]
mark sys_open/sys_read exports unused
sys_open / sys_read were used in the early 1.2 days to load firmware from
disk inside drivers. Since 2.0 or so this was deprecated behavior, but
several drivers still were using this. Since a few years we have a
request_firmware() API that implements this in a nice, consistent way.
Only some old ISA sound drivers (pre-ALSA) still straggled along for some
time.... however with commit c2b1239a9f22f19c53543b460b24507d0e21ea0c the
last user is now gone.
This is a good thing, since using sys_open / sys_read etc for firmware is a
very buggy to dangerous thing to do; these operations put an fd in the
process file descriptor table.... which then can be tampered with from
other threads for example. For those who don't want the firmware loader,
filp_open()/vfs_read are the better APIs to use, without this security
issue.
The patch below marks sys_open and sys_read unused now that they're
really not used anymore, and for deletion in the 2.6.25 timeframe.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jan Kiszka [Thu, 15 Nov 2007 01:00:08 +0000 (17:00 -0800)]
fix param_sysfs_builtin name length check
Commit faf8c714f4508207a9c81cc94dafc76ed6680b44 caused a regression:
parameter names longer than MAX_KBUILD_MODNAME will now be rejected,
although we just need to keep the module name part that short. This patch
restores the old behaviour while still avoiding that memchr is called with
its length parameter larger than the total string length.
Signed-off-by: Jan Kiszka <jan.kiszka@web.de> Cc: Dave Young <hidave.darkstar@gmail.com> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently we special case when we have only the initial pid namespace.
Unfortunately in doing so the copied case for the other namespaces was
broken so we don't properly flush the thread directories :(
So this patch removes the unnecessary special case (removing a usage of
proc_mnt) and corrects the flushing of the thread directories.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Al Viro <viro@ftp.linux.org.uk> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com> Cc: Kirill Korotaev <dev@sw.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We have seen ramdisk based install systems, where some pages of mapped
libraries and programs were suddendly zeroed under memory pressure. This
should not happen, as the ramdisk avoids freeing its pages by keeping them
dirty all the time.
It turns out that there is a case, where the VM makes a ramdisk page clean,
without telling the ramdisk driver. On memory pressure shrink_zone runs
and it starts to run shrink_active_list. There is a check for
buffer_heads_over_limit, and if true, pagevec_strip is called.
pagevec_strip calls try_to_release_page. If the mapping has no releasepage
callback, try_to_free_buffers is called. try_to_free_buffers has now a
special logic for some file systems to make a dirty page clean, if all
buffers are clean. Thats what happened in our test case.
The simplest solution is to provide a noop-releasepage callback for the
ramdisk driver. This avoids try_to_free_buffers for ramdisk pages.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Nick Piggin <npiggin@suse.de> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Adrian Bunk [Thu, 15 Nov 2007 01:00:01 +0000 (17:00 -0800)]
fix mm/util.c:krealloc()
Commit ef8b4520bd9f8294ffce9abd6158085bde5dc902 added one NULL check for
"p" in krealloc(), but that doesn't seem to be enough since there
doesn't seem to be any guarantee that memcpy(ret, NULL, 0) works
(spotted by the Coverity checker).
For making it clearer what happens this patch also removes the pointless
min().
Signed-off-by: Adrian Bunk <bunk@kernel.org> Acked-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Adrian Bunk [Thu, 15 Nov 2007 01:00:00 +0000 (17:00 -0800)]
sunrpc/xprtrdma/transport.c: fix use-after-free
Fix an obvious use-after-free spotted by the Coverity checker.
Signed-off-by: Adrian Bunk <bunk@kernel.org> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Neil Brown <neilb@suse.de> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bjorn Helgaas [Thu, 15 Nov 2007 00:59:59 +0000 (16:59 -0800)]
serial: only use PNP IRQ if it's valid
"Luming Yu" <luming.yu@gmail.com> says:
There is a "ttyS1 irq is -1" problem observed on tiger4 which cause the
serial port broken.
It is because that there is __no__ ACPI IRQ resource assigned for the
serial port. So the value of the IRQ for the port is never changed since it
got initialized to -1.
If PNP supplies a valid IRQ, use it. Otherwise, leave port.irq == 0, which
means "no IRQ" to the serial core.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Cc: Yu Luming <luming.yu@intel.com> Acked-by: Matthew Wilcox <matthew@wil.cx> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Darrick J. Wong [Thu, 15 Nov 2007 00:59:58 +0000 (16:59 -0800)]
i5000_edac: no need to __stringify() KBUILD_BASENAME
The i5000_edac driver's PCI registration structure has the name
""i5000_edac"" (with extra set of double-quotes) which is probably not
intentional. Get rid of __stringify.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com> Cc: Doug Thompson <norsk5@yahoo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bjorn Helgaas [Thu, 15 Nov 2007 00:59:57 +0000 (16:59 -0800)]
rtc: fall back to requesting only the ports we actually use
Firmware like PNPBIOS or ACPI can report the address space consumed by the
RTC. The actual space consumed may be less than the size (RTC_IO_EXTENT)
assumed by the RTC driver.
The PNP core doesn't request resources yet, but I'd like to make it do so.
If/when it does, the RTC_IO_EXTENT request may fail, which prevents the RTC
driver from loading.
Since we only use the RTC index and data registers at RTC_PORT(0) and
RTC_PORT(1), we can fall back to requesting just enough space for those.
If the PNP core requests resources, this results in typical I/O port usage
like this:
0070-0073 : 00:06 <-- PNP device 00:06 responds to 70-73
0070-0071 : rtc <-- RTC driver uses only 70-71
Bjorn Helgaas [Thu, 15 Nov 2007 00:59:56 +0000 (16:59 -0800)]
rtc: release correct region in error path
The misc_register() error path always released an I/O port region,
even if the region was memory-mapped (only mips uses memory-mapped RTC,
as far as I can see).
Fengguang Wu [Thu, 15 Nov 2007 00:59:54 +0000 (16:59 -0800)]
reiserfs: don't drop PG_dirty when releasing sub-page-sized dirty file
This is not a new problem in 2.6.23-git17. 2.6.22/2.6.23 is buggy in the
same way.
Reiserfs could accumulate dirty sub-page-size files until umount time.
They cannot be synced to disk by pdflush routines or explicit `sync'
commands. Only `umount' can do the trick.
Shannon Nelson [Thu, 15 Nov 2007 00:59:51 +0000 (16:59 -0800)]
I/OAT: Add support for version 2 of ioatdma device
Add support for version 2 of the ioatdma device. This device handles
the descriptor chain and DCA services slightly differently:
- Instead of moving the dma descriptors between a busy and an idle chain,
this new version uses a single circular chain so that we don't have
rewrite the next_descriptor pointers as we add new requests, and the
device doesn't need to re-read the last descriptor.
- The new device has the DCA tags defined internally instead of needing
them defined statically.
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com> Cc: "Williams, Dan J" <dan.j.williams@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linux Kernel Markers: fix marker mutex not taken upon module load
Upon module load, we must take the markers mutex. It implies that the marker
mutex must be nested inside the module mutex.
It implies changing the nesting order : now the marker mutex nests inside the
module mutex. Make the necessary changes to reverse the order in which the
mutexes are taken.
Includes some cleanup from Dave Hansen <haveblue@us.ibm.com>.
Dmitri Vorobiev [Thu, 15 Nov 2007 00:59:47 +0000 (16:59 -0800)]
Fixes to the BFS filesystem driver
I found a few bugs in the BFS driver. Detailed description of the bugs as
well as the steps to reproduce the errors are given in the kernel bugzilla.
Please follow these links for more information:
This patch fixes the bugs described above. Besides, the patch introduces
coding style changes to make the BFS driver conform to the requirements
specified for Linux kernel code. Finally, I made a few cosmetic changes
such as removal of trivial debug output.
Also, the patch removes the fields `si_lf_ioff' and `si_lf_sblk' of the
in-core superblock structure. These fields are initialized but never
actually used.
If you are wondering why I need BFS, here is the answer: I am using this
driver in the context of Linux kernel classes I am teaching in the Moscow
State University and in the International Institute of Information
Technology in Pune, India.
Signed-off-by: Dmitri Vorobiev <dmitri.vorobiev@gmail.com> Cc: Tigran Aivazian <tigran@veritas.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>