When we get a notification that cpu topology changed, we schedule a
work struct which just calls arch_reinit_sched_domains. This function
in turn calls get_online_cpus() which results int the lockdep warning
below.
After all it turnded out that it's not legal to call get_online_cpus()
from the context of a multi-threaded work queue.
It could deadlock this way:
process 0 (events/cpu-x):
-> run_workqueue
-> removes my work_struct from the work queue
-> calls work_struct->fn
-> get_online_cpus()
-> locks on cpu_hotplug.lock since process 1 below is doing cpu hotplug
process 1:
-> cpu_down (for cpu-x)
-> cpu_hotplug_begin (holds cpu_hotplug.lock now)
-> cpu-x dead
-> notifier_call_chain with CPU_DEAD
-> cleanup_workqueue_thread
-> flush_cpu_workqueue (succeeds)
-> kthread_stop for events/cpu-x
-> now kthread_stop waits for my work_struct to complete from within
process 0. -> dead.
A single threaded workqueue wouldn't have such problems, however there is
no such common queue available and it's not worth to create one for the
very rare calls to arch_reinit_sched_domains.
So we just create a kernel thread from our work struct which calls
arch_reinit_sched_domains and are done with it.
Thanks to Oleg Nesterov and Peter Zijlstra for helping me figuring out
that this isn't a false positive lockdep warning:
=======================================================
[ INFO: possible circular locking dependency detected ] 2.6.25-03562-g3dc5063-dirty #12
-------------------------------------------------------
events/3/14 is trying to acquire lock:
(&cpu_hotplug.lock){--..}, at: [<0000000000076094>] get_online_cpus+0x50/0x78
but task is already holding lock:
(topology_work){--..}, at: [<0000000000059cde>] run_workqueue+0x106/0x278
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
Introduce an ->isc field in the subchannel to store the desired
interruption subclass, since sch->schib.pmcw.isc may be overwritten
by the hardware on stsch() after machine checks.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
On some smp sysfs store attributes get_online_cpus() may block on
cpu_hotplug.lock, but we hold already smp_cpu_state_mutex. Since the
locking order on cpu hotplug via arch_update_cpu_topology is inverse
this might lead to deadlocks.
So make sure locking order is always the same.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/v4l-dvb:
V4L/DVB (7798): tuners/Kconfig: Change config name and help to reflect dynamic load for tuners
V4L/DVB (7794): cx88: Fix a warning
V4L/DVB (7792): ivtv: correct misspelled "HIMEM4G" to "HIGHMEM4G" in error message
V4L/DVB (7791): ivtv: POLLHUP must be returned on eof
V4L/DVB (7789b): Fix merge conflicts
Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (48 commits)
ext4: fix hot spins in mballoc after err_freebuddy and err_freemeta
ext4: fix test ext_generic_write_end() copied return value
ext3: fix test ext_generic_write_end() copied return value
ext4: Move mballoc headers/structures to a seperate header file mballoc.h
ext4: cleanup for compiling mballoc with verification and debugging #defines
ext4: don't use ext4_error in ext4_check_descriptors
ext4: mark inode dirty after initializing the extent tree
ext4: update ctime and mtime for truncate with extents.
ext4: Don't do GFP_NOFS allocations after taking ext4_lock_group
ext4: move headers out of include/linux
ext4: fix wrong gfp type under transaction
ext4: Fix hang on umount with quotas when journal is aborted
ext4: Fix update of mtime and ctime on rename
jdb2: replace remaining __FUNCTION__ occurrences
ext4: replace remaining __FUNCTION__ occurrences
jbd2: only create debugfs and stats entries if init is successful
jbd2: fix kernel-doc notation
jbd2: replace potentially false assertion with if block
jbd2: eliminate duplicated code in revocation table init/destroy functions
jbd2: tidy up revoke cache initialisation and destruction
...
Robert P. J. Day [Mon, 28 Apr 2008 23:16:20 +0000 (20:16 -0300)]
V4L/DVB (7792): ivtv: correct misspelled "HIMEM4G" to "HIGHMEM4G" in error message
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Stephen Rothwell [Wed, 30 Apr 2008 01:16:16 +0000 (11:16 +1000)]
pasemi_edac needs to include linux/edac.h
Commit c3c52bce6993c6d37af2c2de9b482a7013d646a7 ("edac: fix module
initialization on several modules 2nd time") added a call to opstate_init
but did not include linux/edac.h that declares it.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Olof Johansson <olof@lixom.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ext4: fix hot spins in mballoc after err_freebuddy and err_freemeta
In ext4_mb_init_backend() 'i' is of type ext4_group_t. Since unsigned, i
>= 0 is always true, so fix hot spins after err_freebuddy: and -meta:
and prevent decrements when zero.
It fixes:
Compilation errors:
fs/ext4/mballoc.c: In function '__mb_check_buddy':
fs/ext4/mballoc.c:605: error: 'struct ext4_prealloc_space' has no member named 'group_list'
fs/ext4/mballoc.c:606: error: 'struct ext4_prealloc_space' has no member named 'pstart'
fs/ext4/mballoc.c:608: error: 'struct ext4_prealloc_space' has no member named 'len'
Compilation warnings:
fs/ext4/mballoc.c: In function 'ext4_mb_normalize_group_request':
fs/ext4/mballoc.c:2863: warning: format '%lu' expects type 'long unsigned int', but argument 3 has type 'int'
fs/ext4/mballoc.c: In function 'ext4_mb_use_inode_pa':
fs/ext4/mballoc.c:3103: warning: format '%lu' expects type 'long unsigned int', but argument 3 has type 'int'
Sparse check:
fs/ext4/mballoc.c:3818:2: warning: context imbalance in 'ext4_mb_show_ac' - different lock contexts for basic block
Josef Bacik [Wed, 30 Apr 2008 02:00:28 +0000 (22:00 -0400)]
ext4: don't use ext4_error in ext4_check_descriptors
Because ext4_check_descriptors is called at mount time you can't use ext4_error
as it calls ext4_commit_sb, which since the sb isn't all the way initialized
causes bad things to happen (ie a panic). This patch changes the ext4_error's
to printk's to keep this problem from happening. Thanks much,
Signed-off-by: Josef Bacik <jbacik@redhat.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
ext4: update ctime and mtime for truncate with extents.
The recently announced "Linux POSIX file system test suite"
caught a truncate issue when using extents:
mtime and ctime are not updated when truncate is successful.
This is the single issue caught with "default" ext4 (mkfs and mount
with minimal options).
The testsuite does not report failure with -o noextents.
With the following patch, all tests of the testsuite pass.
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
Smack: Integrate Smack with Audit
Security: Typecast CAP_*_SET macros
Security: Make secctx_to_secid() take const secdata
Ahmed S. Darwish [Tue, 29 Apr 2008 22:34:10 +0000 (08:34 +1000)]
Smack: Integrate Smack with Audit
Setup the new Audit hooks for Smack. SELinux Audit rule fields are recycled
to avoid `auditd' userspace modifications. Currently only equality testing
is supported on labels acting as a subject (AUDIT_SUBJ_USER) or as an object
(AUDIT_OBJ_USER).
Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com> Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
[libata] linux/libata.h: reorganize ata_device struct members a bit
ahci: SB600 ahci can't do MSI, blacklist that capability
libata: More TSSTcorp pain, keep in sync with legacy IDE
pata_via: Fix 6410 misdetect
[libata] pata_atiixp: fix PIO timing data misprogramming
Josef Bacik [Wed, 30 Apr 2008 02:02:02 +0000 (22:02 -0400)]
ext4: fix wrong gfp type under transaction
This fixes the allocations with GFP_KERNEL while under a transaction problems
in ext4. This patch is the same as its ext3 counterpart, just switches these
to GFP_NOFS.
Signed-off-by: Josef Bacik <jbacik@redhat.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Jan Kara [Wed, 30 Apr 2008 02:02:07 +0000 (22:02 -0400)]
ext4: Fix hang on umount with quotas when journal is aborted
Call dquot_drop() from ext4_dquot_drop() even if we fail to start a
transaction. Otherwise we never get to dropping references to quota structures
from the inode and umount will hang indefinitely. Thanks to Payphone LIOU for
spotting the problem.
Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mingming Cao <cmm@us.ibm.com> CC: Payphone LIOU <lioupayphone@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Alex Chiang [Tue, 29 Apr 2008 22:05:29 +0000 (15:05 -0700)]
[IA64] Provide ACPI fixup for /proc/cpuinfo/physical_id
Legacy HP ia64 platforms currently cannot provide
/proc/cpuinfo/physical_id due to legacy SAL/PAL implementations.
However, that physical topology information can be obtained
via ACPI.
Provide an interface that gives ACPI one last chance to provide
physical_id for these legacy platforms. This logic only comes
into play iff:
- ACPI actually provides slot information for the CPU
- we lack a valid socket_id
Otherwise, we don't do anything.
Since x86 uses the ACPI processor driver as well, we provide a nop
stub function for arch_fix_phys_package_id() in asm-x86/topology.h
Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/v4l-dvb: (28 commits)
V4L-DVB(7789a): cx18: fix symbol conflict with ivtv driver
V4L/DVB (7789): tuner: remove static dependencies on analog tuner sub-modules
V4L/DVB (7785): [2.6 patch] make mt9{m001,v022}_controls[] static
V4L/DVB (7786): cx18: new driver for the Conexant CX23418 MPEG encoder chip
V4L/DVB (7783): drivers/media/dvb/frontends/s5h1420.c: printk fix
V4L/DVB (7782): pvrusb2: Driver is no longer experimental
V4L/DVB (7781): pvrusb2-dvb: include dvb support by default and update Kconfig help text
V4L/DVB (7780): pvrusb2: always enable support for OnAir Creator / HDTV USB2
V4L/DVB (7779): pvrusb2-dvb: quiet down noise in kernel log for feed debug
Rename common tuner Kconfig names to use the same
Fix V4L/DVB core help messages
V4L/DVB (7769): Move other terrestrial tuners to common/tuners
V4L/DVB (7768): reorganize some DVB-S Kconfig items
V4L/DVB(7767): Move tuners to common/tuners
V4L/DVB (7766): saa7134: add another PCI ID for Beholder M6
V4L/DVB (7765): Add support for Beholder BeholdTV H6
V4L/DVB (7763): ivtv: add tuner support for the AverMedia M116
V4L/DVB (7762): ivtv: fix tuner detection for PAL-N/Nc
V4L/DVB (7761): ivtv: increase the DMA timeout from 100 to 300 ms
V4L/DVB (7759): ivtv: increase version number to 1.2.1
...
Merge branch 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6
* 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6:
i2c: Convert most new-style drivers to use module aliasing
i2c: Add support for device alias names
i2c-amd756-s4882: Fix an error path
i2c: Drop unused RTC driver IDs
i2c/tps65010: Add missing intialization of client data
i2c-sis5595: Minor cleanups in sis5595_access
i2c-piix4: Minor cleanups
i2c: Spelling fix (successful)
i2c-stub: No newline in parameter description
V4L-DVB(7789a): cx18: fix symbol conflict with ivtv driver
LD drivers/media/video/built-in.o
drivers/media/video/cx18/built-in.o: In function `get_service_set':
/home/v4l/tokernel/git/drivers/media/video/cx18/cx18-ioctl.c:118: multiple definition of `get_service_set'
drivers/media/video/ivtv/built-in.o:/home/v4l/tokernel/git/drivers/media/video/ivtv/ivtv-ioctl.c:119: first defined here
drivers/media/video/cx18/built-in.o: In function `expand_service_set':
/home/v4l/tokernel/git/drivers/media/video/cx18/cx18-ioctl.c:92: multiple definition of `expand_service_set'
drivers/media/video/ivtv/built-in.o:/home/v4l/tokernel/git/drivers/media/video/ivtv/ivtv-ioctl.c:92: first defined here
drivers/media/video/cx18/built-in.o: In function `service2vbi':
/home/v4l/tokernel/git/drivers/media/video/cx18/cx18-ioctl.c:44: multiple definition of `service2vbi'
drivers/media/video/ivtv/built-in.o:/home/v4l/tokernel/git/drivers/media/video/ivtv/ivtv-ioctl.c:42: first defined here
make[2]: ** [drivers/media/video/built-in.o] Erro 1
make[1]: ** [drivers/media/video] Erro 2
make: ** [drivers/media/] Erro 2
Hans Verkuil [Mon, 28 Apr 2008 23:24:33 +0000 (20:24 -0300)]
V4L/DVB (7786): cx18: new driver for the Conexant CX23418 MPEG encoder chip
Many thanks to Steve Toth from Hauppauge and Nattu Dakshinamurthy from
Conexant for their support. I am in particular thankful to Hauppauge
since without their help this driver would not exist. It should also
be noted that Steve did the work to get the DVB part up and running.
Thank you!
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Steven Toth <stoth@hauppauge.com> Signed-off-by: Michael Krufky <mkrufky@linuxtv.org> Signed-off-by: G. Andrew Walls <awalls@radix.net> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
drivers/media/dvb/frontends/s5h1420.c: In function `s5h1420_setsymbolrate':
drivers/media/dvb/frontends/s5h1420.c:484: warning: long long unsigned int format, u64 arg (arg 2)
We do not know what type the architecture uses for u64.
Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Mike Isely [Mon, 28 Apr 2008 00:37:33 +0000 (21:37 -0300)]
V4L/DVB (7782): pvrusb2: Driver is no longer experimental
This driver has been in-kernel and reasonably stable for well over a
year. It is in a stable form and is known to work well. Remove its
experimental status.
Signed-off-by: Mike Isely <isely@pobox.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Michael Krufky [Sun, 27 Apr 2008 22:12:29 +0000 (19:12 -0300)]
V4L/DVB (7780): pvrusb2: always enable support for OnAir Creator / HDTV USB2
This was a build option in the past, to avoid conflicts with the cxusb module
for digital televsion support. Now that dtv mode support has been merged into
pvrusb2, the OnAir devices are fully supported by this single module. This no
longer should be a build option.
Signed-off-by: Michael Krufky <mkrufky@linuxtv.org> Signed-off-by: Mike Isely <isely@pobox.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
V4L/DVB (7769): Move other terrestrial tuners to common/tuners
Those tuners are currently used only under media/dvb. However,
they can support also analog TV. Better to move them to the same place
as the other hybrid tuners. This would make easier to use those tuners also
by analog drivers.
There were several issues in the past, caused by the hybrid tuner design, since
now, the same tuner can be used by drivers/media/dvb and drivers/media/video.
Kconfig items were rearranged, to split V4L/DVB core from their drivers.
Tuner setup were happening during i2c attach callback. This means that it would
happen on two conditions:
1) if tuner module weren't load, it will happen at request_module("tuner");
2) if tuner is not compiled as a module, or it is already loaded
(for example, on setups with more than one tuner), it will happen
when saa7134 registers I2C bus.
Due to that, if tuner were loaded, tuner setup will happen _before_ reading
the proper values at tuner eeprom. Since set_addr refuses to change for a tuner
that were previously defined (except if the tuner_addr is set), this were
making eeprom tuner detection useless.
This patch removes tuner type setup from saa7134-i2c, moving it to the proper
place, after taking eeprom into account.
Reviewed-by: Hermann Pitton <hermann-pitton@arcor.de> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Tuner setup were happening during i2c attach callback. This means that it would
happen on two conditions:
1) if tuner module weren't load, it will happen at request_module("tuner");
2) if tuner is not compiled as a module, or it is already loaded
(for example, on setups with more than one tuner), it will happen
when cx88 registers I2C bus.
Due to that, if tuner were loaded, tuner setup will happen _before_ reading
the proper values at tuner eeprom. Since set_addr refuses to change for a tuner
that were previously defined (except if the tuner_addr is set), this were making
eeprom tuner detection useless.
This patch removes tuner type setup from cx88-i2c, moving it to the proper
place, after taking eeprom into account.
Reviewed-by: Gert Vervoort <gert.vervoort@hccnet.nl> Reviewed-by: Ian Pickworth <ian@pickworth.me.uk> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Alan Cox [Tue, 29 Apr 2008 13:10:57 +0000 (14:10 +0100)]
pata_via: Fix 6410 misdetect
The discrete VIA ATA chips don't have 0x40 enable bits. We check that
properly in one location but not another. This causes some users 6410
RAID cards to be incorrectly skipped.
Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Jean Delvare [Tue, 29 Apr 2008 21:11:40 +0000 (23:11 +0200)]
i2c: Convert most new-style drivers to use module aliasing
Based on earlier work by Jon Smirl and Jochen Friedrich.
Update most new-style i2c drivers to use standard module aliasing
instead of the old driver_name/type driver matching scheme. I've
left the video drivers apart (except for SoC camera drivers) as
they're a bit more diffcult to deal with, they'll have their own
patch later.
Signed-off-by: Jean Delvare <khali@linux-fr.org> Cc: Jon Smirl <jonsmirl@gmail.com> Cc: Jochen Friedrich <jochen@scram.de>
Jean Delvare [Tue, 29 Apr 2008 21:11:39 +0000 (23:11 +0200)]
i2c: Add support for device alias names
Based on earlier work by Jon Smirl and Jochen Friedrich.
This patch allows new-style i2c chip drivers to have alias names using
the official kernel aliasing system and MODULE_DEVICE_TABLE(). At this
point, the old i2c driver binding scheme (driver_name/type) is still
supported.
Signed-off-by: Jean Delvare <khali@linux-fr.org> Cc: Jochen Friedrich <jochen@scram.de> Cc: Jon Smirl <jonsmirl@gmail.com> Cc: Kay Sievers <kay.sievers@vrfy.org>
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
RDMA/nes: Formatting cleanup
RDMA/nes: Add support for SFP+ PHY
RDMA/nes: Use LRO
IPoIB: Copy child MTU from parent
IB/mthca: Avoid changing userspace ABI to handle DMA write barrier attribute
IB/mthca: Avoid recycling old FMR R_Keys too soon
mlx4_core: Avoid recycling old FMR R_Keys too soon
IB/ehca: Allocate event queue size depending on max number of CQs and QPs
IPoIB: Use separate CQ for UD send completions
IB/iser: Count FMR alignment violations per session
IB/iser: Move high-volume debug output to higher debug level
IB/ehca: handle negative return value from ibmebus_request_irq() properly
RDMA/cxgb3: Support peer-2-peer connection setup
RDMA/cxgb3: Set the max_mr_size device attribute correctly
RDMA/cxgb3: Correctly serialize peer abort path
mlx4_core: Add a way to set the "collapsed" CQ flag
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6:
alim15x3: disable init_hwif_ali15x3 for PowerPC
ide: fix crash at boot with siimage driver
Anton Vorontsov [Tue, 29 Apr 2008 20:57:38 +0000 (22:57 +0200)]
alim15x3: disable init_hwif_ali15x3 for PowerPC
We don't need init_hwif_ali15x3() on the PowerPC systems either.
Before:
ALI15X3: IDE controller (0x10b9:0x5229 rev 0xc8) at PCI slot 0001:03:1f.0
ALI15X3: 100% native mode on irq 19
ide0: BM-DMA at 0x1120-0x1127
ide1: BM-DMA at 0x1128-0x112f
hda: SONY DVD RW AW-Q170A, ATAPI CD/DVD-ROM drive
hda: UDMA/66 mode selected
ide0: Disabled unable to get IRQ 14.
ide0: failed to initialize IDE interface
ide1: Disabled unable to get IRQ 15.
ide1: failed to initialize IDE interface
After:
ALI15X3: IDE controller (0x10b9:0x5229 rev 0xc8) at PCI slot 0001:03:1f.0
ALI15X3: 100% native mode on irq 19
ide0: BM-DMA at 0x1120-0x1127
ide1: BM-DMA at 0x1128-0x112f
hda: SONY DVD RW AW-Q170A, ATAPI CD/DVD-ROM drive
hda: UDMA/66 mode selected
ide0 at 0x1100-0x1107,0x110a on irq 19
ide1 at 0x1110-0x1117,0x111a on irq 19
hda: ATAPI 48X DVD-ROM DVD-R CD-R/RW drive, 2048kB Cache
ide0 works well, though I can't test ide1, it isn't traced out on
the board.
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Some change to the IDE layer are causing the siimage driver to crash
at boot with a NULL dereference. This is due to the sil_dma_ops not
containing all the necessary pointers. I suppose it used to just
"override" the defaults while now, it needs to contain everything.
[bart: while at it: sil_dma_ops should be const now (pointed out by Sergei)]
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>, Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Alex Chiang [Thu, 24 Apr 2008 18:57:08 +0000 (12:57 -0600)]
[IA64] Remove printk noise on unimplemented SAL_PHYSICAL_ID_INFO
Commit 113134fcbca83619be4c68d0ca66db6093777b5d changed the flow of
control when calling PAL_LOGICAL_TO_PHYSICAL and SAL_PHYSICAL_ID_INFO.
With the change, if a platform did not implement the latter, a useless
printk would appear in the boot log:
ia64_sal_pltid failed with -1
So let's check the return code and only printk on a true error, and do
not print anything in the unimplemented case. While we're in there,
clean up some stylistic issues too.
Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
Various cleanups:
- Change // to /* .. */
- Place whitespace around binary operators.
- Trim down a few long lines.
- Some minor alignment formatting for better readability.
- Remove some silly tabs.
Signed-off-by: Glenn Streiff <gstreiff@neteffect.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
Eli Cohen [Tue, 29 Apr 2008 20:46:53 +0000 (13:46 -0700)]
IPoIB: Copy child MTU from parent
When creating a child interface, copy the MTU information from the
parent. Otherwise when the child's multicast join completes, the MTU
will not be updated since the code does
dev->mtu = min(priv->mcast_mtu, priv->admin_mtu);
and priv->admin_mtu will be set to 0.
Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
Roland Dreier [Tue, 29 Apr 2008 20:46:53 +0000 (13:46 -0700)]
IB/mthca: Avoid changing userspace ABI to handle DMA write barrier attribute
Commit cb9fbc5c ("IB: expand ib_umem_get() prototype") changed the
mthca userspace ABI to provide a way for userspace to indicate which
memory regions need the DMA write barrier attribute. However, it is
possible to handle this without breaking existing userspace, by having
the mthca kernel driver recognize whether it is talking to old or new
userspace, depending on the size of the register MR structure passed in.
The only potential drawback of this is that is allows old userspace
(which has a bug with DMA ordering on large SGI Altix systems) to
continue to run on new kernels, but the advantage of allowing old
userspace to continue to work on unaffected systems seems to outweigh
this, and we can print a warning to push people to upgrade their
userspace.
Olaf Kirch [Tue, 29 Apr 2008 20:46:53 +0000 (13:46 -0700)]
IB/mthca: Avoid recycling old FMR R_Keys too soon
When a FMR is unmapped, mthca resets the map count to 0, and clears
the upper part of the R_Key which is used as the sequence counter.
This poses a problem for RDS, which uses ib_fmr_unmap as a fence
operation. RDS assumes that after issuing an unmap, the old R_Keys
will be invalid for a "reasonable" period of time. For instance,
Oracle processes uses shared memory buffers allocated from a pool of
buffers. When a process dies, we want to reclaim these buffers -- but
we must make sure there are no pending RDMA operations to/from those
buffers. The only way to achieve that is by using unmap and sync the
TPT.
However, when the sequence count is reset on unmap, there is a high
likelihood that a new mapping will be given the same R_Key that was
issued a few milliseconds ago.
To prevent this, don't reset the sequence count when unmapping a FMR.
Signed-off-by: Olaf Kirch <olaf.kirch@oracle.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
Olaf Kirch [Tue, 29 Apr 2008 20:46:53 +0000 (13:46 -0700)]
mlx4_core: Avoid recycling old FMR R_Keys too soon
When a FMR is unmapped, mlx4 resets the map count to 0, and clears the
upper part of the R_Key which is used as the sequence counter.
This poses a problem for RDS, which uses ib_fmr_unmap as a fence
operation. RDS assumes that after issuing an unmap, the old R_Keys
will be invalid for a "reasonable" period of time. For instance,
Oracle processes uses shared memory buffers allocated from a pool of
buffers. When a process dies, we want to reclaim these buffers -- but
we must make sure there are no pending RDMA operations to/from those
buffers. The only way to achieve that is by using unmap and sync the
TPT.
However, when the sequence count is reset on unmap, there is a high
likelihood that a new mapping will be given the same R_Key that was
issued a few milliseconds ago.
To prevent this, don't reset the sequence count when unmapping a FMR.
Signed-off-by: Olaf Kirch <olaf.kirch@oracle.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
Stefan Roscher [Tue, 29 Apr 2008 20:46:53 +0000 (13:46 -0700)]
IB/ehca: Allocate event queue size depending on max number of CQs and QPs
If a lot of QPs fall into Error state at once and the EQ of the
respective HCA is too small, it might overrun, causing the eHCA driver
to stop processing completion events and calling the application's
completion handlers, effectively causing traffic to stop.
Fix this by limiting available QPs and CQs to a customizable max
count, and determining EQ size based on these counts and a worst-case
assumption.
Signed-off-by: Stefan Roscher <stefan.roscher@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
Eli Cohen [Tue, 29 Apr 2008 20:46:53 +0000 (13:46 -0700)]
IPoIB: Use separate CQ for UD send completions
Use a dedicated CQ for UD send completions. Also, do not arm the UD
send CQ, which reduces the number of interrupts generated. This patch
farther reduces overhead by not calling poll CQ for every posted send
WR -- it does polls only when there 16 or more outstanding work requests.
Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>