Linus Torvalds [Sat, 20 Oct 2007 02:59:18 +0000 (19:59 -0700)]
Avoid compile error in fs/nfs/unlink.c
Erez Zadok reports that certain configurations fail to build due to
schedule() TASK_[UN]INTERRUPTIBLE not being declared. Add proper
include files to fix.
Jeff Garzik [Sat, 20 Oct 2007 02:56:44 +0000 (22:56 -0400)]
[libata] sata_sis: use correct S/G table size
sata_sis has the same restrictions as other SFF controllers, and so must
use LIBATA_MAX_PRD to denote that SCSI may only fill ATA_MAX_PRD/2
entries, due to our need to handle IOMMU merging.
sysctl: Don't compile sysctl_check when !CONFIG_SYSCTL
Weird I thought I had written the makefile so this would be handled. Oh
well this should fix it.
Sorry about that.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-and-tested-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jonathan Brassow [Fri, 19 Oct 2007 21:47:57 +0000 (22:47 +0100)]
dm log: split suspend
There are now two phases to a suspend in device-mapper -
presuspend and postsuspend. This patch removes the
single 'suspend' in the logging API and replaces it with
'presuspend' and 'postsuspend' functions to align it
better with core device-mapper.
A subsequent patch will make use of 'presuspend'.
Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Dave Wysochanski [Fri, 19 Oct 2007 21:47:55 +0000 (22:47 +0100)]
dm mpath: hp retry if not ready
This patch adds retries to the hp hardware handler, and utilizes the
MP_RETRY flag of dm-multipath. For now in the hp handler, if we get a
pg_init completed with a check condition we just assume we can retry the
pg_init command. We make this assumption because of incomplete data on
specific check condition code of the HP hardware, and because testing
has shown the HP path initialization command to be idempotent.
The number of times we retry is settable via the "pg_init_retries"
multipath map feature.
Signed-off-by: Dave Wysochanski <dwysocha@redhat.com> Acked-by: Chandra Seetharaman <sekharan@us.ibm.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Dave Wysochanski [Fri, 19 Oct 2007 21:47:53 +0000 (22:47 +0100)]
dm mpath: add retry pg init
This patch allows a failed path group initialisation command to be retried.
It adds a generic MP_RETRY flag and a "pg_init_retries" feature to
device-mapper multipath which limits the number of retries.
1. A hw handler sends a path initialization command to the storage and
the command completes with an error code indicating the command
should be retried.
2. The hardware handler calls dm_pg_init_complete() with MP_RETRY
set in err_flags to ask the dm multipath core to retry.
3. If the retry limit has not been exceeded, pg_init() is retried.
Otherwise fail_path() is called.
If you are using the userspace multipath-tools or device-mapper-multipath
package, you can set pg_init_retries in the 'device' section of your
/etc/multipath.conf file. For example:
features "2 pg_init_retries 7"
The number of PG retries attempted is reported in the 'dmsetup status' output.
Signed-off-by: Dave Wysochanski <dwysocha@redhat.com> Acked-by: Mike Christie <michaelc@cs.wisc.edu> Acked-by: Chandra Seetharaman <sekharan@us.ibm.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Milan Broz [Fri, 19 Oct 2007 21:38:58 +0000 (22:38 +0100)]
dm crypt: add post processing queue
Add post-processing queue (per crypt device) for read operations.
Current implementation uses only one queue for all operations
and this can lead to starvation caused by many requests waiting
for memory allocation. But the needed memory-releasing operation
is queued after these requests (in the same queue).
Signed-off-by: Milan Broz <mbroz@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Jesper Juhl [Fri, 19 Oct 2007 21:38:54 +0000 (22:38 +0100)]
dm io:ctl remove vmalloc void cast
In drivers/md/dm-ioctl.c::copy_params() there's a call to vmalloc()
where we currently cast the return value, but that's pretty pointless
given that vmalloc() returns "void *".
Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Kcopyd uses a semaphore as mutex. Use the mutex API instead of the (binary)
semaphore,
Signed-off-by: Matthias Kaehlcke <matthias.kaehlcke@gmail.com> Cc: Neil Brown <neilb@suse.de> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Jun'ichi Nomura [Fri, 19 Oct 2007 21:38:43 +0000 (22:38 +0100)]
dm: fix thaw_bdev
This patch fixes a bd_mount_sem counter corruption bug in device-mapper.
thaw_bdev() should be called only when freeze_bdev() was called for the
device.
Otherwise, thaw_bdev() will up bd_mount_sem and corrupt the semaphore counter.
struct block_device with the corrupted semaphore may remain in slab cache
and be reused later.
Attached patch will fix it by calling unlock_fs() instead.
unlock_fs() will determine whether it should call thaw_bdev()
by checking the device is frozen or not.
Easy reproducer is:
#!/bin/sh
while [ 1 ]; do
dmsetup --notable create a
dmsetup --nolockfs suspend a
dmsetup remove a
done
It's not easy to see the effect of corrupted semaphore.
So I have tested with putting printk below in bdev_alloc_inode():
if (atomic_read(&ei->bdev.bd_mount_sem.count) != 1)
printk(KERN_DEBUG "Incorrect semaphore count = %d (%p)\n",
atomic_read(&ei->bdev.bd_mount_sem.count),
&ei->bdev);
Without the patch, I saw something like:
Incorrect semaphore count = 17 (f2ab91c0)
With the patch, the message didn't appear.
The bug was introduced in 2.6.16 with this bug fix:
Need to unfreeze and release bdev otherwise the bdev inode with
inconsistent state is reused later and cause problem.
and backported to 2.6.15.5.
It occurs only in free_dev(), which is called only when the dm device is
removed. The buggy code is executed only if md->suspended_bdev is
non-NULL and that can happen only when the device was suspended without
noflush.
Milan Broz [Fri, 19 Oct 2007 21:38:36 +0000 (22:38 +0100)]
dm io:ctl use constant struct size
Make size of dm_ioctl struct always 312 bytes on all supported
architectures.
This change retains compatibility with already-compiled code because
it uses an embedded offset to locate the payload that follows the
structure.
On 64-bit architectures there is no change at all; on 32-bit
we are increasing the size of dm-ioctl from 308 to 312 bytes.
Currently with 32-bit userspace / 64-bit kernel on x86_64
some ioctls (including rename, message) are incorrectly rejected
by the comparison against 'param + 1'. This breaks userspace
lvrename and multipath 'fail_if_no_path' changes, for example.
(BTW Device-mapper uses its own versioning and ignores the ioctl
size bits. Only the generic ioctl compat code on mixed arches
checks them, and that will continue to accept both sizes for now,
but we intend to list 308 as deprecated and eventually remove it.)
Signed-off-by: Milan Broz <mbroz@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Cc: Guido Guenther <agx@sigxcpu.org> Cc: Kevin Corry <kevcorry@us.ibm.com> Cc: stable@kernel.org
Bryn M. Reeves [Fri, 19 Oct 2007 21:29:32 +0000 (22:29 +0100)]
dm mpath: rdac fix init race
Re-order the initialisation of dm-rdac to avoid registering the hw
handler before the workqueue has been initialised. Closes a race
that would potentially give an oops.
Signed-off-by: Bryn M. Reeves <breeves@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
It should be TASKFILE_NO_DATA, not TASKFILE_IN. Luckily ATM ->data_phase is
unused if ->command_type == IDE_DRIVE_TASK_NO_DATA but this may change in the
future.
* Set hwif->dma_base only if allocation of extra ports succeeds.
While at it:
* Move setting of hwif->dma_{base,master} from ide_{mapped_mmio,iomio}_dma()
to ide_setup_dma().
* Rename 'dma_base' argument to 'base' in ide_setup_dma() (to make the code
obey 80-columns limit and increase its readability).
* Remove stale ide_setup_dma() comment.
v2:
* Change to allocate hwif->dmatable_cpu before reserving I/O ports missed
teardown code (spotted by Sergei). On the second thought this change is
actually unnecessary so revert it in v2.
* Make ide_release_dma_engine() void and remove needless comment.
Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
ide: fix ide_register_hw() to check hwif->io_ports[]
hwif->hw.io_ports[] and hwif->io_ports[] should be the same but "4drives"
support and scc_pata host driver set only hwif->io_ports[].
To compensate for this check hwif->io_ports[] instead of hwif->hw.io_ports[]
in ide_register_hw() (instead of fixing "4drives" and scc_pata because hwif->hw
is to be removed).
* Add ide_device_add() helper and convert host drivers to use it
instead of open-coded variants.
* Make ide_pci_setup_ports() and do_ide_setup_pci_device()
take 'u8 *idx' argument instead of 'ata_index_t *index'.
* Remove no longer needed ata_index_t.
* Unexport probe_hwif_init() and make it static.
* Unexport ide_proc_register_port().
There should be no functionality changes caused by this patch
(sgiioc4.c: ide_proc_register_port() requires hwif->present
to be set and it won't be set if probe_hwif_init() fails).
* Split off sil_sata_udma_filter() from sil_udma_filter()
and rename sil_udma_filter() to sil_pata_udma_filter().
* Rename siimage_busproc() to sil_sata_busproc().
* Rename siimage_reset_poll() to sil_sata_reset_poll()
and in init_hwif_siimage() set ->reset_poll method only
for SATA controllers.
* Rename siimage_pre_reset() to sil_sata_pre_reset(),
in init_hwif_siimage() set ->pre_reset method only for
SATA controllers and remove redundant is_sata() call.
* Add CONFIG_BLK_DEV_IDE_SATA #ifdef/#endif to pdev_is_sata()
so compiler will know to throw out unused SATA code for
CONFIG_BLK_DEV_IDE_SATA=n case (830 bytes saved on x86-32).
* Bump driver version.
Some minor cleanups while at it:
* Convert sil_{pata,sata}_udma_filter() to use ATA_UDMA* defines.
* In siimage_mmio_ide_dma_test_irq() move 'base' variable
under 'if (SATA_ERROR_REG)' block.
alim15x3: fix CD_ROM DMA and PIO FIFO settings setup
* Setup CD_ROM DMA and PIO FIFO settings in init_chipset_ali15x3() instead
of ata66_ali15x3(). The latter is called from init_hwif_common_ali15x3()
only if DMA base exists (which insists m5529_revision > 0x20).
This changes makes CD_ROM DMA / PIO FIFO bits being set only once
and also when "idex=ata66" kernel parameter is used.
* While at it move also chip_is_1543c_e setup from ata66_ali15x3() to
init_chipset_ali15x3() and check if isa_dev exists before accessing it.
Add IDE_HFLAG_{IO_32BIT,UNMASK_IRQS} host flag to tell ide_pci_setup_ports()
to set drive->{io_32bit,unmask} for both drives on the interface. Convert
amd74xx, sl82c105 and via82cxxx host drivers to use these new host flags.
Add IDE_HFLAG_FORCE_LEGACY_IRQS host flag to tell ide_pci_setup_ports()
to always set hwif->irq to legacy IRQ 14/15 and convert generic IDE PCI
and via82cxxx host drivers to use it.
While at it:
* Add IDE_HFLAGS_UMC define (generic IDE PCI host driver).
* Remove no longer needed init_hwif_generic() (generic IDE PCI host driver).
* Set d->udma_mask instead of hwif->ultra_mask (via82cxxx host driver).
Add ->chipset field to ide_pci_device_t and use it in ide_hwif_configure()
to set hwif->chipset. Convert cmd64x, cy82c693, rz1000 and trm290 host
drivers to use this new ability.
While at it define hwif_chipset_t as u8 to save some space in hw_regs_t,
ide_hwif_t and ide_pci_device_t instances.
Add hwif_register_devices() helper to fix code duplication between
probe_hwif_init_with_fixup() and ideprobe_init(). Also remove stale
comment while at it.
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
ide: fix disabled ports reporting for PCI controllers
Report all disabled ports in ide_pci_setup_ports() (prevents the bogus
warning when ide_hwif_configure()->ide_match_hwif() fails to find free
ide_hwifs[] slots).
* ssh://master.kernel.org/pub/scm/linux/kernel/git/tglx/linux-2.6-x86: (33 commits)
x86: convert cpuinfo_x86 array to a per_cpu array
x86: introduce frame_pointer() and stack_pointer()
x86 & generic: change to __builtin_prefetch()
i386: do not BUG_ON() when MSR is unknown
x86: acpi use cpu_physical_id
x86: convert cpu_llc_id to be a per cpu variable
x86: convert cpu_to_apicid to be a per cpu variable
i386: introduce "used_vectors" bitmap which can be used to reserve vectors.
x86: use raw locks during oopses
x86: honor _PAGE_PSE bit on page walks
i386: do cpuid_device_create() in CPU_UP_PREPARE instead of CPU_ONLINE.
x86: implement missing x86_64 function smp_call_function_mask()
x86: use descriptor's functions instead of inline assembly
i386: consolidate show_regs and show_registers for i386
i386: make callgraph use dump_trace() on i386/x86_64
x86: enable iommu_merge by default
i386: i386 add AMD64 Barcelona PMU MSR definitions to msr.h
x86: Unify i386 and x86-64 early quirks
x86: enable HPET on ICH3 and ICH4
x86: force enable HPET on VT8235/8237 chipsets
...
Manually fix trivial conflict with task pid container helper changes in
arch/x86/kernel/process_32.c
Linus Torvalds [Fri, 19 Oct 2007 21:31:42 +0000 (14:31 -0700)]
Merge git://git.linux-nfs.org/pub/linux/nfs-2.6
* git://git.linux-nfs.org/pub/linux/nfs-2.6:
NFSv4: Fix an rpc_cred reference leakage in fs/nfs/delegation.c
NFSv4: Ensure that we wait for the CLOSE request to complete
NFS: Fix a race in sillyrename
NFS: Fix a writeback race...
Trond Myklebust [Mon, 15 Oct 2007 22:17:53 +0000 (18:17 -0400)]
NFS: Fix a race in sillyrename
lookup() and sillyrename() can race one another because the sillyrename()
completion cannot take the parent directory's inode->i_mutex since the
latter may be held by whoever is calling dput().
We therefore have little option but to add extra locking to ensure that
nfs_lookup() and nfs_atomic_open() do not race with the sillyrename
completion.
If somebody has looked up the sillyrenamed file in the meantime, we just
transfer the sillydelete information to the new dentry.
Please refer to the bug-report at
http://bugzilla.linux-nfs.org/show_bug.cgi?id=150
We cannot zero the user page in nfs_mark_uptodate() any more, since
a) We'd be modifying the page without holding the page lock
b) We can race with other updates of the page, most notably
because of the call to nfs_wb_page() in nfs_writepage_setup().
Instead, we do the zeroing in nfs_update_request() if we see that we're
creating a request that might potentially be marked as up to date.
Thanks to Olivier Paquet for reporting the bug and providing a test-case.
* git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild:
kbuild: fix first module build
kconfig: update kconfig-language text
kbuild: introduce cc-cross-prefix
kbuild: disable depmod in cross-compile kernel build
kbuild: make deb-pkg - add 'Provides:' line
kconfig: comment typo in scripts/kconfig/Makefile.
kbuild: stop docproc segfaulting when SRCTREE isn't set.
kbuild: modpost problem when symbols move from one module to another
kbuild: cscope - filter out .tmp_* in find_sources
kbuild: mailing list has moved
kbuild: check asm symlink when building a kernel
Sam Ravnborg [Fri, 19 Oct 2007 20:20:02 +0000 (22:20 +0200)]
kbuild: fix first module build
When building a specific module before doing a total kernel
build it failed because $(MORVERDIR) were missing.
Creating the MODVERDIR explicit (independent of KBUILD_MODULES)
fixed this. As a side-effect the MODVERDIR will be created
also for a non-module build - but no harm done by that.
Linus Torvalds [Fri, 19 Oct 2007 20:12:46 +0000 (13:12 -0700)]
Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (41 commits)
ACPICA: hw: Don't carry spinlock over suspend
ACPICA: hw: remove use_lock flag from acpi_hw_register_{read, write}
ACPI: cpuidle: port idle timer suspend/resume workaround to cpuidle
ACPI: clean up acpi_enter_sleep_state_prep
Hibernation: Make sure that ACPI is enabled in acpi_hibernation_finish
ACPI: suppress uninitialized var warning
cpuidle: consolidate 2.6.22 cpuidle branch into one patch
ACPI: thinkpad-acpi: skip blanks before the data when parsing sysfs
ACPI: AC: Add sysfs interface
ACPI: SBS: Add sysfs alarm
ACPI: SBS: Add ACPI_PROCFS around procfs handling code.
ACPI: SBS: Add support for power_supply class (and sysfs)
ACPI: SBS: Make SBS reads table-driven.
ACPI: SBS: Simplify data structures in SBS
ACPI: SBS: Split host controller (ACPI0001) from SBS driver (ACPI0002)
ACPI: EC: Add new query handler to list head.
ACPI: Add acpi_bus_generate_event4() function
ACPI: Battery: add sysfs alarm
ACPI: Battery: Add sysfs support
ACPI: Battery: Misc clean-ups, no functional changes
...
Fix up conflicts in drivers/misc/thinkpad_acpi.[ch] manually
Randy Dunlap [Fri, 19 Oct 2007 17:53:48 +0000 (10:53 -0700)]
kconfig: update kconfig-language text
Add kconfig-language docs for mainmenu, def_bool, and def_tristate.
Remove "requires" as a synonym of "depends on" since it was removed
from the parser in commit 247537b9a270b52cc0375adcb0fb2605a160cba5.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Sam Ravnborg <sam@ravnborg.org>