Linus Torvalds [Tue, 30 Jan 2007 00:37:38 +0000 (16:37 -0800)]
Fix balance_dirty_page() calculations with CONFIG_HIGHMEM
This makes balance_dirty_page() always base its calculations on the
amount of non-highmem memory in the machine, rather than try to base it
on total memory and then falling back on non-highmem memory if the
mapping it was writing wasn't highmem capable.
This not only fixes a situation where two different writers can have
wildly different notions about what is a "balanced" dirty state, but it
also means that people with highmem machines don't run into an OOM
situation when regular memory fills up with dirty pages.
We used to try to handle the latter case by scaling down the dirty_ratio
if the machine had a lot of highmem pages in page_writeback_init(), but
it wasn't aggressive enough for some situations, and since basing the
dirty ratio on highmem memory was broken in the first place, let's just
stop doing so.
(A variation of this theme fixed Justin Piszcz's OOM problem when
copying an 18GB file on a RAID setup).
Acked-by: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Justin Piszcz <jpiszcz@lucidpixels.com> Cc: Andrew Morton <akpm@osdl.org> Cc: Neil Brown <neilb@suse.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Randy Dunlap <rdunlap@xenotime.net> Cc: Christoph Lameter <clameter@sgi.com> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Adrian Bunk <bunk@stusta.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Takashi Iwai [Fri, 26 Jan 2007 11:40:31 +0000 (12:40 +0100)]
[PATCH] ALSA: Fix sysfs breakage
The recent change for a new sysfs tree with card* object breaks the
/sys/class/sound tree if CONFIG_SYSFS_DEPRECATED is enabled.
The device in each entry doesn't point the correct device object:
Also, this change breaks some drivers (like sound/arm/*) referring
card->dev directly to obtain the device object for memory handling.
This patch reverts the semantics of card->dev to the former version,
which points to a real device object. The card* object is stored in a
new card->card_dev field, instead. The device parent is chosen either
card->dev or card->card_dev according to CONFIG_SYSFS_DEPRECATED to
keep the tree compatibility.
Also, card* isn't created if CONFIG_SYSFS_DEPRECATED is enabled. The
reason of card* object is a root of all beloing devices, and it makes
little sense if each sound device points to the real device object
directly.
Dave Jones [Mon, 29 Jan 2007 05:07:04 +0000 (00:07 -0500)]
[CPUFREQ] Remove unneeded errata workaround from p4-clockmod.
This workaround unnecessarily cripples functionality to work
around an errata that doesn't seem possible to hit due to
us using the automatic clock throttling in the p4 mcheck code.
See http://lkml.org/lkml/2006/10/28/148 for complete reasoning
and lack of disconsent.
A stupid bug has been plaguing the sys_pciconfig_iobase on ppc64. It wasn't
noticed until recently as it seems to not affect G5s but it's been causing
problems running X servers on some other machines recently. The bus number
matching was bogus.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
Darrick J. Wong [Fri, 26 Jan 2007 22:08:52 +0000 (14:08 -0800)]
[SCSI] libsas: Handle SCSI commands that complete with failure codes
This patch moves the code that handles SAS failures out of the main EH
function and into a separate function. It also detects commands that have
no sas_task (i.e. they completed, but with error data) and sends them into
scsi_error for processing. This allows us to handle SCSI errors (and
enables auto-spinup as a side effect) instead of dropping them on the
floor and falling into an infinite loop. It also requires the
implementation of a device reset function, which the SAS failure code has
been modified to employ for REQ_DEVICE_RESET.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Darrick J. Wong [Fri, 26 Jan 2007 22:08:43 +0000 (14:08 -0800)]
[SCSI] libsas: Clean up discovery failure handler code
sas_rphy_delete does two things: it removes the sas_rphy from the transport
layer and frees the sas_rphy. This can be broken down into two functions,
sas_rphy_remove and sas_rphy_free; sas_rphy_remove is of interest to
sas_discover_root_expander because it calls functions that require
sas_rphy_add as a prerequisite and can fail (namely sas_discover_expander).
In that case, sas_discover_root_expander needs to be able to undo the effects
of sas_rphy_add yet leave the job of freeing the sas_rphy to the caller of
sas_discover_root_expander.
This patch also removes some unnecessary code from sas_discover_end_dev
to eliminate an unnecessary cycle of sas_notify_lldd_gone/found for SAS
devices, thus eliminating a sas_rphy_remove call (and fixing a race condition
where a SCSI target scan can come in between the gone and found call).
It also moves the sas_rphy_free calls into sas_discover_domain and
sas_ex_discover_end_dev to complement the sas_rphy_allocation via
sas_get_port_device.
This patch does not change the semantics of sas_rphy_delete.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Darrick J. Wong [Fri, 26 Jan 2007 22:08:41 +0000 (14:08 -0800)]
[SCSI] libsas: Fix incorrect sas_port deformation in sas_form_port
Currently, sas_form_port checks the given asd_sas_phy's sas_phy to see if
there's already a port attached. If so, the SAS addresses of the port and
the phy are compared to determine if we need to detach from the port
because the addresses don't match or if we can stop; the SAS address stored
in the sas_port reflects whatever device _was_ attached to the port/phy, and
the SAS address stored in the sas_port reflects whatever device we just
discovered. As written, the code detaches from the port if the addresses
_do_ match, and prints an error if they do _not_ match. I believe this to
be incorrect, as it seems more logical to keep the port if the addresses
match (i.e. the phy was reset but the device didn't change), and detach it
they do not (i.e. the device changed).
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Take the expose_physicals flag and allow the user to select default (physicals
available via /dev/sg), exposed (physicals available via /dev/sd for
experimental reasons) and hidden (physicals blocked from all access). This
expands the functionality of the previous expose_physicals insmod parameter
which was added to support some experimental configurations.
Signed-off-by Mark Haverkamp <markh@linux-foundation.org> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Mark Haverkamp [Tue, 23 Jan 2007 23:00:30 +0000 (15:00 -0800)]
[SCSI] aacraid: rework packet support code
Received from Mark Salyzyn,
Replace all if/else packet formations with platform function calls. This is in
recognition of the proliferation of read and write packet types, and in the
need to migrate to up-and-coming packets for new products.
Signed-off-by Mark Haverkamp <markh@linux-foundation.org> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Mark Haverkamp [Tue, 23 Jan 2007 22:59:20 +0000 (14:59 -0800)]
[SCSI] aacraid: rework communication support code
Received from Mark Salyzyn,
Replace all if/else communication transports with a platform function call.
This is in recognition of the need to migrate to up-and-coming transports.
Currently the Linux driver does not support two available communication
transports provided by our products, these will be added in future patches, and
will expand the platform function set.
Signed-off-by Mark Haverkamp <markh@linux-foundation.org> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Brian King [Tue, 23 Jan 2007 17:25:37 +0000 (11:25 -0600)]
[SCSI] ipr: PCI error recovery fix
Since the pci_block_user_cfg_access API was modified to track
block/unblocks, it was discovered that the ipr driver had a
path through its code (in PCI error recovery) which would unblock
when not previously blocked.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Brian King [Tue, 23 Jan 2007 17:25:23 +0000 (11:25 -0600)]
[SCSI] ipr: Remove usage of pci driver data
Since ipr handles dynamic ids, it must handle driver_data
not being set, so remove the current usage of driver_data
so it can be used for other things in future patches.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Alexis Bruemmer [Tue, 16 Jan 2007 23:36:12 +0000 (15:36 -0800)]
[SCSI] aic94xx: fix typos and update verison number
fix typos and bump version number
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com> Acked-by: Alexis Bruemmer <alexisb@us.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Josepch Chan [Sat, 27 Jan 2007 12:47:08 +0000 (13:47 +0100)]
via82cxxx/pata_via: correct PCI_DEVICE_ID_VIA_SATA_EIDE ID and add support for CX700 and 8237S
This patch:
* Corrects the wrong device ID of PCI_DEVICE_ID_VIA_SATA_EIDE
from 0x0581 to 0x5324.
* Adds VIA CX700 and VT8237S support in drivers/ide/pci/via82cxxx.c
* Adds VIA VT8237S support in drivers/ata/pata_via.c
Tejun Heo [Sat, 27 Jan 2007 12:47:02 +0000 (13:47 +0100)]
ide: unregister idepnp driver on unload
idepnp driver is registered as a pnp driver on ide init but doesn't
get unregistered on ide unload causing driver list corruption and
eventually oops. Fix it.
Alan Cox [Sat, 27 Jan 2007 12:46:45 +0000 (13:46 +0100)]
ide/generic: Jmicron has its own drivers now
Drop ide-generic support for Jmicron identifiers as we now trust Jmicron.c for
this with drivers/ide. The code check remains for the all-generic-ide case.
Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Tejun Heo [Sat, 27 Jan 2007 02:04:26 +0000 (11:04 +0900)]
ahci: port_no should be used when clearing IRQ in ahci_thaw()
ap->id is logcial port ID which is unique among all ATA ports and
doesn't have anything to do with hardware port index. ap->port_no is
the hardware port index and thus should be used when clearing IRQ mask
in ahci_thaw().
This problem has been spotted by Jeff Garzik <jgarzik@pobox.com>.
Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
When main table is just a single leaf this gets printed as belonging to the
local table in /proc/net/fib_trie. A fix is below.
Signed-off-by: Robert Olsson <robert.olsson@its.uu.se> Acked-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 27 Jan 2007 02:48:16 +0000 (18:48 -0800)]
[SPARC64]: Set g4/g5 properly in sun4v dtlb-prot handling.
Mirror the logic in the sun4u handler, we have to update
both registers even when we branch out to window fault
fixup handling.
The way it works is that if we are in etrap processing a
fault already, g4/g5 holds the original fault information.
If we take a window spill fault while doing etrap, then
we put the window spill fault info into g4/g5 and this is
what the top-level fault handler ends up processing first.
Then we retry the originally faulting instruction, and
process the original fault at that time.
This is all necessary because of how constrained the trap
registers are in these code paths. These cases trigger
very rarely, so even if there is some performance implication
it's doesn't happen very often. In fact the rarity is why
it took so long to trigger and find this particular bug.
Signed-off-by: David S. Miller <davem@davemloft.net>
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
[NETFILTER]: nf_conntrack_pptp: fix NAT setup of expected GRE connections
[NETFILTER]: nf_nat_pptp: fix expectation removal
[NETFILTER]: nf_nat: fix ICMP translation with statically linked conntrack
[TCP]: Restore SKB socket owner setting in tcp_transmit_skb().
[AF_PACKET]: Check device down state before hard header callbacks.
[DECNET]: Handle a failure in neigh_parms_alloc (take 2)
[BNX2]: Fix 2nd port's MAC address.
[TCP]: Fix sorting of SACK blocks.
[AF_PACKET]: Fix BPF handling.
[IPV4]: Fix the fib trie iterator to work with a single entry routing tables
Linus Torvalds [Fri, 26 Jan 2007 22:45:18 +0000 (14:45 -0800)]
Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev:
Fix Maple PATA IRQ assignment.
ahci: use 0x80 as wait stat value instead of 0xff
sata_via: style clean up, no indirect method call in LLD
ahci: fix endianness in spurious interrupt message
libata-sff: Don't call bmdma_stop on non DMA capable controllers
libata: implement ATA_FLAG_IGN_SIMPLEX and use it in sata_uli
ahci: improve and limit spurious interrupt messages, take#3
sata_via: don't diddle with ATA_NIEN in ->freeze
libata: set_mode, Fix the FIXME
libata hpt3xn: Hopefully sort out the DPLL logic versus the vendor code
libata cmd64x: whack into a shape that looks like the documentation
David Woodhouse [Mon, 1 Jan 2007 19:31:15 +0000 (19:31 +0000)]
Fix Maple PATA IRQ assignment.
On the Maple board, the AMD8111 IDE is in legacy mode... except that it
appears on IRQ 20 instead of IRQ 15. For drivers/ide this was handled by
the architecture's "pci_get_legacy_ide_irq()" function, but in libata we
just hard-code the numbers 14 and 15.
This patch provides asm-powerpc/libata-portmap.h which maps the IRQ as
appropriate, having added a pci_dev argument to the
ATA_{PRIM,SECOND}ARY_IRQ macros.
There's probably a better way to do this -- especially if we observe
that the _only_ case in which this seemingly-generic
"pci_get_legacy_ide_irq()" function returns anything other than 14 and
15 for primary and secondary respectively is the case of the AMD8111 on
the Maple board -- couldn't we handle that with a special case in the
pata_amd driver, or perhaps with a PCI quirk for Maple to switch it into
native mode during early boot and assign resources properly?
Signed-off-by: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Tejun Heo [Fri, 26 Jan 2007 06:37:20 +0000 (15:37 +0900)]
ahci: use 0x80 as wait stat value instead of 0xff
Before hardreset, ahci initialized stat part of received FIS area to
0xff to wait for the first D2H Reg FIS which would change the value to
device ready state. This used to work but now libata considers status
value of 0xff as device not present making this wait prone to failure.
This patch makes ahci use 0x80 for the wait stat value instead of
0xff to fix the above problem.
Matt Domsch [Fri, 26 Jan 2007 08:57:18 +0000 (00:57 -0800)]
[PATCH] Fix race in efi variable delete code
Fix race when deleting an EFI variable and issuing another EFI command on
the same variable. The removal of the variable from the efivars_list
should be done in efivar_delete and not delayed until the kobject release.
Furthermore, remove the item from the list at module unload time, and use
list_for_each_entry_safe() rather than list_for_each_safe() for
readability.
Tested on ia64.
Signed-off-by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Matt Domsch <Matt_Domsch@dell.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alexey Dobriyan [Fri, 26 Jan 2007 08:57:16 +0000 (00:57 -0800)]
[PATCH] core-dumping unreadable binaries via PT_INTERP
Proposed patch to fix #5 in
http://www.isec.pl/vulnerabilities/isec-0017-binfmt_elf.txt
aka
http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2004-1073
To reproduce, do
* grab poc at the end of advisory.
* add line "eph.p_memsz = 4096;" after "eph.p_filesz = 4096;"
where first "4096" is something equal to or greater than 4096.
* ./poc /usr/bin/sudo && ls -l
Here I get with 2.6.20-rc5:
-rw------- 1 ad ad 102400 2007-01-15 19:17 core
---s--x--x 2 root root 101820 2007-01-15 19:15 /usr/bin/sudo
NeilBrown [Fri, 26 Jan 2007 08:57:14 +0000 (00:57 -0800)]
[PATCH] md: remove unnecessary printk when raid5 gets an unaligned read.
raid5_mergeable_bvec tries to ensure that raid5 never sees a read request
that does not fit within just one chunk. However as we must always accept
a single-page read, that is not always possible.
So when "in_chunk_boundary" fails, it might be unusual, but it is not a
problem and printing a message every time is a bad idea.
Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jeff Dike [Fri, 26 Jan 2007 08:57:12 +0000 (00:57 -0800)]
[PATCH] Fix UML on non-standard VM split hosts
This fixes UML on hosts with non-standard VM splits. We had changed the
config variable that controls UML behavior on such hosts, but not
propogated the change everywhere. In particular, the values of STUB_CODE
and STUB_DATA relied on the old variable.
I also reformatted the HOST_VMSPLIT_3G help to make it more standard.
Spotted by uml@flonatel.org.
Signed-off-by: Jeff Dike <jdike@addtoit.com> Cc: Blaisorblade <blaisorblade@yahoo.it> Cc: Pravin <shindepravin@gmail.com> Cc: <uml@flonatel.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
NeilBrown [Fri, 26 Jan 2007 08:57:10 +0000 (00:57 -0800)]
[PATCH] knfsd: Fix type mismatch with filldir_t used by nfsd
nfsd defines a type 'encode_dent_fn' which is much like 'filldir_t' except
that the first pointer is 'struct readdir_cd *' rather than 'void *'. It
then casts encode_dent_fn points to 'filldir_t' as needed. This hides any
other type mismatches between the two such as the fact that the 'ino' arg
recently changed from ino_t to u64.
So: get rid of 'encode_dent_fn', get rid of the cast of the function type,
change the first arg of various functions from 'struct readdir_cd *' to
'void *', and live with the fact that we have a little less type checking
on the calling of these functions now. Less internal (to nfsd) checking
offset by more external checking, which is more important.
Thanks to Gabriel Paubert <paubert@iram.es> for discovering this and
providing an initial patch.
Signed-off-by: Gabriel Paubert <paubert@iram.es> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Robert P. J. Day [Fri, 26 Jan 2007 08:57:09 +0000 (00:57 -0800)]
[PATCH] fix various kernel-doc in header files
Fix a number of kernel-doc entries for header files in include/linux by
making sure they begin with the appropriate '/**' notation and use @var
notation.
Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jun'ichi Nomura [Fri, 26 Jan 2007 08:57:07 +0000 (00:57 -0800)]
[PATCH] dm-multipath: fix stall on noflush suspend/resume
Allow noflush suspend/resume of device-mapper device only for the case
where the device size is unchanged.
Otherwise, dm-multipath devices can stall when resumed if noflush was used
when suspending them, all paths have failed and queue_if_no_path is set.
Explanation:
1. Something is doing fsync() on the block dev,
holding inode->i_sem
2. The fsync write is blocked by all-paths-down and queue_if_no_path
3. Someone requests to suspend the dm device with noflush.
Pending writes are left in queue.
4. In the middle of dm_resume(), __bind() tries to get
inode->i_sem to do __set_size() and waits forever.
'noflush suspend' is a new device-mapper feature introduced in
early 2.6.20. So I hope the fix being included before 2.6.20 is
released.
Example of reproducer:
1. Create a multipath device by dmsetup
2. Fail all paths during mkfs
3. Do dmsetup suspend --noflush and load new map with healthy paths
4. Do dmsetup resume
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Acked-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[PATCH] 9p: null terminate error strings for debug print
We weren't properly NULL terminating protocol error strings for our debug
printk resulting in garbage being included in the output when debug was
enabled.
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[PATCH] 9p: fix segfault caused by race condition in meta-data operations
Running dbench multithreaded exposed a race condition where fid structures
were removed while in use. This patch adds semaphores to meta-data operations
to protect the fid structure. Some cleanup of error-case handling in the
inode operations is also included.
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[PATCH] 9p: update documentation regarding server applications
Update the documentation to cover using Inferno as a server for 9p and to
include information about spfs (a stable single-threaded stand-alone 9p
server).
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9p doesn't handle renames between directories -- however, we were returning
EPERM instead of EXDEV when we detected this case.
Signed-off-by: Eric Van Hensbergren <ericvh@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[PATCH] 9p: fix bogus return code checks during initialization
There is a simple logic error in init_v9fs - the return code checks are
reversed. This patch fixes the return code and adds some messages to prevent
module initialization from failing silently.
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
NeilBrown [Fri, 26 Jan 2007 08:57:03 +0000 (00:57 -0800)]
[PATCH] md: avoid reading past the end of a bitmap file
In most cases we check the size of the bitmap file before reading data from
it. However when reading the superblock, we always read the first PAGE_SIZE
bytes, which might not always be appropriate. So limit that read to the size
of the file if appropriate.
Also, we get the count of available bytes wrong in one place, so that too can
read past the end of the file.
Cc: "yang yin" <yinyang801120@gmail.com> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
NeilBrown [Fri, 26 Jan 2007 08:57:02 +0000 (00:57 -0800)]
[PATCH] md: make sure the events count in an md array never returns to zero
Now that we sometimes step the array events count backwards (when
transitioning dirty->clean where nothing else interesting has happened - so
that we don't need to write to spares all the time), it is possible for the
event count to return to zero, which is potentially confusing and triggers and
MD_BUG.
We could possibly remove the MD_BUG, but is just as easy, and probably safer,
to make sure we never return to zero.
Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
NeilBrown [Fri, 26 Jan 2007 08:57:01 +0000 (00:57 -0800)]
[PATCH] md: make 'repair' actually work for raid1
When 'repair' finds a block that is different one the various parts of the
mirror. it is meant to write a chosen good version to the others. However it
currently writes out the original data to each. The memcpy to make all the
data the same is missing.
Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Peter Staubach [Fri, 26 Jan 2007 08:57:00 +0000 (00:57 -0800)]
[PATCH] knfsd: Don't mess with the 'mode' when storing a exclusive-create cookie
NFS V3 (and V4) support exclusive create by passing a 'cookie' which can get
stored with the file. If the file exists but has exactly the right cookie
stored, then we assume this is a retransmit and the exclusive create was
successful.
The cookie is 64bits and is traditionally stored in the mtime and atime
fields. This causes a problem with Solaris7 as negative mtime or atime
confuse it. So we moved two bits into the mode word instead.
But inherited ACLs sometimes overwrite the mode word on create, so this is a
problem.
So we give up and just store 62 of the 64 bits and assume that is close
enough.
Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
NeilBrown [Fri, 26 Jan 2007 08:56:59 +0000 (00:56 -0800)]
[PATCH] knfsd: replace some warning ins nfsfh.h with BUG_ON or WARN_ON
A couple of the warnings will be followed by an Oops if they ever fire, so may
as well be BUG_ON. Another isn't obviously fatal but has never been known to
fire, so make it a WARN_ON.
Cc: Adrian Bunk <bunk@stusta.de> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
NeilBrown [Fri, 26 Jan 2007 08:56:59 +0000 (00:56 -0800)]
[PATCH] knfsd: fix an NFSD bug with full sized, non-page-aligned reads
NFSd assumes that largest number of pages that will be needed for a
request+response is 2+N where N pages is the size of the largest permitted
read/write request. The '2' are 1 for the non-data part of the request, and 1
for the non-data part of the reply.
However, when a read request is not page-aligned, and we choose to use
->sendfile to send it directly from the page cache, we may need N+1 pages to
hold the whole reply. This can overflow and array and cause an Oops.
This patch increases size of the array for holding pages by one and makes sure
that entry is NULL when it is not in use.
Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Tilman Schmidt [Fri, 26 Jan 2007 08:56:56 +0000 (00:56 -0800)]
[PATCH] Gigaset ISDN driver error handling fixes
Fix several flaws in the error handling of the Siemens Gigaset ISDN driver,
including one that would cause an Oops when connecting more than one device
of the same type.
Ingo Molnar [Fri, 26 Jan 2007 08:56:55 +0000 (00:56 -0800)]
[PATCH] ACPI: fix cpufreq regression
Recently cpufreq support on my laptop (Lenovo T60) broke completely: when
it's plugged into AC it would never go higher than 1 GHz - neither 1.3 GHz
nor 1.83 GHz is possible - no matter which governor (userspace, speed or
ondemand) is used.
After some cpufreq debugging i tracked the regression back to the following
(totally correct) bug-fix commit:
[PATCH] Correct bound checking from the value returned from _PPC method.
This bugfix, which makes other laptops work, made a previously hidden
(BIOS) bug visible on my laptop.
The bug is the following: if the _PPC (Performance Present Capabilities)
optional ACPI object is queried /after/ bootup then the BIOS reports an
incorrect value of '2'.
My laptop (Lenovo T60) has the following performance states supported:
Per ACPI specification, a _PPC value of '0' means that all 3 performance
states are usable. A _PPC value of '1' means states 1 .. 2 are usable, a
value of '2' means only state '2' (slowest) is usable.
now, the _PPC object is optional, and it also comes with notification.
Furthermore, when a CPU object is initialized, the _PPC object is
initialized as well. So the following evaluation of the _PPC object is
superfluous:
And this is the point where my laptop's BIOS returns the incorrect value of
'2'. Note that it has not sent any notification event, so the value is
probably not really intentional (possibly spurious), and Windows likely
doesnt query it after bootup either. Maybe the value is kept at '2'
normally, and is only set to the real value when a true asynchronous event
(such as AC plug event, battery switch, etc.) occurs.
So i /think/ this is a grey area of the ACPI spec: per the letter of the
spec the _PPC value only changes when notified, so there's no reason to
query it after the system has booted up. So in my opinion the best (and
most compatible) strategy would be to do the change below, and to not
evaluate the _PPC object in the acpi_processor_get_performance_info() call,
but only evaluate it if _PPC is present during CPU object init, or if it's
notified during an asynchronous event. This change is more permissive than
the previous logic, so it definitely shouldnt break any existing system.
This also happens to fix my laptop, which is merrily chugging along at
1.83 GHz now. Yay!
Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Dave Jones <davej@redhat.com> Acked-by: Len Brown <lenb@kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Atsushi Nemoto [Fri, 26 Jan 2007 08:56:54 +0000 (00:56 -0800)]
[PATCH] SPI: alternative fix for spi_busnum_to_master
If a SPI master device exists, udev (udevtrigger) causes kernel crash, due
to wrong kobj pointer in kobject_uevent_env(). This problem was not in
2.6.19.
The backtrace (on MIPS) was:
[<8024db6c>] kobject_uevent_env+0x54c/0x5e8
[<802a8264>] store_uevent+0x1c/0x3c (in drivers/class.c)
[<801cb14c>] subsys_attr_store+0x2c/0x50
[<801cb80c>] flush_write_buffer+0x38/0x5c
[<801cb900>] sysfs_write_file+0xd0/0x190
[<80181444>] vfs_write+0xc4/0x1a0
[<80181cdc>] sys_write+0x54/0xa0
[<8010dae4>] stack_done+0x20/0x3c
flush_write_buffer() passes kobject of spi_master_class.subsys to
subsys_addr_store(), then subsys_addr_store() passes a pointer to a struct
subsystem to store_uevent() which expects a pointer to a struct
class_device. The problem seems subsys_attr_store() called instead of
class_device_attr_store().
This mismatch was caused by commit 3bd0f6943520e459659d10f3282285e43d3990f1, which overrides kset of master
class. This made spi_master_class.subsys.kset.ktype NULL so
subsys_sysfs_ops is used instead of class_dev_sysfs_ops.
The commit was to fix spi_busnum_to_master(). Here is a patch fixes
this function in other way, just searching children list of
class_device.
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp> Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Roland McGrath [Fri, 26 Jan 2007 08:56:52 +0000 (00:56 -0800)]
[PATCH] x86_64 ia32 vDSO: define arch_vma_name
This patch makes x86_64 define arch_vma_name for CONFIG_IA32_EMULATION. This
makes the ia32 vDSO mapping appear in /proc/PID/maps with "[vdso]" for ia32
processes, as it does on native i386.
Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Roland McGrath [Fri, 26 Jan 2007 08:56:50 +0000 (00:56 -0800)]
[PATCH] x86_64 ia32 vDSO: use VM_ALWAYSDUMP
This patch fixes ia32 core dumps on x86_64 to include just one phdr for the
vDSO vma. Currently it writes a confused format with two phdrs for the
address, one without contents and one with. This patch removes the
special-case core writing macros for the ia32 vDSO. Instead, it uses
VM_ALWAYSDUMP in the vma. This changes core dumps so they no longer include
the non-PT_LOAD phdrs from the vDSO, consistent with fixed native i386 core
dumps.
Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Roland McGrath [Fri, 26 Jan 2007 08:56:49 +0000 (00:56 -0800)]
[PATCH] i386 vDSO: use VM_ALWAYSDUMP
This patch fixes core dumps to include the vDSO vma, which is left out now.
It removes the special-case core writing macros, which were not doing the
right thing for the vDSO vma anyway. Instead, it uses VM_ALWAYSDUMP in the
vma; there is no need for the fixmap page to be installed. It handles the
CONFIG_COMPAT_VDSO case by making elf_core_dump use the fake vma from
get_gate_vma after real vmas in the same way the /proc/PID/maps code does.
This changes core dumps so they no longer include the non-PT_LOAD phdrs from
the vDSO. I made the change to add them in the first place, but in turned out
that nothing ever wanted them there since the advent of NT_AUXV. It's cleaner
to leave them out, and just let the phdrs inside the vDSO image speak for
themselves.
Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Roland McGrath [Fri, 26 Jan 2007 08:56:48 +0000 (00:56 -0800)]
[PATCH] Add VM_ALWAYSDUMP
This patch adds the VM_ALWAYSDUMP flag for vm_flags in vm_area_struct. This
provides a clean explicit way to have a vma always included in core dumps, as
is needed for vDSO's.
Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Roland McGrath [Fri, 26 Jan 2007 08:56:47 +0000 (00:56 -0800)]
[PATCH] Fix gate_vma.vm_flags
This patch fixes the initialization of gate_vma.vm_flags and
gate_vma.vm_page_prot to reflect reality. This makes the "[vdso]" line in
/proc/PID/maps correctly show r-xp instead of ---p, when gate_vma is used
(CONFIG_COMPAT_VDSO on i386).
Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Roland McGrath [Fri, 26 Jan 2007 08:56:46 +0000 (00:56 -0800)]
[PATCH] Fix CONFIG_COMPAT_VDSO
I wouldn't mind if CONFIG_COMPAT_VDSO went away entirely. But if it's there,
it should work properly. Currently it's quite haphazard: both real vma and
fixmap are mapped, both are put in the two different AT_* slots, sysenter
returns to the vma address rather than the fixmap address, and core dumps yet
are another story.
This patch makes CONFIG_COMPAT_VDSO disable the real vma and use the fixmap
area consistently. This makes it actually compatible with what the old vdso
implementation did.
Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Justin Clacherty [Fri, 26 Jan 2007 08:56:44 +0000 (00:56 -0800)]
[PATCH] spi: fix error setting the spi mode in pxa2xx_spi.c
Currently the spi mode can be set to the wrong mode if you are switching
from any mode other than mode 0. This is because the mode is set using a
bitwise or on uncleared bits. The following patch clears the mode bits
before setting the new mode. I've also modified it to use the appropriate
defines from pxa-regs.h for readability.
Signed-off-by: Justin Clacherty <justin@redfish-group.com> Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joerg Roedel [Fri, 26 Jan 2007 08:56:42 +0000 (00:56 -0800)]
[PATCH] KVM: SVM: Propagate cpu shutdown events to userspace
This patch implements forwarding of SHUTDOWN intercepts from the guest on to
userspace on AMD SVM. A SHUTDOWN event occurs when the guest produces a
triple fault (e.g. on reboot). This also fixes the bug that a guest reboot
actually causes a host reboot under some circumstances.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Avi Kivity [Fri, 26 Jan 2007 08:56:41 +0000 (00:56 -0800)]
[PATCH] KVM: MMU: Report nx faults to the guest
With the recent guest page fault change, we perform access checks on our
own instead of relying on the cpu. This means we have to perform the nx
checks as well.
Software like the google toolbar on windows appears to rely on this
somehow.
Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Avi Kivity [Fri, 26 Jan 2007 08:56:41 +0000 (00:56 -0800)]
[PATCH] KVM: MMU: Perform access checks in walk_addr()
Check pte permission bits in walk_addr(), instead of scattering the checks all
over the code. This has the following benefits:
1. We no longer set the accessed bit for accessed which fail permission checks.
2. Setting the accessed bit is simplified.
3. Under some circumstances, we used to pretend a page fault was fixed when
it would actually fail the access checks. This caused an unnecessary
vmexit.
4. The error code for guest page faults is now correct.
The fix helps netbsd further along booting, and allows kvm to pass the new mmu
testsuite.
Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Leonard Norrgard [Fri, 26 Jan 2007 08:56:38 +0000 (00:56 -0800)]
[PATCH] KVM: SVM: Fix SVM idt confusion
There's an obvious typo in svm_{get,set}_idt, causing it to access the ldt
instead.
Because these functions are only called for save/load on AMD, the bug does not
impact normal operation. With the fix, save/load works as expected on AMD
hosts.
Signed-off-by: Uri Lublin <uril@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 26 Jan 2007 20:53:20 +0000 (12:53 -0800)]
Write back inode data pages even when the inode itself is locked
In __writeback_single_inode(), when we find a locked inode and we're not
doing a data-integrity sync, we used to just skip writing entirely,
since we didn't want to wait for the inode to unlock.
However, there's really no reason to skip writing the data pages, which
are likely to be the the bulk of the dirty state anyway (and the main
reason why writeback was started for the non-data-integrity case, of
course!)
Acked-by: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Andrew Morton <akpm@osdl.org>, Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Hugh Dickins <hugh@veritas.com> Cc: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>