[PATCH] Remove old node based policy interface from mempolicy.c
mempolicy.c contains provisional interface for huge page allocation based on
node numbers. This is in use in SLES9 but was never used (AFAIK) in upstream
versions of Linux.
Huge page allocations now use zonelists to figure out where to allocate pages.
The use of zonelists allows us to find the closest hugepage which was the
consideration of the NUMA distance for huge page allocations.
Remove the obsolete functions.
Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Andi Kleen <ak@muc.de> Acked-by: William Lee Irwin III <wli@holomorphy.com> Cc: Adam Litke <agl@us.ibm.com> Acked-by: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The huge_zonelist() function in the memory policy layer provides an list of
zones ordered by NUMA distance. The hugetlb layer will walk that list looking
for a zone that has available huge pages but is also in the nodeset of the
current cpuset.
This patch does not contain the folding of find_or_alloc_huge_page() that was
controversial in the earlier discussion.
Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Andi Kleen <ak@muc.de> Acked-by: William Lee Irwin III <wli@holomorphy.com> Cc: Adam Litke <agl@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This was discussed at
http://marc.theaimsgroup.com/?l=linux-kernel&m=113166526217117&w=2
This patch changes the dequeueing to select a huge page near the node
executing instead of always beginning to check for free nodes from node 0.
This will result in a placement of the huge pages near the executing
processor improving performance.
The existing implementation can place the huge pages far away from the
executing processor causing significant degradation of performance. The
search starting from zero also means that the lower zones quickly run out
of memory. Selecting a huge page near the process distributed the huge
pages better.
Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: William Lee Irwin III <wli@holomorphy.com> Cc: Adam Litke <agl@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
David Gibson [Fri, 6 Jan 2006 08:10:44 +0000 (00:10 -0800)]
[PATCH] Hugetlb: Copy on Write support
Implement copy-on-write support for hugetlb mappings so MAP_PRIVATE can be
supported. This helps us to safely use hugetlb pages in many more
applications. The patch makes the following changes. If needed, I also have
it broken out according to the following paragraphs.
1. Add a pair of functions to set/clear write access on huge ptes. The
writable check in make_huge_pte is moved out to the caller for use by COW
later.
2. Hugetlb copy-on-write requires special case handling in the following
situations:
- copy_hugetlb_page_range() - Copied pages must be write protected so
a COW fault will be triggered (if necessary) if those pages are written
to.
- find_or_alloc_huge_page() - Only MAP_SHARED pages are added to the
page cache. MAP_PRIVATE pages still need to be locked however.
3. Provide hugetlb_cow() and calls from hugetlb_fault() and
hugetlb_no_page() which handles the COW fault by making the actual copy.
4. Remove the check in hugetlbfs_file_map() so that MAP_PRIVATE mmaps
will be allowed. Make MAP_HUGETLB exempt from the depricated VM_RESERVED
mapping check.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Adam Litke <agl@us.ibm.com> Cc: William Lee Irwin III <wli@holomorphy.com> Cc: "Seth, Rohit" <rohit.seth@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Adam Litke [Fri, 6 Jan 2006 08:10:43 +0000 (00:10 -0800)]
[PATCH] Hugetlb: Reorganize hugetlb_fault to prepare for COW
This patch splits the "no_page()" type activity into its own function,
hugetlb_no_page(). hugetlb_fault() becomes the entry point for hugetlb faults
and delegates to the appropriate handler depending on the type of fault.
Right now we still have only hugetlb_no_page() but a later patch introduces a
COW fault.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Adam Litke <agl@us.ibm.com> Cc: William Lee Irwin III <wli@holomorphy.com> Cc: "Seth, Rohit" <rohit.seth@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Adam Litke [Fri, 6 Jan 2006 08:10:42 +0000 (00:10 -0800)]
[PATCH] Hugetlb: Rename find_lock_page to find_or_alloc_huge_page
find_lock_huge_page() isn't a great name, since it does extra things not
analagous to find_lock_page(). Rename it find_or_alloc_huge_page() which is
closer to the mark.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Adam Litke <agl@us.ibm.com> Cc: William Lee Irwin III <wli@holomorphy.com> Cc: "Seth, Rohit" <rohit.seth@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Adam Litke [Fri, 6 Jan 2006 08:10:40 +0000 (00:10 -0800)]
[PATCH] Hugetlb: Remove duplicate i_size check
cleanup
Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Adam Litke <agl@us.ibm.com> Cc: William Lee Irwin III <wli@holomorphy.com> Cc: "Seth, Rohit" <rohit.seth@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] madvise(MADV_REMOVE): remove pages from tmpfs shm backing store
Here is the patch to implement madvise(MADV_REMOVE) - which frees up a
given range of pages & its associated backing store. Current
implementation supports only shmfs/tmpfs and other filesystems return
-ENOSYS.
"Some app allocates large tmpfs files, then when some task quits and some
client disconnect, some memory can be released. However the only way to
release tmpfs-swap is to MADV_REMOVE". - Andrea Arcangeli
Databases want to use this feature to drop a section of their bufferpool
(shared memory segments) - without writing back to disk/swap space.
This feature is also useful for supporting hot-plug memory on UML.
Concerns raised by Andrew Morton:
- "We have no plan for holepunching! If we _do_ have such a plan (or
might in the future) then what would the API look like? I think
sys_holepunch(fd, start, len), so we should start out with that."
- Using madvise is very weird, because people will ask "why do I need to
mmap my file before I can stick a hole in it?"
- None of the other madvise operations call into the filesystem in this
manner. A broad question is: is this capability an MM operation or a
filesytem operation? truncate, for example, is a filesystem operation
which sometimes has MM side-effects. madvise is an mm operation and with
this patch, it gains FS side-effects, only they're really, really
significant ones."
Comments:
- Andrea suggested the fs operation too but then it's more efficient to
have it as a mm operation with fs side effects, because they don't
immediatly know fd and physical offset of the range. It's possible to
fixup in userland and to use the fs operation but it's more expensive,
the vmas are already in the kernel and we can use them.
Short term plan & Future Direction:
- We seem to need this interface only for shmfs/tmpfs files in the short
term. We have to add hooks into the filesystem for correctness and
completeness. This is what this patch does.
- In the future, plan is to support both fs and mmap apis also. This
also involves (other) filesystem specific functions to be implemented.
- Current patch doesn't support VM_NONLINEAR - which can be addressed in
the future.
Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Cc: Hugh Dickins <hugh@veritas.com> Cc: Andrea Arcangeli <andrea@suse.de> Cc: Michael Kerrisk <mtk-manpages@gmx.net> Cc: Ulrich Drepper <drepper@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch makes truncate_inode_pages_range from truncate_inode_pages.
truncate_inode_pages became a one-liner call to truncate_inode_pages_range.
Reiser4 needs truncate_inode_pages_ranges because it tries to keep
correspondence between existences of metadata pointing to data pages and pages
to which those metadata point to. So, when metadata of certain part of file
is removed from filesystem tree, only pages of corresponding range are to be
truncated.
(Needed by the madvise(MADV_REMOVE) patch)
Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andy Whitcroft [Fri, 6 Jan 2006 08:10:35 +0000 (00:10 -0800)]
[PATCH] memhotplug: register_memory should be global
register_memory is global and declared so in linux/memory.h. Update the
HOTPLUG specific definition to match. This fixes a compile warning when
HOTPLUG is enabled.
Signed-off-by: Andy Whitcroft <apw@shadowen.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andy Whitcroft [Fri, 6 Jan 2006 08:10:35 +0000 (00:10 -0800)]
[PATCH] memhotplug: register_ and unregister_memory_notifier should be global
Both register_memory_notifer and unregister_memory_notifier are global and
declared so in linux/memory.h. Update the HOTPLUG specific definitions to
match. This fixes a compile warning when HOTPLUG is enabled.
Signed-off-by: Andy Whitcroft <apw@shadowen.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Two changes to the setting of the ALLOC_CPUSET flag in
mm/page_alloc.c:__alloc_pages()
- A bug fix - the "ignoring mins" case should not be honoring ALLOC_CPUSET.
This case of all cases, since it is handling a request that will free up
more memory than is asked for (exiting tasks, e.g.) should be allowed to
escape cpuset constraints when memory is tight.
- A logic change to make it simpler. Honor cpusets even on GFP_ATOMIC
(!wait) requests. With this, cpuset confinement applies to all requests
except ALLOC_NO_WATERMARKS, so that in a subsequent cleanup patch, I can
remove the ALLOC_CPUSET flag entirely. Since I don't know any real reason
this logic has to be either way, I am choosing the path of the simplest
code.
Signed-off-by: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
NeilBrown [Fri, 6 Jan 2006 08:09:49 +0000 (00:09 -0800)]
[PATCH] knfsd: fix hash function for IP addresses on 64bit little-endian machines.
The hash.h hash_long function, when used on a 64 bit machine, ignores many
of the middle-order bits. (The prime chosen it too bit-sparse).
IP addresses for clients of an NFS server are very likely to differ only in
the low-order bits. As addresses are stored in network-byte-order, these
bits become middle-order bits in a little-endian 64bit 'long', and so do
not contribute to the hash. Thus you can have the situation where all
clients appear on one hash chain.
So, until hash_long is fixed (or maybe forever), us a hash function that
works well on IP addresses - xor the bytes together.
Thanks to "Iozone" <capps@iozone.org> for identifying this problem.
Cc: "Iozone" <capps@iozone.org> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Herbert Xu [Fri, 6 Jan 2006 08:09:47 +0000 (00:09 -0800)]
[PATCH] nbd: fix TX/RX race condition
Janos Haar of First NetCenter Bt. reported numerous crashes involving the
NBD driver. With his help, this was tracked down to bogus bio vectors
which in turn was the result of a race condition between the
receive/transmit routines in the NBD driver.
The bug manifests itself like this:
CPU0 CPU1
do_nbd_request
add req to queuelist
nbd_send_request
send req head
for each bio
kmap
send
nbd_read_stat
nbd_find_request
nbd_end_request
kunmap
When CPU1 finishes nbd_end_request, the request and all its associated
bio's are freed. So when CPU0 calls kunmap whose argument is derived from
the last bio, it may crash.
Under normal circumstances, the race occurs only on the last bio. However,
if an error is encountered on the remote NBD server (such as an incorrect
magic number in the request), or if there were a bug in the server, it is
possible for the nbd_end_request to occur any time after the request's
addition to the queuelist.
The following patch fixes this problem by making sure that requests are not
added to the queuelist until after they have been completed transmission.
In order for the receiving side to be ready for responses involving
requests still being transmitted, the patch introduces the concept of the
active request.
When a response matches the current active request, its processing is
delayed until after the tranmission has come to a stop.
This has been tested by Janos and it has been successful in curing this
race condition.
From: Herbert Xu <herbert@gondor.apana.org.au>
Here is an updated patch which removes the active_req wait in
nbd_clear_queue and the associated memory barrier.
I've also clarified this in the comment.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Cc: <djani22@dynamicweb.hu> Cc: Paul Clements <Paul.Clements@SteelEye.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Joshua Kwan [Fri, 6 Jan 2006 08:09:45 +0000 (00:09 -0800)]
[PATCH] hfsplus oops fix
nls_utf8 is available, and the check in hfsplus_fill_super checks the wrong
pointer for NULLness (it checks the saved nls, not the new one that it
needs to use.)
Signed-off-by: Joshua Kwan <joshk@triplehelix.org> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Chuck Ebbert [Fri, 6 Jan 2006 04:11:29 +0000 (23:11 -0500)]
[PATCH] i386: PTRACE_POKEUSR: allow changing RF bit in EFLAGS register.
Setting RF (resume flag) allows a debugger to resume execution after a
code breakpoint without tripping the breakpoint again. It is reset by
the CPU after execution of one instruction.
Requested by Stephane Eranian:
"I am trying to the user HW debug registers on i386 and I am running
into a problem with ptrace() not allowing access to EFLAGS_RF for
POKEUSER (see FLAG_MASK). [ ... ] It avoids the need to remove the
breakpoint, single step, and reinstall. The equivalent functionality
exists on IA-64 and is allowed by ptrace()"
Export the stored values instead of re-reading everything in the socket
information sysfs files, and make them accessible to all users, not only
to root.
Vitaly Bordug [Thu, 8 Dec 2005 15:56:12 +0000 (13:56 -0200)]
[PATCH] 8xx PCMCIA: support for MPC885ADS and MPC866ADS
This adds PCMCIA support for both MPC885ADS and MPC866ADS.
This is established not together with FADS, because 885 does not have
io_block_mapping() for BCSR area.
Also, some cleanups done both for 885ADS and MBX.
[PATCH] pcmcia: properly handle static mem, but dynamic io sockets
Some PCMCIA sockets have statically mapped memory windows, but dynamically
mapped IO windows. Using the "nonstatic" socket library is inpractical for
them, as they do neither need a resource database (as we can trust the
kernel resource database on m68k and ppc) nor lots of other features of that
library. Let them get a small "iodyn" socket library (105 lines of code)
instead.
[PATCH] pcmcia: unify attach, EVENT_CARD_INSERTION handlers into one probe callback
Unify the EVENT_CARD_INSERTION and "attach" callbacks to one unified
probe() callback. As all in-kernel drivers are changed to this new
callback, there will be no temporary backwards-compatibility. Inside a
probe() function, each driver _must_ set struct pcmcia_device
*p_dev->instance and instance->handle correctly.
With these patches, the basic driver interface for 16-bit PCMCIA drivers
now has the classic four callbacks known also from other buses:
int (*probe) (struct pcmcia_device *dev);
void (*remove) (struct pcmcia_device *dev);
int (*suspend) (struct pcmcia_device *dev);
int (*resume) (struct pcmcia_device *dev);
Move the suspend and resume methods out of the event handler, and into
special functions. Also use these functions for pre- and post-reset, as
almost all drivers already do, and the remaining ones can easily be
converted.
Bugfix to include/pcmcia/ds.c Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Daniel Ritz [Thu, 3 Nov 2005 20:12:14 +0000 (21:12 +0100)]
[PATCH] yenta: make bridge specific init code configurable
Make the bridge specific initialization code config options depending on
CONFIG_EMBEDDED. Config options for TI/EnE, Toshiba, Ricoh and O2Micro are
available. Disabling all of the specific tweaks cuts off more than half
of yenta_socket.ko.
Signed-off-by: Daniel Ritz <daniel.ritz@gmx.ch> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Add a return value to pcmcia_validate_mem. Only if we have enough memory
available to map the CIS, we should proceed in trying to determine information
about the device.
Tony Luck [Thu, 5 Jan 2006 21:30:52 +0000 (13:30 -0800)]
[IA64] Fix compile warnings in setup.c
arch/ia64/kernel/setup.c: In function `show_cpuinfo':
arch/ia64/kernel/setup.c:576: warning: long unsigned int format, different type arg (arg 12)
arch/ia64/kernel/setup.c:576: warning: long unsigned int format, different type arg (arg 13)
Luis F. Ortiz [Thu, 5 Jan 2006 21:12:41 +0000 (13:12 -0800)]
[ATYFB]: Fix onboard video on SPARC Blade 100 for 2.6.{13,14,15}
I have recently been switching from using 2.4.32 on my trusty
old Sparc Blade 100 to using 2.6.15 . Some of the problems I ran into
were distorted video when the console was active (missing first
character, skipped dots) and when running X windows (colored snow,
stripes, missing pixels). A quick examination of the 2.6 versus 2.4
source for the ATY driver revealed alot of changes.
A closer look at the code/data for the 64GR/XL chip revealed
two minor "typos" that the rewriter(s) of the code made. The first is
a incorrect clock value (230 .vs. 235) and the second is a missing
flag (M64F_SDRAM_MAGIC_PLL). Making both these changes seems to have
fixed my problem. I tend to think the 235 value is the correct one,
as there is a 29.4 Mhz clock crystal close to the video chip and 235.2
(29.4*8) is too close to 235 to make it a coincidence.
The flag for M64F_SDRAM_MAGIC_PLL was dropped during the
changes made by adaplas in file revision 1.72 on the old bitkeeper
repository.
The change relating to the clock rate has been there forever,
at least in the 2.6 tree. I'm not sure where to look for the old 2.5
tree or if anyone cares when it happened.
On SPARC Blades 100's, which use the ATY MACH64GR video chipset, the
clock crystal frequency is 235.2 Mhz, not 230 Mhz. The chipset also
requires the use of M64F_SDRAM_MAGIC_PLL in order to setup the PLL
properly for the DRAM.
Signed-off-by: Luis F. Ortiz <lfo@Polyad.Org> Signed-off-by: David S. Miller <davem@davemloft.net>
CC [M] net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.o
net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c: In function 'ipv4_refrag':
net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c:198: error: dereferencing pointer to incomplete type
make[3]: *** [net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.o] Error 1
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Thu, 5 Jan 2006 20:21:16 +0000 (12:21 -0800)]
[NETFILTER]: make ipv6_find_hdr() find transport protocol header
The original ipv6_find_hdr() finds the specified header in IPv6 packets.
This makes it possible to get transport header so that we can kill similar
loop in ip6_match_packet().
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Thu, 5 Jan 2006 20:20:59 +0000 (12:20 -0800)]
[NETFILTER]: Call POST_ROUTING hook before fragmentation
Call POST_ROUTING hook before fragmentation to get rid of the okfn use
in ip_refrag and save the useless fragmentation/defragmentation step
when NAT is used.
The patch introduces one user-visible change, the POSTROUTING chain
in the mangle table gets entire packets, not fragments, which should
simplify use of the MARK and CLASSIFY targets for queueing as a nice
side-effect.
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
[NETFILTER]: Filter dumped entries based on the layer 3 protocol number
Dump entries of a given Layer 3 protocol number.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Cleanup: Use 'else if' instead of a ugly 'goto' statement.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
[NETFILTER]: ctnetlink: use u_int32_t instead of unsigned int
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
[NETFILTER]: ctnetlink: propagate ctnetlink_dump_tuples_proto return value back
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
[NETFILTER]: ctnetlink: Add sanity checkings for ICMP
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
[NETFILTER]: ctnetlink: remove bogus checks in ICMP protocol at dumping
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Jesper Juhl [Thu, 5 Jan 2006 20:16:16 +0000 (12:16 -0800)]
[NETFILTER]: Decrease number of pointer derefs in nf_conntrack_core.c
Benefits of the patch:
- Fewer pointer dereferences should make the code slightly faster.
- Size of generated code is smaller
- improved readability
Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Jesper Juhl [Thu, 5 Jan 2006 20:15:58 +0000 (12:15 -0800)]
[NETFILTER]: Decrease number of pointer derefs in nfnetlink_queue.c
Benefits of the patch:
- Fewer pointer dereferences should make the code slightly faster.
- Size of generated code is smaller
- improved readability
Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Adrian Bunk [Thu, 5 Jan 2006 20:14:43 +0000 (12:14 -0800)]
[IPVS]: Fix compilation
Signed-off-by: Adrian Bunk <bunk@stusta.de> Acked-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Thu, 5 Jan 2006 00:20:40 +0000 (16:20 -0800)]
Relax the rw_verify_area() error checking.
In particular, allow over-large read- or write-requests to be downgraded
to a more reasonable range, rather than considering them outright errors.
We want to protect lower layers from (the sadly all too common) overflow
conditions, but prefer to do so by chopping the requests up, rather than
just refusing them outright.
Cc: Peter Anvin <hpa@zytor.com> Cc: Ulrich Drepper <drepper@redhat.com> Cc: Andi Kleen <ak@suse.de> Cc: Al Viro <viro@ftp.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Kay Sievers [Mon, 19 Dec 2005 00:42:56 +0000 (01:42 +0100)]
[PATCH] net: swich device attribute creation to default attrs
Recent udev versions don't longer cover bad sysfs timing with built-in
logic. Explicit rules are required to do that. For net devices, the
following is needed:
ACTION=="add", SUBSYSTEM=="net", WAIT_FOR_SYSFS="address"
to handle access to net device properties from an event handler without
races.
This patch changes the main net attributes to be created by the driver
core, which is done _before_ the event is sent out and will not require
the stat() loop of the WAIT_FOR_SYSFS key.
Signed-off-by: Kay Sievers <kay.sievers@suse.de> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
[PATCH] Driver core: Make block devices create the proper symlink name
Block devices need to add the block device name to the symlink they put
in the device directory, otherwise multiple symlinks of the same name
can be created. This matches the class system, which works the same
way, we just forgot to convert block at the same time.
Cc: Pete Zaitcev <zaitcev@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
[PATCH] Driver core: only all userspace bind/unbind if CONFIG_HOTPLUG is enabled
Thanks to drivers making their id tables __devinit, we can't allow
userspace to bind or unbind drivers from devices manually through sysfs.
So we only allow this if CONFIG_HOTPLUG is enabled.
Rusty Russell [Sat, 10 Dec 2005 11:48:20 +0000 (22:48 +1100)]
[PATCH] Input: fix add modalias support build error
Fix build when scripts/mod/file2alias.c includes linux/input.h, which
tries to include /usr/include/linux/mod_devicetable.h:
In file included from scripts/mod/file2alias.c:40:
include/linux/input.h:21:35: linux/mod_devicetable.h: No such file or directory
make[2]: *** [scripts/mod/file2alias.o] Error 1
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Rusty Russell [Wed, 7 Dec 2005 20:40:34 +0000 (21:40 +0100)]
[PATCH] Input: add modalias support
Here's the patch for modalias support for input classes. It uses
comma-separated numbers, and doesn't describe all the potential keys (no
module currently cares, and that would make the strings huge). The
changes to input.h are to move the definitions needed by file2alias
outside __KERNEL__. I chose not to move those definitions to
mod_devicetable.h, because there are so many that it might break compile
of something else in the kernel.
The rest is fairly straightforward.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> CC: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Kay Sievers [Mon, 12 Dec 2005 17:03:44 +0000 (18:03 +0100)]
[PATCH] ide: MODALIAS support for autoloading of ide-cd, ide-disk, ...
IDE: MODALIAS support for autoloading of ide-cd, ide-disk, ...
Add MODULE_ALIAS to IDE midlayer modules: ide-disk, ide-cd, ide-floppy and
ide-tape, to autoload these modules depending on the probed media type of
the IDE device.
It is used by udev and replaces the former agent shell script of the hotplug
package, which was required to lookup the media type in the proc filesystem.
Using proc was racy, cause the media file is created after the hotplug event
is sent out.
The module autoloading does not take any effect, until something like the
following udev rule is configured:
SUBSYSTEM=="ide", ACTION=="add", ENV{MODALIAS}=="?*", RUN+="/sbin/modprobe $env{MODALIAS}"
The module ide-scsi will not be autoloaded, cause it requires manual
configuration. It can't be, and never was supported for automatic setup in
the hotplug package. Adding a MODULE_ALIAS to ide-scsi for all supported
media types, would just lead to a default blacklist entry anyway.
$ modinfo ide-disk
filename: /lib/modules/2.6.15-rc4-g1b0997f5/kernel/drivers/ide/ide-disk.ko
description: ATA DISK Driver
alias: ide:*m-disk*
license: GPL
...
It also adds attributes to the IDE device:
$ tree /sys/bus/ide/devices/0.0/
/sys/bus/ide/devices/0.0/
|-- bus -> ../../../../../../../bus/ide
|-- drivename
|-- media
|-- modalias
|-- power
| |-- state
| `-- wakeup
`-- uevent
$ cat /sys/bus/ide/devices/0.0/{modalias,drivename,media}
ide:m-disk
hda
disk
Signed-off-by: Kay Sievers <kay.sievers@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
akpm@osdl.org [Wed, 23 Nov 2005 07:36:13 +0000 (23:36 -0800)]
[PATCH] kobject_uevent CONFIG_NET=n fix
lib/lib.a(kobject_uevent.o)(.text+0x25f): In function `kobject_uevent':
: undefined reference to `__alloc_skb'
lib/lib.a(kobject_uevent.o)(.text+0x2a1): In function `kobject_uevent':
: undefined reference to `skb_over_panic'
lib/lib.a(kobject_uevent.o)(.text+0x31d): In function `kobject_uevent':
: undefined reference to `skb_over_panic'
lib/lib.a(kobject_uevent.o)(.text+0x356): In function `kobject_uevent':
: undefined reference to `netlink_broadcast'
lib/lib.a(kobject_uevent.o)(.init.text+0x9): In function `kobject_uevent_init':
: undefined reference to `netlink_kernel_create'
make: *** [.tmp_vmlinux1] Error 1
Netlink is unconditionally enabled if CONFIG_NET, so that's OK.
kobject_uevent.o is compiled even if !CONFIG_HOTPLUG, which is lazy.
Let's compound the sin.
Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Kumar Gala [Mon, 28 Nov 2005 16:15:39 +0000 (10:15 -0600)]
[PATCH] Allow overlapping resources for platform devices
There are cases in which a device's memory mapped registers overlap
with another device's memory mapped registers. On several PowerPC
devices this occurs for the MDIO bus, whose registers tended to overlap
with one of the ethernet controllers.
By switching from request_resource to insert_resource we can register
the MDIO bus as a proper platform device and not hack around how we
handle its memory mapped registers.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Frank Pavlic [Sun, 27 Nov 2005 04:48:40 +0000 (20:48 -0800)]
[PATCH] klist: Fix broken kref counting in find functions
The klist reference counting in the find functions that use
klist_iter_init_node is broken. If the function (for example
driver_find_device) is called with a NULL start object then everything is
fine, the first call to next_device()/klist_next increases the ref-count of
the first node on the list and does nothing for the start object which is
NULL.
If they are called with a valid start object then klist_next will decrement
the ref-count for the start object but nobody has incremented it. Logical
place to fix this would be klist_iter_init_node because the function puts a
reference of the object into the klist_iter struct.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Frank Pavlic <pavlic@de.ibm.com> Cc: Patrick Mochel <mochel@digitalimplant.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>