this patch makes the pgdir management per-vcpu. The pgdirs pool
is still guest-wide (although it'll probably need to grow when we
are really executing more vcpus), but the pgdidx index is gone,
since it makes no sense anymore. Instead, we use a per-vcpu
index.
Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
lguest uses tasks to control its running behaviour (like sending
breaks, controlling halted state, etc). In a per-vcpu environment,
each vcpu will have its own underlying task. So this patch
makes the infrastructure for that possible
Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
emulate_insn() needs to know about current eip, which will be,
in the future, a per-vcpu thing. So in this patch, the function
prototype is modified to receive a vcpu struct
Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The switcher needs to be mapped per-vcpu, because different vcpus
will potentially have different page tables (they don't have to,
because threads will share the same).
So our first step is the make the function receive a vcpu struct
Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
this patch changes do_hcall() and do_async_hcall() interfaces (and obviously their
callers) to get a vcpu struct. Again, a vcpu services the hypercall, not the whole
guest
Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This patch makes the write() file operation smp aware. Which means, receiving
the vcpu_id value through the offset parameter, and being well aware to which
vcpu we're talking to.
Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This patch makes the run_guest() routine use the lg_cpu struct.
This is required since in a smp guest environment, there's no
more the notion of "running the guest", but rather, it is "running the vcpu"
Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
this patch initializes the first vcpu in the initialize() routing,
which is responsible for starting the process of putting the guest up.
right now, as much of the fields are still not per-vcpu, it does not
do much.
Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This patch makes uses of pread() and pwrite() in lguest launcher
to communicate the vcpu id to the lguest driver. The id is kept in
a thread variable, which means we'll span in the future, vcpus as
threads. But right now, only the infrastructure is out there.
Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Currently, lguest module can't be compiled without the PARAVIRT flag being
on. This is a fake dependency, since the module itself shouldn't need any
paravirt override. Reason for that is the reference to pv_info structure
in initial loading tests.
This patch removes it in favour of a more generic error message.
Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Linus Torvalds [Wed, 30 Jan 2008 08:54:24 +0000 (19:54 +1100)]
Merge git://git.linux-nfs.org/pub/linux/nfs-2.6
* git://git.linux-nfs.org/pub/linux/nfs-2.6: (118 commits)
NFSv4: Iterate through all nfs_clients when the server recalls a delegation
NFSv4: Deal more correctly with duplicate delegations
NFS: Fix a potential race between umount and nfs_access_cache_shrinker()
NFS: Add an asynchronous delegreturn operation for use in nfs_clear_inode
nfs: convert NFS_*(inode) helpers to static inline
nfs: obliterate NFS_FLAGS macro
NFS: Address memory leaks in the NFS client mount option parser
nfs4: allow nfsv4 acls on non-regular-files
NFS: Optimise away the sigmask code in aio/dio reads and writes
SUNRPC: Don't bother changing the sigmask for asynchronous RPC calls
SUNRPC: rpcb_getport_sync() passes incorrect address size to rpc_create()
SUNRPC: Clean up block comment preceding rpcb_getport_sync()
SUNRPC: Use appropriate argument types in rpcb client
SUNRPC: rpcb_getport_sync() should use built-in hostname generator
SUNRPC: Clean up functions that free address_strings array
NFS: NFS version number is unsigned
NLM: Fix a bogus 'return' in nlmclnt_rpc_release
NLM: Introduce an arguments structure for nlmclnt_init()
NLM/NFS: Use cached nlm_host when calling nlmclnt_proc()
NFS: Invoke nlmclnt_init during NFS mount processing
...
Jens Axboe [Tue, 29 Jan 2008 21:25:18 +0000 (22:25 +0100)]
as-iosched: fix double locking bug in as_merged_requests()
If the two requests belong to the same io context, we will attempt
to lock the same lock twice. But swapping contexts is pointless in
that case, so just check for rioc == nioc before doing the double
lock and copy.
Tested-by: Olof Johansson <olof@lixom.net> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Trond Myklebust [Thu, 24 Jan 2008 23:14:34 +0000 (18:14 -0500)]
NFS: Add an asynchronous delegreturn operation for use in nfs_clear_inode
Otherwise, there is a potential deadlock if the last dput() from an NFSv4
close() or other asynchronous operation leads to nfs_clear_inode calling
the synchronous delegreturn.
Chuck Lever [Wed, 16 Jan 2008 21:38:10 +0000 (16:38 -0500)]
NFS: Address memory leaks in the NFS client mount option parser
David Howells noticed that repeating the same mount option twice during an
NFS mount request can result in orphaned memory in certain cases.
Only the client_address and mount_server.hostname strings are initialized
in the mount parsing loop, so those appear to be the only two pointers that
might be written over by repeating a mount option. The strings in the
nfs_server section of the nfs_parsed_mount_data structure are set only once
after the options are parsed, thus these are not susceptible to being
overwritten.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
J. Bruce Fields [Tue, 15 Jan 2008 21:43:19 +0000 (16:43 -0500)]
nfs4: allow nfsv4 acls on non-regular-files
The rfc doesn't give any reason it shouldn't be possible to set an
attribute on a non-regular file. And if the server supports it, then it
shouldn't be up to us to prevent it.
Thanks to Erez for the report and Trond for further analysis.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Tested-by: Erez Zadok <ezk@cs.sunysb.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Tue, 15 Jan 2008 19:17:12 +0000 (14:17 -0500)]
NFS: Optimise away the sigmask code in aio/dio reads and writes
There are no interruptible waits for asynchronous RPC tasks, so we don't
need to wrap calls to rpc_run_task() with an
rpc_clnt_sigmask/rpc_clnt_unsigmask pair.
Instead we can wrap the wait_for_completion_interruptible() in
nfs_direct_wait(). This means that we completely optimise away sigmask
setting for the case of non-blocking aio/dio.
Chuck Lever [Mon, 14 Jan 2008 20:11:53 +0000 (15:11 -0500)]
SUNRPC: Use appropriate argument types in rpcb client
Clean up: Follow recommendations of Chapter 5 of Documentation/CodingStyle
and use "u32" instead of "__u32" for types in definitions that are not
shared with user space.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Mon, 14 Jan 2008 20:11:46 +0000 (15:11 -0500)]
SUNRPC: rpcb_getport_sync() should use built-in hostname generator
rpc_create() can already fill in the hostname with a string representation
of the server's IP address, so remove redundant logic in in
rpcb_getport_sync() that does that.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Mon, 14 Jan 2008 17:32:20 +0000 (12:32 -0500)]
SUNRPC: Clean up functions that free address_strings array
Clean up: document the rule (kfree) and the exceptions
(RPC_DISPLAY_PROTO and RPC_DISPLAY_NETID) when freeing the objects in
a transport's address_strings array.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Fri, 11 Jan 2008 22:09:59 +0000 (17:09 -0500)]
NLM/NFS: Use cached nlm_host when calling nlmclnt_proc()
Now that each NFS mount point caches its own nlm_host structure, it can be
passed to nlmclnt_proc() for each lock request. By pinning an nlm_host for
each mount point, we trade the overhead of looking up or creating a fresh
nlm_host struct during every NLM procedure call for a little extra memory.
We also restrict the nlmclnt_proc symbol to limit the use of this call to
in-tree modules.
Note that nlm_lookup_host() (just removed from the client's per-request
NLM processing) could also trigger an nlm_host garbage collection. Now
client-side nlm_host garbage collection occurs only during NFS mount
processing. Since the NFS client now holds a reference on these nlm_host
structures, they wouldn't have been affected by garbage collection
anyway.
Given that nlm_lookup_host() reorders the global nlm_host chain after
every successful lookup, and that a garbage collection could be triggered
during the call, we've removed a significant amount of per-NLM-request
CPU processing overhead.
Sidebar: there are only a few remaining references to the internals of
NFS inodes in the client-side NLM code. The only references I found are
related to extracting or comparing the inode's file handle via NFS_FH().
One is in nlmclnt_grant(); the other is in nlmclnt_setlockargs().
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Fri, 11 Jan 2008 22:09:52 +0000 (17:09 -0500)]
NFS: Invoke nlmclnt_init during NFS mount processing
Cache an appropriate nlm_host structure in the NFS client's mount point
metadata for later use.
Note that there is no need to set NFS_MOUNT_NONLM in the error case -- if
nfs_start_lockd() returns a non-zero value, its callers ensure that the
mount request fails outright.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Fri, 11 Jan 2008 22:09:44 +0000 (17:09 -0500)]
NLM: Introduce external nlm_host set-up and tear-down functions
We would like to remove the per-lock-operation nlm_lookup_host() call from
nlmclnt_proc().
The new architecture pins an nlm_host structure to each NFS client
superblock that has the "lock" mount option set. The NFS client passes
in the pinned nlm_host structure during each call to nlmclnt_proc(). NFS
client unmount processing "puts" the nlm_host so it can be garbage-
collected later.
This patch introduces externally callable NLM functions that handle
mount-time nlm_host set up and tear-down.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Mon, 7 Jan 2008 23:34:48 +0000 (18:34 -0500)]
SUNRPC: fewer conditionals in the format_ip_address routines
Clean up: have the set up routines explicitly pass the strings to be used
for the transport name and NETID. This removes a number of conditionals
and dependencies on rpc_xprt.prot, which is overloaded.
Tighten up type checking on the address_strings array while we're at it.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Thu, 20 Dec 2007 19:55:04 +0000 (14:55 -0500)]
NFS: nfs_write_end clean up
Clean up: commit 4899f9c8 added nfs_write_end(), which introduces a
conditional expression that returns an unsigned integer in one arm and
a signed integer in the other.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Thu, 20 Dec 2007 19:54:42 +0000 (14:54 -0500)]
NFS: Fix use of copy_to_user() in idmap_pipe_upcall
The idmap_pipe_upcall() function expects the copy_to_user() function to
return a negative error value if the call fails, but copy_to_user()
returns an unsigned long number of bytes that couldn't be copied.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Thu, 20 Dec 2007 19:54:27 +0000 (14:54 -0500)]
SUNRPC: Fix use of copy_to_user() in gss_pipe_upcall()
The gss_pipe_upcall() function expects the copy_to_user() function to
return a negative error value if the call fails, but copy_to_user()
returns an unsigned long number of bytes that couldn't be copied.
Can rpc_pipefs actually retry a partially completed upcall read? If
not, then gss_pipe_upcall() should punt any partial read, just like the
upcall logic in net/sunrpc/cache.c.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Thu, 3 Jan 2008 21:29:06 +0000 (16:29 -0500)]
NFS: Fix the 'proto=' mount option
Currently, if you have a server mounted using networking protocol, you
cannot specify a different value using the 'proto=' option on another
mountpoint.
Trond Myklebust [Thu, 20 Dec 2007 21:03:55 +0000 (16:03 -0500)]
SUNRPC: Add support for per-client timeout values
In order to be able to support setting the timeo and retrans parameters on
a per-mountpoint basis, we move the rpc_timeout structure into the
rpc_clnt.
Chuck Lever [Mon, 10 Dec 2007 19:59:28 +0000 (14:59 -0500)]
NFS: Support non-IPv4 addresses in nfs_parsed_mount_data
Replace the nfs_server and mount_server address fields in the
nfs_parsed_mount_data structure with a "struct sockaddr_storage"
instead of a "struct sockaddr_in".
Chuck Lever [Mon, 10 Dec 2007 19:59:21 +0000 (14:59 -0500)]
NFS: Refactor mount option address parsing into separate function
Refactor the logic to parse incoming text-based IP addresses. Use the
in4_pton() function instead of the older in_aton(), following the lead
of the in-kernel CIFS client.
Later we'll add IPv6 address parsing using the matching in6_pton()
function. For now we can't allow IPv6 address parsing: we must expand
the size of the address storage fields in the nfs_parsed_mount_options
struct before we can parse and store IPv6 addresses.
Chuck Lever [Mon, 10 Dec 2007 19:59:13 +0000 (14:59 -0500)]
NFS: Remove the NIPQUAD from nfs_try_mount
In the name of address family compatibility, we can't have the NIP_FMT and
NIPQUAD macros in nfs_try_mount(). Instead, we can make use of an unused
mount option to display the mount server's hostname.
Chuck Lever [Mon, 10 Dec 2007 19:59:06 +0000 (14:59 -0500)]
NFS: Adjust nfs_clone_mount structure to store "struct sockaddr *"
Change the addr field in the nfs_clone_mount structure to store a "struct
sockaddr *" to support non-IPv4 addresses in the NFS client.
Note this is mostly a cosmetic change, and does not actually allow
referrals using IPv6 addresses. The existing referral code assumes that
the server returns a string that represents an IPv4 address. This code
needs to support hostnames and IPv6 addresses as well as IPv4 addresses,
thus it will need to be reorganized completely (to handle DNS resolution
in user space).
Chuck Lever [Mon, 10 Dec 2007 19:58:59 +0000 (14:58 -0500)]
NFS: Change nfs4_set_client() to accept struct sockaddr *
Adjust the arguments and callers of nfs4_set_client() to pass a "struct
sockaddr *" instead of a "struct sockaddr_in *" to support non-IPv4
addresses in the NFS client.
Chuck Lever [Mon, 10 Dec 2007 19:58:44 +0000 (14:58 -0500)]
NFS: Change nfs_find_client() to take "struct sockaddr *"
Adjust arguments and callers of nfs_find_client() to pass a
"struct sockaddr *" instead of "struct sockaddr_in *" to support non-IPv4
addresses.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net>
Trond: Also fix up protocol version number argument in nfs_find_client() to
use the correct u32 type.
Chuck Lever [Mon, 10 Dec 2007 19:58:15 +0000 (14:58 -0500)]
NFS: Expand server address storage in nfs_client struct
Prepare for managing larger addresses in the NFS client by widening the
nfs_client struct's cl_addr field.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net>
(Modified to work with the new parameters for nfs_alloc_client) Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Mon, 10 Dec 2007 19:57:53 +0000 (14:57 -0500)]
NFS: Make setting a port number agostic
We'll need to set the port number of an AF_INET or AF_INET6 address in
several places in fs/nfs/super.c, so introduce a helper that can manage
this for us. We put this helper to immediate use.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Mon, 10 Dec 2007 19:57:38 +0000 (14:57 -0500)]
NFS: Add support for AF_INET6 addresses in nfs_compare_super()
Refactor nfs_compare_super() and add AF_INET6 support.
Replace the generic memcmp() to document explicitly what parts of the
addresses must match in this check, and make the comparison independent
of the lengths of both addresses.
A side benefit is both tests are more computationally efficient than a
memcmp().
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Mon, 10 Dec 2007 19:57:16 +0000 (14:57 -0500)]
NFS: eliminate NIPQUAD(clp->cl_addr.sin_addr)
To ensure the NFS client displays IPv6 addresses properly, replace
address family-specific NIPQUAD() invocations with a call to the RPC
client to get a formatted string representing the remote peer's
address.
Chuck Lever [Mon, 10 Dec 2007 19:57:09 +0000 (14:57 -0500)]
NFS: Enable NFS client to generate CLIENTID strings with IPv6 addresses
We recently added methods to RPC transports that provide string versions of
the remote peer address information. Convert the NFSv4 SETCLIENTID
procedure to use those methods instead of building the client ID out of
whole cloth.
Chuck Lever [Mon, 10 Dec 2007 19:56:54 +0000 (14:56 -0500)]
NFS: Ensure NFSv4 SETCLIENTID send buffer is large enough
Ensure that the RPC buffer size specified for NFSv4 SETCLIENTID procedures
matches what we are encoding into the buffer. See the definition of
struct nfs4_setclientid {} and the encode_setclientid() function.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Mon, 10 Dec 2007 19:56:46 +0000 (14:56 -0500)]
SUNRPC: Move universal address definitions to global header
Universal addresses are defined in RFC 1833 and clarified in RFC 3530. We
need to use them in several places in the NFS and RPC clients, so move the
relevant definition and block comment to an appropriate global include
file.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Mon, 10 Dec 2007 19:56:24 +0000 (14:56 -0500)]
SUNRPC: rpc_create() default hostname should support AF_INET6 addresses
If the ULP doesn't pass a hostname string to rpc_create(), it manufactures
one based on the passed-in address. Be smart enough to handle an AF_INET6
address properly in this case.
Move the default servername logic before the xprt_create_transport() call
to simplify error handling in rpc_create().
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Fri, 26 Oct 2007 17:32:45 +0000 (13:32 -0400)]
NFS: Clean up address comparison in __nfs_find_client()
The address comparison in the __nfs_find_client() function is deceptive.
It uses a memcmp() to check a pair of u32 fields for equality. Not only is
this inefficient, but usually memcmp() is used for comparing two *whole*
sockaddr_in's (which includes comparisons of the address family and port
number), so it's easy to mistake the comparison here for a whole sockaddr
comparison, which it isn't.
So for clarity and efficiency, we replace the memcmp() with a simple test
for equality between the two s_addr fields. This should have no
behavioral effect.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Fri, 26 Oct 2007 17:32:40 +0000 (13:32 -0400)]
NFS: Clean up: copy hostname with kstrndup during mount processing
Clean up: mount option parsing uses kstrndup in several places, rather than
using kzalloc. Replace the few remaining uses of kzalloc with kstrndup,
for consistency.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Fri, 26 Oct 2007 17:32:29 +0000 (13:32 -0400)]
NFS: Remove support for the 'mountprog' option
Remove the mount option that allows users to specify an alternate mountd
program number. The client hasn't support setting an alternate mountd
program number for a very long time.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Fri, 26 Oct 2007 17:32:24 +0000 (13:32 -0400)]
NFS: Remove support for the 'nfsprog' option
Remove the mount option that allows users to specify an alternate NFS
program number. The client hasn't support setting an alternate NFS
program number for a very long time.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>