Trond Myklebust [Mon, 1 Oct 2007 17:46:53 +0000 (13:46 -0400)]
NFS: don't cache the verifer across ->lookup() calls
If the ->lookup() call causes the directory verifier to change, then there
is still no need to use the old verifier, since our dentry has been
verified.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Mon, 1 Oct 2007 14:00:23 +0000 (10:00 -0400)]
NFS: nfs_mark_for_revalidate don't update cache_change_attribute
Just let the subsequent inode revalidation do the update...
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Mon, 1 Oct 2007 13:59:15 +0000 (09:59 -0400)]
NFS: nfs_post_op_update_inode don't update cache_change_attribute
If nfs_post_op_update_inode fails because the server didn't return any
attributes, then we let the subsequent inode revalidation update
cache_change_attribute.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Mon, 1 Oct 2007 13:56:59 +0000 (09:56 -0400)]
NFS: Don't revalidate dentries on directory size or ctime changes
We only need to look at the mtime changes...
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Sun, 30 Sep 2007 19:31:19 +0000 (15:31 -0400)]
NFS: Don't set cache_change_attribute in nfs_revalidate_mapping
The attribute revalidation code will already have taken care of resetting
nfsi->cache_change_attribute.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Wed, 3 Oct 2007 19:58:38 +0000 (15:58 -0400)]
NFS: Fix a bug in nfs_open_revalidate()
We want to set the verifier when the call to nfs4_open_revalidate()
_succeeds_.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Tue, 2 Oct 2007 22:38:53 +0000 (18:38 -0400)]
NFS: Don't hash the negative dentry when optimising for an O_EXCL open
We don't want to leave an unverified hashed negative dentry if the
exclusive create fails to complete.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Tue, 2 Oct 2007 01:51:38 +0000 (21:51 -0400)]
NFS: nfs_instantiate() should set the dentry verifier
That will also allow us to remove the calls in mknod and mkdir.
In addition it will ensure that symlinks set it correctly.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Sat, 29 Sep 2007 21:41:33 +0000 (17:41 -0400)]
NFS: Ensure nfs_instantiate() invalidates the parent dir on error
Also ensure that it drops the dentry in this case.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Sat, 29 Sep 2007 21:15:01 +0000 (17:15 -0400)]
NFS: Fix nfs_verify_change_attribute()
We don't care about whether or not some other process on our client is
changing the directory while we're in nfs_lookup_revalidate(), because the
dcache will take care of ensuring local atomicity.
We can therefore remove the test for nfs_caches_unstable().
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Sat, 29 Sep 2007 21:14:03 +0000 (17:14 -0400)]
NFS: Fix the sign of the return value of nfs_save_change_attribute()
Also fix up the comments.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Sun, 30 Sep 2007 19:21:24 +0000 (15:21 -0400)]
NFS: Fake up 'wcc' attributes to prevent cache invalidation after write
NFSv2 and v4 don't offer weak cache consistency attributes on WRITE calls.
In NFSv3, returning wcc data is optional. In all cases, we want to prevent
the client from invalidating our cached data whenever ->write_done()
attempts to update the inode attributes.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Sun, 30 Sep 2007 19:13:17 +0000 (15:13 -0400)]
NFS: Remove bogus check of cache_change_attribute in nfs_update_inode
Remove the bogus 'data_stable' check in nfs_update_inode. The
cache_change_attribute tells you if the directory changed on the server,
and should have nothing to do with the file length.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Fri, 28 Sep 2007 23:11:33 +0000 (19:11 -0400)]
NFS: Fix the ESTALE "revalidation" in _nfs_revalidate_inode()
For one thing, the test NFS_ATTRTIMEO() == 0 makes no sense: we're
testing whether or not the cache timeout length is zero, which is totally
unrelated to the issue of whether or not we trust the file staleness.
Secondly, we do not want to retry the GETATTR once a file has been declared
stale by the server: we rather want to discard that inode as soon as
possible, since there are broken servers still in use out there that reuse
filehandles on new files.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Fri, 28 Sep 2007 21:20:07 +0000 (17:20 -0400)]
NFS: Fix atime revalidation in read()
NFSv3 will correctly update atime on a read() call, so there is no need to
set the NFS_INO_INVALID_ATIME flag unless the call to nfs_refresh_inode()
fails.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Fri, 28 Sep 2007 21:11:45 +0000 (17:11 -0400)]
NFS: Fix atime revalidation in readdir()
NFSv3 will correctly update atime on a readdir call, so there is no need to
set the NFS_INO_INVALID_ATIME flag unless the call to nfs_refresh_inode()
fails.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Sun, 30 Sep 2007 22:01:13 +0000 (18:01 -0400)]
NFS: Don't use readdirplus data if the page cache is invalid
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Thu, 27 Sep 2007 19:57:24 +0000 (15:57 -0400)]
NFSv4: Don't use ctime/mtime for determining when to invalidate the caches
In NFSv4 we should only be looking at the change attribute.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Thu, 27 Sep 2007 14:07:31 +0000 (10:07 -0400)]
NFS: Don't force a dcache revalidation if nfs_wcc_update_inode succeeds
The reason is that if the weak cache consistency update was successful,
then we know that our client must be the only one that changed the
directory, and we've already updated the dcache to reflect the change.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Sun, 30 Sep 2007 21:03:25 +0000 (17:03 -0400)]
NFS: nfs_wcc_update_inode: directory caches are always invalidated
We must ensure that the readdir data is always invalidated whether or not
the weak cache consistency data update succeeds.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Fri, 28 Sep 2007 18:20:33 +0000 (14:20 -0400)]
NFS: Fix dcache revalidation bugs
We don't need to force a dentry lookup just because we're making changes to
the directory.
Don't update nfsi->cache_change_attribute in nfs_end_data_update: that
overrides the NFSv3/v4 weak consistency checking that tells us our update
was the only one, and that tells us the dcache is still valid.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Fri, 28 Sep 2007 18:20:12 +0000 (14:20 -0400)]
NFS: fix nfs_verify_change_attribute
We always want to check that the verifier and directory
cache_change_attribute match. This also allows us to remove the 'wraparound
hack' for the cache_change_attribute. If we're only checking for equality,
then we don't care about wraparound issues.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Wed, 15 Aug 2007 16:59:12 +0000 (12:59 -0400)]
NFS: nfs_post_op_update_inode() should call nfs_refresh_inode()
Ensure that we don't clobber the results from a more recent getattr call...
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Wed, 15 Aug 2007 16:49:17 +0000 (12:49 -0400)]
NFS: Fix over-conservative attribute invalidation in nfs_update_inode()
We should always be declaring the attribute cache as valid after having
updated it.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Fri, 10 Aug 2007 21:45:11 +0000 (17:45 -0400)]
NFSv4: Make NFSv4 ACCESS calls return attributes too...
It doesn't really make sense to cache an access call without also
revalidating the attributes.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Fri, 10 Aug 2007 21:45:10 +0000 (17:45 -0400)]
NFSv4: Simplify _nfs4_do_access()
Currently, _nfs4_do_access() is just a copy of nfs_do_access() with added
conversion of the open flags into an access mask. This patch merges the
duplicate functionality.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Fri, 10 Aug 2007 21:44:32 +0000 (17:44 -0400)]
NFS: Replace file->private_data with calls to nfs_file_open_context()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Fri, 10 Aug 2007 21:44:28 +0000 (17:44 -0400)]
NFS: Add a helper to extract the nfs_open_context from a struct file
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Mon, 24 Sep 2007 19:40:16 +0000 (15:40 -0400)]
NFS: Eliminate nfs_refresh_verifier()
nfs_set_verifier() and nfs_refresh_verifier() do exactly the same thing, so
replace one with the other.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Mon, 24 Sep 2007 19:40:11 +0000 (15:40 -0400)]
NFS: Eliminate nfs_renew_times()
The nfs_renew_times() function plants the current time in jiffies in
dentry->d_time. But a call to nfs_renew_times() is always followed by
another call that overwrites dentry->d_time. Get rid of the
nfs_renew_times() calls.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Mon, 24 Sep 2007 19:40:06 +0000 (15:40 -0400)]
NFS: Don't call nfs_renew_times() in nfs_dentry_iput()
Negative dentries need to be reverified after an asynchronous unlink.
Quoth Trond:
"Unfortunately I don't think that we can avoid revalidating the
resulting negative dentry since the UNLINK call is asynchronous,
and so the new verifier on the directory will only be known a
posteriori."
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Mon, 24 Sep 2007 19:40:00 +0000 (15:40 -0400)]
SUNRPC: Fix bytes-per-op accounting for RPC over UDP
NFS performance metrics reported zero bytes sent per op when mounting with
UDP. The UDP socket transport wasn't properly counting the number of bytes
sent.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Mon, 24 Sep 2007 19:39:55 +0000 (15:39 -0400)]
NFS: Show "nointr" mount option
The default "intr" setting is different for NFS and NFSv4. To avoid
confusion on this issue, don't hide the "nointr" option in /proc/mounts.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Mon, 24 Sep 2007 19:39:50 +0000 (15:39 -0400)]
NFS: Verify server address before invoking in-kernel mount client
Re-order mount option sanity checking slightly to ensure we have a valid
server address *before* trying to do the mountd RPC call.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
\"Talpey, Thomas\ [Thu, 20 Sep 2007 21:37:58 +0000 (17:37 -0400)]
SUNRPC: Add RDMA dependency to SUNRPC_XPRT_RDMA
Add a dependency on RDMA before enabling SUNRPC_XPRT_RDMA
Yes, "INFINIBAND" also turns on iWARP and other RDMA support.
Signed-off-by: Tom Talpey <talpey@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
\"Talpey, Thomas\ [Mon, 10 Sep 2007 17:51:18 +0000 (13:51 -0400)]
RPCRDMA: rpc rdma verbs interface implementation
This implements the interface from rpcrdma to the RDMA verbs interface
supported by Infniband and iWARP.
Signed-off-by: Tom Talpey <talpey@netapp.com>
Signed-off-by: James Lentini <jlentini@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
\"Talpey, Thomas\ [Mon, 10 Sep 2007 17:50:42 +0000 (13:50 -0400)]
RPCRDMA: rpc rdma protocol implementation
This implements the marshaling and unmarshaling of the rpcrdma transport
headers. Connection management is also addressed.
Signed-off-by: Tom Talpey <talpey@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
\"Talpey, Thomas\ [Mon, 10 Sep 2007 17:50:12 +0000 (13:50 -0400)]
RPCRDMA: rpc rdma transport switch
This implements the configuration and building of the core transport
switch implementation of the rpcrdma transport. Stubs are provided for
the rpcrdma protocol handling, and the infiniband/iwarp verbs interface.
These are provided in following patches.
Signed-off-by: Tom Talpey <talpey@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
\"Talpey, Thomas\ [Mon, 10 Sep 2007 17:49:41 +0000 (13:49 -0400)]
NFS: support RDMA mounts
Adds hooks to the string-based NFS mount to support an "rdma" protocol option.
Signed-off-by: Tom Talpey <tmt@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
\"Talpey, Thomas\ [Mon, 10 Sep 2007 17:49:15 +0000 (13:49 -0400)]
RPCRDMA: Kconfig and header file with rpcrdma protocol definitions
This file implements the configuration target, protocol template and
constants for the rpcrdma transport framing, for use by the xprtrdma
rpc transport implementation.
Signed-off-by: Tom Talpey <talpey@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
\"Talpey, Thomas\ [Mon, 10 Sep 2007 17:48:47 +0000 (13:48 -0400)]
NFS - print accurate transport protocol
Use the per-transport strings to display the transport protocol accurately.
Signed-off-by: Tom Talpey <tmt@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
\"Talpey, Thomas\ [Mon, 10 Sep 2007 17:48:23 +0000 (13:48 -0400)]
NFS/SUNRPC: use transport protocol naming
Instead of an { address family, raw IP protocol number }-tuple, use the
newly-defined RPC identifier when creating clients in the upper layers.
Signed-off-by: Tom Talpey <tmt@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
\"Talpey, Thomas\ [Mon, 10 Sep 2007 17:47:57 +0000 (13:47 -0400)]
NFS/SUNRPC: support transport protocol naming
To prepare for including non-sockets-based RPC transports, select
RPC transports by an identifier (to be used in following patches).
Signed-off-by: Tom Talpey <tmt@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
\"Talpey, Thomas\ [Mon, 10 Sep 2007 17:47:31 +0000 (13:47 -0400)]
SUNRPC: rearrange RPC sockets definitions
To prepare for including non-sockets-based RPC transports, move the
sockets-dependent definitions into their own file.
Signed-off-by: Tom Talpey <tmt@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
\"Talpey, Thomas\ [Mon, 10 Sep 2007 17:47:07 +0000 (13:47 -0400)]
SUNRPC: rename the rpc_xprtsock_create structure
To prepare for including non-sockets-based RPC transports, change the
overly suggestive name of the transport creation arguments struct.
Signed-off-by: Tom Talpey <tmt@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
\"Talpey, Thomas\ [Mon, 10 Sep 2007 17:46:39 +0000 (13:46 -0400)]
SUNRPC: Finish API to load RPC transport implementations dynamically
Allow RPC client transport implementations to be loaded as needed, or
as they become available from distributors or third-party vendors.
Note that we leave the IP sockets implementation in sunrpc.o
permanently, as IP functionality is always available in any
kernel that runs NFS.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Tom Talpey <tmt@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
\"Talpey, Thomas\ [Mon, 10 Sep 2007 17:46:00 +0000 (13:46 -0400)]
SUNRPC: Provide a new API for registering transport implementations
To allow transport capabilities to be loaded dynamically, provide an API
for registering and unregistering the transports with the RPC client.
Eventually xprt_create_transport() will be changed to search the list of
registered transports when initializing a fresh transport.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Tom Talpey <tmt@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
\"Talpey, Thomas\ [Mon, 10 Sep 2007 17:45:36 +0000 (13:45 -0400)]
SUNRPC: add EXPORT_SYMBOL_GPL for generic transport functions
SUNRPC: add EXPORT_SYMBOL_GPL for generic transport functions
As a preface to allowing arbitrary transport modules to be loaded
dynamically, add EXPORT_SYMBOL_GPL for all generic transport functions
that a transport implementation might want to use.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Acked-by: Tom Talpey <tmt@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
\"Talpey, Thomas\ [Mon, 10 Sep 2007 17:44:58 +0000 (13:44 -0400)]
SUNRPC: mark bulk read/write data in xdrbuf
Adds a flag word to the xdrbuf struct which indicates any bulk
disposition of the data. This enables RPC transport providers to
marshal it efficiently/appropriately, and may enable other
optimizations.
Signed-off-by: Tom Talpey <tmt@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Fri, 21 Sep 2007 00:23:51 +0000 (20:23 -0400)]
NFSv4: Fix a bug in nfs4_validate_mount_data()
The previous patch introduced a bug when copying the server address.
Also clarify a copy into the auth_flavours array: currently the two
size calculations are equivalent, but we may decide to change the size
of auth_flavors[] at some point.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
\"Talpey, Thomas\ [Mon, 10 Sep 2007 17:44:33 +0000 (13:44 -0400)]
NFS: use in-kernel mount argument structure for nfsv4 mounts
The user-visible nfs4_mount_data does not contain sufficient data to
describe new mount options, and also is now a legacy structure. Replace
it with the internal nfs_parsed_mount_data for nfsv4 in-kernel use.
Signed-off-by: Tom Talpey <tmt@netapp.com>
Acked-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
\"Talpey, Thomas\ [Mon, 10 Sep 2007 17:43:56 +0000 (13:43 -0400)]
NFS: use in-kernel mount argument structure for nfsv[23] mounts
The user-visible nfs_mount_data does not contain sufficient data to
describe new mount options, and also is now a legacy structure. Replace
it with the internal nfs_parsed_mount_data for nfsv[23] in-kernel use.
Signed-off-by: Tom Talpey <tmt@netapp.com>
Acked-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
\"Talpey, Thomas\ [Mon, 10 Sep 2007 17:43:29 +0000 (13:43 -0400)]
NFS: move nfs_parsed_mount_data structure definition
In preparation for rearranging the nfs mount argument passing, make the
nfs_parsed_mount_data struct visible across nfs kernel files.
Signed-off-by: Tom Talpey <tmt@netapp.com>
Acked-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
\"Talpey, Thomas\ [Mon, 10 Sep 2007 17:43:05 +0000 (13:43 -0400)]
SUNRPC: export per-transport rpcbind netid's
The rpcbind (v3+) netid is provided by each RPC client transport. This fixes
an omission in IPv6 rpcbind client support, and enables future extension.
Signed-off-by: Tom Talpey <tmt@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
\"Talpey, Thomas\ [Mon, 10 Sep 2007 17:42:38 +0000 (13:42 -0400)]
SUNRPC: move per-transport rpcbind netid's
Move the TCP/UDP rpcbind netid's from the rpcbind client to a global header.
Signed-off-by: Tom Talpey <tmt@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Tue, 11 Sep 2007 22:01:20 +0000 (18:01 -0400)]
NFSD: Convert printk's to dprintk's in NFSD's nfs4xdr
Due to recent edict to remove or replace printk's that can flood the system
log.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Tue, 11 Sep 2007 22:01:15 +0000 (18:01 -0400)]
LOCKD: Convert printk's to dprintk's in lockd XDR routines
Due to recent edict to remove or replace printk's that might flood the
system log.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Tue, 11 Sep 2007 22:01:10 +0000 (18:01 -0400)]
NFS: Convert printk's to dprintk's in fs/nfs/nfs?xdr.c
Due to recent edict to replace or remove printk's that can be triggered en
masse by remote misbehavior. Left a few that only occur just before a BUG.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Tue, 11 Sep 2007 22:01:04 +0000 (18:01 -0400)]
NFS: Add new 'mountaddr=' mount option
I got the 'mounthost=' option wrong - it shouldn't look for an address
value, but rather a hostname value. However, the in-kernel mount client
and NFS client cannot resolve a hostname by themselves; they rely on
user-land to pass in the resolved address.
Create a new mount option that does take an address so that the mount
program's address can be passed in. The mount hostname is now ignored
by the kernel.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
James Lentini [Mon, 24 Sep 2007 21:32:49 +0000 (17:32 -0400)]
[NFS] [PATCH] NFS: initialize default port in kernel mount client
If no mount server port number is specified, the previous change to the
kernel mount client inadvertently allows the NFS server's port number to be
the used as the mount server's port number. If the user specifies an NFS
server port (-o port=x), the mount will fail.
The fix below sets the mount server's port to 0 if no mount server
port is specified by the user.
Signed-off-by: James Lentini <jlentini@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Tue, 11 Sep 2007 22:00:58 +0000 (18:00 -0400)]
NFS: Kernel mount client should use async bind
Simplify the in-kernel mount client by using autobind instead of an
explicit call to rpc_getport_sync.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Tue, 11 Sep 2007 22:00:52 +0000 (18:00 -0400)]
SUNRPC: RPC bind failures should be permanent for NULL requests
The purpose of an RPC ping (a NULL request) is to determine whether the
remote end is operating and supports the RPC program and version of the
request.
If we do an RPC bind and the remote's rpcbind service says "this
program or service isn't supported" then we have our answer already,
and we should give up immediately.
This is good for the kernel mount client, as it will cause the request
to fail, and then allow an immediate retry with different options.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Tue, 11 Sep 2007 22:00:47 +0000 (18:00 -0400)]
SUNRPC: Split another new rpcbind retry error code from EACCES
Add more new error code processing to the kernel's rpcbind client
and to call_bind_status() to distinguish two cases:
Case 1: the remote has replied that the program/version tuple is not
registered (returns EACCES)
Case 2: retry with a lesser rpcbind version (rpcb now returns EPFNOSUPPORT)
This change allows more specific error processing for each of these two
cases. We now fail case 2 instead of retrying... it's a server
configuration error not to support even rpcbind version 2. And don't
expose this new error code to user land -- convert it to EIO before
failing the RPC.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Tue, 11 Sep 2007 22:00:41 +0000 (18:00 -0400)]
SUNRPC: Add a new error code for retry waiting for another binder
Add new error code processing to the kernel's rpcbind client and to
call_bind_status() to distinguish two cases:
Case 1: the remote has replied that the program/version tuple is not
registered (returns -EACCES)
Case 2: another process is already in the middle of binding on this
transport (now returns -EAGAIN)
This change allows more specific retry processing for each of these two
cases.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Tue, 11 Sep 2007 22:00:36 +0000 (18:00 -0400)]
SUNRPC: Retry bad rpcbind replies
When a server returns a bad rpcbind reply, make rpcbind client recovery logic
retry with an older protocol version. Older versions are more likely to work
correctly.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Tue, 11 Sep 2007 22:00:31 +0000 (18:00 -0400)]
SUNRPC: Make rpcb_decode_getaddr more picky about universal addresses
Add better sanity checking of server replies to the GETVERSADDR reply
decoder. Change the error return code: EIO is what other XDR decoding
routines return if there is a failure while decoding.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Tue, 11 Sep 2007 22:00:25 +0000 (18:00 -0400)]
SUNRPC: Clean up in rpc_show_tasks
/home/cel/linux/net/sunrpc/clnt.c: In function ‘rpc_show_tasks’:
/home/cel/linux/net/sunrpc/clnt.c:1538: warning:
signed and unsigned type in conditional expression
This points out another case where a conditional expression returns a
signed value in one arm and an unsigned value in the other.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Tue, 11 Sep 2007 22:00:20 +0000 (18:00 -0400)]
SUNRPC: Make sure server name is reasonable before trying to print it
Check the length of the passed-in server name before trying to print it in
the log.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Tue, 11 Sep 2007 22:00:15 +0000 (18:00 -0400)]
SUNRPC: Use correct argument type in memcpy()
Noticed by Tom Talpey <tmt@netapp.com>:
OBTW, there's a nit on that memcpy, too. The r_addr is an array, so
memcpy(&map->r_addr
is passing the address of the array as a char **. It's the same as
map->r_addr, but technically the wrong type.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Tue, 11 Sep 2007 22:00:09 +0000 (18:00 -0400)]
SUNRPC: fix a signed v. unsigned comparison nit in rpc_bind_new_program
/home/cel/linux/net/sunrpc/clnt.c: In function ‘rpc_bind_new_program’:
/home/cel/linux/net/sunrpc/clnt.c:445: warning:
comparison between signed and unsigned
RPC version numbers are u32, not int.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Tue, 11 Sep 2007 22:00:03 +0000 (18:00 -0400)]
SUNRPC: Only one dprintk is needed during client creation
Remove one of two identical dprintk's that occur when an RPC client is
created.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Thu, 16 Aug 2007 20:03:31 +0000 (16:03 -0400)]
SUNRPC: Fix generation of universal addresses for
Fix some problems with rpcbind v3 and v4 queries from the in-kernel rpcbind
client:
1. The r_addr argument must be a full universal address, not just an IP
address, and
2. The universal address in r_addr is the address of the remote rpcbind
server, not the RPC service being requested
This addresses bugzilla.kernel.org report 8891 for 2.6.23-rc and greater.
In addition, if the rpcbind client is unable to start the rpcbind request,
make sure not to leak the xprt.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Chuck Lever [Thu, 16 Aug 2007 20:03:26 +0000 (16:03 -0400)]
SUNRPC: Add support for formatted universal addresses
"Universal addresses" are a string representation of an IP address and
port. They are described fully in RFC 3530, section 2.2. Add support
for generating them in the RPC client's socket transport module.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Chuck Lever [Mon, 6 Aug 2007 15:58:04 +0000 (11:58 -0400)]
SUNRPC: Split xs_reclassify_socket into an IPv4 and IPv6 version
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Mon, 6 Aug 2007 15:57:58 +0000 (11:57 -0400)]
SUNRPC: Add a helper for extracting the address using the correct type
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Mon, 6 Aug 2007 15:57:53 +0000 (11:57 -0400)]
SUNRPC: Add IPv6 address support to net/sunrpc/xprtsock.c
Finalize support for setting up RPC client transports to remote RPC
services addressed via IPv6.
Based on work done by Gilles Quillard at Bull Open Source.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Mon, 6 Aug 2007 15:57:48 +0000 (11:57 -0400)]
SUNRPC: create connect workers for IPv6
Clone separate connect worker functions for connecting AF_INET6 sockets.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Mon, 6 Aug 2007 15:57:43 +0000 (11:57 -0400)]
SUNRPC: Rename IPv4 connect workers
Prepare for introduction of IPv6 versions of same.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Mon, 6 Aug 2007 15:57:38 +0000 (11:57 -0400)]
SUNRPC: Refactor a part of socket connect logic into a helper function
Finishing a socket connect is the same for IPv4 and IPv6, so split it out
into a helper.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Mon, 6 Aug 2007 15:57:33 +0000 (11:57 -0400)]
SUNRPC: create an IPv6-savvy mechanism for binding to a reserved port
Clone xs_bindresvport into two functions, one that can handle IPv4
addresses, and one that can handle IPv6 addresses.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Mon, 6 Aug 2007 15:57:28 +0000 (11:57 -0400)]
SUNRPC: Rename xs_bind() to prepare for IPv6-specific bind method
Prepare for introduction of IPv6-specific socket bind function.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Mon, 6 Aug 2007 15:57:23 +0000 (11:57 -0400)]
SUNRPC: Introduce support for setting the port number in IPv6 addresses
We could clone xs_set_port, but this is easier overall.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Mon, 6 Aug 2007 15:57:18 +0000 (11:57 -0400)]
SUNRPC: add support for IPv6 to the kernel's rpcbind client
Prepare for adding IPv6 support to the RPC client by adding IPv6
capabilities to rpcbind. Note that this is support on the query side
only; registering IPv6 addresses with the local portmapper will come
later.
Note we have to take care not to fall back to using version 2 of the
rpcbind protocol if we're dealing with IPv6 address. Version 2 doesn't
support IPv6 at all.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Mon, 6 Aug 2007 15:57:12 +0000 (11:57 -0400)]
SUNRPC: add a function to format IPv6 addresses
Clone xs_format_ipv4_peer_addresses into an IPv6 version.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Mon, 6 Aug 2007 15:57:07 +0000 (11:57 -0400)]
SUNRPC: Rename xs_format_peer_addresses
Prepare to add an IPv6 version of xs_format_peer_addresses by renaming it
to xs_format_ipv4_peer_addresses.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Mon, 6 Aug 2007 15:57:02 +0000 (11:57 -0400)]
SUNRPC: Add hex-formatted address support to rpc_peeraddr2str()
Add support for the NFS client's need to export volume information
with IP addresses formatted in hex instead of decimal.
This isn't used yet, but subsequent patches (not in this series) will
change the NFS client to use this functionality.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Mon, 6 Aug 2007 15:56:57 +0000 (11:56 -0400)]
SUNRPC: Free address buffers in a loop
Use more generic logic to free buffers holding formatted addresses. This
makes it less likely a bug will be introduced when adding additional buffer
types in xs_format_peer_address().
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Mon, 6 Aug 2007 15:56:52 +0000 (11:56 -0400)]
SUNRPC: Use standard macros for printing IP addresses
include/linux/kernel.h gives us some nice macros for formatting IP
addresses. Use them.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Mon, 6 Aug 2007 15:56:42 +0000 (11:56 -0400)]
SUNRPC: Fix a signed v. unsigned comparison in net/sunrpc/xprtsock.c
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Mon, 6 Aug 2007 15:56:31 +0000 (11:56 -0400)]
SUNRPC: Fix a signed v. unsigned comparison in rpcbind's XDR routines
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Jeff Layton [Mon, 30 Jul 2007 12:47:38 +0000 (08:47 -0400)]
[NFS] [PATCH] NFS: show addr=ipaddr in /proc/mounts rather than
A minor thing, but useful when working with a server with multiple
addrs. This looks like it might also be necessary if Miklos' effort
to eliminate /etc/mtab ever comes to fruition.
When displaying mount options in /proc/mounts, the kernel prints
"addr=hostname". This info is redundant since we already have the
hostname displayed as part of the "device" section of the mount. This
patch changes it to display the IP address to which the socket is
connected.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Christoph Hellwig [Fri, 3 Aug 2007 14:20:32 +0000 (16:20 +0200)]
[NFS] [PATCH] nfs: tiny makefile cleanup
no need to set up foo-objs these days.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Fabio Olive Leite [Fri, 27 Jul 2007 01:59:00 +0000 (22:59 -0300)]
Re: [NFS] [PATCH] Attribute timeout handling and wrapping u32 jiffies
I would like to discuss the idea that the current checks for attribute
timeout using time_after are inadequate for 32bit architectures, since
time_after works correctly only when the two timestamps being compared
are within 2^31 jiffies of each other. The signed overflow caused by
comparing values more than 2^31 jiffies apart will flip the result,
causing incorrect assumptions of validity.
2^31 jiffies is a fairly large period of time (~25 days) when compared
to the lifetime of most kernel data structures, but for long lived NFS
mounts that can sit idle for months (think that for some reason autofs
cannot be used), it is easy to compare inode attribute timestamps with
very disparate or even bogus values (as in when jiffies have wrapped
many times, where the comparison doesn't even make sense).
Currently the code tests for attribute timeout by simply adding the
desired amount of jiffies to the stored timestamp and comparing that
with the current timestamp of obtained attribute data with time_after.
This is incorrect, as it returns true for the desired timeout period
and another full 2^31 range of jiffies.
In testing with artificial jumps (several small jumps, not one big
crank) of the jiffies I was able to reproduce a problem found in a
server with very long lived NFS mounts, where attributes would not be
refreshed even after touching files and directories in the server:
Initial uptime:
03:42:01 up 6 min, 0 users, load average: 0.01, 0.12, 0.07
NFS volume is mounted and time is advanced:
03:38:09 up 25 days, 2 min, 0 users, load average: 1.22, 1.05, 1.08
# ls -l /local/A/foo/bar /nfs/A/foo/bar
-rw-r--r-- 1 root root 0 Dec 17 03:38 /local/A/foo/bar
-rw-r--r-- 1 root root 0 Nov 22 00:36 /nfs/A/foo/bar
# touch /local/A/foo/bar
# ls -l /local/A/foo/bar /nfs/A/foo/bar
-rw-r--r-- 1 root root 0 Dec 17 03:47 /local/A/foo/bar
-rw-r--r-- 1 root root 0 Nov 22 00:36 /nfs/A/foo/bar
We can see the local mtime is updated, but the NFS mount still shows
the old value. The patch below makes it work:
Initial setup...
07:11:02 up 25 days, 1 min, 0 users, load average: 0.15, 0.03, 0.04
# ls -l /local/A/foo/bar /nfs/A/foo/bar
-rw-r--r-- 1 root root 0 Jan 11 07:11 /local/A/foo/bar
-rw-r--r-- 1 root root 0 Jan 11 07:11 /nfs/A/foo/bar
# touch /local/A/foo/bar
# ls -l /local/A/foo/bar /nfs/A/foo/bar
-rw-r--r-- 1 root root 0 Jan 11 07:14 /local/A/foo/bar
-rw-r--r-- 1 root root 0 Jan 11 07:14 /nfs/A/foo/bar
Signed-off-by: Fabio Olive Leite <fleite@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Peter Staubach [Fri, 3 Aug 2007 19:07:10 +0000 (15:07 -0400)]
64 bit ino support for NFS client
Hi.
Attached is a patch to modify the NFS client code to support
64 bit ino's, as appropriate for the system and the NFS
protocol version.
The code basically just expand the NFS interfaces for routines
which handle ino's from using ino_t to u64 and then uses the
fileid in the nfs_inode instead of i_ino in the inode. The
code paths that were updated are in the getattr method and
the readdir methods.
This should be no real change on 64 bit platforms. Since
the ino_t is an unsigned long, it would already be 64 bits
wide.
Thanx...
ps
Signed-off-by: Peter Staubach <staubach@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Fri, 8 Jun 2007 02:44:34 +0000 (22:44 -0400)]
SUNRPC: Convert rpc_pipefs to use the generic filesystem notification hooks
This will allow rpc.gssd to use inotify instead of dnotify in order to
locate new rpc upcall pipes.
This also requires the exporting of __audit_inode_child(), which is used by
fsnotify_create() and fsnotify_mkdir(). Ccing David Woodhouse.
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Wed, 25 Jul 2007 18:09:54 +0000 (14:09 -0400)]
NFS: Fall back to synchronous writes when a background write errors...
This helps prevent huge queues of background writes from building up
whenever the server runs out of disk or quota space, or if someone changes
the file access modes behind our backs.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Wed, 25 Jul 2007 18:09:54 +0000 (14:09 -0400)]
NFS: Writeback optimisation
Schedule writes using WB_SYNC_NONE first, then come back for a second pass
using WB_SYNC_ALL.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Fri, 20 Jul 2007 17:13:28 +0000 (13:13 -0400)]
NFS: Clean up NFS writeback flush code
The only user of nfs_sync_mapping_range() is nfs_getattr(), which uses it
to flush out the entire inode without sending a commit. We therefore
replace nfs_sync_mapping_range with a more appropriate helper.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Sun, 22 Jul 2007 23:27:46 +0000 (19:27 -0400)]
VFS: Remove writeback_control->fs_private
The only user of this field was NFS.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Sun, 22 Jul 2007 23:27:32 +0000 (19:27 -0400)]
NFS: Clean up nfs_writepages()
Just call write_cache_pages directly instead of hacking the writeback
control structure in order to find out if we were called from writepages()
or directly from the VM.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>