summaryrefslogtreecommitdiff
path: root/include/linux/lockd/lockd.h
AgeCommit message (Collapse)AuthorFilesLines
2018-01-15lockd: convert nlm_rqst.a_count from atomic_t to refcount_tElena Reshetova1-1/+1
atomic_t variables are currently used to implement reference counters with the following properties: - counter is initialized to 1 using atomic_set() - a resource is freed upon counter reaching zero - once counter reaches zero, its further increments aren't allowed - counter schema uses basic atomic operations (set, inc, inc_not_zero, dec_and_test, etc.) Such atomic variables should be converted to a newly provided refcount_t type and API that prevents accidental counter overflows and underflows. This is important since overflows and underflows can lead to use-after-free situation and be exploitable. The variable nlm_rqst.a_count is used as pure reference counter. Convert it to refcount_t and fix up the operations. **Important note for maintainers: Some functions from refcount_t API defined in lib/refcount.c have different memory ordering guarantees than their atomic counterparts. The full comparison can be seen in https://lkml.org/lkml/2017/11/15/57 and it is hopefully soon in state to be merged to the documentation tree. Normally the differences should not matter since refcount_t provides enough guarantees to satisfy the refcounting use cases, but in some rare cases it might matter. Please double check that you don't have some undocumented memory guarantees for this variable usage. For the nlm_rqst.a_count it might make a difference in following places: - nlmclnt_release_call() and nlmsvc_release_call(): decrement in refcount_dec_and_test() only provides RELEASE ordering and control dependency on success vs. fully ordered atomic counterpart Suggested-by: Kees Cook <keescook@chromium.org> Reviewed-by: David Windsor <dwindsor@gmail.com> Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2018-01-15lockd: convert nlm_lockowner.count from atomic_t to refcount_tElena Reshetova1-1/+1
atomic_t variables are currently used to implement reference counters with the following properties: - counter is initialized to 1 using atomic_set() - a resource is freed upon counter reaching zero - once counter reaches zero, its further increments aren't allowed - counter schema uses basic atomic operations (set, inc, inc_not_zero, dec_and_test, etc.) Such atomic variables should be converted to a newly provided refcount_t type and API that prevents accidental counter overflows and underflows. This is important since overflows and underflows can lead to use-after-free situation and be exploitable. The variable nlm_lockowner.count is used as pure reference counter. Convert it to refcount_t and fix up the operations. **Important note for maintainers: Some functions from refcount_t API defined in lib/refcount.c have different memory ordering guarantees than their atomic counterparts. The full comparison can be seen in https://lkml.org/lkml/2017/11/15/57 and it is hopefully soon in state to be merged to the documentation tree. Normally the differences should not matter since refcount_t provides enough guarantees to satisfy the refcounting use cases, but in some rare cases it might matter. Please double check that you don't have some undocumented memory guarantees for this variable usage. For the nlm_lockowner.count it might make a difference in following places: - nlm_put_lockowner(): decrement in refcount_dec_and_lock() only provides RELEASE ordering, control dependency on success and holds a spin lock on success vs. fully ordered atomic counterpart. No changes in spin lock guarantees. Suggested-by: Kees Cook <keescook@chromium.org> Reviewed-by: David Windsor <dwindsor@gmail.com> Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2018-01-15lockd: convert nsm_handle.sm_count from atomic_t to refcount_tElena Reshetova1-1/+1
atomic_t variables are currently used to implement reference counters with the following properties: - counter is initialized to 1 using atomic_set() - a resource is freed upon counter reaching zero - once counter reaches zero, its further increments aren't allowed - counter schema uses basic atomic operations (set, inc, inc_not_zero, dec_and_test, etc.) Such atomic variables should be converted to a newly provided refcount_t type and API that prevents accidental counter overflows and underflows. This is important since overflows and underflows can lead to use-after-free situation and be exploitable. The variable nsm_handle.sm_count is used as pure reference counter. Convert it to refcount_t and fix up the operations. **Important note for maintainers: Some functions from refcount_t API defined in lib/refcount.c have different memory ordering guarantees than their atomic counterparts. The full comparison can be seen in https://lkml.org/lkml/2017/11/15/57 and it is hopefully soon in state to be merged to the documentation tree. Normally the differences should not matter since refcount_t provides enough guarantees to satisfy the refcounting use cases, but in some rare cases it might matter. Please double check that you don't have some undocumented memory guarantees for this variable usage. For the nsm_handle.sm_count it might make a difference in following places: - nsm_release(): decrement in refcount_dec_and_lock() only provides RELEASE ordering, control dependency on success and holds a spin lock on success vs. fully ordered atomic counterpart. No change for the spin lock guarantees. Suggested-by: Kees Cook <keescook@chromium.org> Reviewed-by: David Windsor <dwindsor@gmail.com> Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2018-01-15lockd: convert nlm_host.h_count from atomic_t to refcount_tElena Reshetova1-1/+2
atomic_t variables are currently used to implement reference counters with the following properties: - counter is initialized to 1 using atomic_set() - a resource is freed upon counter reaching zero - once counter reaches zero, its further increments aren't allowed - counter schema uses basic atomic operations (set, inc, inc_not_zero, dec_and_test, etc.) Such atomic variables should be converted to a newly provided refcount_t type and API that prevents accidental counter overflows and underflows. This is important since overflows and underflows can lead to use-after-free situation and be exploitable. The variable nlm_host.h_count is used as pure reference counter. Convert it to refcount_t and fix up the operations. **Important note for maintainers: Some functions from refcount_t API defined in lib/refcount.c have different memory ordering guarantees than their atomic counterparts. The full comparison can be seen in https://lkml.org/lkml/2017/11/15/57 and it is hopefully soon in state to be merged to the documentation tree. Normally the differences should not matter since refcount_t provides enough guarantees to satisfy the refcounting use cases, but in some rare cases it might matter. Please double check that you don't have some undocumented memory guarantees for this variable usage. For the nlm_host.h_count it might make a difference in following places: - nlmsvc_release_host(): decrement in refcount_dec() provides RELEASE ordering, while original atomic_dec() was fully unordered. Since the change is for better, it should not matter. - nlmclnt_release_host(): decrement in refcount_dec_and_test() only provides RELEASE ordering and control dependency on success vs. fully ordered atomic counterpart. It doesn't seem to matter in this case since object freeing happens under mutex lock anyway. Suggested-by: Kees Cook <keescook@chromium.org> Reviewed-by: David Windsor <dwindsor@gmail.com> Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-11-02License cleanup: add SPDX GPL-2.0 license identifier to files with no licenseGreg Kroah-Hartman1-0/+1
Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-15sunrpc: mark all struct svc_procinfo instances as constChristoph Hellwig1-2/+2
struct svc_procinfo contains function pointers, and marking it as constant avoids it being able to be used as an attach vector for code injections. Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-04-21lockd: Introduce nlmclnt_operationsBenjamin Coddington1-0/+2
NFS would enjoy the ability to modify the behavior of the NLM client's unlock RPC task in order to delay the transmission of the unlock until IO that was submitted under that lock has completed. This ability can ensure that the NLM client will always complete the transmission of an unlock even if the waiting caller has been interrupted with fatal signal. For this purpose, a pointer to a struct nlmclnt_operations can be assigned in a nfs_module's nfs_rpc_ops that will install those nlmclnt_operations on the nlm_host. The struct nlmclnt_operations defines three callback operations that will be used in a following patch: nlmclnt_alloc_call - used to call back after a successful allocation of a struct nlm_rqst in nlmclnt_proc(). nlmclnt_unlock_prepare - used to call back during NLM unlock's rpc_call_prepare. The NLM client defers calling rpc_call_start() until this callback returns false. nlmclnt_release_call - used to call back when the NLM client's struct nlm_rqst is freed. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-02-14nlm: Ensure callback code also checks that the files matchTrond Myklebust1-1/+2
It is not sufficient to just check that the lock pids match when granting a callback, we also need to ensure that we're granting the callback on the right file. Reported-by: Pankaj Singh <psingh.ait@gmail.com> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2015-10-23lockd: get rid of reference-counted NSM RPC clientsAndrey Ryabinin1-0/+1
Currently we have reference-counted per-net NSM RPC client which created on the first monitor request and destroyed after the last unmonitor request. It's needed because RPC client need to know 'utsname()->nodename', but utsname() might be NULL when nsm_unmonitor() called. So instead of holding the rpc client we could just save nodename in struct nlm_host and pass it to the rpc_create(). Thus ther is no need in keeping rpc client until last unmonitor request. We could create separate RPC clients for each monitor/unmonitor requests. Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-10-13lockd: create NSM handles per net namespaceAndrey Ryabinin1-3/+6
Commit cb7323fffa85 ("lockd: create and use per-net NSM RPC clients on MON/UNMON requests") introduced per-net NSM RPC clients. Unfortunately this doesn't make any sense without per-net nsm_handle. E.g. the following scenario could happen Two hosts (X and Y) in different namespaces (A and B) share the same nsm struct. 1. nsm_monitor(host_X) called => NSM rpc client created, nsm->sm_monitored bit set. 2. nsm_mointor(host-Y) called => nsm->sm_monitored already set, we just exit. Thus in namespace B ln->nsm_clnt == NULL. 3. host X destroyed => nsm->sm_count decremented to 1 4. host Y destroyed => nsm_unmonitor() => nsm_mon_unmon() => NULL-ptr dereference of *ln->nsm_clnt So this could be fixed by making per-net nsm_handles list, instead of global. Thus different net namespaces will not be able share the same nsm_handle. Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: <stable@vger.kernel.org> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-09-10lockd: rip out deferred lock handling from testlock codepathJeff Layton1-1/+0
As Kinglong points out, the nlm_block->b_fl field is no longer used at all. Also, vfs_test_lock in the generic locking code will only return FILE_LOCK_DEFERRED if FL_SLEEP is set, and it isn't here. The only other place that returns that value is the DLM lock code, but it only does that in dlm_posix_lock, never in dlm_posix_get. Remove all of the deferred locking code from the testlock codepath since it doesn't appear to ever be used anyway. I do have a small concern that this might cause a behavior change in the case where you have a block already sitting on the list when the testlock request comes in, but that looks like it doesn't really work properly anyway. I think it's best to just pass that down to vfs_test_lock and let the filesystem report that instead of trying to infer what's going on with the lock by looking at an existing block. Cc: cluster-devel@redhat.com Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Kinglong Mee <kinglongmee@gmail.com>
2014-05-07nfsd: remove <linux/nfsd/nfsfh.h>Christoph Hellwig1-1/+1
The only real user of this header is fs/nfsd/nfsfh.h, so merge the two. Various lockѕ source files used it to indirectly get other sunrpc or nfs headers, so fix those up. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-03-01Merge branch 'for-3.9' of git://linux-nfs.org/~bfields/linuxLinus Torvalds1-1/+2
Pull nfsd changes from J Bruce Fields: "Miscellaneous bugfixes, plus: - An overhaul of the DRC cache by Jeff Layton. The main effect is just to make it larger. This decreases the chances of intermittent errors especially in the UDP case. But we'll need to watch for any reports of performance regressions. - Containerized nfsd: with some limitations, we now support per-container nfs-service, thanks to extensive work from Stanislav Kinsbursky over the last year." Some notes about conflicts, since there were *two* non-data semantic conflicts here: - idr_remove_all() had been added by a memory leak fix, but has since become deprecated since idr_destroy() does it for us now. - xs_local_connect() had been added by this branch to make AF_LOCAL connections be synchronous, but in the meantime Trond had changed the calling convention in order to avoid a RCU dereference. There were a couple of more obvious actual source-level conflicts due to the hlist traversal changes and one just due to code changes next to each other, but those were trivial. * 'for-3.9' of git://linux-nfs.org/~bfields/linux: (49 commits) SUNRPC: make AF_LOCAL connect synchronous nfsd: fix compiler warning about ambiguous types in nfsd_cache_csum svcrpc: fix rpc server shutdown races svcrpc: make svc_age_temp_xprts enqueue under sv_lock lockd: nlmclnt_reclaim(): avoid stack overflow nfsd: enable NFSv4 state in containers nfsd: disable usermode helper client tracker in container nfsd: use proper net while reading "exports" file nfsd: containerize NFSd filesystem nfsd: fix comments on nfsd_cache_lookup SUNRPC: move cache_detail->cache_request callback call to cache_read() SUNRPC: remove "cache_request" argument in sunrpc_cache_pipe_upcall() function SUNRPC: rework cache upcall logic SUNRPC: introduce cache_detail->cache_request callback NFS: simplify and clean cache library NFS: use SUNRPC cache creation and destruction helper for DNS cache nfsd4: free_stid can be static nfsd: keep a checksum of the first 256 bytes of request sunrpc: trim off trailing checksum before returning decrypted or integrity authenticated buffer sunrpc: fix comment in struct xdr_buf definition ...
2013-02-23new helper: file_inode(file)Al Viro1-1/+1
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-15lockd: nlmclnt_reclaim(): avoid stack overflowTim Gardner1-1/+2
Even though nlmclnt_reclaim() is only one call into the stack frame, 928 bytes on the stack seems like a lot. Recode to dynamically allocate the request structure once from within the reclaimer task, then pass this pointer into nlmclnt_reclaim() for reuse on subsequent calls. smatch analysis: fs/lockd/clntproc.c:620 nlmclnt_reclaim() warn: 'reqst' puts 928 bytes on stack Also remove redundant assignment of 0 after memset. Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-07-28LockD: pass actual network namespace to grace period management functionsStanislav Kinsbursky1-2/+2
Passed network namespace replaced hard-coded init_net Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-07-28LockD: mark host per network namespace on garbage collectStanislav Kinsbursky1-1/+1
This is required for per-network NLM shutdown and cleanup. This patch passes init_net for a while. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-02-15Lockd: shutdown NLM hosts in network namespace contextStanislav Kinsbursky1-0/+1
Lockd now managed in network namespace context. And this patch introduces network namespace related NLM hosts shutdown in case of releasing per-net Lockd resources. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-02-15LockD: make nlm hosts network namespace awareStanislav Kinsbursky1-1/+3
This object depends on RPC client, and thus on network namespace. So let's make it's allocation and lookup in network namespace context. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-02-01SUNRPC: constify the rpc_programTrond Myklebust1-1/+1
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-01-13module_param: make bool parameters really bool (drivers & misc)Rusty Russell1-1/+1
module_param(bool) used to counter-intuitively take an int. In fddd5201 (mid-2009) we allowed bool or int/unsigned int using a messy trick. It's time to remove the int/unsigned int option. For this version it'll simply give a warning, but it'll break next kernel version. Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-12-12net: use IS_ENABLED(CONFIG_IPV6)Eric Dumazet1-3/+3
Instead of testing defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-16lockd: Clean up nlmsvc_lookup_host()Chuck Lever1-1/+1
Clean up. Change nlmsvc_lookup_host() to be purpose-built for server-side nlm_host management. This replaces the generic nlm_lookup_host() helper function, just like on the client side. The lookup logic is specialized for server host lookups. The server side cache also gets its own specialized equivalent of the nlm_release_host() function. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-12-16lockd: Create client-side nlm_host cacheChuck Lever1-0/+1
NFS clients don't need the garbage collection processing that is performed on nlm_host structures. The client picks up an nlm_host at mount time and holds a reference to it until the file system is unmounted. Servers, on the other hand, don't have a precise way to tell when an nlm_host is no longer being used, so zero refcount nlm_host entries are left to expire in the cache after a time. Basically there's nothing holding a reference to an nlm_host between individual server-side NLM requests, but we can't afford the expense of recreating them for every new NLM request from a client. The nlm_host cache adds some lifetime hysteresis to entries in the cache so the next time a particular nlm_host is needed, it's likely to be discovered by a lookup rather than created from whole cloth. With the new implementation, client nlm_host cache items are no longer garbage collected, and are destroyed directly by a new release function specialized for client entries, nlmclnt_release_host(). They are cached in their own data structure, and have their own lookup logic, simplified and specialized for client nlm_host entries. However, the client nlm_host cache still shares reboot recovery logic with the server nlm_host cache. The NSM "peer rebooted" downcall for clients and servers still come through the same RPC call. This is a legacy formal API that would be difficult to alter, and besides, the user space NSM implementation can't tell the difference between peers that are clients or servers. For this reason, the client cache continues to share the nlm_host_mutex (and reboot recovery logic) with the server cache. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-12-16lockd: Split nlm_release_call()Chuck Lever1-1/+2
The nlm_release_call() function is invoked from both the server and the client side. We're about to introduce a distinct server- and client-side nlm_release_host(), so nlm_release_call() must first be split into a client-side and a server-side version. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-11-16NLM: Fix a regression in lockdTrond Myklebust1-0/+1
Nick Bowler reports: There are no unusual messages on the client... but I just logged into the server and I see lots of messages of the following form: nfsd: request from insecure port (192.168.8.199:35766)! nfsd: request from insecure port (192.168.8.199:35766)! nfsd: request from insecure port (192.168.8.199:35766)! nfsd: request from insecure port (192.168.8.199:35766)! nfsd: request from insecure port (192.168.8.199:35766)! Bisected to commit 9247685088398cf21bcb513bd2832b4cd42516c4 (SUNRPC: Properly initialize sock_xprt.srcaddr in all cases) Apparently, removing the 'transport->srcaddr.ss_family = family' from xs_create_sock() triggers this due to nlmclnt_lookup_host() incorrectly initialising the srcaddr family to AF_UNSPEC. Reported-by: Nick Bowler <nbowler@elliptictech.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-09-22Merge branch 'for-2.6.32' of git://linux-nfs.org/~bfields/linuxLinus Torvalds1-43/+0
* 'for-2.6.32' of git://linux-nfs.org/~bfields/linux: (68 commits) nfsd4: nfsv4 clients should cross mountpoints nfsd: revise 4.1 status documentation sunrpc/cache: avoid variable over-loading in cache_defer_req sunrpc/cache: use list_del_init for the list_head entries in cache_deferred_req nfsd: return success for non-NFS4 nfs4_state_start nfsd41: Refactor create_client() nfsd41: modify nfsd4.1 backchannel to use new xprt class nfsd41: Backchannel: Implement cb_recall over NFSv4.1 nfsd41: Backchannel: cb_sequence callback nfsd41: Backchannel: Setup sequence information nfsd41: Backchannel: Server backchannel RPC wait queue nfsd41: Backchannel: Add sequence arguments to callback RPC arguments nfsd41: Backchannel: callback infrastructure nfsd4: use common rpc_cred for all callbacks nfsd4: allow nfs4 state startup to fail SUNRPC: Defer the auth_gss upcall when the RPC call is asynchronous nfsd4: fix null dereference creating nfsv4 callback client nfsd4: fix whitespace in NFSPROC4_CLNT_CB_NULL definition nfsd41: sunrpc: add new xprt class for nfsv4.1 backchannel sunrpc/cache: simplify cache_fresh_locked and cache_fresh_unlocked. ...
2009-09-22const: make lock_manager_operations constAlexey Dobriyan1-1/+1
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-21sunrpc: add routine for comparing addressesJeff Layton1-43/+0
lockd needs these sort of routines, as does the NFSv4 callback code. Move lockd's routines into common code and rename them so that they can be used by others. Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-06-18lockd: Update NSM state from SM_MON repliesChuck Lever1-1/+1
When rpc.statd starts up in user space at boot time, it attempts to write the latest NSM local state number into /proc/sys/fs/nfs/nsm_local_state. If lockd.ko isn't loaded yet (as is the case in most configurations), that file doesn't exist, thus the kernel's NSM state remains set to its initial value of zero during lockd operation. This is a problem because rpc.statd and lockd use the NSM state number to prevent repeated lock recovery on rebooted hosts. If lockd sends a zero NSM state, but then a delayed SM_NOTIFY with a real NSM state number is received, there is no way for lockd or rpc.statd to distinguish that stale SM_NOTIFY from an actual reboot. Thus lock recovery could be performed after the rebooted host has already started reclaiming locks, and those locks will be lost. We could change /etc/init.d/nfslock so it always modprobes lockd.ko before starting rpc.statd. However, if lockd.ko is ever unloaded and reloaded, we are back at square one, since the NSM state is not preserved across an unload/reload cycle. This may happen frequently on clients that use automounter. A period of NFS inactivity causes lockd.ko to be unloaded, and the kernel loses its NSM state setting. Instead, let's use the fact that rpc.statd plants the local system's NSM state in every SM_MON (and SM_UNMON) reply. lockd performs a synchronous SM_MON upcall to the local rpc.statd _before_ sending its first NLM request to a new remote. This would permit rpc.statd to provide the current NSM state to lockd, even after lockd.ko had been unloaded and reloaded. Note that NLMPROC_LOCK arguments are constructed before the nsm_monitor() call, so we have to rearrange argument construction very slightly to make this all work out. And, the kernel appears to treat NSM state as a u32 (see struct nlm_args and nsm_res). Make nsm_local_state a u32 as well, to ensure we don't get bogus comparison results. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-03-11NLM: Shrink the IPv4-only version of nlm_cmp_addr()Chuck Lever1-0/+8
Clean up/micro-optimatization: Make the AF_INET-only version of nlm_cmp_addr() smaller. This matches the style of nlm_privileged_requester(), and makes the AF_INET-only version of nlm_cmp_addr() nearly the same size as it was before IPv6 support. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-01-06NLM: Rewrite IPv4 privileged requester's checkChuck Lever1-2/+5
Clean up. For consistency, rewrite the IPv4 check to match the same style as the new IPv6 check. Note that ipv4_is_loopback() is somewhat broader in its interpretation of what is a loopback address than simply "127.0.0.1". Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-01-06NLM: nlm_privileged_requester() doesn't recognize mapped loopback addressChuck Lever1-2/+8
Commit b85e4676 added the nlm_privileged_requester() helper to check whether an RPC request was sent from a local privileged caller. It recognizes IPv4 privileged callers (from "127.0.0.1"), and IPv6 privileged callers (from "::1"). However, IPV6_ADDR_LOOPBACK is not set for the mapped IPv4 loopback address (::ffff:7f00:0001), so the test breaks when the kernel's RPC service is IPv6-enabled but user space is calling via the IPv4 loopback address. This is actually the most common case for IPv6- enabled RPC services on Linux. Rewrite the IPv6 check to handle the mapped IPv4 loopback address as well as a normal IPv6 loopback address. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-01-06NSM: Move nsm_addr() to fs/lockd/mon.cChuck Lever1-10/+0
Clean up: nsm_addr_in() is no longer used, and nsm_addr() is used only in fs/lockd/mon.c, so move it there. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-01-06NSM: Remove include/linux/lockd/sm_inter.hChuck Lever1-0/+1
Clean up: The include/linux/lockd/sm_inter.h header is nearly empty now. Remove it. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-01-06NLM: Remove "create" argument from nsm_find()Chuck Lever1-3/+3
Clean up: nsm_find() now has only one caller, and that caller unconditionally sets the @create argument. Thus the @create argument is no longer needed. Since nsm_find() now has a more specific purpose, pick a more appropriate name for it. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-01-06NSM: Add nsm_lookup() functionChuck Lever1-0/+1
Introduce a new API to fs/lockd/mon.c that allows nlm_host_rebooted() to lookup up nsm_handles via the contents of an nlm_reboot struct. The new function is equivalent to calling nsm_find() with @create set to zero, but it takes a struct nlm_reboot instead of separate arguments. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-01-06NLM: Change nlm_host_rebooted() to take a single nlm_reboot argumentChuck Lever1-2/+1
Pass the nlm_reboot data structure directly from the NLMPROC_SM_NOTIFY XDR decoders to nlm_host_rebooted(). This eliminates some packing and unpacking of the NLMPROC_SM_NOTIFY results, and prepares for passing these results, including the "priv" cookie, directly to a lookup routine in fs/lockd/mon.c. This patch changes code organization but should not cause any behavioral change. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-01-06NSM: Generate NSMPROC_MON's "priv" argument when nsm_handle is createdChuck Lever1-0/+1
Introduce a new data type, used by both the in-kernel NLM and NSM implementations, that is used to manage the opaque "priv" argument for the NSMPROC_MON and NLMPROC_SM_NOTIFY calls. Construct the "priv" cookie when the nsm_handle is created. The nsm_init_private() function may look a little strange, but it is roughly equivalent to how the XDR encoder formed the "priv" argument. It's going to go away soon. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-01-06NSM: Move nsm_find() to fs/lockd/mon.cChuck Lever1-0/+6
The nsm_find() function sets up fresh nsm_handle entries. This is where we will store the "priv" cookie used to lookup nsm_handles during reboot recovery. The cookie will be constructed when nsm_find() creates a new nsm_handle. As much as possible, I would like to keep everything that handles a "priv" cookie in fs/lockd/mon.c so that all the smarts are in one source file. That organization should make it pretty simple to see how all this works. To me, it makes more sense than the current arrangement to keep nsm_find() with nsm_monitor() and nsm_unmonitor(). So, start reorganizing by moving nsm_find() into fs/lockd/mon.c. The nsm_release() function comes along too, since it shares the nsm_lock global variable. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-01-06NLM: Move the public declaration of nsm_unmonitor() to lockd.hChuck Lever1-0/+1
Clean up. Make the nlm_host argument "const," and move the public declaration to lockd.h. Add a documenting comment. Bruce observed that nsm_unmonitor()'s only caller doesn't care about its return code, so make nsm_unmonitor() return void. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-01-06NSM: Release nsmhandle in nlm_destroy_hostChuck Lever1-1/+0
The nsm_handle's reference count is bumped in nlm_lookup_host(). It should be decremented in nlm_destroy_host() to make it easier to see the balance of these two operations. Move the nsm_release() call to fs/lockd/host.c. The h_nsmhandle pointer is set in nlm_lookup_host(), and never cleared. The nlm_destroy_host() function is never called for the same nlm_host twice, so h_nsmhandle won't ever be NULL when nsm_unmonitor() is called. All references to the nlm_host are gone before it is freed. We can skip making h_nsmhandle NULL just before the nlm_host is deallocated. It's also likely we can remove the h_nsmhandle NULL check in nlmsvc_is_client() as well, but we can do that later when rearchitect- ing the nlm_host cache. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-01-06NLM: Move the public declaration of nsm_monitor() to lockd.hChuck Lever1-0/+4
Clean up. Make the nlm_host argument "const," and move the public declaration to lockd.h with other NSM public function (nsm_release, eg) and global variable declarations. Add a documenting comment. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-01-06NSM: Support IPv6 version of mon_nameChuck Lever1-0/+1
The "mon_name" argument of the NSMPROC_MON and NSMPROC_UNMON upcalls is a string that contains the hostname or IP address of the remote peer to be notified when this host has rebooted. The sm-notify command uses this identifier to contact the peer when we reboot, so it must be either a well-qualified DNS hostname or a presentation format IP address string. When the "nsm_use_hostnames" sysctl is set to zero, the kernel's NSM provides a presentation format IP address in the "mon_name" argument. Otherwise, the "caller_name" argument from NLM requests is used, which is usually just the DNS hostname of the peer. To support IPv6 addresses for the mon_name argument, we use the nsm_handle's address eye-catcher, which already contains an appropriate presentation format address string. Using the eye-catcher string obviates the need to use a large buffer on the stack to form the presentation address string for the upcall. This patch also addresses a subtle bug. An NSMPROC_MON request and the subsequent NSMPROC_UNMON request for the same peer are required to use the same value for the "mon_name" argument. Otherwise, rpc.statd's NSMPROC_UNMON processing cannot locate the database entry for that peer and remove it. If the setting of nsm_use_hostnames is changed between the time the kernel sends an NSMPROC_MON request and the time it sends the NSMPROC_UNMON request for the same peer, the "mon_name" argument for these two requests may not be the same. This is because the value of "mon_name" is currently chosen at the moment the call is made based on the setting of nsm_use_hostnames To ensure both requests pass identical contents in the "mon_name" argument, we now select which string to use for the argument in the nsm_monitor() function. A pointer to this string is saved in the nsm_handle so it can be used for a subsequent NSMPROC_UNMON upcall. NB: There are other potential problems, such as how nlm_host_rebooted() might behave if nsm_use_hostnames were changed while hosts are still being monitored. This patch does not attempt to address those problems. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-01-06NSM: Use modern style for sm_name field in nsm_handleChuck Lever1-1/+1
Clean up: I'm about to add another "char *" field to the nsm_handle structure. The sm_name field uses an older style of declaring a "char *" field. If I match that style for the new field, checkpatch.pl will complain. So, fix the sm_name field to use the new style. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-01-06NLM: Support IPv6 scope IDs in nlm_display_address()Chuck Lever1-1/+9
Scope ID support is needed since the kernel's NSM implementation is about to use these displayed addresses as a mon_name in some cases. When nsm_use_hostnames is zero, without scope ID support NSM will fail to handle peers that contact us via a link-local address. Link-local addresses do not work without an interface ID, which is stored in the sockaddr's sin6_scope_id field. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-01-06NLM: Remove address eye-catcher buffers from nlm_hostChuck Lever1-3/+1
The h_name field in struct nlm_host is a just copy of h_nsmhandle->sm_name. Likewise, the contents of the h_addrbuf field should be identical to the sm_addrbuf field. The h_srcaddrbuf field is used only in one place for debugging. We can live without this until we get %pI formatting for printk(). Currently these buffers are 48 bytes, but we need to support scope IDs in IPv6 presentation addresses, which means making the buffers even larger. Instead, let's find ways to eliminate them to save space. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-01-06NLM: Use modern style for pointer fields in nlm_hostChuck Lever1-3/+3
Clean up: I'm about to add another "char *" field to the nlm_host structure. The h_name field, for example, uses an older style of declaring a "char *" field. If I match that style for the new field, checkpatch.pl will complain. So, fix pointer fields to use the new style. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-12-23NLM: allow lockd requests from an unprivileged portChuck Lever1-1/+3
If the admin has specified the "noresvport" option for an NFS mount point, the kernel's NFS client uses an unprivileged source port for the main NFS transport. The kernel's lockd client should use an unprivileged port in this case as well. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-10-04lockd: Add helper to sanity check incoming NOTIFY requestsChuck Lever1-0/+41
lockd accepts SM_NOTIFY calls only from a privileged process on the local system. If lockd uses an AF_INET6 listener, the sender's address (ie the local rpc.statd) will be the IPv6 loopback address, not the IPv4 loopback address. Make sure the privilege test in nlmsvc_proc_sm_notify() and nlm4svc_proc_sm_notify() works for both AF_INET and AF_INET6 family addresses by refactoring the test into a helper and adding support for IPv6 addresses. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>