summaryrefslogtreecommitdiff
path: root/include/linux/lockd
AgeCommit message (Collapse)AuthorFilesLines
2026-03-30lockd: Make linux/lockd/nlm.h an internal headerChuck Lever2-60/+0
The NLM protocol constants and status codes in nlm.h are needed only by lockd's internal implementation. NFS client code and NFSD interact with lockd through the stable API in bind.h and have no direct use for protocol-level definitions. Exposing these definitions globally via bind.h creates unnecessary coupling between lockd internals and its consumers. Moving nlm.h from include/linux/lockd/ to fs/lockd/ clarifies the API boundary: bind.h provides the lockd service interface, while nlm.h remains available only to code within fs/lockd/ that implements the protocol. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-30lockd: Move xdr.h from include/linux/lockd/ to fs/lockd/Chuck Lever2-116/+2
The lockd subsystem unnecessarily exposes internal NLM XDR type definitions through the global include path. These definitions are not used by any code outside fs/lockd/, making them inappropriate for include/linux/lockd/. Moving xdr.h to fs/lockd/ narrows the API surface and clarifies that these types are internal implementation details. The comment in linux/lockd/bind.h stating xdr.h was needed for "xdr-encoded error codes" is stale: no lockd API consumers use those codes. Forward declarations for struct nfs_fh and struct file_lock are added to bind.h because their definitions were previously pulled in transitively through xdr.h. Additionally, nfs3proc.c and proc.c need explicit includes of filelock.h for FL_CLOSE and for accessing struct file_lock members, respectively. Built and tested with lockd client/server operations. No functional change. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-30lockd: Remove lockd/debug.hChuck Lever1-40/+0
The lockd include structure has unnecessary indirection. The header include/linux/lockd/debug.h is consumed only by fs/lockd/lockd.h, creating an extra compilation dependency and making the code harder to navigate. Fold the debug.h definitions directly into lockd.h and remove the now-redundant header. This reduces the include tree depth and makes the debug-related definitions easier to find when working on lockd internals. Build-tested with lockd built as module and built-in. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-30lockd: Relocate include/linux/lockd/lockd.hChuck Lever1-401/+0
Headers placed in include/linux/ form part of the kernel's internal API and signal to subsystem maintainers that other parts of the kernel may depend on them. By moving lockd.h into fs/lockd/, lockd becomes a more self-contained module whose internal interfaces are clearly distinguished from its public contract with the rest of the kernel. This relocation addresses a long-standing XXX comment in the header itself that acknowledged the file's misplacement. Future changes to lockd internals can now proceed with confidence that external consumers are not inadvertently coupled to implementation details. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-30lockd: Move share.h from include/linux/lockd/ to fs/lockd/Chuck Lever2-32/+2
The share.h header defines struct nlm_share and declares the DOS share management functions used by the NLM server to implement NLM_SHARE and NLM_UNSHARE operations. These interfaces are used exclusively within the lockd subsystem. A git grep search confirms no external code references them. Relocating this header from include/linux/lockd/ to fs/lockd/ narrows the public API surface of the lockd module. Out-of-tree code cannot depend on these internal interfaces after this change. Future refactoring of the share management implementation thus requires no consideration of external consumers. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-30lockd: Move xdr4.h from include/linux/lockd/ to fs/lockd/Chuck Lever3-49/+4
The xdr4.h header declares NLMv4-specific XDR encoder/decoder functions and error codes that are used exclusively within the lockd subsystem. Moving it from include/linux/lockd/ to fs/lockd/ clarifies the intended scope of these declarations and prevents external code from depending on lockd-internal interfaces. This change reduces the public API surface of the lockd module and makes it easier to refactor NLMv4 internals without risk of breaking out-of-tree consumers. The header's contents are implementation details of the NLMv4 wire protocol handling, not a contract with other kernel subsystems. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-30NFS: Use nlmclnt_shutdown_rpc_clnt() to safely shut down NLMChuck Lever1-0/+1
A race condition exists in shutdown_store() when writing to the sysfs "shutdown" file concurrently with nlm_shutdown_hosts_net(). Without synchronization, the following sequence can occur: 1. shutdown_store() reads server->nlm_host (non-NULL) 2. nlm_shutdown_hosts_net() acquires nlm_host_mutex, calls rpc_shutdown_client(), sets h_rpcclnt to NULL, and potentially frees the host via nlm_gc_hosts() 3. shutdown_store() dereferences the now-stale or freed host Introduce nlmclnt_shutdown_rpc_clnt(), which acquires nlm_host_mutex before accessing h_rpcclnt. This synchronizes with nlm_shutdown_hosts_net() and ensures the rpc_clnt pointer remains valid during the shutdown operation. This change also improves API layering: NFS client code no longer needs to include the internal lockd header to access nlm_host fields. The new helper resides in bind.h alongside other public lockd interfaces. Reported-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-30lockd: Relocate nlmsvc_unlock API declarationsChuck Lever2-6/+7
The nlmsvc_unlock_all_by_sb() and nlmsvc_unlock_all_by_ip() functions are part of lockd's external API, consumed by other kernel subsystems. Their declarations currently reside in linux/lockd/lockd.h alongside internal implementation details, which blurs the boundary between lockd's public interface and its private internals. Moving these declarations to linux/lockd/bind.h groups them with other external API functions and makes the separation explicit. This clarifies which functions are intended for external use and reduces the risk of internal implementation details leaking into the public API surface. Build-tested with allyesconfig; no functional changes. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-30lockd: Have nlm_fopen() return errno valuesChuck Lever2-5/+5
The nlm_fopen() function is part of the API between nfsd and lockd. Currently its return value is an on-the-wire NLM status code. But that forces NFSD to include NLM wire protocol definitions despite having no other dependency on the NLM wire protocol. In addition, a CONFIG_LOCKD_V4 Kconfig symbol appears in the middle of NFSD source code. Refactor: Let's not use on-the-wire values as part of a high-level API between two Linux kernel modules. That's what we have errno for, right? And, instead of simply moving the CONFIG_LOCKD_V4 check, we can get rid of it entirely and let the decision of what actual NLM status code goes on the wire to be left up to NLM version-specific code. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-30lockd: Introduce nlm__int__deadlockChuck Lever1-0/+1
The use of CONFIG_LOCKD_V4 in combination with a later cast_status() in the NLMv3 code is difficult to reason about. Instead, replace the use of nlm_deadlock with an implementation-defined status value that version-specific code translates appropriately. The new approach establishes a translation boundary: generic lockd code returns nlm__int__deadlock when posix_lock_file() yields -EDEADLK. Version-specific handlers (svc4proc.c for NLMv4, svcproc.c for NLMv3) translate this internal status to the appropriate wire protocol value. NLMv4 maps to nlm4_deadlock; NLMv3 maps to nlm_lck_denied (since NLMv3 lacks a deadlock-specific status code). Later this modification will also remove the need to include NLMv4 headers in NLMv3 and generic code. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-03-30lockd: Relocate and rename nlm_drop_replyChuck Lever2-2/+6
The nlm_drop_reply status code is internal to the kernel's lockd implementation and must never appear on the wire. Its previous location in xdr.h grouped it with legitimate NLM protocol status codes, obscuring this critical distinction. Relocate the definition to lockd.h with a comment block for internal status codes, and rename to nlm__int__drop_reply to make its internal-only nature explicit. This prepares for adding additional internal status codes in subsequent patches. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2025-11-21lockd: don't allow locking on reexported NFSv2/3Jeff Layton1-1/+8
Since commit 9254c8ae9b81 ("nfsd: disallow file locking and delegations for NFSv4 reexport"), file locking when reexporting an NFS mount via NFSv4 is expressly prohibited by nfsd. Do the same in lockd: Add a new nlmsvc_file_cannot_lock() helper that will test whether file locking is allowed for a given file, and return nlm_lck_denied_nolocks if it isn't. Signed-off-by: Jeff Layton <jlayton@kernel.org> Tested-by: Olga Kornievskaia <okorniev@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-11-19lockd: Remove unused parameter to nlmsvc_testlock()Chuck Lever1-3/+3
The nlm_cookie parameter has been unused since commit 09802fd2a8ca ("lockd: rip out deferred lock handling from testlock codepath"). Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-11-19lockd: Remove unused typedefChuck Lever1-2/+0
Clean up: Looks like the last usage of this typedef was removed by commit 026fec7e7c47 ("sunrpc: properly type pc_decode callbacks") in 2017. Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-09-01lockd: discard nlmsvc_timeoutNeilBrown1-1/+1
nlmsvc_timeout always has the same value as (nlm_timeout * HZ), so use that in the one place that nlmsvc_timeout is used. In truth it *might* not always be the same as nlmsvc_timeout is only set when lockd is started while nlm_timeout can be set at anytime via sysctl. I think this difference it not helpful so removing it is good. Also remove the test for nlm_timout being 0. This is not possible - unless a module parameter is used to set the minimum timeout to 0, and if that happens then it probably should be honoured. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-02-05lockd: adapt to breakup of struct file_lockJeff Layton2-5/+4
Most of the existing APIs have remained the same, but subsystems that access file_lock fields directly need to reach into struct file_lock_core now. Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/20240131-flsplit-v3-40-c6129007ee8d@kernel.org Reviewed-by: NeilBrown <neilb@suse.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-02-05filelock: split common fields into struct file_lock_coreJeff Layton1-1/+2
In a future patch, we're going to split file leases into their own structure. Since a lot of the underlying machinery uses the same fields move those into a new file_lock_core, and embed that inside struct file_lock. For now, add some macros to ensure that we can continue to build while the conversion is in progress. Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/20240131-flsplit-v3-17-c6129007ee8d@kernel.org Reviewed-by: NeilBrown <neilb@suse.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-10-16SUNRPC: change how svc threads are asked to exit.NeilBrown1-1/+1
svc threads are currently stopped using kthread_stop(). This requires identifying a specific thread. However we don't care which thread stops, just as long as one does. So instead, set a flag in the svc_pool to say that a thread needs to die, and have each thread check this flag instead of calling kthread_should_stop(). The first thread to find and clear this flag then moves towards exiting. This removes an explicit dependency on sp_all_threads which will make a future patch simpler. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-08-30SUNRPC: remove timeout arg from svc_recv()NeilBrown1-1/+3
Most svc threads have no interest in a timeout. nfsd sets it to 1 hour, but this is a wart of no significance. lockd uses the timeout so that it can call nlmsvc_retry_blocked(). It also sometimes calls svc_wake_up() to ensure this is called. So change lockd to be consistent and always use svc_wake_up() to trigger nlmsvc_retry_blocked() - using a timer instead of a timeout to svc_recv(). And change svc_recv() to not take a timeout arg. This makes the sp_threads_timedout counter always zero. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-06-19NFS: add a sysfs link to the lockd rpc_clientBenjamin Coddington1-0/+2
After lockd is started, add a symlink for lockd's rpc_client under NFS' superblock sysfs. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2023-04-26lockd: fix races in client GRANTED_MSG wait logicJeff Layton1-3/+5
After the wait for a grant is done (for whatever reason), nlmclnt_block updates the status of the nlm_rqst with the status of the block. At the point it does this, however, the block is still queued its status could change at any time. This is particularly a problem when the waiting task is signaled during the wait. We can end up giving up on the lock just before the GRANTED_MSG callback comes in, and accept it even though the lock request gets back an error, leaving a dangling lock on the server. Since the nlm_wait never lives beyond the end of nlmclnt_lock, put it on the stack and add functions to allow us to enqueue and dequeue the block. Enqueue it just before the lock/wait loop, and dequeue it just after we exit the loop instead of waiting until the end of the function. Also, scrape the status at the time that we dequeue it to ensure that it's final. Reported-by: Yongcheng Yang <yoyang@redhat.com> Link: https://bugzilla.redhat.com/show_bug.cgi?id=2063818 Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-04-26lockd: move struct nlm_wait to lockd.hJeff Layton1-1/+10
The next patch needs struct nlm_wait in fs/lockd/clntproc.c, so move the definition to a shared header file. As an added clean-up, drop the unused b_reclaim field. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-04-26lockd: remove 2 unused helper functionsJeff Layton1-10/+0
Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-03-14lockd: set file_lock start and end when decoding nlm4 testargsJeff Layton1-0/+1
Commit 6930bcbfb6ce dropped the setting of the file_lock range when decoding a nlm_lock off the wire. This causes the client side grant callback to miss matching blocks and reject the lock, only to rerequest it 30s later. Add a helper function to set the file_lock range from the start and end values that the protocol uses, and have the nlm_lock decoder call that to set up the file_lock args properly. Fixes: 6930bcbfb6ce ("lockd: detect and reject lock arguments that overflow") Reported-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Tested-by: Amir Goldstein <amir73il@gmail.com> Cc: stable@vger.kernel.org #6.0 Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2023-02-23Merge tag 'nfsd-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linuxLinus Torvalds1-2/+2
Pull nfsd updates from Chuck Lever: "Two significant security enhancements are part of this release: - NFSD's RPC header encoding and decoding, including RPCSEC GSS and gssproxy header parsing, has been overhauled to make it more memory-safe. - Support for Kerberos AES-SHA2-based encryption types has been added for both the NFS client and server. This provides a clean path for deprecating and removing insecure encryption types based on DES and SHA-1. AES-SHA2 is also FIPS-140 compliant, so that NFS with Kerberos may now be used on systems with fips enabled. In addition to these, NFSD is now able to handle crossing into an auto-mounted mount point on an exported NFS mount. A number of fixes have been made to NFSD's server-side copy implementation. RPC metrics have been converted to per-CPU variables. This helps reduce unnecessary cross-CPU and cross-node memory bus traffic, and significantly reduces noise when KCSAN is enabled" * tag 'nfsd-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (121 commits) NFSD: Clean up nfsd_symlink() NFSD: copy the whole verifier in nfsd_copy_write_verifier nfsd: don't fsync nfsd_files on last close SUNRPC: Fix occasional warning when destroying gss_krb5_enctypes nfsd: fix courtesy client with deny mode handling in nfs4_upgrade_open NFSD: fix problems with cleanup on errors in nfsd4_copy nfsd: fix race to check ls_layouts nfsd: don't hand out delegation on setuid files being opened for write SUNRPC: Remove ->xpo_secure_port() SUNRPC: Clean up the svc_xprt_flags() macro nfsd: remove fs/nfsd/fault_inject.c NFSD: fix leaked reference count of nfsd4_ssc_umount_item nfsd: clean up potential nfsd_file refcount leaks in COPY codepath nfsd: zero out pointers after putting nfsd_files on COPY setup error SUNRPC: Fix whitespace damage in svcauth_unix.c nfsd: eliminate __nfs4_get_fd nfsd: add some kerneldoc comments for stateid preprocessing functions nfsd: eliminate find_deleg_file_locked nfsd: don't take nfsd4_copy ref for OP_OFFLOAD_STATUS SUNRPC: Add encryption self-tests ...
2023-02-20SUNRPC: Use per-CPU counters to tally server RPC countsChuck Lever1-2/+2
- Improves counting accuracy - Reduces cross-CPU memory traffic Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-01-11fs: remove locks_inodeJeff Layton1-2/+2
locks_inode was turned into a wrapper around file_inode in de2a4a501e71 (Partially revert "locks: fix file locking on overlayfs"). Finish replacing locks_inode invocations everywhere with file_inode. Acked-by: Miklos Szeredi <mszeredi@redhat.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Jeff Layton <jlayton@kernel.org>
2023-01-11filelock: move file locking definitions to separate header fileJeff Layton1-0/+1
The file locking definitions have lived in fs.h since the dawn of time, but they are only used by a small subset of the source files that include it. Move the file locking definitions to a new header file, and add the appropriate #include directives to the source files that need them. By doing this we trim down fs.h a bit and limit the amount of rebuilding that has to be done when we make changes to the file locking APIs. Reviewed-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Howells <dhowells@redhat.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Acked-by: Chuck Lever <chuck.lever@oracle.com> Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com> Acked-by: Steve French <stfrench@microsoft.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Jeff Layton <jlayton@kernel.org>
2022-08-04lockd: detect and reject lock arguments that overflowJeff Layton1-0/+2
lockd doesn't currently vet the start and length in nlm4 requests like it should, and can end up generating lock requests with arguments that overflow when passed to the filesystem. The NLM4 protocol uses unsigned 64-bit arguments for both start and length, whereas struct file_lock tracks the start and end as loff_t values. By the time we get around to calling nlm4svc_retrieve_args, we've lost the information that would allow us to determine if there was an overflow. Start tracking the actual start and len for NLM4 requests in the nlm_lock. In nlm4svc_retrieve_args, vet these values to ensure they won't cause an overflow, and return NLM4_FBIG if they do. Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=392 Reported-by: Jan Kasiak <j.kasiak@gmail.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: <stable@vger.kernel.org> # 5.14+
2022-07-30NLM: Defend against file_lock changes after vfs_test_lock()Benjamin Coddington1-0/+1
Instead of trusting that struct file_lock returns completely unchanged after vfs_test_lock() when there's no conflicting lock, stash away our nlm_lockowner reference so we can properly release it for all cases. This defends against another file_lock implementation overwriting fl_owner when the return type is F_UNLCK. Reported-by: Roberto Bergantinos Corpas <rbergant@redhat.com> Tested-by: Roberto Bergantinos Corpas <rbergant@redhat.com> Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-01-08nfs: block notification on fs with its own ->lockJ. Bruce Fields1-2/+7
NFSv4.1 supports an optional lock notification feature which notifies the client when a lock comes available. (Normally NFSv4 clients just poll for locks if necessary.) To make that work, we need to request a blocking lock from the filesystem. We turned that off for NFS in commit f657f8eef3ff ("nfs: don't atempt blocking locks on nfs reexports") [sic] because it actually blocks the nfsd thread while waiting for the lock. Thanks to Vasily Averin for pointing out that NFS isn't the only filesystem with that problem. Any filesystem that leaves ->lock NULL will use posix_lock_file(), which does the right thing. Simplest is just to assume that any filesystem that defines its own ->lock is not safe to request a blocking lock from. So, this patch mostly reverts commit f657f8eef3ff ("nfs: don't atempt blocking locks on nfs reexports") [sic] and commit b840be2f00c0 ("lockd: don't attempt blocking locks on nfs reexports"), and instead uses a check of ->lock (Vasily's suggestion) to decide whether to support blocking lock notifications on a given filesystem. Also add a little documentation. Perhaps someday we could add back an export flag later to allow filesystems with "good" ->lock methods to support blocking lock notifications. Reported-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> [ cel: Description rewritten to address checkpatch nits ] [ cel: Fixed warning when SUNRPC debugging is disabled ] [ cel: Fixed NULL check ] Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Vasily Averin <vvs@virtuozzo.com>
2021-10-13SUNRPC: Change return value type of .pc_encodeChuck Lever2-8/+8
Returning an undecorated integer is an age-old trope, but it's not clear (even to previous experts in this code) that the only valid return values are 1 and 0. These functions do not return a negative errno, rpc_stat value, or a positive length. Document there are only two valid return values by having .pc_encode return only true or false. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2021-10-13SUNRPC: Replace the "__be32 *p" parameter to .pc_encodeChuck Lever2-8/+8
The passed-in value of the "__be32 *p" parameter is now unused in every server-side XDR encoder, and can be removed. Note also that there is a line in each encoder that sets up a local pointer to a struct xdr_stream. Passing that pointer from the dispatcher instead saves one line per encoder function. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2021-10-13SUNRPC: Change return value type of .pc_decodeChuck Lever2-18/+18
Returning an undecorated integer is an age-old trope, but it's not clear (even to previous experts in this code) that the only valid return values are 1 and 0. These functions do not return a negative errno, rpc_stat value, or a positive length. Document there are only two valid return values by having .pc_decode return only true or false. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2021-10-13SUNRPC: Replace the "__be32 *p" parameter to .pc_decodeChuck Lever2-19/+19
The passed-in value of the "__be32 *p" parameter is now unused in every server-side XDR decoder, and can be removed. Note also that there is a line in each decoder that sets up a local pointer to a struct xdr_stream. Passing that pointer from the dispatcher instead saves one line per decoder function. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2021-08-24Keep read and write fds with each nlm_fileJ. Bruce Fields2-3/+9
We shouldn't really be using a read-only file descriptor to take a write lock. Most filesystems will put up with it. But NFS, for example, won't. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-08-23nlm: minor nlm_lookup_file argument changeJ. Bruce Fields1-1/+1
It'll come in handy to get the whole nlm_lock. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-07-07lockd: Remove stale commentsChuck Lever2-12/+1
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-11-12lockd: remove __KERNEL__ ifdefsChristoph Hellwig2-8/+0
Remove the __KERNEL__ ifdefs from the non-UAPI sunrpc headers, as those can't be included from user space programs. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-07-04lockd: Convert NLM service fl_owner to nlm_lockownerBenjamin Coddington1-0/+2
Do as the NLM client: allocate and track a struct nlm_lockowner for use as the fl_owner for locks created by the NLM sever. This allows us to keep the svid within this structure for matching locks, and will allow us to track the pid of lockd in a future patch. It should also allow easier reference of the nlm_host in conflicting locks, and simplify lock hashing and comparison. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> [bfields@redhat.com: fix type of some error returns] Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-05-16Merge tag 'nfsd-5.2' of git://linux-nfs.org/~bfields/linuxLinus Torvalds1-1/+1
Pull nfsd updates from Bruce Fields: "This consists mostly of nfsd container work: Scott Mayhew revived an old api that communicates with a userspace daemon to manage some on-disk state that's used to track clients across server reboots. We've been using a usermode_helper upcall for that, but it's tough to run those with the right namespaces, so a daemon is much friendlier to container use cases. Trond fixed nfsd's handling of user credentials in user namespaces. He also contributed patches that allow containers to support different sets of NFS protocol versions. The only remaining container bug I'm aware of is that the NFS reply cache is shared between all containers. If anyone's aware of other gaps in our container support, let me know. The rest of this is miscellaneous bugfixes" * tag 'nfsd-5.2' of git://linux-nfs.org/~bfields/linux: (23 commits) nfsd: update callback done processing locks: move checks from locks_free_lock() to locks_release_private() nfsd: fh_drop_write in nfsd_unlink nfsd: allow fh_want_write to be called twice nfsd: knfsd must use the container user namespace SUNRPC: rsi_parse() should use the current user namespace SUNRPC: Fix the server AUTH_UNIX userspace mappings lockd: Pass the user cred from knfsd when starting the lockd server SUNRPC: Temporary sockets should inherit the cred from their parent SUNRPC: Cache the process user cred in the RPC server listener nfsd: Allow containers to set supported nfs versions nfsd: Add custom rpcbind callbacks for knfsd SUNRPC: Allow further customisation of RPC program registration SUNRPC: Clean up generic dispatcher code SUNRPC: Add a callback to initialise server requests SUNRPC/nfs: Fix return value for nfs4_callback_compound() nfsd: handle legacy client tracking records sent by nfsdcld nfsd: re-order client tracking method selection nfsd: keep a tally of RECLAIM_COMPLETE operations when using nfsdcld nfsd: un-deprecate nfsdcld ...
2019-04-27lockd: Store the lockd client credential in struct nlm_hostTrond Myklebust2-1/+4
When we create a new lockd client, we want to be able to pass the correct credential of the process that created the struct nlm_host. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-04-24lockd: Pass the user cred from knfsd when starting the lockd serverTrond Myklebust1-1/+2
When starting up a new knfsd server, pass the user cred to the supporting lockd server. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-08-09nfsd: fix leaked file lock with nfs exported overlayfsAmir Goldstein1-2/+2
nfsd and lockd call vfs_lock_file() to lock/unlock the inode returned by locks_inode(file). Many places in nfsd/lockd code use the inode returned by file_inode(file) for lock manipulation. With Overlayfs, file_inode() (the underlying inode) is not the same object as locks_inode() (the overlay inode). This can result in "Leaked POSIX lock" messages and eventually to a kernel crash as reported by Eddie Horng: https://marc.info/?l=linux-unionfs&m=153086643202072&w=2 Fix all the call sites in nfsd/lockd that should use locks_inode(). This is a correctness bug that manifested when overlayfs gained NFS export support in v4.16. Reported-by: Eddie Horng <eddiehorng.tw@gmail.com> Tested-by: Eddie Horng <eddiehorng.tw@gmail.com> Cc: Jeff Layton <jlayton@kernel.org> Fixes: 8383f1748829 ("ovl: wire up NFS export operations") Cc: stable@vger.kernel.org Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-01-15lockd: convert nlm_rqst.a_count from atomic_t to refcount_tElena Reshetova1-1/+1
atomic_t variables are currently used to implement reference counters with the following properties: - counter is initialized to 1 using atomic_set() - a resource is freed upon counter reaching zero - once counter reaches zero, its further increments aren't allowed - counter schema uses basic atomic operations (set, inc, inc_not_zero, dec_and_test, etc.) Such atomic variables should be converted to a newly provided refcount_t type and API that prevents accidental counter overflows and underflows. This is important since overflows and underflows can lead to use-after-free situation and be exploitable. The variable nlm_rqst.a_count is used as pure reference counter. Convert it to refcount_t and fix up the operations. **Important note for maintainers: Some functions from refcount_t API defined in lib/refcount.c have different memory ordering guarantees than their atomic counterparts. The full comparison can be seen in https://lkml.org/lkml/2017/11/15/57 and it is hopefully soon in state to be merged to the documentation tree. Normally the differences should not matter since refcount_t provides enough guarantees to satisfy the refcounting use cases, but in some rare cases it might matter. Please double check that you don't have some undocumented memory guarantees for this variable usage. For the nlm_rqst.a_count it might make a difference in following places: - nlmclnt_release_call() and nlmsvc_release_call(): decrement in refcount_dec_and_test() only provides RELEASE ordering and control dependency on success vs. fully ordered atomic counterpart Suggested-by: Kees Cook <keescook@chromium.org> Reviewed-by: David Windsor <dwindsor@gmail.com> Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2018-01-15lockd: convert nlm_lockowner.count from atomic_t to refcount_tElena Reshetova1-1/+1
atomic_t variables are currently used to implement reference counters with the following properties: - counter is initialized to 1 using atomic_set() - a resource is freed upon counter reaching zero - once counter reaches zero, its further increments aren't allowed - counter schema uses basic atomic operations (set, inc, inc_not_zero, dec_and_test, etc.) Such atomic variables should be converted to a newly provided refcount_t type and API that prevents accidental counter overflows and underflows. This is important since overflows and underflows can lead to use-after-free situation and be exploitable. The variable nlm_lockowner.count is used as pure reference counter. Convert it to refcount_t and fix up the operations. **Important note for maintainers: Some functions from refcount_t API defined in lib/refcount.c have different memory ordering guarantees than their atomic counterparts. The full comparison can be seen in https://lkml.org/lkml/2017/11/15/57 and it is hopefully soon in state to be merged to the documentation tree. Normally the differences should not matter since refcount_t provides enough guarantees to satisfy the refcounting use cases, but in some rare cases it might matter. Please double check that you don't have some undocumented memory guarantees for this variable usage. For the nlm_lockowner.count it might make a difference in following places: - nlm_put_lockowner(): decrement in refcount_dec_and_lock() only provides RELEASE ordering, control dependency on success and holds a spin lock on success vs. fully ordered atomic counterpart. No changes in spin lock guarantees. Suggested-by: Kees Cook <keescook@chromium.org> Reviewed-by: David Windsor <dwindsor@gmail.com> Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2018-01-15lockd: convert nsm_handle.sm_count from atomic_t to refcount_tElena Reshetova1-1/+1
atomic_t variables are currently used to implement reference counters with the following properties: - counter is initialized to 1 using atomic_set() - a resource is freed upon counter reaching zero - once counter reaches zero, its further increments aren't allowed - counter schema uses basic atomic operations (set, inc, inc_not_zero, dec_and_test, etc.) Such atomic variables should be converted to a newly provided refcount_t type and API that prevents accidental counter overflows and underflows. This is important since overflows and underflows can lead to use-after-free situation and be exploitable. The variable nsm_handle.sm_count is used as pure reference counter. Convert it to refcount_t and fix up the operations. **Important note for maintainers: Some functions from refcount_t API defined in lib/refcount.c have different memory ordering guarantees than their atomic counterparts. The full comparison can be seen in https://lkml.org/lkml/2017/11/15/57 and it is hopefully soon in state to be merged to the documentation tree. Normally the differences should not matter since refcount_t provides enough guarantees to satisfy the refcounting use cases, but in some rare cases it might matter. Please double check that you don't have some undocumented memory guarantees for this variable usage. For the nsm_handle.sm_count it might make a difference in following places: - nsm_release(): decrement in refcount_dec_and_lock() only provides RELEASE ordering, control dependency on success and holds a spin lock on success vs. fully ordered atomic counterpart. No change for the spin lock guarantees. Suggested-by: Kees Cook <keescook@chromium.org> Reviewed-by: David Windsor <dwindsor@gmail.com> Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2018-01-15lockd: convert nlm_host.h_count from atomic_t to refcount_tElena Reshetova1-1/+2
atomic_t variables are currently used to implement reference counters with the following properties: - counter is initialized to 1 using atomic_set() - a resource is freed upon counter reaching zero - once counter reaches zero, its further increments aren't allowed - counter schema uses basic atomic operations (set, inc, inc_not_zero, dec_and_test, etc.) Such atomic variables should be converted to a newly provided refcount_t type and API that prevents accidental counter overflows and underflows. This is important since overflows and underflows can lead to use-after-free situation and be exploitable. The variable nlm_host.h_count is used as pure reference counter. Convert it to refcount_t and fix up the operations. **Important note for maintainers: Some functions from refcount_t API defined in lib/refcount.c have different memory ordering guarantees than their atomic counterparts. The full comparison can be seen in https://lkml.org/lkml/2017/11/15/57 and it is hopefully soon in state to be merged to the documentation tree. Normally the differences should not matter since refcount_t provides enough guarantees to satisfy the refcounting use cases, but in some rare cases it might matter. Please double check that you don't have some undocumented memory guarantees for this variable usage. For the nlm_host.h_count it might make a difference in following places: - nlmsvc_release_host(): decrement in refcount_dec() provides RELEASE ordering, while original atomic_dec() was fully unordered. Since the change is for better, it should not matter. - nlmclnt_release_host(): decrement in refcount_dec_and_test() only provides RELEASE ordering and control dependency on success vs. fully ordered atomic counterpart. It doesn't seem to matter in this case since object freeing happens under mutex lock anyway. Suggested-by: Kees Cook <keescook@chromium.org> Reviewed-by: David Windsor <dwindsor@gmail.com> Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-11-02License cleanup: add SPDX GPL-2.0 license identifier to files with no licenseGreg Kroah-Hartman7-0/+7
Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-15sunrpc: mark all struct svc_procinfo instances as constChristoph Hellwig1-2/+2
struct svc_procinfo contains function pointers, and marking it as constant avoids it being able to be used as an attach vector for code injections. Signed-off-by: Christoph Hellwig <hch@lst.de>