kernel/linux.git - Linux kernel stable tree (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2023-08-11	nfsd: Remove incorrect check in nfsd4_validate_stateid	Trond Myklebust	1	-2/+0
	commit f75546f58a70da5cfdcec5a45ffc377885ccbee8 upstream. If the client is calling TEST_STATEID, then it is because some event occurred that requires it to check all the stateids for validity and call FREE_STATEID on the ones that have been revoked. In this case, either the stateid exists in the list of stateids associated with that nfs4_client, in which case it should be tested, or it does not. There are no additional conditions to be considered. Reported-by: "Frank Ch. Eigler" <fche@redhat.com> Fixes: 7df302f75ee2 ("NFSD: TEST_STATEID should not return NFS4ERR_STALE_STATEID") Cc: stable@vger.kernel.org # v5.7+ Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-01	nfsd: Ensure knfsd shuts down when the "nfsd" pseudofs is unmounted	Trond Myklebust	1	-7/+1
	commit c6c7f2a84da459bcc3714044e74a9cb66de31039 upstream. In order to ensure that knfsd threads don't linger once the nfsd pseudofs is unmounted (e.g. when the container is killed) we let nfsd_umount() shut down those threads and wait for them to exit. This also should ensure that we don't need to do a kernel mount of the pseudofs, since the thread lifetime is now limited by the lifetime of the filesystem. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Nikos Tsironis <ntsironis@arrikto.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-14	nfsd: don't call nfsd_file_put from client states seqfile display	Jeff Layton	1	-18/+33
	[ Upstream commit e0aa651068bfd520afcd357af8ecd2de005fc83d ] We had a report of this: BUG: sleeping function called from invalid context at fs/nfsd/filecache.c:440 ...with a stack trace showing nfsd_file_put being called from nfs4_show_open. This code has always tried to call fput while holding a spinlock, but we recently changed this to use the filecache, and that started triggering the might_sleep() in nfsd_file_put. states_start takes and holds the cl_lock while iterating over the client's states, and we can't sleep with that held. Have the various nfs4_show_* functions instead hold the fi_lock instead of taking a nfsd_file reference. Fixes: 78599c42ae3c ("nfsd4: add file to display list of client's opens") Link: https://bugzilla.redhat.com/show_bug.cgi?id=2138357 Reported-by: Zhi Li <yieli@redhat.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-10-26	NFSD: fix use-after-free on source server when doing inter-server copy	Dai Ngo	1	-0/+5
	[ Upstream commit 019805fea91599b22dfa62ffb29c022f35abeb06 ] Use-after-free occurred when the laundromat tried to free expired cpntf_state entry on the s2s_cp_stateids list after inter-server copy completed. The sc_cp_list that the expired copy state was inserted on was already freed. When COPY completes, the Linux client normally sends LOCKU(lock_state x), FREE_STATEID(lock_state x) and CLOSE(open_state y) to the source server. The nfs4_put_stid call from nfsd4_free_stateid cleans up the copy state from the s2s_cp_stateids list before freeing the lock state's stid. However, sometimes the CLOSE was sent before the FREE_STATEID request. When this happens, the nfsd4_close_open_stateid call from nfsd4_close frees all lock states on its st_locks list without cleaning up the copy state on the sc_cp_list list. When the time the FREE_STATEID arrives the server returns BAD_STATEID since the lock state was freed. This causes the use-after-free error to occur when the laundromat tries to free the expired cpntf_state. This patch adds a call to nfs4_free_cpntf_statelist in nfsd4_close_open_stateid to clean up the copy state before calling free_ol_stateid_reaplist to free the lock state's stid on the reaplist. Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-06-06	NFSD: Fix possible sleep during nfsd4_release_lockowner()	Chuck Lever	1	-8/+4
	commit ce3c4ad7f4ce5db7b4f08a1e237d8dd94b39180b upstream. nfsd4_release_lockowner() holds clp->cl_lock when it calls check_for_locks(). However, check_for_locks() calls nfsd_file_get() / nfsd_file_put() to access the backing inode's flc_posix list, and nfsd_file_put() can sleep if the inode was recently removed. Let's instead rely on the stateowner's reference count to gate whether the release is permitted. This should be a reliable indication of locks-in-use since file lock operations and ->lm_get_owner take appropriate references, which are released appropriately when file locks are removed. Reported-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-08	NFSD: Fix nfsd_breaker_owns_lease() return values	Chuck Lever	1	-2/+10
	[ Upstream commit 50719bf3442dd6cd05159e9c98d020b3919ce978 ] These have been incorrect since the function was introduced. A proper kerneldoc comment is added since this function, though static, is part of an external interface. Reported-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-02-08	nfsd: nfsd4_setclientid_confirm mistakenly expires confirmed client.	Dai Ngo	1	-1/+3
	commit ab451ea952fe9d7afefae55ddb28943a148247fe upstream. From RFC 7530 Section 16.34.5: o The server has not recorded an unconfirmed { v, x, c, , } and has recorded a confirmed { v, x, c, *, s }. If the principals of the record and of SETCLIENTID_CONFIRM do not match, the server returns NFS4ERR_CLID_INUSE without removing any relevant leased client state, and without changing recorded callback and callback_ident values for client { x }. The current code intends to do what the spec describes above but it forgot to set 'old' to NULL resulting to the confirmed client to be expired. Fixes: 2b63482185e6 ("nfsd: fix clid_inuse on mount with security change") Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Bruce Fields <bfields@fieldses.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-14	nfsd: fix use-after-free due to delegation race	J. Bruce Fields	1	-2/+7
	commit 548ec0805c399c65ed66c6641be467f717833ab5 upstream. A delegation break could arrive as soon as we've called vfs_setlease. A delegation break runs a callback which immediately (in nfsd4_cb_recall_prepare) adds the delegation to del_recall_lru. If we then exit nfs4_set_delegation without hashing the delegation, it will be freed as soon as the callback is done with it, without ever being removed from del_recall_lru. Symptoms show up later as use-after-free or list corruption warnings, usually in the laundromat thread. I suspect aba2072f4523 "nfsd: grant read delegations to clients holding writes" made this bug easier to hit, but I looked as far back as v3.0 and it looks to me it already had the same problem. So I'm not sure where the bug was introduced; it may have been there from the beginning. Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-10-09	nfsd: back channel stuck in SEQ4_STATUS_CB_PATH_DOWN	Dai Ngo	1	-3/+13
	[ Upstream commit 02579b2ff8b0becfb51d85a975908ac4ab15fba8 ] When the back channel enters SEQ4_STATUS_CB_PATH_DOWN state, the client recovers by sending BIND_CONN_TO_SESSION but the server fails to recover the back channel and leaves it as NFSD4_CB_DOWN. Fix by enhancing nfsd4_bind_conn_to_session to probe the back channel by calling nfsd4_probe_callback. Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-18	nfsd: fix crash on LOCKT on reexported NFSv3	J. Bruce Fields	1	-2/+3
	[ Upstream commit 0bcc7ca40bd823193224e9f38bafbd8325aaf566 ] Unlike other filesystems, NFSv3 tries to use fl_file in the GETLK case. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-15	nfsd4: Fix forced-expiry locking	J. Bruce Fields	1	-2/+2
	[ Upstream commit f7104cc1a9159cd0d3e8526cb638ae0301de4b61 ] This should use the network-namespace-wide client_lock, not the per-client cl_lock. You shouldn't see any bugs unless you're actually using the forced-expiry interface introduced by 89c905beccbb. Fixes: 89c905beccbb "nfsd: allow forced expiration of NFSv4 clients" Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-20	NFSD: Fix TP_printk() format specifier in nfsd_clid_class	Chuck Lever	1	-3/+0
	[ Upstream commit a948b1142cae66785521a389cab2cce74069b547 ] Since commit 9a6944fee68e ("tracing: Add a verifier to check string pointers for trace events"), which was merged in v5.13-rc1, TP_printk() no longer tacitly supports the "%.*s" format specifier. These are low value tracepoints, so just remove them. Reported-by: David Wysochanski <dwysocha@redhat.com> Fixes: dd5e3fbc1f47 ("NFSD: Add tracepoints to the NFSD state management code") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-19	nfsd: ensure new clients break delegations	J. Bruce Fields	1	-5/+19
	[ Upstream commit 217fd6f625af591e2866bebb8cda778cf85bea2e ] If nfsd already has an open file that it plans to use for IO from another, it may not need to do another vfs open, but it still may need to break any delegations in case the existing opens are for another client. Symptoms are that we may incorrectly fail to break a delegation on a write open from a different client, when the delegation-holding client already has a write open. Fixes: 28df3d1539de ("nfsd: clients don't need to break their own delegations") Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-25	nfsd: don't abort copies early	J. Bruce Fields	1	-1/+1
	commit bfdd89f232aa2de5a4b3fc985cba894148b830a8 upstream. The typical result of the backwards comparison here is that the source server in a server-to-server copy will return BAD_STATEID within a few seconds of the copy starting, instead of giving the copy a full lease period, so the copy_file_range() call will end up unnecessarily returning a short read. Fixes: 624322f1adc5 "NFSD add COPY_NOTIFY operation" Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-20	Revert "nfsd4: a client's own opens needn't prevent delegations"	J. Bruce Fields	1	-40/+14
	commit 6ee65a773096ab3f39d9b00311ac983be5bdeb7c upstream. This reverts commit 94415b06eb8aed13481646026dc995f04a3a534a. That commit claimed to allow a client to get a read delegation when it was the only writer. Actually it allowed a client to get a read delegation when any client has a write open! The main problem is that it's depending on nfs4_clnt_odstate structures that are actually only maintained for pnfs exports. This causes clients to miss writes performed by other clients, even when there have been intervening closes and opens, violating close-to-open cache consistency. We can do this a different way, but first we should just revert this. I've added pynfs 4.1 test DELEG19 to test for this, as I should have done originally! Cc: stable@vger.kernel.org Reported-by: Timo Rothenpieler <timo@rothenpieler.org> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-20	Revert "nfsd4: remove check_conflicting_opens warning"	J. Bruce Fields	1	-0/+1
	commit 4aa5e002034f0701c3335379fd6c22d7f3338cce upstream. This reverts commit 50747dd5e47b "nfsd4: remove check_conflicting_opens warning", as a prerequisite for reverting 94415b06eb8a, which has a serious bug. Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-30	NFSD: Fix 5 seconds delay when doing inter server copy	Dai Ngo	1	-0/+1
	[ Upstream commit ca9364dde50daba93eff711b4b945fd08beafcc2 ] Since commit b4868b44c5628 ("NFSv4: Wait for stateid updates after CLOSE/OPEN_DOWNGRADE"), every inter server copy operation suffers 5 seconds delay regardless of the size of the copy. The delay is from nfs_set_open_stateid_locked when the check by nfs_stateid_is_sequential fails because the seqid in both nfs4_state and nfs4_stateid are 0. Fix by modifying nfs4_init_cp_state to return the stateid with seqid 1 instead of 0. This is also to conform with section 4.8 of RFC 7862. Here is the relevant paragraph from section 4.8 of RFC 7862: A copy offload stateid's seqid MUST NOT be zero. In the context of a copy offload operation, it is inappropriate to indicate "the most recent copy offload operation" using a stateid with a seqid of zero (see Section 8.2.2 of [RFC5661]). It is inappropriate because the stateid refers to internal state in the server and there may be several asynchronous COPY operations being performed in parallel on the same file by the server. Therefore, a copy offload stateid with a seqid of zero MUST be considered invalid. Fixes: ce0887ac96d3 ("NFSD add nfs4 inter ssc to nfsd4_copy") Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-16	nfsd: remove unneeded break	Tom Rix	1	-1/+0
	Because every path through nfs4_find_file()'s switch does an explicit return, the break is not needed. Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-09-26	nfsd: rq_lease_breaker cleanup	J. Bruce Fields	1	-1/+2
	Since only the v4 code cares about it, maybe it's better to leave rq_lease_breaker out of the common dispatch code? Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-09-26	nfsd4: remove check_conflicting_opens warning	J. Bruce Fields	1	-1/+0
	There are actually rare races where this is possible (e.g. if a new open intervenes between the read of i_writecount and the fi_fds). Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-09-26	nfsd: rename delegation related tracepoints to make them less confusing	Hou Tao	1	-2/+2
	Now when a read delegation is given, two delegation related traces will be printed: nfsd_deleg_open: client 5f45b854:e6058001 stateid 00000030:00000001 nfsd_deleg_none: client 5f45b854:e6058001 stateid 0000002f:00000001 Although the intention is to let developers know two stateid are returned, the traces are confusing about whether or not a read delegation is handled out. So renaming trace_nfsd_deleg_none() to trace_nfsd_open() and trace_nfsd_deleg_open() to trace_nfsd_deleg_read() to make the intension clearer. The patched traces will be: nfsd_deleg_read: client 5f48a967:b55b21cd stateid 00000003:00000001 nfsd_open: client 5f48a967:b55b21cd stateid 00000002:00000001 Suggested-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Hou Tao <houtao1@huawei.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-09-26	nfsd: give up callbacks on revoked delegations	J. Bruce Fields	1	-1/+2
	The delegation is no longer returnable, so I don't think there's much point retrying the recall. (I think it's worth asking why we even need separate CLOSED_DELEG and REVOKED_DELEG states. But treating them the same would currently cause nfsd4_free_stateid to call list_del_init(&dp->dl_recall_lru) on a delegation that the laundromat had unhashed but not revoked, incorrectly removing it from the laundromat's reaplist or a client's dl_recall_lru.) Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-09-26	nfsd: remove fault injection code	J. Bruce Fields	1	-593/+0
	It was an interesting idea but nobody seems to be using it, it's buggy at this point, and nfs4state.c is already complicated enough without it. The new nfsd/clients/ code provides some of the same functionality, and could probably do more if desired. This feature has been deprecated since 9d60d93198c6 ("Deprecate nfsd fault injection"). Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-08-26	Merge tag 'nfsd-5.9-1' of git://git.linux-nfs.org/projects/cel/cel-2.6	Linus Torvalds	1	-0/+2
	Pull nfs server fixes from Chuck Lever: - Eliminate an oops introduced in v5.8 - Remove a duplicate #include added by nfsd-5.9 * tag 'nfsd-5.9-1' of git://git.linux-nfs.org/projects/cel/cel-2.6: SUNRPC: remove duplicate include nfsd: fix oops on mixed NFSv4/NFSv3 client access
2020-08-24	treewide: Use fallthrough pseudo-keyword	Gustavo A. R. Silva	1	-6/+6
	Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-08-16	nfsd: fix oops on mixed NFSv4/NFSv3 client access	J. Bruce Fields	1	-0/+2
	If an NFSv2/v3 client breaks an NFSv4 client's delegation, it will hit a NULL dereference in nfsd_breaker_owns_lease(). Easily reproduceable with for example mount -overs=4.2 server:/export /mnt/ sleep 1h </mnt/file & mount -overs=3 server:/export /mnt2/ touch /mnt2/file Reported-by: Robert Dinse <nanook@eskimo.com> Fixes: 28df3d1539de50 ("nfsd: clients don't need to break their own delegations") BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=208807 Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-08-09	Merge tag 'nfsd-5.9' of git://git.linux-nfs.org/projects/cel/cel-2.6	Linus Torvalds	1	-14/+40
	Pull NFS server updates from Chuck Lever: "Highlights: - Support for user extended attributes on NFS (RFC 8276) - Further reduce unnecessary NFSv4 delegation recalls Notable fixes: - Fix recent krb5p regression - Address a few resource leaks and a rare NULL dereference Other: - De-duplicate RPC/RDMA error handling and other utility functions - Replace storage and display of kernel memory addresses by tracepoints" * tag 'nfsd-5.9' of git://git.linux-nfs.org/projects/cel/cel-2.6: (38 commits) svcrdma: CM event handler clean up svcrdma: Remove transport reference counting svcrdma: Fix another Receive buffer leak SUNRPC: Refresh the show_rqstp_flags() macro nfsd: netns.h: delete a duplicated word SUNRPC: Fix ("SUNRPC: Add "@len" parameter to gss_unwrap()") nfsd: avoid a NULL dereference in __cld_pipe_upcall() nfsd4: a client's own opens needn't prevent delegations nfsd: Use seq_putc() in two functions svcrdma: Display chunk completion ID when posting a rw_ctxt svcrdma: Record send_ctxt completion ID in trace_svcrdma_post_send() svcrdma: Introduce Send completion IDs svcrdma: Record Receive completion ID in svc_rdma_decode_rqst svcrdma: Introduce Receive completion IDs svcrdma: Introduce infrastructure to support completion IDs svcrdma: Add common XDR encoders for RDMA and Read segments svcrdma: Add common XDR decoders for RDMA and Read segments SUNRPC: Add helpers for decoding list discriminators symbolically svcrdma: Remove declarations for functions long removed svcrdma: Clean up trace_svcrdma_send_failed() tracepoint ...
2020-07-22	nfsd4: fix NULL dereference in nfsd/clients display code	J. Bruce Fields	1	-1/+19
	We hold the cl_lock here, and that's enough to keep stateid's from going away, but it's not enough to prevent the files they point to from going away. Take fi_lock and a reference and check for NULL, as we do in other code. Reported-by: NeilBrown <neilb@suse.de> Fixes: 78599c42ae3c ("nfsd4: add file to display list of client's opens") Reviewed-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-07-14	nfsd4: a client's own opens needn't prevent delegations	J. Bruce Fields	1	-14/+40
	We recently fixed lease breaking so that a client's actions won't break its own delegations. But we still have an unnecessary self-conflict when granting delegations: a client's own write opens will prevent us from handing out a read delegation even when no other client has the file open for write. Fix that by turning off the checks for conflicting opens under vfs_setlease, and instead performing those checks in the nfsd code. We don't depend much on locks here: instead we acquire the delegation, then check for conflicts, and drop the delegation again if we find any. The check beforehand is an optimization of sorts, just to avoid acquiring the delegation unnecessarily. There's a race where the first check could cause us to deny the delegation when we could have granted it. But, that's OK, delegation grants are optional (and probably not even a good idea in that case). Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-06-29	nfsd4: fix nfsdfs reference count loop	J. Bruce Fields	1	-1/+7
	We don't drop the reference on the nfsdfs filesystem with mntput(nn->nfsd_mnt) until nfsd_exit_net(), but that won't be called until the nfsd module's unloaded, and we can't unload the module as long as there's a reference on nfsdfs. So this prevents module unloading. Fixes: 2c830dd7209b ("nfsd: persist nfsd filesystem across mounts") Reported-and-Tested-by: Luo Xiaogang <lxgrxd@163.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-05-21	Merge branch 'nfsd-5.8' of git://linux-nfs.org/~cel/cel-2.6 into ↵	J. Bruce Fields	1	-40/+23
	for-5.8-incoming Highlights of this series: * Remove serialization of sending RPC/RDMA Replies * Convert the TCP socket send path to use xdr_buf::bvecs (pre-requisite for RPC-on-TLS) * Fix svcrdma backchannel sendto return code * Convert a number of dprintk call sites to use tracepoints * Fix the "suggest braces around empty body in an 'else' statement" warning
2020-05-21	NFSD: Squash an annoying compiler warning	Chuck Lever	1	-3/+2
	Clean up: Fix gcc empty-body warning when -Wextra is used. ../fs/nfsd/nfs4state.c:3898:3: warning: suggest braces around empty body in an ‘else’ statement [-Wempty-body] Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-05-21	NFSD: Add tracepoints for monitoring NFSD callbacks	Chuck Lever	1	-4/+2
	Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-05-21	NFSD: Add tracepoints to the NFSD state management code	Chuck Lever	1	-33/+19
	Capture obvious events and replace dprintk() call sites. Introduce infrastructure so that adding more tracepoints in this code later is simplified. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-05-09	nfsd: clients don't need to break their own delegations	J. Bruce Fields	1	-0/+14
	We currently revoke read delegations on any write open or any operation that modifies file data or metadata (including rename, link, and unlink). But if the delegation in question is the only read delegation and is held by the client performing the operation, that's not really necessary. It's not always possible to prevent this in the NFSv4.0 case, because there's not always a way to determine which client an NFSv4.0 delegation came from. (In theory we could try to guess this from the transport layer, e.g., by assuming all traffic on a given TCP connection comes from the same client. But that's not really correct.) In the NFSv4.1 case the session layer always tells us the client. This patch should remove such self-conflicts in all cases where we can reliably determine the client from the compound. To do that we need to track "who" is performing a given (possibly lease-breaking) file operation. We're doing that by storing the information in the svc_rqst and using kthread_data() to map the current task back to a svc_rqst. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-05-06	nfsd: handle repeated BIND_CONN_TO_SESSION	J. Bruce Fields	1	-12/+42
	If the client attempts BIND_CONN_TO_SESSION on an already bound connection, it should be either a no-op or an error. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-05-06	nfsd4: add filename to states output	Achilles Gaikwad	1	-0/+13
	Add filename to states output for ease of debugging. Signed-off-by: Achilles Gaikwad <agaikwad@redhat.com> Signed-off-by: Kenneth Dsouza <kdsouza@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-05-06	nfsd4: stid display should preserve on-the-wire byte order	J. Bruce Fields	1	-1/+2
	When we decode the stateid we byte-swap si_generation. But for simplicity's sake and ease of comparison with network traces, it's better to display the whole thing in network order. Reported-by: Scott Mayhew <smayhew@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-05-06	nfsd4: common stateid-printing code	J. Bruce Fields	1	-4/+17
	There's a problem with how I'm formatting stateids. Before I fix it, I'd like to move the stateid formatting into a common helper. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-04-13	nfsd: memory corruption in nfsd4_lock()	Vasily Averin	1	-0/+2
	New struct nfsd4_blocked_lock allocated in find_or_allocate_block() does not initialized nbl_list and nbl_lru. If conflock allocation fails rollback can call list_del_init() access uninitialized fields and corrupt memory. v2: just initialize nbl_list and nbl_lru right after nbl allocation. Fixes: 76d348fadff5 ("nfsd: have nfsd4_lock use blocking locks for v4.1+ lock") Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-03-19	nfsd4: kill warnings on testing stateids with mismatched clientids	J. Bruce Fields	1	-8/+1
	It's normal for a client to test a stateid from a previous instance, e.g. after a network partition. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-03-16	nfsd: remove read permission bit for ctl sysctl	Petr Vorel	1	-1/+1
	It's meant to be write-only. Fixes: 89c905beccbb ("nfsd: allow forced expiration of NFSv4 clients") Signed-off-by: Petr Vorel <pvorel@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-03-16	nfsd: Don't add locks to closed or closing open stateids	Trond Myklebust	1	-30/+43
	In NFSv4, the lock stateids are tied to the lockowner, and the open stateid, so that the action of closing the file also results in either an automatic loss of the locks, or an error of the form NFS4ERR_LOCKS_HELD. In practice this means we must not add new locks to the open stateid after the close process has been invoked. In fact doing so, can result in the following panic: kernel BUG at lib/list_debug.c:51! invalid opcode: 0000 [#1] SMP NOPTI CPU: 2 PID: 1085 Comm: nfsd Not tainted 5.6.0-rc3+ #2 Hardware name: VMware, Inc. VMware7,1/440BX Desktop Reference Platform, BIOS VMW71.00V.14410784.B64.1908150010 08/15/2019 RIP: 0010:__list_del_entry_valid.cold+0x31/0x55 Code: 1a 3d 9b e8 74 10 c2 ff 0f 0b 48 c7 c7 f0 1a 3d 9b e8 66 10 c2 ff 0f 0b 48 89 f2 48 89 fe 48 c7 c7 b0 1a 3d 9b e8 52 10 c2 ff <0f> 0b 48 89 fe 4c 89 c2 48 c7 c7 78 1a 3d 9b e8 3e 10 c2 ff 0f 0b RSP: 0018:ffffb296c1d47d90 EFLAGS: 00010246 RAX: 0000000000000054 RBX: ffff8ba032456ec8 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff8ba039e99cc8 RDI: ffff8ba039e99cc8 RBP: ffff8ba032456e60 R08: 0000000000000781 R09: 0000000000000003 R10: 0000000000000000 R11: 0000000000000001 R12: ffff8ba009a4abe0 R13: ffff8ba032456e8c R14: 0000000000000000 R15: ffff8ba00adb01d8 FS: 0000000000000000(0000) GS:ffff8ba039e80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fb213f0b008 CR3: 00000001347de006 CR4: 00000000003606e0 Call Trace: release_lock_stateid+0x2b/0x80 [nfsd] nfsd4_free_stateid+0x1e9/0x210 [nfsd] nfsd4_proc_compound+0x414/0x700 [nfsd] ? nfs4svc_decode_compoundargs+0x407/0x4c0 [nfsd] nfsd_dispatch+0xc1/0x200 [nfsd] svc_process_common+0x476/0x6f0 [sunrpc] ? svc_sock_secure_port+0x12/0x30 [sunrpc] ? svc_recv+0x313/0x9c0 [sunrpc] ? nfsd_svc+0x2d0/0x2d0 [nfsd] svc_process+0xd4/0x110 [sunrpc] nfsd+0xe3/0x140 [nfsd] kthread+0xf9/0x130 ? nfsd_destroy+0x50/0x50 [nfsd] ? kthread_park+0x90/0x90 ret_from_fork+0x1f/0x40 The fix is to ensure that lock creation tests for whether or not the open stateid is unhashed, and to fail if that is the case. Fixes: 659aefb68eca ("nfsd: Ensure we don't recognise lock stateids after freeing them") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-03-16	fs: nfsd: nfs4state.c: Use built-in RCU list checking	Madhuparna Bhowmik	1	-1/+2
	list_for_each_entry_rcu() has built-in RCU and lock checking. Pass cond argument to list_for_each_entry_rcu() to silence false lockdep warning when CONFIG_PROVE_RCU_LIST is enabled by default. Signed-off-by: Madhuparna Bhowmik <madhuparnabhowmik10@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2019-12-20	nfsd: use ktime_get_real_seconds() in nfs4_verifier	Arnd Bergmann	1	-1/+1
	gen_confirm() generates a unique identifier based on the current time. This overflows in year 2038, but that is harmless since it generally does not lead to duplicates, as long as the time has been initialized by a real-time clock or NTP. Using ktime_get_boottime_seconds() or ktime_get_seconds() would avoid the overflow, but it would be more likely to result in non-unique numbers. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-12-20	nfsd: use boottime for lease expiry calculation	Arnd Bergmann	1	-26/+22
	A couple of time_t variables are only used to track the state of the lease time and its expiration. The code correctly uses the 'time_after()' macro to make this work on 32-bit architectures even beyond year 2038, but the get_seconds() function and the time_t type itself are deprecated as they behave inconsistently between 32-bit and 64-bit architectures and often lead to code that is not y2038 safe. As a minor issue, using get_seconds() leads to problems with concurrent settimeofday() or clock_settime() calls, in the worst case timeout never triggering after the time has been set backwards. Change nfsd to use time64_t and ktime_get_boottime_seconds() here. This is clearly excessive, as boottime by itself means we never go beyond 32 bits, but it does mean we handle this correctly and consistently without having to worry about corner cases and should be no more expensive than the previous implementation on 64-bit architectures. The max_cb_time() function gets changed in order to avoid an expensive 64-bit division operation, but as the lease time is at most one hour, there is no change in behavior. Also do the same for server-to-server copy expiration time. Signed-off-by: Arnd Bergmann <arnd@arndb.de> [bfields@redhat.com: fix up copy expiration] Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-12-20	nfsd: fix jiffies/time_t mixup in LRU list	Arnd Bergmann	1	-1/+1
	The nfsd4_blocked_lock->nbl_time timestamp is recorded in jiffies, but then compared to a CLOCK_REALTIME timestamp later on, which makes no sense. For consistency with the other timestamps, change this to use a time_t. This is a change in behavior, which may cause regressions, but the current code is not sensible. On a system with CONFIG_HZ=1000, the 'time_after((unsigned long)nbl->nbl_time, (unsigned long)cutoff))' check is false for roughly the first 18 days of uptime and then true for the next 49 days. Fixes: 7919d0a27f1e ("nfsd: add a LRU list for blocked locks") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-12-20	nfsd: pass a 64-bit guardtime to nfsd_setattr()	Arnd Bergmann	1	-1/+1
	Guardtime handling in nfs3 differs between 32-bit and 64-bit architectures, and uses the deprecated time_t type. Change it to using time64_t, which behaves the same way on 64-bit and 32-bit architectures, treating the number as an unsigned 32-bit entity with a range of year 1970 to 2106 consistently, and avoiding the y2038 overflow. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-12-20	nfsd: make 'boot_time' 64-bit wide	Arnd Bergmann	1	-7/+7
	The local boot time variable gets truncated to time_t at the moment, which can lead to slightly odd behavior on 32-bit architectures. Use ktime_get_real_seconds() instead of get_seconds() to always get a 64-bit result, and keep it that way wherever possible. It still gets truncated in a few places: - When assigning to cl_clientid.cl_boot, this is already documented and is only used as a unique identifier. - In clients_still_reclaiming(), the truncation is to 'unsigned long' in order to use the 'time_before() helper. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-12-20	nfsd: print 64-bit timestamps in client_info_show	Arnd Bergmann	1	-3/+2
	The nii_time field gets truncated to 'time_t' on 32-bit architectures before printing. Remove the use of 'struct timespec' to product the correct output beyond 2038. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>