summaryrefslogtreecommitdiff
path: root/fs/nfs
AgeCommit message (Collapse)AuthorFilesLines
2021-12-01NFSv42: Don't fail clone() unless the OP_CLONE operation failedTrond Myklebust1-2/+1
[ Upstream commit d3c45824ad65aebf765fcf51366d317a29538820 ] The failure to retrieve post-op attributes has no bearing on whether or not the clone operation itself was successful. We must therefore ignore the return value of decode_getfattr() when looking at the success or failure of nfs4_xdr_dec_clone(). Fixes: 36022770de6c ("nfs42: add CLONE xdr functions") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-12-01NFSv42: Fix pagecache invalidation after COPY/CLONEBenjamin Coddington1-1/+3
commit 3f015d89a47cd8855cd92f71fff770095bd885a1 upstream. The mechanism in use to allow the client to see the results of COPY/CLONE is to drop those pages from the pagecache. This forces the client to read those pages once more from the server. However, truncate_pagecache_range() zeros out partial pages instead of dropping them. Let us instead use invalidate_inode_pages2_range() with full-page offsets to ensure the client properly sees the results of COPY/CLONE operations. Cc: <stable@vger.kernel.org> # v4.7+ Fixes: 2e72448b07dc ("NFS: Add COPY nfs operation") Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18NFSv4: Fix a regression in nfs_set_open_stateid_locked()Trond Myklebust1-7/+8
[ Upstream commit 01d29f87fcfef38d51ce2b473981a5c1e861ac0a ] If we already hold open state on the client, yet the server gives us a completely different stateid to the one we already hold, then we currently treat it as if it were an out-of-sequence update, and wait for 5 seconds for other updates to come in. This commit fixes the behaviour so that we immediately start processing of the new stateid, and then leave it to the call to nfs4_test_and_free_stateid() to decide what to do with the old stateid. Fixes: b4868b44c562 ("NFSv4: Wait for stateid updates after CLOSE/OPEN_DOWNGRADE") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-18Fix user namespace leakAlexey Gladkov1-1/+1
[ Upstream commit d5f458a979650e5ed37212f6134e4ee2b28cb6ed ] Fixes: 61ca2c4afd9d ("NFS: Only reference user namespace from nfs4idmap struct instead of cred") Signed-off-by: Alexey Gladkov <legion@kernel.org> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-18NFS: Fix an Oops in pnfs_mark_request_commit()Trond Myklebust1-1/+1
[ Upstream commit f0caea8882a7412a2ad4d8274f0280cdf849c9e2 ] Olga reports seeing the following Oops when doing O_DIRECT writes to a pNFS flexfiles server: Oops: 0000 [#1] SMP PTI CPU: 1 PID: 234186 Comm: kworker/u8:1 Not tainted 5.15.0-rc4+ #4 Hardware name: Red Hat KVM/RHEL-AV, BIOS 1.13.0-2.module+el8.3.0+7353+9de0a3cc 04/01/2014 Workqueue: nfsiod rpc_async_release [sunrpc] RIP: 0010:nfs_mark_request_commit+0x12/0x30 [nfs] Code: ff ff be 03 00 00 00 e8 ac 34 83 eb e9 29 ff ff ff e8 22 bc d7 eb 66 90 0f 1f 44 00 00 48 85 f6 74 16 48 8b 42 10 48 8b 40 18 <48> 8b 40 18 48 85 c0 74 05 e9 70 fc 15 ec 48 89 d6 e9 68 ed ff ff RSP: 0018:ffffa82f0159fe00 EFLAGS: 00010286 RAX: 0000000000000000 RBX: ffff8f3393141880 RCX: 0000000000000000 RDX: ffffa82f0159fe08 RSI: ffff8f3381252500 RDI: ffff8f3393141880 RBP: ffff8f33ac317c00 R08: 0000000000000000 R09: ffff8f3487724cb0 R10: 0000000000000008 R11: 0000000000000001 R12: 0000000000000001 R13: ffff8f3485bccee0 R14: ffff8f33ac317c10 R15: ffff8f33ac317cd8 FS: 0000000000000000(0000) GS:ffff8f34fbc80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000018 CR3: 0000000122120006 CR4: 0000000000770ee0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: nfs_direct_write_completion+0x13b/0x250 [nfs] rpc_free_task+0x39/0x60 [sunrpc] rpc_async_release+0x29/0x40 [sunrpc] process_one_work+0x1ce/0x370 worker_thread+0x30/0x380 ? process_one_work+0x370/0x370 kthread+0x11a/0x140 ? set_kthread_struct+0x40/0x40 ret_from_fork+0x22/0x30 Reported-by: Olga Kornievskaia <aglo@umich.edu> Fixes: 9c455a8c1e14 ("NFS/pNFS: Clean up pNFS commit operations") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-18NFS: Fix up commit deadlocksTrond Myklebust3-6/+7
[ Upstream commit 133a48abf6ecc535d7eddc6da1c3e4c972445882 ] If O_DIRECT bumps the commit_info rpcs_out field, then that could lead to fsync() hangs. The fix is to ensure that O_DIRECT calls nfs_commit_end(). Fixes: 723c921e7dfc ("sched/wait, fs/nfs: Convert wait_on_atomic_t() usage to the new wait_var_event() API") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-18NFS: Fix deadlocks in nfs_scan_commit_list()Trond Myklebust1-15/+2
[ Upstream commit 64a93dbf25d3a1368bb58ddf0f61d0a92d7479e3 ] Partially revert commit 2ce209c42c01 ("NFS: Wait for requests that are locked on the commit list"), since it can lead to deadlocks between commit requests and nfs_join_page_group(). For now we should assume that any locked requests on the commit list are either about to be removed and committed by another task, or the writes they describe are about to be retransmitted. In either case, we should not need to worry. Fixes: 2ce209c42c01 ("NFS: Wait for requests that are locked on the commit list") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-18pnfs/flexfiles: Fix misplaced barrier in nfs4_ff_layout_prepare_dsBaptiste Lepers2-4/+4
[ Upstream commit a2915fa06227b056a8f9b0d79b61dca08ad5cfc6 ] _nfs4_pnfs_v3/v4_ds_connect do some work smp_wmb ds->ds_clp = clp; And nfs4_ff_layout_prepare_ds currently does smp_rmb if(ds->ds_clp) ... This patch places the smp_rmb after the if. This ensures that following reads only happen once nfs4_ff_layout_prepare_ds has checked that data has been properly initialized. Fixes: d67ae825a59d6 ("pnfs/flexfiles: Add the FlexFile Layout Driver") Signed-off-by: Baptiste Lepers <baptiste.lepers@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-18NFS: Fix dentry verifier racesTrond Myklebust1-4/+3
[ Upstream commit cec08f452a687fce9dfdf47946d00a1d12a8bec5 ] If the directory changed while we were revalidating the dentry, then don't update the dentry verifier. There is no value in setting the verifier to an older value, and we could end up overwriting a more up to date verifier from a parallel revalidation. Fixes: efeda80da38d ("NFSv4: Fix revalidation of dentries with delegations") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Tested-by: Benjamin Coddington <bcodding@redhat.com> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-18NFS: Ignore the directory size when marking for revalidationTrond Myklebust1-1/+1
[ Upstream commit a6a361c4ca3cc3e6f3b39d1b6bca1de90f5f4b11 ] If we want to revalidate the directory, then just mark the change attribute as invalid. Fixes: 13c0b082b6a9 ("NFS: Replace use of NFS_INO_REVAL_PAGECACHE when checking cache validity") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Tested-by: Benjamin Coddington <bcodding@redhat.com> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-18NFS: Don't set NFS_INO_DATA_INVAL_DEFER and NFS_INO_INVALID_DATATrond Myklebust1-2/+7
[ Upstream commit 488796ec1e39fb9194cc8175f770823d40fbf0ed ] NFS_INO_DATA_INVAL_DEFER and NFS_INO_INVALID_DATA should be considered mutually exclusive. Fixes: 1c341b777501 ("NFS: Add deferred cache invalidation for close-to-open consistency violations") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Tested-by: Benjamin Coddington <bcodding@redhat.com> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-18NFS: Default change_attr_type to NFS4_CHANGE_TYPE_IS_UNDEFINEDTrond Myklebust3-3/+5
[ Upstream commit eea413308f2e6deb00f061f18081a53f3ecc8cc6 ] Both NFSv3 and NFSv2 generate their change attribute from the ctime value that was supplied by the server. However the problem is that there are plenty of servers out there with ctime resolutions of 1ms or worse. In a modern performance system, this is insufficient when trying to decide which is the most recent set of attributes when, for instance, a READ or GETATTR call races with a WRITE or SETATTR. For this reason, let's revert to labelling the NFSv2/v3 change attributes as NFS4_CHANGE_TYPE_IS_UNDEFINED. This will ensure we protect against such races. Fixes: 7b24dacf0840 ("NFS: Another inode revalidation improvement") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Tested-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-04Merge tag 'nfs-for-5.15-1' of git://git.linux-nfs.org/projects/anna/linux-nfsLinus Torvalds11-51/+109
Pull NFS client updates from Anna Schumaker: "New Features: - Better client responsiveness when server isn't replying - Use refcount_t in sunrpc rpc_client refcount tracking - Add srcaddr and dst_port to the sunrpc sysfs info files - Add basic support for connection sharing between servers with multiple NICs` Bugfixes and Cleanups: - Sunrpc tracepoint cleanups - Disconnect after ib_post_send() errors to avoid deadlocks - Fix for tearing down rpcrdma_reps - Fix a potential pNFS layoutget livelock loop - pNFS layout barrier fixes - Fix a potential memory corruption in rpc_wake_up_queued_task_set_status() - Fix reconnection locking - Fix return value of get_srcport() - Remove rpcrdma_post_sends() - Remove pNFS dead code - Remove copy size restriction for inter-server copies - Overhaul the NFS callback service - Clean up sunrpc TCP socket shutdowns - Always provide aligned buffers to RPC read layers" * tag 'nfs-for-5.15-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (39 commits) NFS: Always provide aligned buffers to the RPC read layers NFSv4.1 add network transport when session trunking is detected SUNRPC enforce creation of no more than max_connect xprts NFSv4 introduce max_connect mount options SUNRPC add xps_nunique_destaddr_xprts to xprt_switch_info in sysfs SUNRPC keep track of number of transports to unique addresses NFSv3: Delete duplicate judgement in nfs3_async_handle_jukebox SUNRPC: Tweak TCP socket shutdown in the RPC client SUNRPC: Simplify socket shutdown when not reusing TCP ports NFSv4.2: remove restriction of copy size for inter-server copy. NFS: Clean up the synopsis of callback process_op() NFS: Extract the xdr_init_encode/decode() calls from decode_compound NFS: Remove unused callback void decoder NFS: Add a private local dispatcher for NFSv4 callback operations SUNRPC: Eliminate the RQ_AUTHERR flag SUNRPC: Set rq_auth_stat in the pg_authenticate() callout SUNRPC: Add svc_rqst::rq_auth_stat SUNRPC: Add dst_port to the sysfs xprt info file SUNRPC: Add srcaddr as a file in sysfs sunrpc: Fix return value of get_srcport() ...
2021-09-02Merge tag 'ovl-update-5.15' of ↵Linus Torvalds2-2/+5
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs Pull overlayfs update from Miklos Szeredi: - Copy up immutable/append/sync/noatime attributes (Amir Goldstein) - Improve performance by enabling RCU lookup. - Misc fixes and improvements The reason this touches so many files is that the ->get_acl() method now gets a "bool rcu" argument. The ->get_acl() API was updated based on comments from Al and Linus: Link: https://lore.kernel.org/linux-fsdevel/CAJfpeguQxpd6Wgc0Jd3ks77zcsAv_bn0q17L3VNnnmPKu11t8A@mail.gmail.com/ * tag 'ovl-update-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: ovl: enable RCU'd ->get_acl() vfs: add rcu argument to ->get_acl() callback ovl: fix BUG_ON() in may_delete() when called from ovl_cleanup() ovl: use kvalloc in xattr copy-up ovl: update ctime when changing fileattr ovl: skip checking lower file's i_writecount on truncate ovl: relax lookup error on mismatch origin ftype ovl: do not set overlay.opaque for new directories ovl: add ovl_allow_offline_changes() helper ovl: disable decoding null uuid with redirect_dir ovl: consistent behavior for immutable/append-only inodes ovl: copy up sync/noatime fileattr flags ovl: pass ovl_fs to ovl_check_setxattr() fs: add generic helper for filling statx attribute flags
2021-08-31Merge tag 'nfsd-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linuxLinus Torvalds2-1/+4
Pull nfsd updates from Chuck Lever: "New features: - Support for server-side disconnect injection via debugfs - Protocol definitions for new RPC_AUTH_TLS authentication flavor Performance improvements: - Reduce page allocator traffic in the NFSD splice read actor - Reduce CPU utilization in svcrdma's Send completion handler Notable bug fixes: - Stabilize lockd operation when re-exporting NFS mounts - Fix the use of %.*s in NFSD tracepoints - Fix /proc/sys/fs/nfs/nsm_use_hostnames" * tag 'nfsd-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (31 commits) nfsd: fix crash on LOCKT on reexported NFSv3 nfs: don't allow reexport reclaims lockd: don't attempt blocking locks on nfs reexports nfs: don't atempt blocking locks on nfs reexports Keep read and write fds with each nlm_file lockd: update nlm_lookup_file reexport comment nlm: minor refactoring nlm: minor nlm_lookup_file argument change lockd: lockd server-side shouldn't set fl_ops SUNRPC: Add documentation for the fail_sunrpc/ directory SUNRPC: Server-side disconnect injection SUNRPC: Move client-side disconnect injection SUNRPC: Add a /sys/kernel/debug/fail_sunrpc/ directory svcrdma: xpt_bc_xprt is already clear in __svc_rdma_free() nfsd4: Fix forced-expiry locking rpc: fix gss_svc_init cleanup on failure SUNRPC: Add RPC_AUTH_TLS protocol numbers lockd: change the proc_handler for nsm_use_hostnames sysctl: introduce new proc handler proc_dobool SUNRPC: Fix a NULL pointer deref in trace_svc_stats_latency() ...
2021-08-30NFS: Always provide aligned buffers to the RPC read layersTrond Myklebust1-2/+6
Instead of messing around with XDR padding in the RDMA layer, we should just give the RPC layer an aligned buffer. Try to avoid creating extra RPC calls by aligning to the smaller value of ALIGN(len, rsize) and PAGE_SIZE. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2021-08-27NFSv4.1 add network transport when session trunking is detectedOlga Kornievskaia1-0/+29
After trunking is discovered in nfs4_discover_server_trunking(), add the transport to the old client structure if the allowed limit of transports has not been reached. An example: there exists a multi-homed server and client mounts one server address and some volume and then doest another mount to a different address of the same server and perhaps a different volume. Previously, the client checks that this is a session trunkable servers (same server), and removes the newly created client structure along with its transport. Now, the client adds the connection from the 2nd mount into the xprt switch of the existing client (it leads to having 2 available connections). Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2021-08-27SUNRPC enforce creation of no more than max_connect xprtsOlga Kornievskaia1-0/+1
If we are adding new transports via rpc_clnt_test_and_add_xprt() then check if we've reached the limit. Currently only pnfs path adds transports via that function but this is done in preparation when the client would add new transports when session trunking is detected. A warning is logged if the limit is reached. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2021-08-27NFSv4 introduce max_connect mount optionsOlga Kornievskaia5-2/+22
This option will control up to how many xprts can the client establish to the server with a distinct address (that means nconnect connections are not counted towards this new limit). This patch is setting up nfs structures to keeep track of the max_connect limit (does not enforce it). The default value is kept at 1 so that no current mounts that don't want any additional connections would be effected. The maximum value is set at 16. Mounts to DS are not limited to default value of 1 but instead set to the maximum default value of 16 (NFS_MAX_TRANSPORTS). Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2021-08-27NFSv3: Delete duplicate judgement in nfs3_async_handle_jukeboxYe Bin1-2/+1
As eb96d5c97b08 ("SUNRPC handle EKEYEXPIRED in call_refreshresult") commit handle EKEYEXPIRED in call_refreshresult, so there is only handle when "task->tk_status" is equal "-EJUKEBOX" in nfs3_async_handle_jukebox. Signed-off-by: Ye Bin <yebin10@huawei.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2021-08-26nfs: don't allow reexport reclaimsJ. Bruce Fields1-0/+3
In the reexport case, nfsd is currently passing along locks with the reclaim bit set. The client sends a new lock request, which is granted if there's currently no conflict--even if it's possible a conflicting lock could have been briefly held in the interim. We don't currently have any way to safely grant reclaim, so for now let's just deny them all. I'm doing this by passing the reclaim bit to nfs and letting it fail the call, with the idea that eventually the client might be able to do something more forgiving here. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Acked-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-08-26nfs: don't atempt blocking locks on nfs reexportsJ. Bruce Fields1-1/+1
NFS implements blocking locks by blocking inside its lock method. In the reexport case, this blocks the nfs server thread, which could lead to deadlocks since an nfs server thread might be required to unlock the conflicting lock. It also causes a crash, since the nfs server thread assumes it can free the lock when its lm_notify lock callback is called. Ideal would be to make the nfs lock method return without blocking in this case, but for now it works just not to attempt blocking locks. The difference is just that the original client will have to poll (as it does in the v4.0 case) instead of getting a callback when the lock's available. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Acked-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-08-23fs: remove mandatory file locking supportJeff Layton1-4/+0
We added CONFIG_MANDATORY_FILE_LOCKING in 2015, and soon after turned it off in Fedora and RHEL8. Several other distros have followed suit. I've heard of one problem in all that time: Someone migrated from an older distro that supported "-o mand" to one that didn't, and the host had a fstab entry with "mand" in it which broke on reboot. They didn't actually _use_ mandatory locking so they just removed the mount option and moved on. This patch rips out mandatory locking support wholesale from the kernel, along with the Kconfig option and the Documentation file. It also changes the mount code to ignore the "mand" mount option instead of erroring out, and to throw a big, ugly warning. Signed-off-by: Jeff Layton <jlayton@kernel.org>
2021-08-18vfs: add rcu argument to ->get_acl() callbackMiklos Szeredi2-2/+5
Add a rcu argument to the ->get_acl() callback to allow get_cached_acl_rcu() to call the ->get_acl() method in the next patch. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-08-10NFSv4.2: remove restriction of copy size for inter-server copy.Dai Ngo1-6/+4
Currently inter-server copy is allowed only if the copy size is larger than (rsize*14) which is the over-head of the mount operation of the source export. This patch, relying on the delayed unmount feature, removes this restriction since the mount and unmount overhead is now not applicable for every inter-server copy. Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2021-08-10NFS: Clean up the synopsis of callback process_op()Chuck Lever1-10/+9
The xdr_stream and rq_arg and rq_res are already accessible via the @rqstp parameter. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2021-08-10NFS: Extract the xdr_init_encode/decode() calls from decode_compoundChuck Lever1-13/+9
Clean up: Move the xdr_init_encode() and xdr_init_decode() calls into the dispatcher, just like the NFSD and lockd dispatchers. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2021-08-10NFS: Remove unused callback void decoderChuck Lever1-6/+4
Clean up: The callback RPC dispatcher no longer invokes these call outs, although svc_process_common() relies on seeing a .pc_encode function. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2021-08-10NFS: Add a private local dispatcher for NFSv4 callback operationsChuck Lever1-2/+11
The client's NFSv4 callback service is the only remaining user of svc_generic_dispatch(). Note that the NFSv4 callback service doesn't use the .pc_encode and .pc_decode callouts in any substantial way, so they are removed. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2021-08-10SUNRPC: Eliminate the RQ_AUTHERR flagChuck Lever1-1/+2
Now that there is an alternate method for returning an auth_stat value, replace the RQ_AUTHERR flag with use of that new method. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2021-08-10SUNRPC: Set rq_auth_stat in the pg_authenticate() calloutChuck Lever1-0/+4
In a few moments, rq_auth_stat will need to be explicitly set to rpc_auth_ok before execution gets to the dispatcher. svc_authenticate() already sets it, but it often gets reset to rpc_autherr_badcred right after that call, even when authentication is successful. Let's ensure that the pg_authenticate callout and svc_set_client() set it properly in every case. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2021-08-09NFSv4/pnfs: The layout barrier indicate a minimal value for the seqidTrond Myklebust1-1/+1
The intention of the layout barrier is to ensure that we do not update the layout to match an older value than the current expectation. Fix the test in pnfs_layout_stateid_blocked() to reflect that it is legal for the seqid of the stateid to match that of the barrier. Fixes: aa95edf309ef ("NFSv4/pnfs: Fix the layout barrier update") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2021-08-09NFSv4/pNFS: Always allow update of a zero valued layout barrierTrond Myklebust1-1/+1
A zero value for the layout barrier indicates that it has been cleared (since seqid '0' is an illegal value), so we should always allow it to be updated. Fixes: d29b468da4f9 ("pNFS/NFSv4: Improve rejection of out-of-order layouts") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2021-08-09NFSv4/pNFS: Remove dead codeTrond Myklebust1-4/+0
Since commit 2b28a7bee453 ("fs, nfs: convert pnfs_layout_hdr.plh_refcount from atomic_t to refcount_t") it has not been legal to bump a zero refcount, so the code that tries to allow it if the NFS_LSEG_VALID flag is still set would cause trouble. Luckily, NFS_LSEG_VALID has its own refcount so we can never hit this bad code snippet in practice. Remove it to avoid confusion. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2021-08-09NFSv4/pNFS: Fix a layoutget livelock loopTrond Myklebust1-4/+8
If NFS_LAYOUT_RETURN_REQUESTED is set, but there is no value set for the layout plh_return_seq, we can end up in a livelock loop in which every layout segment retrieved by a new call to layoutget is immediately invalidated by pnfs_layout_need_return(). To get around this, we should just set plh_return_seq to the current value of the layout stateid's seqid. Fixes: d474f96104bd ("NFS: Don't return layout segments that are in use") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2021-07-09Merge tag 'nfs-for-5.14-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds16-224/+347
Pull NFS client updates from Trond Myklebust: "Highlights include: Features: - Multiple patches to add support for fcntl() leases over NFSv4. - A sysfs interface to display more information about the various transport connections used by the RPC client - A sysfs interface to allow a suitably privileged user to offline a transport that may no longer point to a valid server - A sysfs interface to allow a suitably privileged user to change the server IP address used by the RPC client Stable fixes: - Two sunrpc fixes for deadlocks involving privileged rpc_wait_queues Bugfixes: - SUNRPC: Avoid a KASAN slab-out-of-bounds bug in xdr_set_page_base() - SUNRPC: prevent port reuse on transports which don't request it. - NFSv3: Fix memory leak in posix_acl_create() - NFS: Various fixes to attribute revalidation timeouts - NFSv4: Fix handling of non-atomic change attribute updates - NFSv4: If a server is down, don't cause mounts to other servers to hang as well - pNFS: Fix an Oops in pnfs_mark_request_commit() when doing O_DIRECT - NFS: Fix mount failures due to incorrect setting of the has_sec_mnt_opts filesystem flag - NFS: Ensure nfs_readpage returns promptly when an internal error occurs - NFS: Fix fscache read from NFS after cache error - pNFS: Various bugfixes around the LAYOUTGET operation" * tag 'nfs-for-5.14-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (46 commits) NFSv4/pNFS: Return an error if _nfs4_pnfs_v3_ds_connect can't load NFSv3 NFSv4/pNFS: Don't call _nfs4_pnfs_v3_ds_connect multiple times NFSv4/pnfs: Clean up layout get on open NFSv4/pnfs: Fix layoutget behaviour after invalidation NFSv4/pnfs: Fix the layout barrier update NFS: Fix fscache read from NFS after cache error NFS: Ensure nfs_readpage returns promptly when internal error occurs sunrpc: remove an offlined xprt using sysfs sunrpc: provide showing transport's state info in the sysfs directory sunrpc: display xprt's queuelen of assigned tasks via sysfs sunrpc: provide multipath info in the sysfs directory NFSv4.1 identify and mark RPC tasks that can move between transports sunrpc: provide transport info in the sysfs directory SUNRPC: take a xprt offline using sysfs sunrpc: add dst_attr attributes to the sysfs xprt directory SUNRPC for TCP display xprt's source port in sysfs xprt_info SUNRPC query transport's source port SUNRPC display xprt's main value in sysfs's xprt_info SUNRPC mark the first transport sunrpc: add add sysfs directory per xprt under each xprt_switch ...
2021-07-08Merge part 2 of branch 'sysfs-devel'Trond Myklebust3-8/+44
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-07-08NFSv4/pNFS: Return an error if _nfs4_pnfs_v3_ds_connect can't load NFSv3Trond Myklebust1-1/+1
Currently we fail to return an error if the NFSv3 module failed to load when we're trying to connect to a pNFS data server. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-07-08NFSv4/pNFS: Don't call _nfs4_pnfs_v3_ds_connect multiple timesTrond Myklebust1-26/+26
After we grab the lock in nfs4_pnfs_ds_connect(), there is no check for whether or not ds->ds_clp has already been initialised, so we can end up adding the same transports multiple times. Fixes: fc821d59209d ("pnfs/NFSv4.1: Add multipath capabilities to pNFS flexfiles servers over NFSv3") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-07-08NFSv4/pnfs: Clean up layout get on openTrond Myklebust2-17/+17
Cache the layout in the arguments so we don't have to keep looking it up from the inode. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-07-08NFSv4/pnfs: Fix layoutget behaviour after invalidationTrond Myklebust1-5/+5
If the layout gets invalidated, we should wait for any outstanding layoutget requests for that layout to complete, and we should resend them only after re-establishing the layout stateid. Fixes: d29b468da4f9 ("pNFS/NFSv4: Improve rejection of out-of-order layouts") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-07-08NFSv4/pnfs: Fix the layout barrier updateTrond Myklebust1-15/+15
If we have multiple outstanding layoutget requests, the current code to update the layout barrier assumes that the outstanding layout stateids are updated in order. That's not necessarily the case. Instead of using the value of lo->plh_outstanding as a guesstimate for the window of values we need to accept, just wait to update the window until we're processing the last one. The intention here is just to ensure that we don't process 2^31 seqid updates without also updating the barrier. Fixes: 1bcf34fdac5f ("pNFS/NFSv4: Update the layout barrier when we schedule a layoutreturn") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-07-08NFS: Fix fscache read from NFS after cache errorDave Wysochanski2-7/+16
Earlier commits refactored some NFS read code and removed nfs_readpage_async(), but neglected to properly fixup nfs_readpage_from_fscache_complete(). The code path is only hit when something unusual occurs with the cachefiles backing filesystem, such as an IO error or while a cookie is being invalidated. Mark page with PG_checked if fscache IO completes in error, unlock the page, and let the VM decide to re-issue based on PG_uptodate. When the VM reissues the readpage, PG_checked allows us to skip over fscache and read from the server. Link: https://marc.info/?l=linux-nfs&m=162498209518739 Fixes: 1e83b173b266 ("NFS: Add nfs_pageio_complete_read() and remove nfs_readpage_async()") Signed-off-by: Dave Wysochanski <dwysocha@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-07-08NFS: Ensure nfs_readpage returns promptly when internal error occursDave Wysochanski1-3/+3
A previous refactoring of nfs_readpage() might end up calling wait_on_page_locked_killable() even if readpage_async_filler() failed with an internal error and pg_error was non-zero (for example, if nfs_create_request() failed). In the case of an internal error, skip over wait_on_page_locked_killable() as this is only needed when the read is sent and an error occurs during completion handling. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-07-08NFSv4.1 identify and mark RPC tasks that can move between transportsOlga Kornievskaia3-8/+44
In preparation for when we can re-try a task on a different transport, identify and mark such RPC tasks as moveable. Only 4.1+ operarations can be re-tried on a different transport. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-07-03Merge branch 'work.namei' of ↵Linus Torvalds1-4/+0
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs name lookup updates from Al Viro: "Small namei.c patch series, mostly to simplify the rules for nameidata state. It's actually from the previous cycle - but I didn't post it for review in time... Changes visible outside of fs/namei.c: file_open_root() calling conventions change, some freed bits in LOOKUP_... space" * 'work.namei' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: namei: make sure nd->depth is always valid teach set_nameidata() to handle setting the root as well take LOOKUP_{ROOT,ROOT_GRABBED,JUMPED} out of LOOKUP_... space switch file_open_root() to struct path
2021-07-03Merge tag 'trace-v5.14' of ↵Linus Torvalds2-5/+5
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: - Added option for per CPU threads to the hwlat tracer - Have hwlat tracer handle hotplug CPUs - New tracer: osnoise, that detects latency caused by interrupts, softirqs and scheduling of other tasks. - Added timerlat tracer that creates a thread and measures in detail what sources of latency it has for wake ups. - Removed the "success" field of the sched_wakeup trace event. This has been hardcoded as "1" since 2015, no tooling should be looking at it now. If one exists, we can revert this commit, fix that tool and try to remove it again in the future. - tgid mapping fixed to handle more than PID_MAX_DEFAULT pids/tgids. - New boot command line option "tp_printk_stop", as tp_printk causes trace events to write to console. When user space starts, this can easily live lock the system. Having a boot option to stop just after boot up is useful to prevent that from happening. - Have ftrace_dump_on_oops boot command line option take numbers that match the numbers shown in /proc/sys/kernel/ftrace_dump_on_oops. - Bootconfig clean ups, fixes and enhancements. - New ktest script that tests bootconfig options. - Add tracepoint_probe_register_may_exist() to register a tracepoint without triggering a WARN*() if it already exists. BPF has a path from user space that can do this. All other paths are considered a bug. - Small clean ups and fixes * tag 'trace-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (49 commits) tracing: Resize tgid_map to pid_max, not PID_MAX_DEFAULT tracing: Simplify & fix saved_tgids logic treewide: Add missing semicolons to __assign_str uses tracing: Change variable type as bool for clean-up trace/timerlat: Fix indentation on timerlat_main() trace/osnoise: Make 'noise' variable s64 in run_osnoise() tracepoint: Add tracepoint_probe_register_may_exist() for BPF tracing tracing: Fix spelling in osnoise tracer "interferences" -> "interference" Documentation: Fix a typo on trace/osnoise-tracer trace/osnoise: Fix return value on osnoise_init_hotplug_support trace/osnoise: Make interval u64 on osnoise_main trace/osnoise: Fix 'no previous prototype' warnings tracing: Have osnoise_main() add a quiescent state for task rcu seq_buf: Make trace_seq_putmem_hex() support data longer than 8 seq_buf: Fix overflow in seq_buf_putmem_hex() trace/osnoise: Support hotplug operations trace/hwlat: Support hotplug operations trace/hwlat: Protect kdata->kthread with get/put_online_cpus trace: Add timerlat tracer trace: Add osnoise tracer ...
2021-06-30treewide: Add missing semicolons to __assign_str usesJoe Perches2-5/+5
The __assign_str macro has an unusual ending semicolon but the vast majority of uses of the macro already have semicolon termination. $ git grep -P '\b__assign_str\b' | wc -l 551 $ git grep -P '\b__assign_str\b.*;' | wc -l 480 Add semicolons to the __assign_str() uses without semicolon termination and all the other uses without semicolon termination via additional defines that are equivalent to __assign_str() with the eventual goal of removing the semicolon from the __assign_str() macro definition. Link: https://lore.kernel.org/lkml/1e068d21106bb6db05b735b4916bb420e6c9842a.camel@perches.com/ Link: https://lkml.kernel.org/r/48a056adabd8f70444475352f617914cef504a45.camel@perches.com Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-06-29Merge branch 'leases-devel'Trond Myklebust6-23/+125
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-06-29NFSv4: setlease should return EAGAIN if locks are not availableTrond Myklebust1-2/+2
Instead of returning ENOLCK when we can't hand out a lease, we should be returning EAGAIN. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>