summaryrefslogtreecommitdiff
path: root/fs/nfs
AgeCommit message (Collapse)AuthorFilesLines
2022-11-26NFSv4: Retry LOCK on OLD_STATEID during delegation returnBenjamin Coddington1-2/+4
[ Upstream commit f5ea16137a3fa2858620dc9084466491c128535f ] There's a small window where a LOCK sent during a delegation return can race with another OPEN on client, but the open stateid has not yet been updated. In this case, the client doesn't handle the OLD_STATEID error from the server and will lose this lock, emitting: "NFS: nfs4_handle_delegation_recall_error: unhandled error -10024". Fix this by sending the task through the nfs4 error handling in nfs4_lock_done() when we may have to reconcile our stateid with what the server believes it to be. For this case, the result is a retry of the LOCK operation with the updated stateid. Reported-by: Gonzalo Siero Humet <gsierohu@redhat.com> Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-11-10nfs4: Fix kmemleak when allocate slot failedZhang Xiaoxu1-0/+1
[ Upstream commit 7e8436728e22181c3f12a5dbabd35ed3a8b8c593 ] If one of the slot allocate failed, should cleanup all the other allocated slots, otherwise, the allocated slots will leak: unreferenced object 0xffff8881115aa100 (size 64): comm ""mount.nfs"", pid 679, jiffies 4294744957 (age 115.037s) hex dump (first 32 bytes): 00 cc 19 73 81 88 ff ff 00 a0 5a 11 81 88 ff ff ...s......Z..... 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<000000007a4c434a>] nfs4_find_or_create_slot+0x8e/0x130 [<000000005472a39c>] nfs4_realloc_slot_table+0x23f/0x270 [<00000000cd8ca0eb>] nfs40_init_client+0x4a/0x90 [<00000000128486db>] nfs4_init_client+0xce/0x270 [<000000008d2cacad>] nfs4_set_client+0x1a2/0x2b0 [<000000000e593b52>] nfs4_create_server+0x300/0x5f0 [<00000000e4425dd2>] nfs4_try_get_tree+0x65/0x110 [<00000000d3a6176f>] vfs_get_tree+0x41/0xf0 [<0000000016b5ad4c>] path_mount+0x9b3/0xdd0 [<00000000494cae71>] __x64_sys_mount+0x190/0x1d0 [<000000005d56bdec>] do_syscall_64+0x35/0x80 [<00000000687c9ae4>] entry_SYSCALL_64_after_hwframe+0x46/0xb0 Fixes: abf79bb341bf ("NFS: Add a slot table to struct nfs_client for NFSv4.0 transport blocking") Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-11-10NFSv4.2: Fixup CLONE dest file size for zero-length countBenjamin Coddington1-0/+3
[ Upstream commit 038efb6348ce96228f6828354cb809c22a661681 ] When holding a delegation, the NFS client optimizes away setting the attributes of a file from the GETATTR in the compound after CLONE, and for a zero-length CLONE we will end up setting the inode's size to zero in nfs42_copy_dest_done(). Handle this case by computing the resulting count from the server's reported size after CLONE's GETATTR. Suggested-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Fixes: 94d202d5ca39 ("NFSv42: Copy offload should update the file size when appropriate") Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-11-10NFSv4.1: We must always send RECLAIM_COMPLETE after a rebootTrond Myklebust1-0/+1
[ Upstream commit e59679f2b7e522ecad99974e5636291ffd47c184 ] Currently, we are only guaranteed to send RECLAIM_COMPLETE if we have open state to recover. Fix the client to always send RECLAIM_COMPLETE after setting up the lease. Fixes: fce5c838e133 ("nfs41: RECLAIM_COMPLETE functionality") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-11-10NFSv4.1: Handle RECLAIM_COMPLETE trunking errorsTrond Myklebust1-0/+1
[ Upstream commit 5d917cba3201e5c25059df96c29252fd99c4f6a7 ] If RECLAIM_COMPLETE sets the NFS4CLNT_BIND_CONN_TO_SESSION flag, then we need to loop back in order to handle it. Fixes: 0048fdd06614 ("NFSv4.1: RECLAIM_COMPLETE must handle NFS4ERR_CONN_NOT_BOUND_TO_SESSION") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-11-10NFSv4: Fix a potential state reclaim deadlockTrond Myklebust1-19/+17
[ Upstream commit 1ba04394e028ea8b45d92685cc0d6ab582cf7647 ] If the server reboots while we are engaged in a delegation return, and there is a pNFS layout with return-on-close set, then the current code can end up deadlocking in pnfs_roc() when nfs_inode_set_delegation() tries to return the old delegation. Now that delegreturn actually uses its own copy of the stateid, it should be safe to just always update the delegation stateid in place. Fixes: 078000d02d57 ("pNFS: We want return-on-close to complete when evicting the inode") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-11-03NFSv4: Add an fattr allocation to _nfs4_discover_trunking()Scott Mayhew1-7/+12
commit 4f40a5b5544618b096d1611a18219dd91fd57f80 upstream. This was missed in c3ed222745d9 ("NFSv4: Fix free of uninitialized nfs4_label on referral lookup.") and causes a panic when mounting with '-o trunkdiscovery': PID: 1604 TASK: ffff93dac3520000 CPU: 3 COMMAND: "mount.nfs" #0 [ffffb79140f738f8] machine_kexec at ffffffffaec64bee #1 [ffffb79140f73950] __crash_kexec at ffffffffaeda67fd #2 [ffffb79140f73a18] crash_kexec at ffffffffaeda76ed #3 [ffffb79140f73a30] oops_end at ffffffffaec2658d #4 [ffffb79140f73a50] general_protection at ffffffffaf60111e [exception RIP: nfs_fattr_init+0x5] RIP: ffffffffc0c18265 RSP: ffffb79140f73b08 RFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff93dac304a800 RCX: 0000000000000000 RDX: ffffb79140f73bb0 RSI: ffff93dadc8cbb40 RDI: d03ee11cfaf6bd50 RBP: ffffb79140f73be8 R8: ffffffffc0691560 R9: 0000000000000006 R10: ffff93db3ffd3df8 R11: 0000000000000000 R12: ffff93dac4040000 R13: ffff93dac2848e00 R14: ffffb79140f73b60 R15: ffffb79140f73b30 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #5 [ffffb79140f73b08] _nfs41_proc_get_locations at ffffffffc0c73d53 [nfsv4] #6 [ffffb79140f73bf0] nfs4_proc_get_locations at ffffffffc0c83e90 [nfsv4] #7 [ffffb79140f73c60] nfs4_discover_trunking at ffffffffc0c83fb7 [nfsv4] #8 [ffffb79140f73cd8] nfs_probe_fsinfo at ffffffffc0c0f95f [nfs] #9 [ffffb79140f73da0] nfs_probe_server at ffffffffc0c1026a [nfs] RIP: 00007f6254fce26e RSP: 00007ffc69496ac8 RFLAGS: 00000246 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f6254fce26e RDX: 00005600220a82a0 RSI: 00005600220a64d0 RDI: 00005600220a6520 RBP: 00007ffc69496c50 R8: 00005600220a8710 R9: 003035322e323231 R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffc69496c50 R13: 00005600220a8440 R14: 0000000000000010 R15: 0000560020650ef9 ORIG_RAX: 00000000000000a5 CS: 0033 SS: 002b Fixes: c3ed222745d9 ("NFSv4: Fix free of uninitialized nfs4_label on referral lookup.") Signed-off-by: Scott Mayhew <smayhew@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-11-03NFSv4: Fix free of uninitialized nfs4_label on referral lookup.Benjamin Coddington4-13/+24
commit c3ed222745d9ad7b69299b349a64ba533c64a34f upstream. Send along the already-allocated fattr along with nfs4_fs_locations, and drop the memcpy of fattr. We end up growing two more allocations, but this fixes up a crash as: PID: 790 TASK: ffff88811b43c000 CPU: 0 COMMAND: "ls" #0 [ffffc90000857920] panic at ffffffff81b9bfde #1 [ffffc900008579c0] do_trap at ffffffff81023a9b #2 [ffffc90000857a10] do_error_trap at ffffffff81023b78 #3 [ffffc90000857a58] exc_stack_segment at ffffffff81be1f45 #4 [ffffc90000857a80] asm_exc_stack_segment at ffffffff81c009de #5 [ffffc90000857b08] nfs_lookup at ffffffffa0302322 [nfs] #6 [ffffc90000857b70] __lookup_slow at ffffffff813a4a5f #7 [ffffc90000857c60] walk_component at ffffffff813a86c4 #8 [ffffc90000857cb8] path_lookupat at ffffffff813a9553 #9 [ffffc90000857cf0] filename_lookup at ffffffff813ab86b Suggested-by: Trond Myklebust <trondmy@hammerspace.com> Fixes: 9558a007dbc3 ("NFS: Remove the label from the nfs4_lookup_res struct") Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-09-28NFSv4: Fixes for nfs4_inode_return_delegation()Trond Myklebust1-4/+6
commit 6e176d47160cec8bcaa28d9aa06926d72d54237c upstream. We mustn't call nfs_wb_all() on anything other than a regular file. Furthermore, we can exit early when we don't hold a delegation. Reported-by: David Wysochanski <dwysocha@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Thorsten Leemhuis <regressions@leemhuis.info> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-09-23NFSv4: Turn off open-by-filehandle and NFS re-export for NFSv4.0Trond Myklebust1-9/+18
[ Upstream commit 2a9d683b48c8a87e61a4215792d44c90bcbbb536 ] The NFSv4.0 protocol only supports open() by name. It cannot therefore be used with open_by_handle() and friends, nor can it be re-exported by knfsd. Reported-by: Chuck Lever III <chuck.lever@oracle.com> Fixes: 20fa19027286 ("nfs: add export operations") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-09-15NFS: Fix another fsync() issue after a server rebootTrond Myklebust3-11/+11
[ Upstream commit 67f4b5dc49913abcdb5cc736e73674e2f352f81d ] Currently, when the writeback code detects a server reboot, it redirties any pages that were not committed to disk, and it sets the flag NFS_CONTEXT_RESEND_WRITES in the nfs_open_context of the file descriptor that dirtied the file. While this allows the file descriptor in question to redrive its own writes, it violates the fsync() requirement that we should be synchronising all writes to disk. While the problem is infrequent, we do see corner cases where an untimely server reboot causes the fsync() call to abandon its attempt to sync data to disk and causing data corruption issues due to missed error conditions or similar. In order to tighted up the client's ability to deal with this situation without introducing livelocks, add a counter that records the number of times pages are redirtied due to a server reboot-like condition, and use that in fsync() to redrive the sync to disk. Fixes: 2197e9b06c22 ("NFS: Fix up fsync() when the server rebooted") Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-09-15NFS: Save some space in the inodeTrond Myklebust1-8/+18
[ Upstream commit e591b298d7ecb851e200f65946e3d53fe78a3c4f ] Save some space in the nfs_inode by setting up an anonymous union with the fields that are peculiar to a specific type of filesystem object. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-09-15NFS: Further optimisations for 'ls -l'Trond Myklebust1-5/+11
[ Upstream commit ff81dfb5d721fff87bd516c558847f6effb70031 ] If a user is doing 'ls -l', we have a heuristic in GETATTR that tells the readdir code to try to use READDIRPLUS in order to refresh the inode attributes. In certain cirumstances, we also try to invalidate the remaining directory entries in order to ensure this refresh. If there are multiple readers of the directory, we probably should avoid invalidating the page cache, since the heuristic breaks down in that situation anyway. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Tested-by: Benjamin Coddington <bcodding@redhat.com> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-31NFSv4.2 fix problems with __nfs42_ssc_openOlga Kornievskaia1-0/+6
[ Upstream commit fcfc8be1e9cf2f12b50dce8b579b3ae54443a014 ] A destination server while doing a COPY shouldn't accept using the passed in filehandle if its not a regular filehandle. If alloc_file_pseudo() has failed, we need to decrement a reference on the newly created inode, otherwise it leaks. Reported-by: Al Viro <viro@zeniv.linux.org.uk> Fixes: ec4b092508982 ("NFS: inter ssc open") Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-31NFS: Don't allocate nfs_fattr on the stack in __nfs42_ssc_open()Trond Myklebust1-4/+6
[ Upstream commit 156cd28562a4e8ca454d11b234d9f634a45d6390 ] The preferred behaviour is always to allocate struct nfs_fattr from the slab. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-25NFSv4/pnfs: Fix a use-after-free bug in openTrond Myklebust1-5/+6
commit 2135e5d56278ffdb1c2e6d325dc6b87f669b9dac upstream. If someone cancels the open RPC call, then we must not try to free either the open slot or the layoutget operation arguments, since they are likely still in use by the hung RPC call. Fixes: 6949493884fe ("NFSv4: Don't hold the layoutget locks across multiple RPC calls") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-25NFSv4.1: RECLAIM_COMPLETE must handle EACCESZhang Xianwei1-0/+3
commit e35a5e782f67ed76a65ad0f23a484444a95f000f upstream. A client should be able to handle getting an EACCES error while doing a mount operation to reclaim state due to NFS4CLNT_RECLAIM_REBOOT being set. If the server returns RPC_AUTH_BADCRED because authentication failed when we execute "exportfs -au", then RECLAIM_COMPLETE will go a wrong way. After mount succeeds, all OPEN call will fail due to an NFS4ERR_GRACE error being returned. This patch is to fix it by resending a RPC request. Signed-off-by: Zhang Xianwei <zhang.xianwei8@zte.com.cn> Signed-off-by: Yi Wang <wang.yi59@zte.com.cn> Fixes: aa5190d0ed7d ("NFSv4: Kill nfs4_async_handle_error() abuses by NFSv4.1") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-25NFSv4: Fix races in the legacy idmapper upcallTrond Myklebust1-22/+24
commit 51fd2eb52c0ca8275a906eed81878ef50ae94eb0 upstream. nfs_idmap_instantiate() will cause the process that is waiting in request_key_with_auxdata() to wake up and exit. If there is a second process waiting for the idmap->idmap_mutex, then it may wake up and start a new call to request_key_with_auxdata(). If the call to idmap_pipe_downcall() from the first process has not yet finished calling nfs_idmap_complete_pipe_upcall_locked(), then we may end up triggering the WARN_ON_ONCE() in nfs_idmap_prepare_pipe_upcall(). The fix is to ensure that we clear idmap->idmap_upcall_data before calling nfs_idmap_instantiate(). Fixes: e9ab41b620e4 ("NFSv4: Clean up the legacy idmapper upcall") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-25NFSv4.1: Handle NFS4ERR_DELAY replies to OP_SEQUENCE correctlyTrond Myklebust1-1/+0
commit 7ccafd4b2b9f34e6d8185f796f151c47424e273e upstream. Don't assume that the NFS4ERR_DELAY means that the server is processing this slot id. Fixes: 3453d5708b33 ("NFSv4.1: Avoid false retries when RPC calls are interrupted") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-25NFSv4.1: Don't decrease the value of seq_nr_highest_sentTrond Myklebust1-3/+2
commit f07a5d2427fc113dc50c5c818eba8929bc27b8ca upstream. When we're trying to figure out what the server may or may not have seen in terms of request numbers, do not assume that requests with a larger number were missed, just because we saw a reply to a request with a smaller number. Fixes: 3453d5708b33 ("NFSv4.1: Avoid false retries when RPC calls are interrupted") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-17pNFS/flexfiles: Report RDMA connection errors to the serverTrond Myklebust1-0/+4
commit 7836d75467e9d214bdf5c693b32721de729a6e38 upstream. The RPC/RDMA driver will return -EPROTO and -ENODEV as connection errors under certain circumstances. Make sure that we handle them and report them to the server. If not, we can end up cycling forever in a LAYOUTGET/LAYOUTRETURN loop. Fixes: a12f996d3413 ("NFSv4/pNFS: Use connections to a DS that are all of the same protocol family") Cc: stable@vger.kernel.org # 5.11.x Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-17Revert "pNFS: nfs3_set_ds_client should set NFS_CS_NOPING"Trond Myklebust1-1/+0
commit 9597152d98840c2517230740952df97cfcc07e2f upstream. This reverts commit c6eb58435b98bd843d3179664a0195ff25adb2c3. If a transport is down, then we want to fail over to other transports if they are listed in the GETDEVICEINFO reply. Fixes: c6eb58435b98 ("pNFS: nfs3_set_ds_client should set NFS_CS_NOPING") Cc: stable@vger.kernel.org # 5.11.x Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-22pNFS: Avoid a live lock condition in pnfs_update_layout()Trond Myklebust3-6/+11
[ Upstream commit 880265c77ac415090090d1fe72a188fee71cb458 ] If we're about to send the first layoutget for an empty layout, we want to make sure that we drain out the existing pending layoutget calls first. The reason is that these layouts may have been already implicitly returned to the server by a recall to which the client gave a NFS4ERR_NOMATCHING_LAYOUT response. The problem is that wait_var_event_killable() could in principle see the plh_outstanding count go back to '1' when the first process to wake up starts sending a new layoutget. If it fails to get a layout, then this loop can continue ad infinitum... Fixes: 0b77f97a7e42 ("NFSv4/pnfs: Fix layoutget behaviour after invalidation") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-06-22pNFS: Don't keep retrying if the server replied NFS4ERR_LAYOUTUNAVAILABLETrond Myklebust1-0/+6
[ Upstream commit fe44fb23d6ccde4c914c44ef74ab8d9d9ba02bea ] If the server tells us that a pNFS layout is not available for a specific file, then we should not keep pounding it with further layoutget requests. Fixes: 183d9e7b112a ("pnfs: rework LAYOUTGET retry handling") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-06-14NFSv4: Don't hold the layoutget locks across multiple RPC callsTrond Myklebust1-0/+4
[ Upstream commit 6949493884fe88500de4af182588e071cf1544ee ] When doing layoutget as part of the open() compound, we have to be careful to release the layout locks before we can call any further RPC calls, such as setattr(). The reason is that those calls could trigger a recall, which could deadlock. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-06-09NFSv4.1 mark qualified async operations as MOVEABLE tasksOlga Kornievskaia4-12/+29
[ Upstream commit 118f09eda21d392e1eeb9f8a4bee044958cccf20 ] Mark async operations such as RENAME, REMOVE, COMMIT MOVEABLE for the nfsv4.1+ sessions. Fixes: 85e39feead948 ("NFSv4.1 identify and mark RPC tasks that can move between transports") Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-06-09NFS: Convert GFP_NOFS to GFP_KERNELTrond Myklebust4-14/+13
[ Upstream commit da48f267f90d9dc9f930fd9a67753643657b404f ] Assume that sections that should not re-enter the filesystem are already protected with memalloc_nofs_save/restore call, so relax those GFP_NOFS instances which might be used by other contexts. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-06-09NFS: Create a new nfs_alloc_fattr_with_label() functionAnna Schumaker3-20/+23
[ Upstream commit d755ad8dc752d44545613ea04d660aed674e540d ] For creating fattrs with the label field already allocated for us. I also update nfs_free_fattr() to free the label in the end. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-06-09NFS: Always initialise fattr->label in nfs_fattr_alloc()Trond Myklebust1-1/+3
[ Upstream commit d4a95a7e5a4d3b68b26f70668cf77324a11b5718 ] We're about to add a check in nfs_free_fattr() for whether or not the label is non-zero. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-06-09NFS: Further fixes to the writeback error handlingTrond Myklebust1-21/+18
[ Upstream commit c6fd3511c3397dd9cbc6dc5d105bbedb69bf4061 ] When we handle an error by redirtying the page, we're not corrupting the mapping, so we don't want the error to be recorded in the mapping. If the caller has specified a sync_mode of WB_SYNC_NONE, we can just return AOP_WRITEPAGE_ACTIVATE. However if we're dealing with WB_SYNC_ALL, we need to ensure that retries happen when the errors are non-fatal. Reported-by: Olga Kornievskaia <aglo@umich.edu> Fixes: 8fc75bed96bb ("NFS: Fix up return value on fatal errors in nfs_page_async_flush()") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-06-09NFSv4/pNFS: Do not fail I/O when we fail to allocate the pNFS layoutTrond Myklebust1-0/+2
[ Upstream commit 3764a17e31d579cf9b4bd0a69894b577e8d75702 ] Commit 587f03deb69b caused pnfs_update_layout() to stop returning ENOMEM when the memory allocation fails, and hence causes it to fall back to trying to do I/O through the MDS. There is no guarantee that this will fare any better. If we're failing the pNFS layout allocation, then we should just redirty the page and retry later. Reported-by: Olga Kornievskaia <aglo@umich.edu> Fixes: 587f03deb69b ("pnfs: refactor send_layoutget") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-06-09NFS: Don't report errors from nfs_pageio_complete() more than onceTrond Myklebust1-8/+1
[ Upstream commit c5e483b77cc2edb318da152abe07e33006b975fd ] Since errors from nfs_pageio_complete() are already being reported through nfs_async_write_error(), we should not be returning them to the callers of do_writepages() as well. They will end up being reported through the generic mechanism instead. Fixes: 6fbda89b257f ("NFS: Replace custom error reporting mechanism with generic one") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-06-09NFS: Do not report flush errors in nfs_write_end()Trond Myklebust1-5/+2
[ Upstream commit d95b26650e86175e4a97698d89bc1626cd1df0c6 ] If we do flush cached writebacks in nfs_write_end() due to the imminent expiration of an RPCSEC_GSS session, then we should defer reporting any resulting errors until the calls to file_check_and_advance_wb_err() in nfs_file_write() and nfs_file_fsync(). Fixes: 6fbda89b257f ("NFS: Replace custom error reporting mechanism with generic one") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-06-09NFS: Don't report ENOSPC write errors twiceTrond Myklebust1-20/+14
[ Upstream commit e6005436f6cc9ed13288f936903f0151e5543485 ] Any errors reported by the write() system call need to be cleared from the file descriptor's error tracking. The current call to nfs_wb_all() causes the error to be reported, but since it doesn't call file_check_and_advance_wb_err(), we can end up reporting the same error a second time when the application calls fsync(). Note that since Linux 4.13, the rule is that EIO may be reported for write(), but it must be reported by a subsequent fsync(), so let's just drop reporting it in write. The check for nfs_ctx_key_to_expire() is just a duplicate to the one already in nfs_write_end(), so let's drop that too. Reported-by: ChenXiaoSong <chenxiaosong2@huawei.com> Fixes: ce368536dd61 ("nfs: nfs_file_write() should check for writeback errors") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-06-09NFS: fsync() should report filesystem errors over EINTR/ERESTARTSYSTrond Myklebust1-4/+5
[ Upstream commit 9641d9bc9b75f11f70646f5c6ee9f5f519a1012e ] If the commit to disk is interrupted, we should still first check for filesystem errors so that we can report them in preference to the error due to the signal. Fixes: 2197e9b06c22 ("NFS: Fix up fsync() when the server rebooted") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-06-09NFS: Do not report EINTR/ERESTARTSYS as mapping errorsTrond Myklebust1-1/+1
[ Upstream commit cea9ba7239dcc84175041174304c6cdeae3226e5 ] If the attempt to flush data was interrupted due to a local signal, then just requeue the writes back for I/O. Fixes: 6fbda89b257f ("NFS: Replace custom error reporting mechanism with generic one") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-06-06NFS: Memory allocation failures are not server fatal errorsTrond Myklebust1-0/+1
commit 452284407c18d8a522c3039339b1860afa0025a8 upstream. We need to filter out ENOMEM in nfs_error_is_fatal_on_server(), because running out of memory on our client is not a server error. Reported-by: Olga Kornievskaia <aglo@umich.edu> Fixes: 2dc23afffbca ("NFS: ENOMEM should also be a fatal error.") Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-05-18nfs: fix broken handling of the softreval mount optionDan Aloni1-1/+1
[ Upstream commit 085d16d5f949b64713d5e960d6c9bbf51bc1d511 ] Turns out that ever since this mount option was added, passing `softreval` in NFS mount options cancelled all other flags while not affecting the underlying flag `NFS_MOUNT_SOFTREVAL`. Fixes: c74dfe97c104 ("NFS: Add mount option 'softreval'") Signed-off-by: Dan Aloni <dan.aloni@vastdata.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-05-12NFSv4: Don't invalidate inode attributes on delegation returnTrond Myklebust1-1/+11
commit 00c94ebec5925593c0377b941289224469e72ac7 upstream. There is no need to declare attributes such as the ctime, mtime and block size invalid when we're just returning a delegation, so it is inappropriate to call nfs_post_op_update_inode_force_wcc(). Instead, just call nfs_refresh_inode() after faking up the change attribute. We know that the GETATTR op occurs before the DELEGRETURN, so we are safe when doing this. Fixes: 0bc2c9b4dca9 ("NFSv4: Don't discard the attributes returned by asynchronous DELEGRETURN") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-13NFSv4: fix open failure with O_ACCMODE flagChenXiaoSong3-12/+14
[ Upstream commit b243874f6f9568b2daf1a00e9222cacdc15e159c ] open() with O_ACCMODE|O_DIRECT flags secondly will fail. Reproducer: 1. mount -t nfs -o vers=4.2 $server_ip:/ /mnt/ 2. fd = open("/mnt/file", O_ACCMODE|O_DIRECT|O_CREAT) 3. close(fd) 4. fd = open("/mnt/file", O_ACCMODE|O_DIRECT) Server nfsd4_decode_share_access() will fail with error nfserr_bad_xdr when client use incorrect share access mode of 0. Fix this by using NFS4_SHARE_ACCESS_BOTH share access mode in client, just like firstly opening. Fixes: ce4ef7c0a8a05 ("NFS: Split out NFS v4 file operations") Signed-off-by: ChenXiaoSong <chenxiaosong2@huawei.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-13Revert "NFSv4: Handle the special Linux file open access mode"ChenXiaoSong2-2/+1
[ Upstream commit ab0fc21bc7105b54bafd85bd8b82742f9e68898a ] This reverts commit 44942b4e457beda00981f616402a1a791e8c616e. After secondly opening a file with O_ACCMODE|O_DIRECT flags, nfs4_valid_open_stateid() will dereference NULL nfs4_state when lseek(). Reproducer: 1. mount -t nfs -o vers=4.2 $server_ip:/ /mnt/ 2. fd = open("/mnt/file", O_ACCMODE|O_DIRECT|O_CREAT) 3. close(fd) 4. fd = open("/mnt/file", O_ACCMODE|O_DIRECT) 5. lseek(fd) Reported-by: Lyu Tao <tao.lyu@epfl.ch> Signed-off-by: ChenXiaoSong <chenxiaosong2@huawei.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-13NFS: Avoid writeback threads getting stuck in mempool_alloc()Trond Myklebust2-7/+13
[ Upstream commit 0bae835b63c53f86cdc524f5962e39409585b22c ] In a low memory situation, allow the NFS writeback code to fail without getting stuck in infinite loops in mempool_alloc(). Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-13NFS: nfsiod should not block forever in mempool_alloc()Trond Myklebust3-17/+22
[ Upstream commit 515dcdcd48736576c6f5c197814da6f81c60a21e ] The concern is that since nfsiod is sometimes required to kick off a commit, it can get locked up waiting forever in mempool_alloc() instead of failing gracefully and leaving the commit until later. Try to allocate from the slab first, with GFP_KERNEL | __GFP_NORETRY, then fall back to a non-blocking attempt to allocate from the memory pool. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-13NFS: swap-out must always use STABLE writes.NeilBrown1-4/+6
[ Upstream commit c265de257f558a05c1859ee9e3fed04883b9ec0e ] The commit handling code is not safe against memory-pressure deadlocks when writing to swap. In particular, nfs_commitdata_alloc() blocks indefinitely waiting for memory, and this can consume all available workqueue threads. swap-out most likely uses STABLE writes anyway as COND_STABLE indicates that a stable write should be used if the write fits in a single request, and it normally does. However if we ever swap with a small wsize, or gather unusually large numbers of pages for a single write, this might change. For safety, make it explicit in the code that direct writes used for swap must always use FLUSH_STABLE. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-13NFS: swap IO handling is slightly different for O_DIRECT IONeilBrown2-16/+30
[ Upstream commit 64158668ac8b31626a8ce48db4cad08496eb8340 ] 1/ Taking the i_rwsem for swap IO triggers lockdep warnings regarding possible deadlocks with "fs_reclaim". These deadlocks could, I believe, eventuate if a buffered read on the swapfile was attempted. We don't need coherence with the page cache for a swap file, and buffered writes are forbidden anyway. There is no other need for i_rwsem during direct IO. So never take it for swap_rw() 2/ generic_write_checks() explicitly forbids writes to swap, and performs checks that are not needed for swap. So bypass it for swap_rw(). Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-13NFSv4: Protect the state recovery thread against direct reclaimTrond Myklebust1-0/+12
[ Upstream commit 3e17898aca293a24dae757a440a50aa63ca29671 ] If memory allocation triggers a direct reclaim from the state recovery thread, then we can deadlock. Use memalloc_nofs_save/restore to ensure that doesn't happen. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-13NFSv4.2: fix reference count leaks in _nfs42_proc_copy_notify()Xin Xiong1-3/+6
[ Upstream commit b7f114edd54326f730a754547e7cfb197b5bc132 ] [You don't often get email from xiongx18@fudan.edu.cn. Learn why this is important at http://aka.ms/LearnAboutSenderIdentification.] The reference counting issue happens in two error paths in the function _nfs42_proc_copy_notify(). In both error paths, the function simply returns the error code and forgets to balance the refcount of object `ctx`, bumped by get_nfs_open_context() earlier, which may cause refcount leaks. Fix it by balancing refcount of the `ctx` object before the function returns in both error paths. Signed-off-by: Xin Xiong <xiongx18@fudan.edu.cn> Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn> Signed-off-by: Xin Tan <tanxin.ctf@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-08NFSv4/pNFS: Fix another issue with a list iterator pointing to the headTrond Myklebust3-18/+22
[ Upstream commit 7c9d845f0612e5bcd23456a2ec43be8ac43458f1 ] In nfs4_callback_devicenotify(), if we don't find a matching entry for the deviceid, we're left with a pointer to 'struct nfs_server' that actually points to the list of super blocks associated with our struct nfs_client. Furthermore, even if we have a valid pointer, nothing pins the super block, and so the struct nfs_server could end up getting freed while we're using it. Since all we want is a pointer to the struct pnfs_layoutdriver_type, let's skip all the iteration over super blocks, and just use APIs to find the layout driver directly. Reported-by: Xiaomeng Tong <xiam0nd.tong@gmail.com> Fixes: 1be5683b03a7 ("pnfs: CB_NOTIFY_DEVICEID") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-08NFS: Don't loop forever in nfs_do_recoalesce()Trond Myklebust1-0/+1
[ Upstream commit d02d81efc7564b4d5446a02e0214a164cf00b1f3 ] If __nfs_pageio_add_request() fails to add the request, it will return with either desc->pg_error < 0, or mirror->pg_recoalesce will be set, so we are guaranteed either to exit the function altogether, or to loop. However if there is nothing left in mirror->pg_list to coalesce, we must exit, so make sure that we clear mirror->pg_recoalesce every time we loop. Reported-by: Olga Kornievskaia <aglo@umich.edu> Fixes: 70536bf4eb07 ("NFS: Clean up reset of the mirror accounting variables") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-08NFSv4.1: don't retry BIND_CONN_TO_SESSION on session errorOlga Kornievskaia1-0/+1
[ Upstream commit 1d15d121cc2ad4d016a7dc1493132a9696f91fc5 ] There is no reason to retry the operation if a session error had occurred in such case result structure isn't filled out. Fixes: dff58530c4ca ("NFSv4.1: fix handling of backchannel binding in BIND_CONN_TO_SESSION") Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>