summaryrefslogtreecommitdiff
path: root/fs/nfsd
AgeCommit message (Collapse)AuthorFilesLines
2022-04-08NFSD: Fix nfsd_breaker_owns_lease() return valuesChuck Lever1-2/+10
[ Upstream commit 50719bf3442dd6cd05159e9c98d020b3919ce978 ] These have been incorrect since the function was introduced. A proper kerneldoc comment is added since this function, though static, is part of an external interface. Reported-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-08nfsd: more robust allocation failure handling in nfsd_file_cache_initAmir Goldstein1-3/+3
[ Upstream commit 4d2eeafecd6c83b4444db3dc0ada201c89b1aa44 ] The nfsd file cache table can be pretty large and its allocation may require as many as 80 contigious pages. Employ the same fix that was employed for similar issue that was reported for the reply cache hash table allocation several years ago by commit 8f97514b423a ("nfsd: more robust allocation failure handling in nfsd_reply_cache_init"). Fixes: 65294c1f2c5e ("nfsd: add a new struct file caching facility to nfsd") Link: https://lore.kernel.org/linux-nfs/e3cdaeec85a6cfec980e87fc294327c0381c1778.camel@kernel.org/ Suggested-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-08NFSD: prevent underflow in nfssvc_decode_writeargs()Dan Carpenter2-2/+2
commit 184416d4b98509fb4c3d8fc3d6dc1437896cc159 upstream. Smatch complains: fs/nfsd/nfsxdr.c:341 nfssvc_decode_writeargs() warn: no lower bound on 'args->len' Change the type to unsigned to prevent this issue. Cc: stable@vger.kernel.org Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-08nfsd: fix crash on COPY_NOTIFY with special stateidJ. Bruce Fields1-1/+5
[ Upstream commit 074b07d94e0bb6ddce5690a9b7e2373088e8b33a ] RTM says "If the special ONE stateid is passed to nfs4_preprocess_stateid_op(), it returns status=0 but does not set *cstid. nfsd4_copy_notify() depends on stid being set if status=0, and thus can crash if the client sends the right COPY_NOTIFY RPC." RFC 7862 says "The cna_src_stateid MUST refer to either open or locking states provided earlier by the server. If it is invalid, then the operation MUST fail." The RFC doesn't specify an error, and the choice doesn't matter much as this is clearly illegal client behavior, but bad_stateid seems reasonable. Simplest is just to guarantee that nfs4_preprocess_stateid_op, called with non-NULL cstid, errors out if it can't return a stateid. Reported-by: rtm@csail.mit.edu Fixes: 624322f1adc5 ("NFSD add COPY_NOTIFY operation") Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Olga Kornievskaia <kolga@netapp.com> Tested-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-08Revert "nfsd: skip some unnecessary stats in the v4 case"Chuck Lever1-27/+17
[ Upstream commit 58f258f65267542959487dbe8b5641754411843d ] On the wire, I observed NFSv4 OPEN(CREATE) operations sometimes returning a reasonable-looking value in the cinfo.before field and zero in the cinfo.after field. RFC 8881 Section 10.8.1 says: > When a client is making changes to a given directory, it needs to > determine whether there have been changes made to the directory by > other clients. It does this by using the change attribute as > reported before and after the directory operation in the associated > change_info4 value returned for the operation. and > ... The post-operation change > value needs to be saved as the basis for future change_info4 > comparisons. A good quality client implementation therefore saves the zero cinfo.after value. During a subsequent OPEN operation, it will receive a different non-zero value in the cinfo.before field for that directory, and it will incorrectly believe the directory has changed, triggering an undesirable directory cache invalidation. There are filesystem types where fs_supports_change_attribute() returns false, tmpfs being one. On NFSv4 mounts, this means the fh_getattr() call site in fill_pre_wcc() and fill_post_wcc() is never invoked. Subsequently, nfsd4_change_attribute() is invoked with an uninitialized @stat argument. In fill_pre_wcc(), @stat contains stale stack garbage, which is then placed on the wire. In fill_post_wcc(), ->fh_post_wc is all zeroes, so zero is placed on the wire. Both of these values are meaningless. This fix can be applied immediately to stable kernels. Once there are more regression tests in this area, this optimization can be attempted again. Fixes: 428a23d2bf0c ("nfsd: skip some unnecessary stats in the v4 case") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-08NFSD: Fix verifier returned in stable WRITEsChuck Lever1-0/+4
[ Upstream commit f11ad7aa653130b71e2e89bed207f387718216d5 ] RFC 8881 explains the purpose of the write verifier this way: > The final portion of the result is the field writeverf. This field > is the write verifier and is a cookie that the client can use to > determine whether a server has changed instance state (e.g., server > restart) between a call to WRITE and a subsequent call to either > WRITE or COMMIT. But then it says: > This cookie MUST be unchanged during a single instance of the > NFSv4.1 server and MUST be unique between instances of the NFSv4.1 > server. If the cookie changes, then the client MUST assume that > any data written with an UNSTABLE4 value for committed and an old > writeverf in the reply has been lost and will need to be > recovered. RFC 1813 has similar language for NFSv3. NFSv2 does not have a write verifier since it doesn't implement the COMMIT procedure. Since commit 19e0663ff9bc ("nfsd: Ensure sampling of the write verifier is atomic with the write"), the Linux NFS server has returned a boot-time-based verifier for UNSTABLE WRITEs, but a zero verifier for FILE_SYNC and DATA_SYNC WRITEs. FILE_SYNC and DATA_SYNC WRITEs are not followed up with a COMMIT, so there's no need for clients to compare verifiers for stable writes. However, by returning a different verifier for stable and unstable writes, the above commit puts the Linux NFS server a step farther out of compliance with the first MUST above. At least one NFS client (FreeBSD) noticed the difference, making this a potential regression. Reported-by: Rick Macklem <rmacklem@uoguelph.ca> Link: https://lore.kernel.org/linux-nfs/YQXPR0101MB096857EEACF04A6DF1FC6D9BDD749@YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM/T/ Fixes: 19e0663ff9bc ("nfsd: Ensure sampling of the write verifier is atomic with the write") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-08NFSD: Fix zero-length NFSv3 WRITEsChuck Lever2-10/+1
[ Upstream commit 6a2f774424bfdcc2df3e17de0cefe74a4269cad5 ] The Linux NFS server currently responds to a zero-length NFSv3 WRITE request with NFS3ERR_IO. It responds to a zero-length NFSv4 WRITE with NFS4_OK and count of zero. RFC 1813 says of the WRITE procedure's @count argument: count The number of bytes of data to be written. If count is 0, the WRITE will succeed and return a count of 0, barring errors due to permissions checking. RFC 8881 has similar language for NFSv4, though NFSv4 removed the explicit @count argument because that value is already contained in the opaque payload array. The synthetic client pynfs's WRT4 and WRT15 tests do emit zero- length WRITEs to exercise this spec requirement. Commit fdec6114ee1f ("nfsd4: zero-length WRITE should succeed") addressed the same problem there with the same fix. But interestingly the Linux NFS client does not appear to emit zero- length WRITEs, instead squelching them. I'm not aware of a test that can generate such WRITEs for NFSv3, so I wrote a naive C program to generate a zero-length WRITE and test this fix. Fixes: 8154ef2776aa ("NFSD: Clean up legacy NFS WRITE argument XDR decoders") Reported-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-08NFSD: Have legacy NFSD WRITE decoders use xdr_stream_subsegment()Chuck Lever7-26/+8
[ Upstream commit dae9a6cab8009e526570e7477ce858dcdfeb256e ] Refactor. Now that the NFSv2 and NFSv3 XDR decoders have been converted to use xdr_streams, the WRITE decoder functions can use xdr_stream_subsegment() to extract the WRITE payload into its own xdr_buf, just as the NFSv4 WRITE XDR decoder currently does. That makes it possible to pass the first kvec, pages array + length, page_base, and total payload length via a single function parameter. The payload's page_base is not yet assigned or used, but will be in subsequent patches. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-02-16NFSD: Fix the behavior of READ near OFFSET_MAXChuck Lever3-10/+14
commit 0cb4d23ae08c48f6bf3c29a8e5c4a74b8388b960 upstream. Dan Aloni reports: > Due to commit 8cfb9015280d ("NFS: Always provide aligned buffers to > the RPC read layers") on the client, a read of 0xfff is aligned up > to server rsize of 0x1000. > > As a result, in a test where the server has a file of size > 0x7fffffffffffffff, and the client tries to read from the offset > 0x7ffffffffffff000, the read causes loff_t overflow in the server > and it returns an NFS code of EINVAL to the client. The client as > a result indefinitely retries the request. The Linux NFS client does not handle NFS?ERR_INVAL, even though all NFS specifications permit servers to return that status code for a READ. Instead of NFS?ERR_INVAL, have out-of-range READ requests succeed and return a short result. Set the EOF flag in the result to prevent the client from retrying the READ request. This behavior appears to be consistent with Solaris NFS servers. Note that NFSv3 and NFSv4 use u64 offset values on the wire. These must be converted to loff_t internally before use -- an implicit type cast is not adequate for this purpose. Otherwise VFS checks against sb->s_maxbytes do not work properly. Reported-by: Dan Aloni <dan.aloni@vastdata.com> Cc: stable@vger.kernel.org Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-16NFSD: Fix offset type in I/O trace pointsChuck Lever1-7/+7
commit 6a4d333d540041d244b2fca29b8417bfde20af81 upstream. NFSv3 and NFSv4 use u64 offset values on the wire. Record these values verbatim without the implicit type case to loff_t. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-16NFSD: Clamp WRITE offsetsChuck Lever2-2/+8
commit 6260d9a56ab352b54891ec66ab0eced57d55abc6 upstream. Ensure that a client cannot specify a WRITE range that falls in a byte range outside what the kernel's internal types (such as loff_t, which is signed) can represent. The kiocb iterators, invoked in nfsd_vfs_write(), should properly limit write operations to within the underlying file system's s_maxbytes. Cc: stable@vger.kernel.org Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-16NFSD: Fix ia_size underflowChuck Lever1-0/+4
commit e6faac3f58c7c4176b66f63def17a34232a17b0e upstream. iattr::ia_size is a loff_t, which is a signed 64-bit type. NFSv3 and NFSv4 both define file size as an unsigned 64-bit type. Thus there is a range of valid file size values an NFS client can send that is already larger than Linux can handle. Currently decode_fattr4() dumps a full u64 value into ia_size. If that value happens to be larger than S64_MAX, then ia_size underflows. I'm about to fix up the NFSv3 behavior as well, so let's catch the underflow in the common code path: nfsd_setattr(). Cc: stable@vger.kernel.org Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-16NFSD: Fix NFSv3 SETATTR/CREATE's handling of large file sizesChuck Lever1-1/+1
commit a648fdeb7c0e17177a2280344d015dba3fbe3314 upstream. iattr::ia_size is a loff_t, so these NFSv3 procedures must be careful to deal with incoming client size values that are larger than s64_max without corrupting the value. Silently capping the value results in storing a different value than the client passed in which is unexpected behavior, so remove the min_t() check in decode_sattr3(). Note that RFC 1813 permits only the WRITE procedure to return NFS3ERR_FBIG. We believe that NFSv3 reference implementations also return NFS3ERR_FBIG when ia_size is too large. Cc: stable@vger.kernel.org Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-08nfsd: nfsd4_setclientid_confirm mistakenly expires confirmed client.Dai Ngo1-1/+3
commit ab451ea952fe9d7afefae55ddb28943a148247fe upstream. From RFC 7530 Section 16.34.5: o The server has not recorded an unconfirmed { v, x, c, *, * } and has recorded a confirmed { v, x, c, *, s }. If the principals of the record and of SETCLIENTID_CONFIRM do not match, the server returns NFS4ERR_CLID_INUSE without removing any relevant leased client state, and without changing recorded callback and callback_ident values for client { x }. The current code intends to do what the spec describes above but it forgot to set 'old' to NULL resulting to the confirmed client to be expired. Fixes: 2b63482185e6 ("nfsd: fix clid_inuse on mount with security change") Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Bruce Fields <bfields@fieldses.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-01fsnotify: fix fsnotify hooks in pseudo filesystemsAmir Goldstein1-2/+3
commit 29044dae2e746949ad4b9cbdbfb248994d1dcdb4 upstream. Commit 49246466a989 ("fsnotify: move fsnotify_nameremove() hook out of d_delete()") moved the fsnotify delete hook before d_delete() so fsnotify will have access to a positive dentry. This allowed a race where opening the deleted file via cached dentry is now possible after receiving the IN_DELETE event. To fix the regression in pseudo filesystems, convert d_delete() calls to d_drop() (see commit 46c46f8df9aa ("devpts_pty_kill(): don't bother with d_delete()") and move the fsnotify hook after d_drop(). Add a missing fsnotify_unlink() hook in nfsdfs that was found during the audit of fsnotify hooks in pseudo filesystems. Note that the fsnotify hooks in simple_recursive_removal() follow d_invalidate(), so they require no change. Link: https://lore.kernel.org/r/20220120215305.282577-2-amir73il@gmail.com Reported-by: Ivan Delalande <colona@arista.com> Link: https://lore.kernel.org/linux-fsdevel/YeNyzoDM5hP5LtGW@visor/ Fixes: 49246466a989 ("fsnotify: move fsnotify_nameremove() hook out of d_delete()") Cc: stable@vger.kernel.org # v5.3+ Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-29NFSD: Fix READDIR buffer overflowChuck Lever2-11/+8
commit 53b1119a6e5028b125f431a0116ba73510d82a72 upstream. If a client sends a READDIR count argument that is too small (say, zero), then the buffer size calculation in the new init_dirlist helper functions results in an underflow, allowing the XDR stream functions to write beyond the actual buffer. This calculation has always been suspect. NFSD has never sanity- checked the READDIR count argument, but the old entry encoders managed the problem correctly. With the commits below, entry encoding changed, exposing the underflow to the pointer arithmetic in xdr_reserve_space(). Modern NFS clients attempt to retrieve as much data as possible for each READDIR request. Also, we have no unit tests that exercise the behavior of READDIR at the lower bound of @count values. Thus this case was missed during testing. Reported-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com> Fixes: f5dcccd647da ("NFSD: Update the NFSv2 READDIR entry encoder to use struct xdr_stream") Fixes: 7f87fc2d34d4 ("NFSD: Update NFSv3 READDIR entry encoders to use struct xdr_stream") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-14nfsd: Fix nsfd startup race (again)Alexander Sverdlin2-7/+8
commit b10252c7ae9c9d7c90552f88b544a44ee773af64 upstream. Commit bd5ae9288d64 ("nfsd: register pernet ops last, unregister first") has re-opened rpc_pipefs_event() race against nfsd_net_id registration (register_pernet_subsys()) which has been fixed by commit bb7ffbf29e76 ("nfsd: fix nsfd startup race triggering BUG_ON"). Restore the order of register_pernet_subsys() vs register_cld_notifier(). Add WARN_ON() to prevent a future regression. Crash info: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000012 CPU: 8 PID: 345 Comm: mount Not tainted 5.4.144-... #1 pc : rpc_pipefs_event+0x54/0x120 [nfsd] lr : rpc_pipefs_event+0x48/0x120 [nfsd] Call trace: rpc_pipefs_event+0x54/0x120 [nfsd] blocking_notifier_call_chain rpc_fill_super get_tree_keyed rpc_fs_get_tree vfs_get_tree do_mount ksys_mount __arm64_sys_mount el0_svc_handler el0_svc Fixes: bd5ae9288d64 ("nfsd: register pernet ops last, unregister first") Cc: stable@vger.kernel.org Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-14nfsd: fix use-after-free due to delegation raceJ. Bruce Fields1-2/+7
commit 548ec0805c399c65ed66c6641be467f717833ab5 upstream. A delegation break could arrive as soon as we've called vfs_setlease. A delegation break runs a callback which immediately (in nfsd4_cb_recall_prepare) adds the delegation to del_recall_lru. If we then exit nfs4_set_delegation without hashing the delegation, it will be freed as soon as the callback is done with it, without ever being removed from del_recall_lru. Symptoms show up later as use-after-free or list corruption warnings, usually in the laundromat thread. I suspect aba2072f4523 "nfsd: grant read delegations to clients holding writes" made this bug easier to hit, but I looked as far back as v3.0 and it looks to me it already had the same problem. So I'm not sure where the bug was introduced; it may have been there from the beginning. Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-25NFSD: Fix exposure in nfsd4_decode_bitmap()Chuck Lever1-5/+2
[ Upstream commit c0019b7db1d7ac62c711cda6b357a659d46428fe ] rtm@csail.mit.edu reports: > nfsd4_decode_bitmap4() will write beyond bmval[bmlen-1] if the RPC > directs it to do so. This can cause nfsd4_decode_state_protect4_a() > to write client-supplied data beyond the end of > nfsd4_exchange_id.spo_must_allow[] when called by > nfsd4_decode_exchange_id(). Rewrite the loops so nfsd4_decode_bitmap() cannot iterate beyond @bmlen. Reported by: rtm@csail.mit.edu Fixes: d1c263a031e8 ("NFSD: Replace READ* macros in nfsd4_decode_fattr()") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-10-08Merge tag 'nfsd-5.15-3' of ↵Linus Torvalds3-11/+17
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: "Bug fixes for NFSD error handling paths" * tag 'nfsd-5.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: NFSD: Keep existing listeners on portlist error SUNRPC: fix sign error causing rpcsec_gss drops nfsd: Fix a warning for nfsd_file_close_inode nfsd4: Handle the NFSv4 READDIR 'dircount' hint being zero nfsd: fix error handling of register_pernet_subsys() in init_nfsd()
2021-10-06NFSD: Keep existing listeners on portlist errorBenjamin Coddington1-1/+4
If nfsd has existing listening sockets without any processes, then an error returned from svc_create_xprt() for an additional transport will remove those existing listeners. We're seeing this in practice when userspace attempts to create rpcrdma transports without having the rpcrdma modules present before creating nfsd kernel processes. Fix this by checking for existing sockets before calling nfsd_destroy(). Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-10-01nfsd: Fix a warning for nfsd_file_close_inodeTrond Myklebust1-1/+1
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-09-30nfsd4: Handle the NFSv4 READDIR 'dircount' hint being zeroTrond Myklebust1-8/+11
RFC3530 notes that the 'dircount' field may be zero, in which case the recommendation is to ignore it, and only enforce the 'maxcount' field. In RFC5661, this recommendation to ignore a zero valued field becomes a requirement. Fixes: aee377644146 ("nfsd4: fix rd_dircount enforcement") Cc: <stable@vger.kernel.org> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-09-30nfsd: fix error handling of register_pernet_subsys() in init_nfsd()Patrick Ho1-1/+1
init_nfsd() should not unregister pernet subsys if the register fails but should instead unwind from the last successful operation which is register_filesystem(). Unregistering a failed register_pernet_subsys() call can result in a kernel GPF as revealed by programmatically injecting an error in register_pernet_subsys(). Verified the fix handled failure gracefully with no lingering nfsd entry in /proc/filesystems. This change was introduced by the commit bd5ae9288d64 ("nfsd: register pernet ops last, unregister first"), the original error handling logic was correct. Fixes: bd5ae9288d64 ("nfsd: register pernet ops last, unregister first") Cc: stable@vger.kernel.org Signed-off-by: Patrick Ho <Patrick.Ho@netapp.com> Acked-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-09-22Merge tag 'nfsd-5.15-2' of ↵Linus Torvalds1-3/+13
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: "Critical bug fixes: - Fix crash in NLM TEST procedure - NFSv4.1+ backchannel not restored after PATH_DOWN" * tag 'nfsd-5.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: nfsd: back channel stuck in SEQ4_STATUS_CB_PATH_DOWN NLM: Fix svcxdr_encode_owner()
2021-09-17nfsd: back channel stuck in SEQ4_STATUS_CB_PATH_DOWNDai Ngo1-3/+13
When the back channel enters SEQ4_STATUS_CB_PATH_DOWN state, the client recovers by sending BIND_CONN_TO_SESSION but the server fails to recover the back channel and leaves it as NFSD4_CB_DOWN. Fix by enhancing nfsd4_bind_conn_to_session to probe the back channel by calling nfsd4_probe_callback. Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-09-03Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds1-1/+1
Pull SCSI updates from James Bottomley: "This series consists of the usual driver updates (ufs, qla2xxx, target, smartpqi, lpfc, mpt3sas). The core change causing the most churn was replacing the command request field request with a macro, allowing us to offset map to it and remove the redundant field; the same was also done for the tag field. The most impactful change is the final removal of scsi_ioctl, which has been deprecated for over a decade" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (293 commits) scsi: ufs: Fix ufshcd_request_sense_async() for Samsung KLUFG8RHDA-B2D1 scsi: ufs: ufs-exynos: Fix static checker warning scsi: mpt3sas: Use the proper SCSI midlayer interfaces for PI scsi: lpfc: Use the proper SCSI midlayer interfaces for PI scsi: lpfc: Copyright updates for 14.0.0.1 patches scsi: lpfc: Update lpfc version to 14.0.0.1 scsi: lpfc: Add bsg support for retrieving adapter cmf data scsi: lpfc: Add cmf_info sysfs entry scsi: lpfc: Add debugfs support for cm framework buffers scsi: lpfc: Add support for maintaining the cm statistics buffer scsi: lpfc: Add rx monitoring statistics scsi: lpfc: Add support for the CM framework scsi: lpfc: Add cmfsync WQE support scsi: lpfc: Add support for cm enablement buffer scsi: lpfc: Add cm statistics buffer support scsi: lpfc: Add EDC ELS support scsi: lpfc: Expand FPIN and RDF receive logging scsi: lpfc: Add MIB feature enablement support scsi: lpfc: Add SET_HOST_DATA mbox cmd to pass date/time info to firmware scsi: fc: Add EDC ELS definition ...
2021-08-31Merge tag 'nfsd-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linuxLinus Torvalds5-34/+33
Pull nfsd updates from Chuck Lever: "New features: - Support for server-side disconnect injection via debugfs - Protocol definitions for new RPC_AUTH_TLS authentication flavor Performance improvements: - Reduce page allocator traffic in the NFSD splice read actor - Reduce CPU utilization in svcrdma's Send completion handler Notable bug fixes: - Stabilize lockd operation when re-exporting NFS mounts - Fix the use of %.*s in NFSD tracepoints - Fix /proc/sys/fs/nfs/nsm_use_hostnames" * tag 'nfsd-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (31 commits) nfsd: fix crash on LOCKT on reexported NFSv3 nfs: don't allow reexport reclaims lockd: don't attempt blocking locks on nfs reexports nfs: don't atempt blocking locks on nfs reexports Keep read and write fds with each nlm_file lockd: update nlm_lookup_file reexport comment nlm: minor refactoring nlm: minor nlm_lookup_file argument change lockd: lockd server-side shouldn't set fl_ops SUNRPC: Add documentation for the fail_sunrpc/ directory SUNRPC: Server-side disconnect injection SUNRPC: Move client-side disconnect injection SUNRPC: Add a /sys/kernel/debug/fail_sunrpc/ directory svcrdma: xpt_bc_xprt is already clear in __svc_rdma_free() nfsd4: Fix forced-expiry locking rpc: fix gss_svc_init cleanup on failure SUNRPC: Add RPC_AUTH_TLS protocol numbers lockd: change the proc_handler for nsm_use_hostnames sysctl: introduce new proc handler proc_dobool SUNRPC: Fix a NULL pointer deref in trace_svc_stats_latency() ...
2021-08-26nfsd: fix crash on LOCKT on reexported NFSv3J. Bruce Fields1-2/+3
Unlike other filesystems, NFSv3 tries to use fl_file in the GETLK case. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-08-26nfs: don't allow reexport reclaimsJ. Bruce Fields2-0/+4
In the reexport case, nfsd is currently passing along locks with the reclaim bit set. The client sends a new lock request, which is granted if there's currently no conflict--even if it's possible a conflicting lock could have been briefly held in the interim. We don't currently have any way to safely grant reclaim, so for now let's just deny them all. I'm doing this by passing the reclaim bit to nfs and letting it fail the call, with the idea that eventually the client might be able to do something more forgiving here. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Acked-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-08-26nfs: don't atempt blocking locks on nfs reexportsJ. Bruce Fields1-2/+6
NFS implements blocking locks by blocking inside its lock method. In the reexport case, this blocks the nfs server thread, which could lead to deadlocks since an nfs server thread might be required to unlock the conflicting lock. It also causes a crash, since the nfs server thread assumes it can free the lock when its lm_notify lock callback is called. Ideal would be to make the nfs lock method return without blocking in this case, but for now it works just not to attempt blocking locks. The difference is just that the original client will have to poll (as it does in the v4.0 case) instead of getting a callback when the lock's available. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Acked-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-08-24Keep read and write fds with each nlm_fileJ. Bruce Fields1-2/+6
We shouldn't really be using a read-only file descriptor to take a write lock. Most filesystems will put up with it. But NFS, for example, won't. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-08-23fs: remove mandatory file locking supportJeff Layton2-36/+1
We added CONFIG_MANDATORY_FILE_LOCKING in 2015, and soon after turned it off in Fedora and RHEL8. Several other distros have followed suit. I've heard of one problem in all that time: Someone migrated from an older distro that supported "-o mand" to one that didn't, and the host had a fstab entry with "mand" in it which broke on reboot. They didn't actually _use_ mandatory locking so they just removed the mount option and moved on. This patch rips out mandatory locking support wholesale from the kernel, along with the Kconfig option and the Documentation file. It also changes the mount code to ignore the "mand" mount option instead of erroring out, and to throw a big, ugly warning. Signed-off-by: Jeff Layton <jlayton@kernel.org>
2021-08-17nfsd4: Fix forced-expiry lockingJ. Bruce Fields1-2/+2
This should use the network-namespace-wide client_lock, not the per-client cl_lock. You shouldn't see any bugs unless you're actually using the forced-expiry interface introduced by 89c905beccbb. Fixes: 89c905beccbb "nfsd: allow forced expiration of NFSv4 clients" Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-08-17NFSD: remove vanity commentsNeilBrown1-1/+0
Including one's name in copyright claims is appropriate. Including it in random comments is just vanity. After 2 decades, it is time for these to be gone. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-08-17NFSD: Use new __string_len C macros for nfsd_clid_classChuck Lever1-3/+2
Clean up. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-08-17NFSD: Use new __string_len C macros for the nfs_dirent tracepointChuck Lever1-7/+5
Clean up. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-08-17NFSD: Batch release pages during splice readChuck Lever1-7/+2
Large splice reads call put_page() repeatedly. put_page() is relatively expensive to call, so replace it with the new svc_rqst_replace_page() helper to help amortize that cost. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: NeilBrown <neilb@suse.de>
2021-08-17NFSD: Clean up splice actorChuck Lever1-8/+3
A few useful observations: - The value in @size is never modified. - splice_desc.len is an unsigned int, and so is xdr_buf.page_len. An implicit cast to size_t is unnecessary. - The computation of .page_len is the same in all three arms of the "if" statement, so hoist it out to make it clear that the operation is an unconditional invariant. The resulting function is 18 bytes shorter on my system (-Os). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: NeilBrown <neilb@suse.de>
2021-07-29scsi: core: Rename CONFIG_BLK_SCSI_REQUEST to CONFIG_SCSI_COMMONChristoph Hellwig1-1/+1
CONFIG_BLK_SCSI_REQUEST is rather misnamed as it enables building a small amount of code shared by the SCSI initiator, target, and consumers of the scsi_request passthrough API. Rename it and also allow building it as a module. [mkp: add module license] Link: https://lore.kernel.org/r/20210724072033.1284840-20-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-07-09Merge tag 'block-5.14-2021-07-08' of git://git.kernel.dk/linux-blockLinus Torvalds1-1/+1
Pull more block updates from Jens Axboe: "A combination of changes that ended up depending on both the driver and core branch (and/or the IDE removal), and a few late arriving fixes. In detail: - Fix io ticks wrap-around issue (Chunguang) - nvme-tcp sock locking fix (Maurizio) - s390-dasd fixes (Kees, Christoph) - blk_execute_rq polling support (Keith) - blk-cgroup RCU iteration fix (Yu) - nbd backend ID addition (Prasanna) - Partition deletion fix (Yufen) - Use blk_mq_alloc_disk for mmc, mtip32xx, ubd (Christoph) - Removal of now dead block request types due to IDE removal (Christoph) - Loop probing and control device cleanups (Christoph) - Device uevent fix (Christoph) - Misc cleanups/fixes (Tetsuo, Christoph)" * tag 'block-5.14-2021-07-08' of git://git.kernel.dk/linux-block: (34 commits) blk-cgroup: prevent rcu_sched detected stalls warnings while iterating blkgs block: fix the problem of io_ticks becoming smaller nvme-tcp: can't set sk_user_data without write_lock loop: remove unused variable in loop_set_status() block: remove the bdgrab in blk_drop_partitions block: grab a device refcount in disk_uevent s390/dasd: Avoid field over-reading memcpy() dasd: unexport dasd_set_target_state block: check disk exist before trying to add partition ubd: remove dead code in ubd_setup_common nvme: use return value from blk_execute_rq() block: return errors from blk_execute_rq() nvme: use blk_execute_rq() for passthrough commands block: support polling through blk_execute_rq block: remove REQ_OP_SCSI_{IN,OUT} block: mark blk_mq_init_queue_data static loop: rewrite loop_exit using idr_for_each_entry loop: split loop_lookup loop: don't allow deleting an unspecified loop device loop: move loop_ctl_mutex locking into loop_add ...
2021-07-07Merge tag 'nfsd-5.14' of git://linux-nfs.org/~bfields/linuxLinus Torvalds11-132/+546
Pull nfsd updates from Bruce Fields: - add tracepoints for callbacks and for client creation and destruction - cache the mounts used for server-to-server copies - expose callback information in /proc/fs/nfsd/clients/*/info - don't hold locks unnecessarily while waiting for commits - update NLM to use xdr_stream, as we have for NFSv2/v3/v4 * tag 'nfsd-5.14' of git://linux-nfs.org/~bfields/linux: (69 commits) nfsd: fix NULL dereference in nfs3svc_encode_getaclres NFSD: Prevent a possible oops in the nfs_dirent() tracepoint nfsd: remove redundant assignment to pointer 'this' nfsd: Reduce contention for the nfsd_file nf_rwsem lockd: Update the NLMv4 SHARE results encoder to use struct xdr_stream lockd: Update the NLMv4 nlm_res results encoder to use struct xdr_stream lockd: Update the NLMv4 TEST results encoder to use struct xdr_stream lockd: Update the NLMv4 void results encoder to use struct xdr_stream lockd: Update the NLMv4 FREE_ALL arguments decoder to use struct xdr_stream lockd: Update the NLMv4 SHARE arguments decoder to use struct xdr_stream lockd: Update the NLMv4 SM_NOTIFY arguments decoder to use struct xdr_stream lockd: Update the NLMv4 nlm_res arguments decoder to use struct xdr_stream lockd: Update the NLMv4 UNLOCK arguments decoder to use struct xdr_stream lockd: Update the NLMv4 CANCEL arguments decoder to use struct xdr_stream lockd: Update the NLMv4 LOCK arguments decoder to use struct xdr_stream lockd: Update the NLMv4 TEST arguments decoder to use struct xdr_stream lockd: Update the NLMv4 void arguments decoder to use struct xdr_stream lockd: Update the NLMv1 SHARE results encoder to use struct xdr_stream lockd: Update the NLMv1 nlm_res results encoder to use struct xdr_stream lockd: Update the NLMv1 TEST results encoder to use struct xdr_stream ...
2021-07-07nfsd: fix NULL dereference in nfs3svc_encode_getaclresJ. Bruce Fields1-1/+2
In error cases the dentry may be NULL. Before 20798dfe249a, the encoder also checked dentry and d_really_is_positive(dentry), but that looks like overkill to me--zero status should be enough to guarantee a positive dentry. This isn't the first time we've seen an error-case NULL dereference hidden in the initialization of a local variable in an xdr encoder. But I went back through the other recent rewrites and didn't spot any similar bugs. Reported-by: JianHong Yin <jiyin@redhat.com> Reviewed-by: Chuck Lever III <chuck.lever@oracle.com> Fixes: 20798dfe249a ("NFSD: Update the NFSv3 GETACL result encoder...") Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2021-07-07NFSD: Prevent a possible oops in the nfs_dirent() tracepointChuck Lever1-1/+0
The double copy of the string is a mistake, plus __assign_str() uses strlen(), which is wrong to do on a string that isn't guaranteed to be NUL-terminated. Fixes: 6019ce0742ca ("NFSD: Add a tracepoint to record directory entry encoding") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2021-07-07nfsd: remove redundant assignment to pointer 'this'Colin Ian King1-1/+1
The pointer 'this' is being initialized with a value that is never read and it is being updated later with a new value. The initialization is redundant and can be removed. Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2021-07-07nfsd: Reduce contention for the nfsd_file nf_rwsemTrond Myklebust1-2/+16
When flushing out the unstable file writes as part of a COMMIT call, try to perform most of of the data writes and waits outside the semaphore. This means that if the client is sending the COMMIT as part of a memory reclaim operation, then it can continue performing I/O, with contention for the lock occurring only once the data sync is finished. Fixes: 5011af4c698a ("nfsd: Fix stable writes") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Tested-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2021-07-07nfsd: rpc_peeraddr2str needs rcu lockJ. Bruce Fields1-0/+2
I'm not even sure cl_xprt can change here, but we're getting "suspicious RCU usage" warnings, and other rpc_peeraddr2str callers are taking the rcu lock. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2021-07-07NFSD: Fix error return code in nfsd4_interssc_connect()Wei Yongjun1-0/+1
'status' has been overwritten to 0 after nfsd4_ssc_setup_dul(), this cause 0 will be return in vfs_kern_mount() error case. Fix to return nfserr_nodev in this error. Fixes: f4e44b393389 ("NFSD: delay unmount source's export after inter-server copy completed.") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2021-07-07nfsd: fix kernel test robot warning in SSC codeDai Ngo2-3/+3
Fix by initializing pointer nfsd4_ssc_umount_item with NULL instead of 0. Replace return value of nfsd4_ssc_setup_dul with __be32 instead of int. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2021-07-07nfsd4: Expose the callback address and state of each NFS4 clientDave Wysochanski1-0/+17
In addition to the client's address, display the callback channel state and address in the 'info' file. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>