summaryrefslogtreecommitdiff
path: root/net/sunrpc
AgeCommit message (Collapse)AuthorFilesLines
2025-02-02Merge tag 'pull-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds1-9/+5
Pull misc vfs cleanups from Al Viro: "Two unrelated patches - one is a removal of long-obsolete include in overlayfs (it used to need fs/internal.h, but the extern it wanted has been moved back to include/linux/namei.h) and another introduces convenience helper constructing struct qstr by a NUL-terminated string" * tag 'pull-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: add a string-to-qstr constructor fs/overlayfs/namei.c: get rid of include ../internal.h
2025-01-29Merge tag 'nfs-for-6.14-1' of git://git.linux-nfs.org/projects/anna/linux-nfsLinus Torvalds2-9/+35
Pull NFS client updates from Anna Schumaker: "New Features: - Enable using direct IO with localio - Added localio related tracepoints Bugfixes: - Sunrpc fixes for working with a very large cl_tasks list - Fix a possible buffer overflow in nfs_sysfs_link_rpc_client() - Fixes for handling reconnections with localio - Fix how the NFS_FSCACHE kconfig option interacts with NETFS_SUPPORT - Fix COPY_NOTIFY xdr_buf size calculations - pNFS/Flexfiles fix for retrying requesting a layout segment for reads - Sunrpc fix for retrying on EKEYEXPIRED error when the TGT is expired Cleanups: - Various other nfs & nfsd localio cleanups - Prepratory patches for async copy improvements that are under development - Make OFFLOAD_CANCEL, LAYOUTSTATS, and LAYOUTERR moveable to other xprts - Add netns inum and srcaddr to debugfs rpc_xprt info" * tag 'nfs-for-6.14-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (28 commits) SUNRPC: do not retry on EKEYEXPIRED when user TGT ticket expired sunrpc: add netns inum and srcaddr to debugfs rpc_xprt info pnfs/flexfiles: retry getting layout segment for reads NFSv4.2: make LAYOUTSTATS and LAYOUTERROR MOVEABLE NFSv4.2: mark OFFLOAD_CANCEL MOVEABLE NFSv4.2: fix COPY_NOTIFY xdr buf size calculation NFS: Rename struct nfs4_offloadcancel_data NFS: Fix typo in OFFLOAD_CANCEL comment NFS: CB_OFFLOAD can return NFS4ERR_DELAY nfs: Make NFS_FSCACHE select NETFS_SUPPORT instead of depending on it nfs: fix incorrect error handling in LOCALIO nfs: probe for LOCALIO when v3 client reconnects to server nfs: probe for LOCALIO when v4 client reconnects to server nfs/localio: remove redundant code and simplify LOCALIO enablement nfs_common: add nfs_localio trace events nfs_common: track all open nfsd_files per LOCALIO nfs_client nfs_common: rename nfslocalio nfs_uuid_lock to nfs_uuids_lock nfsd: nfsd_file_acquire_local no longer returns GC'd nfsd_file nfsd: rename nfsd_serv_ prefixed methods and variables with nfsd_net_ nfsd: update percpu_ref to manage references on nfsd_net ...
2025-01-28Merge tag 'nfsd-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linuxLinus Torvalds10-369/+53
Pull nfsd updates from Chuck Lever: "Jeff Layton contributed an implementation of NFSv4.2+ attribute delegation, as described here: https://www.ietf.org/archive/id/draft-ietf-nfsv4-delstid-08.html This interoperates with similar functionality introduced into the Linux NFS client in v6.11. An attribute delegation permits an NFS client to manage a file's mtime, rather than flushing dirty data to the NFS server so that the file's mtime reflects the last write, which is considerably slower. Neil Brown contributed dynamic NFSv4.1 session slot table resizing. This facility enables NFSD to increase or decrease the number of slots per NFS session depending on server memory availability. More session slots means greater parallelism. Chuck Lever fixed a long-standing latent bug where NFSv4 COMPOUND encoding screws up when crossing a page boundary in the encoding buffer. This is a zero-day bug, but hitting it is rare and depends on the NFS client implementation. The Linux NFS client does not happen to trigger this issue. A variety of bug fixes and other incremental improvements fill out the list of commits in this release. Great thanks to all contributors, reviewers, testers, and bug reporters who participated during this development cycle" * tag 'nfsd-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (42 commits) sunrpc: Remove gss_{de,en}crypt_xdr_buf deadcode sunrpc: Remove gss_generic_token deadcode sunrpc: Remove unused xprt_iter_get_xprt Revert "SUNRPC: Reduce thread wake-up rate when receiving large RPC messages" nfsd: implement OPEN_ARGS_SHARE_ACCESS_WANT_OPEN_XOR_DELEGATION nfsd: handle delegated timestamps in SETATTR nfsd: add support for delegated timestamps nfsd: rework NFS4_SHARE_WANT_* flag handling nfsd: add support for FATTR4_OPEN_ARGUMENTS nfsd: prepare delegation code for handing out *_ATTRS_DELEG delegations nfsd: rename NFS4_SHARE_WANT_* constants to OPEN4_SHARE_ACCESS_WANT_* nfsd: switch to autogenerated definitions for open_delegation_type4 nfs_common: make include/linux/nfs4.h include generated nfs4_1.h nfsd: fix handling of delegated change attr in CB_GETATTR SUNRPC: Document validity guarantees of the pointer returned by reserve_space NFSD: Insulate nfsd4_encode_fattr4() from page boundaries in the encode buffer NFSD: Insulate nfsd4_encode_secinfo() from page boundaries in the encode buffer NFSD: Refactor nfsd4_do_encode_secinfo() again NFSD: Insulate nfsd4_encode_readlink() from page boundaries in the encode buffer NFSD: Insulate nfsd4_encode_read_plus_data() from page boundaries in the encode buffer ...
2025-01-28add a string-to-qstr constructorAl Viro1-9/+5
Quite a few places want to build a struct qstr by given string; it would be convenient to have a primitive doing that, rather than open-coding it via QSTR_INIT(). The closest approximation was in bcachefs, but that expands to initializer list - {.len = strlen(string), .name = string}. It would be more useful to have it as compound literal - (struct qstr){.len = strlen(string), .name = string}. Unlike initializer list it's a valid expression. What's more, it's a valid lvalue - it's an equivalent of anonymous local variable with such initializer, so the things like path->dentry = d_alloc_pseudo(mnt->mnt_sb, &QSTR(name)); are valid. It can also be used as initializer, with identical effect - struct qstr x = (struct qstr){.name = s, .len = strlen(s)}; is equivalent to struct qstr anon_variable = {.name = s, .len = strlen(s)}; struct qstr x = anon_variable; // anon_variable is never used after that point and any even remotely sane compiler will manage to collapse that into struct qstr x = {.name = s, .len = strlen(s)}; What compound literals can't be used for is initialization of global variables, but those are covered by QSTR_INIT(). This commit lifts definition(s) of QSTR() into linux/dcache.h, converts it to compound literal (all bcachefs users are fine with that) and converts assorted open-coded instances to using that. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-01-26mm: alloc_pages_bulk: rename APILuiz Capitulino2-4/+3
The previous commit removed the page_list argument from alloc_pages_bulk_noprof() along with the alloc_pages_bulk_list() function. Now that only the *_array() flavour of the API remains, we can do the following renaming (along with the _noprof() ones): alloc_pages_bulk_array -> alloc_pages_bulk alloc_pages_bulk_array_mempolicy -> alloc_pages_bulk_mempolicy alloc_pages_bulk_array_node -> alloc_pages_bulk_node Link: https://lkml.kernel.org/r/275a3bbc0be20fbe9002297d60045e67ab3d4ada.1734991165.git.luizcap@redhat.com Signed-off-by: Luiz Capitulino <luizcap@redhat.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-22SUNRPC: do not retry on EKEYEXPIRED when user TGT ticket expiredDai Ngo1-2/+2
When a user TGT ticket expired, gssd returns EKEYEXPIRED to the RPC layer for the upcall to create the security context. The RPC layer then retries the upcall twice before returning the EKEYEXPIRED to the NFS layer. This results in three separate TCP connections to the NFS server being created by gssd for each RPC request. These connections are not used and left in TIME_WAIT state. Note that for RPC call that uses machine credential, gssd automatically renews the ticket. But for a regular user the ticket needs to be renewed by the user before access to the krb5 share is allowed. This patch removes the retries by RPC on EKEYEXPIRED so that these unused TCP connections are not created. Reproducer: $ kinit -l 1m $ sleep 65 $ cd /mnt/krb5share $ netstat -na |grep TIME_WAIT Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2025-01-22sunrpc: add netns inum and srcaddr to debugfs rpc_xprt infoJeff Layton1-0/+12
The output format should provide a value that matches the one in the /proc/<pid>/ns/net symlink. This makes it simpler to match the rpc_xprt and rpc_clnt to a particular container. Also, when the xprt defines the get_srcaddr operation, use that to display the source address as well. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2025-01-21sunrpc: Remove gss_{de,en}crypt_xdr_buf deadcodeDr. David Alan Gilbert2-62/+0
Commit ec596aaf9b48 ("SUNRPC: Remove code behind CONFIG_RPCSEC_GSS_KRB5_SIMPLIFIED") was the last user of the gss_decrypt_xdr_buf() and gss_encrypt_xdr_buf() functions. Remove them. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2025-01-21sunrpc: Remove gss_generic_token deadcodeDr. David Alan Gilbert3-233/+1
Commit ec596aaf9b48 ("SUNRPC: Remove code behind CONFIG_RPCSEC_GSS_KRB5_SIMPLIFIED") was the last user of the routines in gss_generic_token.c. Remove the routines and associated header. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2025-01-21sunrpc: Remove unused xprt_iter_get_xprtDr. David Alan Gilbert1-17/+0
xprt_iter_get_xprt() was added by commit 80b14d5e61ca ("SUNRPC: Add a structure to track multiple transports") but is unused. Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Acked-by: Anna Schumaker <anna.schumaker@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2025-01-21Revert "SUNRPC: Reduce thread wake-up rate when receiving large RPC messages"Chuck Lever1-11/+1
I noticed that a handful of NFSv3 fstests were taking an unexpectedly long time to run. Troubleshooting showed that the server's TCP window closed and never re-opened, which caused the client to trigger an RPC retransmit timeout after 180 seconds. The client's recovery action was to establish a fresh connection and retransmit the timed-out requests. This worked, but it adds a long delay. I tracked the problem to the commit that attempted to reduce the rate at which the network layer delivers TCP socket data_ready callbacks. Under most circumstances this change worked as expected, but for NFSv3, which has no session or other type of throttling, it can overwhelm the receiver on occasion. I'm sure I could tweak the lowat settings, but the small benefit doesn't seem worth the bother. Just revert it. Fixes: 2b877fc53e97 ("SUNRPC: Reduce thread wake-up rate when receiving large RPC messages") Cc: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2025-01-13SUNRPC: display total RPC tasks for RPC clientDai Ngo2-2/+8
Display the total number of RPC tasks, including tasks waiting on workqueue and wait queues, for rpc_clnt. Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2025-01-13SUNRPC: only put task on cl_tasks list after the RPC call slot is reserved.Dai Ngo1-5/+13
Under heavy write load, we've seen the cl_tasks list grows to millions of entries. Even though the list is extremely long, the system still runs fine until the user wants to get the information of all active RPC tasks by doing: When this happens, tasks_start acquires the cl_lock to walk the cl_tasks list, returning one entry at a time to the caller. The cl_lock is held until all tasks on this list have been processed. While the cl_lock is held, completed RPC tasks have to spin wait in rpc_task_release_client for the cl_lock. If there are millions of entries in the cl_tasks list it will take a long time before tasks_stop is called and the cl_lock is released. The spin wait tasks can use up all the available CPUs in the system, preventing other jobs to run, this causes the system to temporarily lock up. This patch fixes this problem by delaying inserting the RPC task on the cl_tasks list until the RPC call slot is reserved. This limits the length of the cl_tasks to the number of call slots available in the system. Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2025-01-11SUNRPC: Document validity guarantees of the pointer returned by reserve_spaceChuck Lever1-0/+6
A subtlety of this API is that if the @nbytes region traverses a page boundary, the next __xdr_commit_encode will shift the data item in the XDR encode buffer. This makes the returned pointer point to something else, leading to unexpected behavior. There are a few cases where the caller saves the returned pointer and then later uses it to insert a computed value into an earlier part of the stream. This can be safe only if either: - the data item is guaranteed to be in the XDR buffer's head, and thus is not ever going to be near a page boundary, or - the data item is no larger than 4 octets, since XDR alignment rules require all data items to start on 4-octet boundaries But that safety is only an artifact of the current implementation. It would be less brittle if these "safe" uses were eventually replaced. Reviewed-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2025-01-06SUNRPC: no need get cache ref when protected by rcuYang Erkun1-9/+3
rcu_read_lock/rcu_read_unlock has already provide protection for the pointer we will reference when we call c_show. Therefore, there is no need to obtain a cache reference to help protect cache_head. Additionally, the .put such as expkey_put/svc_export_put will invoke dput, which can sleep and break rcu. Stop get cache reference to fix them all. Fixes: ae74136b4bb6 ("SUNRPC: Allow cache lookups to use RCU protection rather than the r/w spinlock") Suggested-by: NeilBrown <neilb@suse.de> Signed-off-by: Yang Erkun <yangerkun@huawei.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2025-01-06SUNRPC: introduce cache_check_rcu to help check in rcu contextYang Erkun1-15/+26
This is a prepare patch to add cache_check_rcu, will use it with follow patch. Suggested-by: NeilBrown <neilb@suse.de> Signed-off-by: Yang Erkun <yangerkun@huawei.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2025-01-06sunrpc: remove all connection limit configurationNeilBrown1-7/+1
Now that the connection limit only apply to unconfirmed connections, there is no need to configure it. So remove all the configuration and fix the number of unconfirmed connections as always 64 - which is now given a name: XPT_MAX_TMP_CONN Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2025-01-06nfsd: don't use sv_nrthreads in connection limiting calculations.NeilBrown1-16/+16
The heuristic for limiting the number of incoming connections to nfsd currently uses sv_nrthreads - allowing more connections if more threads were configured. A future patch will allow number of threads to grow dynamically so that there will be no need to configure sv_nrthreads. So we need a different solution for limiting connections. It isn't clear what problem is solved by limiting connections (as mentioned in a code comment) but the most likely problem is a connection storm - many connections that are not doing productive work. These will be closed after about 6 minutes already but it might help to slow down a storm. This patch adds a per-connection flag XPT_PEER_VALID which indicates that the peer has presented a filehandle for which it has some sort of access. i.e the peer is known to be trusted in some way. We now only count connections which have NOT been determined to be valid. There should be relative few of these at any given time. If the number of non-validated peer exceed a limit - currently 64 - we close the oldest non-validated peer to avoid having too many of these useless connections. Note that this patch significantly changes the meaning of the various configuration parameters for "max connections". The next patch will remove all of these. Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-12-02module: Convert symbol namespace to string literalPeter Zijlstra1-1/+1
Clean up the existing export namespace code along the same lines of commit 33def8498fdd ("treewide: Convert macro and uses of __section(foo) to __section("foo")") and for the same reason, it is not desired for the namespace argument to be a macro expansion itself. Scripted using git grep -l -e MODULE_IMPORT_NS -e EXPORT_SYMBOL_NS | while read file; do awk -i inplace ' /^#define EXPORT_SYMBOL_NS/ { gsub(/__stringify\(ns\)/, "ns"); print; next; } /^#define MODULE_IMPORT_NS/ { gsub(/__stringify\(ns\)/, "ns"); print; next; } /MODULE_IMPORT_NS/ { $0 = gensub(/MODULE_IMPORT_NS\(([^)]*)\)/, "MODULE_IMPORT_NS(\"\\1\")", "g"); } /EXPORT_SYMBOL_NS/ { if ($0 ~ /(EXPORT_SYMBOL_NS[^(]*)\(([^,]+),/) { if ($0 !~ /(EXPORT_SYMBOL_NS[^(]*)\(([^,]+), ([^)]+)\)/ && $0 !~ /(EXPORT_SYMBOL_NS[^(]*)\(\)/ && $0 !~ /^my/) { getline line; gsub(/[[:space:]]*\\$/, ""); gsub(/[[:space:]]/, "", line); $0 = $0 " " line; } $0 = gensub(/(EXPORT_SYMBOL_NS[^(]*)\(([^,]+), ([^)]+)\)/, "\\1(\\2, \"\\3\")", "g"); } } { print }' $file; done Requested-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://mail.google.com/mail/u/2/#inbox/FMfcgzQXKWgMmjdFwwdsfgxzKpVHWPlc Acked-by: Greg KH <gregkh@linuxfoundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-11-30Merge tag 'nfs-for-6.13-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds2-5/+17
Pull NFS client updates from Trond Myklebust: "Bugfixes: - nfs/localio: fix for a memory corruption in nfs_local_read_done - Revert "nfs: don't reuse partially completed requests in nfs_lock_and_join_requests" - nfsv4: - ignore SB_RDONLY when mounting nfs - Fix a use-after-free problem in open() - sunrpc: - clear XPRT_SOCK_UPD_TIMEOUT when reseting the transport - timeout and cancel TLS handshake with -ETIMEDOUT - fix one UAF issue caused by sunrpc kernel tcp socket - Fix a hang in TLS sock_close if sk_write_pending - pNFS/blocklayout: Fix device registration issues Features and cleanups: - localio cleanups from Mike Snitzer - Clean up refcounting on the nfs version modules - __counted_by() annotations - nfs: make processes that are waiting for an I/O lock killable" * tag 'nfs-for-6.13-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (24 commits) fs/nfs/io: make nfs_start_io_*() killable nfs/blocklayout: Limit repeat device registration on failure nfs/blocklayout: Don't attempt unregister for invalid block device sunrpc: fix one UAF issue caused by sunrpc kernel tcp socket SUNRPC: timeout and cancel TLS handshake with -ETIMEDOUT sunrpc: clear XPRT_SOCK_UPD_TIMEOUT when reset transport nfs: ignore SB_RDONLY when mounting nfs Revert "nfs: don't reuse partially completed requests in nfs_lock_and_join_requests" Revert "fs: nfs: fix missing refcnt by replacing folio_set_private by folio_attach_private" nfs/localio: must clear res.replen in nfs_local_read_done NFSv4.0: Fix a use-after-free problem in the asynchronous open() NFSv4.0: Fix the wake up of the next waiter in nfs_release_seqid() SUNRPC: Fix a hang in TLS sock_close if sk_write_pending sunrpc: remove newlines from tracepoints nfs: Annotate struct pnfs_commit_array with __counted_by() nfs/localio: eliminate need for nfs_local_fsync_work forward declaration nfs/localio: remove extra indirect nfs_to call to check {read,write}_iter nfs/localio: eliminate unnecessary kref in nfs_local_fsync_ctx nfs/localio: remove redundant suid/sgid handling NFS: Implement get_nfs_version() ...
2024-11-28sunrpc: fix one UAF issue caused by sunrpc kernel tcp socketLiu Jian2-0/+11
BUG: KASAN: slab-use-after-free in tcp_write_timer_handler+0x156/0x3e0 Read of size 1 at addr ffff888111f322cd by task swapper/0/0 CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.12.0-rc4-dirty #7 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 Call Trace: <IRQ> dump_stack_lvl+0x68/0xa0 print_address_description.constprop.0+0x2c/0x3d0 print_report+0xb4/0x270 kasan_report+0xbd/0xf0 tcp_write_timer_handler+0x156/0x3e0 tcp_write_timer+0x66/0x170 call_timer_fn+0xfb/0x1d0 __run_timers+0x3f8/0x480 run_timer_softirq+0x9b/0x100 handle_softirqs+0x153/0x390 __irq_exit_rcu+0x103/0x120 irq_exit_rcu+0xe/0x20 sysvec_apic_timer_interrupt+0x76/0x90 </IRQ> <TASK> asm_sysvec_apic_timer_interrupt+0x1a/0x20 RIP: 0010:default_idle+0xf/0x20 Code: 4c 01 c7 4c 29 c2 e9 72 ff ff ff 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 66 90 0f 00 2d 33 f8 25 00 fb f4 <fa> c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 RSP: 0018:ffffffffa2007e28 EFLAGS: 00000242 RAX: 00000000000f3b31 RBX: 1ffffffff4400fc7 RCX: ffffffffa09c3196 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff9f00590f RBP: 0000000000000000 R08: 0000000000000001 R09: ffffed102360835d R10: ffff88811b041aeb R11: 0000000000000001 R12: 0000000000000000 R13: ffffffffa202d7c0 R14: 0000000000000000 R15: 00000000000147d0 default_idle_call+0x6b/0xa0 cpuidle_idle_call+0x1af/0x1f0 do_idle+0xbc/0x130 cpu_startup_entry+0x33/0x40 rest_init+0x11f/0x210 start_kernel+0x39a/0x420 x86_64_start_reservations+0x18/0x30 x86_64_start_kernel+0x97/0xa0 common_startup_64+0x13e/0x141 </TASK> Allocated by task 595: kasan_save_stack+0x24/0x50 kasan_save_track+0x14/0x30 __kasan_slab_alloc+0x87/0x90 kmem_cache_alloc_noprof+0x12b/0x3f0 copy_net_ns+0x94/0x380 create_new_namespaces+0x24c/0x500 unshare_nsproxy_namespaces+0x75/0xf0 ksys_unshare+0x24e/0x4f0 __x64_sys_unshare+0x1f/0x30 do_syscall_64+0x70/0x180 entry_SYSCALL_64_after_hwframe+0x76/0x7e Freed by task 100: kasan_save_stack+0x24/0x50 kasan_save_track+0x14/0x30 kasan_save_free_info+0x3b/0x60 __kasan_slab_free+0x54/0x70 kmem_cache_free+0x156/0x5d0 cleanup_net+0x5d3/0x670 process_one_work+0x776/0xa90 worker_thread+0x2e2/0x560 kthread+0x1a8/0x1f0 ret_from_fork+0x34/0x60 ret_from_fork_asm+0x1a/0x30 Reproduction script: mkdir -p /mnt/nfsshare mkdir -p /mnt/nfs/netns_1 mkfs.ext4 /dev/sdb mount /dev/sdb /mnt/nfsshare systemctl restart nfs-server chmod 777 /mnt/nfsshare exportfs -i -o rw,no_root_squash *:/mnt/nfsshare ip netns add netns_1 ip link add name veth_1_peer type veth peer veth_1 ifconfig veth_1_peer 11.11.0.254 up ip link set veth_1 netns netns_1 ip netns exec netns_1 ifconfig veth_1 11.11.0.1 ip netns exec netns_1 /root/iptables -A OUTPUT -d 11.11.0.254 -p tcp \ --tcp-flags FIN FIN -j DROP (note: In my environment, a DESTROY_CLIENTID operation is always sent immediately, breaking the nfs tcp connection.) ip netns exec netns_1 timeout -s 9 300 mount -t nfs -o proto=tcp,vers=4.1 \ 11.11.0.254:/mnt/nfsshare /mnt/nfs/netns_1 ip netns del netns_1 The reason here is that the tcp socket in netns_1 (nfs side) has been shutdown and closed (done in xs_destroy), but the FIN message (with ack) is discarded, and the nfsd side keeps sending retransmission messages. As a result, when the tcp sock in netns_1 processes the received message, it sends the message (FIN message) in the sending queue, and the tcp timer is re-established. When the network namespace is deleted, the net structure accessed by tcp's timer handler function causes problems. To fix this problem, let's hold netns refcnt for the tcp kernel socket as done in other modules. This is an ugly hack which can easily be backported to earlier kernels. A proper fix which cleans up the interfaces will follow, but may not be so easy to backport. Fixes: 26abe14379f8 ("net: Modify sk_alloc to not reference count the netns of kernel sockets.") Signed-off-by: Liu Jian <liujian56@huawei.com> Acked-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2024-11-28SUNRPC: timeout and cancel TLS handshake with -ETIMEDOUTBenjamin Coddington1-5/+4
We've noticed a situation where an unstable TCP connection can cause the TLS handshake to timeout waiting for userspace to complete it. When this happens, we don't want to return from xs_tls_handshake_sync() with zero, as this will cause the upper xprt to be set CONNECTED, and subsequent attempts to transmit will be returned with -EPIPE. The sunrpc machine does not recover from this situation and will spin attempting to transmit. The return value of tls_handshake_cancel() can be used to detect a race with completion: * tls_handshake_cancel - cancel a pending handshake * Return values: * %true - Uncompleted handshake request was canceled * %false - Handshake request already completed or not found If true, we do not want the upper xprt to be connected, so return -ETIMEDOUT. If false, its possible the handshake request was lost and that may be the reason for our timeout. Again we do not want the upper xprt to be connected, so return -ETIMEDOUT. Ensure that we alway return an error from xs_tls_handshake_sync() if we call tls_handshake_cancel(). Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Fixes: 75eb6af7acdf ("SUNRPC: Add a TCP-with-TLS RPC transport class") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2024-11-28sunrpc: clear XPRT_SOCK_UPD_TIMEOUT when reset transportLiu Jian1-0/+1
Since transport->sock has been set to NULL during reset transport, XPRT_SOCK_UPD_TIMEOUT also needs to be cleared. Otherwise, the xs_tcp_set_socket_timeouts() may be triggered in xs_tcp_send_request() to dereference the transport->sock that has been set to NULL. Fixes: 7196dbb02ea0 ("SUNRPC: Allow changing of the TCP timeout parameters on the fly") Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com> Signed-off-by: Liu Jian <liujian56@huawei.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2024-11-26Merge tag 'nfsd-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linuxLinus Torvalds3-7/+24
Pull nfsd updates from Chuck Lever: "Jeff Layton contributed a scalability improvement to NFSD's NFSv4 backchannel session implementation. This improvement is intended to increase the rate at which NFSD can safely recall NFSv4 delegations from clients, to avoid the need to revoke them. Revoking requires a slow state recovery process. A wide variety of bug fixes and other incremental improvements make up the bulk of commits in this series. As always I am grateful to the NFSD contributors, reviewers, testers, and bug reporters who participated during this cycle" * tag 'nfsd-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (72 commits) nfsd: allow for up to 32 callback session slots nfs_common: must not hold RCU while calling nfsd_file_put_local nfsd: get rid of include ../internal.h nfsd: fix nfs4_openowner leak when concurrent nfsd4_open occur NFSD: Add nfsd4_copy time-to-live NFSD: Add a laundromat reaper for async copy state NFSD: Block DESTROY_CLIENTID only when there are ongoing async COPY operations NFSD: Handle an NFS4ERR_DELAY response to CB_OFFLOAD NFSD: Free async copy information in nfsd4_cb_offload_release() NFSD: Fix nfsd4_shutdown_copy() NFSD: Add a tracepoint to record canceled async COPY operations nfsd: make nfsd4_session->se_flags a bool nfsd: remove nfsd4_session->se_bchannel nfsd: make use of warning provided by refcount_t nfsd: Don't fail OP_SETCLIENTID when there are too many clients. svcrdma: fix miss destroy percpu_counter in svc_rdma_proc_init() xdrgen: Remove program_stat_to_errno() call sites xdrgen: Update the files included in client-side source code xdrgen: Remove check for "nfs_ok" in C templates xdrgen: Remove tracepoint call site ...
2024-11-19svcrdma: fix miss destroy percpu_counter in svc_rdma_proc_init()Ye Bin1-5/+14
There's issue as follows: RPC: Registered rdma transport module. RPC: Registered rdma backchannel transport module. RPC: Unregistered rdma transport module. RPC: Unregistered rdma backchannel transport module. BUG: unable to handle page fault for address: fffffbfff80c609a PGD 123fee067 P4D 123fee067 PUD 123fea067 PMD 10c624067 PTE 0 Oops: Oops: 0000 [#1] PREEMPT SMP KASAN NOPTI RIP: 0010:percpu_counter_destroy_many+0xf7/0x2a0 Call Trace: <TASK> __die+0x1f/0x70 page_fault_oops+0x2cd/0x860 spurious_kernel_fault+0x36/0x450 do_kern_addr_fault+0xca/0x100 exc_page_fault+0x128/0x150 asm_exc_page_fault+0x26/0x30 percpu_counter_destroy_many+0xf7/0x2a0 mmdrop+0x209/0x350 finish_task_switch.isra.0+0x481/0x840 schedule_tail+0xe/0xd0 ret_from_fork+0x23/0x80 ret_from_fork_asm+0x1a/0x30 </TASK> If register_sysctl() return NULL, then svc_rdma_proc_cleanup() will not destroy the percpu counters which init in svc_rdma_proc_init(). If CONFIG_HOTPLUG_CPU is enabled, residual nodes may be in the 'percpu_counters' list. The above issue may occur once the module is removed. If the CONFIG_HOTPLUG_CPU configuration is not enabled, memory leakage occurs. To solve above issue just destroy all percpu counters when register_sysctl() return NULL. Fixes: 1e7e55731628 ("svcrdma: Restore read and write stats") Fixes: 22df5a22462e ("svcrdma: Convert rdma_stat_sq_starve to a per-CPU counter") Fixes: df971cd853c0 ("svcrdma: Convert rdma_stat_recv to a per-CPU counter") Signed-off-by: Ye Bin <yebin10@huawei.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-11-19SUNRPC: make sure cache entry active before cache_showYang Erkun1-1/+3
The function `c_show` was called with protection from RCU. This only ensures that `cp` will not be freed. Therefore, the reference count for `cp` can drop to zero, which will trigger a refcount use-after-free warning when `cache_get` is called. To resolve this issue, use `cache_get_rcu` to ensure that `cp` remains active. ------------[ cut here ]------------ refcount_t: addition on 0; use-after-free. WARNING: CPU: 7 PID: 822 at lib/refcount.c:25 refcount_warn_saturate+0xb1/0x120 CPU: 7 UID: 0 PID: 822 Comm: cat Not tainted 6.12.0-rc3+ #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.1-2.fc37 04/01/2014 RIP: 0010:refcount_warn_saturate+0xb1/0x120 Call Trace: <TASK> c_show+0x2fc/0x380 [sunrpc] seq_read_iter+0x589/0x770 seq_read+0x1e5/0x270 proc_reg_read+0xe1/0x140 vfs_read+0x125/0x530 ksys_read+0xc1/0x160 do_syscall_64+0x5f/0x170 entry_SYSCALL_64_after_hwframe+0x76/0x7e Cc: stable@vger.kernel.org # v4.20+ Signed-off-by: Yang Erkun <yangerkun@huawei.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-11-11mm: page_frag: avoid caller accessing 'page_frag_cache' directlyYunsheng Lin1-4/+2
Use appropriate frag_page API instead of caller accessing 'page_frag_cache' directly. CC: Andrew Morton <akpm@linux-foundation.org> CC: Linux-MM <linux-mm@kvack.org> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Acked-by: Chuck Lever <chuck.lever@oracle.com> Link: https://patch.msgid.link/20241028115343.3405838-5-linyunsheng@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-11svcrdma: Address an integer overflowChuck Lever1-1/+7
Dan Carpenter reports: > Commit 78147ca8b4a9 ("svcrdma: Add a "parsed chunk list" data > structure") from Jun 22, 2020 (linux-next), leads to the following > Smatch static checker warning: > > net/sunrpc/xprtrdma/svc_rdma_recvfrom.c:498 xdr_check_write_chunk() > warn: potential user controlled sizeof overflow 'segcount * 4 * 4' > > net/sunrpc/xprtrdma/svc_rdma_recvfrom.c > 488 static bool xdr_check_write_chunk(struct svc_rdma_recv_ctxt *rctxt) > 489 { > 490 u32 segcount; > 491 __be32 *p; > 492 > 493 if (xdr_stream_decode_u32(&rctxt->rc_stream, &segcount)) > ^^^^^^^^ > > 494 return false; > 495 > 496 /* A bogus segcount causes this buffer overflow check to fail. */ > 497 p = xdr_inline_decode(&rctxt->rc_stream, > --> 498 segcount * rpcrdma_segment_maxsz * sizeof(*p)); > > > segcount is an untrusted u32. On 32bit systems anything >= SIZE_MAX / 16 will > have an integer overflow and some those values will be accepted by > xdr_inline_decode(). Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Fixes: 78147ca8b4a9 ("svcrdma: Add a "parsed chunk list" data structure") Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-11-08SUNRPC: Fix a hang in TLS sock_close if sk_write_pendingBenjamin Coddington1-0/+1
We've observed an NFS server shrink the TCP window and then reset the TCP connection as part of a HA failover. When the connection has TLS, often the NFS client will hang indefinitely in this stack: wait_woken+0x70/0x80 wait_on_pending_writer+0xe4/0x110 [tls] tls_sk_proto_close+0x368/0x3a0 [tls] inet_release+0x54/0xb0 __sock_release+0x48/0xc8 sock_close+0x20/0x38 __fput+0xe0/0x2f0 __fput_sync+0x58/0x70 xs_reset_transport+0xe8/0x1f8 [sunrpc] xs_tcp_shutdown+0xa4/0x190 [sunrpc] xprt_autoclose+0x68/0x170 [sunrpc] process_one_work+0x180/0x420 worker_thread+0x258/0x368 kthread+0x104/0x118 ret_from_fork+0x10/0x20 This hang prevents the client from closing the socket and reconnecting to the server. Because xs_nospace() elevates sk_write_pending, and sk_sndtimeo is MAX_SCHEDULE_TIMEOUT, tls_sk_proto_close is never able to complete its wait for pending writes to the socket. For this case where we are resetting the transport anyway, we don't expect the socket to ever have write space, so fix this by simply clearing the sock's sndtimeo under the sock's lock. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2024-11-04sunrpc: handle -ENOTCONN in xs_tcp_setup_socket()NeilBrown1-0/+1
xs_tcp_finish_connecting() can return -ENOTCONN but the switch statement in xs_tcp_setup_socket() treats that as an unhandled error. If we treat it as a known error it would propagate back to call_connect_status() which does handle that error code. This appears to be the intention of the commit (given below) which added -ENOTCONN as a return status for xs_tcp_finish_connecting(). So add -ENOTCONN to the switch statement as an error to pass through to the caller. Link: https://bugzilla.suse.com/show_bug.cgi?id=1231050 Link: https://access.redhat.com/discussions/3434091 Fixes: 01d37c428ae0 ("SUNRPC: xprt_connect() don't abort the task if the transport isn't bound") Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-11-02Merge tag 'nfsd-6.12-3' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: - Fix two async COPY bugs found during NFS bake-a-thon - Fix an svcrdma memory leak * tag 'nfsd-6.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: rpcrdma: Always release the rpcrdma_device's xa_array NFSD: Never decrement pending_async_copies on error NFSD: Initialize struct nfsd4_copy earlier
2024-10-30rpcrdma: Always release the rpcrdma_device's xa_arrayChuck Lever1-0/+1
Dai pointed out that the xa_init_flags() in rpcrdma_add_one() needs to have a matching xa_destroy() in rpcrdma_remove_one() to release underlying memory that the xarray might have accrued during operation. Reported-by: Dai Ngo <dai.ngo@oracle.com> Fixes: 7e86845a0346 ("rpcrdma: Implement generic device removal") Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-10-12Merge tag 'nfs-for-6.12-2' of git://git.linux-nfs.org/projects/anna/linux-nfsLinus Torvalds1-7/+4
Pull NFS client fixes from Anna Schumaker: "Localio Bugfixes: - remove duplicated include in localio.c - fix race in NFS calls to nfsd_file_put_local() and nfsd_serv_put() - fix Kconfig for NFS_COMMON_LOCALIO_SUPPORT - fix nfsd_file tracepoints to handle NULL rqstp pointers Other Bugfixes: - fix program selection loop in svc_process_common - fix integer overflow in decode_rc_list() - prevent NULL-pointer dereference in nfs42_complete_copies() - fix CB_RECALL performance issues when using a large number of delegations" * tag 'nfs-for-6.12-2' of git://git.linux-nfs.org/projects/anna/linux-nfs: NFS: remove revoked delegation from server's delegation list nfsd/localio: fix nfsd_file tracepoints to handle NULL rqstp nfs_common: fix Kconfig for NFS_COMMON_LOCALIO_SUPPORT nfs_common: fix race in NFS calls to nfsd_file_put_local() and nfsd_serv_put() NFSv4: Prevent NULL-pointer dereference in nfs42_complete_copies() SUNRPC: Fix integer overflow in decode_rc_list() sunrpc: fix prog selection loop in svc_process_common nfs: Remove duplicated include in localio.c
2024-10-03move asm/unaligned.h to linux/unaligned.hAl Viro2-2/+2
asm/unaligned.h is always an include of asm-generic/unaligned.h; might as well move that thing to linux/unaligned.h and include that - there's nothing arch-specific in that header. auto-generated by the following: for i in `git grep -l -w asm/unaligned.h`; do sed -i -e "s/asm\/unaligned.h/linux\/unaligned.h/" $i done for i in `git grep -l -w asm-generic/unaligned.h`; do sed -i -e "s/asm-generic\/unaligned.h/linux\/unaligned.h/" $i done git mv include/asm-generic/unaligned.h include/linux/unaligned.h git mv tools/include/asm-generic/unaligned.h tools/include/linux/unaligned.h sed -i -e "/unaligned.h/d" include/asm-generic/Kbuild sed -i -e "s/__ASM_GENERIC/__LINUX/" include/linux/unaligned.h tools/include/linux/unaligned.h
2024-09-30sunrpc: fix prog selection loop in svc_process_commonNeilBrown1-7/+4
If the rq_prog is not in the list of programs, then we use the last program in the list and we don't get the expected rpc_prog_unavail error as the subsequent tests on 'progp' being NULL are ineffective. We should only assign progp when we find the right program, and we should initialize it to NULL Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Fixes: 86ab08beb3f0 ("SUNRPC: replace program list with program array") Signed-off-by: NeilBrown <neilb@suse.de> Acked-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-09-27[tree-wide] finally take no_llseek outAl Viro2-5/+0
no_llseek had been defined to NULL two years ago, in commit 868941b14441 ("fs: remove no_llseek") To quote that commit, At -rc1 we'll need do a mechanical removal of no_llseek - git grep -l -w no_llseek | grep -v porting.rst | while read i; do sed -i '/\<no_llseek\>/d' $i done would do it. Unfortunately, that hadn't been done. Linus, could you do that now, so that we could finally put that thing to rest? All instances are of the form .llseek = no_llseek, so it's obviously safe. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-09-23SUNRPC: replace program list with program arrayNeilBrown3-31/+42
A service created with svc_create_pooled() can be given a linked list of programs and all of these will be served. Using a linked list makes it cumbersome when there are several programs that can be optionally selected with CONFIG settings. After this patch is applied, API consumers must use only svc_create_pooled() when creating an RPC service that listens for more than one RPC program. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Mike Snitzer <snitzer@kernel.org> Acked-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-09-23SUNRPC: add svcauth_map_clnt_to_svc_cred_localWeston Andros Adamson1-0/+28
Add new funtion svcauth_map_clnt_to_svc_cred_local which maps a generic cred to a svc_cred suitable for use in nfsd. This is needed by the localio code to map nfs client creds to nfs server credentials. Following from net/sunrpc/auth_unix.c:unx_marshal() it is clear that ->fsuid and ->fsgid must be used (rather than ->uid and ->gid). In addition, these uid and gid must be translated with from_kuid_munged() so local client uses correct uid and gid when acting as local server. Jeff Layton noted: This is where the magic happens. Since we're working in kuid_t/kgid_t, we don't need to worry about further idmapping. Suggested-by: NeilBrown <neilb@suse.de> # to approximate unx_marshal() Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Co-developed-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Mike Snitzer <snitzer@kernel.org> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: NeilBrown <neilb@suse.de> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-09-23SUNRPC: remove call_allocate() BUG_ONsMike Snitzer1-6/+0
Remove BUG_ON if p_arglen=0 to allow RPC with void arg. Remove BUG_ON if p_replen=0 to allow RPC with void return. The former was needed for the first revision of the LOCALIO protocol which had an RPC that took a void arg: /* raw RFC 9562 UUID */ typedef u8 uuid_t<UUID_SIZE>; program NFS_LOCALIO_PROGRAM { version LOCALIO_V1 { void NULL(void) = 0; uuid_t GETUUID(void) = 1; } = 1; } = 400122; The latter is needed for the final revision of the LOCALIO protocol which has a UUID_IS_LOCAL RPC which returns a void: /* raw RFC 9562 UUID */ typedef u8 uuid_t<UUID_SIZE>; program NFS_LOCALIO_PROGRAM { version LOCALIO_V1 { void NULL(void) = 0; void UUID_IS_LOCAL(uuid_t) = 1; } = 1; } = 400122; There is really no value in triggering a BUG_ON in response to either of these previously unsupported conditions. NeilBrown would like the entire 'if (proc->p_proc != 0)' branch removed (not just the one BUG_ON that must be removed for LOCALIO's immediate needs of returning void). Signed-off-by: Mike Snitzer <snitzer@kernel.org> Reviewed-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-09-23net/sunrpc: make use of the helper macro LIST_HEAD()Hongbo Li1-7/+3
list_head can be initialized automatically with LIST_HEAD() instead of calling INIT_LIST_HEAD(). Here we can simplify the code. Signed-off-by: Hongbo Li <lihongbo22@huawei.com> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-09-23SUNRPC: clnt.c: Remove misleading commentSiddh Raman Pant1-5/+0
destroy_wait doesn't store all RPC clients. There was a list named "all_clients" above it, which got moved to struct sunrpc_net in 2012, but the comment was never removed. Fixes: 70abc49b4f4a ("SUNRPC: make SUNPRC clients list per network namespace context") Signed-off-by: Siddh Raman Pant <siddh.raman.pant@oracle.com> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-09-23SUNRPC: Fix -Wformat-truncation warningKunwu Chan1-1/+1
Increase size of the servername array to avoid truncated output warning. net/sunrpc/clnt.c:582:75: error:‘%s’ directive output may be truncated writing up to 107 bytes into a region of size 48 [-Werror=format-truncation=] 582 | snprintf(servername, sizeof(servername), "%s", | ^~ net/sunrpc/clnt.c:582:33: note:‘snprintf’ output between 1 and 108 bytes into a destination of size 48 582 | snprintf(servername, sizeof(servername), "%s", | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 583 | sun->sun_path); Signed-off-by: Kunwu Chan <chentao@kylinos.cn> Suggested-by: NeilBrown <neilb@suse.de> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-09-21sunrpc: xprtrdma: Use ERR_CAST() to returnYan Zhen1-1/+1
Using ERR_CAST() is more reasonable and safer, When it is necessary to convert the type of an error pointer and return it. Signed-off-by: Yan Zhen <yanzhen@vivo.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-09-21svcrdma: Handle device removal outside of the CM event handlerChuck Lever1-1/+15
Synchronously wait for all disconnects to complete to ensure the transports have divested all hardware resources before the underlying RDMA device can safely be removed. Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-09-21sunrpc: allow svc threads to fail initialisation cleanlyNeilBrown1-0/+10
If an svc thread needs to perform some initialisation that might fail, it has no good way to handle the failure. Before the thread can exit it must call svc_exit_thread(), but that requires the service mutex to be held. The thread cannot simply take the mutex as that could deadlock if there is a concurrent attempt to shut down all threads (which is unlikely, but not impossible). nfsd currently call svc_exit_thread() unprotected in the unlikely event that unshare_fs_struct() fails. We can clean this up by introducing svc_thread_init_status() by which an svc thread can report whether initialisation has succeeded. If it has, it continues normally into the action loop. If it has not, svc_thread_init_status() immediately aborts the thread. svc_start_kthread() waits for either of these to happen, and calls svc_exit_thread() (under the mutex) if the thread aborted. Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-09-21sunrpc: merge svc_rqst_alloc() into svc_prepare_thread()NeilBrown1-18/+7
The only caller of svc_rqst_alloc() is svc_prepare_thread(). So merge the one into the other and simplify. Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-09-21sunrpc: don't take ->sv_lock when updating ->sv_nrthreads.NeilBrown1-6/+0
As documented in svc_xprt.c, sv_nrthreads is protected by the service mutex, and it does not need ->sv_lock. (->sv_lock is needed only for sv_permsocks, sv_tempsocks, and sv_tmpcnt). So remove the unnecessary locking. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-09-21sunrpc: change sp_nrthreads from atomic_t to unsigned int.NeilBrown1-20/+11
sp_nrthreads is only ever accessed under the service mutex nlmsvc_mutex nfs_callback_mutex nfsd_mutex so these is no need for it to be an atomic_t. The fact that all code using it is single-threaded means that we can simplify svc_pool_victim and remove the temporary elevation of sp_nrthreads. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-09-21sunrpc: document locking rules for svc_exit_thread()NeilBrown1-0/+14
The locking required for svc_exit_thread() is not obvious, so document it in a kdoc comment. Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-09-01SUNRPC: make various functions static, or not exported.NeilBrown5-34/+29
Various functions are only used within the sunrpc module, and several are only use in the one file. So clean up: These are marked static, and any EXPORT is removed. svc_rcpb_setup() svc_rqst_alloc() svc_rqst_free() - also moved before first use svc_rpcbind_set_version() svc_drop() - also moved to svc.c These are now not EXPORTed, but are not static. svc_authenticate() svc_sock_update_bufs() Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>