summaryrefslogtreecommitdiff
path: root/net/sunrpc
AgeCommit message (Collapse)AuthorFilesLines
2021-02-23Merge tag 'nfsd-5.12-1' of ↵Linus Torvalds2-21/+20
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull more nfsd updates from Chuck Lever: "Here are a few additional NFSD commits for the merge window: Optimization: - Cork the socket while there are queued replies Fixes: - DRC shutdown ordering - svc_rdma_accept() lockdep splat" * tag 'nfsd-5.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: SUNRPC: Further clean up svc_tcp_sendmsg() SUNRPC: Remove redundant socket flags from svc_tcp_sendmsg() SUNRPC: Use TCP_CORK to optimise send performance on the server svcrdma: Hold private mutex while invoking rdma_accept() nfsd: register pernet ops last, unregister first
2021-02-21Merge tag 'nfsd-5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linuxLinus Torvalds5-121/+175
Pull nfsd updates from Chuck Lever: - Update NFSv2 and NFSv3 XDR decoding functions - Further improve support for re-exporting NFS mounts - Convert NFSD stats to per-CPU counters - Add batch Receive posting to the server's RPC/RDMA transport * tag 'nfsd-5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (65 commits) nfsd: skip some unnecessary stats in the v4 case nfs: use change attribute for NFS re-exports NFSv4_2: SSC helper should use its own config. nfsd: cstate->session->se_client -> cstate->clp nfsd: simplify nfsd4_check_open_reclaim nfsd: remove unused set_client argument nfsd: find_cpntf_state cleanup nfsd: refactor set_client nfsd: rename lookup_clientid->set_client nfsd: simplify nfsd_renew nfsd: simplify process_lock nfsd4: simplify process_lookup1 SUNRPC: Correct a comment svcrdma: DMA-sync the receive buffer in svc_rdma_recvfrom() svcrdma: Reduce Receive doorbell rate svcrdma: Deprecate stat variables that are no longer used svcrdma: Restore read and write stats svcrdma: Convert rdma_stat_sq_starve to a per-CPU counter svcrdma: Convert rdma_stat_recv to a per-CPU counter svcrdma: Refactor svc_rdma_init() and svc_rdma_clean_up() ...
2021-02-16SUNRPC: Further clean up svc_tcp_sendmsg()Chuck Lever1-8/+7
Clean up: The msghdr is no longer needed in the caller. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-02-16SUNRPC: Remove redundant socket flags from svc_tcp_sendmsg()Trond Myklebust1-9/+3
Now that the caller controls the TCP_CORK socket option, it is redundant to set MSG_MORE and MSG_SENDPAGE_NOTLAST in the calls to kernel_sendpage(). Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-02-16SUNRPC: Use TCP_CORK to optimise send performance on the serverTrond Myklebust1-1/+7
Use a counter to keep track of how many requests are queued behind the xprt->xpt_mutex, and keep TCP_CORK set until the queue is empty. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Link: https://lore.kernel.org/linux-nfs/20210213202532.23146-1-trondmy@kernel.org/T/#u Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-02-15svcrdma: Hold private mutex while invoking rdma_accept()Chuck Lever1-3/+3
RDMA core mutex locking was restructured by commit d114c6feedfe ("RDMA/cma: Add missing locking to rdma_accept()") [Aug 2020]. When lock debugging is enabled, the RPC/RDMA server trips over the new lockdep assertion in rdma_accept() because it doesn't call rdma_accept() from its CM event handler. As a temporary fix, have svc_rdma_accept() take the handler_mutex explicitly. In the meantime, let's consider how to restructure the RPC/RDMA transport to invoke rdma_accept() from the proper context. Calls to svc_rdma_accept() are serialized with calls to svc_rdma_free() by the generic RPC server layer. Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/linux-rdma/20210209154014.GO4247@nvidia.com/ Fixes: d114c6feedfe ("RDMA/cma: Add missing locking to rdma_accept()") Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-02-01SUNRPC: Fix NFS READs that start at non-page-aligned offsetsChuck Lever1-3/+4
Anj Duvnjak reports that the Kodi.tv NFS client is not able to read video files from a v5.10.11 Linux NFS server. The new sendpage-based TCP sendto logic was not attentive to non- zero page_base values. nfsd_splice_read() sets that field when a READ payload starts in the middle of a page. The Linux NFS client rarely emits an NFS READ that is not page- aligned. All of my testing so far has been with Linux clients, so I missed this one. Reported-by: A. Duvnjak <avian@extremenerds.net> BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=211471 Fixes: 4a85a6a3320b ("SUNRPC: Handle TCP socket sends with kernel_sendpage() again") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: A. Duvnjak <avian@extremenerds.net>
2021-01-25SUNRPC: Handle 0 length opaque XDR object data properlyDave Wysochanski1-3/+6
When handling an auth_gss downcall, it's possible to get 0-length opaque object for the acceptor. In the case of a 0-length XDR object, make sure simple_get_netobj() fills in dest->data = NULL, and does not continue to kmemdup() which will set dest->data = ZERO_SIZE_PTR for the acceptor. The trace event code can handle NULL but not ZERO_SIZE_PTR for a string, and so without this patch the rpcgss_context trace event will crash the kernel as follows: [ 162.887992] BUG: kernel NULL pointer dereference, address: 0000000000000010 [ 162.898693] #PF: supervisor read access in kernel mode [ 162.900830] #PF: error_code(0x0000) - not-present page [ 162.902940] PGD 0 P4D 0 [ 162.904027] Oops: 0000 [#1] SMP PTI [ 162.905493] CPU: 4 PID: 4321 Comm: rpc.gssd Kdump: loaded Not tainted 5.10.0 #133 [ 162.908548] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [ 162.910978] RIP: 0010:strlen+0x0/0x20 [ 162.912505] Code: 48 89 f9 74 09 48 83 c1 01 80 39 00 75 f7 31 d2 44 0f b6 04 16 44 88 04 11 48 83 c2 01 45 84 c0 75 ee c3 0f 1f 80 00 00 00 00 <80> 3f 00 74 10 48 89 f8 48 83 c0 01 80 38 00 75 f7 48 29 f8 c3 31 [ 162.920101] RSP: 0018:ffffaec900c77d90 EFLAGS: 00010202 [ 162.922263] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00000000fffde697 [ 162.925158] RDX: 000000000000002f RSI: 0000000000000080 RDI: 0000000000000010 [ 162.928073] RBP: 0000000000000010 R08: 0000000000000e10 R09: 0000000000000000 [ 162.930976] R10: ffff8e698a590cb8 R11: 0000000000000001 R12: 0000000000000e10 [ 162.933883] R13: 00000000fffde697 R14: 000000010034d517 R15: 0000000000070028 [ 162.936777] FS: 00007f1e1eb93700(0000) GS:ffff8e6ab7d00000(0000) knlGS:0000000000000000 [ 162.940067] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 162.942417] CR2: 0000000000000010 CR3: 0000000104eba000 CR4: 00000000000406e0 [ 162.945300] Call Trace: [ 162.946428] trace_event_raw_event_rpcgss_context+0x84/0x140 [auth_rpcgss] [ 162.949308] ? __kmalloc_track_caller+0x35/0x5a0 [ 162.951224] ? gss_pipe_downcall+0x3a3/0x6a0 [auth_rpcgss] [ 162.953484] gss_pipe_downcall+0x585/0x6a0 [auth_rpcgss] [ 162.955953] rpc_pipe_write+0x58/0x70 [sunrpc] [ 162.957849] vfs_write+0xcb/0x2c0 [ 162.959264] ksys_write+0x68/0xe0 [ 162.960706] do_syscall_64+0x33/0x40 [ 162.962238] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 162.964346] RIP: 0033:0x7f1e1f1e57df Signed-off-by: Dave Wysochanski <dwysocha@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-01-25SUNRPC: Move simple_get_bytes and simple_get_netobj into private headerDave Wysochanski3-58/+45
Remove duplicated helper functions to parse opaque XDR objects and place inside new file net/sunrpc/auth_gss/auth_gss_internal.h. In the new file carry the license and copyright from the source file net/sunrpc/auth_gss/auth_gss.c. Finally, update the comment inside include/linux/sunrpc/xdr.h since lockd is not the only user of struct xdr_netobj. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-01-25SUNRPC: Correct a commentChuck Lever1-1/+1
Clean up: The rq_argpages field was removed from struct svc_rqst in the pre-git era. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-01-25svcrdma: DMA-sync the receive buffer in svc_rdma_recvfrom()Chuck Lever1-3/+3
The Receive completion handler doesn't look at the contents of the Receive buffer. The DMA sync isn't terribly expensive but it's one less thing that needs to be done by the Receive completion handler, which is single-threaded (per svc_xprt). This helps scalability. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2021-01-25svcrdma: Reduce Receive doorbell rateChuck Lever1-39/+43
This is similar to commit e340c2d6ef2a ("xprtrdma: Reduce the doorbell rate (Receive)") which added Receive batching to the client. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-01-25svcrdma: Deprecate stat variables that are no longer usedChuck Lever1-57/+27
Clean up. We are not permitted to remove old proc files. Instead, convert these variables to stubs that are only ever allowed to display a value of zero. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-01-25svcrdma: Restore read and write statsChuck Lever2-8/+20
Now that we have an efficient mechanism to update these two stats, let's start maintaining them again. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-01-25svcrdma: Convert rdma_stat_sq_starve to a per-CPU counterChuck Lever3-5/+11
Avoid the overhead of a memory bus lock cycle for counting a value that is hardly every used. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-01-25svcrdma: Convert rdma_stat_recv to a per-CPU counterChuck Lever2-6/+52
Receives are frequent events. Avoid the overhead of a memory bus lock cycle for counting a value that is hardly every used. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-01-25svcrdma: Refactor svc_rdma_init() and svc_rdma_clean_up()Chuck Lever1-7/+23
Setting up the proc variables is about to get more complicated. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-01-20Merge tag 'nfsd-5.11-2' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: - Avoid exposing parent of root directory in NFSv3 READDIRPLUS results - Fix a tracepoint change that went in the initial 5.11 merge * tag 'nfsd-5.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: SUNRPC: Move the svc_xdr_recvfrom tracepoint again nfsd4: readdirplus shouldn't return parent of export
2021-01-13SUNRPC: Move the svc_xdr_recvfrom tracepoint againChuck Lever1-2/+2
Commit 156708adf2d9 ("SUNRPC: Move the svc_xdr_recvfrom() tracepoint") tried to capture the correct XID in the trace record, but this line in svc_recv: rqstp->rq_xid = svc_getu32(&rqstp->rq_arg.head[0]); alters the size of rq_arg.head[0].iov_len. The tracepoint records the correct XID but an incorrect value for the length of the xdr_buf's head. To keep the trace callsites simple, I've created two trace classes. One assumes the xdr_buf contains a full RPC message, and the XID can be extracted from it. The other assumes the contents of the xdr_buf are arbitrary, and the xid will be provided by the caller. Currently there is only one user of each class, but I expect we will need a few more tracepoints using each class as time goes on. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-01-12Merge tag 'nfs-for-5.11-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds1-1/+1
Pull NFS client fixes from Trond Myklebust: "Highlights include: - Fix parsing of link-local IPv6 addresses - Fix confusing logging of mount errors that was introduced by the fsopen() patchset. - Fix a tracing use after free in _nfs4_do_setlk() - Layout return-on-close fixes when called from nfs4_evict_inode() - Layout segments were being leaked in pnfs_generic_clear_request_commit() - Don't leak DS commits in pnfs_generic_retry_commit() - Fix an Oopsable use-after-free when nfs_delegation_find_inode_server() calls iput() on an inode after the super block has gone away" * tag 'nfs-for-5.11-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFS: nfs_igrab_and_active must first reference the superblock NFS: nfs_delegation_find_inode_server must first reference the superblock NFS/pNFS: Fix a leak of the layout 'plh_outstanding' counter NFS/pNFS: Don't leak DS commits in pnfs_generic_retry_commit() NFS/pNFS: Don't call pnfs_free_bucket_lseg() before removing the request pNFS: Stricter ordering of layoutget and layoutreturn pNFS: Clean up pnfs_layoutreturn_free_lsegs() pNFS: We want return-on-close to complete when evicting the inode pNFS: Mark layout for return if return-on-close was not sent net: sunrpc: interpret the return value of kstrtou32 correctly NFS: Adjust fs_context error logging NFS4: Fix use-after-free in trace_event_raw_event_nfs4_set_lock
2021-01-11Merge tag 'nfsd-5.11-1' of git://git.linux-nfs.org/projects/cel/cel-2.6Linus Torvalds1-1/+85
Pull nfsd fixes from Chuck Lever: - Fix major TCP performance regression - Get NFSv4.2 READ_PLUS regression tests to pass - Improve NFSv4 COMPOUND memory allocation - Fix sparse warning * tag 'nfsd-5.11-1' of git://git.linux-nfs.org/projects/cel/cel-2.6: NFSD: Restore NFSv4 decoding's SAVEMEM functionality SUNRPC: Handle TCP socket sends with kernel_sendpage() again NFSD: Fix sparse warning in nfssvc.c nfsd: Don't set eof on a truncated READ_PLUS nfsd: Fixes for nfsd4_encode_read_plus_data()
2021-01-10net: sunrpc: interpret the return value of kstrtou32 correctlyj.nixdorf@avm.de1-1/+1
A return value of 0 means success. This is documented in lib/kstrtox.c. This was found by trying to mount an NFS share from a link-local IPv6 address with the interface specified by its index: mount("[fe80::1%1]:/srv/nfs", "/mnt", "nfs", 0, "nolock,addr=fe80::1%1") Before this commit this failed with EINVAL and also caused the following message in dmesg: [...] NFS: bad IP address specified: addr=fe80::1%1 The syscall using the same address based on the interface name instead of its index succeeds. Credits for this patch go to my colleague Christian Speich, who traced the origin of this bug to this line of code. Signed-off-by: Johannes Nixdorf <j.nixdorf@avm.de> Fixes: 00cfaa943ec3 ("replace strict_strto calls") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-12-18SUNRPC: Handle TCP socket sends with kernel_sendpage() againChuck Lever1-1/+85
Daire Byrne reports a ~50% aggregrate throughput regression on his Linux NFS server after commit da1661b93bf4 ("SUNRPC: Teach server to use xprt_sock_sendmsg for socket sends"), which replaced kernel_send_page() calls in NFSD's socket send path with calls to sock_sendmsg() using iov_iter. Investigation showed that tcp_sendmsg() was not using zero-copy to send the xdr_buf's bvec pages, but instead was relying on memcpy. This means copying every byte of a large NFS READ payload. It looks like TLS sockets do indeed support a ->sendpage method, so it's really not necessary to use xprt_sock_sendmsg() to support TLS fully on the server. A mechanical reversion of da1661b93bf4 is not possible at this point, but we can re-implement the server's TCP socket sendmsg path using kernel_sendpage(). Reported-by: Daire Byrne <daire@dneg.com> BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=209439 Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-12-17Merge tag 'nfs-for-5.11-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds13-453/+769
Pull NFS client updates from Trond Myklebust: "Highlights include: Features: - NFSv3: Add emulation of lookupp() to improve open_by_filehandle() support - A series of patches to improve readdir performance, particularly with large directories - Basic support for using NFS/RDMA with the pNFS files and flexfiles drivers - Micro-optimisations for RDMA - RDMA tracing improvements Bugfixes: - Fix a long standing bug with xs_read_xdr_buf() when receiving partial pages (Dan Aloni) - Various fixes for getxattr and listxattr, when used over non-TCP transports - Fixes for containerised NFS from Sargun Dhillon - switch nfsiod to be an UNBOUND workqueue (Neil Brown) - READDIR should not ask for security label information if there is no LSM policy (Olga Kornievskaia) - Avoid using interval-based rebinding with TCP in lockd (Calum Mackay) - A series of RPC and NFS layer fixes to support the NFSv4.2 READ_PLUS code - A couple of fixes for pnfs/flexfiles read failover Cleanups: - Various cleanups for the SUNRPC xdr code in conjunction with the READ_PLUS fixes" * tag 'nfs-for-5.11-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (90 commits) NFS/pNFS: Fix a typo in ff_layout_resend_pnfs_read() pNFS/flexfiles: Avoid spurious layout returns in ff_layout_choose_ds_for_read NFSv4/pnfs: Add tracing for the deviceid cache fs/lockd: convert comma to semicolon NFSv4.2: fix error return on memory allocation failure NFSv4.2/pnfs: Don't use READ_PLUS with pNFS yet NFSv4.2: Deal with potential READ_PLUS data extent buffer overflow NFSv4.2: Don't error when exiting early on a READ_PLUS buffer overflow NFSv4.2: Handle hole lengths that exceed the READ_PLUS read buffer NFSv4.2: decode_read_plus_hole() needs to check the extent offset NFSv4.2: decode_read_plus_data() must skip padding after data segment NFSv4.2: Ensure we always reset the result->count in decode_read_plus() SUNRPC: When expanding the buffer, we may need grow the sparse pages SUNRPC: Cleanup - constify a number of xdr_buf helpers SUNRPC: Clean up open coded setting of the xdr_stream 'nwords' field SUNRPC: _copy_to/from_pages() now check for zero length SUNRPC: Cleanup xdr_shrink_bufhead() SUNRPC: Fix xdr_expand_hole() SUNRPC: Fixes for xdr_align_data() SUNRPC: _shift_data_left/right_pages should check the shift length ...
2020-12-16Merge tag 'nfsd-5.11' of git://git.linux-nfs.org/projects/cel/cel-2.6Linus Torvalds14-650/+1315
Pull nfsd updates from Chuck Lever: "Several substantial changes this time around: - Previously, exporting an NFS mount via NFSD was considered to be an unsupported feature. With v5.11, the community has attempted to make re-exporting a first-class feature of NFSD. This would enable the Linux in-kernel NFS server to be used as an intermediate cache for a remotely-located primary NFS server, for example, even with other NFS server implementations, like a NetApp filer, as the primary. - A short series of patches brings support for multiple RPC/RDMA data chunks per RPC transaction to the Linux NFS server's RPC/RDMA transport implementation. This is a part of the RPC/RDMA spec that the other premiere NFS/RDMA implementation (Solaris) has had for a very long time, and completes the implementation of RPC/RDMA version 1 in the Linux kernel's NFS server. - Long ago, NFSv4 support was introduced to NFSD using a series of C macros that hid dprintk's and goto's. Over time, the kernel's XDR implementation has been greatly improved, but these C macros have remained and become fallow. A series of patches in this pull request completely replaces those macros with the use of current kernel XDR infrastructure. Benefits include: - More robust input sanitization in NFSD's NFSv4 XDR decoders. - Make it easier to use common kernel library functions that use XDR stream APIs (for example, GSS-API). - Align the structure of the source code with the RFCs so it is easier to learn, verify, and maintain our XDR implementation. - Removal of more than a hundred hidden dprintk() call sites. - Removal of some explicit manipulation of pages to help make the eventual transition to xdr->bvec smoother. - On top of several related fixes in 5.10-rc, there are a few more fixes to get the Linux NFSD implementation of NFSv4.2 inter-server copy up to speed. And as usual, there is a pinch of seasoning in the form of a collection of unrelated minor bug fixes and clean-ups. Many thanks to all who contributed this time around!" * tag 'nfsd-5.11' of git://git.linux-nfs.org/projects/cel/cel-2.6: (131 commits) nfsd: Record NFSv4 pre/post-op attributes as non-atomic nfsd: Set PF_LOCAL_THROTTLE on local filesystems only nfsd: Fix up nfsd to ensure that timeout errors don't result in ESTALE exportfs: Add a function to return the raw output from fh_to_dentry() nfsd: close cached files prior to a REMOVE or RENAME that would replace target nfsd: allow filesystems to opt out of subtree checking nfsd: add a new EXPORT_OP_NOWCC flag to struct export_operations Revert "nfsd4: support change_attr_type attribute" nfsd4: don't query change attribute in v2/v3 case nfsd: minor nfsd4_change_attribute cleanup nfsd: simplify nfsd4_change_info nfsd: only call inode_query_iversion in the I_VERSION case nfs_common: need lock during iterate through the list NFSD: Fix 5 seconds delay when doing inter server copy NFSD: Fix sparse warning in nfs4proc.c SUNRPC: Remove XDRBUF_SPARSE_PAGES flag in gss_proxy upcall sunrpc: clean-up cache downcall nfsd: Fix message level for normal termination NFSD: Remove macros that are no longer used NFSD: Replace READ* macros in nfsd4_decode_compound() ...
2020-12-16Merge tag 'nfs-rdma-for-5.11-1' of ↵Trond Myklebust6-75/+90
git://git.linux-nfs.org/projects/anna/linux-nfs into linux-next NFSoRDmA Client updates for Linux 5.11 Cleanups and improvements: - Remove use of raw kernel memory addresses in tracepoints - Replace dprintk() call sites in ERR_CHUNK path - Trace unmap sync calls - Optimize MR DMA-unmapping Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-12-14SUNRPC: When expanding the buffer, we may need grow the sparse pagesTrond Myklebust1-2/+33
If we're shifting the page data to the right, and this happens to be a sparse page array, then we may need to allocate new pages in order to receive the data. Reported-by: "Mkrtchyan, Tigran" <tigran.mkrtchyan@desy.de> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-12-14SUNRPC: Cleanup - constify a number of xdr_buf helpersTrond Myklebust1-29/+24
There are a number of xdr helpers for struct xdr_buf that do not change the structure itself. Mark those as taking const pointers for documentation purposes. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-12-14SUNRPC: Clean up open coded setting of the xdr_stream 'nwords' fieldTrond Myklebust1-13/+16
Move the setting of the xdr_stream 'nwords' field into the helpers that reset the xdr_stream cursor. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-12-14SUNRPC: _copy_to/from_pages() now check for zero lengthTrond Myklebust1-4/+2
Clean up callers of _copy_to/from_pages() that still check for a zero length. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-12-14SUNRPC: Cleanup xdr_shrink_bufhead()Trond Myklebust1-77/+87
Clean up xdr_shrink_bufhead() to use the new helpers instead of doing its own thing. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-12-14SUNRPC: Fix xdr_expand_hole()Trond Myklebust1-95/+179
We do want to try to grow the buffer if possible, but if that attempt fails, we still want to move the data and truncate the XDR message. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-12-14SUNRPC: Fixes for xdr_align_data()Trond Myklebust1-42/+132
The main use case right now for xdr_align_data() is to shift the page data to the left, and in practice shrink the total XDR data buffer. This patch ensures that we fix up the accounting for the buffer length as we shift that data around. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-12-14SUNRPC: _shift_data_left/right_pages should check the shift lengthTrond Myklebust1-0/+12
Exit early if the shift is zero. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-12-14xprtrdma: Fix XDRBUF_SPARSE_PAGES supportChuck Lever1-9/+31
Olga K. observed that rpcrdma_marsh_req() allocates sparse pages only when it has determined that a Reply chunk is necessary. There are plenty of cases where no Reply chunk is needed, but the XDRBUF_SPARSE_PAGES flag is set. The result would be a crash in rpcrdma_inline_fixup() when it tries to copy parts of the received Reply into a missing page. To avoid crashing, handle sparse page allocation up front. Until XATTR support was added, this issue did not appear often because the only SPARSE_PAGES consumer always expected a reply large enough to always require a Reply chunk. Reported-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-12-14sunrpc: fix xs_read_xdr_buf for partial pages receiveDan Aloni1-1/+2
When receiving pages data, return value 'ret' when positive includes `buf->page_base`, so we should subtract that before it is used for changing `offset` and comparing against `want`. This was discovered on the very rare cases where the server returned a chunk of bytes that when added to the already received amount of bytes for the pages happened to match the current `recv.len`, for example on this case: buf->page_base : 258356 actually received from socket: 1740 ret : 260096 want : 260096 In this case neither of the two 'if ... goto out' trigger, and we continue to tail parsing. Worth to mention that the ensuing EMSGSIZE from the continued execution of `xs_read_xdr_buf` may be observed by an application due to 4 superfluous bytes being added to the pages data. Fixes: 277e4ab7d530 ("SUNRPC: Simplify TCP receive code by switching to using iterators") Signed-off-by: Dan Aloni <dan@kernelim.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-12-09SUNRPC: Remove XDRBUF_SPARSE_PAGES flag in gss_proxy upcallChuck Lever2-6/+10
There's no need to defer allocation of pages for the receive buffer. - This upcall is quite infrequent - gssp_alloc_receive_pages() can allocate the pages with GFP_KERNEL, unlike the transport - gssp_alloc_receive_pages() knows exactly how many pages are needed Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Olga Kornievskaia <kolga@netapp.com>
2020-12-09sunrpc: clean-up cache downcallRoberto Bergantinos Corpas1-30/+11
We can simplify code around cache_downcall unifying memory allocations using kvmalloc. This has the benefit of getting rid of cache_slow_downcall (and queue_io_mutex), and also matches userland allocation size and limits. Signed-off-by: Roberto Bergantinos Corpas <rbergant@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-12-02net: sunrpc: Fix 'snprintf' return value check in 'do_xprt_debugfs'Fedor Tokarev1-2/+2
'snprintf' returns the number of characters which would have been written if enough space had been available, excluding the terminating null byte. Thus, the return value of 'sizeof(buf)' means that the last character has been dropped. Signed-off-by: Fedor Tokarev <ftokarev@gmail.com> Fixes: 2f34b8bfae19 ("SUNRPC: add links for all client xprts to debugfs") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-12-02SUNRPC: Fix open coded xdr_stream_remaining()Trond Myklebust1-2/+2
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-12-02SUNRPC: Fix up xdr_set_page()Trond Myklebust1-9/+12
While we always want to align to the next page and/or the beginning of the tail buffer when we call xdr_set_next_page(), the functions xdr_align_data() and xdr_expand_hole() really want to align to the next object in that next page or tail. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-12-02SUNRPC: Clean up the handling of page padding in rpc_prepare_reply_pages()Trond Myklebust2-7/+1
rpc_prepare_reply_pages() currently expects the 'hdrsize' argument to contain the length of the data that we expect to want placed in the head kvec plus a count of 1 word of padding that is placed after the page data. This is very confusing when trying to read the code, and sometimes leads to callers adding an arbitrary value of '1' just in order to satisfy the requirement (whether or not the page data actually needs such padding). This patch aims to clarify the code by changing the 'hdrsize' argument to remove that 1 word of padding. This means we need to subtract the padding from all the existing callers. Fixes: 02ef04e432ba ("NFS: Account for XDR pad of buf->pages") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-12-02SUNRPC: Fix up xdr_read_pages() to take arbitrary object lengthsTrond Myklebust1-26/+15
Fix up xdr_read_pages() so that it can handle object lengths that are larger than the page length, by simply aligning to the next object in the buffer tail. The function will continue to return the length of the truncate object data that actually fit into the pages. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-12-02SUNRPC: Clean up helpers xdr_set_iov() and xdr_set_page_base()Trond Myklebust1-17/+19
Allow xdr_set_iov() to set a base so that we can use it to set the cursor to a specific position in the kvec buffer. If the new base overflows the kvec/pages buffer in either xdr_set_iov() or xdr_set_page_base(), then truncate it so that we point to the end of the buffer. Finally, change both function to return the number of bytes remaining to read in their buffers. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-12-02SUNRPC: Fix up typo in xdr_init_decode()Trond Myklebust1-1/+1
We already know that the head buffer and page are empty, so if there is any data, it is in the tail. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-12-02SUNRPC: Fix up open coded kmemdup_nul()Trond Myklebust1-3/+1
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-12-02SUNRPC: Remove unused function xprt_load_transport()Trond Myklebust1-15/+0
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-12-02SUNRPC: Add a helper to return the transport identifier given a netidTrond Myklebust1-4/+21
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-12-02SUNRPC: Close a race with transport setup and module putTrond Myklebust1-11/+33
After we've looked up the transport module, we need to ensure it can't go away until we've finished running the transport setup code. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-12-02SUNRPC: xprt_load_transport() needs to support the netid "rdma6"Trond Myklebust4-16/+55
According to RFC5666, the correct netid for an IPv6 addressed RDMA transport is "rdma6", which we've supported as a mount option since Linux-4.7. The problem is when we try to load the module "xprtrdma6", that will fail, since there is no modulealias of that name. Fixes: 181342c5ebe8 ("xprtrdma: Add rdma6 option to support NFS/RDMA IPv6") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>