summaryrefslogtreecommitdiff
path: root/fs/nfsd/filecache.h
AgeCommit message (Collapse)AuthorFilesLines
2022-08-17nfsd: eliminate the NFSD_FILE_BREAK_* flagsJeff Layton1-3/+1
commit 23ba98de6dcec665e15c0ca19244379bb0d30932 upstream. We had a report from the spring Bake-a-thon of data corruption in some nfstest_interop tests. Looking at the traces showed the NFS server allowing a v3 WRITE to proceed while a read delegation was still outstanding. Currently, we only set NFSD_FILE_BREAK_* flags if NFSD_MAY_NOT_BREAK_LEASE was set when we call nfsd_file_alloc. NFSD_MAY_NOT_BREAK_LEASE was intended to be set when finding files for COMMIT ops, where we need a writeable filehandle but don't need to break read leases. It doesn't make any sense to consult that flag when allocating a file since the file may be used on subsequent calls where we do want to break the lease (and the usage of it here seems to be reverse from what it should be anyway). Also, after calling nfsd_open_break_lease, we don't want to clear the BREAK_* bits. A lease could end up being set on it later (more than once) and we need to be able to break those leases as well. This means that the NFSD_FILE_BREAK_* flags now just mirror NFSD_MAY_{READ,WRITE} flags, so there's no need for them at all. Just drop those flags and unconditionally call nfsd_open_break_lease every time. Reported-by: Olga Kornieskaia <kolga@netapp.com> Link: https://bugzilla.redhat.com/show_bug.cgi?id=2107360 Fixes: 65294c1f2c5e (nfsd: add a new struct file caching facility to nfsd) Cc: <stable@vger.kernel.org> # 5.4.x : bb283ca18d1e NFSD: Clean up the show_nf_flags() macro Cc: <stable@vger.kernel.org> # 5.4.x Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-05-23NFSD: Instantiate a struct file when creating a regular NFSv4 fileChuck Lever1-0/+2
There have been reports of races that cause NFSv4 OPEN(CREATE) to return an error even though the requested file was created. NFSv4 does not provide a status code for this case. To mitigate some of these problems, reorganize the NFSv4 OPEN(CREATE) logic to allocate resources before the file is actually created, and open the new file while the parent directory is still locked. Two new APIs are added: + Add an API that works like nfsd_file_acquire() but does not open the underlying file. The OPEN(CREATE) path can use this API when it already has an open file. + Add an API that is kin to dentry_open(). NFSD needs to create a file and grab an open "struct file *" atomically. The alloc_empty_file() has to be done before the inode create. If it fails (for example, because the NFS server has exceeded its max_files limit), we avoid creating the file and can still return an error to the NFS client. BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=382 Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: JianHong Yin <jiyin@redhat.com>
2022-01-08nfsd: Replace use of rwsem with errseq_tTrond Myklebust1-1/+0
The nfsd_file nf_rwsem is currently being used to separate file write and commit instances to ensure that we catch errors and apply them to the correct write/commit. We can improve scalability at the expense of a little accuracy (some extra false positives) by replacing the nf_rwsem with more careful use of the errseq_t mechanism to track errors across the different operations. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> [ cel: rebased on zero-verifier fix ]
2020-02-06nfsd: convert file cache to use over/underflow safe refcountTrond Myklebust1-2/+2
Use the 'refcount_t' type instead of 'atomic_t' for improved refcounting safety. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-01-23nfsd: Fix stable writesTrond Myklebust1-0/+1
Strictly speaking, a stable write error needs to reflect the write + the commit of that write (and only that write). To ensure that we don't pick up the write errors from other writebacks, add a rw_semaphore to provide exclusion. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-01-23nfsd: Containerise filecache laundretteTrond Myklebust1-0/+2
Ensure that if the filecache laundrette gets stuck, it only affects the knfsd instances of one container. The notifier callbacks can be called from various contexts so avoid using synchonous filesystem operations that might deadlock. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-09-10nfsd: nfsd_file cache entries should be per net namespaceTrond Myklebust1-1/+2
Ensure that we can safely clear out the file cache entries when the nfs server is shut down on a container. Otherwise, the file cache may end up pinning the mounts. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-08-19nfsd: add a new struct file caching facility to nfsdJeff Layton1-0/+60
Currently, NFSv2/3 reads and writes have to open a file, do the read or write and then close it again for each RPC. This is highly inefficient, especially when the underlying filesystem has a relatively slow open routine. This patch adds a new open file cache to knfsd. Rather than doing an open for each RPC, the read/write handlers can call into this cache to see if there is one already there for the correct filehandle and NFS_MAY_READ/WRITE flags. If there isn't an entry, then we create a new one and attempt to perform the open. If there is, then we wait until the entry is fully instantiated and return it if it is at the end of the wait. If it's not, then we attempt to take over construction. Since the main goal is to speed up NFSv2/3 I/O, we don't want to close these files on last put of these objects. We need to keep them around for a little while since we never know when the next READ/WRITE will come in. Cache entries have a hardcoded 1s timeout, and we have a recurring workqueue job that walks the cache and purges any entries that have expired. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Richard Sharpe <richard.sharpe@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>