summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)AuthorFilesLines
2021-04-14io_uring: fix POLL_REMOVE removing apollPavel Begunkov1-12/+11
Don't allow REQ_OP_POLL_REMOVE to kill apoll requests, users should not know about it. Also, remove weird -EACCESS in io_poll_update(), it shouldn't know anything about apoll, and have to work even if happened to have a poll and an async poll'ed request with same user_data. Fixes: b69de288e913 ("io_uring: allow events and user_data update of running poll requests") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-14io_uring: refactor io_ring_exit_work()Pavel Begunkov1-4/+5
Don't reinit io_ring_exit_work()'s exit work/completions on each iteration, that's wasteful. Also add list_rotate_left(), so if we failed to complete the task job, we don't try it again and again but defer it until others are processed. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-14NFSv42: Don't force attribute revalidation of the copy offload sourceTrond Myklebust1-6/+1
When a copy offload is performed, we do not expect the source file to change other than perhaps to see the atime be updated. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-04-14NFSv42: Copy offload should update the file size when appropriateTrond Myklebust1-9/+32
If the result of a copy offload or clone operation is to grow the destination file size, then we should update it. The reason is that when a client holds a delegation, it is authoritative for the file size. Fixes: 16abd2a0c124 ("NFSv4.2: fix client's attribute cache management for copy_file_range") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-04-14NFSv4.2 fix handling of sr_eof in SEEK's replyOlga Kornievskaia1-1/+4
Currently the client ignores the value of the sr_eof of the SEEK operation. According to the spec, if the server didn't find the requested extent and reached the end of the file, the server would return sr_eof=true. In case the request for DATA and no data was found (ie in the middle of the hole), then the lseek expects that ENXIO would be returned. Fixes: 1c6dcbe5ceff8 ("NFS: Implement SEEK") Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-04-14pNFS/flexfiles: fix incorrect size check in decode_nfs_fh()Nikola Livic1-1/+1
We (adam zabrocki, alexander matrosov, alexander tereshkin, maksym bazalii) observed the check: if (fh->size > sizeof(struct nfs_fh)) should not use the size of the nfs_fh struct which includes an extra two bytes from the size field. struct nfs_fh { unsigned short size; unsigned char data[NFS_MAXFHSIZE]; } but should determine the size from data[NFS_MAXFHSIZE] so the memcpy will not write 2 bytes beyond destination. The proposed fix is to compare against the NFS_MAXFHSIZE directly, as is done elsewhere in fs code base. Fixes: d67ae825a59d ("pnfs/flexfiles: Add the FlexFile Layout Driver") Signed-off-by: Nikola Livic <nlivic@gmail.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-04-14NFSv4: Catch and trace server filehandle encoding errorsTrond Myklebust5-6/+14
If the server returns a filehandle with an invalid length, then trace that, and return an EREMOTEIO error. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-04-14NFSv4: Convert nfs_xdr_status tracepoint to an event classTrond Myklebust2-2/+19
We would like the ability to record other XDR errors, particularly those that are due to server bugs. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-04-14NFSv4: Add tracing for COMPOUND errorsTrond Myklebust2-3/+36
When the server returns a different operation than we expected, then trace that. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-04-14NFS: Split attribute support out from the server capabilitiesTrond Myklebust3-47/+68
There are lots of attributes, and they are crowding out the bit space. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-04-14NFS: Don't store NFS_INO_REVAL_FORCEDTrond Myklebust4-52/+43
NFS_INO_REVAL_FORCED is intended to tell us that the cache needs revalidation despite the fact that we hold a delegation. We shouldn't need to store it anymore, though. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-04-14NFSv4: link must update the inode nlink.Trond Myklebust2-1/+10
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-04-14NFSv4: nfs4_inc/dec_nlink_locked should also invalidate ctimeTrond Myklebust1-7/+11
If the nlink changes, then so will the ctime. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-04-14cuse: simplify refcountMiklos Szeredi1-7/+3
Put extra reference early in cuse_channel_open(). Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-04-14cuse: prevent cloneMiklos Szeredi1-0/+2
For cloned connections cuse_channel_release() will be called more than once, resulting in use after free. Prevent device cloning for CUSE, which does not make sense at this point, and highly unlikely to be used in real life. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-04-14virtiofs: fix usernsMiklos Szeredi1-2/+1
get_user_ns() is done twice (once in virtio_fs_get_tree() and once in fuse_conn_init()), resulting in a reference leak. Also looks better to use fsc->user_ns (which *should* be the current_user_ns() at this point). Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-04-14virtiofs: remove useless functionJiapeng Chong1-5/+0
Fix the following clang warning: fs/fuse/virtio_fs.c:130:35: warning: unused function 'vq_to_fpq' [-Wunused-function]. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-04-14virtiofs: split requests that exceed virtqueue sizeConnor Kuehl3-3/+22
If an incoming FUSE request can't fit on the virtqueue, the request is placed onto a workqueue so a worker can try to resubmit it later where there will (hopefully) be space for it next time. This is fine for requests that aren't larger than a virtqueue's maximum capacity. However, if a request's size exceeds the maximum capacity of the virtqueue (even if the virtqueue is empty), it will be doomed to a life of being placed on the workqueue, removed, discovered it won't fit, and placed on the workqueue yet again. Furthermore, from section 2.6.5.3.1 (Driver Requirements: Indirect Descriptors) of the virtio spec: "A driver MUST NOT create a descriptor chain longer than the Queue Size of the device." To fix this, limit the number of pages FUSE will use for an overall request. This way, each request can realistically fit on the virtqueue when it is decomposed into a scattergather list and avoid violating section 2.6.5.3.1 of the virtio spec. Signed-off-by: Connor Kuehl <ckuehl@redhat.com> Reviewed-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-04-14virtiofs: fix memory leak in virtio_fs_probe()Luis Henriques1-0/+1
When accidentally passing twice the same tag to qemu, kmemleak ended up reporting a memory leak in virtiofs. Also, looking at the log I saw the following error (that's when I realised the duplicated tag): virtiofs: probe of virtio5 failed with error -17 Here's the kmemleak log for reference: unreferenced object 0xffff888103d47800 (size 1024): comm "systemd-udevd", pid 118, jiffies 4294893780 (age 18.340s) hex dump (first 32 bytes): 00 00 00 00 ad 4e ad de ff ff ff ff 00 00 00 00 .....N.......... ff ff ff ff ff ff ff ff 80 90 02 a0 ff ff ff ff ................ backtrace: [<000000000ebb87c1>] virtio_fs_probe+0x171/0x7ae [virtiofs] [<00000000f8aca419>] virtio_dev_probe+0x15f/0x210 [<000000004d6baf3c>] really_probe+0xea/0x430 [<00000000a6ceeac8>] device_driver_attach+0xa8/0xb0 [<00000000196f47a7>] __driver_attach+0x98/0x140 [<000000000b20601d>] bus_for_each_dev+0x7b/0xc0 [<00000000399c7b7f>] bus_add_driver+0x11b/0x1f0 [<0000000032b09ba7>] driver_register+0x8f/0xe0 [<00000000cdd55998>] 0xffffffffa002c013 [<000000000ea196a2>] do_one_initcall+0x64/0x2e0 [<0000000008f727ce>] do_init_module+0x5c/0x260 [<000000003cdedab6>] __do_sys_finit_module+0xb5/0x120 [<00000000ad2f48c6>] do_syscall_64+0x33/0x40 [<00000000809526b5>] entry_SYSCALL_64_after_hwframe+0x44/0xae Cc: stable@vger.kernel.org Signed-off-by: Luis Henriques <lhenriques@suse.de> Fixes: a62a8ef9d97d ("virtio-fs: add virtiofs filesystem") Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-04-14fuse: invalidate attrs when page writeback completesVivek Goyal1-0/+9
In fuse when a direct/write-through write happens we invalidate attrs because that might have updated mtime/ctime on server and cached mtime/ctime will be stale. What about page writeback path. Looks like we don't invalidate attrs there. To be consistent, invalidate attrs in writeback path as well. Only exception is when writeback_cache is enabled. In that case we strust local mtime/ctime and there is no need to invalidate attrs. Recently users started experiencing failure of xfstests generic/080, geneirc/215 and generic/614 on virtiofs. This happened only newer "stat" utility and not older one. This patch fixes the issue. So what's the root cause of the issue. Here is detailed explanation. generic/080 test does mmap write to a file, closes the file and then checks if mtime has been updated or not. When file is closed, it leads to flushing of dirty pages (and that should update mtime/ctime on server). But we did not explicitly invalidate attrs after writeback finished. Still generic/080 passed so far and reason being that we invalidated atime in fuse_readpages_end(). This is called in fuse_readahead() path and always seems to trigger before mmaped write. So after mmaped write when lstat() is called, it sees that atleast one of the fields being asked for is invalid (atime) and that results in generating GETATTR to server and mtime/ctime also get updated and test passes. But newer /usr/bin/stat seems to have moved to using statx() syscall now (instead of using lstat()). And statx() allows it to query only ctime or mtime (and not rest of the basic stat fields). That means when querying for mtime, fuse_update_get_attr() sees that mtime is not invalid (only atime is invalid). So it does not generate a new GETATTR and fill stat with cached mtime/ctime. And that means updated mtime is not seen by xfstest and tests start failing. Invalidating attrs after writeback completion should solve this problem in a generic manner. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-04-14fuse: add a flag FUSE_SETXATTR_ACL_KILL_SGID to kill SGIDVivek Goyal1-1/+6
When posix access ACL is set, it can have an effect on file mode and it can also need to clear SGID if. - None of caller's group/supplementary groups match file owner group. AND - Caller is not priviliged (No CAP_FSETID). As of now fuser server is responsible for changing the file mode as well. But it does not know whether to clear SGID or not. So add a flag FUSE_SETXATTR_ACL_KILL_SGID and send this info with SETXATTR to let file server know that sgid needs to be cleared as well. Reported-by: Luis Henriques <lhenriques@suse.de> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-04-14fuse: extend FUSE_SETXATTR requestVivek Goyal4-6/+14
Fuse client needs to send additional information to file server when it calls SETXATTR(system.posix_acl_access), so add extra flags field to the structure. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-04-14fuse: fix matching of FUSE_DEV_IOC_CLONE commandAlessio Balsini1-5/+2
With commit f8425c939663 ("fuse: 32-bit user space ioctl compat for fuse device") the matching constraints for the FUSE_DEV_IOC_CLONE ioctl command are relaxed, limited to the testing of command type and number. As Arnd noticed, this is wrong as it wouldn't ensure the correctness of the data size or direction for the received FUSE device ioctl. Fix by bringing back the comparison of the ioctl received by the FUSE device to the originally generated FUSE_DEV_IOC_CLONE. Fixes: f8425c939663 ("fuse: 32-bit user space ioctl compat for fuse device") Reported-by: Arnd Bergmann <arnd@kernel.org> Signed-off-by: Alessio Balsini <balsini@android.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-04-14fuse: fix a typoBhaskar Chowdhury1-1/+1
s/reponsible/responsible/ Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-04-14fuse: don't zero pages twiceMiklos Szeredi1-15/+6
All callers of fuse_short_read already set the .page_zeroing flag, so no need to do the tail zeroing again. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-04-14fuse: fix typo for fuse_conn.max_pages commentConnor Kuehl1-1/+1
'Maxmum' -> 'Maximum' Signed-off-by: Connor Kuehl <ckuehl@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-04-14fuse: fix write deadlockVivek Goyal2-12/+30
There are two modes for write(2) and friends in fuse: a) write through (update page cache, send sync WRITE request to userspace) b) buffered write (update page cache, async writeout later) The write through method kept all the page cache pages locked that were used for the request. Keeping more than one page locked is deadlock prone and Qian Cai demonstrated this with trinity fuzzing. The reason for keeping the pages locked is that concurrent mapped reads shouldn't try to pull possibly stale data into the page cache. For full page writes, the easy way to fix this is to make the cached page be the authoritative source by marking the page PG_uptodate immediately. After this the page can be safely unlocked, since mapped/cached reads will take the written data from the cache. Concurrent mapped writes will now cause data in the original WRITE request to be updated; this however doesn't cause any data inconsistency and this scenario should be exceedingly rare anyway. If the WRITE request returns with an error in the above case, currently the page is not marked uptodate; this means that a concurrent read will always read consistent data. After this patch the page is uptodate between writing to the cache and receiving the error: there's window where a cached read will read the wrong data. While theoretically this could be a regression, it is unlikely to be one in practice, since this is normal for buffered writes. In case of a partial page write to an already uptodate page the locking is also unnecessary, with the above caveats. Partial write of a not uptodate page still needs to be handled. One way would be to read the complete page before doing the write. This is not possible, since it might break filesystems that don't expect any READ requests when the file was opened O_WRONLY. The other solution is to serialize the synchronous write with reads from the partial pages. The easiest way to do this is to keep the partial pages locked. The problem is that a write() may involve two such pages (one head and one tail). This patch fixes it by only locking the partial tail page. If there's a partial head page as well, then split that off as a separate WRITE request. Reported-by: Qian Cai <cai@lca.pw> Link: https://lore.kernel.org/linux-fsdevel/4794a3fa3742a5e84fb0f934944204b55730829b.camel@lca.pw/ Fixes: ea9b9907b82a ("fuse: implement perform_write") Cc: <stable@vger.kernel.org> # v2.6.26 Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-04-13f2fs: fix to avoid NULL pointer dereferenceYi Chen1-1/+4
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 pc : f2fs_put_page+0x1c/0x26c lr : __revoke_inmem_pages+0x544/0x75c f2fs_put_page+0x1c/0x26c __revoke_inmem_pages+0x544/0x75c __f2fs_commit_inmem_pages+0x364/0x3c0 f2fs_commit_inmem_pages+0xc8/0x1a0 f2fs_ioc_commit_atomic_write+0xa4/0x15c f2fs_ioctl+0x5b0/0x1574 file_ioctl+0x154/0x320 do_vfs_ioctl+0x164/0x740 __arm64_sys_ioctl+0x78/0xa4 el0_svc_common+0xbc/0x1d0 el0_svc_handler+0x74/0x98 el0_svc+0x8/0xc In f2fs_put_page, we access page->mapping is NULL. The root cause is: In some cases, the page refcount and ATOMIC_WRITTEN_PAGE flag miss set for page-priavte flag has been set. We add f2fs_bug_on like this: f2fs_register_inmem_page() { ... f2fs_set_page_private(page, ATOMIC_WRITTEN_PAGE); f2fs_bug_on(F2FS_I_SB(inode), !IS_ATOMIC_WRITTEN_PAGE(page)); ... } The bug on stack follow link this: PC is at f2fs_register_inmem_page+0x238/0x2b4 LR is at f2fs_register_inmem_page+0x2a8/0x2b4 f2fs_register_inmem_page+0x238/0x2b4 f2fs_set_data_page_dirty+0x104/0x164 set_page_dirty+0x78/0xc8 f2fs_write_end+0x1b4/0x444 generic_perform_write+0x144/0x1cc __generic_file_write_iter+0xc4/0x174 f2fs_file_write_iter+0x2c0/0x350 __vfs_write+0x104/0x134 vfs_write+0xe8/0x19c SyS_pwrite64+0x78/0xb8 To fix this issue, let's add page refcount add page-priavte flag. The page-private flag is not cleared and needs further analysis. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Ge Qiu <qiuge@huawei.com> Signed-off-by: Dehe Gu <gudehe@huawei.com> Signed-off-by: Yi Chen <chenyi77@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-04-13f2fs: avoid duplicated codes for cleanupChao Yu1-22/+10
f2fs_segment_has_free_slot() was copied and modified from __next_free_blkoff(), they are almost the same, clean up to reuse common code as much as possible. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-04-13io_uring: inline io_iopoll_getevents()Pavel Begunkov1-39/+13
io_iopoll_getevents() is of no use to us anymore, io_iopoll_check() handles all the cases. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/7e50b8917390f38bee4f822c6f4a6a98a27be037.1618278933.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-13io_uring: skip futile iopoll iterationsPavel Begunkov1-2/+5
The only way to get out of io_iopoll_getevents() and continue iterating is to have empty iopoll_list, otherwise the main loop would just exit. So, instead of the unlock on 8th time heuristic, do that based on iopoll_list. Also, as no one can add new requests to iopoll_list while io_iopoll_check() hold uring_lock, it's useless to spin with the list empty, return in that case. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/5b8ebe84f5fff7ffa1f708952dfef7fc78b668e2.1618278933.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-13io_uring: don't fail overflow on in_idlePavel Begunkov1-24/+20
As CQE overflows are now untied from requests and so don't hold any ref, we don't need to handle exiting/exec'ing cases there anymore. Moreover, it's much nicer in regards to userspace to save overflowed CQEs whenever possible, so remove failing on in_idle. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/d873b7dab75c7f3039ead9628a745bea01f2cfd2.1618278933.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-13io_uring: clean up io_poll_remove_waitqs()Pavel Begunkov1-10/+5
Move some parts of io_poll_remove_waitqs() that are opcode independent. Looks better and stresses that both do __io_poll_remove_one(). Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/bbc717f82117cc335c89cbe67ec8d72608178732.1618278933.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-13io_uring: refactor hrtimer_try_to_cancel usesPavel Begunkov1-15/+8
Don't save return values of hrtimer_try_to_cancel() in a variable, but use right away. It's in general safer to not have an intermediate variable, which may be reused and passed out wrongly, but it be contracted out. Also clean io_timeout_extract(). Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/d2566ef7ce632e6882dc13e022a26249b3fd30b5.1618278933.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-13io_uring: add timeout completion_lock annotationPavel Begunkov1-0/+1
Add one more sparse locking annotation for readability in io_kill_timeout(). Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/bdbb22026024eac29203c1aa0045c4954a2488d1.1618278933.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-13io_uring: split poll and poll update structuresPavel Begunkov1-23/+32
struct io_poll_iocb became pretty nasty combining also update fields. Split them, so we would have more clarity to it. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/b2f74d64ffebb57a648f791681af086c7211e3a4.1618278933.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-13io_uring: fix uninit old data for poll event updPavel Begunkov1-9/+9
Both IORING_POLL_UPDATE_EVENTS and IORING_POLL_UPDATE_USER_DATA need old_user_data to find/cancel a poll request, but it's set only for the first one. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/ab08fd35b7652e977f9a475f01741b04102297f1.1618278933.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-13io_uring: fix leaking reg files on exitPavel Begunkov1-14/+15
If io_sqe_files_unregister() faults on io_rsrc_ref_quiesce(), it will fail to do unregister leaving files referenced. And that may well happen because of a strayed signal or just because it does allocations inside. In io_ring_ctx_free() do an unsafe version of unregister, as it's guaranteed to not have requests by that point and so quiesce is useless. Cc: stable@vger.kernel.org Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/e696e9eade571b51997d0dc1d01f144c6d685c05.1618278933.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-13NFS: Another inode revalidation improvementTrond Myklebust1-1/+29
If we're trying to update the inode because a previous update left the cache in a partially unrevalidated state, then allow the update if the change attrs match. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-04-13NFS: Use information about the change attribute to optimise updatesTrond Myklebust2-21/+110
If the NFSv4.2 server supports the 'change_attr_type' attribute, then allow the client to optimise its attribute cache update strategy. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-04-13NFSv4: Add support for the NFSv4.2 "change_attr_type" attributeTrond Myklebust5-0/+38
The change_attr_type allows the server to provide a description of how the change attribute will behave. This again will allow the client to optimise its behaviour w.r.t. attribute revalidation. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-04-13NFSv4: Don't modify the change attribute cached in the inodeTrond Myklebust1-3/+0
When the client is caching data and a write delegation is held, then the server may send a CB_GETATTR to query the attributes. When this happens, the client is supposed to bump the change attribute value that it returns if it holds cached data. However that process uses a value that is stored in the delegation. We do not want to bump the change attribute held in the inode. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-04-13NFSv4: Fix value of decode_fsinfo_maxszTrond Myklebust1-1/+10
At least two extra fields have been added to fsinfo since this was last updated. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-04-13NFS: Simplify cache consistency in nfs_check_inode_attributes()Trond Myklebust1-13/+5
We should not be invalidating the access or acl caches in nfs_check_inode_attributes(), since the point is we're unsure about whether the contents of the struct nfs_fattr are fully up to date. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-04-13NFS: Remove a line of code that has no effect in nfs_update_inode()Trond Myklebust1-1/+0
Commit 0b467264d0db ("NFS: Fix attribute revalidation") changed the way we populate the 'invalid' attribute, and made the line that strips away the NFS_INO_INVALID_ATTR bits redundant. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-04-13NFS: Fix up handling of outstanding layoutcommit in nfs_update_inode()Trond Myklebust1-1/+5
If there is an outstanding layoutcommit, then the list of attributes whose values are expected to change is not the full set. So let's be explicit about the full list. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-04-13NFS: Separate tracking of file mode cache validity from the uid/gidTrond Myklebust5-15/+28
chown()/chgrp() and chmod() are separate operations, and in addition, there are mode operations that are performed automatically by the server. So let's track mode validity separately from the file ownership validity. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-04-13NFS: Separate tracking of file nlinks cache validity from the mode/uid/gidTrond Myklebust4-13/+21
Rename can cause us to revalidate the access cache, so lets track the nlinks separately from the mode/uid/gid. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-04-13NFSv4: Fix nfs4_bitmap_copy_adjust()Trond Myklebust1-17/+16
Don't remove flags from the set retrieved from the cache_validity. We do want to retrieve all attributes that are listed as being invalid, whether or not there is a delegation set. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-04-13fs/locks: remove useless assignment in fcntl_getlkTian Tao1-1/+0
Function parameter 'cmd' is rewritten with unused value at locks.c Signed-off-by: Tian Tao <tiantao6@hisilicon.com> Signed-off-by: Jeff Layton <jlayton@kernel.org>