Age | Commit message (Collapse) | Author | Files | Lines |
|
WARNING: CPU: 1 PID: 27907 at fs/io_uring.c:7147 io_sq_thread_park+0xb5/0xd0 fs/io_uring.c:7147
CPU: 1 PID: 27907 Comm: iou-sqp-27905 Not tainted 5.12.0-rc4-syzkaller #0
RIP: 0010:io_sq_thread_park+0xb5/0xd0 fs/io_uring.c:7147
Call Trace:
io_ring_ctx_wait_and_kill+0x214/0x700 fs/io_uring.c:8619
io_uring_release+0x3e/0x50 fs/io_uring.c:8646
__fput+0x288/0x920 fs/file_table.c:280
task_work_run+0xdd/0x1a0 kernel/task_work.c:140
io_run_task_work fs/io_uring.c:2238 [inline]
io_run_task_work fs/io_uring.c:2228 [inline]
io_uring_try_cancel_requests+0x8ec/0xc60 fs/io_uring.c:8770
io_uring_cancel_sqpoll+0x1cf/0x290 fs/io_uring.c:8974
io_sqpoll_cancel_cb+0x87/0xb0 fs/io_uring.c:8907
io_run_task_work_head+0x58/0xb0 fs/io_uring.c:1961
io_sq_thread+0x3e2/0x18d0 fs/io_uring.c:6763
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
May happen that last ctx ref is killed in io_uring_cancel_sqpoll(), so
fput callback (i.e. io_uring_release()) is enqueued through task_work,
and run by same cancellation. As it's deeply nested we can't do parking
or taking sqd->lock there, because its state is unclear. So avoid
ctx ejection from sqd list from io_ring_ctx_wait_and_kill() and do it
in a clear context in io_ring_exit_work().
Fixes: f6d54255f423 ("io_uring: halt SQO submission on ctx exit")
Reported-by: syzbot+e3a3f84f5cecf61f0583@syzkaller.appspotmail.com
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/e90df88b8ff2cabb14a7534601d35d62ab4cb8c7.1616496707.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Open-coding this function meant it missed out on the recent bugfix
for waiters being woken by a delayed wake event from a previous
instantiation of the page[1].
[DH: Changed the patch to use vmf->page rather than variable page which
doesn't exist yet upstream]
Fixes: 1cf7a1518aef ("afs: Implement shared-writeable mmap")
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: kafs-testing@auristor.com
cc: linux-afs@lists.infradead.org
cc: linux-mm@kvack.org
Link: https://lore.kernel.org/r/20210320054104.1300774-4-willy@infradead.org
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c2407cf7d22d0c0d94cf20342b3b8f06f1d904e7 [1]
|
|
Cachefiles was relying on wait_page_key and wait_bit_key being the
same layout, which is fragile. Now that wait_page_key is exposed in
the pagemap.h header, we can remove that fragility
A comment on the need to maintain structure layout equivalence was added by
Linus[1] and that is no longer applicable.
Fixes: 62906027091f ("mm: add PageWaiters indicating tasks are waiting for a page bit")
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: kafs-testing@auristor.com
cc: linux-cachefs@redhat.com
cc: linux-mm@kvack.org
Link: https://lore.kernel.org/r/20210320054104.1300774-2-willy@infradead.org/
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3510ca20ece0150af6b10c77a74ff1b5c198e3e2 [1]
|
|
Fix ~59 single-word typos in the tracing code comments, and fix
the grammar in a handful of places.
Link: https://lore.kernel.org/r/20210322224546.GA1981273@gmail.com
Link: https://lkml.kernel.org/r/20210323174935.GA4176821@gmail.com
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
Now, fallocate() on a pinned file only allocates blocks which aligns
to segment rather than section, so GC may try to migrate pinned file's
block, and after several times of failure, pinned file's block could
be migrated to other place, however user won't be aware of such
condition, and then old obsolete block address may be readed/written
incorrectly.
To avoid such condition, let's try to allocate pinned file's blocks
with section alignment.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
The GD_NEED_PART_SCAN is set by bdev_check_media_change to initiate
a partition scan while removing a block device. It should be cleared
after blk_drop_paritions because blk_drop_paritions could return
-EBUSY and then the consequence __blkdev_get has no chance to do
delete_partition if GD_NEED_PART_SCAN already cleared.
It causes some problems on some card readers. Ex. Realtek card
reader 0bda:0328 and 0bda:0158. The device node of the partition
will not disappear after the memory card removed. Thus the user
applications can not update the device mapping correctly.
BugLink: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1920874
Signed-off-by: Chris Chiu <chris.chiu@canonical.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210323085219.24428-1-chris.chiu@canonical.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Fix kernel-doc warnings in xattr.c:
../fs/xattr.c:257: warning: Function parameter or member 'mnt_userns' not described in '__vfs_setxattr_locked'
../fs/xattr.c:485: warning: Function parameter or member 'mnt_userns' not described in '__vfs_removexattr_locked'
and fix one function whose kernel-doc was not in the correct format.
Link: https://lore.kernel.org/r/20210216042929.8931-4-rdunlap@infradead.org
Fixes: 71bc356f93a1 ("commoncap: handle idmapped mounts")
Fixes: b1ab7e4b2a88 ("VFS: Factor out part of vfs_setxattr so it can be called from the SELinux hook for inode_setsecctx.")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: David P. Quigley <dpquigl@tycho.nsa.gov>
Cc: James Morris <jmorris@namei.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
|
|
Fix kernel-doc warnings in namei.c:
../fs/namei.c:1149: warning: Excess function parameter 'dir_mode' description in 'may_create_in_sticky'
../fs/namei.c:1149: warning: Excess function parameter 'dir_uid' description in 'may_create_in_sticky'
../fs/namei.c:3396: warning: Function parameter or member 'open_flag' not described in 'vfs_tmpfile'
../fs/namei.c:3396: warning: Excess function parameter 'open_flags' description in 'vfs_tmpfile'
../fs/namei.c:4460: warning: Function parameter or member 'rd' not described in 'vfs_rename'
../fs/namei.c:4460: warning: Excess function parameter 'old_mnt_userns' description in 'vfs_rename'
../fs/namei.c:4460: warning: Excess function parameter 'old_dir' description in 'vfs_rename'
../fs/namei.c:4460: warning: Excess function parameter 'old_dentry' description in 'vfs_rename'
../fs/namei.c:4460: warning: Excess function parameter 'new_mnt_userns' description in 'vfs_rename'
../fs/namei.c:4460: warning: Excess function parameter 'new_dir' description in 'vfs_rename'
../fs/namei.c:4460: warning: Excess function parameter 'new_dentry' description in 'vfs_rename'
../fs/namei.c:4460: warning: Excess function parameter 'delegated_inode' description in 'vfs_rename'
../fs/namei.c:4460: warning: Excess function parameter 'flags' description in 'vfs_rename'
Link: https://lore.kernel.org/r/20210216042929.8931-3-rdunlap@infradead.org
Fixes: 9fe61450972d ("namei: introduce struct renamedata")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
|
|
Fix kernel-doc warning in libfs.c.
../fs/libfs.c:498: warning: Function parameter or member 'mnt_userns' not described in 'simple_setattr'
Link: https://lore.kernel.org/r/20210216042929.8931-2-rdunlap@infradead.org/
Fixes: 549c7297717c ("fs: make helpers idmap mount aware")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
|
|
Give filesystem two little helpers that do the right thing when
initializing the i_uid and i_gid fields on idmapped and non-idmapped
mounts. Filesystems shouldn't have to be concerned with too many
details.
Link: https://lore.kernel.org/r/20210320122623.599086-5-christian.brauner@ubuntu.com
Inspired-by: Vivek Goyal <vgoyal@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
|
|
Don't open-code the checks and instead move them into a clean little
helper we can call. This also reduces the risk that if we ever change
something we forget to change all locations.
Link: https://lore.kernel.org/r/20210320122623.599086-4-christian.brauner@ubuntu.com
Inspired-by: Vivek Goyal <vgoyal@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
|
|
Vivek pointed out that the fs{g,u}id_into_mnt() naming scheme can be
misleading as it could be understood as implying they do the exact same
thing as i_{g,u}id_into_mnt(). The original motivation for this naming
scheme was to signal to callers that the helpers will always take care
to map the k{g,u}id such that the ownership is expressed in terms of the
mnt_users.
Get rid of the confusion by renaming those helpers to something more
sensible. Al suggested mapped_fs{g,u}id() which seems a really good fit.
Usually filesystems don't need to bother with these helpers directly
only in some cases where they allocate objects that carry {g,u}ids which
are either filesystem specific (e.g. xfs quota objects) or don't have a
clean set of helpers as inodes have.
Link: https://lore.kernel.org/r/20210320122623.599086-3-christian.brauner@ubuntu.com
Inspired-by: Vivek Goyal <vgoyal@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Darrick J. Wong <djwong@kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
|
|
Keep track of whether or not there were LSM security context
options passed during mount (ie creation of the superblock).
Then, while deciding if the superblock can be shared for the new
mount, check if the newly passed in LSM security context options
are compatible with the existing superblock's ones by calling
security_sb_mnt_opts_compat().
Previously, with selinux enabled, NFS wasn't able to do the
following 2mounts:
mount -o vers=4.2,sec=sys,context=system_u:object_r:root_t:s0
<serverip>:/ /mnt
mount -o vers=4.2,sec=sys,context=system_u:object_r:swapfile_t:s0
<serverip>:/scratch /scratch
2nd mount would fail with "mount.nfs: an incorrect mount option was
specified" and var log messages would have:
"SElinux: mount invalid. Same superblock, different security
settings for.."
Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
[PM: tweak subject line]
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
In nfs_fill_super() passed in nfs_fs_context can never be NULL.
Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
[PM: tweak subject line]
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
This patch replaces a loop with a "tzcnt" instruction.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
mountd can now monitor clients appearing and disappearing in
/proc/fs/nfsd/clients, and will log these events, in liu of the logging
of mount/unmount events for NFSv3.
Currently it cannot distinguish between unconfirmed clients (which might
be transient and totally uninteresting) and confirmed clients.
So add a "status: " line which reports either "confirmed" or
"unconfirmed", and use fsnotify to report that the info file
has been modified.
This requires a bit of infrastructure to keep the dentry for the "info"
file. There is no need to take a counted reference as the dentry must
remain around until the client is removed.
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Note size_t is 32-bit on a 32-bit architecture, but cp_count is defined
by the protocol to be 64 bit, so we could be turning a large copy into a
0-length copy here.
Reported-by: <radchenkoy@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
>From https://tools.ietf.org/html/rfc7862#page-65
A count of 0 (zero) requests that all bytes from ca_src_offset
through EOF be copied to the destination.
Reported-by: <radchenkoy@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Trivial fix.
Cc: linux-nfs@vger.kernel.org
Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
In order to ensure that knfsd threads don't linger once the nfsd
pseudofs is unmounted (e.g. when the container is killed) we let
nfsd_umount() shut down those threads and wait for them to exit.
This also should ensure that we don't need to do a kernel mount of
the pseudofs, since the thread lifetime is now limited by the
lifetime of the filesystem.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
`printk()`, by default, uses the log level warning, which leaves the
user reading
NFSD: Using UMH upcall client tracking operations.
wondering what to do about it (`dmesg --level=warn`).
Several client tracking methods are tried, and expected to fail. That’s
why a message is printed only on success. It might be interesting for
users to know the chosen method, so use info-level instead of
debug-level.
Cc: linux-nfs@vger.kernel.org
Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
We do this same logic repeatedly, and it's easy to get the sense of the
comparison wrong.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
These are no longer needed because there are no dprintk() call sites
in these files.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Enable watching the progress of directory encoding to capture the
timing of any issues with reading or encoding a directory. The
new tracepoint captures dirent encoding for all NFS versions.
For example, here's what a few NFSv4 directory entries might look
like:
nfsd-989 [002] 468.596265: nfsd_dirent: fh_hash=0x5d162594 ino=2 name=.
nfsd-989 [002] 468.596267: nfsd_dirent: fh_hash=0x5d162594 ino=1 name=..
nfsd-989 [002] 468.596299: nfsd_dirent: fh_hash=0x5d162594 ino=3827 name=zlib.c
nfsd-989 [002] 468.596325: nfsd_dirent: fh_hash=0x5d162594 ino=3811 name=xdiff
nfsd-989 [002] 468.596351: nfsd_dirent: fh_hash=0x5d162594 ino=3810 name=xdiff-interface.h
nfsd-989 [002] 468.596377: nfsd_dirent: fh_hash=0x5d162594 ino=3809 name=xdiff-interface.c
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
The SETACL result encoder is exactly the same as the NFSv2
attrstatres decoder.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Clean up.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Clean up: Counting the bytes used by each returned directory entry
seems less brittle to me than trying to measure consumed pages after
the fact.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Refactor: Add helper function similar to nfs3svc_encode_cookie3().
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
During NFSv2 and NFSv3 READDIR/PLUS operations, NFSD advances
rq_next_page to the full size of the client-requested buffer, then
releases all those pages at the end of the request. The next request
to use that nfsd thread has to refill the pages.
NFSD does this even when the dirlist in the reply is small. With
NFSv3 clients that send READDIR operations with large buffer sizes,
that can be 256 put_page/alloc_page pairs per READDIR request, even
though those pages often remain unused.
We can save some work by not releasing dirlist buffer pages that
were not used to form the READDIR Reply. I've left the NFSv2 code
alone since there are never more than three pages involved in an
NFSv2 READDIR Reply.
Eventually we should nail down why these pages need to be released
at all in order to avoid allocating and releasing pages
unnecessarily.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Clean up.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
The benefit of the xdr_stream helpers is that they transparently
handle encoding an XDR data item that crosses page boundaries.
Most of the open-coded logic to do that here can be eliminated.
A sub-buffer and sub-stream are set up as a sink buffer for the
directory entry encoder. As an entry is encoded, it is added to
the end of the content in this buffer/stream. The total length of
the directory list is tracked in the buffer's @len field.
When it comes time to encode the Reply, the sub-buffer is merged
into rq_res's page array at the correct place using
xdr_write_pages().
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Clean up: Counting the bytes used by each returned directory entry
seems less brittle to me than trying to measure consumed pages after
the fact.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Refactor: De-duplicate identical code that handles encoding of
directory offset cookies across page boundaries.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|