summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)AuthorFilesLines
2021-07-20io_uring: remove double poll entry on arm failurePavel Begunkov1-0/+2
__io_queue_proc() can enqueue both poll entries and still fail afterwards, so the callers trying to cancel it should also try to remove the second poll entry (if any). For example, it may leave the request alive referencing a io_uring context but not accessible for cancellation: [ 282.599913][ T1620] task:iou-sqp-23145 state:D stack:28720 pid:23155 ppid: 8844 flags:0x00004004 [ 282.609927][ T1620] Call Trace: [ 282.613711][ T1620] __schedule+0x93a/0x26f0 [ 282.634647][ T1620] schedule+0xd3/0x270 [ 282.638874][ T1620] io_uring_cancel_generic+0x54d/0x890 [ 282.660346][ T1620] io_sq_thread+0xaac/0x1250 [ 282.696394][ T1620] ret_from_fork+0x1f/0x30 Cc: stable@vger.kernel.org Fixes: 18bceab101add ("io_uring: allow POLL_ADD with double poll_wait() users") Reported-and-tested-by: syzbot+ac957324022b7132accf@syzkaller.appspotmail.com Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/0ec1228fc5eda4cb524eeda857da8efdc43c331c.1626774457.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-07-20io_uring: explicitly count entries for poll reqsPavel Begunkov1-6/+10
If __io_queue_proc() fails to add a second poll entry, e.g. kmalloc() failed, but it goes on with a third waitqueue, it may succeed and overwrite the error status. Count the number of poll entries we added, so we can set pt->error to zero at the beginning and find out when the mentioned scenario happens. Cc: stable@vger.kernel.org Fixes: 18bceab101add ("io_uring: allow POLL_ADD with double poll_wait() users") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/9d6b9e561f88bcc0163623b74a76c39f712151c3.1626774457.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-07-20gfs2: Fix memory leak of object lsi on error return pathColin Ian King1-0/+1
In the case where IS_ERR(lsi->si_sc_inode) is true the error exit path to free_local does not kfree the allocated object lsi leading to a memory leak. Fix this by kfree'ing lst before taking the error exit path. Addresses-Coverity: ("Resource leak") Fixes: 97fd734ba17e ("gfs2: lookup local statfs inodes prior to journal recovery") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2021-07-20f2fs: quota: fix potential deadlockChao Yu1-36/+48
xfstest generic/587 reports a deadlock issue as below: ====================================================== WARNING: possible circular locking dependency detected 5.14.0-rc1 #69 Not tainted ------------------------------------------------------ repquota/8606 is trying to acquire lock: ffff888022ac9320 (&sb->s_type->i_mutex_key#18){+.+.}-{3:3}, at: f2fs_quota_sync+0x207/0x300 [f2fs] but task is already holding lock: ffff8880084bcde8 (&sbi->quota_sem){.+.+}-{3:3}, at: f2fs_quota_sync+0x59/0x300 [f2fs] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (&sbi->quota_sem){.+.+}-{3:3}: __lock_acquire+0x648/0x10b0 lock_acquire+0x128/0x470 down_read+0x3b/0x2a0 f2fs_quota_sync+0x59/0x300 [f2fs] f2fs_quota_on+0x48/0x100 [f2fs] do_quotactl+0x5e3/0xb30 __x64_sys_quotactl+0x23a/0x4e0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae -> #1 (&sbi->cp_rwsem){++++}-{3:3}: __lock_acquire+0x648/0x10b0 lock_acquire+0x128/0x470 down_read+0x3b/0x2a0 f2fs_unlink+0x353/0x670 [f2fs] vfs_unlink+0x1c7/0x380 do_unlinkat+0x413/0x4b0 __x64_sys_unlinkat+0x50/0xb0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae -> #0 (&sb->s_type->i_mutex_key#18){+.+.}-{3:3}: check_prev_add+0xdc/0xb30 validate_chain+0xa67/0xb20 __lock_acquire+0x648/0x10b0 lock_acquire+0x128/0x470 down_write+0x39/0xc0 f2fs_quota_sync+0x207/0x300 [f2fs] do_quotactl+0xaff/0xb30 __x64_sys_quotactl+0x23a/0x4e0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae other info that might help us debug this: Chain exists of: &sb->s_type->i_mutex_key#18 --> &sbi->cp_rwsem --> &sbi->quota_sem Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&sbi->quota_sem); lock(&sbi->cp_rwsem); lock(&sbi->quota_sem); lock(&sb->s_type->i_mutex_key#18); *** DEADLOCK *** 3 locks held by repquota/8606: #0: ffff88801efac0e0 (&type->s_umount_key#53){++++}-{3:3}, at: user_get_super+0xd9/0x190 #1: ffff8880084bc380 (&sbi->cp_rwsem){++++}-{3:3}, at: f2fs_quota_sync+0x3e/0x300 [f2fs] #2: ffff8880084bcde8 (&sbi->quota_sem){.+.+}-{3:3}, at: f2fs_quota_sync+0x59/0x300 [f2fs] stack backtrace: CPU: 6 PID: 8606 Comm: repquota Not tainted 5.14.0-rc1 #69 Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 Call Trace: dump_stack_lvl+0xce/0x134 dump_stack+0x17/0x20 print_circular_bug.isra.0.cold+0x239/0x253 check_noncircular+0x1be/0x1f0 check_prev_add+0xdc/0xb30 validate_chain+0xa67/0xb20 __lock_acquire+0x648/0x10b0 lock_acquire+0x128/0x470 down_write+0x39/0xc0 f2fs_quota_sync+0x207/0x300 [f2fs] do_quotactl+0xaff/0xb30 __x64_sys_quotactl+0x23a/0x4e0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f883b0b4efe The root cause is ABBA deadlock of inode lock and cp_rwsem, reorder locks in f2fs_quota_sync() as below to fix this issue: - lock inode - lock cp_rwsem - lock quota_sem Fixes: db6ec53b7e03 ("f2fs: add a rw_sem to cover quota flag changes") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-07-20f2fs: let's keep writing IOs on SBI_NEED_FSCKJaegeuk Kim2-1/+3
SBI_NEED_FSCK is an indicator that fsck.f2fs needs to be triggered, so it is not fully critical to stop any IO writes. So, let's allow to write data instead of reporting EIO forever given SBI_NEED_FSCK, but do keep OPU. Fixes: 955772787667 ("f2fs: drop inplace IO if fs status is abnormal") Cc: <stable@kernel.org> # v5.13+ Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-07-20seq_file: disallow extremely large seq buffer allocationsEric Sandeen1-0/+3
There is no reasonable need for a buffer larger than this, and it avoids int overflow pitfalls. Fixes: 058504edd026 ("fs/seq_file: fallback to vmalloc allocation") Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Reported-by: Qualys Security Advisory <qsa@qualys.com> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-19f2fs: Revert "f2fs: Fix indefinite loop in f2fs_gc() v1"Jia Yang1-1/+1
This reverts commit 957fa47823dfe449c5a15a944e4e7a299a6601db. The patch "f2fs: Fix indefinite loop in f2fs_gc()" v1 and v4 are all merged. Patch v4 is test info for patch v1. Patch v1 doesn't work and may cause that sbi->cur_victim_sec can't be resetted to NULL_SEGNO, which makes SSR unable to get segment of sbi->cur_victim_sec. So it should be reverted. The mails record: [1] https://lore.kernel.org/linux-f2fs-devel/7288dcd4-b168-7656-d1af-7e2cafa4f720@huawei.com/T/ [2] https://lore.kernel.org/linux-f2fs-devel/20190809153653.GD93481@jaegeuk-macbookpro.roam.corp.google.com/T/ Signed-off-by: Jia Yang <jiayang5@huawei.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-07-19fs: dlm: move receive loop into receive handlerAlexander Aring1-37/+31
This patch moves the kernel_recvmsg() loop call into the receive_from_sock() function instead of doing the loop outside the function and abort the loop over it's return value. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-07-19fs: dlm: fix multiple empty writequeue allocAlexander Aring1-0/+21
This patch will add a mutex that a connection can allocate a writequeue entry buffer only at a sleepable context at one time. If multiple caller waits at the writequeue spinlock and the spinlock gets release it could be that multiple new writequeue page buffers were allocated instead of allocate one writequeue page buffer and other waiters will use remaining buffer of it. It will only be the case for sleepable context which is the common case. In non-sleepable contexts like retransmission we just don't care about such behaviour. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-07-19fs: dlm: generic connect funcAlexander Aring1-193/+150
This patch adds a generic connect function for TCP and SCTP. If the connect functionality differs from each other additional callbacks in dlm_proto_ops were added. The sockopts callback handling will guarantee that sockets created by connect() will use the same options as sockets created by accept(). Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-07-19fs: dlm: auto load sctp moduleAlexander Aring1-5/+15
This patch adds a "for now" better handling of missing SCTP support in the kernel and try to load the sctp module if SCTP is set. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-07-19fs: dlm: introduce generic listenAlexander Aring1-115/+113
This patch combines each transport layer listen functionality into one listen function. Per transport layer differences are provided by additional callbacks in dlm_proto_ops. This patch drops silently sock_set_keepalive() for listen tcp sockets only. This socket option is not set at connecting sockets, I also don't see the sense of set keepalive for sockets which are created by accept() only. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-07-19fs: dlm: move to static proto opsAlexander Aring1-22/+30
This patch moves the per transport socket callbacks to a static const array. We can support only one transport socket for the init namespace which will be determinted by reading the dlm config at lowcomms_start(). Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-07-19fs: dlm: introduce con_next_wq helperAlexander Aring1-22/+35
This patch introduce a function to determine if something is ready to being send in the writequeue. It's not just that the writequeue is not empty additional the first entry need to have a valid length field. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-07-19fs: dlm: cleanup and remove _send_rcomAlexander Aring1-18/+11
The _send_rcom() can be removed and we call directly dlm_rcom_out(). As we doing that we removing the struct dlm_ls parameter which isn't used. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-07-19fs: dlm: clear CF_APP_LIMITED on closeAlexander Aring1-0/+1
If send_to_sock() sets CF_APP_LIMITED limited bit and it has not been cleared by a waiting lowcomms_write_space() yet and a close_connection() apprears we should clear the CF_APP_LIMITED bit again because the connection starts from a new state again at reconnect. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-07-19fs: dlm: fix typo in tlv prefixAlexander Aring1-1/+1
This patch fixes a small typo in a unused struct field. It should named be t_pad instead of o_pad. Came over this as I updated wireshark dissector. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-07-19fs: dlm: use READ_ONCE for config varAlexander Aring1-1/+1
This patch will use READ_ONCE to signal the compiler to read this variable only one time. If we don't do that it could be that the compiler read this value more than one time, because some optimizations, from the configure data which might can be changed during this time. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-07-19fs: dlm: use sk->sk_socket instead of con->sockAlexander Aring1-2/+1
Instead of dereference "con->sock" we can get the socket structure over "sk->sk_socket" as well. This patch will switch to this behaviour. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2021-07-19ksmbd: move credit charge verification over smb2 request size verificationNamjae Jeon1-6/+6
Move credit charge verification over smb2 request size verification to avoid being skipped. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-07-19ksmbd: set STATUS_INVALID_PARAMETER error status if credit charge is invalidNamjae Jeon2-12/+17
MS-SMB2 specification describe : If the calculated credit number is greater than the CreditCharge, the server MUST fail the request with the error code STATUS_INVALID_PARAMETER. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-07-19ksmbd: fix wrong error status return on session setupNamjae Jeon1-20/+12
When user insert wrong password, ksmbd return STATUS_INVALID_PARAMETER error status to client. It will make user confusing whether it is not password problem. This patch change error status to STATUS_LOGON_FAILURE. and return STATUS_INSUFFICIENT_RESOURCES if memory allocation failed on session setup. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-07-19ksmbd: fix wrong compression context sizeNamjae Jeon1-1/+1
Use smb2_compression_ctx instead of smb2_encryption_neg_context. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-07-18Merge tag 'xfs-5.14-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds9-37/+174
Pull xfs fixes from Darrick Wong: "A few fixes for issues in the new online shrink code, additional corrections for my recent bug-hunt w.r.t. extent size hints on realtime, and improved input checking of the GROWFSRT ioctl. IOW, the usual 'I somehow got bored during the merge window and resumed auditing the farther reaches of xfs': - Fix shrink eligibility checking when sparse inode clusters enabled - Reset '..' directory entries when unlinking directories to prevent verifier errors if fs is shrinked later - Don't report unusable extent size hints to FSGETXATTR - Don't warn when extent size hints are unusable because the sysadmin configured them that way - Fix insufficient parameter validation in GROWFSRT ioctl - Fix integer overflow when adding rt volumes to filesystem" * tag 'xfs-5.14-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: detect misaligned rtinherit directory extent size hints xfs: fix an integer overflow error in xfs_growfs_rt xfs: improve FSGROWFSRT precondition checking xfs: don't expose misaligned extszinherit hints to userspace xfs: correct the narrative around misaligned rtinherit/extszinherit dirs xfs: reset child dir '..' entry when unlinking child xfs: check for sparse inode clusters that cross new EOAG when shrinking
2021-07-18Merge tag 'iomap-5.14-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds2-20/+13
Pull iomap fixes from Darrick Wong: "A handful of bugfixes for the iomap code. There's nothing especially exciting here, just fixes for UBSAN (not KASAN as I erroneously wrote in the tag message) warnings about undefined behavior in the SEEK_DATA/SEEK_HOLE code, and some reshuffling of per-page block state info to fix some problems with gfs2. - Fix KASAN warnings due to integer overflow in SEEK_DATA/SEEK_HOLE - Fix assertion errors when using inlinedata files on gfs2" * tag 'iomap-5.14-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: iomap: Don't create iomap_page objects in iomap_page_mkwrite_actor iomap: Don't create iomap_page objects for inline files iomap: Permit pages without an iop to enter writeback iomap: remove the length variable in iomap_seek_hole iomap: remove the length variable in iomap_seek_data
2021-07-17Merge tag '5.14-rc1-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds8-20/+124
Pull cifs fixes from Steve French: "Eight cifs/smb3 fixes, including three for stable. Three are DFS related fixes, and two to fix problems pointed out by static checkers" * tag '5.14-rc1-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: do not share tcp sessions of dfs connections SMB3.1.1: fix mount failure to some servers when compression enabled cifs: added WARN_ON for all the count decrements cifs: fix missing null session check in mount cifs: handle reconnect of tcon when there is no cached dfs referral cifs: fix the out of range assignment to bit fields in parse_server_interfaces cifs: Do not use the original cruid when following DFS links for multiuser mounts cifs: use the expiry output of dns_query to schedule next resolution
2021-07-16Merge tag 'io_uring-5.14-2021-07-16' of git://git.kernel.dk/linux-blockLinus Torvalds1-3/+5
Pull io_uring fixes from Jens Axboe: "Two small fixes: one fixing the process target of a check, and the other a minor issue with the drain error handling" * tag 'io_uring-5.14-2021-07-16' of git://git.kernel.dk/linux-block: io_uring: fix io_drain_req() io_uring: use right task for exiting checks
2021-07-16Merge tag 'zonefs-5.14-rc2' of ↵Linus Torvalds1-3/+0
git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs Pull zonefs fix from Damien Le Moal: "A single patch to remove an unnecessary NULL bio check (from Xianting)" * tag 'zonefs-5.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs: zonefs: remove redundant null bio check
2021-07-16reiserfs: check directory items on read from diskShreyansh Chouhan1-5/+26
While verifying the leaf item that we read from the disk, reiserfs doesn't check the directory items, this could cause a crash when we read a directory item from the disk that has an invalid deh_location. This patch adds a check to the directory items read from the disk that does a bounds check on deh_location for the directory entries. Any directory entry header with a directory entry offset greater than the item length is considered invalid. Link: https://lore.kernel.org/r/20210709152929.766363-1-chouhan.shreyansh630@gmail.com Reported-by: syzbot+c31a48e6702ccb3d64c9@syzkaller.appspotmail.com Signed-off-by: Shreyansh Chouhan <chouhan.shreyansh630@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2021-07-16fs/ext2: Avoid page_address on pages returned by ext2_get_pageJavier Pello3-9/+10
Commit 782b76d7abdf02b12c46ed6f1e9bf715569027f7 ("fs/ext2: Replace kmap() with kmap_local_page()") replaced the kmap/kunmap calls in ext2_get_page/ext2_put_page with kmap_local_page/kunmap_local for efficiency reasons. As a necessary side change, the commit also made ext2_get_page (and ext2_find_entry and ext2_dotdot) return the mapping address along with the page itself, as it is required for kunmap_local, and converted uses of page_address on such pages to use the newly returned address instead. However, uses of page_address on such pages were missed in ext2_check_page and ext2_delete_entry, which triggers oopses if kmap_local_page happens to return an address from high memory. Fix this now by converting the remaining uses of page_address to use the right address, as returned by kmap_local_page. Link: https://lore.kernel.org/r/20210714185448.8707ac239e9f12b3a7f5b9f9@urjc.es Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Javier Pello <javier.pello@urjc.es> Fixes: 782b76d7abdf ("fs/ext2: Replace kmap() with kmap_local_page()") Signed-off-by: Jan Kara <jack@suse.cz>
2021-07-16reiserfs: add check for root_inode in reiserfs_fill_superYu Kuai1-0/+8
Our syzcaller report a NULL pointer dereference: BUG: kernel NULL pointer dereference, address: 0000000000000000 PGD 116e95067 P4D 116e95067 PUD 1080b5067 PMD 0 Oops: 0010 [#1] SMP KASAN CPU: 7 PID: 592 Comm: a.out Not tainted 5.13.0-next-20210629-dirty #67 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-p4 RIP: 0010:0x0 Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6. RSP: 0018:ffff888114e779b8 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 1ffff110229cef39 RCX: ffffffffaa67e1aa RDX: 0000000000000000 RSI: ffff88810a58ee00 RDI: ffff8881233180b0 RBP: ffffffffac38e9c0 R08: ffffffffaa67e17e R09: 0000000000000001 R10: ffffffffb91c5557 R11: fffffbfff7238aaa R12: ffff88810a58ee00 R13: ffff888114e77aa0 R14: 0000000000000000 R15: ffff8881233180b0 FS: 00007f946163c480(0000) GS:ffff88839f1c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffffffffd6 CR3: 00000001099c1000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: __lookup_slow+0x116/0x2d0 ? page_put_link+0x120/0x120 ? __d_lookup+0xfc/0x320 ? d_lookup+0x49/0x90 lookup_one_len+0x13c/0x170 ? __lookup_slow+0x2d0/0x2d0 ? reiserfs_schedule_old_flush+0x31/0x130 reiserfs_lookup_privroot+0x64/0x150 reiserfs_fill_super+0x158c/0x1b90 ? finish_unfinished+0xb10/0xb10 ? bprintf+0xe0/0xe0 ? __mutex_lock_slowpath+0x30/0x30 ? __kasan_check_write+0x20/0x30 ? up_write+0x51/0xb0 ? set_blocksize+0x9f/0x1f0 mount_bdev+0x27c/0x2d0 ? finish_unfinished+0xb10/0xb10 ? reiserfs_kill_sb+0x120/0x120 get_super_block+0x19/0x30 legacy_get_tree+0x76/0xf0 vfs_get_tree+0x49/0x160 ? capable+0x1d/0x30 path_mount+0xacc/0x1380 ? putname+0x97/0xd0 ? finish_automount+0x450/0x450 ? kmem_cache_free+0xf8/0x5a0 ? putname+0x97/0xd0 do_mount+0xe2/0x110 ? path_mount+0x1380/0x1380 ? copy_mount_options+0x69/0x140 __x64_sys_mount+0xf0/0x190 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae This is because 'root_inode' is initialized with wrong mode, and it's i_op is set to 'reiserfs_special_inode_operations'. Thus add check for 'root_inode' to fix the problem. Link: https://lore.kernel.org/r/20210702040743.1918552-1-yukuai3@huawei.com Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Jan Kara <jack@suse.cz>
2021-07-16cifs: do not share tcp sessions of dfs connectionsPaulo Alcantara2-3/+38
Make sure that we do not share tcp sessions of dfs mounts when mounting regular shares that connect to same server. DFS connections rely on a single instance of tcp in order to do failover properly in cifs_reconnect(). Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-07-16zonefs: remove redundant null bio checkXianting Tian1-3/+0
bio_alloc() with __GFP_DIRECT_RECLAIM, which is included in GFP_NOFS, never fails, see comments in bio_alloc_bioset(). Signed-off-by: Xianting Tian <xianting.tian@linux.alibaba.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
2021-07-16Merge tag 'configfs-5.13-1' of git://git.infradead.org/users/hch/configfsLinus Torvalds1-7/+22
Pull configfs fix from Christoph Hellwig: - fix the read and write iterators (Bart Van Assche) * tag 'configfs-5.13-1' of git://git.infradead.org/users/hch/configfs: configfs: fix the read and write iterators
2021-07-16SMB3.1.1: fix mount failure to some servers when compression enabledSteve French1-0/+1
When sending the compression context to some servers, they rejected the SMB3.1.1 negotiate protocol because they expect the compression context to have a data length of a multiple of 8. Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-07-16cifs: added WARN_ON for all the count decrementsShyam Prasad N2-0/+11
We have a few ref counters srv_count, ses_count and tc_count which we use for ref counting. Added a WARN_ON during the decrement of each of these counters to make sure that they don't go below their minimum values. Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-07-16cifs: fix missing null session check in mountSteve French1-1/+1
Although it is unlikely to be have ended up with a null session pointer calling cifs_try_adding_channels in cifs_mount. Coverity correctly notes that we are already checking for it earlier (when we return from do_dfs_failover), so at a minimum to clarify the code we should make sure we also check for it when we exit the loop so we don't end up calling cifs_try_adding_channels or mount_setup_tlink with a null ses pointer. Addresses-Coverity: 1505608 ("Derefernce after null check") Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-07-16cifs: handle reconnect of tcon when there is no cached dfs referralPaulo Alcantara1-4/+2
When there is no cached DFS referral of tcon->dfs_path, then reconnect to same share. Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Cc: <stable@vger.kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-07-15Merge tag 'Wimplicit-fallthrough-clang-5.14-rc2' of ↵Linus Torvalds2-9/+9
git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux Pull fallthrough fixes from Gustavo Silva: "This fixes many fall-through warnings when building with Clang and -Wimplicit-fallthrough, and also enables -Wimplicit-fallthrough for Clang, globally. It's also important to notice that since we have adopted the use of the pseudo-keyword macro fallthrough, we also want to avoid having more /* fall through */ comments being introduced. Contrary to GCC, Clang doesn't recognize any comments as implicit fall-through markings when the -Wimplicit-fallthrough option is enabled. So, in order to avoid having more comments being introduced, we use the option -Wimplicit-fallthrough=5 for GCC, which similar to Clang, will cause a warning in case a code comment is intended to be used as a fall-through marking. The patch for Makefile also enforces this. We had almost 4,000 of these issues for Clang in the beginning, and there might be a couple more out there when building some architectures with certain configurations. However, with the recent fixes I think we are in good shape and it is now possible to enable the warning for Clang" * tag 'Wimplicit-fallthrough-clang-5.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux: (27 commits) Makefile: Enable -Wimplicit-fallthrough for Clang powerpc/smp: Fix fall-through warning for Clang dmaengine: mpc512x: Fix fall-through warning for Clang usb: gadget: fsl_qe_udc: Fix fall-through warning for Clang powerpc/powernv: Fix fall-through warning for Clang MIPS: Fix unreachable code issue MIPS: Fix fall-through warnings for Clang ASoC: Mediatek: MT8183: Fix fall-through warning for Clang power: supply: Fix fall-through warnings for Clang dmaengine: ti: k3-udma: Fix fall-through warning for Clang s390: Fix fall-through warnings for Clang dmaengine: ipu: Fix fall-through warning for Clang iommu/arm-smmu-v3: Fix fall-through warning for Clang mmc: jz4740: Fix fall-through warning for Clang PCI: Fix fall-through warning for Clang scsi: libsas: Fix fall-through warning for Clang video: fbdev: Fix fall-through warning for Clang math-emu: Fix fall-through warning cpufreq: Fix fall-through warning for Clang drm/msm: Fix fall-through warning in msm_gem_new_impl() ...
2021-07-15Merge branch 'akpm' (patches from Andrew)Linus Torvalds4-11/+45
Merge misc fixes from Andrew Morton: "13 patches. Subsystems affected by this patch series: mm (kasan, pagealloc, rmap, hmm, and hugetlb), and hfs" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm/hugetlb: fix refs calculation from unaligned @vaddr hfs: add lock nesting notation to hfs_find_init hfs: fix high memory mapping in hfs_bnode_read hfs: add missing clean-up in hfs_fill_super lib/test_hmm: remove set but unused page variable mm: fix the try_to_unmap prototype for !CONFIG_MMU mm/page_alloc: further fix __alloc_pages_bulk() return value mm/page_alloc: correct return value when failing at preparing mm/page_alloc: avoid page allocator recursion with pagesets.lock held Revert "mm/page_alloc: make should_fail_alloc_page() static" kasan: fix build by including kernel.h kasan: add memzero init for unaligned size at DEBUG mm: move helper to check slub_debug_enabled
2021-07-15hfs: add lock nesting notation to hfs_find_initDesmond Cheong Zhi Xi2-1/+20
Syzbot reports a possible recursive lock in [1]. This happens due to missing lock nesting information. From the logs, we see that a call to hfs_fill_super is made to mount the hfs filesystem. While searching for the root inode, the lock on the catalog btree is grabbed. Then, when the parent of the root isn't found, a call to __hfs_bnode_create is made to create the parent of the root. This eventually leads to a call to hfs_ext_read_extent which grabs a lock on the extents btree. Since the order of locking is catalog btree -> extents btree, this lock hierarchy does not lead to a deadlock. To tell lockdep that this locking is safe, we add nesting notation to distinguish between catalog btrees, extents btrees, and attributes btrees (for HFS+). This has already been done in hfsplus. Link: https://syzkaller.appspot.com/bug?id=f007ef1d7a31a469e3be7aeb0fde0769b18585db [1] Link: https://lkml.kernel.org/r/20210701030756.58760-4-desmondcheongzx@gmail.com Signed-off-by: Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com> Reported-by: syzbot+b718ec84a87b7e73ade4@syzkaller.appspotmail.com Tested-by: syzbot+b718ec84a87b7e73ade4@syzkaller.appspotmail.com Reviewed-by: Viacheslav Dubeyko <slava@dubeyko.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Gustavo A. R. Silva <gustavoars@kernel.org> Cc: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-15hfs: fix high memory mapping in hfs_bnode_readDesmond Cheong Zhi Xi1-5/+20
Pages that we read in hfs_bnode_read need to be kmapped into kernel address space. However, currently only the 0th page is kmapped. If the given offset + length exceeds this 0th page, then we have an invalid memory access. To fix this, we kmap relevant pages one by one and copy their relevant portions of data. An example of invalid memory access occurring without this fix can be seen in the following crash report: ================================================================== BUG: KASAN: use-after-free in memcpy include/linux/fortify-string.h:191 [inline] BUG: KASAN: use-after-free in hfs_bnode_read+0xc4/0xe0 fs/hfs/bnode.c:26 Read of size 2 at addr ffff888125fdcffe by task syz-executor5/4634 CPU: 0 PID: 4634 Comm: syz-executor5 Not tainted 5.13.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:79 [inline] dump_stack+0x195/0x1f8 lib/dump_stack.c:120 print_address_description.constprop.0+0x1d/0x110 mm/kasan/report.c:233 __kasan_report mm/kasan/report.c:419 [inline] kasan_report.cold+0x7b/0xd4 mm/kasan/report.c:436 check_region_inline mm/kasan/generic.c:180 [inline] kasan_check_range+0x154/0x1b0 mm/kasan/generic.c:186 memcpy+0x24/0x60 mm/kasan/shadow.c:65 memcpy include/linux/fortify-string.h:191 [inline] hfs_bnode_read+0xc4/0xe0 fs/hfs/bnode.c:26 hfs_bnode_read_u16 fs/hfs/bnode.c:34 [inline] hfs_bnode_find+0x880/0xcc0 fs/hfs/bnode.c:365 hfs_brec_find+0x2d8/0x540 fs/hfs/bfind.c:126 hfs_brec_read+0x27/0x120 fs/hfs/bfind.c:165 hfs_cat_find_brec+0x19a/0x3b0 fs/hfs/catalog.c:194 hfs_fill_super+0xc13/0x1460 fs/hfs/super.c:419 mount_bdev+0x331/0x3f0 fs/super.c:1368 hfs_mount+0x35/0x40 fs/hfs/super.c:457 legacy_get_tree+0x10c/0x220 fs/fs_context.c:592 vfs_get_tree+0x93/0x300 fs/super.c:1498 do_new_mount fs/namespace.c:2905 [inline] path_mount+0x13f5/0x20e0 fs/namespace.c:3235 do_mount fs/namespace.c:3248 [inline] __do_sys_mount fs/namespace.c:3456 [inline] __se_sys_mount fs/namespace.c:3433 [inline] __x64_sys_mount+0x2b8/0x340 fs/namespace.c:3433 do_syscall_64+0x37/0xc0 arch/x86/entry/common.c:47 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x45e63a Code: 48 c7 c2 bc ff ff ff f7 d8 64 89 02 b8 ff ff ff ff eb d2 e8 88 04 00 00 0f 1f 84 00 00 00 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f9404d410d8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5 RAX: ffffffffffffffda RBX: 0000000020000248 RCX: 000000000045e63a RDX: 0000000020000000 RSI: 0000000020000100 RDI: 00007f9404d41120 RBP: 00007f9404d41120 R08: 00000000200002c0 R09: 0000000020000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000003 R13: 0000000000000003 R14: 00000000004ad5d8 R15: 0000000000000000 The buggy address belongs to the page: page:00000000dadbcf3e refcount:0 mapcount:0 mapping:0000000000000000 index:0x1 pfn:0x125fdc flags: 0x2fffc0000000000(node=0|zone=2|lastcpupid=0x3fff) raw: 02fffc0000000000 ffffea000497f748 ffffea000497f6c8 0000000000000000 raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff888125fdce80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ffff888125fdcf00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff >ffff888125fdcf80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ^ ffff888125fdd000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ffff888125fdd080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ================================================================== Link: https://lkml.kernel.org/r/20210701030756.58760-3-desmondcheongzx@gmail.com Signed-off-by: Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com> Reviewed-by: Viacheslav Dubeyko <slava@dubeyko.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Gustavo A. R. Silva <gustavoars@kernel.org> Cc: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-15hfs: add missing clean-up in hfs_fill_superDesmond Cheong Zhi Xi1-5/+5
Patch series "hfs: fix various errors", v2. This series ultimately aims to address a lockdep warning in hfs_find_init reported by Syzbot [1]. The work done for this led to the discovery of another bug, and the Syzkaller repro test also reveals an invalid memory access error after clearing the lockdep warning. Hence, this series is broken up into three patches: 1. Add a missing call to hfs_find_exit for an error path in hfs_fill_super 2. Fix memory mapping in hfs_bnode_read by fixing calls to kmap 3. Add lock nesting notation to tell lockdep that the observed locking hierarchy is safe This patch (of 3): Before exiting hfs_fill_super, the struct hfs_find_data used in hfs_find_init should be passed to hfs_find_exit to be cleaned up, and to release the lock held on the btree. The call to hfs_find_exit is missing from an error path. We add it back in by consolidating calls to hfs_find_exit for error paths. Link: https://syzkaller.appspot.com/bug?id=f007ef1d7a31a469e3be7aeb0fde0769b18585db [1] Link: https://lkml.kernel.org/r/20210701030756.58760-1-desmondcheongzx@gmail.com Link: https://lkml.kernel.org/r/20210701030756.58760-2-desmondcheongzx@gmail.com Signed-off-by: Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com> Reviewed-by: Viacheslav Dubeyko <slava@dubeyko.com> Cc: Gustavo A. R. Silva <gustavoars@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-15xfs: detect misaligned rtinherit directory extent size hintsDarrick J. Wong1-2/+16
If we encounter a directory that has been configured to pass on an extent size hint to a new realtime file and the hint isn't an integer multiple of the rt extent size, we should flag the hint for administrative review because that is a misconfiguration (that other parts of the kernel will fix automatically). Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2021-07-15xfs: fix an integer overflow error in xfs_growfs_rtDarrick J. Wong1-5/+5
During a realtime grow operation, we run a single transaction for each rt bitmap block added to the filesystem. This means that each step has to be careful to increase sb_rblocks appropriately. Fix the integer overflow error in this calculation that can happen when the extent size is very large. Found by running growfs to add a rt volume to a filesystem formatted with a 1g rt extent size. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2021-07-15xfs: improve FSGROWFSRT precondition checkingDarrick J. Wong1-7/+32
Improve the checking at the start of a realtime grow operation so that we avoid accidentally set a new extent size that is too large and avoid adding an rt volume to a filesystem with rmap or reflink because we don't support rt rmap or reflink yet. While we're at it, separate the checks so that we're only testing one aspect at a time. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2021-07-15xfs: don't expose misaligned extszinherit hints to userspaceDarrick J. Wong1-1/+18
Commit 603f000b15f2 changed xfs_ioctl_setattr_check_extsize to reject an attempt to set an EXTSZINHERIT extent size hint on a directory with RTINHERIT set if the hint isn't a multiple of the realtime extent size. However, I have recently discovered that it is possible to change the realtime extent size when adding a rt device to a filesystem, which means that the existence of directories with misaligned inherited hints is not an accident. As a result, it's possible that someone could have set a valid hint and added an rt volume with a different rt extent size, which invalidates the ondisk hints. After such a sequence, FSGETXATTR will report a misaligned hint, which FSSETXATTR will trip over, causing confusion if the user was doing the usual GET/SET sequence to change some other attribute. Change xfs_fill_fsxattr to omit the hint if it isn't aligned properly. Fixes: 603f000b15f2 ("xfs: validate extsz hints against rt extent size when rtinherit is set") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2021-07-15xfs: correct the narrative around misaligned rtinherit/extszinherit dirsDarrick J. Wong3-22/+24
While auditing the realtime growfs code, I realized that the GROWFSRT ioctl (and by extension xfs_growfs) has always allowed sysadmins to change the realtime extent size when adding a realtime section to the filesystem. Since we also have always allowed sysadmins to set RTINHERIT and EXTSZINHERIT on directories even if there is no realtime device, this invalidates the premise laid out in the comments added in commit 603f000b15f2. In other words, this is not a case of inadequate metadata validation. This is a case of nearly forgotten (and apparently untested) but supported functionality. Update the comments to reflect what we've learned, and remove the log message about correcting the misalignment. Fixes: 603f000b15f2 ("xfs: validate extsz hints against rt extent size when rtinherit is set") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2021-07-15xfs: reset child dir '..' entry when unlinking childDarrick J. Wong1-0/+13
While running xfs/168, I noticed a second source of post-shrink corruption errors causing shutdowns. Let's say that directory B has a low inode number and is a child of directory A, which has a high number. If B is empty but open, and unlinked from A, B's dotdot link continues to point to A. If A is then unlinked and the filesystem shrunk so that A is no longer a valid inode, a subsequent AIL push of B will trip the inode verifiers because the dotdot entry points outside of the filesystem. To avoid this problem, reset B's dotdot entry to the root directory when unlinking directories, since the root directory cannot be removed. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2021-07-15xfs: check for sparse inode clusters that cross new EOAG when shrinkingDarrick J. Wong3-0/+66
While running xfs/168, I noticed occasional write verifier shutdowns involving inodes at the very end of the filesystem. Existing inode btree validation code checks that all inode clusters are fully contained within the filesystem. However, due to inadequate checking in the fs shrink code, it's possible that there could be a sparse inode cluster at the end of the filesystem where the upper inodes of the cluster are marked as holes and the corresponding blocks are free. In this case, the last blocks in the AG are listed in the bnobt. This enables the shrink to proceed but results in a filesystem that trips the inode verifiers. Fix this by disallowing the shrink. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>