summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)AuthorFilesLines
2023-03-23ksmbd: set FILE_NAMED_STREAMS attribute in FS_ATTRIBUTE_INFORMATIONNamjae Jeon1-0/+4
If vfs objects = streams_xattr in ksmbd.conf FILE_NAMED_STREAMS should be set to Attributes in FS_ATTRIBUTE_INFORMATION. MacOS client show "Format: SMB (Unknown)" on faked NTFS and no streams support. Cc: stable@vger.kernel.org Reported-by: Miao Lihua <441884205@qq.com> Tested-by: Miao Lihua <441884205@qq.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-03-23ksmbd: fix wrong signingkey creation when encryption is AES256Namjae Jeon1-2/+3
MacOS and Win11 support AES256 encrytion and it is included in the cipher array of encryption context. Especially on macOS, The most preferred cipher is AES256. Connecting to ksmbd fails on newer MacOS clients that support AES256 encryption. MacOS send disconnect request after receiving final session setup response from ksmbd. Because final session setup is signed with signing key was generated incorrectly. For signging key, 'L' value should be initialized to 128 if key size is 16bytes. Cc: stable@vger.kernel.org Reported-by: Miao Lihua <441884205@qq.com> Tested-by: Miao Lihua <441884205@qq.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-03-22NFSv4: Fix hangs when recovering open state after a server rebootTrond Myklebust1-3/+2
When we're using a cached open stateid or a delegation in order to avoid sending a CLAIM_PREVIOUS open RPC call to the server, we don't have a new open stateid to present to update_open_stateid(). Instead rely on nfs4_try_open_cached(), just as if we were doing a normal open. Fixes: d2bfda2e7aa0 ("NFSv4: don't reprocess cached open CLAIM_PREVIOUS") Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2023-03-22Merge tag 'nfsd-6.3-3' of ↵Linus Torvalds3-3/+10
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: - Fix a crash during NFS READs from certain client implementations - Address a minor kbuild regression in v6.3 * tag 'nfsd-6.3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: nfsd: don't replace page in rq_pages if it's a continuation of last page NFS & NFSD: Update GSS dependencies
2023-03-21Merge tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linuxLinus Torvalds2-18/+19
Pull fsverity fixes from Eric Biggers: "Fix two significant performance issues with fsverity" * tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linux: fsverity: don't drop pagecache at end of FS_IOC_ENABLE_VERITY fsverity: Remove WQ_UNBOUND from fsverity read workqueue
2023-03-21Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/linuxLinus Torvalds2-13/+25
Pull fscrypt fix from Eric Biggers: "Fix a bug where when a filesystem was being unmounted, the fscrypt keyring was destroyed before inodes have been released by the Landlock LSM. This bug was found by syzbot" * tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/linux: fscrypt: check for NULL keyring in fscrypt_put_master_key_activeref() fscrypt: improve fscrypt_destroy_keyring() documentation fscrypt: destroy keyring after security_sb_delete()
2023-03-21zonefs: Fix error message in zonefs_file_dio_append()Damien Le Moal1-1/+1
Since the expected write location in a sequential file is always at the end of the file (append write), when an invalid write append location is detected in zonefs_file_dio_append(), print the invalid written location instead of the expected write location. Fixes: a608da3bd730 ("zonefs: Detect append writes at invalid locations") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
2023-03-21zonefs: Prevent uninitialized symbol 'size' warningDamien Le Moal1-1/+1
In zonefs_file_dio_append(), initialize the variable size to 0 to prevent compilation and static code analizers warning such as: New smatch warnings: fs/zonefs/file.c:441 zonefs_file_dio_append() error: uninitialized symbol 'size'. The warning is a false positive as size is never actually used uninitialized. No functional change. Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <error27@gmail.com> Link: https://lore.kernel.org/r/202303191227.GL8Dprbi-lkp@intel.com/ Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
2023-03-20Merge tag 'nfs-for-6.3-2' of git://git.linux-nfs.org/projects/anna/linux-nfsLinus Torvalds4-10/+17
Pull NFS client fixes from Anna Schumaker: - Fix /proc/PID/io read_bytes accounting - Fix setting NLM file_lock start and end during decoding testargs - Fix timing for setting access cache timestamps * tag 'nfs-for-6.3-2' of git://git.linux-nfs.org/projects/anna/linux-nfs: NFS: Correct timing for assigning access cache timestamp lockd: set file_lock start and end when decoding nlm4 testargs NFS: Fix /proc/PID/io read_bytes for buffered reads
2023-03-19xfs: test dir/attr hash when loading moduleDarrick J. Wong4-0/+680
Back in the 6.2-rc1 days, Eric Whitney reported a fstests regression in ext4 against generic/454. The cause of this test failure was the unfortunate combination of setting an xattr name containing UTF8 encoded emoji, an xattr hash function that accepted a char pointer with no explicit signedness, signed type extension of those chars to an int, and the 6.2 build tools maintainers deciding to mandate -funsigned-char across the board. As a result, the ondisk extended attribute structure written out by 6.1 and 6.2 were not the same. This discrepancy, in fact, had been noticeable if a filesystem with such an xattr were moved between any two architectures that don't employ the same signedness of a raw "char" declaration. The only reason anyone noticed is that x86 gcc defaults to signed, and no such -funsigned-char update was made to e2fsprogs, so e2fsck immediately started reporting data corruption. After a day and a half of discussing how to handle this use case (xattrs with bit 7 set anywhere in the name) without breaking existing users, Linus merged his own patch and didn't tell the maintainer. None of the ext4 developers realized this until AUTOSEL announced that the commit had been backported to stable. In the end, this problem could have been detected much earlier if there had been any useful tests of hash function(s) in use inside ext4 to make sure that they always produce the same outputs given the same inputs. The XFS dirent/xattr name hash takes a uint8_t*, so I don't think it's vulnerable to this problem. However, let's avoid all this drama by adding our own self test to check that the da hash produces the same outputs for a static pile of inputs on various platforms. This enables us to fix any breakage that may result in a controlled fashion. The buffer and test data are identical to the patches submitted to xfsprogs. Link: https://lore.kernel.org/linux-ext4/Y8bpkm3jA3bDm3eL@debian-BULLSEYE-live-builder-AMD64/ Link: https://lore.kernel.org/linux-xfs/ZBUKCRR7xvIqPrpX@destitution/T/#md38272cc684e2c0d61494435ccbb91f022e8dee4 Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2023-03-19xfs: add tracepoints for each of the externally visible allocatorsDarrick J. Wong2-0/+24
There are now five separate space allocator interfaces exposed to the rest of XFS for five different strategies to find space. Add tracepoints for each of them so that I can tell from a trace dump exactly which ones got called and what happened underneath them. Add a sixth so it's more obvious if an allocation actually happened. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2023-03-19xfs: walk all AGs if TRYLOCK passed to xfs_alloc_vextent_iterate_agsDarrick J. Wong1-1/+5
Callers of xfs_alloc_vextent_iterate_ags that pass in the TRYLOCK flag want us to perform a non-blocking scan of the AGs for free space. There are no ordering constraints for non-blocking AGF lock acquisition, so the scan can freely start over at AG 0 even when minimum_agno > 0. This manifests fairly reliably on xfs/294 on 6.3-rc2 with the parent pointer patchset applied and the realtime volume enabled. I observed the following sequence as part of an xfs_dir_createname call: 0. Fragment the free space, then allocate nearly all the free space in all AGs except AG 0. 1. Create a directory in AG 2 and let it grow for a while. 2. Try to allocate 2 blocks to expand the dirent part of a directory. The space will be allocated out of AG 0, but the allocation will not be contiguous. This (I think) activates the LOWMODE allocator. 3. The bmapi call decides to convert from extents to bmbt format and tries to allocate 1 block. This allocation request calls xfs_alloc_vextent_start_ag with the inode number, which starts the scan at AG 2. We ignore AG 0 (with all its free space) and instead scrape AG 2 and 3 for more space. We find one block, but this now kicks t_highest_agno to 3. 4. The createname call decides it needs to split the dabtree. It tries to allocate even more space with xfs_alloc_vextent_start_ag, but now we're constrained to AG 3, and we don't find the space. The createname returns ENOSPC and the filesystem shuts down. This change fixes the problem by making the trylock scan wrap around to AG 0 if it doesn't like the AGs that it finds. Since the current transaction itself holds AGF 0, the trylock of AGF 0 will succeed, and we take space from the AG that has plenty. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2023-03-19Merge tag 'ext4_for_linus_urgent' of ↵Linus Torvalds1-3/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 fix from Ted Ts'o: "Fix a double unlock bug on an error path in ext4, found by smatch and syzkaller" * tag 'ext4_for_linus_urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: fix possible double unlock when moving a directory
2023-03-19fscrypt: check for NULL keyring in fscrypt_put_master_key_activeref()Eric Biggers1-0/+2
It is a bug for fscrypt_put_master_key_activeref() to see a NULL keyring. But it used to be possible due to the bug, now fixed, where fscrypt_destroy_keyring() was called before security_sb_delete(). To be consistent with how fscrypt_destroy_keyring() uses WARN_ON for the same issue, WARN and leak the fscrypt_master_key if the keyring is NULL instead of dereferencing the NULL pointer. This is a robustness improvement, not a fix. Link: https://lore.kernel.org/r/20230313221231.272498-4-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2023-03-19fscrypt: improve fscrypt_destroy_keyring() documentationEric Biggers1-10/+11
Document that fscrypt_destroy_keyring() must be called after all potentially-encrypted inodes have been evicted. Link: https://lore.kernel.org/r/20230313221231.272498-3-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2023-03-18ext4: fix possible double unlock when moving a directoryTheodore Ts'o1-3/+1
Fixes: 0813299c586b ("ext4: Fix possible corruption when moving a directory") Link: https://lore.kernel.org/r/5efbe1b9-ad8b-4a4f-b422-24824d2b775c@kili.mountain Reported-by: Dan Carpenter <error27@gmail.com> Reported-by: syzbot+0c73d1d8b952c5f3d714@syzkaller.appspotmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-03-18nfsd: don't replace page in rq_pages if it's a continuation of last pageJeff Layton1-1/+8
The splice read calls nfsd_splice_actor to put the pages containing file data into the svc_rqst->rq_pages array. It's possible however to get a splice result that only has a partial page at the end, if (e.g.) the filesystem hands back a short read that doesn't cover the whole page. nfsd_splice_actor will plop the partial page into its rq_pages array and return. Then later, when nfsd_splice_actor is called again, the remainder of the page may end up being filled out. At this point, nfsd_splice_actor will put the page into the array _again_ corrupting the reply. If this is done enough times, rq_next_page will overrun the array and corrupt the trailing fields -- the rq_respages and rq_next_page pointers themselves. If we've already added the page to the array in the last pass, don't add it to the array a second time when dealing with a splice continuation. This was originally handled properly in nfsd_splice_actor, but commit 91e23b1c3982 ("NFSD: Clean up nfsd_splice_actor()") removed the check for it. Fixes: 91e23b1c3982 ("NFSD: Clean up nfsd_splice_actor()") Cc: Al Viro <viro@zeniv.linux.org.uk> Reported-by: Dario Lesca <d.lesca@solinos.it> Tested-by: David Critch <dcritch@redhat.com> Link: https://bugzilla.redhat.com/show_bug.cgi?id=2150630 Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-03-17cifs: check only tcon status on tcon related functionsShyam Prasad N4-11/+19
We had a couple of checks for session in cifs_tree_connect and cifs_mark_open_files_invalid, which were unnecessary. And that was done with ses_lock. Changed that to tc_lock too. Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-03-17Merge tag '6.3-rc2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds14-195/+118
Pull cifs client fixes from Steve French: "Seven cifs/smb3 client fixes, all also for stable: - four DFS fixes - multichannel reconnect fix - fix smb1 stats for cancel command - fix for set file size error path" * tag '6.3-rc2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: use DFS root session instead of tcon ses cifs: return DFS root session id in DebugData cifs: fix use-after-free bug in refresh_cache_worker() cifs: set DFS root session in cifs_get_smb_ses() cifs: generate signkey for the channel that's reconnecting cifs: Fix smb2_set_path_size() cifs: Move the in_send statistic to __smb_send_rqst()
2023-03-16xfs: try to idiot-proof the allocatorsDarrick J. Wong1-0/+13
In porting his development branch to 6.3-rc1, yours truly has repeatedly screwed up the args->pag being fed to the xfs_alloc_vextent* functions. Add some debugging assertions to test the preconditions required of the callers. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2023-03-16fsverity: don't drop pagecache at end of FS_IOC_ENABLE_VERITYEric Biggers1-12/+13
The full pagecache drop at the end of FS_IOC_ENABLE_VERITY is causing performance problems and is hindering adoption of fsverity. It was intended to solve a race condition where unverified pages might be left in the pagecache. But actually it doesn't solve it fully. Since the incomplete solution for this race condition has too much performance impact for it to be worth it, let's remove it for now. Fixes: 3fda4c617e84 ("fs-verity: implement FS_IOC_ENABLE_VERITY ioctl") Cc: stable@vger.kernel.org Reviewed-by: Victor Hsieh <victorhsieh@google.com> Link: https://lore.kernel.org/r/20230314235332.50270-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2023-03-15btrfs: zoned: drop space_info->active_total_bytesNaohiro Aota4-43/+9
The space_info->active_total_bytes is no longer necessary as we now count the region of newly allocated block group as zone_unusable. Drop its usage. Fixes: 6a921de58992 ("btrfs: zoned: introduce space_info->active_total_bytes") CC: stable@vger.kernel.org # 6.1+ Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-03-15btrfs: zoned: count fresh BG region as zone unusableNaohiro Aota2-6/+26
The naming of space_info->active_total_bytes is misleading. It counts not only active block groups but also full ones which are previously active but now inactive. That confusion results in a bug not counting the full BGs into active_total_bytes on mount time. For a background, there are three kinds of block groups in terms of activation. 1. Block groups never activated 2. Block groups currently active 3. Block groups previously active and currently inactive (due to fully written or zone finish) What we really wanted to exclude from "total_bytes" is the total size of BGs #1. They seem empty and allocatable but since they are not activated, we cannot rely on them to do the space reservation. And, since BGs #1 never get activated, they should have no "used", "reserved" and "pinned" bytes. OTOH, BGs #3 can be counted in the "total", since they are already full we cannot allocate from them anyway. For them, "total_bytes == used + reserved + pinned + zone_unusable" should hold. Tracking #2 and #3 as "active_total_bytes" (current implementation) is confusing. And, tracking #1 and subtract that properly from "total_bytes" every time you need space reservation is cumbersome. Instead, we can count the whole region of a newly allocated block group as zone_unusable. Then, once that block group is activated, release [0 .. zone_capacity] from the zone_unusable counters. With this, we can eliminate the confusing ->active_total_bytes and the code will be common among regular and the zoned mode. Also, no additional counter is needed with this approach. Fixes: 6a921de58992 ("btrfs: zoned: introduce space_info->active_total_bytes") CC: stable@vger.kernel.org # 6.1+ Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-03-15btrfs: use temporary variable for space_info in btrfs_update_block_groupJosef Bacik1-10/+12
We do cache->space_info->counter += num_bytes; everywhere in here. This is makes the lines longer than they need to be, and will be especially noticeable when we add the active tracking in, so add a temp variable for the space_info so this is cleaner. Reviewed-by: Naohiro Aota <naohiro.aota@wdc.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-03-15btrfs: rename BTRFS_FS_NO_OVERCOMMIT to BTRFS_FS_ACTIVE_ZONE_TRACKINGJosef Bacik3-8/+4
This flag only gets set when we're doing active zone tracking, and we're going to need to use this flag for things related to this behavior. Rename the flag to represent what it actually means for the file system so it can be used in other ways and still make sense. Reviewed-by: Naohiro Aota <naohiro.aota@wdc.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-03-15btrfs: zoned: fix btrfs_can_activate_zone() to support DUP profileNaohiro Aota1-2/+12
btrfs_can_activate_zone() returns true if at least one device has one zone available for activation. This is OK for the single profile, but not OK for DUP profile. We need two zones to create a DUP block group. Fix it by properly handling the case with the profile flags. Fixes: 265f7237dd25 ("btrfs: zoned: allow DUP on meta-data block groups") CC: stable@vger.kernel.org # 6.1+ Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-03-15btrfs: fix compiler warning on SPARC/PA-RISC handling fscrypt_setup_filenameSweet Tea Dorminy1-1/+6
Commit 1ec49744ba83 ("btrfs: turn on -Wmaybe-uninitialized") exposed that on SPARC and PA-RISC, gcc is unaware that fscrypt_setup_filename() only returns negative error values or 0. This ultimately results in a maybe-uninitialized warning in btrfs_lookup_dentry(). Change to only return negative error values or 0 from fscrypt_setup_filename() at the relevant call site, and assert that no positive error codes are returned (which would have wider implications involving other users). Reported-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/all/481b19b5-83a0-4793-b4fd-194ad7b978c3@roeck-us.net/ Signed-off-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-03-15btrfs: handle missing chunk mapping more gracefullyQu Wenruo1-1/+2
[BUG] During my scrub rework, I did a stupid thing like this: bio->bi_iter.bi_sector = stripe->logical; btrfs_submit_bio(fs_info, bio, stripe->mirror_num); Above bi_sector assignment is using logical address directly, which lacks ">> SECTOR_SHIFT". This results a read on a range which has no chunk mapping. This results the following crash: BTRFS critical (device dm-1): unable to find logical 11274289152 length 65536 assertion failed: !IS_ERR(em), in fs/btrfs/volumes.c:6387 Sure this is all my fault, but this shows a possible problem in real world, that some bit flip in file extents/tree block can point to unmapped ranges, and trigger above ASSERT(), or if CONFIG_BTRFS_ASSERT is not configured, cause invalid pointer access. [PROBLEMS] In the above call chain, we just don't handle the possible error from btrfs_get_chunk_map() inside __btrfs_map_block(). [FIX] The fix is straightforward, replace the ASSERT() with proper error handling (callers handle errors already). Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-03-15cifs: use DFS root session instead of tcon sesPaulo Alcantara1-0/+1
Use DFS root session whenever possible to get new DFS referrals otherwise we might end up with an IPC tcon (tcon->ses->tcon_ipc) that doesn't respond to them. It should be safe accessing @ses->dfs_root_ses directly in cifs_inval_name_dfs_link_error() as it has same lifetime as of @tcon. Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Cc: stable@vger.kernel.org # 6.2 Signed-off-by: Steve French <stfrench@microsoft.com>
2023-03-15cifs: return DFS root session id in DebugDataPaulo Alcantara1-0/+5
Return the DFS root session id in /proc/fs/cifs/DebugData to make it easier to track which IPC tcon was used to get new DFS referrals for a specific connection, and aids in debugging. A simple output of it would be Sessions: 1) Address: 192.168.1.13 Uses: 1 Capability: 0x300067 Session Status: 1 Security type: RawNTLMSSP SessionId: 0xd80000000009 User: 0 Cred User: 0 DFS root session id: 0x128006c000035 Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Cc: stable@vger.kernel.org # 6.2 Signed-off-by: Steve French <stfrench@microsoft.com>
2023-03-15cifs: fix use-after-free bug in refresh_cache_worker()Paulo Alcantara8-164/+67
The UAF bug occurred because we were putting DFS root sessions in cifs_umount() while DFS cache refresher was being executed. Make DFS root sessions have same lifetime as DFS tcons so we can avoid the use-after-free bug is DFS cache refresher and other places that require IPCs to get new DFS referrals on. Also, get rid of mount group handling in DFS cache as we no longer need it. This fixes below use-after-free bug catched by KASAN [ 379.946955] BUG: KASAN: use-after-free in __refresh_tcon.isra.0+0x10b/0xc10 [cifs] [ 379.947642] Read of size 8 at addr ffff888018f57030 by task kworker/u4:3/56 [ 379.948096] [ 379.948208] CPU: 0 PID: 56 Comm: kworker/u4:3 Not tainted 6.2.0-rc7-lku #23 [ 379.948661] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552-rebuilt.opensuse.org 04/01/2014 [ 379.949368] Workqueue: cifs-dfscache refresh_cache_worker [cifs] [ 379.949942] Call Trace: [ 379.950113] <TASK> [ 379.950260] dump_stack_lvl+0x50/0x67 [ 379.950510] print_report+0x16a/0x48e [ 379.950759] ? __virt_addr_valid+0xd8/0x160 [ 379.951040] ? __phys_addr+0x41/0x80 [ 379.951285] kasan_report+0xdb/0x110 [ 379.951533] ? __refresh_tcon.isra.0+0x10b/0xc10 [cifs] [ 379.952056] ? __refresh_tcon.isra.0+0x10b/0xc10 [cifs] [ 379.952585] __refresh_tcon.isra.0+0x10b/0xc10 [cifs] [ 379.953096] ? __pfx___refresh_tcon.isra.0+0x10/0x10 [cifs] [ 379.953637] ? __pfx___mutex_lock+0x10/0x10 [ 379.953915] ? lock_release+0xb6/0x720 [ 379.954167] ? __pfx_lock_acquire+0x10/0x10 [ 379.954443] ? refresh_cache_worker+0x34e/0x6d0 [cifs] [ 379.954960] ? __pfx_wb_workfn+0x10/0x10 [ 379.955239] refresh_cache_worker+0x4ad/0x6d0 [cifs] [ 379.955755] ? __pfx_refresh_cache_worker+0x10/0x10 [cifs] [ 379.956323] ? __pfx_lock_acquired+0x10/0x10 [ 379.956615] ? read_word_at_a_time+0xe/0x20 [ 379.956898] ? lockdep_hardirqs_on_prepare+0x12/0x220 [ 379.957235] process_one_work+0x535/0x990 [ 379.957509] ? __pfx_process_one_work+0x10/0x10 [ 379.957812] ? lock_acquired+0xb7/0x5f0 [ 379.958069] ? __list_add_valid+0x37/0xd0 [ 379.958341] ? __list_add_valid+0x37/0xd0 [ 379.958611] worker_thread+0x8e/0x630 [ 379.958861] ? __pfx_worker_thread+0x10/0x10 [ 379.959148] kthread+0x17d/0x1b0 [ 379.959369] ? __pfx_kthread+0x10/0x10 [ 379.959630] ret_from_fork+0x2c/0x50 [ 379.959879] </TASK> Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Cc: stable@vger.kernel.org # 6.2 Signed-off-by: Steve French <stfrench@microsoft.com>
2023-03-15cifs: set DFS root session in cifs_get_smb_ses()Paulo Alcantara6-13/+13
Set the DFS root session pointer earlier when creating a new SMB session to prevent racing with smb2_reconnect(), cifs_reconnect_tcon() and DFS cache refresher. Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Cc: stable@vger.kernel.org # 6.2 Signed-off-by: Steve French <stfrench@microsoft.com>
2023-03-15Merge tag 'mm-hotfixes-stable-2023-03-14-16-51' of ↵Linus Torvalds1-2/+17
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "Eleven hotfixes. Four of these are cc:stable and the remainder address post-6.2 issues or aren't considered suitable for backporting. Seven of these fixes are for MM" * tag 'mm-hotfixes-stable-2023-03-14-16-51' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mm/damon/paddr: fix folio_nr_pages() after folio_put() in damon_pa_mark_accessed_or_deactivate() mm/damon/paddr: fix folio_size() call after folio_put() in damon_pa_young() ocfs2: fix data corruption after failed write migrate_pages: try migrate in batch asynchronously firstly migrate_pages: move split folios processing out of migrate_pages_batch() migrate_pages: fix deadlock in batched migration .mailmap: add Alexandre Ghiti personal email address mailmap: correct Dikshita Agarwal's Qualcomm email address mailmap: updates for Jarkko Sakkinen mm/userfaultfd: propagate uffd-wp bit when PTE-mapping the huge zeropage mm: teach mincore_hugetlb about pte markers
2023-03-15fsverity: Remove WQ_UNBOUND from fsverity read workqueueNathan Huckleberry1-6/+6
WQ_UNBOUND causes significant scheduler latency on ARM64/Android. This is problematic for latency sensitive workloads, like I/O post-processing. Removing WQ_UNBOUND gives a 96% reduction in fsverity workqueue related scheduler latency and improves app cold startup times by ~30ms. WQ_UNBOUND was also removed from the dm-verity workqueue for the same reason [1]. This code was tested by running Android app startup benchmarks and measuring how long the fsverity workqueue spent in the runnable state. Before Total workqueue scheduler latency: 553800us After Total workqueue scheduler latency: 18962us [1]: https://lore.kernel.org/all/20230202012348.885402-1-nhuck@google.com/ Signed-off-by: Nathan Huckleberry <nhuck@google.com> Fixes: 8a1d0f9cacc9 ("fs-verity: add data verification hooks for ->readpages()") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230310193325.620493-1-nhuck@google.com Signed-off-by: Eric Biggers <ebiggers@google.com>
2023-03-15cifs: generate signkey for the channel that's reconnectingShyam Prasad N1-1/+1
Before my changes to how multichannel reconnects work, the primary channel was always used to do a non-binding session setup. With my changes, that is not the case anymore. Missed this place where channel at index 0 was forcibly updated with the signing key. Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Cc: stable@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2023-03-15cifs: Fix smb2_set_path_size()Volker Lendecke1-7/+24
If cifs_get_writable_path() finds a writable file, smb2_compound_op() must use that file's FID and not the COMPOUND_FID. Cc: stable@vger.kernel.org Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-03-14NFS: Correct timing for assigning access cache timestampChengen Du1-1/+1
When the user's login time is newer than the cache's timestamp, the original entry in the RB-tree will be replaced by a new entry. Currently, the timestamp is only set if the entry is not found in the RB-tree, which can cause the timestamp to be undefined when the entry exists. This may result in a significant increase in ACCESS operations if the timestamp is set to zero. Signed-off-by: Chengen Du <chengen.du@canonical.com> Fixes: 0eb43812c027 ("NFS: Clear the file access cache upon login”) Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2023-03-14lockd: set file_lock start and end when decoding nlm4 testargsJeff Layton2-9/+13
Commit 6930bcbfb6ce dropped the setting of the file_lock range when decoding a nlm_lock off the wire. This causes the client side grant callback to miss matching blocks and reject the lock, only to rerequest it 30s later. Add a helper function to set the file_lock range from the start and end values that the protocol uses, and have the nlm_lock decoder call that to set up the file_lock args properly. Fixes: 6930bcbfb6ce ("lockd: detect and reject lock arguments that overflow") Reported-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Tested-by: Amir Goldstein <amir73il@gmail.com> Cc: stable@vger.kernel.org #6.0 Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2023-03-14NFS: Fix /proc/PID/io read_bytes for buffered readsDave Wysochanski1-0/+3
Prior to commit 8786fde8421c ("Convert NFS from readpages to readahead"), nfs_readpages() used the old mm interface read_cache_pages() which called task_io_account_read() for each NFS page read. After this commit, nfs_readpages() is converted to nfs_readahead(), which now uses the new mm interface readahead_page(). The new interface requires callers to call task_io_account_read() themselves. In addition, to nfs_readahead() task_io_account_read() should also be called from nfs_read_folio(). Fixes: 8786fde8421c ("Convert NFS from readpages to readahead") Link: https://lore.kernel.org/linux-nfs/CAPt2mGNEYUk5u8V4abe=5MM5msZqmvzCVrtCP4Qw1n=gCHCnww@mail.gmail.com/ Signed-off-by: Dave Wysochanski <dwysocha@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2023-03-14fscrypt: destroy keyring after security_sb_delete()Eric Biggers1-3/+12
fscrypt_destroy_keyring() must be called after all potentially-encrypted inodes were evicted; otherwise it cannot safely destroy the keyring. Since inodes that are in-use by the Landlock LSM don't get evicted until security_sb_delete(), this means that fscrypt_destroy_keyring() must be called *after* security_sb_delete(). This fixes a WARN_ON followed by a NULL dereference, only possible if Landlock was being used on encrypted files. Fixes: d7e7b9af104c ("fscrypt: stop using keyrings subsystem for fscrypt_master_key") Cc: stable@vger.kernel.org Reported-by: syzbot+93e495f6a4f748827c88@syzkaller.appspotmail.com Link: https://lore.kernel.org/r/00000000000044651705f6ca1e30@google.com Reviewed-by: Christian Brauner <brauner@kernel.org> Link: https://lore.kernel.org/r/20230313221231.272498-2-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2023-03-12Merge tag 'xfs-6.3-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds2-21/+40
Pull xfs fixes from Darrick Wong: - Fix a crash if mount time quotacheck fails when there are inodes queued for garbage collection. - Fix an off by one error when discarding folios after writeback failure. * tag 'xfs-6.3-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: fix off-by-one-block in xfs_discard_folio() xfs: quotacheck failure can race with background inode inactivation
2023-03-12Merge tag 'vfs.misc.v6.3-rc2' of ↵Linus Torvalds2-3/+2
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping Pull vfs fixes from Christian Brauner: - When allocating pages for a watch queue failed, we didn't return an error causing userspace to proceed even though all subsequent notifcations would be lost. Make sure to return an error. - Fix a misformed tree entry for the idmapping maintainers entry. - When setting file leases from an idmapped mount via generic_setlease() we need to take the idmapping into account otherwise taking a lease would fail from an idmapped mount. - Remove two redundant assignments, one in splice code and the other in locks code, that static checkers complained about. * tag 'vfs.misc.v6.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping: filelocks: use mount idmapping for setlease permission check fs/locks: Remove redundant assignment to cmd splice: Remove redundant assignment to ret MAINTAINERS: repair a malformed T: entry in IDMAPPED MOUNTS watch_queue: fix IOC_WATCH_QUEUE_SET_SIZE alloc error paths
2023-03-12Merge tag 'ext4_for_linus_stable' of ↵Linus Torvalds11-30/+87
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 fixes from Ted Ts'o: "Bug fixes and regressions for ext4, the most serious of which is a potential deadlock during directory renames that was introduced during the merge window discovered by a combination of syzbot and lockdep" * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: zero i_disksize when initializing the bootloader inode ext4: make sure fs error flag setted before clear journal error ext4: commit super block if fs record error when journal record without error ext4, jbd2: add an optimized bmap for the journal inode ext4: fix WARNING in ext4_update_inline_data ext4: move where set the MAY_INLINE_DATA flag is set ext4: Fix deadlock during directory rename ext4: Fix comment about the 64BIT feature docs: ext4: modify the group desc size to 64 ext4: fix another off-by-one fsmap error on 1k block filesystems ext4: fix RENAME_WHITEOUT handling for inline directories ext4: make kobj_type structures constant ext4: fix cgroup writeback accounting with fs-layer encryption
2023-03-11ext4: zero i_disksize when initializing the bootloader inodeZhihao Cheng1-0/+1
If the boot loader inode has never been used before, the EXT4_IOC_SWAP_BOOT inode will initialize it, including setting the i_size to 0. However, if the "never before used" boot loader has a non-zero i_size, then i_disksize will be non-zero, and the inconsistency between i_size and i_disksize can trigger a kernel warning: WARNING: CPU: 0 PID: 2580 at fs/ext4/file.c:319 CPU: 0 PID: 2580 Comm: bb Not tainted 6.3.0-rc1-00004-g703695902cfa RIP: 0010:ext4_file_write_iter+0xbc7/0xd10 Call Trace: vfs_write+0x3b1/0x5c0 ksys_write+0x77/0x160 __x64_sys_write+0x22/0x30 do_syscall_64+0x39/0x80 Reproducer: 1. create corrupted image and mount it: mke2fs -t ext4 /tmp/foo.img 200 debugfs -wR "sif <5> size 25700" /tmp/foo.img mount -t ext4 /tmp/foo.img /mnt cd /mnt echo 123 > file 2. Run the reproducer program: posix_memalign(&buf, 1024, 1024) fd = open("file", O_RDWR | O_DIRECT); ioctl(fd, EXT4_IOC_SWAP_BOOT); write(fd, buf, 1024); Fix this by setting i_disksize as well as i_size to zero when initiaizing the boot loader inode. Link: https://bugzilla.kernel.org/show_bug.cgi?id=217159 Cc: stable@kernel.org Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Link: https://lore.kernel.org/r/20230308032643.641113-1-chengzhihao1@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-03-11ext4: make sure fs error flag setted before clear journal errorYe Bin1-2/+4
Now, jounral error number maybe cleared even though ext4_commit_super() failed. This may lead to error flag miss, then fsck will miss to check file system deeply. Signed-off-by: Ye Bin <yebin10@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230307061703.245965-3-yebin@huaweicloud.com
2023-03-11ext4: commit super block if fs record error when journal record without errorYe Bin1-0/+9
Now, 'es->s_state' maybe covered by recover journal. And journal errno maybe not recorded in journal sb as IO error. ext4_update_super() only update error information when 'sbi->s_add_error_count' large than zero. Then 'EXT4_ERROR_FS' flag maybe lost. To solve above issue just recover 'es->s_state' error flag after journal replay like error info. Signed-off-by: Ye Bin <yebin10@huawei.com> Reviewed-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230307061703.245965-2-yebin@huaweicloud.com
2023-03-11ext4, jbd2: add an optimized bmap for the journal inodeTheodore Ts'o2-3/+29
The generic bmap() function exported by the VFS takes locks and does checks that are not necessary for the journal inode. So allow the file system to set a journal-optimized bmap function in journal->j_bmap. Reported-by: syzbot+9543479984ae9e576000@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?id=e4aaa78795e490421c79f76ec3679006c8ff4cf0 Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-03-11ext4: fix WARNING in ext4_update_inline_dataYe Bin1-0/+3
Syzbot found the following issue: EXT4-fs (loop0): mounted filesystem 00000000-0000-0000-0000-000000000000 without journal. Quota mode: none. fscrypt: AES-256-CTS-CBC using implementation "cts-cbc-aes-aesni" fscrypt: AES-256-XTS using implementation "xts-aes-aesni" ------------[ cut here ]------------ WARNING: CPU: 0 PID: 5071 at mm/page_alloc.c:5525 __alloc_pages+0x30a/0x560 mm/page_alloc.c:5525 Modules linked in: CPU: 1 PID: 5071 Comm: syz-executor263 Not tainted 6.2.0-rc1-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022 RIP: 0010:__alloc_pages+0x30a/0x560 mm/page_alloc.c:5525 RSP: 0018:ffffc90003c2f1c0 EFLAGS: 00010246 RAX: ffffc90003c2f220 RBX: 0000000000000014 RCX: 0000000000000000 RDX: 0000000000000028 RSI: 0000000000000000 RDI: ffffc90003c2f248 RBP: ffffc90003c2f2d8 R08: dffffc0000000000 R09: ffffc90003c2f220 R10: fffff52000785e49 R11: 1ffff92000785e44 R12: 0000000000040d40 R13: 1ffff92000785e40 R14: dffffc0000000000 R15: 1ffff92000785e3c FS: 0000555556c0d300(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f95d5e04138 CR3: 00000000793aa000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> __alloc_pages_node include/linux/gfp.h:237 [inline] alloc_pages_node include/linux/gfp.h:260 [inline] __kmalloc_large_node+0x95/0x1e0 mm/slab_common.c:1113 __do_kmalloc_node mm/slab_common.c:956 [inline] __kmalloc+0xfe/0x190 mm/slab_common.c:981 kmalloc include/linux/slab.h:584 [inline] kzalloc include/linux/slab.h:720 [inline] ext4_update_inline_data+0x236/0x6b0 fs/ext4/inline.c:346 ext4_update_inline_dir fs/ext4/inline.c:1115 [inline] ext4_try_add_inline_entry+0x328/0x990 fs/ext4/inline.c:1307 ext4_add_entry+0x5a4/0xeb0 fs/ext4/namei.c:2385 ext4_add_nondir+0x96/0x260 fs/ext4/namei.c:2772 ext4_create+0x36c/0x560 fs/ext4/namei.c:2817 lookup_open fs/namei.c:3413 [inline] open_last_lookups fs/namei.c:3481 [inline] path_openat+0x12ac/0x2dd0 fs/namei.c:3711 do_filp_open+0x264/0x4f0 fs/namei.c:3741 do_sys_openat2+0x124/0x4e0 fs/open.c:1310 do_sys_open fs/open.c:1326 [inline] __do_sys_openat fs/open.c:1342 [inline] __se_sys_openat fs/open.c:1337 [inline] __x64_sys_openat+0x243/0x290 fs/open.c:1337 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd Above issue happens as follows: ext4_iget ext4_find_inline_data_nolock ->i_inline_off=164 i_inline_size=60 ext4_try_add_inline_entry __ext4_mark_inode_dirty ext4_expand_extra_isize_ea ->i_extra_isize=32 s_want_extra_isize=44 ext4_xattr_shift_entries ->after shift i_inline_off is incorrect, actually is change to 176 ext4_try_add_inline_entry ext4_update_inline_dir get_max_inline_xattr_value_size if (EXT4_I(inode)->i_inline_off) entry = (struct ext4_xattr_entry *)((void *)raw_inode + EXT4_I(inode)->i_inline_off); free += EXT4_XATTR_SIZE(le32_to_cpu(entry->e_value_size)); ->As entry is incorrect, then 'free' may be negative ext4_update_inline_data value = kzalloc(len, GFP_NOFS); -> len is unsigned int, maybe very large, then trigger warning when 'kzalloc()' To resolve the above issue we need to update 'i_inline_off' after 'ext4_xattr_shift_entries()'. We do not need to set EXT4_STATE_MAY_INLINE_DATA flag here, since ext4_mark_inode_dirty() already sets this flag if needed. Setting EXT4_STATE_MAY_INLINE_DATA when it is needed may trigger a BUG_ON in ext4_writepages(). Reported-by: syzbot+d30838395804afc2fa6f@syzkaller.appspotmail.com Cc: stable@kernel.org Signed-off-by: Ye Bin <yebin10@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230307015253.2232062-3-yebin@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-03-11ext4: move where set the MAY_INLINE_DATA flag is setYe Bin2-2/+6
The only caller of ext4_find_inline_data_nolock() that needs setting of EXT4_STATE_MAY_INLINE_DATA flag is ext4_iget_extra_inode(). In ext4_write_inline_data_end() we just need to update inode->i_inline_off. Since we are going to add one more caller that does not need to set EXT4_STATE_MAY_INLINE_DATA, just move setting of EXT4_STATE_MAY_INLINE_DATA out to ext4_iget_extra_inode(). Signed-off-by: Ye Bin <yebin10@huawei.com> Cc: stable@kernel.org Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230307015253.2232062-2-yebin@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-03-11Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds1-0/+1
Pull misc fixes from Al Viro: "pick_file() speculation fix + fix for alpha mis(merge,cherry-pick) The fs/file.c one is a genuine missing speculation barrier in pick_file() (reachable e.g. via close(2)). The alpha one is strictly speaking not a bug fix, but only because confusion between preempt_enable() and preempt_disable() is harmless on architecture without CONFIG_PREEMPT. Looks like alpha.git picked the wrong version of patch - that braino used to be there in early versions, but it had been fixed quite a while ago..." * tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fs: prevent out-of-bounds array speculation when closing a file descriptor alpha: fix lazy-FPU mis(merged/applied/whatnot)