Age | Commit message (Collapse) | Author | Files | Lines |
|
Pull NFS client fixes from Trond Myklebust:
- Fix a race in the RPCSEC_GSS upcall code that causes hung RPC calls
- Fix a broken coalescing test in the pNFS file layout driver
- Ensure that the access cache rcu path also applies the login test
- Fix up for a sparse warning
* tag 'nfs-for-6.2-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
NFS: Fix up a sparse warning
NFS: Judge the file access cache's timestamp in rcu path
pNFS/filelayout: Fix coalescing test for single DS
SUNRPC: ensure the matching upcall is in-flight upon downcall
|
|
Pull cifs fixes from Steve French:
"cifs/smb3 client fixes:
- two multichannel fixes
- three reconnect fixes
- unmap fix"
* tag '6.2-rc2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: fix interface count calculation during refresh
cifs: refcount only the selected iface during interface update
cifs: protect access of TCP_Server_Info::{dstaddr,hostname}
cifs: fix race in assemble_neg_contexts()
cifs: ignore ipc reconnect failures during dfs failover
cifs: Fix kmap_local_page() unmapping
|
|
Commit 55d1cbbbb29e ("hfs/hfsplus: use WARN_ON for sanity check") fixed
a build warning by turning a comment into a WARN_ON(), but it turns out
that syzbot then complains because it can trigger said warning with a
corrupted hfs image.
The warning actually does warn about a bad situation, but we are much
better off just handling it as the error it is. So rather than warn
about us doing bad things, stop doing the bad things and return -EIO.
While at it, also fix a memory leak that was introduced by an earlier
fix for a similar syzbot warning situation, and add a check for one case
that historically wasn't handled at all (ie neither comment nor
subsequent WARN_ON).
Reported-by: syzbot+7bb7cd3595533513a9e7@syzkaller.appspotmail.com
Fixes: 55d1cbbbb29e ("hfs/hfsplus: use WARN_ON for sanity check")
Fixes: 8d824e69d9f3 ("hfs: fix OOB Read in __hfs_brec_find")
Link: https://lore.kernel.org/lkml/000000000000dbce4e05f170f289@google.com/
Tested-by: Michael Schmitz <schmitzmic@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Viacheslav Dubeyko <slava@dubeyko.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Pull ceph fixes from Ilya Dryomov:
"Two file locking fixes from Xiubo"
* tag 'ceph-for-6.2-rc3' of https://github.com/ceph/ceph-client:
ceph: avoid use-after-free in ceph_fl_release_lock()
ceph: switch to vfs_inode_has_locks() to fix file lock bug
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull UDF fixes from Jan Kara:
"Two fixups of the UDF changes that went into 6.2-rc1"
* tag 'fixes_for_v6.2-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
udf: initialize newblock to 0
udf: Fix extension of the last extent in the file
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
"A few more regression and regular fixes:
- regressions:
- fix assertion condition using = instead of ==
- fix false alert on bad tree level check
- fix off-by-one error in delalloc search during lseek
- fix compat ro feature check at read-write remount
- handle case when read-repair happens with ongoing device replace
- updated error messages"
* tag 'for-6.2-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: fix compat_ro checks against remount
btrfs: always report error in run_one_delayed_ref()
btrfs: handle case when repair happens with dev-replace
btrfs: fix off-by-one in delalloc search during lseek
btrfs: fix false alert on bad tree level check
btrfs: add error message for metadata level mismatch
btrfs: fix ASSERT em->len condition in btrfs_get_extent
|
|
The clang build reports this error
fs/udf/inode.c:805:6: error: variable 'newblock' is used uninitialized whenever 'if' condition is true [-Werror,-Wsometimes-uninitialized]
if (*err < 0)
^~~~~~~~
newblock is never set before error handling jump.
Initialize newblock to 0 and remove redundant settings.
Fixes: d8b39db5fab8 ("udf: Handle error when adding extent to a file")
Reported-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Message-Id: <20221230175341.1629734-1-trix@redhat.com>
|
|
When extending the last extent in the file within the last block, we
wrongly computed the length of the last extent. This is mostly a
cosmetical problem since the extent does not contain any data and the
length will be fixed up by following operations but still.
Fixes: 1f3868f06855 ("udf: Fix extending file within last block")
Signed-off-by: Jan Kara <jack@suse.cz>
|
|
The last fix to iface_count did fix the overcounting issue.
However, during each refresh, we could end up undercounting
the iface_count, if a match was found.
Fixing this by doing increments and decrements instead of
setting it to 0 before each parsing of server interfaces.
Fixes: 096bbeec7bd6 ("smb3: interface count displayed incorrectly")
Cc: stable@vger.kernel.org # 6.1
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
When the server interface for a channel is not active anymore,
we have the logic to select an alternative interface. However
this was not breaking out of the loop as soon as a new alternative
was found. As a result, some interfaces may get refcounted unintentionally.
There was also a bug in checking if we found an alternate iface.
Fixed that too.
Fixes: b54034a73baf ("cifs: during reconnect, update interface if necessary")
Cc: stable@vger.kernel.org # 5.19+
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs fixes from Jaegeuk Kim:
- fix a null pointer dereference in f2fs_issue_flush, which occurs by
the combination of mount/remount options.
- fix a bug in per-block age-based extent_cache newly introduced in
6.2-rc1, which reported a wrong age information in extent_cache.
- fix a kernel panic if extent_tree was not created, which was caught
by a wrong BUG_ON
* tag 'f2fs-fix-6.2-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs:
f2fs: let's avoid panic if extent_tree is not created
f2fs: should use a temp extent_info for lookup
f2fs: don't mix to use union values in extent_info
f2fs: initialize extent_cache parameter
f2fs: fix to avoid NULL pointer dereference in f2fs_issue_flush()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
Pull nfsd fixes from Chuck Lever:
- Fix a filecache UAF during NFSD shutdown
- Avoid exposing automounted mounts on NFS re-exports
* tag 'nfsd-6.2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
nfsd: fix handling of readdir in v4root vs. mount upcall timeout
nfsd: shut down the NFSv4 state objects before the filecache
|
|
Use the appropriate locks to protect access of hostname and dstaddr
fields in cifs_tree_connect() as they might get changed by other
tasks.
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Reviewed-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Serialise access of TCP_Server_Info::hostname in
assemble_neg_contexts() by holding the server's mutex otherwise it
might end up accessing an already-freed hostname pointer from
cifs_reconnect() or cifs_resolve_server().
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Reviewed-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
If it failed to reconnect ipc used for getting referrals, we can just
ignore it as it is not required for reconnecting the share. The worst
case would be not being able to detect or chase nested links as long
as dfs root server is unreachable.
Before patch:
$ mount.cifs //root/dfs/link /mnt -o echo_interval=10,...
-> target share: /fs0/share
disconnect root & fs0
$ ls /mnt
ls: cannot access '/mnt': Host is down
connect fs0
$ ls /mnt
ls: cannot access '/mnt': Resource temporarily unavailable
After patch:
$ mount.cifs //root/dfs/link /mnt -o echo_interval=10,...
-> target share: /fs0/share
disconnect root & fs0
$ ls /mnt
ls: cannot access '/mnt': Host is down
connect fs0
$ ls /mnt
bar.rtf dir1 foo
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Reviewed-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
kmap_local_page() requires kunmap_local() to unmap the mapping. In
addition memcpy_page() is provided to perform this common memcpy
pattern.
Replace the kmap_local_page() and broken kunmap() with memcpy_page()
Fixes: d406d26745ab ("cifs: skip alloc when request has no pages")
Reviewed-by: Paulo Alcantara <pc@cjr.nz>
Reviewed-by: "Fabio M. De Francesco" <fmdefrancesco@gmail.com>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
This patch avoids the below panic.
pc : __lookup_extent_tree+0xd8/0x760
lr : f2fs_do_write_data_page+0x104/0x87c
sp : ffffffc010cbb3c0
x29: ffffffc010cbb3e0 x28: 0000000000000000
x27: ffffff8803e7f020 x26: ffffff8803e7ed40
x25: ffffff8803e7f020 x24: ffffffc010cbb460
x23: ffffffc010cbb480 x22: 0000000000000000
x21: 0000000000000000 x20: ffffffff22e90900
x19: 0000000000000000 x18: ffffffc010c5d080
x17: 0000000000000000 x16: 0000000000000020
x15: ffffffdb1acdbb88 x14: ffffff888759e2b0
x13: 0000000000000000 x12: ffffff802da49000
x11: 000000000a001200 x10: ffffff8803e7ed40
x9 : ffffff8023195800 x8 : ffffff802da49078
x7 : 0000000000000001 x6 : 0000000000000000
x5 : 0000000000000006 x4 : ffffffc010cbba28
x3 : 0000000000000000 x2 : ffffffc010cbb480
x1 : 0000000000000000 x0 : ffffff8803e7ed40
Call trace:
__lookup_extent_tree+0xd8/0x760
f2fs_do_write_data_page+0x104/0x87c
f2fs_write_single_data_page+0x420/0xb60
f2fs_write_cache_pages+0x418/0xb1c
__f2fs_write_data_pages+0x428/0x58c
f2fs_write_data_pages+0x30/0x40
do_writepages+0x88/0x190
__writeback_single_inode+0x48/0x448
writeback_sb_inodes+0x468/0x9e8
__writeback_inodes_wb+0xb8/0x2a4
wb_writeback+0x33c/0x740
wb_do_writeback+0x2b4/0x400
wb_workfn+0xe4/0x34c
process_one_work+0x24c/0x5bc
worker_thread+0x3e8/0xa50
kthread+0x150/0x1b4
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
Otherwise, __lookup_extent_tree() will override the given extent_info which will
be used by caller.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
Let's explicitly use the defined values in block_age case only.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
This can avoid confusing tracepoint values.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
With below two cases, it will cause NULL pointer dereference when
accessing SM_I(sbi)->fcc_info in f2fs_issue_flush().
a) If kthread_run() fails in f2fs_create_flush_cmd_control(), it will
release SM_I(sbi)->fcc_info,
- mount -o noflush_merge /dev/vda /mnt/f2fs
- mount -o remount,flush_merge /dev/vda /mnt/f2fs -- kthread_run() fails
- dd if=/dev/zero of=/mnt/f2fs/file bs=4k count=1 conv=fsync
b) we will never allocate memory for SM_I(sbi)->fcc_info w/ below
testcase,
- mount -o ro /dev/vda /mnt/f2fs
- mount -o rw,remount /dev/vda /mnt/f2fs
- dd if=/dev/zero of=/mnt/f2fs/file bs=4k count=1 conv=fsync
In order to fix this issue, let change as below:
- fix error path handling in f2fs_create_flush_cmd_control().
- allocate SM_I(sbi)->fcc_info even if readonly is on.
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
[BUG]
Even with commit 81d5d61454c3 ("btrfs: enhance unsupported compat RO
flags handling"), btrfs can still mount a fs with unsupported compat_ro
flags read-only, then remount it RW:
# btrfs ins dump-super /dev/loop0 | grep compat_ro_flags -A 3
compat_ro_flags 0x403
( FREE_SPACE_TREE |
FREE_SPACE_TREE_VALID |
unknown flag: 0x400 )
# mount /dev/loop0 /mnt/btrfs
mount: /mnt/btrfs: wrong fs type, bad option, bad superblock on /dev/loop0, missing codepage or helper program, or other error.
dmesg(1) may have more information after failed mount system call.
^^^ RW mount failed as expected ^^^
# dmesg -t | tail -n5
loop0: detected capacity change from 0 to 1048576
BTRFS: device fsid cb5b82f5-0fdd-4d81-9b4b-78533c324afa devid 1 transid 7 /dev/loop0 scanned by mount (1146)
BTRFS info (device loop0): using crc32c (crc32c-intel) checksum algorithm
BTRFS info (device loop0): using free space tree
BTRFS error (device loop0): cannot mount read-write because of unknown compat_ro features (0x403)
BTRFS error (device loop0): open_ctree failed
# mount /dev/loop0 -o ro /mnt/btrfs
# mount -o remount,rw /mnt/btrfs
^^^ RW remount succeeded unexpectedly ^^^
[CAUSE]
Currently we use btrfs_check_features() to check compat_ro flags against
our current mount flags.
That function get reused between open_ctree() and btrfs_remount().
But for btrfs_remount(), the super block we passed in still has the old
mount flags, thus btrfs_check_features() still believes we're mounting
read-only.
[FIX]
Replace the existing @sb argument with @is_rw_mount.
As originally we only use @sb to determine if the mount is RW.
Now it's callers' responsibility to determine if the mount is RW, and
since there are only two callers, the check is pretty simple:
- caller in open_ctree()
Just pass !sb_rdonly().
- caller in btrfs_remount()
Pass !(*flags & SB_RDONLY), as our check should be against the new
flags.
Now we can correctly reject the RW remount:
# mount /dev/loop0 -o ro /mnt/btrfs
# mount -o remount,rw /mnt/btrfs
mount: /mnt/btrfs: mount point not mounted or bad option.
dmesg(1) may have more information after failed mount system call.
# dmesg -t | tail -n 1
BTRFS error (device loop0: state M): cannot mount read-write because of unknown compat_ro features (0x403)
Reported-by: Chung-Chiang Cheng <shepjeng@gmail.com>
Fixes: 81d5d61454c3 ("btrfs: enhance unsupported compat RO flags handling")
CC: stable@vger.kernel.org # 5.15+
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Currently we have a btrfs_debug() for run_one_delayed_ref() failure, but
if end users hit such problem, there will be no chance that
btrfs_debug() is enabled. This can lead to very little useful info for
debugging.
This patch will:
- Add extra info for error reporting
Including:
* logical bytenr
* num_bytes
* type
* action
* ref_mod
- Replace the btrfs_debug() with btrfs_err()
- Move the error reporting into run_one_delayed_ref()
This is to avoid use-after-free, the @node can be freed in the caller.
This error should only be triggered at most once.
As if run_one_delayed_ref() failed, we trigger the error message, then
causing the call chain to error out:
btrfs_run_delayed_refs()
`- btrfs_run_delayed_refs()
`- btrfs_run_delayed_refs_for_head()
`- run_one_delayed_ref()
And we will abort the current transaction in btrfs_run_delayed_refs().
If we have to run delayed refs for the abort transaction,
run_one_delayed_ref() will just cleanup the refs and do nothing, thus no
new error messages would be output.
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
[BUG]
There is a bug report that a BUG_ON() in btrfs_repair_io_failure()
(originally repair_io_failure() in v6.0 kernel) got triggered when
replacing a unreliable disk:
BTRFS warning (device sda1): csum failed root 257 ino 2397453 off 39624704 csum 0xb0d18c75 expected csum 0x4dae9c5e mirror 3
kernel BUG at fs/btrfs/extent_io.c:2380!
invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
CPU: 9 PID: 3614331 Comm: kworker/u257:2 Tainted: G OE 6.0.0-5-amd64 #1 Debian 6.0.10-2
Hardware name: Micro-Star International Co., Ltd. MS-7C60/TRX40 PRO WIFI (MS-7C60), BIOS 2.70 07/01/2021
Workqueue: btrfs-endio btrfs_end_bio_work [btrfs]
RIP: 0010:repair_io_failure+0x24a/0x260 [btrfs]
Call Trace:
<TASK>
clean_io_failure+0x14d/0x180 [btrfs]
end_bio_extent_readpage+0x412/0x6e0 [btrfs]
? __switch_to+0x106/0x420
process_one_work+0x1c7/0x380
worker_thread+0x4d/0x380
? rescuer_thread+0x3a0/0x3a0
kthread+0xe9/0x110
? kthread_complete_and_exit+0x20/0x20
ret_from_fork+0x22/0x30
[CAUSE]
Before the BUG_ON(), we got some read errors from the replace target
first, note the mirror number (3, which is beyond RAID1 duplication,
thus it's read from the replace target device).
Then at the BUG_ON() location, we are trying to writeback the repaired
sectors back the failed device.
The check looks like this:
ret = btrfs_map_block(fs_info, BTRFS_MAP_WRITE, logical,
&map_length, &bioc, mirror_num);
if (ret)
goto out_counter_dec;
BUG_ON(mirror_num != bioc->mirror_num);
But inside btrfs_map_block(), we can modify bioc->mirror_num especially
for dev-replace:
if (dev_replace_is_ongoing && mirror_num == map->num_stripes + 1 &&
!need_full_stripe(op) && dev_replace->tgtdev != NULL) {
ret = get_extra_mirror_from_replace(fs_info, logical, *length,
dev_replace->srcdev->devid,
&mirror_num,
&physical_to_patch_in_first_stripe);
patch_the_first_stripe_for_dev_replace = 1;
}
Thus if we're repairing the replace target device, we're going to
trigger that BUG_ON().
But in reality, the read failure from the replace target device may be
that, our replace hasn't reached the range we're reading, thus we're
reading garbage, but with replace running, the range would be properly
filled later.
Thus in that case, we don't need to do anything but let the replace
routine to handle it.
[FIX]
Instead of a BUG_ON(), just skip the repair if we're repairing the
device replace target device.
Reported-by: 小太 <nospam@kota.moe>
Link: https://lore.kernel.org/linux-btrfs/CACsxjPYyJGQZ+yvjzxA1Nn2LuqkYqTCcUH43S=+wXhyf8S00Ag@mail.gmail.com/
CC: stable@vger.kernel.org # 6.0+
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
During lseek, when searching for delalloc in a range that represents a
hole and that range has a length of 1 byte, we end up not doing the actual
delalloc search in the inode's io tree, resulting in not correctly
reporting the offset with data or a hole. This actually only happens when
the start offset is 0 because with any other start offset we round it down
by sector size.
Reproducer:
$ mkfs.btrfs -f /dev/sdc
$ mount /dev/sdc /mnt/sdc
$ xfs_io -f -c "pwrite -q 0 1" /mnt/sdc/foo
$ xfs_io -c "seek -d 0" /mnt/sdc/foo
Whence Result
DATA EOF
It should have reported an offset of 0 instead of EOF.
Fix this by updating btrfs_find_delalloc_in_range() and count_range_bits()
to deal with inclusive ranges properly. These functions are already
supposed to work with inclusive end offsets, they just got it wrong in a
couple places due to off-by-one mistakes.
A test case for fstests will be added later.
Reported-by: Joan Bruguera Micó <joanbrugueram@gmail.com>
Link: https://lore.kernel.org/linux-btrfs/20221223020509.457113-1-joanbrugueram@gmail.com/
Fixes: b6e833567ea1 ("btrfs: make hole and data seeking a lot more efficient")
CC: stable@vger.kernel.org # 6.1
Tested-by: Joan Bruguera Micó <joanbrugueram@gmail.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
[BUG]
There is a bug report that on a RAID0 NVMe btrfs system, under heavy
write load the filesystem can flip RO randomly.
With extra debugging, it shows some tree blocks failed to pass their
level checks, and if that happens at critical path of a transaction, we
abort the transaction:
BTRFS error (device nvme0n1p3): level verify failed on logical 5446121209856 mirror 1 wanted 0 found 1
BTRFS error (device nvme0n1p3: state A): Transaction aborted (error -5)
BTRFS: error (device nvme0n1p3: state A) in btrfs_finish_ordered_io:3343: errno=-5 IO failure
BTRFS info (device nvme0n1p3: state EA): forced readonly
[CAUSE]
The reporter has already bisected to commit 947a629988f1 ("btrfs: move
tree block parentness check into validate_extent_buffer()").
And with extra debugging, it shows we can have btrfs_tree_parent_check
filled with all zeros in the following call trace:
submit_one_bio+0xd4/0xe0
submit_extent_page+0x142/0x550
read_extent_buffer_pages+0x584/0x9c0
? __pfx_end_bio_extent_readpage+0x10/0x10
? folio_unlock+0x1d/0x50
btrfs_read_extent_buffer+0x98/0x150
read_tree_block+0x43/0xa0
read_block_for_search+0x266/0x370
btrfs_search_slot+0x351/0xd30
? lock_is_held_type+0xe8/0x140
btrfs_lookup_csum+0x63/0x150
btrfs_csum_file_blocks+0x197/0x6c0
? sched_clock_cpu+0x9f/0xc0
? lock_release+0x14b/0x440
? _raw_read_unlock+0x29/0x50
btrfs_finish_ordered_io+0x441/0x860
btrfs_work_helper+0xfe/0x400
? lock_is_held_type+0xe8/0x140
process_one_work+0x294/0x5b0
worker_thread+0x4f/0x3a0
? __pfx_worker_thread+0x10/0x10
kthread+0xf5/0x120
? __pfx_kthread+0x10/0x10
ret_from_fork+0x2c/0x50
Currently we only copy the btrfs_tree_parent_check structure into bbio
at read_extent_buffer_pages() after we have assembled the bbio.
But as shown above, submit_extent_page() itself can already submit the
bbio, leaving the bbio->parent_check uninitialized, and cause the false
alert.
[FIX]
Instead of copying @check into bbio after bbio is assembled, we pass
@check in btrfs_bio_ctrl::parent_check, and copy the content of
parent_check in submit_one_bio() for metadata read.
By this we should be able to pass the needed info for metadata endio
verification, and fix the false alert.
Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Link: https://lore.kernel.org/linux-btrfs/CABXGCsNzVxo4iq-tJSGm_kO1UggHXgq6CdcHDL=z5FL4njYXSQ@mail.gmail.com/
Fixes: 947a629988f1 ("btrfs: move tree block parentness check into validate_extent_buffer()")
Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
From a recent regression report, we found that after commit 947a629988f1
("btrfs: move tree block parentness check into
validate_extent_buffer()") if we have a level mismatch (false alert
though), there is no error message at all.
This makes later debugging harder. This patch will add the proper error
message for such case.
Link: https://lore.kernel.org/linux-btrfs/CABXGCsNzVxo4iq-tJSGm_kO1UggHXgq6CdcHDL=z5FL4njYXSQ@mail.gmail.com/
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
The em->len value is supposed to be verified in the assertion condition
though we expect it to be same as the sectorsize.
Fixes: a196a8944f77 ("btrfs: do not reset extent map members for inline extents read")
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Tanmay Bhushan <007047221b@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
"First batch of regression and regular fixes:
- regressions:
- fix error handling after conversion to qstr for paths
- fix raid56/scrub recovery caused by uninitialized variable
after conversion to error bitmaps
- restore qgroup backref lookup behaviour after recent
refactoring
- fix leak of device lists at module exit time
- fix resolving backrefs for inline extent followed by prealloc
- reset defrag ioctl buffer on memory allocation error"
* tag 'for-6.2-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: fix fscrypt name leak after failure to join log transaction
btrfs: scrub: fix uninitialized return value in recover_scrub_rbio
btrfs: fix resolving backrefs for inline extent followed by prealloc
btrfs: fix trace event name typo for FLUSH_DELAYED_REFS
btrfs: restore BTRFS_SEQ_LAST when looking up qgroup backref lookup
btrfs: fix leak of fs devices after removing btrfs module
btrfs: fix an error handling path in btrfs_defrag_leaves()
btrfs: fix an error handling path in btrfs_rename()
|
|
syzbot is reporting hung task at do_user_addr_fault() [1], for there is
a silent deadlock between PG_locked bit and ni_lock lock.
Since filemap_update_page() calls filemap_read_folio() after calling
folio_trylock() which will set PG_locked bit, ntfs_truncate() must not
call truncate_setsize() which will wait for PG_locked bit to be cleared
when holding ni_lock lock.
Link: https://lore.kernel.org/all/00000000000060d41f05f139aa44@google.com/
Link: https://syzkaller.appspot.com/bug?extid=bed15dbf10294aa4f2ae [1]
Reported-by: syzbot <syzbot+bed15dbf10294aa4f2ae@syzkaller.appspotmail.com>
Debugged-by: Linus Torvalds <torvalds@linux-foundation.org>
Co-developed-by: Hillf Danton <hdanton@sina.com>
Signed-off-by: Hillf Danton <hdanton@sina.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Fixes: 4342306f0f0d ("fs/ntfs3: Add file operations and implementation")
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
If v4 READDIR operation hits a mountpoint and gets back an error,
then it will include that entry in the reply and set RDATTR_ERROR for it
to the error.
That's fine for "normal" exported filesystems, but on the v4root, we
need to be more careful to only expose the existence of dentries that
lead to exports.
If the mountd upcall times out while checking to see whether a
mountpoint on the v4root is exported, then we have no recourse other
than to fail the whole operation.
Cc: Steve Dickson <steved@redhat.com>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216777
Reported-by: JianHong Yin <yin-jianhong@163.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: <stable@vger.kernel.org>
|
|
When ceph releasing the file_lock it will try to get the inode pointer
from the fl->fl_file, which the memory could already be released by
another thread in filp_close(). Because in VFS layer the fl->fl_file
doesn't increase the file's reference counter.
Will switch to use ceph dedicate lock info to track the inode.
And in ceph_fl_release_lock() we should skip all the operations if the
fl->fl_u.ceph.inode is not set, which should come from the request
file_lock. And we will set fl->fl_u.ceph.inode when inserting it to the
inode lock list, which is when copying the lock.
Link: https://tracker.ceph.com/issues/57986
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
For the POSIX locks they are using the same owner, which is the
thread id. And multiple POSIX locks could be merged into single one,
so when checking whether the 'file' has locks may fail.
For a file where some openers use locking and others don't is a
really odd usage pattern though. Locks are like stoplights -- they
only work if everyone pays attention to them.
Just switch ceph_get_caps() to check whether any locks are set on
the inode. If there are POSIX/OFD/FLOCK locks on the file at the
time, we should set CHECK_FILELOCK, regardless of what fd was used
to set the lock.
Fixes: ff5d913dfc71 ("ceph: return -EIO if read/write against filp that lost file locks")
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
sparse is warning about an incorrect RCU dereference.
fs/nfs/dir.c:2965:56: warning: incorrect type in argument 1 (different address spaces)
fs/nfs/dir.c:2965:56: expected struct cred const *
fs/nfs/dir.c:2965:56: got struct cred const [noderef] __rcu *const cred
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
If the user's login time is newer than the cache's timestamp,
we expect the cache may be stale and need to clear.
The stale cache will remain in the list's tail if no other
users operate on that inode.
Once the user accesses the inode, the stale cache will be
returned in rcu path.
Signed-off-by: Chengen Du <chengen.du@canonical.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
Due to several bugs caused by timers being re-armed after they are
shutdown and just before they are freed, a new state of timers was added
called "shutdown". After a timer is set to this state, then it can no
longer be re-armed.
The following script was run to find all the trivial locations where
del_timer() or del_timer_sync() is called in the same function that the
object holding the timer is freed. It also ignores any locations where
the timer->function is modified between the del_timer*() and the free(),
as that is not considered a "trivial" case.
This was created by using a coccinelle script and the following
commands:
$ cat timer.cocci
@@
expression ptr, slab;
identifier timer, rfield;
@@
(
- del_timer(&ptr->timer);
+ timer_shutdown(&ptr->timer);
|
- del_timer_sync(&ptr->timer);
+ timer_shutdown_sync(&ptr->timer);
)
... when strict
when != ptr->timer
(
kfree_rcu(ptr, rfield);
|
kmem_cache_free(slab, ptr);
|
kfree(ptr);
)
$ spatch timer.cocci . > /tmp/t.patch
$ patch -p1 < /tmp/t.patch
Link: https://lore.kernel.org/lkml/20221123201306.823305113@linutronix.de/
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Acked-by: Pavel Machek <pavel@ucw.cz> [ LED ]
Acked-by: Kalle Valo <kvalo@kernel.org> [ wireless ]
Acked-by: Paolo Abeni <pabeni@redhat.com> [ networking ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull pstore fixes from Kees Cook:
- Switch pmsg_lock to an rt_mutex to avoid priority inversion (John
Stultz)
- Correctly assign mem_type property (Luca Stefani)
* tag 'pstore-v6.2-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
pstore: Properly assign mem_type property
pstore: Make sure CONFIG_PSTORE_PMSG selects CONFIG_RT_MUTEXES
pstore: Switch pmsg_lock to an rt_mutex to avoid priority inversion
|
|
Pull 9p updates from Dominique Martinet:
- improve p9_check_errors to check buffer size instead of msize when
possible (e.g. not zero-copy)
- some more syzbot and KCSAN fixes
- minor headers include cleanup
* tag '9p-for-6.2-rc1' of https://github.com/martinetd/linux:
9p/client: fix data race on req->status
net/9p: fix response size check in p9_check_errors()
net/9p: distinguish zero-copy requests
9p/xen: do not memcpy header into req->rc
9p: set req refcount to zero to avoid uninitialized usage
9p/net: Remove unneeded idr.h #include
9p/fs: Remove unneeded idr.h #include
|
|
If mem-type is specified in the device tree
it would end up overriding the record_size
field instead of populating mem_type.
As record_size is currently parsed after the
improper assignment with default size 0 it
continued to work as expected regardless of the
value found in the device tree.
Simply changing the target field of the struct
is enough to get mem-type working as expected.
Fixes: 9d843e8fafc7 ("pstore: Add mem_type property DT parsing support")
Cc: stable@vger.kernel.org
Signed-off-by: Luca Stefani <luca@osomprivacy.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20221222131049.286288-1-luca@osomprivacy.com
|
|
In commit 76d62f24db07 ("pstore: Switch pmsg_lock to an rt_mutex
to avoid priority inversion") I changed a lock to an rt_mutex.
However, its possible that CONFIG_RT_MUTEXES is not enabled,
which then results in a build failure, as the 0day bot detected:
https://lore.kernel.org/linux-mm/202212211244.TwzWZD3H-lkp@intel.com/
Thus this patch changes CONFIG_PSTORE_PMSG to select
CONFIG_RT_MUTEXES, which ensures the build will not fail.
Cc: Wei Wang <wvw@google.com>
Cc: Midas Chien<midaschieh@google.com>
Cc: Connor O'Brien <connoro@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Anton Vorontsov <anton@enomsg.org>
Cc: Colin Cross <ccross@android.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: kernel test robot <lkp@intel.com>
Cc: kernel-team@android.com
Fixes: 76d62f24db07 ("pstore: Switch pmsg_lock to an rt_mutex to avoid priority inversion")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: John Stultz <jstultz@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20221221051855.15761-1-jstultz@google.com
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
Pull afs update from David Howells:
"A fix for a couple of missing resource counter decrements, two small
cleanups of now-unused bits of code and a patch to remove writepage
support from afs"
* tag 'afs-next-20221222' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
afs: Stop implementing ->writepage()
afs: remove afs_cache_netfs and afs_zap_permits() declarations
afs: remove variable nr_servers
afs: Fix lost servers_outstanding count
|
|
Currently, we shut down the filecache before trying to clean up the
stateids that depend on it. This leads to the kernel trying to free an
nfsd_file twice, and a refcount overput on the nf_mark.
Change the shutdown procedure to tear down all of the stateids prior
to shutting down the filecache.
Reported-and-tested-by: Wang Yugui <wangyugui@e16-tech.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Fixes: 5e113224c17e ("nfsd: nfsd_file cache entries should be per net namespace")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
We're trying to get rid of the ->writepage() hook[1]. Stop afs from using
it by unlocking the page and calling afs_writepages_region() rather than
folio_write_one().
A flag is passed to afs_writepages_region() to indicate that it should only
write a single region so that we don't flush the entire file in
->write_begin(), but do add other dirty data to the region being written to
try and reduce the number of RPC ops.
This requires ->migrate_folio() to be implemented, so point that at
filemap_migrate_folio() for files and also for symlinks and directories.
This can be tested by turning on the afs_folio_dirty tracepoint and then
doing something like:
xfs_io -c "w 2223 7000" -c "w 15000 22222" -c "w 23 7" /afs/my/test/foo
and then looking in the trace to see if the write at position 15000 gets
stored before page 0 gets dirtied for the write at position 23.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: Christoph Hellwig <hch@lst.de>
cc: Matthew Wilcox <willy@infradead.org>
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/20221113162902.883850-1-hch@lst.de/ [1]
Link: https://lore.kernel.org/r/166876785552.222254.4403222906022558715.stgit@warthog.procyon.org.uk/ # v1
|
|
afs_zap_permits() has been removed since
commit be080a6f43c4 ("afs: Overhaul permit caching").
afs_cache_netfs has been removed since
commit 523d27cda149 ("afs: Convert afs to use the new fscache API").
so remove the declare for them from header file.
Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/20220909070353.1160228-1-cuigaosheng1@huawei.com/
|
|
Variable nr_servers is no longer being used, the last reference
to it was removed in commit 45df8462730d ("afs: Fix server list handling")
so clean up the code by removing it.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/20221020173923.21342-1-colin.i.king@gmail.com/
|
|
The afs_fs_probe_dispatcher() work function is passed a count on
net->servers_outstanding when it is scheduled (which may come via its
timer). This is passed back to the work_item, passed to the timer or
dropped at the end of the dispatcher function.
But, at the top of the dispatcher function, there are two checks which
skip the rest of the function: if the network namespace is being destroyed
or if there are no fileservers to probe. These two return paths, however,
do not drop the count passed to the dispatcher, and so, sometimes, the
destruction of a network namespace, such as induced by rmmod of the kafs
module, may get stuck in afs_purge_servers(), waiting for
net->servers_outstanding to become zero.
Fix this by adding the missing decrements in afs_fs_probe_dispatcher().
Fixes: f6cbb368bcb0 ("afs: Actively poll fileservers to maintain NAT or firewall openings")
Reported-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/167164544917.2072364.3759519569649459359.stgit@warthog.procyon.org.uk/
|
|
git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French:
"cifs/smb3 client fixes, mostly related to reconnect and/or DFS:
- two important reconnect fixes: cases where status of recently
connected IPCs and shares were not being updated leaving them in an
incorrect state
- fix for older Windows servers that would return
STATUS_OBJECT_NAME_INVALID to query info requests on DFS links in a
namespace that contained non-ASCII characters, reducing number of
wasted roundtrips.
- fix for leaked -ENOMEM to userspace when cifs.ko couldn't perform
I/O due to a disconnected server, expired or deleted session.
- removal of all unneeded DFS related mount option string parsing
(now using fs_context for automounts)
- improve clarity/readability, moving various DFS related functions
out of fs/cifs/connect.c (which was getting too big to be readable)
to new file.
- Fix problem when large number of DFS connections. Allow sharing of
DFS connections and fix how the referral paths are matched
- Referral caching fix: Instead of looking up ipc connections to
refresh cached referrals, store direct dfs root server's IPC
pointer in new sessions so it can simply be accessed to either
refresh or create a new referral that such connections belong to.
- Fix to allow dfs root server's connections to also failover
- Optimized reconnect of nested DFS links
- Set correct status of IPC connections marked for reconnect"
* tag '6.2-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6:
cifs: update internal module number
cifs: don't leak -ENOMEM in smb2_open_file()
cifs: use origin fullpath for automounts
cifs: set correct status of tcon ipc when reconnecting
cifs: optimize reconnect of nested links
cifs: fix source pathname comparison of dfs supers
cifs: fix confusing debug message
cifs: don't block in dfs_cache_noreq_update_tgthint()
cifs: refresh root referrals
cifs: fix refresh of cached referrals
cifs: don't refresh cached referrals from unactive mounts
cifs: share dfs connections and supers
cifs: split out ses and tcon retrieval from mount_get_conns()
cifs: set resolved ip in sockaddr
cifs: remove unused smb3_fs_context::mount_options
cifs: get rid of mount options string parsing
cifs: use fs_context for automounts
cifs: reduce roundtrips on create/qinfo requests
cifs: set correct ipc status after initial tree connect
cifs: set correct tcon status after initial tree connect
|
|
https://github.com/Paragon-Software-Group/linux-ntfs3
Pull ntfs3 updates from Konstantin Komarov:
- added mount options 'hidedotfiles', 'nocase' and 'windows_names'
- fixed xfstests (tested on x86_64): generic/083 generic/263
generic/307 generic/465
- fix some logic errors
- code refactoring and dead code removal
* tag 'ntfs3_for_6.2' of https://github.com/Paragon-Software-Group/linux-ntfs3: (61 commits)
fs/ntfs3: Make if more readable
fs/ntfs3: Improve checking of bad clusters
fs/ntfs3: Fix wrong if in hdr_first_de
fs/ntfs3: Use ALIGN kernel macro
fs/ntfs3: Fix incorrect if in ntfs_set_acl_ex
fs/ntfs3: Check fields while reading
fs/ntfs3: Correct ntfs_check_for_free_space
fs/ntfs3: Restore correct state after ENOSPC in attr_data_get_block
fs/ntfs3: Changing locking in ntfs_rename
fs/ntfs3: Fixing wrong logic in attr_set_size and ntfs_fallocate
fs/ntfs3: atomic_open implementation
fs/ntfs3: Fix wrong indentations
fs/ntfs3: Change new sparse cluster processing
fs/ntfs3: Fixing work with sparse clusters
fs/ntfs3: Simplify ntfs_update_mftmirr function
fs/ntfs3: Remove unused functions
fs/ntfs3: Fix sparse problems
fs/ntfs3: Add ntfs_bitmap_weight_le function and refactoring
fs/ntfs3: Use _le variants of bitops functions
fs/ntfs3: Add functions to modify LE bitmaps
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping
Pull mount propagation fix from Christian Brauner:
"The propagate_mnt() function handles mount propagation when creating
mounts and propagates the source mount tree @source_mnt to all
applicable nodes of the destination propagation mount tree headed by
@dest_mnt.
Unfortunately it contains a bug where it fails to terminate at peers
of @source_mnt when looking up copies of the source mount that become
masters for copies of the source mount tree mounted on top of slaves
in the destination propagation tree causing a NULL dereference.
This fixes that bug (with a long commit message for a seven character
fix but hopefully it'll help us fix issues faster in the future rather
than having to go through the pain of having to relearn everything
once more)"
* tag 'fs.mount.propagation.fix.v6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping:
pnode: terminate at peers of source
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from bpf, netfilter and can.
Current release - regressions:
- bpf: synchronize dispatcher update with bpf_dispatcher_xdp_func
- rxrpc:
- fix security setting propagation
- fix null-deref in rxrpc_unuse_local()
- fix switched parameters in peer tracing
Current release - new code bugs:
- rxrpc:
- fix I/O thread startup getting skipped
- fix locking issues in rxrpc_put_peer_locked()
- fix I/O thread stop
- fix uninitialised variable in rxperf server
- fix the return value of rxrpc_new_incoming_call()
- microchip: vcap: fix initialization of value and mask
- nfp: fix unaligned io read of capabilities word
Previous releases - regressions:
- stop in-kernel socket users from corrupting socket's task_frag
- stream: purge sk_error_queue in sk_stream_kill_queues()
- openvswitch: fix flow lookup to use unmasked key
- dsa: mv88e6xxx: avoid reg_lock deadlock in mv88e6xxx_setup_port()
- devlink:
- hold region lock when flushing snapshots
- protect devlink dump by the instance lock
Previous releases - always broken:
- bpf:
- prevent leak of lsm program after failed attach
- resolve fext program type when checking map compatibility
- skbuff: account for tail adjustment during pull operations
- macsec: fix net device access prior to holding a lock
- bonding: switch back when high prio link up
- netfilter: flowtable: really fix NAT IPv6 offload
- enetc: avoid buffer leaks on xdp_do_redirect() failure
- unix: fix race in SOCK_SEQPACKET's unix_dgram_sendmsg()
- dsa: microchip: remove IRQF_TRIGGER_FALLING in
request_threaded_irq"
* tag 'net-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (64 commits)
net: fec: check the return value of build_skb()
net: simplify sk_page_frag
Treewide: Stop corrupting socket's task_frag
net: Introduce sk_use_task_frag in struct sock.
mctp: Remove device type check at unregister
net: dsa: microchip: remove IRQF_TRIGGER_FALLING in request_threaded_irq
can: kvaser_usb: hydra: help gcc-13 to figure out cmd_len
can: flexcan: avoid unbalanced pm_runtime_enable warning
Documentation: devlink: add missing toc entry for etas_es58x devlink doc
mctp: serial: Fix starting value for frame check sequence
nfp: fix unaligned io read of capabilities word
net: stream: purge sk_error_queue in sk_stream_kill_queues()
myri10ge: Fix an error handling path in myri10ge_probe()
net: microchip: vcap: Fix initialization of value and mask
rxrpc: Fix the return value of rxrpc_new_incoming_call()
rxrpc: rxperf: Fix uninitialised variable
rxrpc: Fix I/O thread stop
rxrpc: Fix switched parameters in peer tracing
rxrpc: Fix locking issues in rxrpc_put_peer_locked()
rxrpc: Fix I/O thread startup getting skipped
...
|