summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2023-11-20ACPI: resource: Skip IRQ override on ASUS ExpertBook B1402CVAHans de Goede1-0/+7
Like various other ASUS ExpertBook-s, the ASUS ExpertBook B1402CVA has an ACPI DSDT table that describes IRQ 1 as ActiveLow while the kernel overrides it to EdgeHigh. This prevents the keyboard from working. To fix this issue, add this laptop to the skip_override_table so that the kernel does not override IRQ 1. Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218114 Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-11-20ACPI: video: Use acpi_device_fix_up_power_children()Hans de Goede1-1/+1
Commit 89c290ea7589 ("ACPI: video: Put ACPI video and its child devices into D0 on boot") introduced calling acpi_device_fix_up_power_extended() on the video card for which the ACPI video bus is the companion device. This unnecessarily touches the power-state of the GPU itself, while the issue it tries to address only requires calling _PS0 on the child devices. Touching the power-state of the GPU itself is causing suspend / resume issues on e.g. a Lenovo ThinkPad W530. Instead use acpi_device_fix_up_power_children(), which only touches the child devices, to fix this. Fixes: 89c290ea7589 ("ACPI: video: Put ACPI video and its child devices into D0 on boot") Reported-by: Owen T. Heisler <writer@owenh.net> Closes: https://lore.kernel.org/regressions/9f36fb06-64c4-4264-aaeb-4e1289e764c4@owenh.net/ Closes: https://gitlab.freedesktop.org/drm/nouveau/-/issues/273 Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218124 Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Tested-by: Owen T. Heisler <writer@owenh.net> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Cc: 6.6+ <stable@vger.kernel.org> # 6.6+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-11-20ACPI: PM: Add acpi_device_fix_up_power_children() functionHans de Goede2-0/+14
In some cases it is necessary to fix-up the power-state of an ACPI device's children without touching the ACPI device itself add a new acpi_device_fix_up_power_children() function for this. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Cc: 6.6+ <stable@vger.kernel.org> # 6.6+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-11-20ACPI: processor_idle: use raw_safe_halt() in acpi_idle_play_dead()David Woodhouse1-1/+1
Xen HVM guests were observed taking triple-faults when attempting to online a previously offlined vCPU. Investigation showed that the fault was coming from a failing call to lockdep_assert_irqs_disabled(), in load_current_idt() which was too early in the CPU bringup to actually catch the exception and report the failure cleanly. This was a false positive, caused by acpi_idle_play_dead() setting the per-cpu hardirqs_enabled flag by calling safe_halt(). Switch it to use raw_safe_halt() instead, which doesn't do so. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: 6.6+ <stable@vger.kernel.org> # 6.6+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-11-20bcache: avoid NULL checking to c->root in run_cache_set()Coly Li1-1/+1
In run_cache_set() after c->root returned from bch_btree_node_get(), it is checked by IS_ERR_OR_NULL(). Indeed it is unncessary to check NULL because bch_btree_node_get() will not return NULL pointer to caller. This patch replaces IS_ERR_OR_NULL() by IS_ERR() for the above reason. Signed-off-by: Coly Li <colyli@suse.de> Link: https://lore.kernel.org/r/20231120052503.6122-11-colyli@suse.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-11-20bcache: add code comments for bch_btree_node_get() and __bch_btree_node_alloc()Coly Li1-0/+7
This patch adds code comments to bch_btree_node_get() and __bch_btree_node_alloc() that NULL pointer will not be returned and it is unnecessary to check NULL pointer by the callers of these routines. Signed-off-by: Coly Li <colyli@suse.de> Link: https://lore.kernel.org/r/20231120052503.6122-10-colyli@suse.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-11-20bcache: replace a mistaken IS_ERR() by IS_ERR_OR_NULL() in btree_gc_coalesce()Coly Li1-1/+1
Commit 028ddcac477b ("bcache: Remove unnecessary NULL point check in node allocations") do the following change inside btree_gc_coalesce(), 31 @@ -1340,7 +1340,7 @@ static int btree_gc_coalesce( 32 memset(new_nodes, 0, sizeof(new_nodes)); 33 closure_init_stack(&cl); 34 35 - while (nodes < GC_MERGE_NODES && !IS_ERR_OR_NULL(r[nodes].b)) 36 + while (nodes < GC_MERGE_NODES && !IS_ERR(r[nodes].b)) 37 keys += r[nodes++].keys; 38 39 blocks = btree_default_blocks(b->c) * 2 / 3; At line 35 the original r[nodes].b is not always allocatored from __bch_btree_node_alloc(), and possibly initialized as NULL pointer by caller of btree_gc_coalesce(). Therefore the change at line 36 is not correct. This patch replaces the mistaken IS_ERR() by IS_ERR_OR_NULL() to avoid potential issue. Fixes: 028ddcac477b ("bcache: Remove unnecessary NULL point check in node allocations") Cc: <stable@vger.kernel.org> # 6.5+ Cc: Zheng Wang <zyytlz.wz@163.com> Signed-off-by: Coly Li <colyli@suse.de> Link: https://lore.kernel.org/r/20231120052503.6122-9-colyli@suse.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-11-20bcache: fixup multi-threaded bch_sectors_dirty_init() wake-up raceMingzhe Zou1-1/+2
We get a kernel crash about "unable to handle kernel paging request": ```dmesg [368033.032005] BUG: unable to handle kernel paging request at ffffffffad9ae4b5 [368033.032007] PGD fc3a0d067 P4D fc3a0d067 PUD fc3a0e063 PMD 8000000fc38000e1 [368033.032012] Oops: 0003 [#1] SMP PTI [368033.032015] CPU: 23 PID: 55090 Comm: bch_dirtcnt[0] Kdump: loaded Tainted: G OE --------- - - 4.18.0-147.5.1.es8_24.x86_64 #1 [368033.032017] Hardware name: Tsinghua Tongfang THTF Chaoqiang Server/072T6D, BIOS 2.4.3 01/17/2017 [368033.032027] RIP: 0010:native_queued_spin_lock_slowpath+0x183/0x1d0 [368033.032029] Code: 8b 02 48 85 c0 74 f6 48 89 c1 eb d0 c1 e9 12 83 e0 03 83 e9 01 48 c1 e0 05 48 63 c9 48 05 c0 3d 02 00 48 03 04 cd 60 68 93 ad <48> 89 10 8b 42 08 85 c0 75 09 f3 90 8b 42 08 85 c0 74 f7 48 8b 02 [368033.032031] RSP: 0018:ffffbb48852abe00 EFLAGS: 00010082 [368033.032032] RAX: ffffffffad9ae4b5 RBX: 0000000000000246 RCX: 0000000000003bf3 [368033.032033] RDX: ffff97b0ff8e3dc0 RSI: 0000000000600000 RDI: ffffbb4884743c68 [368033.032034] RBP: 0000000000000001 R08: 0000000000000000 R09: 000007ffffffffff [368033.032035] R10: ffffbb486bb01000 R11: 0000000000000001 R12: ffffffffc068da70 [368033.032036] R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000000000 [368033.032038] FS: 0000000000000000(0000) GS:ffff97b0ff8c0000(0000) knlGS:0000000000000000 [368033.032039] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [368033.032040] CR2: ffffffffad9ae4b5 CR3: 0000000fc3a0a002 CR4: 00000000003626e0 [368033.032042] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [368033.032043] bcache: bch_cached_dev_attach() Caching rbd479 as bcache462 on set 8cff3c36-4a76-4242-afaa-7630206bc70b [368033.032045] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [368033.032046] Call Trace: [368033.032054] _raw_spin_lock_irqsave+0x32/0x40 [368033.032061] __wake_up_common_lock+0x63/0xc0 [368033.032073] ? bch_ptr_invalid+0x10/0x10 [bcache] [368033.033502] bch_dirty_init_thread+0x14c/0x160 [bcache] [368033.033511] ? read_dirty_submit+0x60/0x60 [bcache] [368033.033516] kthread+0x112/0x130 [368033.033520] ? kthread_flush_work_fn+0x10/0x10 [368033.034505] ret_from_fork+0x35/0x40 ``` The crash occurred when call wake_up(&state->wait), and then we want to look at the value in the state. However, bch_sectors_dirty_init() is not found in the stack of any task. Since state is allocated on the stack, we guess that bch_sectors_dirty_init() has exited, causing bch_dirty_init_thread() to be unable to handle kernel paging request. In order to verify this idea, we added some printing information during wake_up(&state->wait). We find that "wake up" is printed twice, however we only expect the last thread to wake up once. ```dmesg [ 994.641004] alcache: bch_dirty_init_thread() wake up [ 994.641018] alcache: bch_dirty_init_thread() wake up [ 994.641523] alcache: bch_sectors_dirty_init() init exit ``` There is a race. If bch_sectors_dirty_init() exits after the first wake up, the second wake up will trigger this bug("unable to handle kernel paging request"). Proceed as follows: bch_sectors_dirty_init kthread_run ==============> bch_dirty_init_thread(bch_dirtcnt[0]) ... ... atomic_inc(&state.started) ... ... ... atomic_read(&state.enough) ... ... atomic_set(&state->enough, 1) kthread_run ======================================================> bch_dirty_init_thread(bch_dirtcnt[1]) ... atomic_dec_and_test(&state->started) ... atomic_inc(&state.started) ... ... ... wake_up(&state->wait) ... atomic_read(&state.enough) atomic_dec_and_test(&state->started) ... ... wait_event(state.wait, atomic_read(&state.started) == 0) ... return ... wake_up(&state->wait) We believe it is very common to wake up twice if there is no dirty, but crash is an extremely low probability event. It's hard for us to reproduce this issue. We attached and detached continuously for a week, with a total of more than one million attaches and only one crash. Putting atomic_inc(&state.started) before kthread_run() can avoid waking up twice. Fixes: b144e45fc576 ("bcache: make bch_sectors_dirty_init() to be multithreaded") Signed-off-by: Mingzhe Zou <mingzhe.zou@easystack.cn> Cc: <stable@vger.kernel.org> Signed-off-by: Coly Li <colyli@suse.de> Link: https://lore.kernel.org/r/20231120052503.6122-8-colyli@suse.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-11-20bcache: fixup lock c->root errorMingzhe Zou1-3/+11
We had a problem with io hung because it was waiting for c->root to release the lock. crash> cache_set.root -l cache_set.list ffffa03fde4c0050 root = 0xffff802ef454c800 crash> btree -o 0xffff802ef454c800 | grep rw_semaphore [ffff802ef454c858] struct rw_semaphore lock; crash> struct rw_semaphore ffff802ef454c858 struct rw_semaphore { count = { counter = -4294967297 }, wait_list = { next = 0xffff00006786fc28, prev = 0xffff00005d0efac8 }, wait_lock = { raw_lock = { { val = { counter = 0 }, { locked = 0 '\000', pending = 0 '\000' }, { locked_pending = 0, tail = 0 } } } }, osq = { tail = { counter = 0 } }, owner = 0xffffa03fdc586603 } The "counter = -4294967297" means that lock count is -1 and a write lock is being attempted. Then, we found that there is a btree with a counter of 1 in btree_cache_freeable. crash> cache_set -l cache_set.list ffffa03fde4c0050 -o|grep btree_cache [ffffa03fde4c1140] struct list_head btree_cache; [ffffa03fde4c1150] struct list_head btree_cache_freeable; [ffffa03fde4c1160] struct list_head btree_cache_freed; [ffffa03fde4c1170] unsigned int btree_cache_used; [ffffa03fde4c1178] wait_queue_head_t btree_cache_wait; [ffffa03fde4c1190] struct task_struct *btree_cache_alloc_lock; crash> list -H ffffa03fde4c1140|wc -l 973 crash> list -H ffffa03fde4c1150|wc -l 1123 crash> cache_set.btree_cache_used -l cache_set.list ffffa03fde4c0050 btree_cache_used = 2097 crash> list -s btree -l btree.list -H ffffa03fde4c1140|grep -E -A2 "^ lock = {" > btree_cache.txt crash> list -s btree -l btree.list -H ffffa03fde4c1150|grep -E -A2 "^ lock = {" > btree_cache_freeable.txt [root@node-3 127.0.0.1-2023-08-04-16:40:28]# pwd /var/crash/127.0.0.1-2023-08-04-16:40:28 [root@node-3 127.0.0.1-2023-08-04-16:40:28]# cat btree_cache.txt|grep counter|grep -v "counter = 0" [root@node-3 127.0.0.1-2023-08-04-16:40:28]# cat btree_cache_freeable.txt|grep counter|grep -v "counter = 0" counter = 1 We found that this is a bug in bch_sectors_dirty_init() when locking c->root: (1). Thread X has locked c->root(A) write. (2). Thread Y failed to lock c->root(A), waiting for the lock(c->root A). (3). Thread X bch_btree_set_root() changes c->root from A to B. (4). Thread X releases the lock(c->root A). (5). Thread Y successfully locks c->root(A). (6). Thread Y releases the lock(c->root B). down_write locked ---(1)----------------------┐ | | | down_read waiting ---(2)----┐ | | | ┌-------------┐ ┌-------------┐ bch_btree_set_root ===(3)========>> | c->root A | | c->root B | | | └-------------┘ └-------------┘ up_write ---(4)---------------------┘ | | | | | down_read locked ---(5)-----------┘ | | | up_read ---(6)-----------------------------┘ Since c->root may change, the correct steps to lock c->root should be the same as bch_root_usage(), compare after locking. static unsigned int bch_root_usage(struct cache_set *c) { unsigned int bytes = 0; struct bkey *k; struct btree *b; struct btree_iter iter; goto lock_root; do { rw_unlock(false, b); lock_root: b = c->root; rw_lock(false, b, b->level); } while (b != c->root); for_each_key_filter(&b->keys, k, &iter, bch_ptr_bad) bytes += bkey_bytes(k); rw_unlock(false, b); return (bytes * 100) / btree_bytes(c); } Fixes: b144e45fc576 ("bcache: make bch_sectors_dirty_init() to be multithreaded") Signed-off-by: Mingzhe Zou <mingzhe.zou@easystack.cn> Cc: <stable@vger.kernel.org> Signed-off-by: Coly Li <colyli@suse.de> Link: https://lore.kernel.org/r/20231120052503.6122-7-colyli@suse.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-11-20bcache: fixup init dirty data errorsMingzhe Zou1-1/+4
We found that after long run, the dirty_data of the bcache device will have errors. This error cannot be eliminated unless re-register. We also found that reattach after detach, this error can accumulate. In bch_sectors_dirty_init(), all inode <= d->id keys will be recounted again. This is wrong, we only need to count the keys of the current device. Fixes: b144e45fc576 ("bcache: make bch_sectors_dirty_init() to be multithreaded") Signed-off-by: Mingzhe Zou <mingzhe.zou@easystack.cn> Cc: <stable@vger.kernel.org> Signed-off-by: Coly Li <colyli@suse.de> Link: https://lore.kernel.org/r/20231120052503.6122-6-colyli@suse.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-11-20bcache: prevent potential division by zero errorRand Deeb1-1/+1
In SHOW(), the variable 'n' is of type 'size_t.' While there is a conditional check to verify that 'n' is not equal to zero before executing the 'do_div' macro, concerns arise regarding potential division by zero error in 64-bit environments. The concern arises when 'n' is 64 bits in size, greater than zero, and the lower 32 bits of it are zeros. In such cases, the conditional check passes because 'n' is non-zero, but the 'do_div' macro casts 'n' to 'uint32_t,' effectively truncating it to its lower 32 bits. Consequently, the 'n' value becomes zero. To fix this potential division by zero error and ensure precise division handling, this commit replaces the 'do_div' macro with div64_u64(). div64_u64() is designed to work with 64-bit operands, guaranteeing that division is performed correctly. This change enhances the robustness of the code, ensuring that division operations yield accurate results in all scenarios, eliminating the possibility of division by zero, and improving compatibility across different 64-bit environments. Found by Linux Verification Center (linuxtesting.org) with SVACE. Signed-off-by: Rand Deeb <rand.sec96@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Coly Li <colyli@suse.de> Link: https://lore.kernel.org/r/20231120052503.6122-5-colyli@suse.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-11-20bcache: remove redundant assignment to variable cur_idxColin Ian King1-1/+1
Variable cur_idx is being initialized with a value that is never read, it is being re-assigned later in a while-loop. Remove the redundant assignment. Cleans up clang scan build warning: drivers/md/bcache/writeback.c:916:2: warning: Value stored to 'cur_idx' is never read [deadcode.DeadStores] Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: Coly Li <colyli@suse.de> Signed-off-by: Coly Li <colyli@suse.de> Link: https://lore.kernel.org/r/20231120052503.6122-4-colyli@suse.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-11-20bcache: check return value from btree_node_alloc_replacement()Coly Li1-0/+2
In btree_gc_rewrite_node(), pointer 'n' is not checked after it returns from btree_gc_rewrite_node(). There is potential possibility that 'n' is a non NULL ERR_PTR(), referencing such error code is not permitted in following code. Therefore a return value checking is necessary after 'n' is back from btree_node_alloc_replacement(). Signed-off-by: Coly Li <colyli@suse.de> Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20231120052503.6122-3-colyli@suse.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-11-20bcache: avoid oversize memory allocation by small stripe_sizeColy Li2-0/+3
Arraies bcache->stripe_sectors_dirty and bcache->full_dirty_stripes are used for dirty data writeback, their sizes are decided by backing device capacity and stripe size. Larger backing device capacity or smaller stripe size make these two arraies occupies more dynamic memory space. Currently bcache->stripe_size is directly inherited from queue->limits.io_opt of underlying storage device. For normal hard drives, its limits.io_opt is 0, and bcache sets the corresponding stripe_size to 1TB (1<<31 sectors), it works fine 10+ years. But for devices do declare value for queue->limits.io_opt, small stripe_size (comparing to 1TB) becomes an issue for oversize memory allocations of bcache->stripe_sectors_dirty and bcache->full_dirty_stripes, while the capacity of hard drives gets much larger in recent decade. For example a raid5 array assembled by three 20TB hardrives, the raid device capacity is 40TB with typical 512KB limits.io_opt. After the math calculation in bcache code, these two arraies will occupy 400MB dynamic memory. Even worse Andrea Tomassetti reports that a 4KB limits.io_opt is declared on a new 2TB hard drive, then these two arraies request 2GB and 512MB dynamic memory from kzalloc(). The result is that bcache device always fails to initialize on his system. To avoid the oversize memory allocation, bcache->stripe_size should not directly inherited by queue->limits.io_opt from the underlying device. This patch defines BCH_MIN_STRIPE_SZ (4MB) as minimal bcache stripe size and set bcache device's stripe size against the declared limits.io_opt value from the underlying storage device, - If the declared limits.io_opt > BCH_MIN_STRIPE_SZ, bcache device will set its stripe size directly by this limits.io_opt value. - If the declared limits.io_opt < BCH_MIN_STRIPE_SZ, bcache device will set its stripe size by a value multiplying limits.io_opt and euqal or large than BCH_MIN_STRIPE_SZ. Then the minimal stripe size of a bcache device will always be >= 4MB. For a 40TB raid5 device with 512KB limits.io_opt, memory occupied by bcache->stripe_sectors_dirty and bcache->full_dirty_stripes will be 50MB in total. For a 2TB hard drive with 4KB limits.io_opt, memory occupied by these two arraies will be 2.5MB in total. Such mount of memory allocated for bcache->stripe_sectors_dirty and bcache->full_dirty_stripes is reasonable for most of storage devices. Reported-by: Andrea Tomassetti <andrea.tomassetti-opensource@devo.com> Signed-off-by: Coly Li <colyli@suse.de> Reviewed-by: Eric Wheeler <bcache@lists.ewheeler.net> Link: https://lore.kernel.org/r/20231120052503.6122-2-colyli@suse.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-11-20drm/rockchip: vop: Fix color for RGB888/BGR888 format on VOP fullJonas Karlman1-3/+11
Use of DRM_FORMAT_RGB888 and DRM_FORMAT_BGR888 on e.g. RK3288, RK3328 and RK3399 result in wrong colors being displayed. The issue can be observed using modetest: modetest -s <connector_id>@<crtc_id>:1920x1080-60@RG24 modetest -s <connector_id>@<crtc_id>:1920x1080-60@BG24 Vendor 4.4 kernel apply an inverted rb swap for these formats on VOP full framework (IP version 3.x) compared to VOP little framework (2.x). Fix colors by applying different rb swap for VOP full framework (3.x) and VOP little framework (2.x) similar to vendor 4.4 kernel. Fixes: 85a359f25388 ("drm/rockchip: Add BGR formats to VOP") Signed-off-by: Jonas Karlman <jonas@kwiboo.se> Tested-by: Diederik de Haas <didi.debian@cknow.org> Reviewed-by: Christopher Obbard <chris.obbard@collabora.com> Tested-by: Christopher Obbard <chris.obbard@collabora.com> Signed-off-by: Heiko Stuebner <heiko@sntech.de> Link: https://patchwork.freedesktop.org/patch/msgid/20231026191500.2994225-1-jonas@kwiboo.se
2023-11-20io_uring/fs: consider link->flags when getting path for LINKATCharles Mirabile1-1/+1
In order for `AT_EMPTY_PATH` to work as expected, the fact that the user wants that behavior needs to make it to `getname_flags` or it will return ENOENT. Fixes: cf30da90bc3a ("io_uring: add support for IORING_OP_LINKAT") Cc: <stable@vger.kernel.org> Link: https://github.com/axboe/liburing/issues/995 Signed-off-by: Charles Mirabile <cmirabil@redhat.com> Link: https://lore.kernel.org/r/20231120105545.1209530-1-cmirabil@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-11-20libfs: getdents() should return 0 after reaching EODChuck Lever1-3/+11
The new directory offset helpers don't conform with the convention of getdents() returning no more entries once a directory file descriptor has reached the current end-of-directory. To address this, copy the logic from dcache_readdir() to mark the open directory file descriptor once EOD has been reached. Seeking resets the mark. Reported-by: Tavian Barnes <tavianator@tavianator.com> Closes: https://lore.kernel.org/linux-fsdevel/20231113180616.2831430-1-tavianator@tavianator.com/ Fixes: 6faddda69f62 ("libfs: Add directory operations for stable offsets") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Link: https://lore.kernel.org/r/170043792492.4628.15646203084646716134.stgit@bazille.1015granger.net Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-11-20xfs: respect the stable writes flag on the RT deviceChristoph Hellwig3-0/+23
Update the per-folio stable writes flag dependening on which device an inode resides on. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20231025141020.192413-5-hch@lst.de Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-11-20xfs: clean up FS_XFLAG_REALTIME handling in xfs_ioctl_setattr_xflagsChristoph Hellwig1-10/+12
Introduce a local boolean variable if FS_XFLAG_REALTIME to make the checks for it more obvious, and de-densify a few of the conditionals using it to make them more readable while at it. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20231025141020.192413-4-hch@lst.de Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-11-20block: update the stable_writes flag in bdev_addChristoph Hellwig1-0/+2
Propagate the per-queue stable_write flags into each bdev inode in bdev_add. This makes sure devices that require stable writes have it set for I/O on the block device node as well. Note that this doesn't cover the case of a flag changing on a live device yet. We should handle that as well, but I plan to cover it as part of a more general rework of how changing runtime paramters on block devices works. Fixes: 1cb039f3dc16 ("bdi: replace BDI_CAP_STABLE_WRITES with a queue and a sb flag") Reported-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20231025141020.192413-3-hch@lst.de Tested-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-11-20filemap: add a per-mapping stable writes flagChristoph Hellwig3-1/+20
folio_wait_stable waits for writeback to finish before modifying the contents of a folio again, e.g. to support check summing of the data in the block integrity code. Currently this behavior is controlled by the SB_I_STABLE_WRITES flag on the super_block, which means it is uniform for the entire file system. This is wrong for the block device pseudofs which is shared by all block devices, or file systems that can use multiple devices like XFS witht the RT subvolume or btrfs (although btrfs currently reimplements folio_wait_stable anyway). Add a per-address_space AS_STABLE_WRITES flag to control the behavior in a more fine grained way. The existing SB_I_STABLE_WRITES is kept to initialize AS_STABLE_WRITES to the existing default which covers most cases. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20231025141020.192413-2-hch@lst.de Tested-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-11-20autofs: add: new_inode check in autofs_fill_super()Ian Kent1-35/+21
Add missing NULL check of root_inode in autofs_fill_super(). While we are at it simplify the logic by taking advantage of the VFS cleanup procedures and get rid of the goto error handling, as suggested by Al Viro. Signed-off-by: Ian Kent <raven@themaw.net> Link: https://lore.kernel.org/r/20231119225319.331156-1-raven@themaw.net Reviewed-by: Bill O'Donnell <bodonnel@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <brauner@kernel.org> Cc: Bill O'Donnell <billodo@redhat.com> Reported-by: <syzbot+662f87a8ef490f45fa64@syzkaller.appspotmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-11-20spi: axi-spi-engine: add support for any word sizeDavid Lechner1-16/+68
The AXI SPI Engine IP supports any word size from 1 to 32 bits. This adds support for this by setting the bits_per_word_mask and emitting the appropriate instruction to the SPI Engine each time a transfer requires a new word size. The functions that transfer tx/rx buffers from/to the SPI Engine registers (spi_engine_write_{tx,rx}_fifo()) as well as the function that creates the transfer instruction (spi_engine_gen_xfer()) also have to be modified to take into account the word size since xfer->len is the size of the buffers in bytes rather than words. Signed-off-by: David Lechner <dlechner@baylibre.com> Link: https://lore.kernel.org/r/20231117-axi-spi-engine-series-1-v1-14-cc59db999b87@baylibre.com Signed-off-by: Mark Brown <broonie@kernel.org>
2023-11-20spi: axi-spi-engine: add support for cs_offDavid Lechner1-10/+21
This adds support for the spi_transfer::cs_off flag to the AXI SPI Engine driver. The logic is copied from the generic spi_transfer_one_message() in spi.c. Signed-off-by: David Lechner <dlechner@baylibre.com> Link: https://lore.kernel.org/r/20231117-axi-spi-engine-series-1-v1-13-cc59db999b87@baylibre.com Signed-off-by: Mark Brown <broonie@kernel.org>
2023-11-20spi: axi-spi-engine: remove struct spi_engine::msgDavid Lechner1-32/+28
In the AXI SPI Engine driver, the struct spi_engine::msg member was used to keep track of the current message being processed. The SPI core is already keeping track of this, so we don't need to duplicate the effort. In most cases, we already have a pointer to the current message, so we can pass it directly to the functions that need it. In the one case where we don't have a pointer to the current message, we can get it from struct spi_controller::cur_msg. Signed-off-by: David Lechner <dlechner@baylibre.com> Link: https://lore.kernel.org/r/20231117-axi-spi-engine-series-1-v1-12-cc59db999b87@baylibre.com Signed-off-by: Mark Brown <broonie@kernel.org>
2023-11-20spi: axi-spi-engine: remove completed_id from driver stateDavid Lechner1-3/+3
In the AXI SPI Engine driver, the completed_id field in the driver state is only used in one function and the value does not need to persist between function calls. Therefore, it can be removed from the driver state and made a local variable in the function where it is used. Signed-off-by: David Lechner <dlechner@baylibre.com> Link: https://lore.kernel.org/r/20231117-axi-spi-engine-series-1-v1-11-cc59db999b87@baylibre.com Signed-off-by: Mark Brown <broonie@kernel.org>
2023-11-20spi: axi-spi-engine: use message_prepare/unprepareDavid Lechner1-12/+34
This modifies the AXI SPI Engine driver to make use of the message_prepare and message_unprepare callbacks. This separates the concerns of allocating and freeing the message state from the transfer_one_message callback. The main benfit of this is so that future callers of spi_finalize_current_message() will not have to do manual cleanup of the state. Signed-off-by: David Lechner <dlechner@baylibre.com> Link: https://lore.kernel.org/r/20231117-axi-spi-engine-series-1-v1-10-cc59db999b87@baylibre.com Signed-off-by: Mark Brown <broonie@kernel.org>
2023-11-20spi: axi-spi-engine: move msg state to new structDavid Lechner1-54/+96
This moves the message state in the AXI SPI Engine driver to a new struct spi_engine_msg_state. Previously, the driver state contained various pointers that pointed to memory owned by a struct spi_message. However, it did not set any of these pointers to NULL when a message was completed. This could lead to use after free bugs. Example of how this could happen: 1. SPI core calls into spi_engine_transfer_one_message() with msg1. 2. Assume something was misconfigured and spi_engine_tx_next() is not called enough times in interrupt callbacks for msg1 such that spi_engine->tx_xfer is never set to NULL before the msg1 completes. 3. SYNC interrupt is received and spi_finalize_current_message() is called for msg1. spi_engine->msg is set to NULL but no other message-specific state is reset. 4. Caller that sent msg1 is notified of the completion and frees msg1 and the associated xfers and tx/rx buffers. 4. SPI core calls into spi_engine_transfer_one_message() with msg2. 5. When spi_engine_tx_next() is called for msg2, spi_engine->tx_xfer is still be pointing to an xfer from msg1, which was already freed. spi_engine_xfer_next() tries to access xfer->transfer_list of one of the freed xfers and we get a segfault or undefined behavior. To avoid issues like this, instead of putting per-message state in the driver state struct, we can make use of the struct spi_message::state field to store a pointer to a new struct spi_engine_msg_state. This way, all of the state that belongs to specific message stays with that message and we don't have to remember to manually reset all aspects of the message state when a message is completed. Rather, a new state is allocated for each message. Most of the changes are just renames where the state is accessed. One place where this wasn't straightforward was the sync_id member. This has been changed to use ida_alloc_range() since we needed to separate the per-message sync_id from the per-controller next available sync_id. Signed-off-by: David Lechner <dlechner@baylibre.com> Link: https://lore.kernel.org/r/20231117-axi-spi-engine-series-1-v1-9-cc59db999b87@baylibre.com Signed-off-by: Mark Brown <broonie@kernel.org>
2023-11-20spi: axi-spi-engine: check for valid clock rateDavid Lechner1-0/+3
This adds a check for a valid SCLK rate in the axi-spi-engine driver during probe. A valid rate is required to get accurate timing for delays and by not allowing 0 we can avoid divide by zero errors later without additional checks. Signed-off-by: David Lechner <dlechner@baylibre.com> Link: https://lore.kernel.org/r/20231117-axi-spi-engine-series-1-v1-8-cc59db999b87@baylibre.com Signed-off-by: Mark Brown <broonie@kernel.org>
2023-11-20spi: axi-spi-engine: use devm_spi_register_controller()David Lechner1-9/+1
This replaces spi_register_controller() with devm_spi_register_controller() in the AXI SPI Engine driver. This saves us from having to call spi_unregister_controller() in the remove function. The remove function is also removed since it is no longer needed. Signed-off-by: David Lechner <dlechner@baylibre.com> Link: https://lore.kernel.org/r/20231117-axi-spi-engine-series-1-v1-7-cc59db999b87@baylibre.com Signed-off-by: Mark Brown <broonie@kernel.org>
2023-11-20spi: axi-spi-engine: use devm_request_irq()David Lechner1-8/+3
This replaces request_irq() with devm_request_irq() in the AXI SPI Engine driver. This simplifies the error path and removes the need to call free_irq() in the remove function. Signed-off-by: David Lechner <dlechner@baylibre.com> Link: https://lore.kernel.org/r/20231117-axi-spi-engine-series-1-v1-6-cc59db999b87@baylibre.com Signed-off-by: Mark Brown <broonie@kernel.org>
2023-11-20spi: axi-spi-engine: use devm action to reset hw on removeDavid Lechner1-5/+14
This moves the reset of the hardware to a devm action in the AXI SPI Engine driver. This will allow us to use devm on later calls in the probe function while preserving the order during cleanup. Signed-off-by: David Lechner <dlechner@baylibre.com> Link: https://lore.kernel.org/r/20231117-axi-spi-engine-series-1-v1-5-cc59db999b87@baylibre.com Signed-off-by: Mark Brown <broonie@kernel.org>
2023-11-20spi: axi-spi-engine: use devm_spi_alloc_host()David Lechner1-21/+10
This modifies the AXI SPI Engine driver to use devm_spi_alloc_host() instead of spi_alloc_host() to simplify the code a bit. In addition to simplifying the error paths in the probe function, we can also remove spi_controller_get/put() calls in the remove function since devm_spi_alloc_host() sets a flag to no longer decrement the controller reference count in the spi_unregister_controller() function. Signed-off-by: David Lechner <dlechner@baylibre.com> Link: https://lore.kernel.org/r/20231117-axi-spi-engine-series-1-v1-4-cc59db999b87@baylibre.com Signed-off-by: Mark Brown <broonie@kernel.org>
2023-11-20spi: axi-spi-engine: simplify driver data allocationDavid Lechner1-6/+2
This simplifies the private data allocation in the AXI SPI Engine driver by making use of the feature built into the spi_alloc_host() function instead of doing it manually. Signed-off-by: David Lechner <dlechner@baylibre.com> Link: https://lore.kernel.org/r/20231117-axi-spi-engine-series-1-v1-3-cc59db999b87@baylibre.com Signed-off-by: Mark Brown <broonie@kernel.org>
2023-11-20MAINTAINERS: add entry for AXI SPI EngineDavid Lechner1-0/+10
The AXI SPI Engine driver has been in the kernel for many years but has lacked a proper maintainers entry. This adds a new entry for the driver and the devicetree bindings. Signed-off-by: David Lechner <dlechner@baylibre.com> Link: https://lore.kernel.org/r/20231117-axi-spi-engine-series-1-v1-2-cc59db999b87@baylibre.com Signed-off-by: Mark Brown <broonie@kernel.org>
2023-11-20dt-bindings: spi: axi-spi-engine: convert to yamlDavid Lechner2-31/+66
This converts the axi-spi-engine binding to yaml. There are a few minor fixes in the conversion: * Added maintainers. * Added descriptions for the clocks. * Fixed the double "@" in the example. * Added a comma between the clocks in the example. Signed-off-by: David Lechner <dlechner@baylibre.com> Reviewed-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20231117-axi-spi-engine-series-1-v1-1-cc59db999b87@baylibre.com Signed-off-by: Mark Brown <broonie@kernel.org>
2023-11-20spi: ingenic: convert not to use dma_request_slave_channel()Christophe JAILLET1-6/+9
dma_request_slave_channel() is deprecated. dma_request_chan() should be used directly instead. Switch to the preferred function and update the error handling accordingly. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/1c88236b5d6bff0af902492ea9e066c8cb0dfef5.1700391566.git.christophe.jaillet@wanadoo.fr Signed-off-by: Mark Brown <broonie@kernel.org>
2023-11-20drm/i915: do not clean GT table on error pathAndrzej Hajda2-14/+1
The only task of intel_gt_release_all is to zero gt table. Calling it on error path prevents intel_gt_driver_late_release_all (called from i915_driver_late_release) to cleanup GTs, causing leakage. After i915_driver_late_release GT array is not used anymore so it does not need cleaning at all. Sample leak report: BUG i915_request (...): Objects remaining in i915_request on __kmem_cache_shutdown() ... Object 0xffff888113420040 @offset=64 Allocated in __i915_request_create+0x75/0x610 [i915] age=18339 cpu=1 pid=1454 kmem_cache_alloc+0x25b/0x270 __i915_request_create+0x75/0x610 [i915] i915_request_create+0x109/0x290 [i915] __engines_record_defaults+0xca/0x440 [i915] intel_gt_init+0x275/0x430 [i915] i915_gem_init+0x135/0x2c0 [i915] i915_driver_probe+0x8d1/0xdc0 [i915] v2: removed whole intel_gt_release_all Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/8489 Fixes: bec68cc9ea42 ("drm/i915: Prepare for multiple GTs") Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231115-dont_clean_gt_on_error_path-v2-1-54250125470a@intel.com (cherry picked from commit e899505533852bf1da133f2f4c9a9655ff77f7e5) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2023-11-20drm/i915/dp_mst: Fix race between connector registration and setupImre Deak1-8/+8
After drm_connector_init() is called the connector is visible to the rest of the kernel via the drm_mode_config::connector_list. Make sure that the DSC AUX device and capabilities are setup by that time. Another race condition is adding the connector to the connector list before drm_connector_helper_add() sets the connector helper functions. That's an unrelated issue, for which the fix is for a follow-up. One solution would be adding the connector to the connector list only during its registration in drm_connector_register(). Cc: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Fixes: 808b43fa7e56 ("drm/i915/dp_mst: Set connector DSC capabilities and decompression AUX") Reviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231030155843.2251023-2-imre.deak@intel.com (cherry picked from commit 560ea72c76eb6d0c59f77580414e64cc09f1093d) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2023-11-20md: fix bi_status reporting in md_end_clone_ioSong Liu1-1/+2
md_end_clone_io() may overwrite error status in orig_bio->bi_status with BLK_STS_OK. This could happen when orig_bio has BIO_CHAIN (split by md_submit_bio => bio_split_to_limits, for example). As a result, upper layer may miss error reported from md (or the device) and consider the failed IO was successful. Fix this by only update orig_bio->bi_status when current bio reports error and orig_bio is BLK_STS_OK. This is the same behavior as __bio_chain_endio(). Fixes: 10764815ff47 ("md: add io accounting for raid0 and raid5") Cc: stable@vger.kernel.org # v5.14+ Reported-by: Bhanu Victor DiCara <00bvd0+linux@gmail.com> Closes: https://lore.kernel.org/regressions/5727380.DvuYhMxLoT@bvd0/ Signed-off-by: Song Liu <song@kernel.org> Tested-by: Xiao Ni <xni@redhat.com> Reviewed-by: Yu Kuai <yukuai3@huawei.com> Acked-by: Guoqing Jiang <guoqing.jiang@linux.dev>
2023-11-20ata: pata_isapnp: Add missing error check for devm_ioport_map()Chen Ni1-0/+3
Add missing error return check for devm_ioport_map() and return the error if this function call fails. Fixes: 0d5ff566779f ("libata: convert to iomap") Signed-off-by: Chen Ni <nichen@iscas.ac.cn> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
2023-11-20Linux 6.7-rc2Linus Torvalds1-1/+1
2023-11-20Merge tag 'kbuild-fixes-v6.7' of ↵Linus Torvalds4-15/+13
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: - Fix section mismatch warning messages for riscv and loongarch - Remove CONFIG_IA64 left-over from linux/export-internal.h - Fix the location of the quotes for UIMAGE_NAME - Fix a memory leak bug in Kconfig * tag 'kbuild-fixes-v6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kconfig: fix memory leak from range properties kbuild: Move the single quotes for image name linux/export: clean up the IA-64 KSYM_FUNC macro modpost: fix section mismatch message for RELA
2023-11-20Merge tag 'irq_urgent_for_v6.7_rc2' of ↵Linus Torvalds1-6/+10
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fix from Borislav Petkov: - Flush the translation service tables to prevent unpredictable behavior on non-coherent GIC devices * tag 'irq_urgent_for_v6.7_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/gic-v3-its: Flush ITS tables correctly in non-coherent GIC designs
2023-11-20Merge tag 'x86_urgent_for_v6.7_rc2' of ↵Linus Torvalds4-23/+33
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Ignore invalid x2APIC entries in order to not waste per-CPU data - Fix a back-to-back signals handling scenario when shadow stack is in use - A documentation fix - Add Kirill as TDX maintainer * tag 'x86_urgent_for_v6.7_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/acpi: Ignore invalid x2APIC entries x86/shstk: Delay signal entry SSP write until after user accesses x86/Documentation: Indent 'note::' directive for protocol version number note MAINTAINERS: Add Intel TDX entry
2023-11-20Merge tag 'timers_urgent_for_v6.7_rc2' of ↵Linus Torvalds4-24/+22
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fix from Borislav Petkov: - Do the push of pending hrtimers away from a CPU which is being offlined earlier in the offlining process in order to prevent a deadlock * tag 'timers_urgent_for_v6.7_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: hrtimers: Push pending hrtimers away from outgoing CPU earlier
2023-11-20Merge tag 'sched_urgent_for_v6.7_rc2' of ↵Linus Torvalds2-38/+135
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Borislav Petkov: - Fix virtual runtime calculation when recomputing a sched entity's weights - Fix wrongly rejected unprivileged poll requests to the cgroup psi pressure files - Make sure the load balancing is done by only one CPU * tag 'sched_urgent_for_v6.7_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/fair: Fix the decision for load balance sched: psi: fix unprivileged polling against cgroups sched/eevdf: Fix vruntime adjustment on reweight
2023-11-20Merge tag 'locking_urgent_for_v6.7_rc2' of ↵Linus Torvalds1-3/+6
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fix from Borislav Petkov: - Fix a hardcoded futex flags case which lead to one robust futex test failure * tag 'locking_urgent_for_v6.7_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: futex: Fix hardcoded flags
2023-11-20Merge tag 'perf_urgent_for_v6.7_rc2' of ↵Linus Torvalds2-5/+25
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fix from Borislav Petkov: - Make sure the context refcount is transferred too when migrating perf events * tag 'perf_urgent_for_v6.7_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/core: Fix cpuctx refcounting
2023-11-19net: fill in MODULE_DESCRIPTION()s for SOCK_DIAG modulesJakub Kicinski12-0/+12
W=1 builds now warn if module is built without a MODULE_DESCRIPTION(). Add descriptions to all the sock diag modules in one fell swoop. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>