summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)AuthorFilesLines
2025-07-07PM: domains: Detach on device_unbind_cleanup()Claudiu Beznea2-0/+5
The dev_pm_domain_attach() function is typically used in bus code alongside dev_pm_domain_detach(), often following patterns like: static int bus_probe(struct device *_dev) { struct bus_driver *drv = to_bus_driver(dev->driver); struct bus_device *dev = to_bus_device(_dev); int ret; // ... ret = dev_pm_domain_attach(_dev, true); if (ret) return ret; if (drv->probe) ret = drv->probe(dev); // ... } static void bus_remove(struct device *_dev) { struct bus_driver *drv = to_bus_driver(dev->driver); struct bus_device *dev = to_bus_device(_dev); if (drv->remove) drv->remove(dev); dev_pm_domain_detach(_dev); } When the driver's probe function uses devres-managed resources that depend on the power domain state, those resources are released later during device_unbind_cleanup(). Releasing devres-managed resources that depend on the power domain state after detaching the device from its PM domain can cause failures. For example, if the driver uses devm_pm_runtime_enable() in its probe function, and the device's clocks are managed by the PM domain, then during removal the runtime PM is disabled in device_unbind_cleanup() after the clocks have been removed from the PM domain. It may happen that the devm_pm_runtime_enable() action causes the device to be runtime- resumed. If the driver specific runtime PM APIs access registers directly, this will lead to accessing device registers without clocks being enabled. Similar issues may occur with other devres actions that access device registers. Add detach_power_off member to struct dev_pm_info, to be used later in device_unbind_cleanup() as the power_off argument for dev_pm_domain_detach(). This is a preparatory step toward removing dev_pm_domain_detach() calls from bus remove functions. Since the current PM domain detach functions (genpd_dev_pm_detach() and acpi_dev_pm_detach()) already set dev->pm_domain = NULL, there should be no issues with bus drivers that still call dev_pm_domain_detach() in their remove functions. Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://patch.msgid.link/20250703112708.1621607-3-claudiu.beznea.uj@bp.renesas.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-07-07PM: domains: Add flags to specify power on attach/detachClaudiu Beznea12-15/+15
Calling dev_pm_domain_attach()/dev_pm_domain_detach() in bus driver probe/remove functions can affect system behavior when the drivers attached to the bus use devres-managed resources. Since devres actions may need to access device registers, calling dev_pm_domain_detach() too early, i.e., before these actions complete, can cause failures on some systems. One such example is Renesas RZ/G3S SoC-based platforms. If the device clocks are managed via PM domains, invoking dev_pm_domain_detach() in the bus driver's remove function removes the device's clocks from the PM domain, preventing any subsequent pm_runtime_resume*() calls from enabling those clocks. The second argument of dev_pm_domain_attach() specifies whether the PM domain should be powered on during attachment. Likewise, the second argument of dev_pm_domain_detach() indicates whether the domain should be powered off during detachment. Upcoming changes address the issue described above (initially for the platform bus only) by deferring the call to dev_pm_domain_detach() until after devres_release_all() in device_unbind_cleanup(). The detach_power_off field in struct dev_pm_info stores the detach power off info from the second argument of dev_pm_domain_attach(). Because there are cases where the device's PM domain power-on/off behavior must be conditional (e.g., in i2c_device_probe()), the patch introduces PD_FLAG_ATTACH_POWER_ON and PD_FLAG_DETACH_POWER_OFF flags to be passed to dev_pm_domain_attach(). Finally, dev_pm_domain_attach() and its users are updated to use the newly introduced PD_FLAG_ATTACH_POWER_ON and PD_FLAG_DETACH_POWER_OFF macros. This change is preparatory. Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org> Acked-by: Wolfram Sang <wsa+renesas@sang-engineering.com> # I2C Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://patch.msgid.link/20250703112708.1621607-2-claudiu.beznea.uj@bp.renesas.com [ rjw: Changelog adjustments ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-07-07nbd: fix uaf in nbd_genl_connect() error pathZheng Qixing1-3/+3
There is a use-after-free issue in nbd: block nbd6: Receive control failed (result -104) block nbd6: shutting down sockets ================================================================== BUG: KASAN: slab-use-after-free in recv_work+0x694/0xa80 drivers/block/nbd.c:1022 Write of size 4 at addr ffff8880295de478 by task kworker/u33:0/67 CPU: 2 UID: 0 PID: 67 Comm: kworker/u33:0 Not tainted 6.15.0-rc5-syzkaller-00123-g2c89c1b655c0 #0 PREEMPT(full) Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014 Workqueue: nbd6-recv recv_work Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:408 [inline] print_report+0xc3/0x670 mm/kasan/report.c:521 kasan_report+0xe0/0x110 mm/kasan/report.c:634 check_region_inline mm/kasan/generic.c:183 [inline] kasan_check_range+0xef/0x1a0 mm/kasan/generic.c:189 instrument_atomic_read_write include/linux/instrumented.h:96 [inline] atomic_dec include/linux/atomic/atomic-instrumented.h:592 [inline] recv_work+0x694/0xa80 drivers/block/nbd.c:1022 process_one_work+0x9cc/0x1b70 kernel/workqueue.c:3238 process_scheduled_works kernel/workqueue.c:3319 [inline] worker_thread+0x6c8/0xf10 kernel/workqueue.c:3400 kthread+0x3c2/0x780 kernel/kthread.c:464 ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:153 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245 </TASK> nbd_genl_connect() does not properly stop the device on certain error paths after nbd_start_device() has been called. This causes the error path to put nbd->config while recv_work continue to use the config after putting it, leading to use-after-free in recv_work. This patch moves nbd_start_device() after the backend file creation. Reported-by: syzbot+48240bab47e705c53126@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/68227a04.050a0220.f2294.00b5.GAE@google.com/T/ Fixes: 6497ef8df568 ("nbd: provide a way for userspace processes to identify device backends") Signed-off-by: Zheng Qixing <zhengqixing@huawei.com> Reviewed-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20250612132405.364904-1-zhengqixing@huaweicloud.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-07-07EDAC/ie31200: Add Intel Raptor Lake-HX SoCs supportQiuxu Zhuo1-0/+4
Intel Raptor Lake-HX SoC shares the same memory controller registers as Raptor Lake-S SoC. Add a compute die ID for Raptor Lake-HX SoCs with Out-of-Band ECC capability for EDAC support. Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Tested-by: Laurens SEGHERS <laurens@rale.com> Link: https://lore.kernel.org/r/20250704151609.7833-4-qiuxu.zhuo@intel.com
2025-07-07EDAC/igen6: Add Intel Wildcat Lake SoCs supportLili Li1-0/+15
Intel Wildcat Lake is a mobile derivative of Panther Lake with one memory controller. Wildcat Lake SoCs share the same IBECC registers with Meteor Lake-P SoCs. Add a compute die ID and a new configuration structure for Wildcat Lake SoCs with In-Band ECC capability for EDAC support. Signed-off-by: Lili Li <lili.li@intel.com> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20250704151609.7833-3-qiuxu.zhuo@intel.com
2025-07-07drm/amdgpu: fix use-after-free in amdgpu_userq_suspend+0x51a/0x5a0Vitaly Prosyak1-1/+1
[ +0.000020] BUG: KASAN: slab-use-after-free in amdgpu_userq_suspend+0x51a/0x5a0 [amdgpu] [ +0.000817] Read of size 8 at addr ffff88812eec8c58 by task amd_pci_unplug/1733 [ +0.000027] CPU: 10 UID: 0 PID: 1733 Comm: amd_pci_unplug Tainted: G W 6.14.0+ #2 [ +0.000009] Tainted: [W]=WARN [ +0.000003] Hardware name: ASUS System Product Name/ROG STRIX B550-F GAMING (WI-FI), BIOS 1401 12/03/2020 [ +0.000004] Call Trace: [ +0.000004] <TASK> [ +0.000003] dump_stack_lvl+0x76/0xa0 [ +0.000011] print_report+0xce/0x600 [ +0.000009] ? srso_return_thunk+0x5/0x5f [ +0.000006] ? kasan_complete_mode_report_info+0x76/0x200 [ +0.000007] ? kasan_addr_to_slab+0xd/0xb0 [ +0.000006] ? amdgpu_userq_suspend+0x51a/0x5a0 [amdgpu] [ +0.000707] kasan_report+0xbe/0x110 [ +0.000006] ? amdgpu_userq_suspend+0x51a/0x5a0 [amdgpu] [ +0.000541] __asan_report_load8_noabort+0x14/0x30 [ +0.000005] amdgpu_userq_suspend+0x51a/0x5a0 [amdgpu] [ +0.000535] ? stop_cpsch+0x396/0x600 [amdgpu] [ +0.000556] ? stop_cpsch+0x429/0x600 [amdgpu] [ +0.000536] ? __pfx_amdgpu_userq_suspend+0x10/0x10 [amdgpu] [ +0.000536] ? srso_return_thunk+0x5/0x5f [ +0.000004] ? kgd2kfd_suspend+0x132/0x1d0 [amdgpu] [ +0.000542] amdgpu_device_fini_hw+0x581/0xe90 [amdgpu] [ +0.000485] ? down_write+0xbb/0x140 [ +0.000007] ? __mutex_unlock_slowpath.constprop.0+0x317/0x360 [ +0.000005] ? __pfx_amdgpu_device_fini_hw+0x10/0x10 [amdgpu] [ +0.000482] ? __kasan_check_write+0x14/0x30 [ +0.000004] ? srso_return_thunk+0x5/0x5f [ +0.000004] ? up_write+0x55/0xb0 [ +0.000007] ? srso_return_thunk+0x5/0x5f [ +0.000005] ? blocking_notifier_chain_unregister+0x6c/0xc0 [ +0.000008] amdgpu_driver_unload_kms+0x69/0x90 [amdgpu] [ +0.000484] amdgpu_pci_remove+0x93/0x130 [amdgpu] [ +0.000482] pci_device_remove+0xae/0x1e0 [ +0.000008] device_remove+0xc7/0x180 [ +0.000008] device_release_driver_internal+0x3d4/0x5a0 [ +0.000007] device_release_driver+0x12/0x20 [ +0.000004] pci_stop_bus_device+0x104/0x150 [ +0.000006] pci_stop_and_remove_bus_device_locked+0x1b/0x40 [ +0.000005] remove_store+0xd7/0xf0 [ +0.000005] ? __pfx_remove_store+0x10/0x10 [ +0.000006] ? __pfx__copy_from_iter+0x10/0x10 [ +0.000006] ? __pfx_dev_attr_store+0x10/0x10 [ +0.000006] dev_attr_store+0x3f/0x80 [ +0.000006] sysfs_kf_write+0x125/0x1d0 [ +0.000004] ? srso_return_thunk+0x5/0x5f [ +0.000005] ? __kasan_check_write+0x14/0x30 [ +0.000005] kernfs_fop_write_iter+0x2ea/0x490 [ +0.000005] ? rw_verify_area+0x70/0x420 [ +0.000005] ? __pfx_kernfs_fop_write_iter+0x10/0x10 [ +0.000006] vfs_write+0x90d/0xe70 [ +0.000005] ? srso_return_thunk+0x5/0x5f [ +0.000005] ? __pfx_vfs_write+0x10/0x10 [ +0.000004] ? local_clock+0x15/0x30 [ +0.000008] ? srso_return_thunk+0x5/0x5f [ +0.000004] ? __kasan_slab_free+0x5f/0x80 [ +0.000005] ? srso_return_thunk+0x5/0x5f [ +0.000004] ? __kasan_check_read+0x11/0x20 [ +0.000004] ? srso_return_thunk+0x5/0x5f [ +0.000004] ? fdget_pos+0x1d3/0x500 [ +0.000007] ksys_write+0x119/0x220 [ +0.000005] ? putname+0x1c/0x30 [ +0.000006] ? __pfx_ksys_write+0x10/0x10 [ +0.000007] __x64_sys_write+0x72/0xc0 [ +0.000006] x64_sys_call+0x18ab/0x26f0 [ +0.000006] do_syscall_64+0x7c/0x170 [ +0.000004] ? srso_return_thunk+0x5/0x5f [ +0.000004] ? __pfx___x64_sys_openat+0x10/0x10 [ +0.000006] ? srso_return_thunk+0x5/0x5f [ +0.000004] ? __kasan_check_read+0x11/0x20 [ +0.000003] ? srso_return_thunk+0x5/0x5f [ +0.000004] ? fpregs_assert_state_consistent+0x21/0xb0 [ +0.000006] ? srso_return_thunk+0x5/0x5f [ +0.000004] ? syscall_exit_to_user_mode+0x4e/0x240 [ +0.000005] ? srso_return_thunk+0x5/0x5f [ +0.000004] ? do_syscall_64+0x88/0x170 [ +0.000003] ? srso_return_thunk+0x5/0x5f [ +0.000004] ? irqentry_exit+0x43/0x50 [ +0.000004] ? srso_return_thunk+0x5/0x5f [ +0.000004] ? exc_page_fault+0x7c/0x110 [ +0.000006] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ +0.000006] RIP: 0033:0x7480c0b14887 [ +0.000005] Code: 10 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24 [ +0.000005] RSP: 002b:00007fff142b0058 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ +0.000006] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007480c0b14887 [ +0.000003] RDX: 0000000000000001 RSI: 00007480c0e7365a RDI: 0000000000000004 [ +0.000003] RBP: 00007fff142b0080 R08: 0000563b2e73c170 R09: 0000000000000000 [ +0.000003] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fff142b02f8 [ +0.000003] R13: 0000563b159a72a9 R14: 0000563b159a9d48 R15: 00007480c0f19040 [ +0.000008] </TASK> [ +0.000445] Allocated by task 427 on cpu 5 at 29.342331s: [ +0.000011] kasan_save_stack+0x28/0x60 [ +0.000006] kasan_save_track+0x18/0x70 [ +0.000006] kasan_save_alloc_info+0x38/0x60 [ +0.000005] __kasan_kmalloc+0xc1/0xd0 [ +0.000006] __kmalloc_cache_noprof+0x1bd/0x430 [ +0.000007] amdgpu_driver_open_kms+0x172/0x760 [amdgpu] [ +0.000493] drm_file_alloc+0x569/0x9a0 [ +0.000007] drm_client_init+0x1b7/0x410 [ +0.000007] drm_fbdev_client_setup+0x174/0x470 [ +0.000006] drm_client_setup+0x8a/0xf0 [ +0.000006] amdgpu_pci_probe+0x510/0x10c0 [amdgpu] [ +0.000483] local_pci_probe+0xe7/0x1b0 [ +0.000006] pci_device_probe+0x5bf/0x890 [ +0.000006] really_probe+0x1fd/0x950 [ +0.000005] __driver_probe_device+0x307/0x410 [ +0.000006] driver_probe_device+0x4e/0x150 [ +0.000005] __driver_attach+0x223/0x510 [ +0.000006] bus_for_each_dev+0x102/0x1a0 [ +0.000005] driver_attach+0x3d/0x60 [ +0.000006] bus_add_driver+0x309/0x650 [ +0.000005] driver_register+0x13d/0x490 [ +0.000006] __pci_register_driver+0x1ee/0x2b0 [ +0.000006] rfcomm_dlc_clear_state+0x69/0x220 [rfcomm] [ +0.000011] do_one_initcall+0x9c/0x3e0 [ +0.000007] do_init_module+0x29e/0x7f0 [ +0.000006] load_module+0x5c75/0x7c80 [ +0.000006] init_module_from_file+0x106/0x180 [ +0.000006] idempotent_init_module+0x377/0x740 [ +0.000006] __x64_sys_finit_module+0xd7/0x180 [ +0.000006] x64_sys_call+0x1f0b/0x26f0 [ +0.000006] do_syscall_64+0x7c/0x170 [ +0.000005] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ +0.000013] Freed by task 1733 on cpu 5 at 59.907086s: [ +0.000011] kasan_save_stack+0x28/0x60 [ +0.000006] kasan_save_track+0x18/0x70 [ +0.000005] kasan_save_free_info+0x3b/0x60 [ +0.000005] __kasan_slab_free+0x54/0x80 [ +0.000006] kfree+0x127/0x470 [ +0.000006] amdgpu_driver_postclose_kms+0x455/0x760 [amdgpu] [ +0.000493] drm_file_free.part.0+0x5b1/0xba0 [ +0.000006] drm_file_free+0x13/0x30 [ +0.000006] drm_client_release+0x1c4/0x2b0 [ +0.000006] drm_fbdev_ttm_fb_destroy+0xd2/0x120 [drm_ttm_helper] [ +0.000007] put_fb_info+0x97/0xe0 [ +0.000007] unregister_framebuffer+0x197/0x380 [ +0.000005] drm_fb_helper_unregister_info+0x94/0x100 [ +0.000005] drm_fbdev_client_unregister+0x3c/0x80 [ +0.000007] drm_client_dev_unregister+0x144/0x330 [ +0.000006] drm_dev_unregister+0x49/0x1b0 [ +0.000006] drm_dev_unplug+0x4c/0xd0 [ +0.000006] amdgpu_pci_remove+0x58/0x130 [amdgpu] [ +0.000484] pci_device_remove+0xae/0x1e0 [ +0.000008] device_remove+0xc7/0x180 [ +0.000007] device_release_driver_internal+0x3d4/0x5a0 [ +0.000006] device_release_driver+0x12/0x20 [ +0.000007] pci_stop_bus_device+0x104/0x150 [ +0.000006] pci_stop_and_remove_bus_device_locked+0x1b/0x40 [ +0.000006] remove_store+0xd7/0xf0 [ +0.000006] dev_attr_store+0x3f/0x80 [ +0.000005] sysfs_kf_write+0x125/0x1d0 [ +0.000006] kernfs_fop_write_iter+0x2ea/0x490 [ +0.000006] vfs_write+0x90d/0xe70 [ +0.000006] ksys_write+0x119/0x220 [ +0.000006] __x64_sys_write+0x72/0xc0 [ +0.000006] x64_sys_call+0x18ab/0x26f0 [ +0.000005] do_syscall_64+0x7c/0x170 [ +0.000006] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ +0.000012] The buggy address belongs to the object at ffff88812eec8000 which belongs to the cache kmalloc-rnd-07-4k of size 4096 [ +0.000016] The buggy address is located 3160 bytes inside of freed 4096-byte region [ffff88812eec8000, ffff88812eec9000) [ +0.000023] The buggy address belongs to the physical page: [ +0.000009] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x12eec8 [ +0.000007] head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0 [ +0.000005] flags: 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff) [ +0.000007] page_type: f5(slab) [ +0.000008] raw: 0017ffffc0000040 ffff888100054500 dead000000000122 0000000000000000 [ +0.000005] raw: 0000000000000000 0000000080040004 00000000f5000000 0000000000000000 [ +0.000006] head: 0017ffffc0000040 ffff888100054500 dead000000000122 0000000000000000 [ +0.000005] head: 0000000000000000 0000000080040004 00000000f5000000 0000000000000000 [ +0.000006] head: 0017ffffc0000003 ffffea0004bbb201 ffffffffffffffff 0000000000000000 [ +0.000005] head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000 [ +0.000005] page dumped because: kasan: bad access detected [ +0.000010] Memory state around the buggy address: [ +0.000009] ffff88812eec8b00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ +0.000012] ffff88812eec8b80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ +0.000011] >ffff88812eec8c00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ +0.000011] ^ [ +0.000010] ffff88812eec8c80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ +0.000011] ffff88812eec8d00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ +0.000011] ================================================================== The use-after-free occurs because a delayed work item (`suspend_work`) may still be pending or running when resources it accesses are freed during device removal or file close. The previous code used `flush_work(&fpriv->evf_mgr.suspend_work.work)`, which does not wait for delayed work that has not yet started. As a result, the delayed work could run after its memory was freed, causing a use-after-free. By switching to `flush_delayed_work(&fpriv->evf_mgr.suspend_work)`, we ensure that the kernel waits for both queued and delayed work to finish before freeing memory, closing this race. Fixes: adba0929736a ("drm/amdgpu: Fix Illegal opcode in command stream Error") Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-07-07EDAC/i10nm: Add Intel Granite Rapids-D supportQiuxu Zhuo1-1/+11
The Granite Rapids-D CPU model uses memory controller registers similar to those of the Granite Rapids server CPU but with a different memory controller MMIO base. Add the Granite Rapids-D CPU model ID and use the new memory controller MMIO base for EDAC support. Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Tested-by: VikasX Chougule <vikasx.chougule@intel.com> Link: https://lore.kernel.org/r/20250704151609.7833-2-qiuxu.zhuo@intel.com
2025-07-07Revert "drm/amdgpu: fix slab-use-after-free in amdgpu_userq_mgr_fini"Vitaly Prosyak2-4/+15
This reverts commit 5fb90421fa0fbe0a968274912101fe917bf1c47b. The original patch moved `amdgpu_userq_mgr_fini()` to the driver's `postclose` callback, which is called after `drm_gem_release()` in the DRM file cleanup sequence.If a user application crashes or aborts without cleaning up its user queues, 'drm_gem_release()` may free GEM objects that are still referenced by active user queues, leading to use-after-free. By reverting, we ensure that user queues are disabled and cleaned up before any GEM objects are released, preventing this class of bug. However, this reintroduces a race during PCI hot-unplug, where device removal can race with per-file cleanup, leading to use-after-free in suspend/unplug paths. This will be fixed in the next patch. Fixes: 5fb90421fa0f ("drm/amdgpu: fix slab-use-after-free in amdgpu_userq_mgr_fini+0x70c") Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-07-07drm/amd/display: Use scaling for non-native resolutions on LVDSAlex Deucher1-1/+2
[Why] Common resolutions are added to supported modes to enable compatibility scenarios that compositors may use to do things like clone displays. There is no guarantee however that the panel will natively support these modes. [How] If the compositor hasn't enabled scaling but a non-native resolution has been picked for an LVDS panel turn the scaler on anyway. This will ensure compatibility. Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-07-07drm/amd/display: Disable common modes for LVDSAlex Deucher1-1/+2
[Why] Common modes are added to LVDS for compatibility in clone mode, but not all panels support them. Non-native modes were disabled in the past but this caused problems because compositors didn't use scaling for non native modes. Now non-native modes on LVDS will enable the scaler by default. [How] Check the connector type. If the connector is LVDS avoid adding common modes. Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-07-07drm/amdgpu/sdma: allow caller to handle kernel rings in engine resetAlex Deucher6-23/+32
Add a parameter to amdgpu_sdma_reset_engine() to let the caller handle the kernel rings. This allows the kernel rings to back up their unprocessed state if the reset comes in via the drm scheduler rather than KFD. Reviewed-by: Jesse Zhang <Jesse.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-07-07drm/amdgpu/sdma: consolidate engine reset handlingAlex Deucher4-25/+9
Move the force completion handling into the common engine reset function. No need to duplicate it for every IP version. Reviewed-by: Jesse Zhang <Jesse.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-07-07drm/amdkfd: Avoid queue reset if disabledLijo Lazar1-0/+9
If ring reset is disabled, skip resetting queues. Instead, fall back to device based reset. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-07-07drm/amdgpu: Add a noverbose flag to psp_wait_forLijo Lazar10-118/+130
For extended wait with retries on a PSP register value, add a noverbose flag to avoid excessive error messages on each timeout. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-07-07drm/amd/pm/powerplay/hwmgr/smu_helper: fix order of mask and valueFedor Pchelkin1-1/+1
There is a small typo in phm_wait_on_indirect_register(). Swap mask and value arguments provided to phm_wait_on_register() so that they satisfy the function signature and actual usage scheme. Found by Linux Verification Center (linuxtesting.org) with Svace static analysis tool. In practice this doesn't fix any issues because the only place this function is used uses the same value for the value and mask. Fixes: 3bace3591493 ("drm/amd/powerplay: add hardware manager sub-component") Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-07-07drm/amdgpu/gfx10: fix KGQ reset sequenceAlex Deucher1-5/+15
Need to reinit the ring before remapping it and all of the KIQ handling needs to be within the kiq lock. Fixes: 1741281a157f ("drm/amdgpu/gfx10: add ring reset callbacks") Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-07-07drm/amdgpu: Pass adev pointer to functionsLijo Lazar6-63/+57
Pass amdgpu device context instead of drm device context to some amdgpu_device_* functions. DRM device context is not required in those functions. No functional change. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-07-07drm/xe/ptl: Add HuC FW definition for PTLDaniele Ceraolo Spurio1-0/+1
Add the unversioned define for the PTL HuC FW. Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250626182805.1701096-15-daniele.ceraolospurio@intel.com
2025-07-07drm/xe/ptl: Add GuC FW definition for PTLDaniele Ceraolo Spurio1-0/+1
The first official GuC relase for PTL is 70.47.0, which maps to API version 1.22.4. Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250626182805.1701096-14-daniele.ceraolospurio@intel.com
2025-07-07drm/xe/guc: Recommend GuC v70.46.2 for BMG, LNL, DG2Julia Filipchuk1-3/+3
UAPI compatibility version 1.22.2 Resolves various bugs. Recommend newer version. Signed-off-by: Julia Filipchuk <julia.filipchuk@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250626182805.1701096-13-daniele.ceraolospurio@intel.com
2025-07-07platform/x86: hfi: Introduce AMD Hardware Feedback Interface DriverPerry Yuan5-0/+184
The AMD Heterogeneous core design and Hardware Feedback Interface (HFI) provide behavioral classification and a dynamically updated ranking table for the scheduler to use when choosing cores for tasks. There are two CPU core types defined: Classic and Dense. Classic cores are the standard performance cores, while Dense cores are optimized for area and efficiency. Heterogeneous compute refers to CPU implementations that are comprised of more than one architectural class, each with two capabilities. This means each CPU reports two separate capabilities: "perf" and "eff". Each capability lists all core ranking numbers between 0 and 255, where a higher number represents a higher capability. Heterogeneous systems can also extend to more than two architectural classes. The purpose of the scheduling feedback mechanism is to provide information to the operating system scheduler in real time, allowing the scheduler to direct threads to the optimal core during task scheduling. All core ranking data are provided by the PMFW via a shared memory ranking table, which the driver reads and uses to update core capabilities to the scheduler. When the hardware updates the table, it generates a platform interrupt to notify the OS to read the new ranking table. Signed-off-by: Perry Yuan <perry.yuan@amd.com> Co-developed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Reviewed-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Acked-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537 Link: https://lore.kernel.org/20250609200518.3616080-5-superm1@kernel.org
2025-07-07ACPI: fan: Update debug message in fan_get_state_acpi4()Sumeet Pawnikar1-1/+1
Update invalid control value returned debug print with appropriate message as no matching fps control value for checking fan fps count condition. Signed-off-by: Sumeet Pawnikar <sumeet4linux@gmail.com> Link: https://patch.msgid.link/20250705110005.4343-1-sumeet4linux@gmail.com> [ rjw: Subject edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-07-07ACPI: PRM: Reduce unnecessary printing to avoid user confusionZhu Qiyu1-2/+24
Commit 088984c8d54c ("ACPI: PRM: Find EFI_MEMORY_RUNTIME block for PRM handler and context") introduced non-essential printing "Failed to find VA for GUID: xxxx, PA: 0x0" which may confuse users to think that something wrong is going on while it is not the case. According to the PRM Spec Section 4.1.2 [1], both static data buffer address and ACPI parameter buffer address may be NULL if they are not needed, so there is no need to print out the "Failed to find VA ... " in those cases. Link: https://uefi.org/sites/default/files/resources/Platform%20Runtime%20Mechanism%20-%20with%20legal%20notice.pdf # [1] Signed-off-by: Zhu Qiyu <qiyuzhu2@amd.com> Link: https://patch.msgid.link/20250704014104.82524-1-qiyuzhu2@amd.com [ rjw: Edits in new comments, subject and changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-07-07ACPI: fan: Replace sprintf() with sysfs_emit()Eslam Khafagy1-1/+1
Replace sprintf() with sysfs_emit() in function show_fine_grain_control() in according to Documentation/filesystems/sysfs.rst. Link: https://lore.kernel.org/all/20250621055200.166361-1-abdelrahmanfekry375@gmail.com/ Signed-off-by: Eslam Khafagy <eslam.medhat1993@gmail.com> Link: https://patch.msgid.link/20250704004002.70839-1-eslam.medhat1993@gmail.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-07-07ACPI: APEI: EINJ: Fix trigger actionsTony Luck1-4/+10
The trigger events are in BIOS memory immediately following the acpi_einj_trigger structure. These were not copied to regular kernel memory for use by apei_exec_ctx_init() so injections in "notrigger=0" mode failed with a message like this: APEI: Invalid action table, unknown instruction type: 123 Fix by allocating a "table_size" block of memory and copying the whole table for use in the rest of the trigger flow. Fixes: 1a35c88302a3 ("ACPI: APEI: EINJ: Fix kernel test sparse warnings") Reported-by: Yi1 Lai <yi1.lai@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Link: https://patch.msgid.link/20250703200421.28012-1-tony.luck@intel.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-07-07wifi: mt76: mt7921s: Introduce SDIO WiFi/BT combo module card resetLeon Yen6-2/+72
Add a hardware reset method to recover from the SDIO bus error that cannot be resolved by the current WiFi/BT subsystem reset. Signed-off-by: Leon Yen <leon.yen@mediatek.com> Signed-off-by: Ming Yen Hsieh <mingyen.hsieh@mediatek.com> Link: https://patch.msgid.link/20250418093740.3814909-1-mingyen.hsieh@mediatek.com Signed-off-by: Felix Fietkau <nbd@nbd.name>
2025-07-07wifi: mt76: mt792x: improve monitor interface handlingMing Yen Hsieh1-0/+1
Enable IEEE80211_HW_NO_VIRTUAL_MONITOR to ensure the driver is notified of all monitor interfaces and their channels. Signed-off-by: Ming Yen Hsieh <mingyen.hsieh@mediatek.com> Link: https://patch.msgid.link/20250625075611.1407687-1-mingyen.hsieh@mediatek.com Signed-off-by: Felix Fietkau <nbd@nbd.name>
2025-07-07wifi: mt76: mt792x: Limit the concurrent STA and SoftAP to operate on the ↵Leon Yen1-5/+27
same channel Due to the lack of NoA(Notice of Absence) mechanism in SoftAP mode, it is inappropriate to allow concurrent SoftAP and STA to operate on the different channels. This patch restricts the concurrent SoftAP and STA to be setup on the same channel only. Signed-off-by: Leon Yen <leon.yen@mediatek.com> Signed-off-by: Ming Yen Hsieh <mingyen.hsieh@mediatek.com> Link: https://patch.msgid.link/20250625073720.1385210-1-mingyen.hsieh@mediatek.com Signed-off-by: Felix Fietkau <nbd@nbd.name>
2025-07-07wifi: mt76: mt7925: Fix null-ptr-deref in mt7925_thermal_init()Henry Martin1-0/+2
devm_kasprintf() returns NULL on error. Currently, mt7925_thermal_init() does not check for this case, which results in a NULL pointer dereference. Add NULL check after devm_kasprintf() to prevent this issue. Fixes: 396e41a74a88 ("wifi: mt76: mt7925: support temperature sensor") Signed-off-by: Henry Martin <bsdhenryma@tencent.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://patch.msgid.link/20250625124901.1839832-1-bsdhenryma@tencent.com Signed-off-by: Felix Fietkau <nbd@nbd.name>
2025-07-07wifi: mt76: Get rid of dma_sync_single_for_device() for MMIO devicesLorenzo Bianconi2-13/+4
Since the page_pool for MT76 MMIO devices are created with PP_FLAG_DMA_SYNC_DEV flag, we do not need to sync_for_device each page received from the pool since it is already done by the page_pool codebase. Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://patch.msgid.link/20250625-mt76-sync-for-device-v1-1-e687e3278e1a@kernel.org Signed-off-by: Felix Fietkau <nbd@nbd.name>
2025-07-07wifi: mt76: mt7996: Move num_sta accounting in mt7996_mac_sta_{add,remove}_linksLorenzo Bianconi1-26/+32
Move phy num_sta accounting in mt7996_mac_sta_add() and mt7996_mac_sta_remove() routines in order to take into account all possibles MLO links. Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://patch.msgid.link/20250704-mt7996-mlo-fixes-v1-9-356456c73f43@kernel.org Signed-off-by: Felix Fietkau <nbd@nbd.name>
2025-07-07wifi: mt76: mt7996: Add MLO support to mt7996_tx_check_aggr()Lorenzo Bianconi1-23/+21
Generalize mt7996_tx_check_aggr() and mt7996_txwi_free() routines to introduce MLO support for MT7996 driver. Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://patch.msgid.link/20250704-mt7996-mlo-fixes-v1-8-356456c73f43@kernel.org Signed-off-by: Felix Fietkau <nbd@nbd.name>
2025-07-07wifi: mt76: mt7996: Fix valid_links bitmask in mt7996_mac_sta_{add,remove}Lorenzo Bianconi1-2/+2
sta->valid_links bitmask can be set even for non-MLO client. Fixes: dd82a9e02c054 ("wifi: mt76: mt7996: Rely on mt7996_sta_link in sta_add/sta_remove callbacks") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://patch.msgid.link/20250704-mt7996-mlo-fixes-v1-7-356456c73f43@kernel.org Signed-off-by: Felix Fietkau <nbd@nbd.name>
2025-07-07wifi: mt76: mt7996: Fix possible OOB access in mt7996_tx()Lorenzo Bianconi1-5/+12
Fis possible Out-Of-Boundary access in mt7996_tx routine if link_id is set to IEEE80211_LINK_UNSPECIFIED Fixes: 3ce8acb86b661 ("wifi: mt76: mt7996: Update mt7996_tx to MLO support") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://patch.msgid.link/20250704-mt7996-mlo-fixes-v1-6-356456c73f43@kernel.org Signed-off-by: Felix Fietkau <nbd@nbd.name>
2025-07-07wifi: mt76: mt7996: Fix mlink lookup in mt7996_tx_prepare_skbLorenzo Bianconi1-2/+2
Use proper link_id in mt7996_tx_prepare_skb routine for mlink lookup. Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://patch.msgid.link/20250704-mt7996-mlo-fixes-v1-5-356456c73f43@kernel.org Signed-off-by: Felix Fietkau <nbd@nbd.name>
2025-07-07wifi: mt76: mt7996: Do not set wcid.sta to 1 in mt7996_mac_sta_event()Lorenzo Bianconi1-1/+0
msta_link->wcid.sta is already set to 1 in mt7996_mac_sta_init_link routine. Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://patch.msgid.link/20250704-mt7996-mlo-fixes-v1-4-356456c73f43@kernel.org Signed-off-by: Felix Fietkau <nbd@nbd.name>
2025-07-07wifi: mt76: mt7996: Rely on for_each_sta_active_link() in ↵Lorenzo Bianconi1-7/+7
mt7996_mcu_sta_mld_setup_tlv() Reuse for_each_sta_active_link utility macro in mt7996_mcu_sta_mld_setup_tlv routine. Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://patch.msgid.link/20250704-mt7996-mlo-fixes-v1-3-356456c73f43@kernel.org Signed-off-by: Felix Fietkau <nbd@nbd.name>
2025-07-07wifi: mt76: mt7996: Fix secondary link lookup in mt7996_mcu_sta_mld_setup_tlv()Lorenzo Bianconi1-2/+1
Use proper link_id value for secondary link lookup in mt7996_mcu_sta_mld_setup_tlv routine. Fixes: 00cef41d9d8f5 ("wifi: mt76: mt7996: Add mt7996_mcu_sta_mld_setup_tlv() and mt7996_mcu_sta_eht_mld_tlv()") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://patch.msgid.link/20250704-mt7996-mlo-fixes-v1-2-356456c73f43@kernel.org Signed-off-by: Felix Fietkau <nbd@nbd.name>
2025-07-07wifi: mt76: fix vif link allocationFelix Fietkau2-3/+6
Reuse the vif deflink for link_id = 0 in order to avoid confusion with vif->bss_conf, which also gets a link id of 0. Link: https://patch.msgid.link/20250704-mt7996-mlo-fixes-v1-1-356456c73f43@kernel.org Signed-off-by: Felix Fietkau <nbd@nbd.name>
2025-07-07wifi: mt76: fix queue assignment for deauth packetsFelix Fietkau1-1/+2
When running in AP mode and deauthenticating a client that's in powersave mode, the disassoc/deauth packet can get stuck in a tx queue along with other buffered frames. This can fill up hardware queues with frames that are only released after the WTBL slot is reused for another client. Fix this by moving deauth packets to the ALTX queue. Reported-by: Chad Monroe <chad.monroe@adtran.com> Link: https://patch.msgid.link/20250707154702.1726-2-nbd@nbd.name Signed-off-by: Felix Fietkau <nbd@nbd.name>
2025-07-07wifi: mt76: add a wrapper for wcid access with validationFelix Fietkau17-68/+41
Several places use rcu_dereference to get a wcid entry without validating if the index exceeds the array boundary. Fix this by using a helper function, which handles validation. Link: https://patch.msgid.link/20250707154702.1726-1-nbd@nbd.name Signed-off-by: Felix Fietkau <nbd@nbd.name>
2025-07-07firmware: arm_scmi: power_control: Ensure SCMI_SYSPOWER_IDLE is set early ↵Peng Fan1-5/+17
during resume Fix a race condition where a second suspend notification from another SCMI agent wakes the system before SCMI_SYSPOWER_IDLE is set, leading to ignored suspend requests. This is due to interrupts triggering early execution of `scmi_userspace_notifier()` before the SCMI state is updated. To resolve this, set SCMI_SYSPOWER_IDLE earlier in the device resume path, prior to `thaw_processes()`. This ensures the SCMI state is correct when the notifier runs, allowing the system to suspend again as expected. On some platforms using SCMI, SCP cannot distinguish between CPU idle and suspend since both result in cluster power-off. By explicitly setting the idle state early, the Linux SCMI agent can correctly re-suspend in response to external notifications. Signed-off-by: Peng Fan <peng.fan@nxp.com> Message-Id: <20250704-scmi-pm-v2-2-9316cec2f9cc@nxp.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2025-07-07firmware: arm_scmi: Add power management operations to SCMI busPeng Fan1-0/+26
Introduce suspend and resume power management callbacks for `scmi_bus_type`, modeled after `platform_pm_ops`. This enables SCMI devices on the bus to implement their own suspend and resume behavior, allowing for more fine-grained power control at the device level. Signed-off-by: Peng Fan <peng.fan@nxp.com> Message-Id: <20250704-scmi-pm-v2-1-9316cec2f9cc@nxp.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2025-07-07wifi: mt76: mt7921: prevent decap offload config before STA initializationDeren Wu1-0/+3
The decap offload configuration should only be applied after the STA has been successfully initialized. Attempting to configure it earlier can lead to corruption of the MAC configuration in the chip's hardware state. Add an early check for `msta->deflink.wcid.sta` to ensure the station peer is properly initialized before proceeding with decapsulation offload configuration. Cc: stable@vger.kernel.org Fixes: 24299fc869f7 ("mt76: mt7921: enable rx header traslation offload") Signed-off-by: Deren Wu <deren.wu@mediatek.com> Link: https://patch.msgid.link/f23a72ba7a3c1ad38ba9e13bb54ef21d6ef44ffb.1748149855.git.deren.wu@mediatek.com Signed-off-by: Felix Fietkau <nbd@nbd.name>
2025-07-07wifi: mt76: mt7925: prevent NULL pointer dereference in ↵Deren Wu1-0/+6
mt7925_sta_set_decap_offload() Add a NULL check for msta->vif before accessing its members to prevent a kernel panic in AP mode deployment. This also fix the issue reported in [1]. The crash occurs when this function is triggered before the station is fully initialized. The call trace shows a page fault at mt7925_sta_set_decap_offload() due to accessing resources when msta->vif is NULL. Fix this by adding an early return if msta->vif is NULL and also check wcid.sta is ready. This ensures we only proceed with decap offload configuration when the station's state is properly initialized. [14739.655703] Unable to handle kernel paging request at virtual address ffffffffffffffa0 [14739.811820] CPU: 0 UID: 0 PID: 895854 Comm: hostapd Tainted: G [14739.821394] Tainted: [C]=CRAP, [O]=OOT_MODULE [14739.825746] Hardware name: Raspberry Pi 4 Model B Rev 1.1 (DT) [14739.831577] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [14739.838538] pc : mt7925_sta_set_decap_offload+0xc0/0x1b8 [mt7925_common] [14739.845271] lr : mt7925_sta_set_decap_offload+0x58/0x1b8 [mt7925_common] [14739.851985] sp : ffffffc085efb500 [14739.855295] x29: ffffffc085efb500 x28: 0000000000000000 x27: ffffff807803a158 [14739.862436] x26: ffffff8041ececb8 x25: 0000000000000001 x24: 0000000000000001 [14739.869577] x23: 0000000000000001 x22: 0000000000000008 x21: ffffff8041ecea88 [14739.876715] x20: ffffff8041c19ca0 x19: ffffff8078031fe0 x18: 0000000000000000 [14739.883853] x17: 0000000000000000 x16: ffffffe2aeac1110 x15: 000000559da48080 [14739.890991] x14: 0000000000000001 x13: 0000000000000000 x12: 0000000000000000 [14739.898130] x11: 0a10020001008e88 x10: 0000000000001a50 x9 : ffffffe26457bfa0 [14739.905269] x8 : ffffff8042013bb0 x7 : ffffff807fb6cbf8 x6 : dead000000000100 [14739.912407] x5 : dead000000000122 x4 : ffffff80780326c8 x3 : 0000000000000000 [14739.919546] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffffff8041ececb8 [14739.926686] Call trace: [14739.929130] mt7925_sta_set_decap_offload+0xc0/0x1b8 [mt7925_common] [14739.935505] ieee80211_check_fast_rx+0x19c/0x510 [mac80211] [14739.941344] _sta_info_move_state+0xe4/0x510 [mac80211] [14739.946860] sta_info_move_state+0x1c/0x30 [mac80211] [14739.952116] sta_apply_auth_flags.constprop.0+0x90/0x1b0 [mac80211] [14739.958708] sta_apply_parameters+0x234/0x5e0 [mac80211] [14739.964332] ieee80211_add_station+0xdc/0x190 [mac80211] [14739.969950] nl80211_new_station+0x46c/0x670 [cfg80211] [14739.975516] genl_family_rcv_msg_doit+0xdc/0x150 [14739.980158] genl_rcv_msg+0x218/0x298 [14739.983830] netlink_rcv_skb+0x64/0x138 [14739.987670] genl_rcv+0x40/0x60 [14739.990816] netlink_unicast+0x314/0x380 [14739.994742] netlink_sendmsg+0x198/0x3f0 [14739.998664] __sock_sendmsg+0x64/0xc0 [14740.002324] ____sys_sendmsg+0x260/0x298 [14740.006242] ___sys_sendmsg+0xb4/0x110 Cc: stable@vger.kernel.org Link: https://github.com/morrownr/USB-WiFi/issues/603 [1] Fixes: b859ad65309a ("wifi: mt76: mt7925: add link handling in mt7925_sta_set_decap_offload") Signed-off-by: Deren Wu <deren.wu@mediatek.com> Link: https://patch.msgid.link/35aedbffa050e98939264300407a52ba4e236d52.1748149855.git.deren.wu@mediatek.com Signed-off-by: Felix Fietkau <nbd@nbd.name>
2025-07-07wifi: mt76: mt7925: fix incorrect scan probe IE handling for hw_scanMing Yen Hsieh3-19/+63
The IEs should be processed and filled into the command tlv separately according to each band. Cc: stable@vger.kernel.org Fixes: c948b5da6bbe ("wifi: mt76: mt7925: add Mediatek Wi-Fi7 driver for mt7925 chips") Signed-off-by: Ming Yen Hsieh <mingyen.hsieh@mediatek.com> Link: https://patch.msgid.link/20250616063649.1100503-1-mingyen.hsieh@mediatek.com Signed-off-by: Felix Fietkau <nbd@nbd.name>
2025-07-07wifi: mt76: mt7925: fix invalid array index in ssid assignment during hw scanMichael Lo2-4/+4
Update the destination index to use 'n_ssids', which is incremented only when a valid SSID is present. Previously, both mt76_connac_mcu_hw_scan() and mt7925_mcu_hw_scan() used the loop index 'i' for the destination array, potentially leaving gaps if any source SSIDs had zero length. Cc: stable@vger.kernel.org Fixes: c948b5da6bbe ("wifi: mt76: mt7925: add Mediatek Wi-Fi7 driver for mt7925 chips") Signed-off-by: Michael Lo <michael.lo@mediatek.com> Signed-off-by: Ming Yen Hsieh <mingyen.hsieh@mediatek.com> Link: https://patch.msgid.link/20250612062046.160598-1-mingyen.hsieh@mediatek.com Signed-off-by: Felix Fietkau <nbd@nbd.name>
2025-07-07wifi: mt76: mt7925: fix the wrong config for tx interruptMing Yen Hsieh1-1/+1
MT_INT_TX_DONE_MCU_WM may cause tx interrupt to be mishandled during a reset failure, leading to the reset process failing. By using MT_INT_TX_DONE_MCU instead of MT_INT_TX_DONE_MCU_WM, the handling of tx interrupt is improved. Cc: stable@vger.kernel.org Fixes: c948b5da6bbe ("wifi: mt76: mt7925: add Mediatek Wi-Fi7 driver for mt7925 chips") Signed-off-by: Ming Yen Hsieh <mingyen.hsieh@mediatek.com> Link: https://patch.msgid.link/20250612060931.135635-1-mingyen.hsieh@mediatek.com Signed-off-by: Felix Fietkau <nbd@nbd.name>
2025-07-07wifi: mt76: Remove RCU section in mt7996_mac_sta_rc_work()Lorenzo Bianconi1-28/+7
Since mt7996_mcu_add_rate_ctrl() and mt7996_mcu_set_fixed_field() can't run in atomic context, move RCU critical section in mt7996_mcu_add_rate_ctrl() and mt7996_mcu_set_fixed_field(). This patch fixes a 'sleep while atomic' issue in mt7996_mac_sta_rc_work(). Fixes: 0762bdd30279 ("wifi: mt76: mt7996: rework mt7996_mac_sta_rc_work to support MLO") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Tested-by: Ben Greear <greearb@candelatech.com> Link: https://patch.msgid.link/20250605-mt7996-sleep-while-atomic-v1-5-d46d15f9203c@kernel.org Signed-off-by: Felix Fietkau <nbd@nbd.name>
2025-07-07wifi: mt76: Move RCU section in mt7996_mcu_add_rate_ctrl()Lorenzo Bianconi4-27/+45
Since mt76_mcu_skb_send_msg() routine can't be executed in atomic context, move RCU section in mt7996_mcu_add_rate_ctrl() and execute mt76_mcu_skb_send_msg() in non-atomic context. This is a preliminary patch to fix a 'sleep while atomic' issue in mt7996_mac_sta_rc_work(). Fixes: 0762bdd30279 ("wifi: mt76: mt7996: rework mt7996_mac_sta_rc_work to support MLO") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://patch.msgid.link/20250605-mt7996-sleep-while-atomic-v1-4-d46d15f9203c@kernel.org Signed-off-by: Felix Fietkau <nbd@nbd.name>