summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)AuthorFilesLines
2026-03-18drm/amdgpu: Remove dead negative offset check in ↵Srinivasan Shanmugam1-5/+0
amdgpu_virt_init_critical_region() amdgpu_virt_init_critical_region() stores init_hdr_offset as u64. The subsequent check for init_hdr_offset < 0 is therefore always false. Drop the unreachable validation and rely on the existing check_add_overflow() and VRAM end bounds check for offset validation. This resolves the Smatch warning about comparing an unsigned value against zero. drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c:953 amdgpu_virt_init_critical_region() warn: unsigned 'init_hdr_offset' is never less than zero. Fixes: 07009df6494d ("drm/amdgpu: Introduce SRIOV critical regions v2 during VF init") Cc: Dan Carpenter <dan.carpenter@linaro.org> Cc: Ellen Pan <yunru.pan@amd.com> Cc: Lijo Lazar <lijo.lazar@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Bokun Zhang <bokun.zhang@amd.com> Reviewed-by: Ellen Pan <yunru.pan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-18drm/amdgpu: Move amdgpu_vm_is_bo_always_valid() before first useSrinivasan Shanmugam1-14/+14
Smatch reports that 'bo' could be NULL in amdgpu_vm_bo_update(), even though amdgpu_vm_is_bo_always_valid() already checks for a NULL BO. Move amdgpu_vm_is_bo_always_valid() earlier in the file so the helper definition appears before its first use. This allows static analysis tools to see the NULL check performed by the helper and avoids the warning. Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Cc: Dan Carpenter <dan.carpenter@linaro.org> Cc: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-18drm/amdgpu: Drop redundant queue NULL check in hang detect workerSrinivasan Shanmugam1-1/+1
amdgpu_userq_hang_detect_work() retrieves the queue pointer using container_of() from the embedded work item. Since the work structure is part of struct amdgpu_usermode_queue, the returned queue pointer cannot be NULL in normal execution. Remove the redundant !queue check and keep the validation for queue->userq_mgr. Fixes the below: drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c:159 amdgpu_userq_hang_detect_work() warn: can 'queue' even be NULL? Fixes: 290f46cf5726 ("drm/amdgpu: Implement user queue reset functionality") Cc: Jesse Zhang <Jesse.Zhang@amd.com> Cc: Dan Carpenter <dan.carpenter@linaro.org> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Jesse Zhang <jesse.zhang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-18drm/amdgpu : Update psp 13_0_15 ip block supportMangesh Gadre1-1/+6
Included psp_13_0_15 ip block for RAS Signed-off-by: Mangesh Gadre <Mangesh.Gadre@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-18drm/amdgpu: rework amdgpu_userq_wait_ioctl v4Christian König1-259/+323
Lockdep was complaining about a number of issues here. Especially lock inversion between syncobj, dma_resv and copying things into userspace. Rework the functionality. Split it up into multiple functions, consistenly use memdup_array_user(), fix the lock inversions and a few more bugs in error handling. v2: drop the dma_fence leak fix, turned out that was actually correct, just not well documented. Apply some more cleanup suggestion from Tvrtko. v3: rebase on already done cleanups v4: add missing dma_fence_put() in error path. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-18drm/amdgpu: fix adding eviction fenceChristian König3-8/+24
We can't add the eviction fence without validating the BO. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-18drm/amdgpu: fix eviction fence and userq manager shutdownChristian König5-1/+16
That is a really complicated dance and wasn't implemented fully correct. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-18drm/amdgpu: completely rework eviction fence handling v2Christian König7-206/+110
Well that was broken on multiple levels. First of all a lot of checks were placed at incorrect locations, especially if the resume worker should run or not. Then a bunch of code was just mid-layering because of incorrect assignment who should do what. And finally comments explaining what happens instead of why. Just re-write it from scratch, that should at least fix some of the hangs we are seeing. Use RCU for the eviction fence pointer in the manager, the spinlock usage was mostly incorrect as well. Then finally remove all the nonsense checks and actually add them in the correct locations. v2: some typo fixes and cleanups suggested by Sunil Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-18drm/radeon: apply state adjust rules to some additional HAINAN vairantsAlex Deucher1-1/+3
They need a similar workaround. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/1839 Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-18drm/amdgpu: apply state adjust rules to some additional HAINAN vairantsAlex Deucher1-1/+3
They need a similar workaround. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/1839 Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-18drm/amdgpu: rework how we handle TLB fencesAlex Deucher2-1/+8
Add a new VM flag to indicate whether or not we need a TLB fence. Userqs (KFD or KGD) require a TLB fence. A TLB fence is not strictly required for kernel queues, but it shouldn't hurt. That said, enabling this unconditionally should be fine, but it seems to tickle some issues in KIQ/MES. Only enable them for KFD, or when KGD userq queues are enabled (currently via module parameter). Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4798 Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4749 Fixes: f3854e04b708 ("drm/amdgpu: attach tlb fence to the PTs update") Cc: Christian König <christian.koenig@amd.com> Cc: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-18drm/amdgpu: Add JPEG_v5_0_2 IP blockSonny Jiang4-0/+954
Add support for JPEG_5_0_2 v2: comment out RAS for now (Alex) v3: drop some bringup leftovers (Alex) Signed-off-by: Sonny Jiang <sonjiang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-18drm/amdgpu: Set VCN_5_0_2 DPG modeSonny Jiang1-1/+1
Set DPG flag for VCN_5_0_2 Signed-off-by: Sonny Jiang <sonjiang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-18drm/amdgpu: Add VCN_5_0_2 codecs capabilities supportSonny Jiang1-0/+34
Support VCN_5_0_2 codec query Signed-off-by: Sonny Jiang <sonjiang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-18drm/amdgpu: Add VCN v5_0_2Sonny Jiang5-2/+1259
Add support for VCN_5_0_2 v2: squash in RRMT enable bit fix from Sonny (Alex) v3: sqaush in doorbell enablement patch (Alex) v4: drop some bringup leftovers (Alex) Signed-off-by: Sonny Jiang <sonjiang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-18drm/amd/pm: Add mutex lock for metrics tableAsad Kamal1-0/+1
Add metrics table mutex lock in smu table context struct Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-18drm/amd/pm: Update pm attributesAsad Kamal1-26/+10
Update pm attributes show/hide for gc_v12_1_0 v2: Use multi-aid check (Lijo) Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-18drm/amd/pm: Add fru eeprom info supportAsad Kamal1-0/+1
Add fru eeprom info support for smu_v15_0_8 Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-18drm/amd/ras: Fix NULL deref in ras_core_get_utc_second_timestamp()Srinivasan Shanmugam1-2/+5
ras_core_get_utc_second_timestamp() retrieves the current UTC timestamp (in seconds since the Unix epoch) through a platform-specific RAS system callback and is used for timestamping RAS error events. The function checks ras_core in the conditional statement before calling the sys_fn callback. However, when the condition fails, the function prints an error message using ras_core->dev. If ras_core is NULL, this can lead to a potential NULL pointer dereference when accessing ras_core->dev. Add an early NULL check for ras_core at the beginning of the function and return 0 when the pointer is not valid. This prevents the dereference and makes the control flow clearer. Fixes: 13c91b5b4378 ("drm/amd/ras: Add rascore unified interface function") Cc: YiPeng Chai <YiPeng.Chai@amd.com> Cc: Dan Carpenter <dan.carpenter@linaro.org> Cc: Tao Zhou <tao.zhou1@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: YiPeng Chai <YiPeng.Chai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-18PCI: rpaphp: Simplify with scoped for each OF child loopKrzysztof Kozlowski1-3/+1
Use scoped for-each loop when iterating over device nodes to make code a bit simpler. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://patch.msgid.link/20260317133322.266102-8-krzysztof.kozlowski@oss.qualcomm.com
2026-03-18PCI: pnv_php: Simplify with scoped for each OF child loopKrzysztof Kozlowski1-12/+7
Use scoped for-each loop when iterating over device nodes to make code a bit simpler. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://patch.msgid.link/20260317133322.266102-7-krzysztof.kozlowski@oss.qualcomm.com
2026-03-18libie: prevent memleak in fwlog codeMichal Swiatkowski1-13/+36
All cmd_buf buffers are allocated and need to be freed after usage. Add an error unwinding path that properly frees these buffers. The memory leak happens whenever fwlog configuration is changed. For example: $echo 256K > /sys/kernel/debug/ixgbe/0000\:32\:00.0/fwlog/log_size Fixes: 96a9a9341cda ("ice: configure FW logging") Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2026-03-18accel/amdxdna: Support retrieving hardware context debug informationLizhi Hou8-11/+213
The firmware implements the GET_APP_HEALTH command to collect debug information for a specific hardware context. When a command times out, the driver issues this command to collect the relevant debug information. User space tools can also retrieve this information through the hardware context query IOCTL. Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20260317044906.1513133-1-lizhi.hou@amd.com
2026-03-17Merge tag 'hid-for-linus-2026031701' of ↵Linus Torvalds11-15/+48
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid Pull HID fixes from Jiri Kosina: - various fixes dealing with (intentionally) broken devices in HID core, logitech-hidpp and multitouch drivers (Lee Jones) - fix for OOB in wacom driver (Benoît Sevens) - fix for potentialy HID-bpf-induced buffer overflow in () (Benjamin Tissoires) - various other small fixes and device ID / quirk additions * tag 'hid-for-linus-2026031701' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: HID: multitouch: Check to ensure report responses match the request HID: logitech-hidpp: Prevent use-after-free on force feedback initialisation failure HID: bpf: prevent buffer overflow in hid_hw_request selftests/hid: fix compilation when bpf_wq and hid_device are not exported HID: core: Mitigate potential OOB by removing bogus memset() HID: intel-thc-hid: Set HID_PHYS with PCI BDF HID: appletb-kbd: add .resume method in PM HID: logitech-hidpp: Enable MX Master 4 over bluetooth HID: input: Add HID_BATTERY_QUIRK_DYNAMIC for Elan touchscreens HID: input: Drop Asus UX550* touchscreen ignore battery quirks HID: asus: add xg mobile 2022 external hardware support HID: wacom: fix out-of-bounds read in wacom_intuos_bt_irq
2026-03-17iavf: fix VLAN filter lost on add/delete racePetr Oros1-3/+6
When iavf_add_vlan() finds an existing filter in IAVF_VLAN_REMOVE state, it transitions the filter to IAVF_VLAN_ACTIVE assuming the pending delete can simply be cancelled. However, there is no guarantee that iavf_del_vlans() has not already processed the delete AQ request and removed the filter from the PF. In that case the filter remains in the driver's list as IAVF_VLAN_ACTIVE but is no longer programmed on the NIC. Since iavf_add_vlans() only picks up filters in IAVF_VLAN_ADD state, the filter is never re-added, and spoof checking drops all traffic for that VLAN. CPU0 CPU1 Workqueue ---- ---- --------- iavf_del_vlan(vlan 100) f->state = REMOVE schedule AQ_DEL_VLAN iavf_add_vlan(vlan 100) f->state = ACTIVE iavf_del_vlans() f is ACTIVE, skip iavf_add_vlans() f is ACTIVE, skip Filter is ACTIVE in driver but absent from NIC. Transition to IAVF_VLAN_ADD instead and schedule IAVF_FLAG_AQ_ADD_VLAN_FILTER so iavf_add_vlans() re-programs the filter. A duplicate add is idempotent on the PF. Fixes: 0c0da0e95105 ("iavf: refactor VLAN filter states") Signed-off-by: Petr Oros <poros@redhat.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2026-03-17igc: fix page fault in XDP TX timestamps handlingZdenek Bouska3-0/+42
If an XDP application that requested TX timestamping is shutting down while the link of the interface in use is still up the following kernel splat is reported: [ 883.803618] [ T1554] BUG: unable to handle page fault for address: ffffcfb6200fd008 ... [ 883.803650] [ T1554] Call Trace: [ 883.803652] [ T1554] <TASK> [ 883.803654] [ T1554] igc_ptp_tx_tstamp_event+0xdf/0x160 [igc] [ 883.803660] [ T1554] igc_tsync_interrupt+0x2d5/0x300 [igc] ... During shutdown of the TX ring the xsk_meta pointers are left behind, so that the IRQ handler is trying to touch them. This issue is now being fixed by cleaning up the stale xsk meta data on TX shutdown. TX timestamps on other queues remain unaffected. Fixes: 15fd021bc427 ("igc: Add Tx hardware timestamp request for AF_XDP zero-copy packet") Signed-off-by: Zdenek Bouska <zdenek.bouska@siemens.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Reviewed-by: Florian Bezdeka <florian.bezdeka@siemens.com> Tested-by: Avigail Dahan <avigailx.dahan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2026-03-17igc: fix missing update of skb->tail in igc_xmit_frame()Kohei Enju1-5/+2
igc_xmit_frame() misses updating skb->tail when the packet size is shorter than the minimum one. Use skb_put_padto() in alignment with other Intel Ethernet drivers. Fixes: 0507ef8a0372 ("igc: Add transmit and receive fastpath and interrupt handlers") Signed-off-by: Kohei Enju <kohei@enjuk.jp> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Avigail Dahan <avigailx.dahan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2026-03-17devres: add devres_node_init()Danilo Krummrich1-6/+9
Both alloc_dr() and devres_open_group() initialize devres_node.entry and set devres_node.release. Add a helper, devres_node_init(), for this pattern. Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://patch.msgid.link/20260202235210.55176-4-dakr@kernel.org Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2026-03-17devres: add devres_node_add()Danilo Krummrich1-8/+9
Both devres_add() and devres_open_group() acquire the devres_lock and call add_dr(). Add a helper, devres_node_add(), for this pattern. Use guard(spinlock_irqsave) to avoid the explicit unlock call and local flag variables. Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://patch.msgid.link/20260202235210.55176-3-dakr@kernel.org Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2026-03-17devres: fix missing node debug info in devm_krealloc()Danilo Krummrich1-0/+2
Fix missing call to set_node_dbginfo() for new devres nodes created by devm_krealloc(). Fixes: f82485722e5d ("devres: provide devm_krealloc()") Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://patch.msgid.link/20260202235210.55176-2-dakr@kernel.org Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2026-03-17PCI/CXL: Hide SBR from reset_methods if masked by CXLVidya Sagar1-5/+1
Per CXL r3.1, sec 8.1.5.2, the Secondary Bus Reset (SBR) bit in the Bridge Control register of a CXL port has no effect unless the "Unmask SBR" bit in the Port Control Extensions Register is set. After b1956e2d0713 ("PCI/CXL: Fail bus reset if upstream CXL Port has SBR masked"), Linux checks the "Unmask SBR" bit in pci_reset_bus_function(). But when probe==true, it previously returned 0, incorrectly indicating that SBR is a viable reset method for the device. As a result, "bus" is listed in the device's "reset_method" attribute even though the hardware is incapable of performing it. If a user writes "bus" to "reset_method" or triggers a reset that falls back to SBR, the operation fails with "write error: Inappropriate ioctl for device". If the link is operating in CXL mode (pcie_is_cxl()), return -ENOTTY immediately unless "Unmask SBR" is set, regardless of the probe argument. This ensures that "bus" is not advertised in "reset_methods" when the hardware prevents it, improving clarity for users and aligning the sysfs capability report with actual hardware behavior. Signed-off-by: Vidya Sagar <vidyas@nvidia.com> [bhelgaas: commit log, use pcie_is_cxl()] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://patch.msgid.link/20260225133801.30231-1-vidyas@nvidia.com
2026-03-17ACPICA: Update the format of Arg3 of _DSMSaket Dumbre1-1/+1
To get rid of type incompatibility warnings in Linux. Fixes: 81f92cff6d42 ("ACPICA: ACPI_TYPE_ANY does not include the package type") Link: https://github.com/acpica/acpica/commit/4fb74872dcec Signed-off-by: Saket Dumbre <saket.dumbre@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://patch.msgid.link/12856643.O9o76ZdvQC@rafael.j.wysocki
2026-03-17driver core: platform: use generic driver_override infrastructureDanilo Krummrich4-40/+10
When a driver is probed through __driver_attach(), the bus' match() callback is called without the device lock held, thus accessing the driver_override field without a lock, which can cause a UAF. Fix this by using the driver-core driver_override infrastructure taking care of proper locking internally. Note that calling match() from __driver_attach() without the device lock held is intentional. [1] Link: https://lore.kernel.org/driver-core/DGRGTIRHA62X.3RY09D9SOK77P@kernel.org/ [1] Reported-by: Gui-Dong Han <hanguidong02@gmail.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220789 Fixes: 3d713e0e382e ("driver core: platform: add device binding path 'driver_override'") Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://patch.msgid.link/20260303115720.48783-5-dakr@kernel.org Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2026-03-17hwmon: axi-fan: don't use driver_override as IRQ nameDanilo Krummrich1-1/+1
Do not use driver_override as IRQ name, as it is not guaranteed to point to a valid string; use NULL instead (which makes the devm IRQ helpers use dev_name()). Fixes: 8412b410fa5e ("hwmon: Support ADI Fan Control IP") Reviewed-by: Nuno Sá <nuno.sa@analog.com> Acked-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://patch.msgid.link/20260303115720.48783-4-dakr@kernel.org Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2026-03-17driver core: generalize driver_override in struct deviceDanilo Krummrich3-1/+104
Currently, there are 12 busses (including platform and PCI) that duplicate the driver_override logic for their individual devices. All of them seem to be prone to the bug described in [1]. While this could be solved for every bus individually using a separate lock, solving this in the driver-core generically results in less (and cleaner) changes overall. Thus, move driver_override to struct device, provide corresponding accessors for busses and handle locking with a separate lock internally. In particular, add device_set_driver_override(), device_has_driver_override(), device_match_driver_override() and generalize the sysfs store() and show() callbacks via a driver_override feature flag in struct bus_type. Until all busses have migrated, keep driver_set_override() in place. Note that we can't use the device lock for the reasons described in [2]. Link: https://bugzilla.kernel.org/show_bug.cgi?id=220789 [1] Link: https://lore.kernel.org/driver-core/DGRGTIRHA62X.3RY09D9SOK77P@kernel.org/ [2] Tested-by: Gui-Dong Han <hanguidong02@gmail.com> Co-developed-by: Gui-Dong Han <hanguidong02@gmail.com> Signed-off-by: Gui-Dong Han <hanguidong02@gmail.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://patch.msgid.link/20260303115720.48783-2-dakr@kernel.org [ Use dev->bus instead of sp->bus for consistency; fix commit message to refer to the struct bus_type's driver_override feature flag. - Danilo ] Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2026-03-17Merge tag 'rust_io-7.1-rc1' into driver-core-nextDanilo Krummrich206-1385/+2080
Register abstraction and I/O infrastructure improvements Introduce the register!() macro to define type-safe I/O register accesses. Refactor the IoCapable trait into a functional trait, which simplifies I/O backends and removes the need for overloaded Io methods. This is a stable tag for other trees to merge. Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2026-03-17RDMA/efa: Fix possible deadlockEthan Tidmore1-0/+1
In the error path for efa_com_alloc_comp_ctx() the semaphore assigned to &aq->avail_cmds is not released. Detected by Smatch: drivers/infiniband/hw/efa/efa_com.c:662 efa_com_cmd_exec() warn: inconsistent returns '&aq->avail_cmds' Add release for &aq->avail_cmds in efa_com_alloc_comp_ctx() error path. Fixes: ef3b06742c8a2 ("RDMA/efa: Fix use of completion ctx after free") Signed-off-by: Ethan Tidmore <ethantidmore06@gmail.com> Link: https://patch.msgid.link/20260314045730.1143862-1-ethantidmore06@gmail.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2026-03-17RDMA/rw: Fix MR pool exhaustion in bvec RDMA READ pathChuck Lever1-7/+9
When IOVA-based DMA mapping is unavailable (e.g., IOMMU passthrough mode), rdma_rw_ctx_init_bvec() falls back to checking rdma_rw_io_needs_mr() with the raw bvec count. Unlike the scatterlist path in rdma_rw_ctx_init(), which passes a post-DMA-mapping entry count that reflects coalescing of physically contiguous pages, the bvec path passes the pre-mapping page count. This overstates the number of DMA entries, causing every multi-bvec RDMA READ to consume an MR from the QP's pool. Under NFS WRITE workloads the server performs RDMA READs to pull data from the client. With the inflated MR demand, the pool is rapidly exhausted, ib_mr_pool_get() returns NULL, and rdma_rw_init_one_mr() returns -EAGAIN. svcrdma treats this as a DMA mapping failure, closes the connection, and the client reconnects -- producing a cycle of 71% RPC retransmissions and ~100 reconnections per test run. RDMA WRITEs (NFS READ direction) are unaffected because DMA_TO_DEVICE never triggers the max_sgl_rd check. Remove the rdma_rw_io_needs_mr() gate from the bvec path entirely, so that bvec RDMA operations always use the map_wrs path (direct WR posting without MR allocation). The bvec caller has no post-DMA-coalescing segment count available -- xdr_buf and svc_rqst hold pages as individual pointers, and physical contiguity is discovered only during DMA mapping -- so the raw page count cannot serve as a reliable input to rdma_rw_io_needs_mr(). iWARP devices, which require MRs unconditionally, are handled by an earlier check in rdma_rw_ctx_init_bvec() and are unaffected. Fixes: bea28ac14cab ("RDMA/core: add MR support for bvec-based RDMA operations") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://patch.msgid.link/20260313194201.5818-3-cel@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
2026-03-17RDMA/rw: Fall back to direct SGE on MR pool exhaustionChuck Lever1-3/+18
When IOMMU passthrough mode is active, ib_dma_map_sgtable_attrs() produces no coalescing: each scatterlist page maps 1:1 to a DMA entry, so sgt.nents equals the raw page count. A 1 MB transfer yields 256 DMA entries. If that count exceeds the device's max_sgl_rd threshold (an optimization hint from mlx5 firmware), rdma_rw_io_needs_mr() steers the operation into the MR registration path. Each such operation consumes one or more MRs from a pool sized at max_rdma_ctxs -- roughly one MR per concurrent context. Under write-intensive workloads that issue many concurrent RDMA READs, the pool is rapidly exhausted, ib_mr_pool_get() returns NULL, and rdma_rw_init_one_mr() returns -EAGAIN. Upper layer protocols treat this as a fatal DMA mapping failure and tear down the connection. The max_sgl_rd check is a performance optimization, not a correctness requirement: the device can handle large SGE counts via direct posting, just less efficiently than with MR registration. When the MR pool cannot satisfy a request, falling back to the direct SGE (map_wrs) path avoids the connection reset while preserving the MR optimization for the common case where pool resources are available. Add a fallback in rdma_rw_ctx_init() so that -EAGAIN from rdma_rw_init_mr_wrs() triggers direct SGE posting instead of propagating the error. iWARP devices, which mandate MR registration for RDMA READs, and force_mr debug mode continue to treat -EAGAIN as terminal. Fixes: 00bd1439f464 ("RDMA/rw: Support threshold for registration vs scattering to local pages") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://patch.msgid.link/20260313194201.5818-2-cel@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
2026-03-17regulator: fp9931: Make vin-supply mandatoryMark Brown387-2399/+4139
Robby Cai <robby.cai@nxp.com> says: The FP9931 regulator requires a valid "vin" supply to operate correctly. Therefore, the driver should treat "vin" as a mandatory supply. This patchset updates the binding documentation to mark vin-supply as a required property, and modifies the driver accordingly. As suggested in the reviews from Andreas and Mark, v2 switches to using devm_regulator_get() since the supply is mandatory.
2026-03-17regulator: fp9931: Fix handling of mandatory "vin" supplyRobby Cai1-1/+1
The FP9931 requires a mandatory "vin" power supply to operate. Replace devm_regulator_get_optional() with devm_regulator_get() to enforce this mandatory dependency. Fixes: 12d821bd13d42 ("regulator: Add FP9931/JD9930 driver") Signed-off-by: Robby Cai <robby.cai@nxp.com> Link: https://patch.msgid.link/20260313133102.2749890-3-robby.cai@nxp.com Signed-off-by: Mark Brown <broonie@kernel.org>
2026-03-17platform/x86: wireless-hotkey: Convert ACPI driver to a platform oneRafael J. Wysocki1-21/+23
In all cases in which a struct acpi_driver is used for binding a driver to an ACPI device object, a corresponding platform device is created by the ACPI core and that device is regarded as a proper representation of underlying hardware. Accordingly, a struct platform_driver should be used by driver code to bind to that device. There are multiple reasons why drivers should not bind directly to ACPI device objects [1]. Overall, it is better to bind drivers to platform devices than to their ACPI companions, so convert the airplane mode button for AMD, HP and Xiaomi laptops driver from an ACPI driver to a platform one. While this is not expected to alter functionality, it changes sysfs layout and so it will be visible to user space. Link: https://lore.kernel.org/all/2396510.ElGaqSPkdT@rafael.j.wysocki/ [1] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://patch.msgid.link/9607409.CDJkKcVGEf@rafael.j.wysocki Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2026-03-17platform/x86: wireless-hotkey: Register ACPI notify handler directlyRafael J. Wysocki1-3/+10
To facilitate subsequent conversion of the driver to a platform one, make it install an ACPI notify handler directly instead of using a .notify() callback in struct acpi_driver. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://patch.msgid.link/3953848.kQq0lBPeGt@rafael.j.wysocki Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2026-03-17platform/x86: topstar-laptop: Convert ACPI driver to a platform oneRafael J. Wysocki1-14/+16
In all cases in which a struct acpi_driver is used for binding a driver to an ACPI device object, a corresponding platform device is created by the ACPI core and that device is regarded as a proper representation of underlying hardware. Accordingly, a struct platform_driver should be used by driver code to bind to that device. There are multiple reasons why drivers should not bind directly to ACPI device objects [1]. Overall, it is better to bind drivers to platform devices than to their ACPI companions, so convert the Topstar Laptop ACPI Extras driver from an ACPI driver to a platform one. While this is not expected to alter functionality, it changes sysfs layout and so it will be visible to user space. Link: https://lore.kernel.org/all/2396510.ElGaqSPkdT@rafael.j.wysocki/ [1] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://patch.msgid.link/10834824.nUPlyArG6x@rafael.j.wysocki Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2026-03-17platform/x86: topstar-laptop: Register ACPI notify handler directlyRafael J. Wysocki1-4/+11
To facilitate subsequent conversion of the driver to a platform one, make it install an ACPI notify handler directly instead of using a .notify() callback in struct acpi_driver. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://patch.msgid.link/3425557.44csPzL39Z@rafael.j.wysocki Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2026-03-17platform/x86: sony-laptop: Convert PIC driver to a platform oneRafael J. Wysocki1-26/+25
In all cases in which a struct acpi_driver is used for binding a driver to an ACPI device object, a corresponding platform device is created by the ACPI core and that device is regarded as a proper representation of underlying hardware. Accordingly, a struct platform_driver should be used by driver code to bind to that device. There are multiple reasons why drivers should not bind directly to ACPI device objects [1]. Overall, it is better to bind drivers to platform devices than to their ACPI companions, so convert the Programmable IO Control (pic) part of the Sony laptop driver from an ACPI driver to a platform one. While this is not expected to alter functionality, it changes sysfs layout and so it will be visible to user space. Link: https://lore.kernel.org/all/2396510.ElGaqSPkdT@rafael.j.wysocki/ [1] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://patch.msgid.link/2059875.yKVeVyVuyW@rafael.j.wysocki Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2026-03-17platform/x86: sony-laptop: Convert NC driver to a platform oneRafael J. Wysocki1-34/+31
In all cases in which a struct acpi_driver is used for binding a driver to an ACPI device object, a corresponding platform device is created by the ACPI core and that device is regarded as a proper representation of underlying hardware. Accordingly, a struct platform_driver should be used by driver code to bind to that device. There are multiple reasons why drivers should not bind directly to ACPI device objects [1]. Overall, it is better to bind drivers to platform devices than to their ACPI companions, so convert the Notebook Control (nc) part of the Sony laptop driver from an ACPI driver to a platform one. While this is not expected to alter functionality, it changes sysfs layout and so it will be visible to user space. Link: https://lore.kernel.org/all/2396510.ElGaqSPkdT@rafael.j.wysocki/ [1] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://patch.msgid.link/886676729.0ifERbkFSE@rafael.j.wysocki Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2026-03-17platform/x86: sony-laptop: Register ACPI notify handler directlyRafael J. Wysocki1-2/+8
To facilitate subsequent conversion of the driver to a platform one, make it install an ACPI notify handler directly instead of using a .notify() callback in struct acpi_driver. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://patch.msgid.link/2559802.jE0xQCEvom@rafael.j.wysocki Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2026-03-17platform/x86: lg-laptop: Convert ACPI driver to a platform oneRafael J. Wysocki1-32/+13
In all cases in which a struct acpi_driver is used for binding a driver to an ACPI device object, a corresponding platform device is created by the ACPI core and that device is regarded as a proper representation of underlying hardware. Accordingly, a struct platform_driver should be used by driver code to bind to that device. There are multiple reasons why drivers should not bind directly to ACPI device objects [1]. Overall, it is better to bind drivers to platform devices than to their ACPI companions, so convert the LG Gram ACPI features and hotkeys driver from an ACPI driver to a platform one. While this is not expected to alter functionality, it changes sysfs layout and so it will be visible to user space. Link: https://lore.kernel.org/all/2396510.ElGaqSPkdT@rafael.j.wysocki/ [1] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://patch.msgid.link/1868365.VLH7GnMWUR@rafael.j.wysocki Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2026-03-17platform/x86: lg-laptop: Drop debug-only ACPI notify handlerRafael J. Wysocki1-6/+0
To facilitate subsequent conversion of the driver to using struct platform_driver instead of struct acpi_driver, drop the debug-only notify handler method from the driver. No intentional functional impact beyond debug. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://patch.msgid.link/3346280.5fSG56mABF@rafael.j.wysocki Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>