summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)AuthorFilesLines
2025-08-15zloop: fix KASAN use-after-free of tag setShin'ichiro Kawasaki1-1/+2
commit 765761851d89c772f482494d452e266795460278 upstream. When a zoned loop device, or zloop device, is removed, KASAN enabled kernel reports "BUG KASAN use-after-free" in blk_mq_free_tag_set(). The BUG happens because zloop_ctl_remove() calls put_disk(), which invokes zloop_free_disk(). The zloop_free_disk() frees the memory allocated for the zlo pointer. However, after the memory is freed, zloop_ctl_remove() calls blk_mq_free_tag_set(&zlo->tag_set), which accesses the freed zlo. Hence the KASAN use-after-free. zloop_ctl_remove() put_disk(zlo->disk) put_device() kobject_put() ... zloop_free_disk() kvfree(zlo) blk_mq_free_tag_set(&zlo->tag_set) To avoid the BUG, move the call to blk_mq_free_tag_set(&zlo->tag_set) from zloop_ctl_remove() into zloop_free_disk(). This ensures that the tag_set is freed before the call to kvfree(zlo). Fixes: eb0570c7df23 ("block: new zoned loop block device driver") CC: stable@vger.kernel.org Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20250731110745.165751-1-shinichiro.kawasaki@wdc.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-15platform/x86/intel/pmt: fix a crashlog NULL pointer accessMichael J. Ruhl2-1/+3
commit 54d5cd4719c5e87f33d271c9ac2e393147d934f8 upstream. Usage of the intel_pmt_read() for binary sysfs, requires a pcidev. The current use of the endpoint value is only valid for telemetry endpoint usage. Without the ep, the crashlog usage causes the following NULL pointer exception: BUG: kernel NULL pointer dereference, address: 0000000000000000 Oops: Oops: 0000 [#1] SMP NOPTI RIP: 0010:intel_pmt_read+0x3b/0x70 [pmt_class] Code: Call Trace: <TASK> ? sysfs_kf_bin_read+0xc0/0xe0 kernfs_fop_read_iter+0xac/0x1a0 vfs_read+0x26d/0x350 ksys_read+0x6b/0xe0 __x64_sys_read+0x1d/0x30 x64_sys_call+0x1bc8/0x1d70 do_syscall_64+0x6d/0x110 Augment struct intel_pmt_entry with a pointer to the pcidev to avoid the NULL pointer exception. Fixes: 045a513040cc ("platform/x86/intel/pmt: Use PMT callbacks") Cc: stable@vger.kernel.org Reviewed-by: David E. Box <david.e.box@linux.intel.com> Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Link: https://lore.kernel.org/r/20250713172943.7335-2-michael.j.ruhl@intel.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-15net: usbnet: Fix the wrong netif_carrier_on() callAmmar Faizi1-3/+3
commit 8466d393700f9ccef68134d3349f4e0a087679b9 upstream. The commit referenced in the Fixes tag causes usbnet to malfunction (identified via git bisect). Post-commit, my external RJ45 LAN cable fails to connect. Linus also reported the same issue after pulling that commit. The code has a logic error: netif_carrier_on() is only called when the link is already on. Fix this by moving the netif_carrier_on() call outside the if-statement entirely. This ensures it is always called when EVENT_LINK_CARRIER_ON is set and properly clears it regardless of the link state. Cc: stable@vger.kernel.org Cc: Armando Budianto <sprite@gnuweeb.org> Reviewed-by: Simon Horman <horms@kernel.org> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/all/CAHk-=wjqL4uF0MG_c8+xHX1Vv8==sPYQrtzbdA3kzi96284nuQ@mail.gmail.com Closes: https://lore.kernel.org/netdev/CAHk-=wjKh8X4PT_mU1kD4GQrbjivMfPn-_hXa6han_BTDcXddw@mail.gmail.com Closes: https://lore.kernel.org/netdev/0752dee6-43d6-4e1f-81d2-4248142cccd2@gnuweeb.org Fixes: 0d9cfc9b8cb1 ("net: usbnet: Avoid potential RCU stall on LINK_CHANGE event") Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-15net: usbnet: Avoid potential RCU stall on LINK_CHANGE eventJohn Ernberg1-3/+8
commit 0d9cfc9b8cb17dbc29a98792d36ec39a1cf1395f upstream. The Gemalto Cinterion PLS83-W modem (cdc_ether) is emitting confusing link up and down events when the WWAN interface is activated on the modem-side. Interrupt URBs will in consecutive polls grab: * Link Connected * Link Disconnected * Link Connected Where the last Connected is then a stable link state. When the system is under load this may cause the unlink_urbs() work in __handle_link_change() to not complete before the next usbnet_link_change() call turns the carrier on again, allowing rx_submit() to queue new SKBs. In that event the URB queue is filled faster than it can drain, ending up in a RCU stall: rcu: INFO: rcu_sched detected expedited stalls on CPUs/tasks: { 0-.... } 33108 jiffies s: 201 root: 0x1/. rcu: blocking rcu_node structures (internal RCU debug): Sending NMI from CPU 1 to CPUs 0: NMI backtrace for cpu 0 Call trace: arch_local_irq_enable+0x4/0x8 local_bh_enable+0x18/0x20 __netdev_alloc_skb+0x18c/0x1cc rx_submit+0x68/0x1f8 [usbnet] rx_alloc_submit+0x4c/0x74 [usbnet] usbnet_bh+0x1d8/0x218 [usbnet] usbnet_bh_tasklet+0x10/0x18 [usbnet] tasklet_action_common+0xa8/0x110 tasklet_action+0x2c/0x34 handle_softirqs+0x2cc/0x3a0 __do_softirq+0x10/0x18 ____do_softirq+0xc/0x14 call_on_irq_stack+0x24/0x34 do_softirq_own_stack+0x18/0x20 __irq_exit_rcu+0xa8/0xb8 irq_exit_rcu+0xc/0x30 el1_interrupt+0x34/0x48 el1h_64_irq_handler+0x14/0x1c el1h_64_irq+0x68/0x6c _raw_spin_unlock_irqrestore+0x38/0x48 xhci_urb_dequeue+0x1ac/0x45c [xhci_hcd] unlink1+0xd4/0xdc [usbcore] usb_hcd_unlink_urb+0x70/0xb0 [usbcore] usb_unlink_urb+0x24/0x44 [usbcore] unlink_urbs.constprop.0.isra.0+0x64/0xa8 [usbnet] __handle_link_change+0x34/0x70 [usbnet] usbnet_deferred_kevent+0x1c0/0x320 [usbnet] process_scheduled_works+0x2d0/0x48c worker_thread+0x150/0x1dc kthread+0xd8/0xe8 ret_from_fork+0x10/0x20 Get around the problem by delaying the carrier on to the scheduled work. This needs a new flag to keep track of the necessary action. The carrier ok check cannot be removed as it remains required for the LINK_RESET event flow. Fixes: 4b49f58fff00 ("usbnet: handle link change") Cc: stable@vger.kernel.org Signed-off-by: John Ernberg <john.ernberg@actia.se> Link: https://patch.msgid.link/20250723102526.1305339-1-john.ernberg@actia.se Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-15Bluetooth: btusb: Add USB ID 3625:010b for TP-LINK Archer TX10UB NanoZenm Chen1-0/+4
commit d9da920233ec85af8b9c87154f2721a7dc4623f5 upstream. Add USB ID 3625:010b for TP-LINK Archer TX10UB Nano which is based on a Realtek RTL8851BU chip. The information in /sys/kernel/debug/usb/devices about the Bluetooth device is listed as the below: T: Bus=01 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 9 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=3625 ProdID=010b Rev= 0.00 S: Manufacturer=Realtek S: Product=802.11ax WLAN Adapter S: SerialNumber=00e04c000001 C:* #Ifs= 3 Cfg#= 1 Atr=e0 MxPwr=500mA A: FirstIf#= 0 IfCount= 2 Cls=e0(wlcon) Sub=01 Prot=01 I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=81(I) Atr=03(Int.) MxPS= 16 Ivl=1ms E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 0 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 0 Ivl=1ms I: If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 9 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 9 Ivl=1ms I: If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 17 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 17 Ivl=1ms I: If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 25 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 25 Ivl=1ms I: If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 33 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 33 Ivl=1ms I: If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 49 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 49 Ivl=1ms I: If#= 1 Alt= 6 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 63 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 63 Ivl=1ms I:* If#= 2 Alt= 0 #EPs= 8 Cls=ff(vend.) Sub=ff Prot=ff Driver=rtl8851bu E: Ad=84(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=06(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=07(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=09(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=0a(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=0b(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=0c(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms Cc: stable@vger.kernel.org Signed-off-by: Zenm Chen <zenmchen@gmail.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-15USB: serial: option: add Foxconn T99W709Slark Xiao1-0/+2
commit ad1244e1ce18f8c1a5ebad8074bfcf10eacb0311 upstream. T99W709 is designed based on MTK T300(5G redcap) chip. There are 7 serial ports to be enumerated: AP_LOG, GNSS, AP_META, AT, MD_META, NPT, DBG. RSVD(5) for ADB port. test evidence as below: T: Bus=01 Lev=02 Prnt=02 Port=00 Cnt=01 Dev#= 7 Spd=480 MxCh= 0 D: Ver= 2.10 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=0489 ProdID=e15f Rev=00.01 S: Manufacturer=MediaTek Inc. S: Product=USB DATA CARD S: SerialNumber=355511220000399 C: #Ifs=10 Cfg#= 1 Atr=a0 MxPwr=500mA I: If#=0x0 Alt= 0 #EPs= 1 Cls=02(commc) Sub=0e Prot=00 Driver=cdc_mbim I: If#=0x1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim I: If#=0x2 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#=0x3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#=0x4 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#=0x5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=usbfs I: If#=0x6 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#=0x7 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#=0x8 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#=0x9 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option Signed-off-by: Slark Xiao <slark_xiao@163.com> Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-15pptp: fix pptp_xmit() error pathEric Dumazet1-3/+4
[ Upstream commit ae633388cae349886f1a3cfb27aa092854b24c1b ] I accidentally added a bug in pptp_xmit() that syzbot caught for us. Only call ip_rt_put() if a route has been allocated. BUG: unable to handle page fault for address: ffffffffffffffdb PGD df3b067 P4D df3b067 PUD df3d067 PMD 0 Oops: Oops: 0002 [#1] SMP KASAN PTI CPU: 1 UID: 0 PID: 6346 Comm: syz.0.336 Not tainted 6.16.0-next-20250804-syzkaller #0 PREEMPT(full) Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/12/2025 RIP: 0010:arch_atomic_add_return arch/x86/include/asm/atomic.h:85 [inline] RIP: 0010:raw_atomic_sub_return_release include/linux/atomic/atomic-arch-fallback.h:846 [inline] RIP: 0010:atomic_sub_return_release include/linux/atomic/atomic-instrumented.h:327 [inline] RIP: 0010:__rcuref_put include/linux/rcuref.h:109 [inline] RIP: 0010:rcuref_put+0x172/0x210 include/linux/rcuref.h:173 Call Trace: <TASK> dst_release+0x24/0x1b0 net/core/dst.c:167 ip_rt_put include/net/route.h:285 [inline] pptp_xmit+0x14b/0x1a90 drivers/net/ppp/pptp.c:267 __ppp_channel_push+0xf2/0x1c0 drivers/net/ppp/ppp_generic.c:2166 ppp_channel_push+0x123/0x660 drivers/net/ppp/ppp_generic.c:2198 ppp_write+0x2b0/0x400 drivers/net/ppp/ppp_generic.c:544 vfs_write+0x27b/0xb30 fs/read_write.c:684 ksys_write+0x145/0x250 fs/read_write.c:738 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f Fixes: de9c4861fb42 ("pptp: ensure minimal skb length in pptp_xmit()") Reported-by: syzbot+27d7cfbc93457e472e00@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/689095a5.050a0220.1fc43d.0009.GAE@google.com/ Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250807142146.2877060-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15nvmet: exit debugfs after discovery subsystem exitsMohamed Khalfella1-1/+1
[ Upstream commit 80f21806b8e34ae1e24c0fc6a0f0dfd9b055e130 ] Commit 528589947c180 ("nvmet: initialize discovery subsys after debugfs is initialized") changed nvmet_init() to initialize nvme discovery after "nvmet" debugfs directory is initialized. The change broke nvmet_exit() because discovery subsystem now depends on debugfs. Debugfs should be destroyed after discovery subsystem. Fix nvmet_exit() to do that. Reported-by: Yi Zhang <yi.zhang@redhat.com> Closes: https://lore.kernel.org/all/CAHj4cs96AfFQpyDKF_MdfJsnOEo=2V7dQgqjFv+k3t7H-=yGhA@mail.gmail.com/ Fixes: 528589947c180 ("nvmet: initialize discovery subsys after debugfs is initialized") Signed-off-by: Mohamed Khalfella <mkhalfella@purestorage.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Daniel Wagner <dwagner@suse.de> Link: https://lore.kernel.org/r/20250807053507.2794335-1-mkhalfella@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15irqchip: Build IMX_MU_MSI only on ARMArnd Bergmann1-0/+1
[ Upstream commit 3b6a18f0da8720d612d8a682ea5c55870da068e0 ] Compile-testing IMX_MU_MSI on x86 without PCI_MSI support results in a build failure: drivers/gpio/gpio-sprd.c:8: include/linux/gpio/driver.h:41:33: error: field 'msiinfo' has incomplete type drivers/iommu/iommufd/viommu.c:4: include/linux/msi.h:528:33: error: field 'alloc_info' has incomplete type Tighten the dependency further to only allow compile testing on Arm. This could be refined further to allow certain x86 configs. This was submitted before to address a different build failure, which was fixed differently, but the problem has now returned in a different form. Fixes: 70afdab904d2d1e6 ("irqchip: Add IMX MU MSI controller driver") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250805160952.4006075-1-arnd@kernel.org Link: https://lore.kernel.org/all/20221215164109.761427-1-arnd@kernel.org/ Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15net: ti: icssg-prueth: Fix skb handling for XDP_PASSMeghana Malladi1-7/+8
[ Upstream commit d942fe13f72bec92f6c689fbd74c5ec38228c16a ] emac_rx_packet() is a common function for handling traffic for both xdp and non-xdp use cases. Use common logic for handling skb with or without xdp to prevent any incorrect packet processing. This patch fixes ping working with XDP_PASS for icssg driver. Fixes: 62aa3246f4623 ("net: ti: icssg-prueth: Add XDP support") Signed-off-by: Meghana Malladi <m-malladi@ti.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250803180216.3569139-1-m-malladi@ti.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15eth: fbnic: Lock the tx_dropped updateMohsin Bashir1-0/+2
[ Upstream commit 53abd9c86fd086d8448ceec4e9ffbd65b6c17a37 ] Wrap copying of drop stats on TX path from fbd->hw_stats by the hw_stats_lock. Currently, it is being performed outside the lock and another thread accessing fbd->hw_stats can lead to inconsistencies. Fixes: 5f8bd2ce8269 ("eth: fbnic: add support for TMI stats") Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250802024636.679317-3-mohsin.bashr@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15eth: fbnic: Fix tx_dropped reportingMohsin Bashir1-4/+4
[ Upstream commit 2972395d8fad7f4efc8555348f2f988d4941d797 ] Correctly copy the tx_dropped stats from the fbd->hw_stats to the rtnl_link_stats64 struct. Fixes: 5f8bd2ce8269 ("eth: fbnic: add support for TMI stats") Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250802024636.679317-2-mohsin.bashr@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15eth: fbnic: remove the debugging trick of super high page biasJakub Kicinski2-6/+4
[ Upstream commit e407fceeaf1b2959892b4fc9b584843d3f2bfc05 ] Alex added page bias of LONG_MAX, which is admittedly quite a clever way of catching overflows of the pp ref count. The page pool code was "optimized" to leave the ref at 1 for freed pages so it can't catch basic bugs by itself any more. (Something we should probably address under DEBUG_NET...) Unfortunately for fbnic since commit f7dc3248dcfb ("skbuff: Optimization of SKB coalescing for page pool") core _may_ actually take two extra pp refcounts, if one of them is returned before driver gives up the bias the ret < 0 check in page_pool_unref_netmem() will trigger. While at it add a FBNIC_ to the name of the driver constant. Fixes: 0cb4c0a13723 ("eth: fbnic: Implement Rx queue alloc/start/stop/free") Link: https://patch.msgid.link/20250801170754.2439577-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15Revert "net: mdio_bus: Use devm for getting reset GPIO"Jakub Kicinski1-2/+2
[ Upstream commit 175811b8f05f0da3e19b7d3124666649ddde3802 ] This reverts commit 3b98c9352511db627b606477fc7944b2fa53a165. Russell says: Using devm_*() [here] is completely wrong, because this is called from mdiobus_register_device(). This is not the probe function for the device, and thus there is no code to trigger the release of the resource on unregistration. Moreover, when the mdiodev is eventually probed, if the driver fails or the driver is unbound, the GPIO will be released, but a reference will be left behind. Using devm* with a struct device that is *not* currently being probed is fundamentally wrong - an abuse of devm. Reported-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/95449490-fa58-41d4-9493-c9213c1f2e7d@sirena.org.uk Suggested-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Fixes: 3b98c9352511 ("net: mdio_bus: Use devm for getting reset GPIO") Link: https://patch.msgid.link/20250801212742.2607149-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15benet: fix BUG when creating VFsMichal Schmidt1-1/+1
[ Upstream commit 5a40f8af2ba1b9bdf46e2db10e8c9710538fbc63 ] benet crashes as soon as SRIOV VFs are created: kernel BUG at mm/vmalloc.c:3457! Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI CPU: 4 UID: 0 PID: 7408 Comm: test.sh Kdump: loaded Not tainted 6.16.0+ #1 PREEMPT(voluntary) [...] RIP: 0010:vunmap+0x5f/0x70 [...] Call Trace: <TASK> __iommu_dma_free+0xe8/0x1c0 be_cmd_set_mac_list+0x3fe/0x640 [be2net] be_cmd_set_mac+0xaf/0x110 [be2net] be_vf_eth_addr_config+0x19f/0x330 [be2net] be_vf_setup+0x4f7/0x990 [be2net] be_pci_sriov_configure+0x3a1/0x470 [be2net] sriov_numvfs_store+0x20b/0x380 kernfs_fop_write_iter+0x354/0x530 vfs_write+0x9b9/0xf60 ksys_write+0xf3/0x1d0 do_syscall_64+0x8c/0x3d0 be_cmd_set_mac_list() calls dma_free_coherent() under a spin_lock_bh. Fix it by freeing only after the lock has been released. Fixes: 1a82d19ca2d6 ("be2net: fix sleeping while atomic bugs in be_ndo_bridge_getlink") Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/20250801101338.72502-1-mschmidt@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15net: airoha: npu: Add missing MODULE_FIRMWARE macrosLorenzo Bianconi1-0/+2
[ Upstream commit 4e7e471e2e3f9085fe1dbe821c4dd904a917c66a ] Introduce missing MODULE_FIRMWARE definitions for firmware autoload. Fixes: 23290c7bc190d ("net: airoha: Introduce Airoha NPU support") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20250801-airoha-npu-missing-module-firmware-v2-1-e860c824d515@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15ipa: fix compile-testing with qcom-mdt=mArnd Bergmann1-1/+1
[ Upstream commit 2df158047d532d0e2a6b39953656c738872151a3 ] There are multiple drivers that use the qualcomm mdt loader, but they have conflicting ideas of how to deal with that dependency when compile-testing for non-qualcomm targets: IPA only enables the MDT loader when the kernel config includes ARCH_QCOM, but the newly added ath12k support always enables it, which leads to a link failure with the combination of IPA=y and ATH12K=m: aarch64-linux-ld: drivers/net/ipa/ipa_main.o: in function `ipa_firmware_load': ipa_main.c:(.text.unlikely+0x134): undefined reference to `qcom_mdt_load The ATH12K method seems more reliable here, so change IPA over to do the same thing. Fixes: 38a4066f593c ("net: ipa: support COMPILE_TEST") Fixes: c0dd3f4f7091 ("wifi: ath12k: enable ath12k AHB support") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://patch.msgid.link/20250731080024.2054904-1-arnd@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15eth: fbnic: unlink NAPIs from queues on error to openJakub Kicinski1-1/+3
[ Upstream commit 4b31bcb025cb497da2b01f87173108ff32d350d2 ] CI hit a UaF in fbnic in the AF_XDP portion of the queues.py test. The UaF is in the __sk_mark_napi_id_once() call in xsk_bind(), NAPI has been freed. Looks like the device failed to open earlier, and we lack clearing the NAPI pointer from the queue. Fixes: 557d02238e05 ("eth: fbnic: centralize the queue count and NAPI<>queue setting") Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250728163129.117360-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15drm/xe/pf: Disable PF restart worker on device removalMichal Wajdeczko1-1/+31
[ Upstream commit c286ce6b01f633806b4db3e4ec8e0162928299cd ] We can't let restart worker run once device is removed, since other data that it might want to access could be already released. Explicitly disable worker as part of device cleanup action. Fixes: a4d1c5d0b99b ("drm/xe/pf: Move VFs reprovisioning to worker") Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Cc: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://lore.kernel.org/r/20250801142822.180530-2-michal.wajdeczko@intel.com (cherry picked from commit a424353937c24554bb242a6582ed8f018b4a411c) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15md: make rdev_addable usable for rcu modeYang Erkun1-1/+7
[ Upstream commit 13017b427118f4311471ee47df74872372ca8482 ] Our testcase trigger panic: BUG: kernel NULL pointer dereference, address: 00000000000000e0 ... Oops: Oops: 0000 [#1] SMP NOPTI CPU: 2 UID: 0 PID: 85 Comm: kworker/2:1 Not tainted 6.16.0+ #94 PREEMPT(none) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.1-2.fc37 04/01/2014 Workqueue: md_misc md_start_sync RIP: 0010:rdev_addable+0x4d/0xf0 ... Call Trace: <TASK> md_start_sync+0x329/0x480 process_one_work+0x226/0x6d0 worker_thread+0x19e/0x340 kthread+0x10f/0x250 ret_from_fork+0x14d/0x180 ret_from_fork_asm+0x1a/0x30 </TASK> Modules linked in: raid10 CR2: 00000000000000e0 ---[ end trace 0000000000000000 ]--- RIP: 0010:rdev_addable+0x4d/0xf0 md_spares_need_change in md_start_sync will call rdev_addable which protected by rcu_read_lock/rcu_read_unlock. This rcu context will help protect rdev won't be released, but rdev->mddev will be set to NULL before we call synchronize_rcu in md_kick_rdev_from_array. Fix this by using READ_ONCE and check does rdev->mddev still alive. Fixes: bc08041b32ab ("md: suspend array in md_start_sync() if array need reconfiguration") Fixes: 570b9147deb6 ("md: use RCU lock to protect traversal in md_spares_need_change()") Signed-off-by: Yang Erkun <yangerkun@huawei.com> Link: https://lore.kernel.org/linux-raid/20250731114530.776670-1-yangerkun@huawei.com Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15net: mdio: mdio-bcm-unimac: Correct rate fallback logicFlorian Fainelli1-3/+2
[ Upstream commit a81649a4efd382497bf3d34a623360263adc6993 ] When the parent clock is a gated clock which has multiple parents, the clock provider (clk-scmi typically) might return a rate of 0 since there is not one of those particular parent clocks that should be chosen for returning a rate. Prior to ee975351cf0c ("net: mdio: mdio-bcm-unimac: Manage clock around I/O accesses"), we would not always be passing a clock reference depending upon how mdio-bcm-unimac was instantiated. In that case, we would take the fallback path where the rate is hard coded to 250MHz. Make sure that we still fallback to using a fixed rate for the divider calculation, otherwise we simply ignore the desired MDIO bus clock frequency which can prevent us from interfacing with Ethernet PHYs properly. Fixes: ee975351cf0c ("net: mdio: mdio-bcm-unimac: Manage clock around I/O accesses") Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250730202533.3463529-1-florian.fainelli@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15net/mlx5: Correctly set gso_segs when LRO is usedChristoph Paasch1-0/+1
[ Upstream commit 77bf1c55b2acc7fa3734b14f4561e3d75aea1a90 ] When gso_segs is left at 0, a number of assumptions will end up being incorrect throughout the stack. For example, in the GRO-path, we set NAPI_GRO_CB()->count to gso_segs. So, if a non-LRO'ed packet followed by an LRO'ed packet is being processed in GRO, the first one will have NAPI_GRO_CB()->count set to 1 and the next one to 0 (in dev_gro_receive()). Since commit 531d0d32de3e ("net/mlx5: Correctly set gso_size when LRO is used") these packets will get merged (as their gso_size now matches). So, we end up in gro_complete() with NAPI_GRO_CB()->count == 1 and thus don't call inet_gro_complete(). Meaning, checksum-validation in tcp_checksum_complete() will fail with a "hw csum failure". Even before the above mentioned commit, incorrect gso_segs means that other things like TCP's accounting of incoming packets (tp->segs_in, data_segs_in, rcv_ooopack) will be incorrect. Which means that if one does bytes_received/data_segs_in, the result will be bigger than the MTU. Fix this by initializing gso_segs correctly when LRO is used. Fixes: e586b3b0baee ("net/mlx5: Ethernet Datapath files") Reported-by: Gal Pressman <gal@nvidia.com> Closes: https://lore.kernel.org/netdev/6583783f-f0fb-4fb1-a415-feec8155bc69@nvidia.com/ Signed-off-by: Christoph Paasch <cpaasch@openai.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250729-mlx5_gso_segs-v1-1-b48c480c1c12@openai.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15net: airoha: Fix PPE table access in airoha_ppe_debugfs_foe_show()Lorenzo Bianconi1-6/+20
[ Upstream commit 38358fa3cc8e16c6862a3e5c5c233f9f652e3a6d ] In order to avoid any possible race we need to hold the ppe_lock spinlock accessing the hw PPE table. airoha_ppe_foe_get_entry routine is always executed holding ppe_lock except in airoha_ppe_debugfs_foe_show routine. Fix the problem introducing airoha_ppe_foe_get_entry_locked routine. Fixes: 3fe15c640f380 ("net: airoha: Introduce PPE debugfs support") Reviewed-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://patch.msgid.link/20250731-airoha_ppe_foe_get_entry_locked-v2-1-50efbd8c0fd6@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15spi: cs42l43: Property entry should be a null-terminated arraySimon Trimmer1-1/+1
[ Upstream commit ffcfd071eec7973e58c4ffff7da4cb0e9ca7b667 ] The software node does not specify a count of property entries, so the array must be null-terminated. When unterminated, this can lead to a fault in the downstream cs35l56 amplifier driver, because the node parse walks off the end of the array into unknown memory. Fixes: 0ca645ab5b15 ("spi: cs42l43: Add speaker id support to the bridge configuration") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220371 Signed-off-by: Simon Trimmer <simont@opensource.cirrus.com> Link: https://patch.msgid.link/20250731160109.1547131-1-simont@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15s390/ap: Unmask SLCF bit in card and queue ap functions sysfsHarald Freudenberger1-1/+1
[ Upstream commit 123b7c7c2ba725daf3bfa5ce421d65b92cb5c075 ] The SLCF bit ("stateless command filtering") introduced with CEX8 cards was because of the function mask's default value suppressed when user space read the ap function for an AP card or queue. Unmask this bit so that user space applications like lszcrypt can evaluate and list this feature. Fixes: d4c53ae8e494 ("s390/ap: store TAPQ hwinfo in struct ap_card") Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15nvmet: initialize discovery subsys after debugfs is initializedMohamed Khalfella1-6/+6
[ Upstream commit 528589947c1802b9357c2a9b96d88cc4a11cd88b ] During nvme target initialization discovery subsystem is initialized before "nvmet" debugfs directory is created. This results in discovery subsystem debugfs directory to be created in debugfs root directory. nvmet_init() -> nvmet_init_discovery() -> nvmet_subsys_alloc() -> nvmet_debugfs_subsys_setup() In other words, the codepath above is exeucted before nvmet_debugfs is created. We get /sys/kernel/debug/nqn.2014-08.org.nvmexpress.discovery instead of /sys/kernel/debug/nvmet/nqn.2014-08.org.nvmexpress.discovery. Move nvmet_init_discovery() call after nvmet_init_debugfs() to fix it. Fixes: 649fd41420a8 ("nvmet: add debugfs support") Signed-off-by: Mohamed Khalfella <mkhalfella@purestorage.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Hannes Reinecke <hare@kernel.org> Reviewed-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15pptp: ensure minimal skb length in pptp_xmit()Eric Dumazet1-6/+9
[ Upstream commit de9c4861fb42f0cd72da844c3c34f692d5895b7b ] Commit aabc6596ffb3 ("net: ppp: Add bound checking for skb data on ppp_sync_txmung") fixed ppp_sync_txmunge() We need a similar fix in pptp_xmit(), otherwise we might read uninit data as reported by syzbot. BUG: KMSAN: uninit-value in pptp_xmit+0xc34/0x2720 drivers/net/ppp/pptp.c:193 pptp_xmit+0xc34/0x2720 drivers/net/ppp/pptp.c:193 ppp_channel_bridge_input drivers/net/ppp/ppp_generic.c:2290 [inline] ppp_input+0x1d6/0xe60 drivers/net/ppp/ppp_generic.c:2314 pppoe_rcv_core+0x1e8/0x760 drivers/net/ppp/pppoe.c:379 sk_backlog_rcv+0x142/0x420 include/net/sock.h:1148 __release_sock+0x1d3/0x330 net/core/sock.c:3213 release_sock+0x6b/0x270 net/core/sock.c:3767 pppoe_sendmsg+0x15d/0xcb0 drivers/net/ppp/pppoe.c:904 sock_sendmsg_nosec net/socket.c:712 [inline] __sock_sendmsg+0x330/0x3d0 net/socket.c:727 ____sys_sendmsg+0x893/0xd80 net/socket.c:2566 ___sys_sendmsg+0x271/0x3b0 net/socket.c:2620 __sys_sendmmsg+0x2d9/0x7c0 net/socket.c:2709 Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: syzbot+afad90ffc8645324afe5@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/68887d86.a00a0220.b12ec.00cd.GAE@google.com/T/#u Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com> Link: https://patch.msgid.link/20250729080207.1863408-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15net: mdio_bus: Use devm for getting reset GPIOBence Csókás1-2/+2
[ Upstream commit 3b98c9352511db627b606477fc7944b2fa53a165 ] Commit bafbdd527d56 ("phylib: Add device reset GPIO support") removed devm_gpiod_get_optional() in favor of the non-devres managed fwnode_get_named_gpiod(). When it was kind-of reverted by commit 40ba6a12a548 ("net: mdio: switch to using gpiod_get_optional()"), the devm functionality was not reinstated. Nor was the GPIO unclaimed on device remove. This leads to the GPIO being claimed indefinitely, even when the device and/or the driver gets removed. Fixes: bafbdd527d56 ("phylib: Add device reset GPIO support") Fixes: 40ba6a12a548 ("net: mdio: switch to using gpiod_get_optional()") Cc: Csaba Buday <buday.csaba@prolan.hu> Signed-off-by: Bence Csókás <csokas.bence@prolan.hu> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20250728153455.47190-2-csokas.bence@prolan.hu Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15net: ipa: add IPA v5.1 and v5.5 to ipa_version_string()Luca Weiss1-1/+5
[ Upstream commit f2aa00e4f65efcf25ff6bc8198e21f031e7b9b1b ] Handle the case for v5.1 and v5.5 instead of returning "0.0". Also reword the comment below since I don't see any evidence of such a check happening, and - since 5.5 has been missing - can happen. Fixes: 3aac8ec1c028 ("net: ipa: add some new IPA versions") Signed-off-by: Luca Weiss <luca.weiss@fairphone.com> Reviewed-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com> Reviewed-by: Alex Elder <elder@riscstar.com> Link: https://patch.msgid.link/20250728-ipa-5-1-5-5-version_string-v1-1-d7a5623d7ece@fairphone.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15phy: mscc: Fix parsing of unicast framesHoratiu Vultur2-0/+2
[ Upstream commit 6fb5ff63b35b7e849cc8510957f25753f87f63d2 ] According to the 1588 standard, it is possible to use both unicast and multicast frames to send the PTP information. It was noticed that if the frames were unicast they were not processed by the analyzer meaning that they were not timestamped. Therefore fix this to match also these unicast frames. Fixes: ab2bf9339357 ("net: phy: mscc: 1588 block initialization") Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20250726140307.3039694-1-horatiu.vultur@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15md/md-cluster: handle REMOVE message earlierHeming Zhao1-3/+6
[ Upstream commit 948b1fe12005d39e2b49087b50e5ee55c9a8f76f ] Commit a1fd37f97808 ("md: Don't wait for MD_RECOVERY_NEEDED for HOT_REMOVE_DISK ioctl") introduced a regression in the md_cluster module. (Failed cases 02r1_Manage_re-add & 02r10_Manage_re-add) Consider a 2-node cluster: - node1 set faulty & remove command on a disk. - node2 must correctly update the array metadata. Before a1fd37f97808, on node1, the delay between msg:METADATA_UPDATED (triggered by faulty) and msg:REMOVE was sufficient for node2 to reload the disk info (written by node1). After a1fd37f97808, node1 no longer waits between faulty and remove, causing it to send msg:REMOVE while node2 is still reloading disk info. This often results in node2 failing to remove the faulty disk. == how to trigger == set up a 2-node cluster (node1 & node2) with disks vdc & vdd. on node1: mdadm -CR /dev/md0 -l1 -b clustered -n2 /dev/vdc /dev/vdd --assume-clean ssh node2-ip mdadm -A /dev/md0 /dev/vdc /dev/vdd mdadm --manage /dev/md0 --fail /dev/vdc --remove /dev/vdc check array status on both nodes with "mdadm -D /dev/md0". node1 output: Number Major Minor RaidDevice State - 0 0 0 removed 1 254 48 1 active sync /dev/vdd node2 output: Number Major Minor RaidDevice State - 0 0 0 removed 1 254 48 1 active sync /dev/vdd 0 254 32 - faulty /dev/vdc Fixes: a1fd37f97808 ("md: Don't wait for MD_RECOVERY_NEEDED for HOT_REMOVE_DISK ioctl") Signed-off-by: Heming Zhao <heming.zhao@suse.com> Reviewed-by: Su Yue <glass.su@suse.com> Link: https://lore.kernel.org/linux-raid/20250728042145.9989-1-heming.zhao@suse.com Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15PCI: pnv_php: Fix surprise plug detection and recoveryTimothy Pearson1-3/+107
[ Upstream commit a2a2a6fc2469524caa713036297c542746d148dc ] The existing PowerNV hotplug code did not handle surprise plug events correctly, leading to a complete failure of the hotplug system after device removal and a required reboot to detect new devices. This comes down to two issues: 1) When a device is surprise removed, often the bridge upstream port will cause a PE freeze on the PHB. If this freeze is not cleared, the MSI interrupts from the bridge hotplug notification logic will not be received by the kernel, stalling all plug events on all slots associated with the PE. 2) When a device is removed from a slot, regardless of surprise or programmatic removal, the associated PHB/PE ls left frozen. If this freeze is not cleared via a fundamental reset, skiboot is unable to clear the freeze and cannot retrain / rescan the slot. This also requires a reboot to clear the freeze and redetect the device in the slot. Issue the appropriate unfreeze and rescan commands on hotplug events, and don't oops on hotplug if pci_bus_to_OF_node() returns NULL. Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com> [bhelgaas: tidy comments] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/171044224.1359864.1752615546988.JavaMail.zimbra@raptorengineeringinc.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15PCI: pnv_php: Work around switches with broken presence detectionTimothy Pearson1-0/+27
[ Upstream commit 80f9fc2362797538ebd4fd70a1dfa838cc2c2cdb ] The Microsemi Switchtec PM8533 PFX 48xG3 [11f8:8533] PCIe switch system was observed to incorrectly assert the Presence Detect Set bit in its capabilities when tested on a Raptor Computing Systems Blackbird system, resulting in the hot insert path never attempting a rescan of the bus and any downstream devices not being re-detected. Work around this by additionally checking whether the PCIe data link is active or not when performing presence detection on downstream switches' ports, similar to the pciehp_hpc.c driver. Signed-off-by: Shawn Anastasio <sanastasio@raptorengineering.com> Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/505981576.1359853.1752615415117.JavaMail.zimbra@raptorengineeringinc.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15PCI: pnv_php: Clean up allocated IRQs on unplugTimothy Pearson1-19/+77
[ Upstream commit 4668619092554e1b95c9a5ac2941ca47ba6d548a ] When the root of a nested PCIe bridge configuration is unplugged, the pnv_php driver leaked the allocated IRQ resources for the child bridges' hotplug event notifications, resulting in a panic. Fix this by walking all child buses and deallocating all its IRQ resources before calling pci_hp_remove_devices(). Also modify the lifetime of the workqueue at struct pnv_php_slot::wq so that it is only destroyed in pnv_php_free_slot(), instead of pnv_php_disable_irq(). This is required since pnv_php_disable_irq() will now be called by workers triggered by hot unplug interrupts, so the workqueue needs to stay allocated. The abridged kernel panic that occurs without this patch is as follows: WARNING: CPU: 0 PID: 687 at kernel/irq/msi.c:292 msi_device_data_release+0x6c/0x9c CPU: 0 UID: 0 PID: 687 Comm: bash Not tainted 6.14.0-rc5+ #2 Call Trace: msi_device_data_release+0x34/0x9c (unreliable) release_nodes+0x64/0x13c devres_release_all+0xc0/0x140 device_del+0x2d4/0x46c pci_destroy_dev+0x5c/0x194 pci_hp_remove_devices+0x90/0x128 pci_hp_remove_devices+0x44/0x128 pnv_php_disable_slot+0x54/0xd4 power_write_file+0xf8/0x18c pci_slot_attr_store+0x40/0x5c sysfs_kf_write+0x64/0x78 kernfs_fop_write_iter+0x1b0/0x290 vfs_write+0x3bc/0x50c ksys_write+0x84/0x140 system_call_exception+0x124/0x230 system_call_vectored_common+0x15c/0x2ec Signed-off-by: Shawn Anastasio <sanastasio@raptorengineering.com> Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com> [bhelgaas: tidy comments] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/2013845045.1359852.1752615367790.JavaMail.zimbra@raptorengineeringinc.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15vfio/pci: Do vf_token checks for VFIO_DEVICE_BIND_IOMMUFDJason Gunthorpe9-11/+59
[ Upstream commit 86624ba3b522b6512def25534341da93356c8da4 ] This was missed during the initial implementation. The VFIO PCI encodes the vf_token inside the device name when opening the device from the group FD, something like: "0000:04:10.0 vf_token=bd8d9d2b-5a5f-4f5a-a211-f591514ba1f3" This is used to control access to a VF unless there is co-ordination with the owner of the PF. Since we no longer have a device name in the cdev path, pass the token directly through VFIO_DEVICE_BIND_IOMMUFD using an optional field indicated by VFIO_DEVICE_BIND_FLAG_TOKEN. Fixes: 5fcc26969a16 ("vfio: Add VFIO_DEVICE_BIND_IOMMUFD") Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Reviewed-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/0-v3-bdd8716e85fe+3978a-vfio_token_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15scsi: sd: Make sd shutdown issue START STOP UNIT appropriatelySalomon Dushimirimana1-1/+3
[ Upstream commit 8e48727c26c4d839ff9b4b73d1cae486bea7fe19 ] Commit aa3998dbeb3a ("ata: libata-scsi: Disable scsi device manage_system_start_stop") enabled libata EH to manage device power mode trasitions for system suspend/resume and removed the flag from ata_scsi_dev_config. However, since the sd_shutdown() function still relies on the manage_system_start_stop flag, a spin-down command is not issued to the disk with command "echo 1 > /sys/block/sdb/device/delete" sd_shutdown() can be called for both system/runtime start stop operations, so utilize the manage_run_time_start_stop flag set in the ata_scsi_dev_config and issue a spin-down command during disk removal when the system is running. This is in addition to when the system is powering off and manage_shutdown flag is set. The manage_system_start_stop flag will still be used for drivers that still set the flag. Fixes: aa3998dbeb3a ("ata: libata-scsi: Disable scsi device manage_system_start_stop") Signed-off-by: Salomon Dushimirimana <salomondush@google.com> Link: https://lore.kernel.org/r/20250724214520.112927-1-salomondush@google.com Tested-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15scsi: ufs: core: Use link recovery when h8 exit fails during runtime resumeSeunghui Lee1-1/+9
[ Upstream commit 35dabf4503b94a697bababe94678a8bc989c3223 ] If the h8 exit fails during runtime resume process, the runtime thread enters runtime suspend immediately and the error handler operates at the same time. It becomes stuck and cannot be recovered through the error handler. To fix this, use link recovery instead of the error handler. Fixes: 4db7a2360597 ("scsi: ufs: Fix concurrency of error handler and other error recovery paths") Signed-off-by: Seunghui Lee <sh043.lee@samsung.com> Link: https://lore.kernel.org/r/20250717081213.6811-1-sh043.lee@samsung.com Reviewed-by: Bean Huo <beanhuo@micron.com> Acked-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15scsi: Revert "scsi: iscsi: Fix HW conn removal use after free"Li Lingfeng1-0/+2
[ Upstream commit 7bdc68921481c19cd8c85ddf805a834211c19e61 ] This reverts commit c577ab7ba5f3bf9062db8a58b6e89d4fe370447e. The invocation of iscsi_put_conn() in iscsi_iter_destory_conn_fn() is used to free the initial reference counter of iscsi_cls_conn. For non-qla4xxx cases, the ->destroy_conn() callback (e.g., iscsi_conn_teardown) will call iscsi_remove_conn() and iscsi_put_conn() to remove the connection from the children list of session and free the connection at last. However for qla4xxx, it is not the case. The ->destroy_conn() callback of qla4xxx will keep the connection in the session conn_list and doesn't use iscsi_put_conn() to free the initial reference counter. Therefore, it seems necessary to keep the iscsi_put_conn() in the iscsi_iter_destroy_conn_fn(), otherwise, there will be memory leak problem. Link: https://lore.kernel.org/all/88334658-072b-4b90-a949-9c74ef93cfd1@huawei.com/ Fixes: c577ab7ba5f3 ("scsi: iscsi: Fix HW conn removal use after free") Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com> Link: https://lore.kernel.org/r/20250715073926.3529456-1-lilingfeng3@huawei.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15scsi: mpt3sas: Fix a fw_event memory leakTomas Henzl1-2/+1
[ Upstream commit 3e90b38781e3bdd651edaf789585687611638862 ] In _mpt3sas_fw_work() the fw_event reference is removed, it should also be freed in all cases. Fixes: 4318c7347847 ("scsi: mpt3sas: Handle NVMe PCIe device related events generated from firmware.") Signed-off-by: Tomas Henzl <thenzl@redhat.com> Link: https://lore.kernel.org/r/20250723153018.50518-1-thenzl@redhat.com Acked-by: Sathya Prakash Veerichetty <sathya.prakash@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15vfio/pci: Separate SR-IOV VF dev_setAlex Williamson1-1/+1
[ Upstream commit e908f58b6beb337cbe4481d52c3f5c78167b1aab ] In the below noted Fixes commit we introduced a reflck mutex to allow better scaling between devices for open and close. The reflck was based on the hot reset granularity, device level for root bus devices which cannot support hot reset or bus/slot reset otherwise. Overlooked in this were SR-IOV VFs, where there's also no bus reset option, but the default for a non-root-bus, non-slot-based device is bus level reflck granularity. The reflck mutex has since become the dev_set mutex (via commit 2cd8b14aaa66 ("vfio/pci: Move to the device set infrastructure")) and is our defacto serialization for various operations and ioctls. It still seems to be the case though that sets of vfio-pci devices really only need serialization relative to hot resets affecting the entire set, which is not relevant to SR-IOV VFs. As described in the Closes link below, this serialization contributes to startup latency when multiple VFs sharing the same "bus" are opened concurrently. Mark the device itself as the basis of the dev_set for SR-IOV VFs. Reported-by: Aaron Lewis <aaronlewis@google.com> Closes: https://lore.kernel.org/all/20250626180424.632628-1-aaronlewis@google.com Tested-by: Aaron Lewis <aaronlewis@google.com> Fixes: e309df5b0c9e ("vfio/pci: Parallelize device open and release") Reviewed-by: Yi Liu <yi.l.liu@intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20250626225623.1180952-1-alex.williamson@redhat.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15vfio/pds: Fix missing detach_ioas opBrett Creeley1-0/+1
[ Upstream commit fe24d5bc635e103a517ec201c3cb571eeab8be2f ] When CONFIG_IOMMUFD is enabled and a device is bound to the pds_vfio_pci driver, the following WARN_ON() trace is seen and probe fails: WARNING: CPU: 0 PID: 5040 at drivers/vfio/vfio_main.c:317 __vfio_register_dev+0x130/0x140 [vfio] <...> pds_vfio_pci 0000:08:00.1: probe with driver pds_vfio_pci failed with error -22 This is because the driver's vfio_device_ops.detach_ioas isn't set. Fix this by using the generic vfio_iommufd_physical_detach_ioas function. Fixes: 38fe3975b4c2 ("vfio/pds: Initial support for pds VFIO driver") Signed-off-by: Brett Creeley <brett.creeley@amd.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20250702163744.69767-1-brett.creeley@amd.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15vfio: Prevent open_count decrement to negativeJacob Pan1-1/+2
[ Upstream commit 982ddd59ed97dc7e63efd97ed50273ffb817bd41 ] When vfio_df_close() is called with open_count=0, it triggers a warning in vfio_assert_device_open() but still decrements open_count to -1. This allows a subsequent open to incorrectly pass the open_count == 0 check, leading to unintended behavior, such as setting df->access_granted = true. For example, running an IOMMUFD compat no-IOMMU device with VFIO tests (https://github.com/awilliam/tests/blob/master/vfio-noiommu-pci-device-open.c) results in a warning and a failed VFIO_GROUP_GET_DEVICE_FD ioctl on the first run, but the second run succeeds incorrectly. Add checks to avoid decrementing open_count below zero. Fixes: 05f37e1c03b6 ("vfio: Pass struct vfio_device_file * to vfio_device_open/close()") Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Jacob Pan <jacob.pan@linux.microsoft.com> Link: https://lore.kernel.org/r/20250618234618.1910456-2-jacob.pan@linux.microsoft.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15vfio: Fix unbalanced vfio_df_close call in no-iommu modeJacob Pan2-4/+7
[ Upstream commit b25e271b377999191b12f0afbe1861edcf57e3fe ] For devices with no-iommu enabled in IOMMUFD VFIO compat mode, the group open path skips vfio_df_open(), leaving open_count at 0. This causes a warning in vfio_assert_device_open(device) when vfio_df_close() is called during group close. The correct behavior is to skip only the IOMMUFD bind in the device open path for no-iommu devices. Commit 6086efe73498 omitted vfio_df_open(), which was too broad. This patch restores the previous behavior, ensuring the vfio_df_open is called in the group open path. Fixes: 6086efe73498 ("vfio-iommufd: Move noiommu compat validation out of vfio_iommufd_bind()") Suggested-by: Alex Williamson <alex.williamson@redhat.com> Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Jacob Pan <jacob.pan@linux.microsoft.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20250618234618.1910456-1-jacob.pan@linux.microsoft.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15i2c: muxes: mule: Fix an error handling path in mule_i2c_mux_probe()Christophe JAILLET1-2/+1
[ Upstream commit 33ac5155891cab165c93b51b0e22e153eacc2ee7 ] If an error occurs in the loop that creates the device adapters, then a reference to 'dev' still needs to be released. Use for_each_child_of_node_scoped() to both fix the issue and save one line of code. Fixes: d0f8e97866bf ("i2c: muxes: add support for tsd,mule-i2c multiplexer") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15rtc: rv3028: fix incorrect maximum clock rate handlingBrian Masney1-1/+1
[ Upstream commit b574acb3cf7591d2513a9f29f8c2021ad55fb881 ] When rv3028_clkout_round_rate() is called with a requested rate higher than the highest supported rate, it currently returns 0, which disables the clock. According to the clk API, round_rate() should instead return the highest supported rate. Update the function to return the maximum supported rate in this case. Fixes: f583c341a515f ("rtc: rv3028: add clkout support") Signed-off-by: Brian Masney <bmasney@redhat.com> Link: https://lore.kernel.org/r/20250710-rtc-clk-round-rate-v1-6-33140bb2278e@redhat.com Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15rtc: pcf8563: fix incorrect maximum clock rate handlingBrian Masney1-1/+1
[ Upstream commit 906726a5efeefe0ef0103ccff5312a09080c04ae ] When pcf8563_clkout_round_rate() is called with a requested rate higher than the highest supported rate, it currently returns 0, which disables the clock. According to the clk API, round_rate() should instead return the highest supported rate. Update the function to return the maximum supported rate in this case. Fixes: a39a6405d5f94 ("rtc: pcf8563: add CLKOUT to common clock framework") Signed-off-by: Brian Masney <bmasney@redhat.com> Link: https://lore.kernel.org/r/20250710-rtc-clk-round-rate-v1-5-33140bb2278e@redhat.com Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15rtc: pcf85063: fix incorrect maximum clock rate handlingBrian Masney1-1/+1
[ Upstream commit 186ae1869880e58bb3f142d222abdb35ecb4df0f ] When pcf85063_clkout_round_rate() is called with a requested rate higher than the highest supported rate, it currently returns 0, which disables the clock. According to the clk API, round_rate() should instead return the highest supported rate. Update the function to return the maximum supported rate in this case. Fixes: 8c229ab6048b7 ("rtc: pcf85063: Add pcf85063 clkout control to common clock framework") Signed-off-by: Brian Masney <bmasney@redhat.com> Link: https://lore.kernel.org/r/20250710-rtc-clk-round-rate-v1-4-33140bb2278e@redhat.com Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15rtc: nct3018y: fix incorrect maximum clock rate handlingBrian Masney1-1/+1
[ Upstream commit 437c59e4b222cd697b4cf95995d933e7d583c5f1 ] When nct3018y_clkout_round_rate() is called with a requested rate higher than the highest supported rate, it currently returns 0, which disables the clock. According to the clk API, round_rate() should instead return the highest supported rate. Update the function to return the maximum supported rate in this case. Fixes: 5adbaed16cc63 ("rtc: Add NCT3018Y real time clock driver") Signed-off-by: Brian Masney <bmasney@redhat.com> Link: https://lore.kernel.org/r/20250710-rtc-clk-round-rate-v1-3-33140bb2278e@redhat.com Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15rtc: hym8563: fix incorrect maximum clock rate handlingBrian Masney1-1/+1
[ Upstream commit d0a518eb0a692a2ab8357e844970660c5ea37720 ] When hym8563_clkout_round_rate() is called with a requested rate higher than the highest supported rate, it currently returns 0, which disables the clock. According to the clk API, round_rate() should instead return the highest supported rate. Update the function to return the maximum supported rate in this case. Fixes: dcaf038493525 ("rtc: add hym8563 rtc-driver") Signed-off-by: Brian Masney <bmasney@redhat.com> Link: https://lore.kernel.org/r/20250710-rtc-clk-round-rate-v1-2-33140bb2278e@redhat.com Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15rtc: ds1307: fix incorrect maximum clock rate handlingBrian Masney1-1/+1
[ Upstream commit cf6eb547a24af7ad7bbd2abe9c5327f956bbeae8 ] When ds3231_clk_sqw_round_rate() is called with a requested rate higher than the highest supported rate, it currently returns 0, which disables the clock. According to the clk API, round_rate() should instead return the highest supported rate. Update the function to return the maximum supported rate in this case. Fixes: 6c6ff145b3346 ("rtc: ds1307: add clock provider support for DS3231") Signed-off-by: Brian Masney <bmasney@redhat.com> Link: https://lore.kernel.org/r/20250710-rtc-clk-round-rate-v1-1-33140bb2278e@redhat.com Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Sasha Levin <sashal@kernel.org>