summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)AuthorFilesLines
2025-08-01net: hns3: default enable tx bounce buffer when smmu enabledJijie Shao2-0/+33
[ Upstream commit 49ade8630f36e9dca2395592cfb0b7deeb07e746 ] The SMMU engine on HIP09 chip has a hardware issue. SMMU pagetable prefetch features may prefetch and use a invalid PTE even the PTE is valid at that time. This will cause the device trigger fake pagefaults. The solution is to avoid prefetching by adding a SYNC command when smmu mapping a iova. But the performance of nic has a sharp drop. Then we do this workaround, always enable tx bounce buffer, avoid mapping/unmapping on TX path. This issue only affects HNS3, so we always enable tx bounce buffer when smmu enabled to improve performance. Fixes: 295ba232a8c3 ("net: hns3: add device version to replace pci revision") Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250722125423.1270673-5-shaojijie@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-01net: hns3: fixed vf get max channels bugJian Shen1-5/+1
[ Upstream commit b3e75c0bcc53f647311960bc1b0970b9b480ca5a ] Currently, the queried maximum of vf channels is the maximum of channels supported by each TC. However, the actual maximum of channels is the maximum of channels supported by the device. Fixes: 849e46077689 ("net: hns3: add ethtool_ops.get_channels support for VF") Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Hao Lan <lanhao@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250722125423.1270673-4-shaojijie@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-01net: hns3: disable interrupt when ptp init failedYonglong Liu1-3/+6
[ Upstream commit cde304655f25d94a996c45b0f9956e7dcc2bc4c0 ] When ptp init failed, we'd better disable the interrupt and clear the flag, to avoid early report interrupt at next probe. Fixes: 0bf5eb788512 ("net: hns3: add support for PTP") Signed-off-by: Yonglong Liu <liuyonglong@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250722125423.1270673-3-shaojijie@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-01net: hns3: fix concurrent setting vlan filter issueJian Shen1-15/+21
[ Upstream commit 4555f8f8b6aa46940f55feb6a07704c2935b6d6e ] The vport->req_vlan_fltr_en may be changed concurrently by function hclge_sync_vlan_fltr_state() called in periodic work task and function hclge_enable_vport_vlan_filter() called by user configuration. It may cause the user configuration inoperative. Fixes it by protect the vport->req_vlan_fltr by vport_lock. Fixes: 2ba306627f59 ("net: hns3: add support for modify VLAN filter state") Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250722125423.1270673-2-shaojijie@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-01s390/ism: fix concurrency management in ism_cmd()Halil Pasic1-0/+3
[ Upstream commit 897e8601b9cff1d054cdd53047f568b0e1995726 ] The s390x ISM device data sheet clearly states that only one request-response sequence is allowable per ISM function at any point in time. Unfortunately as of today the s390/ism driver in Linux does not honor that requirement. This patch aims to rectify that. This problem was discovered based on Aliaksei's bug report which states that for certain workloads the ISM functions end up entering error state (with PEC 2 as seen from the logs) after a while and as a consequence connections handled by the respective function break, and for future connection requests the ISM device is not considered -- given it is in a dysfunctional state. During further debugging PEC 3A was observed as well. A kernel message like [ 1211.244319] zpci: 061a:00:00.0: Event 0x2 reports an error for PCI function 0x61a is a reliable indicator of the stated function entering error state with PEC 2. Let me also point out that a kernel message like [ 1211.244325] zpci: 061a:00:00.0: The ism driver bound to the device does not support error recovery is a reliable indicator that the ISM function won't be auto-recovered because the ISM driver currently lacks support for it. On a technical level, without this synchronization, commands (inputs to the FW) may be partially or fully overwritten (corrupted) by another CPU trying to issue commands on the same function. There is hard evidence that this can lead to DMB token values being used as DMB IOVAs, leading to PEC 2 PCI events indicating invalid DMA. But this is only one of the failure modes imaginable. In theory even completely losing one command and executing another one twice and then trying to interpret the outputs as if the command we intended to execute was actually executed and not the other one is also possible. Frankly, I don't feel confident about providing an exhaustive list of possible consequences. Fixes: 684b89bc39ce ("s390/ism: add device driver for internal shared memory") Reported-by: Aliaksei Makarau <Aliaksei.Makarau@ibm.com> Tested-by: Mahanta Jambigi <mjambigi@linux.ibm.com> Tested-by: Aliaksei Makarau <Aliaksei.Makarau@ibm.com> Signed-off-by: Halil Pasic <pasic@linux.ibm.com> Reviewed-by: Alexandra Winter <wintera@linux.ibm.com> Signed-off-by: Alexandra Winter <wintera@linux.ibm.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250722161817.1298473-1-wintera@linux.ibm.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-01drm/bridge: ti-sn65dsi86: Remove extra semicolon in ti_sn_bridge_probe()Douglas Anderson1-1/+1
[ Upstream commit 15a7ca747d9538c2ad8b0c81dd4c1261e0736c82 ] As reported by the kernel test robot, a recent patch introduced an unnecessary semicolon. Remove it. Fixes: 55e8ff842051 ("drm/bridge: ti-sn65dsi86: Add HPD for DisplayPort connector type") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202506301704.0SBj6ply-lkp@intel.com/ Reviewed-by: Devarsh Thakkar <devarsht@ti.com> Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20250714130631.1.I1cfae3222e344a3b3c770d079ee6b6f7f3b5d636@changeid Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-01can: netlink: can_changelink(): fix NULL pointer deref of struct ↵Marc Kleine-Budde2-3/+21
can_priv::do_set_mode [ Upstream commit c1f3f9797c1f44a762e6f5f72520b2e520537b52 ] Andrei Lalaev reported a NULL pointer deref when a CAN device is restarted from Bus Off and the driver does not implement the struct can_priv::do_set_mode callback. There are 2 code path that call struct can_priv::do_set_mode: - directly by a manual restart from the user space, via can_changelink() - delayed automatic restart after bus off (deactivated by default) To prevent the NULL pointer deference, refuse a manual restart or configure the automatic restart delay in can_changelink() and report the error via extack to user space. As an additional safety measure let can_restart() return an error if can_priv::do_set_mode is not set instead of dereferencing it unchecked. Reported-by: Andrei Lalaev <andrey.lalaev@gmail.com> Closes: https://lore.kernel.org/all/20250714175520.307467-1-andrey.lalaev@gmail.com Fixes: 39549eef3587 ("can: CAN Network device driver and Netlink interface") Link: https://patch.msgid.link/20250718-fix-nullptr-deref-do_set_mode-v1-1-0b520097bb96@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-01i40e: When removing VF MAC filters, only check PF-set MACJamie Bainbridge1-2/+2
[ Upstream commit 5a0df02999dbe838c3feed54b1d59e9445f68b89 ] When the PF is processing an Admin Queue message to delete a VF's MACs from the MAC filter, we currently check if the PF set the MAC and if the VF is trusted. This results in undesirable behaviour, where if a trusted VF with a PF-set MAC sets itself down (which sends an AQ message to delete the VF's MAC filters) then the VF MAC is erased from the interface. This results in the VF losing its PF-set MAC which should not happen. There is no need to check for trust at all, because an untrusted VF cannot change its own MAC. The only check needed is whether the PF set the MAC. If the PF set the MAC, then don't erase the MAC on link-down. Resolve this by changing the deletion check only for PF-set MAC. (the out-of-tree driver has also intentionally removed the check for VF trust here with OOT driver version 2.26.8, this changes the Linux kernel driver behaviour and comment to match the OOT driver behaviour) Fixes: ea2a1cfc3b201 ("i40e: Fix VF MAC filter removal") Signed-off-by: Jamie Bainbridge <jamie.bainbridge@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-01i40e: report VF tx_dropped with tx_errors instead of tx_discardsDennis Chen1-1/+1
[ Upstream commit 50b2af451597ca6eefe9d4543f8bbf8de8aa00e7 ] Currently the tx_dropped field in VF stats is not updated correctly when reading stats from the PF. This is because it reads from i40e_eth_stats.tx_discards which seems to be unused for per VSI stats, as it is not updated by i40e_update_eth_stats() and the corresponding register, GLV_TDPC, is not implemented[1]. Use i40e_eth_stats.tx_errors instead, which is actually updated by i40e_update_eth_stats() by reading from GLV_TEPC. To test, create a VF and try to send bad packets through it: $ echo 1 > /sys/class/net/enp2s0f0/device/sriov_numvfs $ cat test.py from scapy.all import * vlan_pkt = Ether(dst="ff:ff:ff:ff:ff:ff") / Dot1Q(vlan=999) / IP(dst="192.168.0.1") / ICMP() ttl_pkt = IP(dst="8.8.8.8", ttl=0) / ICMP() print("Send packet with bad VLAN tag") sendp(vlan_pkt, iface="enp2s0f0v0") print("Send packet with TTL=0") sendp(ttl_pkt, iface="enp2s0f0v0") $ ip -s link show dev enp2s0f0 16: enp2s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether 3c:ec:ef:b7:e0:ac brd ff:ff:ff:ff:ff:ff RX: bytes packets errors dropped missed mcast 0 0 0 0 0 0 TX: bytes packets errors dropped carrier collsns 0 0 0 0 0 0 vf 0 link/ether e2:c6:fd:c1:1e:92 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off RX: bytes packets mcast bcast dropped 0 0 0 0 0 TX: bytes packets dropped 0 0 0 $ python test.py Send packet with bad VLAN tag . Sent 1 packets. Send packet with TTL=0 . Sent 1 packets. $ ip -s link show dev enp2s0f0 16: enp2s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether 3c:ec:ef:b7:e0:ac brd ff:ff:ff:ff:ff:ff RX: bytes packets errors dropped missed mcast 0 0 0 0 0 0 TX: bytes packets errors dropped carrier collsns 0 0 0 0 0 0 vf 0 link/ether e2:c6:fd:c1:1e:92 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off RX: bytes packets mcast bcast dropped 0 0 0 0 0 TX: bytes packets dropped 0 0 0 A packet with non-existent VLAN tag and a packet with TTL = 0 are sent, but tx_dropped is not incremented. After patch: $ ip -s link show dev enp2s0f0 19: enp2s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether 3c:ec:ef:b7:e0:ac brd ff:ff:ff:ff:ff:ff RX: bytes packets errors dropped missed mcast 0 0 0 0 0 0 TX: bytes packets errors dropped carrier collsns 0 0 0 0 0 0 vf 0 link/ether 4a:b7:3d:37:f7:56 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off RX: bytes packets mcast bcast dropped 0 0 0 0 0 TX: bytes packets dropped 0 0 2 Fixes: dc645daef9af5bcbd9c ("i40e: implement VF stats NDO") Signed-off-by: Dennis Chen <dechen@redhat.com> Link: https://www.intel.com/content/www/us/en/content-details/596333/intel-ethernet-controller-x710-tm4-at2-carlsville-datasheet.html Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-01net/mlx5: E-Switch, Fix peer miss rules to use peer eswitchShahar Shitrit1-54/+54
[ Upstream commit 5b4c56ad4da0aa00b258ab50b1f5775b7d3108c7 ] In the original design, it is assumed local and peer eswitches have the same number of vfs. However, in new firmware, local and peer eswitches can have different number of vfs configured by mlxconfig. In such configuration, it is incorrect to derive the number of vfs from the local device's eswitch. Fix this by updating the peer miss rules add and delete functions to use the peer device's eswitch and vf count instead of the local device's information, ensuring correct behavior regardless of vf configuration differences. Fixes: ac004b832128 ("net/mlx5e: E-Switch, Add peer miss rules") Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/1752753970-261832-3-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-01net/mlx5: Fix memory leak in cmd_exec()Chiara Meiohas1-2/+2
[ Upstream commit 3afa3ae3db52e3c216d77bd5907a5a86833806cc ] If cmd_exec() is called with callback and mlx5_cmd_invoke() returns an error, resources allocated in cmd_exec() will not be freed. Fix the code to release the resources if mlx5_cmd_invoke() returns an error. Fixes: f086470122d5 ("net/mlx5: cmdif, Return value improvements") Reported-by: Alex Tereshkin <atereshkin@nvidia.com> Signed-off-by: Chiara Meiohas <cmeiohas@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Vlad Dumitrescu <vdumitrescu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/1752753970-261832-2-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-01net: ti: icssg-prueth: Fix buffer allocation for ICSSGHimanshu Mittal5-73/+190
[ Upstream commit 6e86fb73de0fe3ec5cdcd5873ad1d6005f295b64 ] Fixes overlapping buffer allocation for ICSSG peripheral used for storing packets to be received/transmitted. There are 3 buffers: 1. Buffer for Locally Injected Packets 2. Buffer for Forwarding Packets 3. Buffer for Host Egress Packets In existing allocation buffers for 2. and 3. are overlapping causing packet corruption. Packet corruption observations: During tcp iperf testing, due to overlapping buffers the received ack packet overwrites the packet to be transmitted. So, we see packets on wire with the ack packet content inside the content of next TCP packet from sender device. Details for AM64x switch mode: -> Allocation by existing driver: +---------+-------------------------------------------------------------+ | | SLICE 0 | SLICE 1 | | +------+--------------+--------+------+--------------+--------+ | | Slot | Base Address | Size | Slot | Base Address | Size | |---------+------+--------------+--------+------+--------------+--------+ | | 0 | 70000000 | 0x2000 | 0 | 70010000 | 0x2000 | | | 1 | 70002000 | 0x2000 | 1 | 70012000 | 0x2000 | | | 2 | 70004000 | 0x2000 | 2 | 70014000 | 0x2000 | | FWD | 3 | 70006000 | 0x2000 | 3 | 70016000 | 0x2000 | | Buffers | 4 | 70008000 | 0x2000 | 4 | 70018000 | 0x2000 | | | 5 | 7000A000 | 0x2000 | 5 | 7001A000 | 0x2000 | | | 6 | 7000C000 | 0x2000 | 6 | 7001C000 | 0x2000 | | | 7 | 7000E000 | 0x2000 | 7 | 7001E000 | 0x2000 | +---------+------+--------------+--------+------+--------------+--------+ | | 8 | 70020000 | 0x1000 | 8 | 70028000 | 0x1000 | | | 9 | 70021000 | 0x1000 | 9 | 70029000 | 0x1000 | | | 10 | 70022000 | 0x1000 | 10 | 7002A000 | 0x1000 | | Our | 11 | 70023000 | 0x1000 | 11 | 7002B000 | 0x1000 | | LI | 12 | 00000000 | 0x0 | 12 | 00000000 | 0x0 | | Buffers | 13 | 00000000 | 0x0 | 13 | 00000000 | 0x0 | | | 14 | 00000000 | 0x0 | 14 | 00000000 | 0x0 | | | 15 | 00000000 | 0x0 | 15 | 00000000 | 0x0 | +---------+------+--------------+--------+------+--------------+--------+ | | 16 | 70024000 | 0x1000 | 16 | 7002C000 | 0x1000 | | | 17 | 70025000 | 0x1000 | 17 | 7002D000 | 0x1000 | | | 18 | 70026000 | 0x1000 | 18 | 7002E000 | 0x1000 | | Their | 19 | 70027000 | 0x1000 | 19 | 7002F000 | 0x1000 | | LI | 20 | 00000000 | 0x0 | 20 | 00000000 | 0x0 | | Buffers | 21 | 00000000 | 0x0 | 21 | 00000000 | 0x0 | | | 22 | 00000000 | 0x0 | 22 | 00000000 | 0x0 | | | 23 | 00000000 | 0x0 | 23 | 00000000 | 0x0 | +---------+------+--------------+--------+------+--------------+--------+ --> here 16, 17, 18, 19 overlapping with below express buffer +-----+-----------------------------------------------+ | | SLICE 0 | SLICE 1 | | +------------+----------+------------+----------+ | | Start addr | End addr | Start addr | End addr | +-----+------------+----------+------------+----------+ | EXP | 70024000 | 70028000 | 7002C000 | 70030000 | <-- Overlapping | PRE | 70030000 | 70033800 | 70034000 | 70037800 | +-----+------------+----------+------------+----------+ +---------------------+----------+----------+ | | SLICE 0 | SLICE 1 | +---------------------+----------+----------+ | Default Drop Offset | 00000000 | 00000000 | <-- Field not configured +---------------------+----------+----------+ -> Allocation this patch brings: +---------+-------------------------------------------------------------+ | | SLICE 0 | SLICE 1 | | +------+--------------+--------+------+--------------+--------+ | | Slot | Base Address | Size | Slot | Base Address | Size | |---------+------+--------------+--------+------+--------------+--------+ | | 0 | 70000000 | 0x2000 | 0 | 70040000 | 0x2000 | | | 1 | 70002000 | 0x2000 | 1 | 70042000 | 0x2000 | | | 2 | 70004000 | 0x2000 | 2 | 70044000 | 0x2000 | | FWD | 3 | 70006000 | 0x2000 | 3 | 70046000 | 0x2000 | | Buffers | 4 | 70008000 | 0x2000 | 4 | 70048000 | 0x2000 | | | 5 | 7000A000 | 0x2000 | 5 | 7004A000 | 0x2000 | | | 6 | 7000C000 | 0x2000 | 6 | 7004C000 | 0x2000 | | | 7 | 7000E000 | 0x2000 | 7 | 7004E000 | 0x2000 | +---------+------+--------------+--------+------+--------------+--------+ | | 8 | 70010000 | 0x1000 | 8 | 70050000 | 0x1000 | | | 9 | 70011000 | 0x1000 | 9 | 70051000 | 0x1000 | | | 10 | 70012000 | 0x1000 | 10 | 70052000 | 0x1000 | | Our | 11 | 70013000 | 0x1000 | 11 | 70053000 | 0x1000 | | LI | 12 | 00000000 | 0x0 | 12 | 00000000 | 0x0 | | Buffers | 13 | 00000000 | 0x0 | 13 | 00000000 | 0x0 | | | 14 | 00000000 | 0x0 | 14 | 00000000 | 0x0 | | | 15 | 00000000 | 0x0 | 15 | 00000000 | 0x0 | +---------+------+--------------+--------+------+--------------+--------+ | | 16 | 70014000 | 0x1000 | 16 | 70054000 | 0x1000 | | | 17 | 70015000 | 0x1000 | 17 | 70055000 | 0x1000 | | | 18 | 70016000 | 0x1000 | 18 | 70056000 | 0x1000 | | Their | 19 | 70017000 | 0x1000 | 19 | 70057000 | 0x1000 | | LI | 20 | 00000000 | 0x0 | 20 | 00000000 | 0x0 | | Buffers | 21 | 00000000 | 0x0 | 21 | 00000000 | 0x0 | | | 22 | 00000000 | 0x0 | 22 | 00000000 | 0x0 | | | 23 | 00000000 | 0x0 | 23 | 00000000 | 0x0 | +---------+------+--------------+--------+------+--------------+--------+ +-----+-----------------------------------------------+ | | SLICE 0 | SLICE 1 | | +------------+----------+------------+----------+ | | Start addr | End addr | Start addr | End addr | +-----+------------+----------+------------+----------+ | EXP | 70018000 | 7001C000 | 70058000 | 7005C000 | | PRE | 7001C000 | 7001F800 | 7005C000 | 7005F800 | +-----+------------+----------+------------+----------+ +---------------------+----------+----------+ | | SLICE 0 | SLICE 1 | +---------------------+----------+----------+ | Default Drop Offset | 7001F800 | 7005F800 | +---------------------+----------+----------+ Rootcause: missing buffer configuration for Express frames in function: prueth_fw_offload_buffer_setup() Details: Driver implements two distinct buffer configuration functions that are invoked based on the driver state and ICSSG firmware:- - prueth_fw_offload_buffer_setup() - prueth_emac_buffer_setup() During initialization, driver creates standard network interfaces (netdevs) and configures buffers via prueth_emac_buffer_setup(). This function properly allocates and configures all required memory regions including: - LI buffers - Express packet buffers - Preemptible packet buffers However, when the driver transitions to an offload mode (switch/HSR/PRP), buffer reconfiguration is handled by prueth_fw_offload_buffer_setup(). This function does not reconfigure the buffer regions required for Express packets, leading to incorrect buffer allocation. Fixes: abd5576b9c57 ("net: ti: icssg-prueth: Add support for ICSSG switch firmware") Signed-off-by: Himanshu Mittal <h-mittal1@ti.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250717094220.546388-1-h-mittal1@ti.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-01staging: vchiq_arm: Make vchiq_shutdown never failStefan Wahren1-2/+1
[ Upstream commit f2b8ebfb867011ddbefbdf7b04ad62626cbc2afd ] Most of the users of vchiq_shutdown ignore the return value, which is bad because this could lead to resource leaks. So instead of changing all calls to vchiq_shutdown, it's easier to make vchiq_shutdown never fail. Fixes: 71bad7f08641 ("staging: add bcm2708 vchiq driver") Signed-off-by: Stefan Wahren <wahrenst@gmx.net> Link: https://lore.kernel.org/r/20250715161108.3411-4-wahrenst@gmx.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-01platform/x86: Fix initialization order for firmware_attributes_classTorsten Hilbrich1-1/+2
[ Upstream commit 2bfe3ae1aa45f8b61cb0dc462114fd0c9636ad32 ] The think-lmi driver uses the firwmare_attributes_class. But this class is registered after think-lmi, causing the "think-lmi" directory in "/sys/class/firmware-attributes" to be missing when the driver is compiled as builtin. Fixes: 55922403807a ("platform/x86: think-lmi: Directly use firmware_attributes_class") Signed-off-by: Torsten Hilbrich <torsten.hilbrich@secunet.com> Link: https://lore.kernel.org/r/7dce5f7f-c348-4350-ac53-d14a8e1e8034@secunet.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-01platform/mellanox: mlxbf-pmc: Use kstrtobool() to check 0/1 inputShravan Kumar Ramani1-6/+4
[ Upstream commit 0e2cebd72321caeef84b6ba7084e85be0287fb4b ] For setting the enable value, the input should be 0 or 1 only. Use kstrtobool() in place of kstrtoint() in mlxbf_pmc_enable_store() to accept only valid input. Fixes: 423c3361855c ("platform/mellanox: mlxbf-pmc: Add support for BlueField-3") Signed-off-by: Shravan Kumar Ramani <shravankr@nvidia.com> Reviewed-by: David Thompson <davthompson@nvidia.com> Link: https://lore.kernel.org/r/2ee618c59976bcf1379d5ddce2fc60ab5014b3a9.1751380187.git.shravankr@nvidia.com [ij: split kstrbool() change to own commit.] Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-01platform/mellanox: mlxbf-pmc: Validate event/enable inputShravan Kumar Ramani1-1/+4
[ Upstream commit f8c1311769d3b2c82688b294b4ae03e94f1c326d ] Before programming the event info, validate the event number received as input by checking if it exists in the event_list. Also fix a typo in the comment for mlxbf_pmc_get_event_name() to correctly mention that it returns the event name when taking the event number as input, and not the other way round. Fixes: 423c3361855c ("platform/mellanox: mlxbf-pmc: Add support for BlueField-3") Signed-off-by: Shravan Kumar Ramani <shravankr@nvidia.com> Reviewed-by: David Thompson <davthompson@nvidia.com> Link: https://lore.kernel.org/r/2ee618c59976bcf1379d5ddce2fc60ab5014b3a9.1751380187.git.shravankr@nvidia.com [ij: split kstrbool() change to own commit.] Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-01platform/mellanox: mlxbf-pmc: Remove newline char from event name inputShravan Kumar Ramani1-1/+9
[ Upstream commit 44e6ca8faeeed12206f3e7189c5ac618b810bb9c ] Since the input string passed via the command line appends a newline char, it needs to be removed before comparison with the event_list. Fixes: 1a218d312e65 ("platform/mellanox: mlxbf-pmc: Add Mellanox BlueField PMC driver") Signed-off-by: Shravan Kumar Ramani <shravankr@nvidia.com> Reviewed-by: David Thompson <davthompson@nvidia.com> Link: https://lore.kernel.org/r/4978c18e33313b48fa2ae7f3aa6dbcfce40877e4.1751380187.git.shravankr@nvidia.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-01regmap: fix potential memory leak of regmap_busAbdun Nihaal1-0/+2
[ Upstream commit c871c199accb39d0f4cb941ad0dccabfc21e9214 ] When __regmap_init() is called from __regmap_init_i2c() and __regmap_init_spi() (and their devm versions), the bus argument obtained from regmap_get_i2c_bus() and regmap_get_spi_bus(), may be allocated using kmemdup() to support quirks. In those cases, the bus->free_on_exit field is set to true. However, inside __regmap_init(), buf is not freed on any error path. This could lead to a memory leak of regmap_bus when __regmap_init() fails. Fix that by freeing bus on error path when free_on_exit is set. Fixes: ea030ca68819 ("regmap-i2c: Set regmap max raw r/w from quirks") Signed-off-by: Abdun Nihaal <abdun.nihaal@gmail.com> Link: https://patch.msgid.link/20250626172823.18725-1-abdun.nihaal@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-01iio: adc: ad7949: use spi_is_bpw_supported()David Lechner1-4/+3
[ Upstream commit 7b86482632788acd48d7b9ee1867f5ad3a32ccbb ] Use spi_is_bpw_supported() instead of directly accessing spi->controller ->bits_per_word_mask. bits_per_word_mask may be 0, which implies that 8-bits-per-word is supported. spi_is_bpw_supported() takes this into account while spi_ctrl_mask == SPI_BPW_MASK(8) does not. Fixes: 0b2a740b424e ("iio: adc: ad7949: enable use with non 14/16-bit controllers") Closes: https://lore.kernel.org/linux-spi/c8b8a963-6cef-4c9b-bfef-dab2b7bd0b0f@sirena.org.uk/ Signed-off-by: David Lechner <dlechner@baylibre.com> Reviewed-by: Andy Shevchenko <andy@kernel.org> Link: https://patch.msgid.link/20250611-iio-adc-ad7949-use-spi_is_bpw_supported-v1-1-c4e15bfd326e@baylibre.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-01interconnect: qcom: sc7280: Add missing num_links to xm_pcie3_1 nodeXilin Wu1-0/+1
[ Upstream commit 886a94f008dd1a1702ee66dd035c266f70fd9e90 ] This allows adding interconnect paths for PCIe 1 in device tree later. Fixes: 46bdcac533cc ("interconnect: qcom: Add SC7280 interconnect provider driver") Signed-off-by: Xilin Wu <sophon@radxa.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Link: https://lore.kernel.org/r/20250613-sc7280-icc-pcie1-fix-v1-1-0b09813e3b09@radxa.com Signed-off-by: Georgi Djakov <djakov@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-01RDMA/core: Rate limit GID cache warning messagesMaor Gottlieb1-2/+2
[ Upstream commit 333e4d79316c9ed5877d7aac8b8ed22efc74e96d ] The GID cache warning messages can flood the kernel log when there are multiple failed attempts to add GIDs. This can happen when creating many virtual interfaces without having enough space for their GIDs in the GID table. Change pr_warn to pr_warn_ratelimited to prevent log flooding while still maintaining visibility of the issue. Link: https://patch.msgid.link/r/fd45ed4a1078e743f498b234c3ae816610ba1b18.1750062357.git.leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-01platform/x86: asus-nb-wmi: add DMI quirk for ASUS Zenbook Duo UX8406CARahul Chandra1-0/+9
[ Upstream commit 7dc6b2d3b5503bcafebbeaf9818112bf367107b4 ] Add a DMI quirk entry for the ASUS Zenbook Duo UX8406CA 2025 model to use the existing zenbook duo keyboard quirk. Signed-off-by: Rahul Chandra <rahul@chandra.net> Link: https://lore.kernel.org/r/20250624073301.602070-1-rahul@chandra.net Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-01regulator: core: fix NULL dereference on unbind due to stale coupling dataAlessandro Carminati1-0/+1
[ Upstream commit ca46946a482238b0cdea459fb82fc837fb36260e ] Failing to reset coupling_desc.n_coupled after freeing coupled_rdevs can lead to NULL pointer dereference when regulators are accessed post-unbind. This can happen during runtime PM or other regulator operations that rely on coupling metadata. For example, on ridesx4, unbinding the 'reg-dummy' platform device triggers a panic in regulator_lock_recursive() due to stale coupling state. Ensure n_coupled is set to 0 to prevent access to invalid pointers. Signed-off-by: Alessandro Carminati <acarmina@redhat.com> Link: https://patch.msgid.link/20250626083809.314842-1-acarmina@redhat.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-01virtio_ring: Fix error reporting in virtqueue_resizeLaurent Vivier1-2/+6
[ Upstream commit 45ebc7e6c125ce93d2ddf82cd5bea20121bb0258 ] The virtqueue_resize() function was not correctly propagating error codes from its internal resize helper functions, specifically virtqueue_resize_packet() and virtqueue_resize_split(). If these helpers returned an error, but the subsequent call to virtqueue_enable_after_reset() succeeded, the original error from the resize operation would be masked. Consequently, virtqueue_resize() could incorrectly report success to its caller despite an underlying resize failure. This change restores the original code behavior: if (vdev->config->enable_vq_after_reset(_vq)) return -EBUSY; return err; Fix: commit ad48d53b5b3f ("virtio_ring: separate the logic of reset/enable from virtqueue_resize") Cc: xuanzhuo@linux.alibaba.com Signed-off-by: Laurent Vivier <lvivier@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20250521092236.661410-2-lvivier@redhat.com Tested-by: Lei Yang <leiyang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-01virtio_net: Enforce minimum TX ring size for reliabilityLaurent Vivier1-0/+6
[ Upstream commit 24b2f5df86aaebbe7bac40304eaf5a146c02367c ] The `tx_may_stop()` logic stops TX queues if free descriptors (`sq->vq->num_free`) fall below the threshold of (`MAX_SKB_FRAGS` + 2). If the total ring size (`ring_num`) is not strictly greater than this value, queues can become persistently stopped or stop after minimal use, severely degrading performance. A single sk_buff transmission typically requires descriptors for: - The virtio_net_hdr (1 descriptor) - The sk_buff's linear data (head) (1 descriptor) - Paged fragments (up to MAX_SKB_FRAGS descriptors) This patch enforces that the TX ring size ('ring_num') must be strictly greater than (MAX_SKB_FRAGS + 2). This ensures that the ring is always large enough to hold at least one maximally-fragmented packet plus at least one additional slot. Reported-by: Lei Yang <leiyang@redhat.com> Signed-off-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20250521092236.661410-4-lvivier@redhat.com Tested-by: Lei Yang <leiyang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-01Input: gpio-keys - fix a sleep while atomic with PREEMPT_RTFabrice Gasnier1-2/+2
commit f4a8f561d08e39f7833d4a278ebfb12a41eef15f upstream. When enabling PREEMPT_RT, the gpio_keys_irq_timer() callback runs in hard irq context, but the input_event() takes a spin_lock, which isn't allowed there as it is converted to a rt_spin_lock(). [ 4054.289999] BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48 [ 4054.290028] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 0, name: swapper/0 ... [ 4054.290195] __might_resched+0x13c/0x1f4 [ 4054.290209] rt_spin_lock+0x54/0x11c [ 4054.290219] input_event+0x48/0x80 [ 4054.290230] gpio_keys_irq_timer+0x4c/0x78 [ 4054.290243] __hrtimer_run_queues+0x1a4/0x438 [ 4054.290257] hrtimer_interrupt+0xe4/0x240 [ 4054.290269] arch_timer_handler_phys+0x2c/0x44 [ 4054.290283] handle_percpu_devid_irq+0x8c/0x14c [ 4054.290297] handle_irq_desc+0x40/0x58 [ 4054.290307] generic_handle_domain_irq+0x1c/0x28 [ 4054.290316] gic_handle_irq+0x44/0xcc Considering the gpio_keys_irq_isr() can run in any context, e.g. it can be threaded, it seems there's no point in requesting the timer isr to run in hard irq context. Relax the hrtimer not to use the hard context. Fixes: 019002f20cb5 ("Input: gpio-keys - use hrtimer for release timer") Suggested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Fabrice Gasnier <fabrice.gasnier@foss.st.com> Signed-off-by: Gatien Chevallier <gatien.chevallier@foss.st.com> Link: https://lore.kernel.org/r/20250528-gpio_keys_preempt_rt-v2-1-3fc55a9c3619@foss.st.com Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> [ adjusted context ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-07-24iommu/vt-d: Fix misplaced domain_attached assignmentBan ZuoXiang1-3/+3
Commit fb5873b779dd ("iommu/vt-d: Restore context entry setup order for aliased devices") was incorrectly backported: the domain_attached assignment was mistakenly placed in device_set_dirty_tracking() instead of original identity_domain_attach_dev(). Fix this by moving the assignment to the correct function as in the original commit. Fixes: fb5873b779dd ("iommu/vt-d: Restore context entry setup order for aliased devices") Closes: https://lore.kernel.org/linux-iommu/721D44AF820A4FEB+722679cb-2226-4287-8835-9251ad69a1ac@bbaa.fun/ Cc: stable@vger.kernel.org Reported-by: Ban ZuoXiang <bbaa@bbaa.fun> Signed-off-by: Ban ZuoXiang <bbaa@bbaa.fun> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-07-24drm/xe: Move page fault init after topology initMatthew Brost1-3/+3
commit 3155ac89251dcb5e35a3ec2f60a74a6ed22c56fd upstream. We need the topology to determine GT page fault queue size, move page fault init after topology init. Cc: stable@vger.kernel.org Fixes: 3338e4f90c14 ("drm/xe: Use topology to determine page fault queue size") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Link: https://lore.kernel.org/r/20250710191208.1040215-1-matthew.brost@intel.com (cherry picked from commit beb72acb5b38dbe670d8eb752d1ad7a32f9c4119) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-07-24drm/xe/mocs: Initialize MOCS index earlyBalasubramani Vivekanandan1-2/+2
commit 2a58b21adee3df10ca6f4491af965c4890d2d8e3 upstream. MOCS uc_index is used even before it is initialized in the following callstack guc_prepare_xfer() __xe_guc_upload() xe_guc_min_load_for_hwconfig() xe_uc_init_hwconfig() xe_gt_init_hwconfig() Do MOCS index initialization earlier in the device probe. Signed-off-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Reviewed-by: Ravi Kumar Vodapalli <ravi.kumar.vodapalli@intel.com> Link: https://lore.kernel.org/r/20250520142445.2792824-1-balasubramani.vivekanandan@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> (cherry picked from commit 241cc827c0987d7173714fc5a95a7c8fc9bf15c0) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Stable-dep-of: 3155ac89251d ("drm/xe: Move page fault init after topology init") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-07-24i2c: omap: fix deprecated of_property_read_bool() useJohan Hovold1-1/+1
commit e66b0a8f048bc8590eb1047480f946898a3f80c9 upstream. Using of_property_read_bool() for non-boolean properties is deprecated and results in a warning during runtime since commit c141ecc3cecd ("of: Warn when of_property_read_bool() is used on non-boolean properties"). Fixes: b6ef830c60b6 ("i2c: omap: Add support for setting mux") Cc: Jayesh Choudhary <j-choudhary@ti.com> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Acked-by: Mukesh Kumar Savaliya <quic_msavaliy@quicinc.com> Link: https://lore.kernel.org/r/20250415075230.16235-1-johan+linaro@kernel.org Signed-off-by: Andi Shyti <andi.shyti@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-07-24i2c: omap: Handle omap_i2c_init() errors in omap_i2c_probe()Christophe JAILLET1-1/+4
commit a9503a2ecd95e23d7243bcde7138192de8c1c281 upstream. omap_i2c_init() can fail. Handle this error in omap_i2c_probe(). Fixes: 010d442c4a29 ("i2c: New bus driver for TI OMAP boards") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Cc: <stable@vger.kernel.org> # v2.6.19+ Signed-off-by: Andi Shyti <andi.shyti@kernel.org> Link: https://lore.kernel.org/r/565311abf9bafd7291ca82bcecb48c1fac1e727b.1751701715.git.christophe.jaillet@wanadoo.fr Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-07-24i2c: omap: Fix an error handling path in omap_i2c_probe()Christophe JAILLET1-2/+5
commit 666c23af755dccca8c25b5d5200ca28153c69a05 upstream. If an error occurs after calling mux_state_select(), mux_state_deselect() should be called as already done in the remove function. Fixes: b6ef830c60b6 ("i2c: omap: Add support for setting mux") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Cc: <stable@vger.kernel.org> # v6.15+ Signed-off-by: Andi Shyti <andi.shyti@kernel.org> Link: https://lore.kernel.org/r/998542981b6d2435c057dd8b9fe71743927babab.1749913149.git.christophe.jaillet@wanadoo.fr Stable-dep-of: a9503a2ecd95 ("i2c: omap: Handle omap_i2c_init() errors in omap_i2c_probe()") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-07-24i2c: omap: Add support for setting muxJayesh Choudhary2-0/+23
commit b6ef830c60b6f4adfb72d0780b4363df3a1feb9c upstream. Some SoCs require muxes in the routing for SDA and SCL lines. Therefore, add support for setting the mux by reading the mux-states property from the dt-node. Signed-off-by: Jayesh Choudhary <j-choudhary@ti.com> Link: https://lore.kernel.org/r/20250318103622.29979-3-j-choudhary@ti.com Signed-off-by: Andi Shyti <andi.shyti@kernel.org> Stable-dep-of: a9503a2ecd95 ("i2c: omap: Handle omap_i2c_init() errors in omap_i2c_probe()") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-07-24net: libwx: fix multicast packets received countJiawen Wu1-0/+2
commit 2b30a3d1ec2538a1fd363fde746b9fe1d38abc77 upstream. Multicast good packets received by PF rings that pass ethternet MAC address filtering are counted for rtnl_link_stats64.multicast. The counter is not cleared on read. Fix the duplicate counting on updating statistics. Fixes: 46b92e10d631 ("net: libwx: support hardware statistics") Cc: stable@vger.kernel.org Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/DA229A4F58B70E51+20250714015656.91772-1-jiawenwu@trustnetic.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-07-24usb: dwc3: qcom: Don't leave BCR assertedKrishna Kurapati1-6/+2
commit ef8abc0ba49ce717e6bc4124e88e59982671f3b5 upstream. Leaving the USB BCR asserted prevents the associated GDSC to turn on. This blocks any subsequent attempts of probing the device, e.g. after a probe deferral, with the following showing in the log: [ 1.332226] usb30_prim_gdsc status stuck at 'off' Leave the BCR deasserted when exiting the driver to avoid this issue. Cc: stable <stable@kernel.org> Fixes: a4333c3a6ba9 ("usb: dwc3: Add Qualcomm DWC3 glue driver") Acked-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Signed-off-by: Krishna Kurapati <krishna.kurapati@oss.qualcomm.com> Link: https://lore.kernel.org/r/20250709132900.3408752-1-krishna.kurapati@oss.qualcomm.com [ adapted to individual clock management instead of bulk clock operations ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-07-24usb: hub: Don't try to recover devices lost during warm reset.Mathias Nyman1-2/+6
commit 2521106fc732b0b75fd3555c689b1ed1d29d273c upstream. Hub driver warm-resets ports in SS.Inactive or Compliance mode to recover a possible connected device. The port reset code correctly detects if a connection is lost during reset, but hub driver port_event() fails to take this into account in some cases. port_event() ends up using stale values and assumes there is a connected device, and will try all means to recover it, including power-cycling the port. Details: This case was triggered when xHC host was suspended with DbC (Debug Capability) enabled and connected. DbC turns one xHC port into a simple usb debug device, allowing debugging a system with an A-to-A USB debug cable. xhci DbC code disables DbC when xHC is system suspended to D3, and enables it back during resume. We essentially end up with two hosts connected to each other during suspend, and, for a short while during resume, until DbC is enabled back. The suspended xHC host notices some activity on the roothub port, but can't train the link due to being suspended, so xHC hardware sets a CAS (Cold Attach Status) flag for this port to inform xhci host driver that the port needs to be warm reset once xHC resumes. CAS is xHCI specific, and not part of USB specification, so xhci driver tells usb core that the port has a connection and link is in compliance mode. Recovery from complinace mode is similar to CAS recovery. xhci CAS driver support that fakes a compliance mode connection was added in commit 8bea2bd37df0 ("usb: Add support for root hub port status CAS") Once xHCI resumes and DbC is enabled back, all activity on the xHC roothub host side port disappears. The hub driver will anyway think port has a connection and link is in compliance mode, and hub driver will try to recover it. The port power-cycle during recovery seems to cause issues to the active DbC connection. Fix this by clearing connect_change flag if hub_port_reset() returns -ENOTCONN, thus avoiding the whole unnecessary port recovery and initialization attempt. Cc: stable@vger.kernel.org Fixes: 8bea2bd37df0 ("usb: Add support for root hub port status CAS") Tested-by: Łukasz Bartosik <ukaszb@chromium.org> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Link: https://lore.kernel.org/r/20250623133947.3144608-1-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-07-24usb: hub: Fix flushing of delayed work used for post resume purposesMathias Nyman2-13/+9
commit 9bd9c8026341f75f25c53104eb7e656e357ca1a2 upstream. Delayed work that prevents USB3 hubs from runtime-suspending too early needed to be flushed in hub_quiesce() to resolve issues detected on QC SC8280XP CRD board during suspend resume testing. This flushing did however trigger new issues on Raspberry Pi 3B+, which doesn't have USB3 ports, and doesn't queue any post resume delayed work. The flushed 'hub->init_work' item is used for several purposes, and is originally initialized with a 'NULL' work function. The work function is also changed on the fly, which may contribute to the issue. Solve this by creating a dedicated delayed work item for post resume work, and flush that delayed work in hub_quiesce() Cc: stable <stable@kernel.org> Fixes: a49e1e2e785f ("usb: hub: Fix flushing and scheduling of delayed work that tunes runtime pm") Reported-by: Mark Brown <broonie@kernel.org> Closes: https://lore.kernel.org/linux-usb/aF5rNp1l0LWITnEB@finisterre.sirena.org.uk Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Tested-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> # SC8280XP CRD Tested-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20250627164348.3982628-2-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-07-24usb: hub: Fix flushing and scheduling of delayed work that tunes runtime pmMathias Nyman1-2/+4
commit a49e1e2e785fb3621f2d748581881b23a364998a upstream. Delayed work to prevent USB3 hubs from runtime-suspending immediately after resume was added in commit 8f5b7e2bec1c ("usb: hub: fix detection of high tier USB3 devices behind suspended hubs"). This delayed work needs be flushed if system suspends, or hub needs to be quiesced for other reasons right after resume. Not flushing it triggered issues on QC SC8280XP CRD board during suspend/resume testing. Fix it by flushing the delayed resume work in hub_quiesce() The delayed work item that allow hub runtime suspend is also scheduled just before calling autopm get. Alan pointed out there is a small risk that work is run before autopm get, which would call autopm put before get, and mess up the runtime pm usage order. Swap the order of work sheduling and calling autopm get to solve this. Cc: stable <stable@kernel.org> Fixes: 8f5b7e2bec1c ("usb: hub: fix detection of high tier USB3 devices behind suspended hubs") Reported-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Closes: https://lore.kernel.org/linux-usb/acaaa928-832c-48ca-b0ea-d202d5cd3d6c@oss.qualcomm.com Reported-by: Alan Stern <stern@rowland.harvard.edu> Closes: https://lore.kernel.org/linux-usb/c73fbead-66d7-497a-8fa1-75ea4761090a@rowland.harvard.edu Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250626130102.3639861-2-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-07-24usb: hub: fix detection of high tier USB3 devices behind suspended hubsMathias Nyman1-1/+32
commit 8f5b7e2bec1c36578fdaa74a6951833541103e27 upstream. USB3 devices connected behind several external suspended hubs may not be detected when plugged in due to aggressive hub runtime pm suspend. The hub driver immediately runtime-suspends hubs if there are no active children or port activity. There is a delay between the wake signal causing hub resume, and driver visible port activity on the hub downstream facing ports. Most of the LFPS handshake, resume signaling and link training done on the downstream ports is not visible to the hub driver until completed, when device then will appear fully enabled and running on the port. This delay between wake signal and detectable port change is even more significant with chained suspended hubs where the wake signal will propagate upstream first. Suspended hubs will only start resuming downstream ports after upstream facing port resumes. The hub driver may resume a USB3 hub, read status of all ports, not yet see any activity, and runtime suspend back the hub before any port activity is visible. This exact case was seen when conncting USB3 devices to a suspended Thunderbolt dock. USB3 specification defines a 100ms tU3WakeupRetryDelay, indicating USB3 devices expect to be resumed within 100ms after signaling wake. if not then device will resend the wake signal. Give the USB3 hubs twice this time (200ms) to detect any port changes after resume, before allowing hub to runtime suspend again. Cc: stable <stable@kernel.org> Fixes: 2839f5bcfcfc ("USB: Turn on auto-suspend for USB 3.0 hubs.") Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250611112441.2267883-1-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-07-24drm/mediatek: only announce AFBC if really supportedIcenowy Zheng7-4/+27
[ Upstream commit 8d121a82fa564e0c8bd86ce4ec56b2a43b9b016e ] Currently even the SoC's OVL does not declare the support of AFBC, AFBC is still announced to the userspace within the IN_FORMATS blob, which breaks modern Wayland compositors like KWin Wayland and others. Gate passing modifiers to drm_universal_plane_init() behind querying the driver of the hardware block for AFBC support. Fixes: c410fa9b07c3 ("drm/mediatek: Add AFBC support to Mediatek DRM driver") Signed-off-by: Icenowy Zheng <uwu@icenowy.me> Reviewed-by: CK Hu <ck.hu@medaitek.com> Link: https://patchwork.kernel.org/project/linux-mediatek/patch/20250531121140.387661-1-uwu@icenowy.me/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-07-24drm/mediatek: Add wait_event_timeout when disabling planeJason-JH Lin3-0/+39
[ Upstream commit d208261e9f7c66960587b10473081dc1cecbe50b ] Our hardware registers are set through GCE, not by the CPU. DRM might assume the hardware is disabled immediately after calling atomic_disable() of drm_plane, but it is only truly disabled after the GCE IRQ is triggered. Additionally, the cursor plane in DRM uses async_commit, so DRM will not wait for vblank and will free the buffer immediately after calling atomic_disable(). To prevent the framebuffer from being freed before the layer disable settings are configured into the hardware, which can cause an IOMMU fault error, a wait_event_timeout has been added to wait for the ddp_cmdq_cb() callback,indicating that the GCE IRQ has been triggered. Fixes: 2f965be7f900 ("drm/mediatek: apply CMDQ control flow") Signed-off-by: Jason-JH Lin <jason-jh.lin@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: CK Hu <ck.hu@mediatek.com> Link: https://patchwork.kernel.org/project/linux-mediatek/patch/20250624113223.443274-1-jason-jh.lin@mediatek.com/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-07-24virtio-net: fix recursived rtnl_lock() during probe()Zigit Zo1-1/+1
[ Upstream commit be5dcaed694e4255dc02dd0acfe036708c535def ] The deadlock appears in a stack trace like: virtnet_probe() rtnl_lock() virtio_config_changed_work() netdev_notify_peers() rtnl_lock() It happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing. The config_work in probe() will get scheduled until virtnet_open() enables the config change notification via virtio_config_driver_enable(). Fixes: df28de7b0050 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <zuozhijie@bytedance.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20250716115717.1472430-1-zuozhijie@bytedance.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-07-24hv_netvsc: Set VF priv_flags to IFF_NO_ADDRCONF before open to prevent IPv6 ↵Li Tian1-1/+4
addrconf [ Upstream commit d7501e076d859d2f381d57bd984ff6db13172727 ] Set an additional flag IFF_NO_ADDRCONF to prevent ipv6 addrconf. Commit under Fixes added a new flag change that was not made to hv_netvsc resulting in the VF being assinged an IPv6. Fixes: 8a321cf7becc ("net: add IFF_NO_ADDRCONF and use it in bonding to prevent ipv6 addrconf") Suggested-by: Cathy Avery <cavery@redhat.com> Signed-off-by: Li Tian <litian@redhat.com> Reviewed-by: Xin Long <lucien.xin@gmail.com> Link: https://patch.msgid.link/20250716002607.4927-1-litian@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-07-24drm/xe/pf: Prepare to stop SR-IOV support prior GT resetMichal Wajdeczko3-0/+27
[ Upstream commit 81dccec448d204e448ae83e1fe60e8aaeaadadb8 ] As part of the resume or GT reset, the PF driver schedules work which is then used to complete restarting of the SR-IOV support, including resending to the GuC configurations of provisioned VFs. However, in case of short delay between those two actions, which could be seen by triggering a GT reset on the suspened device: $ echo 1 > /sys/kernel/debug/dri/0000:00:02.0/gt0/force_reset this PF worker might be still busy, which lead to errors due to just stopped or disabled GuC CTB communication: [ ] xe 0000:00:02.0: [drm:xe_gt_resume [xe]] GT0: resumed [ ] xe 0000:00:02.0: [drm] GT0: trying reset from force_reset_show [xe] [ ] xe 0000:00:02.0: [drm] GT0: reset queued [ ] xe 0000:00:02.0: [drm] GT0: reset started [ ] xe 0000:00:02.0: [drm:guc_ct_change_state [xe]] GT0: GuC CT communication channel stopped [ ] xe 0000:00:02.0: [drm:guc_ct_send_recv [xe]] GT0: H2G request 0x5503 canceled! [ ] xe 0000:00:02.0: [drm] GT0: PF: Failed to push VF1 12 config KLVs (-ECANCELED) [ ] xe 0000:00:02.0: [drm] GT0: PF: Failed to push VF1 configuration (-ECANCELED) [ ] xe 0000:00:02.0: [drm:guc_ct_change_state [xe]] GT0: GuC CT communication channel disabled [ ] xe 0000:00:02.0: [drm] GT0: PF: Failed to push VF2 12 config KLVs (-ENODEV) [ ] xe 0000:00:02.0: [drm] GT0: PF: Failed to push VF2 configuration (-ENODEV) [ ] xe 0000:00:02.0: [drm] GT0: PF: Failed to push 2 of 2 VFs configurations [ ] xe 0000:00:02.0: [drm:pf_worker_restart_func [xe]] GT0: PF: restart completed While this VFs reprovisioning will be successful during next spin of the worker, to avoid those errors, make sure to cancel restart worker if we are about to trigger next reset. Fixes: 411220808cee ("drm/xe/pf: Restart VFs provisioning after GT reset") Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://lore.kernel.org/r/20250711193316.1920-2-michal.wajdeczko@intel.com (cherry picked from commit 9f50b729dd61dfb9f4d7c66900d22a7c7353a8c0) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-07-24drm/xe/pf: Move VFs reprovisioning to workerMichal Wajdeczko2-2/+51
[ Upstream commit a4d1c5d0b99b75263a5626d2e52d569db3844b33 ] Since the GuC is reset during GT reset, we need to re-send the entire SR-IOV provisioning configuration to the GuC. But since this whole configuration is protected by the PF master mutex and we can't avoid making allocations under this mutex (like during LMEM provisioning), we can't do this reprovisioning from gt-reset path if we want to be reclaim-safe. Move VFs reprovisioning to a async worker that we will start from the gt-reset path. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250125215505.720-1-michal.wajdeczko@intel.com Stable-dep-of: 81dccec448d2 ("drm/xe/pf: Prepare to stop SR-IOV support prior GT reset") Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-07-24drm/xe/pf: Sanitize VF scratch registers on FLRMichal Wajdeczko3-1/+55
[ Upstream commit 13a48a0fa52352f9fe58e2e1927670dcfea64c3a ] Some VF accessible registers (like GuC scratch registers) must be explicitly reset during the FLR. While this is today done by the GuC firmware, according to the design, this should be responsibility of the PF driver, as future platforms may require more registers to be reset. Likewise GuC, the PF can access VFs registers by adding some platform specific offset to the original register address. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240902192953.1792-1-michal.wajdeczko@intel.com Stable-dep-of: 81dccec448d2 ("drm/xe/pf: Prepare to stop SR-IOV support prior GT reset") Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-07-24net/mlx5: Correctly set gso_size when LRO is usedChristoph Paasch1-4/+8
[ Upstream commit 531d0d32de3e1b6b77a87bd37de0c2c6e17b496a ] gso_size is expected by the networking stack to be the size of the payload (thus, not including ethernet/IP/TCP-headers). However, cqe_bcnt is the full sized frame (including the headers). Dividing cqe_bcnt by lro_num_seg will then give incorrect results. For example, running a bpftrace higher up in the TCP-stack (tcp_event_data_recv), we commonly have gso_size set to 1450 or 1451 even though in reality the payload was only 1448 bytes. This can have unintended consequences: - In tcp_measure_rcv_mss() len will be for example 1450, but. rcv_mss will be 1448 (because tp->advmss is 1448). Thus, we will always recompute scaling_ratio each time an LRO-packet is received. - In tcp_gro_receive(), it will interfere with the decision whether or not to flush and thus potentially result in less gro'ed packets. So, we need to discount the protocol headers from cqe_bcnt so we can actually divide the payload by lro_num_seg to get the real gso_size. v2: - Use "(unsigned char *)tcp + tcp->doff * 4 - skb->data)" to compute header-len (Tariq Toukan <tariqt@nvidia.com>) - Improve commit-message (Gal Pressman <gal@nvidia.com>) Fixes: e586b3b0baee ("net/mlx5: Ethernet Datapath files") Signed-off-by: Christoph Paasch <cpaasch@openai.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Link: https://patch.msgid.link/20250715-cpaasch-pf-925-investigate-incorrect-gso_size-on-cx-7-nic-v2-1-e06c3475f3ac@openai.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-07-24Bluetooth: btusb: QCA: Fix downloading wrong NVM for WCN6855 GF variant ↵Zijun Hu1-34/+44
without board ID [ Upstream commit 43015955795a619f7ca4ae69b9c0ffc994c82818 ] For GF variant of WCN6855 without board ID programmed btusb_generate_qca_nvm_name() will chose wrong NVM 'qca/nvm_usb_00130201.bin' to download. Fix by choosing right NVM 'qca/nvm_usb_00130201_gf.bin'. Also simplify NVM choice logic of btusb_generate_qca_nvm_name(). Fixes: d6cba4e6d0e2 ("Bluetooth: btusb: Add support using different nvm for variant WCN6855 controller") Signed-off-by: Zijun Hu <zijun.hu@oss.qualcomm.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-07-24loop: use kiocb helpers to fix lockdep warningMing Lei1-3/+2
[ Upstream commit c4706c5058a7bd7d7c20f3b24a8f523ecad44e83 ] The lockdep tool can report a circular lock dependency warning in the loop driver's AIO read/write path: ``` [ 6540.587728] kworker/u96:5/72779 is trying to acquire lock: [ 6540.593856] ff110001b5968440 (sb_writers#9){.+.+}-{0:0}, at: loop_process_work+0x11a/0xf70 [loop] [ 6540.603786] [ 6540.603786] but task is already holding lock: [ 6540.610291] ff110001b5968440 (sb_writers#9){.+.+}-{0:0}, at: loop_process_work+0x11a/0xf70 [loop] [ 6540.620210] [ 6540.620210] other info that might help us debug this: [ 6540.627499] Possible unsafe locking scenario: [ 6540.627499] [ 6540.634110] CPU0 [ 6540.636841] ---- [ 6540.639574] lock(sb_writers#9); [ 6540.643281] lock(sb_writers#9); [ 6540.646988] [ 6540.646988] *** DEADLOCK *** ``` This patch fixes the issue by using the AIO-specific helpers `kiocb_start_write()` and `kiocb_end_write()`. These functions are designed to be used with a `kiocb` and manage write sequencing correctly for asynchronous I/O without introducing the problematic lock dependency. The `kiocb` is already part of the `loop_cmd` struct, so this change also simplifies the completion function `lo_rw_aio_do_completion()` by using the `iocb` from the `cmd` struct directly, instead of retrieving the loop device from the request queue. Fixes: 39d86db34e41 ("loop: add file_start_write() and file_end_write()") Cc: Changhui Zhong <czhong@redhat.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250716114808.3159657-1-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-07-24usb: net: sierra: check for no status endpointOliver Neukum1-0/+4
[ Upstream commit 4c4ca3c46167518f8534ed70f6e3b4bf86c4d158 ] The driver checks for having three endpoints and having bulk in and out endpoints, but not that the third endpoint is interrupt input. Rectify the omission. Reported-by: syzbot+3f89ec3d1d0842e95d50@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-usb/686d5a9f.050a0220.1ffab7.0017.GAE@google.com/ Tested-by: syzbot+3f89ec3d1d0842e95d50@syzkaller.appspotmail.com Fixes: eb4fd8cd355c8 ("net/usb: add sierra_net.c driver") Signed-off-by: Oliver Neukum <oneukum@suse.com> Link: https://patch.msgid.link/20250714111326.258378-1-oneukum@suse.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>