summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2025-05-14net: enetc: fix implicit declaration of function FIELD_PREPWei Fang1-0/+1
The kernel test robot reported the following error: drivers/net/ethernet/freescale/enetc/ntmp.c: In function 'ntmp_fill_request_hdr': drivers/net/ethernet/freescale/enetc/ntmp.c:203:38: error: implicit declaration of function 'FIELD_PREP' [-Wimplicit-function-declaration] 203 | cbd->req_hdr.access_method = FIELD_PREP(NTMP_ACCESS_METHOD, | ^~~~~~~~~~ Therefore, add "bitfield.h" to ntmp_private.h to fix this issue. Fixes: 4701073c3deb ("net: enetc: add initial netc-lib driver to support NTMP") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202505101047.NTMcerZE-lkp@intel.com/ Signed-off-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2025-05-14net: wangxun: Correct clerical errors in commentsJiawen Wu2-2/+2
There are wrong "#endif" comments in .h files need to be corrected. Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2025-05-14phy: starfive: jh7110-usb: Fix USB 2.0 host occasional detection failureHal Feng1-0/+7
JH7110 USB 2.0 host fails to detect USB 2.0 devices occasionally. With a long time of debugging and testing, we found that setting Rx clock gating control signal to normal power consumption mode can solve this problem. Signed-off-by: Hal Feng <hal.feng@starfivetech.com> Link: https://lore.kernel.org/r/20250422101244.51686-1-hal.feng@starfivetech.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-05-14nvme: multipath: enable BLK_FEAT_ATOMIC_WRITES for multipathingAlan Adamson1-1/+2
A change to QEMU resulted in all nvme controllers (single and multi-controller subsystems) to have its CMIC.MCTRS bit set which indicates the subsystem supports multiple controllers and it is possible a namespace can be shared between those multiple controllers in a multipath configuration. When a namespace of a CMIC.MCTRS enabled subsystem is allocated, a multipath node is created. The queue limits for this node are inherited from the namespace being allocated. When inheriting queue limits, the features being inherited need to be specified. The atomic write feature (BLK_FEAT_ATOMIC_WRITES) was not specified so the atomic queue limits were not inherited by the multipath disk node which resulted in the sysfs atomic write attributes being zeroed. The fix is to include BLK_FEAT_ATOMIC_WRITES in the list of features to be inherited. Signed-off-by: Alan Adamson <alan.adamson@oracle.com> Reviewed-by: John Garry <john.g.garry@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-05-14xfrm: Sanitize marks before insertPaul Chaignon2-0/+6
Prior to this patch, the mark is sanitized (applying the state's mask to the state's value) only on inserts when checking if a conflicting XFRM state or policy exists. We discovered in Cilium that this same sanitization does not occur in the hot-path __xfrm_state_lookup. In the hot-path, the sk_buff's mark is simply compared to the state's value: if ((mark & x->mark.m) != x->mark.v) continue; Therefore, users can define unsanitized marks (ex. 0xf42/0xf00) which will never match any packet. This commit updates __xfrm_state_insert and xfrm_policy_insert to store the sanitized marks, thus removing this footgun. This has the side effect of changing the ip output, as the returned mark will have the mask applied to it when printed. Fixes: 3d6acfa7641f ("xfrm: SA lookups with mark") Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> Signed-off-by: Louis DeLosSantos <louis.delos.devel@gmail.com> Co-developed-by: Louis DeLosSantos <louis.delos.devel@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2025-05-14qlcnic: fix memory leak in qlcnic_sriov_channel_cfg_cmd()Abdun Nihaal1-2/+5
In one of the error paths in qlcnic_sriov_channel_cfg_cmd(), the memory allocated in qlcnic_sriov_alloc_bc_mbx_args() for mailbox arguments is not freed. Fix that by jumping to the error path that frees them, by calling qlcnic_free_mbx_args(). This was found using static analysis. Fixes: f197a7aa6288 ("qlcnic: VF-PF communication channel implementation") Signed-off-by: Abdun Nihaal <abdun.nihaal@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250512044829.36400-1-abdun.nihaal@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-14net: phy: remove stub for mdiobus_register_board_infoHeiner Kallweit1-9/+0
The functionality of mdiobus_register_board_info() typically isn't optional for the caller. Therefore remove the stub. Note: Currently we have only one caller of mdiobus_register_board_info(), in a DSA/PHYLINK context. Therefore CONFIG_MDIO_DEVICE is selected anyway. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/410a2222-c4e8-45b0-9091-d49674caeb00@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-14net: mlxsw: convert to ndo_hwtstamp_get() and ndo_hwtstamp_set()Vladimir Oltean4-72/+48
New timestamping API was introduced in commit 66f7223039c0 ("net: add NDOs for configuring hardware timestamping") from kernel v6.6. It is time to convert the mlxsw driver to the new API, so that the ndo_eth_ioctl() path can be removed completely. The UAPI is still ioctl-only, but it's best to remove the "ioctl" mentions from the driver in case a netlink variant appears. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20250512154411.848614-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-14net: ipa: Make the SMEM item ID constantKonrad Dybcio11-21/+11
It can't vary, stop storing the same magic number everywhere. Signed-off-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Reviewed-by: Alex Elder <elder@kernel.org> Link: https://patch.msgid.link/20250512-topic-ipa_smem-v1-1-302679514a0d@oss.qualcomm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-14docs: networking: timestamping: improve stacked PHC sentenceVladimir Oltean1-5/+3
The first paragraph makes no grammatical sense. I suppose a portion of the intended sentece is missing: "[The challenge with ] stacked PHCs (...) is that they uncover bugs". Rephrase, and at the same time simplify the structure of the sentence a little bit, it is not easy to follow. Fixes: 94d9f78f4d64 ("docs: networking: timestamping: add section for stacked PHC devices") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Link: https://patch.msgid.link/20250512131751.320283-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-14net: enetc: convert to ndo_hwtstamp_get() and ndo_hwtstamp_set()Vladimir Oltean4-26/+31
New timestamping API was introduced in commit 66f7223039c0 ("net: add NDOs for configuring hardware timestamping") from kernel v6.6. It is time to convert the ENETC driver to the new API, so that the ndo_eth_ioctl() path can be removed completely. Move the enetc_hwtstamp_get() and enetc_hwtstamp_set() calls away from enetc_ioctl() to dedicated net_device_ops for the LS1028A PF and VF (NETC v4 does not yet implement enetc_ioctl()), adapt the prototypes and export these symbols (enetc_ioctl() is also exported). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Wei Fang <wei.fang@nxp.com> Link: https://patch.msgid.link/20250512112402.4100618-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-14net: txgbe: Fix pending interruptJiawen Wu1-6/+1
For unknown reasons, sometimes the value of MISC interrupt is 0 in the IRQ handle function. In this case, wx_intr_enable() is also should be invoked to clear the interrupt. Otherwise, the next interrupt would never be reported. Fixes: a9843689e2de ("net: txgbe: add sriov function support") Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/F4F708403CE7090B+20250512100652.139510-1-jiawenwu@trustnetic.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-14net/mlx5e: Disable MACsec offload for uplink representor profileCarolina Jubran1-0/+4
MACsec offload is not supported in switchdev mode for uplink representors. When switching to the uplink representor profile, the MACsec offload feature must be cleared from the netdevice's features. If left enabled, attempts to add offloads result in a null pointer dereference, as the uplink representor does not support MACsec offload even though the feature bit remains set. Clear NETIF_F_HW_MACSEC in mlx5e_fix_uplink_rep_features(). Kernel log: Oops: general protection fault, probably for non-canonical address 0xdffffc000000000f: 0000 [#1] SMP KASAN KASAN: null-ptr-deref in range [0x0000000000000078-0x000000000000007f] CPU: 29 UID: 0 PID: 4714 Comm: ip Not tainted 6.14.0-rc4_for_upstream_debug_2025_03_02_17_35 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 RIP: 0010:__mutex_lock+0x128/0x1dd0 Code: d0 7c 08 84 d2 0f 85 ad 15 00 00 8b 35 91 5c fe 03 85 f6 75 29 49 8d 7e 60 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 a6 15 00 00 4d 3b 76 60 0f 85 fd 0b 00 00 65 ff RSP: 0018:ffff888147a4f160 EFLAGS: 00010206 RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000001 RDX: 000000000000000f RSI: 0000000000000000 RDI: 0000000000000078 RBP: ffff888147a4f2e0 R08: ffffffffa05d2c19 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000 R13: dffffc0000000000 R14: 0000000000000018 R15: ffff888152de0000 FS: 00007f855e27d800(0000) GS:ffff88881ee80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000004e5768 CR3: 000000013ae7c005 CR4: 0000000000372eb0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 Call Trace: <TASK> ? die_addr+0x3d/0xa0 ? exc_general_protection+0x144/0x220 ? asm_exc_general_protection+0x22/0x30 ? mlx5e_macsec_add_secy+0xf9/0x700 [mlx5_core] ? __mutex_lock+0x128/0x1dd0 ? lockdep_set_lock_cmp_fn+0x190/0x190 ? mlx5e_macsec_add_secy+0xf9/0x700 [mlx5_core] ? mutex_lock_io_nested+0x1ae0/0x1ae0 ? lock_acquire+0x1c2/0x530 ? macsec_upd_offload+0x145/0x380 ? lockdep_hardirqs_on_prepare+0x400/0x400 ? kasan_save_stack+0x30/0x40 ? kasan_save_stack+0x20/0x40 ? kasan_save_track+0x10/0x30 ? __kasan_kmalloc+0x77/0x90 ? __kmalloc_noprof+0x249/0x6b0 ? genl_family_rcv_msg_attrs_parse.constprop.0+0xb5/0x240 ? mlx5e_macsec_add_secy+0xf9/0x700 [mlx5_core] mlx5e_macsec_add_secy+0xf9/0x700 [mlx5_core] ? mlx5e_macsec_add_rxsa+0x11a0/0x11a0 [mlx5_core] macsec_update_offload+0x26c/0x820 ? macsec_set_mac_address+0x4b0/0x4b0 ? lockdep_hardirqs_on_prepare+0x284/0x400 ? _raw_spin_unlock_irqrestore+0x47/0x50 macsec_upd_offload+0x2c8/0x380 ? macsec_update_offload+0x820/0x820 ? __nla_parse+0x22/0x30 ? genl_family_rcv_msg_attrs_parse.constprop.0+0x15e/0x240 genl_family_rcv_msg_doit+0x1cc/0x2a0 ? genl_family_rcv_msg_attrs_parse.constprop.0+0x240/0x240 ? cap_capable+0xd4/0x330 genl_rcv_msg+0x3ea/0x670 ? genl_family_rcv_msg_dumpit+0x2a0/0x2a0 ? lockdep_set_lock_cmp_fn+0x190/0x190 ? macsec_update_offload+0x820/0x820 netlink_rcv_skb+0x12b/0x390 ? genl_family_rcv_msg_dumpit+0x2a0/0x2a0 ? netlink_ack+0xd80/0xd80 ? rwsem_down_read_slowpath+0xf90/0xf90 ? netlink_deliver_tap+0xcd/0xac0 ? netlink_deliver_tap+0x155/0xac0 ? _copy_from_iter+0x1bb/0x12c0 genl_rcv+0x24/0x40 netlink_unicast+0x440/0x700 ? netlink_attachskb+0x760/0x760 ? lock_acquire+0x1c2/0x530 ? __might_fault+0xbb/0x170 netlink_sendmsg+0x749/0xc10 ? netlink_unicast+0x700/0x700 ? __might_fault+0xbb/0x170 ? netlink_unicast+0x700/0x700 __sock_sendmsg+0xc5/0x190 ____sys_sendmsg+0x53f/0x760 ? import_iovec+0x7/0x10 ? kernel_sendmsg+0x30/0x30 ? __copy_msghdr+0x3c0/0x3c0 ? filter_irq_stacks+0x90/0x90 ? stack_depot_save_flags+0x28/0xa30 ___sys_sendmsg+0xeb/0x170 ? kasan_save_stack+0x30/0x40 ? copy_msghdr_from_user+0x110/0x110 ? do_syscall_64+0x6d/0x140 ? lock_acquire+0x1c2/0x530 ? __virt_addr_valid+0x116/0x3b0 ? __virt_addr_valid+0x1da/0x3b0 ? lock_downgrade+0x680/0x680 ? __delete_object+0x21/0x50 __sys_sendmsg+0xf7/0x180 ? __sys_sendmsg_sock+0x20/0x20 ? kmem_cache_free+0x14c/0x4e0 ? __x64_sys_close+0x78/0xd0 do_syscall_64+0x6d/0x140 entry_SYSCALL_64_after_hwframe+0x4b/0x53 RIP: 0033:0x7f855e113367 Code: 0e 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b9 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10 RSP: 002b:00007ffd15e90c88 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f855e113367 RDX: 0000000000000000 RSI: 00007ffd15e90cf0 RDI: 0000000000000004 RBP: 00007ffd15e90dbc R08: 0000000000000028 R09: 000000000045d100 R10: 00007f855e011dd8 R11: 0000000000000246 R12: 0000000000000019 R13: 0000000067c6b785 R14: 00000000004a1e80 R15: 0000000000000000 </TASK> Modules linked in: 8021q garp mrp sch_ingress openvswitch nsh mlx5_ib mlx5_fwctl mlx5_dpll mlx5_core rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_iscsi ib_umad rdma_cm ib_ipoib iw_cm ib_cm ib_uverbs ib_core xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter rpcsec_gss_krb5 auth_rpcgss oid_registry overlay zram zsmalloc fuse [last unloaded: mlx5_core] ---[ end trace 0000000000000000 ]--- Fixes: 8ff0ac5be144 ("net/mlx5: Add MACsec offload Tx command support") Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/1746958552-561295-1-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-14Merge branch 'net-mlx5-hws-complex-matchers-and-rehash-mechanism-fixes'Jakub Kicinski13-179/+2178
Tariq Toukan says: ==================== net/mlx5: HWS, Complex Matchers and rehash mechanism fixes Motivation: ---------- A matcher can match a certain set of match parameters. However, the number and size of match params for a single matcher are limited — all the parameters must fit within a single definer. A common example of this limitation is IPv6 address matching, where matching both source and destination IPs requires more bits than a single definer can support. SW Steering addresses this limitation by chaining multiple Steering Table Entries (STEs) within the same matcher, where each STE matches on a subset of the parameters. In HW Steering, such chaining is not possible — the matcher's STEs are managed in a hash table, and a single definer is used to calculate the hash index for STEs. Overview: -------- To address this limitation in HW Steering, we introduce *Complex Matchers*, which consist of two chained matchers. This allows matching on twice as many parameters. Complex Matchers are filled with *Complex Rules* — rules that are split into two parts and inserted into their respective matchers. The first half of the Complex Matcher is a regular matcher and points to the second half, which is an *Isolated Matcher*. An Isolated Matcher has its own isolated table and is accessible only by traffic coming from the first half of the Complex Matcher. This splitting of matchers/rules into multiple parts is transparent to users. It is hidden behind the BWC HWS API. It becomes visible only when dumping steering debug information, where the Complex Matcher appears as two separate matchers: one in the user-created table and another in its isolated table. Implementation Details: ---------------------- All user actions are performed on the second part of the rules only. The first part handles matching and applies two actions: modify header (set metadata, see details below) and go-to-table (directing traffic to the isolated table containing the isolated matcher). Rule updates (updating rule actions) are applied to the second part of the rule since user-provided actions are not executed in the first matcher. We use REG_C_6 metadata register to set and match on unique per-rule tag (see details below). Splitting rules into two parts introduces new challenges: 1. Invalid Combinations Consider two rules with different matching values: - Rule 1: A+B - Rule 2: C+D Let's split the rules into two parts as follows: |-----Complex Matcher-------| | | | 1st matcher 2nd matcher | | |---| |---| | | | A | | B | | | |---| -----> |---| | | | C | | D | | | |---| |---| | | | |---------------------------| Splitting these rules results in invalid combinations: A+D and C+B: any packet that matched on A will be forwarded to the 2nd matcher, where it will try to match on B (which is legal, and it is what the user asked for), but it will also try to match on D (which is not what the user asked for). To resolve this, we assign unique tags to each rule on the first matcher and match on these tags on the second matcher: |----------| |---------| | A | | B, TagA | | action: | | | | set TagA | | | |----------| --> |---------| | C | | D, TagB | | action: | | | | set TagB | | | |----------| |---------| 2. Duplicated Entries: Consider two rules with overlapping values: - Rule 1: A+B - Rule 2: A+D Let's split the rules into two parts as follows: |---| |---| | A | | B | |---| --> |---| | | | D | |---| |---| This leads to the duplicated entries on the first matcher, which HWS doesn't allow: subsequent delete of either of the rules will delete the only entry in the first matcher, leaving the remaining rule broken. To address this, we use a reference count for entries in the first matcher and delete STEs only when their refcount reaches zero. Both challenges are resolved by having a per-matcher data structure (implemented with rhashtable) that manages refcounts for the first part of the rules and holds unique tags (managed via IDA) for these rules to set and to match on the second matcher. Limitations: ----------- We utilize metadata register REG_C_6 in this implementation, so its usage anywhere along the flow that might include the need for Complex Matcher is prohibited. The number and size of match parameters remain limited — now constrained by what can be represented by two definers instead of one. This architectural limitation arises from the structure of Complex Matchers. If future requirements demand more parameters, Complex Matchers can be extended beyond two matchers. Additionally, there is an implementation limit of 32 match parameters per matcher (disregarding parameter size). This limit can be lifted if needed. Patches: ------- - Patches 1-3: small additions/refactoring in preparation for Complex Matcher: exposed mlx5hws_table_ft_set_next_ft() in header, added definer function to convert field name enum to string, expose the polling function mlx5hws_bwc_queue_poll() in a header. - Patch 4: in preparation for Complex Matcher, this patch adds support for Isolated Matcher. - Patch 5: the main patch - Complex Matchers implementation. [2] Patch 6: fixing the usecase where rule insertion was failing, but rehash couldn't be initiated if the number of rules in the table is below the rehash threshold. Patch 7: fixing the usecase where many rules in parallel would require rehash, due to the way the counting of rules was done. Patch 8: fixing the case where rules were requiring action template extension in parallel, leading to unneeded extensions with the same templates. Patch 9: refactor and simplify the rehash loop. Patch 10: dump error completion details, which helps a lot in trying to understand what went wrong, especially during rehash. ==================== Link: https://patch.msgid.link/1746992290-568936-1-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-14net/mlx5: HWS, dump bad completion detailsYevgeny Kliteynik2-3/+120
Failing to insert/delete a rule should not happen. If it does happen, it would be good to know at which stage it happened and what was the failure. This patch adds printing of bad CQE details. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Vlad Dogaru <vdogaru@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1746992290-568936-11-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-14net/mlx5: HWS, rework rehash loopYevgeny Kliteynik1-75/+52
Reworking the rehash loop - simplifying the code and making it less error prone: - Instead of doing round-robin on all the queues with batch of rules in each cycle, just go over all the queues and move all the rules that belong to this queue. - If at some stage of moving the rule we get a failure (which should not happen), this can't be rolled back. So instead of aborting rehash and leaving the matcher in a broken state, allow the loop to continue: attempt to move the rest of the rules and delete the old matcher. A rule that failed to move to a new matcher will loose its match STE once the rehash is completed and the old matcher is deleted, so the rule won't match any traffic any more. This rule's packets will fall back to the steering pipeline w/o HW offload. Rehash procedure will return an error, which will cause the rule insertion to fail for the rule that started this whole rehash. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Vlad Dogaru <vdogaru@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1746992290-568936-10-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-14net/mlx5: HWS, fix redundant extension of action templatesYevgeny Kliteynik1-51/+54
When a rule is inserted into a matcher, we search for the suitable action template. If such template is not found, action template array is extended with the new template. However, when several threads are performing this in parallel, there is a race - we can end up with extending the action templates array with the same template. This patch is doing the following: - refactor the code to find action template index in rule create and update, have the common code in an auxiliary function - after locking all the queues, check again if the action template array still needs to be extended Signed-off-by: Vlad Dogaru <vdogaru@nvidia.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1746992290-568936-9-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-14net/mlx5: HWS, fix counting of rules in the matcherYevgeny Kliteynik1-5/+5
Currently the counter that counts number of rules in a matcher is increased only when rule insertion is completed. In a multi-threaded usecase this can lead to a scenario that many rules can be in process of insertion in the same matcher, while none of them has completed the insertion and the rule counter is not updated. This results in a rule insertion failure for many of them at first attempt, which leads to all of them requiring rehash and requiring locking of all the queue locks. This patch fixes the case by increasing the rule counter in the beginning of insertion process and decreasing in case of any failure. Signed-off-by: Vlad Dogaru <vdogaru@nvidia.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1746992290-568936-8-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-14net/mlx5: HWS, force rehash when rule insertion failedYevgeny Kliteynik2-2/+7
Rules are inserted into hash table in accordance with their hash index. When a certain number of rules is reached, the table is rehashed: a bigger new table is allocated and all the rules are moved there. But sometimes a new rule can't be inserted into the hash table because its index is full, even though the number of rules in the table is well below the threshold. The hash function is not perfect, so such cases are not rare. When that happens, we want to do the same rehash, in order to increase the table size and lower the probability for such cases. This patch fixes the usecase where rule insertion was failing, but rehash couldn't be initiated due to low number of rules: it adds flag that denotes that rehash is required, even if the number of rules in the table is below the rehash threshold. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Vlad Dogaru <vdogaru@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1746992290-568936-7-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-14net/mlx5: HWS, support complex matchersYevgeny Kliteynik4-27/+1410
This patch adds support for Complex Matchers/Rules Overview: -------- A matcher can match on a certain set of match parameters. However, the number and size of match params for a single matcher are limited: all the parameters must fit within a single definer. A common example of this limitation is IPv6 address matching, where matching both source and destination IPs requires more bits than a single definer can support. SW Steering addresses this limitation by chaining multiple Steering Table Entries (STEs) within the same matcher, where each STE matches on a subset of the parameters. In HW Steering, such chaining is not possible — the matcher's STEs are managed in a hash table, and a single definer is used to calculate the hash index for STEs. To address this limitation in HW Steering, we introduce Complex Matchers, which consist of two chained matchers. This allows matching on twice as many parameters. Complex Matchers are filled with Complex Rules — rules that are split into two parts and inserted into their respective matchers. The first half of the Complex Matcher is a regular matcher and points to the second half, which is an Isolated Matcher. An Isolated Matcher has its own isolated table and is accessible only by traffic coming from the first half of the Complex Matcher. This splitting of matchers/rules into multiple parts is transparent to users. It is hidden under the BWC HWS API. It becomes visible only when dumping steering debug information, where the Complex Matcher appears as two separate matchers: one in the user-created table and another in its isolated table. Some implementation details: --------------------------- All user actions are performed on the second part of the rules only. The first part handles matching and applies two actions: modify header (set metadata, see details below) and go-to-table (directing traffic to the isolated table containing the isolated matcher). Rule updates (updating rule actions) are applied to the second part of the rule since user-provided actions are not executed in the first matcher. We use REG_C_6 metadata register to set and match on unique per-rule tag (see details below). Splitting rules into two parts introduces new challenges: 1. Invalid Combinations Consider two rules with different matching values: - Rule 1: A+B - Rule 2: C+D Let's split the rules into two parts as follows: |---| |---| | A | | B | |---| --> |---| | C | | D | |---| |---| Splitting these rules results in invalid combinations like A+D and C+B. To resolve this, we assign unique tags to each rule on the first matcher and match these tags on the second matcher (the tag is implemented through modify_hdr action that sets value to metadata register REG_C_6): |----------| |---------| | A | | B, TagA | | action: | | | | set TagA | | | |----------| --> |---------| | C | | D, TagB | | action: | | | | set TagB | | | |----------| |---------| 2. Duplicated Entries: Consider two rules with overlapping values: - Rule 1: A+B - Rule 2: A+D Let's split the rules into two parts as follows: |---| |---| | A | | B | |---| --> |---| | | | D | |---| |---| This leads to the duplicated entries on the first matcher, which HWS doesn't allow: subsequent delete of either of the rules will delete the only entry in the first matcher, leaving the remaining rule broken. To address this, we use a reference count for entries in the first matcher and delete STEs only when their refcount reaches zero. Both challenges are resolved by having a per-matcher data structure (implemented with rhashtable) that manages refcounts for the first part of the rules and holds unique tags (managed via IDA) for these rules to set and to match on the second matcher. Limitations: ----------- We utilize metadata register REG_C_6 in this implementation, so its usage anywhere along the steering of the flow that might include the need for Complex Matcher is prohibited. The number and size of match parameters remain limited — now it is constrained by what can be represented by two definers instead of one. This architectural limitation arises from the structure of Complex Matchers. If future requirements demand more parameters, Complex Matchers can be extended beyond two matchers. Additionally, there is an implementation limit of 32 match parameters per rule (disregarding parameter size). This limit can be lifted if needed. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Vlad Dogaru <vdogaru@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1746992290-568936-6-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-14net/mlx5: HWS, introduce isolated matchersYevgeny Kliteynik3-1/+288
In preparation for complex matcher support, introduce the isolated matcher. Isolated matcher is a matcher that has its own isolated table. It is used as the second half of the complex matcher: when the rule is split into two parts (complex rule), then matching on the first part will send the packet to the isolated matcher that will try to match on the second part. In case of miss, the packet goes back to the matcher's end flow table. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Vlad Dogaru <vdogaru@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1746992290-568936-5-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-14net/mlx5: HWS, expose polling function in header fileYevgeny Kliteynik2-13/+21
In preparation for complex matcher, expose the function that is polling queue for completion (mlx5hws_bwc_queue_poll) in header file, so that it will be used by complex matcher code. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Vlad Dogaru <vdogaru@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1746992290-568936-4-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-14net/mlx5: HWS, add definer function to get field name strYevgeny Kliteynik2-0/+214
In preparation for complex matcher support, add function for converting definer fname to str, which will be used in following patches. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Vlad Dogaru <vdogaru@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1746992290-568936-3-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-14net/mlx5: HWS, expose function mlx5hws_table_ft_set_next_ft in headerYevgeny Kliteynik2-8/+13
In preparation for complex matcher support, make function mlx5hws_table_ft_set_next_ft() non-static and expose it in header. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Vlad Dogaru <vdogaru@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1746992290-568936-2-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-14vsock/test: Fix occasional failure in SIOCOUTQ testsKonstantin Shkolnyy1-12/+16
These tests: "SOCK_STREAM ioctl(SIOCOUTQ) 0 unsent bytes" "SOCK_SEQPACKET ioctl(SIOCOUTQ) 0 unsent bytes" output: "Unexpected 'SIOCOUTQ' value, expected 0, got 64 (CLIENT)". They test that the SIOCOUTQ ioctl reports 0 unsent bytes after the data have been received by the other side. However, sometimes there is a delay in updating this "unsent bytes" counter, and the test fails even though the counter properly goes to 0 several milliseconds later. The delay occurs in the kernel because the used buffer notification callback virtio_vsock_tx_done(), called upon receipt of the data by the other side, doesn't update the counter itself. It delegates that to a kernel thread (via vsock->tx_work). Sometimes that thread is delayed more than the test expects. Change the test to poll SIOCOUTQ until it returns 0 or a timeout occurs. Signed-off-by: Konstantin Shkolnyy <kshk@linux.ibm.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Fixes: 18ee44ce97c1 ("test/vsock: add ioctl unsent bytes test") Link: https://patch.msgid.link/20250507151456.2577061-1-kshk@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-14tools: ynl-gen: Allow multi-attr without nested-attributes againLukas Wunner1-4/+3
Since commit ce6cb8113c84 ("tools: ynl-gen: individually free previous values on double set"), specifying the "multi-attr" property raises an error unless the "nested-attributes" property is specified as well: File "tools/net/ynl/./pyynl/ynl_gen_c.py", line 1147, in _load_nested_sets child = self.pure_nested_structs.get(nested) ^^^^^^ UnboundLocalError: cannot access local variable 'nested' where it is not associated with a value This appears to be a bug since there are existing specs which omit "nested-attributes" on "multi-attr" attributes. Also, according to Documentation/userspace-api/netlink/specs.rst, multi-attr "is the recommended way of implementing arrays (no extra nesting)", suggesting that nesting should even be avoided in favor of multi-attr. Fix the indentation of the if-block introduced by the commit to avoid the error. Fixes: ce6cb8113c84 ("tools: ynl-gen: individually free previous values on double set") Signed-off-by: Lukas Wunner <lukas@wunner.de> Link: https://patch.msgid.link/d6b58684b7e5bfb628f7313e6893d0097904e1d1.1746940107.git.lukas@wunner.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-14x86/its: Fix build errors when CONFIG_MODULES=nEric Biggers1-0/+6
Fix several build errors when CONFIG_MODULES=n, including the following: ../arch/x86/kernel/alternative.c:195:25: error: incomplete definition of type 'struct module' 195 | for (int i = 0; i < mod->its_num_pages; i++) { Fixes: 872df34d7c51 ("x86/its: Use dynamic thunks for indirect branches") Cc: stable@vger.kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Dave Hansen <dave.hansen@intel.com> Tested-by: Steven Rostedt (Google) <rostedt@goodmis.org> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-05-13x86/CPU/AMD: Add X86_FEATURE_ZEN6Yazen Ghannam2-1/+6
Add a synthetic feature flag for Zen6. [ bp: Move the feature flag to a free slot and avoid future merge conflicts from incoming stuff. ] Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/20250513204857.3376577-1-yazen.ghannam@amd.com
2025-05-13drm/xe: Fix the gem shrinker nameThomas Hellström1-1/+1
The xe buffer object shrinker name is visible in the <debugfs>/shrinker directory and most if not all other shinkers follow a naming convention that looks like <subsystem>-<driver>_<objects>:<unique> Follow the same convention for xe, changing the name to drm-xe_gem:<unique>. Other shrinkers typically use the device node for <unique> but since drm drivers typically don't have a single unique device- node, instead use the unique name in the drm device. Fixes: 00c8efc3180f ("drm/xe: Add a shrinker for xe bos") Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Francois Dugast <francois.dugast@intel.com> Link: https://lore.kernel.org/r/20250508112931.3347-1-thomas.hellstrom@linux.intel.com (cherry picked from commit 243bf99e2fe75edf8df1711c1377b6fc020b806c) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-05-13Merge tag 'probes-fixes-v6.15-rc6' of ↵Linus Torvalds7-4/+32
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull probes fixes from Masami Hiramatsu: - fprobe: Fix RCU warning message in list traversal fprobe_module_callback() using hlist_for_each_entry_rcu() traverse the fprobe list but it locks fprobe_mutex() instead of rcu lock because it is enough. So add lockdep_is_held() to avoid warning. - tracing: eprobe: Add missing trace_probe_log_clear for eprobe __trace_eprobe_create() uses trace_probe_log but forgot to clear it at exit. Add trace_probe_log_clear() calls. - tracing: probes: Fix possible race in trace_probe_log APIs trace_probe_log APIs are used in probe event (dynamic_events, kprobe_events and uprobe_events) creation. Only dynamic_events uses the dyn_event_ops_mutex mutex to serialize it. This makes kprobe and uprobe events to lock the same mutex to serialize its creation to avoid race in trace_probe_log APIs. * tag 'probes-fixes-v6.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing: probes: Fix a possible race in trace_probe_log APIs tracing: add missing trace_probe_log_clear for eprobes tracing: fprobe: Fix RCU warning message in list traversal
2025-05-13drm/amd/display: Avoid flooding unnecessary info messagesWayne Lin1-3/+3
It's expected that we'll encounter temporary exceptions during aux transactions. Adjust logging from drm_info to drm_dbg_dp to prevent flooding with unnecessary log messages. Fixes: 3637e457eb00 ("drm/amd/display: Fix wrong handling for AUX_DEFER case") Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Wayne Lin <Wayne.Lin@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20250513032026.838036-1-Wayne.Lin@amd.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 9a9c3e1fe5256da14a0a307dff0478f90c55fc8c) Cc: stable@vger.kernel.org
2025-05-13drm/amd/display: Fix null check of pipe_ctx->plane_state for update_dchubp_dppMelissa Wen1-3/+3
Similar to commit 6a057072ddd1 ("drm/amd/display: Fix null check for pipe_ctx->plane_state in dcn20_program_pipe") that addresses a null pointer dereference on dcn20_update_dchubp_dpp. This is the same function hooked for update_dchubp_dpp in dcn401, with the same issue. Fix possible null pointer deference on dcn401_program_pipe too. Fixes: 63ab80d9ac0a ("drm/amd/display: DML2.1 Post-Si Cleanup") Signed-off-by: Melissa Wen <mwen@igalia.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit d8d47f739752227957d8efc0cb894761bfe1d879)
2025-05-13drm/amd/display: check stream id dml21 wrapper to get plane_idAurabindo Pillai1-9/+11
[Why & How] Fix a false positive warning which occurs due to lack of correct checks when querying plane_id in DML21. This fixes the warning when performing a mode1 reset (cat /sys/kernel/debug/dri/1/amdgpu_gpu_recover): [ 35.751250] WARNING: CPU: 11 PID: 326 at /tmp/amd.PHpyAl7v/amd/amdgpu/../display/dc/dml2/dml2_dc_resource_mgmt.c:91 dml2_map_dc_pipes+0x243d/0x3f40 [amdgpu] [ 35.751434] Modules linked in: amdgpu(OE) amddrm_ttm_helper(OE) amdttm(OE) amddrm_buddy(OE) amdxcp(OE) amddrm_exec(OE) amd_sched(OE) amdkcl(OE) drm_suballoc_helper drm_ttm_helper ttm drm_display_helper cec rc_core i2c_algo_bit rfcomm qrtr cmac algif_hash algif_skcipher af_alg bnep amd_atl intel_rapl_msr intel_rapl_common snd_hda_codec_hdmi snd_hda_intel edac_mce_amd snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec kvm_amd snd_hda_core snd_hwdep snd_pcm kvm snd_seq_midi snd_seq_midi_event snd_rawmidi crct10dif_pclmul polyval_clmulni polyval_generic btusb ghash_clmulni_intel sha256_ssse3 btrtl sha1_ssse3 snd_seq btintel aesni_intel btbcm btmtk snd_seq_device crypto_simd sunrpc cryptd bluetooth snd_timer ccp binfmt_misc rapl snd i2c_piix4 wmi_bmof gigabyte_wmi k10temp i2c_smbus soundcore gpio_amdpt mac_hid sch_fq_codel msr parport_pc ppdev lp parport efi_pstore nfnetlink dmi_sysfs ip_tables x_tables autofs4 hid_generic usbhid hid crc32_pclmul igc ahci xhci_pci libahci xhci_pci_renesas video wmi [ 35.751501] CPU: 11 UID: 0 PID: 326 Comm: kworker/u64:9 Tainted: G OE 6.11.0-21-generic #21~24.04.1-Ubuntu [ 35.751504] Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE [ 35.751505] Hardware name: Gigabyte Technology Co., Ltd. X670E AORUS PRO X/X670E AORUS PRO X, BIOS F30 05/22/2024 [ 35.751506] Workqueue: amdgpu-reset-dev amdgpu_debugfs_reset_work [amdgpu] [ 35.751638] RIP: 0010:dml2_map_dc_pipes+0x243d/0x3f40 [amdgpu] [ 35.751794] Code: 6d 0c 00 00 8b 84 24 88 00 00 00 41 3b 44 9c 20 0f 84 fc 07 00 00 48 83 c3 01 48 83 fb 06 75 b3 4c 8b 64 24 68 4c 8b 6c 24 40 <0f> 0b b8 06 00 00 00 49 8b 94 24 a0 49 00 00 89 c3 83 f8 07 0f 87 [ 35.751796] RSP: 0018:ffffbfa3805d7680 EFLAGS: 00010246 [ 35.751798] RAX: 0000000000010000 RBX: 0000000000000006 RCX: 0000000000000000 [ 35.751799] RDX: 0000000000000000 RSI: 0000000000000005 RDI: 0000000000000000 [ 35.751800] RBP: ffffbfa3805d78f0 R08: 0000000000000000 R09: 0000000000000000 [ 35.751801] R10: 0000000000000000 R11: 0000000000000000 R12: ffffbfa383249000 [ 35.751802] R13: ffffa0e68f280000 R14: ffffbfa383249658 R15: 0000000000000000 [ 35.751803] FS: 0000000000000000(0000) GS:ffffa0edbe580000(0000) knlGS:0000000000000000 [ 35.751804] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 35.751805] CR2: 00005d847ef96c58 CR3: 000000041de3e000 CR4: 0000000000f50ef0 [ 35.751806] PKRU: 55555554 [ 35.751807] Call Trace: [ 35.751810] <TASK> [ 35.751816] ? show_regs+0x6c/0x80 [ 35.751820] ? __warn+0x88/0x140 [ 35.751822] ? dml2_map_dc_pipes+0x243d/0x3f40 [amdgpu] [ 35.751964] ? report_bug+0x182/0x1b0 [ 35.751969] ? handle_bug+0x6e/0xb0 [ 35.751972] ? exc_invalid_op+0x18/0x80 [ 35.751974] ? asm_exc_invalid_op+0x1b/0x20 [ 35.751978] ? dml2_map_dc_pipes+0x243d/0x3f40 [amdgpu] [ 35.752117] ? math_pow+0x48/0xa0 [amdgpu] [ 35.752256] ? srso_alias_return_thunk+0x5/0xfbef5 [ 35.752260] ? math_pow+0x48/0xa0 [amdgpu] [ 35.752400] ? srso_alias_return_thunk+0x5/0xfbef5 [ 35.752403] ? math_pow+0x11/0xa0 [amdgpu] [ 35.752524] ? srso_alias_return_thunk+0x5/0xfbef5 [ 35.752526] ? core_dcn4_mode_programming+0xe4d/0x20d0 [amdgpu] [ 35.752663] ? srso_alias_return_thunk+0x5/0xfbef5 [ 35.752669] dml21_validate+0x3d4/0x980 [amdgpu] Reviewed-by: Austin Zheng <austin.zheng@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit f8ad62c0a93e5dd94243e10f1b742232e4d6411e)
2025-05-13drm/amd/display: fix link_set_dpms_off multi-display MST corner caseGeorge Shen1-2/+11
[Why & How] When MST config is unplugged/replugged too quickly, it can potentially result in a scenario where previous DC state has not been reset before the HPD link detection sequence begins. In this case, driver will disable the streams/link prior to re-enabling the link for link training. There is a bug in the current logic that does not account for the fact that current_state can be released and cleared prior to swapping to a new state (resulting in the pipe_ctx stream pointers to be cleared) in between disabling streams. To resolve this, cache the original streams prior to committing any stream updates. Reviewed-by: Wenjing Liu <wenjing.liu@amd.com> Signed-off-by: George Shen <george.shen@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 1561782686ccc36af844d55d31b44c938dd412dc)
2025-05-13drm/amd/display: Defer BW-optimization-blocked DRR adjustmentsJohn Olender2-3/+9
[Why & How] Instead of dropping DRR updates, defer them. This fixes issues where monitor continues to see incorrect refresh rate after VRR was turned off by userspace. Fixes: 32953485c558 ("drm/amd/display: Do not update DRR while BW optimizations pending") Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3546 Reviewed-by: Sun peng Li <sunpeng.li@amd.com> Signed-off-by: John Olender <john.olender@gmail.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 53761b7ecd83e6fbb9f2206f8c980a6aa308c844) Cc: stable@vger.kernel.org
2025-05-13Revert: "drm/amd/display: Enable urgent latency adjustment on DCN35"Gabe Teeger1-2/+2
This reverts commit 756c85e4d0dd ("drm/amd/display: Enable urgent latency adjustment on DCN35") Reason for revert: Negative power impact. Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Gabe Teeger <Gabe.Teeger@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 9334c491cd8f388232b9a187bf0ddb728482bd6f)
2025-05-13drm/amd/display: Correct the reply value when AUX write incompleteWayne Lin2-3/+10
[Why] Now forcing aux->transfer to return 0 when incomplete AUX write is inappropriate. It should return bytes have been transferred. [How] aux->transfer is asked not to change original msg except reply field of drm_dp_aux_msg structure. Copy the msg->buffer when it's write request, and overwrite the first byte when sink reply 1 byte indicating partially written byte number. Then we can return the correct value without changing the original msg. Fixes: 3637e457eb00 ("drm/amd/display: Fix wrong handling for AUX_DEFER case") Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Ray Wu <ray.wu@amd.com> Signed-off-by: Wayne Lin <Wayne.Lin@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 7ac37f0dcd2e0b729fa7b5513908dc8ab802b540) Cc: stable@vger.kernel.org
2025-05-13drm/amdgpu: fix incorrect MALL size for GFX1151Tim Huang1-0/+12
On GFX1151, the reported MALL cache size reflects only half of its actual size; this adjustment corrects the discrepancy. Signed-off-by: Tim Huang <tim.huang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 0a5c060b593ad152318f89e5564bfdfcff8a6ac0) Cc: stable@vger.kernel.org
2025-05-13drm/amdgpu: csa unmap use uninterruptible lockPhilip Yang1-1/+1
After process exit to unmap csa and free GPU vm, if signal is accepted and then waiting to take vm lock is interrupted and return, it causes memory leaking and below warning backtrace. Change to use uninterruptible wait lock fix the issue. WARNING: CPU: 69 PID: 167800 at amd/amdgpu/amdgpu_kms.c:1525 amdgpu_driver_postclose_kms+0x294/0x2a0 [amdgpu] Call Trace: <TASK> drm_file_free.part.0+0x1da/0x230 [drm] drm_close_helper.isra.0+0x65/0x70 [drm] drm_release+0x6a/0x120 [drm] amdgpu_drm_release+0x51/0x60 [amdgpu] __fput+0x9f/0x280 ____fput+0xe/0x20 task_work_run+0x67/0xa0 do_exit+0x217/0x3c0 do_group_exit+0x3b/0xb0 get_signal+0x14a/0x8d0 arch_do_signal_or_restart+0xde/0x100 exit_to_user_mode_loop+0xc1/0x1a0 exit_to_user_mode_prepare+0xf4/0x100 syscall_exit_to_user_mode+0x17/0x40 do_syscall_64+0x69/0xc0 Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 7dbbfb3c171a6f63b01165958629c9c26abf38ab) Cc: stable@vger.kernel.org
2025-05-13x86/sev: Make sure pages are not skipped during kdumpAshish Kalra1-4/+7
When shared pages are being converted to private during kdump, additional checks are performed. They include handling the case of a GHCB page being contained within a huge page. Currently, this check incorrectly skips a page just below the GHCB page from being transitioned back to private during kdump preparation. This skipped page causes a 0x404 #VC exception when it is accessed later while dumping guest memory for vmcore generation. Correct the range to be checked for GHCB contained in a huge page. Also, ensure that the skipped huge page containing the GHCB page is transitioned back to private by applying the correct address mask later when changing GHCBs to private at end of kdump preparation. [ bp: Massage commit message. ] Fixes: 3074152e56c9 ("x86/sev: Convert shared memory back to private on kexec") Signed-off-by: Ashish Kalra <ashish.kalra@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Tested-by: Srikanth Aithal <sraithal@amd.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/20250506183529.289549-1-Ashish.Kalra@amd.com
2025-05-13x86/sev: Do not touch VMSA pages during SNP guest memory kdumpAshish Kalra1-86/+158
When kdump is running makedumpfile to generate vmcore and dump SNP guest memory it touches the VMSA page of the vCPU executing kdump. It then results in unrecoverable #NPF/RMP faults as the VMSA page is marked busy/in-use when the vCPU is running and subsequently a causes guest softlockup/hang. Additionally, other APs may be halted in guest mode and their VMSA pages are marked busy and touching these VMSA pages during guest memory dump will also cause #NPF. Issue AP_DESTROY GHCB calls on other APs to ensure they are kicked out of guest mode and then clear the VMSA bit on their VMSA pages. If the vCPU running kdump is an AP, mark it's VMSA page as offline to ensure that makedumpfile excludes that page while dumping guest memory. Fixes: 3074152e56c9 ("x86/sev: Convert shared memory back to private on kexec") Signed-off-by: Ashish Kalra <ashish.kalra@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Pankaj Gupta <pankaj.gupta@amd.com> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Tested-by: Srikanth Aithal <sraithal@amd.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/20250428214151.155464-1-Ashish.Kalra@amd.com
2025-05-13clk: sunxi-ng: d1: Add missing divider for MMC mod clocksAndre Przywara2-19/+47
The D1/R528/T113 SoCs have a hidden divider of 2 in the MMC mod clocks, just as other recent SoCs. So far we did not describe that, which led to the resulting MMC clock rate to be only half of its intended value. Use a macro that allows to describe a fixed post-divider, to compensate for that divisor. This brings the MMC performance on those SoCs to its expected level, so about 23 MB/s for SD cards, instead of the 11 MB/s measured so far. Fixes: 35b97bb94111 ("clk: sunxi-ng: Add support for the D1 SoC clocks") Reported-by: Kuba Szczodrzyński <kuba@szczodrzynski.pl> Signed-off-by: Andre Przywara <andre.przywara@arm.com> Link: https://patch.msgid.link/20250501120631.837186-1-andre.przywara@arm.com Signed-off-by: Chen-Yu Tsai <wens@csie.org>
2025-05-13remoteproc: qcom_wcnss: Fix on platforms without fallback regulatorsMatti Lehtimäki1-1/+2
Recent change to handle platforms with only single power domain broke pronto-v3 which requires power domains and doesn't have fallback voltage regulators in case power domains are missing. Add a check to verify the number of fallback voltage regulators before using the code which handles single power domain situation. Fixes: 65991ea8a6d1 ("remoteproc: qcom_wcnss: Handle platforms with only single power domain") Signed-off-by: Matti Lehtimäki <matti.lehtimaki@gmail.com> Tested-by: Luca Weiss <luca.weiss@fairphone.com> # sdm632-fairphone-fp3 Link: https://lore.kernel.org/r/20250511234026.94735-1-matti.lehtimaki@gmail.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2025-05-13HID: bpf: abort dispatch if device destroyedRong Zhang1-0/+9
The current HID bpf implementation assumes no output report/request will go through it after hid_bpf_destroy_device() has been called. This leads to a bug that unplugging certain types of HID devices causes a cleaned- up SRCU to be accessed. The bug was previously a hidden failure until a recent x86 percpu change [1] made it access not-present pages. The bug will be triggered if the conditions below are met: A) a device under the driver has some LEDs on B) hid_ll_driver->request() is uninplemented (e.g., logitech-djreceiver) If condition A is met, hidinput_led_worker() is always scheduled *after* hid_bpf_destroy_device(). hid_destroy_device ` hid_bpf_destroy_device ` cleanup_srcu_struct(&hdev->bpf.srcu) ` hid_remove_device ` ... ` led_classdev_unregister ` led_trigger_set(led_cdev, NULL) ` led_set_brightness(led_cdev, LED_OFF) ` ... ` input_inject_event ` input_event_dispose ` hidinput_input_event ` schedule_work(&hid->led_work) [hidinput_led_worker] This is fine when condition B is not met, where hidinput_led_worker() calls hid_ll_driver->request(). This is the case for most HID drivers, which implement it or use the generic one from usbhid. The driver itself or an underlying driver will then abort processing the request. Otherwise, hidinput_led_worker() tries hid_hw_output_report() and leads to the bug. hidinput_led_worker ` hid_hw_output_report ` dispatch_hid_bpf_output_report ` srcu_read_lock(&hdev->bpf.srcu) ` srcu_read_unlock(&hdev->bpf.srcu, idx) The bug has existed since the introduction [2] of dispatch_hid_bpf_output_report(). However, the same bug also exists in dispatch_hid_bpf_raw_requests(), and I've reproduced (no visible effect because of the lack of [1], but confirmed bpf.destroyed == 1) the bug against the commit (i.e., the Fixes:) introducing the function. This is because hidinput_led_worker() falls back to hid_hw_raw_request() when hid_ll_driver->output_report() is uninplemented (e.g., logitech- djreceiver). hidinput_led_worker ` hid_hw_output_report: -ENOSYS ` hid_hw_raw_request ` dispatch_hid_bpf_raw_requests ` srcu_read_lock(&hdev->bpf.srcu) ` srcu_read_unlock(&hdev->bpf.srcu, idx) Fix the issue by returning early in the two mentioned functions if hid_bpf has been marked as destroyed. Though dispatch_hid_bpf_device_event() handles input events, and there is no evidence that it may be called after the destruction, the same check, as a safety net, is also added to it to maintain the consistency among all dispatch functions. The impact of the bug on other architectures is unclear. Even if it acts as a hidden failure, this is still dangerous because it corrupts whatever is on the address calculated by SRCU. Thus, CC'ing the stable list. [1]: commit 9d7de2aa8b41 ("x86/percpu/64: Use relative percpu offsets") [2]: commit 9286675a2aed ("HID: bpf: add HID-BPF hooks for hid_hw_output_report") Closes: https://lore.kernel.org/all/20250506145548.GGaBoi9Jzp3aeJizTR@fat_crate.local/ Fixes: 8bd0488b5ea5 ("HID: bpf: add HID-BPF hooks for hid_hw_raw_requests") Cc: stable@vger.kernel.org Signed-off-by: Rong Zhang <i@rong.moe> Tested-by: Petr Tesarik <petr@tesarici.cz> Link: https://patch.msgid.link/20250512152420.87441-1-i@rong.moe Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
2025-05-13net: dsa: b53: prevent standalone from trying to forward to other portsJonas Gorski2-0/+47
When bridged ports and standalone ports share a VLAN, e.g. via VLAN uppers, or untagged traffic with a vlan unaware bridge, the ASIC will still try to forward traffic to known FDB entries on standalone ports. But since the port VLAN masks prevent forwarding to bridged ports, this traffic will be dropped. This e.g. can be observed in the bridge_vlan_unaware ping tests, where this breaks pinging with learning on. Work around this by enabling the simplified EAP mode on switches supporting it for standalone ports, which causes the ASIC to redirect traffic of unknown source MAC addresses to the CPU port. Since standalone ports do not learn, there are no known source MAC addresses, so effectively this redirects all incoming traffic to the CPU port. Fixes: ff39c2d68679 ("net: dsa: b53: Add bridge support") Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Link: https://patch.msgid.link/20250508091424.26870-1-jonas.gorski@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-13tracing: probes: Fix a possible race in trace_probe_log APIsMasami Hiramatsu (Google)5-3/+27
Since the shared trace_probe_log variable can be accessed and modified via probe event create operation of kprobe_events, uprobe_events, and dynamic_events, it should be protected. In the dynamic_events, all operations are serialized by `dyn_event_ops_mutex`. But kprobe_events and uprobe_events interfaces are not serialized. To solve this issue, introduces dyn_event_create(), which runs create() operation under the mutex, for kprobe_events and uprobe_events. This also uses lockdep to check the mutex is held when using trace_probe_log* APIs. Link: https://lore.kernel.org/all/174684868120.551552.3068655787654268804.stgit@devnote2/ Reported-by: Paul Cacheux <paulcacheux@gmail.com> Closes: https://lore.kernel.org/all/20250510074456.805a16872b591e2971a4d221@kernel.org/ Fixes: ab105a4fb894 ("tracing: Use tracing error_log with probe events") Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2025-05-13Merge branch 'amd-xgbe-add-support-for-amd-renoir'Paolo Abeni5-57/+227
Raju Rangoju says: ==================== amd-xgbe: add support for AMD Renoir Add support for a new AMD Ethernet device called "Renoir". It has a new PCI ID, add this to the current list of supported devices in the amd-xgbe devices. Also, the BAR1 addresses cannot be used to access the PCS registers on Renoir platform, use the indirect addressing via SMN instead. ==================== Link: https://patch.msgid.link/20250509155325.720499-1-Raju.Rangoju@amd.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-13amd-xgbe: add support for new pci device id 0x1641Raju Rangoju1-0/+18
Add support for new pci device id 0x1641 to register Renoir device with PCIe. Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250509155325.720499-6-Raju.Rangoju@amd.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-13amd-xgbe: Add XGBE_XPCS_ACCESS_V3 support to xgbe_pci_probe()Raju Rangoju3-7/+33
A new version of XPCS access routines have been introduced, add the support to xgbe_pci_probe() to use these routines. Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250509155325.720499-5-Raju.Rangoju@amd.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-13amd-xgbe: add support for new XPCS routinesRaju Rangoju3-0/+122
Add the necessary support to enable Renoir ethernet device. Since the BAR1 address cannot be used to access the XPCS registers on Renoir, use the smn functions. Some of the ethernet add-in-cards have dual PHY but share a single MDIO line (between the ports). In such cases, link inconsistencies are noticed during the heavy traffic and during reboot stress tests. Using smn calls helps avoid such race conditions. Suggested-by: Sudheesh Mavila <sudheesh.mavila@amd.com> Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250509155325.720499-4-Raju.Rangoju@amd.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>