From 3197cce4d48c1e5f159f8f24a4e3b9bd83385c14 Mon Sep 17 00:00:00 2001 From: Gabriel Goller Date: Mon, 23 Feb 2026 11:12:54 +0100 Subject: docs: net: document neigh gc_stale_time sysctl Add missing documentation for a neighbor table garbage collector sysctl parameter in ip-sysctl.rst: neigh/default/gc_stale_time: controls how long an unused neighbor entry is kept before becoming eligible for garbage collection (default: 60 seconds) Signed-off-by: Gabriel Goller Link: https://patch.msgid.link/20260223101257.47563-1-g.goller@proxmox.com Signed-off-by: Jakub Kicinski --- Documentation/networking/ip-sysctl.rst | 11 +++++++++++ 1 file changed, 11 insertions(+) (limited to 'Documentation') diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst index 6921d8594b84..9c90333530fa 100644 --- a/Documentation/networking/ip-sysctl.rst +++ b/Documentation/networking/ip-sysctl.rst @@ -202,6 +202,17 @@ neigh/default/gc_thresh3 - INTEGER Default: 1024 +neigh/default/gc_stale_time - INTEGER + Determines how long a neighbor entry can remain unused before it is + considered stale and eligible for garbage collection. Entries that have + not been used for longer than this time will be removed by the garbage + collector, unless they have active references, are marked as PERMANENT, + or carry the NTF_EXT_LEARNED or NTF_EXT_VALIDATED flag. Stale entries + are only removed by the periodic GC when there are at least gc_thresh1 + neighbors in the table. + + Default: 60 seconds + neigh/default/unres_qlen_bytes - INTEGER The maximum number of bytes which may be used by packets queued for each unresolved address by other network layers. -- cgit v1.2.3 From 64db5933c7adcdc4dd8f5ef6506cc998ecbe63ac Mon Sep 17 00:00:00 2001 From: Eric Dumazet Date: Mon, 23 Feb 2026 16:17:42 +0000 Subject: icmp: increase net.ipv4.icmp_msgs_{per_sec,burst} These sysctls were added in 4cdf507d5452 ("icmp: add a global rate limitation") and their default values might be too small. Some network tools send probes to closed UDP ports from many hosts to estimate proportion of packet drops on a particular target. This patch sets both sysctls to 10000. Note the per-peer rate-limit (as described in RFC 4443 2.4 (f)) intent is still enforced. This also increases security, see b38e7819cae9 ("icmp: randomize the global rate limiter") for reference. Signed-off-by: Eric Dumazet Reviewed-by: Kuniyuki Iwashima Link: https://patch.msgid.link/20260223161742.929830-1-edumazet@google.com Signed-off-by: Jakub Kicinski --- Documentation/networking/ip-sysctl.rst | 6 +++--- net/ipv4/icmp.c | 4 ++-- 2 files changed, 5 insertions(+), 5 deletions(-) (limited to 'Documentation') diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst index 9c90333530fa..d1eeb5323af0 100644 --- a/Documentation/networking/ip-sysctl.rst +++ b/Documentation/networking/ip-sysctl.rst @@ -1758,14 +1758,14 @@ icmp_msgs_per_sec - INTEGER controlled by this limit. For security reasons, the precise count of messages per second is randomized. - Default: 1000 + Default: 10000 icmp_msgs_burst - INTEGER icmp_msgs_per_sec controls number of ICMP packets sent per second, - while icmp_msgs_burst controls the burst size of these packets. + while icmp_msgs_burst controls the token bucket size. For security reasons, the precise burst size is randomized. - Default: 50 + Default: 10000 icmp_ratemask - INTEGER Mask made of ICMP types for which rates are being limited. diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c index a62b4c4033cc..1cf9e391aa0c 100644 --- a/net/ipv4/icmp.c +++ b/net/ipv4/icmp.c @@ -1727,8 +1727,8 @@ static int __net_init icmp_sk_init(struct net *net) net->ipv4.sysctl_icmp_ratemask = 0x1818; net->ipv4.sysctl_icmp_errors_use_inbound_ifaddr = 0; net->ipv4.sysctl_icmp_errors_extension_mask = 0; - net->ipv4.sysctl_icmp_msgs_per_sec = 1000; - net->ipv4.sysctl_icmp_msgs_burst = 50; + net->ipv4.sysctl_icmp_msgs_per_sec = 10000; + net->ipv4.sysctl_icmp_msgs_burst = 10000; return 0; } -- cgit v1.2.3 From d68d21ea6b29f87f1c334f007e845ca5fb678c80 Mon Sep 17 00:00:00 2001 From: Gabriel Goller Date: Wed, 25 Feb 2026 10:58:10 +0100 Subject: docs: net: document neigh gc_interval sysctl Add entry for the neigh/default/gc_interval sysctl. This sysctl is unused since kernel v2.6.8. Suggested-by: Jakub Kicinski Signed-off-by: Gabriel Goller Link: https://patch.msgid.link/20260225095822.44050-1-g.goller@proxmox.com Signed-off-by: Jakub Kicinski --- Documentation/networking/ip-sysctl.rst | 7 +++++++ 1 file changed, 7 insertions(+) (limited to 'Documentation') diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst index d1eeb5323af0..265158534cda 100644 --- a/Documentation/networking/ip-sysctl.rst +++ b/Documentation/networking/ip-sysctl.rst @@ -202,6 +202,13 @@ neigh/default/gc_thresh3 - INTEGER Default: 1024 +neigh/default/gc_interval - INTEGER + Specifies how often the garbage collector for neighbor entries + should run. This value applies to the entire table, not + individual entries. Unused since kernel v2.6.8. + + Default: 30 seconds + neigh/default/gc_stale_time - INTEGER Determines how long a neighbor entry can remain unused before it is considered stale and eligible for garbage collection. Entries that have -- cgit v1.2.3 From 5cf47393d96f211836ce98a22cebdf0eb8555413 Mon Sep 17 00:00:00 2001 From: Yohei Kojima Date: Thu, 26 Feb 2026 00:12:09 +0900 Subject: docs: ethtool: clarify the bit-by-bit bitset format description Clarify the bit-by-bit bitset format's behavior around mandatory attributes and bit identification. More specifically, the following changes are made: * Rephrase a misleading sentence which implies name and index are mutually exclusive * Describe that ETHTOOL_A_BITSET_BITS nest is mandatory * Describe that a request fails if inconsistent identifiers are given Signed-off-by: Yohei Kojima Reviewed-by: Jakub Kicinski Reviewed-by: Simon Horman Link: https://patch.msgid.link/ef90a56965ca66e57aa177929ce3e10c5ca815fa.1772031974.git.yk@y-koj.net Signed-off-by: Jakub Kicinski --- Documentation/networking/ethtool-netlink.rst | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) (limited to 'Documentation') diff --git a/Documentation/networking/ethtool-netlink.rst b/Documentation/networking/ethtool-netlink.rst index af56c304cef4..32179168eb73 100644 --- a/Documentation/networking/ethtool-netlink.rst +++ b/Documentation/networking/ethtool-netlink.rst @@ -96,7 +96,7 @@ For short bitmaps of (reasonably) fixed length, standard ``NLA_BITFIELD32`` type is used. For arbitrary length bitmaps, ethtool netlink uses a nested attribute with contents of one of two forms: compact (two binary bitmaps representing bit values and mask of affected bits) and bit-by-bit (list of -bits identified by either index or name). +bits identified by index or name). Verbose (bit-by-bit) bitsets allow sending symbolic names for bits together with their values which saves a round trip (when the bitset is passed in a @@ -156,12 +156,16 @@ Bit-by-bit form: nested (bitset) attribute contents: | | | ``ETHTOOL_A_BITSET_BIT_VALUE`` | flag | present if bit is set | +-+-+--------------------------------+--------+-----------------------------+ -Bit size is optional for bit-by-bit form. ``ETHTOOL_A_BITSET_BITS`` nest can +For bit-by-bit form, ``ETHTOOL_A_BITSET_SIZE`` is optional, and +``ETHTOOL_A_BITSET_BITS`` is mandatory. ``ETHTOOL_A_BITSET_BITS`` nest can only contain ``ETHTOOL_A_BITSET_BITS_BIT`` attributes but there can be an arbitrary number of them. A bit may be identified by its index or by its name. When used in requests, listed bits are set to 0 or 1 according to -``ETHTOOL_A_BITSET_BIT_VALUE``, the rest is preserved. A request fails if -index exceeds kernel bit length or if name is not recognized. +``ETHTOOL_A_BITSET_BIT_VALUE``, the rest is preserved. + +A request fails if index exceeds kernel bit length or if name is not +recognized. If both name and index are set, the request will fail if they +point to different bits. When ``ETHTOOL_A_BITSET_NOMASK`` flag is present, bitset is interpreted as a simple bitmap. ``ETHTOOL_A_BITSET_BIT_VALUE`` attributes are not used in -- cgit v1.2.3 From 57cc8ab3e9f2c460204bc8facb7932b9b53be878 Mon Sep 17 00:00:00 2001 From: Leon Kral Date: Fri, 27 Feb 2026 00:07:47 +0000 Subject: net/handshake: Fixed grammar mistake The word "a" was used instead of "an" which is grammatically incorrect. Fixed by changing from "a" to "an". This improves readability of the documentation. Signed-off-by: Leon Kral Reviewed-by: Alistair Francis Link: https://patch.msgid.link/20260227001151.41610-1-leon.j.kral@protonmail.com Signed-off-by: Jakub Kicinski --- Documentation/networking/tls-handshake.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/networking/tls-handshake.rst b/Documentation/networking/tls-handshake.rst index 6f5ea1646a47..4f7bc1087df9 100644 --- a/Documentation/networking/tls-handshake.rst +++ b/Documentation/networking/tls-handshake.rst @@ -7,7 +7,7 @@ In-Kernel TLS Handshake Overview ======== -Transport Layer Security (TLS) is a Upper Layer Protocol (ULP) that runs +Transport Layer Security (TLS) is an Upper Layer Protocol (ULP) that runs over TCP. TLS provides end-to-end data integrity and confidentiality in addition to peer authentication. -- cgit v1.2.3 From ca4c7771a059fb2665fecc7d5954672d09475089 Mon Sep 17 00:00:00 2001 From: Konrad Dybcio Date: Mon, 2 Mar 2026 16:58:43 +0100 Subject: dt-bindings: sram: qcom,imem: Allow modem-tables subnode The IP Accelerator hardware/firmware owns a sizeable region within the IMEM, named 'modem-tables', containing various packet processing configuration data. It's not actually accessed by the OS, although we have to IOMMU-map it with the IPA device, so that presumably the firmware can act upon it. Allow it as a subnode of IMEM. Reviewed-by: Krzysztof Kozlowski Reviewed-by: Alex Elder Signed-off-by: Konrad Dybcio Link: https://patch.msgid.link/20260302-topic-ipa_imem-v6-1-c0ebbf3eae9f@oss.qualcomm.com Signed-off-by: Jakub Kicinski --- Documentation/devicetree/bindings/sram/qcom,imem.yaml | 14 ++++++++++++++ 1 file changed, 14 insertions(+) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/sram/qcom,imem.yaml b/Documentation/devicetree/bindings/sram/qcom,imem.yaml index 6a627c57ae2f..c63026904061 100644 --- a/Documentation/devicetree/bindings/sram/qcom,imem.yaml +++ b/Documentation/devicetree/bindings/sram/qcom,imem.yaml @@ -67,6 +67,20 @@ properties: $ref: /schemas/power/reset/syscon-reboot-mode.yaml# patternProperties: + "^modem-tables@[0-9a-f]+$": + type: object + description: + Region containing packet processing configuration for the IP Accelerator. + + properties: + reg: + maxItems: 1 + + required: + - reg + + additionalProperties: false + "^pil-reloc@[0-9a-f]+$": $ref: /schemas/remoteproc/qcom,pil-info.yaml# description: Peripheral image loader relocation region -- cgit v1.2.3 From f5a598abfdd9dc73a9537b9628d4f26a3069c707 Mon Sep 17 00:00:00 2001 From: Konrad Dybcio Date: Mon, 2 Mar 2026 16:58:44 +0100 Subject: dt-bindings: net: qcom,ipa: Add sram property for describing IMEM slice The IPA driver currently grabs a slice of IMEM through hardcoded addresses. Not only is that ugly and against the principles of DT, but it also creates a situation where two distinct platforms implementing the same version of IPA would need to be hardcoded together and matched at runtime. Instead, do the sane thing and accept a handle to said region directly. Don't make it required on purpose, as it's not there on ancient implementations (currently unsupported) and we're not yet done with filling the data across al DTs. Reviewed-by: Alex Elder Reviewed-by: Krzysztof Kozlowski Signed-off-by: Konrad Dybcio Link: https://patch.msgid.link/20260302-topic-ipa_imem-v6-2-c0ebbf3eae9f@oss.qualcomm.com Signed-off-by: Jakub Kicinski --- Documentation/devicetree/bindings/net/qcom,ipa.yaml | 7 +++++++ 1 file changed, 7 insertions(+) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/net/qcom,ipa.yaml b/Documentation/devicetree/bindings/net/qcom,ipa.yaml index c7f5f2ef7452..4237e74041ef 100644 --- a/Documentation/devicetree/bindings/net/qcom,ipa.yaml +++ b/Documentation/devicetree/bindings/net/qcom,ipa.yaml @@ -165,6 +165,13 @@ properties: initializing IPA hardware. Optional, and only used when Trust Zone performs early initialization. + sram: + maxItems: 1 + description: + A reference to an additional region residing in IMEM (special + on-chip SRAM), which is accessed by the IPA firmware and needs + to be IOMMU-mapped from the OS. + required: - compatible - iommus -- cgit v1.2.3 From 8838bb185ef3c801ea4641a95bdace6210cd4b4e Mon Sep 17 00:00:00 2001 From: Daniel Golle Date: Tue, 3 Mar 2026 03:16:23 +0000 Subject: dt-bindings: net: dsa: maxlinear,mxl862xx: remove port label The ports in the example device tree should not have a 'label' property. Labels for all user ports have been removed from an earlier submission, but this was overlooked in the case of the CPU port. Remove 'cpu' port label from the example. Suggested-by: Vladimir Oltean Signed-off-by: Daniel Golle Reviewed-by: Andrew Lunn Acked-by: Conor Dooley Link: https://patch.msgid.link/61579de297eb636ec5f1e6c97d453e26abb0625d.1772507210.git.daniel@makrotopia.org Signed-off-by: Jakub Kicinski --- Documentation/devicetree/bindings/net/dsa/maxlinear,mxl862xx.yaml | 1 - 1 file changed, 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/net/dsa/maxlinear,mxl862xx.yaml b/Documentation/devicetree/bindings/net/dsa/maxlinear,mxl862xx.yaml index f1d667f7a055..2f19c19c60f3 100644 --- a/Documentation/devicetree/bindings/net/dsa/maxlinear,mxl862xx.yaml +++ b/Documentation/devicetree/bindings/net/dsa/maxlinear,mxl862xx.yaml @@ -110,7 +110,6 @@ examples: port@9 { reg = <9>; - label = "cpu"; ethernet = <&gmac0>; phy-mode = "usxgmii"; -- cgit v1.2.3 From dd378109d20ff6789091fa3558607c1d242d80ad Mon Sep 17 00:00:00 2001 From: Eric Dumazet Date: Mon, 2 Mar 2026 18:14:29 +0000 Subject: net-sysfs: use rps_tag_ptr and remove metadata from rps_sock_flow_table Instead of storing the @mask at the beginning of rps_sock_flow_table, use 5 low order bits of the rps_tag_ptr to store the log of the size. This removes a potential cache line miss to fetch @mask. More importantly, we can switch to vmalloc_huge() without wasting memory. Tested with: numactl --interleave=all bash -c "echo 4194304 >/proc/sys/net/core/rps_sock_flow_entries" Signed-off-by: Eric Dumazet Reviewed-by: Kuniyuki Iwashima Link: https://patch.msgid.link/20260302181432.1836150-5-edumazet@google.com Signed-off-by: Jakub Kicinski --- Documentation/networking/scaling.rst | 13 ++++-- include/net/hotdata.h | 5 +- include/net/rps.h | 42 ++++++++--------- net/core/dev.c | 12 +++-- net/core/sysctl_net_core.c | 89 +++++++++++++++++++----------------- 5 files changed, 86 insertions(+), 75 deletions(-) (limited to 'Documentation') diff --git a/Documentation/networking/scaling.rst b/Documentation/networking/scaling.rst index 0023afa530ec..6c261eb48845 100644 --- a/Documentation/networking/scaling.rst +++ b/Documentation/networking/scaling.rst @@ -403,16 +403,21 @@ Both of these need to be set before RFS is enabled for a receive queue. Values for both are rounded up to the nearest power of two. The suggested flow count depends on the expected number of active connections at any given time, which may be significantly less than the number of open -connections. We have found that a value of 32768 for rps_sock_flow_entries -works fairly well on a moderately loaded server. +connections. We have found that a value of 65536 for rps_sock_flow_entries +works fairly well on a moderately loaded server. Big servers might +need 1048576 or even higher values. + +On a NUMA host it is advisable to spread rps_sock_flow_entries on all nodes. + +numactl --interleave=all bash -c "echo 1048576 >/proc/sys/net/core/rps_sock_flow_entries" For a single queue device, the rps_flow_cnt value for the single queue would normally be configured to the same value as rps_sock_flow_entries. For a multi-queue device, the rps_flow_cnt for each queue might be configured as rps_sock_flow_entries / N, where N is the number of -queues. So for instance, if rps_sock_flow_entries is set to 32768 and there +queues. So for instance, if rps_sock_flow_entries is set to 131072 and there are 16 configured receive queues, rps_flow_cnt for each queue might be -configured as 2048. +configured as 8192. Accelerated RFS diff --git a/include/net/hotdata.h b/include/net/hotdata.h index 6632b1aa7584..62534d1f3c70 100644 --- a/include/net/hotdata.h +++ b/include/net/hotdata.h @@ -6,6 +6,9 @@ #include #include #include +#ifdef CONFIG_RPS +#include +#endif struct skb_defer_node { struct llist_head defer_list; @@ -33,7 +36,7 @@ struct net_hotdata { struct kmem_cache *skbuff_fclone_cache; struct kmem_cache *skb_small_head_cache; #ifdef CONFIG_RPS - struct rps_sock_flow_table __rcu *rps_sock_flow_table; + rps_tag_ptr rps_sock_flow_table; u32 rps_cpu_mask; #endif struct skb_defer_node __percpu *skb_defer_nodes; diff --git a/include/net/rps.h b/include/net/rps.h index 82cdffdf3e6b..dee930d9dd38 100644 --- a/include/net/rps.h +++ b/include/net/rps.h @@ -8,6 +8,7 @@ #include #ifdef CONFIG_RPS +#include extern struct static_key_false rps_needed; extern struct static_key_false rfs_needed; @@ -60,45 +61,38 @@ struct rps_dev_flow_table { * meaning we use 32-6=26 bits for the hash. */ struct rps_sock_flow_table { - u32 _mask; - - u32 ents[] ____cacheline_aligned_in_smp; + u32 ent; }; -#define RPS_SOCK_FLOW_TABLE_SIZE(_num) (offsetof(struct rps_sock_flow_table, ents[_num])) - -static inline u32 rps_sock_flow_table_mask(const struct rps_sock_flow_table *table) -{ - return table->_mask; -} #define RPS_NO_CPU 0xffff -static inline void rps_record_sock_flow(struct rps_sock_flow_table *table, - u32 hash) +static inline void rps_record_sock_flow(rps_tag_ptr tag_ptr, u32 hash) { - unsigned int index = hash & rps_sock_flow_table_mask(table); + unsigned int index = hash & rps_tag_to_mask(tag_ptr); u32 val = hash & ~net_hotdata.rps_cpu_mask; + struct rps_sock_flow_table *table; /* We only give a hint, preemption can change CPU under us */ val |= raw_smp_processor_id(); + table = rps_tag_to_table(tag_ptr); /* The following WRITE_ONCE() is paired with the READ_ONCE() * here, and another one in get_rps_cpu(). */ - if (READ_ONCE(table->ents[index]) != val) - WRITE_ONCE(table->ents[index], val); + if (READ_ONCE(table[index].ent) != val) + WRITE_ONCE(table[index].ent, val); } static inline void _sock_rps_record_flow_hash(__u32 hash) { - struct rps_sock_flow_table *sock_flow_table; + rps_tag_ptr tag_ptr; if (!hash) return; rcu_read_lock(); - sock_flow_table = rcu_dereference(net_hotdata.rps_sock_flow_table); - if (sock_flow_table) - rps_record_sock_flow(sock_flow_table, hash); + tag_ptr = READ_ONCE(net_hotdata.rps_sock_flow_table); + if (tag_ptr) + rps_record_sock_flow(tag_ptr, hash); rcu_read_unlock(); } @@ -125,6 +119,7 @@ static inline void _sock_rps_record_flow(const struct sock *sk) static inline void _sock_rps_delete_flow(const struct sock *sk) { struct rps_sock_flow_table *table; + rps_tag_ptr tag_ptr; u32 hash, index; hash = READ_ONCE(sk->sk_rxhash); @@ -132,11 +127,12 @@ static inline void _sock_rps_delete_flow(const struct sock *sk) return; rcu_read_lock(); - table = rcu_dereference(net_hotdata.rps_sock_flow_table); - if (table) { - index = hash & rps_sock_flow_table_mask(table); - if (READ_ONCE(table->ents[index]) != RPS_NO_CPU) - WRITE_ONCE(table->ents[index], RPS_NO_CPU); + tag_ptr = READ_ONCE(net_hotdata.rps_sock_flow_table); + if (tag_ptr) { + index = hash & rps_tag_to_mask(tag_ptr); + table = rps_tag_to_table(tag_ptr); + if (READ_ONCE(table[index].ent) != RPS_NO_CPU) + WRITE_ONCE(table[index].ent, RPS_NO_CPU); } rcu_read_unlock(); } diff --git a/net/core/dev.c b/net/core/dev.c index 92f8eeac8de3..7ae87be81afc 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -5075,9 +5075,9 @@ set_rps_cpu(struct net_device *dev, struct sk_buff *skb, static int get_rps_cpu(struct net_device *dev, struct sk_buff *skb, struct rps_dev_flow **rflowp) { - const struct rps_sock_flow_table *sock_flow_table; struct netdev_rx_queue *rxqueue = dev->_rx; struct rps_dev_flow_table *flow_table; + rps_tag_ptr global_tag_ptr; struct rps_map *map; int cpu = -1; u32 tcpu; @@ -5108,8 +5108,9 @@ static int get_rps_cpu(struct net_device *dev, struct sk_buff *skb, if (!hash) goto done; - sock_flow_table = rcu_dereference(net_hotdata.rps_sock_flow_table); - if (flow_table && sock_flow_table) { + global_tag_ptr = READ_ONCE(net_hotdata.rps_sock_flow_table); + if (flow_table && global_tag_ptr) { + struct rps_sock_flow_table *sock_flow_table; struct rps_dev_flow *rflow; u32 next_cpu; u32 flow_id; @@ -5118,8 +5119,9 @@ static int get_rps_cpu(struct net_device *dev, struct sk_buff *skb, /* First check into global flow table if there is a match. * This READ_ONCE() pairs with WRITE_ONCE() from rps_record_sock_flow(). */ - flow_id = hash & rps_sock_flow_table_mask(sock_flow_table); - ident = READ_ONCE(sock_flow_table->ents[flow_id]); + flow_id = hash & rps_tag_to_mask(global_tag_ptr); + sock_flow_table = rps_tag_to_table(global_tag_ptr); + ident = READ_ONCE(sock_flow_table[flow_id].ent); if ((ident ^ hash) & ~net_hotdata.rps_cpu_mask) goto try_rps; diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c index cfbe798493b5..502705e04649 100644 --- a/net/core/sysctl_net_core.c +++ b/net/core/sysctl_net_core.c @@ -138,68 +138,73 @@ done: static int rps_sock_flow_sysctl(const struct ctl_table *table, int write, void *buffer, size_t *lenp, loff_t *ppos) { + struct rps_sock_flow_table *o_sock_table, *sock_table; + static DEFINE_MUTEX(sock_flow_mutex); + rps_tag_ptr o_tag_ptr, tag_ptr; unsigned int orig_size, size; - int ret, i; struct ctl_table tmp = { .data = &size, .maxlen = sizeof(size), .mode = table->mode }; - struct rps_sock_flow_table *o_sock_table, *sock_table; - static DEFINE_MUTEX(sock_flow_mutex); void *tofree = NULL; + int ret, i; + u8 log; mutex_lock(&sock_flow_mutex); - o_sock_table = rcu_dereference_protected( - net_hotdata.rps_sock_flow_table, - lockdep_is_held(&sock_flow_mutex)); - size = o_sock_table ? rps_sock_flow_table_mask(o_sock_table) + 1 : 0; + o_tag_ptr = tag_ptr = net_hotdata.rps_sock_flow_table; + + size = o_tag_ptr ? rps_tag_to_mask(o_tag_ptr) + 1 : 0; + o_sock_table = rps_tag_to_table(o_tag_ptr); orig_size = size; ret = proc_dointvec(&tmp, write, buffer, lenp, ppos); - if (write) { - if (size) { - if (size > 1<<29) { - /* Enforce limit to prevent overflow */ + if (!write) + goto unlock; + + if (size) { + if (size > 1<<29) { + /* Enforce limit to prevent overflow */ + mutex_unlock(&sock_flow_mutex); + return -EINVAL; + } + sock_table = o_sock_table; + size = roundup_pow_of_two(size); + if (size != orig_size) { + sock_table = vmalloc_huge(size * sizeof(*sock_table), + GFP_KERNEL); + if (!sock_table) { mutex_unlock(&sock_flow_mutex); - return -EINVAL; - } - sock_table = o_sock_table; - size = roundup_pow_of_two(size); - if (size != orig_size) { - sock_table = - vmalloc(RPS_SOCK_FLOW_TABLE_SIZE(size)); - if (!sock_table) { - mutex_unlock(&sock_flow_mutex); - return -ENOMEM; - } - net_hotdata.rps_cpu_mask = - roundup_pow_of_two(nr_cpu_ids) - 1; - sock_table->_mask = size - 1; + return -ENOMEM; } + net_hotdata.rps_cpu_mask = + roundup_pow_of_two(nr_cpu_ids) - 1; + log = ilog2(size); + tag_ptr = (rps_tag_ptr)sock_table | log; + } - for (i = 0; i < size; i++) - sock_table->ents[i] = RPS_NO_CPU; - } else - sock_table = NULL; - - if (sock_table != o_sock_table) { - rcu_assign_pointer(net_hotdata.rps_sock_flow_table, - sock_table); - if (sock_table) { - static_branch_inc(&rps_needed); - static_branch_inc(&rfs_needed); - } - if (o_sock_table) { - static_branch_dec(&rps_needed); - static_branch_dec(&rfs_needed); - tofree = o_sock_table; - } + for (i = 0; i < size; i++) + sock_table[i].ent = RPS_NO_CPU; + } else { + sock_table = NULL; + tag_ptr = 0UL; + } + if (tag_ptr != o_tag_ptr) { + smp_store_release(&net_hotdata.rps_sock_flow_table, tag_ptr); + if (sock_table) { + static_branch_inc(&rps_needed); + static_branch_inc(&rfs_needed); + } + if (o_sock_table) { + static_branch_dec(&rps_needed); + static_branch_dec(&rfs_needed); + tofree = o_sock_table; } } +unlock: mutex_unlock(&sock_flow_mutex); kvfree_rcu_mightsleep(tofree); -- cgit v1.2.3 From cc39325f927850473d3a84b029ae6f9b508e9bd1 Mon Sep 17 00:00:00 2001 From: Mohsin Bashir Date: Mon, 2 Mar 2026 15:01:45 -0800 Subject: net: ethtool: Track pause storm events With TX pause enabled, if a device is unable to pass packets up to the stack (e.g., CPU is hanged), the device can cause pause storm. Given that devices can have native support to protect the neighbor from such flooding, such events need some tracking. This support is to track TX pause storm events for better observability. Reviewed-by: Oleksij Rempel Signed-off-by: Jakub Kicinski Signed-off-by: Mohsin Bashir Link: https://patch.msgid.link/20260302230149.1580195-2-mohsin.bashr@gmail.com Signed-off-by: Paolo Abeni --- Documentation/netlink/specs/ethtool.yaml | 13 +++++++++++++ include/linux/ethtool.h | 2 ++ include/uapi/linux/ethtool_netlink_generated.h | 1 + net/ethtool/pause.c | 4 +++- 4 files changed, 19 insertions(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/netlink/specs/ethtool.yaml b/Documentation/netlink/specs/ethtool.yaml index 0a2d2343f79a..4707063af3b4 100644 --- a/Documentation/netlink/specs/ethtool.yaml +++ b/Documentation/netlink/specs/ethtool.yaml @@ -879,6 +879,19 @@ attribute-sets: - name: rx-frames type: u64 + - + name: tx-pause-storm-events + type: u64 + doc: >- + TX pause storm event count. Increments each time device + detects that its pause assertion condition has been true + for too long for normal operation. As a result, the device + has temporarily disabled its own Pause TX function to + protect the network from itself. + This counter should never increment under normal overload + conditions; it indicates catastrophic failure like an OS + crash. The rate of incrementing is implementation specific. + - name: pause attr-cnt-name: __ethtool-a-pause-cnt diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h index 798abec67a1b..83c375840835 100644 --- a/include/linux/ethtool.h +++ b/include/linux/ethtool.h @@ -512,12 +512,14 @@ struct ethtool_eth_ctrl_stats { * * Equivalent to `30.3.4.3 aPAUSEMACCtrlFramesReceived` * from the standard. + * @tx_pause_storm_events: TX pause storm event count (see ethtool.yaml). */ struct ethtool_pause_stats { enum ethtool_mac_stats_src src; struct_group(stats, u64 tx_pause_frames; u64 rx_pause_frames; + u64 tx_pause_storm_events; ); }; diff --git a/include/uapi/linux/ethtool_netlink_generated.h b/include/uapi/linux/ethtool_netlink_generated.h index 556a0c834df5..114b83017297 100644 --- a/include/uapi/linux/ethtool_netlink_generated.h +++ b/include/uapi/linux/ethtool_netlink_generated.h @@ -381,6 +381,7 @@ enum { ETHTOOL_A_PAUSE_STAT_PAD, ETHTOOL_A_PAUSE_STAT_TX_FRAMES, ETHTOOL_A_PAUSE_STAT_RX_FRAMES, + ETHTOOL_A_PAUSE_STAT_TX_PAUSE_STORM_EVENTS, __ETHTOOL_A_PAUSE_STAT_CNT, ETHTOOL_A_PAUSE_STAT_MAX = (__ETHTOOL_A_PAUSE_STAT_CNT - 1) diff --git a/net/ethtool/pause.c b/net/ethtool/pause.c index 0f9af1e66548..5d28f642764c 100644 --- a/net/ethtool/pause.c +++ b/net/ethtool/pause.c @@ -130,7 +130,9 @@ static int pause_put_stats(struct sk_buff *skb, if (ethtool_put_stat(skb, pause_stats->tx_pause_frames, ETHTOOL_A_PAUSE_STAT_TX_FRAMES, pad) || ethtool_put_stat(skb, pause_stats->rx_pause_frames, - ETHTOOL_A_PAUSE_STAT_RX_FRAMES, pad)) + ETHTOOL_A_PAUSE_STAT_RX_FRAMES, pad) || + ethtool_put_stat(skb, pause_stats->tx_pause_storm_events, + ETHTOOL_A_PAUSE_STAT_TX_PAUSE_STORM_EVENTS, pad)) goto err_cancel; nla_nest_end(skb, nest); -- cgit v1.2.3 From bf5a54bc0e3d8962474fcd611f1266eba036232f Mon Sep 17 00:00:00 2001 From: "Remy D. Farley" Date: Tue, 3 Mar 2026 19:57:41 +0000 Subject: doc/netlink: netlink-raw: Add max check Add definitions for max check and len-or-limit type, the same as in other specifications. Suggested-by: Donald Hunter Reviewed-by: Donald Hunter Signed-off-by: Remy D. Farley Link: https://patch.msgid.link/20260303195638.381642-2-one-d-wide@protonmail.com Signed-off-by: Jakub Kicinski --- Documentation/netlink/netlink-raw.yaml | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/netlink/netlink-raw.yaml b/Documentation/netlink/netlink-raw.yaml index 0166a7e4afbb..dd98dda55bd0 100644 --- a/Documentation/netlink/netlink-raw.yaml +++ b/Documentation/netlink/netlink-raw.yaml @@ -19,6 +19,12 @@ $defs: type: [ string, integer ] pattern: ^[0-9A-Za-z_-]+( - 1)?$ minimum: 0 + len-or-limit: + # literal int, const name, or limit based on fixed-width type + # e.g. u8-min, u16-max, etc. + type: [ string, integer ] + pattern: ^[0-9A-Za-z_-]+$ + minimum: 0 # Schema for specs title: Protocol @@ -270,7 +276,10 @@ properties: type: string min: description: Min value for an integer attribute. - type: integer + $ref: '#/$defs/len-or-limit' + max: + description: Max value for an integer attribute. + $ref: '#/$defs/len-or-limit' min-len: description: Min length for a binary attribute. $ref: '#/$defs/len-or-define' -- cgit v1.2.3 From a3a54ba4ef2b2f788c45caf02255efd457fbadd0 Mon Sep 17 00:00:00 2001 From: "Remy D. Farley" Date: Tue, 3 Mar 2026 19:58:13 +0000 Subject: doc/netlink: nftables: Add definitions New enums/flags: - payload-base - range-ops - registers - numgen-types - log-level - log-flags Added missing enumerations: - bitwise-ops Annotated doc comment or associated enum: - bitwise-ops Reviewed-by: Donald Hunter Signed-off-by: Remy D. Farley Link: https://patch.msgid.link/20260303195638.381642-3-one-d-wide@protonmail.com Signed-off-by: Jakub Kicinski --- Documentation/netlink/specs/nftables.yaml | 181 +++++++++++++++++++++++++++++- 1 file changed, 178 insertions(+), 3 deletions(-) (limited to 'Documentation') diff --git a/Documentation/netlink/specs/nftables.yaml b/Documentation/netlink/specs/nftables.yaml index 17ad707fa0d5..f15f825cb3a1 100644 --- a/Documentation/netlink/specs/nftables.yaml +++ b/Documentation/netlink/specs/nftables.yaml @@ -66,9 +66,21 @@ definitions: name: bitwise-ops type: enum entries: - - bool - - lshift - - rshift + - + name: mask-xor # aka bool (old name) + doc: >- + mask-and-xor operation used to implement NOT, AND, OR and XOR boolean + operations + - + name: lshift + - + name: rshift + - + name: and + - + name: or + - + name: xor - name: cmp-ops type: enum @@ -132,6 +144,12 @@ definitions: - object - concat - expr + - + name: set-elem-flags + type: flags + entries: + - interval-end + - catchall - name: lookup-flags type: flags @@ -225,6 +243,147 @@ definitions: - icmp-unreach - tcp-rst - icmpx-unreach + - + name: reject-inet-code + doc: These codes are mapped to real ICMP and ICMPv6 codes. + type: enum + entries: + - icmpx-no-route + - icmpx-port-unreach + - icmpx-host-unreach + - icmpx-admin-prohibited + - + name: payload-base + type: enum + entries: + - link-layer-header + - network-header + - transport-header + - inner-header + - tun-header + - + name: range-ops + doc: Range operator + type: enum + entries: + - eq + - neq + - + name: registers + doc: | + nf_tables registers. + nf_tables used to have five registers: a verdict register and four data + registers of size 16. The data registers have been changed to 16 registers + of size 4. For compatibility reasons, the NFT_REG_[1-4] registers still + map to areas of size 16, the 4 byte registers are addressed using + NFT_REG32_00 - NFT_REG32_15. + type: enum + entries: + - + name: reg-verdict + - + name: reg-1 + - + name: reg-2 + - + name: reg-3 + - + name: reg-4 + - + name: reg32-00 + value: 8 + - + name: reg32-01 + - + name: reg32-02 + - + name: reg32-03 + - + name: reg32-04 + - + name: reg32-05 + - + name: reg32-06 + - + name: reg32-07 + - + name: reg32-08 + - + name: reg32-09 + - + name: reg32-10 + - + name: reg32-11 + - + name: reg32-12 + - + name: reg32-13 + - + name: reg32-14 + - + name: reg32-15 + - + name: numgen-types + type: enum + entries: + - incremental + - random + - + name: log-level + doc: nf_tables log levels + type: enum + entries: + - + name: emerg + doc: system is unusable + - + name: alert + doc: action must be taken immediately + - + name: crit + doc: critical conditions + - + name: err + doc: error conditions + - + name: warning + doc: warning conditions + - + name: notice + doc: normal but significant condition + - + name: info + doc: informational + - + name: debug + doc: debug-level messages + - + name: audit + doc: enabling audit logging + - + name: log-flags + doc: nf_tables log flags + header: linux/netfilter/nf_log.h + type: flags + entries: + - + name: tcpseq + doc: Log TCP sequence numbers + - + name: tcpopt + doc: Log TCP options + - + name: ipopt + doc: Log IP options + - + name: uid + doc: Log UID owning local socket + - + name: nflog + doc: Unsupported, don't reuse + - + name: macdecode + doc: Decode MAC header attribute-sets: - @@ -767,6 +926,22 @@ attribute-sets: nested-attributes: hook-dev-attrs - name: expr-bitwise-attrs + doc: | + The bitwise expression supports boolean and shift operations. It + implements the boolean operations by performing the following + operation:: + + dreg = (sreg & mask) ^ xor + + with these mask and xor values: + + op mask xor + ---- ---- --- + NOT: 1 1 + OR: ~x x + XOR: 1 x + AND: x 0 + attributes: - name: sreg -- cgit v1.2.3 From 482da27d5274dfe5f2828d150981b966b8f9f3b1 Mon Sep 17 00:00:00 2001 From: "Remy D. Farley" Date: Tue, 3 Mar 2026 19:58:52 +0000 Subject: doc/netlink: nftables: Update attribute sets New attribute sets: - log-attrs - numgen-attrs - range-attrs - compat-target-attrs - compat-match-attrs - compat-attrs Added missing attributes: - table-attrs (pad, owner) - set-attrs (type, count) Added missing checks: - range-attrs - expr-bitwise-attrs - compat-target-attrs - compat-match-attrs - compat-attrs Annotated doc comment or associated enum: - batch-attrs - verdict-attrs - expr-payload-attrs Fixed byte order: - nft-counter-attrs - expr-counter-attrs - rule-compat-attrs Reviewed-by: Donald Hunter Signed-off-by: Remy D. Farley Link: https://patch.msgid.link/20260303195638.381642-4-one-d-wide@protonmail.com Signed-off-by: Jakub Kicinski --- Documentation/netlink/specs/nftables.yaml | 208 +++++++++++++++++++++++++++++- 1 file changed, 204 insertions(+), 4 deletions(-) (limited to 'Documentation') diff --git a/Documentation/netlink/specs/nftables.yaml b/Documentation/netlink/specs/nftables.yaml index f15f825cb3a1..ba62adfb4c63 100644 --- a/Documentation/netlink/specs/nftables.yaml +++ b/Documentation/netlink/specs/nftables.yaml @@ -387,16 +387,100 @@ definitions: attribute-sets: - - name: empty-attrs + name: log-attrs + doc: log expression netlink attributes attributes: + # Mentioned in nft_log_init() - - name: name + name: group + doc: netlink group to send messages to + type: u16 + byte-order: big-endian + - + name: prefix + doc: prefix to prepend to log messages type: string + - + name: snaplen + doc: length of payload to include in netlink message + type: u32 + byte-order: big-endian + - + name: qthreshold + doc: queue threshold + type: u16 + byte-order: big-endian + - + name: level + doc: log level + type: u32 + enum: log-level + byte-order: big-endian + - + name: flags + doc: logging flags + type: u32 + enum: log-flags + byte-order: big-endian + - + name: numgen-attrs + doc: nf_tables number generator expression netlink attributes + attributes: + - + name: dreg + doc: destination register + type: u32 + enum: registers + - + name: modulus + doc: maximum counter value + type: u32 + byte-order: big-endian + - + name: type + doc: operation type + type: u32 + byte-order: big-endian + enum: numgen-types + - + name: offset + doc: offset to be added to the counter + type: u32 + byte-order: big-endian + - + name: range-attrs + attributes: + # Mentioned in net/netfilter/nft_range.c + - + name: sreg + doc: source register of data to compare + type: u32 + byte-order: big-endian + enum: registers + - + name: op + doc: cmp operation + type: u32 + byte-order: big-endian + enum: range-ops + checks: + max: 255 + - + name: from-data + doc: data range from + type: nest + nested-attributes: data-attrs + - + name: to-data + doc: data range to + type: nest + nested-attributes: data-attrs - name: batch-attrs attributes: - name: genid + doc: generation ID for this changeset type: u32 byte-order: big-endian - @@ -423,10 +507,18 @@ attribute-sets: type: u64 byte-order: big-endian doc: numeric handle of the table + - + name: pad + type: pad - name: userdata type: binary doc: user data + - + name: owner + type: u32 + byte-order: big-endian + doc: owner of this table through netlink portID - name: chain-attrs attributes: @@ -530,9 +622,11 @@ attribute-sets: - name: bytes type: u64 + byte-order: big-endian - name: packets type: u64 + byte-order: big-endian - name: rule-attrs attributes: @@ -602,15 +696,18 @@ attribute-sets: selector: name doc: type specific data - + # Mentioned in nft_parse_compat() in net/netfilter/nft_compat.c name: rule-compat-attrs attributes: - name: proto - type: binary + type: u32 + byte-order: big-endian doc: numeric value of the handled protocol - name: flags - type: binary + type: u32 + byte-order: big-endian doc: bitmask of flags - name: set-attrs @@ -699,6 +796,15 @@ attribute-sets: type: nest nested-attributes: set-list-attrs doc: list of expressions + - + name: type + type: string + doc: set backend type + - + name: count + type: u32 + byte-order: big-endian + doc: number of set elements - name: set-desc-attrs attributes: @@ -968,6 +1074,8 @@ attribute-sets: type: u32 byte-order: big-endian enum: bitwise-ops + checks: + max: 255 - name: data type: nest @@ -1004,25 +1112,31 @@ attribute-sets: attributes: - name: code + doc: nf_tables verdict type: u32 byte-order: big-endian enum: verdict-code - name: chain + doc: jump target chain name type: string - name: chain-id + doc: jump target chain ID type: u32 + byte-order: big-endian - name: expr-counter-attrs attributes: - name: bytes type: u64 + byte-order: big-endian doc: Number of bytes - name: packets type: u64 + byte-order: big-endian doc: Number of packets - name: pad @@ -1107,6 +1221,25 @@ attribute-sets: type: u32 byte-order: big-endian enum: lookup-flags + - + name: expr-masq-attrs + attributes: + - + name: flags + type: u32 + byte-order: big-endian + enum: nat-range-flags + enum-as-flags: true + - + name: reg-proto-min + type: u32 + byte-order: big-endian + enum: registers + - + name: reg-proto-max + type: u32 + byte-order: big-endian + enum: registers - name: expr-meta-attrs attributes: @@ -1158,37 +1291,49 @@ attribute-sets: enum-as-flags: true - name: expr-payload-attrs + doc: nf_tables payload expression netlink attributes attributes: - name: dreg + doc: destination register to load data into type: u32 byte-order: big-endian + enum: registers - name: base + doc: payload base type: u32 + enum: payload-base byte-order: big-endian - name: offset + doc: payload offset relative to base type: u32 byte-order: big-endian - name: len + doc: payload length type: u32 byte-order: big-endian - name: sreg + doc: source register to load data from type: u32 byte-order: big-endian + enum: registers - name: csum-type + doc: checksum type type: u32 byte-order: big-endian - name: csum-offset + doc: checksum offset relative to base type: u32 byte-order: big-endian - name: csum-flags + doc: checksum flags type: u32 byte-order: big-endian - @@ -1254,6 +1399,61 @@ attribute-sets: type: u32 byte-order: big-endian doc: id of object map + - + name: compat-target-attrs + header: linux/netfilter/nf_tables_compat.h + attributes: + - + name: name + type: string + checks: + max-len: 32 + - + name: rev + type: u32 + byte-order: big-endian + checks: + max: 255 + - + name: info + type: binary + - + name: compat-match-attrs + header: linux/netfilter/nf_tables_compat.h + attributes: + - + name: name + type: string + checks: + max-len: 32 + - + name: rev + type: u32 + byte-order: big-endian + checks: + max: 255 + - + name: info + type: binary + - + name: compat-attrs + header: linux/netfilter/nf_tables_compat.h + attributes: + - + name: name + type: string + checks: + max-len: 32 + - + name: rev + type: u32 + byte-order: big-endian + checks: + max: 255 + - + name: type + type: u32 + byte-order: big-endian sub-messages: - -- cgit v1.2.3 From 27c7ee6d26ddd1a769bdcfb22862ef443da461e8 Mon Sep 17 00:00:00 2001 From: "Remy D. Farley" Date: Tue, 3 Mar 2026 19:59:19 +0000 Subject: doc/netlink: nftables: Add sub-messages New sub-messsages: - log - match - numgen - range Reviewed-by: Donald Hunter Signed-off-by: Remy D. Farley Link: https://patch.msgid.link/20260303195638.381642-5-one-d-wide@protonmail.com Signed-off-by: Jakub Kicinski --- Documentation/netlink/specs/nftables.yaml | 15 +++++++++++++++ 1 file changed, 15 insertions(+) (limited to 'Documentation') diff --git a/Documentation/netlink/specs/nftables.yaml b/Documentation/netlink/specs/nftables.yaml index ba62adfb4c63..086b16b12b0f 100644 --- a/Documentation/netlink/specs/nftables.yaml +++ b/Documentation/netlink/specs/nftables.yaml @@ -1480,15 +1480,24 @@ sub-messages: - value: immediate attribute-set: expr-immediate-attrs + - + value: log + attribute-set: log-attrs - value: lookup attribute-set: expr-lookup-attrs + - + value: match + attribute-set: compat-match-attrs - value: meta attribute-set: expr-meta-attrs - value: nat attribute-set: expr-nat-attrs + - + value: numgen + attribute-set: numgen-attrs - value: objref attribute-set: expr-objref-attrs @@ -1498,6 +1507,9 @@ sub-messages: - value: quota attribute-set: quota-attrs + - + value: range + attribute-set: range-attrs - value: reject attribute-set: expr-reject-attrs @@ -1507,6 +1519,9 @@ sub-messages: - value: tproxy attribute-set: expr-tproxy-attrs + # There're more sub-messages to go: + # grep -A10 nft_expr_type + # and look for .name\s*=\s*"..." - name: obj-data formats: -- cgit v1.2.3 From 568b370f128ce328cf3750eda5a84080043f97d6 Mon Sep 17 00:00:00 2001 From: "Remy D. Farley" Date: Tue, 3 Mar 2026 20:00:12 +0000 Subject: doc/netlink: nftables: Fill out operation attributes Filled out operation attributes: - newtable - gettable - deltable - destroytable - newchain - getchain - delchain - destroychain - newrule - getrule - getrule-reset - delrule - destroyrule - newset - getset - delset - destroyset - newsetelem - getsetelem - getsetelem-reset - delsetelem - destroysetelem - getgen - newobj - getobj - delobj - destroyobj - newflowtable - getflowtable - delflowtable - destroyflowtable Signed-off-by: Remy D. Farley Link: https://patch.msgid.link/20260303195638.381642-6-one-d-wide@protonmail.com Signed-off-by: Jakub Kicinski --- Documentation/netlink/specs/nftables.yaml | 285 +++++++++++++++++++++++++----- 1 file changed, 243 insertions(+), 42 deletions(-) (limited to 'Documentation') diff --git a/Documentation/netlink/specs/nftables.yaml b/Documentation/netlink/specs/nftables.yaml index 086b16b12b0f..21edf3d25f34 100644 --- a/Documentation/netlink/specs/nftables.yaml +++ b/Documentation/netlink/specs/nftables.yaml @@ -1568,7 +1568,10 @@ operations: request: value: 0xa00 attributes: + # Mentioned in nf_tables_newtable() - name + - flags + - userdata - name: gettable doc: Get / dump tables. @@ -1578,11 +1581,21 @@ operations: request: value: 0xa01 attributes: + # Mentioned in nf_tables_gettable() - name reply: value: 0xa00 - attributes: + attributes: &get-table + # Mentioned in nf_tables_fill_table_info() - name + - use + - handle + - flags + - owner + - userdata + dump: + reply: + attributes: *get-table - name: deltable doc: Delete an existing table. @@ -1591,8 +1604,10 @@ operations: do: request: value: 0xa02 - attributes: + attributes: &del-table + # Mentioned in nf_tables_deltable() - name + - handle - name: destroytable doc: | @@ -1603,8 +1618,7 @@ operations: do: request: value: 0xa1a - attributes: - - name + attributes: *del-table - name: newchain doc: Create a new chain. @@ -1614,7 +1628,19 @@ operations: request: value: 0xa03 attributes: + # Mentioned in nf_tables_newchain() + - table + - handle + - policy + - flags + # Mentioned in nf_tables_updchain() + - hook - name + - counters + # Mentioned in nf_tables_addchain() + - userdata + # Mentioned in nft_chain_parse_hook() + - type - name: getchain doc: Get / dump chains. @@ -1624,11 +1650,27 @@ operations: request: value: 0xa04 attributes: + # Mentioned in nf_tables_getchain() + - table - name reply: value: 0xa03 - attributes: + attributes: &get-chain + # Mentioned in nf_tables_fill_chain_info() + - table - name + - handle + - hook + - policy + - type + - flags + - counters + - id + - use + - userdata + dump: + reply: + attributes: *get-chain - name: delchain doc: Delete an existing chain. @@ -1637,8 +1679,12 @@ operations: do: request: value: 0xa05 - attributes: + attributes: &del-chain + # Mentioned in nf_tables_delchain() + - table + - handle - name + - hook - name: destroychain doc: | @@ -1649,8 +1695,7 @@ operations: do: request: value: 0xa1b - attributes: - - name + attributes: *del-chain - name: newrule doc: Create a new rule. @@ -1660,7 +1705,16 @@ operations: request: value: 0xa06 attributes: - - name + # Mentioned in nf_tables_newrule() + - table + - chain + - chain-id + - handle + - position + - position-id + - expressions + - userdata + - compat - name: getrule doc: Get / dump rules. @@ -1669,12 +1723,30 @@ operations: do: request: value: 0xa07 - attributes: - - name + attributes: &get-rule-request + # Mentioned in nf_tables_getrule_single() + - table + - chain + - handle reply: value: 0xa06 + attributes: &get-rule + # Mentioned in nf_tables_fill_rule_info() + - table + - chain + - handle + - position + - expressions + - userdata + dump: + request: attributes: - - name + # Mentioned in nf_tables_dump_rules_start() + - table + - chain + reply: + attributes: *get-rule + - name: getrule-reset doc: Get / dump rules and reset stateful expressions. @@ -1683,12 +1755,15 @@ operations: do: request: value: 0xa19 - attributes: - - name + attributes: *get-rule-request reply: value: 0xa06 - attributes: - - name + attributes: *get-rule + dump: + request: + attributes: *get-rule-request + reply: + attributes: *get-rule - name: delrule doc: Delete an existing rule. @@ -1697,8 +1772,11 @@ operations: do: request: value: 0xa08 - attributes: - - name + attributes: &del-rule + - table + - chain + - handle + - id - name: destroyrule doc: | @@ -1708,8 +1786,7 @@ operations: do: request: value: 0xa1c - attributes: - - name + attributes: *del-rule - name: newset doc: Create a new set. @@ -1719,7 +1796,21 @@ operations: request: value: 0xa09 attributes: + # Mentioned in nf_tables_newset() + - table - name + - key-len + - id + - key-type + - flags + - data-type + - data-len + - obj-type + - timeout + - gc-interval + - policy + - desc + - userdata - name: getset doc: Get / dump sets. @@ -1729,11 +1820,35 @@ operations: request: value: 0xa0a attributes: + # Mentioned in nf_tables_getset() + - table - name reply: value: 0xa09 - attributes: + attributes: &get-set + # Mentioned in nf_tables_fill_set() + - table - name + - handle + - flags + - key-len + - key-type + - data-type + - data-len + - obj-type + - gc-interval + - policy + - userdata + - desc + - expr + - expressions + dump: + request: + attributes: + # Mentioned in nf_tables_getset() + - table + reply: + attributes: *get-set - name: delset doc: Delete an existing set. @@ -1742,7 +1857,10 @@ operations: do: request: value: 0xa0b - attributes: + attributes: &del-set + # Mentioned in nf_tables_delset() + - table + - handle - name - name: destroyset @@ -1753,8 +1871,7 @@ operations: do: request: value: 0xa1d - attributes: - - name + attributes: *del-set - name: newsetelem doc: Create a new set element. @@ -1764,7 +1881,11 @@ operations: request: value: 0xa0c attributes: - - name + # Mentioned in nf_tables_newsetelem() + - table + - set + - set-id + - elements - name: getsetelem doc: Get / dump set elements. @@ -1774,11 +1895,27 @@ operations: request: value: 0xa0d attributes: - - name + # Mentioned in nf_tables_getsetelem() + - table + - set + - elements reply: value: 0xa0c attributes: - - name + # Mentioned in nf_tables_fill_setelem_info() + - elements + dump: + request: + attributes: &dump-set-request + # Mentioned in nft_set_dump_ctx_init() + - table + - set + reply: + attributes: &dump-set + # Mentioned in nf_tables_dump_set() + - table + - set + - elements - name: getsetelem-reset doc: Get / dump set elements and reset stateful expressions. @@ -1788,11 +1925,20 @@ operations: request: value: 0xa21 attributes: - - name + # Mentioned in nf_tables_getsetelem_reset() + - elements reply: value: 0xa0c attributes: - - name + # Mentioned in nf_tables_dumpreset_set() + - table + - set + - elements + dump: + request: + attributes: *dump-set-request + reply: + attributes: *dump-set - name: delsetelem doc: Delete an existing set element. @@ -1801,8 +1947,11 @@ operations: do: request: value: 0xa0e - attributes: - - name + attributes: &del-setelem + # Mentioned in nf_tables_delsetelem() + - table + - set + - elements - name: destroysetelem doc: Delete an existing set element with destroy semantics. @@ -1811,8 +1960,7 @@ operations: do: request: value: 0xa1e - attributes: - - name + attributes: *del-setelem - name: getgen doc: Get / dump rule-set generation. @@ -1821,12 +1969,16 @@ operations: do: request: value: 0xa10 - attributes: - - name reply: value: 0xa0f - attributes: - - name + attributes: &get-gen + # Mentioned in nf_tables_fill_gen_info() + - id + - proc-pid + - proc-name + dump: + reply: + attributes: *get-gen - name: newobj doc: Create a new stateful object. @@ -1836,7 +1988,12 @@ operations: request: value: 0xa12 attributes: + # Mentioned in nf_tables_newobj() + - type - name + - data + - table + - userdata - name: getobj doc: Get / dump stateful objects. @@ -1846,11 +2003,29 @@ operations: request: value: 0xa13 attributes: + # Mentioned in nf_tables_getobj_single() - name + - type + - table reply: value: 0xa12 - attributes: + attributes: &obj-info + # Mentioned in nf_tables_fill_obj_info() + - table - name + - type + - handle + - use + - data + - userdata + dump: + request: + attributes: + # Mentioned in nf_tables_dump_obj_start() + - table + - type + reply: + attributes: *obj-info - name: delobj doc: Delete an existing stateful object. @@ -1860,7 +2035,11 @@ operations: request: value: 0xa14 attributes: + # Mentioned in nf_tables_delobj() + - table - name + - type + - handle - name: destroyobj doc: Delete an existing stateful object with destroy semantics. @@ -1870,7 +2049,11 @@ operations: request: value: 0xa1f attributes: + # Mentioned in nf_tables_delobj() + - table - name + - type + - handle - name: newflowtable doc: Create a new flow table. @@ -1880,7 +2063,11 @@ operations: request: value: 0xa16 attributes: + # Mentioned in nf_tables_newflowtable() + - table - name + - hook + - flags - name: getflowtable doc: Get / dump flow tables. @@ -1890,11 +2077,22 @@ operations: request: value: 0xa17 attributes: + # Mentioned in nf_tables_getflowtable() - name + - table reply: value: 0xa16 - attributes: + attributes: &flowtable-info + # Mentioned in nf_tables_fill_flowtable_info() + - table - name + - handle + - use + - flags + - hook + dump: + reply: + attributes: *flowtable-info - name: delflowtable doc: Delete an existing flow table. @@ -1903,8 +2101,12 @@ operations: do: request: value: 0xa18 - attributes: + attributes: &del-flowtable + # Mentioned in nf_tables_delflowtable() + - table - name + - handle + - hook - name: destroyflowtable doc: Delete an existing flow table with destroy semantics. @@ -1913,8 +2115,7 @@ operations: do: request: value: 0xa20 - attributes: - - name + attributes: *del-flowtable mcast-groups: list: -- cgit v1.2.3 From e5e09233e8a9179b260460fdb28df5c24bcfbed6 Mon Sep 17 00:00:00 2001 From: Antonio Quartulli Date: Wed, 4 Mar 2026 15:10:09 +0100 Subject: tools: ynl: add uns-admin-perm to genetlink GENL_UNS_ADMIN_PERM may be required by protocols using the `genetlink` family, however, this flag is currently only allowed in `genetlink-legacy`. Add it to the list of possible values in genetlink.yaml too. Cc: Simon Horman Cc: Donald Hunter Link: https://github.com/OpenVPN/ovpn-net-next/issues/33 Suggested-by: Ralf Lici Signed-off-by: Antonio Quartulli Link: https://patch.msgid.link/20260304141020.23270-1-antonio@openvpn.net Signed-off-by: Jakub Kicinski --- Documentation/netlink/genetlink.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/netlink/genetlink.yaml b/Documentation/netlink/genetlink.yaml index b020a537d8ac..a1194d5d93fc 100644 --- a/Documentation/netlink/genetlink.yaml +++ b/Documentation/netlink/genetlink.yaml @@ -262,7 +262,7 @@ properties: description: Command flags. type: array items: - enum: [ admin-perm ] + enum: [ admin-perm, uns-admin-perm ] dont-validate: description: Kernel attribute validation flags. type: array -- cgit v1.2.3 From 8e235bc43326f47adfd77d926c7d6ad39077569a Mon Sep 17 00:00:00 2001 From: Jakub Kicinski Date: Wed, 4 Mar 2026 07:16:46 -0800 Subject: docs: netdev: refine netdevsim testing guidance The library to create tests for both NIC HW and netdevsim has existed for almost a year. netdevsim-only tests we get increasingly feel like a waste, we should try to write tests that work both on netdevsim and real HW. Refine the guidance accordingly. Reviewed-by: Simon Horman Link: https://patch.msgid.link/20260304151647.2770466-1-kuba@kernel.org Signed-off-by: Jakub Kicinski --- Documentation/process/maintainer-netdev.rst | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) (limited to 'Documentation') diff --git a/Documentation/process/maintainer-netdev.rst b/Documentation/process/maintainer-netdev.rst index 6bce4507d5d3..3aa13bc2405d 100644 --- a/Documentation/process/maintainer-netdev.rst +++ b/Documentation/process/maintainer-netdev.rst @@ -479,8 +479,14 @@ netdevsim ``netdevsim`` is a test driver which can be used to exercise driver configuration APIs without requiring capable hardware. -Mock-ups and tests based on ``netdevsim`` are strongly encouraged when -adding new APIs, but ``netdevsim`` in itself is **not** considered +Mock-ups and tests based on ``netdevsim`` are encouraged when +adding new APIs with complex logic in the stack. The tests should +be written so that they can run both against ``netdevsim`` and a real +device (see ``tools/testing/selftests/drivers/net/README.rst``). +``netdevsim``-only tests should focus on testing corner cases +and failure paths in the core which are hard to exercise with a real driver. + +``netdevsim`` in itself is **not** considered a use case/user. You must also implement the new APIs in a real driver. We give no guarantees that ``netdevsim`` won't change in the future -- cgit v1.2.3 From 4a51ac9056c17e8216b245cd16152eefbd21abfb Mon Sep 17 00:00:00 2001 From: Kyoji Ogasawara Date: Mon, 9 Mar 2026 21:45:39 +0900 Subject: net/smc: fix indentation in smcr_buf_type section smcr_buf_type section used inconsistent indentation compared with the rest of this document. Signed-off-by: Kyoji Ogasawara Reviewed-by: D. Wythe Link: https://patch.msgid.link/20260309124541.22723-2-sawara04.o@gmail.com Signed-off-by: Jakub Kicinski --- Documentation/networking/smc-sysctl.rst | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) (limited to 'Documentation') diff --git a/Documentation/networking/smc-sysctl.rst b/Documentation/networking/smc-sysctl.rst index 904a910f198e..17b8314c0e5e 100644 --- a/Documentation/networking/smc-sysctl.rst +++ b/Documentation/networking/smc-sysctl.rst @@ -23,17 +23,17 @@ autocorking_size - INTEGER Default: 64K smcr_buf_type - INTEGER - Controls which type of sndbufs and RMBs to use in later newly created - SMC-R link group. Only for SMC-R. + Controls which type of sndbufs and RMBs to use in later newly created + SMC-R link group. Only for SMC-R. - Default: 0 (physically contiguous sndbufs and RMBs) + Default: 0 (physically contiguous sndbufs and RMBs) - Possible values: + Possible values: - - 0 - Use physically contiguous buffers - - 1 - Use virtually contiguous buffers - - 2 - Mixed use of the two types. Try physically contiguous buffers first. - If not available, use virtually contiguous buffers then. + - 0 - Use physically contiguous buffers + - 1 - Use virtually contiguous buffers + - 2 - Mixed use of the two types. Try physically contiguous buffers first. + If not available, use virtually contiguous buffers then. smcr_testlink_time - INTEGER How frequently SMC-R link sends out TEST_LINK LLC messages to confirm -- cgit v1.2.3 From aa5ec9d03b9c4459e528ecd75d84f6ef98fb2f5a Mon Sep 17 00:00:00 2001 From: Kyoji Ogasawara Date: Mon, 9 Mar 2026 21:45:40 +0900 Subject: net/smc: Add documentation for limit_smc_hs and hs_ctrl Document missing SMC sysctl parameters limit_smc_hs and hs_ctrl Signed-off-by: Kyoji Ogasawara Reviewed-by: D. Wythe Link: https://patch.msgid.link/20260309124541.22723-3-sawara04.o@gmail.com Signed-off-by: Jakub Kicinski --- Documentation/networking/smc-sysctl.rst | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+) (limited to 'Documentation') diff --git a/Documentation/networking/smc-sysctl.rst b/Documentation/networking/smc-sysctl.rst index 17b8314c0e5e..a8b4f357174e 100644 --- a/Documentation/networking/smc-sysctl.rst +++ b/Documentation/networking/smc-sysctl.rst @@ -111,3 +111,30 @@ smcr_max_recv_wr - INTEGER like before having this control. Default: 48 + +limit_smc_hs - INTEGER + Whether to limit SMC handshake for newly created sockets. + + When enabled, SMC listen path applies handshake limitation based on + handshake worker congestion and queued SMC handshake load. + + Possible values: + + - 0 - Disable handshake limitation + - 1 - Enable handshake limitation + + Default: 0 (disable) + +hs_ctrl - STRING + Select the SMC handshake control profile by name. + + This string refers to the name of a user-implemented + BPF struct_ops instance of type smc_hs_ctrl. + + The selected profile controls whether SMC options are advertised + during TCP SYN/SYN-ACK handshake. + + Only available when CONFIG_SMC_HS_CTRL_BPF is enabled. + Write an empty string to clear the current profile. + + Default: empty string -- cgit v1.2.3 From 7da62262ec96a4b345d207b6bcd2ddf5231b7f7d Mon Sep 17 00:00:00 2001 From: Fernando Fernandez Mancera Date: Mon, 9 Mar 2026 03:39:45 +0100 Subject: inet: add ip_local_port_step_width sysctl to improve port usage distribution With the current port selection algorithm, ports after a reserved port range or long time used port are used more often than others [1]. This causes an uneven port usage distribution. This combines with cloud environments blocking connections between the application server and the database server if there was a previous connection with the same source port, leading to connectivity problems between applications on cloud environments. The real issue here is that these firewalls cannot cope with standards-compliant port reuse. This is a workaround for such situations and an improvement on the distribution of ports selected. The proposed solution is to implement a variant of RFC 6056 Algorithm 5. The step size is selected randomly on every connect() call ensuring it is a coprime with respect to the size of the range of ports we want to scan. This way, we can ensure that all ports within the range are scanned before returning an error. To enable this algorithm, the user must configure the new sysctl option "net.ipv4.ip_local_port_step_width". In addition, on graphs generated we can observe that the distribution of source ports is more even with the proposed approach. [2] [1] https://0xffsoftware.com/port_graph_current_alg.html [2] https://0xffsoftware.com/port_graph_random_step_alg.html Reviewed-by: Eric Dumazet Signed-off-by: Fernando Fernandez Mancera Link: https://patch.msgid.link/20260309023946.5473-2-fmancera@suse.de Signed-off-by: Jakub Kicinski --- Documentation/networking/ip-sysctl.rst | 16 +++++++++++++ .../net_cachelines/netns_ipv4_sysctl.rst | 1 + include/net/netns/ipv4.h | 1 + net/ipv4/inet_hashtables.c | 28 +++++++++++++++++++--- net/ipv4/sysctl_net_ipv4.c | 7 ++++++ 5 files changed, 50 insertions(+), 3 deletions(-) (limited to 'Documentation') diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst index 265158534cda..2e3a746fcc6d 100644 --- a/Documentation/networking/ip-sysctl.rst +++ b/Documentation/networking/ip-sysctl.rst @@ -1630,6 +1630,22 @@ ip_local_reserved_ports - list of comma separated ranges Default: Empty +ip_local_port_step_width - INTEGER + Defines the numerical maximum increment between successive port + allocations within the ephemeral port range when an unavailable port is + reached. This can be used to mitigate accumulated nodes in port + distribution when reserved ports have been configured. Please note that + port collisions may be more frequent in a system with a very high load. + + It is recommended to set this value strictly larger than the largest + contiguous block of ports configure in ip_local_reserved_ports. For + large reserved port ranges, setting this to 3x or 4x the size of the + largest block is advised. Using a value equal or greater than the local + port range size completely solves the uneven port distribution problem, + but it can degrade performance under port exhaustion situations. + + Default: 0 (disabled) + ip_unprivileged_port_start - INTEGER This is a per-namespace sysctl. It defines the first unprivileged port in the network namespace. Privileged ports diff --git a/Documentation/networking/net_cachelines/netns_ipv4_sysctl.rst b/Documentation/networking/net_cachelines/netns_ipv4_sysctl.rst index beaf1880a19b..cf284263e69b 100644 --- a/Documentation/networking/net_cachelines/netns_ipv4_sysctl.rst +++ b/Documentation/networking/net_cachelines/netns_ipv4_sysctl.rst @@ -52,6 +52,7 @@ u8 sysctl_ip_fwd_update_priority u8 sysctl_ip_nonlocal_bind u8 sysctl_ip_autobind_reuse u8 sysctl_ip_dynaddr +u32 sysctl_ip_local_port_step_width u8 sysctl_ip_early_demux read_mostly ip(6)_rcv_finish_core u8 sysctl_raw_l3mdev_accept u8 sysctl_tcp_early_demux read_mostly ip(6)_rcv_finish_core diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h index 38624beff9b3..80ccd4dda8e0 100644 --- a/include/net/netns/ipv4.h +++ b/include/net/netns/ipv4.h @@ -166,6 +166,7 @@ struct netns_ipv4 { u8 sysctl_ip_autobind_reuse; /* Shall we try to damage output packets if routing dev changes? */ u8 sysctl_ip_dynaddr; + u32 sysctl_ip_local_port_step_width; #ifdef CONFIG_NET_L3_MASTER_DEV u8 sysctl_raw_l3mdev_accept; #endif diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c index ac7b67c603b5..13310c72b0bf 100644 --- a/net/ipv4/inet_hashtables.c +++ b/net/ipv4/inet_hashtables.c @@ -16,6 +16,7 @@ #include #include #include +#include #include #include @@ -1057,12 +1058,12 @@ int __inet_hash_connect(struct inet_timewait_death_row *death_row, struct net *net = sock_net(sk); struct inet_bind2_bucket *tb2; struct inet_bind_bucket *tb; + int step, scan_step, l3mdev; + u32 index, max_rand_step; bool tb_created = false; u32 remaining, offset; int ret, i, low, high; bool local_ports; - int step, l3mdev; - u32 index; if (port) { local_bh_disable(); @@ -1076,6 +1077,8 @@ int __inet_hash_connect(struct inet_timewait_death_row *death_row, local_ports = inet_sk_get_local_port_range(sk, &low, &high); step = local_ports ? 1 : 2; + scan_step = step; + max_rand_step = READ_ONCE(net->ipv4.sysctl_ip_local_port_step_width); high++; /* [32768, 60999] -> [32768, 61000[ */ remaining = high - low; @@ -1094,9 +1097,28 @@ int __inet_hash_connect(struct inet_timewait_death_row *death_row, */ if (!local_ports) offset &= ~1U; + + if (max_rand_step && remaining > 1) { + u32 range = remaining / step; + u32 upper_bound; + + upper_bound = min(range, max_rand_step); + scan_step = get_random_u32_inclusive(1, upper_bound); + while (gcd(scan_step, range) != 1) { + scan_step++; + /* if both scan_step and range are even gcd won't be 1 */ + if (!(scan_step & 1) && !(range & 1)) + scan_step++; + if (unlikely(scan_step > upper_bound)) { + scan_step = 1; + break; + } + } + scan_step *= step; + } other_parity_scan: port = low + offset; - for (i = 0; i < remaining; i += step, port += step) { + for (i = 0; i < remaining; i += step, port += scan_step) { if (unlikely(port >= high)) port -= remaining; if (inet_is_local_reserved_port(net, port)) diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c index 5654cc9c8a0b..d8bdb1bdbff1 100644 --- a/net/ipv4/sysctl_net_ipv4.c +++ b/net/ipv4/sysctl_net_ipv4.c @@ -823,6 +823,13 @@ static struct ctl_table ipv4_net_table[] = { .mode = 0644, .proc_handler = ipv4_local_port_range, }, + { + .procname = "ip_local_port_step_width", + .maxlen = sizeof(u32), + .data = &init_net.ipv4.sysctl_ip_local_port_step_width, + .mode = 0644, + .proc_handler = proc_douintvec, + }, { .procname = "ip_local_reserved_ports", .data = &init_net.ipv4.sysctl_local_reserved_ports, -- cgit v1.2.3 From 410593fec752f15c97062d2ded5e3a9dd654dcb2 Mon Sep 17 00:00:00 2001 From: Eric Dumazet Date: Tue, 10 Mar 2026 07:38:55 +0000 Subject: tcp: add sysctl_tcp_shrink_window to netns_ipv4_sysctl.rst Add missing entry for sysctl_tcp_shrink_window. Signed-off-by: Eric Dumazet Reviewed-by: Kuniyuki Iwashima Link: https://patch.msgid.link/20260310073855.564927-1-edumazet@google.com Signed-off-by: Jakub Kicinski --- Documentation/networking/net_cachelines/netns_ipv4_sysctl.rst | 1 + 1 file changed, 1 insertion(+) (limited to 'Documentation') diff --git a/Documentation/networking/net_cachelines/netns_ipv4_sysctl.rst b/Documentation/networking/net_cachelines/netns_ipv4_sysctl.rst index cf284263e69b..6dbd97d435e9 100644 --- a/Documentation/networking/net_cachelines/netns_ipv4_sysctl.rst +++ b/Documentation/networking/net_cachelines/netns_ipv4_sysctl.rst @@ -105,6 +105,7 @@ u8 sysctl_tcp_nometrics_save u8 sysctl_tcp_no_ssthresh_metrics_save TCP_LAST_ACK/tcp_(update/init)_metrics u8 sysctl_tcp_moderate_rcvbuf read_mostly tcp_rcvbuf_grow() u32 sysctl_tcp_rcvbuf_low_rtt read_mostly tcp_rcvbuf_grow() +u8 sysctl_tcp_shrink_window read_mostly read_mostly __tcp_select_window() u8 sysctl_tcp_tso_win_divisor read_mostly tcp_tso_should_defer(tcp_write_xmit) u8 sysctl_tcp_workaround_signed_windows tcp_select_window int sysctl_tcp_limit_output_bytes read_mostly tcp_small_queue_check(tcp_write_xmit) -- cgit v1.2.3 From 0de607dc4fd80ede3b2a35e8a72f99c7a0bbc321 Mon Sep 17 00:00:00 2001 From: Alexander Graf Date: Wed, 4 Mar 2026 23:00:27 +0000 Subject: vsock: add G2H fallback for CIDs not owned by H2G transport When no H2G transport is loaded, vsock currently routes all CIDs to the G2H transport (commit 65b422d9b61b ("vsock: forward all packets to the host when no H2G is registered"). Extend that existing behavior: when an H2G transport is loaded but does not claim a given CID, the connection falls back to G2H in the same way. This matters in environments like Nitro Enclaves, where an instance may run nested VMs via vhost-vsock (H2G) while also needing to reach sibling enclaves at higher CIDs through virtio-vsock-pci (G2H). With the old code, any CID > 2 was unconditionally routed to H2G when vhost was loaded, making those enclaves unreachable without setting VMADDR_FLAG_TO_HOST explicitly on every connect. Requiring every application to set VMADDR_FLAG_TO_HOST creates friction: tools like socat, iperf, and others would all need to learn about it. The flag was introduced 6 years ago and I am still not aware of any tool that supports it. Even if there was support, it would be cumbersome to use. The most natural experience is a single CID address space where H2G only wins for CIDs it actually owns, and everything else falls through to G2H, extending the behavior that already exists when H2G is absent. To give user space at least a hint that the kernel applied this logic, automatically set the VMADDR_FLAG_TO_HOST on the remote address so it can determine the path taken via getpeername(). Add a per-network namespace sysctl net.vsock.g2h_fallback (default 1). At 0 it forces strict routing: H2G always wins for CID > VMADDR_CID_HOST, or ENODEV if H2G is not loaded. Signed-off-by: Alexander Graf Tested-by: syzbot@syzkaller.appspotmail.com Reviewed-by: Stefano Garzarella Link: https://patch.msgid.link/20260304230027.59857-1-graf@amazon.com Signed-off-by: Paolo Abeni --- Documentation/admin-guide/sysctl/net.rst | 28 +++++++++++++++++++++++++ drivers/vhost/vsock.c | 13 ++++++++++++ include/net/af_vsock.h | 9 ++++++++ include/net/netns/vsock.h | 2 ++ net/vmw_vsock/af_vsock.c | 35 +++++++++++++++++++++++++++----- net/vmw_vsock/virtio_transport.c | 7 +++++++ 6 files changed, 89 insertions(+), 5 deletions(-) (limited to 'Documentation') diff --git a/Documentation/admin-guide/sysctl/net.rst b/Documentation/admin-guide/sysctl/net.rst index 3b2ad61995d4..0724a793798f 100644 --- a/Documentation/admin-guide/sysctl/net.rst +++ b/Documentation/admin-guide/sysctl/net.rst @@ -602,3 +602,31 @@ it does not modify the current namespace or any existing children. A namespace with ``ns_mode`` set to ``local`` cannot change ``child_ns_mode`` to ``global`` (returns ``-EPERM``). + +g2h_fallback +------------ + +Controls whether connections to CIDs not owned by the host-to-guest (H2G) +transport automatically fall back to the guest-to-host (G2H) transport. + +When enabled, if a connect targets a CID that the H2G transport (e.g. +vhost-vsock) does not serve, or if no H2G transport is loaded at all, the +connection is routed via the G2H transport (e.g. virtio-vsock) instead. This +allows a host running both nested VMs (via vhost-vsock) and sibling VMs +reachable through the hypervisor (e.g. Nitro Enclaves) to address both using +a single CID space, without requiring applications to set +``VMADDR_FLAG_TO_HOST``. + +When the fallback is taken, ``VMADDR_FLAG_TO_HOST`` is automatically set on +the remote address so that userspace can determine the path via +``getpeername()``. + +Note: With this sysctl enabled, user space that attempts to talk to a guest +CID which is not implemented by the H2G transport will create host vsock +traffic. Environments that rely on H2G-only isolation should set it to 0. + +Values: + + - 0 - Connections to CIDs <= 2 or with VMADDR_FLAG_TO_HOST use G2H; + all others use H2G (or fail with ENODEV if H2G is not loaded). + - 1 - Connections to CIDs not owned by H2G fall back to G2H. (default) diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c index 054f7a718f50..1d8ec6bed53e 100644 --- a/drivers/vhost/vsock.c +++ b/drivers/vhost/vsock.c @@ -91,6 +91,18 @@ static struct vhost_vsock *vhost_vsock_get(u32 guest_cid, struct net *net) return NULL; } +static bool vhost_transport_has_remote_cid(struct vsock_sock *vsk, u32 cid) +{ + struct sock *sk = sk_vsock(vsk); + struct net *net = sock_net(sk); + bool found; + + rcu_read_lock(); + found = !!vhost_vsock_get(cid, net); + rcu_read_unlock(); + return found; +} + static void vhost_transport_do_send_pkt(struct vhost_vsock *vsock, struct vhost_virtqueue *vq) @@ -424,6 +436,7 @@ static struct virtio_transport vhost_transport = { .module = THIS_MODULE, .get_local_cid = vhost_transport_get_local_cid, + .has_remote_cid = vhost_transport_has_remote_cid, .init = virtio_transport_do_socket_init, .destruct = virtio_transport_destruct, diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h index 533d8e75f7bb..4e40063adab4 100644 --- a/include/net/af_vsock.h +++ b/include/net/af_vsock.h @@ -179,6 +179,15 @@ struct vsock_transport { /* Addressing. */ u32 (*get_local_cid)(void); + /* Check if this transport serves a specific remote CID. + * For H2G transports: return true if the CID belongs to a registered + * guest. If not implemented, all CIDs > VMADDR_CID_HOST go to H2G. + * For G2H transports: return true if the transport can reach arbitrary + * CIDs via the hypervisor (i.e. supports the fallback overlay). VMCI + * does not implement this as it only serves CIDs 0 and 2. + */ + bool (*has_remote_cid)(struct vsock_sock *vsk, u32 remote_cid); + /* Read a single skb */ int (*read_skb)(struct vsock_sock *, skb_read_actor_t); diff --git a/include/net/netns/vsock.h b/include/net/netns/vsock.h index dc8cbe45f406..7f84aad92f57 100644 --- a/include/net/netns/vsock.h +++ b/include/net/netns/vsock.h @@ -20,5 +20,7 @@ struct netns_vsock { /* 0 = unlocked, 1 = locked to global, 2 = locked to local */ int child_ns_mode_locked; + + int g2h_fallback; }; #endif /* __NET_NET_NAMESPACE_VSOCK_H */ diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c index f0ab2f13e9db..cc4b225250b9 100644 --- a/net/vmw_vsock/af_vsock.c +++ b/net/vmw_vsock/af_vsock.c @@ -545,9 +545,13 @@ static void vsock_deassign_transport(struct vsock_sock *vsk) * The vsk->remote_addr is used to decide which transport to use: * - remote CID == VMADDR_CID_LOCAL or g2h->local_cid or VMADDR_CID_HOST if * g2h is not loaded, will use local transport; - * - remote CID <= VMADDR_CID_HOST or h2g is not loaded or remote flags field - * includes VMADDR_FLAG_TO_HOST flag value, will use guest->host transport; - * - remote CID > VMADDR_CID_HOST will use host->guest transport; + * - remote CID <= VMADDR_CID_HOST or remote flags field includes + * VMADDR_FLAG_TO_HOST, will use guest->host transport; + * - remote CID > VMADDR_CID_HOST and h2g is loaded and h2g claims that CID, + * will use host->guest transport; + * - h2g not loaded or h2g does not claim that CID and g2h claims the CID via + * has_remote_cid, will use guest->host transport (when g2h_fallback=1) + * - anything else goes to h2g or returns -ENODEV if no h2g is available */ int vsock_assign_transport(struct vsock_sock *vsk, struct vsock_sock *psk) { @@ -581,11 +585,21 @@ int vsock_assign_transport(struct vsock_sock *vsk, struct vsock_sock *psk) case SOCK_SEQPACKET: if (vsock_use_local_transport(remote_cid)) new_transport = transport_local; - else if (remote_cid <= VMADDR_CID_HOST || !transport_h2g || + else if (remote_cid <= VMADDR_CID_HOST || (remote_flags & VMADDR_FLAG_TO_HOST)) new_transport = transport_g2h; - else + else if (transport_h2g && + (!transport_h2g->has_remote_cid || + transport_h2g->has_remote_cid(vsk, remote_cid))) + new_transport = transport_h2g; + else if (sock_net(sk)->vsock.g2h_fallback && + transport_g2h && transport_g2h->has_remote_cid && + transport_g2h->has_remote_cid(vsk, remote_cid)) { + vsk->remote_addr.svm_flags |= VMADDR_FLAG_TO_HOST; + new_transport = transport_g2h; + } else { new_transport = transport_h2g; + } break; default: ret = -ESOCKTNOSUPPORT; @@ -2879,6 +2893,15 @@ static struct ctl_table vsock_table[] = { .mode = 0644, .proc_handler = vsock_net_child_mode_string }, + { + .procname = "g2h_fallback", + .data = &init_net.vsock.g2h_fallback, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = proc_dointvec_minmax, + .extra1 = SYSCTL_ZERO, + .extra2 = SYSCTL_ONE, + }, }; static int __net_init vsock_sysctl_register(struct net *net) @@ -2894,6 +2917,7 @@ static int __net_init vsock_sysctl_register(struct net *net) table[0].data = &net->vsock.mode; table[1].data = &net->vsock.child_ns_mode; + table[2].data = &net->vsock.g2h_fallback; } net->vsock.sysctl_hdr = register_net_sysctl_sz(net, "net/vsock", table, @@ -2928,6 +2952,7 @@ static void vsock_net_init(struct net *net) net->vsock.mode = vsock_net_child_mode(current->nsproxy->net_ns); net->vsock.child_ns_mode = net->vsock.mode; + net->vsock.g2h_fallback = 1; } static __net_init int vsock_sysctl_init_net(struct net *net) diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c index 77fe5b7b066c..57f2d6ec3ffc 100644 --- a/net/vmw_vsock/virtio_transport.c +++ b/net/vmw_vsock/virtio_transport.c @@ -547,11 +547,18 @@ bool virtio_transport_stream_allow(struct vsock_sock *vsk, u32 cid, u32 port) static bool virtio_transport_seqpacket_allow(struct vsock_sock *vsk, u32 remote_cid); +static bool virtio_transport_has_remote_cid(struct vsock_sock *vsk, u32 cid) +{ + /* The CID could be implemented by the host. Always assume it is. */ + return true; +} + static struct virtio_transport virtio_transport = { .transport = { .module = THIS_MODULE, .get_local_cid = virtio_transport_get_local_cid, + .has_remote_cid = virtio_transport_has_remote_cid, .init = virtio_transport_do_socket_init, .destruct = virtio_transport_destruct, -- cgit v1.2.3 From 4320f1f111c587b8c6c9abc06f43e25bc10b670c Mon Sep 17 00:00:00 2001 From: Wojciech Slenska Date: Tue, 10 Mar 2026 12:22:30 +0100 Subject: dt-bindings: net: qcom,ipa: document qcm2290 compatible Document that ipa on qcm2290 uses version 4.2, the same as sc7180. Acked-by: Krzysztof Kozlowski Signed-off-by: Wojciech Slenska Link: https://patch.msgid.link/20260310112309.79261-2-wojciech.slenska@gmail.com Signed-off-by: Paolo Abeni --- Documentation/devicetree/bindings/net/qcom,ipa.yaml | 4 ++++ 1 file changed, 4 insertions(+) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/net/qcom,ipa.yaml b/Documentation/devicetree/bindings/net/qcom,ipa.yaml index 4237e74041ef..e4bb627e1757 100644 --- a/Documentation/devicetree/bindings/net/qcom,ipa.yaml +++ b/Documentation/devicetree/bindings/net/qcom,ipa.yaml @@ -53,6 +53,10 @@ properties: - qcom,sm6350-ipa - qcom,sm8350-ipa - qcom,sm8550-ipa + - items: + - enum: + - qcom,qcm2290-ipa + - const: qcom,sc7180-ipa - items: - enum: - qcom,sm8650-ipa -- cgit v1.2.3 From 00699d944836576d2789eb0cf02d989b975375d7 Mon Sep 17 00:00:00 2001 From: ShravyaPanchagiri Date: Tue, 10 Mar 2026 22:04:50 -0500 Subject: docs: octeontx2: fix typo in documentation Fix spelling mistake "Crate" to "Create" in the documentation. Signed-off-by: ShravyaPanchagiri Reviewed-by: Simon Horman Link: https://patch.msgid.link/20260311030450.8461-1-shravy112@gmail.com Signed-off-by: Jakub Kicinski --- Documentation/networking/device_drivers/ethernet/marvell/octeontx2.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/networking/device_drivers/ethernet/marvell/octeontx2.rst b/Documentation/networking/device_drivers/ethernet/marvell/octeontx2.rst index a52850602cd8..c31c6c197cdb 100644 --- a/Documentation/networking/device_drivers/ethernet/marvell/octeontx2.rst +++ b/Documentation/networking/device_drivers/ethernet/marvell/octeontx2.rst @@ -323,7 +323,7 @@ Setup HTB offload # ethtool -K hw-tc-offload on -2. Crate htb root:: +2. Create htb root:: # tc qdisc add dev clsact # tc qdisc replace dev root handle 1: htb offload -- cgit v1.2.3 From 7c52f407f28d2e3746a931f88869b0169a09ed6a Mon Sep 17 00:00:00 2001 From: Hangbin Liu Date: Wed, 11 Mar 2026 11:22:29 +0800 Subject: ynl: ethtool: remove duplicated unspec entry There is a duplicated unspec entry. Remove it. No user impact expected, found by inspection. Signed-off-by: Hangbin Liu Link: https://patch.msgid.link/20260311-b4-drop_dup_unspec-v1-1-e0dfa47b5981@gmail.com Signed-off-by: Jakub Kicinski --- Documentation/netlink/specs/ethtool.yaml | 4 ---- 1 file changed, 4 deletions(-) (limited to 'Documentation') diff --git a/Documentation/netlink/specs/ethtool.yaml b/Documentation/netlink/specs/ethtool.yaml index 4707063af3b4..a05b5425b76a 100644 --- a/Documentation/netlink/specs/ethtool.yaml +++ b/Documentation/netlink/specs/ethtool.yaml @@ -306,10 +306,6 @@ attribute-sets: name: strings attr-cnt-name: __ethtool-a-strings-cnt attributes: - - - name: unspec - type: unused - value: 0 - name: unspec type: unused -- cgit v1.2.3 From 0e24d17bd9668f9dad78ede6a0e8f13dab176682 Mon Sep 17 00:00:00 2001 From: Simon Baatz Date: Mon, 9 Mar 2026 09:02:26 +0100 Subject: tcp: implement RFC 7323 window retraction receiver requirements By default, the Linux TCP implementation does not shrink the advertised window (RFC 7323 calls this "window retraction") with the following exceptions: - When an incoming segment cannot be added due to the receive buffer running out of memory. Since commit 8c670bdfa58e ("tcp: correct handling of extreme memory squeeze") a zero window will be advertised in this case. It turns out that reaching the required memory pressure is easy when window scaling is in use. In the simplest case, sending a sufficient number of segments smaller than the scale factor to a receiver that does not read data is enough. - Commit b650d953cd39 ("tcp: enforce receive buffer memory limits by allowing the tcp window to shrink") addressed the "eating memory" problem by introducing a sysctl knob that allows shrinking the window before running out of memory. However, RFC 7323 does not only state that shrinking the window is necessary in some cases, it also formulates requirements for TCP implementations when doing so (Section 2.4). This commit addresses the receiver-side requirements: After retracting the window, the peer may have a snd_nxt that lies within a previously advertised window but is now beyond the retracted window. This means that all incoming segments (including pure ACKs) will be rejected until the application happens to read enough data to let the peer's snd_nxt be in window again (which may be never). To comply with RFC 7323, the receiver MUST honor any segment that would have been in window for any ACK sent by the receiver and, when window scaling is in effect, SHOULD track the maximum window sequence number it has advertised. This patch tracks that maximum window sequence number rcv_mwnd_seq throughout the connection and uses it in tcp_sequence() when deciding whether a segment is acceptable. rcv_mwnd_seq is updated together with rcv_wup and rcv_wnd in tcp_select_window(). If we count tcp_sequence() as fast path, it is read in the fast path. Therefore, rcv_mwnd_seq is put into rcv_wnd's cacheline group. The logic for handling received data in tcp_data_queue() is already sufficient and does not need to be updated. Signed-off-by: Simon Baatz Reviewed-by: Eric Dumazet Link: https://patch.msgid.link/20260309-tcp_rfc7323_retract_wnd_rfc-v3-1-4c7f96b1ec69@gmail.com Signed-off-by: Jakub Kicinski --- .../networking/net_cachelines/tcp_sock.rst | 1 + include/linux/tcp.h | 3 +++ include/net/tcp.h | 22 ++++++++++++++++++++++ net/ipv4/tcp.c | 2 ++ net/ipv4/tcp_fastopen.c | 1 + net/ipv4/tcp_input.c | 10 +++++----- net/ipv4/tcp_minisocks.c | 1 + net/ipv4/tcp_output.c | 3 +++ .../net/packetdrill/tcp_rcv_big_endseq.pkt | 2 +- 9 files changed, 39 insertions(+), 6 deletions(-) (limited to 'Documentation') diff --git a/Documentation/networking/net_cachelines/tcp_sock.rst b/Documentation/networking/net_cachelines/tcp_sock.rst index 563daea10d6c..fecf61166a54 100644 --- a/Documentation/networking/net_cachelines/tcp_sock.rst +++ b/Documentation/networking/net_cachelines/tcp_sock.rst @@ -121,6 +121,7 @@ u64 delivered_mstamp read_write u32 rate_delivered read_mostly tcp_rate_gen u32 rate_interval_us read_mostly rate_delivered,rate_app_limited u32 rcv_wnd read_write read_mostly tcp_select_window,tcp_receive_window,tcp_fast_path_check +u32 rcv_mwnd_seq read_write tcp_select_window u32 write_seq read_write tcp_rate_check_app_limited,tcp_write_queue_empty,tcp_skb_entail,forced_push,tcp_mark_push u32 notsent_lowat read_mostly tcp_stream_memory_free u32 pushed_seq read_write tcp_mark_push,forced_push diff --git a/include/linux/tcp.h b/include/linux/tcp.h index bcebc4f07532..6982f10e826b 100644 --- a/include/linux/tcp.h +++ b/include/linux/tcp.h @@ -316,6 +316,9 @@ struct tcp_sock { */ u32 app_limited; /* limited until "delivered" reaches this val */ u32 rcv_wnd; /* Current receiver window */ + u32 rcv_mwnd_seq; /* Maximum window sequence number (RFC 7323, + * section 2.4, receiver requirements) + */ u32 rcv_tstamp; /* timestamp of last received ACK (for keepalives) */ /* * Options received (usually on last packet, some only on SYN packets). diff --git a/include/net/tcp.h b/include/net/tcp.h index 48dffcca0a71..f87bdacb5a69 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -934,6 +934,28 @@ static inline u32 tcp_receive_window(const struct tcp_sock *tp) return (u32) win; } +/* Compute the maximum receive window we ever advertised. + * Rcv_nxt can be after the window if our peer push more data + * than the offered window. + */ +static inline u32 tcp_max_receive_window(const struct tcp_sock *tp) +{ + s32 win = tp->rcv_mwnd_seq - tp->rcv_nxt; + + if (win < 0) + win = 0; + return (u32) win; +} + +/* Check if we need to update the maximum receive window sequence number */ +static inline void tcp_update_max_rcv_wnd_seq(struct tcp_sock *tp) +{ + u32 wre = tp->rcv_wup + tp->rcv_wnd; + + if (after(wre, tp->rcv_mwnd_seq)) + tp->rcv_mwnd_seq = wre; +} + /* Choose a new window, without checks for shrinking, and without * scaling applied to the result. The caller does these things * if necessary. This is a "raw" window selection. diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index ed6f6712f060..516087c622ad 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -3561,6 +3561,7 @@ static int tcp_repair_set_window(struct tcp_sock *tp, sockptr_t optbuf, int len) tp->rcv_wnd = opt.rcv_wnd; tp->rcv_wup = opt.rcv_wup; + tp->rcv_mwnd_seq = opt.rcv_wup + opt.rcv_wnd; return 0; } @@ -5275,6 +5276,7 @@ static void __init tcp_struct_check(void) CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, received_ecn_bytes); CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, app_limited); CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, rcv_wnd); + CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, rcv_mwnd_seq); CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, rcv_tstamp); CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, rx_opt); diff --git a/net/ipv4/tcp_fastopen.c b/net/ipv4/tcp_fastopen.c index 9fdc19accafd..4e389d609f91 100644 --- a/net/ipv4/tcp_fastopen.c +++ b/net/ipv4/tcp_fastopen.c @@ -377,6 +377,7 @@ static struct sock *tcp_fastopen_create_child(struct sock *sk, tcp_rsk(req)->rcv_nxt = tp->rcv_nxt; tp->rcv_wup = tp->rcv_nxt; + tp->rcv_mwnd_seq = tp->rcv_wup + tp->rcv_wnd; /* tcp_conn_request() is sending the SYNACK, * and queues the child into listener accept queue. */ diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 71ac69b7b75e..2e1b23760815 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -4808,20 +4808,18 @@ static enum skb_drop_reason tcp_sequence(const struct sock *sk, const struct tcphdr *th) { const struct tcp_sock *tp = tcp_sk(sk); - u32 seq_limit; if (before(end_seq, tp->rcv_wup)) return SKB_DROP_REASON_TCP_OLD_SEQUENCE; - seq_limit = tp->rcv_nxt + tcp_receive_window(tp); - if (unlikely(after(end_seq, seq_limit))) { + if (unlikely(after(end_seq, tp->rcv_nxt + tcp_max_receive_window(tp)))) { /* Some stacks are known to handle FIN incorrectly; allow the * FIN to extend beyond the window and check it in detail later. */ - if (!after(end_seq - th->fin, seq_limit)) + if (!after(end_seq - th->fin, tp->rcv_nxt + tcp_receive_window(tp))) return SKB_NOT_DROPPED_YET; - if (after(seq, seq_limit)) + if (after(seq, tp->rcv_nxt + tcp_max_receive_window(tp))) return SKB_DROP_REASON_TCP_INVALID_SEQUENCE; /* Only accept this packet if receive queue is empty. */ @@ -6903,6 +6901,7 @@ consume: */ WRITE_ONCE(tp->rcv_nxt, TCP_SKB_CB(skb)->seq + 1); tp->rcv_wup = TCP_SKB_CB(skb)->seq + 1; + tp->rcv_mwnd_seq = tp->rcv_wup + tp->rcv_wnd; /* RFC1323: The window in SYN & SYN/ACK segments is * never scaled. @@ -7015,6 +7014,7 @@ consume: WRITE_ONCE(tp->rcv_nxt, TCP_SKB_CB(skb)->seq + 1); WRITE_ONCE(tp->copied_seq, tp->rcv_nxt); tp->rcv_wup = TCP_SKB_CB(skb)->seq + 1; + tp->rcv_mwnd_seq = tp->rcv_wup + tp->rcv_wnd; /* RFC1323: The window in SYN & SYN/ACK segments is * never scaled. diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c index dafb63b923d0..d350d794a959 100644 --- a/net/ipv4/tcp_minisocks.c +++ b/net/ipv4/tcp_minisocks.c @@ -604,6 +604,7 @@ struct sock *tcp_create_openreq_child(const struct sock *sk, newtp->window_clamp = req->rsk_window_clamp; newtp->rcv_ssthresh = req->rsk_rcv_wnd; newtp->rcv_wnd = req->rsk_rcv_wnd; + newtp->rcv_mwnd_seq = newtp->rcv_wup + req->rsk_rcv_wnd; newtp->rx_opt.wscale_ok = ireq->wscale_ok; if (newtp->rx_opt.wscale_ok) { newtp->rx_opt.snd_wscale = ireq->snd_wscale; diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 34a25ef61006..35c3b0ab5a0c 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -293,6 +293,7 @@ static u16 tcp_select_window(struct sock *sk) tp->pred_flags = 0; tp->rcv_wnd = 0; tp->rcv_wup = tp->rcv_nxt; + tcp_update_max_rcv_wnd_seq(tp); return 0; } @@ -316,6 +317,7 @@ static u16 tcp_select_window(struct sock *sk) tp->rcv_wnd = new_win; tp->rcv_wup = tp->rcv_nxt; + tcp_update_max_rcv_wnd_seq(tp); /* Make sure we do not exceed the maximum possible * scaled window. @@ -4165,6 +4167,7 @@ static void tcp_connect_init(struct sock *sk) else tp->rcv_tstamp = tcp_jiffies32; tp->rcv_wup = tp->rcv_nxt; + tp->rcv_mwnd_seq = tp->rcv_nxt + tp->rcv_wnd; WRITE_ONCE(tp->copied_seq, tp->rcv_nxt); inet_csk(sk)->icsk_rto = tcp_timeout_init(sk); diff --git a/tools/testing/selftests/net/packetdrill/tcp_rcv_big_endseq.pkt b/tools/testing/selftests/net/packetdrill/tcp_rcv_big_endseq.pkt index 6c0f32c40f19..12882be10f2e 100644 --- a/tools/testing/selftests/net/packetdrill/tcp_rcv_big_endseq.pkt +++ b/tools/testing/selftests/net/packetdrill/tcp_rcv_big_endseq.pkt @@ -36,7 +36,7 @@ +0 read(4, ..., 100000) = 4000 -// If queue is empty, accept a packet even if its end_seq is above wup + rcv_wnd +// If queue is empty, accept a packet even if its end_seq is above rcv_mwnd_seq +0 < P. 4001:54001(50000) ack 1 win 257 * > . 1:1(0) ack 54001 win 0 -- cgit v1.2.3 From 68deca0f0f4bba8c5278340c7c142500171d5f9b Mon Sep 17 00:00:00 2001 From: Jiri Pirko Date: Thu, 12 Mar 2026 11:03:55 +0100 Subject: devlink: expose devlink instance index over netlink Each devlink instance has an internally assigned index used for xarray storage. Expose it as a new DEVLINK_ATTR_INDEX uint attribute alongside the existing bus_name and dev_name handle. Signed-off-by: Jiri Pirko Link: https://patch.msgid.link/20260312100407.551173-2-jiri@resnulli.us Signed-off-by: Jakub Kicinski --- Documentation/netlink/specs/devlink.yaml | 5 +++++ include/uapi/linux/devlink.h | 2 ++ net/devlink/devl_internal.h | 2 ++ net/devlink/port.c | 1 + 4 files changed, 10 insertions(+) (limited to 'Documentation') diff --git a/Documentation/netlink/specs/devlink.yaml b/Documentation/netlink/specs/devlink.yaml index 837112da6738..1bed67a0eefb 100644 --- a/Documentation/netlink/specs/devlink.yaml +++ b/Documentation/netlink/specs/devlink.yaml @@ -867,6 +867,10 @@ attribute-sets: type: flag doc: Request restoring parameter to its default value. value: 183 + - + name: index + type: uint + doc: Unique devlink instance index. - name: dl-dev-stats subset-of: devlink @@ -1311,6 +1315,7 @@ operations: attributes: - bus-name - dev-name + - index - reload-failed - dev-stats dump: diff --git a/include/uapi/linux/devlink.h b/include/uapi/linux/devlink.h index e7d6b6d13470..1ba3436db4ae 100644 --- a/include/uapi/linux/devlink.h +++ b/include/uapi/linux/devlink.h @@ -642,6 +642,8 @@ enum devlink_attr { DEVLINK_ATTR_PARAM_VALUE_DEFAULT, /* dynamic */ DEVLINK_ATTR_PARAM_RESET_DEFAULT, /* flag */ + DEVLINK_ATTR_INDEX, /* uint */ + /* Add new attributes above here, update the spec in * Documentation/netlink/specs/devlink.yaml and re-generate * net/devlink/netlink_gen.c. diff --git a/net/devlink/devl_internal.h b/net/devlink/devl_internal.h index 1377864383bc..31fa98af418e 100644 --- a/net/devlink/devl_internal.h +++ b/net/devlink/devl_internal.h @@ -178,6 +178,8 @@ devlink_nl_put_handle(struct sk_buff *msg, struct devlink *devlink) return -EMSGSIZE; if (nla_put_string(msg, DEVLINK_ATTR_DEV_NAME, dev_name(devlink->dev))) return -EMSGSIZE; + if (nla_put_uint(msg, DEVLINK_ATTR_INDEX, devlink->index)) + return -EMSGSIZE; return 0; } diff --git a/net/devlink/port.c b/net/devlink/port.c index 93d8a25bb920..1ff609571ea4 100644 --- a/net/devlink/port.c +++ b/net/devlink/port.c @@ -222,6 +222,7 @@ size_t devlink_nl_port_handle_size(struct devlink_port *devlink_port) return nla_total_size(strlen(devlink->dev->bus->name) + 1) /* DEVLINK_ATTR_BUS_NAME */ + nla_total_size(strlen(dev_name(devlink->dev)) + 1) /* DEVLINK_ATTR_DEV_NAME */ + + nla_total_size(8) /* DEVLINK_ATTR_INDEX */ + nla_total_size(4); /* DEVLINK_ATTR_PORT_INDEX */ } -- cgit v1.2.3 From d85a8af57da871964c60eb5b9ab121a6c31e3bd3 Mon Sep 17 00:00:00 2001 From: Jiri Pirko Date: Thu, 12 Mar 2026 11:03:58 +0100 Subject: devlink: allow to use devlink index as a command handle Currently devlink instances are addressed bus_name/dev_name tuple. Allow the newly introduced DEVLINK_ATTR_INDEX to be used as an alternative handle for all devlink commands. When DEVLINK_ATTR_INDEX is present in the request, use it for a direct xarray lookup instead of iterating over all instances comparing bus_name/dev_name strings. Signed-off-by: Jiri Pirko Link: https://patch.msgid.link/20260312100407.551173-5-jiri@resnulli.us Signed-off-by: Jakub Kicinski --- Documentation/netlink/specs/devlink.yaml | 53 +++++ net/devlink/core.c | 16 +- net/devlink/devl_internal.h | 1 + net/devlink/netlink.c | 14 +- net/devlink/netlink_gen.c | 355 +++++++++++++++++++------------ 5 files changed, 296 insertions(+), 143 deletions(-) (limited to 'Documentation') diff --git a/Documentation/netlink/specs/devlink.yaml b/Documentation/netlink/specs/devlink.yaml index 1bed67a0eefb..b495d56b9137 100644 --- a/Documentation/netlink/specs/devlink.yaml +++ b/Documentation/netlink/specs/devlink.yaml @@ -871,6 +871,8 @@ attribute-sets: name: index type: uint doc: Unique devlink instance index. + checks: + max: u32-max - name: dl-dev-stats subset-of: devlink @@ -1310,6 +1312,7 @@ operations: attributes: &dev-id-attrs - bus-name - dev-name + - index reply: &get-reply value: 3 attributes: @@ -1334,6 +1337,7 @@ operations: attributes: &port-id-attrs - bus-name - dev-name + - index - port-index reply: value: 7 @@ -1358,6 +1362,7 @@ operations: attributes: - bus-name - dev-name + - index - port-index - port-type - port-function @@ -1375,6 +1380,7 @@ operations: attributes: - bus-name - dev-name + - index - port-index - port-flavour - port-pci-pf-number @@ -1409,6 +1415,7 @@ operations: attributes: - bus-name - dev-name + - index - port-index - port-split-count @@ -1437,6 +1444,7 @@ operations: attributes: &sb-id-attrs - bus-name - dev-name + - index - sb-index reply: &sb-get-reply value: 13 @@ -1459,6 +1467,7 @@ operations: attributes: &sb-pool-id-attrs - bus-name - dev-name + - index - sb-index - sb-pool-index reply: &sb-pool-get-reply @@ -1482,6 +1491,7 @@ operations: attributes: - bus-name - dev-name + - index - sb-index - sb-pool-index - sb-pool-threshold-type @@ -1500,6 +1510,7 @@ operations: attributes: &sb-port-pool-id-attrs - bus-name - dev-name + - index - port-index - sb-index - sb-pool-index @@ -1524,6 +1535,7 @@ operations: attributes: - bus-name - dev-name + - index - port-index - sb-index - sb-pool-index @@ -1542,6 +1554,7 @@ operations: attributes: &sb-tc-pool-bind-id-attrs - bus-name - dev-name + - index - port-index - sb-index - sb-pool-type @@ -1567,6 +1580,7 @@ operations: attributes: - bus-name - dev-name + - index - port-index - sb-index - sb-pool-index @@ -1588,6 +1602,7 @@ operations: attributes: - bus-name - dev-name + - index - sb-index - @@ -1603,6 +1618,7 @@ operations: attributes: - bus-name - dev-name + - index - sb-index - @@ -1621,6 +1637,7 @@ operations: attributes: &eswitch-attrs - bus-name - dev-name + - index - eswitch-mode - eswitch-inline-mode - eswitch-encap-mode @@ -1649,12 +1666,14 @@ operations: attributes: - bus-name - dev-name + - index - dpipe-table-name reply: value: 31 attributes: - bus-name - dev-name + - index - dpipe-tables - @@ -1669,11 +1688,13 @@ operations: attributes: - bus-name - dev-name + - index - dpipe-table-name reply: attributes: - bus-name - dev-name + - index - dpipe-entries - @@ -1688,10 +1709,12 @@ operations: attributes: - bus-name - dev-name + - index reply: attributes: - bus-name - dev-name + - index - dpipe-headers - @@ -1707,6 +1730,7 @@ operations: attributes: - bus-name - dev-name + - index - dpipe-table-name - dpipe-table-counters-enabled @@ -1723,6 +1747,7 @@ operations: attributes: - bus-name - dev-name + - index - resource-id - resource-size @@ -1738,11 +1763,13 @@ operations: attributes: - bus-name - dev-name + - index reply: value: 36 attributes: - bus-name - dev-name + - index - resource-list - @@ -1758,6 +1785,7 @@ operations: attributes: - bus-name - dev-name + - index - reload-action - reload-limits - netns-pid @@ -1767,6 +1795,7 @@ operations: attributes: - bus-name - dev-name + - index - reload-actions-performed - @@ -1781,6 +1810,7 @@ operations: attributes: ¶m-id-attrs - bus-name - dev-name + - index - param-name reply: ¶m-get-reply attributes: *param-id-attrs @@ -1802,6 +1832,7 @@ operations: attributes: - bus-name - dev-name + - index - param-name - param-type # param-value-data is missing here as the type is variable @@ -1821,6 +1852,7 @@ operations: attributes: ®ion-id-attrs - bus-name - dev-name + - index - port-index - region-name reply: ®ion-get-reply @@ -1845,6 +1877,7 @@ operations: attributes: ®ion-snapshot-id-attrs - bus-name - dev-name + - index - port-index - region-name - region-snapshot-id @@ -1875,6 +1908,7 @@ operations: attributes: - bus-name - dev-name + - index - port-index - region-name - region-snapshot-id @@ -1886,6 +1920,7 @@ operations: attributes: - bus-name - dev-name + - index - port-index - region-name @@ -1935,6 +1970,7 @@ operations: attributes: - bus-name - dev-name + - index - info-driver-name - info-serial-number - info-version-fixed @@ -1956,6 +1992,7 @@ operations: attributes: &health-reporter-id-attrs - bus-name - dev-name + - index - port-index - health-reporter-name reply: &health-reporter-get-reply @@ -1978,6 +2015,7 @@ operations: attributes: - bus-name - dev-name + - index - port-index - health-reporter-name - health-reporter-graceful-period @@ -2048,6 +2086,7 @@ operations: attributes: - bus-name - dev-name + - index - flash-update-file-name - flash-update-component - flash-update-overwrite-mask @@ -2065,6 +2104,7 @@ operations: attributes: &trap-id-attrs - bus-name - dev-name + - index - trap-name reply: &trap-get-reply value: 63 @@ -2087,6 +2127,7 @@ operations: attributes: - bus-name - dev-name + - index - trap-name - trap-action @@ -2103,6 +2144,7 @@ operations: attributes: &trap-group-id-attrs - bus-name - dev-name + - index - trap-group-name reply: &trap-group-get-reply value: 67 @@ -2125,6 +2167,7 @@ operations: attributes: - bus-name - dev-name + - index - trap-group-name - trap-action - trap-policer-id @@ -2142,6 +2185,7 @@ operations: attributes: &trap-policer-id-attrs - bus-name - dev-name + - index - trap-policer-id reply: &trap-policer-get-reply value: 71 @@ -2164,6 +2208,7 @@ operations: attributes: - bus-name - dev-name + - index - trap-policer-id - trap-policer-rate - trap-policer-burst @@ -2194,6 +2239,7 @@ operations: attributes: &rate-id-attrs - bus-name - dev-name + - index - port-index - rate-node-name reply: &rate-get-reply @@ -2217,6 +2263,7 @@ operations: attributes: - bus-name - dev-name + - index - rate-node-name - rate-tx-share - rate-tx-max @@ -2238,6 +2285,7 @@ operations: attributes: - bus-name - dev-name + - index - rate-node-name - rate-tx-share - rate-tx-max @@ -2259,6 +2307,7 @@ operations: attributes: - bus-name - dev-name + - index - rate-node-name - @@ -2274,6 +2323,7 @@ operations: attributes: &linecard-id-attrs - bus-name - dev-name + - index - linecard-index reply: &linecard-get-reply value: 80 @@ -2296,6 +2346,7 @@ operations: attributes: - bus-name - dev-name + - index - linecard-index - linecard-type @@ -2329,6 +2380,7 @@ operations: attributes: - bus-name - dev-name + - index - selftests - @@ -2340,4 +2392,5 @@ operations: attributes: - bus-name - dev-name + - index - port-index diff --git a/net/devlink/core.c b/net/devlink/core.c index 63709c132a7c..237558abcd63 100644 --- a/net/devlink/core.c +++ b/net/devlink/core.c @@ -333,13 +333,15 @@ void devlink_put(struct devlink *devlink) queue_rcu_work(system_percpu_wq, &devlink->rwork); } -struct devlink *devlinks_xa_find_get(struct net *net, unsigned long *indexp) +static struct devlink *__devlinks_xa_find_get(struct net *net, + unsigned long *indexp, + unsigned long end) { struct devlink *devlink = NULL; rcu_read_lock(); retry: - devlink = xa_find(&devlinks, indexp, ULONG_MAX, DEVLINK_REGISTERED); + devlink = xa_find(&devlinks, indexp, end, DEVLINK_REGISTERED); if (!devlink) goto unlock; @@ -358,6 +360,16 @@ next: goto retry; } +struct devlink *devlinks_xa_find_get(struct net *net, unsigned long *indexp) +{ + return __devlinks_xa_find_get(net, indexp, ULONG_MAX); +} + +struct devlink *devlinks_xa_lookup_get(struct net *net, unsigned long index) +{ + return __devlinks_xa_find_get(net, &index, index); +} + /** * devl_register - Register devlink instance * @devlink: devlink diff --git a/net/devlink/devl_internal.h b/net/devlink/devl_internal.h index 1b770de0313e..395832ed4477 100644 --- a/net/devlink/devl_internal.h +++ b/net/devlink/devl_internal.h @@ -90,6 +90,7 @@ extern struct genl_family devlink_nl_family; for (index = 0; (devlink = devlinks_xa_find_get(net, &index)); index++) struct devlink *devlinks_xa_find_get(struct net *net, unsigned long *indexp); +struct devlink *devlinks_xa_lookup_get(struct net *net, unsigned long index); static inline bool __devl_is_registered(struct devlink *devlink) { diff --git a/net/devlink/netlink.c b/net/devlink/netlink.c index 7b205f677b7a..9cba40285de4 100644 --- a/net/devlink/netlink.c +++ b/net/devlink/netlink.c @@ -186,6 +186,17 @@ devlink_get_from_attrs_lock(struct net *net, struct nlattr **attrs, char *busname; char *devname; + if (attrs[DEVLINK_ATTR_INDEX]) { + if (attrs[DEVLINK_ATTR_BUS_NAME] || + attrs[DEVLINK_ATTR_DEV_NAME]) + return ERR_PTR(-EINVAL); + index = nla_get_u32(attrs[DEVLINK_ATTR_INDEX]); + devlink = devlinks_xa_lookup_get(net, index); + if (!devlink) + return ERR_PTR(-ENODEV); + goto found; + } + if (!attrs[DEVLINK_ATTR_BUS_NAME] || !attrs[DEVLINK_ATTR_DEV_NAME]) return ERR_PTR(-EINVAL); @@ -356,7 +367,8 @@ int devlink_nl_dumpit(struct sk_buff *msg, struct netlink_callback *cb, int flags = NLM_F_MULTI; if (attrs && - (attrs[DEVLINK_ATTR_BUS_NAME] || attrs[DEVLINK_ATTR_DEV_NAME])) + (attrs[DEVLINK_ATTR_BUS_NAME] || attrs[DEVLINK_ATTR_DEV_NAME] || + attrs[DEVLINK_ATTR_INDEX])) return devlink_nl_inst_single_dumpit(msg, cb, flags, dump_one, attrs); else diff --git a/net/devlink/netlink_gen.c b/net/devlink/netlink_gen.c index f4c61c2b4f22..eb35e80e01d1 100644 --- a/net/devlink/netlink_gen.c +++ b/net/devlink/netlink_gen.c @@ -11,6 +11,11 @@ #include +/* Integer value ranges */ +static const struct netlink_range_validation devlink_attr_index_range = { + .max = U32_MAX, +}; + /* Sparse enums validation callbacks */ static int devlink_attr_param_type_validate(const struct nlattr *attr, @@ -56,37 +61,42 @@ const struct nla_policy devlink_dl_selftest_id_nl_policy[DEVLINK_ATTR_SELFTEST_I }; /* DEVLINK_CMD_GET - do */ -static const struct nla_policy devlink_get_nl_policy[DEVLINK_ATTR_DEV_NAME + 1] = { +static const struct nla_policy devlink_get_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), }; /* DEVLINK_CMD_PORT_GET - do */ -static const struct nla_policy devlink_port_get_do_nl_policy[DEVLINK_ATTR_PORT_INDEX + 1] = { +static const struct nla_policy devlink_port_get_do_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), [DEVLINK_ATTR_PORT_INDEX] = { .type = NLA_U32, }, }; /* DEVLINK_CMD_PORT_GET - dump */ -static const struct nla_policy devlink_port_get_dump_nl_policy[DEVLINK_ATTR_DEV_NAME + 1] = { +static const struct nla_policy devlink_port_get_dump_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), }; /* DEVLINK_CMD_PORT_SET - do */ -static const struct nla_policy devlink_port_set_nl_policy[DEVLINK_ATTR_PORT_FUNCTION + 1] = { +static const struct nla_policy devlink_port_set_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), [DEVLINK_ATTR_PORT_INDEX] = { .type = NLA_U32, }, [DEVLINK_ATTR_PORT_TYPE] = NLA_POLICY_MAX(NLA_U16, 3), [DEVLINK_ATTR_PORT_FUNCTION] = NLA_POLICY_NESTED(devlink_dl_port_function_nl_policy), }; /* DEVLINK_CMD_PORT_NEW - do */ -static const struct nla_policy devlink_port_new_nl_policy[DEVLINK_ATTR_PORT_PCI_SF_NUMBER + 1] = { +static const struct nla_policy devlink_port_new_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), [DEVLINK_ATTR_PORT_INDEX] = { .type = NLA_U32, }, [DEVLINK_ATTR_PORT_FLAVOUR] = NLA_POLICY_MAX(NLA_U16, 7), [DEVLINK_ATTR_PORT_PCI_PF_NUMBER] = { .type = NLA_U16, }, @@ -95,58 +105,66 @@ static const struct nla_policy devlink_port_new_nl_policy[DEVLINK_ATTR_PORT_PCI_ }; /* DEVLINK_CMD_PORT_DEL - do */ -static const struct nla_policy devlink_port_del_nl_policy[DEVLINK_ATTR_PORT_INDEX + 1] = { +static const struct nla_policy devlink_port_del_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), [DEVLINK_ATTR_PORT_INDEX] = { .type = NLA_U32, }, }; /* DEVLINK_CMD_PORT_SPLIT - do */ -static const struct nla_policy devlink_port_split_nl_policy[DEVLINK_ATTR_PORT_SPLIT_COUNT + 1] = { +static const struct nla_policy devlink_port_split_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), [DEVLINK_ATTR_PORT_INDEX] = { .type = NLA_U32, }, [DEVLINK_ATTR_PORT_SPLIT_COUNT] = { .type = NLA_U32, }, }; /* DEVLINK_CMD_PORT_UNSPLIT - do */ -static const struct nla_policy devlink_port_unsplit_nl_policy[DEVLINK_ATTR_PORT_INDEX + 1] = { +static const struct nla_policy devlink_port_unsplit_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), [DEVLINK_ATTR_PORT_INDEX] = { .type = NLA_U32, }, }; /* DEVLINK_CMD_SB_GET - do */ -static const struct nla_policy devlink_sb_get_do_nl_policy[DEVLINK_ATTR_SB_INDEX + 1] = { +static const struct nla_policy devlink_sb_get_do_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), [DEVLINK_ATTR_SB_INDEX] = { .type = NLA_U32, }, }; /* DEVLINK_CMD_SB_GET - dump */ -static const struct nla_policy devlink_sb_get_dump_nl_policy[DEVLINK_ATTR_DEV_NAME + 1] = { +static const struct nla_policy devlink_sb_get_dump_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), }; /* DEVLINK_CMD_SB_POOL_GET - do */ -static const struct nla_policy devlink_sb_pool_get_do_nl_policy[DEVLINK_ATTR_SB_POOL_INDEX + 1] = { +static const struct nla_policy devlink_sb_pool_get_do_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), [DEVLINK_ATTR_SB_INDEX] = { .type = NLA_U32, }, [DEVLINK_ATTR_SB_POOL_INDEX] = { .type = NLA_U16, }, }; /* DEVLINK_CMD_SB_POOL_GET - dump */ -static const struct nla_policy devlink_sb_pool_get_dump_nl_policy[DEVLINK_ATTR_DEV_NAME + 1] = { +static const struct nla_policy devlink_sb_pool_get_dump_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), }; /* DEVLINK_CMD_SB_POOL_SET - do */ -static const struct nla_policy devlink_sb_pool_set_nl_policy[DEVLINK_ATTR_SB_POOL_THRESHOLD_TYPE + 1] = { +static const struct nla_policy devlink_sb_pool_set_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), [DEVLINK_ATTR_SB_INDEX] = { .type = NLA_U32, }, [DEVLINK_ATTR_SB_POOL_INDEX] = { .type = NLA_U16, }, [DEVLINK_ATTR_SB_POOL_THRESHOLD_TYPE] = NLA_POLICY_MAX(NLA_U8, 1), @@ -154,24 +172,27 @@ static const struct nla_policy devlink_sb_pool_set_nl_policy[DEVLINK_ATTR_SB_POO }; /* DEVLINK_CMD_SB_PORT_POOL_GET - do */ -static const struct nla_policy devlink_sb_port_pool_get_do_nl_policy[DEVLINK_ATTR_SB_POOL_INDEX + 1] = { +static const struct nla_policy devlink_sb_port_pool_get_do_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), [DEVLINK_ATTR_PORT_INDEX] = { .type = NLA_U32, }, [DEVLINK_ATTR_SB_INDEX] = { .type = NLA_U32, }, [DEVLINK_ATTR_SB_POOL_INDEX] = { .type = NLA_U16, }, }; /* DEVLINK_CMD_SB_PORT_POOL_GET - dump */ -static const struct nla_policy devlink_sb_port_pool_get_dump_nl_policy[DEVLINK_ATTR_DEV_NAME + 1] = { +static const struct nla_policy devlink_sb_port_pool_get_dump_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), }; /* DEVLINK_CMD_SB_PORT_POOL_SET - do */ -static const struct nla_policy devlink_sb_port_pool_set_nl_policy[DEVLINK_ATTR_SB_THRESHOLD + 1] = { +static const struct nla_policy devlink_sb_port_pool_set_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), [DEVLINK_ATTR_PORT_INDEX] = { .type = NLA_U32, }, [DEVLINK_ATTR_SB_INDEX] = { .type = NLA_U32, }, [DEVLINK_ATTR_SB_POOL_INDEX] = { .type = NLA_U16, }, @@ -179,9 +200,10 @@ static const struct nla_policy devlink_sb_port_pool_set_nl_policy[DEVLINK_ATTR_S }; /* DEVLINK_CMD_SB_TC_POOL_BIND_GET - do */ -static const struct nla_policy devlink_sb_tc_pool_bind_get_do_nl_policy[DEVLINK_ATTR_SB_TC_INDEX + 1] = { +static const struct nla_policy devlink_sb_tc_pool_bind_get_do_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), [DEVLINK_ATTR_PORT_INDEX] = { .type = NLA_U32, }, [DEVLINK_ATTR_SB_INDEX] = { .type = NLA_U32, }, [DEVLINK_ATTR_SB_POOL_TYPE] = NLA_POLICY_MAX(NLA_U8, 1), @@ -189,15 +211,17 @@ static const struct nla_policy devlink_sb_tc_pool_bind_get_do_nl_policy[DEVLINK_ }; /* DEVLINK_CMD_SB_TC_POOL_BIND_GET - dump */ -static const struct nla_policy devlink_sb_tc_pool_bind_get_dump_nl_policy[DEVLINK_ATTR_DEV_NAME + 1] = { +static const struct nla_policy devlink_sb_tc_pool_bind_get_dump_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), }; /* DEVLINK_CMD_SB_TC_POOL_BIND_SET - do */ -static const struct nla_policy devlink_sb_tc_pool_bind_set_nl_policy[DEVLINK_ATTR_SB_TC_INDEX + 1] = { +static const struct nla_policy devlink_sb_tc_pool_bind_set_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), [DEVLINK_ATTR_PORT_INDEX] = { .type = NLA_U32, }, [DEVLINK_ATTR_SB_INDEX] = { .type = NLA_U32, }, [DEVLINK_ATTR_SB_POOL_INDEX] = { .type = NLA_U16, }, @@ -207,80 +231,91 @@ static const struct nla_policy devlink_sb_tc_pool_bind_set_nl_policy[DEVLINK_ATT }; /* DEVLINK_CMD_SB_OCC_SNAPSHOT - do */ -static const struct nla_policy devlink_sb_occ_snapshot_nl_policy[DEVLINK_ATTR_SB_INDEX + 1] = { +static const struct nla_policy devlink_sb_occ_snapshot_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), [DEVLINK_ATTR_SB_INDEX] = { .type = NLA_U32, }, }; /* DEVLINK_CMD_SB_OCC_MAX_CLEAR - do */ -static const struct nla_policy devlink_sb_occ_max_clear_nl_policy[DEVLINK_ATTR_SB_INDEX + 1] = { +static const struct nla_policy devlink_sb_occ_max_clear_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), [DEVLINK_ATTR_SB_INDEX] = { .type = NLA_U32, }, }; /* DEVLINK_CMD_ESWITCH_GET - do */ -static const struct nla_policy devlink_eswitch_get_nl_policy[DEVLINK_ATTR_DEV_NAME + 1] = { +static const struct nla_policy devlink_eswitch_get_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), }; /* DEVLINK_CMD_ESWITCH_SET - do */ -static const struct nla_policy devlink_eswitch_set_nl_policy[DEVLINK_ATTR_ESWITCH_ENCAP_MODE + 1] = { +static const struct nla_policy devlink_eswitch_set_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), [DEVLINK_ATTR_ESWITCH_MODE] = NLA_POLICY_MAX(NLA_U16, 2), [DEVLINK_ATTR_ESWITCH_INLINE_MODE] = NLA_POLICY_MAX(NLA_U8, 3), [DEVLINK_ATTR_ESWITCH_ENCAP_MODE] = NLA_POLICY_MAX(NLA_U8, 1), }; /* DEVLINK_CMD_DPIPE_TABLE_GET - do */ -static const struct nla_policy devlink_dpipe_table_get_nl_policy[DEVLINK_ATTR_DPIPE_TABLE_NAME + 1] = { +static const struct nla_policy devlink_dpipe_table_get_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), [DEVLINK_ATTR_DPIPE_TABLE_NAME] = { .type = NLA_NUL_STRING, }, }; /* DEVLINK_CMD_DPIPE_ENTRIES_GET - do */ -static const struct nla_policy devlink_dpipe_entries_get_nl_policy[DEVLINK_ATTR_DPIPE_TABLE_NAME + 1] = { +static const struct nla_policy devlink_dpipe_entries_get_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), [DEVLINK_ATTR_DPIPE_TABLE_NAME] = { .type = NLA_NUL_STRING, }, }; /* DEVLINK_CMD_DPIPE_HEADERS_GET - do */ -static const struct nla_policy devlink_dpipe_headers_get_nl_policy[DEVLINK_ATTR_DEV_NAME + 1] = { +static const struct nla_policy devlink_dpipe_headers_get_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), }; /* DEVLINK_CMD_DPIPE_TABLE_COUNTERS_SET - do */ -static const struct nla_policy devlink_dpipe_table_counters_set_nl_policy[DEVLINK_ATTR_DPIPE_TABLE_COUNTERS_ENABLED + 1] = { +static const struct nla_policy devlink_dpipe_table_counters_set_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), [DEVLINK_ATTR_DPIPE_TABLE_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DPIPE_TABLE_COUNTERS_ENABLED] = { .type = NLA_U8, }, }; /* DEVLINK_CMD_RESOURCE_SET - do */ -static const struct nla_policy devlink_resource_set_nl_policy[DEVLINK_ATTR_RESOURCE_SIZE + 1] = { +static const struct nla_policy devlink_resource_set_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), [DEVLINK_ATTR_RESOURCE_ID] = { .type = NLA_U64, }, [DEVLINK_ATTR_RESOURCE_SIZE] = { .type = NLA_U64, }, }; /* DEVLINK_CMD_RESOURCE_DUMP - do */ -static const struct nla_policy devlink_resource_dump_nl_policy[DEVLINK_ATTR_DEV_NAME + 1] = { +static const struct nla_policy devlink_resource_dump_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), }; /* DEVLINK_CMD_RELOAD - do */ -static const struct nla_policy devlink_reload_nl_policy[DEVLINK_ATTR_RELOAD_LIMITS + 1] = { +static const struct nla_policy devlink_reload_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), [DEVLINK_ATTR_RELOAD_ACTION] = NLA_POLICY_RANGE(NLA_U8, 1, 2), [DEVLINK_ATTR_RELOAD_LIMITS] = NLA_POLICY_BITFIELD32(6), [DEVLINK_ATTR_NETNS_PID] = { .type = NLA_U32, }, @@ -289,22 +324,25 @@ static const struct nla_policy devlink_reload_nl_policy[DEVLINK_ATTR_RELOAD_LIMI }; /* DEVLINK_CMD_PARAM_GET - do */ -static const struct nla_policy devlink_param_get_do_nl_policy[DEVLINK_ATTR_PARAM_NAME + 1] = { +static const struct nla_policy devlink_param_get_do_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), [DEVLINK_ATTR_PARAM_NAME] = { .type = NLA_NUL_STRING, }, }; /* DEVLINK_CMD_PARAM_GET - dump */ -static const struct nla_policy devlink_param_get_dump_nl_policy[DEVLINK_ATTR_DEV_NAME + 1] = { +static const struct nla_policy devlink_param_get_dump_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), }; /* DEVLINK_CMD_PARAM_SET - do */ -static const struct nla_policy devlink_param_set_nl_policy[DEVLINK_ATTR_PARAM_RESET_DEFAULT + 1] = { +static const struct nla_policy devlink_param_set_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), [DEVLINK_ATTR_PARAM_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_PARAM_TYPE] = NLA_POLICY_VALIDATE_FN(NLA_U8, &devlink_attr_param_type_validate), [DEVLINK_ATTR_PARAM_VALUE_CMODE] = NLA_POLICY_MAX(NLA_U8, 2), @@ -312,41 +350,46 @@ static const struct nla_policy devlink_param_set_nl_policy[DEVLINK_ATTR_PARAM_RE }; /* DEVLINK_CMD_REGION_GET - do */ -static const struct nla_policy devlink_region_get_do_nl_policy[DEVLINK_ATTR_REGION_NAME + 1] = { +static const struct nla_policy devlink_region_get_do_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), [DEVLINK_ATTR_PORT_INDEX] = { .type = NLA_U32, }, [DEVLINK_ATTR_REGION_NAME] = { .type = NLA_NUL_STRING, }, }; /* DEVLINK_CMD_REGION_GET - dump */ -static const struct nla_policy devlink_region_get_dump_nl_policy[DEVLINK_ATTR_DEV_NAME + 1] = { +static const struct nla_policy devlink_region_get_dump_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), }; /* DEVLINK_CMD_REGION_NEW - do */ -static const struct nla_policy devlink_region_new_nl_policy[DEVLINK_ATTR_REGION_SNAPSHOT_ID + 1] = { +static const struct nla_policy devlink_region_new_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), [DEVLINK_ATTR_PORT_INDEX] = { .type = NLA_U32, }, [DEVLINK_ATTR_REGION_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_REGION_SNAPSHOT_ID] = { .type = NLA_U32, }, }; /* DEVLINK_CMD_REGION_DEL - do */ -static const struct nla_policy devlink_region_del_nl_policy[DEVLINK_ATTR_REGION_SNAPSHOT_ID + 1] = { +static const struct nla_policy devlink_region_del_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), [DEVLINK_ATTR_PORT_INDEX] = { .type = NLA_U32, }, [DEVLINK_ATTR_REGION_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_REGION_SNAPSHOT_ID] = { .type = NLA_U32, }, }; /* DEVLINK_CMD_REGION_READ - dump */ -static const struct nla_policy devlink_region_read_nl_policy[DEVLINK_ATTR_REGION_DIRECT + 1] = { +static const struct nla_policy devlink_region_read_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), [DEVLINK_ATTR_PORT_INDEX] = { .type = NLA_U32, }, [DEVLINK_ATTR_REGION_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_REGION_SNAPSHOT_ID] = { .type = NLA_U32, }, @@ -356,44 +399,50 @@ static const struct nla_policy devlink_region_read_nl_policy[DEVLINK_ATTR_REGION }; /* DEVLINK_CMD_PORT_PARAM_GET - do */ -static const struct nla_policy devlink_port_param_get_nl_policy[DEVLINK_ATTR_PORT_INDEX + 1] = { +static const struct nla_policy devlink_port_param_get_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), [DEVLINK_ATTR_PORT_INDEX] = { .type = NLA_U32, }, }; /* DEVLINK_CMD_PORT_PARAM_SET - do */ -static const struct nla_policy devlink_port_param_set_nl_policy[DEVLINK_ATTR_PORT_INDEX + 1] = { +static const struct nla_policy devlink_port_param_set_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), [DEVLINK_ATTR_PORT_INDEX] = { .type = NLA_U32, }, }; /* DEVLINK_CMD_INFO_GET - do */ -static const struct nla_policy devlink_info_get_nl_policy[DEVLINK_ATTR_DEV_NAME + 1] = { +static const struct nla_policy devlink_info_get_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), }; /* DEVLINK_CMD_HEALTH_REPORTER_GET - do */ -static const struct nla_policy devlink_health_reporter_get_do_nl_policy[DEVLINK_ATTR_HEALTH_REPORTER_NAME + 1] = { +static const struct nla_policy devlink_health_reporter_get_do_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), [DEVLINK_ATTR_PORT_INDEX] = { .type = NLA_U32, }, [DEVLINK_ATTR_HEALTH_REPORTER_NAME] = { .type = NLA_NUL_STRING, }, }; /* DEVLINK_CMD_HEALTH_REPORTER_GET - dump */ -static const struct nla_policy devlink_health_reporter_get_dump_nl_policy[DEVLINK_ATTR_PORT_INDEX + 1] = { +static const struct nla_policy devlink_health_reporter_get_dump_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), [DEVLINK_ATTR_PORT_INDEX] = { .type = NLA_U32, }, }; /* DEVLINK_CMD_HEALTH_REPORTER_SET - do */ -static const struct nla_policy devlink_health_reporter_set_nl_policy[DEVLINK_ATTR_HEALTH_REPORTER_BURST_PERIOD + 1] = { +static const struct nla_policy devlink_health_reporter_set_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), [DEVLINK_ATTR_PORT_INDEX] = { .type = NLA_U32, }, [DEVLINK_ATTR_HEALTH_REPORTER_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_HEALTH_REPORTER_GRACEFUL_PERIOD] = { .type = NLA_U64, }, @@ -403,137 +452,155 @@ static const struct nla_policy devlink_health_reporter_set_nl_policy[DEVLINK_ATT }; /* DEVLINK_CMD_HEALTH_REPORTER_RECOVER - do */ -static const struct nla_policy devlink_health_reporter_recover_nl_policy[DEVLINK_ATTR_HEALTH_REPORTER_NAME + 1] = { +static const struct nla_policy devlink_health_reporter_recover_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), [DEVLINK_ATTR_PORT_INDEX] = { .type = NLA_U32, }, [DEVLINK_ATTR_HEALTH_REPORTER_NAME] = { .type = NLA_NUL_STRING, }, }; /* DEVLINK_CMD_HEALTH_REPORTER_DIAGNOSE - do */ -static const struct nla_policy devlink_health_reporter_diagnose_nl_policy[DEVLINK_ATTR_HEALTH_REPORTER_NAME + 1] = { +static const struct nla_policy devlink_health_reporter_diagnose_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), [DEVLINK_ATTR_PORT_INDEX] = { .type = NLA_U32, }, [DEVLINK_ATTR_HEALTH_REPORTER_NAME] = { .type = NLA_NUL_STRING, }, }; /* DEVLINK_CMD_HEALTH_REPORTER_DUMP_GET - dump */ -static const struct nla_policy devlink_health_reporter_dump_get_nl_policy[DEVLINK_ATTR_HEALTH_REPORTER_NAME + 1] = { +static const struct nla_policy devlink_health_reporter_dump_get_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), [DEVLINK_ATTR_PORT_INDEX] = { .type = NLA_U32, }, [DEVLINK_ATTR_HEALTH_REPORTER_NAME] = { .type = NLA_NUL_STRING, }, }; /* DEVLINK_CMD_HEALTH_REPORTER_DUMP_CLEAR - do */ -static const struct nla_policy devlink_health_reporter_dump_clear_nl_policy[DEVLINK_ATTR_HEALTH_REPORTER_NAME + 1] = { +static const struct nla_policy devlink_health_reporter_dump_clear_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), [DEVLINK_ATTR_PORT_INDEX] = { .type = NLA_U32, }, [DEVLINK_ATTR_HEALTH_REPORTER_NAME] = { .type = NLA_NUL_STRING, }, }; /* DEVLINK_CMD_FLASH_UPDATE - do */ -static const struct nla_policy devlink_flash_update_nl_policy[DEVLINK_ATTR_FLASH_UPDATE_OVERWRITE_MASK + 1] = { +static const struct nla_policy devlink_flash_update_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), [DEVLINK_ATTR_FLASH_UPDATE_FILE_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_FLASH_UPDATE_COMPONENT] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_FLASH_UPDATE_OVERWRITE_MASK] = NLA_POLICY_BITFIELD32(3), }; /* DEVLINK_CMD_TRAP_GET - do */ -static const struct nla_policy devlink_trap_get_do_nl_policy[DEVLINK_ATTR_TRAP_NAME + 1] = { +static const struct nla_policy devlink_trap_get_do_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), [DEVLINK_ATTR_TRAP_NAME] = { .type = NLA_NUL_STRING, }, }; /* DEVLINK_CMD_TRAP_GET - dump */ -static const struct nla_policy devlink_trap_get_dump_nl_policy[DEVLINK_ATTR_DEV_NAME + 1] = { +static const struct nla_policy devlink_trap_get_dump_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), }; /* DEVLINK_CMD_TRAP_SET - do */ -static const struct nla_policy devlink_trap_set_nl_policy[DEVLINK_ATTR_TRAP_ACTION + 1] = { +static const struct nla_policy devlink_trap_set_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), [DEVLINK_ATTR_TRAP_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_TRAP_ACTION] = NLA_POLICY_MAX(NLA_U8, 2), }; /* DEVLINK_CMD_TRAP_GROUP_GET - do */ -static const struct nla_policy devlink_trap_group_get_do_nl_policy[DEVLINK_ATTR_TRAP_GROUP_NAME + 1] = { +static const struct nla_policy devlink_trap_group_get_do_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), [DEVLINK_ATTR_TRAP_GROUP_NAME] = { .type = NLA_NUL_STRING, }, }; /* DEVLINK_CMD_TRAP_GROUP_GET - dump */ -static const struct nla_policy devlink_trap_group_get_dump_nl_policy[DEVLINK_ATTR_DEV_NAME + 1] = { +static const struct nla_policy devlink_trap_group_get_dump_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), }; /* DEVLINK_CMD_TRAP_GROUP_SET - do */ -static const struct nla_policy devlink_trap_group_set_nl_policy[DEVLINK_ATTR_TRAP_POLICER_ID + 1] = { +static const struct nla_policy devlink_trap_group_set_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), [DEVLINK_ATTR_TRAP_GROUP_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_TRAP_ACTION] = NLA_POLICY_MAX(NLA_U8, 2), [DEVLINK_ATTR_TRAP_POLICER_ID] = { .type = NLA_U32, }, }; /* DEVLINK_CMD_TRAP_POLICER_GET - do */ -static const struct nla_policy devlink_trap_policer_get_do_nl_policy[DEVLINK_ATTR_TRAP_POLICER_ID + 1] = { +static const struct nla_policy devlink_trap_policer_get_do_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), [DEVLINK_ATTR_TRAP_POLICER_ID] = { .type = NLA_U32, }, }; /* DEVLINK_CMD_TRAP_POLICER_GET - dump */ -static const struct nla_policy devlink_trap_policer_get_dump_nl_policy[DEVLINK_ATTR_DEV_NAME + 1] = { +static const struct nla_policy devlink_trap_policer_get_dump_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), }; /* DEVLINK_CMD_TRAP_POLICER_SET - do */ -static const struct nla_policy devlink_trap_policer_set_nl_policy[DEVLINK_ATTR_TRAP_POLICER_BURST + 1] = { +static const struct nla_policy devlink_trap_policer_set_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), [DEVLINK_ATTR_TRAP_POLICER_ID] = { .type = NLA_U32, }, [DEVLINK_ATTR_TRAP_POLICER_RATE] = { .type = NLA_U64, }, [DEVLINK_ATTR_TRAP_POLICER_BURST] = { .type = NLA_U64, }, }; /* DEVLINK_CMD_HEALTH_REPORTER_TEST - do */ -static const struct nla_policy devlink_health_reporter_test_nl_policy[DEVLINK_ATTR_HEALTH_REPORTER_NAME + 1] = { +static const struct nla_policy devlink_health_reporter_test_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), [DEVLINK_ATTR_PORT_INDEX] = { .type = NLA_U32, }, [DEVLINK_ATTR_HEALTH_REPORTER_NAME] = { .type = NLA_NUL_STRING, }, }; /* DEVLINK_CMD_RATE_GET - do */ -static const struct nla_policy devlink_rate_get_do_nl_policy[DEVLINK_ATTR_RATE_NODE_NAME + 1] = { +static const struct nla_policy devlink_rate_get_do_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), [DEVLINK_ATTR_PORT_INDEX] = { .type = NLA_U32, }, [DEVLINK_ATTR_RATE_NODE_NAME] = { .type = NLA_NUL_STRING, }, }; /* DEVLINK_CMD_RATE_GET - dump */ -static const struct nla_policy devlink_rate_get_dump_nl_policy[DEVLINK_ATTR_DEV_NAME + 1] = { +static const struct nla_policy devlink_rate_get_dump_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), }; /* DEVLINK_CMD_RATE_SET - do */ -static const struct nla_policy devlink_rate_set_nl_policy[DEVLINK_ATTR_RATE_TC_BWS + 1] = { +static const struct nla_policy devlink_rate_set_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), [DEVLINK_ATTR_RATE_NODE_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_RATE_TX_SHARE] = { .type = NLA_U64, }, [DEVLINK_ATTR_RATE_TX_MAX] = { .type = NLA_U64, }, @@ -544,9 +611,10 @@ static const struct nla_policy devlink_rate_set_nl_policy[DEVLINK_ATTR_RATE_TC_B }; /* DEVLINK_CMD_RATE_NEW - do */ -static const struct nla_policy devlink_rate_new_nl_policy[DEVLINK_ATTR_RATE_TC_BWS + 1] = { +static const struct nla_policy devlink_rate_new_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), [DEVLINK_ATTR_RATE_NODE_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_RATE_TX_SHARE] = { .type = NLA_U64, }, [DEVLINK_ATTR_RATE_TX_MAX] = { .type = NLA_U64, }, @@ -557,50 +625,57 @@ static const struct nla_policy devlink_rate_new_nl_policy[DEVLINK_ATTR_RATE_TC_B }; /* DEVLINK_CMD_RATE_DEL - do */ -static const struct nla_policy devlink_rate_del_nl_policy[DEVLINK_ATTR_RATE_NODE_NAME + 1] = { +static const struct nla_policy devlink_rate_del_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), [DEVLINK_ATTR_RATE_NODE_NAME] = { .type = NLA_NUL_STRING, }, }; /* DEVLINK_CMD_LINECARD_GET - do */ -static const struct nla_policy devlink_linecard_get_do_nl_policy[DEVLINK_ATTR_LINECARD_INDEX + 1] = { +static const struct nla_policy devlink_linecard_get_do_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), [DEVLINK_ATTR_LINECARD_INDEX] = { .type = NLA_U32, }, }; /* DEVLINK_CMD_LINECARD_GET - dump */ -static const struct nla_policy devlink_linecard_get_dump_nl_policy[DEVLINK_ATTR_DEV_NAME + 1] = { +static const struct nla_policy devlink_linecard_get_dump_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), }; /* DEVLINK_CMD_LINECARD_SET - do */ -static const struct nla_policy devlink_linecard_set_nl_policy[DEVLINK_ATTR_LINECARD_TYPE + 1] = { +static const struct nla_policy devlink_linecard_set_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), [DEVLINK_ATTR_LINECARD_INDEX] = { .type = NLA_U32, }, [DEVLINK_ATTR_LINECARD_TYPE] = { .type = NLA_NUL_STRING, }, }; /* DEVLINK_CMD_SELFTESTS_GET - do */ -static const struct nla_policy devlink_selftests_get_nl_policy[DEVLINK_ATTR_DEV_NAME + 1] = { +static const struct nla_policy devlink_selftests_get_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), }; /* DEVLINK_CMD_SELFTESTS_RUN - do */ -static const struct nla_policy devlink_selftests_run_nl_policy[DEVLINK_ATTR_SELFTESTS + 1] = { +static const struct nla_policy devlink_selftests_run_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), [DEVLINK_ATTR_SELFTESTS] = NLA_POLICY_NESTED(devlink_dl_selftest_id_nl_policy), }; /* DEVLINK_CMD_NOTIFY_FILTER_SET - do */ -static const struct nla_policy devlink_notify_filter_set_nl_policy[DEVLINK_ATTR_PORT_INDEX + 1] = { +static const struct nla_policy devlink_notify_filter_set_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), [DEVLINK_ATTR_PORT_INDEX] = { .type = NLA_U32, }, }; @@ -613,7 +688,7 @@ const struct genl_split_ops devlink_nl_ops[74] = { .doit = devlink_nl_get_doit, .post_doit = devlink_nl_post_doit, .policy = devlink_get_nl_policy, - .maxattr = DEVLINK_ATTR_DEV_NAME, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_CMD_CAP_DO, }, { @@ -629,14 +704,14 @@ const struct genl_split_ops devlink_nl_ops[74] = { .doit = devlink_nl_port_get_doit, .post_doit = devlink_nl_post_doit, .policy = devlink_port_get_do_nl_policy, - .maxattr = DEVLINK_ATTR_PORT_INDEX, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_CMD_CAP_DO, }, { .cmd = DEVLINK_CMD_PORT_GET, .dumpit = devlink_nl_port_get_dumpit, .policy = devlink_port_get_dump_nl_policy, - .maxattr = DEVLINK_ATTR_DEV_NAME, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_CMD_CAP_DUMP, }, { @@ -646,7 +721,7 @@ const struct genl_split_ops devlink_nl_ops[74] = { .doit = devlink_nl_port_set_doit, .post_doit = devlink_nl_post_doit, .policy = devlink_port_set_nl_policy, - .maxattr = DEVLINK_ATTR_PORT_FUNCTION, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO, }, { @@ -656,7 +731,7 @@ const struct genl_split_ops devlink_nl_ops[74] = { .doit = devlink_nl_port_new_doit, .post_doit = devlink_nl_post_doit, .policy = devlink_port_new_nl_policy, - .maxattr = DEVLINK_ATTR_PORT_PCI_SF_NUMBER, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO, }, { @@ -666,7 +741,7 @@ const struct genl_split_ops devlink_nl_ops[74] = { .doit = devlink_nl_port_del_doit, .post_doit = devlink_nl_post_doit, .policy = devlink_port_del_nl_policy, - .maxattr = DEVLINK_ATTR_PORT_INDEX, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO, }, { @@ -676,7 +751,7 @@ const struct genl_split_ops devlink_nl_ops[74] = { .doit = devlink_nl_port_split_doit, .post_doit = devlink_nl_post_doit, .policy = devlink_port_split_nl_policy, - .maxattr = DEVLINK_ATTR_PORT_SPLIT_COUNT, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO, }, { @@ -686,7 +761,7 @@ const struct genl_split_ops devlink_nl_ops[74] = { .doit = devlink_nl_port_unsplit_doit, .post_doit = devlink_nl_post_doit, .policy = devlink_port_unsplit_nl_policy, - .maxattr = DEVLINK_ATTR_PORT_INDEX, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO, }, { @@ -696,14 +771,14 @@ const struct genl_split_ops devlink_nl_ops[74] = { .doit = devlink_nl_sb_get_doit, .post_doit = devlink_nl_post_doit, .policy = devlink_sb_get_do_nl_policy, - .maxattr = DEVLINK_ATTR_SB_INDEX, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_CMD_CAP_DO, }, { .cmd = DEVLINK_CMD_SB_GET, .dumpit = devlink_nl_sb_get_dumpit, .policy = devlink_sb_get_dump_nl_policy, - .maxattr = DEVLINK_ATTR_DEV_NAME, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_CMD_CAP_DUMP, }, { @@ -713,14 +788,14 @@ const struct genl_split_ops devlink_nl_ops[74] = { .doit = devlink_nl_sb_pool_get_doit, .post_doit = devlink_nl_post_doit, .policy = devlink_sb_pool_get_do_nl_policy, - .maxattr = DEVLINK_ATTR_SB_POOL_INDEX, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_CMD_CAP_DO, }, { .cmd = DEVLINK_CMD_SB_POOL_GET, .dumpit = devlink_nl_sb_pool_get_dumpit, .policy = devlink_sb_pool_get_dump_nl_policy, - .maxattr = DEVLINK_ATTR_DEV_NAME, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_CMD_CAP_DUMP, }, { @@ -730,7 +805,7 @@ const struct genl_split_ops devlink_nl_ops[74] = { .doit = devlink_nl_sb_pool_set_doit, .post_doit = devlink_nl_post_doit, .policy = devlink_sb_pool_set_nl_policy, - .maxattr = DEVLINK_ATTR_SB_POOL_THRESHOLD_TYPE, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO, }, { @@ -740,14 +815,14 @@ const struct genl_split_ops devlink_nl_ops[74] = { .doit = devlink_nl_sb_port_pool_get_doit, .post_doit = devlink_nl_post_doit, .policy = devlink_sb_port_pool_get_do_nl_policy, - .maxattr = DEVLINK_ATTR_SB_POOL_INDEX, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_CMD_CAP_DO, }, { .cmd = DEVLINK_CMD_SB_PORT_POOL_GET, .dumpit = devlink_nl_sb_port_pool_get_dumpit, .policy = devlink_sb_port_pool_get_dump_nl_policy, - .maxattr = DEVLINK_ATTR_DEV_NAME, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_CMD_CAP_DUMP, }, { @@ -757,7 +832,7 @@ const struct genl_split_ops devlink_nl_ops[74] = { .doit = devlink_nl_sb_port_pool_set_doit, .post_doit = devlink_nl_post_doit, .policy = devlink_sb_port_pool_set_nl_policy, - .maxattr = DEVLINK_ATTR_SB_THRESHOLD, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO, }, { @@ -767,14 +842,14 @@ const struct genl_split_ops devlink_nl_ops[74] = { .doit = devlink_nl_sb_tc_pool_bind_get_doit, .post_doit = devlink_nl_post_doit, .policy = devlink_sb_tc_pool_bind_get_do_nl_policy, - .maxattr = DEVLINK_ATTR_SB_TC_INDEX, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_CMD_CAP_DO, }, { .cmd = DEVLINK_CMD_SB_TC_POOL_BIND_GET, .dumpit = devlink_nl_sb_tc_pool_bind_get_dumpit, .policy = devlink_sb_tc_pool_bind_get_dump_nl_policy, - .maxattr = DEVLINK_ATTR_DEV_NAME, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_CMD_CAP_DUMP, }, { @@ -784,7 +859,7 @@ const struct genl_split_ops devlink_nl_ops[74] = { .doit = devlink_nl_sb_tc_pool_bind_set_doit, .post_doit = devlink_nl_post_doit, .policy = devlink_sb_tc_pool_bind_set_nl_policy, - .maxattr = DEVLINK_ATTR_SB_TC_INDEX, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO, }, { @@ -794,7 +869,7 @@ const struct genl_split_ops devlink_nl_ops[74] = { .doit = devlink_nl_sb_occ_snapshot_doit, .post_doit = devlink_nl_post_doit, .policy = devlink_sb_occ_snapshot_nl_policy, - .maxattr = DEVLINK_ATTR_SB_INDEX, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO, }, { @@ -804,7 +879,7 @@ const struct genl_split_ops devlink_nl_ops[74] = { .doit = devlink_nl_sb_occ_max_clear_doit, .post_doit = devlink_nl_post_doit, .policy = devlink_sb_occ_max_clear_nl_policy, - .maxattr = DEVLINK_ATTR_SB_INDEX, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO, }, { @@ -814,7 +889,7 @@ const struct genl_split_ops devlink_nl_ops[74] = { .doit = devlink_nl_eswitch_get_doit, .post_doit = devlink_nl_post_doit, .policy = devlink_eswitch_get_nl_policy, - .maxattr = DEVLINK_ATTR_DEV_NAME, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO, }, { @@ -824,7 +899,7 @@ const struct genl_split_ops devlink_nl_ops[74] = { .doit = devlink_nl_eswitch_set_doit, .post_doit = devlink_nl_post_doit, .policy = devlink_eswitch_set_nl_policy, - .maxattr = DEVLINK_ATTR_ESWITCH_ENCAP_MODE, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO, }, { @@ -834,7 +909,7 @@ const struct genl_split_ops devlink_nl_ops[74] = { .doit = devlink_nl_dpipe_table_get_doit, .post_doit = devlink_nl_post_doit, .policy = devlink_dpipe_table_get_nl_policy, - .maxattr = DEVLINK_ATTR_DPIPE_TABLE_NAME, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_CMD_CAP_DO, }, { @@ -844,7 +919,7 @@ const struct genl_split_ops devlink_nl_ops[74] = { .doit = devlink_nl_dpipe_entries_get_doit, .post_doit = devlink_nl_post_doit, .policy = devlink_dpipe_entries_get_nl_policy, - .maxattr = DEVLINK_ATTR_DPIPE_TABLE_NAME, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_CMD_CAP_DO, }, { @@ -854,7 +929,7 @@ const struct genl_split_ops devlink_nl_ops[74] = { .doit = devlink_nl_dpipe_headers_get_doit, .post_doit = devlink_nl_post_doit, .policy = devlink_dpipe_headers_get_nl_policy, - .maxattr = DEVLINK_ATTR_DEV_NAME, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_CMD_CAP_DO, }, { @@ -864,7 +939,7 @@ const struct genl_split_ops devlink_nl_ops[74] = { .doit = devlink_nl_dpipe_table_counters_set_doit, .post_doit = devlink_nl_post_doit, .policy = devlink_dpipe_table_counters_set_nl_policy, - .maxattr = DEVLINK_ATTR_DPIPE_TABLE_COUNTERS_ENABLED, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO, }, { @@ -874,7 +949,7 @@ const struct genl_split_ops devlink_nl_ops[74] = { .doit = devlink_nl_resource_set_doit, .post_doit = devlink_nl_post_doit, .policy = devlink_resource_set_nl_policy, - .maxattr = DEVLINK_ATTR_RESOURCE_SIZE, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO, }, { @@ -884,7 +959,7 @@ const struct genl_split_ops devlink_nl_ops[74] = { .doit = devlink_nl_resource_dump_doit, .post_doit = devlink_nl_post_doit, .policy = devlink_resource_dump_nl_policy, - .maxattr = DEVLINK_ATTR_DEV_NAME, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_CMD_CAP_DO, }, { @@ -894,7 +969,7 @@ const struct genl_split_ops devlink_nl_ops[74] = { .doit = devlink_nl_reload_doit, .post_doit = devlink_nl_post_doit_dev_lock, .policy = devlink_reload_nl_policy, - .maxattr = DEVLINK_ATTR_RELOAD_LIMITS, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO, }, { @@ -904,14 +979,14 @@ const struct genl_split_ops devlink_nl_ops[74] = { .doit = devlink_nl_param_get_doit, .post_doit = devlink_nl_post_doit, .policy = devlink_param_get_do_nl_policy, - .maxattr = DEVLINK_ATTR_PARAM_NAME, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_CMD_CAP_DO, }, { .cmd = DEVLINK_CMD_PARAM_GET, .dumpit = devlink_nl_param_get_dumpit, .policy = devlink_param_get_dump_nl_policy, - .maxattr = DEVLINK_ATTR_DEV_NAME, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_CMD_CAP_DUMP, }, { @@ -921,7 +996,7 @@ const struct genl_split_ops devlink_nl_ops[74] = { .doit = devlink_nl_param_set_doit, .post_doit = devlink_nl_post_doit, .policy = devlink_param_set_nl_policy, - .maxattr = DEVLINK_ATTR_PARAM_RESET_DEFAULT, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO, }, { @@ -931,14 +1006,14 @@ const struct genl_split_ops devlink_nl_ops[74] = { .doit = devlink_nl_region_get_doit, .post_doit = devlink_nl_post_doit, .policy = devlink_region_get_do_nl_policy, - .maxattr = DEVLINK_ATTR_REGION_NAME, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_CMD_CAP_DO, }, { .cmd = DEVLINK_CMD_REGION_GET, .dumpit = devlink_nl_region_get_dumpit, .policy = devlink_region_get_dump_nl_policy, - .maxattr = DEVLINK_ATTR_DEV_NAME, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_CMD_CAP_DUMP, }, { @@ -948,7 +1023,7 @@ const struct genl_split_ops devlink_nl_ops[74] = { .doit = devlink_nl_region_new_doit, .post_doit = devlink_nl_post_doit, .policy = devlink_region_new_nl_policy, - .maxattr = DEVLINK_ATTR_REGION_SNAPSHOT_ID, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO, }, { @@ -958,7 +1033,7 @@ const struct genl_split_ops devlink_nl_ops[74] = { .doit = devlink_nl_region_del_doit, .post_doit = devlink_nl_post_doit, .policy = devlink_region_del_nl_policy, - .maxattr = DEVLINK_ATTR_REGION_SNAPSHOT_ID, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO, }, { @@ -966,7 +1041,7 @@ const struct genl_split_ops devlink_nl_ops[74] = { .validate = GENL_DONT_VALIDATE_DUMP_STRICT, .dumpit = devlink_nl_region_read_dumpit, .policy = devlink_region_read_nl_policy, - .maxattr = DEVLINK_ATTR_REGION_DIRECT, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DUMP, }, { @@ -976,7 +1051,7 @@ const struct genl_split_ops devlink_nl_ops[74] = { .doit = devlink_nl_port_param_get_doit, .post_doit = devlink_nl_post_doit, .policy = devlink_port_param_get_nl_policy, - .maxattr = DEVLINK_ATTR_PORT_INDEX, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_CMD_CAP_DO, }, { @@ -992,7 +1067,7 @@ const struct genl_split_ops devlink_nl_ops[74] = { .doit = devlink_nl_port_param_set_doit, .post_doit = devlink_nl_post_doit, .policy = devlink_port_param_set_nl_policy, - .maxattr = DEVLINK_ATTR_PORT_INDEX, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO, }, { @@ -1002,7 +1077,7 @@ const struct genl_split_ops devlink_nl_ops[74] = { .doit = devlink_nl_info_get_doit, .post_doit = devlink_nl_post_doit, .policy = devlink_info_get_nl_policy, - .maxattr = DEVLINK_ATTR_DEV_NAME, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_CMD_CAP_DO, }, { @@ -1018,14 +1093,14 @@ const struct genl_split_ops devlink_nl_ops[74] = { .doit = devlink_nl_health_reporter_get_doit, .post_doit = devlink_nl_post_doit, .policy = devlink_health_reporter_get_do_nl_policy, - .maxattr = DEVLINK_ATTR_HEALTH_REPORTER_NAME, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_CMD_CAP_DO, }, { .cmd = DEVLINK_CMD_HEALTH_REPORTER_GET, .dumpit = devlink_nl_health_reporter_get_dumpit, .policy = devlink_health_reporter_get_dump_nl_policy, - .maxattr = DEVLINK_ATTR_PORT_INDEX, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_CMD_CAP_DUMP, }, { @@ -1035,7 +1110,7 @@ const struct genl_split_ops devlink_nl_ops[74] = { .doit = devlink_nl_health_reporter_set_doit, .post_doit = devlink_nl_post_doit, .policy = devlink_health_reporter_set_nl_policy, - .maxattr = DEVLINK_ATTR_HEALTH_REPORTER_BURST_PERIOD, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO, }, { @@ -1045,7 +1120,7 @@ const struct genl_split_ops devlink_nl_ops[74] = { .doit = devlink_nl_health_reporter_recover_doit, .post_doit = devlink_nl_post_doit, .policy = devlink_health_reporter_recover_nl_policy, - .maxattr = DEVLINK_ATTR_HEALTH_REPORTER_NAME, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO, }, { @@ -1055,7 +1130,7 @@ const struct genl_split_ops devlink_nl_ops[74] = { .doit = devlink_nl_health_reporter_diagnose_doit, .post_doit = devlink_nl_post_doit, .policy = devlink_health_reporter_diagnose_nl_policy, - .maxattr = DEVLINK_ATTR_HEALTH_REPORTER_NAME, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO, }, { @@ -1063,7 +1138,7 @@ const struct genl_split_ops devlink_nl_ops[74] = { .validate = GENL_DONT_VALIDATE_DUMP_STRICT, .dumpit = devlink_nl_health_reporter_dump_get_dumpit, .policy = devlink_health_reporter_dump_get_nl_policy, - .maxattr = DEVLINK_ATTR_HEALTH_REPORTER_NAME, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DUMP, }, { @@ -1073,7 +1148,7 @@ const struct genl_split_ops devlink_nl_ops[74] = { .doit = devlink_nl_health_reporter_dump_clear_doit, .post_doit = devlink_nl_post_doit, .policy = devlink_health_reporter_dump_clear_nl_policy, - .maxattr = DEVLINK_ATTR_HEALTH_REPORTER_NAME, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO, }, { @@ -1083,7 +1158,7 @@ const struct genl_split_ops devlink_nl_ops[74] = { .doit = devlink_nl_flash_update_doit, .post_doit = devlink_nl_post_doit, .policy = devlink_flash_update_nl_policy, - .maxattr = DEVLINK_ATTR_FLASH_UPDATE_OVERWRITE_MASK, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO, }, { @@ -1093,14 +1168,14 @@ const struct genl_split_ops devlink_nl_ops[74] = { .doit = devlink_nl_trap_get_doit, .post_doit = devlink_nl_post_doit, .policy = devlink_trap_get_do_nl_policy, - .maxattr = DEVLINK_ATTR_TRAP_NAME, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_CMD_CAP_DO, }, { .cmd = DEVLINK_CMD_TRAP_GET, .dumpit = devlink_nl_trap_get_dumpit, .policy = devlink_trap_get_dump_nl_policy, - .maxattr = DEVLINK_ATTR_DEV_NAME, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_CMD_CAP_DUMP, }, { @@ -1110,7 +1185,7 @@ const struct genl_split_ops devlink_nl_ops[74] = { .doit = devlink_nl_trap_set_doit, .post_doit = devlink_nl_post_doit, .policy = devlink_trap_set_nl_policy, - .maxattr = DEVLINK_ATTR_TRAP_ACTION, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO, }, { @@ -1120,14 +1195,14 @@ const struct genl_split_ops devlink_nl_ops[74] = { .doit = devlink_nl_trap_group_get_doit, .post_doit = devlink_nl_post_doit, .policy = devlink_trap_group_get_do_nl_policy, - .maxattr = DEVLINK_ATTR_TRAP_GROUP_NAME, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_CMD_CAP_DO, }, { .cmd = DEVLINK_CMD_TRAP_GROUP_GET, .dumpit = devlink_nl_trap_group_get_dumpit, .policy = devlink_trap_group_get_dump_nl_policy, - .maxattr = DEVLINK_ATTR_DEV_NAME, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_CMD_CAP_DUMP, }, { @@ -1137,7 +1212,7 @@ const struct genl_split_ops devlink_nl_ops[74] = { .doit = devlink_nl_trap_group_set_doit, .post_doit = devlink_nl_post_doit, .policy = devlink_trap_group_set_nl_policy, - .maxattr = DEVLINK_ATTR_TRAP_POLICER_ID, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO, }, { @@ -1147,14 +1222,14 @@ const struct genl_split_ops devlink_nl_ops[74] = { .doit = devlink_nl_trap_policer_get_doit, .post_doit = devlink_nl_post_doit, .policy = devlink_trap_policer_get_do_nl_policy, - .maxattr = DEVLINK_ATTR_TRAP_POLICER_ID, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_CMD_CAP_DO, }, { .cmd = DEVLINK_CMD_TRAP_POLICER_GET, .dumpit = devlink_nl_trap_policer_get_dumpit, .policy = devlink_trap_policer_get_dump_nl_policy, - .maxattr = DEVLINK_ATTR_DEV_NAME, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_CMD_CAP_DUMP, }, { @@ -1164,7 +1239,7 @@ const struct genl_split_ops devlink_nl_ops[74] = { .doit = devlink_nl_trap_policer_set_doit, .post_doit = devlink_nl_post_doit, .policy = devlink_trap_policer_set_nl_policy, - .maxattr = DEVLINK_ATTR_TRAP_POLICER_BURST, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO, }, { @@ -1174,7 +1249,7 @@ const struct genl_split_ops devlink_nl_ops[74] = { .doit = devlink_nl_health_reporter_test_doit, .post_doit = devlink_nl_post_doit, .policy = devlink_health_reporter_test_nl_policy, - .maxattr = DEVLINK_ATTR_HEALTH_REPORTER_NAME, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO, }, { @@ -1184,14 +1259,14 @@ const struct genl_split_ops devlink_nl_ops[74] = { .doit = devlink_nl_rate_get_doit, .post_doit = devlink_nl_post_doit, .policy = devlink_rate_get_do_nl_policy, - .maxattr = DEVLINK_ATTR_RATE_NODE_NAME, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_CMD_CAP_DO, }, { .cmd = DEVLINK_CMD_RATE_GET, .dumpit = devlink_nl_rate_get_dumpit, .policy = devlink_rate_get_dump_nl_policy, - .maxattr = DEVLINK_ATTR_DEV_NAME, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_CMD_CAP_DUMP, }, { @@ -1201,7 +1276,7 @@ const struct genl_split_ops devlink_nl_ops[74] = { .doit = devlink_nl_rate_set_doit, .post_doit = devlink_nl_post_doit, .policy = devlink_rate_set_nl_policy, - .maxattr = DEVLINK_ATTR_RATE_TC_BWS, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO, }, { @@ -1211,7 +1286,7 @@ const struct genl_split_ops devlink_nl_ops[74] = { .doit = devlink_nl_rate_new_doit, .post_doit = devlink_nl_post_doit, .policy = devlink_rate_new_nl_policy, - .maxattr = DEVLINK_ATTR_RATE_TC_BWS, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO, }, { @@ -1221,7 +1296,7 @@ const struct genl_split_ops devlink_nl_ops[74] = { .doit = devlink_nl_rate_del_doit, .post_doit = devlink_nl_post_doit, .policy = devlink_rate_del_nl_policy, - .maxattr = DEVLINK_ATTR_RATE_NODE_NAME, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO, }, { @@ -1231,14 +1306,14 @@ const struct genl_split_ops devlink_nl_ops[74] = { .doit = devlink_nl_linecard_get_doit, .post_doit = devlink_nl_post_doit, .policy = devlink_linecard_get_do_nl_policy, - .maxattr = DEVLINK_ATTR_LINECARD_INDEX, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_CMD_CAP_DO, }, { .cmd = DEVLINK_CMD_LINECARD_GET, .dumpit = devlink_nl_linecard_get_dumpit, .policy = devlink_linecard_get_dump_nl_policy, - .maxattr = DEVLINK_ATTR_DEV_NAME, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_CMD_CAP_DUMP, }, { @@ -1248,7 +1323,7 @@ const struct genl_split_ops devlink_nl_ops[74] = { .doit = devlink_nl_linecard_set_doit, .post_doit = devlink_nl_post_doit, .policy = devlink_linecard_set_nl_policy, - .maxattr = DEVLINK_ATTR_LINECARD_TYPE, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO, }, { @@ -1258,7 +1333,7 @@ const struct genl_split_ops devlink_nl_ops[74] = { .doit = devlink_nl_selftests_get_doit, .post_doit = devlink_nl_post_doit, .policy = devlink_selftests_get_nl_policy, - .maxattr = DEVLINK_ATTR_DEV_NAME, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_CMD_CAP_DO, }, { @@ -1274,14 +1349,14 @@ const struct genl_split_ops devlink_nl_ops[74] = { .doit = devlink_nl_selftests_run_doit, .post_doit = devlink_nl_post_doit, .policy = devlink_selftests_run_nl_policy, - .maxattr = DEVLINK_ATTR_SELFTESTS, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO, }, { .cmd = DEVLINK_CMD_NOTIFY_FILTER_SET, .doit = devlink_nl_notify_filter_set_doit, .policy = devlink_notify_filter_set_nl_policy, - .maxattr = DEVLINK_ATTR_PORT_INDEX, + .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_CMD_CAP_DO, }, }; -- cgit v1.2.3 From 63fff8c0f7025a838f1afab5fc8e6ad989e4823e Mon Sep 17 00:00:00 2001 From: Jiri Pirko Date: Thu, 12 Mar 2026 11:04:06 +0100 Subject: documentation: networking: add shared devlink documentation Document shared devlink instances for multiple PFs on the same chip. Signed-off-by: Jiri Pirko Link: https://patch.msgid.link/20260312100407.551173-13-jiri@resnulli.us Signed-off-by: Jakub Kicinski --- .../networking/devlink/devlink-shared.rst | 97 ++++++++++++++++++++++ Documentation/networking/devlink/index.rst | 1 + 2 files changed, 98 insertions(+) create mode 100644 Documentation/networking/devlink/devlink-shared.rst (limited to 'Documentation') diff --git a/Documentation/networking/devlink/devlink-shared.rst b/Documentation/networking/devlink/devlink-shared.rst new file mode 100644 index 000000000000..16bf6a7d25d9 --- /dev/null +++ b/Documentation/networking/devlink/devlink-shared.rst @@ -0,0 +1,97 @@ +.. SPDX-License-Identifier: GPL-2.0 + +======================== +Devlink Shared Instances +======================== + +Overview +======== + +Shared devlink instances allow multiple physical functions (PFs) on the same +chip to share a devlink instance for chip-wide operations. + +Multiple PFs may reside on the same physical chip, running a single firmware. +Some of the resources and configurations may be shared among these PFs. The +shared devlink instance provides an object to pin configuration knobs on. + +There are two possible usage models: + +1. The shared devlink instance is used alongside individual PF devlink + instances, providing chip-wide configuration in addition to per-PF + configuration. +2. The shared devlink instance is the only devlink instance, without + per-PF instances. + +It is up to the driver to decide which usage model to use. + +The shared devlink instance is not backed by any struct *device*. + +Implementation +============== + +Architecture +------------ + +The implementation uses: + +* **Chip identification**: PFs are grouped by chip using a driver-specific identifier +* **Shared instance management**: Global list of shared instances with reference counting + +API Functions +------------- + +The following functions are provided for managing shared devlink instances: + +* ``devlink_shd_get()``: Get or create a shared devlink instance identified by a string ID +* ``devlink_shd_put()``: Release a reference on a shared devlink instance +* ``devlink_shd_get_priv()``: Get private data from shared devlink instance + +Initialization Flow +------------------- + +1. **PF calls shared devlink init** during driver probe +2. **Chip identification** using driver-specific method to determine device identity +3. **Get or create shared instance** using ``devlink_shd_get()``: + + * The function looks up existing instance by identifier + * If none exists, creates new instance: + - Allocates and registers devlink instance + - Adds to global shared instances list + - Increments reference count + +4. **Set nested devlink instance** for the PF devlink instance using + ``devl_nested_devlink_set()`` before registering the PF devlink instance + +Cleanup Flow +------------ + +1. **Cleanup** when PF is removed +2. **Call** ``devlink_shd_put()`` to release reference (decrements reference count) +3. **Shared instance is automatically destroyed** when the last PF removes (reference count reaches zero) + +Chip Identification +------------------- + +PFs belonging to the same chip are identified using a driver-specific method. +The driver is free to choose any identifier that is suitable for determining +whether two PFs are part of the same device. Examples include: + +* **PCI VPD serial numbers**: Extract from PCI VPD +* **Device tree properties**: Read chip identifier from device tree +* **Other hardware-specific identifiers**: Any unique identifier that groups PFs by chip + +Locking +------- + +A global mutex (``shd_mutex``) protects the shared instances list during registration/deregistration. + +Similarly to other nested devlink instance relationships, devlink lock of +the shared instance should be always taken after the devlink lock of PF. + +Reference Counting +------------------ + +Each shared devlink instance maintains a reference count (``refcount_t refcount``). +The reference count is incremented when ``devlink_shd_get()`` is called and decremented +when ``devlink_shd_put()`` is called. When the reference count reaches zero, the shared +instance is automatically destroyed. diff --git a/Documentation/networking/devlink/index.rst b/Documentation/networking/devlink/index.rst index 35b12a2bfeba..f7ba7dcf477d 100644 --- a/Documentation/networking/devlink/index.rst +++ b/Documentation/networking/devlink/index.rst @@ -68,6 +68,7 @@ general. devlink-resource devlink-selftests devlink-trap + devlink-shared Driver-specific documentation ----------------------------- -- cgit v1.2.3 From cc7a3435dfadb7469082c6b09018fb2b9cbdeda1 Mon Sep 17 00:00:00 2001 From: "Jan Petrous (OSS)" Date: Fri, 13 Mar 2026 08:13:34 +0100 Subject: dt-bindings: net: nxp,s32-dwmac: Declare per-queue interrupts The DWMAC IP on NXP S32G/R SoCs has connected queue-based IRQ lines, set them to allow using Multi-IRQ mode. Reviewed-by: Matthias Brugger Acked-by: Conor Dooley Signed-off-by: Jan Petrous (OSS) Link: https://patch.msgid.link/20260313-dwmac_multi_irq-v12-3-b5c9d0aa13d6@oss.nxp.com Signed-off-by: Jakub Kicinski --- .../devicetree/bindings/net/nxp,s32-dwmac.yaml | 47 +++++++++++++++++++--- 1 file changed, 42 insertions(+), 5 deletions(-) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/net/nxp,s32-dwmac.yaml b/Documentation/devicetree/bindings/net/nxp,s32-dwmac.yaml index 1b2934f3c87c..753a04941659 100644 --- a/Documentation/devicetree/bindings/net/nxp,s32-dwmac.yaml +++ b/Documentation/devicetree/bindings/net/nxp,s32-dwmac.yaml @@ -1,5 +1,5 @@ # SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) -# Copyright 2021-2024 NXP +# Copyright 2021-2026 NXP %YAML 1.2 --- $id: http://devicetree.org/schemas/net/nxp,s32-dwmac.yaml# @@ -16,6 +16,8 @@ description: the SoC S32R45 has two instances. The devices can use RGMII/RMII/MII interface over Pinctrl device or the output can be routed to the embedded SerDes for SGMII connectivity. + The DWMAC instances have connected all RX/TX queues interrupts, + enabling load balancing of data traffic across all CPU cores. properties: compatible: @@ -45,10 +47,25 @@ properties: FlexTimer Modules connect to GMAC_0. interrupts: - maxItems: 1 + minItems: 1 + maxItems: 11 interrupt-names: - const: macirq + oneOf: + - items: + - const: macirq + - items: + - const: macirq + - const: tx-queue-0 + - const: rx-queue-0 + - const: tx-queue-1 + - const: rx-queue-1 + - const: tx-queue-2 + - const: rx-queue-2 + - const: tx-queue-3 + - const: rx-queue-3 + - const: tx-queue-4 + - const: rx-queue-4 clocks: items: @@ -88,8 +105,28 @@ examples: <0x0 0x4007c004 0x0 0x4>; /* GMAC_0_CTRL_STS */ nxp,phy-sel = <&gpr 0x4>; interrupt-parent = <&gic>; - interrupts = ; - interrupt-names = "macirq"; + interrupts = , + /* CHN 0: tx, rx */ + , + , + /* CHN 1: tx, rx */ + , + , + /* CHN 2: tx, rx */ + , + , + /* CHN 3: tx, rx */ + , + , + /* CHN 4: tx, rx */ + , + ; + interrupt-names = "macirq", + "tx-queue-0", "rx-queue-0", + "tx-queue-1", "rx-queue-1", + "tx-queue-2", "rx-queue-2", + "tx-queue-3", "rx-queue-3", + "tx-queue-4", "rx-queue-4"; snps,mtl-rx-config = <&mtl_rx_setup>; snps,mtl-tx-config = <&mtl_tx_setup>; clocks = <&clks 24>, <&clks 17>, <&clks 16>, <&clks 15>; -- cgit v1.2.3 From c841b676da98638f5ed8d3f2f449ddd02d9921aa Mon Sep 17 00:00:00 2001 From: Ralf Lici Date: Fri, 14 Nov 2025 11:39:40 +0100 Subject: ovpn: notify userspace on client float event Send a netlink notification when a client updates its remote UDP endpoint. The notification includes the new IP address, port, and scope ID (for IPv6). Cc: linux-kselftest@vger.kernel.org Cc: horms@kernel.org Cc: shuah@kernel.org Cc: donald.hunter@gmail.com Signed-off-by: Ralf Lici Signed-off-by: Antonio Quartulli Reviewed-by: Sabrina Dubroca --- Documentation/netlink/specs/ovpn.yaml | 6 +++ drivers/net/ovpn/netlink.c | 82 +++++++++++++++++++++++++++++ drivers/net/ovpn/netlink.h | 2 + drivers/net/ovpn/peer.c | 2 + include/uapi/linux/ovpn.h | 1 + tools/testing/selftests/net/ovpn/ovpn-cli.c | 3 ++ 6 files changed, 96 insertions(+) (limited to 'Documentation') diff --git a/Documentation/netlink/specs/ovpn.yaml b/Documentation/netlink/specs/ovpn.yaml index 1b91045cee2e..0d0c028bf96f 100644 --- a/Documentation/netlink/specs/ovpn.yaml +++ b/Documentation/netlink/specs/ovpn.yaml @@ -502,6 +502,12 @@ operations: - ifindex - keyconf + - + name: peer-float-ntf + doc: Notification about a peer floating (changing its remote UDP endpoint) + notify: peer-get + mcgrp: peers + mcast-groups: list: - diff --git a/drivers/net/ovpn/netlink.c b/drivers/net/ovpn/netlink.c index fed0e46b32a3..e10d7f9a28f5 100644 --- a/drivers/net/ovpn/netlink.c +++ b/drivers/net/ovpn/netlink.c @@ -1203,6 +1203,88 @@ err_free_msg: return ret; } +/** + * ovpn_nl_peer_float_notify - notify userspace about peer floating + * @peer: the floated peer + * @ss: sockaddr representing the new remote endpoint + * + * Return: 0 on success or a negative error code otherwise + */ +int ovpn_nl_peer_float_notify(struct ovpn_peer *peer, + const struct sockaddr_storage *ss) +{ + struct ovpn_socket *sock; + struct sockaddr_in6 *sa6; + struct sockaddr_in *sa; + struct sk_buff *msg; + struct nlattr *attr; + int ret = -EMSGSIZE; + void *hdr; + + msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_ATOMIC); + if (!msg) + return -ENOMEM; + + hdr = genlmsg_put(msg, 0, 0, &ovpn_nl_family, 0, + OVPN_CMD_PEER_FLOAT_NTF); + if (!hdr) { + ret = -ENOBUFS; + goto err_free_msg; + } + + if (nla_put_u32(msg, OVPN_A_IFINDEX, peer->ovpn->dev->ifindex)) + goto err_cancel_msg; + + attr = nla_nest_start(msg, OVPN_A_PEER); + if (!attr) + goto err_cancel_msg; + + if (nla_put_u32(msg, OVPN_A_PEER_ID, peer->id)) + goto err_cancel_msg; + + if (ss->ss_family == AF_INET) { + sa = (struct sockaddr_in *)ss; + if (nla_put_in_addr(msg, OVPN_A_PEER_REMOTE_IPV4, + sa->sin_addr.s_addr) || + nla_put_net16(msg, OVPN_A_PEER_REMOTE_PORT, sa->sin_port)) + goto err_cancel_msg; + } else if (ss->ss_family == AF_INET6) { + sa6 = (struct sockaddr_in6 *)ss; + if (nla_put_in6_addr(msg, OVPN_A_PEER_REMOTE_IPV6, + &sa6->sin6_addr) || + nla_put_u32(msg, OVPN_A_PEER_REMOTE_IPV6_SCOPE_ID, + sa6->sin6_scope_id) || + nla_put_net16(msg, OVPN_A_PEER_REMOTE_PORT, sa6->sin6_port)) + goto err_cancel_msg; + } else { + ret = -EAFNOSUPPORT; + goto err_cancel_msg; + } + + nla_nest_end(msg, attr); + genlmsg_end(msg, hdr); + + rcu_read_lock(); + sock = rcu_dereference(peer->sock); + if (!sock) { + ret = -EINVAL; + goto err_unlock; + } + genlmsg_multicast_netns(&ovpn_nl_family, sock_net(sock->sk), msg, + 0, OVPN_NLGRP_PEERS, GFP_ATOMIC); + rcu_read_unlock(); + + return 0; + +err_unlock: + rcu_read_unlock(); +err_cancel_msg: + genlmsg_cancel(msg, hdr); +err_free_msg: + nlmsg_free(msg); + return ret; +} + /** * ovpn_nl_key_swap_notify - notify userspace peer's key must be renewed * @peer: the peer whose key needs to be renewed diff --git a/drivers/net/ovpn/netlink.h b/drivers/net/ovpn/netlink.h index 8615dfc3c472..11ee7c681885 100644 --- a/drivers/net/ovpn/netlink.h +++ b/drivers/net/ovpn/netlink.h @@ -13,6 +13,8 @@ int ovpn_nl_register(void); void ovpn_nl_unregister(void); int ovpn_nl_peer_del_notify(struct ovpn_peer *peer); +int ovpn_nl_peer_float_notify(struct ovpn_peer *peer, + const struct sockaddr_storage *ss); int ovpn_nl_key_swap_notify(struct ovpn_peer *peer, u8 key_id); #endif /* _NET_OVPN_NETLINK_H_ */ diff --git a/drivers/net/ovpn/peer.c b/drivers/net/ovpn/peer.c index 3716a1d82801..4e145b4497e6 100644 --- a/drivers/net/ovpn/peer.c +++ b/drivers/net/ovpn/peer.c @@ -287,6 +287,8 @@ void ovpn_peer_endpoints_update(struct ovpn_peer *peer, struct sk_buff *skb) spin_unlock_bh(&peer->lock); + ovpn_nl_peer_float_notify(peer, &ss); + /* rehashing is required only in MP mode as P2P has one peer * only and thus there is no hashtable */ diff --git a/include/uapi/linux/ovpn.h b/include/uapi/linux/ovpn.h index 959b41def61f..0cce0d58b830 100644 --- a/include/uapi/linux/ovpn.h +++ b/include/uapi/linux/ovpn.h @@ -100,6 +100,7 @@ enum { OVPN_CMD_KEY_SWAP, OVPN_CMD_KEY_SWAP_NTF, OVPN_CMD_KEY_DEL, + OVPN_CMD_PEER_FLOAT_NTF, __OVPN_CMD_MAX, OVPN_CMD_MAX = (__OVPN_CMD_MAX - 1) diff --git a/tools/testing/selftests/net/ovpn/ovpn-cli.c b/tools/testing/selftests/net/ovpn/ovpn-cli.c index 0f3babf19fd0..7178abae1b2f 100644 --- a/tools/testing/selftests/net/ovpn/ovpn-cli.c +++ b/tools/testing/selftests/net/ovpn/ovpn-cli.c @@ -1516,6 +1516,9 @@ static int ovpn_handle_msg(struct nl_msg *msg, void *arg) case OVPN_CMD_PEER_DEL_NTF: fprintf(stdout, "received CMD_PEER_DEL_NTF\n"); break; + case OVPN_CMD_PEER_FLOAT_NTF: + fprintf(stdout, "received CMD_PEER_FLOAT_NTF\n"); + break; case OVPN_CMD_KEY_SWAP_NTF: fprintf(stdout, "received CMD_KEY_SWAP_NTF\n"); break; -- cgit v1.2.3 From 2e570a51408839b2079f3cb7e3944bf9b1184ee0 Mon Sep 17 00:00:00 2001 From: Ralf Lici Date: Wed, 9 Jul 2025 17:21:25 +0200 Subject: ovpn: add support for asymmetric peer IDs In order to support the multipeer architecture, upon connection setup each side of a tunnel advertises a unique ID that the other side must include in packets sent to them. Therefore when transmitting a packet, a peer inserts the recipient's advertised ID for that specific tunnel into the peer ID field. When receiving a packet, a peer expects to find its own unique receive ID for that specific tunnel in the peer ID field. Add support for the TX peer ID and embed it into transmitting packets. If no TX peer ID is specified, fallback to using the same peer ID both for RX and TX in order to be compatible with the non-multipeer compliant peers. Cc: horms@kernel.org Cc: donald.hunter@gmail.com Signed-off-by: Ralf Lici Signed-off-by: Antonio Quartulli Reviewed-by: Sabrina Dubroca --- Documentation/netlink/specs/ovpn.yaml | 17 ++++++++++++++++- drivers/net/ovpn/crypto_aead.c | 2 +- drivers/net/ovpn/netlink-gen.c | 13 ++++++++++--- drivers/net/ovpn/netlink-gen.h | 6 +++--- drivers/net/ovpn/netlink.c | 14 ++++++++++++-- drivers/net/ovpn/peer.c | 4 ++++ drivers/net/ovpn/peer.h | 4 +++- include/uapi/linux/ovpn.h | 1 + 8 files changed, 50 insertions(+), 11 deletions(-) (limited to 'Documentation') diff --git a/Documentation/netlink/specs/ovpn.yaml b/Documentation/netlink/specs/ovpn.yaml index 0d0c028bf96f..b0c782e59a32 100644 --- a/Documentation/netlink/specs/ovpn.yaml +++ b/Documentation/netlink/specs/ovpn.yaml @@ -43,7 +43,8 @@ attribute-sets: type: u32 doc: >- The unique ID of the peer in the device context. To be used to - identify peers during operations for a specific device + identify peers during operations for a specific device. + Also used to match packets received from this peer. checks: max: 0xFFFFFF - @@ -160,6 +161,16 @@ attribute-sets: name: link-tx-packets type: uint doc: Number of packets transmitted at the transport level + - + name: tx-id + type: u32 + doc: >- + The ID value used when transmitting packets to this peer. This + way outgoing packets can have a different ID than incoming ones. + Useful in multipeer-to-multipeer connections, where each peer + will advertise the tx-id to be used on the link. + checks: + max: 0xFFFFFF - name: peer-new-input subset-of: peer @@ -188,6 +199,8 @@ attribute-sets: name: keepalive-interval - name: keepalive-timeout + - + name: tx-id - name: peer-set-input subset-of: peer @@ -214,6 +227,8 @@ attribute-sets: name: keepalive-interval - name: keepalive-timeout + - + name: tx-id - name: peer-del-input subset-of: peer diff --git a/drivers/net/ovpn/crypto_aead.c b/drivers/net/ovpn/crypto_aead.c index 77be0942a269..59848c41b7b2 100644 --- a/drivers/net/ovpn/crypto_aead.c +++ b/drivers/net/ovpn/crypto_aead.c @@ -122,7 +122,7 @@ int ovpn_aead_encrypt(struct ovpn_peer *peer, struct ovpn_crypto_key_slot *ks, memcpy(skb->data, iv, OVPN_NONCE_WIRE_SIZE); /* add packet op as head of additional data */ - op = ovpn_opcode_compose(OVPN_DATA_V2, ks->key_id, peer->id); + op = ovpn_opcode_compose(OVPN_DATA_V2, ks->key_id, peer->tx_id); __skb_push(skb, OVPN_OPCODE_SIZE); BUILD_BUG_ON(sizeof(op) != OVPN_OPCODE_SIZE); *((__force __be32 *)skb->data) = htonl(op); diff --git a/drivers/net/ovpn/netlink-gen.c b/drivers/net/ovpn/netlink-gen.c index ecbe9dcf4f7d..2147cec7c2c5 100644 --- a/drivers/net/ovpn/netlink-gen.c +++ b/drivers/net/ovpn/netlink-gen.c @@ -16,6 +16,10 @@ static const struct netlink_range_validation ovpn_a_peer_id_range = { .max = 16777215ULL, }; +static const struct netlink_range_validation ovpn_a_peer_tx_id_range = { + .max = 16777215ULL, +}; + static const struct netlink_range_validation ovpn_a_keyconf_peer_id_range = { .max = 16777215ULL, }; @@ -51,7 +55,7 @@ const struct nla_policy ovpn_keydir_nl_policy[OVPN_A_KEYDIR_NONCE_TAIL + 1] = { [OVPN_A_KEYDIR_NONCE_TAIL] = NLA_POLICY_EXACT_LEN(OVPN_NONCE_TAIL_SIZE), }; -const struct nla_policy ovpn_peer_nl_policy[OVPN_A_PEER_LINK_TX_PACKETS + 1] = { +const struct nla_policy ovpn_peer_nl_policy[OVPN_A_PEER_TX_ID + 1] = { [OVPN_A_PEER_ID] = NLA_POLICY_FULL_RANGE(NLA_U32, &ovpn_a_peer_id_range), [OVPN_A_PEER_REMOTE_IPV4] = { .type = NLA_BE32, }, [OVPN_A_PEER_REMOTE_IPV6] = NLA_POLICY_EXACT_LEN(16), @@ -75,13 +79,14 @@ const struct nla_policy ovpn_peer_nl_policy[OVPN_A_PEER_LINK_TX_PACKETS + 1] = { [OVPN_A_PEER_LINK_TX_BYTES] = { .type = NLA_UINT, }, [OVPN_A_PEER_LINK_RX_PACKETS] = { .type = NLA_UINT, }, [OVPN_A_PEER_LINK_TX_PACKETS] = { .type = NLA_UINT, }, + [OVPN_A_PEER_TX_ID] = NLA_POLICY_FULL_RANGE(NLA_U32, &ovpn_a_peer_tx_id_range), }; const struct nla_policy ovpn_peer_del_input_nl_policy[OVPN_A_PEER_ID + 1] = { [OVPN_A_PEER_ID] = NLA_POLICY_FULL_RANGE(NLA_U32, &ovpn_a_peer_id_range), }; -const struct nla_policy ovpn_peer_new_input_nl_policy[OVPN_A_PEER_KEEPALIVE_TIMEOUT + 1] = { +const struct nla_policy ovpn_peer_new_input_nl_policy[OVPN_A_PEER_TX_ID + 1] = { [OVPN_A_PEER_ID] = NLA_POLICY_FULL_RANGE(NLA_U32, &ovpn_a_peer_id_range), [OVPN_A_PEER_REMOTE_IPV4] = { .type = NLA_BE32, }, [OVPN_A_PEER_REMOTE_IPV6] = NLA_POLICY_EXACT_LEN(16), @@ -94,9 +99,10 @@ const struct nla_policy ovpn_peer_new_input_nl_policy[OVPN_A_PEER_KEEPALIVE_TIME [OVPN_A_PEER_LOCAL_IPV6] = NLA_POLICY_EXACT_LEN(16), [OVPN_A_PEER_KEEPALIVE_INTERVAL] = { .type = NLA_U32, }, [OVPN_A_PEER_KEEPALIVE_TIMEOUT] = { .type = NLA_U32, }, + [OVPN_A_PEER_TX_ID] = NLA_POLICY_FULL_RANGE(NLA_U32, &ovpn_a_peer_tx_id_range), }; -const struct nla_policy ovpn_peer_set_input_nl_policy[OVPN_A_PEER_KEEPALIVE_TIMEOUT + 1] = { +const struct nla_policy ovpn_peer_set_input_nl_policy[OVPN_A_PEER_TX_ID + 1] = { [OVPN_A_PEER_ID] = NLA_POLICY_FULL_RANGE(NLA_U32, &ovpn_a_peer_id_range), [OVPN_A_PEER_REMOTE_IPV4] = { .type = NLA_BE32, }, [OVPN_A_PEER_REMOTE_IPV6] = NLA_POLICY_EXACT_LEN(16), @@ -108,6 +114,7 @@ const struct nla_policy ovpn_peer_set_input_nl_policy[OVPN_A_PEER_KEEPALIVE_TIME [OVPN_A_PEER_LOCAL_IPV6] = NLA_POLICY_EXACT_LEN(16), [OVPN_A_PEER_KEEPALIVE_INTERVAL] = { .type = NLA_U32, }, [OVPN_A_PEER_KEEPALIVE_TIMEOUT] = { .type = NLA_U32, }, + [OVPN_A_PEER_TX_ID] = NLA_POLICY_FULL_RANGE(NLA_U32, &ovpn_a_peer_tx_id_range), }; /* OVPN_CMD_PEER_NEW - do */ diff --git a/drivers/net/ovpn/netlink-gen.h b/drivers/net/ovpn/netlink-gen.h index b2301580770f..67cd85f86173 100644 --- a/drivers/net/ovpn/netlink-gen.h +++ b/drivers/net/ovpn/netlink-gen.h @@ -18,10 +18,10 @@ extern const struct nla_policy ovpn_keyconf_del_input_nl_policy[OVPN_A_KEYCONF_S extern const struct nla_policy ovpn_keyconf_get_nl_policy[OVPN_A_KEYCONF_CIPHER_ALG + 1]; extern const struct nla_policy ovpn_keyconf_swap_input_nl_policy[OVPN_A_KEYCONF_PEER_ID + 1]; extern const struct nla_policy ovpn_keydir_nl_policy[OVPN_A_KEYDIR_NONCE_TAIL + 1]; -extern const struct nla_policy ovpn_peer_nl_policy[OVPN_A_PEER_LINK_TX_PACKETS + 1]; +extern const struct nla_policy ovpn_peer_nl_policy[OVPN_A_PEER_TX_ID + 1]; extern const struct nla_policy ovpn_peer_del_input_nl_policy[OVPN_A_PEER_ID + 1]; -extern const struct nla_policy ovpn_peer_new_input_nl_policy[OVPN_A_PEER_KEEPALIVE_TIMEOUT + 1]; -extern const struct nla_policy ovpn_peer_set_input_nl_policy[OVPN_A_PEER_KEEPALIVE_TIMEOUT + 1]; +extern const struct nla_policy ovpn_peer_new_input_nl_policy[OVPN_A_PEER_TX_ID + 1]; +extern const struct nla_policy ovpn_peer_set_input_nl_policy[OVPN_A_PEER_TX_ID + 1]; int ovpn_nl_pre_doit(const struct genl_split_ops *ops, struct sk_buff *skb, struct genl_info *info); diff --git a/drivers/net/ovpn/netlink.c b/drivers/net/ovpn/netlink.c index e10d7f9a28f5..291e2e5bb450 100644 --- a/drivers/net/ovpn/netlink.c +++ b/drivers/net/ovpn/netlink.c @@ -305,6 +305,12 @@ static int ovpn_nl_peer_modify(struct ovpn_peer *peer, struct genl_info *info, dst_cache_reset(&peer->dst_cache); } + /* In a multipeer-to-multipeer setup we may have asymmetric peer IDs, + * that is peer->id might be different from peer->tx_id. + */ + if (attrs[OVPN_A_PEER_TX_ID]) + peer->tx_id = nla_get_u32(attrs[OVPN_A_PEER_TX_ID]); + if (attrs[OVPN_A_PEER_VPN_IPV4]) { rehash = true; peer->vpn_addrs.ipv4.s_addr = @@ -326,8 +332,8 @@ static int ovpn_nl_peer_modify(struct ovpn_peer *peer, struct genl_info *info, } netdev_dbg(peer->ovpn->dev, - "modify peer id=%u endpoint=%pIScp VPN-IPv4=%pI4 VPN-IPv6=%pI6c\n", - peer->id, &ss, + "modify peer id=%u tx_id=%u endpoint=%pIScp VPN-IPv4=%pI4 VPN-IPv6=%pI6c\n", + peer->id, peer->tx_id, &ss, &peer->vpn_addrs.ipv4.s_addr, &peer->vpn_addrs.ipv6); spin_unlock_bh(&peer->lock); @@ -373,6 +379,7 @@ int ovpn_nl_peer_new_doit(struct sk_buff *skb, struct genl_info *info) } peer_id = nla_get_u32(attrs[OVPN_A_PEER_ID]); + peer = ovpn_peer_new(ovpn, peer_id); if (IS_ERR(peer)) { NL_SET_ERR_MSG_FMT_MOD(info->extack, @@ -572,6 +579,9 @@ static int ovpn_nl_send_peer(struct sk_buff *skb, const struct genl_info *info, if (nla_put_u32(skb, OVPN_A_PEER_ID, peer->id)) goto err; + if (nla_put_u32(skb, OVPN_A_PEER_TX_ID, peer->tx_id)) + goto err; + if (peer->vpn_addrs.ipv4.s_addr != htonl(INADDR_ANY)) if (nla_put_in_addr(skb, OVPN_A_PEER_VPN_IPV4, peer->vpn_addrs.ipv4.s_addr)) diff --git a/drivers/net/ovpn/peer.c b/drivers/net/ovpn/peer.c index 4e145b4497e6..26b55d813f0e 100644 --- a/drivers/net/ovpn/peer.c +++ b/drivers/net/ovpn/peer.c @@ -99,7 +99,11 @@ struct ovpn_peer *ovpn_peer_new(struct ovpn_priv *ovpn, u32 id) if (!peer) return ERR_PTR(-ENOMEM); + /* in the default case TX and RX IDs are the same. + * the user may set a different TX ID via netlink + */ peer->id = id; + peer->tx_id = id; peer->ovpn = ovpn; peer->vpn_addrs.ipv4.s_addr = htonl(INADDR_ANY); diff --git a/drivers/net/ovpn/peer.h b/drivers/net/ovpn/peer.h index a1423f2b09e0..328401570cba 100644 --- a/drivers/net/ovpn/peer.h +++ b/drivers/net/ovpn/peer.h @@ -21,7 +21,8 @@ * struct ovpn_peer - the main remote peer object * @ovpn: main openvpn instance this peer belongs to * @dev_tracker: reference tracker for associated dev - * @id: unique identifier + * @id: unique identifier, used to match incoming packets + * @tx_id: identifier to be used in TX packets * @vpn_addrs: IP addresses assigned over the tunnel * @vpn_addrs.ipv4: IPv4 assigned to peer on the tunnel * @vpn_addrs.ipv6: IPv6 assigned to peer on the tunnel @@ -64,6 +65,7 @@ struct ovpn_peer { struct ovpn_priv *ovpn; netdevice_tracker dev_tracker; u32 id; + u32 tx_id; struct { struct in_addr ipv4; struct in6_addr ipv6; diff --git a/include/uapi/linux/ovpn.h b/include/uapi/linux/ovpn.h index 0cce0d58b830..06690090a1a9 100644 --- a/include/uapi/linux/ovpn.h +++ b/include/uapi/linux/ovpn.h @@ -55,6 +55,7 @@ enum { OVPN_A_PEER_LINK_TX_BYTES, OVPN_A_PEER_LINK_RX_PACKETS, OVPN_A_PEER_LINK_TX_PACKETS, + OVPN_A_PEER_TX_ID, __OVPN_A_PEER_MAX, OVPN_A_PEER_MAX = (__OVPN_A_PEER_MAX - 1) -- cgit v1.2.3 From 63e9d434ddc8b2f7b8b6a6d31611736981edc9ca Mon Sep 17 00:00:00 2001 From: Charles Perry Date: Fri, 13 Mar 2026 07:06:08 -0700 Subject: dt-bindings: net: cdns,macb: add a compatible for Microchip pic64hpsc Add "microchip,pic64hpsc-gem" for "PIC64-HPSC" and "microchip,pic64hx-gem" for "PIC64HX", compatible with the former. The generic compatible "cdns,gem" works but offers limited features. Keep it as a fallback. The GEM IPs within pic64hpsc have their MDIO controllers unconnected from any physical pin. Add a check to prevent adding PHYs under the GEM node. Signed-off-by: Charles Perry Acked-by: Conor Dooley Link: https://patch.msgid.link/20260313140610.3681752-2-charles.perry@microchip.com Signed-off-by: Paolo Abeni --- Documentation/devicetree/bindings/net/cdns,macb.yaml | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/net/cdns,macb.yaml b/Documentation/devicetree/bindings/net/cdns,macb.yaml index cb14c35ba996..feb168385837 100644 --- a/Documentation/devicetree/bindings/net/cdns,macb.yaml +++ b/Documentation/devicetree/bindings/net/cdns,macb.yaml @@ -70,6 +70,14 @@ properties: - microchip,sama7d65-gem # Microchip SAMA7D65 gigabit ethernet interface - const: microchip,sama7g5-gem # Microchip SAMA7G5 gigabit ethernet interface + - items: + - const: microchip,pic64hpsc-gem # Microchip PIC64-HPSC + - const: cdns,gem + - items: + - const: microchip,pic64hx-gem # Microchip PIC64HX + - const: microchip,pic64hpsc-gem # Microchip PIC64-HPSC + - const: cdns,gem + reg: minItems: 1 items: @@ -196,6 +204,17 @@ allOf: required: - phys + - if: + properties: + compatible: + contains: + const: microchip,pic64hpsc-gem + then: + patternProperties: + "^ethernet-phy@[0-9a-f]$": false + properties: + mdio: false + unevaluatedProperties: false examples: -- cgit v1.2.3 From bb30400a566c7a6a9355873344ec63e2c6310e2c Mon Sep 17 00:00:00 2001 From: Inochi Amaoto Date: Mon, 16 Mar 2026 09:00:37 +0800 Subject: dt-bindings: net: Add support for Spacemit K3 dwmac The GMAC IP on Spacemit K3 is almost a standard Synopsys DesignWare MAC (version 5.40a) with some extra clock. Add necessary compatible string for this device. Signed-off-by: Inochi Amaoto Reviewed-by: Rob Herring (Arm) Link: https://patch.msgid.link/20260316010041.164360-2-inochiama@gmail.com Signed-off-by: Jakub Kicinski --- .../devicetree/bindings/net/snps,dwmac.yaml | 2 + .../devicetree/bindings/net/spacemit,k3-dwmac.yaml | 102 +++++++++++++++++++++ 2 files changed, 104 insertions(+) create mode 100644 Documentation/devicetree/bindings/net/spacemit,k3-dwmac.yaml (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/net/snps,dwmac.yaml b/Documentation/devicetree/bindings/net/snps,dwmac.yaml index 38bc34dc4f09..98ebb6276bc6 100644 --- a/Documentation/devicetree/bindings/net/snps,dwmac.yaml +++ b/Documentation/devicetree/bindings/net/snps,dwmac.yaml @@ -109,6 +109,7 @@ properties: - snps,dwmac-5.10a - snps,dwmac-5.20 - snps,dwmac-5.30a + - snps,dwmac-5.40a - snps,dwxgmac - snps,dwxgmac-2.10 - sophgo,sg2042-dwmac @@ -656,6 +657,7 @@ allOf: - snps,dwmac-5.10a - snps,dwmac-5.20 - snps,dwmac-5.30a + - snps,dwmac-5.40a - snps,dwxgmac - snps,dwxgmac-2.10 - st,spear600-gmac diff --git a/Documentation/devicetree/bindings/net/spacemit,k3-dwmac.yaml b/Documentation/devicetree/bindings/net/spacemit,k3-dwmac.yaml new file mode 100644 index 000000000000..678eccf044f9 --- /dev/null +++ b/Documentation/devicetree/bindings/net/spacemit,k3-dwmac.yaml @@ -0,0 +1,102 @@ +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/net/spacemit,k3-dwmac.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Spacemit K3 DWMAC glue layer + +maintainers: + - Inochi Amaoto + +select: + properties: + compatible: + contains: + const: spacemit,k3-dwmac + required: + - compatible + +properties: + compatible: + items: + - const: spacemit,k3-dwmac + - const: snps,dwmac-5.40a + + reg: + maxItems: 1 + + clocks: + items: + - description: GMAC application clock + - description: PTP clock + - description: TX clock + + clock-names: + items: + - const: stmmaceth + - const: ptp_ref + - const: tx + + interrupts: + minItems: 1 + items: + - description: MAC interrupt + - description: MAC wake interrupt + + interrupt-names: + minItems: 1 + items: + - const: macirq + - const: eth_wake_irq + + resets: + maxItems: 1 + + reset-names: + const: stmmaceth + + spacemit,apmu: + $ref: /schemas/types.yaml#/definitions/phandle-array + items: + - items: + - description: phandle to the syscon node which control the glue register + - description: offset of the control register + - description: offset of the dline register + description: + A phandle to syscon with offset to control registers for this MAC + +required: + - compatible + - reg + - clocks + - clock-names + - interrupts + - interrupt-names + - resets + - reset-names + - spacemit,apmu + +allOf: + - $ref: snps,dwmac.yaml# + +unevaluatedProperties: false + +examples: + - | + #include + + ethernet@cac80000 { + compatible = "spacemit,k3-dwmac", "snps,dwmac-5.40a"; + reg = <0xcac80000 0x2000>; + clocks = <&syscon_apmu 66>, <&syscon_apmu 68>, + <&syscon_apmu 69>; + clock-names = "stmmaceth", "ptp_ref", "tx"; + interrupts = <131 IRQ_TYPE_LEVEL_HIGH>, <276 IRQ_TYPE_LEVEL_HIGH>; + interrupt-names = "macirq", "eth_wake_irq"; + phy-mode = "rgmii-id"; + phy-handle = <&phy0>; + resets = <&syscon_apmu 67>; + reset-names = "stmmaceth"; + spacemit,apmu = <&syscon_apmu 0x384 0x38c>; + }; -- cgit v1.2.3 From 46906242fe6103c81af59c541be2ab2902d7a335 Mon Sep 17 00:00:00 2001 From: Geert Uytterhoeven Date: Tue, 17 Mar 2026 09:02:45 +0100 Subject: dt-bindings: net: micrel: Sort lists Sort lists of PHY models and compatible values alphabetically. Signed-off-by: Geert Uytterhoeven Reviewed-by: Stefan Eichenberger Acked-by: Conor Dooley Link: https://patch.msgid.link/3877a8bca7e4c13119387870d10b0758274fa6a0.1773734298.git.geert+renesas@glider.be Signed-off-by: Jakub Kicinski --- Documentation/devicetree/bindings/net/micrel.yaml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/net/micrel.yaml b/Documentation/devicetree/bindings/net/micrel.yaml index ecc00169ef80..5d25f0d0a508 100644 --- a/Documentation/devicetree/bindings/net/micrel.yaml +++ b/Documentation/devicetree/bindings/net/micrel.yaml @@ -51,9 +51,9 @@ properties: bits that are currently supported: KSZ8001: register 0x1e, bits 15..14 - KSZ8041: register 0x1e, bits 15..14 KSZ8021: register 0x1f, bits 5..4 KSZ8031: register 0x1f, bits 5..4 + KSZ8041: register 0x1e, bits 15..14 KSZ8051: register 0x1f, bits 5..4 KSZ8081: register 0x1f, bits 5..4 KSZ8091: register 0x1f, bits 5..4 @@ -80,9 +80,9 @@ allOf: contains: enum: - ethernet-phy-id0022.1510 + - ethernet-phy-id0022.1550 - ethernet-phy-id0022.1555 - ethernet-phy-id0022.1556 - - ethernet-phy-id0022.1550 - ethernet-phy-id0022.1560 - ethernet-phy-id0022.161a then: -- cgit v1.2.3 From 22214fb2fad04d23c40598ec9be21ea638d75841 Mon Sep 17 00:00:00 2001 From: Geert Uytterhoeven Date: Tue, 17 Mar 2026 09:02:46 +0100 Subject: dt-bindings: net: micrel: KSZ8041RNLI supports LED mode Micrel KSZ8041RNLI supports LED mode, just like KSZ8041. This fixes (a.o.) the following "make dtbs_check" warning: arch/arm/boot/dts/renesas/r8a7791-koelsch.dtb: ethernet-phy@1 (ethernet-phy-id0022.1537): False schema does not allow 1 from schema $id: http://devicetree.org/schemas/net/micrel.yaml Signed-off-by: Geert Uytterhoeven Reviewed-by: Stefan Eichenberger Acked-by: Conor Dooley Link: https://patch.msgid.link/efad6c7e024b3a9aa2882db65909ee5bbbcbdc45.1773734298.git.geert+renesas@glider.be Signed-off-by: Jakub Kicinski --- Documentation/devicetree/bindings/net/micrel.yaml | 2 ++ 1 file changed, 2 insertions(+) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/net/micrel.yaml b/Documentation/devicetree/bindings/net/micrel.yaml index 5d25f0d0a508..6fa568057b92 100644 --- a/Documentation/devicetree/bindings/net/micrel.yaml +++ b/Documentation/devicetree/bindings/net/micrel.yaml @@ -54,6 +54,7 @@ properties: KSZ8021: register 0x1f, bits 5..4 KSZ8031: register 0x1f, bits 5..4 KSZ8041: register 0x1e, bits 15..14 + KSZ8041RNLI: register 0x1e, bits 15..14 KSZ8051: register 0x1f, bits 5..4 KSZ8081: register 0x1f, bits 5..4 KSZ8091: register 0x1f, bits 5..4 @@ -80,6 +81,7 @@ allOf: contains: enum: - ethernet-phy-id0022.1510 + - ethernet-phy-id0022.1537 - ethernet-phy-id0022.1550 - ethernet-phy-id0022.1555 - ethernet-phy-id0022.1556 -- cgit v1.2.3 From dc3d720e12f602059490c1ab2bfee84a7465998f Mon Sep 17 00:00:00 2001 From: Haiyang Zhang Date: Tue, 17 Mar 2026 12:18:05 -0700 Subject: net: ethtool: add ethtool COALESCE_RX_CQE_FRAMES/NSECS Add two parameters for drivers supporting Rx CQE coalescing / descriptor writeback. ETHTOOL_A_COALESCE_RX_CQE_FRAMES: Maximum number of frames that can be coalesced into a CQE or writeback. ETHTOOL_A_COALESCE_RX_CQE_NSECS: Max time in nanoseconds after the first packet arrival in a coalesced CQE or writeback to be sent. Signed-off-by: Haiyang Zhang Link: https://patch.msgid.link/20260317191826.1346111-2-haiyangz@linux.microsoft.com Signed-off-by: Jakub Kicinski --- Documentation/netlink/specs/ethtool.yaml | 8 ++++++++ Documentation/networking/ethtool-netlink.rst | 11 +++++++++++ include/linux/ethtool.h | 6 +++++- include/uapi/linux/ethtool_netlink_generated.h | 2 ++ net/ethtool/coalesce.c | 14 +++++++++++++- 5 files changed, 39 insertions(+), 2 deletions(-) (limited to 'Documentation') diff --git a/Documentation/netlink/specs/ethtool.yaml b/Documentation/netlink/specs/ethtool.yaml index a05b5425b76a..5dd4d1b5d94b 100644 --- a/Documentation/netlink/specs/ethtool.yaml +++ b/Documentation/netlink/specs/ethtool.yaml @@ -857,6 +857,12 @@ attribute-sets: name: tx-profile type: nest nested-attributes: profile + - + name: rx-cqe-frames + type: u32 + - + name: rx-cqe-nsecs + type: u32 - name: pause-stat @@ -2253,6 +2259,8 @@ operations: - tx-aggr-time-usecs - rx-profile - tx-profile + - rx-cqe-frames + - rx-cqe-nsecs dump: *coalesce-get-op - name: coalesce-set diff --git a/Documentation/networking/ethtool-netlink.rst b/Documentation/networking/ethtool-netlink.rst index 32179168eb73..e92abf45faf5 100644 --- a/Documentation/networking/ethtool-netlink.rst +++ b/Documentation/networking/ethtool-netlink.rst @@ -1076,6 +1076,8 @@ Kernel response contents: ``ETHTOOL_A_COALESCE_TX_AGGR_TIME_USECS`` u32 time (us), aggr, Tx ``ETHTOOL_A_COALESCE_RX_PROFILE`` nested profile of DIM, Rx ``ETHTOOL_A_COALESCE_TX_PROFILE`` nested profile of DIM, Tx + ``ETHTOOL_A_COALESCE_RX_CQE_FRAMES`` u32 max packets, Rx CQE + ``ETHTOOL_A_COALESCE_RX_CQE_NSECS`` u32 delay (ns), Rx CQE =========================================== ====== ======================= Attributes are only included in reply if their value is not zero or the @@ -1109,6 +1111,13 @@ well with frequent small-sized URBs transmissions. to DIM parameters, see `Generic Network Dynamic Interrupt Moderation (Net DIM) `_. +Rx CQE coalescing allows multiple received packets to be coalesced into a +single Completion Queue Entry (CQE) or descriptor writeback. +``ETHTOOL_A_COALESCE_RX_CQE_FRAMES`` describes the maximum number of +frames that can be coalesced into a CQE or writeback. +``ETHTOOL_A_COALESCE_RX_CQE_NSECS`` describes max time in nanoseconds after +the first packet arrival in a coalesced CQE or writeback to be sent. + COALESCE_SET ============ @@ -1147,6 +1156,8 @@ Request contents: ``ETHTOOL_A_COALESCE_TX_AGGR_TIME_USECS`` u32 time (us), aggr, Tx ``ETHTOOL_A_COALESCE_RX_PROFILE`` nested profile of DIM, Rx ``ETHTOOL_A_COALESCE_TX_PROFILE`` nested profile of DIM, Tx + ``ETHTOOL_A_COALESCE_RX_CQE_FRAMES`` u32 max packets, Rx CQE + ``ETHTOOL_A_COALESCE_RX_CQE_NSECS`` u32 delay (ns), Rx CQE =========================================== ====== ======================= Request is rejected if it attributes declared as unsupported by driver (i.e. diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h index 83c375840835..656d465bcd06 100644 --- a/include/linux/ethtool.h +++ b/include/linux/ethtool.h @@ -332,6 +332,8 @@ struct kernel_ethtool_coalesce { u32 tx_aggr_max_bytes; u32 tx_aggr_max_frames; u32 tx_aggr_time_usecs; + u32 rx_cqe_frames; + u32 rx_cqe_nsecs; }; /** @@ -380,7 +382,9 @@ bool ethtool_convert_link_mode_to_legacy_u32(u32 *legacy_u32, #define ETHTOOL_COALESCE_TX_AGGR_TIME_USECS BIT(26) #define ETHTOOL_COALESCE_RX_PROFILE BIT(27) #define ETHTOOL_COALESCE_TX_PROFILE BIT(28) -#define ETHTOOL_COALESCE_ALL_PARAMS GENMASK(28, 0) +#define ETHTOOL_COALESCE_RX_CQE_FRAMES BIT(29) +#define ETHTOOL_COALESCE_RX_CQE_NSECS BIT(30) +#define ETHTOOL_COALESCE_ALL_PARAMS GENMASK(30, 0) #define ETHTOOL_COALESCE_USECS \ (ETHTOOL_COALESCE_RX_USECS | ETHTOOL_COALESCE_TX_USECS) diff --git a/include/uapi/linux/ethtool_netlink_generated.h b/include/uapi/linux/ethtool_netlink_generated.h index 114b83017297..8134baf7860f 100644 --- a/include/uapi/linux/ethtool_netlink_generated.h +++ b/include/uapi/linux/ethtool_netlink_generated.h @@ -371,6 +371,8 @@ enum { ETHTOOL_A_COALESCE_TX_AGGR_TIME_USECS, ETHTOOL_A_COALESCE_RX_PROFILE, ETHTOOL_A_COALESCE_TX_PROFILE, + ETHTOOL_A_COALESCE_RX_CQE_FRAMES, + ETHTOOL_A_COALESCE_RX_CQE_NSECS, __ETHTOOL_A_COALESCE_CNT, ETHTOOL_A_COALESCE_MAX = (__ETHTOOL_A_COALESCE_CNT - 1) diff --git a/net/ethtool/coalesce.c b/net/ethtool/coalesce.c index 3e18ca1ccc5e..349bb02c517a 100644 --- a/net/ethtool/coalesce.c +++ b/net/ethtool/coalesce.c @@ -118,6 +118,8 @@ static int coalesce_reply_size(const struct ethnl_req_info *req_base, nla_total_size(sizeof(u32)) + /* _TX_AGGR_MAX_BYTES */ nla_total_size(sizeof(u32)) + /* _TX_AGGR_MAX_FRAMES */ nla_total_size(sizeof(u32)) + /* _TX_AGGR_TIME_USECS */ + nla_total_size(sizeof(u32)) + /* _RX_CQE_FRAMES */ + nla_total_size(sizeof(u32)) + /* _RX_CQE_NSECS */ total_modersz * 2; /* _{R,T}X_PROFILE */ } @@ -269,7 +271,11 @@ static int coalesce_fill_reply(struct sk_buff *skb, coalesce_put_u32(skb, ETHTOOL_A_COALESCE_TX_AGGR_MAX_FRAMES, kcoal->tx_aggr_max_frames, supported) || coalesce_put_u32(skb, ETHTOOL_A_COALESCE_TX_AGGR_TIME_USECS, - kcoal->tx_aggr_time_usecs, supported)) + kcoal->tx_aggr_time_usecs, supported) || + coalesce_put_u32(skb, ETHTOOL_A_COALESCE_RX_CQE_FRAMES, + kcoal->rx_cqe_frames, supported) || + coalesce_put_u32(skb, ETHTOOL_A_COALESCE_RX_CQE_NSECS, + kcoal->rx_cqe_nsecs, supported)) return -EMSGSIZE; if (!req_base->dev || !req_base->dev->irq_moder) @@ -338,6 +344,8 @@ const struct nla_policy ethnl_coalesce_set_policy[] = { [ETHTOOL_A_COALESCE_TX_AGGR_MAX_BYTES] = { .type = NLA_U32 }, [ETHTOOL_A_COALESCE_TX_AGGR_MAX_FRAMES] = { .type = NLA_U32 }, [ETHTOOL_A_COALESCE_TX_AGGR_TIME_USECS] = { .type = NLA_U32 }, + [ETHTOOL_A_COALESCE_RX_CQE_FRAMES] = { .type = NLA_U32 }, + [ETHTOOL_A_COALESCE_RX_CQE_NSECS] = { .type = NLA_U32 }, [ETHTOOL_A_COALESCE_RX_PROFILE] = NLA_POLICY_NESTED(coalesce_profile_policy), [ETHTOOL_A_COALESCE_TX_PROFILE] = @@ -570,6 +578,10 @@ __ethnl_set_coalesce(struct ethnl_req_info *req_info, struct genl_info *info, tb[ETHTOOL_A_COALESCE_TX_AGGR_MAX_FRAMES], &mod); ethnl_update_u32(&kernel_coalesce.tx_aggr_time_usecs, tb[ETHTOOL_A_COALESCE_TX_AGGR_TIME_USECS], &mod); + ethnl_update_u32(&kernel_coalesce.rx_cqe_frames, + tb[ETHTOOL_A_COALESCE_RX_CQE_FRAMES], &mod); + ethnl_update_u32(&kernel_coalesce.rx_cqe_nsecs, + tb[ETHTOOL_A_COALESCE_RX_CQE_NSECS], &mod); if (dev->irq_moder && dev->irq_moder->profile_flags & DIM_PROFILE_RX) { ret = ethnl_update_profile(dev, &dev->irq_moder->rx_profile, -- cgit v1.2.3 From c1887257a81bf62f48178d3b9d31e23520d67b2c Mon Sep 17 00:00:00 2001 From: Damien Dejean Date: Wed, 18 Mar 2026 22:54:58 +0100 Subject: dt-bindings: net: ethernet-phy: add property enet-phy-pair-order Add property enet-phy-pair-order to the device tree bindings to define the pair order of the PHY. To simplify PCB design some manufacturers allow to wire the pairs in a reverse order, and change the order in software. The property can be set to 0 to force the normal pair order (ABCD), or 1 to force the reverse pair order (DCBA). Signed-off-by: Damien Dejean Reviewed-by: Rob Herring (Arm) Link: https://patch.msgid.link/20260318215502.106528-2-dam.dejean@gmail.com Signed-off-by: Jakub Kicinski --- Documentation/devicetree/bindings/net/ethernet-phy.yaml | 6 ++++++ 1 file changed, 6 insertions(+) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/net/ethernet-phy.yaml b/Documentation/devicetree/bindings/net/ethernet-phy.yaml index 58634fee9fc4..4a27547f7d7a 100644 --- a/Documentation/devicetree/bindings/net/ethernet-phy.yaml +++ b/Documentation/devicetree/bindings/net/ethernet-phy.yaml @@ -126,6 +126,12 @@ properties: e.g. wrong bootstrap configuration caused by issues in PCB layout design. + enet-phy-pair-order: + $ref: /schemas/types.yaml#/definitions/uint32 + enum: [0, 1] + description: + For normal (0) or reverse (1) order of the pairs (ABCD -> DCBA). + eee-broken-100tx: $ref: /schemas/types.yaml#/definitions/flag description: -- cgit v1.2.3 From 58ffb5910f32e5b387d4af31ee21851c40eb31b5 Mon Sep 17 00:00:00 2001 From: Damien Dejean Date: Wed, 18 Mar 2026 22:55:00 +0100 Subject: dt-bindings: net: ethernet-phy: add property enet-phy-pair-polarity Add the property enet-phy-pair-polarity to describe the polarity of the PHY pairs. To ease PCB designs some manufacturers allow to wire the pairs with a reverse polarity and provide a way to configure it. The property 'enet-phy-pair-polarity' sets the polarity of each pair. Bit 0 to 3 configure the polarity or pairs A to D, if set to 1 the polarity is reversed for this pair. Signed-off-by: Damien Dejean Reviewed-by: Rob Herring (Arm) Link: https://patch.msgid.link/20260318215502.106528-4-dam.dejean@gmail.com Signed-off-by: Jakub Kicinski --- Documentation/devicetree/bindings/net/ethernet-phy.yaml | 8 ++++++++ 1 file changed, 8 insertions(+) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/net/ethernet-phy.yaml b/Documentation/devicetree/bindings/net/ethernet-phy.yaml index 4a27547f7d7a..21a1a63506f0 100644 --- a/Documentation/devicetree/bindings/net/ethernet-phy.yaml +++ b/Documentation/devicetree/bindings/net/ethernet-phy.yaml @@ -132,6 +132,14 @@ properties: description: For normal (0) or reverse (1) order of the pairs (ABCD -> DCBA). + enet-phy-pair-polarity: + $ref: /schemas/types.yaml#/definitions/uint32 + maximum: 0xf + description: + A bitmap to describe pair polarity swap. Bit 0 to swap polarity of pair A, + bit 1 to swap polarity of pair B, bit 2 to swap polarity of pair C and bit + 3 to swap polarity of pair D. + eee-broken-100tx: $ref: /schemas/types.yaml#/definitions/flag description: -- cgit v1.2.3 From 8454478ef9abd4b4acb657160d9587a94cdae180 Mon Sep 17 00:00:00 2001 From: Joey Lu Date: Mon, 23 Mar 2026 18:17:54 +0800 Subject: dt-bindings: net: nuvoton: Add schema for Nuvoton MA35 family GMAC Create initial schema for Nuvoton MA35 family Gigabit MAC. Reviewed-by: Rob Herring (Arm) Signed-off-by: Joey Lu Link: https://patch.msgid.link/20260323101756.81849-2-a0987203069@gmail.com Signed-off-by: Jakub Kicinski --- .../bindings/net/nuvoton,ma35d1-dwmac.yaml | 140 +++++++++++++++++++++ .../devicetree/bindings/net/snps,dwmac.yaml | 1 + 2 files changed, 141 insertions(+) create mode 100644 Documentation/devicetree/bindings/net/nuvoton,ma35d1-dwmac.yaml (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/net/nuvoton,ma35d1-dwmac.yaml b/Documentation/devicetree/bindings/net/nuvoton,ma35d1-dwmac.yaml new file mode 100644 index 000000000000..ab18702e53f9 --- /dev/null +++ b/Documentation/devicetree/bindings/net/nuvoton,ma35d1-dwmac.yaml @@ -0,0 +1,140 @@ +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/net/nuvoton,ma35d1-dwmac.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Nuvoton DWMAC glue layer controller + +maintainers: + - Joey Lu + +description: + Nuvoton 10/100/1000Mbps Gigabit Ethernet MAC Controller is based on + Synopsys DesignWare MAC (version 3.73a). + +select: + properties: + compatible: + contains: + enum: + - nuvoton,ma35d1-dwmac + required: + - compatible + +allOf: + - $ref: snps,dwmac.yaml# + +properties: + compatible: + items: + - const: nuvoton,ma35d1-dwmac + - const: snps,dwmac-3.70a + + reg: + maxItems: 1 + description: + Register range should be one of the GMAC interface. + + interrupts: + maxItems: 1 + + clocks: + items: + - description: MAC clock + - description: PTP clock + + clock-names: + items: + - const: stmmaceth + - const: ptp_ref + + nuvoton,sys: + $ref: /schemas/types.yaml#/definitions/phandle-array + items: + - items: + - description: phandle to access syscon registers. + - description: GMAC interface ID. + enum: + - 0 + - 1 + description: + A phandle to the syscon with one argument that configures system registers + for MA35D1's two GMACs. The argument specifies the GMAC interface ID. + + resets: + maxItems: 1 + + reset-names: + items: + - const: stmmaceth + + phy-mode: + enum: + - rmii + - rgmii + - rgmii-id + - rgmii-txid + - rgmii-rxid + + tx-internal-delay-ps: + default: 0 + minimum: 0 + maximum: 2000 + description: + RGMII TX path delay used only when PHY operates in RGMII mode with + internal delay (phy-mode is 'rgmii-id' or 'rgmii-txid') in pico-seconds. + Allowed values are from 0 to 2000. + + rx-internal-delay-ps: + default: 0 + minimum: 0 + maximum: 2000 + description: + RGMII RX path delay used only when PHY operates in RGMII mode with + internal delay (phy-mode is 'rgmii-id' or 'rgmii-rxid') in pico-seconds. + Allowed values are from 0 to 2000. + +required: + - clocks + - clock-names + - nuvoton,sys + - resets + - reset-names + - phy-mode + +unevaluatedProperties: false + +examples: + - | + #include + #include + #include + ethernet@40120000 { + compatible = "nuvoton,ma35d1-dwmac", "snps,dwmac-3.70a"; + reg = <0x40120000 0x10000>; + interrupts = ; + interrupt-names = "macirq"; + clocks = <&clk EMAC0_GATE>, <&clk EPLL_DIV8>; + clock-names = "stmmaceth", "ptp_ref"; + + nuvoton,sys = <&sys 0>; + resets = <&sys MA35D1_RESET_GMAC0>; + reset-names = "stmmaceth"; + snps,multicast-filter-bins = <0>; + snps,perfect-filter-entries = <8>; + rx-fifo-depth = <4096>; + tx-fifo-depth = <2048>; + + phy-mode = "rgmii-id"; + phy-handle = <ð_phy0>; + mdio { + compatible = "snps,dwmac-mdio"; + #address-cells = <1>; + #size-cells = <0>; + + eth_phy0: ethernet-phy@0 { + reg = <0>; + }; + }; + }; diff --git a/Documentation/devicetree/bindings/net/snps,dwmac.yaml b/Documentation/devicetree/bindings/net/snps,dwmac.yaml index 98ebb6276bc6..c25903c74484 100644 --- a/Documentation/devicetree/bindings/net/snps,dwmac.yaml +++ b/Documentation/devicetree/bindings/net/snps,dwmac.yaml @@ -69,6 +69,7 @@ properties: - ingenic,x2000-mac - loongson,ls2k-dwmac - loongson,ls7a-dwmac + - nuvoton,ma35d1-dwmac - nxp,s32g2-dwmac - qcom,qcs404-ethqos - qcom,sa8775p-ethqos -- cgit v1.2.3 From ed8edcd47529d9ad6558ca2c00dccf21fc0abc08 Mon Sep 17 00:00:00 2001 From: Ryohei Kinugawa Date: Tue, 24 Mar 2026 14:34:10 +0900 Subject: docs/mlx5: Fix typo subfuction Fix two typos: - 'Subfunctons' -> 'Subfunctions' - 'subfuction' -> 'subfunction' Reviewed-by: Joe Damato Signed-off-by: Ryohei Kinugawa Reviewed-by: Tariq Toukan Acked-by: Randy Dunlap Link: https://patch.msgid.link/20260324053416.70166-1-ryohei.kinugawa@gmail.com Signed-off-by: Jakub Kicinski --- .../networking/device_drivers/ethernet/mellanox/mlx5/kconfig.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'Documentation') diff --git a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/kconfig.rst b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/kconfig.rst index 34e911480108..b45d6871492c 100644 --- a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/kconfig.rst +++ b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/kconfig.rst @@ -114,13 +114,13 @@ Enabling the driver and kconfig options **CONFIG_MLX5_SF=(y/n)** | Build support for subfunction. -| Subfunctons are more light weight than PCI SRIOV VFs. Choosing this option +| Subfunctions are more light weight than PCI SRIOV VFs. Choosing this option | will enable support for creating subfunction devices. **CONFIG_MLX5_SF_MANAGER=(y/n)** -| Build support for subfuction port in the NIC. A Mellanox subfunction +| Build support for subfunction port in the NIC. A Mellanox subfunction | port is managed through devlink. A subfunction supports RDMA, netdevice | and vdpa device. It is similar to a SRIOV VF but it doesn't require | SRIOV support. -- cgit v1.2.3 From af0331e1ac51f48d0cc6bb09547b8c8ae467a019 Mon Sep 17 00:00:00 2001 From: "Russell King (Oracle)" Date: Tue, 24 Mar 2026 10:05:45 +0000 Subject: dt-bindings: remove unimplemented AXI snps,kbbe snps,mb and snps,rb Remove the AXI snps,kbbe snps,mb and snps,rb properties as they have not been used, and although the driver parses these, the code hasn't ever used the parsed result. This parsing has now been removed. These were introduced by commit afea03656add ("stmmac: rework DMA bus setting and introduce new platform AXI structure"). Signed-off-by: Russell King (Oracle) Acked-by: Conor Dooley Link: https://patch.msgid.link/E1w4ydt-0000000Dlph-3WvI@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski --- Documentation/devicetree/bindings/net/snps,dwmac.yaml | 18 ------------------ 1 file changed, 18 deletions(-) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/net/snps,dwmac.yaml b/Documentation/devicetree/bindings/net/snps,dwmac.yaml index c25903c74484..2449311c6d28 100644 --- a/Documentation/devicetree/bindings/net/snps,dwmac.yaml +++ b/Documentation/devicetree/bindings/net/snps,dwmac.yaml @@ -204,11 +204,8 @@ properties: * snps,xit_frm, unlock on WoL * snps,wr_osr_lmt, max write outstanding req. limit * snps,rd_osr_lmt, max read outstanding req. limit - * snps,kbbe, do not cross 1KiB boundary. * snps,blen, this is a vector of supported burst length. * snps,fb, fixed-burst - * snps,mb, mixed-burst - * snps,rb, rebuild INCRx Burst snps,mtl-rx-config: $ref: /schemas/types.yaml#/definitions/phandle @@ -588,11 +585,6 @@ properties: description: max read outstanding req. limit - snps,kbbe: - $ref: /schemas/types.yaml#/definitions/flag - description: - do not cross 1KiB boundary. - snps,blen: $ref: /schemas/types.yaml#/definitions/uint32-array description: @@ -605,16 +597,6 @@ properties: description: fixed-burst - snps,mb: - $ref: /schemas/types.yaml#/definitions/flag - description: - mixed-burst - - snps,rb: - $ref: /schemas/types.yaml#/definitions/flag - description: - rebuild INCRx Burst - required: - compatible - reg -- cgit v1.2.3 From dfa36d7e860c2f008d92d91903f2d41ed0c88f3d Mon Sep 17 00:00:00 2001 From: Conor Dooley Date: Wed, 25 Mar 2026 16:28:08 +0000 Subject: dt-bindings: net: cdns,macb: replace cdns,refclk-ext with cdns,refclk-source Ryan added cdns,refclk-ext with the intent of decoupling the source of the reference clock on sama7g5 (and related platforms) from the compatible. Unfortunately, the default for sama7g5-emac is an external reference clock, so this property had no effect there, so that compatibility with older devicetrees is preserved. Replace cdns,refclk-ext with one that supports both default states and therefore is usable for sama7g5-emac. For now, limit it to only the platforms that have USRIO controlled reference clock selection, but this could be generalised in the future. The existing property only works on devices that are compatible with sama7g5-gem, so mark it deprecated, and limit its use to that specific scenario. Signed-off-by: Conor Dooley Link: https://patch.msgid.link/20260325-savior-untainted-03057ee0a917@spud Signed-off-by: Jakub Kicinski --- .../devicetree/bindings/net/cdns,macb.yaml | 56 ++++++++++++++++++++-- 1 file changed, 53 insertions(+), 3 deletions(-) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/net/cdns,macb.yaml b/Documentation/devicetree/bindings/net/cdns,macb.yaml index feb168385837..50490acbb6fc 100644 --- a/Documentation/devicetree/bindings/net/cdns,macb.yaml +++ b/Documentation/devicetree/bindings/net/cdns,macb.yaml @@ -130,10 +130,23 @@ properties: cdns,refclk-ext: type: boolean + deprecated: true + description: | + This selects if the REFCLK for RMII is provided by an external source. + For RGMII mode this selects if the 125MHz REF clock is provided by an external + source. + + This property has been replaced by cdns,refclk-source, as it only works + for devices that use an internal reference clock by default. + + cdns,refclk-source: + $ref: /schemas/types.yaml#/definitions/string + enum: + - internal + - external description: - This selects if the REFCLK for RMII is provided by an external source. - For RGMII mode this selects if the 125MHz REF clock is provided by an external - source. + Select whether or not the refclk for RGMII or RMII is provided by an + internal or external source. The default is device specific. cdns,rx-watermark: $ref: /schemas/types.yaml#/definitions/uint32 @@ -215,6 +228,43 @@ allOf: properties: mdio: false + - if: + not: + properties: + compatible: + contains: + enum: + - microchip,sama7g5-gem + - microchip,sama7g5-emac + then: + properties: + cdns,refclk-source: false + + - if: + not: + properties: + compatible: + contains: + const: microchip,sama7g5-gem + then: + properties: + cdns,refclk-ext: false + + - if: + properties: + compatible: + contains: + enum: + - microchip,sama7g5-emac + then: + properties: + cdns,refclk-source: + default: external + else: + properties: + cdns,refclk-source: + default: internal + unevaluatedProperties: false examples: -- cgit v1.2.3 From 09a6164a4f1d1b5364069bd290dba1d808718016 Mon Sep 17 00:00:00 2001 From: Conor Dooley Date: Wed, 25 Mar 2026 16:28:14 +0000 Subject: dt-bindings: net: macb: add property indicating timer adjust mode The GEM IP has two methods for modifying the ptp timer. The first of these, named "increment mode", relies on software controlling the timer by setting tsu_timer_incr and tsu_timer_incr_sub_nsec and performing once-off adjustments via the tsu_timer_adjust register. This is what the macb driver uses. The second mechanism, "timer adjust mode" uses the gem_tsu_inc_ctrl and gem_tsu_ms signals to control the timer. These modes are not intended to be used in parallel, but both can be possible on the same device and which mode is used cannot be determined from the compatible on all devices, because some users of the GEM IP are SoC FPGAs that permit configuring how the IP is wired up. Add a property to indicate that gem_tsu_inc_ctrl and gem_tsu_ms are wired up for timer adjust mode. Reviewed-by: Krzysztof Kozlowski Signed-off-by: Conor Dooley Link: https://patch.msgid.link/20260325-daily-entitle-3640f7254da4@spud Signed-off-by: Jakub Kicinski --- Documentation/devicetree/bindings/net/cdns,macb.yaml | 15 +++++++++++++++ 1 file changed, 15 insertions(+) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/net/cdns,macb.yaml b/Documentation/devicetree/bindings/net/cdns,macb.yaml index 50490acbb6fc..2c8c080a3d88 100644 --- a/Documentation/devicetree/bindings/net/cdns,macb.yaml +++ b/Documentation/devicetree/bindings/net/cdns,macb.yaml @@ -158,6 +158,12 @@ properties: that need to be filled, before the forwarding process is activated. Width of the SRAM is platform dependent, and can be 4, 8 or 16 bytes. + cdns,timer-adjust: + type: boolean + description: + Set when the hardware is operating in timer-adjust mode, where the timer + is controlled by the gem_tsu_inc_ctrl and gem_tsu_ms inputs. + '#address-cells': const: 1 @@ -207,6 +213,15 @@ allOf: properties: reg: maxItems: 1 + - if: + not: + properties: + compatible: + contains: + const: microchip,mpfs-macb + then: + properties: + cdns,timer-adjust: false - if: properties: -- cgit v1.2.3 From 2f41d7867800a78f339fbec3ab3a64147c33a9f1 Mon Sep 17 00:00:00 2001 From: Viken Dadhaniya Date: Sat, 21 Mar 2026 19:20:30 +0530 Subject: dt-bindings: can: mcp251xfd: add microchip,xstbyen property Add the boolean property 'microchip,xstbyen' to enable the dedicated transceiver standby control function on the INT0/GPIO0/XSTBY pin of the MCP251xFD family. Signed-off-by: Viken Dadhaniya Acked-by: Conor Dooley Link: https://patch.msgid.link/20260321135031.3107408-2-viken.dadhaniya@oss.qualcomm.com Signed-off-by: Marc Kleine-Budde --- .../devicetree/bindings/net/can/microchip,mcp251xfd.yaml | 8 ++++++++ 1 file changed, 8 insertions(+) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/net/can/microchip,mcp251xfd.yaml b/Documentation/devicetree/bindings/net/can/microchip,mcp251xfd.yaml index 2d13638ebc6a..28e494262cd9 100644 --- a/Documentation/devicetree/bindings/net/can/microchip,mcp251xfd.yaml +++ b/Documentation/devicetree/bindings/net/can/microchip,mcp251xfd.yaml @@ -44,6 +44,14 @@ properties: signals a pending RX interrupt. maxItems: 1 + microchip,xstbyen: + type: boolean + description: + If present, configure the INT0/GPIO0/XSTBY pin as transceiver standby + control. The pin is driven low when the controller is active and high + when it enters Sleep mode, allowing automatic standby control of an + external CAN transceiver connected to this pin. + spi-max-frequency: description: Must be half or less of "clocks" frequency. -- cgit v1.2.3 From 3fdea79c09d169b6ea172b8d36232c3773f39973 Mon Sep 17 00:00:00 2001 From: Ivan Vecera Date: Thu, 2 Apr 2026 20:40:55 +0200 Subject: dpll: add frequency monitoring to netlink spec Add DPLL_A_FREQUENCY_MONITOR device attribute to allow control over the frequency monitor feature. The attribute uses the existing dpll_feature_state enum (enable/disable) and is present in both device-get reply and device-set request. Add DPLL_A_PIN_MEASURED_FREQUENCY pin attribute to expose the measured input frequency in millihertz (mHz). The attribute is present in the pin-get reply. Add DPLL_PIN_MEASURED_FREQUENCY_DIVIDER constant to allow userspace to extract integer and fractional parts. Reviewed-by: Vadim Fedorenko Signed-off-by: Ivan Vecera Link: https://patch.msgid.link/20260402184057.1890514-2-ivecera@redhat.com Signed-off-by: Jakub Kicinski --- Documentation/driver-api/dpll.rst | 20 ++++++++++++++++++++ Documentation/netlink/specs/dpll.yaml | 35 +++++++++++++++++++++++++++++++++++ drivers/dpll/dpll_nl.c | 5 +++-- include/uapi/linux/dpll.h | 5 ++++- 4 files changed, 62 insertions(+), 3 deletions(-) (limited to 'Documentation') diff --git a/Documentation/driver-api/dpll.rst b/Documentation/driver-api/dpll.rst index 83118c728ed9..93c191b2d089 100644 --- a/Documentation/driver-api/dpll.rst +++ b/Documentation/driver-api/dpll.rst @@ -250,6 +250,24 @@ in the ``DPLL_A_PIN_PHASE_OFFSET`` attribute. ``DPLL_A_PHASE_OFFSET_MONITOR`` attr state of a feature =============================== ======================== +Frequency monitor +================= + +Some DPLL devices may offer the capability to measure the actual +frequency of all available input pins. The attribute and current feature state +shall be included in the response message of the ``DPLL_CMD_DEVICE_GET`` +command for supported DPLL devices. In such cases, users can also control +the feature using the ``DPLL_CMD_DEVICE_SET`` command by setting the +``enum dpll_feature_state`` values for the attribute. +Once enabled the measured input frequency for each input pin shall be +returned in the ``DPLL_A_PIN_MEASURED_FREQUENCY`` attribute. The value +is in millihertz (mHz), using ``DPLL_PIN_MEASURED_FREQUENCY_DIVIDER`` +as the divider. + + =============================== ======================== + ``DPLL_A_FREQUENCY_MONITOR`` attr state of a feature + =============================== ======================== + Embedded SYNC ============= @@ -411,6 +429,8 @@ according to attribute purpose. ``DPLL_A_PIN_STATE`` attr state of pin on the parent pin ``DPLL_A_PIN_CAPABILITIES`` attr bitmask of pin capabilities + ``DPLL_A_PIN_MEASURED_FREQUENCY`` attr measured frequency of + an input pin in mHz ==================================== ================================== ==================================== ================================= diff --git a/Documentation/netlink/specs/dpll.yaml b/Documentation/netlink/specs/dpll.yaml index 3dd48a32f783..40465a3d7fc2 100644 --- a/Documentation/netlink/specs/dpll.yaml +++ b/Documentation/netlink/specs/dpll.yaml @@ -240,6 +240,20 @@ definitions: integer part of a measured phase offset value. Value of (DPLL_A_PHASE_OFFSET % DPLL_PHASE_OFFSET_DIVIDER) is a fractional part of a measured phase offset value. + - + type: const + name: pin-measured-frequency-divider + value: 1000 + doc: | + pin measured frequency divider allows userspace to calculate + a value of measured input frequency as a fractional value with + three digit decimal precision (millihertz). + Value of (DPLL_A_PIN_MEASURED_FREQUENCY / + DPLL_PIN_MEASURED_FREQUENCY_DIVIDER) is an integer part of + a measured frequency value. + Value of (DPLL_A_PIN_MEASURED_FREQUENCY % + DPLL_PIN_MEASURED_FREQUENCY_DIVIDER) is a fractional part of + a measured frequency value. - type: enum name: feature-state @@ -319,6 +333,13 @@ attribute-sets: name: phase-offset-avg-factor type: u32 doc: Averaging factor applied to calculation of reported phase offset. + - + name: frequency-monitor + type: u32 + enum: feature-state + doc: Current or desired state of the frequency monitor feature. + If enabled, dpll device shall measure all currently available + inputs for their actual input frequency. - name: pin enum-name: dpll_a_pin @@ -456,6 +477,17 @@ attribute-sets: Value is in PPT (parts per trillion, 10^-12). Note: This attribute provides higher resolution than the standard fractional-frequency-offset (which is in PPM). + - + name: measured-frequency + type: u64 + doc: | + The measured frequency of the input pin in millihertz (mHz). + Value of (DPLL_A_PIN_MEASURED_FREQUENCY / + DPLL_PIN_MEASURED_FREQUENCY_DIVIDER) is an integer part (Hz) + of a measured frequency value. + Value of (DPLL_A_PIN_MEASURED_FREQUENCY % + DPLL_PIN_MEASURED_FREQUENCY_DIVIDER) is a fractional part + of a measured frequency value. - name: pin-parent-device @@ -544,6 +576,7 @@ operations: - type - phase-offset-monitor - phase-offset-avg-factor + - frequency-monitor dump: reply: *dev-attrs @@ -563,6 +596,7 @@ operations: - mode - phase-offset-monitor - phase-offset-avg-factor + - frequency-monitor - name: device-create-ntf doc: Notification about device appearing @@ -643,6 +677,7 @@ operations: - esync-frequency-supported - esync-pulse - reference-sync + - measured-frequency dump: request: diff --git a/drivers/dpll/dpll_nl.c b/drivers/dpll/dpll_nl.c index a2b22d492114..1e652340a5d7 100644 --- a/drivers/dpll/dpll_nl.c +++ b/drivers/dpll/dpll_nl.c @@ -43,11 +43,12 @@ static const struct nla_policy dpll_device_get_nl_policy[DPLL_A_ID + 1] = { }; /* DPLL_CMD_DEVICE_SET - do */ -static const struct nla_policy dpll_device_set_nl_policy[DPLL_A_PHASE_OFFSET_AVG_FACTOR + 1] = { +static const struct nla_policy dpll_device_set_nl_policy[DPLL_A_FREQUENCY_MONITOR + 1] = { [DPLL_A_ID] = { .type = NLA_U32, }, [DPLL_A_MODE] = NLA_POLICY_RANGE(NLA_U32, 1, 2), [DPLL_A_PHASE_OFFSET_MONITOR] = NLA_POLICY_MAX(NLA_U32, 1), [DPLL_A_PHASE_OFFSET_AVG_FACTOR] = { .type = NLA_U32, }, + [DPLL_A_FREQUENCY_MONITOR] = NLA_POLICY_MAX(NLA_U32, 1), }; /* DPLL_CMD_PIN_ID_GET - do */ @@ -115,7 +116,7 @@ static const struct genl_split_ops dpll_nl_ops[] = { .doit = dpll_nl_device_set_doit, .post_doit = dpll_post_doit, .policy = dpll_device_set_nl_policy, - .maxattr = DPLL_A_PHASE_OFFSET_AVG_FACTOR, + .maxattr = DPLL_A_FREQUENCY_MONITOR, .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO, }, { diff --git a/include/uapi/linux/dpll.h b/include/uapi/linux/dpll.h index de0005f28e5c..871685f7c353 100644 --- a/include/uapi/linux/dpll.h +++ b/include/uapi/linux/dpll.h @@ -191,7 +191,8 @@ enum dpll_pin_capabilities { DPLL_PIN_CAPABILITIES_STATE_CAN_CHANGE = 4, }; -#define DPLL_PHASE_OFFSET_DIVIDER 1000 +#define DPLL_PHASE_OFFSET_DIVIDER 1000 +#define DPLL_PIN_MEASURED_FREQUENCY_DIVIDER 1000 /** * enum dpll_feature_state - Allow control (enable/disable) and status checking @@ -218,6 +219,7 @@ enum dpll_a { DPLL_A_CLOCK_QUALITY_LEVEL, DPLL_A_PHASE_OFFSET_MONITOR, DPLL_A_PHASE_OFFSET_AVG_FACTOR, + DPLL_A_FREQUENCY_MONITOR, __DPLL_A_MAX, DPLL_A_MAX = (__DPLL_A_MAX - 1) @@ -254,6 +256,7 @@ enum dpll_a_pin { DPLL_A_PIN_REFERENCE_SYNC, DPLL_A_PIN_PHASE_ADJUST_GRAN, DPLL_A_PIN_FRACTIONAL_FREQUENCY_OFFSET_PPT, + DPLL_A_PIN_MEASURED_FREQUENCY, __DPLL_A_PIN_MAX, DPLL_A_PIN_MAX = (__DPLL_A_PIN_MAX - 1) -- cgit v1.2.3 From c8eee00c0fef5f709b9114be432d7b3afebb4c0a Mon Sep 17 00:00:00 2001 From: Daniel Zahka Date: Fri, 3 Apr 2026 03:26:31 -0700 Subject: psp: add missing device stats to get-stats reply attributes Commit f05d26198cf2 ("psp: add stats from psp spec to driver facing api") added device statistics (rx-packets, rx-bytes, rx-auth-fail, rx-error, rx-bad, tx-packets, tx-bytes, tx-error) to the stats attribute-set but did not add them to the get-stats operation reply attributes. The kernel reports these attributes in the reply, so list them in the spec to match. Signed-off-by: Daniel Zahka Acked-by: Willem de Bruijn Link: https://patch.msgid.link/20260403-psp-yaml-fix-v1-1-dacee0663903@gmail.com Signed-off-by: Jakub Kicinski --- Documentation/netlink/specs/psp.yaml | 8 ++++++++ 1 file changed, 8 insertions(+) (limited to 'Documentation') diff --git a/Documentation/netlink/specs/psp.yaml b/Documentation/netlink/specs/psp.yaml index f3a57782d2cf..100c36cda8e5 100644 --- a/Documentation/netlink/specs/psp.yaml +++ b/Documentation/netlink/specs/psp.yaml @@ -267,6 +267,14 @@ operations: - dev-id - key-rotations - stale-events + - rx-packets + - rx-bytes + - rx-auth-fail + - rx-error + - rx-bad + - tx-packets + - tx-bytes + - tx-error pre: psp-device-get-locked post: psp-device-unlock dump: -- cgit v1.2.3 From e72058a4bed070193893410e5f6545cc4f99cd24 Mon Sep 17 00:00:00 2001 From: David Heidelberg Date: Fri, 3 Apr 2026 15:58:46 +0200 Subject: dt-bindings: nfc: nxp,nci: Document PN557 compatible The PN557 uses the same hardware as the PN553 but ships with firmware compliant with NCI 2.0. Document PN557 as a compatible device. Signed-off-by: David Heidelberg Reviewed-by: Krzysztof Kozlowski Link: https://patch.msgid.link/20260403-oneplus-nfc-v3-1-fbdce57d63c1@ixit.cz Signed-off-by: Jakub Kicinski --- Documentation/devicetree/bindings/net/nfc/nxp,nci.yaml | 1 + 1 file changed, 1 insertion(+) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/net/nfc/nxp,nci.yaml b/Documentation/devicetree/bindings/net/nfc/nxp,nci.yaml index 364b36151180..4f3847f64983 100644 --- a/Documentation/devicetree/bindings/net/nfc/nxp,nci.yaml +++ b/Documentation/devicetree/bindings/net/nfc/nxp,nci.yaml @@ -18,6 +18,7 @@ properties: - nxp,nq310 - nxp,pn547 - nxp,pn553 + - nxp,pn557 - const: nxp,nxp-nci-i2c enable-gpios: -- cgit v1.2.3 From 72b18625ba8e5c28acbc923822a8b9ca7c5b3931 Mon Sep 17 00:00:00 2001 From: Ronald Claveau Date: Fri, 27 Mar 2026 10:36:26 +0100 Subject: dt-bindings: net: wireless: brcm: Add compatible for bcm43752 Add bcm43752 compatible with its bcm4329 compatible fallback. Acked-by: Conor Dooley Signed-off-by: Ronald Claveau Acked-by: Arend van Spriel Link: https://patch.msgid.link/20260327-add-bcm43752-compatible-v2-1-5b28e6637101@aliel.fr Signed-off-by: Johannes Berg --- Documentation/devicetree/bindings/net/wireless/brcm,bcm4329-fmac.yaml | 1 + 1 file changed, 1 insertion(+) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/net/wireless/brcm,bcm4329-fmac.yaml b/Documentation/devicetree/bindings/net/wireless/brcm,bcm4329-fmac.yaml index 3be757678764..81fd3e37452a 100644 --- a/Documentation/devicetree/bindings/net/wireless/brcm,bcm4329-fmac.yaml +++ b/Documentation/devicetree/bindings/net/wireless/brcm,bcm4329-fmac.yaml @@ -42,6 +42,7 @@ properties: - brcm,bcm4356-fmac - brcm,bcm4359-fmac - brcm,bcm4366-fmac + - brcm,bcm43752-fmac - cypress,cyw4373-fmac - cypress,cyw43012-fmac - infineon,cyw43439-fmac -- cgit v1.2.3 From 3ebaf730b5832319726e12ebe634a7679eaf2e9b Mon Sep 17 00:00:00 2001 From: Raj Kumar Bhagat Date: Tue, 7 Apr 2026 10:56:28 +0530 Subject: dt-bindings: net: wireless: add ath12k wifi device IPQ5424 Add the device-tree bindings for the ATH12K AHB wifi device IPQ5424. Signed-off-by: Raj Kumar Bhagat Acked-by: Krzysztof Kozlowski Link: https://patch.msgid.link/20260407-ath12k-ipq5424-v5-1-8e96aa660ec4@oss.qualcomm.com Signed-off-by: Jeff Johnson --- Documentation/devicetree/bindings/net/wireless/qcom,ipq5332-wifi.yaml | 1 + 1 file changed, 1 insertion(+) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/net/wireless/qcom,ipq5332-wifi.yaml b/Documentation/devicetree/bindings/net/wireless/qcom,ipq5332-wifi.yaml index 363a0ecb6ad9..37d8a0da7780 100644 --- a/Documentation/devicetree/bindings/net/wireless/qcom,ipq5332-wifi.yaml +++ b/Documentation/devicetree/bindings/net/wireless/qcom,ipq5332-wifi.yaml @@ -17,6 +17,7 @@ properties: compatible: enum: - qcom,ipq5332-wifi + - qcom,ipq5424-wifi reg: maxItems: 1 -- cgit v1.2.3 From 3d7640b6c371a1795e6d9580695d20caf16be9a4 Mon Sep 17 00:00:00 2001 From: Amit Pundir Date: Tue, 7 Apr 2026 08:43:54 +0200 Subject: dt-bindings: wireless: ath10k: Add quirk to skip host cap QMI requests Some firmware versions do not support the host-capability QMI request. Since this request occurs before firmware and board files are loaded, the quirk cannot be expressed in the firmware itself and must be described in the device tree. Signed-off-by: Amit Pundir Co-developed-by: David Heidelberg Signed-off-by: David Heidelberg Reviewed-by: Krzysztof Kozlowski Link: https://patch.msgid.link/20260407-skip-host-cam-qmi-req-v5-1-dfa8a05c6538@ixit.cz Signed-off-by: Jeff Johnson --- .../devicetree/bindings/net/wireless/qcom,ath10k.yaml | 11 +++++++++++ 1 file changed, 11 insertions(+) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/net/wireless/qcom,ath10k.yaml b/Documentation/devicetree/bindings/net/wireless/qcom,ath10k.yaml index f2440d39b7eb..c21d66c7cd55 100644 --- a/Documentation/devicetree/bindings/net/wireless/qcom,ath10k.yaml +++ b/Documentation/devicetree/bindings/net/wireless/qcom,ath10k.yaml @@ -171,6 +171,12 @@ properties: Quirk specifying that the firmware expects the 8bit version of the host capability QMI request + qcom,snoc-host-cap-skip-quirk: + type: boolean + description: + Quirk specifying that the firmware wants to skip the host + capability QMI request + qcom,xo-cal-data: $ref: /schemas/types.yaml#/definitions/uint32 description: @@ -292,6 +298,11 @@ allOf: required: - interrupts + - not: + required: + - qcom,snoc-host-cap-8bit-quirk + - qcom,snoc-host-cap-skip-quirk + examples: # SNoC - | -- cgit v1.2.3 From bd5c24e4001d39136e1084fc47335c93e4ebbb0d Mon Sep 17 00:00:00 2001 From: Jakub Kicinski Date: Mon, 6 Apr 2026 10:53:34 -0700 Subject: docs: netdev: improve wording of reviewer guidance Reword the reviewer guidance based on behavior we see on the list. Steer folks: - towards sending tags - away from process issues. Reviewed-by: Joe Damato Reviewed-by: Nicolai Buchwitz Link: https://patch.msgid.link/20260406175334.3153451-1-kuba@kernel.org Signed-off-by: Jakub Kicinski --- Documentation/process/maintainer-netdev.rst | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) (limited to 'Documentation') diff --git a/Documentation/process/maintainer-netdev.rst b/Documentation/process/maintainer-netdev.rst index 3aa13bc2405d..bda93b459a05 100644 --- a/Documentation/process/maintainer-netdev.rst +++ b/Documentation/process/maintainer-netdev.rst @@ -551,10 +551,12 @@ helpful tips please see :ref:`development_advancedtopics_reviews`. It's safe to assume that netdev maintainers know the community and the level of expertise of the reviewers. The reviewers should not be concerned about -their comments impeding or derailing the patch flow. +their comments impeding or derailing the patch flow. A Reviewed-by tag +is understood to mean "I have reviewed this code to the best of my ability" +rather than "I can attest this code is correct". -Less experienced reviewers are highly encouraged to do more in-depth -review of submissions and not focus exclusively on trivial or subjective +Reviewers are highly encouraged to do more in-depth review of submissions +and not focus exclusively on process issues, trivial or subjective matters like code formatting, tags etc. Testimonials / feedback -- cgit v1.2.3 From b773b9935239e9bec86b96ce91b6ba2252c20b44 Mon Sep 17 00:00:00 2001 From: Vladimir Oltean Date: Tue, 7 Apr 2026 00:21:55 +0300 Subject: net: dsa: remove struct platform_data This is not used anywhere in the kernel. Signed-off-by: Vladimir Oltean Link: https://patch.msgid.link/20260406212158.721806-2-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski --- Documentation/networking/dsa/dsa.rst | 5 ----- include/linux/platform_data/dsa.h | 17 ----------------- 2 files changed, 22 deletions(-) (limited to 'Documentation') diff --git a/Documentation/networking/dsa/dsa.rst b/Documentation/networking/dsa/dsa.rst index 5c79740a533b..fd3c254ced1d 100644 --- a/Documentation/networking/dsa/dsa.rst +++ b/Documentation/networking/dsa/dsa.rst @@ -383,11 +383,6 @@ DSA data structures are defined in ``include/net/dsa.h`` as well as well as various properties of its ports: names/labels, and finally a routing table indication (when cascading switches) -- ``dsa_platform_data``: platform device configuration data which can reference - a collection of dsa_chip_data structures if multiple switches are cascaded, - the conduit network device this switch tree is attached to needs to be - referenced - - ``dsa_switch_tree``: structure assigned to the conduit network device under ``dsa_ptr``, this structure references a dsa_platform_data structure as well as the tagging protocol supported by the switch tree, and which receive/transmit diff --git a/include/linux/platform_data/dsa.h b/include/linux/platform_data/dsa.h index d4d9bf2060a6..fec1ae5bddb9 100644 --- a/include/linux/platform_data/dsa.h +++ b/include/linux/platform_data/dsa.h @@ -48,21 +48,4 @@ struct dsa_chip_data { s8 rtable[DSA_MAX_SWITCHES]; }; -struct dsa_platform_data { - /* - * Reference to a Linux network interface that connects - * to the root switch chip of the tree. - */ - struct device *netdev; - struct net_device *of_netdev; - - /* - * Info structs describing each of the switch chips - * connected via this network interface. - */ - int nr_chips; - struct dsa_chip_data *chip; -}; - - #endif /* __DSA_PDATA_H */ -- cgit v1.2.3 From 11636b550eea9e480fe4ceec8ab7bd8f24b363da Mon Sep 17 00:00:00 2001 From: Or Har-Toov Date: Tue, 7 Apr 2026 22:41:00 +0300 Subject: devlink: Add dump support for device-level resources Add dumpit handler for resource-dump command to iterate over all devlink devices and show their resources. $ devlink resource show pci/0000:08:00.0: name local_max_SFs size 508 unit entry name external_max_SFs size 508 unit entry pci/0000:08:00.1: name local_max_SFs size 508 unit entry name external_max_SFs size 508 unit entry Signed-off-by: Or Har-Toov Reviewed-by: Shay Drori Reviewed-by: Moshe Shemesh Reviewed-by: Jiri Pirko Signed-off-by: Tariq Toukan Link: https://patch.msgid.link/20260407194107.148063-6-tariqt@nvidia.com Signed-off-by: Jakub Kicinski --- Documentation/netlink/specs/devlink.yaml | 6 ++- net/devlink/netlink_gen.c | 20 +++++++-- net/devlink/netlink_gen.h | 4 +- net/devlink/resource.c | 77 ++++++++++++++++++++++++++++++++ 4 files changed, 102 insertions(+), 5 deletions(-) (limited to 'Documentation') diff --git a/Documentation/netlink/specs/devlink.yaml b/Documentation/netlink/specs/devlink.yaml index b495d56b9137..c423e049c7bd 100644 --- a/Documentation/netlink/specs/devlink.yaml +++ b/Documentation/netlink/specs/devlink.yaml @@ -1764,13 +1764,17 @@ operations: - bus-name - dev-name - index - reply: + reply: &resource-dump-reply value: 36 attributes: - bus-name - dev-name - index - resource-list + dump: + request: + attributes: *dev-id-attrs + reply: *resource-dump-reply - name: reload diff --git a/net/devlink/netlink_gen.c b/net/devlink/netlink_gen.c index eb35e80e01d1..a5a47a4c6de8 100644 --- a/net/devlink/netlink_gen.c +++ b/net/devlink/netlink_gen.c @@ -305,7 +305,14 @@ static const struct nla_policy devlink_resource_set_nl_policy[DEVLINK_ATTR_INDEX }; /* DEVLINK_CMD_RESOURCE_DUMP - do */ -static const struct nla_policy devlink_resource_dump_nl_policy[DEVLINK_ATTR_INDEX + 1] = { +static const struct nla_policy devlink_resource_dump_do_nl_policy[DEVLINK_ATTR_INDEX + 1] = { + [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, + [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), +}; + +/* DEVLINK_CMD_RESOURCE_DUMP - dump */ +static const struct nla_policy devlink_resource_dump_dump_nl_policy[DEVLINK_ATTR_INDEX + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), @@ -680,7 +687,7 @@ static const struct nla_policy devlink_notify_filter_set_nl_policy[DEVLINK_ATTR_ }; /* Ops table for devlink */ -const struct genl_split_ops devlink_nl_ops[74] = { +const struct genl_split_ops devlink_nl_ops[75] = { { .cmd = DEVLINK_CMD_GET, .validate = GENL_DONT_VALIDATE_STRICT, @@ -958,10 +965,17 @@ const struct genl_split_ops devlink_nl_ops[74] = { .pre_doit = devlink_nl_pre_doit, .doit = devlink_nl_resource_dump_doit, .post_doit = devlink_nl_post_doit, - .policy = devlink_resource_dump_nl_policy, + .policy = devlink_resource_dump_do_nl_policy, .maxattr = DEVLINK_ATTR_INDEX, .flags = GENL_CMD_CAP_DO, }, + { + .cmd = DEVLINK_CMD_RESOURCE_DUMP, + .dumpit = devlink_nl_resource_dump_dumpit, + .policy = devlink_resource_dump_dump_nl_policy, + .maxattr = DEVLINK_ATTR_INDEX, + .flags = GENL_CMD_CAP_DUMP, + }, { .cmd = DEVLINK_CMD_RELOAD, .validate = GENL_DONT_VALIDATE_STRICT, diff --git a/net/devlink/netlink_gen.h b/net/devlink/netlink_gen.h index 2817d53a0eba..d79f6a0888f6 100644 --- a/net/devlink/netlink_gen.h +++ b/net/devlink/netlink_gen.h @@ -18,7 +18,7 @@ extern const struct nla_policy devlink_dl_rate_tc_bws_nl_policy[DEVLINK_RATE_TC_ extern const struct nla_policy devlink_dl_selftest_id_nl_policy[DEVLINK_ATTR_SELFTEST_ID_FLASH + 1]; /* Ops table for devlink */ -extern const struct genl_split_ops devlink_nl_ops[74]; +extern const struct genl_split_ops devlink_nl_ops[75]; int devlink_nl_pre_doit(const struct genl_split_ops *ops, struct sk_buff *skb, struct genl_info *info); @@ -80,6 +80,8 @@ int devlink_nl_dpipe_table_counters_set_doit(struct sk_buff *skb, struct genl_info *info); int devlink_nl_resource_set_doit(struct sk_buff *skb, struct genl_info *info); int devlink_nl_resource_dump_doit(struct sk_buff *skb, struct genl_info *info); +int devlink_nl_resource_dump_dumpit(struct sk_buff *skb, + struct netlink_callback *cb); int devlink_nl_reload_doit(struct sk_buff *skb, struct genl_info *info); int devlink_nl_param_get_doit(struct sk_buff *skb, struct genl_info *info); int devlink_nl_param_get_dumpit(struct sk_buff *skb, diff --git a/net/devlink/resource.c b/net/devlink/resource.c index f3014ec425c4..02fb36e25c52 100644 --- a/net/devlink/resource.c +++ b/net/devlink/resource.c @@ -223,6 +223,31 @@ nla_put_failure: return -EMSGSIZE; } +static int devlink_resource_list_fill(struct sk_buff *skb, + struct devlink *devlink, + struct list_head *resource_list_head, + int *idx) +{ + struct devlink_resource *resource; + int i = 0; + int err; + + list_for_each_entry(resource, resource_list_head, list) { + if (i < *idx) { + i++; + continue; + } + err = devlink_resource_put(devlink, skb, resource); + if (err) { + *idx = i; + return err; + } + i++; + } + *idx = 0; + return 0; +} + static int devlink_resource_fill(struct genl_info *info, enum devlink_command cmd, int flags) { @@ -302,6 +327,58 @@ int devlink_nl_resource_dump_doit(struct sk_buff *skb, struct genl_info *info) return devlink_resource_fill(info, DEVLINK_CMD_RESOURCE_DUMP, 0); } +static int +devlink_nl_resource_dump_one(struct sk_buff *skb, struct devlink *devlink, + struct netlink_callback *cb, int flags) +{ + struct devlink_nl_dump_state *state = devlink_dump_state(cb); + struct nlattr *resources_attr; + int start_idx = state->idx; + void *hdr; + int err; + + if (list_empty(&devlink->resource_list)) + return 0; + + err = -EMSGSIZE; + hdr = genlmsg_put(skb, NETLINK_CB(cb->skb).portid, cb->nlh->nlmsg_seq, + &devlink_nl_family, flags, DEVLINK_CMD_RESOURCE_DUMP); + if (!hdr) + return err; + + if (devlink_nl_put_handle(skb, devlink)) + goto nla_put_failure; + + resources_attr = nla_nest_start_noflag(skb, DEVLINK_ATTR_RESOURCE_LIST); + if (!resources_attr) + goto nla_put_failure; + + err = devlink_resource_list_fill(skb, devlink, + &devlink->resource_list, &state->idx); + if (err) { + if (state->idx == start_idx) + goto resource_list_cancel; + nla_nest_end(skb, resources_attr); + genlmsg_end(skb, hdr); + return err; + } + nla_nest_end(skb, resources_attr); + genlmsg_end(skb, hdr); + return 0; + +resource_list_cancel: + nla_nest_cancel(skb, resources_attr); +nla_put_failure: + genlmsg_cancel(skb, hdr); + return err; +} + +int devlink_nl_resource_dump_dumpit(struct sk_buff *skb, + struct netlink_callback *cb) +{ + return devlink_nl_dumpit(skb, cb, devlink_nl_resource_dump_one); +} + int devlink_resources_validate(struct devlink *devlink, struct devlink_resource *resource, struct genl_info *info) -- cgit v1.2.3 From 7511ff14f30d264691ec493b09111404aed4aaf4 Mon Sep 17 00:00:00 2001 From: Or Har-Toov Date: Tue, 7 Apr 2026 22:41:02 +0300 Subject: devlink: Add port-specific option to resource dump doit Allow querying devlink resources per-port via the resource-dump doit handler. When a port-index attribute is provided, only that port's resources are returned. When no port-index is given, only device-level resources are returned, preserving backward compatibility. Signed-off-by: Or Har-Toov Reviewed-by: Moshe Shemesh Signed-off-by: Tariq Toukan Link: https://patch.msgid.link/20260407194107.148063-8-tariqt@nvidia.com Signed-off-by: Jakub Kicinski --- Documentation/netlink/specs/devlink.yaml | 4 +++- net/devlink/netlink_gen.c | 3 ++- net/devlink/netlink_gen.h | 4 ++-- net/devlink/resource.c | 20 +++++++++++++++++--- 4 files changed, 24 insertions(+), 7 deletions(-) (limited to 'Documentation') diff --git a/Documentation/netlink/specs/devlink.yaml b/Documentation/netlink/specs/devlink.yaml index c423e049c7bd..34aa81ba689e 100644 --- a/Documentation/netlink/specs/devlink.yaml +++ b/Documentation/netlink/specs/devlink.yaml @@ -1757,19 +1757,21 @@ operations: attribute-set: devlink dont-validate: [strict] do: - pre: devlink-nl-pre-doit + pre: devlink-nl-pre-doit-port-optional post: devlink-nl-post-doit request: attributes: - bus-name - dev-name - index + - port-index reply: &resource-dump-reply value: 36 attributes: - bus-name - dev-name - index + - port-index - resource-list dump: request: diff --git a/net/devlink/netlink_gen.c b/net/devlink/netlink_gen.c index a5a47a4c6de8..9cc372d9ee41 100644 --- a/net/devlink/netlink_gen.c +++ b/net/devlink/netlink_gen.c @@ -309,6 +309,7 @@ static const struct nla_policy devlink_resource_dump_do_nl_policy[DEVLINK_ATTR_I [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), + [DEVLINK_ATTR_PORT_INDEX] = { .type = NLA_U32, }, }; /* DEVLINK_CMD_RESOURCE_DUMP - dump */ @@ -962,7 +963,7 @@ const struct genl_split_ops devlink_nl_ops[75] = { { .cmd = DEVLINK_CMD_RESOURCE_DUMP, .validate = GENL_DONT_VALIDATE_STRICT, - .pre_doit = devlink_nl_pre_doit, + .pre_doit = devlink_nl_pre_doit_port_optional, .doit = devlink_nl_resource_dump_doit, .post_doit = devlink_nl_post_doit, .policy = devlink_resource_dump_do_nl_policy, diff --git a/net/devlink/netlink_gen.h b/net/devlink/netlink_gen.h index d79f6a0888f6..20034b0929a8 100644 --- a/net/devlink/netlink_gen.h +++ b/net/devlink/netlink_gen.h @@ -24,11 +24,11 @@ int devlink_nl_pre_doit(const struct genl_split_ops *ops, struct sk_buff *skb, struct genl_info *info); int devlink_nl_pre_doit_port(const struct genl_split_ops *ops, struct sk_buff *skb, struct genl_info *info); -int devlink_nl_pre_doit_dev_lock(const struct genl_split_ops *ops, - struct sk_buff *skb, struct genl_info *info); int devlink_nl_pre_doit_port_optional(const struct genl_split_ops *ops, struct sk_buff *skb, struct genl_info *info); +int devlink_nl_pre_doit_dev_lock(const struct genl_split_ops *ops, + struct sk_buff *skb, struct genl_info *info); void devlink_nl_post_doit(const struct genl_split_ops *ops, struct sk_buff *skb, struct genl_info *info); diff --git a/net/devlink/resource.c b/net/devlink/resource.c index 7984eda63eb6..bf5221fb3e64 100644 --- a/net/devlink/resource.c +++ b/net/devlink/resource.c @@ -251,8 +251,10 @@ static int devlink_resource_list_fill(struct sk_buff *skb, static int devlink_resource_fill(struct genl_info *info, enum devlink_command cmd, int flags) { + struct devlink_port *devlink_port = info->user_ptr[1]; struct devlink *devlink = info->user_ptr[0]; struct devlink_resource *resource; + struct list_head *resource_list; struct nlattr *resources_attr; struct sk_buff *skb = NULL; struct nlmsghdr *nlh; @@ -261,7 +263,9 @@ static int devlink_resource_fill(struct genl_info *info, int i; int err; - resource = list_first_entry(&devlink->resource_list, + resource_list = devlink_port ? + &devlink_port->resource_list : &devlink->resource_list; + resource = list_first_entry(resource_list, struct devlink_resource, list); start_again: err = devlink_nl_msg_reply_and_new(&skb, info); @@ -277,6 +281,9 @@ start_again: if (devlink_nl_put_handle(skb, devlink)) goto nla_put_failure; + if (devlink_port && + nla_put_u32(skb, DEVLINK_ATTR_PORT_INDEX, devlink_port->index)) + goto nla_put_failure; resources_attr = nla_nest_start_noflag(skb, DEVLINK_ATTR_RESOURCE_LIST); @@ -285,7 +292,7 @@ start_again: incomplete = false; i = 0; - list_for_each_entry_from(resource, &devlink->resource_list, list) { + list_for_each_entry_from(resource, resource_list, list) { err = devlink_resource_put(devlink, skb, resource); if (err) { if (!i) @@ -319,9 +326,16 @@ err_resource_put: int devlink_nl_resource_dump_doit(struct sk_buff *skb, struct genl_info *info) { + struct devlink_port *devlink_port = info->user_ptr[1]; struct devlink *devlink = info->user_ptr[0]; + struct list_head *resource_list; + + if (info->attrs[DEVLINK_ATTR_PORT_INDEX] && !devlink_port) + return -ENODEV; - if (list_empty(&devlink->resource_list)) + resource_list = devlink_port ? + &devlink_port->resource_list : &devlink->resource_list; + if (list_empty(resource_list)) return -EOPNOTSUPP; return devlink_resource_fill(info, DEVLINK_CMD_RESOURCE_DUMP, 0); -- cgit v1.2.3 From 170e160a0e7c6396f74279d922bee90902bd16f6 Mon Sep 17 00:00:00 2001 From: Or Har-Toov Date: Tue, 7 Apr 2026 22:41:04 +0300 Subject: devlink: Document port-level resources and full dump Document the port-level resource support and the option to dump all resources, including both device-level and port-level entries. Signed-off-by: Or Har-Toov Reviewed-by: Shay Drori Reviewed-by: Moshe Shemesh Reviewed-by: Jiri Pirko Signed-off-by: Tariq Toukan Link: https://patch.msgid.link/20260407194107.148063-10-tariqt@nvidia.com Signed-off-by: Jakub Kicinski --- .../networking/devlink/devlink-resource.rst | 35 ++++++++++++++++++++++ 1 file changed, 35 insertions(+) (limited to 'Documentation') diff --git a/Documentation/networking/devlink/devlink-resource.rst b/Documentation/networking/devlink/devlink-resource.rst index 3d5ae51e65a2..9839c1661315 100644 --- a/Documentation/networking/devlink/devlink-resource.rst +++ b/Documentation/networking/devlink/devlink-resource.rst @@ -74,3 +74,38 @@ attribute, which represents the pending change in size. For example: Note that changes in resource size may require a device reload to properly take effect. + +Port-level Resources and Full Dump +================================== + +In addition to device-level resources, ``devlink`` also supports port-level +resources. These resources are associated with a specific devlink port rather +than the device as a whole. + +To list resources for all devlink devices and ports: + +.. code:: shell + + $ devlink resource show + pci/0000:03:00.0: + name max_local_SFs size 128 unit entry dpipe_tables none + name max_external_SFs size 128 unit entry dpipe_tables none + pci/0000:03:00.0/196608: + name max_SFs size 128 unit entry dpipe_tables none + pci/0000:03:00.0/196609: + name max_SFs size 128 unit entry dpipe_tables none + pci/0000:03:00.1: + name max_local_SFs size 128 unit entry dpipe_tables none + name max_external_SFs size 128 unit entry dpipe_tables none + pci/0000:03:00.1/196708: + name max_SFs size 128 unit entry dpipe_tables none + pci/0000:03:00.1/196709: + name max_SFs size 128 unit entry dpipe_tables none + +To show resources for a specific port: + +.. code:: shell + + $ devlink resource show pci/0000:03:00.0/196608 + pci/0000:03:00.0/196608: + name max_SFs size 128 unit entry dpipe_tables none -- cgit v1.2.3 From 1bc45341a6ea4009ee9f2fbca9096b33a9ef71a2 Mon Sep 17 00:00:00 2001 From: Or Har-Toov Date: Tue, 7 Apr 2026 22:41:05 +0300 Subject: devlink: Add resource scope filtering to resource dump Allow filtering the resource dump to device-level or port-level resources using the 'scope' option. Example - dump only device-level resources: $ devlink resource show scope dev pci/0000:03:00.0: name max_local_SFs size 128 unit entry dpipe_tables none name max_external_SFs size 128 unit entry dpipe_tables none pci/0000:03:00.1: name max_local_SFs size 128 unit entry dpipe_tables none name max_external_SFs size 128 unit entry dpipe_tables none Example - dump only port-level resources: $ devlink resource show scope port pci/0000:03:00.0/196608: name max_SFs size 128 unit entry dpipe_tables none pci/0000:03:00.0/196609: name max_SFs size 128 unit entry dpipe_tables none pci/0000:03:00.1/196708: name max_SFs size 128 unit entry dpipe_tables none pci/0000:03:00.1/196709: name max_SFs size 128 unit entry dpipe_tables none Signed-off-by: Or Har-Toov Reviewed-by: Moshe Shemesh Signed-off-by: Tariq Toukan Link: https://patch.msgid.link/20260407194107.148063-11-tariqt@nvidia.com Signed-off-by: Jakub Kicinski --- Documentation/netlink/specs/devlink.yaml | 24 +++++++++++++++++++++++- include/uapi/linux/devlink.h | 11 +++++++++++ net/devlink/netlink_gen.c | 5 +++-- net/devlink/resource.c | 19 ++++++++++++++++++- 4 files changed, 55 insertions(+), 4 deletions(-) (limited to 'Documentation') diff --git a/Documentation/netlink/specs/devlink.yaml b/Documentation/netlink/specs/devlink.yaml index 34aa81ba689e..247b147d689f 100644 --- a/Documentation/netlink/specs/devlink.yaml +++ b/Documentation/netlink/specs/devlink.yaml @@ -157,6 +157,14 @@ definitions: entries: - name: entry + - + type: enum + name: resource-scope + entries: + - + name: dev + - + name: port - type: enum name: reload-action @@ -873,6 +881,16 @@ attribute-sets: doc: Unique devlink instance index. checks: max: u32-max + - + name: resource-scope-mask + type: u32 + enum: resource-scope + enum-as-flags: true + doc: | + Bitmask selecting which resource classes to include in a + resource-dump response. Bit 0 (dev) selects device-level + resources; bit 1 (port) selects port-level resources. + When absent all classes are returned. - name: dl-dev-stats subset-of: devlink @@ -1775,7 +1793,11 @@ operations: - resource-list dump: request: - attributes: *dev-id-attrs + attributes: + - bus-name + - dev-name + - index + - resource-scope-mask reply: *resource-dump-reply - diff --git a/include/uapi/linux/devlink.h b/include/uapi/linux/devlink.h index 7de2d8cc862f..0b165eac7619 100644 --- a/include/uapi/linux/devlink.h +++ b/include/uapi/linux/devlink.h @@ -645,6 +645,7 @@ enum devlink_attr { DEVLINK_ATTR_PARAM_RESET_DEFAULT, /* flag */ DEVLINK_ATTR_INDEX, /* uint */ + DEVLINK_ATTR_RESOURCE_SCOPE_MASK, /* u32 */ /* Add new attributes above here, update the spec in * Documentation/netlink/specs/devlink.yaml and re-generate @@ -704,6 +705,16 @@ enum devlink_resource_unit { DEVLINK_RESOURCE_UNIT_ENTRY, }; +enum devlink_resource_scope { + DEVLINK_RESOURCE_SCOPE_DEV_BIT, + DEVLINK_RESOURCE_SCOPE_PORT_BIT, +}; + +#define DEVLINK_RESOURCE_SCOPE_DEV \ + _BITUL(DEVLINK_RESOURCE_SCOPE_DEV_BIT) +#define DEVLINK_RESOURCE_SCOPE_PORT \ + _BITUL(DEVLINK_RESOURCE_SCOPE_PORT_BIT) + enum devlink_port_fn_attr_cap { DEVLINK_PORT_FN_ATTR_CAP_ROCE_BIT, DEVLINK_PORT_FN_ATTR_CAP_MIGRATABLE_BIT, diff --git a/net/devlink/netlink_gen.c b/net/devlink/netlink_gen.c index 9cc372d9ee41..81899786fd98 100644 --- a/net/devlink/netlink_gen.c +++ b/net/devlink/netlink_gen.c @@ -313,10 +313,11 @@ static const struct nla_policy devlink_resource_dump_do_nl_policy[DEVLINK_ATTR_I }; /* DEVLINK_CMD_RESOURCE_DUMP - dump */ -static const struct nla_policy devlink_resource_dump_dump_nl_policy[DEVLINK_ATTR_INDEX + 1] = { +static const struct nla_policy devlink_resource_dump_dump_nl_policy[DEVLINK_ATTR_RESOURCE_SCOPE_MASK + 1] = { [DEVLINK_ATTR_BUS_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING, }, [DEVLINK_ATTR_INDEX] = NLA_POLICY_FULL_RANGE(NLA_UINT, &devlink_attr_index_range), + [DEVLINK_ATTR_RESOURCE_SCOPE_MASK] = NLA_POLICY_MASK(NLA_U32, 0x3), }; /* DEVLINK_CMD_RELOAD - do */ @@ -974,7 +975,7 @@ const struct genl_split_ops devlink_nl_ops[75] = { .cmd = DEVLINK_CMD_RESOURCE_DUMP, .dumpit = devlink_nl_resource_dump_dumpit, .policy = devlink_resource_dump_dump_nl_policy, - .maxattr = DEVLINK_ATTR_INDEX, + .maxattr = DEVLINK_ATTR_RESOURCE_SCOPE_MASK, .flags = GENL_CMD_CAP_DUMP, }, { diff --git a/net/devlink/resource.c b/net/devlink/resource.c index bf5221fb3e64..3d2f42bc2fb5 100644 --- a/net/devlink/resource.c +++ b/net/devlink/resource.c @@ -398,11 +398,25 @@ devlink_nl_resource_dump_one(struct sk_buff *skb, struct devlink *devlink, struct netlink_callback *cb, int flags) { struct devlink_nl_dump_state *state = devlink_dump_state(cb); + const struct genl_info *info = genl_info_dump(cb); struct devlink_port *devlink_port; + struct nlattr *scope_attr = NULL; unsigned long port_idx; + u32 scope = 0; int err; - if (!state->port_ctx.index_valid) { + if (info->attrs && info->attrs[DEVLINK_ATTR_RESOURCE_SCOPE_MASK]) { + scope_attr = info->attrs[DEVLINK_ATTR_RESOURCE_SCOPE_MASK]; + scope = nla_get_u32(scope_attr); + if (!scope) { + NL_SET_ERR_MSG_ATTR(info->extack, scope_attr, + "empty resource scope selection"); + return -EINVAL; + } + } + + if (!state->port_ctx.index_valid && + (!scope || (scope & DEVLINK_RESOURCE_SCOPE_DEV))) { err = devlink_resource_dump_fill_one(skb, devlink, NULL, cb, flags, &state->idx); if (err) @@ -410,6 +424,8 @@ devlink_nl_resource_dump_one(struct sk_buff *skb, struct devlink *devlink, state->idx = 0; } + if (scope && !(scope & DEVLINK_RESOURCE_SCOPE_PORT)) + goto out; /* Check in case port was removed between dump callbacks. */ if (state->port_ctx.index_valid && !xa_load(&devlink->ports, state->port_ctx.index)) @@ -425,6 +441,7 @@ devlink_nl_resource_dump_one(struct sk_buff *skb, struct devlink *devlink, } state->idx = 0; } +out: state->port_ctx.index_valid = false; state->port_ctx.index = 0; return 0; -- cgit v1.2.3 From 78c327c1728de377020672d429250c80a8a13723 Mon Sep 17 00:00:00 2001 From: Or Har-Toov Date: Tue, 7 Apr 2026 22:41:07 +0300 Subject: devlink: Document resource scope filtering Document the scope parameter for devlink resource show, which allows filtering the dump to device-level or port-level resources only. Signed-off-by: Or Har-Toov Reviewed-by: Moshe Shemesh Signed-off-by: Tariq Toukan Link: https://patch.msgid.link/20260407194107.148063-13-tariqt@nvidia.com Signed-off-by: Jakub Kicinski --- .../networking/devlink/devlink-resource.rst | 35 ++++++++++++++++++++++ 1 file changed, 35 insertions(+) (limited to 'Documentation') diff --git a/Documentation/networking/devlink/devlink-resource.rst b/Documentation/networking/devlink/devlink-resource.rst index 9839c1661315..47eec8f875b4 100644 --- a/Documentation/networking/devlink/devlink-resource.rst +++ b/Documentation/networking/devlink/devlink-resource.rst @@ -109,3 +109,38 @@ To show resources for a specific port: $ devlink resource show pci/0000:03:00.0/196608 pci/0000:03:00.0/196608: name max_SFs size 128 unit entry dpipe_tables none + +Resource Scope Filtering +======================== + +When dumping resources for all devices, ``devlink resource show`` accepts +an optional ``scope`` parameter to restrict the response to device-level +resources, port-level resources, or both (the default). + +To dump only device-level resources across all devices: + +.. code:: shell + + $ devlink resource show scope dev + pci/0000:03:00.0: + name max_local_SFs size 128 unit entry dpipe_tables none + name max_external_SFs size 128 unit entry dpipe_tables none + pci/0000:03:00.1: + name max_local_SFs size 128 unit entry dpipe_tables none + name max_external_SFs size 128 unit entry dpipe_tables none + +To dump only port-level resources across all devices: + +.. code:: shell + + $ devlink resource show scope port + pci/0000:03:00.0/196608: + name max_SFs size 128 unit entry dpipe_tables none + pci/0000:03:00.0/196609: + name max_SFs size 128 unit entry dpipe_tables none + pci/0000:03:00.1/196708: + name max_SFs size 128 unit entry dpipe_tables none + pci/0000:03:00.1/196709: + name max_SFs size 128 unit entry dpipe_tables none + +Note that port-level resources are read-only. -- cgit v1.2.3 From 7789c6bb76acf21539c2c74b0cc869bb57de99e6 Mon Sep 17 00:00:00 2001 From: Daniel Borkmann Date: Fri, 3 Apr 2026 01:10:18 +0200 Subject: net: Add queue-create operation Add a ynl netdev family operation called queue-create that creates a new queue on a netdevice: name: queue-create attribute-set: queue flags: [admin-perm] do: request: attributes: - ifindex - type - lease reply: &queue-create-op attributes: - id This is a generic operation such that it can be extended for various use cases in future. Right now it is mandatory to specify ifindex, the queue type which is enforced to rx and a lease. The newly created queue id is returned to the caller. A queue from a virtual device can have a lease which refers to another queue from a physical device. This is useful for memory providers and AF_XDP operations which take an ifindex and queue id to allow applications to bind against virtual devices in containers. The lease couples both queues together and allows to proxy the operations from a virtual device in a container to the physical device. In future, the nested lease attribute can be lifted and made optional for other use-cases such as dynamic queue creation for physical netdevs. The lack of lease and the specification of the physical device as an ifindex will imply that we need a real queue to be allocated. Similarly, the queue type enforcement to rx can then be lifted as well to support tx. An early implementation had only driver-specific integration [0], but in order for other virtual devices to reuse, it makes sense to have this as a generic API in core net. For leasing queues, the virtual netdev must have real_num_rx_queues less than num_rx_queues at the time of calling queue-create. The queue-type must be rx as only rx queues are supported for leasing for now. We also enforce that the queue-create ifindex must point to a virtual device, and that the nested lease attribute's ifindex must point to a physical device. The nested lease attribute set contains a netns-id attribute which is optional and can specify a netns-id relative to the caller's netns. It requires cap_net_admin and if the netns-id attribute is not specified, the lease ifindex will be retrieved from the current netns. Also, it is modeled as an s32 type similarly as done elsewhere in the stack. Signed-off-by: Daniel Borkmann Co-developed-by: David Wei Signed-off-by: David Wei Acked-by: Stanislav Fomichev Reviewed-by: Nikolay Aleksandrov Link: https://bpfconf.ebpf.io/bpfconf2025/bpfconf2025_material/lsfmmbpf_2025_netkit_borkmann.pdf [0] Link: https://patch.msgid.link/20260402231031.447597-2-daniel@iogearbox.net Signed-off-by: Jakub Kicinski --- Documentation/netlink/specs/netdev.yaml | 46 +++++++++++++++++++++++++++++++++ include/uapi/linux/netdev.h | 11 ++++++++ net/core/netdev-genl-gen.c | 20 ++++++++++++++ net/core/netdev-genl-gen.h | 2 ++ net/core/netdev-genl.c | 5 ++++ tools/include/uapi/linux/netdev.h | 11 ++++++++ 6 files changed, 95 insertions(+) (limited to 'Documentation') diff --git a/Documentation/netlink/specs/netdev.yaml b/Documentation/netlink/specs/netdev.yaml index 596c306ce52b..b93beb247a11 100644 --- a/Documentation/netlink/specs/netdev.yaml +++ b/Documentation/netlink/specs/netdev.yaml @@ -339,6 +339,15 @@ attribute-sets: doc: XSK information for this queue, if any. type: nest nested-attributes: xsk-info + - + name: lease + doc: | + A queue from a virtual device can have a lease which refers to + another queue from a physical device. This is useful for memory + providers and AF_XDP operations which take an ifindex and queue id + to allow applications to bind against virtual devices in containers. + type: nest + nested-attributes: lease - name: qstats doc: | @@ -537,6 +546,26 @@ attribute-sets: name: id - name: type + - + name: lease + attributes: + - + name: ifindex + doc: The netdev ifindex to lease the queue from. + type: u32 + checks: + min: 1 + - + name: queue + doc: The netdev queue to lease from. + type: nest + nested-attributes: queue-id + - + name: netns-id + doc: The network namespace id of the netdev. + type: s32 + checks: + min: 0 - name: dmabuf attributes: @@ -686,6 +715,7 @@ operations: - dmabuf - io-uring - xsk + - lease dump: request: attributes: @@ -797,6 +827,22 @@ operations: reply: attributes: - id + - + name: queue-create + doc: | + Create a new queue for the given netdevice. Whether this operation + is supported depends on the device and the driver. + attribute-set: queue + flags: [admin-perm] + do: + request: + attributes: + - ifindex + - type + - lease + reply: &queue-create-op + attributes: + - id kernel-family: headers: ["net/netdev_netlink.h"] diff --git a/include/uapi/linux/netdev.h b/include/uapi/linux/netdev.h index e0b579a1df4f..7df1056a35fd 100644 --- a/include/uapi/linux/netdev.h +++ b/include/uapi/linux/netdev.h @@ -160,6 +160,7 @@ enum { NETDEV_A_QUEUE_DMABUF, NETDEV_A_QUEUE_IO_URING, NETDEV_A_QUEUE_XSK, + NETDEV_A_QUEUE_LEASE, __NETDEV_A_QUEUE_MAX, NETDEV_A_QUEUE_MAX = (__NETDEV_A_QUEUE_MAX - 1) @@ -202,6 +203,15 @@ enum { NETDEV_A_QSTATS_MAX = (__NETDEV_A_QSTATS_MAX - 1) }; +enum { + NETDEV_A_LEASE_IFINDEX = 1, + NETDEV_A_LEASE_QUEUE, + NETDEV_A_LEASE_NETNS_ID, + + __NETDEV_A_LEASE_MAX, + NETDEV_A_LEASE_MAX = (__NETDEV_A_LEASE_MAX - 1) +}; + enum { NETDEV_A_DMABUF_IFINDEX = 1, NETDEV_A_DMABUF_QUEUES, @@ -228,6 +238,7 @@ enum { NETDEV_CMD_BIND_RX, NETDEV_CMD_NAPI_SET, NETDEV_CMD_BIND_TX, + NETDEV_CMD_QUEUE_CREATE, __NETDEV_CMD_MAX, NETDEV_CMD_MAX = (__NETDEV_CMD_MAX - 1) diff --git a/net/core/netdev-genl-gen.c b/net/core/netdev-genl-gen.c index ba673e81716f..81aecb5d3bc5 100644 --- a/net/core/netdev-genl-gen.c +++ b/net/core/netdev-genl-gen.c @@ -28,6 +28,12 @@ static const struct netlink_range_validation netdev_a_napi_defer_hard_irqs_range }; /* Common nested types */ +const struct nla_policy netdev_lease_nl_policy[NETDEV_A_LEASE_NETNS_ID + 1] = { + [NETDEV_A_LEASE_IFINDEX] = NLA_POLICY_MIN(NLA_U32, 1), + [NETDEV_A_LEASE_QUEUE] = NLA_POLICY_NESTED(netdev_queue_id_nl_policy), + [NETDEV_A_LEASE_NETNS_ID] = NLA_POLICY_MIN(NLA_S32, 0), +}; + const struct nla_policy netdev_page_pool_info_nl_policy[NETDEV_A_PAGE_POOL_IFINDEX + 1] = { [NETDEV_A_PAGE_POOL_ID] = NLA_POLICY_FULL_RANGE(NLA_UINT, &netdev_a_page_pool_id_range), [NETDEV_A_PAGE_POOL_IFINDEX] = NLA_POLICY_FULL_RANGE(NLA_U32, &netdev_a_page_pool_ifindex_range), @@ -107,6 +113,13 @@ static const struct nla_policy netdev_bind_tx_nl_policy[NETDEV_A_DMABUF_FD + 1] [NETDEV_A_DMABUF_FD] = { .type = NLA_U32, }, }; +/* NETDEV_CMD_QUEUE_CREATE - do */ +static const struct nla_policy netdev_queue_create_nl_policy[NETDEV_A_QUEUE_LEASE + 1] = { + [NETDEV_A_QUEUE_IFINDEX] = NLA_POLICY_MIN(NLA_U32, 1), + [NETDEV_A_QUEUE_TYPE] = NLA_POLICY_MAX(NLA_U32, 1), + [NETDEV_A_QUEUE_LEASE] = NLA_POLICY_NESTED(netdev_lease_nl_policy), +}; + /* Ops table for netdev */ static const struct genl_split_ops netdev_nl_ops[] = { { @@ -205,6 +218,13 @@ static const struct genl_split_ops netdev_nl_ops[] = { .maxattr = NETDEV_A_DMABUF_FD, .flags = GENL_CMD_CAP_DO, }, + { + .cmd = NETDEV_CMD_QUEUE_CREATE, + .doit = netdev_nl_queue_create_doit, + .policy = netdev_queue_create_nl_policy, + .maxattr = NETDEV_A_QUEUE_LEASE, + .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO, + }, }; static const struct genl_multicast_group netdev_nl_mcgrps[] = { diff --git a/net/core/netdev-genl-gen.h b/net/core/netdev-genl-gen.h index cffc08517a41..d71b435d72c1 100644 --- a/net/core/netdev-genl-gen.h +++ b/net/core/netdev-genl-gen.h @@ -14,6 +14,7 @@ #include /* Common nested types */ +extern const struct nla_policy netdev_lease_nl_policy[NETDEV_A_LEASE_NETNS_ID + 1]; extern const struct nla_policy netdev_page_pool_info_nl_policy[NETDEV_A_PAGE_POOL_IFINDEX + 1]; extern const struct nla_policy netdev_queue_id_nl_policy[NETDEV_A_QUEUE_TYPE + 1]; @@ -36,6 +37,7 @@ int netdev_nl_qstats_get_dumpit(struct sk_buff *skb, int netdev_nl_bind_rx_doit(struct sk_buff *skb, struct genl_info *info); int netdev_nl_napi_set_doit(struct sk_buff *skb, struct genl_info *info); int netdev_nl_bind_tx_doit(struct sk_buff *skb, struct genl_info *info); +int netdev_nl_queue_create_doit(struct sk_buff *skb, struct genl_info *info); enum { NETDEV_NLGRP_MGMT, diff --git a/net/core/netdev-genl.c b/net/core/netdev-genl.c index 470fabbeacd9..aae75431858d 100644 --- a/net/core/netdev-genl.c +++ b/net/core/netdev-genl.c @@ -1120,6 +1120,11 @@ err_genlmsg_free: return err; } +int netdev_nl_queue_create_doit(struct sk_buff *skb, struct genl_info *info) +{ + return -EOPNOTSUPP; +} + void netdev_nl_sock_priv_init(struct netdev_nl_sock *priv) { INIT_LIST_HEAD(&priv->bindings); diff --git a/tools/include/uapi/linux/netdev.h b/tools/include/uapi/linux/netdev.h index e0b579a1df4f..7df1056a35fd 100644 --- a/tools/include/uapi/linux/netdev.h +++ b/tools/include/uapi/linux/netdev.h @@ -160,6 +160,7 @@ enum { NETDEV_A_QUEUE_DMABUF, NETDEV_A_QUEUE_IO_URING, NETDEV_A_QUEUE_XSK, + NETDEV_A_QUEUE_LEASE, __NETDEV_A_QUEUE_MAX, NETDEV_A_QUEUE_MAX = (__NETDEV_A_QUEUE_MAX - 1) @@ -202,6 +203,15 @@ enum { NETDEV_A_QSTATS_MAX = (__NETDEV_A_QSTATS_MAX - 1) }; +enum { + NETDEV_A_LEASE_IFINDEX = 1, + NETDEV_A_LEASE_QUEUE, + NETDEV_A_LEASE_NETNS_ID, + + __NETDEV_A_LEASE_MAX, + NETDEV_A_LEASE_MAX = (__NETDEV_A_LEASE_MAX - 1) +}; + enum { NETDEV_A_DMABUF_IFINDEX = 1, NETDEV_A_DMABUF_QUEUES, @@ -228,6 +238,7 @@ enum { NETDEV_CMD_BIND_RX, NETDEV_CMD_NAPI_SET, NETDEV_CMD_BIND_TX, + NETDEV_CMD_QUEUE_CREATE, __NETDEV_CMD_MAX, NETDEV_CMD_MAX = (__NETDEV_CMD_MAX - 1) -- cgit v1.2.3 From d04686d9bc86432ea3008d5f358373d8466d1943 Mon Sep 17 00:00:00 2001 From: Daniel Borkmann Date: Fri, 3 Apr 2026 01:10:19 +0200 Subject: net: Implement netdev_nl_queue_create_doit Implement netdev_nl_queue_create_doit which creates a new rx queue in a virtual netdev and then leases it to a rx queue in a physical netdev. Example with ynl client: # ynl --family netdev --output-json --do queue-create \ --json '{"ifindex": 8, "type": "rx", "lease": {"ifindex": 4, "queue": {"type": "rx", "id": 15}}}' {'id': 1} Note that the netdevice locking order is always from the virtual to the physical device. Signed-off-by: Daniel Borkmann Co-developed-by: David Wei Signed-off-by: David Wei Reviewed-by: Nikolay Aleksandrov Link: https://patch.msgid.link/20260402231031.447597-3-daniel@iogearbox.net Signed-off-by: Jakub Kicinski --- Documentation/networking/netdevices.rst | 6 ++ include/linux/netdevice.h | 9 +- include/net/netdev_queues.h | 19 +++- include/net/netdev_rx_queue.h | 15 ++- net/core/dev.c | 8 ++ net/core/dev.h | 5 + net/core/netdev-genl.c | 164 +++++++++++++++++++++++++++++++- net/core/netdev_queues.c | 62 ++++++++++++ net/core/netdev_rx_queue.c | 46 ++++++++- 9 files changed, 323 insertions(+), 11 deletions(-) (limited to 'Documentation') diff --git a/Documentation/networking/netdevices.rst b/Documentation/networking/netdevices.rst index 35704d115312..83e28b96884f 100644 --- a/Documentation/networking/netdevices.rst +++ b/Documentation/networking/netdevices.rst @@ -329,6 +329,12 @@ by setting ``request_ops_lock`` to true. Code comments and docs refer to drivers which have ops called under the instance lock as "ops locked". See also the documentation of the ``lock`` member of struct net_device. +There is also a case of taking two per-netdev locks in sequence when netdev +queues are leased, that is, the netdev-scope lock is taken for both the +virtual and the physical device. To prevent deadlocks, the virtual device's +lock must always be acquired before the physical device's (see +``netdev_nl_queue_create_doit``). + In the future, there will be an option for individual drivers to opt out of using ``rtnl_lock`` and instead perform their control operations directly under the netdev instance lock. diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index e15367373f7c..e8aa9cc4075d 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -2561,7 +2561,14 @@ struct net_device { * Also protects some fields in: * struct napi_struct, struct netdev_queue, struct netdev_rx_queue * - * Ordering: take after rtnl_lock. + * Ordering: + * + * - take after rtnl_lock + * + * - for the case of netdev queue leasing, the netdev-scope lock is + * taken for both the virtual and the physical device; to prevent + * deadlocks, the virtual device's lock must always be acquired + * before the physical device's (see netdev_nl_queue_create_doit) */ struct mutex lock; diff --git a/include/net/netdev_queues.h b/include/net/netdev_queues.h index 95ed28212f4e..748b70552ed1 100644 --- a/include/net/netdev_queues.h +++ b/include/net/netdev_queues.h @@ -150,6 +150,11 @@ enum { * When NIC-wide config is changed the callback will * be invoked for all queues. * + * @ndo_queue_create: Create a new RX queue on a virtual device that will + * be paired with a physical device's queue via leasing. + * Return the new queue id on success, negative error + * on failure. + * * @supported_params: Bitmask of supported parameters, see QCFG_*. * * Note that @ndo_queue_mem_alloc and @ndo_queue_mem_free may be called while @@ -178,6 +183,8 @@ struct netdev_queue_mgmt_ops { struct netlink_ext_ack *extack); struct device * (*ndo_queue_get_dma_dev)(struct net_device *dev, int idx); + int (*ndo_queue_create)(struct net_device *dev, + struct netlink_ext_ack *extack); unsigned int supported_params; }; @@ -185,7 +192,7 @@ struct netdev_queue_mgmt_ops { void netdev_queue_config(struct net_device *dev, int rxq, struct netdev_queue_config *qcfg); -bool netif_rxq_has_unreadable_mp(struct net_device *dev, int idx); +bool netif_rxq_has_unreadable_mp(struct net_device *dev, unsigned int rxq_idx); /** * DOC: Lockless queue stopping / waking helpers. @@ -374,5 +381,11 @@ static inline unsigned int netif_xmit_timeout_ms(struct netdev_queue *txq) }) struct device *netdev_queue_get_dma_dev(struct net_device *dev, int idx); - -#endif +bool netdev_can_create_queue(const struct net_device *dev, + struct netlink_ext_ack *extack); +bool netdev_can_lease_queue(const struct net_device *dev, + struct netlink_ext_ack *extack); +bool netdev_queue_busy(struct net_device *dev, unsigned int idx, + enum netdev_queue_type type, + struct netlink_ext_ack *extack); +#endif /* _LINUX_NET_QUEUES_H */ diff --git a/include/net/netdev_rx_queue.h b/include/net/netdev_rx_queue.h index 08f81329fc11..1d41c253f0a3 100644 --- a/include/net/netdev_rx_queue.h +++ b/include/net/netdev_rx_queue.h @@ -31,6 +31,14 @@ struct netdev_rx_queue { struct napi_struct *napi; struct netdev_queue_config qcfg; struct pp_memory_provider_params mp_params; + + /* If a queue is leased, then the lease pointer is always + * valid. From the physical device it points to the virtual + * queue, and from the virtual device it points to the + * physical queue. + */ + struct netdev_rx_queue *lease; + netdevice_tracker lease_tracker; } ____cacheline_aligned_in_smp; /* @@ -60,5 +68,8 @@ get_netdev_rx_queue_index(struct netdev_rx_queue *queue) } int netdev_rx_queue_restart(struct net_device *dev, unsigned int rxq); - -#endif +void netdev_rx_queue_lease(struct netdev_rx_queue *rxq_dst, + struct netdev_rx_queue *rxq_src); +void netdev_rx_queue_unlease(struct netdev_rx_queue *rxq_dst, + struct netdev_rx_queue *rxq_src); +#endif /* _LINUX_NETDEV_RX_QUEUE_H */ diff --git a/net/core/dev.c b/net/core/dev.c index 5a31f9d2128c..cc7bcac892af 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -1121,6 +1121,14 @@ netdev_get_by_index_lock_ops_compat(struct net *net, int ifindex) return __netdev_put_lock_ops_compat(dev, net); } +struct net_device * +netdev_put_lock(struct net_device *dev, struct net *net, + netdevice_tracker *tracker) +{ + netdev_tracker_free(dev, tracker); + return __netdev_put_lock(dev, net); +} + struct net_device * netdev_xa_find_lock(struct net *net, struct net_device *dev, unsigned long *index) diff --git a/net/core/dev.h b/net/core/dev.h index 781619e76b3e..6516ce2b5517 100644 --- a/net/core/dev.h +++ b/net/core/dev.h @@ -31,6 +31,8 @@ netdev_napi_by_id_lock(struct net *net, unsigned int napi_id); struct net_device *dev_get_by_napi_id(unsigned int napi_id); struct net_device *__netdev_put_lock(struct net_device *dev, struct net *net); +struct net_device *netdev_put_lock(struct net_device *dev, struct net *net, + netdevice_tracker *tracker); struct net_device * netdev_xa_find_lock(struct net *net, struct net_device *dev, unsigned long *index); @@ -96,6 +98,9 @@ int netdev_queue_config_validate(struct net_device *dev, int rxq_idx, struct netdev_queue_config *qcfg, struct netlink_ext_ack *extack); +bool netif_rxq_has_mp(struct net_device *dev, unsigned int rxq_idx); +bool netif_rxq_is_leased(struct net_device *dev, unsigned int rxq_idx); + /* netdev management, shared between various uAPI entry points */ struct netdev_name_node { struct hlist_node hlist; diff --git a/net/core/netdev-genl.c b/net/core/netdev-genl.c index aae75431858d..5d5e5b9a8af0 100644 --- a/net/core/netdev-genl.c +++ b/net/core/netdev-genl.c @@ -1122,7 +1122,169 @@ err_genlmsg_free: int netdev_nl_queue_create_doit(struct sk_buff *skb, struct genl_info *info) { - return -EOPNOTSUPP; + const int qmaxtype = ARRAY_SIZE(netdev_queue_id_nl_policy) - 1; + const int lmaxtype = ARRAY_SIZE(netdev_lease_nl_policy) - 1; + int err, ifindex, ifindex_lease, queue_id, queue_id_lease; + struct nlattr *qtb[ARRAY_SIZE(netdev_queue_id_nl_policy)]; + struct nlattr *ltb[ARRAY_SIZE(netdev_lease_nl_policy)]; + struct netdev_rx_queue *rxq, *rxq_lease; + struct net_device *dev, *dev_lease; + netdevice_tracker dev_tracker; + s32 netns_lease = -1; + struct nlattr *nest; + struct sk_buff *rsp; + struct net *net; + void *hdr; + + if (GENL_REQ_ATTR_CHECK(info, NETDEV_A_QUEUE_IFINDEX) || + GENL_REQ_ATTR_CHECK(info, NETDEV_A_QUEUE_TYPE) || + GENL_REQ_ATTR_CHECK(info, NETDEV_A_QUEUE_LEASE)) + return -EINVAL; + if (nla_get_u32(info->attrs[NETDEV_A_QUEUE_TYPE]) != + NETDEV_QUEUE_TYPE_RX) { + NL_SET_BAD_ATTR(info->extack, info->attrs[NETDEV_A_QUEUE_TYPE]); + return -EINVAL; + } + + ifindex = nla_get_u32(info->attrs[NETDEV_A_QUEUE_IFINDEX]); + + nest = info->attrs[NETDEV_A_QUEUE_LEASE]; + err = nla_parse_nested(ltb, lmaxtype, nest, + netdev_lease_nl_policy, info->extack); + if (err < 0) + return err; + if (NL_REQ_ATTR_CHECK(info->extack, nest, ltb, NETDEV_A_LEASE_IFINDEX) || + NL_REQ_ATTR_CHECK(info->extack, nest, ltb, NETDEV_A_LEASE_QUEUE)) + return -EINVAL; + if (ltb[NETDEV_A_LEASE_NETNS_ID]) { + if (!capable(CAP_NET_ADMIN)) + return -EPERM; + netns_lease = nla_get_s32(ltb[NETDEV_A_LEASE_NETNS_ID]); + } + + ifindex_lease = nla_get_u32(ltb[NETDEV_A_LEASE_IFINDEX]); + + nest = ltb[NETDEV_A_LEASE_QUEUE]; + err = nla_parse_nested(qtb, qmaxtype, nest, + netdev_queue_id_nl_policy, info->extack); + if (err < 0) + return err; + if (NL_REQ_ATTR_CHECK(info->extack, nest, qtb, NETDEV_A_QUEUE_ID) || + NL_REQ_ATTR_CHECK(info->extack, nest, qtb, NETDEV_A_QUEUE_TYPE)) + return -EINVAL; + if (nla_get_u32(qtb[NETDEV_A_QUEUE_TYPE]) != NETDEV_QUEUE_TYPE_RX) { + NL_SET_BAD_ATTR(info->extack, qtb[NETDEV_A_QUEUE_TYPE]); + return -EINVAL; + } + + queue_id_lease = nla_get_u32(qtb[NETDEV_A_QUEUE_ID]); + + rsp = genlmsg_new(GENLMSG_DEFAULT_SIZE, GFP_KERNEL); + if (!rsp) + return -ENOMEM; + + hdr = genlmsg_iput(rsp, info); + if (!hdr) { + err = -EMSGSIZE; + goto err_genlmsg_free; + } + + /* Locking order is always from the virtual to the physical device + * since this is also the same order when applications open the + * memory provider later on. + */ + dev = netdev_get_by_index_lock(genl_info_net(info), ifindex); + if (!dev) { + err = -ENODEV; + goto err_genlmsg_free; + } + if (!netdev_can_create_queue(dev, info->extack)) { + err = -EINVAL; + goto err_unlock_dev; + } + + net = genl_info_net(info); + if (netns_lease >= 0) { + net = get_net_ns_by_id(net, netns_lease); + if (!net) { + err = -ENONET; + goto err_unlock_dev; + } + } + + dev_lease = netdev_get_by_index(net, ifindex_lease, &dev_tracker, + GFP_KERNEL); + if (!dev_lease) { + err = -ENODEV; + goto err_put_netns; + } + if (!netdev_can_lease_queue(dev_lease, info->extack)) { + netdev_put(dev_lease, &dev_tracker); + err = -EINVAL; + goto err_put_netns; + } + + dev_lease = netdev_put_lock(dev_lease, net, &dev_tracker); + if (!dev_lease) { + err = -ENODEV; + goto err_put_netns; + } + if (queue_id_lease >= dev_lease->real_num_rx_queues) { + err = -ERANGE; + NL_SET_BAD_ATTR(info->extack, qtb[NETDEV_A_QUEUE_ID]); + goto err_unlock_dev_lease; + } + if (netdev_queue_busy(dev_lease, queue_id_lease, NETDEV_QUEUE_TYPE_RX, + info->extack)) { + err = -EBUSY; + goto err_unlock_dev_lease; + } + + rxq_lease = __netif_get_rx_queue(dev_lease, queue_id_lease); + rxq = __netif_get_rx_queue(dev, dev->real_num_rx_queues - 1); + + /* Leasing queues from different physical devices is currently + * not supported. Capabilities such as XDP features and DMA + * device may differ between physical devices, and computing + * a correct intersection for the virtual device is not yet + * implemented. + */ + if (rxq->lease && rxq->lease->dev != dev_lease) { + err = -EOPNOTSUPP; + NL_SET_ERR_MSG(info->extack, + "Leasing queues from different devices not supported"); + goto err_unlock_dev_lease; + } + + queue_id = dev->queue_mgmt_ops->ndo_queue_create(dev, info->extack); + if (queue_id < 0) { + err = queue_id; + goto err_unlock_dev_lease; + } + rxq = __netif_get_rx_queue(dev, queue_id); + + netdev_rx_queue_lease(rxq, rxq_lease); + + nla_put_u32(rsp, NETDEV_A_QUEUE_ID, queue_id); + genlmsg_end(rsp, hdr); + + netdev_unlock(dev_lease); + netdev_unlock(dev); + if (netns_lease >= 0) + put_net(net); + + return genlmsg_reply(rsp, info); + +err_unlock_dev_lease: + netdev_unlock(dev_lease); +err_put_netns: + if (netns_lease >= 0) + put_net(net); +err_unlock_dev: + netdev_unlock(dev); +err_genlmsg_free: + nlmsg_free(rsp); + return err; } void netdev_nl_sock_priv_init(struct netdev_nl_sock *priv) diff --git a/net/core/netdev_queues.c b/net/core/netdev_queues.c index 251f27a8307f..177401828e79 100644 --- a/net/core/netdev_queues.c +++ b/net/core/netdev_queues.c @@ -1,6 +1,10 @@ // SPDX-License-Identifier: GPL-2.0-or-later #include +#include +#include + +#include "dev.h" /** * netdev_queue_get_dma_dev() - get dma device for zero-copy operations @@ -25,3 +29,61 @@ struct device *netdev_queue_get_dma_dev(struct net_device *dev, int idx) return dma_dev && dma_dev->dma_mask ? dma_dev : NULL; } +bool netdev_can_create_queue(const struct net_device *dev, + struct netlink_ext_ack *extack) +{ + if (dev->dev.parent) { + NL_SET_ERR_MSG(extack, "Device is not a virtual device"); + return false; + } + if (!dev->queue_mgmt_ops || + !dev->queue_mgmt_ops->ndo_queue_create) { + NL_SET_ERR_MSG(extack, "Device does not support queue creation"); + return false; + } + if (dev->real_num_rx_queues < 1 || + dev->real_num_tx_queues < 1) { + NL_SET_ERR_MSG(extack, "Device must have at least one real queue"); + return false; + } + return true; +} + +bool netdev_can_lease_queue(const struct net_device *dev, + struct netlink_ext_ack *extack) +{ + if (!dev->dev.parent) { + NL_SET_ERR_MSG(extack, "Lease device is a virtual device"); + return false; + } + if (!netif_device_present(dev)) { + NL_SET_ERR_MSG(extack, "Lease device has been removed from the system"); + return false; + } + if (!dev->queue_mgmt_ops) { + NL_SET_ERR_MSG(extack, "Lease device does not support queue management operations"); + return false; + } + return true; +} + +bool netdev_queue_busy(struct net_device *dev, unsigned int idx, + enum netdev_queue_type type, + struct netlink_ext_ack *extack) +{ + if (xsk_get_pool_from_qid(dev, idx)) { + NL_SET_ERR_MSG(extack, "Device queue in use by AF_XDP"); + return true; + } + if (type == NETDEV_QUEUE_TYPE_TX) + return false; + if (netif_rxq_is_leased(dev, idx)) { + NL_SET_ERR_MSG(extack, "Device queue in use due to queue leasing"); + return true; + } + if (netif_rxq_has_mp(dev, idx)) { + NL_SET_ERR_MSG(extack, "Device queue in use by memory provider"); + return true; + } + return false; +} diff --git a/net/core/netdev_rx_queue.c b/net/core/netdev_rx_queue.c index 668a90658f25..a1f23c2c96d4 100644 --- a/net/core/netdev_rx_queue.c +++ b/net/core/netdev_rx_queue.c @@ -10,15 +10,53 @@ #include "dev.h" #include "page_pool_priv.h" -/* See also page_pool_is_unreadable() */ -bool netif_rxq_has_unreadable_mp(struct net_device *dev, int idx) +void netdev_rx_queue_lease(struct netdev_rx_queue *rxq_dst, + struct netdev_rx_queue *rxq_src) { - struct netdev_rx_queue *rxq = __netif_get_rx_queue(dev, idx); + netdev_assert_locked(rxq_src->dev); + netdev_assert_locked(rxq_dst->dev); + + netdev_hold(rxq_src->dev, &rxq_src->lease_tracker, GFP_KERNEL); - return !!rxq->mp_params.mp_ops; + WRITE_ONCE(rxq_src->lease, rxq_dst); + WRITE_ONCE(rxq_dst->lease, rxq_src); +} + +void netdev_rx_queue_unlease(struct netdev_rx_queue *rxq_dst, + struct netdev_rx_queue *rxq_src) +{ + netdev_assert_locked(rxq_dst->dev); + netdev_assert_locked(rxq_src->dev); + + WRITE_ONCE(rxq_src->lease, NULL); + WRITE_ONCE(rxq_dst->lease, NULL); + + netdev_put(rxq_src->dev, &rxq_src->lease_tracker); +} + +bool netif_rxq_is_leased(struct net_device *dev, unsigned int rxq_idx) +{ + if (rxq_idx < dev->real_num_rx_queues) + return READ_ONCE(__netif_get_rx_queue(dev, rxq_idx)->lease); + return false; +} + +/* See also page_pool_is_unreadable() */ +bool netif_rxq_has_unreadable_mp(struct net_device *dev, unsigned int rxq_idx) +{ + if (rxq_idx < dev->real_num_rx_queues) + return __netif_get_rx_queue(dev, rxq_idx)->mp_params.mp_ops; + return false; } EXPORT_SYMBOL(netif_rxq_has_unreadable_mp); +bool netif_rxq_has_mp(struct net_device *dev, unsigned int rxq_idx) +{ + if (rxq_idx < dev->real_num_rx_queues) + return __netif_get_rx_queue(dev, rxq_idx)->mp_params.mp_priv; + return false; +} + static int netdev_rx_queue_reconfig(struct net_device *dev, unsigned int rxq_idx, struct netdev_queue_config *qcfg_old, -- cgit v1.2.3 From 48103896053828a8b4d25839a39aa8514071914a Mon Sep 17 00:00:00 2001 From: Daniel Borkmann Date: Fri, 3 Apr 2026 01:10:27 +0200 Subject: netkit: Add single device mode for netkit Add a single device mode for netkit instead of netkit pairs. The primary target for the paired devices is to connect network namespaces, of course, and support has been implemented in projects like Cilium [0]. For the rxq leasing the plan is to support two main scenarios related to single device mode: * For the use-case of io_uring zero-copy, the control plane can either set up a netkit pair where the peer device can perform rxq leasing which is then tied to the lifetime of the peer device, or the control plane can use a regular netkit pair to connect the hostns to a Pod/container and dynamically add/remove rxq leasing through a single device without having to interrupt the device pair. In the case of io_uring, the memory pool is used as skb non-linear pages, and thus the skb will go its way through the regular stack into netkit. Things like the netkit policy when no BPF is attached or skb scrubbing etc apply as-is in case the paired devices are used, or if the backend memory is tied to the single device and traffic goes through a paired device. * For the use-case of AF_XDP, the control plane needs to use netkit in the single device mode. The single device mode currently enforces only a pass policy when no BPF is attached, and does not yet support BPF link attachments for AF_XDP. skbs sent to that device get dropped at the moment. Given AF_XDP operates at a lower layer of the stack tying this to the netkit pair did not make sense. In future, the plan is to allow BPF at the XDP layer which can: i) process traffic coming from the AF_XDP application (e.g. QEMU with AF_XDP backend) to filter egress traffic or to push selected egress traffic up to the single netkit device to the local stack (e.g. DHCP requests), and ii) vice-versa skbs sent to the single netkit into the AF_XDP application (e.g. DHCP replies). Also, the control-plane can dynamically manage rxq leasing for the single netkit device without having to interrupt (e.g. down/up cycle) the main netkit pair for the Pod which has traffic going in and out. Signed-off-by: Daniel Borkmann Co-developed-by: David Wei Signed-off-by: David Wei Reviewed-by: Jordan Rife Reviewed-by: Nikolay Aleksandrov Link: https://docs.cilium.io/en/stable/operations/performance/tuning/#netkit-device-mode [0] Link: https://patch.msgid.link/20260402231031.447597-11-daniel@iogearbox.net Signed-off-by: Jakub Kicinski --- Documentation/netlink/specs/rt-link.yaml | 11 +++ drivers/net/netkit.c | 131 +++++++++++++++++++------------ include/uapi/linux/if_link.h | 6 ++ 3 files changed, 99 insertions(+), 49 deletions(-) (limited to 'Documentation') diff --git a/Documentation/netlink/specs/rt-link.yaml b/Documentation/netlink/specs/rt-link.yaml index df4b56beb818..fcb5aaf0926f 100644 --- a/Documentation/netlink/specs/rt-link.yaml +++ b/Documentation/netlink/specs/rt-link.yaml @@ -825,6 +825,13 @@ definitions: entries: - name: none - name: default + - + name: netkit-pairing + type: enum + enum-name: netkit-pairing + entries: + - name: pair + - name: single - name: ovpn-mode enum-name: ovpn-mode @@ -2299,6 +2306,10 @@ attribute-sets: - name: tailroom type: u16 + - + name: pairing + type: u32 + enum: netkit-pairing - name: linkinfo-ovpn-attrs name-prefix: ifla-ovpn- diff --git a/drivers/net/netkit.c b/drivers/net/netkit.c index 5c0e01396e06..96c098a6db0d 100644 --- a/drivers/net/netkit.c +++ b/drivers/net/netkit.c @@ -26,6 +26,7 @@ struct netkit { __cacheline_group_begin(netkit_slowpath); enum netkit_mode mode; + enum netkit_pairing pair; bool primary; u32 headroom; __cacheline_group_end(netkit_slowpath); @@ -135,6 +136,10 @@ static int netkit_open(struct net_device *dev) struct netkit *nk = netkit_priv(dev); struct net_device *peer = rtnl_dereference(nk->peer); + if (nk->pair == NETKIT_DEVICE_SINGLE) { + netif_carrier_on(dev); + return 0; + } if (!peer) return -ENOTCONN; if (peer->flags & IFF_UP) { @@ -194,16 +199,17 @@ static void netkit_set_headroom(struct net_device *dev, int headroom) rcu_read_lock(); peer = rcu_dereference(nk->peer); - if (unlikely(!peer)) - goto out; - - nk2 = netkit_priv(peer); - nk->headroom = headroom; - headroom = max(nk->headroom, nk2->headroom); + if (!peer) { + nk->headroom = headroom; + dev->needed_headroom = headroom; + } else { + nk2 = netkit_priv(peer); + nk->headroom = headroom; + headroom = max(nk->headroom, nk2->headroom); - peer->needed_headroom = headroom; - dev->needed_headroom = headroom; -out: + peer->needed_headroom = headroom; + dev->needed_headroom = headroom; + } rcu_read_unlock(); } @@ -335,15 +341,17 @@ static int netkit_new_link(struct net_device *dev, enum netkit_scrub scrub_prim = NETKIT_SCRUB_DEFAULT; enum netkit_scrub scrub_peer = NETKIT_SCRUB_DEFAULT; struct nlattr *peer_tb[IFLA_MAX + 1], **tbp, *attr; + enum netkit_pairing pair = NETKIT_DEVICE_PAIR; enum netkit_action policy_prim = NETKIT_PASS; enum netkit_action policy_peer = NETKIT_PASS; + bool seen_peer = false, seen_scrub = false; struct nlattr **data = params->data; enum netkit_mode mode = NETKIT_L3; unsigned char ifname_assign_type; struct nlattr **tb = params->tb; u16 headroom = 0, tailroom = 0; struct ifinfomsg *ifmp = NULL; - struct net_device *peer; + struct net_device *peer = NULL; char ifname[IFNAMSIZ]; struct netkit *nk; int err; @@ -380,6 +388,13 @@ static int netkit_new_link(struct net_device *dev, headroom = nla_get_u16(data[IFLA_NETKIT_HEADROOM]); if (data[IFLA_NETKIT_TAILROOM]) tailroom = nla_get_u16(data[IFLA_NETKIT_TAILROOM]); + if (data[IFLA_NETKIT_PAIRING]) + pair = nla_get_u32(data[IFLA_NETKIT_PAIRING]); + + seen_scrub = data[IFLA_NETKIT_SCRUB]; + seen_peer = data[IFLA_NETKIT_PEER_INFO] || + data[IFLA_NETKIT_PEER_SCRUB] || + data[IFLA_NETKIT_PEER_POLICY]; } if (ifmp && tbp[IFLA_IFNAME]) { @@ -392,45 +407,47 @@ static int netkit_new_link(struct net_device *dev, if (mode != NETKIT_L2 && (tb[IFLA_ADDRESS] || tbp[IFLA_ADDRESS])) return -EOPNOTSUPP; + if (pair == NETKIT_DEVICE_SINGLE && + (tb != tbp || seen_peer || seen_scrub || + policy_prim != NETKIT_PASS)) + return -EOPNOTSUPP; - peer = rtnl_create_link(peer_net, ifname, ifname_assign_type, - &netkit_link_ops, tbp, extack); - if (IS_ERR(peer)) - return PTR_ERR(peer); - - netif_inherit_tso_max(peer, dev); - if (headroom) { - peer->needed_headroom = headroom; - dev->needed_headroom = headroom; - } - if (tailroom) { - peer->needed_tailroom = tailroom; - dev->needed_tailroom = tailroom; - } - - if (mode == NETKIT_L2 && !(ifmp && tbp[IFLA_ADDRESS])) - eth_hw_addr_random(peer); - if (ifmp && dev->ifindex) - peer->ifindex = ifmp->ifi_index; - - nk = netkit_priv(peer); - nk->primary = false; - nk->policy = policy_peer; - nk->scrub = scrub_peer; - nk->mode = mode; - nk->headroom = headroom; - bpf_mprog_bundle_init(&nk->bundle); + if (pair == NETKIT_DEVICE_PAIR) { + peer = rtnl_create_link(peer_net, ifname, ifname_assign_type, + &netkit_link_ops, tbp, extack); + if (IS_ERR(peer)) + return PTR_ERR(peer); + + netif_inherit_tso_max(peer, dev); + if (headroom) + peer->needed_headroom = headroom; + if (tailroom) + peer->needed_tailroom = tailroom; + if (mode == NETKIT_L2 && !(ifmp && tbp[IFLA_ADDRESS])) + eth_hw_addr_random(peer); + if (ifmp && dev->ifindex) + peer->ifindex = ifmp->ifi_index; - err = register_netdevice(peer); - if (err < 0) - goto err_register_peer; - netif_carrier_off(peer); - if (mode == NETKIT_L2) - dev_change_flags(peer, peer->flags & ~IFF_NOARP, NULL); + nk = netkit_priv(peer); + nk->primary = false; + nk->policy = policy_peer; + nk->scrub = scrub_peer; + nk->mode = mode; + nk->pair = pair; + nk->headroom = headroom; + bpf_mprog_bundle_init(&nk->bundle); + + err = register_netdevice(peer); + if (err < 0) + goto err_register_peer; + netif_carrier_off(peer); + if (mode == NETKIT_L2) + dev_change_flags(peer, peer->flags & ~IFF_NOARP, NULL); - err = rtnl_configure_link(peer, NULL, 0, NULL); - if (err < 0) - goto err_configure_peer; + err = rtnl_configure_link(peer, NULL, 0, NULL); + if (err < 0) + goto err_configure_peer; + } if (mode == NETKIT_L2 && !tb[IFLA_ADDRESS]) eth_hw_addr_random(dev); @@ -438,12 +455,17 @@ static int netkit_new_link(struct net_device *dev, nla_strscpy(dev->name, tb[IFLA_IFNAME], IFNAMSIZ); else strscpy(dev->name, "nk%d", IFNAMSIZ); + if (headroom) + dev->needed_headroom = headroom; + if (tailroom) + dev->needed_tailroom = tailroom; nk = netkit_priv(dev); nk->primary = true; nk->policy = policy_prim; nk->scrub = scrub_prim; nk->mode = mode; + nk->pair = pair; nk->headroom = headroom; bpf_mprog_bundle_init(&nk->bundle); @@ -455,10 +477,12 @@ static int netkit_new_link(struct net_device *dev, dev_change_flags(dev, dev->flags & ~IFF_NOARP, NULL); rcu_assign_pointer(netkit_priv(dev)->peer, peer); - rcu_assign_pointer(netkit_priv(peer)->peer, dev); + if (peer) + rcu_assign_pointer(netkit_priv(peer)->peer, dev); return 0; err_configure_peer: - unregister_netdevice(peer); + if (peer) + unregister_netdevice(peer); return err; err_register_peer: free_netdev(peer); @@ -518,6 +542,8 @@ static struct net_device *netkit_dev_fetch(struct net *net, u32 ifindex, u32 whi nk = netkit_priv(dev); if (!nk->primary) return ERR_PTR(-EACCES); + if (nk->pair == NETKIT_DEVICE_SINGLE) + return ERR_PTR(-EOPNOTSUPP); if (which == BPF_NETKIT_PEER) { dev = rcu_dereference_rtnl(nk->peer); if (!dev) @@ -879,6 +905,7 @@ static int netkit_change_link(struct net_device *dev, struct nlattr *tb[], { IFLA_NETKIT_PEER_INFO, "peer info" }, { IFLA_NETKIT_HEADROOM, "headroom" }, { IFLA_NETKIT_TAILROOM, "tailroom" }, + { IFLA_NETKIT_PAIRING, "pairing" }, }; if (!nk->primary) { @@ -898,9 +925,11 @@ static int netkit_change_link(struct net_device *dev, struct nlattr *tb[], } if (data[IFLA_NETKIT_POLICY]) { + err = -EOPNOTSUPP; attr = data[IFLA_NETKIT_POLICY]; policy = nla_get_u32(attr); - err = netkit_check_policy(policy, attr, extack); + if (nk->pair == NETKIT_DEVICE_PAIR) + err = netkit_check_policy(policy, attr, extack); if (err) return err; WRITE_ONCE(nk->policy, policy); @@ -931,6 +960,7 @@ static size_t netkit_get_size(const struct net_device *dev) nla_total_size(sizeof(u8)) + /* IFLA_NETKIT_PRIMARY */ nla_total_size(sizeof(u16)) + /* IFLA_NETKIT_HEADROOM */ nla_total_size(sizeof(u16)) + /* IFLA_NETKIT_TAILROOM */ + nla_total_size(sizeof(u32)) + /* IFLA_NETKIT_PAIRING */ 0; } @@ -951,6 +981,8 @@ static int netkit_fill_info(struct sk_buff *skb, const struct net_device *dev) return -EMSGSIZE; if (nla_put_u16(skb, IFLA_NETKIT_TAILROOM, dev->needed_tailroom)) return -EMSGSIZE; + if (nla_put_u32(skb, IFLA_NETKIT_PAIRING, nk->pair)) + return -EMSGSIZE; if (peer) { nk = netkit_priv(peer); @@ -972,6 +1004,7 @@ static const struct nla_policy netkit_policy[IFLA_NETKIT_MAX + 1] = { [IFLA_NETKIT_TAILROOM] = { .type = NLA_U16 }, [IFLA_NETKIT_SCRUB] = NLA_POLICY_MAX(NLA_U32, NETKIT_SCRUB_DEFAULT), [IFLA_NETKIT_PEER_SCRUB] = NLA_POLICY_MAX(NLA_U32, NETKIT_SCRUB_DEFAULT), + [IFLA_NETKIT_PAIRING] = NLA_POLICY_MAX(NLA_U32, NETKIT_DEVICE_SINGLE), [IFLA_NETKIT_PRIMARY] = { .type = NLA_REJECT, .reject_message = "Primary attribute is read-only" }, }; diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h index 83a96c56b8ca..280bb1780512 100644 --- a/include/uapi/linux/if_link.h +++ b/include/uapi/linux/if_link.h @@ -1296,6 +1296,11 @@ enum netkit_mode { NETKIT_L3, }; +enum netkit_pairing { + NETKIT_DEVICE_PAIR, + NETKIT_DEVICE_SINGLE, +}; + /* NETKIT_SCRUB_NONE leaves clearing skb->{mark,priority} up to * the BPF program if attached. This also means the latter can * consume the two fields if they were populated earlier. @@ -1320,6 +1325,7 @@ enum { IFLA_NETKIT_PEER_SCRUB, IFLA_NETKIT_HEADROOM, IFLA_NETKIT_TAILROOM, + IFLA_NETKIT_PAIRING, __IFLA_NETKIT_MAX, }; #define IFLA_NETKIT_MAX (__IFLA_NETKIT_MAX - 1) -- cgit v1.2.3 From 4de7a8acd18e2b71591e286678a8ed711e258090 Mon Sep 17 00:00:00 2001 From: Marek Vasut Date: Mon, 6 Apr 2026 01:29:56 +0200 Subject: dt-bindings: net: realtek,rtl82xx: Keep property list sorted Sort the documented properties alphabetically, no functional change. Acked-by: Rob Herring (Arm) Signed-off-by: Marek Vasut Link: https://patch.msgid.link/20260405233008.148974-1-marek.vasut@mailbox.org Signed-off-by: Jakub Kicinski --- Documentation/devicetree/bindings/net/realtek,rtl82xx.yaml | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/net/realtek,rtl82xx.yaml b/Documentation/devicetree/bindings/net/realtek,rtl82xx.yaml index 2b5697bd7c5d..eafcc2f3e3d6 100644 --- a/Documentation/devicetree/bindings/net/realtek,rtl82xx.yaml +++ b/Documentation/devicetree/bindings/net/realtek,rtl82xx.yaml @@ -40,15 +40,15 @@ properties: leds: true - realtek,clkout-disable: + realtek,aldps-enable: type: boolean description: - Disable CLKOUT clock, CLKOUT clock default is enabled after hardware reset. + Enable ALDPS mode, ALDPS mode default is disabled after hardware reset. - realtek,aldps-enable: + realtek,clkout-disable: type: boolean description: - Enable ALDPS mode, ALDPS mode default is disabled after hardware reset. + Disable CLKOUT clock, CLKOUT clock default is enabled after hardware reset. wakeup-source: type: boolean -- cgit v1.2.3 From bfb859a5cb4941db06b37fd8bbd8e6d8a0dd5dcf Mon Sep 17 00:00:00 2001 From: Marek Vasut Date: Mon, 6 Apr 2026 01:29:57 +0200 Subject: dt-bindings: net: realtek,rtl82xx: Document realtek,*-ssc-enable property Document support for spread spectrum clocking (SSC) on RTL8211F(D)(I)-CG, RTL8211FS(I)(-VS)-CG, RTL8211FG(I)(-VS)-CG PHYs. Introduce DT properties 'realtek,clkout-ssc-enable', 'realtek,rxc-ssc-enable' and 'realtek,sysclk-ssc-enable' which control CLKOUT, RXC and SYSCLK SSC spread spectrum clocking enablement on these signals. These clock are not exposed via the clock API, therefore assigned-clock-sscs property does not apply. Reviewed-by: Krzysztof Kozlowski Signed-off-by: Marek Vasut Link: https://patch.msgid.link/20260405233008.148974-2-marek.vasut@mailbox.org Signed-off-by: Jakub Kicinski --- .../devicetree/bindings/net/realtek,rtl82xx.yaml | 15 +++++++++++++++ 1 file changed, 15 insertions(+) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/net/realtek,rtl82xx.yaml b/Documentation/devicetree/bindings/net/realtek,rtl82xx.yaml index eafcc2f3e3d6..45033c31a2d5 100644 --- a/Documentation/devicetree/bindings/net/realtek,rtl82xx.yaml +++ b/Documentation/devicetree/bindings/net/realtek,rtl82xx.yaml @@ -50,6 +50,21 @@ properties: description: Disable CLKOUT clock, CLKOUT clock default is enabled after hardware reset. + realtek,clkout-ssc-enable: + type: boolean + description: + Enable CLKOUT SSC mode, CLKOUT SSC mode default is disabled after hardware reset. + + realtek,rxc-ssc-enable: + type: boolean + description: + Enable RXC SSC mode, RXC SSC mode default is disabled after hardware reset. + + realtek,sysclk-ssc-enable: + type: boolean + description: + Enable SYSCLK SSC mode, SYSCLK SSC mode default is disabled after hardware reset. + wakeup-source: type: boolean description: -- cgit v1.2.3 From 8d7de5477e47525c870b599fb2de06ef8af63466 Mon Sep 17 00:00:00 2001 From: Julian Anastasov Date: Sat, 4 Apr 2026 18:34:39 +0300 Subject: ipvs: add conn_lfactor and svc_lfactor sysctl vars Allow the default load factor for the connection and service tables to be configured. Signed-off-by: Julian Anastasov Signed-off-by: Florian Westphal --- Documentation/networking/ipvs-sysctl.rst | 37 ++++++++++++++++ net/netfilter/ipvs/ip_vs_ctl.c | 76 ++++++++++++++++++++++++++++++++ 2 files changed, 113 insertions(+) (limited to 'Documentation') diff --git a/Documentation/networking/ipvs-sysctl.rst b/Documentation/networking/ipvs-sysctl.rst index 3fb5fa142eef..a556439f8be7 100644 --- a/Documentation/networking/ipvs-sysctl.rst +++ b/Documentation/networking/ipvs-sysctl.rst @@ -29,6 +29,33 @@ backup_only - BOOLEAN If set, disable the director function while the server is in backup mode to avoid packet loops for DR/TUN methods. +conn_lfactor - INTEGER + Possible values: -8 (larger table) .. 8 (smaller table) + + Default: -4 + + Controls the sizing of the connection hash table based on the + load factor (number of connections per table buckets): + + 2^conn_lfactor = nodes / buckets + + As result, the table grows if load increases and shrinks when + load decreases in the range of 2^8 - 2^conn_tab_bits (module + parameter). + The value is a shift count where negative values select + buckets = (connection hash nodes << -value) while positive + values select buckets = (connection hash nodes >> value). The + negative values reduce the collisions and reduce the time for + lookups but increase the table size. Positive values will + tolerate load above 100% when using smaller table is + preferred with the cost of more collisions. If using NAT + connections consider decreasing the value with one because + they add two nodes in the hash table. + + Example: + -4: grow if load goes above 6% (buckets = nodes * 16) + 2: grow if load goes above 400% (buckets = nodes / 4) + conn_reuse_mode - INTEGER 1 - default @@ -219,6 +246,16 @@ secure_tcp - INTEGER The value definition is the same as that of drop_entry and drop_packet. +svc_lfactor - INTEGER + Possible values: -8 (larger table) .. 8 (smaller table) + + Default: -3 + + Controls the sizing of the service hash table based on the + load factor (number of services per table buckets). The table + will grow and shrink in the range of 2^4 - 2^20. + See conn_lfactor for explanation. + sync_threshold - vector of 2 INTEGERs: sync_threshold, sync_period default 3 50 diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c index fb1df61edfdd..6632daa87ded 100644 --- a/net/netfilter/ipvs/ip_vs_ctl.c +++ b/net/netfilter/ipvs/ip_vs_ctl.c @@ -2445,6 +2445,60 @@ static int ipvs_proc_run_estimation(const struct ctl_table *table, int write, return ret; } +static int ipvs_proc_conn_lfactor(const struct ctl_table *table, int write, + void *buffer, size_t *lenp, loff_t *ppos) +{ + struct netns_ipvs *ipvs = table->extra2; + int *valp = table->data; + int val = *valp; + int ret; + + struct ctl_table tmp_table = { + .data = &val, + .maxlen = sizeof(int), + }; + + ret = proc_dointvec(&tmp_table, write, buffer, lenp, ppos); + if (write && ret >= 0) { + if (val < -8 || val > 8) { + ret = -EINVAL; + } else { + *valp = val; + if (rcu_access_pointer(ipvs->conn_tab)) + mod_delayed_work(system_unbound_wq, + &ipvs->conn_resize_work, 0); + } + } + return ret; +} + +static int ipvs_proc_svc_lfactor(const struct ctl_table *table, int write, + void *buffer, size_t *lenp, loff_t *ppos) +{ + struct netns_ipvs *ipvs = table->extra2; + int *valp = table->data; + int val = *valp; + int ret; + + struct ctl_table tmp_table = { + .data = &val, + .maxlen = sizeof(int), + }; + + ret = proc_dointvec(&tmp_table, write, buffer, lenp, ppos); + if (write && ret >= 0) { + if (val < -8 || val > 8) { + ret = -EINVAL; + } else { + *valp = val; + if (rcu_access_pointer(ipvs->svc_table)) + mod_delayed_work(system_unbound_wq, + &ipvs->svc_resize_work, 0); + } + } + return ret; +} + /* * IPVS sysctl table (under the /proc/sys/net/ipv4/vs/) * Do not change order or insert new entries without @@ -2633,6 +2687,18 @@ static struct ctl_table vs_vars[] = { .mode = 0644, .proc_handler = ipvs_proc_est_nice, }, + { + .procname = "conn_lfactor", + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = ipvs_proc_conn_lfactor, + }, + { + .procname = "svc_lfactor", + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = ipvs_proc_svc_lfactor, + }, #ifdef CONFIG_IP_VS_DEBUG { .procname = "debug_level", @@ -4853,6 +4919,16 @@ static int __net_init ip_vs_control_net_init_sysctl(struct netns_ipvs *ipvs) tbl[idx].extra2 = ipvs; tbl[idx++].data = &ipvs->sysctl_est_nice; + if (unpriv) + tbl[idx].mode = 0444; + tbl[idx].extra2 = ipvs; + tbl[idx++].data = &ipvs->sysctl_conn_lfactor; + + if (unpriv) + tbl[idx].mode = 0444; + tbl[idx].extra2 = ipvs; + tbl[idx++].data = &ipvs->sysctl_svc_lfactor; + #ifdef CONFIG_IP_VS_DEBUG /* Global sysctls must be ro in non-init netns */ if (!net_eq(net, &init_net)) -- cgit v1.2.3 From 54fc83a1728535831df0f251e155d05574918115 Mon Sep 17 00:00:00 2001 From: Andy Roulin Date: Sun, 5 Apr 2026 13:52:22 -0700 Subject: net: bridge: add stp_mode attribute for STP mode selection The bridge-stp usermode helper is currently restricted to the initial network namespace, preventing userspace STP daemons (e.g. mstpd) from operating on bridges in other network namespaces. Since commit ff62198553e4 ("bridge: Only call /sbin/bridge-stp for the initial network namespace"), bridges in non-init namespaces silently fall back to kernel STP with no way to use userspace STP. Add a new bridge attribute IFLA_BR_STP_MODE that allows explicit per-bridge control over STP mode selection: BR_STP_MODE_AUTO (default) - Existing behavior: invoke the /sbin/bridge-stp helper in init_net only; fall back to kernel STP if it fails or in non-init namespaces. BR_STP_MODE_USER - Directly enable userspace STP (BR_USER_STP) without invoking the helper. Works in any network namespace. Userspace is responsible for ensuring an STP daemon manages the bridge. BR_STP_MODE_KERNEL - Directly enable kernel STP (BR_KERNEL_STP) without invoking the helper. The mode can only be changed while STP is disabled, or set to the same value (-EBUSY otherwise). IFLA_BR_STP_MODE is processed before IFLA_BR_STP_STATE in br_changelink(), so both can be set atomically in a single netlink message. The mode can also be changed in the same message that disables STP. The stp_mode struct field is u8 since all possible values fit, while NLA_U32 is used for the netlink attribute since it occupies the same space in the netlink message as NLA_U8. A new stp_helper_active boolean tracks whether the /sbin/bridge-stp helper was invoked during br_stp_start(), so that br_stp_stop() only calls the helper for stop when it was called for start. This avoids calling the helper asymmetrically when stp_mode changes between start and stop. Suggested-by: Ido Schimmel Assisted-by: Claude:claude-opus-4-6 Reviewed-by: Ido Schimmel Acked-by: Nikolay Aleksandrov Signed-off-by: Andy Roulin Link: https://patch.msgid.link/20260405205224.3163000-2-aroulin@nvidia.com Signed-off-by: Jakub Kicinski --- Documentation/netlink/specs/rt-link.yaml | 12 ++++++++++ include/uapi/linux/if_link.h | 39 ++++++++++++++++++++++++++++++++ net/bridge/br_device.c | 1 + net/bridge/br_netlink.c | 24 +++++++++++++++++++- net/bridge/br_private.h | 2 ++ net/bridge/br_stp_if.c | 19 ++++++++++------ 6 files changed, 89 insertions(+), 8 deletions(-) (limited to 'Documentation') diff --git a/Documentation/netlink/specs/rt-link.yaml b/Documentation/netlink/specs/rt-link.yaml index fcb5aaf0926f..f23aa5f229c5 100644 --- a/Documentation/netlink/specs/rt-link.yaml +++ b/Documentation/netlink/specs/rt-link.yaml @@ -840,6 +840,14 @@ definitions: entries: - p2p - mp + - + name: br-stp-mode + type: enum + enum-name: br-stp-mode + entries: + - auto + - user + - kernel attribute-sets: - @@ -1550,6 +1558,10 @@ attribute-sets: - name: fdb-max-learned type: u32 + - + name: stp-mode + type: u32 + enum: br-stp-mode - name: linkinfo-brport-attrs name-prefix: ifla-brport- diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h index 280bb1780512..79ce4bc24cba 100644 --- a/include/uapi/linux/if_link.h +++ b/include/uapi/linux/if_link.h @@ -744,6 +744,11 @@ enum in6_addr_gen_mode { * @IFLA_BR_FDB_MAX_LEARNED * Set the number of max dynamically learned FDB entries for the current * bridge. + * + * @IFLA_BR_STP_MODE + * Set the STP mode for the bridge, which controls how the bridge + * selects between userspace and kernel STP. The valid values are + * documented below in the ``BR_STP_MODE_*`` constants. */ enum { IFLA_BR_UNSPEC, @@ -796,11 +801,45 @@ enum { IFLA_BR_MCAST_QUERIER_STATE, IFLA_BR_FDB_N_LEARNED, IFLA_BR_FDB_MAX_LEARNED, + IFLA_BR_STP_MODE, __IFLA_BR_MAX, }; #define IFLA_BR_MAX (__IFLA_BR_MAX - 1) +/** + * DOC: Bridge STP mode values + * + * @BR_STP_MODE_AUTO + * Default. The kernel invokes the ``/sbin/bridge-stp`` helper to hand + * the bridge to a userspace STP daemon (e.g. mstpd). Only attempted in + * the initial network namespace; in other namespaces this falls back to + * kernel STP. + * + * @BR_STP_MODE_USER + * Directly enable userspace STP (``BR_USER_STP``) without invoking the + * ``/sbin/bridge-stp`` helper. Works in any network namespace. + * Userspace is responsible for ensuring an STP daemon manages the + * bridge. + * + * @BR_STP_MODE_KERNEL + * Directly enable kernel STP (``BR_KERNEL_STP``) without invoking the + * helper. + * + * The mode controls how the bridge selects between userspace and kernel + * STP when STP is enabled via ``IFLA_BR_STP_STATE``. It can only be + * changed while STP is disabled (``IFLA_BR_STP_STATE`` == 0), returns + * ``-EBUSY`` otherwise. The default value is ``BR_STP_MODE_AUTO``. + */ +enum br_stp_mode { + BR_STP_MODE_AUTO, + BR_STP_MODE_USER, + BR_STP_MODE_KERNEL, + __BR_STP_MODE_MAX +}; + +#define BR_STP_MODE_MAX (__BR_STP_MODE_MAX - 1) + struct ifla_bridge_id { __u8 prio[2]; __u8 addr[6]; /* ETH_ALEN */ diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c index f7502e62dd35..a35ceae0a6f2 100644 --- a/net/bridge/br_device.c +++ b/net/bridge/br_device.c @@ -518,6 +518,7 @@ void br_dev_setup(struct net_device *dev) ether_addr_copy(br->group_addr, eth_stp_addr); br->stp_enabled = BR_NO_STP; + br->stp_mode = BR_STP_MODE_AUTO; br->group_fwd_mask = BR_GROUPFWD_DEFAULT; br->group_fwd_mask_required = BR_GROUPFWD_DEFAULT; diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c index 0264730938f4..6fd5386a1d64 100644 --- a/net/bridge/br_netlink.c +++ b/net/bridge/br_netlink.c @@ -1270,6 +1270,9 @@ static const struct nla_policy br_policy[IFLA_BR_MAX + 1] = { NLA_POLICY_EXACT_LEN(sizeof(struct br_boolopt_multi)), [IFLA_BR_FDB_N_LEARNED] = { .type = NLA_REJECT }, [IFLA_BR_FDB_MAX_LEARNED] = { .type = NLA_U32 }, + [IFLA_BR_STP_MODE] = NLA_POLICY_RANGE(NLA_U32, + BR_STP_MODE_AUTO, + BR_STP_MODE_MAX), }; static int br_changelink(struct net_device *brdev, struct nlattr *tb[], @@ -1306,6 +1309,23 @@ static int br_changelink(struct net_device *brdev, struct nlattr *tb[], return err; } + if (data[IFLA_BR_STP_MODE]) { + u32 mode = nla_get_u32(data[IFLA_BR_STP_MODE]); + + if (mode != br->stp_mode) { + bool stp_off = br->stp_enabled == BR_NO_STP || + (data[IFLA_BR_STP_STATE] && + !nla_get_u32(data[IFLA_BR_STP_STATE])); + + if (!stp_off) { + NL_SET_ERR_MSG_MOD(extack, + "Can't change STP mode while STP is enabled"); + return -EBUSY; + } + } + br->stp_mode = mode; + } + if (data[IFLA_BR_STP_STATE]) { u32 stp_enabled = nla_get_u32(data[IFLA_BR_STP_STATE]); @@ -1634,6 +1654,7 @@ static size_t br_get_size(const struct net_device *brdev) nla_total_size(sizeof(u8)) + /* IFLA_BR_NF_CALL_ARPTABLES */ #endif nla_total_size(sizeof(struct br_boolopt_multi)) + /* IFLA_BR_MULTI_BOOLOPT */ + nla_total_size(sizeof(u32)) + /* IFLA_BR_STP_MODE */ 0; } @@ -1686,7 +1707,8 @@ static int br_fill_info(struct sk_buff *skb, const struct net_device *brdev) nla_put(skb, IFLA_BR_MULTI_BOOLOPT, sizeof(bm), &bm) || nla_put_u32(skb, IFLA_BR_FDB_N_LEARNED, atomic_read(&br->fdb_n_learned)) || - nla_put_u32(skb, IFLA_BR_FDB_MAX_LEARNED, br->fdb_max_learned)) + nla_put_u32(skb, IFLA_BR_FDB_MAX_LEARNED, br->fdb_max_learned) || + nla_put_u32(skb, IFLA_BR_STP_MODE, br->stp_mode)) return -EMSGSIZE; #ifdef CONFIG_BRIDGE_VLAN_FILTERING diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h index 6dbca845e625..361a9b84451e 100644 --- a/net/bridge/br_private.h +++ b/net/bridge/br_private.h @@ -523,6 +523,8 @@ struct net_bridge { unsigned char topology_change; unsigned char topology_change_detected; u16 root_port; + u8 stp_mode; + bool stp_helper_active; unsigned long max_age; unsigned long hello_time; unsigned long forward_delay; diff --git a/net/bridge/br_stp_if.c b/net/bridge/br_stp_if.c index cc4b27ff1b08..28c1d3f7e22f 100644 --- a/net/bridge/br_stp_if.c +++ b/net/bridge/br_stp_if.c @@ -149,7 +149,9 @@ static void br_stp_start(struct net_bridge *br) { int err = -ENOENT; - if (net_eq(dev_net(br->dev), &init_net)) + /* AUTO mode: try bridge-stp helper in init_net only */ + if (br->stp_mode == BR_STP_MODE_AUTO && + net_eq(dev_net(br->dev), &init_net)) err = br_stp_call_user(br, "start"); if (err && err != -ENOENT) @@ -162,8 +164,9 @@ static void br_stp_start(struct net_bridge *br) else if (br->bridge_forward_delay > BR_MAX_FORWARD_DELAY) __br_set_forward_delay(br, BR_MAX_FORWARD_DELAY); - if (!err) { + if (br->stp_mode == BR_STP_MODE_USER || !err) { br->stp_enabled = BR_USER_STP; + br->stp_helper_active = !err; br_debug(br, "userspace STP started\n"); } else { br->stp_enabled = BR_KERNEL_STP; @@ -180,12 +183,14 @@ static void br_stp_start(struct net_bridge *br) static void br_stp_stop(struct net_bridge *br) { - int err; - if (br->stp_enabled == BR_USER_STP) { - err = br_stp_call_user(br, "stop"); - if (err) - br_err(br, "failed to stop userspace STP (%d)\n", err); + if (br->stp_helper_active) { + int err = br_stp_call_user(br, "stop"); + + if (err) + br_err(br, "failed to stop userspace STP (%d)\n", err); + br->stp_helper_active = false; + } /* To start timers on any ports left in blocking */ spin_lock_bh(&br->lock); -- cgit v1.2.3 From c4f2aab121cdc8400dcd5ec3cc0aa0fb3c06694f Mon Sep 17 00:00:00 2001 From: Andy Roulin Date: Sun, 5 Apr 2026 13:52:23 -0700 Subject: docs: net: bridge: document stp_mode attribute Add documentation for the IFLA_BR_STP_MODE bridge attribute in the "User space STP helper" section of the bridge documentation. Reference the BR_STP_MODE_* values via kernel-doc and describe the use case for network namespace environments. Reviewed-by: Ido Schimmel Acked-by: Nikolay Aleksandrov Signed-off-by: Andy Roulin Link: https://patch.msgid.link/20260405205224.3163000-3-aroulin@nvidia.com Signed-off-by: Jakub Kicinski --- Documentation/networking/bridge.rst | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) (limited to 'Documentation') diff --git a/Documentation/networking/bridge.rst b/Documentation/networking/bridge.rst index ef8b73e157b2..c1e6ea52c9e5 100644 --- a/Documentation/networking/bridge.rst +++ b/Documentation/networking/bridge.rst @@ -148,6 +148,28 @@ called by the kernel when STP is enabled/disabled on a bridge stp_state <0|1>``). The kernel enables user_stp mode if that command returns 0, or enables kernel_stp mode if that command returns any other value. +STP mode selection +------------------ + +The ``IFLA_BR_STP_MODE`` bridge attribute allows explicit control over how +STP operates when enabled, bypassing the ``/sbin/bridge-stp`` helper +entirely for the ``user`` and ``kernel`` modes. + +.. kernel-doc:: include/uapi/linux/if_link.h + :doc: Bridge STP mode values + +The default mode is ``BR_STP_MODE_AUTO``, which preserves the traditional +behavior of invoking the ``/sbin/bridge-stp`` helper. The ``user`` and +``kernel`` modes are particularly useful in network namespace environments +where the helper mechanism is not available, as ``call_usermodehelper()`` +is restricted to the initial network namespace. + +Example:: + + ip link set dev br0 type bridge stp_mode user stp_state 1 + +The mode can only be changed while STP is disabled. + VLAN ==== -- cgit v1.2.3 From a1a702090def20ab0fea13700128861b70d91bc5 Mon Sep 17 00:00:00 2001 From: Ivan Vecera Date: Wed, 8 Apr 2026 12:27:15 +0200 Subject: dt-bindings: dpll: add ref-sync-sources property Add ref-sync-sources phandle-array property to the dpll-pin schema allowing board designers to declare which input pins can serve as sync sources in a Reference-Sync pair. A Ref-Sync pair consists of a clock reference and a low-frequency sync signal where the DPLL locks to the clock but phase-aligns to the sync reference. Update both examples in the Microchip ZL3073x binding to demonstrate the new property with a 1 PPS sync source paired to a clock source. Reviewed-by: Petr Oros Reviewed-by: Prathosh Satish Reviewed-by: Rob Herring (Arm) Signed-off-by: Ivan Vecera Link: https://patch.msgid.link/20260408102716.443099-5-ivecera@redhat.com Signed-off-by: Jakub Kicinski --- .../devicetree/bindings/dpll/dpll-pin.yaml | 13 ++++++++++ .../bindings/dpll/microchip,zl30731.yaml | 30 +++++++++++++++++----- 2 files changed, 36 insertions(+), 7 deletions(-) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/dpll/dpll-pin.yaml b/Documentation/devicetree/bindings/dpll/dpll-pin.yaml index 51db93b77306..1287a472f08f 100644 --- a/Documentation/devicetree/bindings/dpll/dpll-pin.yaml +++ b/Documentation/devicetree/bindings/dpll/dpll-pin.yaml @@ -36,6 +36,19 @@ properties: description: String exposed as the pin board label $ref: /schemas/types.yaml#/definitions/string + ref-sync-sources: + description: | + List of phandles to input pins that can serve as the sync source + in a Reference-Sync pair with this pin acting as the clock source. + A Ref-Sync pair consists of a clock reference and a low-frequency + sync signal. The DPLL locks to the clock reference but + phase-aligns to the sync reference. + Only valid for input pins. Each referenced pin must be a + different input pin on the same device. + $ref: /schemas/types.yaml#/definitions/phandle-array + items: + maxItems: 1 + supported-frequencies-hz: description: List of supported frequencies for this pin, expressed in Hz. diff --git a/Documentation/devicetree/bindings/dpll/microchip,zl30731.yaml b/Documentation/devicetree/bindings/dpll/microchip,zl30731.yaml index 17747f754b84..fa5a8f8e390c 100644 --- a/Documentation/devicetree/bindings/dpll/microchip,zl30731.yaml +++ b/Documentation/devicetree/bindings/dpll/microchip,zl30731.yaml @@ -52,11 +52,19 @@ examples: #address-cells = <1>; #size-cells = <0>; - pin@0 { /* REF0P */ + sync0: pin@0 { /* REF0P - 1 PPS sync source */ reg = <0>; connection-type = "ext"; - label = "Input 0"; - supported-frequencies-hz = /bits/ 64 <1 1000>; + label = "SMA1"; + supported-frequencies-hz = /bits/ 64 <1>; + }; + + pin@1 { /* REF0N - clock source, can pair with sync0 */ + reg = <1>; + connection-type = "ext"; + label = "SMA2"; + supported-frequencies-hz = /bits/ 64 <10000 10000000>; + ref-sync-sources = <&sync0>; }; }; @@ -90,11 +98,19 @@ examples: #address-cells = <1>; #size-cells = <0>; - pin@0 { /* REF0P */ + sync1: pin@0 { /* REF0P - 1 PPS sync source */ reg = <0>; - connection-type = "ext"; - label = "Input 0"; - supported-frequencies-hz = /bits/ 64 <1 1000>; + connection-type = "gnss"; + label = "GNSS_1PPS_IN"; + supported-frequencies-hz = /bits/ 64 <1>; + }; + + pin@1 { /* REF0N - clock source */ + reg = <1>; + connection-type = "gnss"; + label = "GNSS_10M_IN"; + supported-frequencies-hz = /bits/ 64 <10000000>; + ref-sync-sources = <&sync1>; }; }; -- cgit v1.2.3 From f757a2da6df52299606512b0920eba728d642543 Mon Sep 17 00:00:00 2001 From: Nora Schiffer Date: Tue, 7 Apr 2026 12:48:01 +0200 Subject: dt-bindings: net: ti: k3-am654-cpsw-nuss: Add ti,j722s-cpsw-nuss compatible The J722S CPSW3G is mostly identical to the AM64's, but additionally supports SGMII. The AM64 compatible ti,am642-cpsw-nuss is used as a fallback. Signed-off-by: Nora Schiffer Acked-by: Krzysztof Kozlowski Link: https://patch.msgid.link/191e9f7e3a6c14eabe891a98c5fb646766479c0a.1775558273.git.nora.schiffer@ew.tq-group.com Signed-off-by: Jakub Kicinski --- .../bindings/net/ti,k3-am654-cpsw-nuss.yaml | 19 ++++++++++++------- 1 file changed, 12 insertions(+), 7 deletions(-) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/net/ti,k3-am654-cpsw-nuss.yaml b/Documentation/devicetree/bindings/net/ti,k3-am654-cpsw-nuss.yaml index a959c1d7e643..c409c6310ed4 100644 --- a/Documentation/devicetree/bindings/net/ti,k3-am654-cpsw-nuss.yaml +++ b/Documentation/devicetree/bindings/net/ti,k3-am654-cpsw-nuss.yaml @@ -53,13 +53,18 @@ properties: "#size-cells": true compatible: - enum: - - ti,am642-cpsw-nuss - - ti,am654-cpsw-nuss - - ti,j7200-cpswxg-nuss - - ti,j721e-cpsw-nuss - - ti,j721e-cpswxg-nuss - - ti,j784s4-cpswxg-nuss + oneOf: + - enum: + - ti,am642-cpsw-nuss + - ti,am654-cpsw-nuss + - ti,j7200-cpswxg-nuss + - ti,j721e-cpsw-nuss + - ti,j721e-cpswxg-nuss + - ti,j784s4-cpswxg-nuss + - items: + - enum: + - ti,j722s-cpsw-nuss + - const: ti,am642-cpsw-nuss reg: maxItems: 1 -- cgit v1.2.3 From 600f01dc4bd0c736b3ffea9f7976136d8bf1b136 Mon Sep 17 00:00:00 2001 From: Josua Mayer Date: Thu, 9 Apr 2026 14:34:33 +0200 Subject: dt-bindings: net: dsa: nxp,sja1105: make spi-cpol optional for sja1110 Currently, the binding requires 'spi-cpha' for SJA1105 and 'spi-cpol' for SJA1110. However, the SJA1110 supports both SPI modes 0 and 2. Mode 2 (cpha=0, cpol=1) is used by the NXP LX2160 Bluebox 3. On the SolidRun i.MX8DXL HummingBoard Telematics, mode 0 is stable, while forcing mode 2 introduces CRC errors especially during bursts. Drop the requirement on spi-cpol for SJA1110. Fixes: af2eab1a8243 ("dt-bindings: net: nxp,sja1105: document spi-cpol/cpha") Signed-off-by: Josua Mayer Acked-by: Conor Dooley Link: https://patch.msgid.link/20260409-imx8dxl-sr-som-v2-1-83ff20629ba0@solid-run.com Signed-off-by: Jakub Kicinski --- Documentation/devicetree/bindings/net/dsa/nxp,sja1105.yaml | 2 -- 1 file changed, 2 deletions(-) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/net/dsa/nxp,sja1105.yaml b/Documentation/devicetree/bindings/net/dsa/nxp,sja1105.yaml index 607b7fe8d28e..0486489114cd 100644 --- a/Documentation/devicetree/bindings/net/dsa/nxp,sja1105.yaml +++ b/Documentation/devicetree/bindings/net/dsa/nxp,sja1105.yaml @@ -143,8 +143,6 @@ allOf: else: properties: spi-cpha: false - required: - - spi-cpol unevaluatedProperties: false -- cgit v1.2.3 From 268bb35d1a34c9965ec79922a412e966af98be34 Mon Sep 17 00:00:00 2001 From: Charles Perry Date: Wed, 8 Apr 2026 06:18:14 -0700 Subject: dt-bindings: net: document Microchip PIC64-HPSC/HX MDIO controller This MDIO hardware is based on a Microsemi design supported in Linux by mdio-mscc-miim.c. However, The register interface is completely different with pic64hpsc, hence the need for separate documentation. The hardware supports C22 and C45. The documentation recommends an input clock of 156.25MHz and a prescaler of 39, which yields an MDIO clock of 1.95MHz. The hardware supports an interrupt pin to signal transaction completion which is not strictly needed as the software can also poll a "TRIGGER" bit for this. Signed-off-by: Charles Perry Acked-by: Conor Dooley Link: https://patch.msgid.link/20260408131821.1145334-2-charles.perry@microchip.com Signed-off-by: Jakub Kicinski --- .../bindings/net/microchip,pic64hpsc-mdio.yaml | 68 ++++++++++++++++++++++ 1 file changed, 68 insertions(+) create mode 100644 Documentation/devicetree/bindings/net/microchip,pic64hpsc-mdio.yaml (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/net/microchip,pic64hpsc-mdio.yaml b/Documentation/devicetree/bindings/net/microchip,pic64hpsc-mdio.yaml new file mode 100644 index 000000000000..20f29b71566b --- /dev/null +++ b/Documentation/devicetree/bindings/net/microchip,pic64hpsc-mdio.yaml @@ -0,0 +1,68 @@ +# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/net/microchip,pic64hpsc-mdio.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Microchip PIC64-HPSC/HX MDIO controller + +maintainers: + - Charles Perry + +description: + This is the MDIO bus controller present in Microchip PIC64-HPSC/HX SoCs. It + supports C22 and C45 register access and is named "MDIO Initiator" in the + documentation. + +allOf: + - $ref: mdio.yaml# + +properties: + compatible: + oneOf: + - const: microchip,pic64hpsc-mdio + - items: + - const: microchip,pic64hx-mdio + - const: microchip,pic64hpsc-mdio + + reg: + maxItems: 1 + + clocks: + maxItems: 1 + + clock-frequency: + default: 2500000 + + interrupts: + maxItems: 1 + +required: + - compatible + - reg + - clocks + - interrupts + +unevaluatedProperties: false + +examples: + - | + #include + bus { + #address-cells = <2>; + #size-cells = <2>; + + mdio@4000c21e000 { + compatible = "microchip,pic64hpsc-mdio"; + reg = <0x400 0x0c21e000 0x0 0x1000>; + #address-cells = <1>; + #size-cells = <0>; + clocks = <&svc_clk>; + interrupt-parent = <&saplic0>; + interrupts = <168 IRQ_TYPE_LEVEL_HIGH>; + + ethernet-phy@0 { + reg = <0>; + }; + }; + }; -- cgit v1.2.3 From d471d70cc964c3fe5c5e7765ab07b5f2f70f276e Mon Sep 17 00:00:00 2001 From: Luca Weiss Date: Fri, 10 Apr 2026 09:40:07 +0200 Subject: dt-bindings: net: qcom,ipa: add Milos compatible Add support for the Milos SoC, which uses IPA v5.2. Acked-by: Krzysztof Kozlowski Signed-off-by: Luca Weiss Link: https://patch.msgid.link/20260410-ipa-v5-2-v2-1-778422a05060@fairphone.com Signed-off-by: Jakub Kicinski --- Documentation/devicetree/bindings/net/qcom,ipa.yaml | 1 + 1 file changed, 1 insertion(+) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/net/qcom,ipa.yaml b/Documentation/devicetree/bindings/net/qcom,ipa.yaml index e4bb627e1757..fdeaa81b9645 100644 --- a/Documentation/devicetree/bindings/net/qcom,ipa.yaml +++ b/Documentation/devicetree/bindings/net/qcom,ipa.yaml @@ -44,6 +44,7 @@ properties: compatible: oneOf: - enum: + - qcom,milos-ipa - qcom,msm8998-ipa - qcom,sc7180-ipa - qcom,sc7280-ipa -- cgit v1.2.3