84 files changed, 5394 insertions, 1722 deletions
diff --git a/Documentation/networking/af_xdp.rst b/Documentation/networking/af_xdp.rst
index dceeb0d763aa..50d92084a49c 100644
--- a/Documentation/networking/af_xdp.rst
+++ b/Documentation/networking/af_xdp.rst
@@ -209,13 +209,10 @@ Libbpf
 
 Libbpf is a helper library for eBPF and XDP that makes using these
 technologies a lot simpler. It also contains specific helper functions
-in tools/lib/bpf/xsk.h for facilitating the use of AF_XDP. It
-contains two types of functions: those that can be used to make the
-setup of AF_XDP socket easier and ones that can be used in the data
-plane to access the rings safely and quickly. To see an example on how
-to use this API, please take a look at the sample application in
-samples/bpf/xdpsock_usr.c which uses libbpf for both setup and data
-plane operations.
+in tools/testing/selftests/bpf/xsk.h for facilitating the use of
+AF_XDP. It contains two types of functions: those that can be used to
+make the setup of AF_XDP socket easier and ones that can be used in the
+data plane to access the rings safely and quickly.
 
 We recommend that you use this library unless you have become a power
 user. It will make your program a lot simpler.
@@ -372,9 +369,8 @@ needs to explicitly notify the kernel to send any packets put on the
 TX ring. This can be accomplished either by a poll() call, as in the
 RX path, or by calling sendto().
 
-An example of how to use this flag can be found in
-samples/bpf/xdpsock_user.c. An example with the use of libbpf helpers
-would look like this for the TX path:
+An example with the use of libbpf helpers would look like this for the
+TX path:
 
 .. code-block:: c
 
@@ -442,6 +438,15 @@ is created by a privileged process and passed to a non-privileged one.
 Once the option is set, kernel will refuse attempts to bind that socket
 to a different interface.  Updating the value requires CAP_NET_RAW.
 
+XDP_MAX_TX_SKB_BUDGET setsockopt
+--------------------------------
+
+This setsockopt sets the maximum number of descriptors that can be handled
+and passed to the driver at one send syscall. It is applied in the copy
+mode to allow application to tune the per-socket maximum iteration for
+better throughput and less frequency of send syscall.
+Allowed range is [32, xs->tx->nentries].
+
 XDP_STATISTICS getsockopt
 -------------------------
 
@@ -549,12 +554,12 @@ later in this document.
 Usage
 -----
 
-In order to use AF_XDP sockets two parts are needed. The
-user-space application and the XDP program. For a complete setup and
-usage example, please refer to the sample application. The user-space
-side is xdpsock_user.c and the XDP side is part of libbpf.
+In order to use AF_XDP sockets two parts are needed. The user-space
+application and the XDP program. For a complete setup and usage example,
+please refer to the xdp-project at
+https://github.com/xdp-project/bpf-examples/tree/main/AF_XDP-example.
 
-The XDP code sample included in tools/lib/bpf/xsk.c is the following:
+The XDP code sample is the following:
 
 .. code-block:: c
 
@@ -752,11 +757,12 @@ to facilitate extending a zero-copy driver with multi-buffer support.
 
 Sample application
 ==================
-
-There is a xdpsock benchmarking/test application included that
-demonstrates how to use AF_XDP sockets with private UMEMs. Say that
-you would like your UDP traffic from port 4242 to end up in queue 16,
-that we will enable AF_XDP on. Here, we use ethtool for this::
+There is a xdpsock benchmarking/test application that can be found at
+https://github.com/xdp-project/bpf-examples/tree/main/AF_XDP-example
+that demonstrates how to use AF_XDP sockets with private
+UMEMs. Say that you would like your UDP traffic from port 4242 to end
+up in queue 16, that we will enable AF_XDP on. Here, we use ethtool
+for this::
 
       ethtool -N p3p2 rx-flow-hash udp4 fn
       ethtool -N p3p2 flow-type udp4 src-port 4242 dst-port 4242 \
@@ -773,7 +779,7 @@ can be displayed with "-h", as usual.
 This sample application uses libbpf to make the setup and usage of
 AF_XDP simpler. If you want to know how the raw uapi of AF_XDP is
 really used to make something more advanced, take a look at the libbpf
-code in tools/lib/bpf/xsk.[ch].
+code in tools/testing/selftests/bpf/xsk.[ch].
 
 FAQ
 =======
diff --git a/Documentation/networking/arcnet-hardware.rst b/Documentation/networking/arcnet-hardware.rst
index 982215723582..3bf7f99cd7bb 100644
--- a/Documentation/networking/arcnet-hardware.rst
+++ b/Documentation/networking/arcnet-hardware.rst
@@ -3152,7 +3152,7 @@ Tiara
 (model unknown)
 ---------------
 
-  - from Christoph Lameter <christoph@lameter.com>
+  - from Christoph Lameter <cl@gentwo.org>
 
 
 Here is information about my card as far as I could figure it out::
diff --git a/Documentation/networking/bareudp.rst b/Documentation/networking/bareudp.rst
index b9d04ee6dac1..621cb9575c8f 100644
--- a/Documentation/networking/bareudp.rst
+++ b/Documentation/networking/bareudp.rst
@@ -6,16 +6,17 @@ Bare UDP Tunnelling Module Documentation
 
 There are various L3 encapsulation standards using UDP being discussed to
 leverage the UDP based load balancing capability of different networks.
-MPLSoUDP (__ https://tools.ietf.org/html/rfc7510) is one among them.
+MPLSoUDP (https://tools.ietf.org/html/rfc7510) is one among them.
 
 The Bareudp tunnel module provides a generic L3 encapsulation support for
 tunnelling different L3 protocols like MPLS, IP, NSH etc. inside a UDP tunnel.
 
 Special Handling
 ----------------
+
 The bareudp device supports special handling for MPLS & IP as they can have
 multiple ethertypes.
-MPLS procotcol can have ethertypes ETH_P_MPLS_UC  (unicast) & ETH_P_MPLS_MC (multicast).
+The MPLS protocol can have ethertypes ETH_P_MPLS_UC (unicast) & ETH_P_MPLS_MC (multicast).
 IP protocol can have ethertypes ETH_P_IP (v4) & ETH_P_IPV6 (v6).
 This special handling can be enabled only for ethertypes ETH_P_IP & ETH_P_MPLS_UC
 with a flag called multiproto mode.
@@ -52,7 +53,7 @@ be enabled explicitly with the "multiproto" flag.
 3) Device Usage
 
 The bareudp device could be used along with OVS or flower filter in TC.
-The OVS or TC flower layer must set the tunnel information in SKB dst field before
-sending packet buffer to the bareudp device for transmission. On reception the
-bareudp device extracts and stores the tunnel information in SKB dst field before
+The OVS or TC flower layer must set the tunnel information in the SKB dst field before
+sending the packet buffer to the bareudp device for transmission. On reception, the
+bareUDP device extracts and stores the tunnel information in the SKB dst field before
 passing the packet buffer to the network stack.
diff --git a/Documentation/networking/batman-adv.rst b/Documentation/networking/batman-adv.rst
index 8a0dcb1894b4..ec53a42499c1 100644
--- a/Documentation/networking/batman-adv.rst
+++ b/Documentation/networking/batman-adv.rst
@@ -27,7 +27,7 @@ Load the batman-adv module into your kernel::
   $ insmod batman-adv.ko
 
 The module is now waiting for activation. You must add some interfaces on which
-batman-adv can operate. The batman-adv soft-interface can be created using the
+batman-adv can operate. The batman-adv mesh-interface can be created using the
 iproute2 tool ``ip``::
 
   $ ip link add name bat0 type batadv
@@ -164,5 +164,5 @@ Mailing-list:
 
 You can also contact the Authors:
 
-* Marek Lindner <mareklindner@neomailbox.ch>
+* Marek Lindner <marek.lindner@mailbox.org>
 * Simon Wunderlich <sw@simonwunderlich.de>
diff --git a/Documentation/networking/bonding.rst b/Documentation/networking/bonding.rst
index e774b48de9f5..f8f5766703d4 100644
--- a/Documentation/networking/bonding.rst
+++ b/Documentation/networking/bonding.rst
@@ -562,6 +562,12 @@ lacp_rate
 
 	The default is slow.
 
+broadcast_neighbor
+
+	Option specifying whether to broadcast ARP/ND packets to all
+	active slaves.  This option has no effect in modes other than
+	802.3ad mode.  The default is off (0).
+
 max_bonds
 
 	Specifies the number of bonding devices to create for this
@@ -767,8 +773,9 @@ num_unsol_na
 	greater than 1.
 
 	The valid range is 0 - 255; the default value is 1.  These options
-	affect only the active-backup mode.  These options were added for
-	bonding versions 3.3.0 and 3.4.0 respectively.
+	affect the active-backup or 802.3ad (broadcast_neighbor enabled) mode.
+	These options were added for bonding versions 3.3.0 and 3.4.0
+	respectively.
 
 	From Linux 3.0 and bonding version 3.7.1, these notifications
 	are generated by the ipv4 and ipv6 code and the numbers of
@@ -1963,7 +1970,7 @@ obtain its hardware address from the first slave, which might not
 match the hardware address of the VLAN interfaces (which was
 ultimately copied from an earlier slave).
 
-There are two methods to insure that the VLAN device operates
+There are two methods to ensure that the VLAN device operates
 with the correct hardware address if all slaves are removed from a
 bond interface:
 
@@ -2078,7 +2085,7 @@ as an unsolicited ARP reply (because ARP matches replies on an
 interface basis), and is discarded.  The MII monitor is not affected
 by the state of the routing table.
 
-The solution here is simply to insure that slaves do not have
+The solution here is simply to ensure that slaves do not have
 routes of their own, and if for some reason they must, those routes do
 not supersede routes of their master.  This should generally be the
 case, but unusual configurations or errant manual or automatic static
@@ -2295,7 +2302,7 @@ active-backup:
 	the switches have an ISL and play together well.  If the
 	network configuration is such that one switch is specifically
 	a backup switch (e.g., has lower capacity, higher cost, etc),
-	then the primary option can be used to insure that the
+	then the primary option can be used to ensure that the
 	preferred link is always used when it is available.
 
 broadcast:
@@ -2322,7 +2329,7 @@ monitor can provide a higher level of reliability in detecting end to
 end connectivity failures (which may be caused by the failure of any
 individual component to pass traffic for any reason).  Additionally,
 the ARP monitor should be configured with multiple targets (at least
-one for each switch in the network).  This will insure that,
+one for each switch in the network).  This will ensure that,
 regardless of which switch is active, the ARP monitor has a suitable
 target to query.
 
@@ -2916,6 +2923,17 @@ from the bond (``ifenslave -d bond0 eth0``). The bonding driver will
 then restore the MAC addresses that the slaves had before they were
 enslaved.
 
+9.  What bonding modes support native XDP?
+------------------------------------------
+
+  * balance-rr (0)
+  * active-backup (1)
+  * balance-xor (2)
+  * 802.3ad (4)
+
+Note that the vlan+srcmac hash policy does not support native XDP.
+For other bonding modes, the XDP program must be loaded with generic mode.
+
 16. Resources and Links
 =======================
 
diff --git a/Documentation/networking/can.rst b/Documentation/networking/can.rst
index 62519d38c58b..bc1b585355f7 100644
--- a/Documentation/networking/can.rst
+++ b/Documentation/networking/can.rst
@@ -699,10 +699,10 @@ RAW socket option CAN_RAW_JOIN_FILTERS
 
 The CAN_RAW socket can set multiple CAN identifier specific filters that
 lead to multiple filters in the af_can.c filter processing. These filters
-are indenpendent from each other which leads to logical OR'ed filters when
+are independent from each other which leads to logical OR'ed filters when
 applied (see :ref:`socketcan-rawfilter`).
 
-This socket option joines the given CAN filters in the way that only CAN
+This socket option joins the given CAN filters in the way that only CAN
 frames are passed to user space that matched *all* given CAN filters. The
 semantic for the applied filters is therefore changed to a logical AND.
 
@@ -1104,15 +1104,12 @@ for writing CAN network device driver are described below:
 General Settings
 ----------------
 
-.. code-block:: C
-
-    dev->type  = ARPHRD_CAN; /* the netdevice hardware type */
-    dev->flags = IFF_NOARP;  /* CAN has no arp */
+CAN network device drivers can use alloc_candev_mqs() and friends instead of
+alloc_netdev_mqs(), to automatically take care of CAN-specific setup:
 
-    dev->mtu = CAN_MTU; /* sizeof(struct can_frame) -> Classical CAN interface */
+.. code-block:: C
 
-    or alternative, when the controller supports CAN with flexible data rate:
-    dev->mtu = CANFD_MTU; /* sizeof(struct canfd_frame) -> CAN FD interface */
+    dev = alloc_candev_mqs(...);
 
 The struct can_frame or struct canfd_frame is the payload of each socket
 buffer (skbuff) in the protocol family PF_CAN.
diff --git a/Documentation/networking/cdc_mbim.rst b/Documentation/networking/cdc_mbim.rst
index 37f968acc473..8404a3f794f3 100644
--- a/Documentation/networking/cdc_mbim.rst
+++ b/Documentation/networking/cdc_mbim.rst
@@ -51,7 +51,7 @@ Such userspace applications includes, but are not limited to:
  - mbimcli (included with the libmbim [3] library), and
  - ModemManager [4]
 
-Establishing a MBIM IP session reequires at least these actions by the
+Establishing a MBIM IP session requires at least these actions by the
 management application:
 
  - open the control channel
diff --git a/Documentation/networking/dccp.rst b/Documentation/networking/dccp.rst
deleted file mode 100644
index 91e5c33ba3ff..000000000000
--- a/Documentation/networking/dccp.rst
+++ /dev/null
@@ -1,219 +0,0 @@
-.. SPDX-License-Identifier: GPL-2.0
-
-=============
-DCCP protocol
-=============
-
-
-.. Contents
-   - Introduction
-   - Missing features
-   - Socket options
-   - Sysctl variables
-   - IOCTLs
-   - Other tunables
-   - Notes
-
-
-Introduction
-============
-Datagram Congestion Control Protocol (DCCP) is an unreliable, connection
-oriented protocol designed to solve issues present in UDP and TCP, particularly
-for real-time and multimedia (streaming) traffic.
-It divides into a base protocol (RFC 4340) and pluggable congestion control
-modules called CCIDs. Like pluggable TCP congestion control, at least one CCID
-needs to be enabled in order for the protocol to function properly. In the Linux
-implementation, this is the TCP-like CCID2 (RFC 4341). Additional CCIDs, such as
-the TCP-friendly CCID3 (RFC 4342), are optional.
-For a brief introduction to CCIDs and suggestions for choosing a CCID to match
-given applications, see section 10 of RFC 4340.
-
-It has a base protocol and pluggable congestion control IDs (CCIDs).
-
-DCCP is a Proposed Standard (RFC 2026), and the homepage for DCCP as a protocol
-is at http://www.ietf.org/html.charters/dccp-charter.html
-
-
-Missing features
-================
-The Linux DCCP implementation does not currently support all the features that are
-specified in RFCs 4340...42.
-
-The known bugs are at:
-
-	http://www.linuxfoundation.org/collaborate/workgroups/networking/todo#DCCP
-
-For more up-to-date versions of the DCCP implementation, please consider using
-the experimental DCCP test tree; instructions for checking this out are on:
-http://www.linuxfoundation.org/collaborate/workgroups/networking/dccp_testing#Experimental_DCCP_source_tree
-
-
-Socket options
-==============
-DCCP_SOCKOPT_QPOLICY_ID sets the dequeuing policy for outgoing packets. It takes
-a policy ID as argument and can only be set before the connection (i.e. changes
-during an established connection are not supported). Currently, two policies are
-defined: the "simple" policy (DCCPQ_POLICY_SIMPLE), which does nothing special,
-and a priority-based variant (DCCPQ_POLICY_PRIO). The latter allows to pass an
-u32 priority value as ancillary data to sendmsg(), where higher numbers indicate
-a higher packet priority (similar to SO_PRIORITY). This ancillary data needs to
-be formatted using a cmsg(3) message header filled in as follows::
-
-	cmsg->cmsg_level = SOL_DCCP;
-	cmsg->cmsg_type	 = DCCP_SCM_PRIORITY;
-	cmsg->cmsg_len	 = CMSG_LEN(sizeof(uint32_t));	/* or CMSG_LEN(4) */
-
-DCCP_SOCKOPT_QPOLICY_TXQLEN sets the maximum length of the output queue. A zero
-value is always interpreted as unbounded queue length. If different from zero,
-the interpretation of this parameter depends on the current dequeuing policy
-(see above): the "simple" policy will enforce a fixed queue size by returning
-EAGAIN, whereas the "prio" policy enforces a fixed queue length by dropping the
-lowest-priority packet first. The default value for this parameter is
-initialised from /proc/sys/net/dccp/default/tx_qlen.
-
-DCCP_SOCKOPT_SERVICE sets the service. The specification mandates use of
-service codes (RFC 4340, sec. 8.1.2); if this socket option is not set,
-the socket will fall back to 0 (which means that no meaningful service code
-is present). On active sockets this is set before connect(); specifying more
-than one code has no effect (all subsequent service codes are ignored). The
-case is different for passive sockets, where multiple service codes (up to 32)
-can be set before calling bind().
-
-DCCP_SOCKOPT_GET_CUR_MPS is read-only and retrieves the current maximum packet
-size (application payload size) in bytes, see RFC 4340, section 14.
-
-DCCP_SOCKOPT_AVAILABLE_CCIDS is also read-only and returns the list of CCIDs
-supported by the endpoint. The option value is an array of type uint8_t whose
-size is passed as option length. The minimum array size is 4 elements, the
-value returned in the optlen argument always reflects the true number of
-built-in CCIDs.
-
-DCCP_SOCKOPT_CCID is write-only and sets both the TX and RX CCIDs at the same
-time, combining the operation of the next two socket options. This option is
-preferable over the latter two, since often applications will use the same
-type of CCID for both directions; and mixed use of CCIDs is not currently well
-understood. This socket option takes as argument at least one uint8_t value, or
-an array of uint8_t values, which must match available CCIDS (see above). CCIDs
-must be registered on the socket before calling connect() or listen().
-
-DCCP_SOCKOPT_TX_CCID is read/write. It returns the current CCID (if set) or sets
-the preference list for the TX CCID, using the same format as DCCP_SOCKOPT_CCID.
-Please note that the getsockopt argument type here is ``int``, not uint8_t.
-
-DCCP_SOCKOPT_RX_CCID is analogous to DCCP_SOCKOPT_TX_CCID, but for the RX CCID.
-
-DCCP_SOCKOPT_SERVER_TIMEWAIT enables the server (listening socket) to hold
-timewait state when closing the connection (RFC 4340, 8.3). The usual case is
-that the closing server sends a CloseReq, whereupon the client holds timewait
-state. When this boolean socket option is on, the server sends a Close instead
-and will enter TIMEWAIT. This option must be set after accept() returns.
-
-DCCP_SOCKOPT_SEND_CSCOV and DCCP_SOCKOPT_RECV_CSCOV are used for setting the
-partial checksum coverage (RFC 4340, sec. 9.2). The default is that checksums
-always cover the entire packet and that only fully covered application data is
-accepted by the receiver. Hence, when using this feature on the sender, it must
-be enabled at the receiver, too with suitable choice of CsCov.
-
-DCCP_SOCKOPT_SEND_CSCOV sets the sender checksum coverage. Values in the
-	range 0..15 are acceptable. The default setting is 0 (full coverage),
-	values between 1..15 indicate partial coverage.
-
-DCCP_SOCKOPT_RECV_CSCOV is for the receiver and has a different meaning: it
-	sets a threshold, where again values 0..15 are acceptable. The default
-	of 0 means that all packets with a partial coverage will be discarded.
-	Values in the range 1..15 indicate that packets with minimally such a
-	coverage value are also acceptable. The higher the number, the more
-	restrictive this setting (see [RFC 4340, sec. 9.2.1]). Partial coverage
-	settings are inherited to the child socket after accept().
-
-The following two options apply to CCID 3 exclusively and are getsockopt()-only.
-In either case, a TFRC info struct (defined in <linux/tfrc.h>) is returned.
-
-DCCP_SOCKOPT_CCID_RX_INFO
-	Returns a ``struct tfrc_rx_info`` in optval; the buffer for optval and
-	optlen must be set to at least sizeof(struct tfrc_rx_info).
-
-DCCP_SOCKOPT_CCID_TX_INFO
-	Returns a ``struct tfrc_tx_info`` in optval; the buffer for optval and
-	optlen must be set to at least sizeof(struct tfrc_tx_info).
-
-On unidirectional connections it is useful to close the unused half-connection
-via shutdown (SHUT_WR or SHUT_RD): this will reduce per-packet processing costs.
-
-
-Sysctl variables
-================
-Several DCCP default parameters can be managed by the following sysctls
-(sysctl net.dccp.default or /proc/sys/net/dccp/default):
-
-request_retries
-	The number of active connection initiation retries (the number of
-	Requests minus one) before timing out. In addition, it also governs
-	the behaviour of the other, passive side: this variable also sets
-	the number of times DCCP repeats sending a Response when the initial
-	handshake does not progress from RESPOND to OPEN (i.e. when no Ack
-	is received after the initial Request).  This value should be greater
-	than 0, suggested is less than 10. Analogue of tcp_syn_retries.
-
-retries1
-	How often a DCCP Response is retransmitted until the listening DCCP
-	side considers its connecting peer dead. Analogue of tcp_retries1.
-
-retries2
-	The number of times a general DCCP packet is retransmitted. This has
-	importance for retransmitted acknowledgments and feature negotiation,
-	data packets are never retransmitted. Analogue of tcp_retries2.
-
-tx_ccid = 2
-	Default CCID for the sender-receiver half-connection. Depending on the
-	choice of CCID, the Send Ack Vector feature is enabled automatically.
-
-rx_ccid = 2
-	Default CCID for the receiver-sender half-connection; see tx_ccid.
-
-seq_window = 100
-	The initial sequence window (sec. 7.5.2) of the sender. This influences
-	the local ackno validity and the remote seqno validity windows (7.5.1).
-	Values in the range Wmin = 32 (RFC 4340, 7.5.2) up to 2^32-1 can be set.
-
-tx_qlen = 5
-	The size of the transmit buffer in packets. A value of 0 corresponds
-	to an unbounded transmit buffer.
-
-sync_ratelimit = 125 ms
-	The timeout between subsequent DCCP-Sync packets sent in response to
-	sequence-invalid packets on the same socket (RFC 4340, 7.5.4). The unit
-	of this parameter is milliseconds; a value of 0 disables rate-limiting.
-
-
-IOCTLS
-======
-FIONREAD
-	Works as in udp(7): returns in the ``int`` argument pointer the size of
-	the next pending datagram in bytes, or 0 when no datagram is pending.
-
-SIOCOUTQ
-	Returns the number of unsent data bytes in the socket send queue as ``int``
-	into the buffer specified by the argument pointer.
-
-Other tunables
-==============
-Per-route rto_min support
-	CCID-2 supports the RTAX_RTO_MIN per-route setting for the minimum value
-	of the RTO timer. This setting can be modified via the 'rto_min' option
-	of iproute2; for example::
-
-		> ip route change 10.0.0.0/24   rto_min 250j dev wlan0
-		> ip route add    10.0.0.254/32 rto_min 800j dev wlan0
-		> ip route show dev wlan0
-
-	CCID-3 also supports the rto_min setting: it is used to define the lower
-	bound for the expiry of the nofeedback timer. This can be useful on LANs
-	with very low RTTs (e.g., loopback, Gbit ethernet).
-
-
-Notes
-=====
-DCCP does not travel through NAT successfully at present on many boxes. This is
-because the checksum covers the pseudo-header as per TCP and UDP. Linux NAT
-support for DCCP has been added.
diff --git a/Documentation/networking/device_drivers/cable/index.rst b/Documentation/networking/device_drivers/cable/index.rst
deleted file mode 100644
index cce3c4392972..000000000000
--- a/Documentation/networking/device_drivers/cable/index.rst
+++ /dev/null
@@ -1,18 +0,0 @@
-.. SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
-
-Cable Modem Device Drivers
-==========================
-
-Contents:
-
-.. toctree::
-   :maxdepth: 2
-
-   sb1000
-
-.. only::  subproject and html
-
-   Indices
-   =======
-
-   * :ref:`genindex`
diff --git a/Documentation/networking/device_drivers/cable/sb1000.rst b/Documentation/networking/device_drivers/cable/sb1000.rst
deleted file mode 100644
index c8582ca4034d..000000000000
--- a/Documentation/networking/device_drivers/cable/sb1000.rst
+++ /dev/null
@@ -1,222 +0,0 @@
-.. SPDX-License-Identifier: GPL-2.0
-
-===================
-SB100 device driver
-===================
-
-sb1000 is a module network device driver for the General Instrument (also known
-as NextLevel) SURFboard1000 internal cable modem board.  This is an ISA card
-which is used by a number of cable TV companies to provide cable modem access.
-It's a one-way downstream-only cable modem, meaning that your upstream net link
-is provided by your regular phone modem.
-
-This driver was written by Franco Venturi <fventuri@mediaone.net>.  He deserves
-a great deal of thanks for this wonderful piece of code!
-
-Needed tools
-============
-
-Support for this device is now a part of the standard Linux kernel.  The
-driver source code file is drivers/net/sb1000.c.  In addition to this
-you will need:
-
-1. The "cmconfig" program.  This is a utility which supplements "ifconfig"
-   to configure the cable modem and network interface (usually called "cm0");
-
-2. Several PPP scripts which live in /etc/ppp to make connecting via your
-   cable modem easy.
-
-   These utilities can be obtained from:
-
-      http://www.jacksonville.net/~fventuri/
-
-   in Franco's original source code distribution .tar.gz file.  Support for
-   the sb1000 driver can be found at:
-
-      - http://web.archive.org/web/%2E/http://home.adelphia.net/~siglercm/sb1000.html
-      - http://web.archive.org/web/%2E/http://linuxpower.cx/~cable/
-
-   along with these utilities.
-
-3. The standard isapnp tools.  These are necessary to configure your SB1000
-   card at boot time (or afterwards by hand) since it's a PnP card.
-
-   If you don't have these installed as a standard part of your Linux
-   distribution, you can find them at:
-
-      http://www.roestock.demon.co.uk/isapnptools/
-
-   or check your Linux distribution binary CD or their web site.  For help with
-   isapnp, pnpdump, or /etc/isapnp.conf, go to:
-
-      http://www.roestock.demon.co.uk/isapnptools/isapnpfaq.html
-
-Using the driver
-================
-
-To make the SB1000 card work, follow these steps:
-
-1. Run ``make config``, or ``make menuconfig``, or ``make xconfig``, whichever
-   you prefer, in the top kernel tree directory to set up your kernel
-   configuration.  Make sure to say "Y" to "Prompt for development drivers"
-   and to say "M" to the sb1000 driver.  Also say "Y" or "M" to all the standard
-   networking questions to get TCP/IP and PPP networking support.
-
-2. **BEFORE** you build the kernel, edit drivers/net/sb1000.c.  Make sure
-   to redefine the value of READ_DATA_PORT to match the I/O address used
-   by isapnp to access your PnP cards.  This is the value of READPORT in
-   /etc/isapnp.conf or given by the output of pnpdump.
-
-3. Build and install the kernel and modules as usual.
-
-4. Boot your new kernel following the usual procedures.
-
-5. Set up to configure the new SB1000 PnP card by capturing the output
-   of "pnpdump" to a file and editing this file to set the correct I/O ports,
-   IRQ, and DMA settings for all your PnP cards.  Make sure none of the settings
-   conflict with one another.  Then test this configuration by running the
-   "isapnp" command with your new config file as the input.  Check for
-   errors and fix as necessary.  (As an aside, I use I/O ports 0x110 and
-   0x310 and IRQ 11 for my SB1000 card and these work well for me.  YMMV.)
-   Then save the finished config file as /etc/isapnp.conf for proper
-   configuration on subsequent reboots.
-
-6. Download the original file sb1000-1.1.2.tar.gz from Franco's site or one of
-   the others referenced above.  As root, unpack it into a temporary directory
-   and do a ``make cmconfig`` and then ``install -c cmconfig /usr/local/sbin``.
-   Don't do ``make install`` because it expects to find all the utilities built
-   and ready for installation, not just cmconfig.
-
-7. As root, copy all the files under the ppp/ subdirectory in Franco's
-   tar file into /etc/ppp, being careful not to overwrite any files that are
-   already in there.  Then modify ppp@gi-on to set the correct login name,
-   phone number, and frequency for the cable modem.  Also edit pap-secrets
-   to specify your login name and password and any site-specific information
-   you need.
-
-8. Be sure to modify /etc/ppp/firewall to use ipchains instead of
-   the older ipfwadm commands from the 2.0.x kernels.  There's a neat utility to
-   convert ipfwadm commands to ipchains commands:
-
-	http://users.dhp.com/~whisper/ipfwadm2ipchains/
-
-   You may also wish to modify the firewall script to implement a different
-   firewalling scheme.
-
-9. Start the PPP connection via the script /etc/ppp/ppp@gi-on.  You must be
-   root to do this.  It's better to use a utility like sudo to execute
-   frequently used commands like this with root permissions if possible.  If you
-   connect successfully the cable modem interface will come up and you'll see a
-   driver message like this at the console::
-
-	 cm0: sb1000 at (0x110,0x310), csn 1, S/N 0x2a0d16d8, IRQ 11.
-	 sb1000.c:v1.1.2 6/01/98 (fventuri@mediaone.net)
-
-   The "ifconfig" command should show two new interfaces, ppp0 and cm0.
-
-   The command "cmconfig cm0" will give you information about the cable modem
-   interface.
-
-10. Try pinging a site via ``ping -c 5 www.yahoo.com``, for example.  You should
-    see packets received.
-
-11. If you can't get site names (like www.yahoo.com) to resolve into
-    IP addresses (like 204.71.200.67), be sure your /etc/resolv.conf file
-    has no syntax errors and has the right nameserver IP addresses in it.
-    If this doesn't help, try something like ``ping -c 5 204.71.200.67`` to
-    see if the networking is running but the DNS resolution is where the
-    problem lies.
-
-12. If you still have problems, go to the support web sites mentioned above
-    and read the information and documentation there.
-
-Common problems
-===============
-
-1. Packets go out on the ppp0 interface but don't come back on the cm0
-   interface.  It looks like I'm connected but I can't even ping any
-   numerical IP addresses.  (This happens predominantly on Debian systems due
-   to a default boot-time configuration script.)
-
-Solution
-   As root ``echo 0 > /proc/sys/net/ipv4/conf/cm0/rp_filter`` so it
-   can share the same IP address as the ppp0 interface.  Note that this
-   command should probably be added to the /etc/ppp/cablemodem script
-   *right*between* the "/sbin/ifconfig" and "/sbin/cmconfig" commands.
-   You may need to do this to /proc/sys/net/ipv4/conf/ppp0/rp_filter as well.
-   If you do this to /proc/sys/net/ipv4/conf/default/rp_filter on each reboot
-   (in rc.local or some such) then any interfaces can share the same IP
-   addresses.
-
-2. I get "unresolved symbol" error messages on executing ``insmod sb1000.o``.
-
-Solution
-   You probably have a non-matching kernel source tree and
-   /usr/include/linux and /usr/include/asm header files.  Make sure you
-   install the correct versions of the header files in these two directories.
-   Then rebuild and reinstall the kernel.
-
-3. When isapnp runs it reports an error, and my SB1000 card isn't working.
-
-Solution
-   There's a problem with later versions of isapnp using the "(CHECK)"
-   option in the lines that allocate the two I/O addresses for the SB1000 card.
-   This first popped up on RH 6.0.  Delete "(CHECK)" for the SB1000 I/O addresses.
-   Make sure they don't conflict with any other pieces of hardware first!  Then
-   rerun isapnp and go from there.
-
-4. I can't execute the /etc/ppp/ppp@gi-on file.
-
-Solution
-   As root do ``chmod ug+x /etc/ppp/ppp@gi-on``.
-
-5. The firewall script isn't working (with 2.2.x and higher kernels).
-
-Solution
-   Use the ipfwadm2ipchains script referenced above to convert the
-   /etc/ppp/firewall script from the deprecated ipfwadm commands to ipchains.
-
-6. I'm getting *tons* of firewall deny messages in the /var/kern.log,
-   /var/messages, and/or /var/syslog files, and they're filling up my /var
-   partition!!!
-
-Solution
-   First, tell your ISP that you're receiving DoS (Denial of Service)
-   and/or portscanning (UDP connection attempts) attacks!  Look over the deny
-   messages to figure out what the attack is and where it's coming from.  Next,
-   edit /etc/ppp/cablemodem and make sure the ",nobroadcast" option is turned on
-   to the "cmconfig" command (uncomment that line).  If you're not receiving these
-   denied packets on your broadcast interface (IP address xxx.yyy.zzz.255
-   typically), then someone is attacking your machine in particular.  Be careful
-   out there....
-
-7. Everything seems to work fine but my computer locks up after a while
-   (and typically during a lengthy download through the cable modem)!
-
-Solution
-   You may need to add a short delay in the driver to 'slow down' the
-   SURFboard because your PC might not be able to keep up with the transfer rate
-   of the SB1000. To do this, it's probably best to download Franco's
-   sb1000-1.1.2.tar.gz archive and build and install sb1000.o manually.  You'll
-   want to edit the 'Makefile' and look for the 'SB1000_DELAY'
-   define.  Uncomment those 'CFLAGS' lines (and comment out the default ones)
-   and try setting the delay to something like 60 microseconds with:
-   '-DSB1000_DELAY=60'.  Then do ``make`` and as root ``make install`` and try
-   it out.  If it still doesn't work or you like playing with the driver, you may
-   try other numbers.  Remember though that the higher the delay, the slower the
-   driver (which slows down the rest of the PC too when it is actively
-   used). Thanks to Ed Daiga for this tip!
-
-Credits
-=======
-
-This README came from Franco Venturi's original README file which is
-still supplied with his driver .tar.gz archive.  I and all other sb1000 users
-owe Franco a tremendous "Thank you!"  Additional thanks goes to Carl Patten
-and Ralph Bonnell who are now managing the Linux SB1000 web site, and to
-the SB1000 users who reported and helped debug the common problems listed
-above.
-
-
-					Clemmitt Sigler
-					csigler@vt.edu
diff --git a/Documentation/networking/device_drivers/ethernet/amazon/ena.rst b/Documentation/networking/device_drivers/ethernet/amazon/ena.rst
index 4561e8ab9e08..14784a0a6a8a 100644
--- a/Documentation/networking/device_drivers/ethernet/amazon/ena.rst
+++ b/Documentation/networking/device_drivers/ethernet/amazon/ena.rst
@@ -56,6 +56,9 @@ ena_netdev.[ch]     Main Linux kernel driver.
 ena_ethtool.c       ethtool callbacks.
 ena_xdp.[ch]        XDP files
 ena_pci_id_tbl.h    Supported device IDs.
+ena_phc.[ch]        PTP hardware clock infrastructure (see `PHC`_ for more info)
+ena_devlink.[ch]    devlink files.
+ena_debugfs.[ch]    debugfs files.
 =================   ======================================================
 
 Management Interface:
@@ -221,6 +224,99 @@ descriptor it was received on would be recycled. When a packet smaller
 than RX copybreak bytes is received, it is copied into a new memory
 buffer and the RX descriptor is returned to HW.
 
+.. _`PHC`:
+
+PTP Hardware Clock (PHC)
+========================
+.. _`ptp-userspace-api`: https://docs.kernel.org/driver-api/ptp.html#ptp-hardware-clock-user-space-api
+.. _`testptp`: https://elixir.bootlin.com/linux/latest/source/tools/testing/selftests/ptp/testptp.c
+
+ENA Linux driver supports PTP hardware clock providing timestamp reference to achieve nanosecond resolution.
+
+**PHC support**
+
+PHC depends on the PTP module, which needs to be either loaded as a module or compiled into the kernel.
+
+Verify if the PTP module is present:
+
+.. code-block:: shell
+
+  grep -w '^CONFIG_PTP_1588_CLOCK=[ym]' /boot/config-`uname -r`
+
+- If no output is provided, the ENA driver cannot be loaded with PHC support.
+
+**PHC activation**
+
+The feature is turned off by default, in order to turn the feature on, the ENA driver
+can be loaded in the following way:
+
+- devlink:
+
+.. code-block:: shell
+
+  sudo devlink dev param set pci/<domain:bus:slot.function> name enable_phc value true cmode driverinit
+  sudo devlink dev reload pci/<domain:bus:slot.function>
+  # for example:
+  sudo devlink dev param set pci/0000:00:06.0 name enable_phc value true cmode driverinit
+  sudo devlink dev reload pci/0000:00:06.0
+
+All available PTP clock sources can be tracked here:
+
+.. code-block:: shell
+
+  ls /sys/class/ptp
+
+PHC support and capabilities can be verified using ethtool:
+
+.. code-block:: shell
+
+  ethtool -T <interface>
+
+**PHC timestamp**
+
+To retrieve PHC timestamp, use `ptp-userspace-api`_, usage example using `testptp`_:
+
+.. code-block:: shell
+
+  testptp -d /dev/ptp$(ethtool -T <interface> | awk '/PTP Hardware Clock:/ {print $NF}') -k 1
+
+PHC get time requests should be within reasonable bounds,
+avoid excessive utilization to ensure optimal performance and efficiency.
+The ENA device restricts the frequency of PHC get time requests to a maximum
+of 125 requests per second. If this limit is surpassed, the get time request
+will fail, leading to an increment in the phc_err_ts statistic.
+
+**PHC statistics**
+
+PHC can be monitored using debugfs (if mounted):
+
+.. code-block:: shell
+
+  sudo cat /sys/kernel/debug/<domain:bus:slot.function>/phc_stats
+
+  # for example:
+  sudo cat /sys/kernel/debug/0000:00:06.0/phc_stats
+
+PHC errors must remain below 1% of all PHC requests to maintain the desired level of accuracy and reliability
+
+=================   ======================================================
+**phc_cnt**         | Number of successful retrieved timestamps (below expire timeout).
+**phc_exp**         | Number of expired retrieved timestamps (above expire timeout).
+**phc_skp**         | Number of skipped get time attempts (during block period).
+**phc_err_dv**      | Number of failed get time attempts due to device errors (entering into block state).
+**phc_err_ts**      | Number of failed get time attempts due to timestamp errors (entering into block state),
+                    | This occurs if driver exceeded the request limit or device received an invalid timestamp.
+=================   ======================================================
+
+PHC timeouts:
+
+=================   ======================================================
+**expire**          | Max time for a valid timestamp retrieval, passing this threshold will fail
+                    | the get time request and block new requests until block timeout.
+**block**           | Blocking period starts once get time request expires or fails,
+                    | all get time requests during block period will be skipped.
+=================   ======================================================
+
 Statistics
 ==========
 
@@ -268,6 +364,18 @@ RSS
 - The user can provide a hash key, hash function, and configure the
   indirection table through `ethtool(8)`.
 
+DEVLINK SUPPORT
+===============
+.. _`devlink`: https://www.kernel.org/doc/html/latest/networking/devlink/index.html
+
+`devlink`_ supports reloading the driver and initiating re-negotiation with the ENA device
+
+.. code-block:: shell
+
+  sudo devlink dev reload pci/<domain:bus:slot.function>
+  # for example:
+  sudo devlink dev reload pci/0000:00:06.0
+
 DATA PATH
 =========
 
diff --git a/Documentation/networking/device_drivers/ethernet/freescale/dpaa2/switch-driver.rst b/Documentation/networking/device_drivers/ethernet/freescale/dpaa2/switch-driver.rst
index 8bf411b857d4..5f3885e56f58 100644
--- a/Documentation/networking/device_drivers/ethernet/freescale/dpaa2/switch-driver.rst
+++ b/Documentation/networking/device_drivers/ethernet/freescale/dpaa2/switch-driver.rst
@@ -70,7 +70,7 @@ the DPSW object that it will probe:
 Besides the configuration of the actual DPSW object, the dpaa2-switch driver
 will need the following DPAA2 objects:
 
- * 1 DPMCP - A Management Command Portal object is needed for any interraction
+ * 1 DPMCP - A Management Command Portal object is needed for any interaction
    with the MC firmware.
 
  * 1 DPBP - A Buffer Pool is used for seeding buffers intended for the Rx path
diff --git a/Documentation/networking/device_drivers/ethernet/huawei/hinic3.rst b/Documentation/networking/device_drivers/ethernet/huawei/hinic3.rst
new file mode 100644
index 000000000000..e3dfd083fa52
--- /dev/null
+++ b/Documentation/networking/device_drivers/ethernet/huawei/hinic3.rst
@@ -0,0 +1,137 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=====================================================================
+Linux kernel driver for Huawei Ethernet Device Driver (hinic3) family
+=====================================================================
+
+Overview
+========
+
+The hinic3 is a network interface card (NIC) for Data Center. It supports
+a range of link-speed devices (10GE, 25GE, 100GE, etc.). The hinic3
+devices can have multiple physical forms: LOM (Lan on Motherboard) NIC,
+PCIe standard NIC, OCP (Open Compute Project) NIC, etc.
+
+The hinic3 driver supports the following features:
+- IPv4/IPv6 TCP/UDP checksum offload
+- TSO (TCP Segmentation Offload), LRO (Large Receive Offload)
+- RSS (Receive Side Scaling)
+- MSI-X interrupt aggregation configuration and interrupt adaptation.
+- SR-IOV (Single Root I/O Virtualization).
+
+Content
+=======
+
+- Supported PCI vendor ID/device IDs
+- Source Code Structure of Hinic3 Driver
+- Management Interface
+
+Supported PCI vendor ID/device IDs
+==================================
+
+19e5:0222 - hinic3 PF/PPF
+19e5:375F - hinic3 VF
+
+Prime Physical Function (PPF) is responsible for the management of the
+whole NIC card. For example, clock synchronization between the NIC and
+the host. Any PF may serve as a PPF. The PPF is selected dynamically.
+
+Source Code Structure of Hinic3 Driver
+======================================
+
+========================  ================================================
+hinic3_pci_id_tbl.h       Supported device IDs
+hinic3_hw_intf.h          Interface between HW and driver
+hinic3_queue_common.[ch]  Common structures and methods for NIC queues
+hinic3_common.[ch]        Encapsulation of memory operations in Linux
+hinic3_csr.h              Register definitions in the BAR
+hinic3_hwif.[ch]          Interface for BAR
+hinic3_eqs.[ch]           Interface for AEQs and CEQs
+hinic3_mbox.[ch]          Interface for mailbox
+hinic3_mgmt.[ch]          Management interface based on mailbox and AEQ
+hinic3_wq.[ch]            Work queue data structures and interface
+hinic3_cmdq.[ch]          Command queue is used to post command to HW
+hinic3_hwdev.[ch]         HW structures and methods abstractions
+hinic3_lld.[ch]           Auxiliary driver adaptation layer
+hinic3_hw_comm.[ch]       Interface for common HW operations
+hinic3_mgmt_interface.h   Interface between firmware and driver
+hinic3_hw_cfg.[ch]        Interface for HW configuration
+hinic3_irq.c              Interrupt request
+hinic3_netdev_ops.c       Operations registered to Linux kernel stack
+hinic3_nic_dev.h          NIC structures and methods abstractions
+hinic3_main.c             Main Linux kernel driver
+hinic3_nic_cfg.[ch]       NIC service configuration
+hinic3_nic_io.[ch]        Management plane interface for TX and RX
+hinic3_rss.[ch]           Interface for Receive Side Scaling (RSS)
+hinic3_rx.[ch]            Interface for transmit
+hinic3_tx.[ch]            Interface for receive
+hinic3_ethtool.c          Interface for ethtool operations (ops)
+hinic3_filter.c           Interface for MAC address
+========================  ================================================
+
+Management Interface
+====================
+
+Asynchronous Event Queue (AEQ)
+------------------------------
+
+AEQ receives high priority events from the HW over a descriptor queue.
+Every descriptor is a fixed size of 64 bytes. AEQ can receive solicited or
+unsolicited events. Every device, VF or PF, can have up to 4 AEQs.
+Every AEQ is associated to a dedicated IRQ. AEQ can receive multiple types
+of events, but in practice the hinic3 driver ignores all events except for
+2 mailbox related events.
+
+Mailbox
+-------
+
+Mailbox is a communication mechanism between the hinic3 driver and the HW.
+Each device has an independent mailbox. Driver can use the mailbox to send
+requests to management. Driver receives mailbox messages, such as responses
+to requests, over the AEQ (using event HINIC3_AEQ_FOR_MBOX). Due to the
+limited size of mailbox data register, mailbox messages are sent
+segment-by-segment.
+
+Every device can use its mailbox to post request to firmware. The mailbox
+can also be used to post requests and responses between the PF and its VFs.
+
+Completion Event Queue (CEQ)
+----------------------------
+
+The implementation of CEQ is the same as AEQ. It receives completion events
+from HW over a fixed size descriptor of 32 bits. Every device can have up
+to 32 CEQs. Every CEQ has a dedicated IRQ. CEQ only receives solicited
+events that are responses to requests from the driver. CEQ can receive
+multiple types of events, but in practice the hinic3 driver ignores all
+events except for HINIC3_CMDQ that represents completion of previously
+posted commands on a cmdq.
+
+Command Queue (cmdq)
+--------------------
+
+Every cmdq has a dedicated work queue on which commands are posted.
+Commands on the work queue are fixed size descriptor of size 64 bytes.
+Completion of a command will be indicated using ctrl bits in the
+descriptor that carried the command. Notification of command completions
+will also be provided via event on CEQ. Every device has 4 command queues
+that are initialized as a set (called cmdqs), each with its own type.
+Hinic3 driver only uses type HINIC3_CMDQ_SYNC.
+
+Work Queues(WQ)
+---------------
+
+Work queues are logical arrays of fixed size WQEs. The array may be spread
+over multiple non-contiguous pages using indirection table. Work queues are
+used by I/O queues and command queues.
+
+Global function ID
+------------------
+
+Every function, PF or VF, has a unique ordinal identification within the device.
+Many management commands (mbox or cmdq) contain this ID so HW can apply the
+command effect to the right function.
+
+PF is allowed to post management commands to a subordinate VF by specifying the
+VFs ID. A VF must provide its own ID. Anti-spoofing in the HW will cause
+command from a VF to fail if it contains the wrong ID.
+
diff --git a/Documentation/networking/device_drivers/ethernet/index.rst b/Documentation/networking/device_drivers/ethernet/index.rst
index 6fc1961492b7..40ac552641a3 100644
--- a/Documentation/networking/device_drivers/ethernet/index.rst
+++ b/Documentation/networking/device_drivers/ethernet/index.rst
@@ -28,6 +28,7 @@ Contents:
    freescale/gianfar
    google/gve
    huawei/hinic
+   huawei/hinic3
    intel/e100
    intel/e1000
    intel/e1000e
@@ -55,9 +56,11 @@ Contents:
    ti/cpsw_switchdev
    ti/am65_nuss_cpsw_switchdev
    ti/tlan
-   toshiba/spider_net
+   ti/icssg_prueth
    wangxun/txgbe
+   wangxun/txgbevf
    wangxun/ngbe
+   wangxun/ngbevf
 
 .. only::  subproject and html
 
diff --git a/Documentation/networking/device_drivers/ethernet/intel/i40e.rst b/Documentation/networking/device_drivers/ethernet/intel/i40e.rst
index 4fbaa1a2d674..53d9d5829d69 100644
--- a/Documentation/networking/device_drivers/ethernet/intel/i40e.rst
+++ b/Documentation/networking/device_drivers/ethernet/intel/i40e.rst
@@ -299,6 +299,18 @@ Use ethtool to view and set link-down-on-close, as follows::
   ethtool --show-priv-flags ethX
   ethtool --set-priv-flags ethX link-down-on-close [on|off]
 
+Setting the mdd-auto-reset-vf Private Flag
+------------------------------------------
+
+When the mdd-auto-reset-vf private flag is set to "on", the problematic VF will
+be automatically reset if a malformed descriptor is detected. If the flag is
+set to "off", the problematic VF will be disabled.
+
+Use ethtool to view and set mdd-auto-reset-vf, as follows::
+
+  ethtool --show-priv-flags ethX
+  ethtool --set-priv-flags ethX mdd-auto-reset-vf [on|off]
+
 Viewing Link Messages
 ---------------------
 Link messages will not be displayed to the console if the distribution is
diff --git a/Documentation/networking/device_drivers/ethernet/intel/ice.rst b/Documentation/networking/device_drivers/ethernet/intel/ice.rst
index 934752f675ba..0bca293cf9cb 100644
--- a/Documentation/networking/device_drivers/ethernet/intel/ice.rst
+++ b/Documentation/networking/device_drivers/ethernet/intel/ice.rst
@@ -101,6 +101,37 @@ example, if Rx packets are 10 and Netdev (software statistics) displays
 rx_bytes as "X", then ethtool (hardware statistics) will display rx_bytes as
 "X+40" (4 bytes CRC x 10 packets).
 
+ethtool reset
+-------------
+The driver supports 3 types of resets:
+
+- PF reset - resets only components associated with the given PF, does not
+  impact other PFs
+
+- CORE reset - whole adapter is affected, reset all PFs
+
+- GLOBAL reset - same as CORE but mac and phy components are also reinitialized
+
+These are mapped to ethtool reset flags as follow:
+
+- PF reset:
+
+  # ethtool --reset <ethX> irq dma filter offload
+
+- CORE reset:
+
+  # ethtool --reset <ethX> irq-shared dma-shared filter-shared offload-shared \
+  ram-shared
+
+- GLOBAL reset:
+
+  # ethtool --reset <ethX> irq-shared dma-shared filter-shared offload-shared \
+  mac-shared phy-shared ram-shared
+
+In switchdev mode you can reset a VF using port representor:
+
+  # ethtool --reset <repr> irq dma filter offload
+
 
 Viewing Link Messages
 ---------------------
@@ -896,6 +927,19 @@ To enable/disable UDP Segmentation Offload, issue the following command::
 
   # ethtool -K <ethX> tx-udp-segmentation [off|on]
 
+PTP pin interface
+-----------------
+All adapters support standard PTP pin interface. SDPs (Software Definable Pin)
+are single ended pins with both periodic output and external timestamp
+supported. There are also specific differential input/output pins (TIME_SYNC,
+1PPS) with only one of the functions supported.
+
+There are adapters with DPLL, where pins are connected to the DPLL instead of
+being exposed on the board. You have to be aware that in those configurations,
+only SDP pins are exposed and each pin has its own fixed direction.
+To see input signal on those PTP pins, you need to configure DPLL properly.
+Output signal is only visible on DPLL and to send it to the board SMA/U.FL pins,
+DPLL output pins have to be manually configured.
 
 GNSS module
 -----------
diff --git a/Documentation/networking/device_drivers/ethernet/marvell/octeontx2.rst b/Documentation/networking/device_drivers/ethernet/marvell/octeontx2.rst
index 1e196cb9ce25..a52850602cd8 100644
--- a/Documentation/networking/device_drivers/ethernet/marvell/octeontx2.rst
+++ b/Documentation/networking/device_drivers/ethernet/marvell/octeontx2.rst
@@ -14,6 +14,7 @@ Contents
 - `Basic packet flow`_
 - `Devlink health reporters`_
 - `Quality of service`_
+- `RVU representors`_
 
 Overview
 ========
@@ -65,7 +66,7 @@ Admin Function driver
 As mentioned above RVU PF0 is called the admin function (AF), this driver
 supports resource provisioning and configuration of functional blocks.
 Doesn't handle any I/O. It sets up few basic stuff but most of the
-funcionality is achieved via configuration requests from PFs and VFs.
+functionality is achieved via configuration requests from PFs and VFs.
 
 PF/VFs communicates with AF via a shared memory region (mailbox). Upon
 receiving requests AF does resource provisioning and other HW configuration.
@@ -340,3 +341,93 @@ Setup HTB offload
         # tc class add dev <interface> parent 1: classid 1:2 htb rate 10Gbit prio 2 quantum 188416
 
         # tc class add dev <interface> parent 1: classid 1:3 htb rate 10Gbit prio 2 quantum 32768
+
+
+RVU Representors
+================
+
+RVU representor driver adds support for creation of representor devices for
+RVU PFs' VFs in the system. Representor devices are created when user enables
+the switchdev mode.
+Switchdev mode can be enabled either before or after setting up SRIOV numVFs.
+All representor devices share a single NIXLF but each has a dedicated Rx/Tx
+queues. RVU PF representor driver registers a separate netdev for each
+Rx/Tx queue pair.
+
+Current HW does not support built-in switch which can do L2 learning and
+forwarding packets between representee and representor. Hence, packet path
+between representee and it's representor is achieved by setting up appropriate
+NPC MCAM filters.
+Transmit packets matching these filters will be loopbacked through hardware
+loopback channel/interface (i.e, instead of sending them out of MAC interface).
+Which will again match the installed filters and will be forwarded.
+This way representee => representor and representor => representee packet
+path is achieved. These rules get installed when representors are created
+and gets active/deactivate based on the representor/representee interface state.
+
+Usage example:
+
+ - Change device to switchdev mode::
+
+	# devlink dev eswitch set pci/0002:1c:00.0 mode switchdev
+
+ - List of representor devices on the system::
+
+	# ip link show
+	Rpf1vf0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 1000 link/ether f6:43:83:ee:26:21 brd ff:ff:ff:ff:ff:ff
+	Rpf1vf1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 1000 link/ether 12:b2:54:0e:24:54 brd ff:ff:ff:ff:ff:ff
+	Rpf1vf2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 1000 link/ether 4a:12:c4:4c:32:62 brd ff:ff:ff:ff:ff:ff
+	Rpf1vf3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 1000 link/ether ca:cb:68:0e:e2:6e brd ff:ff:ff:ff:ff:ff
+	Rpf2vf0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 1000 link/ether 06:cc:ad:b4:f0:93 brd ff:ff:ff:ff:ff:ff
+
+
+To delete the representors devices from the system. Change the device to legacy mode.
+
+ - Change device to legacy mode::
+
+	# devlink dev eswitch set pci/0002:1c:00.0 mode legacy
+
+RVU representors can be managed using devlink ports
+(see :ref:`Documentation/networking/devlink/devlink-port.rst <devlink_port>`) interface.
+
+ - Show devlink ports of representors::
+
+	# devlink port
+	pci/0002:1c:00.0/0: type eth netdev Rpf1vf0 flavour physical port 0 splittable false
+	pci/0002:1c:00.0/1: type eth netdev Rpf1vf1 flavour pcivf controller 0 pfnum 1 vfnum 1 external false splittable false
+	pci/0002:1c:00.0/2: type eth netdev Rpf1vf2 flavour pcivf controller 0 pfnum 1 vfnum 2 external false splittable false
+	pci/0002:1c:00.0/3: type eth netdev Rpf1vf3 flavour pcivf controller 0 pfnum 1 vfnum 3 external false splittable false
+
+Function attributes
+===================
+
+The RVU representor support function attributes for representors.
+Port function configuration of the representors are supported through devlink eswitch port.
+
+MAC address setup
+-----------------
+
+RVU representor driver support devlink port function attr mechanism to setup MAC
+address. (refer to Documentation/networking/devlink/devlink-port.rst)
+
+ - To setup MAC address for port 2::
+
+	# devlink port function set pci/0002:1c:00.0/2 hw_addr 5c:a1:1b:5e:43:11
+	# devlink port show pci/0002:1c:00.0/2
+	pci/0002:1c:00.0/2: type eth netdev Rpf1vf2 flavour pcivf controller 0 pfnum 1 vfnum 2 external false splittable false
+	function:
+		hw_addr 5c:a1:1b:5e:43:11
+
+
+TC offload
+==========
+
+The rvu representor driver implements support for offloading tc rules using port representors.
+
+ - Drop packets with vlan id 3::
+
+	# tc filter add dev Rpf1vf0 protocol 802.1Q parent ffff: flower vlan_id 3 vlan_ethtype ipv4 skip_sw action drop
+
+ - Redirect packets with vlan id 5 and IPv4 packets to eth1, after stripping vlan header.::
+
+	# tc filter add dev Rpf1vf0 ingress protocol 802.1Q flower vlan_id 5 vlan_ethtype ipv4 skip_sw action vlan pop action mirred ingress redirect dev eth1
diff --git a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/counters.rst b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/counters.rst
index 99d95be4d159..754c81436408 100644
--- a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/counters.rst
+++ b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/counters.rst
@@ -1082,6 +1082,11 @@ like flow control, FEC and more.
        need to replace the cable/transceiver.
      - Error
 
+   * - `total_success_recovery_phy`
+     - The number of total successful recovery events of any type during
+       ports reset cycle.
+     - Error
+
    * - `rx_out_of_buffer`
      - Number of times receive queue had no software buffers allocated for the
        adapter's incoming traffic.
@@ -1336,3 +1341,35 @@ Device Counters
      - The number of times the device owned queue had not enough buffers
        allocated.
      - Error
+
+   * - `pci_bw_inbound_high`
+     - The number of times the device crossed the high inbound pcie bandwidth
+       threshold. To be compared to pci_bw_inbound_low to check if the device
+       is in a congested state.
+       If pci_bw_inbound_high == pci_bw_inbound_low then the device is not congested.
+       If pci_bw_inbound_high > pci_bw_inbound_low then the device is congested.
+     - Tnformative
+
+   * - `pci_bw_inbound_low`
+     - The number of times the device crossed the low inbound PCIe bandwidth
+       threshold. To be compared to pci_bw_inbound_high to check if the device
+       is in a congested state.
+       If pci_bw_inbound_high == pci_bw_inbound_low then the device is not congested.
+       If pci_bw_inbound_high > pci_bw_inbound_low then the device is congested.
+     - Informative
+
+   * - `pci_bw_outbound_high`
+     - The number of times the device crossed the high outbound pcie bandwidth
+       threshold. To be compared to pci_bw_outbound_low to check if the device
+       is in a congested state.
+       If pci_bw_outbound_high == pci_bw_outbound_low then the device is not congested.
+       If pci_bw_outbound_high > pci_bw_outbound_low then the device is congested.
+     - Informative
+
+   * - `pci_bw_outbound_low`
+     - The number of times the device crossed the low outbound PCIe bandwidth
+       threshold. To be compared to pci_bw_outbound_high to check if the device
+       is in a congested state.
+       If pci_bw_outbound_high == pci_bw_outbound_low then the device is not congested.
+       If pci_bw_outbound_high > pci_bw_outbound_low then the device is congested.
+     - Informative
diff --git a/Documentation/networking/device_drivers/ethernet/meta/fbnic.rst b/Documentation/networking/device_drivers/ethernet/meta/fbnic.rst
index 32ff114f5c26..afb8353daefd 100644
--- a/Documentation/networking/device_drivers/ethernet/meta/fbnic.rst
+++ b/Documentation/networking/device_drivers/ethernet/meta/fbnic.rst
@@ -27,3 +27,136 @@ driver takes over.
 devlink dev info provides version information for all three components. In
 addition to the version the hg commit hash of the build is included as a
 separate entry.
+
+Configuration
+-------------
+
+Ringparams (ethtool -g / -G)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+fbnic has two submission (host -> device) rings for every completion
+(device -> host) ring. The three ring objects together form a single
+"queue" as used by higher layer software (a Rx, or a Tx queue).
+
+For Rx the two submission rings are used to pass empty pages to the NIC.
+Ring 0 is the Header Page Queue (HPQ), NIC will use its pages to place
+L2-L4 headers (or full frames if frame is not header-data split).
+Ring 1 is the Payload Page Queue (PPQ) and used for packet payloads.
+The completion ring is used to receive packet notifications / metadata.
+ethtool ``rx`` ringparam maps to the size of the completion ring,
+``rx-mini`` to the HPQ, and ``rx-jumbo`` to the PPQ.
+
+For Tx both submission rings can be used to submit packets, the completion
+ring carries notifications for both. fbnic uses one of the submission
+rings for normal traffic from the stack and the second one for XDP frames.
+ethtool ``tx`` ringparam controls both the size of the submission rings
+and the completion ring.
+
+Every single entry on the HPQ and PPQ (``rx-mini``, ``rx-jumbo``)
+corresponds to 4kB of allocated memory, while entries on the remaining
+rings are in units of descriptors (8B). The ideal ratio of submission
+and completion ring sizes will depend on the workload, as for small packets
+multiple packets will fit into a single page.
+
+Upgrading Firmware
+------------------
+
+fbnic supports updating firmware using signed PLDM images with devlink dev
+flash. PLDM images are written into the flash. Flashing does not interrupt
+the operation of the device.
+
+On host boot the latest UEFI driver is always used, no explicit activation
+is required. Firmware activation is required to run new control firmware. cmrt
+firmware can only be activated by power cycling the NIC.
+
+Statistics
+----------
+
+TX MAC Interface
+~~~~~~~~~~~~~~~~
+
+ - ``ptp_illegal_req``: packets sent to the NIC with PTP request bit set but routed to BMC/FW
+ - ``ptp_good_ts``: packets successfully routed to MAC with PTP request bit set
+ - ``ptp_bad_ts``: packets destined for MAC with PTP request bit set but aborted because of some error (e.g., DMA read error)
+
+TX Extension (TEI) Interface (TTI)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+ - ``tti_cm_drop``: control messages dropped at the TX Extension (TEI) Interface because of credit starvation
+ - ``tti_frame_drop``: packets dropped at the TX Extension (TEI) Interface because of credit starvation
+ - ``tti_tbi_drop``: packets dropped at the TX BMC Interface (TBI) because of credit starvation
+
+RXB (RX Buffer) Enqueue
+~~~~~~~~~~~~~~~~~~~~~~~
+
+ - ``rxb_integrity_err[i]``: frames enqueued with integrity errors (e.g., multi-bit ECC errors) on RXB input i
+ - ``rxb_mac_err[i]``: frames enqueued with MAC end-of-frame errors (e.g., bad FCS) on RXB input i
+ - ``rxb_parser_err[i]``: frames experienced RPC parser errors
+ - ``rxb_frm_err[i]``: frames experienced signaling errors (e.g., missing end-of-packet/start-of-packet) on RXB input i
+ - ``rxb_drbo[i]_frames``: frames received at RXB input i
+ - ``rxb_drbo[i]_bytes``: bytes received at RXB input i
+
+RXB (RX Buffer) FIFO
+~~~~~~~~~~~~~~~~~~~~
+
+ - ``rxb_fifo[i]_drop``: transitions into the drop state on RXB pool i
+ - ``rxb_fifo[i]_dropped_frames``: frames dropped on RXB pool i
+ - ``rxb_fifo[i]_ecn``: transitions into the ECN mark state on RXB pool i
+ - ``rxb_fifo[i]_level``: current occupancy of RXB pool i
+
+RXB (RX Buffer) Dequeue
+~~~~~~~~~~~~~~~~~~~~~~~
+
+   - ``rxb_intf[i]_frames``: frames sent to the output i
+   - ``rxb_intf[i]_bytes``: bytes sent to the output i
+   - ``rxb_pbuf[i]_frames``: frames sent to output i from the perspective of internal packet buffer
+   - ``rxb_pbuf[i]_bytes``: bytes sent to output i from the perspective of internal packet buffer
+
+RPC (Rx parser)
+~~~~~~~~~~~~~~~
+
+ - ``rpc_unkn_etype``: frames containing unknown EtherType
+ - ``rpc_unkn_ext_hdr``: frames containing unknown IPv6 extension header
+ - ``rpc_ipv4_frag``: frames containing IPv4 fragment
+ - ``rpc_ipv6_frag``: frames containing IPv6 fragment
+ - ``rpc_ipv4_esp``: frames with IPv4 ESP encapsulation
+ - ``rpc_ipv6_esp``: frames with IPv6 ESP encapsulation
+ - ``rpc_tcp_opt_err``: frames which encountered TCP option parsing error
+ - ``rpc_out_of_hdr_err``: frames where header was larger than parsable region
+ - ``ovr_size_err``: oversized frames
+
+Hardware Queues
+~~~~~~~~~~~~~~~
+
+1. RX DMA Engine:
+
+ - ``rde_[i]_pkt_err``: packets with MAC EOP, RPC parser, RXB truncation, or RDE frame truncation errors. These error are flagged in the packet metadata because of cut-through support but the actual drop happens once PCIE/RDE is reached.
+ - ``rde_[i]_pkt_cq_drop``: packets dropped because RCQ is full
+ - ``rde_[i]_pkt_bdq_drop``: packets dropped because HPQ or PPQ ran out of host buffer
+
+PCIe
+~~~~
+
+The fbnic driver exposes PCIe hardware performance statistics through debugfs
+(``pcie_stats``). These statistics provide insights into PCIe transaction
+behavior and potential performance bottlenecks.
+
+1. PCIe Transaction Counters:
+
+   These counters track PCIe transaction activity:
+        - ``pcie_ob_rd_tlp``: Outbound read Transaction Layer Packets count
+        - ``pcie_ob_rd_dword``: DWORDs transferred in outbound read transactions
+        - ``pcie_ob_wr_tlp``: Outbound write Transaction Layer Packets count
+        - ``pcie_ob_wr_dword``: DWORDs transferred in outbound write
+	  transactions
+        - ``pcie_ob_cpl_tlp``: Outbound completion TLP count
+        - ``pcie_ob_cpl_dword``: DWORDs transferred in outbound completion TLPs
+
+2. PCIe Resource Monitoring:
+
+   These counters indicate PCIe resource exhaustion events:
+        - ``pcie_ob_rd_no_tag``: Read requests dropped due to tag unavailability
+        - ``pcie_ob_rd_no_cpl_cred``: Read requests dropped due to completion
+	  credit exhaustion
+        - ``pcie_ob_rd_no_np_cred``: Read requests dropped due to non-posted
+	  credit exhaustion
diff --git a/Documentation/networking/device_drivers/ethernet/ti/cpsw.rst b/Documentation/networking/device_drivers/ethernet/ti/cpsw.rst
index a88946bd188b..d3e130455043 100644
--- a/Documentation/networking/device_drivers/ethernet/ti/cpsw.rst
+++ b/Documentation/networking/device_drivers/ethernet/ti/cpsw.rst
@@ -268,14 +268,14 @@ Example 1: One port tx AVB configuration scheme for target board
 
 	// Run your appropriate tools with socket option "SO_PRIORITY"
 	// to 3 for class A and/or to 2 for class B
-	// (I took at https://www.spinics.net/lists/netdev/msg460869.html)
+	// (I took at https://lore.kernel.org/r/20171017010128.22141-1-vinicius.gomes@intel.com/)
 	./tsn_talker -d 18:03:73:66:87:42 -i eth0.100 -p3 -s 1500&
 	./tsn_talker -d 18:03:73:66:87:42 -i eth0.100 -p2 -s 1500&
 
 13) ::
 
 	// run your listener on workstation (should be in same vlan)
-	// (I took at https://www.spinics.net/lists/netdev/msg460869.html)
+	// (I took at https://lore.kernel.org/r/20171017010128.22141-1-vinicius.gomes@intel.com/)
 	./tsn_listener -d 18:03:73:66:87:42 -i enp5s0 -s 1500
 	Receiving data rate: 39012 kbps
 	Receiving data rate: 39012 kbps
@@ -555,7 +555,7 @@ Example 2: Two port tx AVB configuration scheme for target board
 20) ::
 
 	// run your listener on workstation (should be in same vlan)
-	// (I took at https://www.spinics.net/lists/netdev/msg460869.html)
+	// (I took at https://lore.kernel.org/r/20171017010128.22141-1-vinicius.gomes@intel.com/)
 	./tsn_listener -d 18:03:73:66:87:42 -i enp5s0 -s 1500
 	Receiving data rate: 39012 kbps
 	Receiving data rate: 39012 kbps
diff --git a/Documentation/networking/device_drivers/ethernet/ti/icssg_prueth.rst b/Documentation/networking/device_drivers/ethernet/ti/icssg_prueth.rst
new file mode 100644
index 000000000000..da21ddf431bb
--- /dev/null
+++ b/Documentation/networking/device_drivers/ethernet/ti/icssg_prueth.rst
@@ -0,0 +1,56 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==============================================
+Texas Instruments ICSSG PRUETH ethernet driver
+==============================================
+
+:Version: 1.0
+
+ICSSG Firmware
+==============
+
+Every ICSSG core has two Programmable Real-Time Unit(PRUs), two auxiliary
+Real-Time Transfer Unit (RTUs), and two Transmit Real-Time Transfer Units
+(TX_PRUs). Each one of these runs its own firmware. The firmwares combnined are
+referred as ICSSG Firmware.
+
+Firmware Statistics
+===================
+
+The ICSSG firmware maintains certain statistics which are dumped by the driver
+via ``ethtool -S <interface>``
+
+These statistics are as follows,
+
+ - ``FW_RTU_PKT_DROP``: Diagnostic error counter which increments when RTU drops a locally injected packet due to port being disabled or rule violation.
+ - ``FW_Q0_OVERFLOW``: TX overflow counter for queue0
+ - ``FW_Q1_OVERFLOW``: TX overflow counter for queue1
+ - ``FW_Q2_OVERFLOW``: TX overflow counter for queue2
+ - ``FW_Q3_OVERFLOW``: TX overflow counter for queue3
+ - ``FW_Q4_OVERFLOW``: TX overflow counter for queue4
+ - ``FW_Q5_OVERFLOW``: TX overflow counter for queue5
+ - ``FW_Q6_OVERFLOW``: TX overflow counter for queue6
+ - ``FW_Q7_OVERFLOW``: TX overflow counter for queue7
+ - ``FW_DROPPED_PKT``: This counter is incremented when a packet is dropped at PRU because of rule violation.
+ - ``FW_RX_ERROR``: Incremented if there was a CRC error or Min/Max frame error at PRU
+ - ``FW_RX_DS_INVALID``: Incremented when RTU detects Data Status invalid condition
+ - ``FW_TX_DROPPED_PACKET``: Counter for packets dropped via TX Port
+ - ``FW_TX_TS_DROPPED_PACKET``: Counter for packets with TS flag dropped via TX Port
+ - ``FW_INF_PORT_DISABLED``: Incremented when RX frame is dropped due to port being disabled
+ - ``FW_INF_SAV``: Incremented when RX frame is dropped due to Source Address violation
+ - ``FW_INF_SA_DL``: Incremented when RX frame is dropped due to Source Address being in the denylist
+ - ``FW_INF_PORT_BLOCKED``: Incremented when RX frame is dropped due to port being blocked and frame being a special frame
+ - ``FW_INF_DROP_TAGGED`` : Incremented when RX frame is dropped for being tagged
+ - ``FW_INF_DROP_PRIOTAGGED``: Incremented when RX frame is dropped for being priority tagged
+ - ``FW_INF_DROP_NOTAG``: Incremented when RX frame is dropped for being untagged
+ - ``FW_INF_DROP_NOTMEMBER``: Incremented when RX frame is dropped for port not being member of VLAN
+ - ``FW_RX_EOF_SHORT_FRMERR``: Incremented if End Of Frame (EOF) task is scheduled without seeing RX_B1
+ - ``FW_RX_B0_DROP_EARLY_EOF``: Incremented when frame is dropped due to Early EOF
+ - ``FW_TX_JUMBO_FRM_CUTOFF``: Incremented when frame is cut off to prevent packet size > 2000 Bytes
+ - ``FW_RX_EXP_FRAG_Q_DROP``: Incremented when express frame is received in the same queue as the previous fragment
+ - ``FW_RX_FIFO_OVERRUN``: RX fifo overrun counter
+ - ``FW_CUT_THR_PKT``: Incremented when a packet is forwarded using Cut-Through forwarding method
+ - ``FW_HOST_RX_PKT_CNT``: Number of valid packets sent by Rx PRU to Host on PSI
+ - ``FW_HOST_TX_PKT_CNT``: Number of valid packets copied by RTU0 to Tx queues
+ - ``FW_HOST_EGRESS_Q_PRE_OVERFLOW``: Host Egress Q (Pre-emptible) Overflow Counter
+ - ``FW_HOST_EGRESS_Q_EXP_OVERFLOW``: Host Egress Q (Pre-emptible) Overflow Counter
diff --git a/Documentation/networking/device_drivers/ethernet/toshiba/spider_net.rst b/Documentation/networking/device_drivers/ethernet/toshiba/spider_net.rst
deleted file mode 100644
index fe5b32be15cd..000000000000
--- a/Documentation/networking/device_drivers/ethernet/toshiba/spider_net.rst
+++ /dev/null
@@ -1,202 +0,0 @@
-.. SPDX-License-Identifier: GPL-2.0
-
-===========================
-The Spidernet Device Driver
-===========================
-
-Written by Linas Vepstas <linas@austin.ibm.com>
-
-Version of 7 June 2007
-
-Abstract
-========
-This document sketches the structure of portions of the spidernet
-device driver in the Linux kernel tree. The spidernet is a gigabit
-ethernet device built into the Toshiba southbridge commonly used
-in the SONY Playstation 3 and the IBM QS20 Cell blade.
-
-The Structure of the RX Ring.
-=============================
-The receive (RX) ring is a circular linked list of RX descriptors,
-together with three pointers into the ring that are used to manage its
-contents.
-
-The elements of the ring are called "descriptors" or "descrs"; they
-describe the received data. This includes a pointer to a buffer
-containing the received data, the buffer size, and various status bits.
-
-There are three primary states that a descriptor can be in: "empty",
-"full" and "not-in-use".  An "empty" or "ready" descriptor is ready
-to receive data from the hardware. A "full" descriptor has data in it,
-and is waiting to be emptied and processed by the OS. A "not-in-use"
-descriptor is neither empty or full; it is simply not ready. It may
-not even have a data buffer in it, or is otherwise unusable.
-
-During normal operation, on device startup, the OS (specifically, the
-spidernet device driver) allocates a set of RX descriptors and RX
-buffers. These are all marked "empty", ready to receive data. This
-ring is handed off to the hardware, which sequentially fills in the
-buffers, and marks them "full". The OS follows up, taking the full
-buffers, processing them, and re-marking them empty.
-
-This filling and emptying is managed by three pointers, the "head"
-and "tail" pointers, managed by the OS, and a hardware current
-descriptor pointer (GDACTDPA). The GDACTDPA points at the descr
-currently being filled. When this descr is filled, the hardware
-marks it full, and advances the GDACTDPA by one.  Thus, when there is
-flowing RX traffic, every descr behind it should be marked "full",
-and everything in front of it should be "empty".  If the hardware
-discovers that the current descr is not empty, it will signal an
-interrupt, and halt processing.
-
-The tail pointer tails or trails the hardware pointer. When the
-hardware is ahead, the tail pointer will be pointing at a "full"
-descr. The OS will process this descr, and then mark it "not-in-use",
-and advance the tail pointer.  Thus, when there is flowing RX traffic,
-all of the descrs in front of the tail pointer should be "full", and
-all of those behind it should be "not-in-use". When RX traffic is not
-flowing, then the tail pointer can catch up to the hardware pointer.
-The OS will then note that the current tail is "empty", and halt
-processing.
-
-The head pointer (somewhat mis-named) follows after the tail pointer.
-When traffic is flowing, then the head pointer will be pointing at
-a "not-in-use" descr. The OS will perform various housekeeping duties
-on this descr. This includes allocating a new data buffer and
-dma-mapping it so as to make it visible to the hardware. The OS will
-then mark the descr as "empty", ready to receive data. Thus, when there
-is flowing RX traffic, everything in front of the head pointer should
-be "not-in-use", and everything behind it should be "empty". If no
-RX traffic is flowing, then the head pointer can catch up to the tail
-pointer, at which point the OS will notice that the head descr is
-"empty", and it will halt processing.
-
-Thus, in an idle system, the GDACTDPA, tail and head pointers will
-all be pointing at the same descr, which should be "empty". All of the
-other descrs in the ring should be "empty" as well.
-
-The show_rx_chain() routine will print out the locations of the
-GDACTDPA, tail and head pointers. It will also summarize the contents
-of the ring, starting at the tail pointer, and listing the status
-of the descrs that follow.
-
-A typical example of the output, for a nearly idle system, might be::
-
-    net eth1: Total number of descrs=256
-    net eth1: Chain tail located at descr=20
-    net eth1: Chain head is at 20
-    net eth1: HW curr desc (GDACTDPA) is at 21
-    net eth1: Have 1 descrs with stat=x40800101
-    net eth1: HW next desc (GDACNEXTDA) is at 22
-    net eth1: Last 255 descrs with stat=xa0800000
-
-In the above, the hardware has filled in one descr, number 20. Both
-head and tail are pointing at 20, because it has not yet been emptied.
-Meanwhile, hw is pointing at 21, which is free.
-
-The "Have nnn decrs" refers to the descr starting at the tail: in this
-case, nnn=1 descr, starting at descr 20. The "Last nnn descrs" refers
-to all of the rest of the descrs, from the last status change. The "nnn"
-is a count of how many descrs have exactly the same status.
-
-The status x4... corresponds to "full" and status xa... corresponds
-to "empty". The actual value printed is RXCOMST_A.
-
-In the device driver source code, a different set of names are
-used for these same concepts, so that::
-
-    "empty" == SPIDER_NET_DESCR_CARDOWNED == 0xa
-    "full"  == SPIDER_NET_DESCR_FRAME_END == 0x4
-    "not in use" == SPIDER_NET_DESCR_NOT_IN_USE == 0xf
-
-
-The RX RAM full bug/feature
-===========================
-
-As long as the OS can empty out the RX buffers at a rate faster than
-the hardware can fill them, there is no problem. If, for some reason,
-the OS fails to empty the RX ring fast enough, the hardware GDACTDPA
-pointer will catch up to the head, notice the not-empty condition,
-ad stop. However, RX packets may still continue arriving on the wire.
-The spidernet chip can save some limited number of these in local RAM.
-When this local ram fills up, the spider chip will issue an interrupt
-indicating this (GHIINT0STS will show ERRINT, and the GRMFLLINT bit
-will be set in GHIINT1STS).  When the RX ram full condition occurs,
-a certain bug/feature is triggered that has to be specially handled.
-This section describes the special handling for this condition.
-
-When the OS finally has a chance to run, it will empty out the RX ring.
-In particular, it will clear the descriptor on which the hardware had
-stopped. However, once the hardware has decided that a certain
-descriptor is invalid, it will not restart at that descriptor; instead
-it will restart at the next descr. This potentially will lead to a
-deadlock condition, as the tail pointer will be pointing at this descr,
-which, from the OS point of view, is empty; the OS will be waiting for
-this descr to be filled. However, the hardware has skipped this descr,
-and is filling the next descrs. Since the OS doesn't see this, there
-is a potential deadlock, with the OS waiting for one descr to fill,
-while the hardware is waiting for a different set of descrs to become
-empty.
-
-A call to show_rx_chain() at this point indicates the nature of the
-problem. A typical print when the network is hung shows the following::
-
-    net eth1: Spider RX RAM full, incoming packets might be discarded!
-    net eth1: Total number of descrs=256
-    net eth1: Chain tail located at descr=255
-    net eth1: Chain head is at 255
-    net eth1: HW curr desc (GDACTDPA) is at 0
-    net eth1: Have 1 descrs with stat=xa0800000
-    net eth1: HW next desc (GDACNEXTDA) is at 1
-    net eth1: Have 127 descrs with stat=x40800101
-    net eth1: Have 1 descrs with stat=x40800001
-    net eth1: Have 126 descrs with stat=x40800101
-    net eth1: Last 1 descrs with stat=xa0800000
-
-Both the tail and head pointers are pointing at descr 255, which is
-marked xa... which is "empty". Thus, from the OS point of view, there
-is nothing to be done. In particular, there is the implicit assumption
-that everything in front of the "empty" descr must surely also be empty,
-as explained in the last section. The OS is waiting for descr 255 to
-become non-empty, which, in this case, will never happen.
-
-The HW pointer is at descr 0. This descr is marked 0x4.. or "full".
-Since its already full, the hardware can do nothing more, and thus has
-halted processing. Notice that descrs 0 through 254 are all marked
-"full", while descr 254 and 255 are empty. (The "Last 1 descrs" is
-descr 254, since tail was at 255.) Thus, the system is deadlocked,
-and there can be no forward progress; the OS thinks there's nothing
-to do, and the hardware has nowhere to put incoming data.
-
-This bug/feature is worked around with the spider_net_resync_head_ptr()
-routine. When the driver receives RX interrupts, but an examination
-of the RX chain seems to show it is empty, then it is probable that
-the hardware has skipped a descr or two (sometimes dozens under heavy
-network conditions). The spider_net_resync_head_ptr() subroutine will
-search the ring for the next full descr, and the driver will resume
-operations there.  Since this will leave "holes" in the ring, there
-is also a spider_net_resync_tail_ptr() that will skip over such holes.
-
-As of this writing, the spider_net_resync() strategy seems to work very
-well, even under heavy network loads.
-
-
-The TX ring
-===========
-The TX ring uses a low-watermark interrupt scheme to make sure that
-the TX queue is appropriately serviced for large packet sizes.
-
-For packet sizes greater than about 1KBytes, the kernel can fill
-the TX ring quicker than the device can drain it. Once the ring
-is full, the netdev is stopped. When there is room in the ring,
-the netdev needs to be reawakened, so that more TX packets are placed
-in the ring. The hardware can empty the ring about four times per jiffy,
-so its not appropriate to wait for the poll routine to refill, since
-the poll routine runs only once per jiffy.  The low-watermark mechanism
-marks a descr about 1/4th of the way from the bottom of the queue, so
-that an interrupt is generated when the descr is processed. This
-interrupt wakes up the netdev, which can then refill the queue.
-For large packets, this mechanism generates a relatively small number
-of interrupts, about 1K/sec. For smaller packets, this will drop to zero
-interrupts, as the hardware can empty the queue faster than the kernel
-can fill it.
diff --git a/Documentation/networking/device_drivers/ethernet/wangxun/ngbevf.rst b/Documentation/networking/device_drivers/ethernet/wangxun/ngbevf.rst
new file mode 100644
index 000000000000..a39e3d5a1038
--- /dev/null
+++ b/Documentation/networking/device_drivers/ethernet/wangxun/ngbevf.rst
@@ -0,0 +1,16 @@
+.. SPDX-License-Identifier: GPL-2.0+
+
+==================================================================
+Linux Base Virtual Function Driver for Wangxun(R) Gigabit Ethernet
+==================================================================
+
+WangXun Gigabit Virtual Function Linux driver.
+Copyright(c) 2015 - 2025 Beijing WangXun Technology Co., Ltd.
+
+Support
+=======
+For general information, go to the website at:
+https://www.net-swift.com
+
+If you got any problem, contact Wangxun support team via nic-support@net-swift.com
+and Cc: netdev.
diff --git a/Documentation/networking/device_drivers/ethernet/wangxun/txgbevf.rst b/Documentation/networking/device_drivers/ethernet/wangxun/txgbevf.rst
new file mode 100644
index 000000000000..b2f759b7b518
--- /dev/null
+++ b/Documentation/networking/device_drivers/ethernet/wangxun/txgbevf.rst
@@ -0,0 +1,16 @@
+.. SPDX-License-Identifier: GPL-2.0+
+
+===========================================================================
+Linux Base Virtual Function Driver for Wangxun(R) 10/25/40 Gigabit Ethernet
+===========================================================================
+
+WangXun 10/25/40 Gigabit Virtual Function Linux driver.
+Copyright(c) 2015 - 2025 Beijing WangXun Technology Co., Ltd.
+
+Support
+=======
+For general information, go to the website at:
+https://www.net-swift.com
+
+If you got any problem, contact Wangxun support team via nic-support@net-swift.com
+and Cc: netdev.
diff --git a/Documentation/networking/device_drivers/index.rst b/Documentation/networking/device_drivers/index.rst
index 0dd30a84ce25..a254af25b7ef 100644
--- a/Documentation/networking/device_drivers/index.rst
+++ b/Documentation/networking/device_drivers/index.rst
@@ -9,7 +9,6 @@ Contents:
    :maxdepth: 2
 
    atm/index
-   cable/index
    can/index
    cellular/index
    ethernet/index
diff --git a/Documentation/networking/device_drivers/wwan/t7xx.rst b/Documentation/networking/device_drivers/wwan/t7xx.rst
index f346f5f85f15..e07de7700dfc 100644
--- a/Documentation/networking/device_drivers/wwan/t7xx.rst
+++ b/Documentation/networking/device_drivers/wwan/t7xx.rst
@@ -7,12 +7,13 @@
 ============================================
 t7xx driver for MTK PCIe based T700 5G modem
 ============================================
-The t7xx driver is a WWAN PCIe host driver developed for linux or Chrome OS platforms
-for data exchange over PCIe interface between Host platform & MediaTek's T700 5G modem.
-The driver exposes an interface conforming to the MBIM protocol [1]. Any front end
-application (e.g. Modem Manager) could easily manage the MBIM interface to enable
-data communication towards WWAN. The driver also provides an interface to interact
-with the MediaTek's modem via AT commands.
+The t7xx driver is a WWAN PCIe host driver developed for linux or Chrome OS
+platforms for data exchange over PCIe interface between Host platform &
+MediaTek's T700 5G modem.
+The driver exposes an interface conforming to the MBIM protocol [1]. Any front
+end application (e.g. Modem Manager) could easily manage the MBIM interface to
+enable data communication towards WWAN. The driver also provides an interface
+to interact with the MediaTek's modem via AT commands.
 
 Basic usage
 ===========
@@ -45,8 +46,8 @@ The driver provides sysfs interfaces to userspace.
 
 t7xx_mode
 ---------
-The sysfs interface provides userspace with access to the device mode, this interface
-supports read and write operations.
+The sysfs interface provides userspace with access to the device mode, this
+interface supports read and write operations.
 
 Device mode:
 
@@ -67,6 +68,28 @@ Write from userspace to set the device mode.
 ::
   $ echo fastboot_switching > /sys/bus/pci/devices/${bdf}/t7xx_mode
 
+t7xx_debug_ports
+----------------
+The sysfs interface provides userspace with access to enable/disable the debug
+ports, this interface supports read and write operations.
+
+Debug port status:
+
+- ``1`` represents enable debug ports
+- ``0`` represents disable debug ports
+
+Currently supported debug ports (ADB/MIPC).
+
+Read from userspace to get the current debug ports status.
+
+::
+  $ cat /sys/bus/pci/devices/${bdf}/t7xx_debug_ports
+
+Write from userspace to set the debug ports status.
+
+::
+  $ echo 1 > /sys/bus/pci/devices/${bdf}/t7xx_debug_ports
+
 Management application development
 ==================================
 The driver and userspace interfaces are described below. The MBIM protocol is
@@ -139,6 +162,25 @@ Please note that driver needs to be reloaded to export /dev/wwan0fastboot0
 port, because device needs a cold reset after enter ``fastboot_switching``
 mode.
 
+ADB port userspace ABI
+----------------------
+
+/dev/wwan0adb0 character device
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The driver exposes a ADB protocol interface by implementing ADB WWAN Port.
+The userspace end of the ADB channel pipe is a /dev/wwan0adb0 character device.
+Application shall use this interface for ADB protocol communication.
+
+MIPC port userspace ABI
+-----------------------
+
+/dev/wwan0mipc0 character device
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The driver exposes a diagnostic interface by implementing MIPC (Modem
+Information Process Center) WWAN Port. The userspace end of the MIPC channel
+pipe is a /dev/wwan0mipc0 character device.
+Application shall use this interface for MTK modem diagnostic communication.
+
 The MediaTek's T700 modem supports the 3GPP TS 27.007 [4] specification.
 
 References
@@ -164,3 +206,9 @@ speak the Mobile Interface Broadband Model (MBIM) protocol"*
 [5] *fastboot "a mechanism for communicating with bootloaders"*
 
 - https://android.googlesource.com/platform/system/core/+/refs/heads/main/fastboot/README.md
+
+[6] *ADB (Android Debug Bridge) "a mechanism to keep track of Android devices
+and emulators instances connected to or running on a given host developer
+machine with ADB protocol"*
+
+- https://android.googlesource.com/platform/packages/modules/adb/+/refs/heads/main/README.md
diff --git a/Documentation/networking/devlink/bnxt.rst b/Documentation/networking/devlink/bnxt.rst
index a4fb27663cd6..9a8b3d76d11f 100644
--- a/Documentation/networking/devlink/bnxt.rst
+++ b/Documentation/networking/devlink/bnxt.rst
@@ -24,6 +24,8 @@ Parameters
      - Permanent
    * - ``enable_remote_dev_reset``
      - Runtime
+   * - ``enable_roce``
+     - Permanent
 
 The ``bnxt`` driver also implements the following driver-specific
 parameters.
diff --git a/Documentation/networking/devlink/devlink-info.rst b/Documentation/networking/devlink/devlink-info.rst
index 23073bc219d8..dd6adc4d0559 100644
--- a/Documentation/networking/devlink/devlink-info.rst
+++ b/Documentation/networking/devlink/devlink-info.rst
@@ -86,6 +86,10 @@ In case software/firmware components are loaded from the disk (e.g.
 ``/lib/firmware``) only the running version should be reported via
 the kernel API.
 
+Please note that any security versions reported via devlink are purely
+informational. Devlink does not use a secure channel to communicate with
+the device.
+
 Generic Versions
 ================
 
diff --git a/Documentation/networking/devlink/devlink-params.rst b/Documentation/networking/devlink/devlink-params.rst
index 4e01dc32bc08..211b58177e12 100644
--- a/Documentation/networking/devlink/devlink-params.rst
+++ b/Documentation/networking/devlink/devlink-params.rst
@@ -137,3 +137,9 @@ own name.
    * - ``event_eq_size``
      - u32
      - Control the size of asynchronous control events EQ.
+   * - ``enable_phc``
+     - Boolean
+     - Enable PHC (PTP Hardware Clock) functionality in the device.
+   * - ``clock_id``
+     - u64
+     - Clock ID used by the device for registering DPLL devices and pins.
diff --git a/Documentation/networking/devlink/devlink-port.rst b/Documentation/networking/devlink/devlink-port.rst
index 9d22d41a7cd1..5e397798a402 100644
--- a/Documentation/networking/devlink/devlink-port.rst
+++ b/Documentation/networking/devlink/devlink-port.rst
@@ -418,6 +418,14 @@ API allows to configure following rate object's parameters:
   to all node children limits. ``tx_max`` is an upper limit for children.
   ``tx_share`` is a total bandwidth distributed among children.
 
+``tc_bw``
+  Allow users to set the bandwidth allocation per traffic class on rate
+  objects. This enables fine-grained QoS configurations by assigning a relative
+  share value to each traffic class. The bandwidth is distributed in proportion
+  to the share value for each class, relative to the sum of all shares.
+  When applied to a non-leaf node, tc_bw determines how bandwidth is shared
+  among its child elements.
+
 ``tx_priority`` and ``tx_weight`` can be used simultaneously. In that case
 nodes with the same priority form a WFQ subgroup in the sibling group
 and arbitration among them is based on assigned weights.
diff --git a/Documentation/networking/devlink/devlink-trap.rst b/Documentation/networking/devlink/devlink-trap.rst
index 2c14dfe69b3a..5885e21e2212 100644
--- a/Documentation/networking/devlink/devlink-trap.rst
+++ b/Documentation/networking/devlink/devlink-trap.rst
@@ -451,7 +451,7 @@ be added to the following table:
    * - ``udp_parsing``
      - ``drop``
      - Traps packets dropped due to an error in the UDP header parsing.
-       This packet trap could include checksum errorrs, an improper UDP
+       This packet trap could include checksum errors, an improper UDP
        length detected (smaller than 8 bytes) or detection of header
        truncation.
    * - ``tcp_parsing``
diff --git a/Documentation/networking/devlink/ice.rst b/Documentation/networking/devlink/ice.rst
index e3972d03cea0..792e9f8c846a 100644
--- a/Documentation/networking/devlink/ice.rst
+++ b/Documentation/networking/devlink/ice.rst
@@ -69,6 +69,17 @@ Parameters
 
        To verify that value has been set:
        $ devlink dev param show pci/0000:16:00.0 name tx_scheduling_layers
+   * - ``msix_vec_per_pf_max``
+     - driverinit
+     - Set the max MSI-X that can be used by the PF, rest can be utilized for
+       SRIOV. The range is from min value set in msix_vec_per_pf_min to
+       2k/number of ports.
+   * - ``msix_vec_per_pf_min``
+     - driverinit
+     - Set the min MSI-X that will be used by the PF. This value inform how many
+       MSI-X will be allocated statically. The range is from 2 to value set
+       in msix_vec_per_pf_max.
+
 .. list-table:: Driver specific parameters implemented
     :widths: 5 5 90
 
diff --git a/Documentation/networking/devlink/index.rst b/Documentation/networking/devlink/index.rst
index 948c8c44e233..270a65a01411 100644
--- a/Documentation/networking/devlink/index.rst
+++ b/Documentation/networking/devlink/index.rst
@@ -84,6 +84,9 @@ parameters, info versions, and other features it supports.
    i40e
    ionic
    ice
+   ixgbe
+   kvaser_pciefd
+   kvaser_usb
    mlx4
    mlx5
    mlxsw
@@ -97,3 +100,4 @@ parameters, info versions, and other features it supports.
    iosm
    octeontx2
    sfc
+   zl3073x
diff --git a/Documentation/networking/devlink/ixgbe.rst b/Documentation/networking/devlink/ixgbe.rst
new file mode 100644
index 000000000000..c27d1436c70e
--- /dev/null
+++ b/Documentation/networking/devlink/ixgbe.rst
@@ -0,0 +1,171 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=====================
+ixgbe devlink support
+=====================
+
+This document describes the devlink features implemented by the ``ixgbe``
+device driver.
+
+Info versions
+=============
+
+Any of the versions dealing with the security presented by ``devlink-info``
+is purely informational. Devlink does not use a secure channel to communicate
+with the device.
+
+The ``ixgbe`` driver reports the following versions
+
+.. list-table:: devlink info versions implemented
+    :widths: 5 5 5 90
+
+    * - Name
+      - Type
+      - Example
+      - Description
+    * - ``board.id``
+      - fixed
+      - H49289-000
+      - The Product Board Assembly (PBA) identifier of the board.
+    * - ``fw.undi``
+      - running
+      - 1.1937.0
+      - Version of the Option ROM containing the UEFI driver. The version is
+        reported in ``major.minor.patch`` format. The major version is
+        incremented whenever a major breaking change occurs, or when the
+        minor version would overflow. The minor version is incremented for
+        non-breaking changes and reset to 1 when the major version is
+        incremented. The patch version is normally 0 but is incremented when
+        a fix is delivered as a patch against an older base Option ROM.
+    * - ``fw.undi.srev``
+      - running
+      - 4
+      - Number indicating the security revision of the Option ROM.
+    * - ``fw.bundle_id``
+      - running
+      - 0x80000d0d
+      - Unique identifier of the firmware image file that was loaded onto
+        the device. Also referred to as the EETRACK identifier of the NVM.
+    * - ``fw.mgmt.api``
+      - running
+      - 1.5.1
+      - 3-digit version number (major.minor.patch) of the API exported over
+        the AdminQ by the management firmware. Used by the driver to
+        identify what commands are supported. Historical versions of the
+        kernel only displayed a 2-digit version number (major.minor).
+    * - ``fw.mgmt.build``
+      - running
+      - 0x305d955f
+      - Unique identifier of the source for the management firmware.
+    * - ``fw.mgmt.srev``
+      - running
+      - 3
+      - Number indicating the security revision of the firmware.
+    * - ``fw.psid.api``
+      - running
+      - 0.80
+      - Version defining the format of the flash contents.
+    * - ``fw.netlist``
+      - running
+      - 1.1.2000-6.7.0
+      - The version of the netlist module. This module defines the device's
+        Ethernet capabilities and default settings, and is used by the
+        management firmware as part of managing link and device
+        connectivity.
+    * - ``fw.netlist.build``
+      - running
+      - 0xee16ced7
+      - The first 4 bytes of the hash of the netlist module contents.
+
+Flash Update
+============
+
+The ``ixgbe`` driver implements support for flash update using the
+``devlink-flash`` interface. It supports updating the device flash using a
+combined flash image that contains the ``fw.mgmt``, ``fw.undi``, and
+``fw.netlist`` components.
+
+.. list-table:: List of supported overwrite modes
+   :widths: 5 95
+
+   * - Bits
+     - Behavior
+   * - ``DEVLINK_FLASH_OVERWRITE_SETTINGS``
+     - Do not preserve settings stored in the flash components being
+       updated. This includes overwriting the port configuration that
+       determines the number of physical functions the device will
+       initialize with.
+   * - ``DEVLINK_FLASH_OVERWRITE_SETTINGS`` and ``DEVLINK_FLASH_OVERWRITE_IDENTIFIERS``
+     - Do not preserve either settings or identifiers. Overwrite everything
+       in the flash with the contents from the provided image, without
+       performing any preservation. This includes overwriting device
+       identifying fields such as the MAC address, Vital product Data (VPD) area,
+       and device serial number. It is expected that this combination be used with an
+       image customized for the specific device.
+
+Reload
+======
+
+The ``ixgbe`` driver supports activating new firmware after a flash update
+using ``DEVLINK_CMD_RELOAD`` with the ``DEVLINK_RELOAD_ACTION_FW_ACTIVATE``
+action.
+
+.. code:: shell
+
+    $ devlink dev reload pci/0000:01:00.0 reload action fw_activate
+
+The new firmware is activated by issuing a device specific Embedded
+Management Processor reset which requests the device to reset and reload the
+EMP firmware image.
+
+The driver does not currently support reloading the driver via
+``DEVLINK_RELOAD_ACTION_DRIVER_REINIT``.
+
+Regions
+=======
+
+The ``ixgbe`` driver implements the following regions for accessing internal
+device data.
+
+.. list-table:: regions implemented
+    :widths: 15 85
+
+    * - Name
+      - Description
+    * - ``nvm-flash``
+      - The contents of the entire flash chip, sometimes referred to as
+        the device's Non Volatile Memory.
+    * - ``shadow-ram``
+      - The contents of the Shadow RAM, which is loaded from the beginning
+        of the flash. Although the contents are primarily from the flash,
+        this area also contains data generated during device boot which is
+        not stored in flash.
+    * - ``device-caps``
+      - The contents of the device firmware's capabilities buffer. Useful to
+        determine the current state and configuration of the device.
+
+Both the ``nvm-flash`` and ``shadow-ram`` regions can be accessed without a
+snapshot. The ``device-caps`` region requires a snapshot as the contents are
+sent by firmware and can't be split into separate reads.
+
+Users can request an immediate capture of a snapshot for all three regions
+via the ``DEVLINK_CMD_REGION_NEW`` command.
+
+.. code:: shell
+
+    $ devlink region show
+    pci/0000:01:00.0/nvm-flash: size 10485760 snapshot [] max 1
+    pci/0000:01:00.0/device-caps: size 4096 snapshot [] max 10
+
+    $ devlink region new pci/0000:01:00.0/nvm-flash snapshot 1
+
+    $ devlink region dump pci/0000:01:00.0/nvm-flash snapshot 1
+    0000000000000000 0014 95dc 0014 9514 0035 1670 0034 db30
+    0000000000000010 0000 0000 ffff ff04 0029 8c00 0028 8cc8
+    0000000000000020 0016 0bb8 0016 1720 0000 0000 c00f 3ffc
+    0000000000000030 bada cce5 bada cce5 bada cce5 bada cce5
+
+    $ devlink region read pci/0000:01:00.0/nvm-flash snapshot 1 address 0 length 16
+    0000000000000000 0014 95dc 0014 9514 0035 1670 0034 db30
+
+    $ devlink region delete pci/0000:01:00.0/device-caps snapshot 1
diff --git a/Documentation/networking/devlink/kvaser_pciefd.rst b/Documentation/networking/devlink/kvaser_pciefd.rst
new file mode 100644
index 000000000000..075edd2a508a
--- /dev/null
+++ b/Documentation/networking/devlink/kvaser_pciefd.rst
@@ -0,0 +1,24 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=============================
+kvaser_pciefd devlink support
+=============================
+
+This document describes the devlink features implemented by the
+``kvaser_pciefd`` device driver.
+
+Info versions
+=============
+
+The ``kvaser_pciefd`` driver reports the following versions
+
+.. list-table:: devlink info versions implemented
+   :widths: 5 5 90
+
+   * - Name
+     - Type
+     - Description
+   * - ``fw``
+     - running
+     - Version of the firmware running on the device. Also available
+       through ``ethtool -i`` as ``firmware-version``.
diff --git a/Documentation/networking/devlink/kvaser_usb.rst b/Documentation/networking/devlink/kvaser_usb.rst
new file mode 100644
index 000000000000..403db3766cb4
--- /dev/null
+++ b/Documentation/networking/devlink/kvaser_usb.rst
@@ -0,0 +1,33 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==========================
+kvaser_usb devlink support
+==========================
+
+This document describes the devlink features implemented by the
+``kvaser_usb`` device driver.
+
+Info versions
+=============
+
+The ``kvaser_usb`` driver reports the following versions
+
+.. list-table:: devlink info versions implemented
+   :widths: 5 5 90
+
+   * - Name
+     - Type
+     - Description
+   * - ``fw``
+     - running
+     - Version of the firmware running on the device. Also available
+       through ``ethtool -i`` as ``firmware-version``.
+   * - ``board.rev``
+     - fixed
+     - The device hardware revision.
+   * - ``board.id``
+     - fixed
+     - The device EAN (product number).
+   * - ``serial_number``
+     - fixed
+     - The device serial number.
diff --git a/Documentation/networking/devlink/mlx5.rst b/Documentation/networking/devlink/mlx5.rst
index 456985407475..7febe0aecd53 100644
--- a/Documentation/networking/devlink/mlx5.rst
+++ b/Documentation/networking/devlink/mlx5.rst
@@ -53,6 +53,9 @@ parameters.
        * ``smfs`` Software managed flow steering. In SMFS mode, the HW
          steering entities are created and manage through the driver without
          firmware intervention.
+       * ``hmfs`` Hardware managed flow steering. In HMFS mode, the driver
+         is configuring steering rules directly to the HW using Work Queues with
+         a special new type of WQE (Work Queue Element).
 
        SMFS mode is faster and provides better rule insertion rate compared to
        default DMFS mode.
@@ -277,6 +280,10 @@ Description of the vnic counters:
 	number of packets handled by the VNIC experiencing unexpected steering
 	failure (at any point in steering flow owned by the VNIC, including the FDB
 	for the eswitch owner).
+- icm_consumption
+        amount of Interconnect Host Memory (ICM) consumed by the vnic in
+        granularity of 4KB. ICM is host memory allocated by SW upon HCA request
+        and is used for storing data structures that control HCA operation.
 
 User commands examples:
 
diff --git a/Documentation/networking/devlink/netdevsim.rst b/Documentation/networking/devlink/netdevsim.rst
index 88482725422c..3932004eae82 100644
--- a/Documentation/networking/devlink/netdevsim.rst
+++ b/Documentation/networking/devlink/netdevsim.rst
@@ -62,7 +62,7 @@ Rate objects
 
 The ``netdevsim`` driver supports rate objects management, which includes:
 
-- registerging/unregistering leaf rate objects per VF devlink port;
+- registering/unregistering leaf rate objects per VF devlink port;
 - creation/deletion node rate objects;
 - setting tx_share and tx_max rate values for any rate object type;
 - setting parent node for any rate object type.
diff --git a/Documentation/networking/devlink/octeontx2.rst b/Documentation/networking/devlink/octeontx2.rst
index d33a90dd44bf..84206537aedb 100644
--- a/Documentation/networking/devlink/octeontx2.rst
+++ b/Documentation/networking/devlink/octeontx2.rst
@@ -40,6 +40,27 @@ The ``octeontx2 AF`` driver implements the following driver-specific parameters.
      - runtime
      - Use to set the quantum which hardware uses for scheduling among transmit queues.
        Hardware uses weighted DWRR algorithm to schedule among all transmit queues.
+   * - ``npc_mcam_high_zone_percent``
+     - u8
+     - runtime
+     - Use to set the number of high priority zone entries in NPC MCAM that can be allocated
+       by a user, out of the three priority zone categories high, mid and low.
+   * - ``npc_def_rule_cntr``
+     - bool
+     - runtime
+     - Use to enable or disable hit counters for the default rules in NPC MCAM.
+       Its not guaranteed that counters gets enabled and mapped to all the default rules,
+       since the counters are scarce and driver follows a best effort approach.
+       The default rule serves as the primary packet steering rule for a specific PF or VF,
+       based on its DMAC address which is installed by AF driver as part of its initialization.
+       Sample command to read hit counters for default rule from debugfs is as follows,
+       cat /sys/kernel/debug/cn10k/npc/mcam_rules
+   * - ``nix_maxlf``
+     - u16
+     - runtime
+     - Use to set the maximum number of LFs in NIX hardware block. This would be useful
+       to increase the availability of default resources allocated to enabled LFs like
+       MCAM entries for example.
 
 The ``octeontx2 PF`` driver implements the following driver-specific parameters.
 
diff --git a/Documentation/networking/devlink/sfc.rst b/Documentation/networking/devlink/sfc.rst
index db64a1bd9733..0398d59ea184 100644
--- a/Documentation/networking/devlink/sfc.rst
+++ b/Documentation/networking/devlink/sfc.rst
@@ -5,7 +5,7 @@ sfc devlink support
 ===================
 
 This document describes the devlink features implemented by the ``sfc``
-device driver for the ef100 device.
+device driver for the ef10 and ef100 devices.
 
 Info versions
 =============
@@ -18,6 +18,10 @@ The ``sfc`` driver reports the following versions
    * - Name
      - Type
      - Description
+   * - ``fw.bundle_id``
+     - stored
+     - Version of the firmware "bundle" image that was last used to update
+       multiple components.
    * - ``fw.mgmt.suc``
      - running
      - For boards where the management function is split between multiple
@@ -55,3 +59,13 @@ The ``sfc`` driver reports the following versions
    * - ``fw.uefi``
      - running
      - UEFI driver version (No UNDI support).
+
+Flash Update
+============
+
+The ``sfc`` driver implements support for flash update using the
+``devlink-flash`` interface. It supports updating the device flash using a
+combined flash image ("bundle") that contains multiple components (on ef10,
+typically ``fw.mgmt``, ``fw.app``, ``fw.exprom`` and ``fw.uefi``).
+
+The driver does not support any overwrite mask flags.
diff --git a/Documentation/networking/devlink/zl3073x.rst b/Documentation/networking/devlink/zl3073x.rst
new file mode 100644
index 000000000000..4b6cfaf38643
--- /dev/null
+++ b/Documentation/networking/devlink/zl3073x.rst
@@ -0,0 +1,51 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=======================
+zl3073x devlink support
+=======================
+
+This document describes the devlink features implemented by the ``zl3073x``
+device driver.
+
+Parameters
+==========
+
+.. list-table:: Generic parameters implemented
+   :widths: 5 5 90
+
+   * - Name
+     - Mode
+     - Notes
+   * - ``clock_id``
+     - driverinit
+     - Set the clock ID that is used by the driver for registering DPLL devices
+       and pins.
+
+Info versions
+=============
+
+The ``zl3073x`` driver reports the following versions
+
+.. list-table:: devlink info versions implemented
+    :widths: 5 5 5 90
+
+    * - Name
+      - Type
+      - Example
+      - Description
+    * - ``asic.id``
+      - fixed
+      - 1E94
+      - Chip identification number
+    * - ``asic.rev``
+      - fixed
+      - 300
+      - Chip revision number
+    * - ``fw``
+      - running
+      - 7006
+      - Firmware version number
+    * - ``custom_cfg``
+      - running
+      - 1.3.0.1
+      - Device configuration version customized by OEM
diff --git a/Documentation/networking/devmem.rst b/Documentation/networking/devmem.rst
index d95363645331..a6cd7236bfbd 100644
--- a/Documentation/networking/devmem.rst
+++ b/Documentation/networking/devmem.rst
@@ -62,15 +62,15 @@ More Info
     https://lore.kernel.org/netdev/20240831004313.3713467-1-almasrymina@google.com/
 
 
-Interface
-=========
+RX Interface
+============
 
 
 Example
 -------
 
-tools/testing/selftests/net/ncdevmem.c:do_server shows an example of setting up
-the RX path of this API.
+./tools/testing/selftests/drivers/net/hw/ncdevmem:do_server shows an example of
+setting up the RX path of this API.
 
 
 NIC Setup
@@ -235,6 +235,148 @@ can be less than the tokens provided by the user in case of:
 (a) an internal kernel leak bug.
 (b) the user passed more than 1024 frags.
 
+TX Interface
+============
+
+
+Example
+-------
+
+./tools/testing/selftests/drivers/net/hw/ncdevmem:do_client shows an example of
+setting up the TX path of this API.
+
+
+NIC Setup
+---------
+
+The user must bind a TX dmabuf to a given NIC using the netlink API::
+
+        struct netdev_bind_tx_req *req = NULL;
+        struct netdev_bind_tx_rsp *rsp = NULL;
+        struct ynl_error yerr;
+
+        *ys = ynl_sock_create(&ynl_netdev_family, &yerr);
+
+        req = netdev_bind_tx_req_alloc();
+        netdev_bind_tx_req_set_ifindex(req, ifindex);
+        netdev_bind_tx_req_set_fd(req, dmabuf_fd);
+
+        rsp = netdev_bind_tx(*ys, req);
+
+        tx_dmabuf_id = rsp->id;
+
+
+The netlink API returns a dmabuf_id: a unique ID that refers to this dmabuf
+that has been bound.
+
+The user can unbind the dmabuf from the netdevice by closing the netlink socket
+that established the binding. We do this so that the binding is automatically
+unbound even if the userspace process crashes.
+
+Note that any reasonably well-behaved dmabuf from any exporter should work with
+devmem TCP, even if the dmabuf is not actually backed by devmem. An example of
+this is udmabuf, which wraps user memory (non-devmem) in a dmabuf.
+
+Socket Setup
+------------
+
+The user application must use MSG_ZEROCOPY flag when sending devmem TCP. Devmem
+cannot be copied by the kernel, so the semantics of the devmem TX are similar
+to the semantics of MSG_ZEROCOPY::
+
+	setsockopt(socket_fd, SOL_SOCKET, SO_ZEROCOPY, &opt, sizeof(opt));
+
+It is also recommended that the user binds the TX socket to the same interface
+the dma-buf has been bound to via SO_BINDTODEVICE::
+
+	setsockopt(socket_fd, SOL_SOCKET, SO_BINDTODEVICE, ifname, strlen(ifname) + 1);
+
+
+Sending data
+------------
+
+Devmem data is sent using the SCM_DEVMEM_DMABUF cmsg.
+
+The user should create a msghdr where,
+
+* iov_base is set to the offset into the dmabuf to start sending from
+* iov_len is set to the number of bytes to be sent from the dmabuf
+
+The user passes the dma-buf id to send from via the dmabuf_tx_cmsg.dmabuf_id.
+
+The example below sends 1024 bytes from offset 100 into the dmabuf, and 2048
+from offset 2000 into the dmabuf. The dmabuf to send from is tx_dmabuf_id::
+
+       char ctrl_data[CMSG_SPACE(sizeof(struct dmabuf_tx_cmsg))];
+       struct dmabuf_tx_cmsg ddmabuf;
+       struct msghdr msg = {};
+       struct cmsghdr *cmsg;
+       struct iovec iov[2];
+
+       iov[0].iov_base = (void*)100;
+       iov[0].iov_len = 1024;
+       iov[1].iov_base = (void*)2000;
+       iov[1].iov_len = 2048;
+
+       msg.msg_iov = iov;
+       msg.msg_iovlen = 2;
+
+       msg.msg_control = ctrl_data;
+       msg.msg_controllen = sizeof(ctrl_data);
+
+       cmsg = CMSG_FIRSTHDR(&msg);
+       cmsg->cmsg_level = SOL_SOCKET;
+       cmsg->cmsg_type = SCM_DEVMEM_DMABUF;
+       cmsg->cmsg_len = CMSG_LEN(sizeof(struct dmabuf_tx_cmsg));
+
+       ddmabuf.dmabuf_id = tx_dmabuf_id;
+
+       *((struct dmabuf_tx_cmsg *)CMSG_DATA(cmsg)) = ddmabuf;
+
+       sendmsg(socket_fd, &msg, MSG_ZEROCOPY);
+
+
+Reusing TX dmabufs
+------------------
+
+Similar to MSG_ZEROCOPY with regular memory, the user should not modify the
+contents of the dma-buf while a send operation is in progress. This is because
+the kernel does not keep a copy of the dmabuf contents. Instead, the kernel
+will pin and send data from the buffer available to the userspace.
+
+Just as in MSG_ZEROCOPY, the kernel notifies the userspace of send completions
+using MSG_ERRQUEUE::
+
+        int64_t tstop = gettimeofday_ms() + waittime_ms;
+        char control[CMSG_SPACE(100)] = {};
+        struct sock_extended_err *serr;
+        struct msghdr msg = {};
+        struct cmsghdr *cm;
+        int retries = 10;
+        __u32 hi, lo;
+
+        msg.msg_control = control;
+        msg.msg_controllen = sizeof(control);
+
+        while (gettimeofday_ms() < tstop) {
+                if (!do_poll(fd)) continue;
+
+                ret = recvmsg(fd, &msg, MSG_ERRQUEUE);
+
+                for (cm = CMSG_FIRSTHDR(&msg); cm; cm = CMSG_NXTHDR(&msg, cm)) {
+                        serr = (void *)CMSG_DATA(cm);
+
+                        hi = serr->ee_data;
+                        lo = serr->ee_info;
+
+                        fprintf(stdout, "tx complete [%d,%d]\n", lo, hi);
+                }
+        }
+
+After the associated sendmsg has been completed, the dmabuf can be reused by
+the userspace.
+
+
 Implementation & Caveats
 ========================
 
@@ -256,7 +398,7 @@ Testing
 =======
 
 More realistic example code can be found in the kernel source under
-``tools/testing/selftests/net/ncdevmem.c``
+``tools/testing/selftests/drivers/net/hw/ncdevmem.c``
 
 ncdevmem is a devmem TCP netcat. It works very similarly to netcat, but
 receives data directly into a udmabuf.
@@ -268,8 +410,7 @@ ncdevmem has a validation mode as well that expects a repeating pattern of
 incoming data and validates it as such. For example, you can launch
 ncdevmem on the server by::
 
-	ncdevmem -s <server IP> -c <client IP> -f eth1 -d 3 -n 0000:06:00.0 -l \
-		 -p 5201 -v 7
+	ncdevmem -s <server IP> -c <client IP> -f <ifname> -l -p 5201 -v 7
 
 On client side, use regular netcat to send TX data to ncdevmem process
 on the server::
diff --git a/Documentation/networking/diagnostic/index.rst b/Documentation/networking/diagnostic/index.rst
new file mode 100644
index 000000000000..86488aa46b48
--- /dev/null
+++ b/Documentation/networking/diagnostic/index.rst
@@ -0,0 +1,17 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+======================
+Networking Diagnostics
+======================
+
+.. toctree::
+   :maxdepth: 2
+
+   twisted_pair_layer1_diagnostics.rst
+
+.. only::  subproject and html
+
+   Indices
+   =======
+
+   * :ref:`genindex`
diff --git a/Documentation/networking/diagnostic/twisted_pair_layer1_diagnostics.rst b/Documentation/networking/diagnostic/twisted_pair_layer1_diagnostics.rst
new file mode 100644
index 000000000000..079e17effadf
--- /dev/null
+++ b/Documentation/networking/diagnostic/twisted_pair_layer1_diagnostics.rst
@@ -0,0 +1,784 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+Diagnostic Concept for Investigating Twisted Pair Ethernet Variants at OSI Layer 1
+==================================================================================
+
+Introduction
+------------
+
+This documentation is designed for two primary audiences:
+
+1. **Users and System Administrators**: For those dealing with real-world
+   Ethernet issues, this guide provides a practical, step-by-step
+   troubleshooting flow to help identify and resolve common problems in Twisted
+   Pair Ethernet at OSI Layer 1. If you're facing unstable links, speed drops,
+   or mysterious network issues, jump right into the step-by-step guide and
+   follow it through to find your solution.
+
+2. **Kernel Developers**: For developers working with network drivers and PHY
+   support, this documentation outlines the diagnostic process and highlights
+   areas where the Linux kernel’s diagnostic interfaces could be extended or
+   improved. By understanding the diagnostic flow, developers can better
+   prioritize future enhancements.
+
+Step-by-Step Diagnostic Guide from Linux (General Ethernet)
+-----------------------------------------------------------
+
+This diagnostic guide covers common Ethernet troubleshooting scenarios,
+focusing on **link stability and detection** across different Ethernet
+environments, including **Single-Pair Ethernet (SPE)** and **Multi-Pair
+Ethernet (MPE)**, as well as power delivery technologies like **PoDL** (Power
+over Data Line) and **PoE** (Clause 33 PSE).
+
+The guide is designed to help users diagnose physical layer (Layer 1) issues on
+systems running **Linux kernel version 6.11 or newer**, utilizing **ethtool
+version 6.10 or later** and **iproute2 version 6.4.0 or later**.
+
+In this guide, we assume that users may have **limited or no access to the link
+partner** and will focus on diagnosing issues locally.
+
+Diagnostic Scenarios
+~~~~~~~~~~~~~~~~~~~~
+
+- **Link is up and stable, but no data transfer**: If the link is stable but
+  there are issues with data transmission, refer to the **OSI Layer 2
+  Troubleshooting Guide**.
+
+- **Link is unstable**: Link resets, speed drops, or other fluctuations
+  indicate potential issues at the hardware or physical layer.
+
+- **No link detected**: The interface is up, but no link is established.
+
+Verify Interface Status
+~~~~~~~~~~~~~~~~~~~~~~~
+
+Begin by verifying the status of the Ethernet interface to check if it is
+administratively up. Unlike `ethtool`, which provides information on the link
+and PHY status, it does not show the **administrative state** of the interface.
+To check this, you should use the `ip` command, which describes the interface
+state within the angle brackets `"<>"` in its output.
+
+For example, in the output `<NO-CARRIER,BROADCAST,MULTICAST,UP>`, the important
+keywords are:
+
+- **UP**: The interface is in the administrative "UP" state.
+- **NO-CARRIER**: The interface is administratively up, but no physical link is
+  detected.
+
+If the output shows `<BROADCAST,MULTICAST>`, this indicates the interface is in
+the administrative "DOWN" state.
+
+- **Command:** `ip link show dev <interface>`
+
+- **Expected Output:**
+
+  .. code-block:: bash
+
+     4: eth0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 ...
+        link/ether 88:14:2b:00:96:f2 brd ff:ff:ff:ff:ff:ff
+
+- **Interpreting the Output:**
+
+  - **Administrative UP State**:
+
+    - If the output contains **"UP"**, the interface is administratively up,
+      and the system is trying to establish a physical link.
+
+    - If you also see **"NO-CARRIER"**, it means the physical link has not been
+      detected, indicating potential Layer 1 issues like a cable fault,
+      misconfiguration, or no connection at the link partner. In this case,
+      proceed to the **Inspect Link Status and PHY Configuration** section.
+
+  - **Administrative DOWN State**:
+
+    - If the output lacks **"UP"** and shows only states like
+      **"<BROADCAST,MULTICAST>"**, it means the interface is administratively
+      down. In this case, bring the interface up using the following command:
+
+      .. code-block:: bash
+
+         ip link set dev <interface> up
+
+- **Next Steps**:
+
+  - If the interface is **administratively up** but shows **NO-CARRIER**,
+    proceed to the **Inspect Link Status and PHY Configuration** section to
+    troubleshoot potential physical layer issues.
+
+  - If the interface was **administratively down** and you have brought it up,
+    ensure to **repeat this verification step** to confirm the new state of the
+    interface before proceeding
+
+  - **If the interface is up and the link is detected**:
+
+    - If the output shows **"UP"** and there is **no `NO-CARRIER`**, the
+      interface is administratively up, and the physical link has been
+      successfully established. If everything is working as expected, the Layer
+      1 diagnostics are complete, and no further action is needed.
+
+    - If the interface is up and the link is detected but **no data is being
+      transferred**, the issue is likely beyond Layer 1, and you should proceed
+      with diagnosing the higher layers of the OSI model. This may involve
+      checking Layer 2 configurations (such as VLANs or MAC address issues),
+      Layer 3 settings (like IP addresses, routing, or ARP), or Layer 4 and
+      above (firewalls, services, etc.).
+
+    - If the **link is unstable** or **frequently resetting or dropping**, this
+      may indicate a physical layer issue such as a faulty cable, interference,
+      or power delivery problems. In this case, proceed with the next step in
+      this guide.
+
+Inspect Link Status and PHY Configuration
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Use `ethtool -I` to check the link status, PHY configuration, supported link
+modes, and additional statistics such as the **Link Down Events** counter. This
+step is essential for diagnosing Layer 1 problems such as speed mismatches,
+duplex issues, and link instability.
+
+For both **Single-Pair Ethernet (SPE)** and **Multi-Pair Ethernet (MPE)**
+devices, you will use this step to gather key details about the link. **SPE**
+links generally support a single speed and mode without autonegotiation (with
+the exception of **10BaseT1L**), while **MPE** devices typically support
+multiple link modes and autonegotiation.
+
+- **Command:** `ethtool -I <interface>`
+
+- **Example Output for SPE Interface (Non-autonegotiation)**:
+
+  .. code-block:: bash
+
+     Settings for spe4:
+         Supported ports: [ TP ]
+         Supported link modes:   100baseT1/Full
+         Supported pause frame use: No
+         Supports auto-negotiation: No
+         Supported FEC modes: Not reported
+         Advertised link modes: Not applicable
+         Advertised pause frame use: No
+         Advertised auto-negotiation: No
+         Advertised FEC modes: Not reported
+         Speed: 100Mb/s
+         Duplex: Full
+         Auto-negotiation: off
+         master-slave cfg: forced slave
+         master-slave status: slave
+         Port: Twisted Pair
+         PHYAD: 6
+         Transceiver: external
+         MDI-X: Unknown
+         Supports Wake-on: d
+         Wake-on: d
+         Link detected: yes
+         SQI: 7/7
+         Link Down Events: 2
+
+- **Example Output for MPE Interface (Autonegotiation)**:
+
+  .. code-block:: bash
+
+     Settings for eth1:
+         Supported ports: [ TP    MII ]
+         Supported link modes:   10baseT/Half 10baseT/Full
+                                 100baseT/Half 100baseT/Full
+         Supported pause frame use: Symmetric Receive-only
+         Supports auto-negotiation: Yes
+         Supported FEC modes: Not reported
+         Advertised link modes:  10baseT/Half 10baseT/Full
+                                 100baseT/Half 100baseT/Full
+         Advertised pause frame use: Symmetric Receive-only
+         Advertised auto-negotiation: Yes
+         Advertised FEC modes: Not reported
+         Link partner advertised link modes:  10baseT/Half 10baseT/Full
+                                              100baseT/Half 100baseT/Full
+         Link partner advertised pause frame use: Symmetric Receive-only
+         Link partner advertised auto-negotiation: Yes
+         Link partner advertised FEC modes: Not reported
+         Speed: 100Mb/s
+         Duplex: Full
+         Auto-negotiation: on
+         Port: Twisted Pair
+         PHYAD: 10
+         Transceiver: internal
+         MDI-X: Unknown
+         Supports Wake-on: pg
+         Wake-on: p
+         Link detected: yes
+         Link Down Events: 1
+
+- **Next Steps**:
+
+  - Record the output provided by `ethtool`, particularly noting the
+    **master-slave status**, **speed**, **duplex**, and other relevant fields.
+    This information will be useful for further analysis or troubleshooting.
+    Once the **ethtool** output has been collected and stored, move on to the
+    next diagnostic step.
+
+Check Power Delivery (PoDL or PoE)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+If it is known that **PoDL** or **PoE** is **not implemented** on the system,
+or the **PSE** (Power Sourcing Equipment) is managed by proprietary user-space
+software or external tools, you can skip this step. In such cases, verify power
+delivery through alternative methods, such as checking hardware indicators
+(LEDs), using multimeters, or consulting vendor-specific software for
+monitoring power status.
+
+If **PoDL** or **PoE** is implemented and managed directly by Linux, follow
+these steps to ensure power is being delivered correctly:
+
+- **Command:** `ethtool --show-pse <interface>`
+
+- **Expected Output Examples**:
+
+  1. **PSE Not Supported**:
+
+     If no PSE is attached or the interface does not support PSE, the following
+     output is expected:
+
+     .. code-block:: bash
+
+        netlink error: No PSE is attached
+        netlink error: Operation not supported
+
+  2. **PoDL (Single-Pair Ethernet)**:
+
+     When PoDL is implemented, you might see the following attributes:
+
+     .. code-block:: bash
+
+        PSE attributes for eth1:
+        PoDL PSE Admin State: enabled
+        PoDL PSE Power Detection Status: delivering power
+
+  3. **PoE (Clause 33 PSE)**:
+
+     For standard PoE, the output may look like this:
+
+     .. code-block:: bash
+
+        PSE attributes for eth1:
+        Clause 33 PSE Admin State: enabled
+        Clause 33 PSE Power Detection Status: delivering power
+        Clause 33 PSE Available Power Limit: 18000
+
+- **Adjust Power Limit (if needed)**:
+
+  - Sometimes, the available power limit may not be sufficient for the link
+    partner. You can increase the power limit as needed.
+
+  - **Command:** `ethtool --set-pse <interface> c33-pse-avail-pw-limit <limit>`
+
+    Example:
+
+    .. code-block:: bash
+
+      ethtool --set-pse eth1 c33-pse-avail-pw-limit 18000
+      ethtool --show-pse eth1
+
+    **Expected Output** after adjusting the power limit:
+
+    .. code-block:: bash
+
+      Clause 33 PSE Available Power Limit: 18000
+
+
+- **Next Steps**:
+
+  - **PoE or PoDL Not Used**: If **PoE** or **PoDL** is not implemented or used
+    on the system, proceed to the next diagnostic step, as power delivery is
+    not relevant for this setup.
+
+  - **PoE or PoDL Controlled Externally**: If **PoE** or **PoDL** is used but
+    is not managed by the Linux kernel's **PSE-PD** framework (i.e., it is
+    controlled by proprietary user-space software or external tools), this part
+    is out of scope for this documentation. Please consult vendor-specific
+    documentation or external tools for monitoring and managing power delivery.
+
+  - **PSE Admin State Disabled**:
+
+    - If the `PSE Admin State:` is **disabled**, enable it by running one of
+      the following commands:
+
+      .. code-block:: bash
+
+         ethtool --set-pse <devname> podl-pse-admin-control enable
+
+      or, for Clause 33 PSE (PoE):
+
+         ethtool --set-pse <devname> c33-pse-admin-control enable
+
+    - After enabling the PSE Admin State, return to the start of the **Check
+      Power Delivery (PoDL or PoE)** step to recheck the power delivery status.
+
+  - **Power Not Delivered**: If the `Power Detection Status` shows something
+    other than "delivering power" (e.g., `over current`), troubleshoot the
+    **PSE**. Check for potential issues such as a short circuit in the cable,
+    insufficient power delivery, or a fault in the PSE itself.
+
+  - **Power Delivered but No Link**: If power is being delivered but no link is
+    established, proceed with further diagnostics by performing **Cable
+    Diagnostics** or reviewing the **Inspect Link Status and PHY
+    Configuration** steps to identify any underlying issues with the physical
+    link or settings.
+
+Cable Diagnostics
+~~~~~~~~~~~~~~~~~
+
+Use `ethtool` to test for physical layer issues such as cable faults. The test
+results can vary depending on the cable's condition, the technology in use, and
+the state of the link partner. The results from the cable test will help in
+diagnosing issues like open circuits, shorts, impedance mismatches, and
+noise-related problems.
+
+- **Command:** `ethtool --cable-test <interface>`
+
+The following are the typical outputs for **Single-Pair Ethernet (SPE)** and
+**Multi-Pair Ethernet (MPE)**:
+
+- **For Single-Pair Ethernet (SPE)**:
+  - **Expected Output (SPE)**:
+
+  .. code-block:: bash
+
+    Cable test completed for device eth1.
+    Pair A, fault length: 25.00m
+    Pair A code Open Circuit
+
+  This indicates an open circuit or cable fault at the reported distance, but
+  results can be influenced by the link partner's state. Refer to the
+  **"Troubleshooting Based on Cable Test Results"** section for further
+  interpretation of these results.
+
+- **For Multi-Pair Ethernet (MPE)**:
+  - **Expected Output (MPE)**:
+
+  .. code-block:: bash
+
+    Cable test completed for device eth0.
+    Pair A code OK
+    Pair B code OK
+    Pair C code Open Circuit
+
+  Here, Pair C is reported as having an open circuit, while Pairs A and B are
+  functioning correctly. However, if autonegotiation is in use on Pairs A and
+  B, the cable test may be disrupted. Refer to the **"Troubleshooting Based on
+  Cable Test Results"** section for a detailed explanation of these issues and
+  how to resolve them.
+
+For detailed descriptions of the different possible cable test results, please
+refer to the **"Troubleshooting Based on Cable Test Results"** section.
+
+Troubleshooting Based on Cable Test Results
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+After running the cable test, the results can help identify specific issues in
+the physical connection. However, it is important to note that **cable testing
+results heavily depend on the capabilities and characteristics of both the
+local hardware and the link partner**. The accuracy and reliability of the
+results can vary significantly between different hardware implementations.
+
+In some cases, this can introduce **blind spots** in the current cable testing
+implementation, where certain results may not accurately reflect the actual
+physical state of the cable. For example:
+
+- An **Open Circuit** result might not only indicate a damaged or disconnected
+  cable but also occur if the cable is properly attached to a powered-down link
+  partner.
+
+- Some PHYs may report a **Short within Pair** if the link partner is in
+  **forced slave mode**, even though there is no actual short in the cable.
+
+To help users interpret the results more effectively, it could be beneficial to
+extend the **kernel UAPI** (User API) to provide additional context or
+**possible variants** of issues based on the hardware’s characteristics. Since
+these quirks are often hardware-specific, the **kernel driver** would be an
+ideal source of such information. By providing flags or hints related to
+potential false positives for each test result, users would have a better
+understanding of what to verify and where to investigate further.
+
+Until such improvements are made, users should be aware of these limitations
+and manually verify cable issues as needed. Physical inspections may help
+resolve uncertainties related to false positive results.
+
+The results can be one of the following:
+
+- **OK**:
+
+  - The cable is functioning correctly, and no issues were detected.
+
+  - **Next Steps**: If you are still experiencing issues, it might be related
+    to higher-layer problems, such as duplex mismatches or speed negotiation,
+    which are not physical-layer issues.
+
+  - **Special Case for `BaseT1` (1000/100/10BaseT1)**: In `BaseT1` systems, an
+    "OK" result typically also means that the link is up and likely in **slave
+    mode**, since cable tests usually only pass in this mode. For some
+    **10BaseT1L** PHYs, an "OK" result may occur even if the cable is too long
+    for the PHY's configured range (for example, when the range is configured
+    for short-distance mode).
+
+- **Open Circuit**:
+
+  - An **Open Circuit** result typically indicates that the cable is damaged or
+    disconnected at the reported fault length. Consider these possibilities:
+
+    - If the link partner is in **admin down** state or powered off, you might
+      still get an "Open Circuit" result even if the cable is functional.
+
+    - **Next Steps**: Inspect the cable at the fault length for visible damage
+      or loose connections. Verify the link partner is powered on and in the
+      correct mode.
+
+- **Short within Pair**:
+
+  - A **Short within Pair** indicates an unintended connection within the same
+    pair of wires, typically caused by physical damage to the cable.
+
+    - **Next Steps**: Replace or repair the cable and check for any physical
+      damage or improperly crimped connectors.
+
+- **Short to Another Pair**:
+
+  - A **Short to Another Pair** means the wires from different pairs are
+    shorted, which could occur due to physical damage or incorrect wiring.
+
+    - **Next Steps**: Replace or repair the damaged cable. Inspect the cable for
+      incorrect terminations or pinched wiring.
+
+- **Impedance Mismatch**:
+
+  - **Impedance Mismatch** indicates a reflection caused by an impedance
+    discontinuity in the cable. This can happen when a part of the cable has
+    abnormal impedance (e.g., when different cable types are spliced together
+    or when there is a defect in the cable).
+
+    - **Next Steps**: Check the cable quality and ensure consistent impedance
+      throughout its length. Replace any sections of the cable that do not meet
+      specifications.
+
+- **Noise**:
+
+  - **Noise** means that the Time Domain Reflectometry (TDR) test could not
+    complete due to excessive noise on the cable, which can be caused by
+    interference from electromagnetic sources.
+
+    - **Next Steps**: Identify and eliminate sources of electromagnetic
+      interference (EMI) near the cable. Consider using shielded cables or
+      rerouting the cable away from noise sources.
+
+- **Resolution Not Possible**:
+
+  - **Resolution Not Possible** means that the TDR test could not detect the
+    issue due to the resolution limitations of the test or because the fault is
+    beyond the distance that the test can measure.
+
+    - **Next Steps**: Inspect the cable manually if possible, or use alternative
+      diagnostic tools that can handle greater distances or higher resolution.
+
+- **Unknown**:
+
+  - An **Unknown** result may occur when the test cannot classify the fault or
+    when a specific issue is outside the scope of the tool's detection
+    capabilities.
+
+    - **Next Steps**: Re-run the test, verify the link partner's state, and inspect
+      the cable manually if necessary.
+
+Verify Link Partner PHY Configuration
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+If the cable test passes but the link is still not functioning correctly, it’s
+essential to verify the configuration of the link partner’s PHY. Mismatches in
+speed, duplex settings, or master-slave roles can cause connection issues.
+
+Autonegotiation Mismatch
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+- If both link partners support autonegotiation, ensure that autonegotiation is
+  enabled on both sides and that all supported link modes are advertised. A
+  mismatch can lead to connectivity problems or sub optimal performance.
+
+- **Quick Fix:** Reset autonegotiation to the default settings, which will
+  advertise all default link modes:
+
+  .. code-block:: bash
+
+     ethtool -s <interface> autoneg on
+
+- **Command to check configuration:** `ethtool <interface>`
+
+- **Expected Output:** Ensure that both sides advertise compatible link modes.
+  If autonegotiation is off, verify that both link partners are configured for
+  the same speed and duplex.
+
+  The following example shows a case where the local PHY advertises fewer link
+  modes than it supports. This will reduce the number of overlapping link modes
+  with the link partner. In the worst case, there will be no common link modes,
+  and the link will not be created:
+
+  .. code-block:: bash
+
+     Settings for eth0:
+        Supported link modes:  1000baseT/Full, 100baseT/Full
+        Advertised link modes: 1000baseT/Full
+        Speed: 1000Mb/s
+        Duplex: Full
+        Auto-negotiation: on
+
+Combined Mode Mismatch (Autonegotiation on One Side, Forced on the Other)
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+- One possible issue occurs when one side is using **autonegotiation** (as in
+  most modern systems), and the other side is set to a **forced link mode**
+  (e.g., older hardware with single-speed hubs). In such cases, modern PHYs
+  will attempt to detect the forced mode on the other side. If the link is
+  established, you may notice:
+
+  - **No or empty "Link partner advertised link modes"**.
+
+  - **"Link partner advertised auto-negotiation:"** will be **"no"** or not
+    present.
+
+- This type of detection does not always work reliably:
+
+  - Typically, the modern PHY will default to **Half Duplex**, even if the link
+    partner is actually configured for **Full Duplex**.
+
+  - Some PHYs may not work reliably if the link partner switches from one
+    forced mode to another. In this case, only a down/up cycle may help.
+
+- **Next Steps**: Set both sides to the same fixed speed and duplex mode to
+  avoid potential detection issues.
+
+  .. code-block:: bash
+
+     ethtool -s <interface> speed 1000 duplex full autoneg off
+
+Master/Slave Role Mismatch (BaseT1 and 1000BaseT PHYs)
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+- In **BaseT1** systems (e.g., 1000BaseT1, 100BaseT1), link establishment
+  requires that one device is configured as **master** and the other as
+  **slave**. A mismatch in this master-slave configuration can prevent the link
+  from being established. However, **1000BaseT** also supports configurable
+  master/slave roles and can face similar issues.
+
+- **Role Preference in 1000BaseT**: The **1000BaseT** specification allows link
+  partners to negotiate master-slave roles or role preferences during
+  autonegotiation. Some PHYs have hardware limitations or bugs that prevent
+  them from functioning properly in certain roles. In such cases, drivers may
+  force these PHYs into a specific role (e.g., **forced master** or **forced
+  slave**) or try a weaker option by setting preferences. If both link partners
+  have the same issue and are forced into the same mode (e.g., both forced into
+  master mode), they will not be able to establish a link.
+
+- **Next Steps**: Ensure that one side is configured as **master** and the
+  other as **slave** to avoid this issue, particularly when hardware
+  limitations are involved, or try the weaker **preferred** option instead of
+  **forced**. Check for any driver-related restrictions or forced modes.
+
+- **Command to force master/slave mode**:
+
+  .. code-block:: bash
+
+     ethtool -s <interface> master-slave forced-master
+
+  or:
+
+  .. code-block:: bash
+
+     ethtool -s <interface> master-slave forced-master speed 1000 duplex full autoneg off
+
+
+- **Check the current master/slave status**:
+
+  .. code-block:: bash
+
+     ethtool <interface>
+
+  Example Output:
+
+  .. code-block:: bash
+
+     master-slave cfg: forced-master
+     master-slave status: master
+
+- **Hardware Bugs and Driver Forcing**: If a known hardware issue forces the
+  PHY into a specific mode, it’s essential to check the driver source code or
+  hardware documentation for details. Ensure that the roles are compatible
+  across both link partners, and if both PHYs are forced into the same mode,
+  adjust one side accordingly to resolve the mismatch.
+
+Monitor Link Resets and Speed Drops
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+If the link is unstable, showing frequent resets or speed drops, this may
+indicate issues with the cable, PHY configuration, or environmental factors.
+While there is still no completely unified way in Linux to directly monitor
+downshift events or link speed changes via user space tools, both the Linux
+kernel logs and `ethtool` can provide valuable insights, especially if the
+driver supports reporting such events.
+
+- **Monitor Kernel Logs for Link Resets and Speed Drops**:
+
+  - The Linux kernel will print link status changes, including downshift
+    events, in the system logs. These messages typically include speed changes,
+    duplex mode, and downshifted link speed (if the driver supports it).
+
+  - **Command to monitor kernel logs in real-time:**
+
+    .. code-block:: bash
+
+      dmesg -w | grep "Link is Up\|Link is Down"
+
+  - Example Output (if a downshift occurs):
+
+    .. code-block:: bash
+
+      eth0: Link is Up - 100Mbps/Full (downshifted) - flow control rx/tx
+      eth0: Link is Down
+
+    This indicates that the link has been established but has downshifted from
+    a higher speed.
+
+  - **Note**: Not all drivers or PHYs support downshift reporting, so you may
+    not see this information for all devices.
+
+- **Monitor Link Down Events Using `ethtool`**:
+
+  - Starting with the latest kernel and `ethtool` versions, you can track
+    **Link Down Events** using the `ethtool -I` command. This will provide
+    counters for link drops, helping to diagnose link instability issues if
+    supported by the driver.
+
+  - **Command to monitor link down events:**
+
+    .. code-block:: bash
+
+      ethtool -I <interface>
+
+  - Example Output (if supported):
+
+    .. code-block:: bash
+
+      PSE attributes for eth1:
+      Link Down Events: 5
+
+    This indicates that the link has dropped 5 times. Frequent link down events
+    may indicate cable or environmental issues that require further
+    investigation.
+
+- **Check Link Status and Speed**:
+
+  - Even though downshift counts or events are not easily tracked, you can
+    still use `ethtool` to manually check the current link speed and status.
+
+  - **Command:** `ethtool <interface>`
+
+  - **Expected Output:**
+
+    .. code-block:: bash
+
+      Speed: 1000Mb/s
+      Duplex: Full
+      Auto-negotiation: on
+      Link detected: yes
+
+    Any inconsistencies in the expected speed or duplex setting could indicate
+    an issue.
+
+- **Disable Energy-Efficient Ethernet (EEE) for Diagnostics**:
+
+  - **EEE** (Energy-Efficient Ethernet) can be a source of link instability due
+    to transitions in and out of low-power states. For diagnostic purposes, it
+    may be useful to **temporarily** disable EEE to determine if it is
+    contributing to link instability. This is **not a generic recommendation**
+    for disabling power management.
+
+  - **Next Steps**: Disable EEE and monitor if the link becomes stable. If
+    disabling EEE resolves the issue, report the bug so that the driver can be
+    fixed.
+
+  - **Command:**
+
+    .. code-block:: bash
+
+      ethtool --set-eee <interface> eee off
+
+  - **Important**: If disabling EEE resolves the instability, the issue should
+    be reported to the maintainers as a bug, and the driver should be corrected
+    to handle EEE properly without causing instability. Disabling EEE
+    permanently should not be seen as a solution.
+
+- **Monitor Error Counters**:
+
+  - Use `ethtool -S <interface> --all-groups` to retrieve standardized interface
+    statistics if the driver supports the unified interface:
+
+  - **Command:** `ethtool -S <interface> --all-groups`
+
+  - **Example Output (if supported)**:
+
+    .. code-block:: bash
+
+      phydev-RxFrames: 100391
+      phydev-RxErrors: 0
+      phydev-TxFrames: 9
+      phydev-TxErrors: 0
+
+  - If the unified interface is not supported, use `ethtool -S <interface>` to
+    retrieve MAC and PHY counters. Note that non-standardized PHY counter names
+    vary by driver and must be interpreted accordingly:
+
+  - **Command:** `ethtool -S <interface>`
+
+  - **Example Output (if supported)**:
+
+    .. code-block:: bash
+
+      rx_crc_errors: 123
+      tx_errors: 45
+      rx_frame_errors: 78
+
+  - **Note**: If no meaningful error counters are available or if counters are
+    not supported, you may need to rely on physical inspections (e.g., cable
+    condition) or kernel log messages (e.g., link up/down events) to further
+    diagnose the issue.
+
+  - **Compare Counters**:
+
+    - Compare the egress and ingress frame counts reported by the PHY and MAC.
+
+    - A small difference may occur due to sampling rate differences between the
+      MAC and PHY drivers, or if the PHY and MAC are not always fully
+      synchronized in their UP or DOWN states.
+
+    - Significant discrepancies indicate potential issues in the data path
+      between the MAC and PHY.
+
+When All Else Fails...
+~~~~~~~~~~~~~~~~~~~~~~
+
+So you've checked the cables, monitored the logs, disabled EEE, and still...
+nothing? Don’t worry, you’re not alone. Sometimes, Ethernet gremlins just don’t
+want to cooperate.
+
+But before you throw in the towel (or the Ethernet cable), take a deep breath.
+It’s always possible that:
+
+1. Your PHY has a unique, undocumented personality.
+
+2. The problem is lying dormant, waiting for just the right moment to magically
+   resolve itself (hey, it happens!).
+
+3. Or, it could be that the ultimate solution simply hasn’t been invented yet.
+
+If none of the above bring you comfort, there’s one final step: contribute! If
+you've uncovered new or unusual issues, or have creative diagnostic methods,
+feel free to share your findings and extend this documentation. Together, we
+can hunt down every elusive network issue - one twisted pair at a time.
+
+Remember: sometimes the solution is just a reboot away, but if not, it’s time to
+dig deeper - or report that bug!
+
diff --git a/Documentation/networking/ethtool-netlink.rst b/Documentation/networking/ethtool-netlink.rst
index 295563e91082..ab20c644af24 100644
--- a/Documentation/networking/ethtool-netlink.rst
+++ b/Documentation/networking/ethtool-netlink.rst
@@ -236,6 +236,12 @@ Userspace to kernel:
   ``ETHTOOL_MSG_MM_GET``                get MAC merge layer state
   ``ETHTOOL_MSG_MM_SET``                set MAC merge layer parameters
   ``ETHTOOL_MSG_MODULE_FW_FLASH_ACT``   flash transceiver module firmware
+  ``ETHTOOL_MSG_PHY_GET``               get Ethernet PHY information
+  ``ETHTOOL_MSG_TSCONFIG_GET``          get hw timestamping configuration
+  ``ETHTOOL_MSG_TSCONFIG_SET``          set hw timestamping configuration
+  ``ETHTOOL_MSG_RSS_SET``               set RSS settings
+  ``ETHTOOL_MSG_RSS_CREATE_ACT``        create an additional RSS context
+  ``ETHTOOL_MSG_RSS_DELETE_ACT``        delete an additional RSS context
   ===================================== =================================
 
 Kernel to userspace:
@@ -278,11 +284,21 @@ Kernel to userspace:
   ``ETHTOOL_MSG_MODULE_GET_REPLY``         transceiver module parameters
   ``ETHTOOL_MSG_PSE_GET_REPLY``            PSE parameters
   ``ETHTOOL_MSG_RSS_GET_REPLY``            RSS settings
+  ``ETHTOOL_MSG_RSS_NTF``                  RSS settings
   ``ETHTOOL_MSG_PLCA_GET_CFG_REPLY``       PLCA RS parameters
   ``ETHTOOL_MSG_PLCA_GET_STATUS_REPLY``    PLCA RS status
   ``ETHTOOL_MSG_PLCA_NTF``                 PLCA RS parameters
   ``ETHTOOL_MSG_MM_GET_REPLY``             MAC merge layer status
   ``ETHTOOL_MSG_MODULE_FW_FLASH_NTF``      transceiver module flash updates
+  ``ETHTOOL_MSG_PHY_GET_REPLY``            Ethernet PHY information
+  ``ETHTOOL_MSG_PHY_NTF``                  Ethernet PHY information change
+  ``ETHTOOL_MSG_TSCONFIG_GET_REPLY``       hw timestamping configuration
+  ``ETHTOOL_MSG_TSCONFIG_SET_REPLY``       new hw timestamping configuration
+  ``ETHTOOL_MSG_PSE_NTF``                  PSE events notification
+  ``ETHTOOL_MSG_RSS_NTF``                  RSS settings notification
+  ``ETHTOOL_MSG_RSS_CREATE_ACT_REPLY``     create an additional RSS context
+  ``ETHTOOL_MSG_RSS_CREATE_NTF``           additional RSS context created
+  ``ETHTOOL_MSG_RSS_DELETE_NTF``           additional RSS context deleted
   ======================================== =================================
 
 ``GET`` requests are sent by userspace applications to retrieve device
@@ -892,6 +908,10 @@ Kernel response contents:
   ``ETHTOOL_A_RINGS_RX_PUSH``               u8      flag of RX Push mode
   ``ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN``       u32     size of TX push buffer
   ``ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN_MAX``   u32     max size of TX push buffer
+  ``ETHTOOL_A_RINGS_HDS_THRESH``            u32     threshold of
+                                                    header / data split
+  ``ETHTOOL_A_RINGS_HDS_THRESH_MAX``        u32     max threshold of
+                                                    header / data split
   =======================================   ======  ===========================
 
 ``ETHTOOL_A_RINGS_TCP_DATA_SPLIT`` indicates whether the device is usable with
@@ -934,10 +954,12 @@ Request contents:
   ``ETHTOOL_A_RINGS_RX_JUMBO``          u32     size of RX jumbo ring
   ``ETHTOOL_A_RINGS_TX``                u32     size of TX ring
   ``ETHTOOL_A_RINGS_RX_BUF_LEN``        u32     size of buffers on the ring
+  ``ETHTOOL_A_RINGS_TCP_DATA_SPLIT``    u8      TCP header / data split
   ``ETHTOOL_A_RINGS_CQE_SIZE``          u32     Size of TX/RX CQE
   ``ETHTOOL_A_RINGS_TX_PUSH``           u8      flag of TX Push mode
   ``ETHTOOL_A_RINGS_RX_PUSH``           u8      flag of RX Push mode
   ``ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN``   u32     size of TX push buffer
+  ``ETHTOOL_A_RINGS_HDS_THRESH``        u32     threshold of header / data split
   ====================================  ======  ===========================
 
 Kernel checks that requested ring sizes do not exceed limits reported by
@@ -954,6 +976,10 @@ A bigger CQE can have more receive buffer pointers, and in turn the NIC can
 transfer a bigger frame from wire. Based on the NIC hardware, the overall
 completion queue size can be adjusted in the driver if CQE size is modified.
 
+``ETHTOOL_A_RINGS_HDS_THRESH`` specifies the threshold value of
+header / data split feature. If a received packet size is larger than this
+threshold value, header and data will be split.
+
 CHANNELS_GET
 ============
 
@@ -1242,9 +1268,10 @@ Gets timestamping information like ``ETHTOOL_GET_TS_INFO`` ioctl request.
 
 Request contents:
 
-  =====================================  ======  ==========================
-  ``ETHTOOL_A_TSINFO_HEADER``            nested  request header
-  =====================================  ======  ==========================
+  ========================================  ======  ============================
+  ``ETHTOOL_A_TSINFO_HEADER``               nested  request header
+  ``ETHTOOL_A_TSINFO_HWTSTAMP_PROVIDER``    nested  PTP hw clock provider
+  ========================================  ======  ============================
 
 Kernel response contents:
 
@@ -1263,11 +1290,17 @@ would be empty (no bit set).
 
 Additional hardware timestamping statistics response contents:
 
-  =====================================  ======  ===================================
-  ``ETHTOOL_A_TS_STAT_TX_PKTS``          uint    Packets with Tx HW timestamps
-  ``ETHTOOL_A_TS_STAT_TX_LOST``          uint    Tx HW timestamp not arrived count
-  ``ETHTOOL_A_TS_STAT_TX_ERR``           uint    HW error request Tx timestamp count
-  =====================================  ======  ===================================
+  ==================================================  ======  =====================
+  ``ETHTOOL_A_TS_STAT_TX_PKTS``                       uint    Packets with Tx
+                                                              HW timestamps
+  ``ETHTOOL_A_TS_STAT_TX_LOST``                       uint    Tx HW timestamp
+                                                              not arrived count
+  ``ETHTOOL_A_TS_STAT_TX_ERR``                        uint    HW error request
+                                                              Tx timestamp count
+  ``ETHTOOL_A_TS_STAT_TX_ONESTEP_PKTS_UNCONFIRMED``   uint    Packets with one-step
+                                                              HW TX timestamps with
+                                                              unconfirmed delivery
+  ==================================================  ======  =====================
 
 CABLE_TEST
 ==========
@@ -1608,6 +1641,7 @@ the ``ETHTOOL_A_STATS_GROUPS`` bitset. Currently defined values are:
  ETHTOOL_STATS_ETH_PHY  eth-phy  Basic IEEE 802.3 PHY statistics (30.3.2.1.*)
  ETHTOOL_STATS_ETH_CTRL eth-ctrl Basic IEEE 802.3 MAC Ctrl statistics (30.3.3.*)
  ETHTOOL_STATS_RMON     rmon     RMON (RFC 2819) statistics
+ ETHTOOL_STATS_PHY      phy      Additional PHY statistics, not defined by IEEE
  ====================== ======== ===============================================
 
 Each group should have a corresponding ``ETHTOOL_A_STATS_GRP`` in the reply.
@@ -1763,6 +1797,11 @@ Kernel response contents:
                                                       limit of the PoE PSE.
   ``ETHTOOL_A_C33_PSE_PW_LIMIT_RANGES``       nested  Supported power limit
                                                       configuration ranges.
+  ``ETHTOOL_A_PSE_PW_D_ID``                      u32  Index of the PSE power domain
+  ``ETHTOOL_A_PSE_PRIO_MAX``                     u32  Priority maximum configurable
+                                                      on the PoE PSE
+  ``ETHTOOL_A_PSE_PRIO``                         u32  Priority of the PoE PSE
+                                                      currently configured
   ==========================================  ======  =============================
 
 When set, the optional ``ETHTOOL_A_PODL_PSE_ADMIN_STATE`` attribute identifies
@@ -1836,6 +1875,15 @@ identifies the C33 PSE power limit ranges through
 If the controller works with fixed classes, the min and max values will be
 equal.
 
+The ``ETHTOOL_A_PSE_PW_D_ID`` attribute identifies the index of PSE power
+domain.
+
+When set, the optional ``ETHTOOL_A_PSE_PRIO_MAX`` attribute identifies
+the PSE maximum priority value.
+When set, the optional ``ETHTOOL_A_PSE_PRIO`` attributes is used to
+identifies the currently configured PSE priority.
+For a description of PSE priority attributes, see ``PSE_SET``.
+
 PSE_SET
 =======
 
@@ -1849,6 +1897,8 @@ Request contents:
   ``ETHTOOL_A_C33_PSE_ADMIN_CONTROL``        u32  Control PSE Admin state
   ``ETHTOOL_A_C33_PSE_AVAIL_PWR_LIMIT``      u32  Control PoE PSE available
                                                   power limit
+  ``ETHTOOL_A_PSE_PRIO``                     u32  Control priority of the
+                                                  PoE PSE
   ======================================  ======  =============================
 
 When set, the optional ``ETHTOOL_A_PODL_PSE_ADMIN_CONTROL`` attribute is used
@@ -1871,6 +1921,38 @@ various existing products that document power consumption in watts rather than
 classes. If power limit configuration based on classes is needed, the
 conversion can be done in user space, for example by ethtool.
 
+When set, the optional ``ETHTOOL_A_PSE_PRIO`` attributes is used to
+control the PSE priority. Allowed priority value are between zero and
+the value of ``ETHTOOL_A_PSE_PRIO_MAX`` attribute.
+
+A lower value indicates a higher priority, meaning that a priority value
+of 0 corresponds to the highest port priority.
+Port priority serves two functions:
+
+ - Power-up Order: After a reset, ports are powered up in order of their
+   priority from highest to lowest. Ports with higher priority
+   (lower values) power up first.
+ - Shutdown Order: When the power budget is exceeded, ports with lower
+   priority (higher values) are turned off first.
+
+PSE_NTF
+=======
+
+Notify PSE events.
+
+Notification contents:
+
+  ===============================  ======  ========================
+  ``ETHTOOL_A_PSE_HEADER``         nested  request header
+  ``ETHTOOL_A_PSE_EVENTS``         bitset  PSE events
+  ===============================  ======  ========================
+
+When set, the optional ``ETHTOOL_A_PSE_EVENTS`` attribute identifies the
+PSE events.
+
+.. kernel-doc:: include/uapi/linux/ethtool_netlink_generated.h
+    :identifiers: ethtool_pse_event
+
 RSS_GET
 =======
 
@@ -1894,14 +1976,15 @@ used to ignore context 0s and only dump additional contexts).
 
 Kernel response contents:
 
-=====================================  ======  ==========================
+=====================================  ======  ===============================
   ``ETHTOOL_A_RSS_HEADER``             nested  reply header
   ``ETHTOOL_A_RSS_CONTEXT``            u32     context number
   ``ETHTOOL_A_RSS_HFUNC``              u32     RSS hash func
   ``ETHTOOL_A_RSS_INDIR``              binary  Indir table bytes
   ``ETHTOOL_A_RSS_HKEY``               binary  Hash key bytes
   ``ETHTOOL_A_RSS_INPUT_XFRM``         u32     RSS input data transformation
-=====================================  ======  ==========================
+  ``ETHTOOL_A_RSS_FLOW_HASH``          nested  Header fields included in hash
+=====================================  ======  ===============================
 
 ETHTOOL_A_RSS_HFUNC attribute is bitmap indicating the hash function
 being used. Current supported options are toeplitz, xor or crc32.
@@ -1909,7 +1992,68 @@ ETHTOOL_A_RSS_INDIR attribute returns RSS indirection table where each byte
 indicates queue number.
 ETHTOOL_A_RSS_INPUT_XFRM attribute is a bitmap indicating the type of
 transformation applied to the input protocol fields before given to the RSS
-hfunc. Current supported option is symmetric-xor.
+hfunc. Current supported options are symmetric-xor and symmetric-or-xor.
+ETHTOOL_A_RSS_FLOW_HASH carries per-flow type bitmask of which header
+fields are included in the hash calculation.
+
+RSS_SET
+=======
+
+Request contents:
+
+=====================================  ======  ==============================
+  ``ETHTOOL_A_RSS_HEADER``             nested  request header
+  ``ETHTOOL_A_RSS_CONTEXT``            u32     context number
+  ``ETHTOOL_A_RSS_HFUNC``              u32     RSS hash func
+  ``ETHTOOL_A_RSS_INDIR``              binary  Indir table bytes
+  ``ETHTOOL_A_RSS_HKEY``               binary  Hash key bytes
+  ``ETHTOOL_A_RSS_INPUT_XFRM``         u32     RSS input data transformation
+  ``ETHTOOL_A_RSS_FLOW_HASH``          nested  Header fields included in hash
+=====================================  ======  ==============================
+
+``ETHTOOL_A_RSS_INDIR`` is the minimal RSS table the user expects. Kernel and
+the device driver may replicate the table if its smaller than smallest table
+size supported by the device. For example if user requests ``[0, 1]`` but the
+device needs at least 8 entries - the real table in use will end up being
+``[0, 1, 0, 1, 0, 1, 0, 1]``. Most devices require the table size to be power
+of 2, so tables which size is not a power of 2 will likely be rejected.
+Using table of size 0 will reset the indirection table to the default.
+
+RSS_CREATE_ACT
+==============
+
+Request contents:
+
+=====================================  ======  ==============================
+  ``ETHTOOL_A_RSS_HEADER``             nested  request header
+  ``ETHTOOL_A_RSS_CONTEXT``            u32     context number
+  ``ETHTOOL_A_RSS_HFUNC``              u32     RSS hash func
+  ``ETHTOOL_A_RSS_INDIR``              binary  Indir table bytes
+  ``ETHTOOL_A_RSS_HKEY``               binary  Hash key bytes
+  ``ETHTOOL_A_RSS_INPUT_XFRM``         u32     RSS input data transformation
+=====================================  ======  ==============================
+
+Kernel response contents:
+
+=====================================  ======  ==============================
+  ``ETHTOOL_A_RSS_HEADER``             nested  request header
+  ``ETHTOOL_A_RSS_CONTEXT``            u32     context number
+=====================================  ======  ==============================
+
+Create an additional RSS context, if ``ETHTOOL_A_RSS_CONTEXT`` is not
+specified kernel will allocate one automatically.
+
+RSS_DELETE_ACT
+==============
+
+Request contents:
+
+=====================================  ======  ==============================
+  ``ETHTOOL_A_RSS_HEADER``             nested  request header
+  ``ETHTOOL_A_RSS_CONTEXT``            u32     context number
+=====================================  ======  ==============================
+
+Delete an additional RSS context.
 
 PLCA_GET_CFG
 ============
@@ -2240,6 +2384,75 @@ Kernel response contents:
 When ``ETHTOOL_A_PHY_UPSTREAM_TYPE`` is PHY_UPSTREAM_PHY, the PHY's parent is
 another PHY.
 
+TSCONFIG_GET
+============
+
+Retrieves the information about the current hardware timestamping source and
+configuration.
+
+It is similar to the deprecated ``SIOCGHWTSTAMP`` ioctl request.
+
+Request contents:
+
+  ====================================  ======  ==========================
+  ``ETHTOOL_A_TSCONFIG_HEADER``         nested  request header
+  ====================================  ======  ==========================
+
+Kernel response contents:
+
+  ======================================== ======  ============================
+  ``ETHTOOL_A_TSCONFIG_HEADER``            nested  request header
+  ``ETHTOOL_A_TSCONFIG_HWTSTAMP_PROVIDER`` nested  PTP hw clock provider
+  ``ETHTOOL_A_TSCONFIG_TX_TYPES``          bitset  hwtstamp Tx type
+  ``ETHTOOL_A_TSCONFIG_RX_FILTERS``        bitset  hwtstamp Rx filter
+  ``ETHTOOL_A_TSCONFIG_HWTSTAMP_FLAGS``	   u32     hwtstamp flags
+  ======================================== ======  ============================
+
+When set the ``ETHTOOL_A_TSCONFIG_HWTSTAMP_PROVIDER`` attribute identifies the
+source of the hw timestamping provider. It is composed by
+``ETHTOOL_A_TS_HWTSTAMP_PROVIDER_INDEX`` attribute which describe the index of
+the PTP device and ``ETHTOOL_A_TS_HWTSTAMP_PROVIDER_QUALIFIER`` which describe
+the qualifier of the timestamp.
+
+When set the ``ETHTOOL_A_TSCONFIG_TX_TYPES``, ``ETHTOOL_A_TSCONFIG_RX_FILTERS``
+and the ``ETHTOOL_A_TSCONFIG_HWTSTAMP_FLAGS`` attributes identify the Tx
+type, the Rx filter and the flags configured for the current hw timestamping
+provider. The attributes are propagated to the driver through the following
+structure:
+
+.. kernel-doc:: include/linux/net_tstamp.h
+    :identifiers: kernel_hwtstamp_config
+
+TSCONFIG_SET
+============
+
+Set the information about the current hardware timestamping source and
+configuration.
+
+It is similar to the deprecated ``SIOCSHWTSTAMP`` ioctl request.
+
+Request contents:
+
+  ======================================== ======  ============================
+  ``ETHTOOL_A_TSCONFIG_HEADER``            nested  request header
+  ``ETHTOOL_A_TSCONFIG_HWTSTAMP_PROVIDER`` nested  PTP hw clock provider
+  ``ETHTOOL_A_TSCONFIG_TX_TYPES``          bitset  hwtstamp Tx type
+  ``ETHTOOL_A_TSCONFIG_RX_FILTERS``        bitset  hwtstamp Rx filter
+  ``ETHTOOL_A_TSCONFIG_HWTSTAMP_FLAGS``	   u32     hwtstamp flags
+  ======================================== ======  ============================
+
+Kernel response contents:
+
+  ======================================== ======  ============================
+  ``ETHTOOL_A_TSCONFIG_HEADER``            nested  request header
+  ``ETHTOOL_A_TSCONFIG_HWTSTAMP_PROVIDER`` nested  PTP hw clock provider
+  ``ETHTOOL_A_TSCONFIG_TX_TYPES``          bitset  hwtstamp Tx type
+  ``ETHTOOL_A_TSCONFIG_RX_FILTERS``        bitset  hwtstamp Rx filter
+  ``ETHTOOL_A_TSCONFIG_HWTSTAMP_FLAGS``	   u32     hwtstamp flags
+  ======================================== ======  ============================
+
+For a description of each attribute, see ``TSCONFIG_GET``.
+
 Request translation
 ===================
 
@@ -2292,8 +2505,8 @@ are netlink only.
   ``ETHTOOL_SFLAGS``                  ``ETHTOOL_MSG_FEATURES_SET``
   ``ETHTOOL_GPFLAGS``                 ``ETHTOOL_MSG_PRIVFLAGS_GET``
   ``ETHTOOL_SPFLAGS``                 ``ETHTOOL_MSG_PRIVFLAGS_SET``
-  ``ETHTOOL_GRXFH``                   n/a
-  ``ETHTOOL_SRXFH``                   n/a
+  ``ETHTOOL_GRXFH``                   ``ETHTOOL_MSG_RSS_GET``
+  ``ETHTOOL_SRXFH``                   ``ETHTOOL_MSG_RSS_SET``
   ``ETHTOOL_GGRO``                    ``ETHTOOL_MSG_FEATURES_GET``
   ``ETHTOOL_SGRO``                    ``ETHTOOL_MSG_FEATURES_SET``
   ``ETHTOOL_GRXRINGS``                n/a
@@ -2307,8 +2520,8 @@ are netlink only.
   ``ETHTOOL_SRXNTUPLE``               n/a
   ``ETHTOOL_GRXNTUPLE``               n/a
   ``ETHTOOL_GSSET_INFO``              ``ETHTOOL_MSG_STRSET_GET``
-  ``ETHTOOL_GRXFHINDIR``              n/a
-  ``ETHTOOL_SRXFHINDIR``              n/a
+  ``ETHTOOL_GRXFHINDIR``              ``ETHTOOL_MSG_RSS_GET``
+  ``ETHTOOL_SRXFHINDIR``              ``ETHTOOL_MSG_RSS_SET``
   ``ETHTOOL_GFEATURES``               ``ETHTOOL_MSG_FEATURES_GET``
   ``ETHTOOL_SFEATURES``               ``ETHTOOL_MSG_FEATURES_SET``
   ``ETHTOOL_GCHANNELS``               ``ETHTOOL_MSG_CHANNELS_GET``
@@ -2348,4 +2561,6 @@ are netlink only.
   n/a                                 ``ETHTOOL_MSG_MM_SET``
   n/a                                 ``ETHTOOL_MSG_MODULE_FW_FLASH_ACT``
   n/a                                 ``ETHTOOL_MSG_PHY_GET``
+  ``SIOCGHWTSTAMP``                   ``ETHTOOL_MSG_TSCONFIG_GET``
+  ``SIOCSHWTSTAMP``                   ``ETHTOOL_MSG_TSCONFIG_SET``
   =================================== =====================================
diff --git a/Documentation/networking/ieee802154.rst b/Documentation/networking/ieee802154.rst
index c652d383fe10..743c0a80e309 100644
--- a/Documentation/networking/ieee802154.rst
+++ b/Documentation/networking/ieee802154.rst
@@ -72,7 +72,8 @@ exports a management (e.g. MLME) and data API.
 possibly with some kinds of acceleration like automatic CRC computation and
 comparison, automagic ACK handling, address matching, etc.
 
-Those types of devices require different approach to be hooked into Linux kernel.
+Each type of device requires a different approach to be hooked into the Linux
+kernel.
 
 HardMAC
 -------
@@ -81,10 +82,10 @@ See the header include/net/ieee802154_netdev.h. You have to implement Linux
 net_device, with .type = ARPHRD_IEEE802154. Data is exchanged with socket family
 code via plain sk_buffs. On skb reception skb->cb must contain additional
 info as described in the struct ieee802154_mac_cb. During packet transmission
-the skb->cb is used to provide additional data to device's header_ops->create
-function. Be aware that this data can be overridden later (when socket code
-submits skb to qdisc), so if you need something from that cb later, you should
-store info in the skb->data on your own.
+the skb->cb is used to provide additional data to the device's
+header_ops->create function. Be aware that this data can be overridden later
+(when socket code submits skb to qdisc), so if you need something from that cb
+later, you should store info in the skb->data on your own.
 
 To hook the MLME interface you have to populate the ml_priv field of your
 net_device with a pointer to struct ieee802154_mlme_ops instance. The fields
@@ -94,8 +95,9 @@ All other fields are required.
 SoftMAC
 -------
 
-The MAC is the middle layer in the IEEE 802.15.4 Linux stack. This moment it
-provides interface for drivers registration and management of slave interfaces.
+The MAC is the middle layer in the IEEE 802.15.4 Linux stack. At the moment, it
+provides an interface for driver registration and management of slave
+interfaces.
 
 NOTE: Currently the only monitor device type is supported - it's IEEE 802.15.4
 stack interface for network sniffers (e.g. WireShark).
diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst
index 803dfc1efb75..ac90b82f3ce9 100644
--- a/Documentation/networking/index.rst
+++ b/Documentation/networking/index.rst
@@ -14,6 +14,7 @@ Contents:
    can
    can_ucan_protocol
    device_drivers/index
+   diagnostic/index
    dsa/index
    devlink/index
    caif/index
@@ -47,7 +48,6 @@ Contents:
    ax25
    bonding
    cdc_mbim
-   dccp
    dctcp
    devmem
    dns_resolver
@@ -62,6 +62,7 @@ Contents:
    gtp
    ila
    ioam6-sysctl
+   iou-zcrx
    ip_dynaddr
    ipsec
    ip-sysctl
@@ -85,6 +86,7 @@ Contents:
    netdevices
    netfilter-sysctl
    netif-msg
+   netmem
    nexthop-group-resilient
    nf_conntrack-sysctl
    nf_flowtable
diff --git a/Documentation/networking/iou-zcrx.rst b/Documentation/networking/iou-zcrx.rst
new file mode 100644
index 000000000000..0127319b30bb
--- /dev/null
+++ b/Documentation/networking/iou-zcrx.rst
@@ -0,0 +1,202 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=====================
+io_uring zero copy Rx
+=====================
+
+Introduction
+============
+
+io_uring zero copy Rx (ZC Rx) is a feature that removes kernel-to-user copy on
+the network receive path, allowing packet data to be received directly into
+userspace memory. This feature is different to TCP_ZEROCOPY_RECEIVE in that
+there are no strict alignment requirements and no need to mmap()/munmap().
+Compared to kernel bypass solutions such as e.g. DPDK, the packet headers are
+processed by the kernel TCP stack as normal.
+
+NIC HW Requirements
+===================
+
+Several NIC HW features are required for io_uring ZC Rx to work. For now the
+kernel API does not configure the NIC and it must be done by the user.
+
+Header/data split
+-----------------
+
+Required to split packets at the L4 boundary into a header and a payload.
+Headers are received into kernel memory as normal and processed by the TCP
+stack as normal. Payloads are received into userspace memory directly.
+
+Flow steering
+-------------
+
+Specific HW Rx queues are configured for this feature, but modern NICs
+typically distribute flows across all HW Rx queues. Flow steering is required
+to ensure that only desired flows are directed towards HW queues that are
+configured for io_uring ZC Rx.
+
+RSS
+---
+
+In addition to flow steering above, RSS is required to steer all other non-zero
+copy flows away from queues that are configured for io_uring ZC Rx.
+
+Usage
+=====
+
+Setup NIC
+---------
+
+Must be done out of band for now.
+
+Ensure there are at least two queues::
+
+  ethtool -L eth0 combined 2
+
+Enable header/data split::
+
+  ethtool -G eth0 tcp-data-split on
+
+Carve out half of the HW Rx queues for zero copy using RSS::
+
+  ethtool -X eth0 equal 1
+
+Set up flow steering, bearing in mind that queues are 0-indexed::
+
+  ethtool -N eth0 flow-type tcp6 ... action 1
+
+Setup io_uring
+--------------
+
+This section describes the low level io_uring kernel API. Please refer to
+liburing documentation for how to use the higher level API.
+
+Create an io_uring instance with the following required setup flags::
+
+  IORING_SETUP_SINGLE_ISSUER
+  IORING_SETUP_DEFER_TASKRUN
+  IORING_SETUP_CQE32
+
+Create memory area
+------------------
+
+Allocate userspace memory area for receiving zero copy data::
+
+  void *area_ptr = mmap(NULL, area_size,
+                        PROT_READ | PROT_WRITE,
+                        MAP_ANONYMOUS | MAP_PRIVATE,
+                        0, 0);
+
+Create refill ring
+------------------
+
+Allocate memory for a shared ringbuf used for returning consumed buffers::
+
+  void *ring_ptr = mmap(NULL, ring_size,
+                        PROT_READ | PROT_WRITE,
+                        MAP_ANONYMOUS | MAP_PRIVATE,
+                        0, 0);
+
+This refill ring consists of some space for the header, followed by an array of
+``struct io_uring_zcrx_rqe``::
+
+  size_t rq_entries = 4096;
+  size_t ring_size = rq_entries * sizeof(struct io_uring_zcrx_rqe) + PAGE_SIZE;
+  /* align to page size */
+  ring_size = (ring_size + (PAGE_SIZE - 1)) & ~(PAGE_SIZE - 1);
+
+Register ZC Rx
+--------------
+
+Fill in registration structs::
+
+  struct io_uring_zcrx_area_reg area_reg = {
+    .addr = (__u64)(unsigned long)area_ptr,
+    .len = area_size,
+    .flags = 0,
+  };
+
+  struct io_uring_region_desc region_reg = {
+    .user_addr = (__u64)(unsigned long)ring_ptr,
+    .size = ring_size,
+    .flags = IORING_MEM_REGION_TYPE_USER,
+  };
+
+  struct io_uring_zcrx_ifq_reg reg = {
+    .if_idx = if_nametoindex("eth0"),
+    /* this is the HW queue with desired flow steered into it */
+    .if_rxq = 1,
+    .rq_entries = rq_entries,
+    .area_ptr = (__u64)(unsigned long)&area_reg,
+    .region_ptr = (__u64)(unsigned long)&region_reg,
+  };
+
+Register with kernel::
+
+  io_uring_register_ifq(ring, &reg);
+
+Map refill ring
+---------------
+
+The kernel fills in fields for the refill ring in the registration ``struct
+io_uring_zcrx_ifq_reg``. Map it into userspace::
+
+  struct io_uring_zcrx_rq refill_ring;
+
+  refill_ring.khead = (unsigned *)((char *)ring_ptr + reg.offsets.head);
+  refill_ring.khead = (unsigned *)((char *)ring_ptr + reg.offsets.tail);
+  refill_ring.rqes =
+    (struct io_uring_zcrx_rqe *)((char *)ring_ptr + reg.offsets.rqes);
+  refill_ring.rq_tail = 0;
+  refill_ring.ring_ptr = ring_ptr;
+
+Receiving data
+--------------
+
+Prepare a zero copy recv request::
+
+  struct io_uring_sqe *sqe;
+
+  sqe = io_uring_get_sqe(ring);
+  io_uring_prep_rw(IORING_OP_RECV_ZC, sqe, fd, NULL, 0, 0);
+  sqe->ioprio |= IORING_RECV_MULTISHOT;
+
+Now, submit and wait::
+
+  io_uring_submit_and_wait(ring, 1);
+
+Finally, process completions::
+
+  struct io_uring_cqe *cqe;
+  unsigned int count = 0;
+  unsigned int head;
+
+  io_uring_for_each_cqe(ring, head, cqe) {
+    struct io_uring_zcrx_cqe *rcqe = (struct io_uring_zcrx_cqe *)(cqe + 1);
+
+    unsigned long mask = (1ULL << IORING_ZCRX_AREA_SHIFT) - 1;
+    unsigned char *data = area_ptr + (rcqe->off & mask);
+    /* do something with the data */
+
+    count++;
+  }
+  io_uring_cq_advance(ring, count);
+
+Recycling buffers
+-----------------
+
+Return buffers back to the kernel to be used again::
+
+  struct io_uring_zcrx_rqe *rqe;
+  unsigned mask = refill_ring.ring_entries - 1;
+  rqe = &refill_ring.rqes[refill_ring.rq_tail & mask];
+
+  unsigned long area_offset = rcqe->off & ~IORING_ZCRX_AREA_MASK;
+  rqe->off = area_offset | area_reg.rq_area_token;
+  rqe->len = cqe->res;
+  IO_URING_WRITE_ONCE(*refill_ring.ktail, ++refill_ring.rq_tail);
+
+Testing
+=======
+
+See ``tools/testing/selftests/drivers/net/hw/iou-zcrx.c``
diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst
index eacf8983e230..9756d16e3df1 100644
--- a/Documentation/networking/ip-sysctl.rst
+++ b/Documentation/networking/ip-sysctl.rst
@@ -8,15 +8,19 @@ IP Sysctl
 ==============================
 
 ip_forward - BOOLEAN
-	- 0 - disabled (default)
-	- not 0 - enabled
-
 	Forward Packets between interfaces.
 
 	This variable is special, its change resets all configuration
 	parameters to their default state (RFC1122 for hosts, RFC1812
 	for routers)
 
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 0 (disabled)
+
 ip_default_ttl - INTEGER
 	Default value of TTL field (Time To Live) for outgoing (but not
 	forwarded) IP packets. Should be between 1 and 255 inclusive.
@@ -37,8 +41,8 @@ ip_no_pmtu_disc - INTEGER
 	Mode 3 is a hardened pmtu discover mode. The kernel will only
 	accept fragmentation-needed errors if the underlying protocol
 	can verify them besides a plain socket lookup. Current
-	protocols for which pmtu events will be honored are TCP, SCTP
-	and DCCP as they verify e.g. the sequence number or the
+	protocols for which pmtu events will be honored are TCP and
+	SCTP as they verify e.g. the sequence number or the
 	association. This mode should not be enabled globally but is
 	only intended to secure e.g. name servers in namespaces where
 	TCP path mtu must still work but path MTU information of other
@@ -62,20 +66,25 @@ ip_forward_use_pmtu - BOOLEAN
 	kernel honoring this information. This is normally not the
 	case.
 
-	Default: 0 (disabled)
-
 	Possible values:
 
-	- 0 - disabled
-	- 1 - enabled
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 0 (disabled)
 
 fwmark_reflect - BOOLEAN
 	Controls the fwmark of kernel-generated IPv4 reply packets that are not
 	associated with a socket for example, TCP RSTs or ICMP echo replies).
-	If unset, these packets have a fwmark of zero. If set, they have the
+	If disabled, these packets have a fwmark of zero. If enabled, they have the
 	fwmark of the packet they are replying to.
 
-	Default: 0
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 0 (disabled)
 
 fib_multipath_use_neigh - BOOLEAN
 	Use status of existing neighbor entry when determining nexthop for
@@ -83,12 +92,12 @@ fib_multipath_use_neigh - BOOLEAN
 	packets could be directed to a failed nexthop. Only valid for kernels
 	built with CONFIG_IP_ROUTE_MULTIPATH enabled.
 
-	Default: 0 (disabled)
-
 	Possible values:
 
-	- 0 - disabled
-	- 1 - enabled
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 0 (disabled)
 
 fib_multipath_hash_policy - INTEGER
 	Controls which hash policy to use for multipath routes. Only valid
@@ -368,7 +377,12 @@ tcp_autocorking - BOOLEAN
 	queue. Applications can still use TCP_CORK for optimal behavior
 	when they know how/when to uncork their sockets.
 
-	Default : 1
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 1 (enabled)
 
 tcp_available_congestion_control - STRING
 	Shows the available congestion control choices that are registered.
@@ -408,9 +422,16 @@ tcp_congestion_control - STRING
 tcp_dsack - BOOLEAN
 	Allows TCP to send "duplicate" SACKs.
 
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 1 (enabled)
+
 tcp_early_retrans - INTEGER
 	Tail loss probe (TLP) converts RTOs occurring due to tail
-	losses into fast recovery (draft-ietf-tcpm-rack). Note that
+	losses into fast recovery (RFC8985). Note that
 	TLP requires RACK to function properly (see tcp_recovery below)
 
 	Possible values:
@@ -447,7 +468,12 @@ tcp_ecn_fallback - BOOLEAN
 	knob. The value	is not used, if tcp_ecn or per route (or congestion
 	control) ECN settings are disabled.
 
-	Default: 1 (fallback enabled)
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 1 (enabled)
 
 tcp_fack - BOOLEAN
 	This is a legacy option, it has no effect anymore.
@@ -474,7 +500,7 @@ tcp_frto - INTEGER
 	By default it's enabled with a non-zero value. 0 disables F-RTO.
 
 tcp_fwmark_accept - BOOLEAN
-	If set, incoming connections to listening sockets that do not have a
+	If enabled, incoming connections to listening sockets that do not have a
 	socket mark will set the mark of the accepting socket to the fwmark of
 	the incoming SYN packet. This will cause all packets on that connection
 	(starting from the first SYNACK) to be sent with that fwmark. The
@@ -482,7 +508,12 @@ tcp_fwmark_accept - BOOLEAN
 	have a fwmark set via setsockopt(SOL_SOCKET, SO_MARK, ...) are
 	unaffected.
 
-	Default: 0
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 0 (disabled)
 
 tcp_invalid_ratelimit - INTEGER
 	Limit the maximal rate for sending duplicate acknowledgments
@@ -528,6 +559,11 @@ tcp_l3mdev_accept - BOOLEAN
 	which the packets originated. Only valid when the kernel was
 	compiled with CONFIG_NET_L3_MASTER_DEV.
 
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
 	Default: 0 (disabled)
 
 tcp_low_latency - BOOLEAN
@@ -593,10 +629,16 @@ tcp_min_rtt_wlen - INTEGER
 	Default: 300
 
 tcp_moderate_rcvbuf - BOOLEAN
-	If set, TCP performs receive buffer auto-tuning, attempting to
+	If enabled, TCP performs receive buffer auto-tuning, attempting to
 	automatically size the buffer (no greater than tcp_rmem[2]) to
-	match the size required by the path for full throughput.  Enabled by
-	default.
+	match the size required by the path for full throughput.
+
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 1 (enabled)
 
 tcp_mtu_probing - INTEGER
 	Controls TCP Packetization-Layer Path MTU Discovery.  Takes three
@@ -621,13 +663,26 @@ tcp_no_metrics_save - BOOLEAN
 	when the connection closes, so that connections established in the
 	near future can use these to set initial conditions.  Usually, this
 	increases overall performance, but may sometimes cause performance
-	degradation.  If set, TCP will not cache metrics on closing
+	degradation.  If enabled, TCP will not cache metrics on closing
 	connections.
 
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 0 (disabled)
+
 tcp_no_ssthresh_metrics_save - BOOLEAN
 	Controls whether TCP saves ssthresh metrics in the route cache.
+	If enabled, ssthresh metrics are disabled.
+
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
 
-	Default is 1, which disables ssthresh metrics.
+	Default: 1 (enabled)
 
 tcp_orphan_retries - INTEGER
 	This value influences the timeout of a locally closed TCP connection,
@@ -645,9 +700,11 @@ tcp_recovery - INTEGER
 	features.
 
 	=========   =============================================================
-	RACK: 0x1   enables the RACK loss detection for fast detection of lost
-		    retransmissions and tail drops. It also subsumes and disables
-		    RFC6675 recovery for SACK connections.
+	RACK: 0x1   enables RACK loss detection, for fast detection of lost
+		    retransmissions and tail drops, and resilience to
+		    reordering. currently, setting this bit to 0 has no
+		    effect, since RACK is the only supported loss detection
+		    algorithm.
 
 	RACK: 0x2   makes RACK's reordering window static (min_rtt/4).
 
@@ -664,6 +721,11 @@ tcp_reflect_tos - BOOLEAN
 
 	This options affects both IPv4 and IPv6.
 
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
 	Default: 0 (disabled)
 
 tcp_reordering - INTEGER
@@ -685,6 +747,13 @@ tcp_retrans_collapse - BOOLEAN
 	On retransmit try to send bigger packets to work around bugs in
 	certain TCP stacks.
 
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 1 (enabled)
+
 tcp_retries1 - INTEGER
 	This value influences the time, after which TCP decides, that
 	something is wrong due to unacknowledged RTO retransmissions,
@@ -705,16 +774,23 @@ tcp_retries2 - INTEGER
 	seconds and is a lower bound for the effective timeout.
 	TCP will effectively time out at the first RTO which exceeds the
 	hypothetical timeout.
+	If tcp_rto_max_ms is decreased, it is recommended to also
+	change tcp_retries2.
 
 	RFC 1122 recommends at least 100 seconds for the timeout,
 	which corresponds to a value of at least 8.
 
 tcp_rfc1337 - BOOLEAN
-	If set, the TCP stack behaves conforming to RFC1337. If unset,
+	If enabled, the TCP stack behaves conforming to RFC1337. If unset,
 	we are not conforming to RFC, but prevent TCP TIME_WAIT
 	assassination.
 
-	Default: 0
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 0 (disabled)
 
 tcp_rmem - vector of 3 INTEGERs: min, default, max
 	min: Minimal size of receive buffer used by TCP sockets.
@@ -733,11 +809,18 @@ tcp_rmem - vector of 3 INTEGERs: min, default, max
 	net.core.rmem_max.  Calling setsockopt() with SO_RCVBUF disables
 	automatic tuning of that socket's receive buffer size, in which
 	case this value is ignored.
-	Default: between 131072 and 6MB, depending on RAM size.
+	Default: between 131072 and 32MB, depending on RAM size.
 
 tcp_sack - BOOLEAN
 	Enable select acknowledgments (SACKS).
 
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 1 (enabled)
+
 tcp_comp_sack_delay_ns - LONG INTEGER
 	TCP tries to reduce number of SACK sent, using a timer
 	based on 5% of SRTT, capped by this sysctl, in nano seconds.
@@ -760,26 +843,41 @@ tcp_comp_sack_nr - INTEGER
 	Default : 44
 
 tcp_backlog_ack_defer - BOOLEAN
-	If set, user thread processing socket backlog tries sending
+	If enabled, user thread processing socket backlog tries sending
 	one ACK for the whole queue. This helps to avoid potential
 	long latencies at end of a TCP socket syscall.
 
-	Default : true
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 1 (enabled)
 
 tcp_slow_start_after_idle - BOOLEAN
-	If set, provide RFC2861 behavior and time out the congestion
+	If enabled, provide RFC2861 behavior and time out the congestion
 	window after an idle period.  An idle period is defined at
 	the current RTO.  If unset, the congestion window will not
 	be timed out after an idle period.
 
-	Default: 1
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 1 (enabled)
 
 tcp_stdurg - BOOLEAN
 	Use the Host requirements interpretation of the TCP urgent pointer field.
-	Most hosts use the older BSD interpretation, so if you turn this on
+	Most hosts use the older BSD interpretation, so if enabled,
 	Linux might not communicate correctly with them.
 
-	Default: FALSE
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 0 (disabled)
 
 tcp_synack_retries - INTEGER
 	Number of times SYNACKs for a passive TCP connection attempt will
@@ -836,7 +934,12 @@ tcp_migrate_req - BOOLEAN
 	migration by returning SK_DROP in the type of eBPF program, or
 	disable this option.
 
-	Default: 0
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 0 (disabled)
 
 tcp_fastopen - INTEGER
 	Enable TCP Fast Open (RFC7413) to send and accept data in the opening
@@ -1000,9 +1103,30 @@ tcp_tw_reuse - INTEGER
 
 	Default: 2
 
+tcp_tw_reuse_delay - UNSIGNED INTEGER
+        The delay in milliseconds before a TIME-WAIT socket can be reused by a
+        new connection, if TIME-WAIT socket reuse is enabled. The actual reuse
+        threshold is within [N, N+1] range, where N is the requested delay in
+        milliseconds, to ensure the delay interval is never shorter than the
+        configured value.
+
+        This setting contains an assumption about the other TCP timestamp clock
+        tick interval. It should not be set to a value lower than the peer's
+        clock tick for PAWS (Protection Against Wrapped Sequence numbers)
+        mechanism work correctly for the reused connection.
+
+        Default: 1000 (milliseconds)
+
 tcp_window_scaling - BOOLEAN
 	Enable window scaling as defined in RFC1323.
 
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 1 (enabled)
+
 tcp_shrink_window - BOOLEAN
 	This changes how the TCP receive window is calculated.
 
@@ -1010,13 +1134,15 @@ tcp_shrink_window - BOOLEAN
 	window can be offered, and that TCP implementations MUST ensure
 	that they handle a shrinking window, as specified in RFC 1122.
 
-	- 0 - Disabled.	The window is never shrunk.
-	- 1 - Enabled.	The window is shrunk when necessary to remain within
-			the memory limit set by autotuning (sk_rcvbuf).
-			This only occurs if a non-zero receive window
-			scaling factor is also in effect.
+	Possible values:
 
-	Default: 0
+	- 0 (disabled) - The window is never shrunk.
+	- 1 (enabled)  - The window is shrunk when necessary to remain within
+	  the memory limit set by autotuning (sk_rcvbuf).
+	  This only occurs if a non-zero receive window
+	  scaling factor is also in effect.
+
+	Default: 0 (disabled)
 
 tcp_wmem - vector of 3 INTEGERs: min, default, max
 	min: Amount of memory reserved for send buffers for TCP sockets.
@@ -1053,16 +1179,21 @@ tcp_notsent_lowat - UNSIGNED INTEGER
 	Default: UINT_MAX (0xFFFFFFFF)
 
 tcp_workaround_signed_windows - BOOLEAN
-	If set, assume no receipt of a window scaling option means the
+	If enabled, assume no receipt of a window scaling option means the
 	remote TCP is broken and treats the window as a signed quantity.
-	If unset, assume the remote TCP is not broken even if we do
+	If disabled, assume the remote TCP is not broken even if we do
 	not receive a window scaling option from them.
 
-	Default: 0
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 0 (disabled)
 
 tcp_thin_linear_timeouts - BOOLEAN
 	Enable dynamic triggering of linear timeouts for thin streams.
-	If set, a check is performed upon retransmission by timeout to
+	If enabled, a check is performed upon retransmission by timeout to
 	determine if the stream is thin (less than 4 packets in flight).
 	As long as the stream is found to be thin, up to 6 linear
 	timeouts may be performed before exponential backoff mode is
@@ -1071,7 +1202,12 @@ tcp_thin_linear_timeouts - BOOLEAN
 	For more information on thin streams, see
 	Documentation/networking/tcp-thin.rst
 
-	Default: 0
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 0 (disabled)
 
 tcp_limit_output_bytes - INTEGER
 	Controls TCP Small Queue limit per tcp socket.
@@ -1083,7 +1219,7 @@ tcp_limit_output_bytes - INTEGER
 	limits the number of bytes on qdisc or device to reduce artificial
 	RTT/cwnd and reduce bufferbloat.
 
-	Default: 1048576 (16 * 65536)
+	Default: 4194304 (4 MB)
 
 tcp_challenge_ack_limit - INTEGER
 	Limits number of Challenge ACK sent per second, as recommended
@@ -1123,7 +1259,7 @@ tcp_child_ehash_entries - INTEGER
 	Default: 0
 
 tcp_plb_enabled - BOOLEAN
-	If set and the underlying congestion control (e.g. DCTCP) supports
+	If enabled and the underlying congestion control (e.g. DCTCP) supports
 	and enables PLB feature, TCP PLB (Protective Load Balancing) is
 	enabled. PLB is described in the following paper:
 	https://doi.org/10.1145/3544216.3544226. Based on PLB parameters,
@@ -1139,12 +1275,17 @@ tcp_plb_enabled - BOOLEAN
 	by switches to determine next hop. In either case, further host
 	and switch side changes will be needed.
 
-	When set, PLB assumes that congestion signal (e.g. ECN) is made
+	If enabled, PLB assumes that congestion signal (e.g. ECN) is made
 	available and used by congestion control module to estimate a
 	congestion measure (e.g. ce_ratio). PLB needs a congestion measure to
 	make repathing decisions.
 
-	Default: FALSE
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 0 (disabled)
 
 tcp_plb_idle_rehash_rounds - INTEGER
 	Number of consecutive congested rounds (RTT) seen after which
@@ -1213,8 +1354,8 @@ tcp_pingpong_thresh - INTEGER
 tcp_rto_min_us - INTEGER
 	Minimal TCP retransmission timeout (in microseconds). Note that the
 	rto_min route option has the highest precedence for configuring this
-	setting, followed by the TCP_BPF_RTO_MIN socket option, followed by
-	this tcp_rto_min_us sysctl.
+	setting, followed by the TCP_BPF_RTO_MIN and TCP_RTO_MIN_US socket
+	options, followed by this tcp_rto_min_us sysctl.
 
 	The recommended practice is to use a value less or equal to 200000
 	microseconds.
@@ -1223,6 +1364,17 @@ tcp_rto_min_us - INTEGER
 
 	Default: 200000
 
+tcp_rto_max_ms - INTEGER
+	Maximal TCP retransmission timeout (in ms).
+	Note that TCP_RTO_MAX_MS socket option has higher precedence.
+
+	When changing tcp_rto_max_ms, it is important to understand
+	that tcp_retries2 might need a change.
+
+	Possible Values: 1000 - 120,000
+
+	Default: 120,000
+
 UDP variables
 =============
 
@@ -1233,6 +1385,11 @@ udp_l3mdev_accept - BOOLEAN
 	originated. Only valid when the kernel was compiled with
 	CONFIG_NET_L3_MASTER_DEV.
 
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
 	Default: 0 (disabled)
 
 udp_mem - vector of 3 INTEGERs: min, pressure, max
@@ -1263,7 +1420,7 @@ udp_hash_entries - INTEGER
 	A negative value means the networking namespace does not own its
 	hash buckets and shares the initial networking namespace's one.
 
-udp_child_ehash_entries - INTEGER
+udp_child_hash_entries - INTEGER
 	Control the number of hash buckets for UDP sockets in the child
 	networking namespace, which must be set before clone() or unshare().
 
@@ -1293,19 +1450,29 @@ raw_l3mdev_accept - BOOLEAN
 	originated. Only valid when the kernel was compiled with
 	CONFIG_NET_L3_MASTER_DEV.
 
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
 	Default: 1 (enabled)
 
 CIPSOv4 Variables
 =================
 
 cipso_cache_enable - BOOLEAN
-	If set, enable additions to and lookups from the CIPSO label mapping
-	cache.  If unset, additions are ignored and lookups always result in a
+	If enabled, enable additions to and lookups from the CIPSO label mapping
+	cache.  If disabled, additions are ignored and lookups always result in a
 	miss.  However, regardless of the setting the cache is still
 	invalidated when required when means you can safely toggle this on and
 	off and the cache will always be "safe".
 
-	Default: 1
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 1 (enabled)
 
 cipso_cache_bucket_size - INTEGER
 	The CIPSO label cache consists of a fixed size hash table with each
@@ -1323,17 +1490,27 @@ cipso_rbm_optfmt - BOOLEAN
 	This means that when set the CIPSO tag will be padded with empty
 	categories in order to make the packet data 32-bit aligned.
 
-	Default: 0
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
 
-cipso_rbm_structvalid - BOOLEAN
-	If set, do a very strict check of the CIPSO option when
-	ip_options_compile() is called.  If unset, relax the checks done during
+	Default: 0 (disabled)
+
+cipso_rbm_strictvalid - BOOLEAN
+	If enabled, do a very strict check of the CIPSO option when
+	ip_options_compile() is called.  If disabled, relax the checks done during
 	ip_options_compile().  Either way is "safe" as errors are caught else
 	where in the CIPSO processing code but setting this to 0 (False) should
 	result in less work (i.e. it should be faster) but could cause problems
 	with other implementations that require strict checking.
 
-	Default: 0
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 0 (disabled)
 
 IP Variables
 ============
@@ -1390,10 +1567,15 @@ ip_unprivileged_port_start - INTEGER
 	Default: 1024
 
 ip_nonlocal_bind - BOOLEAN
-	If set, allows processes to bind() to non-local IP addresses,
+	If enabled, allows processes to bind() to non-local IP addresses,
 	which can be quite useful - but may break some applications.
 
-	Default: 0
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 0 (disabled)
 
 ip_autobind_reuse - BOOLEAN
 	By default, bind() does not select the ports automatically even if
@@ -1402,7 +1584,13 @@ ip_autobind_reuse - BOOLEAN
 	when you use bind()+connect(), but may break some applications.
 	The preferred solution is to use IP_BIND_ADDRESS_NO_PORT and this
 	option should only be set by experts.
-	Default: 0
+
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 0 (disabled)
 
 ip_dynaddr - INTEGER
 	If set non-zero, enables support for dynamic addresses.
@@ -1420,7 +1608,12 @@ ip_early_demux - BOOLEAN
 	It may add an additional cost for pure routing workloads that
 	reduces overall throughput, in such case you should disable it.
 
-	Default: 1
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 1 (enabled)
 
 ping_group_range - 2 INTEGERS
 	Restrict ICMP_PROTO datagram sockets to users in the group range.
@@ -1432,31 +1625,56 @@ ping_group_range - 2 INTEGERS
 tcp_early_demux - BOOLEAN
 	Enable early demux for established TCP sockets.
 
-	Default: 1
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 1 (enabled)
 
 udp_early_demux - BOOLEAN
 	Enable early demux for connected UDP sockets. Disable this if
 	your system could experience more unconnected load.
 
-	Default: 1
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 1 (enabled)
 
 icmp_echo_ignore_all - BOOLEAN
-	If set non-zero, then the kernel will ignore all ICMP ECHO
+	If enabled, then the kernel will ignore all ICMP ECHO
 	requests sent to it.
 
-	Default: 0
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 0 (disabled)
 
 icmp_echo_enable_probe - BOOLEAN
-        If set to one, then the kernel will respond to RFC 8335 PROBE
+        If enabled, then the kernel will respond to RFC 8335 PROBE
         requests sent to it.
 
-        Default: 0
+        Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 0 (disabled)
 
 icmp_echo_ignore_broadcasts - BOOLEAN
-	If set non-zero, then the kernel will ignore all ICMP ECHO and
+	If enabled, then the kernel will ignore all ICMP ECHO and
 	TIMESTAMP requests sent to it via broadcast/multicast.
 
-	Default: 1
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 1 (enabled)
 
 icmp_ratelimit - INTEGER
 	Limit the maximal rates for sending ICMP packets whose type matches
@@ -1513,17 +1731,22 @@ icmp_ratemask - INTEGER
 icmp_ignore_bogus_error_responses - BOOLEAN
 	Some routers violate RFC1122 by sending bogus responses to broadcast
 	frames.  Such violations are normally logged via a kernel warning.
-	If this is set to TRUE, the kernel will not give such warnings, which
+	If enabled, the kernel will not give such warnings, which
 	will avoid log file clutter.
 
-	Default: 1
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 1 (enabled)
 
 icmp_errors_use_inbound_ifaddr - BOOLEAN
 
-	If zero, icmp error messages are sent with the primary address of
+	If disabled, icmp error messages are sent with the primary address of
 	the exiting interface.
 
-	If non-zero, the message will be sent with the primary address of
+	If enabled, the message will be sent with the primary address of
 	the interface that received the packet that caused the icmp error.
 	This is the behaviour many network administrators will expect from
 	a router. And it can make debugging complicated network layouts
@@ -1533,7 +1756,12 @@ icmp_errors_use_inbound_ifaddr - BOOLEAN
 	then the primary address of the first non-loopback interface that
 	has one will be used regardless of this setting.
 
-	Default: 0
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 0 (disabled)
 
 igmp_max_memberships - INTEGER
 	Change the maximum number of multicast groups we can subscribe to.
@@ -1663,10 +1891,10 @@ proxy_arp_pvlan - BOOLEAN
 
 	This technology is known by different names:
 
-	  In RFC 3069 it is called VLAN Aggregation.
-	  Cisco and Allied Telesyn call it Private VLAN.
-	  Hewlett-Packard call it Source-Port filtering or port-isolation.
-	  Ericsson call it MAC-Forced Forwarding (RFC Draft).
+	- In RFC 3069 it is called VLAN Aggregation.
+	- Cisco and Allied Telesyn call it Private VLAN.
+	- Hewlett-Packard call it Source-Port filtering or port-isolation.
+	- Ericsson call it MAC-Forced Forwarding (RFC Draft).
 
 proxy_delay - INTEGER
 	Delay proxy response.
@@ -1883,8 +2111,12 @@ arp_evict_nocarrier - BOOLEAN
 	between access points on the same network. In most cases this should
 	remain as the default (1).
 
-	- 1 - (default): Clear the ARP cache on NOCARRIER events
-	- 0 - Do not clear ARP cache on NOCARRIER events
+	Possible values:
+
+	- 0 (disabled) - Do not clear ARP cache on NOCARRIER events
+	- 1 (enabled)  - Clear the ARP cache on NOCARRIER events
+
+	Default: 1 (enabled)
 
 mcast_solicit - INTEGER
 	The maximum number of multicast probes in INCOMPLETE state,
@@ -1907,9 +2139,23 @@ mcast_resolicit - INTEGER
 disable_policy - BOOLEAN
 	Disable IPSEC policy (SPD) for this interface
 
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 0 (disabled)
+
 disable_xfrm - BOOLEAN
 	Disable IPSEC encryption on this interface, whatever the policy
 
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 0 (disabled)
+
 igmpv2_unsolicited_report_interval - INTEGER
 	The interval in milliseconds in which the next unsolicited
 	IGMPv1 or IGMPv2 report retransmit will take place.
@@ -1925,11 +2171,25 @@ igmpv3_unsolicited_report_interval - INTEGER
 ignore_routes_with_linkdown - BOOLEAN
         Ignore routes whose link is down when performing a FIB lookup.
 
+        Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 0 (disabled)
+
 promote_secondaries - BOOLEAN
 	When a primary IP address is removed from this interface
 	promote a corresponding secondary IP address instead of
 	removing all the corresponding secondary IP addresses.
 
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 0 (disabled)
+
 drop_unicast_in_l2_multicast - BOOLEAN
 	Drop any unicast IP packets that are received in link-layer
 	multicast (or broadcast) frames.
@@ -1937,14 +2197,24 @@ drop_unicast_in_l2_multicast - BOOLEAN
 	This behavior (for multicast) is actually a SHOULD in RFC
 	1122, but is disabled by default for compatibility reasons.
 
-	Default: off (0)
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 0 (disabled)
 
 drop_gratuitous_arp - BOOLEAN
 	Drop all gratuitous ARP frames, for example if there's a known
 	good ARP proxy on the network and such frames need not be used
 	(or in the case of 802.11, must not be used to prevent attacks.)
 
-	Default: off (0)
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 0 (disabled)
 
 
 tag - INTEGER
@@ -1988,20 +2258,24 @@ bindv6only - BOOLEAN
 	which restricts use of the IPv6 socket to IPv6 communication
 	only.
 
-		- TRUE: disable IPv4-mapped address feature
-		- FALSE: enable IPv4-mapped address feature
+	Possible values:
+
+	- 0 (disabled) - enable IPv4-mapped address feature
+	- 1 (enabled)  - disable IPv4-mapped address feature
 
-	Default: FALSE (as specified in RFC3493)
+	Default: 0 (disabled)
 
 flowlabel_consistency - BOOLEAN
 	Protect the consistency (and unicity) of flow label.
 	You have to disable it to use IPV6_FL_F_REFLECT flag on the
 	flow label manager.
 
-	- TRUE: enabled
-	- FALSE: disabled
+	Possible values:
 
-	Default: TRUE
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 1 (enabled)
 
 auto_flowlabels - INTEGER
 	Automatically generate flow labels based on a flow hash of the
@@ -2027,10 +2301,13 @@ flowlabel_state_ranges - BOOLEAN
 	reserved for the IPv6 flow manager facility, 0x80000-0xFFFFF
 	is reserved for stateless flow labels as described in RFC6437.
 
-	- TRUE: enabled
-	- FALSE: disabled
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 1 (enabled)
 
-	Default: true
 
 flowlabel_reflect - INTEGER
 	Control flow label reflection. Needed for Path MTU
@@ -2098,10 +2375,13 @@ anycast_src_echo_reply - BOOLEAN
 	Controls the use of anycast addresses as source addresses for ICMPv6
 	echo reply
 
-	- TRUE:  enabled
-	- FALSE: disabled
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 0 (disabled)
 
-	Default: FALSE
 
 idgen_delay - INTEGER
 	Controls the delay in seconds after which time to retry
@@ -2158,7 +2438,12 @@ skip_notify_on_dev_down - BOOLEAN
 	to true skips the message, making IPv4 and IPv6 on par in relying
 	on userspace caches to track link events and evict routes.
 
-	Default: false (generate message)
+	Possible values:
+
+	- 0 (disabled) - generate the message
+	- 1 (enabled)  - skip generating the message
+
+	Default: 0 (disabled)
 
 nexthop_compat_mode - BOOLEAN
 	New nexthop API provides a means for managing nexthops independent of
@@ -2170,6 +2455,12 @@ nexthop_compat_mode - BOOLEAN
 	understands the new API, this sysctl can be disabled to achieve full
 	performance benefits of the new API by disabling the nexthop expansion
 	and extraneous notifications.
+
+	Note that as a backward-compatible mode, dumping of modern features
+	might be incomplete or wrong. For example, resilient groups will not be
+	shown as such, but rather as just a list of next hops. Also weights that
+	do not fit into 8 bits will show incorrectly.
+
 	Default: true (backward compat mode)
 
 fib_notify_on_flag_change - INTEGER
@@ -2196,8 +2487,10 @@ fib_notify_on_flag_change - INTEGER
 ioam6_id - INTEGER
         Define the IOAM id of this node. Uses only 24 bits out of 32 in total.
 
-        Min: 0
-        Max: 0xFFFFFF
+        Possible value range:
+
+        - Min: 0
+        - Max: 0xFFFFFF
 
         Default: 0xFFFFFF
 
@@ -2205,8 +2498,10 @@ ioam6_id_wide - LONG INTEGER
         Define the wide IOAM id of this node. Uses only 56 bits out of 64 in
         total. Can be different from ioam6_id.
 
-        Min: 0
-        Max: 0xFFFFFFFFFFFFFF
+        Possible value range:
+
+        - Min: 0
+        - Max: 0xFFFFFFFFFFFFFF
 
         Default: 0xFFFFFFFFFFFFFF
 
@@ -2248,8 +2543,8 @@ conf/all/disable_ipv6 - BOOLEAN
 conf/all/forwarding - BOOLEAN
 	Enable global IPv6 forwarding between all interfaces.
 
-	IPv4 and IPv6 work differently here; e.g. netfilter must be used
-	to control which interfaces may forward packets and which not.
+	IPv4 and IPv6 work differently here; the ``force_forwarding`` flag must
+	be used to control which interfaces may forward packets.
 
 	This also sets all interfaces' Host/Router setting
 	'forwarding' to the specified value.  See below for details.
@@ -2259,13 +2554,30 @@ conf/all/forwarding - BOOLEAN
 proxy_ndp - BOOLEAN
 	Do proxy ndp.
 
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 0 (disabled)
+
+force_forwarding - BOOLEAN
+	Enable forwarding on this interface only -- regardless of the setting on
+	``conf/all/forwarding``. When setting ``conf.all.forwarding`` to 0,
+	the ``force_forwarding`` flag will be reset on all interfaces.
+
 fwmark_reflect - BOOLEAN
 	Controls the fwmark of kernel-generated IPv6 reply packets that are not
 	associated with a socket for example, TCP RSTs or ICMPv6 echo replies).
-	If unset, these packets have a fwmark of zero. If set, they have the
+	If disabled, these packets have a fwmark of zero. If enabled, they have the
 	fwmark of the packet they are replying to.
 
-	Default: 0
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 0 (disabled)
 
 ``conf/interface/*``:
 	Change special settings per interface.
@@ -2356,9 +2668,11 @@ ra_honor_pio_life - BOOLEAN
 	lifetime of an address matching a prefix sent in a Router
 	Advertisement Prefix Information Option.
 
-	- If enabled, the PIO valid lifetime will always be honored.
-	- If disabled, RFC4862 section 5.5.3e is used to determine
+	Possible values:
+
+	- 0 (disabled) - RFC4862 section 5.5.3e is used to determine
 	  the valid lifetime of the address.
+	- 1 (enabled)  - the PIO valid lifetime will always be honored.
 
 	Default: 0 (disabled)
 
@@ -2370,8 +2684,10 @@ ra_honor_pio_pflag - BOOLEAN
 	P-flag suppresses any effects of the A-flag within the same
 	PIO. For a given PIO, P=1 and A=1 is treated as A=0.
 
-	- If disabled, the P-flag is ignored.
-	- If enabled, the P-flag will disable SLAAC autoconfiguration
+	Possible values:
+
+	- 0 (disabled) - the P-flag is ignored.
+	- 1 (enabled)  - the P-flag will disable SLAAC autoconfiguration
 	  for the given Prefix Information Option.
 
 	Default: 0 (disabled)
@@ -2493,10 +2809,15 @@ mtu - INTEGER
 	Default: 1280 (IPv6 required minimum)
 
 ip_nonlocal_bind - BOOLEAN
-	If set, allows processes to bind() to non-local IPv6 addresses,
+	If enabled, allows processes to bind() to non-local IPv6 addresses,
 	which can be quite useful - but may break some applications.
 
-	Default: 0
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 0 (disabled)
 
 router_probe_interval - INTEGER
 	Minimum interval (in seconds) between Router Probing described
@@ -2526,7 +2847,12 @@ use_oif_addrs_only - BOOLEAN
 	routed via this interface are restricted to the set of addresses
 	configured on this interface (vis. RFC 6724, section 4).
 
-	Default: false
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 0 (disabled)
 
 use_tempaddr - INTEGER
 	Preference for Privacy Extensions (RFC3041).
@@ -2651,10 +2977,14 @@ force_tllao - BOOLEAN
 ndisc_notify - BOOLEAN
 	Define mode for notification of address and device changes.
 
-	* 0 - (default): do nothing
-	* 1 - Generate unsolicited neighbour advertisements when device is brought
+	Possible values:
+
+	- 0 (disabled) - do nothing
+	- 1 (enabled)  - Generate unsolicited neighbour advertisements when device is brought
 	  up or hardware address changes.
 
+	Default: 0 (disabled)
+
 ndisc_tclass - INTEGER
 	The IPv6 Traffic Class to use by default when sending IPv6 Neighbor
 	Discovery (Router Solicitation, Router Advertisement, Neighbor
@@ -2671,8 +3001,12 @@ ndisc_evict_nocarrier - BOOLEAN
 	not be cleared when roaming between access points on the same network.
 	In most cases this should remain as the default (1).
 
-	- 1 - (default): Clear neighbor discover cache on NOCARRIER events.
-	- 0 - Do not clear neighbor discovery cache on NOCARRIER events.
+	Possible values:
+
+	- 0 (disabled) - Do not clear neighbor discovery cache on NOCARRIER events.
+	- 1 (enabled)  - Clear neighbor discover cache on NOCARRIER events.
+
+	Default: 1 (enabled)
 
 mldv1_unsolicited_report_interval - INTEGER
 	The interval in milliseconds in which the next unsolicited
@@ -2701,25 +3035,34 @@ suppress_frag_ndisc - INTEGER
 optimistic_dad - BOOLEAN
 	Whether to perform Optimistic Duplicate Address Detection (RFC 4429).
 
-	* 0: disabled (default)
-	* 1: enabled
-
 	Optimistic Duplicate Address Detection for the interface will be enabled
 	if at least one of conf/{all,interface}/optimistic_dad is set to 1,
 	it will be disabled otherwise.
 
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 0 (disabled)
+
+
 use_optimistic - BOOLEAN
 	If enabled, do not classify optimistic addresses as deprecated during
 	source address selection.  Preferred addresses will still be chosen
 	before optimistic addresses, subject to other ranking in the source
 	address selection algorithm.
 
-	* 0: disabled (default)
-	* 1: enabled
-
 	This will be enabled if at least one of
 	conf/{all,interface}/use_optimistic is set to 1, disabled otherwise.
 
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 0 (disabled)
+
 stable_secret - IPv6 address
 	This IPv6 address will be used as a secret to generate IPv6
 	addresses for link-local addresses and autoconfigured
@@ -2750,14 +3093,24 @@ drop_unicast_in_l2_multicast - BOOLEAN
 	Drop any unicast IPv6 packets that are received in link-layer
 	multicast (or broadcast) frames.
 
-	By default this is turned off.
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 0 (disabled)
 
 drop_unsolicited_na - BOOLEAN
 	Drop all unsolicited neighbor advertisements, for example if there's
 	a known good NA proxy on the network and such frames need not be used
 	(or in the case of 802.11, must not be used to prevent attacks.)
 
-	By default this is turned off.
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 0 (disabled).
 
 accept_untracked_na - INTEGER
 	Define behavior for accepting neighbor advertisements from devices that
@@ -2798,7 +3151,12 @@ enhanced_dad - BOOLEAN
 	The nonce option will be sent on an interface unless both of
 	conf/{all,interface}/enhanced_dad are set to FALSE.
 
-	Default: TRUE
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 1 (enabled)
 
 ``icmp/*``:
 ===========
@@ -2827,29 +3185,49 @@ ratemask - list of comma separated ranges
 	Default: 0-1,3-127 (rate limit ICMPv6 errors except Packet Too Big)
 
 echo_ignore_all - BOOLEAN
-	If set non-zero, then the kernel will ignore all ICMP ECHO
+	If enabled, then the kernel will ignore all ICMP ECHO
 	requests sent to it over the IPv6 protocol.
 
-	Default: 0
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 0 (disabled)
 
 echo_ignore_multicast - BOOLEAN
-	If set non-zero, then the kernel will ignore all ICMP ECHO
+	If enabled, then the kernel will ignore all ICMP ECHO
 	requests sent to it over the IPv6 protocol via multicast.
 
-	Default: 0
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 0 (disabled)
 
 echo_ignore_anycast - BOOLEAN
-	If set non-zero, then the kernel will ignore all ICMP ECHO
+	If enabled, then the kernel will ignore all ICMP ECHO
 	requests sent to it over the IPv6 protocol destined to anycast address.
 
-	Default: 0
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 0 (disabled)
 
 error_anycast_as_unicast - BOOLEAN
-	If set to 1, then the kernel will respond with ICMP Errors
+	If enabled, then the kernel will respond with ICMP Errors
 	resulting from requests sent to it over the IPv6 protocol destined
 	to anycast address essentially treating anycast as unicast.
 
-	Default: 0
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
+	Default: 0 (disabled)
 
 xfrm6_gc_thresh - INTEGER
 	(Obsolete since linux-4.14)
@@ -2867,34 +3245,49 @@ YOSHIFUJI Hideaki / USAGI Project <yoshfuji@linux-ipv6.org>
 =================================
 
 bridge-nf-call-arptables - BOOLEAN
-	- 1 : pass bridged ARP traffic to arptables' FORWARD chain.
-	- 0 : disable this.
 
-	Default: 1
+	Possible values:
+
+	- 0 (disabled) - disable this.
+	- 1 (enabled)  - pass bridged ARP traffic to arptables' FORWARD chain.
+
+	Default: 1 (enabled)
 
 bridge-nf-call-iptables - BOOLEAN
-	- 1 : pass bridged IPv4 traffic to iptables' chains.
-	- 0 : disable this.
 
-	Default: 1
+	Possible values:
+
+	- 0 (disabled) - disable this.
+	- 1 (enabled)  - pass bridged IPv4 traffic to iptables' chains.
+
+	Default: 1 (enabled)
 
 bridge-nf-call-ip6tables - BOOLEAN
-	- 1 : pass bridged IPv6 traffic to ip6tables' chains.
-	- 0 : disable this.
 
-	Default: 1
+	Possible values:
+
+	- 0 (disabled) - disable this.
+	- 1 (enabled)  - pass bridged IPv6 traffic to ip6tables' chains.
+
+	Default: 1 (enabled)
 
 bridge-nf-filter-vlan-tagged - BOOLEAN
-	- 1 : pass bridged vlan-tagged ARP/IP/IPv6 traffic to {arp,ip,ip6}tables.
-	- 0 : disable this.
 
-	Default: 0
+	Possible values:
+
+	- 0 (disabled) - disable this.
+	- 1 (enabled)  - pass bridged vlan-tagged ARP/IP/IPv6 traffic to {arp,ip,ip6}tables
+
+	Default: 0 (disabled)
 
 bridge-nf-filter-pppoe-tagged - BOOLEAN
-	- 1 : pass bridged pppoe-tagged IP/IPv6 traffic to {ip,ip6}tables.
-	- 0 : disable this.
 
-	Default: 0
+	Possible values:
+
+	- 0 (disabled) - disable this.
+	- 1 (enabled)  - pass bridged pppoe-tagged IP/IPv6 traffic to {ip,ip6}tables.
+
+	Default: 0 (disabled)
 
 bridge-nf-pass-vlan-input-dev - BOOLEAN
 	- 1: if bridge-nf-filter-vlan-tagged is enabled, try to find a vlan
@@ -2917,11 +3310,12 @@ addip_enable - BOOLEAN
 	the ability to dynamically add and remove new addresses for the SCTP
 	associations.
 
-	1: Enable extension.
+	Possible values:
 
-	0: Disable extension.
+	- 0 (disabled) - disable extension.
+	- 1 (enabled)  - enable extension
 
-	Default: 0
+	Default: 0 (disabled)
 
 pf_enable - INTEGER
 	Enable or disable pf (pf is short for potentially failed) state. A value
@@ -2936,31 +3330,27 @@ pf_enable - INTEGER
 	https://datatracker.ietf.org/doc/draft-ietf-tsvwg-sctp-failover for
 	details.
 
-	1: Enable pf.
+	Possible values:
 
-	0: Disable pf.
+	- 1: Enable pf.
+	- 0: Disable pf.
 
 	Default: 1
 
 pf_expose - INTEGER
 	Unset or enable/disable pf (pf is short for potentially failed) state
 	exposure.  Applications can control the exposure of the PF path state
-	in the SCTP_PEER_ADDR_CHANGE event and the SCTP_GET_PEER_ADDR_INFO
-	sockopt.   When it's unset, no SCTP_PEER_ADDR_CHANGE event with
-	SCTP_ADDR_PF state will be sent and a SCTP_PF-state transport info
-	can be got via SCTP_GET_PEER_ADDR_INFO sockopt;  When it's enabled,
-	a SCTP_PEER_ADDR_CHANGE event will be sent for a transport becoming
-	SCTP_PF state and a SCTP_PF-state transport info can be got via
-	SCTP_GET_PEER_ADDR_INFO sockopt;  When it's disabled, no
-	SCTP_PEER_ADDR_CHANGE event will be sent and it returns -EACCES when
-	trying to get a SCTP_PF-state transport info via SCTP_GET_PEER_ADDR_INFO
-	sockopt.
-
-	0: Unset pf state exposure, Compatible with old applications.
+	in the SCTP_PEER_ADDR_CHANGE event and access of SCTP_PF-state
+	transport info via SCTP_GET_PEER_ADDR_INFO sockopt.
 
-	1: Disable pf state exposure.
+	Possible values:
 
-	2: Enable pf state exposure.
+	- 0: Unset pf state exposure (compatible with old applications). No
+	  event will be sent but the transport info can be queried.
+	- 1: Disable pf state exposure. No event will be sent and trying to
+	  obtain transport info will return -EACCESS.
+	- 2: Enable pf state exposure. The event will be sent for a transport
+	  becoming SCTP_PF state and transport info can be obtained.
 
 	Default: 0
 
@@ -2990,19 +3380,23 @@ auth_enable - BOOLEAN
 	required for secure operation of Dynamic Address Reconfiguration
 	(ADD-IP) extension.
 
-	- 1: Enable this extension.
-	- 0: Disable this extension.
+	Possible values:
 
-	Default: 0
+	- 0 (disabled) - disable extension.
+	- 1 (enabled)  - enable extension
+
+	Default: 0 (disabled)
 
 prsctp_enable - BOOLEAN
 	Enable or disable the Partial Reliability extension (RFC3758) which
 	is used to notify peers that a given DATA should no longer be expected.
 
-	- 1: Enable extension
-	- 0: Disable
+	Possible values:
+
+	- 0 (disabled) - disable extension.
+	- 1 (enabled)  - enable extension
 
-	Default: 1
+	Default: 1 (enabled)
 
 max_burst - INTEGER
 	The limit of the number of new packets that can be initially sent.  It
@@ -3102,10 +3496,12 @@ cookie_preserve_enable - BOOLEAN
 	Enable or disable the ability to extend the lifetime of the SCTP cookie
 	that is used during the establishment phase of SCTP association
 
-	- 1: Enable cookie lifetime extension.
-	- 0: Disable
+	Possible values:
 
-	Default: 1
+	- 0 (disabled) - disable.
+	- 1 (enabled)  - enable cookie lifetime extension.
+
+	Default: 1 (enabled)
 
 cookie_hmac_alg - STRING
 	Select the hmac algorithm used when generating the cookie value sent by
@@ -3150,13 +3546,11 @@ sndbuf_policy - INTEGER
 sctp_mem - vector of 3 INTEGERs: min, pressure, max
 	Number of pages allowed for queueing by all SCTP sockets.
 
-	min: Below this number of pages SCTP is not bothered about its
-	memory appetite. When amount of memory allocated by SCTP exceeds
-	this number, SCTP starts to moderate memory usage.
-
-	pressure: This value was introduced to follow format of tcp_mem.
-
-	max: Number of pages allowed for queueing by all SCTP sockets.
+	* min: Below this number of pages SCTP is not bothered about its
+	  memory usage. When amount of memory allocated by SCTP exceeds
+	  this number, SCTP starts to moderate memory usage.
+	* pressure: This value was introduced to follow format of tcp_mem.
+	* max: Maximum number of allowed pages.
 
 	Default is calculated at boot time from amount of available memory.
 
@@ -3164,9 +3558,9 @@ sctp_rmem - vector of 3 INTEGERs: min, default, max
 	Only the first value ("min") is used, "default" and "max" are
 	ignored.
 
-	min: Minimal size of receive buffer used by SCTP socket.
-	It is guaranteed to each SCTP socket (but not association) even
-	under moderate memory pressure.
+	* min: Minimal size of receive buffer used by SCTP socket.
+	  It is guaranteed to each SCTP socket (but not association) even
+	  under moderate memory pressure.
 
 	Default: 4K
 
@@ -3174,14 +3568,16 @@ sctp_wmem  - vector of 3 INTEGERs: min, default, max
 	Only the first value ("min") is used, "default" and "max" are
 	ignored.
 
-	min: Minimum size of send buffer that can be used by SCTP sockets.
-	It is guaranteed to each SCTP socket (but not association) even
-	under moderate memory pressure.
+	* min: Minimum size of send buffer that can be used by SCTP sockets.
+	  It is guaranteed to each SCTP socket (but not association) even
+	  under moderate memory pressure.
 
 	Default: 4K
 
 addr_scope_policy - INTEGER
-	Control IPv4 address scoping - draft-stewart-tsvwg-sctp-ipv4-00
+	Control IPv4 address scoping (see
+	https://datatracker.ietf.org/doc/draft-stewart-tsvwg-sctp-ipv4/00/
+	for details).
 
 	- 0   - Disable IPv4 address scoping
 	- 1   - Enable IPv4 address scoping
@@ -3239,10 +3635,12 @@ reconf_enable - BOOLEAN
         a stream, and it includes the Parameters of "Outgoing/Incoming SSN
         Reset", "SSN/TSN Reset" and "Add Outgoing/Incoming Streams".
 
-	- 1: Enable extension.
-	- 0: Disable extension.
+	Possible values:
 
-	Default: 0
+	- 0 (disabled) - Disable extension.
+	- 1 (enabled) - Enable extension.
+
+	Default: 0 (disabled)
 
 intl_enable - BOOLEAN
         Enable or disable extension of User Message Interleaving functionality
@@ -3253,10 +3651,12 @@ intl_enable - BOOLEAN
         to 1 and also needs to set socket options SCTP_FRAGMENT_INTERLEAVE to 2
         and SCTP_INTERLEAVING_SUPPORTED to 1.
 
-	- 1: Enable extension.
-	- 0: Disable extension.
+	Possible values:
+
+	- 0 (disabled) - Disable extension.
+	- 1 (enabled) - Enable extension.
 
-	Default: 0
+	Default: 0 (disabled)
 
 ecn_enable - BOOLEAN
         Control use of Explicit Congestion Notification (ECN) by SCTP.
@@ -3265,10 +3665,12 @@ ecn_enable - BOOLEAN
         due to congestion by allowing supporting routers to signal congestion
         before having to drop packets.
 
-        1: Enable ecn.
-        0: Disable ecn.
+        Possible values:
+
+	- 0 (disabled) - Disable ecn.
+	- 1 (enabled) - Enable ecn.
 
-        Default: 1
+	Default: 1 (enabled)
 
 l3mdev_accept - BOOLEAN
 	Enabling this option allows a "global" bound socket to work
@@ -3277,6 +3679,11 @@ l3mdev_accept - BOOLEAN
 	originated. Only valid when the kernel was compiled with
 	CONFIG_NET_L3_MASTER_DEV.
 
+	Possible values:
+
+	- 0 (disabled)
+	- 1 (enabled)
+
 	Default: 1 (enabled)
 
 
diff --git a/Documentation/networking/iso15765-2.rst b/Documentation/networking/iso15765-2.rst
index 0e9d96074178..37ebb2c417cb 100644
--- a/Documentation/networking/iso15765-2.rst
+++ b/Documentation/networking/iso15765-2.rst
@@ -369,8 +369,8 @@ to their default.
 
   addr.can_family = AF_CAN;
   addr.can_ifindex = if_nametoindex("can0");
-  addr.tp.tx_id = 0x18DA42F1 | CAN_EFF_FLAG;
-  addr.tp.rx_id = 0x18DAF142 | CAN_EFF_FLAG;
+  addr.can_addr.tp.tx_id = 0x18DA42F1 | CAN_EFF_FLAG;
+  addr.can_addr.tp.rx_id = 0x18DAF142 | CAN_EFF_FLAG;
 
   ret = bind(s, (struct sockaddr *)&addr, sizeof(addr));
   if (ret < 0)
diff --git a/Documentation/networking/j1939.rst b/Documentation/networking/j1939.rst
index 544bad175aae..45f02efe3df5 100644
--- a/Documentation/networking/j1939.rst
+++ b/Documentation/networking/j1939.rst
@@ -66,6 +66,90 @@ the library exclusively, or by the in-kernel system exclusively.
 J1939 concepts
 ==============
 
+Data Sent to the J1939 Stack
+----------------------------
+
+The data buffers sent to the J1939 stack from user space are not CAN frames
+themselves. Instead, they are payloads that the J1939 stack converts into
+proper CAN frames based on the size of the buffer and the type of transfer. The
+size of the buffer influences how the stack processes the data and determines
+the internal code path used for the transfer.
+
+**Handling of Different Buffer Sizes:**
+
+- **Buffers with a size of 8 bytes or less:**
+
+  - These are handled as simple sessions internally within the stack.
+
+  - The stack converts the buffer directly into a single CAN frame without
+    fragmentation.
+
+  - This type of transfer does not require an actual client (receiver) on the
+    receiving side.
+
+- **Buffers up to 1785 bytes:**
+
+  - These are automatically handled as J1939 Transport Protocol (TP) transfers.
+
+  - Internally, the stack splits the buffer into multiple 8-byte CAN frames.
+
+  - TP transfers can be unicast or broadcast.
+
+  - **Broadcast TP:** Does not require a receiver on the other side and can be
+    used in broadcast scenarios.
+
+  - **Unicast TP:** Requires an active receiver (client) on the other side to
+    acknowledge the transfer.
+
+- **Buffers from 1786 bytes up to 111 MiB:**
+
+  - These are handled as ISO 11783 Extended Transport Protocol (ETP) transfers.
+
+  - ETP transfers are used for larger payloads and are split into multiple CAN
+    frames internally.
+
+  - **ETP transfers (unicast):** Require a receiver on the other side to
+    process the incoming data and acknowledge each step of the transfer.
+
+  - ETP transfers cannot be broadcast like TP transfers, and always require a
+    receiver for operation.
+
+**Non-Blocking Operation with `MSG_DONTWAIT`:**
+
+The J1939 stack supports non-blocking operation when used in combination with
+the `MSG_DONTWAIT` flag. In this mode, the stack attempts to take as much data
+as the available memory for the socket allows. It returns the amount of data
+that was successfully taken, and it is the responsibility of user space to
+monitor this value and handle partial transfers.
+
+- If the stack cannot take the entire buffer, it returns the number of bytes
+  successfully taken, and user space should handle the remainder.
+
+- **Error handling:** When using `MSG_DONTWAIT`, the user must rely on the
+  error queue to detect transfer errors. See the **SO_J1939_ERRQUEUE** section
+  for details on how to subscribe to error notifications. Without the error
+  queue, there is no other way for user space to be notified of transfer errors
+  during non-blocking operations.
+
+**Behavior and Requirements:**
+
+- **Simple transfers (<= 8 bytes):** Do not require a receiver on the other
+  side, making them easy to send without needing address claiming or
+  coordination with a destination.
+
+- **Unicast TP/ETP:** Requires a receiver on the other side to complete the
+  transfer. The receiver must acknowledge the transfer for the session to
+  proceed successfully.
+
+- **Broadcast TP:** Allows sending data without a receiver, but only works for
+  TP transfers. ETP cannot be broadcast and always needs a receiving client.
+
+These different behaviors depend heavily on the size of the buffer provided to
+the stack, and the appropriate transport mechanism (TP or ETP) is selected
+based on the payload size. The stack automatically manages the fragmentation
+and reassembly of large payloads and ensures that the correct CAN frames are
+generated and transmitted for each session.
+
 PGN
 ---
 
@@ -338,6 +422,459 @@ with ``cmsg_level == SOL_J1939 && cmsg_type == SCM_J1939_DEST_ADDR``,
 		}
 	}
 
+setsockopt(2)
+^^^^^^^^^^^^^
+
+The ``setsockopt(2)`` function is used to configure various socket-level
+options for J1939 communication. The following options are supported:
+
+``SO_J1939_FILTER``
+~~~~~~~~~~~~~~~~~~~
+
+The ``SO_J1939_FILTER`` option is essential when the default behavior of
+``bind(2)`` and ``connect(2)`` is insufficient for specific use cases. By
+default, ``bind(2)`` and ``connect(2)`` allow a socket to be associated with a
+single unicast or broadcast address. However, there are scenarios where finer
+control over the incoming messages is required, such as filtering by Parameter
+Group Number (PGN) rather than by addresses.
+
+For example, in a system where multiple types of J1939 messages are being
+transmitted, a process might only be interested in a subset of those messages,
+such as specific PGNs, and not want to receive all messages destined for its
+address or broadcast to the bus.
+
+By applying the ``SO_J1939_FILTER`` option, you can filter messages based on:
+
+- **Source Address (SA)**: Filter messages coming from specific source
+  addresses.
+
+- **Source Name**: Filter messages coming from ECUs with specific NAME
+  identifiers.
+
+- **Parameter Group Number (PGN)**: Focus on receiving messages with specific
+  PGNs, filtering out irrelevant ones.
+
+This filtering mechanism is particularly useful when:
+
+- You want to receive a subset of messages based on their PGNs, even if the
+  address is the same.
+
+- You need to handle both broadcast and unicast messages but only care about
+  certain message types or parameters.
+
+- The ``bind(2)`` and ``connect(2)`` functions only allow binding to a single
+  address, which might not be sufficient if the process needs to handle multiple
+  PGNs but does not want to open multiple sockets.
+
+To remove existing filters, you can pass ``optval == NULL`` or ``optlen == 0``
+to ``setsockopt(2)``. This will clear all currently set filters. If you want to
+**update** the set of filters, you must pass the updated filter set to
+``setsockopt(2)``, as the new filter set will **replace** the old one entirely.
+This behavior ensures that any previous filter configuration is discarded and
+only the new set is applied.
+
+Example of removing all filters:
+
+.. code-block:: c
+
+    setsockopt(sock, SOL_CAN_J1939, SO_J1939_FILTER, NULL, 0);
+
+**Maximum number of filters:** The maximum amount of filters that can be
+applied using ``SO_J1939_FILTER`` is defined by ``J1939_FILTER_MAX``, which is
+set to 512. This means you can configure up to 512 individual filters to match
+your specific filtering needs.
+
+Practical use case: **Monitoring Address Claiming**
+
+One practical use case is monitoring the J1939 address claiming process by
+filtering for specific PGNs related to address claiming. This allows a process
+to monitor and handle address claims without processing unrelated messages.
+
+Example:
+
+.. code-block:: c
+
+    struct j1939_filter filt[] = {
+        {
+            .pgn = J1939_PGN_ADDRESS_CLAIMED,
+            .pgn_mask = J1939_PGN_PDU1_MAX,
+        }, {
+            .pgn = J1939_PGN_REQUEST,
+            .pgn_mask = J1939_PGN_PDU1_MAX,
+        }, {
+            .pgn = J1939_PGN_ADDRESS_COMMANDED,
+            .pgn_mask = J1939_PGN_MAX,
+        },
+    };
+    setsockopt(sock, SOL_CAN_J1939, SO_J1939_FILTER, &filt, sizeof(filt));
+
+In this example, the socket will only receive messages with the PGNs related to
+address claiming: ``J1939_PGN_ADDRESS_CLAIMED``, ``J1939_PGN_REQUEST``, and
+``J1939_PGN_ADDRESS_COMMANDED``. This is particularly useful in scenarios where
+you want to monitor and process address claims without being overwhelmed by
+other traffic on the J1939 network.
+
+``SO_J1939_PROMISC``
+~~~~~~~~~~~~~~~~~~~~
+
+The ``SO_J1939_PROMISC`` option enables socket-level promiscuous mode. When
+this option is enabled, the socket will receive all J1939 traffic, regardless
+of any filters set by ``bind()`` or ``connect()``. This is analogous to
+enabling promiscuous mode for an Ethernet interface, where all traffic on the
+network segment is captured.
+
+However, **`SO_J1939_FILTER` has a higher priority** compared to
+``SO_J1939_PROMISC``. This means that even in promiscuous mode, you can reduce
+the number of packets received by applying specific filters with
+`SO_J1939_FILTER`. The filters will limit which packets are passed to the
+socket, allowing for more refined traffic selection while promiscuous mode is
+active.
+
+The acceptable value size for this option is ``sizeof(int)``, and the value is
+only differentiated between `0` and non-zero. A value of `0` disables
+promiscuous mode, while any non-zero value enables it.
+
+This combination can be useful for debugging or monitoring specific types of
+traffic while still capturing a broad set of messages.
+
+Example:
+
+.. code-block:: c
+
+    int value = 1;
+    setsockopt(sock, SOL_CAN_J1939, SO_J1939_PROMISC, &value, sizeof(value));
+
+In this example, setting ``value`` to any non-zero value (e.g., `1`) enables
+promiscuous mode, allowing the socket to receive all J1939 traffic on the
+network.
+
+``SO_BROADCAST``
+~~~~~~~~~~~~~~~~
+
+The ``SO_BROADCAST`` option enables the sending and receiving of broadcast
+messages. By default, broadcast messages are disabled for J1939 sockets. When
+this option is enabled, the socket will be allowed to send and receive
+broadcast packets on the J1939 network.
+
+Due to the nature of the CAN bus as a shared medium, all messages transmitted
+on the bus are visible to all participants. In the context of J1939,
+broadcasting refers to using a specific destination address field, where the
+destination address is set to a value that indicates the message is intended
+for all participants (usually a global address such as 0xFF). Enabling the
+broadcast option allows the socket to send and receive such broadcast messages.
+
+The acceptable value size for this option is ``sizeof(int)``, and the value is
+only differentiated between `0` and non-zero. A value of `0` disables the
+ability to send and receive broadcast messages, while any non-zero value
+enables it.
+
+Example:
+
+.. code-block:: c
+
+    int value = 1;
+    setsockopt(sock, SOL_SOCKET, SO_BROADCAST, &value, sizeof(value));
+
+In this example, setting ``value`` to any non-zero value (e.g., `1`) enables
+the socket to send and receive broadcast messages.
+
+``SO_J1939_SEND_PRIO``
+~~~~~~~~~~~~~~~~~~~~~~
+
+The ``SO_J1939_SEND_PRIO`` option sets the priority of outgoing J1939 messages
+for the socket. In J1939, messages can have different priorities, and lower
+numerical values indicate higher priority. This option allows the user to
+control the priority of messages sent from the socket by adjusting the priority
+bits in the CAN identifier.
+
+The acceptable value **size** for this option is ``sizeof(int)``, and the value
+is expected to be in the range of 0 to 7, where `0` is the highest priority,
+and `7` is the lowest. By default, the priority is set to `6` if this option is
+not explicitly configured.
+
+Note that the priority values `0` and `1` can only be set if the process has
+the `CAP_NET_ADMIN` capability. These are reserved for high-priority traffic
+and require administrative privileges.
+
+Example:
+
+.. code-block:: c
+
+    int prio = 3;  // Priority value between 0 (highest) and 7 (lowest)
+    setsockopt(sock, SOL_CAN_J1939, SO_J1939_SEND_PRIO, &prio, sizeof(prio));
+
+In this example, the priority is set to `3`, meaning the outgoing messages will
+be sent with a moderate priority level.
+
+``SO_J1939_ERRQUEUE``
+~~~~~~~~~~~~~~~~~~~~~
+
+The ``SO_J1939_ERRQUEUE`` option enables the socket to receive error messages
+from the error queue, providing diagnostic information about transmission
+failures, protocol violations, or other issues that occur during J1939
+communication. Once this option is set, user space is required to handle
+``MSG_ERRQUEUE`` messages.
+
+Setting ``SO_J1939_ERRQUEUE`` to ``0`` will purge any currently present error
+messages in the error queue. When enabled, error messages can be retrieved
+using the ``recvmsg(2)`` system call.
+
+When subscribing to the error queue, the following error events can be
+accessed:
+
+- **``J1939_EE_INFO_TX_ABORT``**: Transmission abort errors.
+- **``J1939_EE_INFO_RX_RTS``**: Reception of RTS (Request to Send) control
+  frames.
+- **``J1939_EE_INFO_RX_DPO``**: Reception of data packets with Data Page Offset
+  (DPO).
+- **``J1939_EE_INFO_RX_ABORT``**: Reception abort errors.
+
+The error queue can be used to correlate errors with specific message transfer
+sessions using the session ID (``tskey``). The session ID is assigned via the
+``SOF_TIMESTAMPING_OPT_ID`` flag, which is set by enabling the
+``SO_TIMESTAMPING`` option.
+
+If ``SO_J1939_ERRQUEUE`` is activated, the user is required to pull messages
+from the error queue, meaning that using plain ``recv(2)`` is not sufficient
+anymore. The user must use ``recvmsg(2)`` with appropriate flags to handle
+error messages. Failure to do so can result in the socket becoming blocked with
+unprocessed error messages in the queue.
+
+It is **recommended** that ``SO_J1939_ERRQUEUE`` be used in combination with
+``SO_TIMESTAMPING`` in most cases. This enables proper error handling along
+with session tracking and timestamping, providing a more detailed analysis of
+message transfers and errors.
+
+The acceptable value **size** for this option is ``sizeof(int)``, and the value
+is only differentiated between ``0`` and non-zero. A value of ``0`` disables
+error queue reception and purges any existing error messages, while any
+non-zero value enables it.
+
+Example:
+
+.. code-block:: c
+
+    int enable = 1;  // Enable error queue reception
+    setsockopt(sock, SOL_CAN_J1939, SO_J1939_ERRQUEUE, &enable, sizeof(enable));
+
+    // Enable timestamping with session tracking via tskey
+    int timestamping = SOF_TIMESTAMPING_OPT_ID | SOF_TIMESTAMPING_TX_ACK |
+                       SOF_TIMESTAMPING_TX_SCHED |
+                       SOF_TIMESTAMPING_RX_SOFTWARE | SOF_TIMESTAMPING_OPT_CMSG;
+    setsockopt(sock, SOL_SOCKET, SO_TIMESTAMPING, &timestamping,
+               sizeof(timestamping));
+
+When enabled, error messages can be retrieved using ``recvmsg(2)``. By
+combining ``SO_J1939_ERRQUEUE`` with ``SO_TIMESTAMPING`` (with
+``SOF_TIMESTAMPING_OPT_ID`` and ``SOF_TIMESTAMPING_OPT_CMSG`` enabled), the
+user can track message transfers, retrieve precise timestamps, and correlate
+errors with specific sessions.
+
+For more information on enabling timestamps and session tracking, refer to the
+`SO_TIMESTAMPING` section.
+
+``SO_TIMESTAMPING``
+~~~~~~~~~~~~~~~~~~~
+
+The ``SO_TIMESTAMPING`` option allows the socket to receive timestamps for
+various events related to message transmissions and receptions in J1939. This
+option is often used in combination with ``SO_J1939_ERRQUEUE`` to provide
+detailed diagnostic information, session tracking, and precise timing data for
+message transfers.
+
+In J1939, all payloads provided by user space, regardless of size, are
+processed by the kernel as **sessions**. This includes both single-frame
+messages (up to 8 bytes) and multi-frame protocols such as the Transport
+Protocol (TP) and Extended Transport Protocol (ETP). Even for small,
+single-frame messages, the kernel creates a session to manage the transmission
+and reception. The concept of sessions allows the kernel to manage various
+aspects of the protocol, such as reassembling multi-frame messages and tracking
+the status of transmissions.
+
+When receiving extended error messages from the error queue, the error
+information is delivered through a `struct sock_extended_err`, accessible via
+the control message (``cmsg``) retrieved using the ``recvmsg(2)`` system call.
+
+There are two typical origins for the extended error messages in J1939:
+
+1. ``serr->ee_origin == SO_EE_ORIGIN_TIMESTAMPING``:
+
+   In this case, the `serr->ee_info` field will contain one of the following
+   timestamp types:
+
+   - ``SCM_TSTAMP_SCHED``: This timestamp is valid for Extended Transport
+     Protocol (ETP) transfers and simple transfers (8 bytes or less). It
+     indicates when a message or set of frames has been scheduled for
+     transmission.
+
+     - For simple transfers (8 bytes or less), it marks the point when the
+       message is queued and ready to be sent onto the CAN bus.
+
+     - For ETP transfers, it is sent after receiving a CTS (Clear to Send)
+       frame on the sender side, indicating that a new set of frames has been
+       scheduled for transmission.
+
+     - The Transport Protocol (TP) case is currently not implemented for this
+       timestamp.
+
+     - On the receiver side, the counterpart to this event for ETP is
+       represented by the ``J1939_EE_INFO_RX_DPO`` message, which indicates the
+       reception of a Data Page Offset (DPO) control frame.
+
+   - ``SCM_TSTAMP_ACK``: This timestamp indicates the acknowledgment of the
+     message or session.
+
+     - For simple transfers (8 bytes or less), it marks when the message has
+       been sent and an echo confirmation has been received from the CAN
+       controller, indicating that the frame was transmitted onto the bus.
+
+     - For multi-frame transfers (TP or ETP), it signifies that the entire
+       session has been acknowledged, typically after receiving the End of
+       Message Acknowledgment (EOMA) packet.
+
+2. ``serr->ee_origin == SO_EE_ORIGIN_LOCAL``:
+
+   In this case, the `serr->ee_info` field will contain one of the following
+   J1939 stack-specific message types:
+
+   - ``J1939_EE_INFO_TX_ABORT``: This message indicates that the transmission
+     of a message or session was aborted. The cause of the abort can come from
+     various sources:
+
+     - **CAN stack failure**: The J1939 stack was unable to pass the frame to
+       the CAN framework for transmission.
+
+     - **Echo failure**: The J1939 stack did not receive an echo confirmation
+       from the CAN controller, meaning the frame may not have been successfully
+       transmitted to the CAN bus.
+
+     - **Protocol-level issues**: For multi-frame transfers (TP/ETP), this
+       could include protocol-related errors, such as an abort signaled by the
+       receiver or a timeout at the protocol level, which causes the session to
+       terminate prematurely.
+
+     - The corresponding error code is stored in ``serr->ee_data``
+       (``session->err`` on kernel side), providing additional details about
+       the specific reason for the abort.
+
+   - ``J1939_EE_INFO_RX_RTS``: This message indicates that the J1939 stack has
+     received a Request to Send (RTS) control frame, signaling the start of a
+     multi-frame transfer using the Transport Protocol (TP) or Extended
+     Transport Protocol (ETP).
+
+     - It informs the receiver that the sender is ready to transmit a
+       multi-frame message and includes details about the total message size
+       and the number of frames to be sent.
+
+     - Statistics such as ``J1939_NLA_TOTAL_SIZE``, ``J1939_NLA_PGN``,
+       ``J1939_NLA_SRC_NAME``, and ``J1939_NLA_DEST_NAME`` are provided along
+       with the ``J1939_EE_INFO_RX_RTS`` message, giving detailed information
+       about the incoming transfer.
+
+   - ``J1939_EE_INFO_RX_DPO``: This message indicates that the J1939 stack has
+     received a Data Page Offset (DPO) control frame, which is part of the
+     Extended Transport Protocol (ETP).
+
+     - The DPO frame signals the continuation of an ETP multi-frame message by
+       indicating the offset position in the data being transferred. It helps
+       the receiver manage large data sets by identifying which portion of the
+       message is being received.
+
+     - It is typically paired with a corresponding ``SCM_TSTAMP_SCHED`` event
+       on the sender side, which indicates when the next set of frames is
+       scheduled for transmission.
+
+     - This event includes statistics such as ``J1939_NLA_BYTES_ACKED``, which
+       tracks the number of bytes acknowledged up to that point in the session.
+
+   - ``J1939_EE_INFO_RX_ABORT``: This message indicates that the reception of a
+     multi-frame message (Transport Protocol or Extended Transport Protocol) has
+     been aborted.
+
+     - The abort can be triggered by protocol-level errors such as timeouts, an
+       unexpected frame, or a specific abort request from the sender.
+
+     - This message signals that the receiver cannot continue processing the
+       transfer, and the session is terminated.
+
+     - The corresponding error code is stored in ``serr->ee_data``
+       (``session->err`` on kernel side ), providing further details about the
+       reason for the abort, such as protocol violations or timeouts.
+
+     - After receiving this message, the receiver discards the partially received
+       frames, and the multi-frame session is considered incomplete.
+
+In both cases, if ``SOF_TIMESTAMPING_OPT_ID`` is enabled, ``serr->ee_data``
+will be set to the session’s unique identifier (``session->tskey``). This
+allows user space to track message transfers by their session identifier across
+multiple frames or stages.
+
+In all other cases, ``serr->ee_errno`` will be set to ``ENOMSG``, except for
+the ``J1939_EE_INFO_TX_ABORT`` and ``J1939_EE_INFO_RX_ABORT`` cases, where the
+kernel sets ``serr->ee_data`` to the error stored in ``session->err``.  All
+protocol-specific errors are converted to standard kernel error values and
+stored in ``session->err``. These error values are unified across system calls
+and ``serr->ee_errno``.  Some of the known error values are described in the
+`Error Codes in the J1939 Stack` section.
+
+When the `J1939_EE_INFO_RX_RTS` message is provided, it will include the
+following statistics for multi-frame messages (TP and ETP):
+
+  - ``J1939_NLA_TOTAL_SIZE``: Total size of the message in the session.
+  - ``J1939_NLA_PGN``: Parameter Group Number (PGN) identifying the message type.
+  - ``J1939_NLA_SRC_NAME``: 64-bit name of the source ECU.
+  - ``J1939_NLA_DEST_NAME``: 64-bit name of the destination ECU.
+  - ``J1939_NLA_SRC_ADDR``: 8-bit source address of the sending ECU.
+  - ``J1939_NLA_DEST_ADDR``: 8-bit destination address of the receiving ECU.
+
+- For other messages (including single-frame messages), only the following
+  statistic is included:
+
+  - ``J1939_NLA_BYTES_ACKED``: Number of bytes successfully acknowledged in the
+    session.
+
+The key flags for ``SO_TIMESTAMPING`` include:
+
+- ``SOF_TIMESTAMPING_OPT_ID``: Enables the use of a unique session identifier
+  (``tskey``) for each transfer. This identifier helps track message transfers
+  and errors as distinct sessions in user space. When this option is enabled,
+  ``serr->ee_data`` will be set to ``session->tskey``.
+
+- ``SOF_TIMESTAMPING_OPT_CMSG``: Sends timestamp information through control
+  messages (``struct scm_timestamping``), allowing the application to retrieve
+  timestamps alongside the data.
+
+- ``SOF_TIMESTAMPING_TX_SCHED``: Provides the timestamp for when a message is
+  scheduled for transmission (``SCM_TSTAMP_SCHED``).
+
+- ``SOF_TIMESTAMPING_TX_ACK``: Provides the timestamp for when a message
+  transmission is fully acknowledged (``SCM_TSTAMP_ACK``).
+
+- ``SOF_TIMESTAMPING_RX_SOFTWARE``: Provides timestamps for reception-related
+  events (e.g., ``J1939_EE_INFO_RX_RTS``, ``J1939_EE_INFO_RX_DPO``,
+  ``J1939_EE_INFO_RX_ABORT``).
+
+These flags enable detailed monitoring of message lifecycles, including
+transmission scheduling, acknowledgments, reception timestamps, and gathering
+detailed statistics about the communication session, especially for multi-frame
+payloads like TP and ETP.
+
+Example:
+
+.. code-block:: c
+
+    // Enable timestamping with various options, including session tracking and
+    // statistics
+    int sock_opt = SOF_TIMESTAMPING_OPT_CMSG |
+                   SOF_TIMESTAMPING_TX_ACK |
+                   SOF_TIMESTAMPING_TX_SCHED |
+                   SOF_TIMESTAMPING_OPT_ID |
+                   SOF_TIMESTAMPING_RX_SOFTWARE;
+
+    setsockopt(sock, SOL_SOCKET, SO_TIMESTAMPING, &sock_opt, sizeof(sock_opt));
+
+
+
 Dynamic Addressing
 ------------------
 
@@ -458,3 +995,141 @@ Send:
 	};
 
 	sendto(sock, dat, sizeof(dat), 0, (const struct sockaddr *)&saddr, sizeof(saddr));
+
+
+Error Codes in the J1939 Stack
+------------------------------
+
+This section lists all potential kernel error codes that can be exposed to user
+space when interacting with the J1939 stack. It includes both standard error
+codes and those derived from protocol-specific abort codes.
+
+- ``EAGAIN``: Operation would block; retry may succeed. One common reason is
+  that an active TP or ETP session exists, and an attempt was made to start a
+  new overlapping TP or ETP session between the same peers.
+
+- ``ENETDOWN``: Network is down. This occurs when the CAN interface is switched
+  to the "down" state.
+
+- ``ENOBUFS``: No buffer space available. This error occurs when the CAN
+  interface's transmit (TX) queue is full, and no more messages can be queued.
+
+- ``EOVERFLOW``: Value too large for defined data type. In J1939, this can
+  happen if the requested data lies outside of the queued buffer. For example,
+  if a CTS (Clear to Send) requests an offset not available in the kernel buffer
+  because user space did not provide enough data.
+
+- ``EBUSY``: Device or resource is busy. For example, this occurs if an
+  identical session is already active and the stack is unable to recover from
+  the condition.
+
+- ``EACCES``: Permission denied. This error can occur, for example, when
+  attempting to send broadcast messages, but the socket is not configured with
+  ``SO_BROADCAST``.
+
+- ``EADDRNOTAVAIL``: Address not available. This error occurs in cases such as:
+
+  - When attempting to use ``getsockname(2)`` to retrieve the peer's address,
+    but the socket is not connected.
+
+  - When trying to send data to or from a NAME, but address claiming for the
+    NAME was not performed or detected by the stack.
+
+- ``EBADFD``: File descriptor in bad state. This error can occur if:
+
+  - Attempting to send data to an unbound socket.
+
+  - The socket is bound but has no source name, and the source address is
+    ``J1939_NO_ADDR``.
+
+  - The ``can_ifindex`` is incorrect.
+
+- ``EFAULT``: Bad address. Occurs mostly when the stack can't copy from or to a
+  sockptr, when there is insufficient data from user space, or when the buffer
+  provided by user space is not large enough for the requested data.
+
+- ``EINTR``: A signal occurred before any data was transmitted; see ``signal(7)``.
+
+- ``EINVAL``: Invalid argument passed. For example:
+
+  - ``msg->msg_namelen`` is less than ``J1939_MIN_NAMELEN``.
+
+  - ``addr->can_family`` is not equal to ``AF_CAN``.
+
+  - An incorrect PGN was provided.
+
+- ``ENODEV``: No such device. This happens when the CAN network device cannot
+  be found for the provided ``can_ifindex`` or if ``can_ifindex`` is 0.
+
+- ``ENOMEM``: Out of memory. Typically related to issues with memory allocation
+  in the stack.
+
+- ``ENOPROTOOPT``: Protocol not available. This can occur when using
+  ``getsockopt(2)`` or ``setsockopt(2)`` if the requested socket option is not
+  available.
+
+- ``EDESTADDRREQ``: Destination address required. This error occurs:
+
+  - In the case of ``connect(2)``, if the ``struct sockaddr *uaddr`` is ``NULL``.
+
+  - In the case of ``send*(2)``, if there is an attempt to send an ETP message
+    to a broadcast address.
+
+- ``EDOM``: Argument out of domain. This error may happen if attempting to send
+  a TP or ETP message to a PGN that is reserved for control PGNs for TP or ETP
+  operations.
+
+- ``EIO``: I/O error. This can occur if the amount of data provided to the
+  socket for a TP or ETP session does not match the announced amount of data for
+  the session.
+
+- ``ENOENT``: No such file or directory. This can happen when the stack
+  attempts to transfer CTS or EOMA but cannot find a matching receiving socket
+  anymore.
+
+- ``ENOIOCTLCMD``: No ioctls are available for the socket layer.
+
+- ``EPERM``: Operation not permitted. For example, this can occur if a
+  requested action requires ``CAP_NET_ADMIN`` privileges.
+
+- ``ENETUNREACH``: Network unreachable. Most likely, this occurs when frames
+  cannot be transmitted to the CAN bus.
+
+- ``ETIME``: Timer expired. This can happen if a timeout occurs while
+  attempting to send a simple message, for example, when an echo message from
+  the controller is not received.
+
+- ``EPROTO``: Protocol error.
+
+  - Used for various protocol-level errors in J1939, including:
+
+    - Duplicate sequence number.
+
+    - Unexpected EDPO or ECTS packet.
+
+    - Invalid PGN or offset in EDPO/ECTS.
+
+    - Number of EDPO packets exceeded CTS allowance.
+
+    - Any other protocol-level error.
+
+- ``EMSGSIZE``: Message too long.
+
+- ``ENOMSG``: No message available.
+
+- ``EALREADY``: The ECU is already engaged in one or more connection-managed
+  sessions and cannot support another.
+
+- ``EHOSTUNREACH``: A timeout occurred, and the session was aborted.
+
+- ``EBADMSG``: CTS (Clear to Send) messages were received during an active data
+  transfer, causing an abort.
+
+- ``ENOTRECOVERABLE``: The maximum retransmission request limit was reached,
+  and the session cannot recover.
+
+- ``ENOTCONN``: An unexpected data transfer packet was received.
+
+- ``EILSEQ``: A bad sequence number was received, and the software could not
+  recover.
+
diff --git a/Documentation/networking/kapi.rst b/Documentation/networking/kapi.rst
index ea55f462cefa..98682b9a13ee 100644
--- a/Documentation/networking/kapi.rst
+++ b/Documentation/networking/kapi.rst
@@ -104,6 +104,9 @@ Driver Support
 .. kernel-doc:: include/linux/netdevice.h
    :internal:
 
+.. kernel-doc:: include/net/net_shaper.h
+   :internal:
+
 PHY Support
 -----------
 
diff --git a/Documentation/networking/kcm.rst b/Documentation/networking/kcm.rst
index db0f5560ac1c..71f44d0beaa3 100644
--- a/Documentation/networking/kcm.rst
+++ b/Documentation/networking/kcm.rst
@@ -200,7 +200,7 @@ while. Example use::
 
   setsockopt(kcmfd, SOL_KCM, KCM_RECV_DISABLE, &val, sizeof(val))
 
-BFP programs for message delineation
+BPF programs for message delineation
 ------------------------------------
 
 BPF programs can be compiled using the BPF LLVM backend. For example,
diff --git a/Documentation/networking/mptcp-sysctl.rst b/Documentation/networking/mptcp-sysctl.rst
index 95598c21fc8e..1683c139821e 100644
--- a/Documentation/networking/mptcp-sysctl.rst
+++ b/Documentation/networking/mptcp-sysctl.rst
@@ -12,6 +12,8 @@ add_addr_timeout - INTEGER (seconds)
 	resent to an MPTCP peer that has not acknowledged a previous
 	ADD_ADDR message.
 
+	Do not retransmit if set to 0.
+
 	The default value matches TCP_RTO_MAX. This is a per-namespace
 	sysctl.
 
@@ -30,6 +32,10 @@ allow_join_initial_addr_port - BOOLEAN
 
 	Default: 1
 
+available_path_managers - STRING
+	Shows the available path managers choices that are registered. More
+	path managers may be available, but not loaded.
+
 available_schedulers - STRING
 	Shows the available schedulers choices that are registered. More packet
 	schedulers may be available, but not loaded.
@@ -41,7 +47,7 @@ blackhole_timeout - INTEGER (seconds)
 	MPTCP is re-enabled and will reset to the initial value when the
 	blackhole issue goes away.
 
-	0 to disable the blackhole detection.
+	0 to disable the blackhole detection. This is a per-namespace sysctl.
 
 	Default: 3600
 
@@ -72,6 +78,23 @@ enabled - BOOLEAN
 
 	Default: 1 (enabled)
 
+path_manager - STRING
+	Set the default path manager name to use for each new MPTCP
+	socket. In-kernel path management will control subflow
+	connections and address advertisements according to
+	per-namespace values configured over the MPTCP netlink
+	API. Userspace path management puts per-MPTCP-connection subflow
+	connection decisions and address advertisements under control of
+	a privileged userspace program, at the cost of more netlink
+	traffic to propagate all of the related events and commands.
+
+	This is a per-namespace sysctl.
+
+	* "kernel"          - In-kernel path manager
+	* "userspace"       - Userspace path manager
+
+	Default: "kernel"
+
 pm_type - INTEGER
 	Set the default path manager type to use for each new MPTCP
 	socket. In-kernel path management will control subflow
@@ -84,6 +107,8 @@ pm_type - INTEGER
 
 	This is a per-namespace sysctl.
 
+	Deprecated since v6.15, use path_manager instead.
+
 	* 0 - In-kernel path manager
 	* 1 - Userspace path manager
 
@@ -108,3 +133,19 @@ stale_loss_cnt - INTEGER
 	This is a per-namespace sysctl.
 
 	Default: 4
+
+syn_retrans_before_tcp_fallback - INTEGER
+	The number of SYN + MP_CAPABLE retransmissions before falling back to
+	TCP, i.e. dropping the MPTCP options. In other words, if all the packets
+	are dropped on the way, there will be:
+
+	* The initial SYN with MPTCP support
+	* This number of SYN retransmitted with MPTCP support
+	* The next SYN retransmissions will be without MPTCP support
+
+	0 means the first retransmission will be done without MPTCP options.
+	>= 128 means that all SYN retransmissions will keep the MPTCP options. A
+	lower number might increase false-positive MPTCP blackholes detections.
+	This is a per-namespace sysctl.
+
+	Default: 2
diff --git a/Documentation/networking/multi-pf-netdev.rst b/Documentation/networking/multi-pf-netdev.rst
index 2cd25d81aaa7..2f5a5bb3ca9a 100644
--- a/Documentation/networking/multi-pf-netdev.rst
+++ b/Documentation/networking/multi-pf-netdev.rst
@@ -89,7 +89,7 @@ Observability
 =============
 The relation between PF, irq, napi, and queue can be observed via netlink spec::
 
-  $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml --dump queue-get --json='{"ifindex": 13}'
+  $ ./tools/net/ynl/pyynl/cli.py --spec Documentation/netlink/specs/netdev.yaml --dump queue-get --json='{"ifindex": 13}'
   [{'id': 0, 'ifindex': 13, 'napi-id': 539, 'type': 'rx'},
    {'id': 1, 'ifindex': 13, 'napi-id': 540, 'type': 'rx'},
    {'id': 2, 'ifindex': 13, 'napi-id': 541, 'type': 'rx'},
@@ -101,7 +101,7 @@ The relation between PF, irq, napi, and queue can be observed via netlink spec::
    {'id': 3, 'ifindex': 13, 'napi-id': 542, 'type': 'tx'},
    {'id': 4, 'ifindex': 13, 'napi-id': 543, 'type': 'tx'}]
 
-  $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml --dump napi-get --json='{"ifindex": 13}'
+  $ ./tools/net/ynl/pyynl/cli.py --spec Documentation/netlink/specs/netdev.yaml --dump napi-get --json='{"ifindex": 13}'
   [{'id': 543, 'ifindex': 13, 'irq': 42},
    {'id': 542, 'ifindex': 13, 'irq': 41},
    {'id': 541, 'ifindex': 13, 'irq': 40},
diff --git a/Documentation/networking/napi.rst b/Documentation/networking/napi.rst
index dfa5d549be9c..a15754adb041 100644
--- a/Documentation/networking/napi.rst
+++ b/Documentation/networking/napi.rst
@@ -171,12 +171,43 @@ a channel as an IRQ/NAPI which services queues of a given type. For example,
 a configuration of 1 ``rx``, 1 ``tx`` and 1 ``combined`` channel is expected
 to utilize 3 interrupts, 2 Rx and 2 Tx queues.
 
+Persistent NAPI config
+----------------------
+
+Drivers often allocate and free NAPI instances dynamically. This leads to loss
+of NAPI-related user configuration each time NAPI instances are reallocated.
+The netif_napi_add_config() API prevents this loss of configuration by
+associating each NAPI instance with a persistent NAPI configuration based on
+a driver defined index value, like a queue number.
+
+Using this API allows for persistent NAPI IDs (among other settings), which can
+be beneficial to userspace programs using ``SO_INCOMING_NAPI_ID``. See the
+sections below for other NAPI configuration settings.
+
+Drivers should try to use netif_napi_add_config() whenever possible.
+
 User API
 ========
 
 User interactions with NAPI depend on NAPI instance ID. The instance IDs
 are only visible to the user thru the ``SO_INCOMING_NAPI_ID`` socket option.
-It's not currently possible to query IDs used by a given device.
+
+Users can query NAPI IDs for a device or device queue using netlink. This can
+be done programmatically in a user application or by using a script included in
+the kernel source tree: ``tools/net/ynl/pyynl/cli.py``.
+
+For example, using the script to dump all of the queues for a device (which
+will reveal each queue's NAPI ID):
+
+.. code-block:: bash
+
+   $ kernel-source/tools/net/ynl/pyynl/cli.py \
+             --spec Documentation/netlink/specs/netdev.yaml \
+             --dump queue-get \
+             --json='{"ifindex": 2}'
+
+See ``Documentation/netlink/specs/netdev.yaml`` for more details on
+available operations and attributes.
 
 Software IRQ coalescing
 -----------------------
@@ -192,6 +223,33 @@ is reused to control the delay of the timer, while
 ``napi_defer_hard_irqs`` controls the number of consecutive empty polls
 before NAPI gives up and goes back to using hardware IRQs.
 
+The above parameters can also be set on a per-NAPI basis using netlink via
+netdev-genl. When used with netlink and configured on a per-NAPI basis, the
+parameters mentioned above use hyphens instead of underscores:
+``gro-flush-timeout`` and ``napi-defer-hard-irqs``.
+
+Per-NAPI configuration can be done programmatically in a user application
+or by using a script included in the kernel source tree:
+``tools/net/ynl/pyynl/cli.py``.
+
+For example, using the script:
+
+.. code-block:: bash
+
+  $ kernel-source/tools/net/ynl/pyynl/cli.py \
+            --spec Documentation/netlink/specs/netdev.yaml \
+            --do napi-set \
+            --json='{"id": 345,
+                     "defer-hard-irqs": 111,
+                     "gro-flush-timeout": 11111}'
+
+Similarly, the parameter ``irq-suspend-timeout`` can be set using netlink
+via netdev-genl. There is no global sysfs parameter for this value.
+
+``irq-suspend-timeout`` is used to determine how long an application can
+completely suspend IRQs. It is used in combination with SO_PREFER_BUSY_POLL,
+which can be set on a per-epoll context basis with ``EPIOCSPARAMS`` ioctl.
+
 .. _poll:
 
 Busy polling
@@ -207,6 +265,46 @@ selected sockets or using the global ``net.core.busy_poll`` and
 ``net.core.busy_read`` sysctls. An io_uring API for NAPI busy polling
 also exists.
 
+epoll-based busy polling
+------------------------
+
+It is possible to trigger packet processing directly from calls to
+``epoll_wait``. In order to use this feature, a user application must ensure
+all file descriptors which are added to an epoll context have the same NAPI ID.
+
+If the application uses a dedicated acceptor thread, the application can obtain
+the NAPI ID of the incoming connection using SO_INCOMING_NAPI_ID and then
+distribute that file descriptor to a worker thread. The worker thread would add
+the file descriptor to its epoll context. This would ensure each worker thread
+has an epoll context with FDs that have the same NAPI ID.
+
+Alternatively, if the application uses SO_REUSEPORT, a bpf or ebpf program can
+be inserted to distribute incoming connections to threads such that each thread
+is only given incoming connections with the same NAPI ID. Care must be taken to
+carefully handle cases where a system may have multiple NICs.
+
+In order to enable busy polling, there are two choices:
+
+1. ``/proc/sys/net/core/busy_poll`` can be set with a time in useconds to busy
+   loop waiting for events. This is a system-wide setting and will cause all
+   epoll-based applications to busy poll when they call epoll_wait. This may
+   not be desirable as many applications may not have the need to busy poll.
+
+2. Applications using recent kernels can issue an ioctl on the epoll context
+   file descriptor to set (``EPIOCSPARAMS``) or get (``EPIOCGPARAMS``) ``struct
+   epoll_params``:, which user programs can define as follows:
+
+.. code-block:: c
+
+  struct epoll_params {
+      uint32_t busy_poll_usecs;
+      uint16_t busy_poll_budget;
+      uint8_t prefer_busy_poll;
+
+      /* pad the struct to a multiple of 64bits */
+      uint8_t __pad;
+  };
+
 IRQ mitigation
 ---------------
 
@@ -222,12 +320,111 @@ Such applications can pledge to the kernel that they will perform a busy
 polling operation periodically, and the driver should keep the device IRQs
 permanently masked. This mode is enabled by using the ``SO_PREFER_BUSY_POLL``
 socket option. To avoid system misbehavior the pledge is revoked
-if ``gro_flush_timeout`` passes without any busy poll call.
+if ``gro_flush_timeout`` passes without any busy poll call. For epoll-based
+busy polling applications, the ``prefer_busy_poll`` field of ``struct
+epoll_params`` can be set to 1 and the ``EPIOCSPARAMS`` ioctl can be issued to
+enable this mode. See the above section for more details.
 
 The NAPI budget for busy polling is lower than the default (which makes
 sense given the low latency intention of normal busy polling). This is
 not the case with IRQ mitigation, however, so the budget can be adjusted
-with the ``SO_BUSY_POLL_BUDGET`` socket option.
+with the ``SO_BUSY_POLL_BUDGET`` socket option. For epoll-based busy polling
+applications, the ``busy_poll_budget`` field can be adjusted to the desired value
+in ``struct epoll_params`` and set on a specific epoll context using the ``EPIOCSPARAMS``
+ioctl. See the above section for more details.
+
+It is important to note that choosing a large value for ``gro_flush_timeout``
+will defer IRQs to allow for better batch processing, but will induce latency
+when the system is not fully loaded. Choosing a small value for
+``gro_flush_timeout`` can cause interference of the user application which is
+attempting to busy poll by device IRQs and softirq processing. This value
+should be chosen carefully with these tradeoffs in mind. epoll-based busy
+polling applications may be able to mitigate how much user processing happens
+by choosing an appropriate value for ``maxevents``.
+
+Users may want to consider an alternate approach, IRQ suspension, to help deal
+with these tradeoffs.
+
+IRQ suspension
+--------------
+
+IRQ suspension is a mechanism wherein device IRQs are masked while epoll
+triggers NAPI packet processing.
+
+While application calls to epoll_wait successfully retrieve events, the kernel will
+defer the IRQ suspension timer. If the kernel does not retrieve any events
+while busy polling (for example, because network traffic levels subsided), IRQ
+suspension is disabled and the IRQ mitigation strategies described above are
+engaged.
+
+This allows users to balance CPU consumption with network processing
+efficiency.
+
+To use this mechanism:
+
+  1. The per-NAPI config parameter ``irq-suspend-timeout`` should be set to the
+     maximum time (in nanoseconds) the application can have its IRQs
+     suspended. This is done using netlink, as described above. This timeout
+     serves as a safety mechanism to restart IRQ driver interrupt processing if
+     the application has stalled. This value should be chosen so that it covers
+     the amount of time the user application needs to process data from its
+     call to epoll_wait, noting that applications can control how much data
+     they retrieve by setting ``max_events`` when calling epoll_wait.
+
+  2. The sysfs parameter or per-NAPI config parameters ``gro_flush_timeout``
+     and ``napi_defer_hard_irqs`` can be set to low values. They will be used
+     to defer IRQs after busy poll has found no data.
+
+  3. The ``prefer_busy_poll`` flag must be set to true. This can be done using
+     the ``EPIOCSPARAMS`` ioctl as described above.
+
+  4. The application uses epoll as described above to trigger NAPI packet
+     processing.
+
+As mentioned above, as long as subsequent calls to epoll_wait return events to
+userland, the ``irq-suspend-timeout`` is deferred and IRQs are disabled. This
+allows the application to process data without interference.
+
+Once a call to epoll_wait results in no events being found, IRQ suspension is
+automatically disabled and the ``gro_flush_timeout`` and
+``napi_defer_hard_irqs`` mitigation mechanisms take over.
+
+It is expected that ``irq-suspend-timeout`` will be set to a value much larger
+than ``gro_flush_timeout`` as ``irq-suspend-timeout`` should suspend IRQs for
+the duration of one userland processing cycle.
+
+While it is not strictly necessary to use ``napi_defer_hard_irqs`` and
+``gro_flush_timeout`` to use IRQ suspension, their use is strongly
+recommended.
+
+IRQ suspension causes the system to alternate between polling mode and
+irq-driven packet delivery. During busy periods, ``irq-suspend-timeout``
+overrides ``gro_flush_timeout`` and keeps the system busy polling, but when
+epoll finds no events, the setting of ``gro_flush_timeout`` and
+``napi_defer_hard_irqs`` determine the next step.
+
+There are essentially three possible loops for network processing and
+packet delivery:
+
+1) hardirq -> softirq -> napi poll; basic interrupt delivery
+2) timer -> softirq -> napi poll; deferred irq processing
+3) epoll -> busy-poll -> napi poll; busy looping
+
+Loop 2 can take control from Loop 1, if ``gro_flush_timeout`` and
+``napi_defer_hard_irqs`` are set.
+
+If ``gro_flush_timeout`` and ``napi_defer_hard_irqs`` are set, Loops 2
+and 3 "wrestle" with each other for control.
+
+During busy periods, ``irq-suspend-timeout`` is used as timer in Loop 2,
+which essentially tilts network processing in favour of Loop 3.
+
+If ``gro_flush_timeout`` and ``napi_defer_hard_irqs`` are not set, Loop 3
+cannot take control from Loop 1.
+
+Therefore, setting ``gro_flush_timeout`` and ``napi_defer_hard_irqs`` is
+the recommended usage, because otherwise setting ``irq-suspend-timeout``
+might not have any discernible effect.
 
 .. _threaded:
 
@@ -247,7 +444,14 @@ dependent). The NAPI instance IDs will be assigned in the opposite
 order than the process IDs of the kernel threads.
 
 Threaded NAPI is controlled by writing 0/1 to the ``threaded`` file in
-netdev's sysfs directory.
+netdev's sysfs directory. It can also be enabled for a specific NAPI using
+netlink interface.
+
+For example, using the script:
+
+.. code-block:: bash
+
+  $ ynl --family netdev --do napi-set --json='{"id": 66, "threaded": 1}'
 
 .. rubric:: Footnotes
 
diff --git a/Documentation/networking/net_cachelines/inet_connection_sock.rst b/Documentation/networking/net_cachelines/inet_connection_sock.rst
index 7a911dc95652..8fae85ebb773 100644
--- a/Documentation/networking/net_cachelines/inet_connection_sock.rst
+++ b/Documentation/networking/net_cachelines/inet_connection_sock.rst
@@ -5,46 +5,47 @@
 inet_connection_sock struct fast path usage breakdown
 =====================================================
 
+=================================== ====================== =================== =================== ========================================================================================================================================================
 Type                                Name                   fastpath_tx_access  fastpath_rx_access  comment
-..struct                            ..inet_connection_sock                                         
-struct_inet_sock                    icsk_inet              read_mostly         read_mostly         tcp_init_buffer_space,tcp_init_transfer,tcp_finish_connect,tcp_connect,tcp_send_rcvq,tcp_send_syn_data
-struct_request_sock_queue           icsk_accept_queue      -                   -                   
-struct_inet_bind_bucket             icsk_bind_hash         read_mostly         -                   tcp_set_state
-struct_inet_bind2_bucket            icsk_bind2_hash        read_mostly         -                   tcp_set_state,inet_put_port
-unsigned_long                       icsk_timeout           read_mostly         -                   inet_csk_reset_xmit_timer,tcp_connect
-struct_timer_list                   icsk_retransmit_timer  read_mostly         -                   inet_csk_reset_xmit_timer,tcp_connect
-struct_timer_list                   icsk_delack_timer      read_mostly         -                   inet_csk_reset_xmit_timer,tcp_connect
-u32                                 icsk_rto               read_write          -                   tcp_cwnd_validate,tcp_schedule_loss_probe,tcp_connect_init,tcp_connect,tcp_write_xmit,tcp_push_one
-u32                                 icsk_rto_min           -                   -                   
-u32                                 icsk_delack_max        -                   -                   
-u32                                 icsk_pmtu_cookie       read_write          -                   tcp_sync_mss,tcp_current_mss,tcp_send_syn_data,tcp_connect_init,tcp_connect
-struct_tcp_congestion_ops           icsk_ca_ops            read_write          -                   tcp_cwnd_validate,tcp_tso_segs,tcp_ca_dst_init,tcp_connect_init,tcp_connect,tcp_write_xmit
-struct_inet_connection_sock_af_ops  icsk_af_ops            read_mostly         -                   tcp_finish_connect,tcp_send_syn_data,tcp_mtup_init,tcp_mtu_check_reprobe,tcp_mtu_probe,tcp_connect_init,tcp_connect,__tcp_transmit_skb
-struct_tcp_ulp_ops*                 icsk_ulp_ops           -                   -                   
-void*                               icsk_ulp_data          -                   -                   
-u8:5                                icsk_ca_state          read_write          -                   tcp_cwnd_application_limited,tcp_set_ca_state,tcp_enter_cwr,tcp_tso_should_defer,tcp_mtu_probe,tcp_schedule_loss_probe,tcp_write_xmit,__tcp_transmit_skb
-u8:1                                icsk_ca_initialized    read_write          -                   tcp_init_transfer,tcp_init_congestion_control,tcp_init_transfer,tcp_finish_connect,tcp_connect
-u8:1                                icsk_ca_setsockopt     -                   -                   
-u8:1                                icsk_ca_dst_locked     write_mostly        -                   tcp_ca_dst_init,tcp_connect_init,tcp_connect
-u8                                  icsk_retransmits       write_mostly        -                   tcp_connect_init,tcp_connect
-u8                                  icsk_pending           read_write          -                   inet_csk_reset_xmit_timer,tcp_connect,tcp_check_probe_timer,__tcp_push_pending_frames,tcp_rearm_rto,tcp_event_new_data_sent,tcp_event_new_data_sent
-u8                                  icsk_backoff           write_mostly        -                   tcp_write_queue_purge,tcp_connect_init
-u8                                  icsk_syn_retries       -                   -                   
-u8                                  icsk_probes_out        -                   -                   
-u16                                 icsk_ext_hdr_len       read_mostly         -                   __tcp_mtu_to_mss,tcp_mtu_to_rss,tcp_mtu_probe,tcp_write_xmit,tcp_mtu_to_mss,
-struct_icsk_ack_u8                  pending                read_write          read_write          inet_csk_ack_scheduled,__tcp_cleanup_rbuf,tcp_cleanup_rbuf,inet_csk_clear_xmit_timer,tcp_event_ack-sent,inet_csk_reset_xmit_timer
-struct_icsk_ack_u8                  quick                  read_write          write_mostly        tcp_dec_quickack_mode,tcp_event_ack_sent,__tcp_transmit_skb,__tcp_select_window,__tcp_cleanup_rbuf
-struct_icsk_ack_u8                  pingpong               -                   -                   
-struct_icsk_ack_u8                  retry                  write_mostly        read_write          inet_csk_clear_xmit_timer,tcp_rearm_rto,tcp_event_new_data_sent,tcp_write_xmit,__tcp_send_ack,tcp_send_ack,
-struct_icsk_ack_u8                  ato                    read_mostly         write_mostly        tcp_dec_quickack_mode,tcp_event_ack_sent,__tcp_transmit_skb,__tcp_send_ack,tcp_send_ack
-struct_icsk_ack_unsigned_long       timeout                read_write          read_write          inet_csk_reset_xmit_timer,tcp_connect
-struct_icsk_ack_u32                 lrcvtime               read_write          -                   tcp_finish_connect,tcp_connect,tcp_event_data_sent,__tcp_transmit_skb
-struct_icsk_ack_u16                 rcv_mss                write_mostly        read_mostly         __tcp_select_window,__tcp_cleanup_rbuf,tcp_initialize_rcv_mss,tcp_connect_init
-struct_icsk_mtup_int                search_high            read_write          -                   tcp_mtup_init,tcp_sync_mss,tcp_connect_init,tcp_mtu_check_reprobe,tcp_write_xmit
-struct_icsk_mtup_int                search_low             read_write          -                   tcp_mtu_probe,tcp_mtu_check_reprobe,tcp_write_xmit,tcp_sync_mss,tcp_connect_init,tcp_mtup_init
-struct_icsk_mtup_u32:31             probe_size             read_write          -                   tcp_mtup_init,tcp_connect_init,__tcp_transmit_skb
-struct_icsk_mtup_u32:1              enabled                read_write          -                   tcp_mtup_init,tcp_sync_mss,tcp_connect_init,tcp_mtu_probe,tcp_write_xmit
-struct_icsk_mtup_u32                probe_timestamp        read_write          -                   tcp_mtup_init,tcp_connect_init,tcp_mtu_check_reprobe,tcp_mtu_probe
-u32                                 icsk_probes_tstamp     -                   -                   
-u32                                 icsk_user_timeout      -                   -                   
-u64[104/sizeof(u64)]                icsk_ca_priv           -                   -                   
+=================================== ====================== =================== =================== ========================================================================================================================================================
+struct inet_sock                    icsk_inet              read_mostly         read_mostly         tcp_init_buffer_space,tcp_init_transfer,tcp_finish_connect,tcp_connect,tcp_send_rcvq,tcp_send_syn_data
+struct request_sock_queue           icsk_accept_queue
+struct inet_bind_bucket             icsk_bind_hash         read_mostly                             tcp_set_state
+struct inet_bind2_bucket            icsk_bind2_hash        read_mostly                             tcp_set_state,inet_put_port
+struct timer_list                   icsk_retransmit_timer  read_write                              inet_csk_reset_xmit_timer,tcp_connect
+struct timer_list                   icsk_delack_timer      read_mostly                             inet_csk_reset_xmit_timer,tcp_connect
+u32                                 icsk_rto               read_write                              tcp_cwnd_validate,tcp_schedule_loss_probe,tcp_connect_init,tcp_connect,tcp_write_xmit,tcp_push_one
+u32                                 icsk_rto_min
+u32                                 icsk_rto_max           read_mostly                             tcp_reset_xmit_timer
+u32                                 icsk_delack_max
+u32                                 icsk_pmtu_cookie       read_write                              tcp_sync_mss,tcp_current_mss,tcp_send_syn_data,tcp_connect_init,tcp_connect
+struct tcp_congestion_ops           icsk_ca_ops            read_write                              tcp_cwnd_validate,tcp_tso_segs,tcp_ca_dst_init,tcp_connect_init,tcp_connect,tcp_write_xmit
+struct inet_connection_sock_af_ops  icsk_af_ops            read_mostly                             tcp_finish_connect,tcp_send_syn_data,tcp_mtup_init,tcp_mtu_check_reprobe,tcp_mtu_probe,tcp_connect_init,tcp_connect,__tcp_transmit_skb
+struct tcp_ulp_ops*                 icsk_ulp_ops
+void*                               icsk_ulp_data
+u8:5                                icsk_ca_state          read_write                              tcp_cwnd_application_limited,tcp_set_ca_state,tcp_enter_cwr,tcp_tso_should_defer,tcp_mtu_probe,tcp_schedule_loss_probe,tcp_write_xmit,__tcp_transmit_skb
+u8:1                                icsk_ca_initialized    read_write                              tcp_init_transfer,tcp_init_congestion_control,tcp_init_transfer,tcp_finish_connect,tcp_connect
+u8:1                                icsk_ca_setsockopt
+u8:1                                icsk_ca_dst_locked     write_mostly                            tcp_ca_dst_init,tcp_connect_init,tcp_connect
+u8                                  icsk_retransmits       write_mostly                            tcp_connect_init,tcp_connect
+u8                                  icsk_pending           read_write                              inet_csk_reset_xmit_timer,tcp_connect,tcp_check_probe_timer,__tcp_push_pending_frames,tcp_rearm_rto,tcp_event_new_data_sent,tcp_event_new_data_sent
+u8                                  icsk_backoff           write_mostly                            tcp_write_queue_purge,tcp_connect_init
+u8                                  icsk_syn_retries
+u8                                  icsk_probes_out
+u16                                 icsk_ext_hdr_len       read_mostly                             __tcp_mtu_to_mss,tcp_mtu_to_rss,tcp_mtu_probe,tcp_write_xmit,tcp_mtu_to_mss,
+struct icsk_ack_u8                  pending                read_write          read_write          inet_csk_ack_scheduled,__tcp_cleanup_rbuf,tcp_cleanup_rbuf,inet_csk_clear_xmit_timer,tcp_event_ack-sent,inet_csk_reset_xmit_timer
+struct icsk_ack_u8                  quick                  read_write          write_mostly        tcp_dec_quickack_mode,tcp_event_ack_sent,__tcp_transmit_skb,__tcp_select_window,__tcp_cleanup_rbuf
+struct icsk_ack_u8                  pingpong
+struct icsk_ack_u8                  retry                  write_mostly        read_write          inet_csk_clear_xmit_timer,tcp_rearm_rto,tcp_event_new_data_sent,tcp_write_xmit,__tcp_send_ack,tcp_send_ack,
+struct icsk_ack_u8                  ato                    read_mostly         write_mostly        tcp_dec_quickack_mode,tcp_event_ack_sent,__tcp_transmit_skb,__tcp_send_ack,tcp_send_ack
+struct icsk_ack_u32                 lrcvtime               read_write                              tcp_finish_connect,tcp_connect,tcp_event_data_sent,__tcp_transmit_skb
+struct icsk_ack_u16                 rcv_mss                write_mostly        read_mostly         __tcp_select_window,__tcp_cleanup_rbuf,tcp_initialize_rcv_mss,tcp_connect_init
+struct icsk_mtup_int                search_high            read_write                              tcp_mtup_init,tcp_sync_mss,tcp_connect_init,tcp_mtu_check_reprobe,tcp_write_xmit
+struct icsk_mtup_int                search_low             read_write                              tcp_mtu_probe,tcp_mtu_check_reprobe,tcp_write_xmit,tcp_sync_mss,tcp_connect_init,tcp_mtup_init
+struct icsk_mtup_u32:31             probe_size             read_write                              tcp_mtup_init,tcp_connect_init,__tcp_transmit_skb
+struct icsk_mtup_u32:1              enabled                read_write                              tcp_mtup_init,tcp_sync_mss,tcp_connect_init,tcp_mtu_probe,tcp_write_xmit
+struct icsk_mtup_u32                probe_timestamp        read_write                              tcp_mtup_init,tcp_connect_init,tcp_mtu_check_reprobe,tcp_mtu_probe
+u32                                 icsk_probes_tstamp
+u32                                 icsk_user_timeout
+u64[104/sizeof(u64)]                icsk_ca_priv
+=================================== ====================== =================== =================== ========================================================================================================================================================
diff --git a/Documentation/networking/net_cachelines/inet_sock.rst b/Documentation/networking/net_cachelines/inet_sock.rst
index 595d7ef5fc8b..b11bf48fa2b3 100644
--- a/Documentation/networking/net_cachelines/inet_sock.rst
+++ b/Documentation/networking/net_cachelines/inet_sock.rst
@@ -5,40 +5,42 @@
 inet_sock struct fast path usage breakdown
 ==========================================
 
+======================= ===================== =================== =================== ======================================================================================================
 Type                    Name                  fastpath_tx_access  fastpath_rx_access  comment
-..struct                ..inet_sock                                                     
-struct_sock             sk                    read_mostly         read_mostly         tcp_init_buffer_space,tcp_init_transfer,tcp_finish_connect,tcp_connect,tcp_send_rcvq,tcp_send_syn_data
-struct_ipv6_pinfo*      pinet6                -                   -                   
-be16                    inet_sport            read_mostly         -                   __tcp_transmit_skb
-be32                    inet_daddr            read_mostly         -                   ip_select_ident_segs
-be32                    inet_rcv_saddr        -                   -                   
-be16                    inet_dport            read_mostly         -                   __tcp_transmit_skb
-u16                     inet_num              -                   -                   
-be32                    inet_saddr            -                   -                   
-s16                     uc_ttl                read_mostly         -                   __ip_queue_xmit/ip_select_ttl
-u16                     cmsg_flags            -                   -                   
-struct_ip_options_rcu*  inet_opt              read_mostly         -                   __ip_queue_xmit
-u16                     inet_id               read_mostly         -                   ip_select_ident_segs
-u8                      tos                   read_mostly         -                   ip_queue_xmit
-u8                      min_ttl               -                   -                   
-u8                      mc_ttl                -                   -                   
-u8                      pmtudisc              -                   -                   
-u8:1                    recverr               -                   -                   
-u8:1                    is_icsk               -                   -                   
-u8:1                    freebind              -                   -                   
-u8:1                    hdrincl               -                   -                   
-u8:1                    mc_loop               -                   -                   
-u8:1                    transparent           -                   -                   
-u8:1                    mc_all                -                   -                   
-u8:1                    nodefrag              -                   -                   
-u8:1                    bind_address_no_port  -                   -                   
-u8:1                    recverr_rfc4884       -                   -                   
-u8:1                    defer_connect         read_mostly         -                   tcp_sendmsg_fastopen
-u8                      rcv_tos               -                   -                   
-u8                      convert_csum          -                   -                   
-int                     uc_index              -                   -                   
-int                     mc_index              -                   -                   
-be32                    mc_addr               -                   -                   
-struct_ip_mc_socklist*  mc_list               -                   -                   
-struct_inet_cork_full   cork                  read_mostly         -                   __tcp_transmit_skb
-struct                  local_port_range      -                   -                   
+======================= ===================== =================== =================== ======================================================================================================
+struct sock             sk                    read_mostly         read_mostly         tcp_init_buffer_space,tcp_init_transfer,tcp_finish_connect,tcp_connect,tcp_send_rcvq,tcp_send_syn_data
+struct ipv6_pinfo*      pinet6
+be16                    inet_sport            read_mostly                             __tcp_transmit_skb
+be32                    inet_daddr            read_mostly                             ip_select_ident_segs
+be32                    inet_rcv_saddr
+be16                    inet_dport            read_mostly                             __tcp_transmit_skb
+u16                     inet_num
+be32                    inet_saddr
+s16                     uc_ttl                read_mostly                             __ip_queue_xmit/ip_select_ttl
+u16                     cmsg_flags
+struct ip_options_rcu*  inet_opt              read_mostly                             __ip_queue_xmit
+u16                     inet_id               read_mostly                             ip_select_ident_segs
+u8                      tos                   read_mostly                             ip_queue_xmit
+u8                      min_ttl
+u8                      mc_ttl
+u8                      pmtudisc
+u8:1                    recverr
+u8:1                    is_icsk
+u8:1                    freebind
+u8:1                    hdrincl
+u8:1                    mc_loop
+u8:1                    transparent
+u8:1                    mc_all
+u8:1                    nodefrag
+u8:1                    bind_address_no_port
+u8:1                    recverr_rfc4884
+u8:1                    defer_connect         read_mostly                             tcp_sendmsg_fastopen
+u8                      rcv_tos
+u8                      convert_csum
+int                     uc_index
+int                     mc_index
+be32                    mc_addr
+struct ip_mc_socklist*  mc_list
+struct inet_cork_full   cork                  read_mostly                             __tcp_transmit_skb
+struct                  local_port_range
+======================= ===================== =================== =================== ======================================================================================================
diff --git a/Documentation/networking/net_cachelines/net_device.rst b/Documentation/networking/net_cachelines/net_device.rst
index 22b07c814f4a..1c19bb7705df 100644
--- a/Documentation/networking/net_cachelines/net_device.rst
+++ b/Documentation/networking/net_cachelines/net_device.rst
@@ -5,181 +5,189 @@
 net_device struct fast path usage breakdown
 ===========================================
 
-Type                                Name                    fastpath_tx_access  fastpath_rx_access  Comments
-..struct                            ..net_device                                                    
-unsigned_long:32                    priv_flags              read_mostly         -                   __dev_queue_xmit(tx)
-unsigned_long:1                     lltx                    read_mostly         -                   HARD_TX_LOCK,HARD_TX_TRYLOCK,HARD_TX_UNLOCK(tx)
-char                                name[16]                -                   -                   
-struct_netdev_name_node*            name_node                                                       
-struct_dev_ifalias*                 ifalias                                                         
-unsigned_long                       mem_end                                                         
-unsigned_long                       mem_start                                                       
-unsigned_long                       base_addr                                                       
-unsigned_long                       state                   read_mostly         read_mostly         netif_running(dev)
-struct_list_head                    dev_list                                                        
-struct_list_head                    napi_list                                                       
-struct_list_head                    unreg_list                                                      
-struct_list_head                    close_list                                                      
-struct_list_head                    ptype_all               read_mostly         -                   dev_nit_active(tx)
-struct_list_head                    ptype_specific                              read_mostly         deliver_ptype_list_skb/__netif_receive_skb_core(rx)
-struct                              adj_list                                                        
-unsigned_int                        flags                   read_mostly         read_mostly         __dev_queue_xmit,__dev_xmit_skb,ip6_output,__ip6_finish_output(tx);ip6_rcv_core(rx)
-xdp_features_t                      xdp_features                                                    
-struct_net_device_ops*              netdev_ops              read_mostly         -                   netdev_core_pick_tx,netdev_start_xmit(tx)
-struct_xdp_metadata_ops*            xdp_metadata_ops                                                
-int                                 ifindex                 -                   read_mostly         ip6_rcv_core
-unsigned_short                      gflags                                                          
-unsigned_short                      hard_header_len         read_mostly         read_mostly         ip6_xmit(tx);gro_list_prepare(rx)
-unsigned_int                        mtu                     read_mostly         -                   ip_finish_output2
-unsigned_short                      needed_headroom         read_mostly         -                   LL_RESERVED_SPACE/ip_finish_output2
-unsigned_short                      needed_tailroom                                                 
-netdev_features_t                   features                read_mostly         read_mostly         HARD_TX_LOCK,netif_skb_features,sk_setup_caps(tx);netif_elide_gro(rx)
-netdev_features_t                   hw_features                                                     
-netdev_features_t                   wanted_features                                                 
-netdev_features_t                   vlan_features                                                   
-netdev_features_t                   hw_enc_features         -                   -                   netif_skb_features
-netdev_features_t                   mpls_features                                                   
-netdev_features_t                   gso_partial_features    read_mostly                             gso_features_check
-unsigned_int                        min_mtu                                                         
-unsigned_int                        max_mtu                                                         
-unsigned_short                      type                                                            
-unsigned_char                       min_header_len                                                  
-unsigned_char                       name_assign_type                                                
-int                                 group                                                           
-struct_net_device_stats             stats                                                           
-struct_net_device_core_stats*       core_stats                                                      
-atomic_t                            carrier_up_count                                                
-atomic_t                            carrier_down_count                                              
-struct_iw_handler_def*              wireless_handlers                                               
-struct_iw_public_data*              wireless_data                                                   
-struct_ethtool_ops*                 ethtool_ops                                                     
-struct_l3mdev_ops*                  l3mdev_ops                                                      
-struct_ndisc_ops*                   ndisc_ops                                                       
-struct_xfrmdev_ops*                 xfrmdev_ops                                                     
-struct_tlsdev_ops*                  tlsdev_ops                                                      
-struct_header_ops*                  header_ops              read_mostly         -                   ip_finish_output2,ip6_finish_output2(tx)
-unsigned_char                       operstate                                                       
-unsigned_char                       link_mode                                                       
-unsigned_char                       if_port                                                         
-unsigned_char                       dma                                                             
-unsigned_char                       perm_addr[32]                                                   
-unsigned_char                       addr_assign_type                                                
-unsigned_char                       addr_len                                                        
-unsigned_char                       upper_level                                                     
-unsigned_char                       lower_level                                                     
-unsigned_short                      neigh_priv_len                                                  
-unsigned_short                      padded                                                          
-unsigned_short                      dev_id                                                          
-unsigned_short                      dev_port                                                        
-spinlock_t                          addr_list_lock                                                  
-int                                 irq                                                             
-struct_netdev_hw_addr_list          uc                                                              
-struct_netdev_hw_addr_list          mc                                                              
-struct_netdev_hw_addr_list          dev_addrs                                                       
-struct_kset*                        queues_kset                                                     
-struct_list_head                    unlink_list                                                     
-unsigned_int                        promiscuity                                                     
-unsigned_int                        allmulti                                                        
-bool                                uc_promisc                                                      
-unsigned_char                       nested_level                                                    
-struct_in_device*                   ip_ptr                  read_mostly         read_mostly         __in_dev_get
-struct_inet6_dev*                   ip6_ptr                 read_mostly         read_mostly         __in6_dev_get
-struct_vlan_info*                   vlan_info                                                       
-struct_dsa_port*                    dsa_ptr                                                         
-struct_tipc_bearer*                 tipc_ptr                                                        
-void*                               atalk_ptr                                                       
-void*                               ax25_ptr                                                        
-struct_wireless_dev*                ieee80211_ptr                                                   
-struct_wpan_dev*                    ieee802154_ptr                                                  
-struct_mpls_dev*                    mpls_ptr                                                        
-struct_mctp_dev*                    mctp_ptr                                                        
-unsigned_char*                      dev_addr                                                        
-struct_netdev_queue*                _rx                     read_mostly         -                   netdev_get_rx_queue(rx)
-unsigned_int                        num_rx_queues                                                   
-unsigned_int                        real_num_rx_queues      -                   read_mostly         get_rps_cpu
-struct_bpf_prog*                    xdp_prog                -                   read_mostly         netif_elide_gro()
-unsigned_long                       gro_flush_timeout       -                   read_mostly         napi_complete_done
-u32                                 napi_defer_hard_irqs    -                   read_mostly         napi_complete_done
-unsigned_int                        gro_max_size            -                   read_mostly         skb_gro_receive
-unsigned_int                        gro_ipv4_max_size       -                   read_mostly         skb_gro_receive
-rx_handler_func_t*                  rx_handler              read_mostly         -                   __netif_receive_skb_core
-void*                               rx_handler_data         read_mostly         -                   
-struct_netdev_queue*                ingress_queue           read_mostly         -                   
-struct_bpf_mprog_entry              tcx_ingress             -                   read_mostly         sch_handle_ingress
-struct_nf_hook_entries*             nf_hooks_ingress                                                
-unsigned_char                       broadcast[32]                                                   
-struct_cpu_rmap*                    rx_cpu_rmap                                                     
-struct_hlist_node                   index_hlist                                                     
-struct_netdev_queue*                _tx                     read_mostly         -                   netdev_get_tx_queue(tx)
-unsigned_int                        num_tx_queues           -                   -                   
-unsigned_int                        real_num_tx_queues      read_mostly         -                   skb_tx_hash,netdev_core_pick_tx(tx)
-unsigned_int                        tx_queue_len                                                    
-spinlock_t                          tx_global_lock                                                  
-struct_xdp_dev_bulk_queue__percpu*  xdp_bulkq                                                       
-struct_xps_dev_maps*                xps_maps[2]             read_mostly         -                   __netif_set_xps_queue
-struct_bpf_mprog_entry              tcx_egress              read_mostly         -                   sch_handle_egress
-struct_nf_hook_entries*             nf_hooks_egress         read_mostly         -                   
-struct_hlist_head                   qdisc_hash[16]                                                  
-struct_timer_list                   watchdog_timer                                                  
-int                                 watchdog_timeo                                                  
-u32                                 proto_down_reason                                               
-struct_list_head                    todo_list                                                       
-int__percpu*                        pcpu_refcnt                                                     
-refcount_t                          dev_refcnt                                                      
-struct_ref_tracker_dir              refcnt_tracker                                                  
-struct_list_head                    link_watch_list                                                 
-enum:8                              reg_state                                                       
-bool                                dismantle                                                       
-enum:16                             rtnl_link_state                                                 
-bool                                needs_free_netdev                                               
-void*priv_destructor                struct_net_device                                               
-struct_netpoll_info*                npinfo                  -                   read_mostly         napi_poll/napi_poll_lock
-possible_net_t                      nd_net                  -                   read_mostly         (dev_net)napi_busy_loop,tcp_v(4/6)_rcv,ip(v6)_rcv,ip(6)_input,ip(6)_input_finish
-void*                               ml_priv                                                         
-enum_netdev_ml_priv_type            ml_priv_type                                                    
-struct_pcpu_lstats__percpu*         lstats                  read_mostly                             dev_lstats_add()
-struct_pcpu_sw_netstats__percpu*    tstats                  read_mostly                             dev_sw_netstats_tx_add()
-struct_pcpu_dstats__percpu*         dstats                                                          
-struct_garp_port*                   garp_port                                                       
-struct_mrp_port*                    mrp_port                                                        
-struct_dm_hw_stat_delta*            dm_private                                                      
-struct_device                       dev                     -                   -                   
-struct_attribute_group*             sysfs_groups[4]                                                 
-struct_attribute_group*             sysfs_rx_queue_group                                            
-struct_rtnl_link_ops*               rtnl_link_ops                                                   
-unsigned_int                        gso_max_size            read_mostly         -                   sk_dst_gso_max_size
-unsigned_int                        tso_max_size                                                    
-u16                                 gso_max_segs            read_mostly         -                   gso_max_segs
-u16                                 tso_max_segs                                                    
-unsigned_int                        gso_ipv4_max_size       read_mostly         -                   sk_dst_gso_max_size
-struct_dcbnl_rtnl_ops*              dcbnl_ops                                                       
-s16                                 num_tc                  read_mostly         -                   skb_tx_hash
-struct_netdev_tc_txq                tc_to_txq[16]           read_mostly         -                   skb_tx_hash
-u8                                  prio_tc_map[16]                                                 
-unsigned_int                        fcoe_ddp_xid                                                    
-struct_netprio_map*                 priomap                                                         
-struct_phy_device*                  phydev                                                          
-struct_sfp_bus*                     sfp_bus                                                         
-struct_lock_class_key*              qdisc_tx_busylock                                               
-bool                                proto_down                                                      
-unsigned:1                          wol_enabled                                                     
-unsigned:1                          threaded                -                   -                   napi_poll(napi_enable,dev_set_threaded)
-unsigned_long:1                     see_all_hwtstamp_requests                                       
-unsigned_long:1                     change_proto_down                                               
-unsigned_long:1                     netns_local                                                     
-unsigned_long:1                     fcoe_mtu                                                        
-struct_list_head                    net_notifier_list                                               
-struct_macsec_ops*                  macsec_ops                                                      
-struct_udp_tunnel_nic_info*         udp_tunnel_nic_info                                             
-struct_udp_tunnel_nic*              udp_tunnel_nic                                                  
-unsigned_int                        xdp_zc_max_segs                                                 
-struct_bpf_xdp_entity               xdp_state[3]                                                    
-u8                                  dev_addr_shadow[32]                                             
-netdevice_tracker                   linkwatch_dev_tracker                                           
-netdevice_tracker                   watchdog_dev_tracker                                            
-netdevice_tracker                   dev_registered_tracker                                          
-struct_rtnl_hw_stats64*             offload_xstats_l3                                               
-struct_devlink_port*                devlink_port                                                    
-struct_dpll_pin*                    dpll_pin                                                        
+=================================== =========================== =================== =================== ===================================================================================
+Type                                Name                        fastpath_tx_access  fastpath_rx_access  Comments
+=================================== =========================== =================== =================== ===================================================================================
+unsigned_long:32                    priv_flags                  read_mostly                             __dev_queue_xmit(tx)
+unsigned_long:1                     lltx                        read_mostly                             HARD_TX_LOCK,HARD_TX_TRYLOCK,HARD_TX_UNLOCK(tx)
+unsigned long:1                     netmem_tx:1;                read_mostly
+char                                name[16]
+struct netdev_name_node*            name_node
+struct dev_ifalias*                 ifalias
+unsigned_long                       mem_end
+unsigned_long                       mem_start
+unsigned_long                       base_addr
+unsigned_long                       state                       read_mostly         read_mostly         netif_running(dev)
+struct list_head                    dev_list
+struct list_head                    napi_list
+struct list_head                    unreg_list
+struct list_head                    close_list
+struct list_head                    ptype_all                   read_mostly                             dev_nit_active(tx)
+struct list_head                    ptype_specific                                  read_mostly         deliver_ptype_list_skb/__netif_receive_skb_core(rx)
+struct                              adj_list
+unsigned_int                        flags                       read_mostly         read_mostly         __dev_queue_xmit,__dev_xmit_skb,ip6_output,__ip6_finish_output(tx);ip6_rcv_core(rx)
+xdp_features_t                      xdp_features
+struct net_device_ops*              netdev_ops                  read_mostly                             netdev_core_pick_tx,netdev_start_xmit(tx)
+struct xdp_metadata_ops*            xdp_metadata_ops
+int                                 ifindex                                         read_mostly         ip6_rcv_core
+unsigned_short                      gflags
+unsigned_short                      hard_header_len             read_mostly         read_mostly         ip6_xmit(tx);gro_list_prepare(rx)
+unsigned_int                        mtu                         read_mostly                             ip_finish_output2
+unsigned_short                      needed_headroom             read_mostly                             LL_RESERVED_SPACE/ip_finish_output2
+unsigned_short                      needed_tailroom
+netdev_features_t                   features                    read_mostly         read_mostly         HARD_TX_LOCK,netif_skb_features,sk_setup_caps(tx);netif_elide_gro(rx)
+netdev_features_t                   hw_features
+netdev_features_t                   wanted_features
+netdev_features_t                   vlan_features
+netdev_features_t                   hw_enc_features                                                     netif_skb_features
+netdev_features_t                   mpls_features
+netdev_features_t                   gso_partial_features        read_mostly                             gso_features_check
+unsigned_int                        min_mtu
+unsigned_int                        max_mtu
+unsigned_short                      type
+unsigned_char                       min_header_len
+unsigned_char                       name_assign_type
+int                                 group
+struct net_device_stats             stats
+struct net_device_core_stats*       core_stats
+atomic_t                            carrier_up_count
+atomic_t                            carrier_down_count
+struct iw_handler_def*              wireless_handlers
+struct ethtool_ops*                 ethtool_ops
+struct l3mdev_ops*                  l3mdev_ops
+struct ndisc_ops*                   ndisc_ops
+struct xfrmdev_ops*                 xfrmdev_ops
+struct tlsdev_ops*                  tlsdev_ops
+struct header_ops*                  header_ops                  read_mostly                             ip_finish_output2,ip6_finish_output2(tx)
+unsigned_char                       operstate
+unsigned_char                       link_mode
+unsigned_char                       if_port
+unsigned_char                       dma
+unsigned_char                       perm_addr[32]
+unsigned_char                       addr_assign_type
+unsigned_char                       addr_len
+unsigned_char                       upper_level
+unsigned_char                       lower_level
+u8                                  threaded                                                            napi_poll(napi_enable,netif_set_threaded)
+unsigned_short                      neigh_priv_len
+unsigned_short                      padded
+unsigned_short                      dev_id
+unsigned_short                      dev_port
+spinlock_t                          addr_list_lock
+int                                 irq
+struct netdev_hw_addr_list          uc
+struct netdev_hw_addr_list          mc
+struct netdev_hw_addr_list          dev_addrs
+struct kset*                        queues_kset
+struct list_head                    unlink_list
+unsigned_int                        promiscuity
+unsigned_int                        allmulti
+bool                                uc_promisc
+unsigned_char                       nested_level
+struct in_device*                   ip_ptr                      read_mostly         read_mostly         __in_dev_get
+struct hlist_head                   fib_nh_head
+struct inet6_dev*                   ip6_ptr                     read_mostly         read_mostly         __in6_dev_get
+struct vlan_info*                   vlan_info
+struct dsa_port*                    dsa_ptr
+struct tipc_bearer*                 tipc_ptr
+void*                               atalk_ptr
+void*                               ax25_ptr
+struct wireless_dev*                ieee80211_ptr
+struct wpan_dev*                    ieee802154_ptr
+struct mpls_dev*                    mpls_ptr
+struct mctp_dev*                    mctp_ptr
+unsigned_char*                      dev_addr
+struct netdev_queue*                _rx                         read_mostly                             netdev_get_rx_queue(rx)
+unsigned_int                        num_rx_queues
+unsigned_int                        real_num_rx_queues                              read_mostly         get_rps_cpu
+struct bpf_prog*                    xdp_prog                                        read_mostly         netif_elide_gro()
+unsigned_long                       gro_flush_timeout                               read_mostly         napi_complete_done
+u32                                 napi_defer_hard_irqs                            read_mostly         napi_complete_done
+unsigned_int                        gro_max_size                                    read_mostly         skb_gro_receive
+unsigned_int                        gro_ipv4_max_size                               read_mostly         skb_gro_receive
+rx_handler_func_t*                  rx_handler                  read_mostly                             __netif_receive_skb_core
+void*                               rx_handler_data             read_mostly
+struct netdev_queue*                ingress_queue               read_mostly
+struct bpf_mprog_entry              tcx_ingress                                     read_mostly         sch_handle_ingress
+struct nf_hook_entries*             nf_hooks_ingress
+unsigned_char                       broadcast[32]
+struct cpu_rmap*                    rx_cpu_rmap
+struct hlist_node                   index_hlist
+struct netdev_queue*                _tx                         read_mostly                             netdev_get_tx_queue(tx)
+unsigned_int                        num_tx_queues
+unsigned_int                        real_num_tx_queues          read_mostly                             skb_tx_hash,netdev_core_pick_tx(tx)
+unsigned_int                        tx_queue_len
+spinlock_t                          tx_global_lock
+struct xdp_dev_bulk_queue__percpu*  xdp_bulkq
+struct xps_dev_maps*                xps_maps[2]                 read_mostly                             __netif_set_xps_queue
+struct bpf_mprog_entry              tcx_egress                  read_mostly                             sch_handle_egress
+struct nf_hook_entries*             nf_hooks_egress             read_mostly
+struct hlist_head                   qdisc_hash[16]
+struct timer_list                   watchdog_timer
+int                                 watchdog_timeo
+u32                                 proto_down_reason
+struct list_head                    todo_list
+int__percpu*                        pcpu_refcnt
+refcount_t                          dev_refcnt
+struct ref_tracker_dir              refcnt_tracker
+struct list_head                    link_watch_list
+enum:8                              reg_state
+bool                                dismantle
+bool                                rtnl_link_initilizing
+bool                                needs_free_netdev
+void*priv_destructor                struct net_device
+struct netpoll_info*                npinfo                                          read_mostly         napi_poll/napi_poll_lock
+possible_net_t                      nd_net                                          read_mostly         (dev_net)napi_busy_loop,tcp_v(4/6)_rcv,ip(v6)_rcv,ip(6)_input,ip(6)_input_finish
+void*                               ml_priv
+enum_netdev_ml_priv_type            ml_priv_type
+struct pcpu_lstats__percpu*         lstats                      read_mostly                             dev_lstats_add()
+struct pcpu_sw_netstats__percpu*    tstats                      read_mostly                             dev_sw_netstats_tx_add()
+struct pcpu_dstats__percpu*         dstats
+struct garp_port*                   garp_port
+struct mrp_port*                    mrp_port
+struct dm_hw_stat_delta*            dm_private
+struct device                       dev
+struct attribute_group*             sysfs_groups[4]
+struct attribute_group*             sysfs_rx_queue_group
+struct rtnl_link_ops*               rtnl_link_ops
+unsigned_int                        gso_max_size                read_mostly                             sk_dst_gso_max_size
+unsigned_int                        tso_max_size
+u16                                 gso_max_segs                read_mostly                             gso_max_segs
+u16                                 tso_max_segs
+unsigned_int                        gso_ipv4_max_size           read_mostly                             sk_dst_gso_max_size
+struct dcbnl_rtnl_ops*              dcbnl_ops
+s16                                 num_tc                      read_mostly                             skb_tx_hash
+struct netdev_tc_txq                tc_to_txq[16]               read_mostly                             skb_tx_hash
+u8                                  prio_tc_map[16]
+unsigned_int                        fcoe_ddp_xid
+struct netprio_map*                 priomap
+struct phy_device*                  phydev
+struct sfp_bus*                     sfp_bus
+struct lock_class_key*              qdisc_tx_busylock
+bool                                proto_down
+unsigned:1                          wol_enabled
+unsigned_long:1                     see_all_hwtstamp_requests
+unsigned_long:1                     change_proto_down
+unsigned_long:1                     netns_immutable
+unsigned_long:1                     fcoe_mtu
+struct list_head                    net_notifier_list
+struct macsec_ops*                  macsec_ops
+struct udp_tunnel_nic_info*         udp_tunnel_nic_info
+struct udp_tunnel_nic*              udp_tunnel_nic
+unsigned_int                        xdp_zc_max_segs
+struct bpf_xdp_entity               xdp_state[3]
+u8                                  dev_addr_shadow[32]
+netdevice_tracker                   linkwatch_dev_tracker
+netdevice_tracker                   watchdog_dev_tracker
+netdevice_tracker                   dev_registered_tracker
+struct rtnl_hw_stats64*             offload_xstats_l3
+struct devlink_port*                devlink_port
+struct dpll_pin*                    dpll_pin
 struct hlist_head                   page_pools
 struct dim_irq_moder*               irq_moder
+u64                                 max_pacing_offload_horizon
+struct_napi_config*                 napi_config
+unsigned_long                       gro_flush_timeout
+u32                                 napi_defer_hard_irqs
+struct hlist_head                   neighbours[2]
+=================================== =========================== =================== =================== ===================================================================================
diff --git a/Documentation/networking/net_cachelines/netns_ipv4_sysctl.rst b/Documentation/networking/net_cachelines/netns_ipv4_sysctl.rst
index 9b87089a84c6..6e7b20afd2d4 100644
--- a/Documentation/networking/net_cachelines/netns_ipv4_sysctl.rst
+++ b/Documentation/networking/net_cachelines/netns_ipv4_sysctl.rst
@@ -5,154 +5,158 @@
 netns_ipv4 struct fast path usage breakdown
 ===========================================
 
+=============================== ============================================ =================== =================== =================================================
 Type                            Name                                         fastpath_tx_access  fastpath_rx_access  comment
-..struct                        ..netns_ipv4                                                                         
-struct_inet_timewait_death_row  tcp_death_row                                                                        
-struct_udp_table*               udp_table                                                                            
-struct_ctl_table_header*        forw_hdr                                                                             
-struct_ctl_table_header*        frags_hdr                                                                            
-struct_ctl_table_header*        ipv4_hdr                                                                             
-struct_ctl_table_header*        route_hdr                                                                            
-struct_ctl_table_header*        xfrm4_hdr                                                                            
-struct_ipv4_devconf*            devconf_all                                                                          
-struct_ipv4_devconf*            devconf_dflt                                                                         
-struct_ip_ra_chain              ra_chain                                                                             
-struct_mutex                    ra_mutex                                                                             
-struct_fib_rules_ops*           rules_ops                                                                            
-struct_fib_table                fib_main                                                                             
-struct_fib_table                fib_default                                                                          
-unsigned_int                    fib_rules_require_fldissect                                                          
-bool                            fib_has_custom_rules                                                                 
-bool                            fib_has_custom_local_routes                                                          
-bool                            fib_offload_disabled                                                                 
-atomic_t                        fib_num_tclassid_users                                                               
-struct_hlist_head*              fib_table_hash                                                                       
-struct_sock*                    fibnl                                                                                
-struct_sock*                    mc_autojoin_sk                                                                       
-struct_inet_peer_base*          peers                                                                                
-struct_fqdir*                   fqdir                                                                                
-u8                              sysctl_icmp_echo_ignore_all                                                          
-u8                              sysctl_icmp_echo_enable_probe                                                        
-u8                              sysctl_icmp_echo_ignore_broadcasts                                                   
-u8                              sysctl_icmp_ignore_bogus_error_responses                                             
-u8                              sysctl_icmp_errors_use_inbound_ifaddr                                                
-int                             sysctl_icmp_ratelimit                                                                
-int                             sysctl_icmp_ratemask                                                                 
-u32                             ip_rt_min_pmtu                               -                   -                   
-int                             ip_rt_mtu_expires                            -                   -                   
-int                             ip_rt_min_advmss                             -                   -                   
-struct_local_ports              ip_local_ports                               -                   -                   
-u8                              sysctl_tcp_ecn                               -                   -                   
-u8                              sysctl_tcp_ecn_fallback                      -                   -                   
-u8                              sysctl_ip_default_ttl                        -                   -                   ip4_dst_hoplimit/ip_select_ttl
-u8                              sysctl_ip_no_pmtu_disc                       -                   -                   
-u8                              sysctl_ip_fwd_use_pmtu                       read_mostly         -                   ip_dst_mtu_maybe_forward/ip_skb_dst_mtu
-u8                              sysctl_ip_fwd_update_priority                -                   -                   ip_forward
-u8                              sysctl_ip_nonlocal_bind                      -                   -                   
-u8                              sysctl_ip_autobind_reuse                     -                   -                   
-u8                              sysctl_ip_dynaddr                            -                   -                   
-u8                              sysctl_ip_early_demux                        -                   read_mostly         ip(6)_rcv_finish_core
-u8                              sysctl_raw_l3mdev_accept                     -                   -                   
-u8                              sysctl_tcp_early_demux                       -                   read_mostly         ip(6)_rcv_finish_core
-u8                              sysctl_udp_early_demux                                                               
-u8                              sysctl_nexthop_compat_mode                   -                   -                   
-u8                              sysctl_fwmark_reflect                        -                   -                   
-u8                              sysctl_tcp_fwmark_accept                     -                   -                   
-u8                              sysctl_tcp_l3mdev_accept                     -                   -                   
-u8                              sysctl_tcp_mtu_probing                       -                   -                   
-int                             sysctl_tcp_mtu_probe_floor                   -                   -                   
-int                             sysctl_tcp_base_mss                          -                   -                   
-int                             sysctl_tcp_min_snd_mss                       read_mostly         -                   __tcp_mtu_to_mss(tcp_write_xmit)
-int                             sysctl_tcp_probe_threshold                   -                   -                   tcp_mtu_probe(tcp_write_xmit)
-u32                             sysctl_tcp_probe_interval                    -                   -                   tcp_mtu_check_reprobe(tcp_write_xmit)
-int                             sysctl_tcp_keepalive_time                    -                   -                   
-int                             sysctl_tcp_keepalive_intvl                   -                   -                   
-u8                              sysctl_tcp_keepalive_probes                  -                   -                   
-u8                              sysctl_tcp_syn_retries                       -                   -                   
-u8                              sysctl_tcp_synack_retries                    -                   -                   
-u8                              sysctl_tcp_syncookies                        -                   -                   generated_on_syn
-u8                              sysctl_tcp_migrate_req                       -                   -                   reuseport
-u8                              sysctl_tcp_comp_sack_nr                      -                   -                   __tcp_ack_snd_check
-int                             sysctl_tcp_reordering                        -                   read_mostly         tcp_may_raise_cwnd/tcp_cong_control
-u8                              sysctl_tcp_retries1                          -                   -                   
-u8                              sysctl_tcp_retries2                          -                   -                   
-u8                              sysctl_tcp_orphan_retries                    -                   -                   
-u8                              sysctl_tcp_tw_reuse                          -                   -                   timewait_sock_ops
-int                             sysctl_tcp_fin_timeout                       -                   -                   TCP_LAST_ACK/tcp_rcv_state_process
-unsigned_int                    sysctl_tcp_notsent_lowat                     read_mostly         -                   tcp_notsent_lowat/tcp_stream_memory_free
-u8                              sysctl_tcp_sack                              -                   -                   tcp_syn_options
-u8                              sysctl_tcp_window_scaling                    -                   -                   tcp_syn_options,tcp_parse_options
-u8                              sysctl_tcp_timestamps                                                                
-u8                              sysctl_tcp_early_retrans                     read_mostly         -                   tcp_schedule_loss_probe(tcp_write_xmit)
-u8                              sysctl_tcp_recovery                          -                   -                   tcp_fastretrans_alert
-u8                              sysctl_tcp_thin_linear_timeouts              -                   -                   tcp_retrans_timer(on_thin_streams)
-u8                              sysctl_tcp_slow_start_after_idle             -                   -                   unlikely(tcp_cwnd_validate-network-not-starved)
-u8                              sysctl_tcp_retrans_collapse                  -                   -                   
-u8                              sysctl_tcp_stdurg                            -                   -                   unlikely(tcp_check_urg)
-u8                              sysctl_tcp_rfc1337                           -                   -                   
-u8                              sysctl_tcp_abort_on_overflow                 -                   -                   
-u8                              sysctl_tcp_fack                              -                   -                   
-int                             sysctl_tcp_max_reordering                    -                   -                   tcp_check_sack_reordering
-int                             sysctl_tcp_adv_win_scale                     -                   -                   tcp_init_buffer_space
-u8                              sysctl_tcp_dsack                             -                   -                   partial_packet_or_retrans_in_tcp_data_queue
-u8                              sysctl_tcp_app_win                           -                   -                   tcp_win_from_space
-u8                              sysctl_tcp_frto                              -                   -                   tcp_enter_loss
-u8                              sysctl_tcp_nometrics_save                    -                   -                   TCP_LAST_ACK/tcp_update_metrics
-u8                              sysctl_tcp_no_ssthresh_metrics_save          -                   -                   TCP_LAST_ACK/tcp_(update/init)_metrics
+=============================== ============================================ =================== =================== =================================================
+struct_inet_timewait_death_row  tcp_death_row
+struct_udp_table*               udp_table
+struct_ctl_table_header*        forw_hdr
+struct_ctl_table_header*        frags_hdr
+struct_ctl_table_header*        ipv4_hdr
+struct_ctl_table_header*        route_hdr
+struct_ctl_table_header*        xfrm4_hdr
+struct_ipv4_devconf*            devconf_all
+struct_ipv4_devconf*            devconf_dflt
+struct_ip_ra_chain              ra_chain
+struct_mutex                    ra_mutex
+struct_fib_rules_ops*           rules_ops
+struct_fib_table                fib_main
+struct_fib_table                fib_default
+unsigned_int                    fib_rules_require_fldissect
+bool                            fib_has_custom_rules
+bool                            fib_has_custom_local_routes
+bool                            fib_offload_disabled
+atomic_t                        fib_num_tclassid_users
+struct_hlist_head*              fib_table_hash
+struct_sock*                    fibnl
+struct_sock*                    mc_autojoin_sk
+struct_inet_peer_base*          peers
+struct_fqdir*                   fqdir
+u8                              sysctl_icmp_echo_ignore_all
+u8                              sysctl_icmp_echo_enable_probe
+u8                              sysctl_icmp_echo_ignore_broadcasts
+u8                              sysctl_icmp_ignore_bogus_error_responses
+u8                              sysctl_icmp_errors_use_inbound_ifaddr
+int                             sysctl_icmp_ratelimit
+int                             sysctl_icmp_ratemask
+u32                             ip_rt_min_pmtu
+int                             ip_rt_mtu_expires
+int                             ip_rt_min_advmss
+struct_local_ports              ip_local_ports
+u8                              sysctl_tcp_ecn
+u8                              sysctl_tcp_ecn_fallback
+u8                              sysctl_ip_default_ttl                                                                ip4_dst_hoplimit/ip_select_ttl
+u8                              sysctl_ip_no_pmtu_disc
+u8                              sysctl_ip_fwd_use_pmtu                       read_mostly                             ip_dst_mtu_maybe_forward/ip_skb_dst_mtu
+u8                              sysctl_ip_fwd_update_priority                                                        ip_forward
+u8                              sysctl_ip_nonlocal_bind
+u8                              sysctl_ip_autobind_reuse
+u8                              sysctl_ip_dynaddr
+u8                              sysctl_ip_early_demux                                            read_mostly         ip(6)_rcv_finish_core
+u8                              sysctl_raw_l3mdev_accept
+u8                              sysctl_tcp_early_demux                                           read_mostly         ip(6)_rcv_finish_core
+u8                              sysctl_udp_early_demux
+u8                              sysctl_nexthop_compat_mode
+u8                              sysctl_fwmark_reflect
+u8                              sysctl_tcp_fwmark_accept
+u8                              sysctl_tcp_l3mdev_accept                                         read_mostly         __inet6_lookup_established/inet_request_bound_dev_if
+u8                              sysctl_tcp_mtu_probing
+int                             sysctl_tcp_mtu_probe_floor
+int                             sysctl_tcp_base_mss
+int                             sysctl_tcp_min_snd_mss                       read_mostly                             __tcp_mtu_to_mss(tcp_write_xmit)
+int                             sysctl_tcp_probe_threshold                                                           tcp_mtu_probe(tcp_write_xmit)
+u32                             sysctl_tcp_probe_interval                                                            tcp_mtu_check_reprobe(tcp_write_xmit)
+int                             sysctl_tcp_keepalive_time
+int                             sysctl_tcp_keepalive_intvl
+u8                              sysctl_tcp_keepalive_probes
+u8                              sysctl_tcp_syn_retries
+u8                              sysctl_tcp_synack_retries
+u8                              sysctl_tcp_syncookies                                                                generated_on_syn
+u8                              sysctl_tcp_migrate_req                                                               reuseport
+u8                              sysctl_tcp_comp_sack_nr                                                              __tcp_ack_snd_check
+int                             sysctl_tcp_reordering                                            read_mostly         tcp_may_raise_cwnd/tcp_cong_control
+u8                              sysctl_tcp_retries1
+u8                              sysctl_tcp_retries2
+u8                              sysctl_tcp_orphan_retries
+u8                              sysctl_tcp_tw_reuse                                                                  timewait_sock_ops
+unsigned_int                    sysctl_tcp_tw_reuse_delay                                                            timewait_sock_ops
+int                             sysctl_tcp_fin_timeout                                                               TCP_LAST_ACK/tcp_rcv_state_process
+unsigned_int                    sysctl_tcp_notsent_lowat                     read_mostly                             tcp_notsent_lowat/tcp_stream_memory_free
+u8                              sysctl_tcp_sack                                                                      tcp_syn_options
+u8                              sysctl_tcp_window_scaling                                                            tcp_syn_options,tcp_parse_options
+u8                              sysctl_tcp_timestamps
+u8                              sysctl_tcp_early_retrans                     read_mostly                             tcp_schedule_loss_probe(tcp_write_xmit)
+u32                             sysctl_tcp_rto_max_ms
+u8                              sysctl_tcp_recovery                                                                  tcp_fastretrans_alert
+u8                              sysctl_tcp_thin_linear_timeouts                                                      tcp_retrans_timer(on_thin_streams)
+u8                              sysctl_tcp_slow_start_after_idle                                                     unlikely(tcp_cwnd_validate-network-not-starved)
+u8                              sysctl_tcp_retrans_collapse
+u8                              sysctl_tcp_stdurg                                                                    unlikely(tcp_check_urg)
+u8                              sysctl_tcp_rfc1337
+u8                              sysctl_tcp_abort_on_overflow
+u8                              sysctl_tcp_fack
+int                             sysctl_tcp_max_reordering                                                            tcp_check_sack_reordering
+int                             sysctl_tcp_adv_win_scale                                                             tcp_init_buffer_space
+u8                              sysctl_tcp_dsack                                                                     partial_packet_or_retrans_in_tcp_data_queue
+u8                              sysctl_tcp_app_win                                                                   tcp_win_from_space
+u8                              sysctl_tcp_frto                                                                      tcp_enter_loss
+u8                              sysctl_tcp_nometrics_save                                                            TCP_LAST_ACK/tcp_update_metrics
+u8                              sysctl_tcp_no_ssthresh_metrics_save                                                  TCP_LAST_ACK/tcp_(update/init)_metrics
 u8                              sysctl_tcp_moderate_rcvbuf                   read_mostly         read_mostly         tcp_tso_should_defer(tx);tcp_rcv_space_adjust(rx)
-u8                              sysctl_tcp_tso_win_divisor                   read_mostly         -                   tcp_tso_should_defer(tcp_write_xmit)
-u8                              sysctl_tcp_workaround_signed_windows         -                   -                   tcp_select_window
-int                             sysctl_tcp_limit_output_bytes                read_mostly         -                   tcp_small_queue_check(tcp_write_xmit)
-int                             sysctl_tcp_challenge_ack_limit               -                   -                   
-int                             sysctl_tcp_min_rtt_wlen                      read_mostly         -                   tcp_ack_update_rtt
-u8                              sysctl_tcp_min_tso_segs                      -                   -                   unlikely(icsk_ca_ops-written)
-u8                              sysctl_tcp_tso_rtt_log                       read_mostly         -                   tcp_tso_autosize
-u8                              sysctl_tcp_autocorking                       read_mostly         -                   tcp_push/tcp_should_autocork
-u8                              sysctl_tcp_reflect_tos                       -                   -                   tcp_v(4/6)_send_synack
-int                             sysctl_tcp_invalid_ratelimit                 -                   -                   
-int                             sysctl_tcp_pacing_ss_ratio                   -                   -                   default_cong_cont(tcp_update_pacing_rate)
-int                             sysctl_tcp_pacing_ca_ratio                   -                   -                   default_cong_cont(tcp_update_pacing_rate)
-int                             sysctl_tcp_wmem[3]                           read_mostly         -                   tcp_wmem_schedule(sendmsg/sendpage)
-int                             sysctl_tcp_rmem[3]                           -                   read_mostly         __tcp_grow_window(tx),tcp_rcv_space_adjust(rx)
-unsigned_int                    sysctl_tcp_child_ehash_entries                                                       
-unsigned_long                   sysctl_tcp_comp_sack_delay_ns                -                   -                   __tcp_ack_snd_check
-unsigned_long                   sysctl_tcp_comp_sack_slack_ns                -                   -                   __tcp_ack_snd_check
-int                             sysctl_max_syn_backlog                       -                   -                   
-int                             sysctl_tcp_fastopen                          -                   -                   
-struct_tcp_congestion_ops       tcp_congestion_control                       -                   -                   init_cc
-struct_tcp_fastopen_context     tcp_fastopen_ctx                             -                   -                   
-unsigned_int                    sysctl_tcp_fastopen_blackhole_timeout        -                   -                   
-atomic_t                        tfo_active_disable_times                     -                   -                   
-unsigned_long                   tfo_active_disable_stamp                     -                   -                   
-u32                             tcp_challenge_timestamp                      -                   -                   
-u32                             tcp_challenge_count                          -                   -                   
-u8                              sysctl_tcp_plb_enabled                       -                   -                   
-u8                              sysctl_tcp_plb_idle_rehash_rounds            -                   -                   
-u8                              sysctl_tcp_plb_rehash_rounds                 -                   -                   
-u8                              sysctl_tcp_plb_suspend_rto_sec               -                   -                   
-int                             sysctl_tcp_plb_cong_thresh                   -                   -                   
-int                             sysctl_udp_wmem_min                                                                  
-int                             sysctl_udp_rmem_min                                                                  
-u8                              sysctl_fib_notify_on_flag_change                                                     
-u8                              sysctl_udp_l3mdev_accept                                                             
-u8                              sysctl_igmp_llm_reports                                                              
-int                             sysctl_igmp_max_memberships                                                          
-int                             sysctl_igmp_max_msf                                                                  
-int                             sysctl_igmp_qrv                                                                      
-struct_ping_group_range         ping_group_range                                                                     
-atomic_t                        dev_addr_genid                                                                       
-unsigned_int                    sysctl_udp_child_hash_entries                                                        
-unsigned_long*                  sysctl_local_reserved_ports                                                          
-int                             sysctl_ip_prot_sock                                                                  
-struct_mr_table*                mrt                                                                                  
-struct_list_head                mr_tables                                                                            
-struct_fib_rules_ops*           mr_rules_ops                                                                         
-u32                             sysctl_fib_multipath_hash_fields                                                     
-u8                              sysctl_fib_multipath_use_neigh                                                       
-u8                              sysctl_fib_multipath_hash_policy                                                     
-struct_fib_notifier_ops*        notifier_ops                                                                         
-unsigned_int                    fib_seq                                                                              
-struct_fib_notifier_ops*        ipmr_notifier_ops                                                                    
-unsigned_int                    ipmr_seq                                                                             
-atomic_t                        rt_genid                                                                             
-siphash_key_t                   ip_id_key                                                                                      
+u8                              sysctl_tcp_tso_win_divisor                   read_mostly                             tcp_tso_should_defer(tcp_write_xmit)
+u8                              sysctl_tcp_workaround_signed_windows                                                 tcp_select_window
+int                             sysctl_tcp_limit_output_bytes                read_mostly                             tcp_small_queue_check(tcp_write_xmit)
+int                             sysctl_tcp_challenge_ack_limit
+int                             sysctl_tcp_min_rtt_wlen                      read_mostly                             tcp_ack_update_rtt
+u8                              sysctl_tcp_min_tso_segs                                                              unlikely(icsk_ca_ops-written)
+u8                              sysctl_tcp_tso_rtt_log                       read_mostly                             tcp_tso_autosize
+u8                              sysctl_tcp_autocorking                       read_mostly                             tcp_push/tcp_should_autocork
+u8                              sysctl_tcp_reflect_tos                                                               tcp_v(4/6)_send_synack
+int                             sysctl_tcp_invalid_ratelimit
+int                             sysctl_tcp_pacing_ss_ratio                                                           default_cong_cont(tcp_update_pacing_rate)
+int                             sysctl_tcp_pacing_ca_ratio                                                           default_cong_cont(tcp_update_pacing_rate)
+int                             sysctl_tcp_wmem[3]                           read_mostly                             tcp_wmem_schedule(sendmsg/sendpage)
+int                             sysctl_tcp_rmem[3]                                               read_mostly         __tcp_grow_window(tx),tcp_rcv_space_adjust(rx)
+unsigned_int                    sysctl_tcp_child_ehash_entries
+unsigned_long                   sysctl_tcp_comp_sack_delay_ns                                                        __tcp_ack_snd_check
+unsigned_long                   sysctl_tcp_comp_sack_slack_ns                                                        __tcp_ack_snd_check
+int                             sysctl_max_syn_backlog
+int                             sysctl_tcp_fastopen
+struct_tcp_congestion_ops       tcp_congestion_control                                                               init_cc
+struct_tcp_fastopen_context     tcp_fastopen_ctx
+unsigned_int                    sysctl_tcp_fastopen_blackhole_timeout
+atomic_t                        tfo_active_disable_times
+unsigned_long                   tfo_active_disable_stamp
+u32                             tcp_challenge_timestamp
+u32                             tcp_challenge_count
+u8                              sysctl_tcp_plb_enabled
+u8                              sysctl_tcp_plb_idle_rehash_rounds
+u8                              sysctl_tcp_plb_rehash_rounds
+u8                              sysctl_tcp_plb_suspend_rto_sec
+int                             sysctl_tcp_plb_cong_thresh
+int                             sysctl_udp_wmem_min
+int                             sysctl_udp_rmem_min
+u8                              sysctl_fib_notify_on_flag_change
+u8                              sysctl_udp_l3mdev_accept
+u8                              sysctl_igmp_llm_reports
+int                             sysctl_igmp_max_memberships
+int                             sysctl_igmp_max_msf
+int                             sysctl_igmp_qrv
+struct_ping_group_range         ping_group_range
+atomic_t                        dev_addr_genid
+unsigned_int                    sysctl_udp_child_hash_entries
+unsigned_long*                  sysctl_local_reserved_ports
+int                             sysctl_ip_prot_sock
+struct_mr_table*                mrt
+struct_list_head                mr_tables
+struct_fib_rules_ops*           mr_rules_ops
+u32                             sysctl_fib_multipath_hash_fields
+u8                              sysctl_fib_multipath_use_neigh
+u8                              sysctl_fib_multipath_hash_policy
+struct_fib_notifier_ops*        notifier_ops
+unsigned_int                    fib_seq
+struct_fib_notifier_ops*        ipmr_notifier_ops
+unsigned_int                    ipmr_seq
+atomic_t                        rt_genid
+siphash_key_t                   ip_id_key
+=============================== ============================================ =================== =================== =================================================
diff --git a/Documentation/networking/net_cachelines/snmp.rst b/Documentation/networking/net_cachelines/snmp.rst
index 6a071538566c..bce4eb35ec48 100644
--- a/Documentation/networking/net_cachelines/snmp.rst
+++ b/Documentation/networking/net_cachelines/snmp.rst
@@ -5,131 +5,137 @@
 netns_ipv4 enum fast path usage breakdown
 ===========================================
 
+============== ===================================== =================== =================== ==================================================
 Type           Name                                  fastpath_tx_access  fastpath_rx_access  comment
-..enum                                                                                       
-unsigned_long  LINUX_MIB_TCPKEEPALIVE                write_mostly        -                   tcp_keepalive_timer
-unsigned_long  LINUX_MIB_DELAYEDACKS                 write_mostly        -                   tcp_delack_timer_handler,tcp_delack_timer
-unsigned_long  LINUX_MIB_DELAYEDACKLOCKED            write_mostly        -                   tcp_delack_timer_handler,tcp_delack_timer
-unsigned_long  LINUX_MIB_TCPAUTOCORKING              write_mostly        -                   tcp_push,tcp_sendmsg_locked
-unsigned_long  LINUX_MIB_TCPFROMZEROWINDOWADV        write_mostly        -                   tcp_select_window,tcp_transmit-skb
-unsigned_long  LINUX_MIB_TCPTOZEROWINDOWADV          write_mostly        -                   tcp_select_window,tcp_transmit-skb
-unsigned_long  LINUX_MIB_TCPWANTZEROWINDOWADV        write_mostly        -                   tcp_select_window,tcp_transmit-skb
-unsigned_long  LINUX_MIB_TCPORIGDATASENT             write_mostly        -                   tcp_write_xmit
-unsigned_long  LINUX_MIB_TCPHPHITS                   -                   write_mostly        tcp_rcv_established,tcp_v4_do_rcv,tcp_v6_do_rcv
-unsigned_long  LINUX_MIB_TCPRCVCOALESCE              -                   write_mostly        tcp_try_coalesce,tcp_queue_rcv,tcp_rcv_established
-unsigned_long  LINUX_MIB_TCPPUREACKS                 -                   write_mostly        tcp_ack,tcp_rcv_established
-unsigned_long  LINUX_MIB_TCPHPACKS                   -                   write_mostly        tcp_ack,tcp_rcv_established
-unsigned_long  LINUX_MIB_TCPDELIVERED                -                   write_mostly        tcp_newly_delivered,tcp_ack,tcp_rcv_established
-unsigned_long  LINUX_MIB_SYNCOOKIESSENT                                                      
-unsigned_long  LINUX_MIB_SYNCOOKIESRECV                                                      
-unsigned_long  LINUX_MIB_SYNCOOKIESFAILED                                                    
-unsigned_long  LINUX_MIB_EMBRYONICRSTS                                                       
-unsigned_long  LINUX_MIB_PRUNECALLED                                                         
-unsigned_long  LINUX_MIB_RCVPRUNED                                                           
-unsigned_long  LINUX_MIB_OFOPRUNED                                                           
-unsigned_long  LINUX_MIB_OUTOFWINDOWICMPS                                                    
-unsigned_long  LINUX_MIB_LOCKDROPPEDICMPS                                                    
-unsigned_long  LINUX_MIB_ARPFILTER                                                           
-unsigned_long  LINUX_MIB_TIMEWAITED                                                          
-unsigned_long  LINUX_MIB_TIMEWAITRECYCLED                                                    
-unsigned_long  LINUX_MIB_TIMEWAITKILLED                                                      
-unsigned_long  LINUX_MIB_PAWSACTIVEREJECTED                                                  
-unsigned_long  LINUX_MIB_PAWSESTABREJECTED                                                   
-unsigned_long  LINUX_MIB_DELAYEDACKLOST                                                      
-unsigned_long  LINUX_MIB_LISTENOVERFLOWS                                                     
-unsigned_long  LINUX_MIB_LISTENDROPS                                                         
-unsigned_long  LINUX_MIB_TCPRENORECOVERY                                                     
-unsigned_long  LINUX_MIB_TCPSACKRECOVERY                                                     
-unsigned_long  LINUX_MIB_TCPSACKRENEGING                                                     
-unsigned_long  LINUX_MIB_TCPSACKREORDER                                                      
-unsigned_long  LINUX_MIB_TCPRENOREORDER                                                      
-unsigned_long  LINUX_MIB_TCPTSREORDER                                                        
-unsigned_long  LINUX_MIB_TCPFULLUNDO                                                         
-unsigned_long  LINUX_MIB_TCPPARTIALUNDO                                                      
-unsigned_long  LINUX_MIB_TCPDSACKUNDO                                                        
-unsigned_long  LINUX_MIB_TCPLOSSUNDO                                                         
-unsigned_long  LINUX_MIB_TCPLOSTRETRANSMIT                                                   
-unsigned_long  LINUX_MIB_TCPRENOFAILURES                                                     
-unsigned_long  LINUX_MIB_TCPSACKFAILURES                                                     
-unsigned_long  LINUX_MIB_TCPLOSSFAILURES                                                     
-unsigned_long  LINUX_MIB_TCPFASTRETRANS                                                      
-unsigned_long  LINUX_MIB_TCPSLOWSTARTRETRANS                                                 
-unsigned_long  LINUX_MIB_TCPTIMEOUTS                                                         
-unsigned_long  LINUX_MIB_TCPLOSSPROBES                                                       
-unsigned_long  LINUX_MIB_TCPLOSSPROBERECOVERY                                                
-unsigned_long  LINUX_MIB_TCPRENORECOVERYFAIL                                                 
-unsigned_long  LINUX_MIB_TCPSACKRECOVERYFAIL                                                 
-unsigned_long  LINUX_MIB_TCPRCVCOLLAPSED                                                     
-unsigned_long  LINUX_MIB_TCPDSACKOLDSENT                                                     
-unsigned_long  LINUX_MIB_TCPDSACKOFOSENT                                                     
-unsigned_long  LINUX_MIB_TCPDSACKRECV                                                        
-unsigned_long  LINUX_MIB_TCPDSACKOFORECV                                                     
-unsigned_long  LINUX_MIB_TCPABORTONDATA                                                      
-unsigned_long  LINUX_MIB_TCPABORTONCLOSE                                                     
-unsigned_long  LINUX_MIB_TCPABORTONMEMORY                                                    
-unsigned_long  LINUX_MIB_TCPABORTONTIMEOUT                                                   
-unsigned_long  LINUX_MIB_TCPABORTONLINGER                                                    
-unsigned_long  LINUX_MIB_TCPABORTFAILED                                                      
-unsigned_long  LINUX_MIB_TCPMEMORYPRESSURES                                                  
-unsigned_long  LINUX_MIB_TCPMEMORYPRESSURESCHRONO                                            
-unsigned_long  LINUX_MIB_TCPSACKDISCARD                                                      
-unsigned_long  LINUX_MIB_TCPDSACKIGNOREDOLD                                                  
-unsigned_long  LINUX_MIB_TCPDSACKIGNOREDNOUNDO                                               
-unsigned_long  LINUX_MIB_TCPSPURIOUSRTOS                                                     
-unsigned_long  LINUX_MIB_TCPMD5NOTFOUND                                                      
-unsigned_long  LINUX_MIB_TCPMD5UNEXPECTED                                                    
-unsigned_long  LINUX_MIB_TCPMD5FAILURE                                                       
-unsigned_long  LINUX_MIB_SACKSHIFTED                                                         
-unsigned_long  LINUX_MIB_SACKMERGED                                                          
-unsigned_long  LINUX_MIB_SACKSHIFTFALLBACK                                                   
-unsigned_long  LINUX_MIB_TCPBACKLOGDROP                                                      
-unsigned_long  LINUX_MIB_PFMEMALLOCDROP                                                      
-unsigned_long  LINUX_MIB_TCPMINTTLDROP                                                       
-unsigned_long  LINUX_MIB_TCPDEFERACCEPTDROP                                                  
-unsigned_long  LINUX_MIB_IPRPFILTER                                                          
-unsigned_long  LINUX_MIB_TCPTIMEWAITOVERFLOW                                                 
-unsigned_long  LINUX_MIB_TCPREQQFULLDOCOOKIES                                                
-unsigned_long  LINUX_MIB_TCPREQQFULLDROP                                                     
-unsigned_long  LINUX_MIB_TCPRETRANSFAIL                                                      
-unsigned_long  LINUX_MIB_TCPBACKLOGCOALESCE                                                  
-unsigned_long  LINUX_MIB_TCPOFOQUEUE                                                         
-unsigned_long  LINUX_MIB_TCPOFODROP                                                          
-unsigned_long  LINUX_MIB_TCPOFOMERGE                                                         
-unsigned_long  LINUX_MIB_TCPCHALLENGEACK                                                     
-unsigned_long  LINUX_MIB_TCPSYNCHALLENGE                                                     
-unsigned_long  LINUX_MIB_TCPFASTOPENACTIVE                                                   
-unsigned_long  LINUX_MIB_TCPFASTOPENACTIVEFAIL                                               
-unsigned_long  LINUX_MIB_TCPFASTOPENPASSIVE                                                  
-unsigned_long  LINUX_MIB_TCPFASTOPENPASSIVEFAIL                                              
-unsigned_long  LINUX_MIB_TCPFASTOPENLISTENOVERFLOW                                           
-unsigned_long  LINUX_MIB_TCPFASTOPENCOOKIEREQD                                               
-unsigned_long  LINUX_MIB_TCPFASTOPENBLACKHOLE                                                
-unsigned_long  LINUX_MIB_TCPSPURIOUS_RTX_HOSTQUEUES                                          
-unsigned_long  LINUX_MIB_BUSYPOLLRXPACKETS                                                   
-unsigned_long  LINUX_MIB_TCPSYNRETRANS                                                       
-unsigned_long  LINUX_MIB_TCPHYSTARTTRAINDETECT                                               
-unsigned_long  LINUX_MIB_TCPHYSTARTTRAINCWND                                                 
-unsigned_long  LINUX_MIB_TCPHYSTARTDELAYDETECT                                               
-unsigned_long  LINUX_MIB_TCPHYSTARTDELAYCWND                                                 
-unsigned_long  LINUX_MIB_TCPACKSKIPPEDSYNRECV                                                
-unsigned_long  LINUX_MIB_TCPACKSKIPPEDPAWS                                                   
-unsigned_long  LINUX_MIB_TCPACKSKIPPEDSEQ                                                    
-unsigned_long  LINUX_MIB_TCPACKSKIPPEDFINWAIT2                                               
-unsigned_long  LINUX_MIB_TCPACKSKIPPEDTIMEWAIT                                               
-unsigned_long  LINUX_MIB_TCPACKSKIPPEDCHALLENGE                                              
-unsigned_long  LINUX_MIB_TCPWINPROBE                                                         
-unsigned_long  LINUX_MIB_TCPMTUPFAIL                                                         
-unsigned_long  LINUX_MIB_TCPMTUPSUCCESS                                                      
-unsigned_long  LINUX_MIB_TCPDELIVEREDCE                                                      
-unsigned_long  LINUX_MIB_TCPACKCOMPRESSED                                                    
-unsigned_long  LINUX_MIB_TCPZEROWINDOWDROP                                                   
-unsigned_long  LINUX_MIB_TCPRCVQDROP                                                         
-unsigned_long  LINUX_MIB_TCPWQUEUETOOBIG                                                     
-unsigned_long  LINUX_MIB_TCPFASTOPENPASSIVEALTKEY                                            
-unsigned_long  LINUX_MIB_TCPTIMEOUTREHASH                                                    
-unsigned_long  LINUX_MIB_TCPDUPLICATEDATAREHASH                                              
-unsigned_long  LINUX_MIB_TCPDSACKRECVSEGS                                                    
-unsigned_long  LINUX_MIB_TCPDSACKIGNOREDDUBIOUS                                              
-unsigned_long  LINUX_MIB_TCPMIGRATEREQSUCCESS                                                
-unsigned_long  LINUX_MIB_TCPMIGRATEREQFAILURE                                                
-unsigned_long  __LINUX_MIB_MAX                                                               
+============== ===================================== =================== =================== ==================================================
+unsigned_long  LINUX_MIB_TCPKEEPALIVE                write_mostly                            tcp_keepalive_timer
+unsigned_long  LINUX_MIB_DELAYEDACKS                 write_mostly                            tcp_delack_timer_handler,tcp_delack_timer
+unsigned_long  LINUX_MIB_DELAYEDACKLOCKED            write_mostly                            tcp_delack_timer_handler,tcp_delack_timer
+unsigned_long  LINUX_MIB_TCPAUTOCORKING              write_mostly                            tcp_push,tcp_sendmsg_locked
+unsigned_long  LINUX_MIB_TCPFROMZEROWINDOWADV        write_mostly                            tcp_select_window,tcp_transmit-skb
+unsigned_long  LINUX_MIB_TCPTOZEROWINDOWADV          write_mostly                            tcp_select_window,tcp_transmit-skb
+unsigned_long  LINUX_MIB_TCPWANTZEROWINDOWADV        write_mostly                            tcp_select_window,tcp_transmit-skb
+unsigned_long  LINUX_MIB_TCPORIGDATASENT             write_mostly                            tcp_write_xmit
+unsigned_long  LINUX_MIB_TCPHPHITS                                       write_mostly        tcp_rcv_established,tcp_v4_do_rcv,tcp_v6_do_rcv
+unsigned_long  LINUX_MIB_TCPRCVCOALESCE                                  write_mostly        tcp_try_coalesce,tcp_queue_rcv,tcp_rcv_established
+unsigned_long  LINUX_MIB_TCPPUREACKS                                     write_mostly        tcp_ack,tcp_rcv_established
+unsigned_long  LINUX_MIB_TCPHPACKS                                       write_mostly        tcp_ack,tcp_rcv_established
+unsigned_long  LINUX_MIB_TCPDELIVERED                                    write_mostly        tcp_newly_delivered,tcp_ack,tcp_rcv_established
+unsigned_long  LINUX_MIB_SYNCOOKIESSENT
+unsigned_long  LINUX_MIB_SYNCOOKIESRECV
+unsigned_long  LINUX_MIB_SYNCOOKIESFAILED
+unsigned_long  LINUX_MIB_EMBRYONICRSTS
+unsigned_long  LINUX_MIB_PRUNECALLED
+unsigned_long  LINUX_MIB_RCVPRUNED
+unsigned_long  LINUX_MIB_OFOPRUNED
+unsigned_long  LINUX_MIB_OUTOFWINDOWICMPS
+unsigned_long  LINUX_MIB_LOCKDROPPEDICMPS
+unsigned_long  LINUX_MIB_ARPFILTER
+unsigned_long  LINUX_MIB_TIMEWAITED
+unsigned_long  LINUX_MIB_TIMEWAITRECYCLED
+unsigned_long  LINUX_MIB_TIMEWAITKILLED
+unsigned_long  LINUX_MIB_PAWSACTIVEREJECTED
+unsigned_long  LINUX_MIB_PAWSESTABREJECTED
+unsigned_long  LINUX_MIB_BEYOND_WINDOW
+unsigned_long  LINUX_MIB_TSECR_REJECTED
+unsigned_long  LINUX_MIB_PAWS_OLD_ACK
+unsigned_long  LINUX_MIB_PAWS_TW_REJECTED
+unsigned_long  LINUX_MIB_DELAYEDACKLOST
+unsigned_long  LINUX_MIB_LISTENOVERFLOWS
+unsigned_long  LINUX_MIB_LISTENDROPS
+unsigned_long  LINUX_MIB_TCPRENORECOVERY
+unsigned_long  LINUX_MIB_TCPSACKRECOVERY
+unsigned_long  LINUX_MIB_TCPSACKRENEGING
+unsigned_long  LINUX_MIB_TCPSACKREORDER
+unsigned_long  LINUX_MIB_TCPRENOREORDER
+unsigned_long  LINUX_MIB_TCPTSREORDER
+unsigned_long  LINUX_MIB_TCPFULLUNDO
+unsigned_long  LINUX_MIB_TCPPARTIALUNDO
+unsigned_long  LINUX_MIB_TCPDSACKUNDO
+unsigned_long  LINUX_MIB_TCPLOSSUNDO
+unsigned_long  LINUX_MIB_TCPLOSTRETRANSMIT
+unsigned_long  LINUX_MIB_TCPRENOFAILURES
+unsigned_long  LINUX_MIB_TCPSACKFAILURES
+unsigned_long  LINUX_MIB_TCPLOSSFAILURES
+unsigned_long  LINUX_MIB_TCPFASTRETRANS
+unsigned_long  LINUX_MIB_TCPSLOWSTARTRETRANS
+unsigned_long  LINUX_MIB_TCPTIMEOUTS
+unsigned_long  LINUX_MIB_TCPLOSSPROBES
+unsigned_long  LINUX_MIB_TCPLOSSPROBERECOVERY
+unsigned_long  LINUX_MIB_TCPRENORECOVERYFAIL
+unsigned_long  LINUX_MIB_TCPSACKRECOVERYFAIL
+unsigned_long  LINUX_MIB_TCPRCVCOLLAPSED
+unsigned_long  LINUX_MIB_TCPDSACKOLDSENT
+unsigned_long  LINUX_MIB_TCPDSACKOFOSENT
+unsigned_long  LINUX_MIB_TCPDSACKRECV
+unsigned_long  LINUX_MIB_TCPDSACKOFORECV
+unsigned_long  LINUX_MIB_TCPABORTONDATA
+unsigned_long  LINUX_MIB_TCPABORTONCLOSE
+unsigned_long  LINUX_MIB_TCPABORTONMEMORY
+unsigned_long  LINUX_MIB_TCPABORTONTIMEOUT
+unsigned_long  LINUX_MIB_TCPABORTONLINGER
+unsigned_long  LINUX_MIB_TCPABORTFAILED
+unsigned_long  LINUX_MIB_TCPMEMORYPRESSURES
+unsigned_long  LINUX_MIB_TCPMEMORYPRESSURESCHRONO
+unsigned_long  LINUX_MIB_TCPSACKDISCARD
+unsigned_long  LINUX_MIB_TCPDSACKIGNOREDOLD
+unsigned_long  LINUX_MIB_TCPDSACKIGNOREDNOUNDO
+unsigned_long  LINUX_MIB_TCPSPURIOUSRTOS
+unsigned_long  LINUX_MIB_TCPMD5NOTFOUND
+unsigned_long  LINUX_MIB_TCPMD5UNEXPECTED
+unsigned_long  LINUX_MIB_TCPMD5FAILURE
+unsigned_long  LINUX_MIB_SACKSHIFTED
+unsigned_long  LINUX_MIB_SACKMERGED
+unsigned_long  LINUX_MIB_SACKSHIFTFALLBACK
+unsigned_long  LINUX_MIB_TCPBACKLOGDROP
+unsigned_long  LINUX_MIB_PFMEMALLOCDROP
+unsigned_long  LINUX_MIB_TCPMINTTLDROP
+unsigned_long  LINUX_MIB_TCPDEFERACCEPTDROP
+unsigned_long  LINUX_MIB_IPRPFILTER
+unsigned_long  LINUX_MIB_TCPTIMEWAITOVERFLOW
+unsigned_long  LINUX_MIB_TCPREQQFULLDOCOOKIES
+unsigned_long  LINUX_MIB_TCPREQQFULLDROP
+unsigned_long  LINUX_MIB_TCPRETRANSFAIL
+unsigned_long  LINUX_MIB_TCPBACKLOGCOALESCE
+unsigned_long  LINUX_MIB_TCPOFOQUEUE
+unsigned_long  LINUX_MIB_TCPOFODROP
+unsigned_long  LINUX_MIB_TCPOFOMERGE
+unsigned_long  LINUX_MIB_TCPCHALLENGEACK
+unsigned_long  LINUX_MIB_TCPSYNCHALLENGE
+unsigned_long  LINUX_MIB_TCPFASTOPENACTIVE
+unsigned_long  LINUX_MIB_TCPFASTOPENACTIVEFAIL
+unsigned_long  LINUX_MIB_TCPFASTOPENPASSIVE
+unsigned_long  LINUX_MIB_TCPFASTOPENPASSIVEFAIL
+unsigned_long  LINUX_MIB_TCPFASTOPENLISTENOVERFLOW
+unsigned_long  LINUX_MIB_TCPFASTOPENCOOKIEREQD
+unsigned_long  LINUX_MIB_TCPFASTOPENBLACKHOLE
+unsigned_long  LINUX_MIB_TCPSPURIOUS_RTX_HOSTQUEUES
+unsigned_long  LINUX_MIB_BUSYPOLLRXPACKETS
+unsigned_long  LINUX_MIB_TCPSYNRETRANS
+unsigned_long  LINUX_MIB_TCPHYSTARTTRAINDETECT
+unsigned_long  LINUX_MIB_TCPHYSTARTTRAINCWND
+unsigned_long  LINUX_MIB_TCPHYSTARTDELAYDETECT
+unsigned_long  LINUX_MIB_TCPHYSTARTDELAYCWND
+unsigned_long  LINUX_MIB_TCPACKSKIPPEDSYNRECV
+unsigned_long  LINUX_MIB_TCPACKSKIPPEDPAWS
+unsigned_long  LINUX_MIB_TCPACKSKIPPEDSEQ
+unsigned_long  LINUX_MIB_TCPACKSKIPPEDFINWAIT2
+unsigned_long  LINUX_MIB_TCPACKSKIPPEDTIMEWAIT
+unsigned_long  LINUX_MIB_TCPACKSKIPPEDCHALLENGE
+unsigned_long  LINUX_MIB_TCPWINPROBE
+unsigned_long  LINUX_MIB_TCPMTUPFAIL
+unsigned_long  LINUX_MIB_TCPMTUPSUCCESS
+unsigned_long  LINUX_MIB_TCPDELIVEREDCE
+unsigned_long  LINUX_MIB_TCPACKCOMPRESSED
+unsigned_long  LINUX_MIB_TCPZEROWINDOWDROP
+unsigned_long  LINUX_MIB_TCPRCVQDROP
+unsigned_long  LINUX_MIB_TCPWQUEUETOOBIG
+unsigned_long  LINUX_MIB_TCPFASTOPENPASSIVEALTKEY
+unsigned_long  LINUX_MIB_TCPTIMEOUTREHASH
+unsigned_long  LINUX_MIB_TCPDUPLICATEDATAREHASH
+unsigned_long  LINUX_MIB_TCPDSACKRECVSEGS
+unsigned_long  LINUX_MIB_TCPDSACKIGNOREDDUBIOUS
+unsigned_long  LINUX_MIB_TCPMIGRATEREQSUCCESS
+unsigned_long  LINUX_MIB_TCPMIGRATEREQFAILURE
+unsigned_long  __LINUX_MIB_MAX
+============== ===================================== =================== =================== ==================================================
diff --git a/Documentation/networking/net_cachelines/tcp_sock.rst b/Documentation/networking/net_cachelines/tcp_sock.rst
index 1c154cbd1848..7bbda5944ee2 100644
--- a/Documentation/networking/net_cachelines/tcp_sock.rst
+++ b/Documentation/networking/net_cachelines/tcp_sock.rst
@@ -5,153 +5,154 @@
 tcp_sock struct fast path usage breakdown
 =========================================
 
+============================= ======================= =================== =================== ==================================================================================================================================================================================================================
 Type                          Name                    fastpath_tx_access  fastpath_rx_access  Comments
-..struct                      ..tcp_sock                                                        
-struct_inet_connection_sock   inet_conn                                                       
+============================= ======================= =================== =================== ==================================================================================================================================================================================================================
+struct inet_connection_sock   inet_conn
 u16                           tcp_header_len          read_mostly         read_mostly         tcp_bound_to_half_wnd,tcp_current_mss(tx);tcp_rcv_established(rx)
-u16                           gso_segs                read_mostly         -                   tcp_xmit_size_goal
+u16                           gso_segs                read_mostly                             tcp_xmit_size_goal
 __be32                        pred_flags              read_write          read_mostly         tcp_select_window(tx);tcp_rcv_established(rx)
-u64                           bytes_received          -                   read_write          tcp_rcv_nxt_update(rx)
-u32                           segs_in                 -                   read_write          tcp_v6_rcv(rx)
-u32                           data_segs_in            -                   read_write          tcp_v6_rcv(rx)
+u64                           bytes_received                              read_write          tcp_rcv_nxt_update(rx)
+u32                           segs_in                                     read_write          tcp_v6_rcv(rx)
+u32                           data_segs_in                                read_write          tcp_v6_rcv(rx)
 u32                           rcv_nxt                 read_mostly         read_write          tcp_cleanup_rbuf,tcp_send_ack,tcp_inq_hint,tcp_transmit_skb,tcp_receive_window(tx);tcp_v6_do_rcv,tcp_rcv_established,tcp_data_queue,tcp_receive_window,tcp_rcv_nxt_update(write)(rx)
-u32                           copied_seq              -                   read_mostly         tcp_cleanup_rbuf,tcp_rcv_space_adjust,tcp_inq_hint
-u32                           rcv_wup                 -                   read_write          __tcp_cleanup_rbuf,tcp_receive_window,tcp_receive_established
+u32                           copied_seq                                  read_mostly         tcp_cleanup_rbuf,tcp_rcv_space_adjust,tcp_inq_hint
+u32                           rcv_wup                                     read_write          __tcp_cleanup_rbuf,tcp_receive_window,tcp_receive_established
 u32                           snd_nxt                 read_write          read_mostly         tcp_rate_check_app_limited,__tcp_transmit_skb,tcp_event_new_data_sent(write)(tx);tcp_rcv_established,tcp_ack,tcp_clean_rtx_queue(rx)
-u32                           segs_out                read_write          -                   __tcp_transmit_skb
-u32                           data_segs_out           read_write          -                   __tcp_transmit_skb,tcp_update_skb_after_send
-u64                           bytes_sent              read_write          -                   __tcp_transmit_skb
-u64                           bytes_acked             -                   read_write          tcp_snd_una_update/tcp_ack
-u32                           dsack_dups                                                      
+u32                           segs_out                read_write                              __tcp_transmit_skb
+u32                           data_segs_out           read_write                              __tcp_transmit_skb,tcp_update_skb_after_send
+u64                           bytes_sent              read_write                              __tcp_transmit_skb
+u64                           bytes_acked                                 read_write          tcp_snd_una_update/tcp_ack
+u32                           dsack_dups
 u32                           snd_una                 read_mostly         read_write          tcp_wnd_end,tcp_urg_mode,tcp_minshall_check,tcp_cwnd_validate(tx);tcp_ack,tcp_may_update_window,tcp_clean_rtx_queue(write),tcp_ack_tstamp(rx)
-u32                           snd_sml                 read_write          -                   tcp_minshall_check,tcp_minshall_update
-u32                           rcv_tstamp              -                   read_mostly         tcp_ack
-u32                           lsndtime                read_write          -                   tcp_slow_start_after_idle_check,tcp_event_data_sent
-u32                           last_oow_ack_time                                               
-u32                           compressed_ack_rcv_nxt                                          
+u32                           snd_sml                 read_write                              tcp_minshall_check,tcp_minshall_update
+u32                           rcv_tstamp                                  read_mostly         tcp_ack
+void *                        tcp_clean_acked                             read_mostly         tcp_ack
+u32                           lsndtime                read_write                              tcp_slow_start_after_idle_check,tcp_event_data_sent
+u32                           last_oow_ack_time
+u32                           compressed_ack_rcv_nxt
 u32                           tsoffset                read_mostly         read_mostly         tcp_established_options(tx);tcp_fast_parse_options(rx)
-struct_list_head              tsq_node                -                   -                   
-struct_list_head              tsorted_sent_queue      read_write          -                   tcp_update_skb_after_send
-u32                           snd_wl1                 -                   read_mostly         tcp_may_update_window
+struct list_head              tsq_node
+struct list_head              tsorted_sent_queue      read_write                              tcp_update_skb_after_send
+u32                           snd_wl1                                     read_mostly         tcp_may_update_window
 u32                           snd_wnd                 read_mostly         read_mostly         tcp_wnd_end,tcp_tso_should_defer(tx);tcp_fast_path_on(rx)
-u32                           max_window              read_mostly         -                   tcp_bound_to_half_wnd,forced_push
+u32                           max_window              read_mostly                             tcp_bound_to_half_wnd,forced_push
 u32                           mss_cache               read_mostly         read_mostly         tcp_rate_check_app_limited,tcp_current_mss,tcp_sync_mss,tcp_sndbuf_expand,tcp_tso_should_defer(tx);tcp_update_pacing_rate,tcp_clean_rtx_queue(rx)
 u32                           window_clamp            read_mostly         read_write          tcp_rcv_space_adjust,__tcp_select_window
-u32                           rcv_ssthresh            read_mostly         -                   __tcp_select_window
+u32                           rcv_ssthresh            read_mostly                             __tcp_select_window
 u8                            scaling_ratio           read_mostly         read_mostly         tcp_win_from_space
-struct                        tcp_rack                                                        
-u16                           advmss                  -                   read_mostly         tcp_rcv_space_adjust
-u8                            compressed_ack                                                  
-u8:2                          dup_ack_counter                                                 
-u8:1                          tlp_retrans                                                     
+struct                        tcp_rack
+u16                           advmss                                      read_mostly         tcp_rcv_space_adjust
+u8                            compressed_ack
+u8:2                          dup_ack_counter
+u8:1                          tlp_retrans
 u8:1                          tcp_usec_ts             read_mostly         read_mostly
-u32                           chrono_start            read_write          -                   tcp_chrono_start/stop(tcp_write_xmit,tcp_cwnd_validate,tcp_send_syn_data)
-u32[3]                        chrono_stat             read_write          -                   tcp_chrono_start/stop(tcp_write_xmit,tcp_cwnd_validate,tcp_send_syn_data)
-u8:2                          chrono_type             read_write          -                   tcp_chrono_start/stop(tcp_write_xmit,tcp_cwnd_validate,tcp_send_syn_data)
-u8:1                          rate_app_limited        -                   read_write          tcp_rate_gen
-u8:1                          fastopen_connect                                                
-u8:1                          fastopen_no_cookie                                              
-u8:1                          is_sack_reneg           -                   read_mostly         tcp_skb_entail,tcp_ack
-u8:2                          fastopen_client_fail                                            
-u8:4                          nonagle                 read_write          -                   tcp_skb_entail,tcp_push_pending_frames
-u8:1                          thin_lto                                                        
-u8:1                          recvmsg_inq                                                     
-u8:1                          repair                  read_mostly         -                   tcp_write_xmit
-u8:1                          frto                                                            
-u8                            repair_queue            -                   -                   
-u8:2                          save_syn                                                        
-u8:1                          syn_data                                                        
-u8:1                          syn_fastopen                                                    
-u8:1                          syn_fastopen_exp                                                
-u8:1                          syn_fastopen_ch                                                 
-u8:1                          syn_data_acked                                                  
-u8:1                          is_cwnd_limited         read_mostly         -                   tcp_cwnd_validate,tcp_is_cwnd_limited
-u32                           tlp_high_seq            -                   read_mostly         tcp_ack
-u32                           tcp_tx_delay                                                    
-u64                           tcp_wstamp_ns           read_write          -                   tcp_pacing_check,tcp_tso_should_defer,tcp_update_skb_after_send
+u32                           chrono_start            read_write                              tcp_chrono_start/stop(tcp_write_xmit,tcp_cwnd_validate,tcp_send_syn_data)
+u32[3]                        chrono_stat             read_write                              tcp_chrono_start/stop(tcp_write_xmit,tcp_cwnd_validate,tcp_send_syn_data)
+u8:2                          chrono_type             read_write                              tcp_chrono_start/stop(tcp_write_xmit,tcp_cwnd_validate,tcp_send_syn_data)
+u8:1                          rate_app_limited                            read_write          tcp_rate_gen
+u8:1                          fastopen_connect
+u8:1                          fastopen_no_cookie
+u8:1                          is_sack_reneg                               read_mostly         tcp_skb_entail,tcp_ack
+u8:2                          fastopen_client_fail
+u8:4                          nonagle                 read_write                              tcp_skb_entail,tcp_push_pending_frames
+u8:1                          thin_lto
+u8:1                          recvmsg_inq
+u8:1                          repair                  read_mostly                             tcp_write_xmit
+u8:1                          frto
+u8                            repair_queue
+u8:2                          save_syn
+u8:1                          syn_data
+u8:1                          syn_fastopen
+u8:1                          syn_fastopen_exp
+u8:1                          syn_fastopen_ch
+u8:1                          syn_data_acked
+u8:1                          is_cwnd_limited         read_mostly                             tcp_cwnd_validate,tcp_is_cwnd_limited
+u32                           tlp_high_seq                                read_mostly         tcp_ack
+u32                           tcp_tx_delay
+u64                           tcp_wstamp_ns           read_write                              tcp_pacing_check,tcp_tso_should_defer,tcp_update_skb_after_send
 u64                           tcp_clock_cache         read_write          read_write          tcp_mstamp_refresh(tcp_write_xmit/tcp_rcv_space_adjust),__tcp_transmit_skb,tcp_tso_should_defer;timer
 u64                           tcp_mstamp              read_write          read_write          tcp_mstamp_refresh(tcp_write_xmit/tcp_rcv_space_adjust)(tx);tcp_rcv_space_adjust,tcp_rate_gen,tcp_clean_rtx_queue,tcp_ack_update_rtt/tcp_time_stamp(rx);timer
 u32                           srtt_us                 read_mostly         read_write          tcp_tso_should_defer(tx);tcp_update_pacing_rate,__tcp_set_rto,tcp_rtt_estimator(rx)
-u32                           mdev_us                 read_write          -                   tcp_rtt_estimator
-u32                           mdev_max_us                                                     
-u32                           rttvar_us               -                   read_mostly         __tcp_set_rto
+u32                           mdev_us                 read_write                              tcp_rtt_estimator
+u32                           mdev_max_us
+u32                           rttvar_us                                   read_mostly         __tcp_set_rto
 u32                           rtt_seq                 read_write                              tcp_rtt_estimator
-struct_minmax                 rtt_min                 -                   read_mostly         tcp_min_rtt/tcp_rate_gen,tcp_min_rtttcp_update_rtt_min
+struct minmax                 rtt_min                                     read_mostly         tcp_min_rtt/tcp_rate_gen,tcp_min_rtttcp_update_rtt_min
 u32                           packets_out             read_write          read_write          tcp_packets_in_flight(tx/rx);tcp_slow_start_after_idle_check,tcp_nagle_check,tcp_rate_skb_sent,tcp_event_new_data_sent,tcp_cwnd_validate,tcp_write_xmit(tx);tcp_ack,tcp_clean_rtx_queue,tcp_update_pacing_rate(rx)
-u32                           retrans_out             -                   read_mostly         tcp_packets_in_flight,tcp_rate_check_app_limited
-u32                           max_packets_out         -                   read_write          tcp_cwnd_validate
-u32                           cwnd_usage_seq          -                   read_write          tcp_cwnd_validate
-u16                           urg_data                -                   read_mostly         tcp_fast_path_check
-u8                            ecn_flags               read_write          -                   tcp_ecn_send
-u8                            keepalive_probes                                                
-u32                           reordering              read_mostly         -                   tcp_sndbuf_expand
-u32                           reord_seen                                                      
+u32                           retrans_out                                 read_mostly         tcp_packets_in_flight,tcp_rate_check_app_limited
+u32                           max_packets_out                             read_write          tcp_cwnd_validate
+u32                           cwnd_usage_seq                              read_write          tcp_cwnd_validate
+u16                           urg_data                                    read_mostly         tcp_fast_path_check
+u8                            ecn_flags               read_write                              tcp_ecn_send
+u8                            keepalive_probes
+u32                           reordering              read_mostly                             tcp_sndbuf_expand
+u32                           reord_seen
 u32                           snd_up                  read_write          read_mostly         tcp_mark_urg,tcp_urg_mode,__tcp_transmit_skb(tx);tcp_clean_rtx_queue(rx)
-struct_tcp_options_received   rx_opt                  read_mostly         read_write          tcp_established_options(tx);tcp_fast_path_on,tcp_ack_update_window,tcp_is_sack,tcp_data_queue,tcp_rcv_established,tcp_ack_update_rtt(rx)
-u32                           snd_ssthresh            -                   read_mostly         tcp_update_pacing_rate
+struct tcp_options_received   rx_opt                  read_mostly         read_write          tcp_established_options(tx);tcp_fast_path_on,tcp_ack_update_window,tcp_is_sack,tcp_data_queue,tcp_rcv_established,tcp_ack_update_rtt(rx)
+u32                           snd_ssthresh                                read_mostly         tcp_update_pacing_rate
 u32                           snd_cwnd                read_mostly         read_mostly         tcp_snd_cwnd,tcp_rate_check_app_limited,tcp_tso_should_defer(tx);tcp_update_pacing_rate
-u32                           snd_cwnd_cnt                                                    
-u32                           snd_cwnd_clamp                                                  
-u32                           snd_cwnd_used                                                   
-u32                           snd_cwnd_stamp                                                  
-u32                           prior_cwnd                                                      
-u32                           prr_delivered                                                   
+u32                           snd_cwnd_cnt
+u32                           snd_cwnd_clamp
+u32                           snd_cwnd_used
+u32                           snd_cwnd_stamp
+u32                           prior_cwnd
+u32                           prr_delivered
 u32                           prr_out                 read_mostly         read_mostly         tcp_rate_skb_sent,tcp_newly_delivered(tx);tcp_ack,tcp_rate_gen,tcp_clean_rtx_queue(rx)
 u32                           delivered               read_mostly         read_write          tcp_rate_skb_sent, tcp_newly_delivered(tx);tcp_ack, tcp_rate_gen, tcp_clean_rtx_queue (rx)
 u32                           delivered_ce            read_mostly         read_write          tcp_rate_skb_sent(tx);tcp_rate_gen(rx)
-u32                           lost                    -                   read_mostly         tcp_ack
+u32                           lost                                        read_mostly         tcp_ack
 u32                           app_limited             read_write          read_mostly         tcp_rate_check_app_limited,tcp_rate_skb_sent(tx);tcp_rate_gen(rx)
-u64                           first_tx_mstamp         read_write          -                   tcp_rate_skb_sent
-u64                           delivered_mstamp        read_write          -                   tcp_rate_skb_sent
-u32                           rate_delivered          -                   read_mostly         tcp_rate_gen
-u32                           rate_interval_us        -                   read_mostly         rate_delivered,rate_app_limited
+u64                           first_tx_mstamp         read_write                              tcp_rate_skb_sent
+u64                           delivered_mstamp        read_write                              tcp_rate_skb_sent
+u32                           rate_delivered                              read_mostly         tcp_rate_gen
+u32                           rate_interval_us                            read_mostly         rate_delivered,rate_app_limited
 u32                           rcv_wnd                 read_write          read_mostly         tcp_select_window,tcp_receive_window,tcp_fast_path_check
-u32                           write_seq               read_write          -                   tcp_rate_check_app_limited,tcp_write_queue_empty,tcp_skb_entail,forced_push,tcp_mark_push
-u32                           notsent_lowat           read_mostly         -                   tcp_stream_memory_free
-u32                           pushed_seq              read_write          -                   tcp_mark_push,forced_push
+u32                           write_seq               read_write                              tcp_rate_check_app_limited,tcp_write_queue_empty,tcp_skb_entail,forced_push,tcp_mark_push
+u32                           notsent_lowat           read_mostly                             tcp_stream_memory_free
+u32                           pushed_seq              read_write                              tcp_mark_push,forced_push
 u32                           lost_out                read_mostly         read_mostly         tcp_left_out(tx);tcp_packets_in_flight(tx/rx);tcp_rate_check_app_limited(rx)
 u32                           sacked_out              read_mostly         read_mostly         tcp_left_out(tx);tcp_packets_in_flight(tx/rx);tcp_clean_rtx_queue(rx)
-struct_hrtimer                pacing_timer                                                    
-struct_hrtimer                compressed_ack_timer                                            
-struct_sk_buff*               lost_skb_hint           read_mostly                             tcp_clean_rtx_queue
-struct_sk_buff*               retransmit_skb_hint     read_mostly         -                   tcp_clean_rtx_queue
-struct_rb_root                out_of_order_queue      -                   read_mostly         tcp_data_queue,tcp_fast_path_check
-struct_sk_buff*               ooo_last_skb                                                    
-struct_tcp_sack_block[1]      duplicate_sack                                                  
-struct_tcp_sack_block[4]      selective_acks                                                  
-struct_tcp_sack_block[4]      recv_sack_cache                                                 
-struct_sk_buff*               highest_sack            read_write          -                   tcp_event_new_data_sent
-int                           lost_cnt_hint                                                   
-u32                           prior_ssthresh                                                  
-u32                           high_seq                                                        
-u32                           retrans_stamp                                                   
-u32                           undo_marker                                                     
-int                           undo_retrans                                                    
-u64                           bytes_retrans                                                   
-u32                           total_retrans                                                   
-u32                           rto_stamp                                                       
-u16                           total_rto                                                       
-u16                           total_rto_recoveries                                            
-u32                           total_rto_time                                                  
-u32                           urg_seq                 -                   -                   
-unsigned_int                  keepalive_time                                                  
-unsigned_int                  keepalive_intvl                                                 
-int                           linger2                                                         
-u8                            bpf_sock_ops_cb_flags                                           
-u8:1                          bpf_chg_cc_inprogress                                           
-u16                           timeout_rehash                                                  
-u32                           rcv_ooopack                                                     
-u32                           rcv_rtt_last_tsecr                                              
-struct                        rcv_rtt_est             -                   read_write          tcp_rcv_space_adjust,tcp_rcv_established
-struct                        rcvq_space              -                   read_write          tcp_rcv_space_adjust
-struct                        mtu_probe                                                       
-u32                           plb_rehash                                                      
-u32                           mtu_info                                                        
-bool                          is_mptcp                                                        
-bool                          smc_hs_congested                                                
-bool                          syn_smc                                                         
-struct_tcp_sock_af_ops*       af_specific                                                     
-struct_tcp_md5sig_info*       md5sig_info                                                     
-struct_tcp_fastopen_request*  fastopen_req                                                    
-struct_request_sock*          fastopen_rsk                                                    
-struct_saved_syn*             saved_syn                                                        
-\ No newline at end of file
+struct hrtimer                pacing_timer
+struct hrtimer                compressed_ack_timer
+struct sk_buff*               retransmit_skb_hint     read_mostly                             tcp_clean_rtx_queue
+struct rb_root                out_of_order_queue                          read_mostly         tcp_data_queue,tcp_fast_path_check
+struct sk_buff*               ooo_last_skb
+struct tcp_sack_block[1]      duplicate_sack
+struct tcp_sack_block[4]      selective_acks
+struct tcp_sack_block[4]      recv_sack_cache
+struct sk_buff*               highest_sack            read_write                              tcp_event_new_data_sent
+u32                           prior_ssthresh
+u32                           high_seq
+u32                           retrans_stamp
+u32                           undo_marker
+int                           undo_retrans
+u64                           bytes_retrans
+u32                           total_retrans
+u32                           rto_stamp
+u16                           total_rto
+u16                           total_rto_recoveries
+u32                           total_rto_time
+u32                           urg_seq
+unsigned_int                  keepalive_time
+unsigned_int                  keepalive_intvl
+int                           linger2
+u8                            bpf_sock_ops_cb_flags
+u8:1                          bpf_chg_cc_inprogress
+u16                           timeout_rehash
+u32                           rcv_ooopack
+u32                           rcv_rtt_last_tsecr
+struct                        rcv_rtt_est                                 read_write          tcp_rcv_space_adjust,tcp_rcv_established
+struct                        rcvq_space                                  read_write          tcp_rcv_space_adjust
+struct                        mtu_probe
+u32                           plb_rehash
+u32                           mtu_info
+bool                          is_mptcp
+bool                          smc_hs_congested
+bool                          syn_smc
+struct tcp_sock_af_ops*       af_specific
+struct tcp_md5sig_info*       md5sig_info
+struct tcp_fastopen_request*  fastopen_req
+struct request_sock*          fastopen_rsk
+struct saved_syn*             saved_syn
+============================= ======================= =================== =================== ==================================================================================================================================================================================================================
diff --git a/Documentation/networking/net_dim.rst b/Documentation/networking/net_dim.rst
index 8908fd7b0a8d..4377998e6826 100644
--- a/Documentation/networking/net_dim.rst
+++ b/Documentation/networking/net_dim.rst
@@ -156,7 +156,7 @@ usage is not complete but it should make the outline of the usage clear.
 		          my_entity->bytes,
 		          &dim_sample);
 	/* Call net DIM */
-	net_dim(&my_entity->dim, dim_sample);
+	net_dim(&my_entity->dim, &dim_sample);
 	...
   }
 
diff --git a/Documentation/networking/netconsole.rst b/Documentation/networking/netconsole.rst
index d55c2a22ec7a..59cb9982afe6 100644
--- a/Documentation/networking/netconsole.rst
+++ b/Documentation/networking/netconsole.rst
@@ -17,6 +17,8 @@ Release prepend support by Breno Leitao <leitao@debian.org>, Jul 7 2023
 
 Userdata append support by Matthew Wood <thepacketgeek@gmail.com>, Jan 22 2024
 
+Sysdata append support by Breno Leitao <leitao@debian.org>, Jan 15 2025
+
 Please send bug reports to Matt Mackall <mpm@selenic.com>
 Satyam Sharma <satyam.sharma@gmail.com>, and Cong Wang <xiyou.wangcong@gmail.com>
 
@@ -45,7 +47,7 @@ following format::
 	r             if present, prepend kernel version (release) to the message
 	src-port      source for UDP packets (defaults to 6665)
 	src-ip        source IP to use (interface address)
-	dev           network interface (eth0)
+	dev           network interface name (eth0) or MAC address
 	tgt-port      port for logging agent (6666)
 	tgt-ip        IP address for logging agent
 	tgt-macaddr   ethernet MAC address for logging agent (broadcast)
@@ -62,6 +64,10 @@ or using IPv6::
 
  insmod netconsole netconsole=@/,@fd00:1:2:3::1/
 
+or using a MAC address to select the egress interface::
+
+   linux netconsole=4444@10.0.0.1/22:33:44:55:66:77,9353@10.0.0.2/12:34:56:78:9a:bc
+
 It also supports logging to multiple remote agents by specifying
 parameters for the multiple agents separated by semicolons and the
 complete string enclosed in "quotes", thusly::
@@ -124,7 +130,7 @@ To remove a target::
 
 The interface exposes these parameters of a netconsole target to userspace:
 
-	==============  =================================       ============
+	=============== =================================       ============
 	enabled		Is this target currently enabled?	(read-write)
 	extended	Extended mode enabled			(read-write)
 	release		Prepend kernel release to message	(read-write)
@@ -135,7 +141,8 @@ The interface exposes these parameters of a netconsole target to userspace:
 	remote_ip	Remote agent's IP address		(read-write)
 	local_mac	Local interface's MAC address		(read-only)
 	remote_mac	Remote agent's MAC address		(read-write)
-	==============  =================================       ============
+	transmit_errors	Number of packet send errors		(read-only)
+	=============== =================================       ============
 
 The "enabled" attribute is also used to control whether the parameters of
 a target can be updated or not -- you can modify the parameters of only
@@ -237,6 +244,134 @@ Delete `userdata` entries with `rmdir`::
 
    It is recommended to not write user data values with newlines.
 
+Task name auto population in userdata
+-------------------------------------
+
+Inside the netconsole configfs hierarchy, there is a file called
+`taskname_enabled` under the `userdata` directory. This file is used to enable
+or disable the automatic task name population feature. This feature
+automatically populates the current task name that is scheduled in the CPU
+sneding the message.
+
+To enable task name auto-population::
+
+  echo 1 > /sys/kernel/config/netconsole/target1/userdata/taskname_enabled
+
+When this option is enabled, the netconsole messages will include an additional
+line in the userdata field with the format `taskname=<task name>`. This allows
+the receiver of the netconsole messages to easily find which application was
+currently scheduled when that message was generated, providing extra context
+for kernel messages and helping to categorize them.
+
+Example::
+
+  echo "This is a message" > /dev/kmsg
+  12,607,22085407756,-;This is a message
+   taskname=echo
+
+In this example, the message was generated while "echo" was the current
+scheduled process.
+
+Kernel release auto population in userdata
+------------------------------------------
+
+Within the netconsole configfs hierarchy, there is a file named `release_enabled`
+located in the `userdata` directory. This file controls the kernel release
+(version) auto-population feature, which appends the kernel release information
+to userdata dictionary in every message sent.
+
+To enable the release auto-population::
+
+  echo 1 > /sys/kernel/config/netconsole/target1/userdata/release_enabled
+
+Example::
+
+  echo "This is a message" > /dev/kmsg
+  12,607,22085407756,-;This is a message
+   release=6.14.0-rc6-01219-g3c027fbd941d
+
+.. note::
+
+   This feature provides the same data as the "release prepend" feature.
+   However, in this case, the release information is appended to the userdata
+   dictionary rather than being included in the message header.
+
+
+CPU number auto population in userdata
+--------------------------------------
+
+Inside the netconsole configfs hierarchy, there is a file called
+`cpu_nr` under the `userdata` directory. This file is used to enable or disable
+the automatic CPU number population feature. This feature automatically
+populates the CPU number that is sending the message.
+
+To enable the CPU number auto-population::
+
+  echo 1 > /sys/kernel/config/netconsole/target1/userdata/cpu_nr
+
+When this option is enabled, the netconsole messages will include an additional
+line in the userdata field with the format `cpu=<cpu_number>`. This allows the
+receiver of the netconsole messages to easily differentiate and demultiplex
+messages originating from different CPUs, which is particularly useful when
+dealing with parallel log output.
+
+Example::
+
+  echo "This is a message" > /dev/kmsg
+  12,607,22085407756,-;This is a message
+   cpu=42
+
+In this example, the message was sent by CPU 42.
+
+.. note::
+
+   If the user has set a conflicting `cpu` key in the userdata dictionary,
+   both keys will be reported, with the kernel-populated entry appearing after
+   the user one. For example::
+
+     # User-defined CPU entry
+     mkdir -p /sys/kernel/config/netconsole/target1/userdata/cpu
+     echo "1" > /sys/kernel/config/netconsole/target1/userdata/cpu/value
+
+   Output might look like::
+
+     12,607,22085407756,-;This is a message
+      cpu=1
+      cpu=42    # kernel-populated value
+
+
+Message ID auto population in userdata
+--------------------------------------
+
+Within the netconsole configfs hierarchy, there is a file named `msgid_enabled`
+located in the `userdata` directory. This file controls the message ID
+auto-population feature, which assigns a numeric id to each message sent to a
+given target and appends the ID to userdata dictionary in every message sent.
+
+The message ID is generated using a per-target 32 bit counter that is
+incremented for every message sent to the target. Note that this counter will
+eventually wrap around after reaching uint32_t max value, so the message ID is
+not globally unique over time. However, it can still be used by the target to
+detect if messages were dropped before reaching the target by identifying gaps
+in the sequence of IDs.
+
+It is important to distinguish message IDs from the message <sequnum> field.
+Some kernel messages may never reach netconsole (for example, due to printk
+rate limiting). Thus, a gap in <sequnum> cannot be solely relied upon to
+indicate that a message was dropped during transmission, as it may never have
+been sent via netconsole. The message ID, on the other hand, is only assigned
+to messages that are actually transmitted via netconsole.
+
+Example::
+
+  echo "This is message #1" > /dev/kmsg
+  echo "This is message #2" > /dev/kmsg
+  13,434,54928466,-;This is message #1
+   msgid=1
+  13,435,54934019,-;This is message #2
+   msgid=2
+
+
 Extended console:
 =================
 
diff --git a/Documentation/networking/netdev-features.rst b/Documentation/networking/netdev-features.rst
index 5014f7cc1398..02bd7536fc0c 100644
--- a/Documentation/networking/netdev-features.rst
+++ b/Documentation/networking/netdev-features.rst
@@ -188,3 +188,8 @@ Redundancy) frames from one port to another in hardware.
 This should be set for devices which duplicate outgoing HSR (High-availability
 Seamless Redundancy) or PRP (Parallel Redundancy Protocol) tags automatically
 frames in hardware.
+
+* netmem-tx
+
+This should be set for devices which support netmem TX. See
+Documentation/networking/netmem.rst
diff --git a/Documentation/networking/netdevices.rst b/Documentation/networking/netdevices.rst
index 857c9784f87e..7ebb6c36482d 100644
--- a/Documentation/networking/netdevices.rst
+++ b/Documentation/networking/netdevices.rst
@@ -8,7 +8,7 @@ Network Devices, the Kernel, and You!
 Introduction
 ============
 The following is a random collection of documentation regarding
-network devices.
+network devices. It is intended for driver developers.
 
 struct net_device lifetime rules
 ================================
@@ -210,49 +210,55 @@ packets is preferred.
 struct net_device synchronization rules
 =======================================
 ndo_open:
-	Synchronization: rtnl_lock() semaphore.
+	Synchronization: rtnl_lock() semaphore. In addition, netdev instance
+	lock if the driver implements queue management or shaper API.
 	Context: process
 
 ndo_stop:
-	Synchronization: rtnl_lock() semaphore.
+	Synchronization: rtnl_lock() semaphore. In addition, netdev instance
+	lock if the driver implements queue management or shaper API.
 	Context: process
 	Note: netif_running() is guaranteed false
 
 ndo_do_ioctl:
 	Synchronization: rtnl_lock() semaphore.
-	Context: process
 
-        This is only called by network subsystems internally,
-        not by user space calling ioctl as it was in before
-        linux-5.14.
+	This is only called by network subsystems internally,
+	not by user space calling ioctl as it was in before
+	linux-5.14.
 
 ndo_siocbond:
-        Synchronization: rtnl_lock() semaphore.
+	Synchronization: rtnl_lock() semaphore. In addition, netdev instance
+	lock if the driver implements queue management or shaper API.
         Context: process
 
-        Used by the bonding driver for the SIOCBOND family of
-        ioctl commands.
+	Used by the bonding driver for the SIOCBOND family of
+	ioctl commands.
 
 ndo_siocwandev:
-	Synchronization: rtnl_lock() semaphore.
+	Synchronization: rtnl_lock() semaphore. In addition, netdev instance
+	lock if the driver implements queue management or shaper API.
 	Context: process
 
 	Used by the drivers/net/wan framework to handle
 	the SIOCWANDEV ioctl with the if_settings structure.
 
 ndo_siocdevprivate:
-	Synchronization: rtnl_lock() semaphore.
+	Synchronization: rtnl_lock() semaphore. In addition, netdev instance
+	lock if the driver implements queue management or shaper API.
 	Context: process
 
 	This is used to implement SIOCDEVPRIVATE ioctl helpers.
 	These should not be added to new drivers, so don't use.
 
 ndo_eth_ioctl:
-	Synchronization: rtnl_lock() semaphore.
+	Synchronization: rtnl_lock() semaphore. In addition, netdev instance
+	lock if the driver implements queue management or shaper API.
 	Context: process
 
 ndo_get_stats:
-	Synchronization: rtnl_lock() semaphore, or RCU.
+	Synchronization: RCU (can be called concurrently with the stats
+	update path).
 	Context: atomic (can't sleep under RCU)
 
 ndo_start_xmit:
@@ -284,6 +290,16 @@ ndo_set_rx_mode:
 	Synchronization: netif_addr_lock spinlock.
 	Context: BHs disabled
 
+ndo_setup_tc:
+	``TC_SETUP_BLOCK`` and ``TC_SETUP_FT`` are running under NFT locks
+	(i.e. no ``rtnl_lock`` and no device instance lock). The rest of
+	``tc_setup_type`` types run under netdev instance lock if the driver
+	implements queue management or shaper API.
+
+Most ndo callbacks not specified in the list above are running
+under ``rtnl_lock``. In addition, netdev instance lock is taken as well if
+the driver implements queue management or shaper API.
+
 struct napi_struct synchronization rules
 ========================================
 napi->poll:
@@ -297,3 +313,106 @@ napi->poll:
 	Context:
 		 softirq
 		 will be called with interrupts disabled by netconsole.
+
+netdev instance lock
+====================
+
+Historically, all networking control operations were protected by a single
+global lock known as ``rtnl_lock``. There is an ongoing effort to replace this
+global lock with separate locks for each network namespace. Additionally,
+properties of individual netdev are increasingly protected by per-netdev locks.
+
+For device drivers that implement shaping or queue management APIs, all control
+operations will be performed under the netdev instance lock.
+Drivers can also explicitly request instance lock to be held during ops
+by setting ``request_ops_lock`` to true. Code comments and docs refer
+to drivers which have ops called under the instance lock as "ops locked".
+See also the documentation of the ``lock`` member of struct net_device.
+
+In the future, there will be an option for individual
+drivers to opt out of using ``rtnl_lock`` and instead perform their control
+operations directly under the netdev instance lock.
+
+Devices drivers are encouraged to rely on the instance lock where possible.
+
+For the (mostly software) drivers that need to interact with the core stack,
+there are two sets of interfaces: ``dev_xxx``/``netdev_xxx`` and ``netif_xxx``
+(e.g., ``dev_set_mtu`` and ``netif_set_mtu``). The ``dev_xxx``/``netdev_xxx``
+functions handle acquiring the instance lock themselves, while the
+``netif_xxx`` functions assume that the driver has already acquired
+the instance lock.
+
+struct net_device_ops
+---------------------
+
+``ndos`` are called without holding the instance lock for most drivers.
+
+"Ops locked" drivers will have most of the ``ndos`` invoked under
+the instance lock.
+
+struct ethtool_ops
+------------------
+
+Similarly to ``ndos`` the instance lock is only held for select drivers.
+For "ops locked" drivers all ethtool ops without exceptions should
+be called under the instance lock.
+
+struct netdev_stat_ops
+----------------------
+
+"qstat" ops are invoked under the instance lock for "ops locked" drivers,
+and under rtnl_lock for all other drivers.
+
+struct net_shaper_ops
+---------------------
+
+All net shaper callbacks are invoked while holding the netdev instance
+lock. ``rtnl_lock`` may or may not be held.
+
+Note that supporting net shapers automatically enables "ops locking".
+
+struct netdev_queue_mgmt_ops
+----------------------------
+
+All queue management callbacks are invoked while holding the netdev instance
+lock. ``rtnl_lock`` may or may not be held.
+
+Note that supporting struct netdev_queue_mgmt_ops automatically enables
+"ops locking".
+
+Notifiers and netdev instance lock
+----------------------------------
+
+For device drivers that implement shaping or queue management APIs,
+some of the notifiers (``enum netdev_cmd``) are running under the netdev
+instance lock.
+
+The following netdev notifiers are always run under the instance lock:
+* ``NETDEV_XDP_FEAT_CHANGE``
+
+For devices with locked ops, currently only the following notifiers are
+running under the lock:
+* ``NETDEV_CHANGE``
+* ``NETDEV_REGISTER``
+* ``NETDEV_UP``
+
+The following notifiers are running without the lock:
+* ``NETDEV_UNREGISTER``
+
+There are no clear expectations for the remaining notifiers. Notifiers not on
+the list may run with or without the instance lock, potentially even invoking
+the same notifier type with and without the lock from different code paths.
+The goal is to eventually ensure that all (or most, with a few documented
+exceptions) notifiers run under the instance lock. Please extend this
+documentation whenever you make explicit assumption about lock being held
+from a notifier.
+
+NETDEV_INTERNAL symbol namespace
+================================
+
+Symbols exported as NETDEV_INTERNAL can only be used in networking
+core and drivers which exclusively flow via the main networking list and trees.
+Note that the inverse is not true, most symbols outside of NETDEV_INTERNAL
+are not expected to be used by random code outside netdev either.
+Symbols may lack the designation because they predate the namespaces,
+or simply due to an oversight.
diff --git a/Documentation/networking/netlink_spec/readme.txt b/Documentation/networking/netlink_spec/readme.txt
index 6763f99d216c..030b44aca4e6 100644
--- a/Documentation/networking/netlink_spec/readme.txt
+++ b/Documentation/networking/netlink_spec/readme.txt
@@ -1,4 +1,4 @@
 SPDX-License-Identifier: GPL-2.0
 
 This file is populated during the build of the documentation (htmldocs) by the
-tools/net/ynl/ynl-gen-rst.py script.
+tools/net/ynl/pyynl/ynl_gen_rst.py script.
diff --git a/Documentation/networking/netmem.rst b/Documentation/networking/netmem.rst
new file mode 100644
index 000000000000..b63aded46337
--- /dev/null
+++ b/Documentation/networking/netmem.rst
@@ -0,0 +1,98 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==================================
+Netmem Support for Network Drivers
+==================================
+
+This document outlines the requirements for network drivers to support netmem,
+an abstract memory type that enables features like device memory TCP. By
+supporting netmem, drivers can work with various underlying memory types
+with little to no modification.
+
+Benefits of Netmem :
+
+* Flexibility: Netmem can be backed by different memory types (e.g., struct
+  page, DMA-buf), allowing drivers to support various use cases such as device
+  memory TCP.
+* Future-proof: Drivers with netmem support are ready for upcoming
+  features that rely on it.
+* Simplified Development: Drivers interact with a consistent API,
+  regardless of the underlying memory implementation.
+
+Driver RX Requirements
+======================
+
+1. The driver must support page_pool.
+
+2. The driver must support the tcp-data-split ethtool option.
+
+3. The driver must use the page_pool netmem APIs for payload memory. The netmem
+   APIs currently 1-to-1 correspond with page APIs. Conversion to netmem should
+   be achievable by switching the page APIs to netmem APIs and tracking memory
+   via netmem_refs in the driver rather than struct page * :
+
+   - page_pool_alloc -> page_pool_alloc_netmem
+   - page_pool_get_dma_addr -> page_pool_get_dma_addr_netmem
+   - page_pool_put_page -> page_pool_put_netmem
+
+   Not all page APIs have netmem equivalents at the moment. If your driver
+   relies on a missing netmem API, feel free to add and propose to netdev@, or
+   reach out to the maintainers and/or almasrymina@google.com for help adding
+   the netmem API.
+
+4. The driver must use the following PP_FLAGS:
+
+   - PP_FLAG_DMA_MAP: netmem is not dma-mappable by the driver. The driver
+     must delegate the dma mapping to the page_pool, which knows when
+     dma-mapping is (or is not) appropriate.
+   - PP_FLAG_DMA_SYNC_DEV: netmem dma addr is not necessarily dma-syncable
+     by the driver. The driver must delegate the dma syncing to the page_pool,
+     which knows when dma-syncing is (or is not) appropriate.
+   - PP_FLAG_ALLOW_UNREADABLE_NETMEM. The driver must specify this flag iff
+     tcp-data-split is enabled.
+
+5. The driver must not assume the netmem is readable and/or backed by pages.
+   The netmem returned by the page_pool may be unreadable, in which case
+   netmem_address() will return NULL. The driver must correctly handle
+   unreadable netmem, i.e. don't attempt to handle its contents when
+   netmem_address() is NULL.
+
+   Ideally, drivers should not have to check the underlying netmem type via
+   helpers like netmem_is_net_iov() or convert the netmem to any of its
+   underlying types via netmem_to_page() or netmem_to_net_iov(). In most cases,
+   netmem or page_pool helpers that abstract this complexity are provided
+   (and more can be added).
+
+6. The driver must use page_pool_dma_sync_netmem_for_cpu() in lieu of
+   dma_sync_single_range_for_cpu(). For some memory providers, dma_syncing for
+   CPU will be done by the page_pool, for others (particularly dmabuf memory
+   provider), dma syncing for CPU is the responsibility of the userspace using
+   dmabuf APIs. The driver must delegate the entire dma-syncing operation to
+   the page_pool which will do it correctly.
+
+7. Avoid implementing driver-specific recycling on top of the page_pool. Drivers
+   cannot hold onto a struct page to do their own recycling as the netmem may
+   not be backed by a struct page. However, you may hold onto a page_pool
+   reference with page_pool_fragment_netmem() or page_pool_ref_netmem() for
+   that purpose, but be mindful that some netmem types might have longer
+   circulation times, such as when userspace holds a reference in zerocopy
+   scenarios.
+
+Driver TX Requirements
+======================
+
+1. The Driver must not pass the netmem dma_addr to any of the dma-mapping APIs
+   directly. This is because netmem dma_addrs may come from a source like
+   dma-buf that is not compatible with the dma-mapping APIs.
+
+   Helpers like netmem_dma_unmap_page_attrs() & netmem_dma_unmap_addr_set()
+   should be used in lieu of dma_unmap_page[_attrs](), dma_unmap_addr_set().
+   The netmem variants will handle netmem dma_addrs correctly regardless of the
+   source, delegating to the dma-mapping APIs when appropriate.
+
+   Not all dma-mapping APIs have netmem equivalents at the moment. If your
+   driver relies on a missing netmem API, feel free to add and propose to
+   netdev@, or reach out to the maintainers and/or almasrymina@google.com for
+   help adding the netmem API.
+
+2. Driver should declare support by setting `netdev->netmem_tx = true`
diff --git a/Documentation/networking/nf_conntrack-sysctl.rst b/Documentation/networking/nf_conntrack-sysctl.rst
index 238b66d0e059..35f889259fcd 100644
--- a/Documentation/networking/nf_conntrack-sysctl.rst
+++ b/Documentation/networking/nf_conntrack-sysctl.rst
@@ -85,7 +85,6 @@ nf_conntrack_log_invalid - INTEGER
 	- 1   - log ICMP packets
 	- 6   - log TCP packets
 	- 17  - log UDP packets
-	- 33  - log DCCP packets
 	- 41  - log ICMPv6 packets
 	- 136 - log UDPLITE packets
 	- 255 - log packets of any protocol
diff --git a/Documentation/networking/phy.rst b/Documentation/networking/phy.rst
index f64641417c54..7f159043ad5a 100644
--- a/Documentation/networking/phy.rst
+++ b/Documentation/networking/phy.rst
@@ -333,6 +333,13 @@ Some of the interface modes are described below:
     SerDes lane, each port having speeds of 2.5G / 1G / 100M / 10M achieved
     through symbol replication. The PCS expects the standard USXGMII code word.
 
+``PHY_INTERFACE_MODE_MIILITE``
+    Non-standard, simplified MII mode, without TXER, RXER, CRS and COL signals
+    as defined for the MII. The absence of COL signal makes half-duplex link
+    modes impossible but does not interfere with BroadR-Reach link modes on
+    Broadcom (and other two-wire Ethernet) PHYs, because they are full-duplex
+    only.
+
 Pause frames / flow control
 ===========================
 
diff --git a/Documentation/networking/rds.rst b/Documentation/networking/rds.rst
index 498395f5fbcb..41b0a6182fe4 100644
--- a/Documentation/networking/rds.rst
+++ b/Documentation/networking/rds.rst
@@ -265,7 +265,7 @@ RDS Protocol
 
       The bitmaps are allocated as connections are brought up.  This
       avoids allocation in the interrupt handling path which queues
-      sages on sockets.  The dense bitmaps let transports send the
+      messages on sockets.  The dense bitmaps let transports send the
       entire bitmap on any bitmap change reasonably efficiently.  This
       is much easier to implement than some finer-grained
       communication of per-port congestion.  The sender does a very
@@ -373,7 +373,7 @@ The recv path
     - validate header checksum
     - copy header to rds_ib_incoming struct if start of a new datagram
     - add to ibinc's fraglist
-    - if competed datagram:
+    - if completed datagram:
 	 - update cong map if datagram was cong update
 	 - call rds_recv_incoming() otherwise
 	 - note if ack is required
@@ -415,7 +415,7 @@ Multipath RDS (mprds)
   I/O workqs and reconnect threads are driven from the rds_conn_path.
   Transports such as TCP that are multipath capable may then set up a
   TCP socket per rds_conn_path, and this is managed by the transport via
-  the transport privatee cp_transport_data pointer.
+  the transport private cp_transport_data pointer.
 
   Transports announce themselves as multipath capable by setting the
   t_mp_capable bit during registration with the rds core module. When the
@@ -430,7 +430,7 @@ Multipath RDS (mprds)
   This is done by sending out a control packet exchange before the
   first data packet. The control packet exchange must have completed
   prior to outgoing hash completion in rds_sendmsg() when the transport
-  is mutlipath capable.
+  is multipath capable.
 
   The control packet is an RDS ping packet (i.e., packet to rds dest
   port 0) with the ping packet having a rds extension header option  of
diff --git a/Documentation/networking/rxrpc.rst b/Documentation/networking/rxrpc.rst
index e807e18ba32a..d63e3e27dd06 100644
--- a/Documentation/networking/rxrpc.rst
+++ b/Documentation/networking/rxrpc.rst
@@ -1062,30 +1062,6 @@ The kernel interface functions are as follows:
      first function to change.  Note that this must be called in TASK_RUNNING
      state.
 
- (#) Get remote client epoch::
-
-	u32 rxrpc_kernel_get_epoch(struct socket *sock,
-				   struct rxrpc_call *call)
-
-     This allows the epoch that's contained in packets of an incoming client
-     call to be queried.  This value is returned.  The function always
-     successful if the call is still in progress.  It shouldn't be called once
-     the call has expired.  Note that calling this on a local client call only
-     returns the local epoch.
-
-     This value can be used to determine if the remote client has been
-     restarted as it shouldn't change otherwise.
-
- (#) Set the maximum lifespan on a call::
-
-	void rxrpc_kernel_set_max_life(struct socket *sock,
-				       struct rxrpc_call *call,
-				       unsigned long hard_timeout)
-
-     This sets the maximum lifespan on a call to hard_timeout (which is in
-     jiffies).  In the event of the timeout occurring, the call will be
-     aborted and -ETIME or -ETIMEDOUT will be returned.
-
  (#) Apply the RXRPC_MIN_SECURITY_LEVEL sockopt to a socket from within in the
      kernel::
 
@@ -1172,3 +1148,18 @@ adjusted through sysctls in /proc/net/rxrpc/:
      header plus exactly 1412 bytes of data.  The terminal packet must contain
      a four byte header plus any amount of data.  In any event, a jumbo packet
      may not exceed rxrpc_rx_mtu in size.
+
+
+API Function Reference
+======================
+
+.. kernel-doc:: net/rxrpc/af_rxrpc.c
+.. kernel-doc:: net/rxrpc/call_object.c
+.. kernel-doc:: net/rxrpc/key.c
+.. kernel-doc:: net/rxrpc/oob.c
+.. kernel-doc:: net/rxrpc/peer_object.c
+.. kernel-doc:: net/rxrpc/recvmsg.c
+.. kernel-doc:: net/rxrpc/rxgk.c
+.. kernel-doc:: net/rxrpc/rxkad.c
+.. kernel-doc:: net/rxrpc/sendmsg.c
+.. kernel-doc:: net/rxrpc/server_key.c
diff --git a/Documentation/networking/scaling.rst b/Documentation/networking/scaling.rst
index 4eb50bcb9d42..99b6a61e5e31 100644
--- a/Documentation/networking/scaling.rst
+++ b/Documentation/networking/scaling.rst
@@ -49,14 +49,21 @@ destination address) and TCP/UDP (source port, destination port) tuples
 are swapped, the computed hash is the same. This is beneficial in some
 applications that monitor TCP/IP flows (IDS, firewalls, ...etc) and need
 both directions of the flow to land on the same Rx queue (and CPU). The
-"Symmetric-XOR" is a type of RSS algorithms that achieves this hash
-symmetry by XORing the input source and destination fields of the IP
-and/or L4 protocols. This, however, results in reduced input entropy and
-could potentially be exploited. Specifically, the algorithm XORs the input
+"Symmetric-XOR" and "Symmetric-OR-XOR" are types of RSS algorithms that
+achieve this hash symmetry by XOR/ORing the input source and destination
+fields of the IP and/or L4 protocols. This, however, results in reduced
+input entropy and could potentially be exploited.
+
+Specifically, the "Symmetric-XOR" algorithm XORs the input
 as follows::
 
     # (SRC_IP ^ DST_IP, SRC_IP ^ DST_IP, SRC_PORT ^ DST_PORT, SRC_PORT ^ DST_PORT)
 
+The "Symmetric-OR-XOR" algorithm, on the other hand, transforms the input as
+follows::
+
+    # (SRC_IP | DST_IP, SRC_IP ^ DST_IP, SRC_PORT | DST_PORT, SRC_PORT ^ DST_PORT)
+
 The result is then fed to the underlying RSS algorithm.
 
 Some advanced NICs allow steering packets to queues based on
@@ -427,8 +434,10 @@ rps_dev_flow_table. The stack consults a CPU to hardware queue map which
 is maintained by the NIC driver. This is an auto-generated reverse map of
 the IRQ affinity table shown by /proc/interrupts. Drivers can use
 functions in the cpu_rmap (“CPU affinity reverse map”) kernel library
-to populate the map. For each CPU, the corresponding queue in the map is
-set to be one whose processing CPU is closest in cache locality.
+to populate the map. Alternatively, drivers can delegate the cpu_rmap
+management to the Kernel by calling netif_enable_cpu_rmap(). For each CPU,
+the corresponding queue in the map is set to be one whose processing CPU is
+closest in cache locality.
 
 
 Accelerated RFS Configuration
diff --git a/Documentation/networking/statistics.rst b/Documentation/networking/statistics.rst
index 75e017dfa825..518284e287b0 100644
--- a/Documentation/networking/statistics.rst
+++ b/Documentation/networking/statistics.rst
@@ -143,7 +143,7 @@ reading multiple stats as it internally performs a full dump of
 and reports only the stat corresponding to the accessed file.
 
 Sysfs files are documented in
-`Documentation/ABI/testing/sysfs-class-net-statistics`.
+Documentation/ABI/testing/sysfs-class-net-statistics.
 
 
 netlink
diff --git a/Documentation/networking/strparser.rst b/Documentation/networking/strparser.rst
index 6cab1f74ae05..8dc6bb04c710 100644
--- a/Documentation/networking/strparser.rst
+++ b/Documentation/networking/strparser.rst
@@ -112,7 +112,7 @@ Functions
 Callbacks
 =========
 
-There are six callbacks:
+There are seven callbacks:
 
     ::
 
@@ -180,10 +180,17 @@ There are six callbacks:
     struct contains two fields: offset and full_len. Offset is
     where the message starts in the skb, and full_len is the
     the length of the message. skb->len - offset may be greater
-    then full_len since strparser does not trim the skb.
+    than full_len since strparser does not trim the skb.
 
     ::
 
+	int (*read_sock)(struct strparser *strp, read_descriptor_t *desc,
+                     sk_read_actor_t recv_actor);
+
+    The read_sock callback is used by strparser instead of
+    sock->ops->read_sock, if provided.
+    ::
+
 	int (*read_sock_done)(struct strparser *strp, int err);
 
      read_sock_done is called when the stream parser is done reading
diff --git a/Documentation/networking/switchdev.rst b/Documentation/networking/switchdev.rst
index f355f0166f1b..2966b7122f05 100644
--- a/Documentation/networking/switchdev.rst
+++ b/Documentation/networking/switchdev.rst
@@ -137,7 +137,7 @@ would be sub-port 0 on port 1 on switch 1.
 Port Features
 ^^^^^^^^^^^^^
 
-dev->netns_local
+dev->netns_immutable
 
 If the switchdev driver (and device) only supports offloading of the default
 network namespace (netns), the driver should set this private flag to prevent
diff --git a/Documentation/networking/timestamping.rst b/Documentation/networking/timestamping.rst
index 8199e6917671..7aabead90648 100644
--- a/Documentation/networking/timestamping.rst
+++ b/Documentation/networking/timestamping.rst
@@ -140,6 +140,14 @@ SOF_TIMESTAMPING_TX_ACK:
   cumulative acknowledgment. The mechanism ignores SACK and FACK.
   This flag can be enabled via both socket options and control messages.
 
+SOF_TIMESTAMPING_TX_COMPLETION:
+  Request tx timestamps on packet tx completion.  The completion
+  timestamp is generated by the kernel when it receives packet a
+  completion report from the hardware. Hardware may report multiple
+  packets at once, and completion timestamps reflect the timing of the
+  report and not actual tx time. This flag can be enabled via both
+  socket options and control messages.
+
 
 1.3.2 Timestamp Reporting
 ^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -194,6 +202,20 @@ SOF_TIMESTAMPING_OPT_ID:
   among all possibly concurrently outstanding timestamp requests for
   that socket.
 
+  The process can optionally override the default generated ID, by
+  passing a specific ID with control message SCM_TS_OPT_ID (not
+  supported for TCP sockets)::
+
+    struct msghdr *msg;
+    ...
+    cmsg			 = CMSG_FIRSTHDR(msg);
+    cmsg->cmsg_level		 = SOL_SOCKET;
+    cmsg->cmsg_type		 = SCM_TS_OPT_ID;
+    cmsg->cmsg_len		 = CMSG_LEN(sizeof(__u32));
+    *((__u32 *) CMSG_DATA(cmsg)) = opt_id;
+    err = sendmsg(fd, msg, 0);
+
+
 SOF_TIMESTAMPING_OPT_ID_TCP:
   Pass this modifier along with SOF_TIMESTAMPING_OPT_ID for new TCP
   timestamping applications. SOF_TIMESTAMPING_OPT_ID defines how the
@@ -511,8 +533,8 @@ implicitly defined. ts[0] holds a software timestamp if set, ts[1]
 is again deprecated and ts[2] holds a hardware timestamp if set.
 
 
-3. Hardware Timestamping configuration: SIOCSHWTSTAMP and SIOCGHWTSTAMP
-=======================================================================
+3. Hardware Timestamping configuration: ETHTOOL_MSG_TSCONFIG_SET/GET
+====================================================================
 
 Hardware time stamping must also be initialized for each device driver
 that is expected to do hardware time stamping. The parameter is defined in
@@ -525,12 +547,14 @@ include/uapi/linux/net_tstamp.h as::
 	};
 
 Desired behavior is passed into the kernel and to a specific device by
-calling ioctl(SIOCSHWTSTAMP) with a pointer to a struct ifreq whose
-ifr_data points to a struct hwtstamp_config. The tx_type and
-rx_filter are hints to the driver what it is expected to do. If
-the requested fine-grained filtering for incoming packets is not
-supported, the driver may time stamp more than just the requested types
-of packets.
+calling the tsconfig netlink socket ``ETHTOOL_MSG_TSCONFIG_SET``.
+The ``ETHTOOL_A_TSCONFIG_TX_TYPES``, ``ETHTOOL_A_TSCONFIG_RX_FILTERS`` and
+``ETHTOOL_A_TSCONFIG_HWTSTAMP_FLAGS`` netlink attributes are then used to set
+the struct hwtstamp_config accordingly.
+
+The ``ETHTOOL_A_TSCONFIG_HWTSTAMP_PROVIDER`` netlink nested attribute is used
+to select the source of the hardware time stamping. It is composed of an index
+for the device source and a qualifier for the type of time stamping.
 
 Drivers are free to use a more permissive configuration than the requested
 configuration. It is expected that drivers should only implement directly the
@@ -549,9 +573,16 @@ Only a processes with admin rights may change the configuration. User
 space is responsible to ensure that multiple processes don't interfere
 with each other and that the settings are reset.
 
-Any process can read the actual configuration by passing this
-structure to ioctl(SIOCGHWTSTAMP) in the same way.  However, this has
-not been implemented in all drivers.
+Any process can read the actual configuration by requesting tsconfig netlink
+socket ``ETHTOOL_MSG_TSCONFIG_GET``.
+
+The legacy configuration is the use of the ioctl(SIOCSHWTSTAMP) with a pointer
+to a struct ifreq whose ifr_data points to a struct hwtstamp_config.
+The tx_type and rx_filter are hints to the driver what it is expected to do.
+If the requested fine-grained filtering for incoming packets is not
+supported, the driver may time stamp more than just the requested types
+of packets. ioctl(SIOCGHWTSTAMP) is used in the same way as the
+ioctl(SIOCSHWTSTAMP). However, this has not been implemented in all drivers.
 
 ::
 
@@ -596,9 +627,10 @@ not been implemented in all drivers.
 --------------------------------------------------------
 
 A driver which supports hardware time stamping must support the
-SIOCSHWTSTAMP ioctl and update the supplied struct hwtstamp_config with
-the actual values as described in the section on SIOCSHWTSTAMP.  It
-should also support SIOCGHWTSTAMP.
+ndo_hwtstamp_set NDO or the legacy SIOCSHWTSTAMP ioctl and update the
+supplied struct hwtstamp_config with the actual values as described in
+the section on SIOCSHWTSTAMP. It should also support ndo_hwtstamp_get or
+the legacy SIOCGHWTSTAMP.
 
 Time stamps for received packets must be stored in the skb. To get a pointer
 to the shared time stamp structure of the skb call skb_hwtstamps(). Then
@@ -779,11 +811,9 @@ Documentation/devicetree/bindings/ptp/timestamper.txt for more details.
 3.2.4 Other caveats for MAC drivers
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
-Stacked PHCs, especially DSA (but not only) - since that doesn't require any
-modification to MAC drivers, so it is more difficult to ensure correctness of
-all possible code paths - is that they uncover bugs which were impossible to
-trigger before the existence of stacked PTP clocks.  One example has to do with
-this line of code, already presented earlier::
+The use of stacked PHCs may uncover MAC driver bugs which were impossible to
+trigger without them. One example has to do with this line of code, already
+presented earlier::
 
       skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS;
 
diff --git a/Documentation/networking/tipc.rst b/Documentation/networking/tipc.rst
index ab63d298cca2..9b375b9b9981 100644
--- a/Documentation/networking/tipc.rst
+++ b/Documentation/networking/tipc.rst
@@ -112,7 +112,7 @@ More Information
 
 - How to contribute to TIPC:
 
-- http://tipc.io/contacts.html
+  http://tipc.io/contacts.html
 
 - More details about TIPC specification:
 
diff --git a/Documentation/networking/tls-offload.rst b/Documentation/networking/tls-offload.rst
index 5f0dea3d571e..7354d48cdf92 100644
--- a/Documentation/networking/tls-offload.rst
+++ b/Documentation/networking/tls-offload.rst
@@ -51,7 +51,7 @@ and send them to the device for encryption and transmission.
 RX
 --
 
-On the receive side if the device handled decryption and authentication
+On the receive side, if the device handled decryption and authentication
 successfully, the driver will set the decrypted bit in the associated
 :c:type:`struct sk_buff <sk_buff>`. The packets reach the TCP stack and
 are handled normally. ``ktls`` is informed when data is queued to the socket
@@ -120,8 +120,9 @@ before installing the connection state in the kernel.
 RX
 --
 
-In RX direction local networking stack has little control over the segmentation,
-so the initial records' TCP sequence number may be anywhere inside the segment.
+In the RX direction, the local networking stack has little control over
+segmentation, so the initial records' TCP sequence number may be anywhere
+inside the segment.
 
 Normal operation
 ================
@@ -138,8 +139,8 @@ There are no guarantees on record length or record segmentation. In particular
 segments may start at any point of a record and contain any number of records.
 Assuming segments are received in order, the device should be able to perform
 crypto operations and authentication regardless of segmentation. For this
-to be possible device has to keep small amount of segment-to-segment state.
-This includes at least:
+to be possible, the device has to keep a small amount of segment-to-segment
+state. This includes at least:
 
  * partial headers (if a segment carried only a part of the TLS header)
  * partial data block
@@ -175,12 +176,12 @@ and packet transformation functions) the device validates the Layer 4
 checksum and performs a 5-tuple lookup to find any TLS connection the packet
 may belong to (technically a 4-tuple
 lookup is sufficient - IP addresses and TCP port numbers, as the protocol
-is always TCP). If connection is matched device confirms if the TCP sequence
-number is the expected one and proceeds to TLS handling (record delineation,
-decryption, authentication for each record in the packet). The device leaves
-the record framing unmodified, the stack takes care of record decapsulation.
-Device indicates successful handling of TLS offload in the per-packet context
-(descriptor) passed to the host.
+is always TCP). If the packet is matched to a connection, the device confirms
+if the TCP sequence number is the expected one and proceeds to TLS handling
+(record delineation, decryption, authentication for each record in the packet).
+The device leaves the record framing unmodified, the stack takes care of record
+decapsulation. Device indicates successful handling of TLS offload in the
+per-packet context (descriptor) passed to the host.
 
 Upon reception of a TLS offloaded packet, the driver sets
 the :c:member:`decrypted` mark in :c:type:`struct sk_buff <sk_buff>`
@@ -439,7 +440,7 @@ by the driver:
  * ``rx_tls_resync_req_end`` - number of times the TLS async resync request
     properly ended with providing the HW tracked tcp-seq.
  * ``rx_tls_resync_req_skip`` - number of times the TLS async resync request
-    procedure was started by not properly ended.
+    procedure was started but not properly ended.
  * ``rx_tls_resync_res_ok`` - number of times the TLS resync response call to
     the driver was successfully handled.
  * ``rx_tls_resync_res_skip`` - number of times the TLS resync response call to
@@ -507,8 +508,8 @@ in packets as seen on the wire.
 Transport layer transparency
 ----------------------------
 
-The device should not modify any packet headers for the purpose
-of the simplifying TLS offload.
+For the purpose of simplifying TLS offload, the device should not modify any
+packet headers.
 
 The device should not depend on any packet headers beyond what is strictly
 necessary for TLS offload.
diff --git a/Documentation/networking/tls.rst b/Documentation/networking/tls.rst
index 658ed3a71e1b..36cc7afc2527 100644
--- a/Documentation/networking/tls.rst
+++ b/Documentation/networking/tls.rst
@@ -16,11 +16,13 @@ User interface
 Creating a TLS connection
 -------------------------
 
-First create a new TCP socket and set the TLS ULP.
+First create a new TCP socket and once the connection is established set the
+TLS ULP.
 
 .. code-block:: c
 
   sock = socket(AF_INET, SOCK_STREAM, 0);
+  connect(sock, addr, addrlen);
   setsockopt(sock, SOL_TCP, TCP_ULP, "tls", sizeof("tls"));
 
 Setting the TLS ULP allows us to set/get TLS socket options. Currently
@@ -200,6 +202,32 @@ received without a cmsg buffer set.
 
 recv will never return data from mixed types of TLS records.
 
+TLS 1.3 Key Updates
+-------------------
+
+In TLS 1.3, KeyUpdate handshake messages signal that the sender is
+updating its TX key. Any message sent after a KeyUpdate will be
+encrypted using the new key. The userspace library can pass the new
+key to the kernel using the TLS_TX and TLS_RX socket options, as for
+the initial keys. TLS version and cipher cannot be changed.
+
+To prevent attempting to decrypt incoming records using the wrong key,
+decryption will be paused when a KeyUpdate message is received by the
+kernel, until the new key has been provided using the TLS_RX socket
+option. Any read occurring after the KeyUpdate has been read and
+before the new key is provided will fail with EKEYEXPIRED. poll() will
+not report any read events from the socket until the new key is
+provided. There is no pausing on the transmit side.
+
+Userspace should make sure that the crypto_info provided has been set
+properly. In particular, the kernel will not check for key/nonce
+reuse.
+
+The number of successful and failed key updates is tracked in the
+``TlsTxRekeyOk``, ``TlsRxRekeyOk``, ``TlsTxRekeyError``,
+``TlsRxRekeyError`` statistics. The ``TlsRxRekeyReceived`` statistic
+counts KeyUpdate handshake messages that have been received.
+
 Integrating in to userspace TLS library
 ---------------------------------------
 
@@ -286,3 +314,13 @@ TLS implementation exposes the following per-namespace statistics
 - ``TlsRxNoPadViolation`` -
   number of data RX records which had to be re-decrypted due to
   ``TLS_RX_EXPECT_NO_PAD`` mis-prediction.
+
+- ``TlsTxRekeyOk``, ``TlsRxRekeyOk`` -
+  number of successful rekeys on existing sessions for TX and RX
+
+- ``TlsTxRekeyError``, ``TlsRxRekeyError`` -
+  number of failed rekeys on existing sessions for TX and RX
+
+- ``TlsRxRekeyReceived`` -
+  number of received KeyUpdate handshake messages, requiring userspace
+  to provide a new RX key
diff --git a/Documentation/networking/tproxy.rst b/Documentation/networking/tproxy.rst
index 7f7c1ff6f159..75e4990cc3db 100644
--- a/Documentation/networking/tproxy.rst
+++ b/Documentation/networking/tproxy.rst
@@ -69,9 +69,9 @@ add rules like this to the iptables ruleset above::
     # iptables -t mangle -A PREROUTING -p tcp --dport 80 -j TPROXY \
       --tproxy-mark 0x1/0x1 --on-port 50080
 
-Or the following rule to nft:
+Or the following rule to nft::
 
-# nft add rule filter divert tcp dport 80 tproxy to :50080 meta mark set 1 accept
+    # nft add rule filter divert tcp dport 80 tproxy to :50080 meta mark set 1 accept
 
 Note that for this to work you'll have to modify the proxy to enable (SOL_IP,
 IP_TRANSPARENT) for the listening socket.
diff --git a/Documentation/networking/xdp-rx-metadata.rst b/Documentation/networking/xdp-rx-metadata.rst
index a6e0ece18be5..ce96f4c99505 100644
--- a/Documentation/networking/xdp-rx-metadata.rst
+++ b/Documentation/networking/xdp-rx-metadata.rst
@@ -120,6 +120,39 @@ It is possible to query which kfunc the particular netdev implements via
 netlink. See ``xdp-rx-metadata-features`` attribute set in
 ``Documentation/netlink/specs/netdev.yaml``.
 
+Driver Implementation
+=====================
+
+Certain devices may prepend metadata to received packets. However, as of now,
+``AF_XDP`` lacks the ability to communicate the size of the ``data_meta`` area
+to the consumer. Therefore, it is the responsibility of the driver to copy any
+device-reserved metadata out from the metadata area and ensure that
+``xdp_buff->data_meta`` is pointing to ``xdp_buff->data`` before presenting the
+frame to the XDP program. This is necessary so that, after the XDP program
+adjusts the metadata area, the consumer can reliably retrieve the metadata
+address using ``METADATA_SIZE`` offset.
+
+The following diagram shows how custom metadata is positioned relative to the
+packet data and how pointers are adjusted for metadata access::
+
+              |<-- bpf_xdp_adjust_meta(xdp_buff, -METADATA_SIZE) --|
+  new xdp_buff->data_meta                              old xdp_buff->data_meta
+              |                                                    |
+              |                                            xdp_buff->data
+              |                                                    |
+   +----------+----------------------------------------------------+------+
+   | headroom |                  custom metadata                   | data |
+   +----------+----------------------------------------------------+------+
+              |                                                    |
+              |                                            xdp_desc->addr
+              |<------ xsk_umem__get_data() - METADATA_SIZE -------|
+
+``bpf_xdp_adjust_meta`` ensures that ``METADATA_SIZE`` is aligned to 4 bytes,
+does not exceed 252 bytes, and leaves sufficient space for building the
+xdp_frame. If these conditions are not met, it returns a negative error. In this
+case, the BPF program should not proceed to populate data into the ``data_meta``
+area.
+
 Example
 =======
 
diff --git a/Documentation/networking/xfrm_device.rst b/Documentation/networking/xfrm_device.rst
index bfea9d8579ed..122204da0fff 100644
--- a/Documentation/networking/xfrm_device.rst
+++ b/Documentation/networking/xfrm_device.rst
@@ -65,9 +65,13 @@ Callbacks to implement
   /* from include/linux/netdevice.h */
   struct xfrmdev_ops {
         /* Crypto and Packet offload callbacks */
-	int	(*xdo_dev_state_add) (struct xfrm_state *x, struct netlink_ext_ack *extack);
-	void	(*xdo_dev_state_delete) (struct xfrm_state *x);
-	void	(*xdo_dev_state_free) (struct xfrm_state *x);
+	int	(*xdo_dev_state_add)(struct net_device *dev,
+                                     struct xfrm_state *x,
+                                     struct netlink_ext_ack *extack);
+	void	(*xdo_dev_state_delete)(struct net_device *dev,
+                                        struct xfrm_state *x);
+	void	(*xdo_dev_state_free)(struct net_device *dev,
+                                      struct xfrm_state *x);
 	bool	(*xdo_dev_offload_ok) (struct sk_buff *skb,
 				       struct xfrm_state *x);
 	void    (*xdo_dev_state_advance_esn) (struct xfrm_state *x);
@@ -126,7 +130,8 @@ been setup for offload, it first calls into xdo_dev_offload_ok() with
 the skb and the intended offload state to ask the driver if the offload
 will serviceable.  This can check the packet information to be sure the
 offload can be supported (e.g. IPv4 or IPv6, no IPv4 options, etc) and
-return true of false to signify its support.
+return true or false to signify its support. In case driver doesn't implement
+this callback, the stack provides reasonable defaults.
 
 Crypto offload mode:
 When ready to send, the driver needs to inspect the Tx packet for the
@@ -169,7 +174,8 @@ the stack in xfrm_input().
 
 	hand the packet to napi_gro_receive() as usual
 
-In ESN mode, xdo_dev_state_advance_esn() is called from xfrm_replay_advance_esn().
+In ESN mode, xdo_dev_state_advance_esn() is called from
+xfrm_replay_advance_esn() for RX, and xfrm_replay_overflow_offload_esn for TX.
 Driver will check packet seq number and update HW ESN state machine if needed.
 
 Packet offload mode:
diff --git a/Documentation/networking/xsk-tx-metadata.rst b/Documentation/networking/xsk-tx-metadata.rst
index e76b0cfc32f7..df53a10ccac3 100644
--- a/Documentation/networking/xsk-tx-metadata.rst
+++ b/Documentation/networking/xsk-tx-metadata.rst
@@ -50,6 +50,10 @@ The flags field enables the particular offload:
   checksum. ``csum_start`` specifies byte offset of where the checksumming
   should start and ``csum_offset`` specifies byte offset where the
   device should store the computed checksum.
+- ``XDP_TXMD_FLAGS_LAUNCH_TIME``: requests the device to schedule the
+  packet for transmission at a pre-determined time called launch time. The
+  value of launch time is indicated by ``launch_time`` field of
+  ``union xsk_tx_metadata``.
 
 Besides the flags above, in order to trigger the offloads, the first
 packet's ``struct xdp_desc`` descriptor should set ``XDP_TX_METADATA``
@@ -65,6 +69,63 @@ In this case, when running in ``XDK_COPY`` mode, the TX checksum
 is calculated on the CPU. Do not enable this option in production because
 it will negatively affect performance.
 
+Launch Time
+===========
+
+The value of the requested launch time should be based on the device's PTP
+Hardware Clock (PHC) to ensure accuracy. AF_XDP takes a different data path
+compared to the ETF queuing discipline, which organizes packets and delays
+their transmission. Instead, AF_XDP immediately hands off the packets to
+the device driver without rearranging their order or holding them prior to
+transmission. Since the driver maintains FIFO behavior and does not perform
+packet reordering, a packet with a launch time request will block other
+packets in the same Tx Queue until it is sent. Therefore, it is recommended
+to allocate separate queue for scheduling traffic that is intended for
+future transmission.
+
+In scenarios where the launch time offload feature is disabled, the device
+driver is expected to disregard the launch time request. For correct
+interpretation and meaningful operation, the launch time should never be
+set to a value larger than the farthest programmable time in the future
+(the horizon). Different devices have different hardware limitations on the
+launch time offload feature.
+
+stmmac driver
+-------------
+
+For stmmac, TSO and launch time (TBS) features are mutually exclusive for
+each individual Tx Queue. By default, the driver configures Tx Queue 0 to
+support TSO and the rest of the Tx Queues to support TBS. The launch time
+hardware offload feature can be enabled or disabled by using the tc-etf
+command to call the driver's ndo_setup_tc() callback.
+
+The value of the launch time that is programmed in the Enhanced Normal
+Transmit Descriptors is a 32-bit value, where the most significant 8 bits
+represent the time in seconds and the remaining 24 bits represent the time
+in 256 ns increments. The programmed launch time is compared against the
+PTP time (bits[39:8]) and rolls over after 256 seconds. Therefore, the
+horizon of the launch time for dwmac4 and dwxlgmac2 is 128 seconds in the
+future.
+
+igc driver
+----------
+
+For igc, all four Tx Queues support the launch time feature. The launch
+time hardware offload feature can be enabled or disabled by using the
+tc-etf command to call the driver's ndo_setup_tc() callback. When entering
+TSN mode, the igc driver will reset the device and create a default Qbv
+schedule with a 1-second cycle time, with all Tx Queues open at all times.
+
+The value of the launch time that is programmed in the Advanced Transmit
+Context Descriptor is a relative offset to the starting time of the Qbv
+transmission window of the queue. The Frst flag of the descriptor can be
+set to schedule the packet for the next Qbv cycle. Therefore, the horizon
+of the launch time for i225 and i226 is the ending time of the next cycle
+of the Qbv transmission window of the queue. For example, when the Qbv
+cycle time is set to 1 second, the horizon of the launch time ranges
+from 1 second to 2 seconds, depending on where the Qbv cycle is currently
+running.
+
 Querying Device Capabilities
 ============================
 
@@ -74,6 +135,7 @@ Refer to ``xsk-flags`` features bitmask in
 
 - ``tx-timestamp``: device supports ``XDP_TXMD_FLAGS_TIMESTAMP``
 - ``tx-checksum``: device supports ``XDP_TXMD_FLAGS_CHECKSUM``
+- ``tx-launch-time-fifo``: device supports ``XDP_TXMD_FLAGS_LAUNCH_TIME``
 
 See ``tools/net/ynl/samples/netdev.c`` on how to query this information.