diff options
Diffstat (limited to 'Documentation')
-rw-r--r-- | Documentation/ABI/testing/sysfs-class-net-queues | 11 | ||||
-rw-r--r-- | Documentation/networking/scaling.txt | 61 |
2 files changed, 61 insertions, 11 deletions
diff --git a/Documentation/ABI/testing/sysfs-class-net-queues b/Documentation/ABI/testing/sysfs-class-net-queues index 0c0df91b1516..978b76358661 100644 --- a/Documentation/ABI/testing/sysfs-class-net-queues +++ b/Documentation/ABI/testing/sysfs-class-net-queues @@ -42,6 +42,17 @@ Description: network device transmit queue. Possible vaules depend on the number of available CPU(s) in the system. +What: /sys/class/<iface>/queues/tx-<queue>/xps_rxqs +Date: June 2018 +KernelVersion: 4.18.0 +Contact: netdev@vger.kernel.org +Description: + Mask of the receive queue(s) currently enabled to participate + into the Transmit Packet Steering packet processing flow for this + network device transmit queue. Possible values depend on the + number of available receive queue(s) in the network device. + Default is disabled. + What: /sys/class/<iface>/queues/tx-<queue>/byte_queue_limits/hold_time Date: November 2011 KernelVersion: 3.3 diff --git a/Documentation/networking/scaling.txt b/Documentation/networking/scaling.txt index f55639d71d35..b7056a8a0540 100644 --- a/Documentation/networking/scaling.txt +++ b/Documentation/networking/scaling.txt @@ -366,8 +366,13 @@ XPS: Transmit Packet Steering Transmit Packet Steering is a mechanism for intelligently selecting which transmit queue to use when transmitting a packet on a multi-queue -device. To accomplish this, a mapping from CPU to hardware queue(s) is -recorded. The goal of this mapping is usually to assign queues +device. This can be accomplished by recording two kinds of maps, either +a mapping of CPU to hardware queue(s) or a mapping of receive queue(s) +to hardware transmit queue(s). + +1. XPS using CPUs map + +The goal of this mapping is usually to assign queues exclusively to a subset of CPUs, where the transmit completions for these queues are processed on a CPU within this set. This choice provides two benefits. First, contention on the device queue lock is @@ -377,15 +382,40 @@ transmit queue). Secondly, cache miss rate on transmit completion is reduced, in particular for data cache lines that hold the sk_buff structures. -XPS is configured per transmit queue by setting a bitmap of CPUs that -may use that queue to transmit. The reverse mapping, from CPUs to -transmit queues, is computed and maintained for each network device. -When transmitting the first packet in a flow, the function -get_xps_queue() is called to select a queue. This function uses the ID -of the running CPU as a key into the CPU-to-queue lookup table. If the +2. XPS using receive queues map + +This mapping is used to pick transmit queue based on the receive +queue(s) map configuration set by the administrator. A set of receive +queues can be mapped to a set of transmit queues (many:many), although +the common use case is a 1:1 mapping. This will enable sending packets +on the same queue associations for transmit and receive. This is useful for +busy polling multi-threaded workloads where there are challenges in +associating a given CPU to a given application thread. The application +threads are not pinned to CPUs and each thread handles packets +received on a single queue. The receive queue number is cached in the +socket for the connection. In this model, sending the packets on the same +transmit queue corresponding to the associated receive queue has benefits +in keeping the CPU overhead low. Transmit completion work is locked into +the same queue-association that a given application is polling on. This +avoids the overhead of triggering an interrupt on another CPU. When the +application cleans up the packets during the busy poll, transmit completion +may be processed along with it in the same thread context and so result in +reduced latency. + +XPS is configured per transmit queue by setting a bitmap of +CPUs/receive-queues that may use that queue to transmit. The reverse +mapping, from CPUs to transmit queues or from receive-queues to transmit +queues, is computed and maintained for each network device. When +transmitting the first packet in a flow, the function get_xps_queue() is +called to select a queue. This function uses the ID of the receive queue +for the socket connection for a match in the receive queue-to-transmit queue +lookup table. Alternatively, this function can also use the ID of the +running CPU as a key into the CPU-to-queue lookup table. If the ID matches a single queue, that is used for transmission. If multiple queues match, one is selected by using the flow hash to compute an index -into the set. +into the set. When selecting the transmit queue based on receive queue(s) +map, the transmit device is not validated against the receive device as it +requires expensive lookup operation in the datapath. The queue chosen for transmitting a particular flow is saved in the corresponding socket structure for the flow (e.g. a TCP connection). @@ -404,11 +434,15 @@ acknowledged. XPS is only available if the kconfig symbol CONFIG_XPS is enabled (on by default for SMP). The functionality remains disabled until explicitly -configured. To enable XPS, the bitmap of CPUs that may use a transmit -queue is configured using the sysfs file entry: +configured. To enable XPS, the bitmap of CPUs/receive-queues that may +use a transmit queue is configured using the sysfs file entry: +For selection based on CPUs map: /sys/class/net/<dev>/queues/tx-<n>/xps_cpus +For selection based on receive-queues map: +/sys/class/net/<dev>/queues/tx-<n>/xps_rxqs + == Suggested Configuration For a network device with a single transmission queue, XPS configuration @@ -421,6 +455,11 @@ best CPUs to share a given queue are probably those that share the cache with the CPU that processes transmit completions for that queue (transmit interrupts). +For transmit queue selection based on receive queue(s), XPS has to be +explicitly configured mapping receive-queue(s) to transmit queue(s). If the +user configuration for receive-queue map does not apply, then the transmit +queue is selected based on the CPUs map. + Per TX Queue rate limitation: ============================= |