summaryrefslogtreecommitdiff
path: root/Documentation/networking
diff options
context:
space:
mode:
authorDavid Arinzon <darinzon@amazon.com>2023-06-12 15:14:48 +0300
committerJakub Kicinski <kuba@kernel.org>2023-06-16 08:45:47 +0300
commitf7d625adeb7bc6a9ec83d32d9615889969d64484 (patch)
tree6771c7b35455f5bde6a5ee4c43aa9c5627b2dba6 /Documentation/networking
parente1d001fa5b477c4da46a29be1fcece91db7c7c6f (diff)
downloadlinux-f7d625adeb7bc6a9ec83d32d9615889969d64484.tar.xz
net: ena: Add dynamic recycling mechanism for rx buffers
The current implementation allocates page-sized rx buffers. As traffic may consist of different types and sizes of packets, in various cases, buffers are not fully used. This change (Dynamic RX Buffers - DRB) uses part of the allocated rx page needed for the incoming packet, and returns the rest of the unused page to be used again as an rx buffer for future packets. A threshold of 2K for unused space has been set in order to declare whether the remainder of the page can be reused again as an rx buffer. As a page may be reused, dma_sync_single_for_cpu() is added in order to sync the memory to the CPU side after it was owned by the HW. In addition, when the rx page can no longer be reused, it is being unmapped using dma_page_unmap(), which implicitly syncs and then unmaps the entire page. In case the kernel still handles the skbs pointing to the previous buffers from that rx page, it may access garbage pointers, caused by the implicit sync overwriting them. The implicit dma sync is removed by replacing dma_page_unmap() with dma_unmap_page_attrs() with DMA_ATTR_SKIP_CPU_SYNC flag. The functionality is disabled for XDP traffic to avoid handling several descriptors per packet. Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Link: https://lore.kernel.org/r/20230612121448.28829-1-darinzon@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Diffstat (limited to 'Documentation/networking')
-rw-r--r--Documentation/networking/device_drivers/ethernet/amazon/ena.rst32
1 files changed, 32 insertions, 0 deletions
diff --git a/Documentation/networking/device_drivers/ethernet/amazon/ena.rst b/Documentation/networking/device_drivers/ethernet/amazon/ena.rst
index 8bcb173e0353..491492677632 100644
--- a/Documentation/networking/device_drivers/ethernet/amazon/ena.rst
+++ b/Documentation/networking/device_drivers/ethernet/amazon/ena.rst
@@ -205,6 +205,7 @@ Adaptive coalescing can be switched on/off through `ethtool(8)`'s
More information about Adaptive Interrupt Moderation (DIM) can be found in
Documentation/networking/net_dim.rst
+.. _`RX copybreak`:
RX copybreak
============
The rx_copybreak is initialized by default to ENA_DEFAULT_RX_COPYBREAK
@@ -315,3 +316,34 @@ Rx
- The new SKB is updated with the necessary information (protocol,
checksum hw verify result, etc), and then passed to the network
stack, using the NAPI interface function :code:`napi_gro_receive()`.
+
+Dynamic RX Buffers (DRB)
+------------------------
+
+Each RX descriptor in the RX ring is a single memory page (which is either 4KB
+or 16KB long depending on system's configurations).
+To reduce the memory allocations required when dealing with a high rate of small
+packets, the driver tries to reuse the remaining RX descriptor's space if more
+than 2KB of this page remain unused.
+
+A simple example of this mechanism is the following sequence of events:
+
+::
+
+ 1. Driver allocates page-sized RX buffer and passes it to hardware
+ +----------------------+
+ |4KB RX Buffer |
+ +----------------------+
+
+ 2. A 300Bytes packet is received on this buffer
+
+ 3. The driver increases the ref count on this page and returns it back to
+ HW as an RX buffer of size 4KB - 300Bytes = 3796 Bytes
+ +----+--------------------+
+ |****|3796 Bytes RX Buffer|
+ +----+--------------------+
+
+This mechanism isn't used when an XDP program is loaded, or when the
+RX packet is less than rx_copybreak bytes (in which case the packet is
+copied out of the RX buffer into the linear part of a new skb allocated
+for it and the RX buffer remains the same size, see `RX copybreak`_).