<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/net/ethernet/amazon, branch v6.1.87</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.1.87</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.1.87'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2024-04-17T09:18:26+00:00</updated>
<entry>
<title>net: ena: Fix incorrect descriptor free behavior</title>
<updated>2024-04-17T09:18:26+00:00</updated>
<author>
<name>David Arinzon</name>
<email>darinzon@amazon.com</email>
</author>
<published>2024-04-10T09:13:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=19ff8fed3338898b70b2aad831386c78564912e1'/>
<id>urn:sha1:19ff8fed3338898b70b2aad831386c78564912e1</id>
<content type='text'>
[ Upstream commit bf02d9fe00632d22fa91d34749c7aacf397b6cde ]

ENA has two types of TX queues:
- queues which only process TX packets arriving from the network stack
- queues which only process TX packets forwarded to it by XDP_REDIRECT
  or XDP_TX instructions

The ena_free_tx_bufs() cycles through all descriptors in a TX queue
and unmaps + frees every descriptor that hasn't been acknowledged yet
by the device (uncompleted TX transactions).
The function assumes that the processed TX queue is necessarily from
the first category listed above and ends up using napi_consume_skb()
for descriptors belonging to an XDP specific queue.

This patch solves a bug in which, in case of a VF reset, the
descriptors aren't freed correctly, leading to crashes.

Fixes: 548c4940b9f1 ("net: ena: Implement XDP_TX action")
Signed-off-by: Shay Agroskin &lt;shayagr@amazon.com&gt;
Signed-off-by: David Arinzon &lt;darinzon@amazon.com&gt;
Reviewed-by: Shannon Nelson &lt;shannon.nelson@amd.com&gt;
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>net: ena: Wrong missing IO completions check order</title>
<updated>2024-04-17T09:18:26+00:00</updated>
<author>
<name>David Arinzon</name>
<email>darinzon@amazon.com</email>
</author>
<published>2024-04-10T09:13:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=7d44e12efb7d777484feea2818d99d68224b3180'/>
<id>urn:sha1:7d44e12efb7d777484feea2818d99d68224b3180</id>
<content type='text'>
[ Upstream commit f7e417180665234fdb7af2ebe33d89aaa434d16f ]

Missing IO completions check is called every second (HZ jiffies).
This commit fixes several issues with this check:

1. Duplicate queues check:
   Max of 4 queues are scanned on each check due to monitor budget.
   Once reaching the budget, this check exits under the assumption that
   the next check will continue to scan the remainder of the queues,
   but in practice, next check will first scan the last already scanned
   queue which is not necessary and may cause the full queue scan to
   last a couple of seconds longer.
   The fix is to start every check with the next queue to scan.
   For example, on 8 IO queues:
   Bug: [0,1,2,3], [3,4,5,6], [6,7]
   Fix: [0,1,2,3], [4,5,6,7]

2. Unbalanced queues check:
   In case the number of active IO queues is not a multiple of budget,
   there will be checks which don't utilize the full budget
   because the full scan exits when reaching the last queue id.
   The fix is to run every TX completion check with exact queue budget
   regardless of the queue id.
   For example, on 7 IO queues:
   Bug: [0,1,2,3], [4,5,6], [0,1,2,3]
   Fix: [0,1,2,3], [4,5,6,0], [1,2,3,4]
   The budget may be lowered in case the number of IO queues is less
   than the budget (4) to make sure there are no duplicate queues on
   the same check.
   For example, on 3 IO queues:
   Bug: [0,1,2,0], [1,2,0,1]
   Fix: [0,1,2], [0,1,2]

Fixes: 1738cd3ed342 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Amit Bernstein &lt;amitbern@amazon.com&gt;
Signed-off-by: David Arinzon &lt;darinzon@amazon.com&gt;
Reviewed-by: Shannon Nelson &lt;shannon.nelson@amd.com&gt;
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>net: ena: Fix potential sign extension issue</title>
<updated>2024-04-17T09:18:26+00:00</updated>
<author>
<name>David Arinzon</name>
<email>darinzon@amazon.com</email>
</author>
<published>2024-04-10T09:13:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=4dea83d483d574645763440aaf50f186a17bfbd6'/>
<id>urn:sha1:4dea83d483d574645763440aaf50f186a17bfbd6</id>
<content type='text'>
[ Upstream commit 713a85195aad25d8a26786a37b674e3e5ec09e3c ]

Small unsigned types are promoted to larger signed types in
the case of multiplication, the result of which may overflow.
In case the result of such a multiplication has its MSB
turned on, it will be sign extended with '1's.
This changes the multiplication result.

Code example of the phenomenon:
-------------------------------
u16 x, y;
size_t z1, z2;

x = y = 0xffff;
printk("x=%x y=%x\n",x,y);

z1 = x*y;
z2 = (size_t)x*y;

printk("z1=%lx z2=%lx\n", z1, z2);

Output:
-------
x=ffff y=ffff
z1=fffffffffffe0001 z2=fffe0001

The expected result of ffff*ffff is fffe0001, and without the
explicit casting to avoid the unwanted sign extension we got
fffffffffffe0001.

This commit adds an explicit casting to avoid the sign extension
issue.

Fixes: 689b2bdaaa14 ("net: ena: add functions for handling Low Latency Queues in ena_com")
Signed-off-by: Arthur Kiyanovski &lt;akiyano@amazon.com&gt;
Signed-off-by: David Arinzon &lt;darinzon@amazon.com&gt;
Reviewed-by: Shannon Nelson &lt;shannon.nelson@amd.com&gt;
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>net: ena: Remove ena_select_queue</title>
<updated>2024-03-26T22:20:36+00:00</updated>
<author>
<name>Kamal Heib</name>
<email>kheib@redhat.com</email>
</author>
<published>2024-02-15T22:31:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=4c51575705d2c2bc7d04be26079e204ee5be6f23'/>
<id>urn:sha1:4c51575705d2c2bc7d04be26079e204ee5be6f23</id>
<content type='text'>
[ Upstream commit 78e886ba2b549945ecada055ee0765f0ded5707a ]

Avoid the following warnings by removing the ena_select_queue() function
and rely on the net core to do the queue selection, The issue happen
when an skb received from an interface with more queues than ena is
forwarded to the ena interface.

[ 1176.159959] eth0 selects TX queue 11, but real number of TX queues is 8
[ 1176.863976] eth0 selects TX queue 14, but real number of TX queues is 8
[ 1180.767877] eth0 selects TX queue 14, but real number of TX queues is 8
[ 1188.703742] eth0 selects TX queue 14, but real number of TX queues is 8

Fixes: 1738cd3ed342 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Kamal Heib &lt;kheib@redhat.com&gt;
Reviewed-by: Jacob Keller &lt;jacob.e.keller@intel.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>net: ena: Fix XDP redirection error</title>
<updated>2023-12-20T16:00:18+00:00</updated>
<author>
<name>David Arinzon</name>
<email>darinzon@amazon.com</email>
</author>
<published>2023-12-11T06:28:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=63387fe87fc5a429175a2f0dda426f650666b7c5'/>
<id>urn:sha1:63387fe87fc5a429175a2f0dda426f650666b7c5</id>
<content type='text'>
[ Upstream commit 4ab138ca0a340e6d6e7a6a9bd5004bd8f83127ca ]

When sending TX packets, the meta descriptor can be all zeroes
as no meta information is required (as in XDP).

This patch removes the validity check, as when
`disable_meta_caching` is enabled, such TX packets will be
dropped otherwise.

Fixes: 0e3a3f6dacf0 ("net: ena: support new LLQ acceleration mode")
Signed-off-by: Shay Agroskin &lt;shayagr@amazon.com&gt;
Signed-off-by: David Arinzon &lt;darinzon@amazon.com&gt;
Link: https://lore.kernel.org/r/20231211062801.27891-5-darinzon@amazon.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>net: ena: Fix xdp drops handling due to multibuf packets</title>
<updated>2023-12-20T16:00:18+00:00</updated>
<author>
<name>David Arinzon</name>
<email>darinzon@amazon.com</email>
</author>
<published>2023-12-11T06:27:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2664b56420b3c9b87a9fc4e09be1cf6e609abb3d'/>
<id>urn:sha1:2664b56420b3c9b87a9fc4e09be1cf6e609abb3d</id>
<content type='text'>
[ Upstream commit 505b1a88d311ff6f8c44a34f94e3be21745cce6f ]

Current xdp code drops packets larger than ENA_XDP_MAX_MTU.
This is an incorrect condition since the problem is not the
size of the packet, rather the number of buffers it contains.

This commit:

1. Identifies and drops XDP multi-buffer packets at the
   beginning of the function.
2. Increases the xdp drop statistic when this drop occurs.
3. Adds a one-time print that such drops are happening to
   give better indication to the user.

Fixes: 838c93dc5449 ("net: ena: implement XDP drop support")
Signed-off-by: Arthur Kiyanovski &lt;akiyano@amazon.com&gt;
Signed-off-by: David Arinzon &lt;darinzon@amazon.com&gt;
Link: https://lore.kernel.org/r/20231211062801.27891-3-darinzon@amazon.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>net: ena: Destroy correct number of xdp queues upon failure</title>
<updated>2023-12-20T16:00:18+00:00</updated>
<author>
<name>David Arinzon</name>
<email>darinzon@amazon.com</email>
</author>
<published>2023-12-11T06:27:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e312eed27abaf2bbb492e8cde56dde9bacdacce0'/>
<id>urn:sha1:e312eed27abaf2bbb492e8cde56dde9bacdacce0</id>
<content type='text'>
[ Upstream commit 41db6f99b5489a0d2ef26afe816ef0c6118d1d47 ]

The ena_setup_and_create_all_xdp_queues() function freed all the
resources upon failure, after creating only xdp_num_queues queues,
instead of freeing just the created ones.

In this patch, the only resources that are freed, are the ones
allocated right before the failure occurs.

Fixes: 548c4940b9f1 ("net: ena: Implement XDP_TX action")
Signed-off-by: Shahar Itzko &lt;itzko@amazon.com&gt;
Signed-off-by: David Arinzon &lt;darinzon@amazon.com&gt;
Link: https://lore.kernel.org/r/20231211062801.27891-2-darinzon@amazon.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>net: ena: Flush XDP packets on error.</title>
<updated>2023-10-06T12:56:42+00:00</updated>
<author>
<name>Sebastian Andrzej Siewior</name>
<email>bigeasy@linutronix.de</email>
</author>
<published>2023-09-18T15:36:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=26f1829c853860b60cf52175c875ad501e875dfd'/>
<id>urn:sha1:26f1829c853860b60cf52175c875ad501e875dfd</id>
<content type='text'>
[ Upstream commit 6f411fb5ca9419090bee6a0a46425e0a5060b734 ]

xdp_do_flush() should be invoked before leaving the NAPI poll function
after a XDP-redirect. This is not the case if the driver leaves via
the error path (after having a redirect in one of its previous
iterations).

Invoke xdp_do_flush() also in the error path.

Cc: Arthur Kiyanovski &lt;akiyano@amazon.com&gt;
Cc: David Arinzon &lt;darinzon@amazon.com&gt;
Cc: Noam Dagan &lt;ndagan@amazon.com&gt;
Cc: Saeed Bishara &lt;saeedb@amazon.com&gt;
Cc: Shay Agroskin &lt;shayagr@amazon.com&gt;
Fixes: a318c70ad152b ("net: ena: introduce XDP redirect implementation")
Acked-by: Arthur Kiyanovski &lt;akiyano@amazon.com&gt;
Signed-off-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Acked-by: Jesper Dangaard Brouer &lt;hawk@kernel.org&gt;
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>net: ena: fix shift-out-of-bounds in exponential backoff</title>
<updated>2023-07-23T11:49:44+00:00</updated>
<author>
<name>Krister Johansen</name>
<email>kjlx@templeofstupid.com</email>
</author>
<published>2023-07-11T01:36:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=90947ebf8794e3c229fb2e16e37f1bfea6877f14'/>
<id>urn:sha1:90947ebf8794e3c229fb2e16e37f1bfea6877f14</id>
<content type='text'>
commit 1e9cb763e9bacf0c932aa948f50dcfca6f519a26 upstream.

The ENA adapters on our instances occasionally reset.  Once recently
logged a UBSAN failure to console in the process:

  UBSAN: shift-out-of-bounds in build/linux/drivers/net/ethernet/amazon/ena/ena_com.c:540:13
  shift exponent 32 is too large for 32-bit type 'unsigned int'
  CPU: 28 PID: 70012 Comm: kworker/u72:2 Kdump: loaded not tainted 5.15.117
  Hardware name: Amazon EC2 c5d.9xlarge/, BIOS 1.0 10/16/2017
  Workqueue: ena ena_fw_reset_device [ena]
  Call Trace:
  &lt;TASK&gt;
  dump_stack_lvl+0x4a/0x63
  dump_stack+0x10/0x16
  ubsan_epilogue+0x9/0x36
  __ubsan_handle_shift_out_of_bounds.cold+0x61/0x10e
  ? __const_udelay+0x43/0x50
  ena_delay_exponential_backoff_us.cold+0x16/0x1e [ena]
  wait_for_reset_state+0x54/0xa0 [ena]
  ena_com_dev_reset+0xc8/0x110 [ena]
  ena_down+0x3fe/0x480 [ena]
  ena_destroy_device+0xeb/0xf0 [ena]
  ena_fw_reset_device+0x30/0x50 [ena]
  process_one_work+0x22b/0x3d0
  worker_thread+0x4d/0x3f0
  ? process_one_work+0x3d0/0x3d0
  kthread+0x12a/0x150
  ? set_kthread_struct+0x50/0x50
  ret_from_fork+0x22/0x30
  &lt;/TASK&gt;

Apparently, the reset delays are getting so large they can trigger a
UBSAN panic.

Looking at the code, the current timeout is capped at 5000us.  Using a
base value of 100us, the current code will overflow after (1&lt;&lt;29).  Even
at values before 32, this function wraps around, perhaps
unintentionally.

Cap the value of the exponent used for this backoff at (1&lt;&lt;16) which is
larger than currently necessary, but large enough to support bigger
values in the future.

Cc: stable@vger.kernel.org
Fixes: 4bb7f4cf60e3 ("net: ena: reduce driver load time")
Signed-off-by: Krister Johansen &lt;kjlx@templeofstupid.com&gt;
Reviewed-by: Leon Romanovsky &lt;leonro@nvidia.com&gt;
Reviewed-by: Shay Agroskin &lt;shayagr@amazon.com&gt;
Link: https://lore.kernel.org/r/20230711013621.GE1926@templeofstupid.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>net: ena: Update NUMA TPH hint register upon NUMA node update</title>
<updated>2023-01-12T11:02:19+00:00</updated>
<author>
<name>David Arinzon</name>
<email>darinzon@amazon.com</email>
</author>
<published>2022-12-29T07:30:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=191c4b94c18ba2c617475d35596cea03bd530376'/>
<id>urn:sha1:191c4b94c18ba2c617475d35596cea03bd530376</id>
<content type='text'>
[ Upstream commit a8ee104f986e720cea52133885cc822d459398c7 ]

The device supports a PCIe optimization hint, which indicates on
which NUMA the queue is currently processed. This hint is utilized
by PCIe in order to reduce its access time by accessing the
correct NUMA resources and maintaining cache coherence.

The driver calls the register update for the hint (called TPH -
TLP Processing Hint) during the NAPI loop.

Though the update is expected upon a NUMA change (when a queue
is moved from one NUMA to the other), the current logic performs
a register update when the queue is moved to a different CPU,
but the CPU is not necessarily in a different NUMA.

The changes include:
1. Performing the TPH update only when the queue has switched
a NUMA node.
2. Moving the TPH update call to be triggered only when NAPI was
scheduled from interrupt context, as opposed to a busy-polling loop.
This is due to the fact that during busy-polling, the frequency
of CPU switches for a particular queue is significantly higher,
thus, the likelihood to switch NUMA is much higher. Therefore,
providing the frequent updates to the device upon a NUMA update
are unlikely to be beneficial.

Fixes: 1738cd3ed342 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: David Arinzon &lt;darinzon@amazon.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
</feed>
