<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/tools/testing/selftests/drivers/net, branch v6.7.3</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.7.3</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.7.3'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2024-02-01T00:21:10+00:00</updated>
<entry>
<title>selftests: bonding: do not test arp/ns target with mode balance-alb/tlb</title>
<updated>2024-02-01T00:21:10+00:00</updated>
<author>
<name>Hangbin Liu</name>
<email>liuhangbin@gmail.com</email>
</author>
<published>2024-01-23T07:59:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=70253e69f1dc26d3e5dec6e7325c3ec1ce7870a9'/>
<id>urn:sha1:70253e69f1dc26d3e5dec6e7325c3ec1ce7870a9</id>
<content type='text'>
[ Upstream commit a2933a8759a62269754e54733d993b19de870e84 ]

The prio_arp/ns tests hard code the mode to active-backup. At the same
time, The balance-alb/tlb modes do not support arp/ns target. So remove
the prio_arp/ns tests from the loop and only test active-backup mode.

Fixes: 481b56e0391e ("selftests: bonding: re-format bond option tests")
Reported-by: Jay Vosburgh &lt;jay.vosburgh@canonical.com&gt;
Closes: https://lore.kernel.org/netdev/17415.1705965957@famine/
Signed-off-by: Hangbin Liu &lt;liuhangbin@gmail.com&gt;
Acked-by: Jay Vosburgh &lt;jay.vosburgh@canonical.com&gt;
Link: https://lore.kernel.org/r/20240123075917.1576360-1-liuhangbin@gmail.com
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>selftests: netdevsim: fix the udp_tunnel_nic test</title>
<updated>2024-02-01T00:21:09+00:00</updated>
<author>
<name>Jakub Kicinski</name>
<email>kuba@kernel.org</email>
</author>
<published>2024-01-23T06:05:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1b62822f66443fb5b0b60d342fd91b7cf2b0d7e2'/>
<id>urn:sha1:1b62822f66443fb5b0b60d342fd91b7cf2b0d7e2</id>
<content type='text'>
[ Upstream commit 0879020a7817e7ce636372c016b4528f541c9f4d ]

This test is missing a whole bunch of checks for interface
renaming and one ifup. Presumably it was only used on a system
with renaming disabled and NetworkManager running.

Fixes: 91f430b2c49d ("selftests: net: add a test for UDP tunnel info infra")
Acked-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
Reviewed-by: Simon Horman &lt;horms@kernel.org&gt;
Link: https://lore.kernel.org/r/20240123060529.1033912-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>selftests: bonding: Increase timeout to 1200s</title>
<updated>2024-02-01T00:21:05+00:00</updated>
<author>
<name>Benjamin Poirier</name>
<email>bpoirier@nvidia.com</email>
</author>
<published>2024-01-18T00:12:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=abc5830a0baf3ba282fa04accc275ec53029ade7'/>
<id>urn:sha1:abc5830a0baf3ba282fa04accc275ec53029ade7</id>
<content type='text'>
[ Upstream commit b01f15a7571b7aa222458bc9bf26ab59bd84e384 ]

When tests are run by runner.sh, bond_options.sh gets killed before
it can complete:

make -C tools/testing/selftests run_tests TARGETS="drivers/net/bonding"
	[...]
	# timeout set to 120
	# selftests: drivers/net/bonding: bond_options.sh
	# TEST: prio (active-backup miimon primary_reselect 0)                [ OK ]
	# TEST: prio (active-backup miimon primary_reselect 1)                [ OK ]
	# TEST: prio (active-backup miimon primary_reselect 2)                [ OK ]
	# TEST: prio (active-backup arp_ip_target primary_reselect 0)         [ OK ]
	# TEST: prio (active-backup arp_ip_target primary_reselect 1)         [ OK ]
	# TEST: prio (active-backup arp_ip_target primary_reselect 2)         [ OK ]
	#
	not ok 7 selftests: drivers/net/bonding: bond_options.sh # TIMEOUT 120 seconds

This test includes many sleep statements, at least some of which are
related to timers in the operation of the bonding driver itself. Increase
the test timeout to allow the test to complete.

I ran the test in slightly different VMs (including one without HW
virtualization support) and got runtimes of 13m39.760s, 13m31.238s, and
13m2.956s. Use a ~1.5x "safety factor" and set the timeout to 1200s.

Fixes: 42a8d4aaea84 ("selftests: bonding: add bonding prio option test")
Reported-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Closes: https://lore.kernel.org/netdev/20240116104402.1203850a@kernel.org/#t
Suggested-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Benjamin Poirier &lt;bpoirier@nvidia.com&gt;
Reviewed-by: Hangbin Liu &lt;liuhangbin@gmail.com&gt;
Link: https://lore.kernel.org/r/20240118001233.304759-1-bpoirier@nvidia.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>selftests: mlxsw: qos_pfc: Adjust the test to support 8 lanes</title>
<updated>2024-01-25T23:45:31+00:00</updated>
<author>
<name>Amit Cohen</name>
<email>amcohen@nvidia.com</email>
</author>
<published>2024-01-17T15:04:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ff21a0e219bf16c348ad89ba004b24a0c56dbece'/>
<id>urn:sha1:ff21a0e219bf16c348ad89ba004b24a0c56dbece</id>
<content type='text'>
[ Upstream commit b34f4de6d30cbaa8fed905a5080b6eace8c84dc7 ]

'qos_pfc' test checks PFC behavior. The idea is to limit the traffic
using a shaper somewhere in the flow of the packets. In this area, the
buffer is smaller than the buffer at the beginning of the flow, so it fills
up until there is no more space left. The test configures there PFC
which is supposed to notice that the headroom is filling up and send PFC
Xoff to indicate the transmitter to stop sending traffic for the priorities
sharing this PG.

The Xon/Xoff threshold is auto-configured and always equal to
2*(MTU rounded up to cell size). Even after sending the PFC Xoff packet,
traffic will keep arriving until the transmitter receives and processes
the PFC packet. This amount of traffic is known as the PFC delay allowance.

Currently the buffer for the delay traffic is configured as 100KB. The
MTU in the test is 10KB, therefore the threshold for Xoff is about 20KB.
This allows 80KB extra to be stored in this buffer.

8-lane ports use two buffers among which the configured buffer is split,
the Xoff threshold then applies to each buffer in parallel.

The test does not take into account the behavior of 8-lane ports, when the
ports are configured to 400Gbps with 8 lanes or 800Gbps with 8 lanes,
packets are dropped and the test fails.

Check if the relevant ports use 8 lanes, in such case double the size of
the buffer, as the headroom is split half-half.

Cc: Shuah Khan &lt;shuah@kernel.org&gt;
Fixes: bfa804784e32 ("selftests: mlxsw: Add a PFC test")
Signed-off-by: Amit Cohen &lt;amcohen@nvidia.com&gt;
Reviewed-by: Ido Schimmel &lt;idosch@nvidia.com&gt;
Signed-off-by: Petr Machata &lt;petrm@nvidia.com&gt;
Acked-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
Link: https://lore.kernel.org/r/23ff11b7dff031eb04a41c0f5254a2b636cd8ebb.1705502064.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>mlxsw: spectrum_acl_tcam: Fix stack corruption</title>
<updated>2024-01-25T23:45:31+00:00</updated>
<author>
<name>Ido Schimmel</name>
<email>idosch@nvidia.com</email>
</author>
<published>2024-01-17T15:04:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=a361c2c1da5dbb13ca67601cf961ab3ad68af383'/>
<id>urn:sha1:a361c2c1da5dbb13ca67601cf961ab3ad68af383</id>
<content type='text'>
[ Upstream commit 483ae90d8f976f8339cf81066312e1329f2d3706 ]

When tc filters are first added to a net device, the corresponding local
port gets bound to an ACL group in the device. The group contains a list
of ACLs. In turn, each ACL points to a different TCAM region where the
filters are stored. During forwarding, the ACLs are sequentially
evaluated until a match is found.

One reason to place filters in different regions is when they are added
with decreasing priorities and in an alternating order so that two
consecutive filters can never fit in the same region because of their
key usage.

In Spectrum-2 and newer ASICs the firmware started to report that the
maximum number of ACLs in a group is more than 16, but the layout of the
register that configures ACL groups (PAGT) was not updated to account
for that. It is therefore possible to hit stack corruption [1] in the
rare case where more than 16 ACLs in a group are required.

Fix by limiting the maximum ACL group size to the minimum between what
the firmware reports and the maximum ACLs that fit in the PAGT register.

Add a test case to make sure the machine does not crash when this
condition is hit.

[1]
Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: mlxsw_sp_acl_tcam_group_update+0x116/0x120
[...]
 dump_stack_lvl+0x36/0x50
 panic+0x305/0x330
 __stack_chk_fail+0x15/0x20
 mlxsw_sp_acl_tcam_group_update+0x116/0x120
 mlxsw_sp_acl_tcam_group_region_attach+0x69/0x110
 mlxsw_sp_acl_tcam_vchunk_get+0x492/0xa20
 mlxsw_sp_acl_tcam_ventry_add+0x25/0xe0
 mlxsw_sp_acl_rule_add+0x47/0x240
 mlxsw_sp_flower_replace+0x1a9/0x1d0
 tc_setup_cb_add+0xdc/0x1c0
 fl_hw_replace_filter+0x146/0x1f0
 fl_change+0xc17/0x1360
 tc_new_tfilter+0x472/0xb90
 rtnetlink_rcv_msg+0x313/0x3b0
 netlink_rcv_skb+0x58/0x100
 netlink_unicast+0x244/0x390
 netlink_sendmsg+0x1e4/0x440
 ____sys_sendmsg+0x164/0x260
 ___sys_sendmsg+0x9a/0xe0
 __sys_sendmsg+0x7a/0xc0
 do_syscall_64+0x40/0xe0
 entry_SYSCALL_64_after_hwframe+0x63/0x6b

Fixes: c3ab435466d5 ("mlxsw: spectrum: Extend to support Spectrum-2 ASIC")
Reported-by: Orel Hagag &lt;orelh@nvidia.com&gt;
Signed-off-by: Ido Schimmel &lt;idosch@nvidia.com&gt;
Reviewed-by: Amit Cohen &lt;amcohen@nvidia.com&gt;
Signed-off-by: Petr Machata &lt;petrm@nvidia.com&gt;
Acked-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
Link: https://lore.kernel.org/r/2d91c89afba59c22587b444994ae419dbea8d876.1705502064.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>mlxsw: spectrum_acl_erp: Fix error flow of pool allocation failure</title>
<updated>2024-01-25T23:45:30+00:00</updated>
<author>
<name>Amit Cohen</name>
<email>amcohen@nvidia.com</email>
</author>
<published>2024-01-17T15:04:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1a720f3ec16e510ff1e0d384119475bc0bdc4e26'/>
<id>urn:sha1:1a720f3ec16e510ff1e0d384119475bc0bdc4e26</id>
<content type='text'>
[ Upstream commit 6d6eeabcfaba2fcadf5443b575789ea606f9de83 ]

Lately, a bug was found when many TC filters are added - at some point,
several bugs are printed to dmesg [1] and the switch is crashed with
segmentation fault.

The issue starts when gen_pool_free() fails because of unexpected
behavior - a try to free memory which is already freed, this leads to BUG()
call which crashes the switch and makes many other bugs.

Trying to track down the unexpected behavior led to a bug in eRP code. The
function mlxsw_sp_acl_erp_table_alloc() gets a pointer to the allocated
index, sets the value and returns an error code. When gen_pool_alloc()
fails it returns address 0, we track it and return -ENOBUFS outside, BUT
the call for gen_pool_alloc() already override the index in erp_table
structure. This is a problem when such allocation is done as part of
table expansion. This is not a new table, which will not be used in case
of allocation failure. We try to expand eRP table and override the
current index (non-zero) with zero. Then, it leads to an unexpected
behavior when address 0 is freed twice. Note that address 0 is valid in
erp_table-&gt;base_index and indeed other tables use it.

gen_pool_alloc() fails in case that there is no space left in the
pre-allocated pool, in our case, the pool is limited to
ACL_MAX_ERPT_BANK_SIZE, which is read from hardware. When more than max
erp entries are required, we exceed the limit and return an error, this
error leads to "Failed to migrate vregion" print.

Fix this by changing erp_table-&gt;base_index only in case of a successful
allocation.

Add a test case for such a scenario. Without this fix it causes
segmentation fault:

$ TESTS="max_erp_entries_test" ./tc_flower.sh
./tc_flower.sh: line 988:  1560 Segmentation fault      tc filter del dev $h2 ingress chain $i protocol ip pref $i handle $j flower &amp;&gt;/dev/null

[1]:
kernel BUG at lib/genalloc.c:508!
invalid opcode: 0000 [#1] PREEMPT SMP
CPU: 6 PID: 3531 Comm: tc Not tainted 6.7.0-rc5-custom-ga6893f479f5e #1
Hardware name: Mellanox Technologies Ltd. MSN4700/VMOD0010, BIOS 5.11 07/12/2021
RIP: 0010:gen_pool_free_owner+0xc9/0xe0
...
Call Trace:
 &lt;TASK&gt;
 __mlxsw_sp_acl_erp_table_other_dec+0x70/0xa0 [mlxsw_spectrum]
 mlxsw_sp_acl_erp_mask_destroy+0xf5/0x110 [mlxsw_spectrum]
 objagg_obj_root_destroy+0x18/0x80 [objagg]
 objagg_obj_destroy+0x12c/0x130 [objagg]
 mlxsw_sp_acl_erp_mask_put+0x37/0x50 [mlxsw_spectrum]
 mlxsw_sp_acl_ctcam_region_entry_remove+0x74/0xa0 [mlxsw_spectrum]
 mlxsw_sp_acl_ctcam_entry_del+0x1e/0x40 [mlxsw_spectrum]
 mlxsw_sp_acl_tcam_ventry_del+0x78/0xd0 [mlxsw_spectrum]
 mlxsw_sp_flower_destroy+0x4d/0x70 [mlxsw_spectrum]
 mlxsw_sp_flow_block_cb+0x73/0xb0 [mlxsw_spectrum]
 tc_setup_cb_destroy+0xc1/0x180
 fl_hw_destroy_filter+0x94/0xc0 [cls_flower]
 __fl_delete+0x1ac/0x1c0 [cls_flower]
 fl_destroy+0xc2/0x150 [cls_flower]
 tcf_proto_destroy+0x1a/0xa0
...
mlxsw_spectrum3 0000:07:00.0: Failed to migrate vregion
mlxsw_spectrum3 0000:07:00.0: Failed to migrate vregion

Fixes: f465261aa105 ("mlxsw: spectrum_acl: Implement common eRP core")
Signed-off-by: Amit Cohen &lt;amcohen@nvidia.com&gt;
Signed-off-by: Ido Schimmel &lt;idosch@nvidia.com&gt;
Signed-off-by: Petr Machata &lt;petrm@nvidia.com&gt;
Acked-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
Link: https://lore.kernel.org/r/4cfca254dfc0e5d283974801a24371c7b6db5989.1705502064.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>selftests: bonding: Change script interpreter</title>
<updated>2024-01-25T23:45:28+00:00</updated>
<author>
<name>Benjamin Poirier</name>
<email>bpoirier@nvidia.com</email>
</author>
<published>2024-01-10T14:14:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2b973b5ba8792110dc047ad8ed22d87597df9797'/>
<id>urn:sha1:2b973b5ba8792110dc047ad8ed22d87597df9797</id>
<content type='text'>
[ Upstream commit c2518da8e6b0e248cfff1d4b6682e14020bd4d3f ]

The tests changed by this patch, as well as the scripts they source, use
features which are not part of POSIX sh (ex. 'source' and 'local'). As a
result, these tests fail when /bin/sh is dash such as on Debian. Change the
interpreter to bash so that these tests can run successfully.

Fixes: d43eff0b85ae ("selftests: bonding: up/down delay w/ slave link flapping")
Tested-by: Hangbin Liu &lt;liuhangbin@gmail.com&gt;
Reviewed-by: Hangbin Liu &lt;liuhangbin@gmail.com&gt;
Reviewed-by: Petr Machata &lt;petrm@nvidia.com&gt;
Signed-off-by: Benjamin Poirier &lt;bpoirier@nvidia.com&gt;
Reviewed-by: Przemek Kitszel &lt;przemyslaw.kitszel@intel.com&gt;
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>selftests: bonding: do not set port down when adding to bond</title>
<updated>2024-01-02T14:17:05+00:00</updated>
<author>
<name>Hangbin Liu</name>
<email>liuhangbin@gmail.com</email>
</author>
<published>2023-12-23T12:59:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=61fa2493ca76fd7bb74e13f0205274f4ab0aa696'/>
<id>urn:sha1:61fa2493ca76fd7bb74e13f0205274f4ab0aa696</id>
<content type='text'>
Similar to commit be809424659c ("selftests: bonding: do not set port down
before adding to bond"). The bond-arp-interval-causes-panic test failed
after commit a4abfa627c38 ("net: rtnetlink: Enslave device before bringing
it up") as the kernel will set the port down _after_ adding to bond if setting
port down specifically.

Fix it by removing the link down operation when adding to bond.

Fixes: 2ffd57327ff1 ("selftests: bonding: cause oops in bond_rr_gen_slave_id")
Signed-off-by: Hangbin Liu &lt;liuhangbin@gmail.com&gt;
Tested-by: Benjamin Poirier &lt;benjamin.poirier@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>selftests: netdevsim: use suitable existing dummy file for flash test</title>
<updated>2023-10-13T09:43:13+00:00</updated>
<author>
<name>Jiri Pirko</name>
<email>jiri@nvidia.com</email>
</author>
<published>2023-10-10T14:44:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=6151ff9c75210c0f9d3d9ddcd3de6325de12c2a0'/>
<id>urn:sha1:6151ff9c75210c0f9d3d9ddcd3de6325de12c2a0</id>
<content type='text'>
The file name used in flash test was "dummy" because at the time test
was written, drivers were responsible for file request and as netdevsim
didn't do that, name was unused. However, the file load request is
now done in devlink code and therefore the file has to exist.
Use first random file from /lib/firmware for this purpose.

Signed-off-by: Jiri Pirko &lt;jiri@nvidia.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>selftests: bonding: create directly devices in the target namespaces</title>
<updated>2023-08-28T09:24:08+00:00</updated>
<author>
<name>Zhengchao Shao</name>
<email>shaozhengchao@huawei.com</email>
</author>
<published>2023-08-26T02:23:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=bf68583624c56bab26b4394fb4721461ded62f94'/>
<id>urn:sha1:bf68583624c56bab26b4394fb4721461ded62f94</id>
<content type='text'>
If failed to set link1_1 to netns client, we should delete link1_1 in the
cleanup path. But if set link1_1 to netns client successfully, delete
link1_1 will report warning. So it will be safer creating directly the
devices in the target namespaces.

Reported-by: Hangbin Liu &lt;liuhangbin@gmail.com&gt;
Closes: https://lore.kernel.org/all/ZNyJx1HtXaUzOkNA@Laptop-X1/
Signed-off-by: Zhengchao Shao &lt;shaozhengchao@huawei.com&gt;
Acked-by: Hangbin Liu &lt;liuhangbin@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
</feed>
