kernel/linux.git - Linux kernel stable tree (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2024-07-10	net/mlx5: DR, Remove definer functions from SW Steering API	Yevgeny Kliteynik	2	-5/+5
	No need to expose definer get/put functions as part of SW Steering API - they are internal functions. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20240708080025.1593555-9-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-10	Merge branch 'mlxsw-improvements'	Jakub Kicinski	3	-26/+35
	Petr Machata says: ==================== mlxsw: Improvements This patchset contains assortments of improvements to the mlxsw driver. Please see individual patches for details. ==================== Link: https://patch.msgid.link/cover.1720447210.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-10	mlxsw: pci: Lock configuration space of upstream bridge during reset	Ido Schimmel	1	-0/+6
	The driver triggers a "Secondary Bus Reset" (SBR) by calling __pci_reset_function_locked() which asserts the SBR bit in the "Bridge Control Register" in the configuration space of the upstream bridge for 2ms. This is done without locking the configuration space of the upstream bridge port, allowing user space to access it concurrently. Linux 6.11 will start warning about such unlocked resets [1][2]: pcieport 0000:00:01.0: unlocked secondary bus reset via: pci_reset_bus_function+0x51c/0x6a0 Avoid the warning and the concurrent access by locking the configuration space of the upstream bridge prior to the reset and unlocking it afterwards. [1] https://lore.kernel.org/all/171711746953.1628941.4692125082286867825.stgit@dwillia2-xfh.jf.intel.com/ [2] https://lore.kernel.org/all/20240531213150.GA610983@bhelgaas/ Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Link: https://patch.msgid.link/9937b0afdb50f2f2825945393c94c093c04a5897.1720447210.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-10	mlxsw: core_thermal: Report valid current state during cooling device ↵	Ido Schimmel	1	-26/+25
	registration Commit 31a0fa0019b0 ("thermal/debugfs: Pass cooling device state to thermal_debug_cdev_add()") changed the thermal core to read the current state of the cooling device as part of the cooling device's registration. This is incompatible with the current implementation of the cooling device operations in mlxsw, leading to initialization failure with errors such as: mlxsw_spectrum 0000:01:00.0: Failed to register cooling device mlxsw_spectrum 0000:01:00.0: cannot register bus device The reason for the failure is that when the get current state operation is invoked the driver tries to derive the index of the cooling device by walking a per thermal zone array and looking for the matching cooling device pointer. However, the pointer is returned from the registration function and therefore only set in the array after the registration. The issue was later fixed by commit 1af89dedc8a5 ("thermal: core: Do not fail cdev registration because of invalid initial state") by not failing the registration of the cooling device if it cannot report a valid current state during registration, although drivers are responsible for ensuring that this will not happen. Therefore, make sure the driver is able to report a valid current state for the cooling device during registration by passing to the registration function a per cooling device private data that already has the cooling device index populated. While at it, call thermal_cooling_device_unregister() unconditionally since the function returns immediately if the cooling device pointer is NULL. Reviewed-by: Vadim Pasternak <vadimp@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Link: https://patch.msgid.link/c823c4678b6b7afb902c35b3551c81a053afd110.1720447210.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-10	mlxsw: Warn about invalid accesses to array fields	Petr Machata	1	-0/+4
	A forgotten or buggy variable initialization can cause out-of-bounds access to a register or other item array field. For an overflow, such access would mangle adjacent parts of the register payload. For an underflow, due to all variables being unsigned, the access would likely trample unrelated memory. Since neither is correct, replace these accesses with accesses at the index of 0, and warn about the issue. Suggested-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Link: https://patch.msgid.link/b988fb265c2f6c1206fe12d5bfdcfa188b7672d1.1720447210.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-10	Merge branch 'selftests-drv-net-rss_ctx-more-tests'	Jakub Kicinski	1	-25/+189
	Jakub Kicinski says: ==================== selftests: drv-net: rss_ctx: more tests Add a few more tests for RSS. v1: https://lore.kernel.org/all/20240705015725.680275-1-kuba@kernel.org/ ==================== Link: https://patch.msgid.link/20240708213627.226025-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-10	selftests: drv-net: rss_ctx: test flow rehashing without impacting traffic	Jakub Kicinski	1	-1/+31
	Some workloads may want to rehash the flows in response to an imbalance. Most effective way to do that is changing the RSS key. Check that changing the key does not cause link flaps or traffic disruption. Disrupting traffic for key update is not incorrect, but makes the key update unusable for rehashing under load. Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20240708213627.226025-6-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-10	selftests: drv-net: rss_ctx: check behavior of indirection table resizing	Jakub Kicinski	1	-1/+36
	Some devices dynamically increase and decrease the size of the RSS indirection table based on the number of enabled queues. When that happens driver must maintain the balance of entries (preferably duplicating the smaller table). Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20240708213627.226025-5-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-10	selftests: drv-net: rss_ctx: test queue changes vs user RSS config	Jakub Kicinski	1	-1/+80
	By default main RSS table should change to include all queues. When user sets a specific RSS config the driver should preserve it, even when queue count changes. Driver should refuse to deactivate queues used in the user-set RSS config. For additional contexts driver should still refuse to deactivate queues in use. Whether the contexts should get resized like context 0 when queue count increases is a bit unclear. I anticipate most drivers today don't do that. Since main use case for additional contexts is to set the indir table - it doesn't seem worthwhile to care about behavior of the default table too much. Don't test that. Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20240708213627.226025-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-10	selftests: drv-net: rss_ctx: factor out send traffic and check	Jakub Kicinski	1	-19/+39
	Wrap up sending traffic and checking in which queues it landed in a helper. The method used for testing is to send a lot of iperf traffic and check which queues received the most packets. Those should be the queues where we expect iperf to land - either because we installed a filter for the port iperf uses, or we didn't and expect it to use context 0. Contexts get disjoint queue sets, but the main context (AKA context 0) may receive some background traffic (noise). Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20240708213627.226025-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-10	selftests: drv-net: rss_ctx: fix cleanup in the basic test	Jakub Kicinski	1	-4/+4
	The basic test may fail without resetting the RSS indir table. Use the .exec() method to run cleanup early since we re-test with traffic that returning to default state works. While at it reformat the doc a tiny bit. Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20240708213627.226025-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-09	net: ti: icssg-prueth: add missing deps	Guillaume La Roque	1	-0/+1
	Add missing dependency on NET_SWITCHDEV. Fixes: abd5576b9c57 ("net: ti: icssg-prueth: Add support for ICSSG switch firmware") Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Guillaume La Roque <glaroque@baylibre.com> Reviewed-by: MD Danish Anwar <danishanwar@ti.com> Link: https://patch.msgid.link/20240708-net-deps-v2-1-b22fb74da2a3@baylibre.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-09	dt-bindings: net: fsl,fman: add ptimer-handle property	Frank Li	1	-0/+4
	Add ptimer-handle property to link to ptp-timer node handle. Fix below warning: arch/arm64/boot/dts/freescale/fsl-ls1043a-rdb.dtb: fman@1a00000: 'ptimer-handle' do not match any of the regexes: '^ethernet@[a-f0-9]+$', '^mdio@[a-f0-9]+$', '^muram@[a-f0-9]+$', '^phc@[a-f0-9]+$', '^port@[a-f0-9]+$', 'pinctrl-[0-9]+' Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Signed-off-by: Frank Li <Frank.Li@nxp.com> Link: https://patch.msgid.link/20240708180949.1898495-2-Frank.Li@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-09	dt-bindings: net: fsl,fman: allow dma-coherent property	Frank Li	1	-0/+2
	Add dma-coherent property to fix below warning. arch/arm64/boot/dts/freescale/fsl-ls1046a-rdb.dtb: fman@1a00000: 'dma-coherent', 'ptimer-handle' do not match any of the regexes: '^ethernet@[a-f0-9]+$', '^mdio@[a-f0-9]+$', '^muram@[a-f0-9]+$', '^phc@[a-f0-9]+$', '^port@[a-f0-9]+$', 'pinctrl-[0-9]+' from schema $id: http://devicetree.org/schemas/net/fsl,fman.yaml# Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Signed-off-by: Frank Li <Frank.Li@nxp.com> Link: https://patch.msgid.link/20240708180949.1898495-1-Frank.Li@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-09	net: tls: Pass union tls_crypto_context pointer to memzero_explicit	Simon Horman	1	-3/+6
	Pass union tls_crypto_context pointer, rather than struct tls_crypto_info pointer, to memzero_explicit(). The address of the pointer is the same before and after. But the new construct means that the size of the dereferenced pointer type matches the size being zeroed. Which aids static analysis. As reported by Smatch: .../tls_main.c:842 do_tls_setsockopt_conf() error: memzero_explicit() 'crypto_info' too small (4 vs 56) No functional change intended. Compile tested only. Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240708-tls-memzero-v2-1-9694eaf31b79@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-09	selftests: forwarding: Make vxlan-bridge-1d pass on debug kernels	Ido Schimmel	1	-4/+4
	The ageing time used by the test is too short for debug kernels and results in entries being aged out prematurely [1]. Fix by increasing the ageing time. The same change was done for the VLAN-aware version of the test in commit dfbab74044be ("selftests: forwarding: Make vxlan-bridge-1q pass on debug kernels"). [1] # ./vxlan_bridge_1d.sh [...] # TEST: VXLAN: flood before learning [ OK ] # TEST: VXLAN: show learned FDB entry [ OK ] # TEST: VXLAN: learned FDB entry [FAIL] # veth3: Expected to capture 0 packets, got 4. # RTNETLINK answers: No such file or directory # TEST: VXLAN: deletion of learned FDB entry [ OK ] # TEST: VXLAN: Ageing of learned FDB entry [FAIL] # veth3: Expected to capture 0 packets, got 2. [...] Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240707095458.2870260-1-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-09	Merge tag 'for-netdev' of ↵	Paolo Abeni	127	-980/+4606
	https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2024-07-08 The following pull-request contains BPF updates for your net-next tree. We've added 102 non-merge commits during the last 28 day(s) which contain a total of 127 files changed, 4606 insertions(+), 980 deletions(-). The main changes are: 1) Support resilient split BTF which cuts down on duplication and makes BTF as compact as possible wrt BTF from modules, from Alan Maguire & Eduard Zingerman. 2) Add support for dumping kfunc prototypes from BTF which enables both detecting as well as dumping compilable prototypes for kfuncs, from Daniel Xu. 3) Batch of s390x BPF JIT improvements to add support for BPF arena and to implement support for BPF exceptions, from Ilya Leoshkevich. 4) Batch of riscv64 BPF JIT improvements in particular to add 12-argument support for BPF trampolines and to utilize bpf_prog_pack for the latter, from Pu Lehui. 5) Extend BPF test infrastructure to add a CHECKSUM_COMPLETE validation option for skbs and add coverage along with it, from Vadim Fedorenko. 6) Inline bpf_get_current_task/_btf() helpers in the arm64 BPF JIT which gives a small 1% performance improvement in micro-benchmarks, from Puranjay Mohan. 7) Extend the BPF verifier to track the delta between linked registers in order to better deal with recent LLVM code optimizations, from Alexei Starovoitov. 8) Fix bpf_wq_set_callback_impl() kfunc signature where the third argument should have been a pointer to the map value, from Benjamin Tissoires. 9) Extend BPF selftests to add regular expression support for test output matching and adjust some of the selftest when compiled under gcc, from Cupertino Miranda. 10) Simplify task_file_seq_get_next() and remove an unnecessary loop which always iterates exactly once anyway, from Dan Carpenter. 11) Add the capability to offload the netfilter flowtable in XDP layer through kfuncs, from Florian Westphal & Lorenzo Bianconi. 12) Various cleanups in networking helpers in BPF selftests to shave off a few lines of open-coded functions on client/server handling, from Geliang Tang. 13) Properly propagate prog->aux->tail_call_reachable out of BPF verifier, so that x86 JIT does not need to implement detection, from Leon Hwang. 14) Fix BPF verifier to add a missing check_func_arg_reg_off() to prevent an out-of-bounds memory access for dynpointers, from Matt Bobrowski. 15) Fix bpf_session_cookie() kfunc to return __u64 instead of long pointer as it might lead to problems on 32-bit archs, from Jiri Olsa. 16) Enhance traffic validation and dynamic batch size support in xsk selftests, from Tushar Vyavahare. bpf-next-for-netdev * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (102 commits) selftests/bpf: DENYLIST.aarch64: Remove fexit_sleep selftests/bpf: amend for wrong bpf_wq_set_callback_impl signature bpf: helpers: fix bpf_wq_set_callback_impl signature libbpf: Add NULL checks to bpf_object__{prev_map,next_map} selftests/bpf: Remove exceptions tests from DENYLIST.s390x s390/bpf: Implement exceptions s390/bpf: Change seen_reg to a mask bpf: Remove unnecessary loop in task_file_seq_get_next() riscv, bpf: Optimize stack usage of trampoline bpf, devmap: Add .map_alloc_check selftests/bpf: Remove arena tests from DENYLIST.s390x selftests/bpf: Add UAF tests for arena atomics selftests/bpf: Introduce __arena_global s390/bpf: Support arena atomics s390/bpf: Enable arena s390/bpf: Support address space cast instruction s390/bpf: Support BPF_PROBE_MEM32 s390/bpf: Land on the next JITed instruction after exception s390/bpf: Introduce pre- and post- probe functions s390/bpf: Get rid of get_probe_mem_regno() ... ==================== Link: https://patch.msgid.link/20240708221438.10974-1-daniel@iogearbox.net Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-07-09	net: phy: microchip: lan937x: add support for 100BaseTX PHY	Oleksij Rempel	1	-1/+125
	Add support of 100BaseTX PHY build in to LAN9371 and LAN9372 switches. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com> Link: https://patch.msgid.link/20240706154201.1456098-1-o.rempel@pengutronix.de Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-07-09	udp: Remove duplicate included header file trace/events/udp.h	Thorsten Blum	1	-1/+0
	Remove duplicate included header file trace/events/udp.h and the following warning reported by make includecheck: trace/events/udp.h is included more than once Compile-tested only. Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240706071132.274352-2-thorsten.blum@toblux.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-07-09	net: tn40xx: add per queue netdev-genl stats support	FUJITA Tomonori	2	-2/+53
	Add support for the netdev-genl per queue stats API. ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \ --dump qstats-get --json '{"scope":"queue"}' [{'ifindex': 4, 'queue-id': 0, 'queue-type': 'rx', 'rx-alloc-fail': 0, 'rx-bytes': 266613, 'rx-packets': 3325}, {'ifindex': 4, 'queue-id': 0, 'queue-type': 'tx', 'tx-bytes': 142823367, 'tx-packets': 2387}] Signed-off-by: FUJITA Tomonori <fujita.tomonori@gmail.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20240706064324.137574-1-fujita.tomonori@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-07-09	sctp: Fix typos and improve comments	Thorsten Blum	1	-4/+4
	Fix typos s/steam/stream/ and spell out Schedule/Unschedule in the comments. Compile-tested only. Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240704202558.62704-2-thorsten.blum@toblux.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-07-09	l2tp: fix possible UAF when cleaning up tunnels	James Chapman	1	-4/+7
	syzbot reported a UAF caused by a race when the L2TP work queue closes a tunnel at the same time as a userspace thread closes a session in that tunnel. Tunnel cleanup is handled by a work queue which iterates through the sessions contained within a tunnel, and closes them in turn. Meanwhile, a userspace thread may arbitrarily close a session via either netlink command or by closing the pppox socket in the case of l2tp_ppp. The race condition may occur when l2tp_tunnel_closeall walks the list of sessions in the tunnel and deletes each one. Currently this is implemented using list_for_each_safe, but because the list spinlock is dropped in the loop body it's possible for other threads to manipulate the list during list_for_each_safe's list walk. This can lead to the list iterator being corrupted, leading to list_for_each_safe spinning. One sequence of events which may lead to this is as follows: * A tunnel is created, containing two sessions A and B. * A thread closes the tunnel, triggering tunnel cleanup via the work queue. * l2tp_tunnel_closeall runs in the context of the work queue. It removes session A from the tunnel session list, then drops the list lock. At this point the list_for_each_safe temporary variable is pointing to the other session on the list, which is session B, and the list can be manipulated by other threads since the list lock has been released. * Userspace closes session B, which removes the session from its parent tunnel via l2tp_session_delete. Since l2tp_tunnel_closeall has released the tunnel list lock, l2tp_session_delete is able to call list_del_init on the session B list node. * Back on the work queue, l2tp_tunnel_closeall resumes execution and will now spin forever on the same list entry until the underlying session structure is freed, at which point UAF occurs. The solution is to iterate over the tunnel's session list using list_first_entry_not_null to avoid the possibility of the list iterator pointing at a list item which may be removed during the walk. Also, have l2tp_tunnel_closeall ref each session while it processes it to prevent another thread from freeing it. cpu1 cpu2 --- --- pppol2tp_release() spin_lock_bh(&tunnel->list_lock); for (;;) { session = list_first_entry_or_null(&tunnel->session_list, struct l2tp_session, list); if (!session) break; list_del_init(&session->list); spin_unlock_bh(&tunnel->list_lock); l2tp_session_delete(session); l2tp_session_delete(session); spin_lock_bh(&tunnel->list_lock); } spin_unlock_bh(&tunnel->list_lock); Calling l2tp_session_delete on the same session twice isn't a problem per-se, but if cpu2 manages to destruct the socket and unref the session to zero before cpu1 progresses then it would lead to UAF. Reported-by: syzbot+b471b7c936301a59745b@syzkaller.appspotmail.com Reported-by: syzbot+c041b4ce3a6dfd1e63e2@syzkaller.appspotmail.com Fixes: d18d3f0a24fc ("l2tp: replace hlist with simple list for per-tunnel session list") Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: Tom Parkin <tparkin@katalix.com> Link: https://patch.msgid.link/20240704152508.1923908-1-jchapman@katalix.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-07-09	Merge branch 'net-stmmac-qcom-ethqos-enable-2-5g-ethernet-on-sa8775p-ride'	Jakub Kicinski	1	-0/+34
	Bartosz Golaszewski says: ==================== net: stmmac: qcom-ethqos: enable 2.5G ethernet on sa8775p-ride Here are the changes required to enable 2.5G ethernet on sa8775p-ride. As advised by Andrew Lunn and Russell King, I am reusing the existing stmmac infrastructure to enable the SGMII loopback and so I dropped the patches adding new callbacks to the driver core. I also added more details to the commit message and made sure the workaround is only enabled on Rev 3 of the board (with AQR115C PHY). Also: dropped any mentions of the OCSGMII mode. v2: https://lore.kernel.org/20240627113948.25358-1-brgl@bgdev.pl v1: https://lore.kernel.org/20240619184550.34524-1-brgl@bgdev.pl ==================== Link: https://patch.msgid.link/20240703181500.28491-1-brgl@bgdev.pl Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-09	net: stmmac: qcom-ethqos: enable SGMII loopback during DMA reset on ↵	Bartosz Golaszewski	1	-0/+23
	sa8775p-ride-r3 On sa8775p-ride-r3 the RX clocks from the AQR115C PHY are not available at the time of the DMA reset. We can however extract the RX clock from the internal SERDES block. Once the link is up, we can revert to the previous state. The AQR115C PHY doesn't support in-band signalling so we can count on getting the link up notification and safely reuse existing callbacks which are already used by another HW quirk workaround which enables the functional clock to avoid a DMA reset due to timeout. Only enable loopback on revision 3 of the board - check the phy_mode to make sure. Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20240703181500.28491-3-brgl@bgdev.pl Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-09	net: stmmac: qcom-ethqos: add support for 2.5G BASEX mode	Bartosz Golaszewski	1	-0/+11
	Add support for 2.5G speed in 2500BASEX mode to the QCom ethqos driver. Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20240703181500.28491-2-brgl@bgdev.pl Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-09	net: page_pool: fix warning code	Johannes Berg	1	-1/+1
	WARN_ON_ONCE("string") doesn't really do what appears to be intended, so fix that. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Fixes: 90de47f020db ("page_pool: fragment API support for 32-bit arch with 64-bit DMA") Link: https://patch.msgid.link/20240705134221.2f4de205caa1.I28496dc0f2ced580282d1fb892048017c4491e21@changeid Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-08	selftests: net: ksft: interrupt cleanly on KeyboardInterrupt	Jakub Kicinski	1	-1/+8
	It's very useful to be able to interrupt the tests during development. Detect KeyboardInterrupt, run the cleanups and exit. Link: https://patch.msgid.link/20240705015222.675840-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-08	selftests/bpf: DENYLIST.aarch64: Remove fexit_sleep	Puranjay Mohan	1	-1/+0
	fexit_sleep test runs successfully now on the BPF CI so remove it from the deny list. ftrace direct calls was blocking tracing programs on arm64 but it has been resolved by now. For more details see also discussion in []. Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240705145009.32340-1-puranjay@kernel.org []
2024-07-08	Merge branch 'small-api-fix-for-bpf_wq'	Alexei Starovoitov	4	-9/+18
	Benjamin Tissoires says: ==================== Small API fix for bpf_wq I realized this while having a map containing both a struct bpf_timer and a struct bpf_wq: the third argument provided to the bpf_wq callback is not the struct bpf_wq pointer itself, but the pointer to the value in the map. Which means that the users need to double cast the provided "value" as this is not a struct bpf_wq *. This is a change of API, but there doesn't seem to be much users of bpf_wq right now, so we should be able to go with this right now. Signed-off-by: Benjamin Tissoires <bentiss@kernel.org> --- Changes in v2: - amended the selftests to retrieve something from the third argument of the callback - Link to v1: https://lore.kernel.org/r/20240705-fix-wq-v1-0-91b4d82cd825@kernel.org --- ==================== Link: https://lore.kernel.org/r/20240708-fix-wq-v2-0-667e5c9fbd99@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-07-08	selftests/bpf: amend for wrong bpf_wq_set_callback_impl signature	Benjamin Tissoires	3	-8/+17
	See the previous patch: the API was wrong, we were provided the pointer to the value, not the actual struct bpf_wq *. Signed-off-by: Benjamin Tissoires <bentiss@kernel.org> Link: https://lore.kernel.org/r/20240708-fix-wq-v2-2-667e5c9fbd99@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-07-08	bpf: helpers: fix bpf_wq_set_callback_impl signature	Benjamin Tissoires	1	-1/+1
	I realized this while having a map containing both a struct bpf_timer and a struct bpf_wq: the third argument provided to the bpf_wq callback is not the struct bpf_wq pointer itself, but the pointer to the value in the map. Which means that the users need to double cast the provided "value" as this is not a struct bpf_wq *. This is a change of API, but there doesn't seem to be much users of bpf_wq right now, so we should be able to go with this right now. Fixes: 81f1d7a583fa ("bpf: wq: add bpf_wq_set_callback_impl") Signed-off-by: Benjamin Tissoires <bentiss@kernel.org> Link: https://lore.kernel.org/r/20240708-fix-wq-v2-1-667e5c9fbd99@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-07-08	libbpf: Add NULL checks to bpf_object__{prev_map,next_map}	Andreas Ziegler	1	-2/+2
	In the current state, an erroneous call to bpf_object__find_map_by_name(NULL, ...) leads to a segmentation fault through the following call chain: bpf_object__find_map_by_name(obj = NULL, ...) -> bpf_object__for_each_map(pos, obj = NULL) -> bpf_object__next_map((obj = NULL), NULL) -> return (obj = NULL)->maps While calling bpf_object__find_map_by_name with obj = NULL is obviously incorrect, this should not lead to a segmentation fault but rather be handled gracefully. As __bpf_map__iter already handles this situation correctly, we can delegate the check for the regular case there and only add a check in case the prev or next parameter is NULL. Signed-off-by: Andreas Ziegler <ziegler.andreas@siemens.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240703083436.505124-1-ziegler.andreas@siemens.com
2024-07-08	selftests/bpf: Remove exceptions tests from DENYLIST.s390x	Ilya Leoshkevich	1	-1/+0
	Now that the s390x JIT supports exceptions, remove the respective tests from the denylist. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240703005047.40915-4-iii@linux.ibm.com
2024-07-08	s390/bpf: Implement exceptions	Ilya Leoshkevich	1	-2/+53
	Implement the following three pieces required from the JIT: - A "top-level" BPF prog (exception_boundary) must save all non-volatile registers, and not only the ones that it clobbers. - A "handler" BPF prog (exception_cb) must switch stack to that of exception_boundary, and restore the registers that exception_boundary saved. - arch_bpf_stack_walk() must unwind the stack and provide the results in a way that satisfies both bpf_throw() and exception_cb. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240703005047.40915-3-iii@linux.ibm.com
2024-07-08	s390/bpf: Change seen_reg to a mask	Ilya Leoshkevich	1	-16/+16
	Using a mask instead of an array saves a small amount of memory and allows marking multiple registers as seen with a simple "or". Another positive side-effect is that it speeds up verification with jitterbug. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240703005047.40915-2-iii@linux.ibm.com
2024-07-08	bpf: Remove unnecessary loop in task_file_seq_get_next()	Dan Carpenter	1	-6/+3
	After commit 0ede61d8589c ("file: convert to SLAB_TYPESAFE_BY_RCU") this loop always iterates exactly one time. Delete the for statement and pull the code in a tab. Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Christian Brauner <brauner@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/bpf/ZoWJF51D4zWb6f5t@stanley.mountain
2024-07-08	riscv, bpf: Optimize stack usage of trampoline	Puranjay Mohan	1	-1/+1
	When BPF_TRAMP_F_CALL_ORIG is not set, stack space for passing arguments on stack doesn't need to be reserved because the original function is not called. Only reserve space for stacked arguments when BPF_TRAMP_F_CALL_ORIG is set. Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Pu Lehui <pulehui@huawei.com> Link: https://lore.kernel.org/bpf/20240708114758.64414-1-puranjay@kernel.org
2024-07-08	net: dsa: microchip: lan9371/2: update MAC capabilities for port 4	Oleksij Rempel	2	-2/+10
	Set proper MAC capabilities for port 4 on LAN9371 and LAN9372 switches with integrated 100BaseTX PHY. And introduce the is_lan937x_tx_phy() function to reuse it where applicable. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-07-08	act_ct: prepare for stolen verdict coming from conntrack and nat engine	Florian Westphal	1	-6/+25
	At this time, conntrack either returns NF_ACCEPT or NF_DROP. To improve debuging it would be nice to be able to replace NF_DROP verdict with NF_DROP_REASON() helper, This helper releases the skb instantly (so drop_monitor can pinpoint exact location) and returns NF_STOLEN. Prepare call sites to deal with this before introducing such changes in conntrack and nat core. Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-07-06	Merge branch 'net-pse-pd-add-new-pse-c33-features'	Jakub Kicinski	9	-30/+997
	Kory Maincent says: ==================== net: pse-pd: Add new PSE c33 features This patch series adds new c33 features to the PSE API. - Expand the PSE PI informations status with power, class and failure reason - Add the possibility to get and set the PSE PIs power limit v5: https://lore.kernel.org/r/20240628-feature_poe_power_cap-v5-0-5e1375d3817a@bootlin.com v4: https://lore.kernel.org/r/20240625-feature_poe_power_cap-v4-0-b0813aad57d5@bootlin.com v3: https://lore.kernel.org/r/20240614-feature_poe_power_cap-v3-0-a26784e78311@bootlin.com v2: https://lore.kernel.org/r/20240607-feature_poe_power_cap-v2-0-c03c2deb83ab@bootlin.com v1: https://lore.kernel.org/r/20240529-feature_poe_power_cap-v1-0-0c4b1d5953b8@bootlin.com ==================== Link: https://patch.msgid.link/20240704-feature_poe_power_cap-v6-0-320003204264@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-06	net: pse-pd: pd692x0: Enhance with new current limit and voltage read callbacks	Kory Maincent (Dent Project)	1	-2/+216
	This patch expands PSE callbacks with newly introduced pi_get/set_current_limit() and pi_get_voltage() callback. It also add the power limit ranges description in the status returned. The only way to set ps692x0 port power limit is by configure the power class plus a small power supplement which maximum depends on each class. Acked-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: Kory Maincent <kory.maincent@bootlin.com> Link: https://patch.msgid.link/20240704-feature_poe_power_cap-v6-7-320003204264@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-06	netlink: specs: Expand the PSE netlink command with C33 pw-limit attributes	Kory Maincent (Dent Project)	1	-0/+22
	Expand the c33 PSE attributes with power limit to be able to set and get the PSE Power Interface power limit. ./ynl/cli.py --spec netlink/specs/ethtool.yaml --no-schema --do pse-get --json '{"header":{"dev-name":"eth1"}}' {'c33-pse-actual-pw': 1700, 'c33-pse-admin-state': 3, 'c33-pse-avail-pw-limit': 97500, 'c33-pse-pw-class': 4, 'c33-pse-pw-d-status': 4, 'c33-pse-pw-limit-ranges': [{'max': 18100, 'min': 15000}, {'max': 38000, 'min': 30000}, {'max': 65000, 'min': 60000}, {'max': 97500, 'min': 90000}], 'header': {'dev-index': 5, 'dev-name': 'eth1'}} ./ynl/cli.py --spec netlink/specs/ethtool.yaml --no-schema --do pse-set --json '{"header":{"dev-name":"eth1"}, "c33-pse-avail-pw-limit":19000}' None Signed-off-by: Kory Maincent <kory.maincent@bootlin.com> Link: https://patch.msgid.link/20240704-feature_poe_power_cap-v6-6-320003204264@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-06	net: ethtool: Add new power limit get and set features	Kory Maincent (Dent Project)	4	-25/+141
	This patch expands the status information provided by ethtool for PSE c33 with available power limit and available power limit ranges. It also adds a call to pse_ethtool_set_pw_limit() to configure the PSE control power limit. Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: Kory Maincent <kory.maincent@bootlin.com> Link: https://patch.msgid.link/20240704-feature_poe_power_cap-v6-5-320003204264@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-06	net: pse-pd: Add new power limit get and set c33 features	Kory Maincent (Dent Project)	2	-11/+204
	This patch add a way to get and set the power limit of a PSE PI. For that it uses regulator API callbacks wrapper like get_voltage() and get/set_current_limit() as power is simply V * I. We used mW unit as defined by the IEEE 802.3-2022 standards. set_current_limit() uses the voltage return by get_voltage() and the desired power limit to calculate the current limit. get_voltage() callback is then mandatory to set the power limit. get_current_limit() callback is by default looking at a driver callback and fallback to extracting the current limit from _pse_ethtool_get_status() if the driver does not set its callback. We prefer let the user the choice because ethtool_get_status return much more information than the current limit. expand pse status with c33_pw_limit_ranges to return the ranges available to configure the power limit. Reviewed-by: Sai Krishna <saikrishnag@marvell.com> Acked-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: Kory Maincent <kory.maincent@bootlin.com> Link: https://patch.msgid.link/20240704-feature_poe_power_cap-v6-4-320003204264@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-06	net: pse-pd: pd692x0: Expand ethtool status message	Kory Maincent (Dent Project)	1	-0/+101
	This update expands pd692x0_ethtool_get_status() callback with newly introduced details such as the detected class, current power delivered, and extended state information. Acked-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: Kory Maincent <kory.maincent@bootlin.com> Link: https://patch.msgid.link/20240704-feature_poe_power_cap-v6-3-320003204264@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-06	netlink: specs: Expand the PSE netlink command with C33 new features	Kory Maincent (Dent Project)	1	-0/+36
	Expand the c33 PSE attributes with PSE class, extended state information and power consumption. ./ynl/cli.py --spec netlink/specs/ethtool.yaml --no-schema --do pse-get --json '{"header":{"dev-name":"eth0"}}' {'c33-pse-actual-pw': 1700, 'c33-pse-admin-state': 3, 'c33-pse-pw-class': 4, 'c33-pse-pw-d-status': 4, 'header': {'dev-index': 4, 'dev-name': 'eth0'}} ./ynl/cli.py --spec netlink/specs/ethtool.yaml --no-schema --do pse-get --json '{"header":{"dev-name":"eth0"}}' {'c33-pse-admin-state': 3, 'c33-pse-ext-state': 'mr-mps-valid', 'c33-pse-ext-substate': 2, 'c33-pse-pw-d-status': 2, 'header': {'dev-index': 4, 'dev-name': 'eth0'}} Signed-off-by: Kory Maincent <kory.maincent@bootlin.com> Link: https://patch.msgid.link/20240704-feature_poe_power_cap-v6-2-320003204264@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-06	net: ethtool: pse-pd: Expand C33 PSE status with class, power and extended state	Kory Maincent (Dent Project)	6	-1/+286
	This update expands the status information provided by ethtool for PSE c33. It includes details such as the detected class, current power delivered, and extended state information. Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: Kory Maincent <kory.maincent@bootlin.com> Link: https://patch.msgid.link/20240704-feature_poe_power_cap-v6-1-320003204264@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-06	Merge branch 'net-openvswitch-add-sample-multicasting'	Jakub Kicinski	13	-16/+566
	Adrian Moreno says: ==================== net: openvswitch: Add sample multicasting. Background Currently, OVS supports several packet sampling mechanisms (sFlow, per-bridge IPFIX, per-flow IPFIX). These end up being translated into a userspace action that needs to be handled by ovs-vswitchd's handler threads only to be forwarded to some third party application that will somehow process the sample and provide observability on the datapath. A particularly interesting use-case is controller-driven per-flow IPFIX sampling where the OpenFlow controller can add metadata to samples (via two 32bit integers) and this metadata is then available to the sample-collecting system for correlation. Problem The fact that sampled traffic share netlink sockets and handler thread time with upcalls, apart from being a performance bottleneck in the sample extraction itself, can severely compromise the datapath, yielding this solution unfit for highly loaded production systems. Users are left with little options other than guessing what sampling rate will be OK for their traffic pattern and system load and dealing with the lost accuracy. Looking at available infrastructure, an obvious candidated would be to use psample. However, it's current state does not help with the use-case at stake because sampled packets do not contain user-defined metadata. Proposal This series is an attempt to fix this situation by extending the existing psample infrastructure to carry a variable length user-defined cookie. The main existing user of psample is tc's act_sample. It is also extended to forward the action's cookie to psample. Finally, a new OVS action (OVS_SAMPLE_ATTR_PSAMPLE) is created. It accepts a group and an optional cookie and uses psample to multicast the packet and the metadata. ==================== Link: https://patch.msgid.link/20240704085710.353845-1-amorenoz@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-06	selftests: openvswitch: add psample test	Adrian Moreno	2	-6/+182
	Add a test to verify sampling packets via psample works. In order to do that, create a subcommand in ovs-dpctl.py to listen to on the psample multicast group and print samples. Reviewed-by: Aaron Conole <aconole@redhat.com> Tested-by: Ilya Maximets <i.maximets@ovn.org> Signed-off-by: Adrian Moreno <amorenoz@redhat.com> Link: https://patch.msgid.link/20240704085710.353845-11-amorenoz@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-06	selftests: openvswitch: parse trunc action	Adrian Moreno	1	-0/+13
	The trunc action was supported decode-able but not parse-able. Add support for parsing the action string. Reviewed-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Adrian Moreno <amorenoz@redhat.com> Link: https://patch.msgid.link/20240704085710.353845-10-amorenoz@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>