summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2025-02-22rtnetlink: Pack newlink() params into structXiao Liang42-110/+223
There are 4 net namespaces involved when creating links: - source netns - where the netlink socket resides, - target netns - where to put the device being created, - link netns - netns associated with the device (backend), - peer netns - netns of peer device. Currently, two nets are passed to newlink() callback - "src_net" parameter and "dev_net" (implicitly in net_device). They are set as follows, depending on netlink attributes in the request. +------------+-------------------+---------+---------+ | peer netns | IFLA_LINK_NETNSID | src_net | dev_net | +------------+-------------------+---------+---------+ | | absent | source | target | | absent +-------------------+---------+---------+ | | present | link | link | +------------+-------------------+---------+---------+ | | absent | peer | target | | present +-------------------+---------+---------+ | | present | peer | link | +------------+-------------------+---------+---------+ When IFLA_LINK_NETNSID is present, the device is created in link netns first and then moved to target netns. This has some side effects, including extra ifindex allocation, ifname validation and link events. These could be avoided if we create it in target netns from the beginning. On the other hand, the meaning of src_net parameter is ambiguous. It varies depending on how parameters are passed. It is the effective link (or peer netns) by design, but some drivers ignore it and use dev_net instead. To provide more netns context for drivers, this patch packs existing newlink() parameters, along with the source netns, link netns and peer netns, into a struct. The old "src_net" is renamed to "net" to avoid confusion with real source netns, and will be deprecated later. The use of src_net are converted to params->net trivially. Signed-off-by: Xiao Liang <shaw.leon@gmail.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250219125039.18024-3-shaw.leon@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-22rtnetlink: Lookup device in target netns when creating linkXiao Liang1-2/+8
When creating link, lookup for existing device in target net namespace instead of current one. For example, two links created by: # ip link add dummy1 type dummy # ip link add netns ns1 dummy1 type dummy should have no conflict since they are in different namespaces. Signed-off-by: Xiao Liang <shaw.leon@gmail.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250219125039.18024-2-shaw.leon@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-22Merge branch 'dt-bindings-net-realtek-rtl9301-switch'Jakub Kicinski2-1/+148
Chris Packham says: ==================== dt-bindings: net: realtek,rtl9301-switch (schema part) This is my attempt at trying to sort out the mess I've created with the RTL9300 switch dt-bindings. Some context is available on [1] and [2]. The first patch just moves the binding from mfd/ to net/ (with an adjustment of the internal path name). The next two patches are successors to patches already sent as part of the series [3]. [1] - https://lore.kernel.org/lkml/20250204-eccentric-deer-of-felicity-02b7ee@krzk-bin/ [2] - https://lore.kernel.org/lkml/4e3c5d83-d215-4eff-bf02-6d420592df8f@alliedtelesis.co.nz/ [3] - https://lore.kernel.org/lkml/20250204030249.1965444-1-chris.packham@alliedtelesis.co.nz/ ==================== Link: https://patch.msgid.link/20250218195216.1034220-1-chris.packham@alliedtelesis.co.nz Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-22dt-bindings: net: Add Realtek MDIO controllerChris Packham2-0/+117
Add dtschema for the MDIO controller found in the RTL9300 Ethernet switch. The controller is slightly unusual in that direct MDIO communication is not possible. We model the MDIO controller with the MDIO buses as child nodes and the PHYs as children of the buses. The mapping of switch port number to MDIO bus/addr requires the ethernet-ports sibling to provide the mapping via the phy-handle property. Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Link: https://patch.msgid.link/20250218195216.1034220-4-chris.packham@alliedtelesis.co.nz Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-22dt-bindings: net: Add switch ports and interrupts to RTL9300Chris Packham1-0/+30
Add bindings for the ethernet-switch and interrupt properties for the RTL9300. Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Link: https://patch.msgid.link/20250218195216.1034220-3-chris.packham@alliedtelesis.co.nz Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-22dt-bindings: net: Move realtek,rtl9301-switch to netChris Packham1-1/+1
Initially realtek,rtl9301-switch was placed under mfd/ because it had some non-switch related blocks (specifically i2c and reset) but with a bit more review it has become apparent that this was wrong and the binding should live under net/. Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Acked-by: Lee Jones <lee@kernel.org> Acked-by: Rob Herring (Arm) <robh@kernel.org> Link: https://patch.msgid.link/20250218195216.1034220-2-chris.packham@alliedtelesis.co.nz Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-22net: sfp: add quirk for 2.5G OEM BX SFPBirger Koblitz1-0/+2
The OEM SFP-2.5G-BX10-D/U SFP module pair is meant to operate with 2500Base-X. However, in their EEPROM they incorrectly specify: Transceiver codes : 0x00 0x12 0x00 0x00 0x12 0x00 0x01 0x05 0x00 BR, Nominal : 2500MBd Use sfp_quirk_2500basex for this module to allow 2500Base-X mode anyway. Tested on BananaPi R3. Signed-off-by: Birger Koblitz <mail@birger-koblitz.de> Reviewed-by: Daniel Golle <daniel@makrotopia.org> Link: https://patch.msgid.link/20250218-b4-lkmsub-v1-1-1e51dcabed90@birger-koblitz.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21net: phy: remove unused feature array declarationsHeiner Kallweit1-3/+0
After 12d5151be010 ("net: phy: remove leftovers from switch to linkmode bitmaps") the following declarations are unused and can be removed too. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/b2883c75-4108-48f2-ab73-e81647262bc2@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21Merge branch 'selftests-drv-net-improve-the-queue-test-for-xsk'Jakub Kicinski4-40/+161
Jakub Kicinski says: ==================== selftests: drv-net: improve the queue test for XSK We see some flakes in the the XSK test: Exception| Traceback (most recent call last): Exception| File "/home/virtme/testing-18/tools/testing/selftests/net/lib/py/ksft.py", line 218, in ksft_run Exception| case(*args) Exception| File "/home/virtme/testing-18/tools/testing/selftests/drivers/net/./queues.py", line 53, in check_xdp Exception| ksft_eq(q['xsk'], {}) Exception| KeyError: 'xsk' I think it's because the method of running the helper in the background is racy. Add more solid infra for waiting for a background helper to be initialized. v1: https://lore.kernel.org/20250218195048.74692-1-kuba@kernel.org ==================== Link: https://patch.msgid.link/20250219234956.520599-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21selftests: drv-net: rename queues check_xdp to check_xskJakub Kicinski1-2/+4
The test is for AF_XDP, we refer to AF_XDP as XSK. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Joe Damato <jdamato@fastly.com> Tested-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250219234956.520599-8-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21selftests: drv-net: improve the use of ksft helpers in XSK queue testJakub Kicinski2-4/+10
Avoid exceptions when xsk attr is not present, and add a proper ksft helper for "not in" condition. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Joe Damato <jdamato@fastly.com> Tested-by: Kurt Kanzenbach <kurt@linutronix.de> Tested-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250219234956.520599-7-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21selftests: drv-net: add a way to wait for a local processJakub Kicinski3-35/+136
We use wait_port_listen() extensively to wait for a process we spawned to be ready. Not all processes will open listening sockets. Add a method of explicitly waiting for a child to be ready. Pass a FD to the spawned process and wait for it to write a message to us. FD number is passed via KSFT_READY_FD env variable. Similarly use KSFT_WAIT_FD to let the child process for a sign that we are done and child should exit. Sending a signal to a child with shell=True can get tricky. Make use of this method in the queues test to make it less flaky. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Acked-by: Joe Damato <jdamato@fastly.com> Tested-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250219234956.520599-6-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21selftests: drv-net: probe for AF_XDP sockets more explicitlyJakub Kicinski2-5/+14
Separate the support check from socket binding for easier refactoring. Use: ./helper - - just to probe if we can open the socket. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Joe Damato <jdamato@fastly.com> Tested-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250219234956.520599-5-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21selftests: drv-net: add missing new line in xdp_helperJakub Kicinski1-1/+1
Kurt and Joe report missing new line at the end of Usage. Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Reviewed-by: Joe Damato <jdamato@fastly.com> Tested-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250219234956.520599-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21selftests: drv-net: use cfg.rpath() in netlink xsk attr testJakub Kicinski1-2/+1
The cfg.rpath() helper was been recently added to make formatting paths for helper binaries easier. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Joe Damato <jdamato@fastly.com> Tested-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250219234956.520599-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21selftests: drv-net: add a warning for bkg + shell + terminateJakub Kicinski1-0/+4
Joe Damato reports that some shells will fork before running the command when python does "sh -c $cmd", while bash on my machine does an exec of $cmd directly. This will have implications for our ability to terminate the child process on various configurations of bash and other shells. Warn about using bkg(... shell=True, termininate=True) most background commands can hopefully exit cleanly (exit_wait). Link: https://lore.kernel.org/Z7Yld21sv_Ip3gQx@LQ3V64L9R2 Acked-by: Stanislav Fomichev <sdf@fomichev.me> Acked-by: Joe Damato <jdamato@fastly.com> Tested-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250219234956.520599-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21octeontx2: hide unused labelArnd Bergmann2-0/+4
A previous patch introduces a build-time warning when CONFIG_DCB is disabled: drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c: In function 'otx2_probe': drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c:3217:1: error: label 'err_free_zc_bmap' defined but not used [-Werror=unused-label] drivers/net/ethernet/marvell/octeontx2/nic/otx2_vf.c: In function 'otx2vf_probe': drivers/net/ethernet/marvell/octeontx2/nic/otx2_vf.c:740:1: error: label 'err_free_zc_bmap' defined but not used [-Werror=unused-label] Add the same #ifdef check around it. Fixes: efabce290151 ("octeontx2-pf: AF_XDP zero copy receive support") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Suman Ghosh <sumang@marvell.com> Link: https://patch.msgid.link/20250219162239.1376865-1-arnd@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21net: phy: qt2025: Fix hardware revision check commentCharalampos Mitrodimas1-1/+1
Correct the hardware revision check comment in the QT2025 driver. The revision value was documented as 0x3b instead of the correct 0xb3, which matches the actual comparison logic in the code. Reviewed-by: FUJITA Tomonori <fujita.tomonori@gmail.com> Signed-off-by: Charalampos Mitrodimas <charmitro@posteo.net> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Trevor Gross <tmgross@umich.edu> Link: https://patch.msgid.link/20250219-qt2025-comment-fix-v2-1-029f67696516@posteo.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21Merge branch 'mlx5-misc-enhancements-2025-02-19'Jakub Kicinski2-45/+21
Tariq Toukan says: ==================== mlx5 misc enhancements 2025-02-19 This small series enhances the mlx5 ethtool link speed code (no functional change), in addition to a Kconfig description enhancement. ==================== Link: https://patch.msgid.link/20250219114112.403808-1-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21net/mlx5e: Separate extended link modes request from link modes type selectionShahar Shitrit1-9/+4
The function ext_requested() serves two distinct purposes: it checks if extended link modes were requested, and it selects whether to use extended or legacy link modes. This change separates these two purposes. Now, ext_link_mode_requested() is used directly for checking if extended link modes are requested, while the selection of extended modes is handled independently based on the autonegotiation status. By making this distinction, the logic for determining whether to select extended or legacy link modes is clearer. Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com> Reviewed-by: Carolina Jubran <cjubran@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250219114112.403808-6-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21net/mlx5e: Change eth_proto parameter namingShahar Shitrit1-2/+2
eth_proto_cap parameter represents the supported link modes, while eth_proto_admin refers to the configured ones. The function get_advertising() retrieves the configured link modes, thus we update its parameter name to eth_proto_admin. Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com> Reviewed-by: Carolina Jubran <cjubran@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250219114112.403808-5-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21net/mlx5e: Introduce ptys2ethtool_process_link()Shahar Shitrit1-25/+10
The functions ptys2ethtool_supported_link(), ptys2ethtool_adver_link() share the same code, thus, in order to remove code duplication we introduce a new function ptys2ethtool_process_link() to handle the processing of both supported and advertised link modes. Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com> Reviewed-by: Carolina Jubran <cjubran@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250219114112.403808-4-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21net/mlx5e: Refactor ptys2ethtool_adver_link()Shahar Shitrit1-12/+8
The function ptys2ethtool_adver_link() contains duplicated code that is found in mlx5e_ethtool_get_speed_arr(). To eliminate this redundancy, we update mlx5e_ethtool_get_speed_arr() to select the appropriate table based on the ext argument passed by the caller, rather than querying the supported mode locally. This allows us to replace the current logic in ptys2ethtool_adver_link() with a call to mlx5e_ethtool_get_speed_arr(). This adjustment aligns with the ptys2ethtool_supported_link() function and prepares for an upcoming patch that reduces code duplication. Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com> Reviewed-by: Carolina Jubran <cjubran@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250219114112.403808-3-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21net/mlx5: Bridge, correct config option descriptionCosmin Ratiu1-2/+2
The implication of the previous help text was that without this option enabled, representor devices couldn't be added to a bridge device, while in fact that was possible, just that rules didn't get offloaded to hw. This commit clarifies the help text. Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250219114112.403808-2-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21neighbour: Replace kvzalloc() with kzalloc() when GFP_ATOMIC is specifiedKohei Enju1-2/+2
kzalloc() uses page allocator when size is larger than KMALLOC_MAX_CACHE_SIZE, so the intention of commit ab101c553bc1 ("neighbour: use kvzalloc()/kvfree()") can be achieved by using kzalloc(). When using GFP_ATOMIC, kvzalloc() only tries the kmalloc path, since the vmalloc path does not support the flag. In this case, kvzalloc() is equivalent to kzalloc() in that neither try the vmalloc path, so this replacement brings no functional change. This is primarily a cleanup change, as the original code functions correctly. This patch replaces kvzalloc() introduced by commit 41b3caa7c076 ("neighbour: Add hlist_node to struct neighbour"), which is called in the same context and with the same gfp flag as the aforementioned commit ab101c553bc1 ("neighbour: use kvzalloc()/kvfree()"). Signed-off-by: Kohei Enju <enjuk@amazon.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Link: https://patch.msgid.link/20250219102227.72488-1-enjuk@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21Merge branch 'some-pktgen-fixes-improvments-part-i'Jakub Kicinski1-17/+22
Peter Seiderer says: ==================== Some pktgen fixes/improvments (part I) While taking a look at '[PATCH net] pktgen: Avoid out-of-range in get_imix_entries' ([1]) and '[PATCH net v2] pktgen: Avoid out-of-bounds access in get_imix_entries' ([2], [3]) and doing some tests and code review I detected that the /proc/net/pktgen/... parsing logic does not honour the user given buffer bounds (resulting in out-of-bounds access). This can be observed e.g. by the following simple test (sometimes the old/'longer' previous value is re-read from the buffer): $ echo add_device lo@0 > /proc/net/pktgen/kpktgend_0 $ echo "min_pkt_size 12345" > /proc/net/pktgen/lo\@0 && grep min_pkt_size /proc/net/pktgen/lo\@0 Params: count 1000 min_pkt_size: 12345 max_pkt_size: 0 Result: OK: min_pkt_size=12345 $ echo -n "min_pkt_size 123" > /proc/net/pktgen/lo\@0 && grep min_pkt_size /proc/net/pktgen/lo\@0 Params: count 1000 min_pkt_size: 12345 max_pkt_size: 0 Result: OK: min_pkt_size=12345 $ echo "min_pkt_size 123" > /proc/net/pktgen/lo\@0 && grep min_pkt_size /proc/net/pktgen/lo\@0 Params: count 1000 min_pkt_size: 123 max_pkt_size: 0 Result: OK: min_pkt_size=123 So fix the out-of-bounds access (and some minor findings) and add a simple proc_net_pktgen selftest... [1] https://lore.kernel.org/netdev/20241006221221.3744995-1-artem.chernyshev@red-soft.ru/ [2] https://lore.kernel.org/netdev/20250109083039.14004-1-pchelkin@ispras.ru/ [3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=76201b5979768500bca362871db66d77cb4c225e ==================== Link: https://patch.msgid.link/20250219084527.20488-1-ps.report@gmx.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21net: pktgen: fix access outside of user given buffer in pktgen_thread_write()Peter Seiderer1-3/+4
Honour the user given buffer size for the strn_len() calls (otherwise strn_len() will access memory outside of the user given buffer). Signed-off-by: Peter Seiderer <ps.report@gmx.net> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250219084527.20488-8-ps.report@gmx.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21net: pktgen: fix ctrl interface command parsingPeter Seiderer1-6/+8
Enable command writing without trailing '\n': - the good case $ echo "reset" > /proc/net/pktgen/pgctrl - the bad case (before the patch) $ echo -n "reset" > /proc/net/pktgen/pgctrl -bash: echo: write error: Invalid argument - with patch applied $ echo -n "reset" > /proc/net/pktgen/pgctrl Signed-off-by: Peter Seiderer <ps.report@gmx.net> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250219084527.20488-7-ps.report@gmx.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21net: pktgen: fix 'ratep 0' error handling (return -EINVAL)Peter Seiderer1-1/+1
Given an invalid 'ratep' command e.g. 'ratep 0' the return value is '1', leading to the following misleading output: - the good case $ echo "ratep 100" > /proc/net/pktgen/lo\@0 $ grep "Result:" /proc/net/pktgen/lo\@0 Result: OK: ratep=100 - the bad case (before the patch) $ echo "ratep 0" > /proc/net/pktgen/lo\@0" -bash: echo: write error: Invalid argument $ grep "Result:" /proc/net/pktgen/lo\@0 Result: No such parameter "atep" - with patch applied $ echo "ratep 0" > /proc/net/pktgen/lo\@0 -bash: echo: write error: Invalid argument $ grep "Result:" /proc/net/pktgen/lo\@0 Result: Idle Signed-off-by: Peter Seiderer <ps.report@gmx.net> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250219084527.20488-6-ps.report@gmx.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21net: pktgen: fix 'rate 0' error handling (return -EINVAL)Peter Seiderer1-1/+1
Given an invalid 'rate' command e.g. 'rate 0' the return value is '1', leading to the following misleading output: - the good case $ echo "rate 100" > /proc/net/pktgen/lo\@0 $ grep "Result:" /proc/net/pktgen/lo\@0 Result: OK: rate=100 - the bad case (before the patch) $ echo "rate 0" > /proc/net/pktgen/lo\@0" -bash: echo: write error: Invalid argument $ grep "Result:" /proc/net/pktgen/lo\@0 Result: No such parameter "ate" - with patch applied $ echo "rate 0" > /proc/net/pktgen/lo\@0 -bash: echo: write error: Invalid argument $ grep "Result:" /proc/net/pktgen/lo\@0 Result: Idle Signed-off-by: Peter Seiderer <ps.report@gmx.net> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250219084527.20488-5-ps.report@gmx.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21net: pktgen: fix hex32_arg parsing for short readsPeter Seiderer1-3/+4
Fix hex32_arg parsing for short reads (here 7 hex digits instead of the expected 8), shift result only on successful input parsing. - before the patch $ echo "mpls 0000123" > /proc/net/pktgen/lo\@0 $ grep mpls /proc/net/pktgen/lo\@0 mpls: 00001230 Result: OK: mpls=00001230 - with patch applied $ echo "mpls 0000123" > /proc/net/pktgen/lo\@0 $ grep mpls /proc/net/pktgen/lo\@0 mpls: 00000123 Result: OK: mpls=00000123 Signed-off-by: Peter Seiderer <ps.report@gmx.net> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250219084527.20488-4-ps.report@gmx.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21net: pktgen: enable 'param=value' parsingPeter Seiderer1-0/+1
Enable more flexible parameters syntax, allowing 'param=value' in addition to the already supported 'param value' pattern (additional this gives the skipping '=' in count_trail_chars() a purpose). Tested with: $ echo "min_pkt_size 999" > /proc/net/pktgen/lo\@0 $ echo "min_pkt_size=999" > /proc/net/pktgen/lo\@0 $ echo "min_pkt_size =999" > /proc/net/pktgen/lo\@0 $ echo "min_pkt_size= 999" > /proc/net/pktgen/lo\@0 $ echo "min_pkt_size = 999" > /proc/net/pktgen/lo\@0 Signed-off-by: Peter Seiderer <ps.report@gmx.net> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250219084527.20488-3-ps.report@gmx.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21net: pktgen: replace ENOTSUPP with EOPNOTSUPPPeter Seiderer1-3/+3
Replace ENOTSUPP with EOPNOTSUPP, fixes checkpatch hint WARNING: ENOTSUPP is not a SUSV4 error code, prefer EOPNOTSUPP and e.g. $ echo "clone_skb 1" > /proc/net/pktgen/lo\@0 -bash: echo: write error: Unknown error 524 Signed-off-by: Peter Seiderer <ps.report@gmx.net> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250219084527.20488-2-ps.report@gmx.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21af_unix: Fix undefined 'other' errorPurva Yeshi1-1/+0
Fix an issue with the sparse static analysis tool where an "undefined 'other'" error occurs due to `__releases(&unix_sk(other)->lock)` being placed before 'other' is in scope. Remove the `__releases()` annotation from the `unix_wait_for_peer()` function to eliminate the sparse error. The annotation references `other` before it is declared, leading to a false positive error during static analysis. Since AF_UNIX does not use sparse annotations, this annotation is unnecessary and does not impact functionality. Reviewed-by: Joe Damato <jdamato@fastly.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Purva Yeshi <purvayeshi550@gmail.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250218141045.38947-1-purvayeshi550@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21eth: fbnic: Add ethtool support for IRQ coalescingMohsin Bashir6-8/+121
Add ethtool support to configure the IRQ coalescing behavior. Support separate timers for Rx and Tx for time based coalescing. For frame based configuration, currently we only support the Rx side. The hardware allows configuration of descriptor count instead of frame count requiring conversion between the two. We assume 2 descriptors per frame, one for the metadata and one for the data segment. When rx-frames are not configured, we set the RX descriptor count to half the ring size as a fail safe. Default configuration: ethtool -c eth0 | grep -E "rx-usecs:|tx-usecs:|rx-frames:" rx-usecs: 30 rx-frames: 0 tx-usecs: 35 IRQ rate test: With single iperf flow we monitor IRQ rate while changing the tx-usesc and rx-usecs to high and low values. ethtool -C eth0 rx-frames 8192 rx-usecs 150 tx-usecs 150 irq/sec 13k irq/sec 14k irq/sec 14k ethtool -C eth0 rx-frames 8192 rx-usecs 10 tx-usecs 10 irq/sec 27k irq/sec 28k irq/sec 28k Validating the use of extack: ethtool -C eth0 rx-frames 16384 netlink error: fbnic: rx_frames is above device max netlink error: Invalid argument Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Brett Creeley <brett.creeley@amd.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Link: https://patch.msgid.link/20250218023520.2038010-1-mohsin.bashr@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21Merge branch 'support-ptp-clock-for-wangxun-nics'Jakub Kicinski17-7/+1194
Jiawen Wu says: ==================== Support PTP clock for Wangxun NICs Implement support for PTP clock on Wangxun NICs. v7: https://lore.kernel.org/20250213083041.78917-1-jiawenwu@trustnetic.com v6: https://lore.kernel.org/20250208031348.4368-1-jiawenwu@trustnetic.com v5: https://lore.kernel.org/20250117062051.2257073-1-jiawenwu@trustnetic.com v4: https://lore.kernel.org/20250114084425.2203428-1-jiawenwu@trustnetic.com v3: https://lore.kernel.org/20250110031716.2120642-1-jiawenwu@trustnetic.com v2: https://lore.kernel.org/20250106084506.2042912-1-jiawenwu@trustnetic.com v1: https://lore.kernel.org/20250102103026.1982137-1-jiawenwu@trustnetic.com ==================== Link: https://patch.msgid.link/20250218023432.146536-1-jiawenwu@trustnetic.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21net: ngbe: Add support for 1PPS and TODJiawen Wu7-4/+250
Implement support for generating a 1pps output signal on SDP0. And support custom firmware to output TOD. Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Link: https://patch.msgid.link/20250218023432.146536-5-jiawenwu@trustnetic.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21net: wangxun: Add periodic checks for overflow and errorsJiawen Wu4-0/+109
Implement watchdog task to detect SYSTIME overflow and error cases of Rx/Tx timestamp. Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://patch.msgid.link/20250218023432.146536-4-jiawenwu@trustnetic.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21net: wangxun: Support to get ts infoJiawen Wu4-0/+58
Implement the function get_ts_info and get_ts_stats in ethtool_ops to get the HW capabilities and statistics for timestamping. Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://patch.msgid.link/20250218023432.146536-3-jiawenwu@trustnetic.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21net: wangxun: Add support for PTP clockJiawen Wu11-6/+780
Implement support for PTP clock on Wangxun NICs. Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://patch.msgid.link/20250218023432.146536-2-jiawenwu@trustnetic.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21tun: Pad virtio headersAkihiko Odaki1-1/+2
tun simply advances iov_iter when it needs to pad virtio header, which leaves the garbage in the buffer as is. This will become especially problematic when tun starts to allow enabling the hash reporting feature; even if the feature is enabled, the packet may lack a hash value and may contain a hole in the virtio header because the packet arrived before the feature gets enabled or does not contain the header fields to be hashed. If the hole is not filled with zero, it is impossible to tell if the packet lacks a hash value. In theory, a user of tun can fill the buffer with zero before calling read() to avoid such a problem, but leaving the garbage in the buffer is awkward anyway so replace advancing the iterator with writing zeros. A user might have initialized the buffer to some non-zero value, expecting tun to skip writing it. As this was never a documented feature, this seems unlikely. The overhead of filling the hole in the header is negligible when the header size is specified according to the specification as doing so will not make another cache line dirty under a reasonable assumption. Below is a proof of this statement: The first 10 bytes of the header is always written and tun also writes the packet itself immediately after the packet unless the packet is empty. This makes a hole between these writes whose size is: sz - 10 where sz is the specified header size. Therefore, we will never make another cache line dirty when: sz < L1_CACHE_BYTES + 10 where L1_CACHE_BYTES is the cache line size. Assuming L1_CACHE_BYTES >= 16, this inequation holds when: sz < 26. sz <= 20 according to the current specification so we even have a margin of 5 bytes in case that the header size grows in a future version of the specification. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Link: https://patch.msgid.link/20250215-buffers-v2-1-1fbc6aaf8ad6@daynix.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21netdevsim: call napi_schedule from a timer contextBreno Leitao2-1/+21
The netdevsim driver was experiencing NOHZ tick-stop errors during packet transmission due to pending softirq work when calling napi_schedule(). This issue was observed when running the netconsole selftest, which triggered the following error message: NOHZ tick-stop error: local softirq work is pending, handler #08!!! To fix this issue, introduce a timer that schedules napi_schedule() from a timer context instead of calling it directly from the TX path. Create an hrtimer for each queue and kick it from the TX path, which then schedules napi_schedule() from the timer context. Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20250219-netdevsim-v3-1-811e2b8abc4c@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21Merge branch 'flexible-array-for-ip-tunnel-options'Jakub Kicinski4-14/+14
Gal Pressman says: ==================== Flexible array for ip tunnel options Remove the hidden assumption that options are allocated at the end of the struct, and teach the compiler about them using a flexible array. First patch is converting hard-coded 'info + 1' to use ip_tunnel_info() helper. Second patch adds the 'options' flexible array and changes the helper to use it. v4: https://lore.kernel.org/20250217202503.265318-1-gal@nvidia.com v3: https://lore.kernel.org/20250212140953.107533-1-gal@nvidia.com v2: https://lore.kernel.org/20250209101853.15828-1-gal@nvidia.com ==================== Link: https://patch.msgid.link/20250219143256.370277-1-gal@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21net: Add options as a flexible array to struct ip_tunnel_infoGal Pressman3-9/+9
Remove the hidden assumption that options are allocated at the end of the struct, and teach the compiler about them using a flexible array. With this, we can revert the unsafe_memcpy() call we have in tun_dst_unclone() [1], and resolve the false field-spanning write warning caused by the memcpy() in ip_tunnel_info_opts_set(). The layout of struct ip_tunnel_info remains the same with this patch. Before this patch, there was an implicit padding at the end of the struct, options would be written at 'info + 1' which is after the padding. This will remain the same as this patch explicitly aligns 'options'. The alignment is needed as the options are later casted to different structs, and might result in unaligned memory access. Pahole output before this patch: struct ip_tunnel_info { struct ip_tunnel_key key; /* 0 64 */ /* XXX last struct has 1 byte of padding */ /* --- cacheline 1 boundary (64 bytes) --- */ struct ip_tunnel_encap encap; /* 64 8 */ struct dst_cache dst_cache; /* 72 16 */ u8 options_len; /* 88 1 */ u8 mode; /* 89 1 */ /* size: 96, cachelines: 2, members: 5 */ /* padding: 6 */ /* paddings: 1, sum paddings: 1 */ /* last cacheline: 32 bytes */ }; Pahole output after this patch: struct ip_tunnel_info { struct ip_tunnel_key key; /* 0 64 */ /* XXX last struct has 1 byte of padding */ /* --- cacheline 1 boundary (64 bytes) --- */ struct ip_tunnel_encap encap; /* 64 8 */ struct dst_cache dst_cache; /* 72 16 */ u8 options_len; /* 88 1 */ u8 mode; /* 89 1 */ /* XXX 6 bytes hole, try to pack */ u8 options[] __attribute__((__aligned__(16))); /* 96 0 */ /* size: 96, cachelines: 2, members: 6 */ /* sum members: 90, holes: 1, sum holes: 6 */ /* paddings: 1, sum paddings: 1 */ /* forced alignments: 1, forced holes: 1, sum forced holes: 6 */ /* last cacheline: 32 bytes */ } __attribute__((__aligned__(16))); [1] Commit 13cfd6a6d7ac ("net: Silence false field-spanning write warning in metadata_dst memcpy") Link: https://lore.kernel.org/all/53D1D353-B8F6-4ADC-8F29-8C48A7C9C6F1@kernel.org/ Suggested-by: Kees Cook <kees@kernel.org> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Kees Cook <kees@kernel.org> Link: https://patch.msgid.link/20250219143256.370277-3-gal@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21ip_tunnel: Use ip_tunnel_info() helper instead of 'info + 1'Gal Pressman2-5/+5
Tunnel options should not be accessed directly, use the ip_tunnel_info() accessor instead. Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Kees Cook <kees@kernel.org> Link: https://patch.msgid.link/20250219143256.370277-2-gal@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski338-1927/+3677
Cross-merge networking fixes after downstream PR (net-6.14-rc4). No conflicts or adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-20Merge tag 'net-6.14-rc4' of ↵Linus Torvalds34-217/+365
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Smaller than usual with no fixes from any subtree. Current release - regressions: - core: fix race of rtnl_net_lock(dev_net(dev)) Previous releases - regressions: - core: remove the single page frag cache for good - flow_dissector: fix handling of mixed port and port-range keys - sched: cls_api: fix error handling causing NULL dereference - tcp: - adjust rcvq_space after updating scaling ratio - drop secpath at the same time as we currently drop dst - eth: gtp: suppress list corruption splat in gtp_net_exit_batch_rtnl(). Previous releases - always broken: - vsock: - fix variables initialization during resuming - for connectible sockets allow only connected - eth: - geneve: fix use-after-free in geneve_find_dev() - ibmvnic: don't reference skb after sending to VIOS" * tag 'net-6.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (34 commits) Revert "net: skb: introduce and use a single page frag cache" net: allow small head cache usage with large MAX_SKB_FRAGS values nfp: bpf: Add check for nfp_app_ctrl_msg_alloc() tcp: drop secpath at the same time as we currently drop dst net: axienet: Set mac_managed_pm arp: switch to dev_getbyhwaddr() in arp_req_set_public() net: Add non-RCU dev_getbyhwaddr() helper sctp: Fix undefined behavior in left shift operation selftests/bpf: Add a specific dst port matching flow_dissector: Fix port range key handling in BPF conversion selftests/net/forwarding: Add a test case for tc-flower of mixed port and port-range flow_dissector: Fix handling of mixed port and port-range keys geneve: Suppress list corruption splat in geneve_destroy_tunnels(). gtp: Suppress list corruption splat in gtp_net_exit_batch_rtnl(). dev: Use rtnl_net_dev_lock() in unregister_netdev(). net: Fix dev_net(dev) race in unregister_netdevice_notifier_dev_net(). net: Add net_passive_inc() and net_passive_dec(). net: pse-pd: pd692x0: Fix power limit retrieval MAINTAINERS: trim the GVE entry gve: set xdp redirect target only when it is available ...
2025-02-20Merge tag 'v6.14-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds8-24/+71
Pull smb client fixes from Steve French: - Fix for chmod regression - Two reparse point related fixes - One minor cleanup (for GCC 14 compiles) - Fix for SMB3.1.1 POSIX Extensions reporting incorrect file type * tag 'v6.14-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: Treat unhandled directory name surrogate reparse points as mount directory nodes cifs: Throw -EOPNOTSUPP error on unsupported reparse point type from parse_reparse_point() smb311: failure to open files of length 1040 when mounting with SMB3.1.1 POSIX extensions smb: client, common: Avoid multiple -Wflex-array-member-not-at-end warnings smb: client: fix chmod(2) regression with ATTR_READONLY
2025-02-20Merge tag 'bcachefs-2025-02-20' of git://evilpiepirate.org/bcachefsLinus Torvalds4-58/+42
Pull bcachefs fixes from Kent Overstreet: "Small stuff: - The fsck code for Hongbo's directory i_size patch was wrong, caught by transaction restart injection: we now have the CI running another test variant with restart injection enabled - Another fixup for reflink pointers to missing indirect extents: previous fix was for fsck code, this fixes the normal runtime paths - Another small srcu lock hold time fix, reported by jpsollie" * tag 'bcachefs-2025-02-20' of git://evilpiepirate.org/bcachefs: bcachefs: Fix srcu lock warning in btree_update_nodes_written() bcachefs: Fix bch2_indirect_extent_missing_error() bcachefs: Fix fsck directory i_size checking
2025-02-20Merge tag 'xfs-fixes-6.14-rc4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds7-30/+114
Pull xfs fixes from Carlos Maiolino: "Just a collection of bug fixes, nothing really stands out" * tag 'xfs-fixes-6.14-rc4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: flush inodegc before swapon xfs: rename xfs_iomap_swapfile_activate to xfs_vm_swap_activate xfs: Do not allow norecovery mount with quotacheck xfs: do not check NEEDSREPAIR if ro,norecovery mount. xfs: fix data fork format filtering during inode repair xfs: fix online repair probing when CONFIG_XFS_ONLINE_REPAIR=n