summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-01-02Documentation: add pyyaml to requirements.txtVegard Nossum1-0/+1
Commit f061c9f7d058 ("Documentation: Document each netlink family") added a new Python script that is invoked during 'make htmldocs' and which reads the netlink YAML spec files. Using the virtualenv from scripts/sphinx-pre-install, we get this new error wen running 'make htmldocs': Traceback (most recent call last): File "./tools/net/ynl/ynl-gen-rst.py", line 26, in <module> import yaml ModuleNotFoundError: No module named 'yaml' make[2]: *** [Documentation/Makefile:112: Documentation/networking/netlink_spec/rt_link.rst] Error 1 make[1]: *** [Makefile:1708: htmldocs] Error 2 Fix this by adding 'pyyaml' to requirements.txt. Note: This was somehow present in the original patch submission: <https://lore.kernel.org/all/20231103135622.250314-1-leitao@debian.org/> I'm not sure why the pyyaml requirement disappeared in the meantime. Fixes: f061c9f7d058 ("Documentation: Document each netlink family") Cc: Breno Leitao <leitao@debian.org> Cc: Jakub Kicinski <kuba@kernel.org> Cc: David S. Miller <davem@davemloft.net> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-02Merge branch 'mptcp-mib-counters'David S. Miller8-26/+110
Matthieu Baerts says: ==================== mptcp: add CurrEstab MIB counter This MIB counter is similar to the one of TCP -- CurrEstab -- available in /proc/net/snmp. This is useful to quickly list the number of MPTCP connections without having to iterate over all of them. Patch 1 prepares its support by adding new helper functions: - MPTCP_DEC_STATS(): similar to MPTCP_INC_STATS(), but this time to decrement a counter. - mptcp_set_state(): similar to tcp_set_state(), to change the state of an MPTCP socket, and to inc/decrement the new counter when needed. Patch 2 uses mptcp_set_state() instead of directly calling inet_sk_state_store() to change the state of MPTCP sockets. Patch 3 and 4 validate the new feature in MPTCP "join" and "diag" selftests. ==================== Signed-off-by: Matthieu Baerts <matttbe@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-02selftests: mptcp: diag: check CURRESTAB countersGeliang Tang1-1/+16
This patch adds a new helper chk_msk_cestab() to check the current established connections counter MIB_CURRESTAB in diag.sh. Invoke it to check the counter during the connection after every chk_msk_inuse(). Signed-off-by: Geliang Tang <geliang.tang@linux.dev> Reviewed-by: Matthieu Baerts <matttbe@kernel.org> Signed-off-by: Matthieu Baerts <matttbe@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-02selftests: mptcp: join: check CURRESTAB countersGeliang Tang1-5/+41
This patch adds a new helper chk_cestab_nr() to check the current established connections counter MIB_CURRESTAB. Set the newly added variables cestab_ns1 and cestab_ns2 to indicate how many connections are expected in ns1 or ns2. Invoke check_cestab() to check the counter during the connection in do_transfer() and invoke chk_cestab_nr() to re-check it when the connection closed. These checks are embedded in add_tests(). Signed-off-by: Geliang Tang <geliang.tang@linux.dev> Acked-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts <matttbe@kernel.org> Signed-off-by: Matthieu Baerts <matttbe@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-02mptcp: use mptcp_set_stateGeliang Tang3-20/+25
This patch replaces all the 'inet_sk_state_store()' calls under net/mptcp with the new helper mptcp_set_state(). Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/460 Signed-off-by: Geliang Tang <geliang.tang@linux.dev> Acked-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts <matttbe@kernel.org> Signed-off-by: Matthieu Baerts <matttbe@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-02mptcp: add CurrEstab MIB counter supportGeliang Tang4-0/+28
Add a new MIB counter named MPTCP_MIB_CURRESTAB to count current established MPTCP connections, similar to TCP_MIB_CURRESTAB. This is useful to quickly list the number of MPTCP connections without having to iterate over all of them. This patch adds a new helper function mptcp_set_state(): if the state switches from or to ESTABLISHED state, this newly added counter is incremented. This helper is going to be used in the following patch. Similar to MPTCP_INC_STATS(), a new helper called MPTCP_DEC_STATS() is also needed to decrement a MIB counter. Signed-off-by: Geliang Tang <geliang.tang@linux.dev> Acked-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts <matttbe@kernel.org> Signed-off-by: Matthieu Baerts <matttbe@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-02Merge branch 'selftests-tcp-ao'David S. Miller4-18/+36
Dmitry Safonov says: ==================== selftest/net: Some more TCP-AO selftest post-merge fixups Note that there's another post-merge fix for TCP-AO selftests, but that doesn't conflict with these, so I don't resend that: https://lore.kernel.org/all/20231219-b4-tcp-ao-selftests-out-of-tree-v1-1-0fff92d26eac@arista.com/T/#u ==================== Tested-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: Dmitry Safonov <dima@arista.com>
2024-01-02selftest/tcp-ao: Work on namespace-ified sysctl_optmem_maxDmitry Safonov1-8/+27
Since commit f5769faeec36 ("net: Namespace-ify sysctl_optmem_max") optmem_max is per-netns, so need of switching to root namespace. It seems trivial to keep the old logic working, so going to keep it for a while (at least, until kernel with netns-optmem_max will be release). Currently, there is a test that checks that optmem_max limit applies to TCP-AO keys and a little benchmark that measures linked-list TCP-AO keys scaling, those are fixed by this. Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-02selftest/tcp-ao: Set routes in a proper VRF table idDmitry Safonov3-10/+9
In unsigned-md5 selftests ip_route_add() is not needed in client_add_ip(): the route was pre-setup in __test_init() => link_init() for subnet, rather than a specific ip-address. Currently, __ip_route_add() mistakenly always sets VRF table to RT_TABLE_MAIN - this seems to have sneaked in during unsigned-md5 tests debugging. That also explains, why ip_route_add_vrf() ignored EEXIST, returned by fib6. Yet, keep EEXIST ignoring in bench-lookups selftests as it's expected that those selftests may add the same (duplicate) routes. Reported-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-02net: Implement missing getsockopt(SO_TIMESTAMPING_NEW)Jörn-Thorben Hinz1-2/+9
Commit 9718475e6908 ("socket: Add SO_TIMESTAMPING_NEW") added the new socket option SO_TIMESTAMPING_NEW. Setting the option is handled in sk_setsockopt(), querying it was not handled in sk_getsockopt(), though. Following remarks on an earlier submission of this patch, keep the old behavior of getsockopt(SO_TIMESTAMPING_OLD) which returns the active flags even if they actually have been set through SO_TIMESTAMPING_NEW. The new getsockopt(SO_TIMESTAMPING_NEW) is stricter, returning flags only if they have been set through the same option. Fixes: 9718475e6908 ("socket: Add SO_TIMESTAMPING_NEW") Link: https://lore.kernel.org/lkml/20230703175048.151683-1-jthinz@mailbox.tu-berlin.de/ Link: https://lore.kernel.org/netdev/0d7cddc9-03fa-43db-a579-14f3e822615b@app.fastmail.com/ Signed-off-by: Jörn-Thorben Hinz <jthinz@mailbox.tu-berlin.de> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-02Merge tag 'wireless-next-2023-12-22' of ↵David S. Miller67-419/+2733
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next Kalle Valo says: ==================== wireless-next patches for v6.8 The third "new features" pull request for v6.8. This is a smaller one to clear up our tree before the break and nothing really noteworthy this time. Major changes: stack * cfg80211: introduce cfg80211_ssid_eq() for SSID matching * cfg80211: support P2P operation on DFS channels * mac80211: allow 64-bit radiotap timestamps iwlwifi * AX210: allow concurrent P2P operation on DFS channels ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-02Merge branch 'net-tc-ipt-retire'David S. Miller18-520/+2
Jamal Hadi Salim says: ==================== net/sched: retire tc ipt action In keeping up with my status as a hero who removes code: another one bites the dust. The tc ipt action was intended to run all netfilter/iptables target. Unfortunately it has not benefitted over the years from proper updates when netfilter changes, and for that reason it has remained rudimentary. Pinging a bunch of people that i was aware were using this indicates that removing it wont affect them. Retire it to reduce maintenance efforts. So Long, ipt, and Thanks for all the Fish. ==================== Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-02net/sched: Remove CONFIG_NET_ACT_IPT from default configsJamal Hadi Salim10-10/+0
Now that we are retiring the IPT action. Reviewed-by: Victor Noguiera <victor@mojatatu.com> Reviewed-by: Pedro Tammela <pctammela@mojatatu.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-02net/sched: Retire ipt actionJamal Hadi Salim8-510/+2
The tc ipt action was intended to run all netfilter/iptables target. Unfortunately it has not benefitted over the years from proper updates when netfilter changes, and for that reason it has remained rudimentary. Pinging a bunch of people that i was aware were using this indicates that removing it wont affect them. Retire it to reduce maintenance efforts. Buh-bye. Reviewed-by: Victor Noguiera <victor@mojatatu.com> Reviewed-by: Pedro Tammela <pctammela@mojatatu.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-02net-device: move gso_partial_features to net_device_read_txEric Dumazet3-3/+4
dev->gso_partial_features is read from tx fast path for GSO packets. Move it to appropriate section to avoid a cache line miss. Fixes: 43a71cd66b9c ("net-device: reorganize net_device fast path variables") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Coco Li <lixiaoyan@google.com> Cc: David Ahern <dsahern@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-02ptp: ocp: Use DEFINE_RES_*() in placeAndy Shevchenko1-20/+6
There is no need to have an intermediate functions as DEFINE_RES_*() macros are represented by compound literals. Just use them in place. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-01net: qrtr: ns: Return 0 if server port is not presentSarannya S1-1/+3
When a 'DEL_CLIENT' message is received from the remote, the corresponding server port gets deleted. A DEL_SERVER message is then announced for this server. As part of handling the subsequent DEL_SERVER message, the name- server attempts to delete the server port which results in a '-ENOENT' error. The return value from server_del() is then propagated back to qrtr_ns_worker, causing excessive error prints. To address this, return 0 from control_cmd_del_server() without checking the return value of server_del(), since the above scenario is not an error case and hence server_del() doesn't have any other error return value. Signed-off-by: Sarannya Sasikumar <quic_sarannya@quicinc.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-01Merge branch 'phy-listing-link_topology-tracking'David S. Miller30-35/+912
Maxime Chevallier says: ==================== Introduce PHY listing and link_topology tracking Here's a V5 of the multi-PHY support series. At a glance, besides some minor fixes and R'd-by from Andrew, one of the thing this series does is remove the ASSERT_RTNL() from the topo_add_phy/del_phy operations. These operations will take a PHY device and put it into the list of devices associated to a netdevice. The main thing to protect here is the list itself, but since we use xarrays, my naive understanding of it is that it contains its own protection scheme. There shouldn't be a need for more locking, as the insertion/deletion paths are already hooked into the PHY connection to a netdev, or disconnection from it. Now for the rest of the cover : As a remainder, this ongoing work aims ultimately at supporting complex link topologies that involve multiplexing multiple PHYs/SFPs on a single netdevice. As a first step, it's required that we are able to enumerate the PHYs on a given ethernet interface. By just doing so, we also improve already-existing use-cases, namely the copper SFP modules support when a media-converter is used (as we have 2 PHYs on the link, but only one is referenced by net_device.phydev, which is used on a variety of netlink commands). The series is architectured as follows : - The first patch adds the notion of phy_link_topology, which tracks all PHYs attached to a netdevice. - Patches 2, 3 and 4 adds some plumbing into SFP and phylib to be able to connect the dots when building the topology tree, to know which PHY is connected to which SFP bus, trying not to be too invasive on phylib. - Patch 5 allows passing a PHY_INDEX to ethnl commands. I'm uncertain about this, as there are at least 4 netlink commands ( 5 with the one introduced in patch 7 ) that targets PHYs directly or indirectly, which to me makes it worth-it to have a generic way to pass a PHY index to commands, however the approach taken may be too generic. - Patch 6 is the netlink spec update + ethtool-user.c|h autogenerated code update (the autogenerated code triggers checkpatch warning though) - Patch 7 introduces a new netlink command set to list PHYs on a netdevice. It implements a custom DUMP and GET operation to allow filtered dumps, that lists all PHYs on a given netdevice. I couldn't use most of ethnl's plumbing though. - Patch 8 is the netlink spec update + ethtool-user.c|h update for that new command - Patch 8,9,10 and 11 updates the PLCA, strset, cable-test and pse netlink commands to use the user-provided PHY instead of net_device.phydev. - Finally patch 12 adds some documentation for this whole work. Examples ======== Here's a short overview of the kind of operations you can have regarding the PHY topology. These tests were performed on a MacchiatoBin, which has 3 interfaces : eth0 and eth1 have the following layout: MAC - PHY - SFP eth2 has this more classic topology : MAC - PHY - RJ45 finally eth3 has the following topology : MAC - SFP When performing a dump with all interfaces down, we don't get any result, as no PHY has been attached to their respective net_device : None The following output is with eth0, eth2 and eth3 up, but no SFP module inserted in none of the interfaces : [{'downstream-sfp-name': 'sfp-eth0', 'drvname': 'mv88x3310', 'header': {'dev-index': 2, 'dev-name': 'eth0'}, 'id': 0, 'index': 1, 'name': 'f212a600.mdio-mii:00', 'upstream-type': 'mac'}, {'drvname': 'Marvell 88E1510', 'header': {'dev-index': 4, 'dev-name': 'eth2'}, 'id': 21040593, 'index': 1, 'name': 'f212a200.mdio-mii:00', 'upstream-type': 'mac'}] And now is a dump operation with a copper SFP in the eth0 port : [{'downstream-sfp-name': 'sfp-eth0', 'drvname': 'mv88x3310', 'header': {'dev-index': 2, 'dev-name': 'eth0'}, 'id': 0, 'index': 1, 'name': 'f212a600.mdio-mii:00', 'upstream-type': 'mac'}, {'drvname': 'Marvell 88E1111', 'header': {'dev-index': 2, 'dev-name': 'eth0'}, 'id': 21040322, 'index': 2, 'name': 'i2c:sfp-eth0:16', 'upstream': {'index': 1, 'sfp-name': 'sfp-eth0'}, 'upstream-type': 'phy'}, {'drvname': 'Marvell 88E1510', 'header': {'dev-index': 4, 'dev-name': 'eth2'}, 'id': 21040593, 'index': 1, 'name': 'f212a200.mdio-mii:00', 'upstream-type': 'mac'}] -- Note that this shouldn't actually work as the 88x3310 PHY doesn't allow a 1G SFP to be connected to its SFP interface, and I don't have a 10G copper SFP, so for the sake of the demo I applied the following modification, which of courses gives a non-functionnal link, but the PHY attach still works, which is what I want to demonstrate : @@ -488,7 +488,7 @@ static int mv3310_sfp_insert(void *upstream, const struct sfp_eeprom_id *id) if (iface != PHY_INTERFACE_MODE_10GBASER) { dev_err(&phydev->mdio.dev, "incompatible SFP module inserted\n"); - return -EINVAL; + //return -EINVAL; } return 0; } Finally an example of the filtered DUMP operation that Jakub suggested in V1 : [{'downstream-sfp-name': 'sfp-eth0', 'drvname': 'mv88x3310', 'header': {'dev-index': 2, 'dev-name': 'eth0'}, 'id': 0, 'index': 1, 'name': 'f212a600.mdio-mii:00', 'upstream-type': 'mac'}, {'drvname': 'Marvell 88E1111', 'header': {'dev-index': 2, 'dev-name': 'eth0'}, 'id': 21040322, 'index': 2, 'name': 'i2c:sfp-eth0:16', 'upstream': {'index': 1, 'sfp-name': 'sfp-eth0'}, 'upstream-type': 'phy'}] And a classic GET operation allows querying a single PHY's info : {'drvname': 'Marvell 88E1111', 'header': {'dev-index': 2, 'dev-name': 'eth0'}, 'id': 21040322, 'index': 2, 'name': 'i2c:sfp-eth0:16', 'upstream': {'index': 1, 'sfp-name': 'sfp-eth0'}, 'upstream-type': 'phy'} Changed in V5: - Removed the RTNL assertion in the topology ops - Made the phy_topo_get_phy inline - Fixed the PSE-PD multi-PHY support by re-adding a wrongly dropped check - Fixed some typos in the documentation - Fixed reverse xmas trees Changes in V4: - Dropped the RFC flag - Made the net_device integration independent to having phylib enabled - Removed the autogenerated ethtool-user code for the YNL specs Changes in V3: - Added RTNL assertions where needed - Fixed issues in the DUMP code for PHY_GET, which crashed when running it twice in a row - Added the documentation, and moved in-source docs around - renamed link_topology to phy_link_topology Changes in V2: - Added the DUMP operation - Added much more information in the reported data, to be able to reconstruct precisely the topology tree - renamed phy_list to link_topology ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-01Documentation: networking: document phy_link_topologyMaxime Chevallier2-0/+122
The newly introduced phy_link_topology tracks all ethernet PHYs that are attached to a netdevice. Document the base principle, internal and external APIs. As the phy_link_topology is expected to be extended, this documentation will hold any further improvements and additions made relative to topology handling. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-01net: ethtool: strset: Allow querying phy stats by indexMaxime Chevallier1-7/+8
The ETH_SS_PHY_STATS command gets PHY statistics. Use the phydev pointer from the ethnl request to allow query phy stats from each PHY on the link. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-01net: ethtool: cable-test: Target the command to the requested PHYMaxime Chevallier1-6/+6
Cable testing is a PHY-specific command. Instead of targeting the command towards dev->phydev, use the request to pick the targeted PHY. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-01net: ethtool: pse-pd: Target the command to the requested PHYMaxime Chevallier1-6/+3
PSE and PD configuration is a PHY-specific command. Instead of targeting the command towards dev->phydev, use the request to pick the targeted PHY device. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-01net: ethtool: plca: Target the command to the requested PHYMaxime Chevallier1-7/+6
PLCA is a PHY-specific command. Instead of targeting the command towards dev->phydev, use the request to pick the targeted PHY. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-01netlink: specs: add ethnl PHY_GET command setMaxime Chevallier1-0/+65
The PHY_GET command, supporting both DUMP and GET operations, is used to retrieve the list of PHYs connected to a netdevice, and get topology information to know where exactly it sits on the physical link. Add the netlink specs corresponding to that command. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-01net: ethtool: Introduce a command to list PHYs on an interfaceMaxime Chevallier6-1/+394
As we have the ability to track the PHYs connected to a net_device through the link_topology, we can expose this list to userspace. This allows userspace to use these identifiers for phy-specific commands and take the decision of which PHY to target by knowing the link topology. Add PHY_GET and PHY_DUMP, which can be a filtered DUMP operation to list devices on only one interface. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-01netlink: specs: add phy-index as a header parameterMaxime Chevallier1-0/+3
Update the spec to take the newly introduced phy-index as a generic request parameter. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-01net: ethtool: Allow passing a phy index for some commandsMaxime Chevallier4-2/+37
Some netlink commands are target towards ethernet PHYs, to control some of their features. As there's several such commands, add the ability to pass a PHY index in the ethnl request, which will populate the generic ethnl_req_info with the relevant phydev when the command targets a PHY. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-01net: sfp: Add helper to return the SFP bus nameMaxime Chevallier2-0/+17
Knowing the bus name is helpful when we want to expose the link topology to userspace, add a helper to return the SFP bus name. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-01net: phy: add helpers to handle sfp phy connect/disconnectMaxime Chevallier6-0/+50
There are a few PHY drivers that can handle SFP modules through their sfp_upstream_ops. Introduce Phylib helpers to keep track of connected SFP PHYs in a netdevice's namespace, by adding the SFP PHY to the upstream PHY's netdev's namespace. By doing so, these SFP PHYs can be enumerated and exposed to users, which will be able to use their capabilities. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-01net: sfp: pass the phy_device when disconnecting an sfp module's PHYMaxime Chevallier4-4/+13
Pass the phy_device as a parameter to the sfp upstream .disconnect_phy operation. This is preparatory work to help track phy devices across a net_device's link. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-01net: phy: Introduce ethernet link topology representationMaxime Chevallier10-2/+188
Link topologies containing multiple network PHYs attached to the same net_device can be found when using a PHY as a media converter for use with an SFP connector, on which an SFP transceiver containing a PHY can be used. With the current model, the transceiver's PHY can't be used for operations such as cable testing, timestamping, macsec offload, etc. The reason being that most of the logic for these configuration, coming from either ethtool netlink or ioctls tend to use netdev->phydev, which in multi-phy systems will reference the PHY closest to the MAC. Introduce a numbering scheme allowing to enumerate PHY devices that belong to any netdev, which can in turn allow userspace to take more precise decisions with regard to each PHY's configuration. The numbering is maintained per-netdev, in a phy_device_list. The numbering works similarly to a netdevice's ifindex, with identifiers that are only recycled once INT_MAX has been reached. This prevents races that could occur between PHY listing and SFP transceiver removal/insertion. The identifiers are assigned at phy_attach time, as the numbering depends on the netdevice the phy is attached to. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-01bcachefs: make RO snapshots actually ROKent Overstreet6-14/+63
Add checks to all the VFS paths for "are we in a RO snapshot?". Note - we don't check this when setting inode options via our xattr interface, since those generally only affect data placement, not contents of data. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> Reported-by: "Carl E. Thompson" <list-bcachefs@carlthompson.net>
2024-01-01bcachefs: bch_sb_field_downgradeKent Overstreet10-9/+255
Add a new superblock section that contains a list of { minor version, recovery passes, errors_to_fix } that is - a list of recovery passes that must be run when downgrading past a given version, and a list of errors to silently fix. The upcoming disk accounting rewrite is not going to be fully compatible: we're going to have to regenerate accounting both when upgrading to the new version, and also from downgrading from the new version, since the new method of doing disk space accounting is a completely different architecture based on deltas, and synchronizing them for every jounal entry write to maintain compatibility is going to be too expensive and impractical. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: bch_sb.recovery_passes_requiredKent Overstreet9-30/+172
Add two new superblock fields. Since the main section of the superblock is now fully, we have to add a new variable length section for them - bch_sb_field_ext. - recovery_passes_requried: recovery passes that must be run on the next mount - errors_silent: errors that will be silently fixed These are to improve upgrading and dwongrading: these fields won't be cleared until after recovery successfully completes, so there won't be any issues with crashing partway through an upgrade or a downgrade. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Add persistent identifiers for recovery passesKent Overstreet3-39/+84
The next patch will start to refer to recovery passes from the superblock; naturally, we now need identifiers that don't change, since the existing enum is in the order in which they are run and is not fixed. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: prt_bitflags_vector()Kent Overstreet3-0/+25
similar to prt_bitflags(), but for ulong arrays Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: move BCH_SB_ERRS() to sb-errors_types.hKent Overstreet2-253/+253
we need BCH_SB_ERR_MAX in bcachefs.h Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: fix buffer overflow in nocow write pathKent Overstreet1-41/+41
BCH_REPLICAS_MAX isn't the actual maximum number of pointers in an extent, it's the maximum number of dirty pointers. We don't have a real restriction on the number of cached pointers, and we don't want a fixed size array here anyways - so switch to DARRAY_PREALLOCATED(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> Reported-and-tested-by: Daniel J Blueman <daniel@quora.org>
2024-01-01bcachefs: DARRAY_PREALLOCATED()Kent Overstreet2-13/+20
Add support to darray for preallocating some number of elements. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Switch darray to kvmalloc()Kent Overstreet2-2/+4
We sometimes use darrays for quite large buffers - the btree write buffer in particular needs large buffers, since it must be sized to hold all the write buffer keys outstanding in the journal. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Factor out darray resize slowpathKent Overstreet3-11/+39
Move the slowpath (actually growing the darray) to an out-of-line function; also, add some helpers for the upcoming btree write buffer rewrite. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: fix setting version_upgrade_completeKent Overstreet1-2/+2
If a superblock write hasn't happened (i.e. we never had to go rw), then c->sb.version will be out of date w.r.t. c->disk_sb.sb->version. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: fix invalid free in dio write pathKent Overstreet1-8/+5
turns out iterate_iovec() mutates __iov, we need to save our own copy Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> Reported-by: Marcin Mirosław <marcin@mejor.pl>
2024-01-01bcachefs: Fix extents iteration + snapshots interactionKent Overstreet1-11/+24
peek_upto() checks against the end position and bails out before FILTER_SNAPSHOTS checks; this is because if we end up at a different inode number than the original search key none of the keys we see might be visibile in the current snapshot - we might be looking at inode in a completely different subvolume. But this is broken, because when we're iterating over extents we're checking against the extent start position to decide when to bail out, and the extent start position isn't monotonically increasing until after we've run FILTER_SNAPSHOTS. Fix this by adding a simple inode number check where the old bailout check was, and moving the main check to the correct position. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> Reported-by: "Carl E. Thompson" <list-bcachefs@carlthompson.net>
2024-01-01Merge tag 'nf-next-23-12-22' of ↵David S. Miller7-38/+567
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next Pablo Neira Ayuso says: ==================== netfilter pull request 23-12-22 The following patchset contains Netfilter updates for net-next: 1) Add locking for NFT_MSG_GETSETELEM_RESET requests, to address a race scenario with two concurrent processes running a dump-and-reset which exposes negative counters to userspace, from Phil Sutter. 2) Use GFP_KERNEL in pipapo GC, from Florian Westphal. 3) Reorder nf_flowtable struct members, place the read-mostly parts accessed by the datapath first. From Florian Westphal. 4) Set on dead flag for NFT_MSG_NEWSET in abort path, from Florian Westphal. 5) Support filtering zone in ctnetlink, from Felix Huettner. 6) Bail out if user tries to redefine an existing chain with different type in nf_tables. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-01MAINTAINERS: step down as TJA11XX C45 maintainerRadu Pirea (NXP OSS)1-1/+1
I am stepping down as TJA11XX C45 maintainer. Andrei Botila will take the responsibility to maintain and improve the support for TJA11XX C45 PHYs. Signed-off-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-01Merge tag 'for-netdev' of ↵David S. Miller23-431/+652
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== bpf-next-for-netdev The following pull-request contains BPF updates for your *net-next* tree. We've added 22 non-merge commits during the last 3 day(s) which contain a total of 23 files changed, 652 insertions(+), 431 deletions(-). The main changes are: 1) Add verifier support for annotating user's global BPF subprogram arguments with few commonly requested annotations for a better developer experience, from Andrii Nakryiko. These tags are: - Ability to annotate a special PTR_TO_CTX argument - Ability to annotate a generic PTR_TO_MEM as non-NULL 2) Support BPF verifier tracking of BPF_JNE which helps cases when the compiler transforms (unsigned) "a > 0" into "if a == 0 goto xxx" and the like, from Menglong Dong. 3) Fix a warning in bpf_mem_cache's check_obj_size() as reported by LKP, from Hou Tao. 4) Re-support uid/gid options when mounting bpffs which had to be reverted with the prior token series revert to avoid conflicts, from Daniel Borkmann. 5) Fix a libbpf NULL pointer dereference in bpf_object__collect_prog_relos() found from fuzzing the library with malformed ELF files, from Mingyi Zhang. 6) Skip DWARF sections in libbpf's linker sanity check given compiler options to generate compressed debug sections can trigger a rejection due to misalignment, from Alyssa Ross. 7) Fix an unnecessary use of the comma operator in BPF verifier, from Simon Horman. 8) Fix format specifier for unsigned long values in cpustat sample, from Colin Ian King. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-01r8169: Fix PCI error on system resumeKai-Heng Feng1-1/+1
Some r8168 NICs stop working upon system resume: [ 688.051096] r8169 0000:02:00.1 enp2s0f1: rtl_ep_ocp_read_cond == 0 (loop: 10, delay: 10000). [ 688.175131] r8169 0000:02:00.1 enp2s0f1: Link is Down ... [ 691.534611] r8169 0000:02:00.1 enp2s0f1: PCI error (cmd = 0x0407, status_errs = 0x0000) Not sure if it's related, but those NICs have a BMC device at function 0: 02:00.0 Unassigned class [ff00]: Realtek Semiconductor Co., Ltd. Realtek RealManage BMC [10ec:816e] (rev 1a) Trial and error shows that increase the loop wait on rtl_ep_ocp_read_cond to 30 can eliminate the issue, so let rtl8168ep_driver_start() to wait a bit longer. Fixes: e6d6ca6e1204 ("r8169: Add support for another RTL8168FP") Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-01net/tcp_sigpool: Use kref_get_unless_zero()Dmitry Safonov1-3/+2
The freeing and re-allocation of algorithm are protected by cpool_mutex, so it doesn't fix an actual use-after-free, but avoids a deserved refcount_warn_saturate() warning. A trivial fix for the racy behavior. Fixes: 8c73b26315aa ("net/tcp: Prepare tcp_md5sig_pool for TCP-AO") Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Dmitry Safonov <dima@arista.com> Tested-by: Bagas Sanjaya <bagasdotme@gmail.com> Reported-by: syzbot <syzkaller@googlegroups.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-01net: sched: em_text: fix possible memory leak in em_text_destroy()Hangyu Hua1-1/+3
m->data needs to be freed when em_text_destroy is called. Fixes: d675c989ed2d ("[PKT_SCHED]: Packet classification based on textsearch (ematch)") Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Hangyu Hua <hbh25y@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>