summaryrefslogtreecommitdiff
path: root/net/mptcp/pm_kernel.c
AgeCommit message (Collapse)AuthorFilesLines
8 daysmptcp: pm: in-kernel: add laminar endpointsMatthieu Baerts (NGI0)1-0/+82
Currently, upon the reception of an ADD_ADDR (and when the fullmesh flag is not used), the in-kernel PM will create new subflows using the local address the routing configuration will pick. It would be easier to pick local addresses from a selected list of endpoints, and use it only once, than relying on routing rules. Use case: both the client (C) and the server (S) have two addresses (a and b). The client establishes the connection between C(a) and S(a). Once established, the server announces its additional address S(b). Once received, the client connects to it using its second address C(b). Compared to a situation without the 'laminar' endpoint for C(b), the client didn't use this address C(b) to establish a subflow to the server's primary address S(a). So at the end, we have: C S C(a) --- S(a) C(b) --- S(b) In case of a 3rd address on each side (C(c) and S(c)), upon the reception of an ADD_ADDR with S(c), the client should not pick C(b) because it has already been used. C(c) should then be used. Note that this situation is currently possible if C doesn't add any endpoint, but configure the routing in order to pick C(b) for the route to S(b), and pick C(c) for the route to S(c). That doesn't sound very practical because it means knowing in advance the IP addresses that will be used and announced by the server. 'laminar', like the idea of laminar flows: the different subflows don't mix with each other on an endpoint, unlike the "turbulent" way traffic is mixed by 'fullmesh'. In the code, the new endpoint type is added. Similar to the other subflow types, an MPTCP_INFO counter is added. While at it, hole are now commented in struct mptcp_info, to remember next time that these holes can no longer be used. Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/503 Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250925-net-next-mptcp-c-flag-laminar-v1-15-ad126cc47c6b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daysmptcp: pm: in-kernel: compare IDs instead of addressesMatthieu Baerts (NGI0)1-38/+44
When receiving an ADD_ADDR right after the 3WHS, the connection will switch to 'fully established'. It means the MPTCP worker will be called to treat two events, in this order: ADD_ADDR_RECEIVED, PM_ESTABLISHED. The MPTCP endpoints cannot have the ID 0, because it is reserved to the address and port used by the initial subflow. To be able to deal with this case in different places, msk->mpc_endpoint_id contains the endpoint ID linked to the initial subflow. This variable was only set when treating the first PM_ESTABLISHED event, after ADD_ADDR_RECEIVED. That's why in fill_local_addresses_vec(), the endpoint addresses were compared with the one of the initial subflow, instead of only comparing the IDs. Instead, msk->mpc_endpoint_id is now set when treating ADD_ADDR_RECEIVED as well, if needed, then the IDs can be compared. To be able to do so, the code doing that is now in a dedicated helper, and called from the functions linked to the two actions. While at it, mptcp_endp_get_local_id() has also been moved up, next to this new helper, because they are linked, and to be able to use it in fill_local_addresses_vec() in the next commit. Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250925-net-next-mptcp-c-flag-laminar-v1-14-ad126cc47c6b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daysmptcp: pm: in-kernel: reduce pernet struct sizeMatthieu Baerts (NGI0)1-36/+23
All the 'unsigned int' variables from the 'pm_nl_pernet' structure are bounded to MPTCP_PM_ADDR_MAX, currently set to 8. The endpoint ID is also bounded by the protocol to 8-bit. MPTCP_PM_ADDR_MAX, if extended later, will never over 8-bit. So no need to use 'unsigned int' variables, 'u8' is enough. Note that the exposed counters in MPTCP_INFO are already limited to 8-bit, same for pm->extra_subflows, and others. So it seems even better to limit them to 8-bit. Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250925-net-next-mptcp-c-flag-laminar-v1-13-ad126cc47c6b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daysmptcp: pm: in-kernel: remove stale_loss_cntMatthieu Baerts (NGI0)1-2/+0
It is currently not used. It was in fact never used since its introduction in commit ff5a0b421cb2 ("mptcp: faster active backup recovery"). It was probably initially added to struct pm_nl_pernet during the development of this commit, before being added to struct mptcp_pernet in ctrl.c, but not removed from the first place. Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250925-net-next-mptcp-c-flag-laminar-v1-12-ad126cc47c6b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daysmptcp: pm: in-kernel: rename 'addrs' to 'endpoints'Matthieu Baerts (NGI0)1-6/+6
A few variables linked to the in-kernel Path-Manager are confusing, and it would help current and future developers, to clarify them. One of them is 'addrs', which in fact represents the number of declared endpoints, and not only the 'signal' endpoints. No functional changes intended. Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250925-net-next-mptcp-c-flag-laminar-v1-11-ad126cc47c6b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daysmptcp: pm: in-kernel: rename 'local_addr_list' to 'endp_list'Matthieu Baerts (NGI0)1-12/+12
A few variables linked to the in-kernel Path-Manager are confusing, and it would help current and future developers, to clarify them. One of them is 'local_addr_list', which in fact represents the list of endpoints, and not only the 'subflow' endpoints. No functional changes intended. Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250925-net-next-mptcp-c-flag-laminar-v1-10-ad126cc47c6b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daysmptcp: pm: in-kernel: rename 'local_addr_max' to 'endp_subflow_max'Matthieu Baerts (NGI0)1-20/+20
A few variables linked to the in-kernel Path-Manager are confusing, and it would help current and future developers, to clarify them. One of them is 'local_addr_max', which in fact represents the maximum number of 'subflow' endpoints that can be used to create new subflows, and not the number of local addresses that have been used to create subflows. While at it, add an additional name for the corresponding variable in MPTCP INFO: mptcpi_endp_subflow_max. Not to break the current uAPI, the new name is added as a 'define' pointing to the former name. This will then also help userspace devs. Also move the variable and function next to the other 'endp_X_max' ones. No functional changes intended. Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250925-net-next-mptcp-c-flag-laminar-v1-9-ad126cc47c6b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daysmptcp: pm: in-kernel: rename 'add_addr_accept_max' to 'limit_add_addr_accepted'Matthieu Baerts (NGI0)1-12/+15
A few variables linked to the in-kernel Path-Manager are confusing, and it would help current and future developers, to clarify them. One of them is 'add_addr_accept_max', which in fact represents the limit of ADD_ADDR that can be accepted: the limit set via 'ip mptcp limit add_addr_accepted X' for example. It is not linked to the maximum number of accepted ADD_ADDR. While at it, add an additional name for the corresponding variable in MPTCP INFO: mptcpi_limit_add_addr_accepted. Not to break the current uAPI, the new name is added as a 'define' pointing to the former name. This will then also help userspace devs. No functional changes intended. Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250925-net-next-mptcp-c-flag-laminar-v1-8-ad126cc47c6b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daysmptcp: pm: in-kernel: rename 'add_addr_signal_max' to 'endp_signal_max'Matthieu Baerts (NGI0)1-13/+13
A few variables linked to the in-kernel Path-Manager are confusing, and it would help current and future developers, to clarify them. One of them is 'add_addr_signal_max', which in fact represents the maximum number of 'signal' endpoints that can be used to announced addresses, and not the number of ADD_ADDR that can be signalled. While at it, add an additional name for the corresponding variable in MPTCP INFO: mptcpi_endp_signal_max. Not to break the current uAPI, the new name is added as a 'define' pointing to the former name. This will then also help userspace devs. No functional changes intended. Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250925-net-next-mptcp-c-flag-laminar-v1-7-ad126cc47c6b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daysmptcp: pm: in-kernel: rename 'subflows_max' to 'limit_extra_subflows'Matthieu Baerts (NGI0)1-23/+25
A few variables linked to the in-kernel Path-Manager are confusing, and it would help current and future developers, to clarify them. One of them is 'subflows_max', which in fact represents the limit of extra subflows: the limit set via 'ip mptcp limit subflows X' for example. It is not linked to the maximum number of created / possible subflows. While at it, add an additional name for the corresponding variable in MPTCP INFO: mptcpi_limit_extra_subflows. Not to break the current uAPI, the new name is added as a 'define' pointing to the former name. This will then also help userspace devs. No functional changes intended. Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250925-net-next-mptcp-c-flag-laminar-v1-6-ad126cc47c6b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daysmptcp: pm: rename 'subflows' to 'extra_subflows'Matthieu Baerts (NGI0)1-12/+12
A few variables linked to the Path-Managers are confusing, and it would help current and future developers, to clarify them. One of them is 'subflows', which in fact represents the number of extra subflows: all the additional subflows created after the initial one, and not the total number of subflows. While at it, add an additional name for the corresponding variable in MPTCP INFO: mptcpi_extra_subflows. Not to break the current uAPI, the new name is added as a 'define' pointing to the former name. This will then also help userspace devs. No functional changes intended. Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250925-net-next-mptcp-c-flag-laminar-v1-5-ad126cc47c6b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daysmptcp: pm: in-kernel: refactor fill_remote_addresses_vecMatthieu Baerts (NGI0)1-49/+67
Before this modification, this function was quite long with many levels of indentations. Each case can be split in a dedicated function: fullmesh, non-fullmesh. To remove one level of indentation, msk->pm.subflows >= subflows_max is now checked after having added one subflow, and stops the loop if it is no longer possible to add new subflows. This is fine to do this because this function should only be called if msk->pm.subflows < subflows_max. No functional changes intended. Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250925-net-next-mptcp-c-flag-laminar-v1-4-ad126cc47c6b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daysmptcp: pm: in-kernel: refactor fill_local_addresses_vecMatthieu Baerts (NGI0)1-71/+104
Before this modification, this function was quite long with many levels of indentations. Each case can be split in a dedicated function: fullmesh, C flag, any. No functional changes intended. Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250925-net-next-mptcp-c-flag-laminar-v1-3-ad126cc47c6b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daysmptcp: pm: in-kernel: usable client side with C-flagMatthieu Baerts (NGI0)1-1/+49
When servers set the C-flag in their MP_CAPABLE to tell clients not to create subflows to the initial address and port, clients will likely not use their other endpoints. That's because the in-kernel path-manager uses the 'subflow' endpoints to create subflows only to the initial address and port. If the limits have not been modified to accept ADD_ADDR, the client doesn't try to establish new subflows. If the limits accept ADD_ADDR, the routing routes will be used to select the source IP. The C-flag is typically set when the server is operating behind a legacy Layer 4 load balancer, or using anycast IP address. Clients having their different 'subflow' endpoints setup, don't end up creating multiple subflows as expected, and causing some deployment issues. A special case is then added here: when servers set the C-flag in the MPC and directly sends an ADD_ADDR, this single ADD_ADDR is accepted. The 'subflows' endpoints will then be used with this new remote IP and port. This exception is only allowed when the ADD_ADDR is sent immediately after the 3WHS, and makes the client switching to the 'fully established' mode. After that, 'select_local_address()' will not be able to find any subflows, because 'id_avail_bitmap' will be filled in mptcp_pm_create_subflow_or_signal_addr(), when switching to 'fully established' mode. Fixes: df377be38725 ("mptcp: add deny_join_id0 in mptcp_options_received") Cc: stable@vger.kernel.org Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/536 Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250925-net-next-mptcp-c-flag-laminar-v1-1-ad126cc47c6b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-19mptcp: pm: kernel: flush: do not reset ADD_ADDR limitMatthieu Baerts (NGI0)1-1/+0
A flush of the MPTCP endpoints should not affect the MPTCP limits. In other words, 'ip mptcp endpoint flush' should not change 'ip mptcp limits'. But it was the case: the MPTCP_PM_ATTR_RCV_ADD_ADDRS (add_addr_accepted) limit was reset by accident. Removing the reset of this counter during a flush fixes this issue. Fixes: 01cacb00b35c ("mptcp: add netlink-based PM") Cc: stable@vger.kernel.org Reported-by: Thomas Dreibholz <dreibh@simula.no> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/579 Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250815-net-mptcp-misc-fixes-6-17-rc2-v1-2-521fe9957892@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-20mptcp: pm: register in-kernel and userspace PMGeliang Tang1-0/+7
This patch defines the original in-kernel netlink path manager as a new struct mptcp_pm_ops named "mptcp_pm_kernel", and register it in mptcp_pm_kernel_register(). And define the userspace path manager as a new struct mptcp_pm_ops named "mptcp_pm_userspace", and register it in mptcp_pm_init(). To ensure that there's always a valid path manager available, the default path manager "mptcp_pm_kernel" will be skipped in mptcp_pm_unregister(). Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250313-net-next-mptcp-pm-ops-intro-v1-7-f4e4a88efc50@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-20mptcp: pm: in-kernel: use kmemdup helperGeliang Tang1-4/+2
Instead of using kmalloc() or kzalloc() to allocate an entry and then immediately duplicate another entry to the newly allocated one, kmemdup() helper can be used to simplify the code. Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250313-net-next-mptcp-pm-ops-intro-v1-2-f4e4a88efc50@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-20mptcp: pm: split netlink and in-kernel initMatthieu Baerts (NGI0)1-4/+1
The registration of mptcp_genl_family is useful for both the in-kernel and the userspace PM. It should then be done in pm_netlink.c. On the other hand, the registration of the in-kernel pernet subsystem is specific to the in-kernel PM, and should stay there in pm_kernel.c. Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250313-net-next-mptcp-pm-ops-intro-v1-1-f4e4a88efc50@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-10mptcp: pm: split in-kernel PM specific codeMatthieu Baerts (NGI0)1-0/+1410
Before this patch, the PM code was dispersed in different places: - pm.c had common code for all PMs - pm_netlink.c was supposed to be about the in-kernel PM, but also had exported common Netlink helpers, NL events for PM userspace daemons, etc. quite confusing. To clarify the code, a reorganisation is suggested here, only by moving code around to avoid confusions: - pm_netlink.c now only contains common PM Netlink code: - PM events: this code was already there - shared helpers around Netlink code that were already there as well - more shared Netlink commands code from pm.c will come after - pm_kernel.c now contains only code that is specific to the in-kernel PM. Now all functions are either called from: - pm.c: events coming from the core, when this PM is being used - pm_netlink.c: for shared Netlink commands - mptcp_pm_gen.c: for Netlink commands specific to the in-kernel PM - sockopt.c: for the exported counters per netns - (while at it, a useless 'return;' spot by checkpatch at the end of mptcp_pm_nl_set_flags_all, has been removed) The code around the PM is now less confusing, which should help for the maintenance in the long term. This will certainly impact future backports, but because other cleanups have already done recently, and more are coming to ease the addition of a new path-manager controlled with BPF (struct_ops), doing that now seems to be a good time. Also, many issues around the PM have been fixed a few months ago while increasing the code coverage in the selftests, so such big reorganisation can be done with more confidence now. No behavioural changes intended. Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250307-net-next-mptcp-pm-reorg-v1-14-abef20ada03b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>