starfive-tech/linux.git - StarFive Tech Linux Kernel for VisionFive (JH7110) boards (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2023-10-28	net: dsa: microchip: ksz9477: Add Wake on Magic Packet support	Oleksij Rempel	3	-7/+97
	Introduce Wake on Magic Packet (WoL) functionality to the ksz9477 driver. Major changes include: 1. Extending the `ksz9477_handle_wake_reason` function to identify Magic Packet wake events alongside existing wake reasons. 2. Updating the `ksz9477_get_wol` and `ksz9477_set_wol` functions to handle WAKE_MAGIC alongside the existing WAKE_PHY option, and to program the switch's MAC address register accordingly when Magic Packet wake-up is enabled. This change will prevent WAKE_MAGIC activation if the related port has a different MAC address compared to a MAC address already used by HSR or an already active WAKE_MAGIC on another port. 3. Adding a restriction in `ksz_port_set_mac_address` to prevent MAC address changes on ports with active Wake on Magic Packet, as the switch's MAC address register is utilized for this feature. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20231026051051.2316937-2-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-27	af_unix: Remove module remnants.	Kuniyuki Iwashima	1	-19/+4
	Since commit 97154bcf4d1b ("af_unix: Kconfig: make CONFIG_UNIX bool"), af_unix.c is no longer built as module. Let's remove unnecessary #if condition, exitcall, and module macros. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/r/20231026212305.45545-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-27	Merge branch 'mptcp-fixes-and-cleanup-for-v6-7'	Jakub Kicinski	7	-31/+88
	Mat Martineau says: ==================== mptcp: Fixes and cleanup for v6.7 This series includes three initial patches that we had queued in our mptcp-net branch, but given the likely timing of net/net-next syncs this week, the need to avoid introducing branch conflicts, and another batch of net-next patches pending in the mptcp tree, the most practical route is to send everything for net-next. Patches 1 & 2 fix some intermittent selftest failures by adjusting timing. Patch 3 removes an unneccessary userspace path manager restriction on the removal of subflows with subflow ID 0. The remainder of the patches are all cleanup or selftest changes: Patches 4-8 clean up kernel code by removing unused parameters, making more consistent use of existing helper functions, and reducing extra casting of socket pointers. Patch 9 removes an unused variable in a selftest script. Patch 10 adds a little more detail to some mptcp_join test output. ==================== Link: https://lore.kernel.org/r/20231025-send-net-next-20231025-v1-0-db8f25f798eb@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-27	selftests: mptcp: display simult in extra_msg	Geliang Tang	1	-1/+4
	Just like displaying "invert" after "Info: ", "simult" should be displayed too when rm_subflow_nr doesn't match the expect value in chk_rm_nr(): syn [ ok ] synack [ ok ] ack [ ok ] add [ ok ] echo [ ok ] rm [ ok ] rmsf [ ok ] 3 in [2:4] Info: invert simult syn [ ok ] synack [ ok ] ack [ ok ] add [ ok ] echo [ ok ] rm [ ok ] rmsf [ ok ] Info: invert Reviewed-by: Matthieu Baerts <matttbe@kernel.org> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <martineau@kernel.org> Link: https://lore.kernel.org/r/20231025-send-net-next-20231025-v1-10-db8f25f798eb@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-27	selftests: mptcp: sockopt: drop mptcp_connect var	Geliang Tang	1	-1/+0
	Global var mptcp_connect defined at the front of mptcp_sockopt.sh is duplicate with local var mptcp_connect defined in do_transfer(), drop this useless global one. Reviewed-by: Matthieu Baerts <matttbe@kernel.org> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <martineau@kernel.org> Link: https://lore.kernel.org/r/20231025-send-net-next-20231025-v1-9-db8f25f798eb@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-27	mptcp: define more local variables sk	Geliang Tang	1	-11/+20
	'(struct sock *)msk' is used several times in mptcp_nl_cmd_announce(), mptcp_nl_cmd_remove() or mptcp_userspace_pm_set_flags() in pm_userspace.c, it's worth adding a local variable sk to point it. Reviewed-by: Matthieu Baerts <matttbe@kernel.org> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <martineau@kernel.org> Link: https://lore.kernel.org/r/20231025-send-net-next-20231025-v1-8-db8f25f798eb@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-27	mptcp: move sk assignment statement ahead	Geliang Tang	1	-5/+6
	If we move the sk assignment statement ahead in mptcp_nl_cmd_sf_create() or mptcp_nl_cmd_sf_destroy(), right after the msk null-check statements, sk can be used after the create_err or destroy_err labels instead of open-coding it again. Reviewed-by: Matthieu Baerts <matttbe@kernel.org> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <martineau@kernel.org> Link: https://lore.kernel.org/r/20231025-send-net-next-20231025-v1-7-db8f25f798eb@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-27	mptcp: use mptcp_get_ext helper	Geliang Tang	1	-2/+2
	Use mptcp_get_ext() helper defined in protocol.h instead of open-coding it in mptcp_sendmsg_frag(). Reviewed-by: Matthieu Baerts <matttbe@kernel.org> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <martineau@kernel.org> Link: https://lore.kernel.org/r/20231025-send-net-next-20231025-v1-6-db8f25f798eb@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-27	mptcp: use mptcp_check_fallback helper	Geliang Tang	2	-2/+2
	Use __mptcp_check_fallback() helper defined in net/mptcp/protocol.h, instead of open-coding it in both __mptcp_do_fallback() and mptcp_diag_fill_info(). Reviewed-by: Matthieu Baerts <matttbe@kernel.org> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <martineau@kernel.org> Link: https://lore.kernel.org/r/20231025-send-net-next-20231025-v1-5-db8f25f798eb@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-27	mptcp: drop useless ssk in pm_subflow_check_next	Geliang Tang	3	-3/+3
	The code using 'ssk' parameter of mptcp_pm_subflow_check_next() has been dropped in commit "95d686517884 (mptcp: fix subflow accounting on close)". So drop this useless parameter ssk. Reviewed-by: Matthieu Baerts <matttbe@kernel.org> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <martineau@kernel.org> Link: https://lore.kernel.org/r/20231025-send-net-next-20231025-v1-4-db8f25f798eb@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-27	mptcp: userspace pm send RM_ADDR for ID 0	Geliang Tang	1	-0/+39
	This patch adds the ability to send RM_ADDR for local ID 0. Check whether id 0 address is removed, if not, put id 0 into a removing list, pass it to mptcp_pm_remove_addr() to remove id 0 address. There is no reason not to allow the userspace to remove the initial address (ID 0). This special case was not taken into account not letting the userspace to delete all addresses as announced. Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/379 Reviewed-by: Matthieu Baerts <matttbe@kernel.org> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <martineau@kernel.org> Link: https://lore.kernel.org/r/20231025-send-net-next-20231025-v1-3-db8f25f798eb@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-27	selftests: mptcp: fix wait_rm_addr/sf parameters	Geliang Tang	1	-4/+10
	The second input parameter of 'wait_rm_addr/sf $1 1' is misused. If it's 1, wait_rm_addr/sf will never break, and will loop ten times, then 'wait_rm_addr/sf' equals to 'sleep 1'. This delay time is too long, which can sometimes make the tests fail. A better way to use wait_rm_addr/sf is to use rm_addr/sf_count to obtain the current value, and then pass into wait_rm_addr/sf. Fixes: 4369c198e599 ("selftests: mptcp: test userspace pm out of transfer") Cc: stable@vger.kernel.org Suggested-by: Matthieu Baerts <matttbe@kernel.org> Reviewed-by: Matthieu Baerts <matttbe@kernel.org> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <martineau@kernel.org> Link: https://lore.kernel.org/r/20231025-send-net-next-20231025-v1-2-db8f25f798eb@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-27	selftests: mptcp: run userspace pm tests slower	Geliang Tang	1	-2/+2
	Some userspace pm tests failed are reported by CI: 112 userspace pm add & remove address syn [ ok ] synack [ ok ] ack [ ok ] add [ ok ] echo [ ok ] mptcp_info subflows=1:1 [ ok ] subflows_total 2:2 [ ok ] mptcp_info add_addr_signal=1:1 [ ok ] rm [ ok ] rmsf [ ok ] Info: invert mptcp_info subflows=0:0 [ ok ] subflows_total 1:1 [fail] got subflows 0:0 expected 1:1 Server ns stats TcpPassiveOpens 2 0.0 TcpInSegs 118 0.0 This patch fixes them by changing 'speed' to 5 to run the tests much more slowly. Fixes: 4369c198e599 ("selftests: mptcp: test userspace pm out of transfer") Cc: stable@vger.kernel.org Reviewed-by: Matthieu Baerts <matttbe@kernel.org> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <martineau@kernel.org> Link: https://lore.kernel.org/r/20231025-send-net-next-20231025-v1-1-db8f25f798eb@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-27	net: selftests: use ethtool_sprintf()	Jakub Kicinski	1	-6/+3
	During a W=1 build GCC 13.2 says: net/core/selftests.c: In function ‘net_selftest_get_strings’: net/core/selftests.c:404:52: error: ‘%s’ directive output may be truncated writing up to 279 bytes into a region of size 28 [-Werror=format-truncation=] 404 \| snprintf(p, ETH_GSTRING_LEN, "%2d. %s", i + 1, \| ^~ net/core/selftests.c:404:17: note: ‘snprintf’ output between 5 and 284 bytes into a destination of size 32 404 \| snprintf(p, ETH_GSTRING_LEN, "%2d. %s", i + 1, \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 405 \| net_selftests[i].name); \| ~~~~~~~~~~~~~~~~~~~~~~ avoid it by using ethtool_sprintf(). Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de> Tested-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://lore.kernel.org/r/20231026022916.566661-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-27	net: bridge: fill in MODULE_DESCRIPTION()	Nikolay Aleksandrov	1	-0/+1
	Fill in bridge's module description. Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27	virtio_net: use u64_stats_t infra to avoid data-races	Eric Dumazet	1	-59/+65
	syzbot reported a data-race in virtnet_poll / virtnet_stats [1] u64_stats_t infra has very nice accessors that must be used to avoid potential load-store tearing. [1] BUG: KCSAN: data-race in virtnet_poll / virtnet_stats read-write to 0xffff88810271b1a0 of 8 bytes by interrupt on cpu 0: virtnet_receive drivers/net/virtio_net.c:2102 [inline] virtnet_poll+0x6c8/0xb40 drivers/net/virtio_net.c:2148 __napi_poll+0x60/0x3b0 net/core/dev.c:6527 napi_poll net/core/dev.c:6594 [inline] net_rx_action+0x32b/0x750 net/core/dev.c:6727 __do_softirq+0xc1/0x265 kernel/softirq.c:553 invoke_softirq kernel/softirq.c:427 [inline] __irq_exit_rcu kernel/softirq.c:632 [inline] irq_exit_rcu+0x3b/0x90 kernel/softirq.c:644 common_interrupt+0x7f/0x90 arch/x86/kernel/irq.c:247 asm_common_interrupt+0x26/0x40 arch/x86/include/asm/idtentry.h:636 __sanitizer_cov_trace_const_cmp8+0x0/0x80 kernel/kcov.c:306 jbd2_write_access_granted fs/jbd2/transaction.c:1174 [inline] jbd2_journal_get_write_access+0x94/0x1c0 fs/jbd2/transaction.c:1239 __ext4_journal_get_write_access+0x154/0x3f0 fs/ext4/ext4_jbd2.c:241 ext4_reserve_inode_write+0x14e/0x200 fs/ext4/inode.c:5745 __ext4_mark_inode_dirty+0x8e/0x440 fs/ext4/inode.c:5919 ext4_evict_inode+0xaf0/0xdc0 fs/ext4/inode.c:299 evict+0x1aa/0x410 fs/inode.c:664 iput_final fs/inode.c:1775 [inline] iput+0x42c/0x5b0 fs/inode.c:1801 do_unlinkat+0x2b9/0x4f0 fs/namei.c:4405 __do_sys_unlink fs/namei.c:4446 [inline] __se_sys_unlink fs/namei.c:4444 [inline] __x64_sys_unlink+0x30/0x40 fs/namei.c:4444 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd read to 0xffff88810271b1a0 of 8 bytes by task 2814 on cpu 1: virtnet_stats+0x1b3/0x340 drivers/net/virtio_net.c:2564 dev_get_stats+0x6d/0x860 net/core/dev.c:10511 rtnl_fill_stats+0x45/0x320 net/core/rtnetlink.c:1261 rtnl_fill_ifinfo+0xd0e/0x1120 net/core/rtnetlink.c:1867 rtnl_dump_ifinfo+0x7f9/0xc20 net/core/rtnetlink.c:2240 netlink_dump+0x390/0x720 net/netlink/af_netlink.c:2266 netlink_recvmsg+0x425/0x780 net/netlink/af_netlink.c:1992 sock_recvmsg_nosec net/socket.c:1027 [inline] sock_recvmsg net/socket.c:1049 [inline] ____sys_recvmsg+0x156/0x310 net/socket.c:2760 ___sys_recvmsg net/socket.c:2802 [inline] __sys_recvmsg+0x1ea/0x270 net/socket.c:2832 __do_sys_recvmsg net/socket.c:2842 [inline] __se_sys_recvmsg net/socket.c:2839 [inline] __x64_sys_recvmsg+0x46/0x50 net/socket.c:2839 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd value changed: 0x000000000045c334 -> 0x000000000045c376 Fixes: 3fa2a1df9094 ("virtio-net: per cpu 64 bit stats (v2)") Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27	Merge branch 'mdb-get'	David S. Miller	13	-199/+608
	Ido Schimmel says: ==================== Add MDB get support This patchset adds MDB get support, allowing user space to request a single MDB entry to be retrieved instead of dumping the entire MDB. Support is added in both the bridge and VXLAN drivers. Patches #1-#6 are small preparations in both drivers. Patches #7-#8 add the required uAPI attributes for the new functionality and the MDB get net device operation (NDO), respectively. Patches #9-#10 implement the MDB get NDO in both drivers. Patch #11 registers a handler for RTM_GETMDB messages in rtnetlink core. The handler derives the net device from the ifindex specified in the ancillary header and invokes its MDB get NDO. Patches #12-#13 add selftests by converting tests that use MDB dump with grep to the new MDB get functionality. iproute2 changes can be found here [1]. v2: * Patch #7: Add a comment to describe attributes structure. * Patch #9: Add a comment above spin_lock_bh(). [1] https://github.com/idosch/iproute2/tree/submit/mdb_get_v1 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27	selftests: vxlan_mdb: Use MDB get instead of dump	Ido Schimmel	1	-54/+54
	Test the new MDB get functionality by converting dump and grep to MDB get. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27	selftests: bridge_mdb: Use MDB get instead of dump	Ido Schimmel	1	-113/+71
	Test the new MDB get functionality by converting dump and grep to MDB get. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27	rtnetlink: Add MDB get support	Ido Schimmel	1	-1/+88
	Now that both the bridge and VXLAN drivers implement the MDB get net device operation, expose the functionality to user space by registering a handler for RTM_GETMDB messages. Derive the net device from the ifindex specified in the ancillary header and invoke its MDB get NDO. Note that unlike other get handlers, the allocation of the skb containing the response is not performed in the common rtnetlink code as the size is variable and needs to be determined by the respective driver. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27	vxlan: mdb: Add MDB get support	Ido Schimmel	3	-0/+153
	Implement support for MDB get operation by looking up a matching MDB entry, allocating the skb according to the entry's size and then filling in the response. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27	bridge: mcast: Add MDB get support	Ido Schimmel	3	-0/+168
	Implement support for MDB get operation by looking up a matching MDB entry, allocating the skb according to the entry's size and then filling in the response. The operation is performed under the bridge multicast lock to ensure that the entry does not change between the time the reply size is determined and when the reply is filled in. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27	net: Add MDB get device operation	Ido Schimmel	1	-0/+4
	Add MDB net device operation that will be invoked by rtnetlink code in response to received RTM_GETMDB messages. Subsequent patches will implement the operation in the bridge and VXLAN drivers. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27	bridge: add MDB get uAPI attributes	Ido Schimmel	1	-0/+18
	Add MDB get attributes that correspond to the MDB set attributes used in RTM_NEWMDB messages. Specifically, add 'MDBA_GET_ENTRY' which will hold a 'struct br_mdb_entry' and 'MDBA_GET_ENTRY_ATTRS' which will hold 'MDBE_ATTR_*' attributes that are used as indexes (source IP and source VNI). An example request will look as follows: [ struct nlmsghdr ] [ struct br_port_msg ] [ MDBA_GET_ENTRY ] struct br_mdb_entry [ MDBA_GET_ENTRY_ATTRS ] [ MDBE_ATTR_SOURCE ] struct in_addr / struct in6_addr [ MDBE_ATTR_SRC_VNI ] u32 Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27	vxlan: mdb: Factor out a helper for remote entry size calculation	Ido Schimmel	1	-9/+19
	Currently, netlink notifications are sent for individual remote entries and not for the entire MDB entry itself. Subsequent patches are going to add MDB get support which will require the VXLAN driver to reply with an entire MDB entry. Therefore, as a preparation, factor out a helper to calculate the size of an individual remote entry. When determining the size of the reply this helper will be invoked for each remote entry in the MDB entry. No functional changes intended. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27	vxlan: mdb: Adjust function arguments	Ido Schimmel	1	-6/+4
	Adjust the function's arguments and rename it to allow it to be reused by future call sites that only have access to 'struct vxlan_mdb_entry_key', but not to 'struct vxlan_mdb_config'. No functional changes intended. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27	bridge: mcast: Rename MDB entry get function	Ido Schimmel	4	-8/+11
	The current name is going to conflict with the upcoming net device operation for the MDB get operation. Rename the function to br_mdb_entry_skb_get(). No functional changes intended. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27	bridge: mcast: Factor out a helper for PG entry size calculation	Ido Schimmel	1	-7/+13
	Currently, netlink notifications are sent for individual port group entries and not for the entire MDB entry itself. Subsequent patches are going to add MDB get support which will require the bridge driver to reply with an entire MDB entry. Therefore, as a preparation, factor out an helper to calculate the size of an individual port group entry. When determining the size of the reply this helper will be invoked for each port group entry in the MDB entry. No functional changes intended. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27	bridge: mcast: Account for missing attributes	Ido Schimmel	1	-4/+11
	The 'MDBA_MDB' and 'MDBA_MDB_ENTRY' nest attributes are not accounted for when calculating the size of MDB notifications. Add them along with comments for existing attributes. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27	bridge: mcast: Dump MDB entries even when snooping is disabled	Ido Schimmel	1	-3/+0
	Currently, the bridge driver does not dump MDB entries when multicast snooping is disabled although the entries are present in the kernel: # bridge mdb add dev br0 port swp1 grp 239.1.1.1 permanent # bridge mdb show dev br0 dev br0 port swp1 grp 239.1.1.1 permanent dev br0 port br0 grp ff02::6a temp dev br0 port br0 grp ff02::1:ff9d:e61b temp # ip link set dev br0 type bridge mcast_snooping 0 # bridge mdb show dev br0 # ip link set dev br0 type bridge mcast_snooping 1 # bridge mdb show dev br0 dev br0 port swp1 grp 239.1.1.1 permanent dev br0 port br0 grp ff02::6a temp dev br0 port br0 grp ff02::1:ff9d:e61b temp This behavior differs from other netlink dump interfaces that dump entries regardless if they are used or not. For example, VLANs are dumped even when VLAN filtering is disabled: # ip link set dev br0 type bridge vlan_filtering 0 # bridge vlan show dev swp1 port vlan-id swp1 1 PVID Egress Untagged Remove the check and always dump MDB entries: # bridge mdb add dev br0 port swp1 grp 239.1.1.1 permanent # bridge mdb show dev br0 dev br0 port swp1 grp 239.1.1.1 permanent dev br0 port br0 grp ff02::6a temp dev br0 port br0 grp ff02::1:ffeb:1a4d temp # ip link set dev br0 type bridge mcast_snooping 0 # bridge mdb show dev br0 dev br0 port swp1 grp 239.1.1.1 permanent dev br0 port br0 grp ff02::6a temp dev br0 port br0 grp ff02::1:ffeb:1a4d temp # ip link set dev br0 type bridge mcast_snooping 1 # bridge mdb show dev br0 dev br0 port swp1 grp 239.1.1.1 permanent dev br0 port br0 grp ff02::6a temp dev br0 port br0 grp ff02::1:ffeb:1a4d temp Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27	Merge branch 'tcp-ao'	David S. Miller	24	-436/+5175
	Dmitry Safonov says: ==================== net/tcp: Add TCP-AO support This is version 16 of TCP-AO support. It addresses the build warning in the middle of patch set, reported by kernel test robot. There's one Sparse warning introduced by tcp_sigpool_start(): __cond_acquires() seems to currently being broken. I've described the reasoning for it on v9 cover letter. Also, checkpatch.pl warnings were addressed, but yet I've left the ones that are more personal preferences (i.e. 80 columns limit). Please, ping me if you have a strong feeling about one of them. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27	Documentation/tcp: Add TCP-AO documentation	Dmitry Safonov	2	-0/+445
	It has Frequently Asked Questions (FAQ) on RFC 5925 - I found it very useful answering those before writing the actual code. It provides answers to common questions that arise on a quick read of the RFC, as well as how they were answered. There's also comparison to TCP-MD5 option, evaluation of per-socket vs in-kernel-DB approaches and description of uAPI provided. Hopefully, it will be as useful for reviewing the code as it was for writing. Cc: Jonathan Corbet <corbet@lwn.net> Cc: linux-doc@vger.kernel.org Signed-off-by: Dmitry Safonov <dima@arista.com> Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27	net/tcp: Add TCP_AO_REPAIR	Dmitry Safonov	4	-11/+125
	Add TCP_AO_REPAIR setsockopt(), getsockopt(). They let a user to repair TCP-AO ISNs/SNEs. Also let the user hack around when (tp->repair) is on and add ao_info on a socket in any supported state. As SNEs now can be read/written at any moment, use WRITE_ONCE()/READ_ONCE() to set/read them. Signed-off-by: Dmitry Safonov <dima@arista.com> Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27	net/tcp: Wire up l3index to TCP-AO	Dmitry Safonov	8	-79/+177
	Similarly how TCP_MD5SIG_FLAG_IFINDEX works for TCP-MD5, TCP_AO_KEYF_IFINDEX is an AO-key flag that binds that MKT to a specified by L3 ifinndex. Similarly, without this flag the key will work in the default VRF l3index = 0 for connections. To prevent AO-keys from overlapping, it's restricted to add key B for a socket that has key A, which have the same sndid/rcvid and one of the following is true: - !(A.keyflags & TCP_AO_KEYF_IFINDEX) or !(B.keyflags & TCP_AO_KEYF_IFINDEX) so that any key is non-bound to a VRF - A.l3index == B.l3index both want to work for the same VRF Additionally, it's restricted to match TCP-MD5 keys for the same peer the following way: \|--------------\|--------------------\|----------------\|---------------\| \| \| MD5 key without \| MD5 key \| MD5 key \| \| \| l3index \| l3index=0 \| l3index=N \| \|--------------\|--------------------\|----------------\|---------------\| \| TCP-AO key \| \| \| \| \| without \| reject \| reject \| reject \| \| l3index \| \| \| \| \|--------------\|--------------------\|----------------\|---------------\| \| TCP-AO key \| \| \| \| \| l3index=0 \| reject \| reject \| allow \| \|--------------\|--------------------\|----------------\|---------------\| \| TCP-AO key \| \| \| \| \| l3index=N \| reject \| allow \| reject \| \|--------------\|--------------------\|----------------\|---------------\| This is done with the help of tcp_md5_do_lookup_any_l3index() to reject adding AO key without TCP_AO_KEYF_IFINDEX if there's TCP-MD5 in any VRF. This is important for case where sysctl_tcp_l3mdev_accept = 1 Similarly, for TCP-AO lookups tcp_ao_do_lookup() may be used with l3index < 0, so that __tcp_ao_key_cmp() will match TCP-AO key in any VRF. Signed-off-by: Dmitry Safonov <dima@arista.com> Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27	net/tcp: Add static_key for TCP-AO	Dmitry Safonov	6	-44/+96
	Similarly to TCP-MD5, add a static key to TCP-AO that is patched out when there are no keys on a machine and dynamically enabled with the first setsockopt(TCP_AO) adds a key on any socket. The static key is as well dynamically disabled later when the socket is destructed. The lifetime of enabled static key here is the same as ao_info: it is enabled on allocation, passed over from full socket to twsk and destructed when ao_info is scheduled for destruction. Signed-off-by: Dmitry Safonov <dima@arista.com> Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27	net/tcp: Allow asynchronous delete for TCP-AO keys (MKTs)	Dmitry Safonov	2	-4/+20
	Delete becomes very, very fast - almost free, but after setsockopt() syscall returns, the key is still alive until next RCU grace period. Which is fine for listen sockets as userspace needs to be aware of setsockopt(TCP_AO) and accept() race and resolve it with verification by getsockopt() after TCP connection was accepted. The benchmark results (on non-loaded box, worse with more RCU work pending): > ok 33 Worst case delete 16384 keys: min=5ms max=10ms mean=6.93904ms stddev=0.263421 > ok 34 Add a new key 16384 keys: min=1ms max=4ms mean=2.17751ms stddev=0.147564 > ok 35 Remove random-search 16384 keys: min=5ms max=10ms mean=6.50243ms stddev=0.254999 > ok 36 Remove async 16384 keys: min=0ms max=0ms mean=0.0296107ms stddev=0.0172078 Co-developed-by: Francesco Ruggeri <fruggeri@arista.com> Signed-off-by: Francesco Ruggeri <fruggeri@arista.com> Co-developed-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Dmitry Safonov <dima@arista.com> Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27	net/tcp: Add TCP-AO getsockopt()s	Dmitry Safonov	4	-14/+369
	Introduce getsockopt(TCP_AO_GET_KEYS) that lets a user get TCP-AO keys and their properties from a socket. The user can provide a filter to match the specific key to be dumped or ::get_all = 1 may be used to dump all keys in one syscall. Add another getsockopt(TCP_AO_INFO) for providing per-socket/per-ao_info stats: packet counters, Current_key/RNext_key and flags like ::ao_required and ::accept_icmps. Co-developed-by: Francesco Ruggeri <fruggeri@arista.com> Signed-off-by: Francesco Ruggeri <fruggeri@arista.com> Co-developed-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Dmitry Safonov <dima@arista.com> Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27	net/tcp: Add option for TCP-AO to (not) hash header	Dmitry Safonov	2	-3/+10
	Provide setsockopt() key flag that makes TCP-AO exclude hashing TCP header for peers that match the key. This is needed for interraction with middleboxes that may change TCP options, see RFC5925 (9.2). Co-developed-by: Francesco Ruggeri <fruggeri@arista.com> Signed-off-by: Francesco Ruggeri <fruggeri@arista.com> Co-developed-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Dmitry Safonov <dima@arista.com> Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27	net/tcp: Ignore specific ICMPs for TCP-AO connections	Dmitry Safonov	7	-2/+87
	Similarly to IPsec, RFC5925 prescribes: ">> A TCP-AO implementation MUST default to ignore incoming ICMPv4 messages of Type 3 (destination unreachable), Codes 2-4 (protocol unreachable, port unreachable, and fragmentation needed -- ’hard errors’), and ICMPv6 Type 1 (destination unreachable), Code 1 (administratively prohibited) and Code 4 (port unreachable) intended for connections in synchronized states (ESTABLISHED, FIN-WAIT-1, FIN- WAIT-2, CLOSE-WAIT, CLOSING, LAST-ACK, TIME-WAIT) that match MKTs." A selftest (later in patch series) verifies that this attack is not possible in this TCP-AO implementation. Co-developed-by: Francesco Ruggeri <fruggeri@arista.com> Signed-off-by: Francesco Ruggeri <fruggeri@arista.com> Co-developed-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Dmitry Safonov <dima@arista.com> Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27	net/tcp: Add tcp_hash_fail() ratelimited logs	Dmitry Safonov	4	-12/+61
	Add a helper for logging connection-detailed messages for failed TCP hash verification (both MD5 and AO). Co-developed-by: Francesco Ruggeri <fruggeri@arista.com> Signed-off-by: Francesco Ruggeri <fruggeri@arista.com> Co-developed-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Dmitry Safonov <dima@arista.com> Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27	net/tcp: Add TCP-AO SNE support	Dmitry Safonov	6	-13/+104
	Add Sequence Number Extension (SNE) for TCP-AO. This is needed to protect long-living TCP-AO connections from replaying attacks after sequence number roll-over, see RFC5925 (6.2). Co-developed-by: Francesco Ruggeri <fruggeri@arista.com> Signed-off-by: Francesco Ruggeri <fruggeri@arista.com> Co-developed-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Dmitry Safonov <dima@arista.com> Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27	net/tcp: Add TCP-AO segments counters	Dmitry Safonov	9	-15/+77
	Introduce segment counters that are useful for troubleshooting/debugging as well as for writing tests. Now there are global snmp counters as well as per-socket and per-key. Co-developed-by: Francesco Ruggeri <fruggeri@arista.com> Signed-off-by: Francesco Ruggeri <fruggeri@arista.com> Co-developed-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Dmitry Safonov <dima@arista.com> Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27	net/tcp: Verify inbound TCP-AO signed segments	Dmitry Safonov	8	-47/+248
	Now there is a common function to verify signature on TCP segments: tcp_inbound_hash(). It has checks for all possible cross-interactions with MD5 signs as well as with unsigned segments. The rules from RFC5925 are: (1) Any TCP segment can have at max only one signature. (2) TCP connections can't switch between using TCP-MD5 and TCP-AO. (3) TCP-AO connections can't stop using AO, as well as unsigned connections can't suddenly start using AO. Co-developed-by: Francesco Ruggeri <fruggeri@arista.com> Signed-off-by: Francesco Ruggeri <fruggeri@arista.com> Co-developed-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Dmitry Safonov <dima@arista.com> Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27	net/tcp: Sign SYN-ACK segments with TCP-AO	Dmitry Safonov	7	-16/+111
	Similarly to RST segments, wire SYN-ACKs to TCP-AO. tcp_rsk_used_ao() is handy here to check if the request socket used AO and needs a signature on the outgoing segments. Co-developed-by: Francesco Ruggeri <fruggeri@arista.com> Signed-off-by: Francesco Ruggeri <fruggeri@arista.com> Co-developed-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Dmitry Safonov <dima@arista.com> Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27	net/tcp: Wire TCP-AO to request sockets	Dmitry Safonov	12	-51/+506
	Now when the new request socket is created from the listening socket, it's recorded what MKT was used by the peer. tcp_rsk_used_ao() is a new helper for checking if TCP-AO option was used to create the request socket. tcp_ao_copy_all_matching() will copy all keys that match the peer on the request socket, as well as preparing them for the usage (creating traffic keys). Co-developed-by: Francesco Ruggeri <fruggeri@arista.com> Signed-off-by: Francesco Ruggeri <fruggeri@arista.com> Co-developed-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Dmitry Safonov <dima@arista.com> Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27	net/tcp: Add TCP-AO sign to twsk	Dmitry Safonov	7	-50/+183
	Add support for sockets in time-wait state. ao_info as well as all keys are inherited on transition to time-wait socket. The lifetime of ao_info is now protected by ref counter, so that tcp_ao_destroy_sock() will destruct it only when the last user is gone. Co-developed-by: Francesco Ruggeri <fruggeri@arista.com> Signed-off-by: Francesco Ruggeri <fruggeri@arista.com> Co-developed-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Dmitry Safonov <dima@arista.com> Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27	net/tcp: Add AO sign to RST packets	Dmitry Safonov	5	-43/+245
	Wire up sending resets to TCP-AO hashing. Co-developed-by: Francesco Ruggeri <fruggeri@arista.com> Signed-off-by: Francesco Ruggeri <fruggeri@arista.com> Co-developed-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Dmitry Safonov <dima@arista.com> Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27	net/tcp: Add tcp_parse_auth_options()	Dmitry Safonov	7	-22/+93
	Introduce a helper that: (1) shares the common code with TCP-MD5 header options parsing (2) looks for hash signature only once for both TCP-MD5 and TCP-AO (3) fails with -EEXIST if any TCP sign option is present twice, see RFC5925 (2.2): ">> A single TCP segment MUST NOT have more than one TCP-AO in its options sequence. When multiple TCP-AOs appear, TCP MUST discard the segment." Co-developed-by: Francesco Ruggeri <fruggeri@arista.com> Signed-off-by: Francesco Ruggeri <fruggeri@arista.com> Co-developed-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Dmitry Safonov <dima@arista.com> Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27	net/tcp: Add TCP-AO sign to outgoing packets	Dmitry Safonov	7	-38/+391
	Using precalculated traffic keys, sign TCP segments as prescribed by RFC5925. Per RFC, TCP header options are included in sign calculation: "The TCP header, by default including options, and where the TCP checksum and TCP-AO MAC fields are set to zero, all in network- byte order." (5.1.3) tcp_ao_hash_header() has exclude_options parameter to optionally exclude TCP header from hash calculation, as described in RFC5925 (9.1), this is needed for interaction with middleboxes that may change "some TCP options". This is wired up to AO key flags and setsockopt() later. Similarly to TCP-MD5 hash TCP segment fragments. From this moment a user can start sending TCP-AO signed segments with one of crypto ahash algorithms from supported by Linux kernel. It can have a user-specified MAC length, to either save TCP option header space or provide higher protection using a longer signature. The inbound segments are not yet verified, TCP-AO option is ignored and they are accepted. Co-developed-by: Francesco Ruggeri <fruggeri@arista.com> Signed-off-by: Francesco Ruggeri <fruggeri@arista.com> Co-developed-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Dmitry Safonov <dima@arista.com> Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27	net/tcp: Calculate TCP-AO traffic keys	Dmitry Safonov	8	-2/+314
	Add traffic key calculation the way it's described in RFC5926. Wire it up to tcp_finish_connect() and cache the new keys straight away on already established TCP connections. Co-developed-by: Francesco Ruggeri <fruggeri@arista.com> Signed-off-by: Francesco Ruggeri <fruggeri@arista.com> Co-developed-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: Dmitry Safonov <dima@arista.com> Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>