summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2022-03-18net: dsa: mv88e6xxx: MST OffloadingTobias Waldekranz2-8/+255
Allocate a SID in the STU for each MSTID in use by a bridge and handle the mapping of MSTIDs to VLANs using the SID field of each VTU entry. Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-18net: dsa: mv88e6xxx: Export STU as devlink regionTobias Waldekranz2-0/+95
Export the raw STU data in a devlink region so that it can be inspected from userspace and compared to the current bridge configuration. Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-18net: dsa: mv88e6xxx: Disentangle STU from VTUTobias Waldekranz4-135/+264
In early LinkStreet silicon (e.g. 6095/6185), the per-VLAN STP states were kept in the VTU - there was no concept of a SID. Later, the information was split into two tables, where the VTU only tracked memberships and deferred the STP state tracking to the STU via a pointer (SID). This meant that a group of VLANs could share the same STU entry. Most likely, this was done to align with MSTP (802.1Q-2018, Clause 13), which is built on this principle. While the VTU is still 4k lines on most devices, the STU is capped at 64 entries. This means that the current stategy, updating STU info whenever a VTU entry is updated, can not easily support MSTP because: - The maximum number of VIDs would also be capped at 64, as we would have to allocate one SID for every VTU entry - even if many VLANs would effectively share the same MST. - MSTP updates would be unnecessarily slow as you would have to iterate over all VLANs that share the same MST. In order to support MSTP offloading in the future, manage the STU as a separate entity from the VTU. Only add support for newer hardware with separate VTU and STU. VTU-only devices can also be supported, but essentially this requires a software implementation of an STU (fanning out state changed to all VLANs tied to the same MST). Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-18net: dsa: Handle MST state changesTobias Waldekranz4-8/+89
Add the usual trampoline functionality from the generic DSA layer down to the drivers for MST state changes. When a state changes to disabled/blocking/listening, make sure to fast age any dynamic entries in the affected VLANs (those controlled by the MSTI in question). Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-18net: dsa: Pass VLAN MSTI migration notifications to driverTobias Waldekranz4-1/+26
Add the usual trampoline functionality from the generic DSA layer down to the drivers for VLAN MSTI migrations. Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-18net: dsa: Validate hardware support for MSTTobias Waldekranz3-0/+30
When joining a bridge where MST is enabled, we validate that the proper offloading support is in place, otherwise we fallback to software bridging. When then mode is changed on a bridge in which we are members, we refuse the change if offloading is not supported. At the moment we only check for configurable learning, but this will be further restricted as we support more MST related switchdev events. Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-18net: bridge: mst: Add helper to query a port's MST stateTobias Waldekranz2-0/+31
This is useful for switchdev drivers who are offloading MST states into hardware. As an example, a driver may wish to flush the FDB for a port when it transitions from forwarding to blocking - which means that the previous state must be discoverable. Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-18net: bridge: mst: Add helper to check if MST is enabledTobias Waldekranz2-0/+15
This is useful for switchdev drivers that might want to refuse to join a bridge where MST is enabled, if the hardware can't support it. Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-18net: bridge: mst: Add helper to map an MSTI to a VID setTobias Waldekranz2-0/+33
br_mst_get_info answers the question: "On this bridge, which VIDs are mapped to the given MSTI?" This is useful in switchdev drivers, which might have to fan-out operations, relating to an MSTI, per VLAN. An example: When a port's MST state changes from forwarding to blocking, a driver may choose to flush the dynamic FDB entries on that port to get faster reconvergence of the network, but this should only be done in the VLANs that are managed by the MSTI in question. Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-18net: bridge: mst: Notify switchdev drivers of MST state changesTobias Waldekranz2-0/+25
Generate a switchdev notification whenever an MST state changes. This notification is keyed by the VLANs MSTI rather than the VID, since multiple VLANs may share the same MST instance. Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-18net: bridge: mst: Notify switchdev drivers of VLAN MSTI migrationsTobias Waldekranz3-0/+66
Whenever a VLAN moves to a new MSTI, send a switchdev notification so that switchdevs can track a bridge's VID to MSTI mappings. Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-18net: bridge: mst: Notify switchdev drivers of MST mode changesTobias Waldekranz2-0/+13
Trigger a switchdev event whenever the bridge's MST mode is enabled/disabled. This allows constituent ports to either perform any required hardware config, or refuse the change if it not supported. Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-18net: bridge: mst: Support setting and reporting MST port statesTobias Waldekranz5-1/+209
Make it possible to change the port state in a given MSTI by extending the bridge port netlink interface (RTM_SETLINK on PF_BRIDGE).The proposed iproute2 interface would be: bridge mst set dev <PORT> msti <MSTI> state <STATE> Current states in all applicable MSTIs can also be dumped via a corresponding RTM_GETLINK. The proposed iproute interface looks like this: $ bridge mst port msti vb1 0 state forwarding 100 state disabled vb2 0 state forwarding 100 state forwarding The preexisting per-VLAN states are still valid in the MST mode (although they are read-only), and can be queried as usual if one is interested in knowing a particular VLAN's state without having to care about the VID to MSTI mapping (in this example VLAN 20 and 30 are bound to MSTI 100): $ bridge -d vlan port vlan-id vb1 10 state forwarding mcast_router 1 20 state disabled mcast_router 1 30 state disabled mcast_router 1 40 state forwarding mcast_router 1 vb2 10 state forwarding mcast_router 1 20 state forwarding mcast_router 1 30 state forwarding mcast_router 1 40 state forwarding mcast_router 1 Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-18net: bridge: mst: Allow changing a VLAN's MSTITobias Waldekranz4-0/+59
Allow a VLAN to move out of the CST (MSTI 0), to an independent tree. The user manages the VID to MSTI mappings via a global VLAN setting. The proposed iproute2 interface would be: bridge vlan global set dev br0 vid <VID> msti <MSTI> Changing the state in non-zero MSTIs is still not supported, but will be addressed in upcoming changes. Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-18net: bridge: mst: Multiple Spanning Tree (MST) modeTobias Waldekranz9-4/+176
Allow the user to switch from the current per-VLAN STP mode to an MST mode. Up to this point, per-VLAN STP states where always isolated from each other. This is in contrast to the MSTP standard (802.1Q-2018, Clause 13.5), where VLANs are grouped into MST instances (MSTIs), and the state is managed on a per-MSTI level, rather that at the per-VLAN level. Perhaps due to the prevalence of the standard, many switching ASICs are built after the same model. Therefore, add a corresponding MST mode to the bridge, which we can later add offloading support for in a straight-forward way. For now, all VLANs are fixed to MSTI 0, also called the Common Spanning Tree (CST). That is, all VLANs will follow the port-global state. Upcoming changes will make this actually useful by allowing VLANs to be mapped to arbitrary MSTIs and allow individual MSTI states to be changed. Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-18r8169: improve driver unload and system shutdown behavior on DASH-enabled ↵Heiner Kallweit1-4/+16
systems There's a number of systems supporting DASH remote management. Driver unload and system shutdown can result in the PHY suspending, thus making DASH unusable. Improve this by handling DASH being enabled very similar to WoL being enabled. Tested-by: Yanko Kaneti <yaneti@declera.com> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://lore.kernel.org/r/1de3b176-c09c-1654-6f00-9785f7a4f954@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-18Merge branch '100GbE' of ↵Jakub Kicinski5-4/+36
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== 100GbE Intel Wired LAN Driver Updates 2022-03-16 This series contains updates to gtp and ice driver. Wojciech fixes smatch reported inconsistent indenting for gtp and ice. Yang Yingliang fixes a couple of return value checks for GNSS to IS_PTR instead of null. Jacob adds support for trace events on tx timestamps. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: ice: add trace events for tx timestamps ice: fix return value check in ice_gnss.c ice: Fix inconsistent indenting in ice_switch gtp: Fix inconsistent indenting ==================== Link: https://lore.kernel.org/r/20220316204024.3201500-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-18ethernet: sun: Fix spelling mistake "mis-matched" -> "mismatched"Colin Ian King1-1/+1
There is a spelling mistake in a dev_err message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Link: https://lore.kernel.org/r/20220316234620.55885-1-colin.i.king@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-18net: ethernet: ti: Fix spelling mistake and clean up messageColin Ian King1-1/+1
There is a spelling mistake in a dev_err message and the MAX_SKB_FRAGS value does not need to be printed between parentheses. Fix this. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Link: https://lore.kernel.org/r/20220316233455.54541-1-colin.i.king@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-18vlan: use correct format charactersBill Wendling1-1/+1
When compiling with -Wformat, clang emits the following warning: net/8021q/vlanproc.c:284:22: warning: format specifies type 'unsigned short' but the argument has type 'int' [-Wformat] mp->priority, ((mp->vlan_qos >> 13) & 0x7)); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~ The types of these arguments are unconditionally defined, so this patch updates the format character to the correct ones for ints and unsigned ints. Link: https://github.com/ClangBuiltLinux/linux/issues/378 Signed-off-by: Bill Wendling <morbo@google.com> Link: https://lore.kernel.org/r/20220316213125.2353370-1-morbo@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-18net/fsl: xgmac_mdio: use correct format charactersBill Wendling1-1/+1
When compiling with -Wformat, clang emits the following warning: drivers/net/ethernet/freescale/xgmac_mdio.c:243:22: warning: format specifies type 'unsigned char' but the argument has type 'int' [-Wformat] phy_id, dev_addr, regnum); ^~~~~~ ./include/linux/dev_printk.h:163:47: note: expanded from macro 'dev_dbg' dev_printk(KERN_DEBUG, dev, dev_fmt(fmt), ##__VA_ARGS__); \ ~~~ ^~~~~~~~~~~ ./include/linux/dev_printk.h:129:34: note: expanded from macro 'dev_printk' _dev_printk(level, dev, fmt, ##__VA_ARGS__); \ ~~~ ^~~~~~~~~~~ The types of these arguments are unconditionally defined, so this patch updates the format character to the correct ones for ints and unsigned ints. Link: https://github.com/ClangBuiltLinux/linux/issues/378 Signed-off-by: Bill Wendling <morbo@google.com> Link: https://lore.kernel.org/r/20220316213114.2352352-1-morbo@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-18bnx2x: use correct format charactersBill Wendling1-2/+2
When compiling with -Wformat, clang emits the following warnings: drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c:6181:40: warning: format specifies type 'unsigned short' but the argument has type 'u32' (aka 'unsigned int') [-Wformat] ret = scnprintf(str, *len, "%hx.%hx", num >> 16, num); ~~~ ^~~~~~~~~ %x drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c:6181:51: warning: format specifies type 'unsigned short' but the argument has type 'u32' (aka 'unsigned int') [-Wformat] ret = scnprintf(str, *len, "%hx.%hx", num >> 16, num); ~~~ ^~~ %x drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c:6196:47: warning: format specifies type 'unsigned char' but the argument has type 'u32' (aka 'unsigned int') [-Wformat] ret = scnprintf(str, *len, "%hhx.%hhx.%hhx", num >> 16, num >> 8, num); ~~~~ ^~~~~~~~~ %x drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c:6196:58: warning: format specifies type 'unsigned char' but the argument has type 'u32' (aka 'unsigned int') [-Wformat] ret = scnprintf(str, *len, "%hhx.%hhx.%hhx", num >> 16, num >> 8, num); ~~~~ ^~~~~~~~ %x drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c:6196:68: warning: format specifies type 'unsigned char' but the argument has type 'u32' (aka 'unsigned int') [-Wformat] ret = scnprintf(str, *len, "%hhx.%hhx.%hhx", num >> 16, num >> 8, num); ~~~~ ^~~ %x The types of these arguments are unconditionally defined, so this patch updates the format character to the correct ones for ints and unsigned ints. Link: https://github.com/ClangBuiltLinux/linux/issues/378 Signed-off-by: Bill Wendling <morbo@google.com> Link: https://lore.kernel.org/r/20220316213104.2351651-1-morbo@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-18enetc: use correct format charactersBill Wendling1-1/+1
When compiling with -Wformat, clang emits the following warning: drivers/net/ethernet/freescale/enetc/enetc_mdio.c:151:22: warning: format specifies type 'unsigned char' but the argument has type 'int' [-Wformat] phy_id, dev_addr, regnum); ^~~~~~ ./include/linux/dev_printk.h:163:47: note: expanded from macro 'dev_dbg' dev_printk(KERN_DEBUG, dev, dev_fmt(fmt), ##__VA_ARGS__); \ ~~~ ^~~~~~~~~~~ ./include/linux/dev_printk.h:129:34: note: expanded from macro 'dev_printk' _dev_printk(level, dev, fmt, ##__VA_ARGS__); \ ~~~ ^~~~~~~~~~~ The types of these arguments are unconditionally defined, so this patch updates the format character to the correct ones for ints and unsigned ints. Link: https://github.com/ClangBuiltLinux/linux/issues/378 Signed-off-by: Bill Wendling <morbo@google.com> Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com> Link: https://lore.kernel.org/r/20220316213109.2352015-1-morbo@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-17Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski80-286/+619
No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-17Merge tag 'net-5.17-final' of ↵Linus Torvalds33-133/+154
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from netfilter, ipsec, and wireless. A few last minute revert / disable and fix patches came down from our sub-trees. We're not waiting for any fixes at this point. Current release - regressions: - Revert "netfilter: nat: force port remap to prevent shadowing well-known ports", restore working conntrack on asymmetric paths - Revert "ath10k: drop beacon and probe response which leak from other channel", restore working AP and mesh mode on QCA9984 - eth: intel: fix hang during reboot/shutdown Current release - new code bugs: - netfilter: nf_tables: disable register tracking, it needs more work to cover all corner cases Previous releases - regressions: - ipv6: fix skb_over_panic in __ip6_append_data when (admin-only) extension headers get specified - esp6: fix ESP over TCP/UDP, interpret ipv6_skip_exthdr's return value more selectively - bnx2x: fix driver load failure when FW not present in initrd Previous releases - always broken: - vsock: stop destroying unrelated sockets in nested virtualization - packet: fix slab-out-of-bounds access in packet_recvmsg() Misc: - add Paolo Abeni to networking maintainers!" * tag 'net-5.17-final' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (26 commits) iavf: Fix hang during reboot/shutdown net: mscc: ocelot: fix backwards compatibility with single-chain tc-flower offload net: bcmgenet: skip invalid partial checksums bnx2x: fix built-in kernel driver load failure net: phy: mscc: Add MODULE_FIRMWARE macros net: dsa: Add missing of_node_put() in dsa_port_parse_of net: handle ARPHRD_PIMREG in dev_is_mac_header_xmit() Revert "ath10k: drop beacon and probe response which leak from other channel" hv_netvsc: Add check for kvmalloc_array iavf: Fix double free in iavf_reset_task ice: destroy flow director filter mutex after releasing VSIs ice: fix NULL pointer dereference in ice_update_vsi_tx_ring_stats() Add Paolo Abeni to networking maintainers atm: eni: Add check for dma_map_single net/packet: fix slab-out-of-bounds access in packet_recvmsg() net: mdio: mscc-miim: fix duplicate debugfs entry net: phy: marvell: Fix invalid comparison in the resume and suspend functions esp6: fix check on ipv6_skip_exthdr's return value net: dsa: microchip: add spi_device_id tables netfilter: nf_tables: disable register tracking ...
2022-03-17Merge tag 'acpi-5.17-rc9' of ↵Linus Torvalds1-5/+5
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fix from Rafael Wysocki: "Revert recent commit that caused multiple systems to misbehave due to firmware issues" * tag 'acpi-5.17-rc9' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: Revert "ACPI: scan: Do not add device IDs from _CID if _HID is not valid"
2022-03-17Merge branch 'akpm' (patches from Andrew)Linus Torvalds4-16/+15
Merge misc fixes from Andrew Morton: "Four patches. Subsystems affected by this patch series: mm/swap, kconfig, ocfs2, and selftests" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: selftests: vm: fix clang build error multiple output files ocfs2: fix crash when initialize filecheck kobj fails configs/debug: restore DEBUG_INFO=y for overriding mm: swap: get rid of livelock in swapin readahead
2022-03-17selftests: vm: fix clang build error multiple output filesYosry Ahmed1-4/+2
When building the vm selftests using clang, some errors are seen due to having headers in the compilation command: clang -Wall -I ../../../../usr/include -no-pie gup_test.c ../../../../mm/gup_test.h -lrt -lpthread -o .../tools/testing/selftests/vm/gup_test clang: error: cannot specify -o when generating multiple output files make[1]: *** [../lib.mk:146: .../tools/testing/selftests/vm/gup_test] Error 1 Rework to add the header files to LOCAL_HDRS before including ../lib.mk, since the dependency is evaluated in '$(OUTPUT)/%:%.c $(LOCAL_HDRS)' in file lib.mk. Link: https://lkml.kernel.org/r/20220304000645.1888133-1-yosryahmed@google.com Signed-off-by: Yosry Ahmed <yosryahmed@google.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-17ocfs2: fix crash when initialize filecheck kobj failsJoseph Qi1-11/+11
Once s_root is set, genric_shutdown_super() will be called if fill_super() fails. That means, we will call ocfs2_dismount_volume() twice in such case, which can lead to kernel crash. Fix this issue by initializing filecheck kobj before setting s_root. Link: https://lkml.kernel.org/r/20220310081930.86305-1-joseph.qi@linux.alibaba.com Fixes: 5f483c4abb50 ("ocfs2: add kobject for online file check") Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-17configs/debug: restore DEBUG_INFO=y for overridingQian Cai1-0/+1
Previously, I failed to realize that Kees' patch [1] has not been merged into the mainline yet, and dropped DEBUG_INFO=y too eagerly from the mainline. As the results, "make debug.config" won't be able to flip DEBUG_INFO=n from the existing .config. This should close the gaps of a few weeks before Kees' patch is there, and work regardless of their merging status anyway. Link: https://lore.kernel.org/all/20220125075126.891825-1-keescook@chromium.org/ [1] Link: https://lkml.kernel.org/r/20220308153524.8618-1-quic_qiancai@quicinc.com Signed-off-by: Qian Cai <quic_qiancai@quicinc.com> Reported-by: Daniel Thompson <daniel.thompson@linaro.org> Reviewed-by: Daniel Thompson <daniel.thompson@linaro.org> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-17mm: swap: get rid of livelock in swapin readaheadGuo Ziliang1-1/+1
In our testing, a livelock task was found. Through sysrq printing, same stack was found every time, as follows: __swap_duplicate+0x58/0x1a0 swapcache_prepare+0x24/0x30 __read_swap_cache_async+0xac/0x220 read_swap_cache_async+0x58/0xa0 swapin_readahead+0x24c/0x628 do_swap_page+0x374/0x8a0 __handle_mm_fault+0x598/0xd60 handle_mm_fault+0x114/0x200 do_page_fault+0x148/0x4d0 do_translation_fault+0xb0/0xd4 do_mem_abort+0x50/0xb0 The reason for the livelock is that swapcache_prepare() always returns EEXIST, indicating that SWAP_HAS_CACHE has not been cleared, so that it cannot jump out of the loop. We suspect that the task that clears the SWAP_HAS_CACHE flag never gets a chance to run. We try to lower the priority of the task stuck in a livelock so that the task that clears the SWAP_HAS_CACHE flag will run. The results show that the system returns to normal after the priority is lowered. In our testing, multiple real-time tasks are bound to the same core, and the task in the livelock is the highest priority task of the core, so the livelocked task cannot be preempted. Although cond_resched() is used by __read_swap_cache_async, it is an empty function in the preemptive system and cannot achieve the purpose of releasing the CPU. A high-priority task cannot release the CPU unless preempted by a higher-priority task. But when this task is already the highest priority task on this core, other tasks will not be able to be scheduled. So we think we should replace cond_resched() with schedule_timeout_uninterruptible(1), schedule_timeout_interruptible will call set_current_state first to set the task state, so the task will be removed from the running queue, so as to achieve the purpose of giving up the CPU and prevent it from running in kernel mode for too long. (akpm: ugly hack becomes uglier. But it fixes the issue in a backportable-to-stable fashion while we hopefully work on something better) Link: https://lkml.kernel.org/r/20220221111749.1928222-1-cgel.zte@gmail.com Signed-off-by: Guo Ziliang <guo.ziliang@zte.com.cn> Reported-by: Zeal Robot <zealci@zte.com.cn> Reviewed-by: Ran Xiaokai <ran.xiaokai@zte.com.cn> Reviewed-by: Jiang Xuexin <jiang.xuexin@zte.com.cn> Reviewed-by: Yang Yang <yang.yang29@zte.com.cn> Acked-by: Hugh Dickins <hughd@google.com> Cc: Naoya Horiguchi <naoya.horiguchi@nec.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Roger Quadros <rogerq@kernel.org> Cc: Ziliang Guo <guo.ziliang@zte.com.cn> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-17iavf: Fix hang during reboot/shutdownIvan Vecera1-0/+7
Recent commit 974578017fc1 ("iavf: Add waiting so the port is initialized in remove") adds a wait-loop at the beginning of iavf_remove() to ensure that port initialization is finished prior unregistering net device. This causes a regression in reboot/shutdown scenario because in this case callback iavf_shutdown() is called and this callback detaches the device, makes it down if it is running and sets its state to __IAVF_REMOVE. Later shutdown callback of associated PF driver (e.g. ice_shutdown) is called. That callback calls among other things sriov_disable() that calls indirectly iavf_remove() (see stack trace below). As the adapter state is already __IAVF_REMOVE then the mentioned loop is end-less and shutdown process hangs. The patch fixes this by checking adapter's state at the beginning of iavf_remove() and skips the rest of the function if the adapter is already in remove state (shutdown is in progress). Reproducer: 1. Create VF on PF driven by ice or i40e driver 2. Ensure that the VF is bound to iavf driver 3. Reboot [52625.981294] sysrq: SysRq : Show Blocked State [52625.988377] task:reboot state:D stack: 0 pid:17359 ppid: 1 f2 [52625.996732] Call Trace: [52625.999187] __schedule+0x2d1/0x830 [52626.007400] schedule+0x35/0xa0 [52626.010545] schedule_hrtimeout_range_clock+0x83/0x100 [52626.020046] usleep_range+0x5b/0x80 [52626.023540] iavf_remove+0x63/0x5b0 [iavf] [52626.027645] pci_device_remove+0x3b/0xc0 [52626.031572] device_release_driver_internal+0x103/0x1f0 [52626.036805] pci_stop_bus_device+0x72/0xa0 [52626.040904] pci_stop_and_remove_bus_device+0xe/0x20 [52626.045870] pci_iov_remove_virtfn+0xba/0x120 [52626.050232] sriov_disable+0x2f/0xe0 [52626.053813] ice_free_vfs+0x7c/0x340 [ice] [52626.057946] ice_remove+0x220/0x240 [ice] [52626.061967] ice_shutdown+0x16/0x50 [ice] [52626.065987] pci_device_shutdown+0x34/0x60 [52626.070086] device_shutdown+0x165/0x1c5 [52626.074011] kernel_restart+0xe/0x30 [52626.077593] __do_sys_reboot+0x1d2/0x210 [52626.093815] do_syscall_64+0x5b/0x1a0 [52626.097483] entry_SYSCALL_64_after_hwframe+0x65/0xca Fixes: 974578017fc1 ("iavf: Add waiting so the port is initialized in remove") Signed-off-by: Ivan Vecera <ivecera@redhat.com> Link: https://lore.kernel.org/r/20220317104524.2802848-1-ivecera@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-17net: mscc: ocelot: fix backwards compatibility with single-chain tc-flower ↵Vladimir Oltean1-1/+15
offload ACL rules can be offloaded to VCAP IS2 either through chain 0, or, since the blamed commit, through a chain index whose number encodes a specific PAG (Policy Action Group) and lookup number. The chain number is translated through ocelot_chain_to_pag() into a PAG, and through ocelot_chain_to_lookup() into a lookup number. The problem with the blamed commit is that the above 2 functions don't have special treatment for chain 0. So ocelot_chain_to_pag(0) returns filter->pag = 224, which is in fact -32, but the "pag" field is an u8. So we end up programming the hardware with VCAP IS2 entries having a PAG of 224. But the way in which the PAG works is that it defines a subset of VCAP IS2 filters which should match on a packet. The default PAG is 0, and previous VCAP IS1 rules (which we offload using 'goto') can modify it. So basically, we are installing filters with a PAG on which no packet will ever match. This is the hardware equivalent of adding filters to a chain which has no 'goto' to it. Restore the previous functionality by making ACL filters offloaded to chain 0 go to PAG 0 and lookup number 0. The choice of PAG is clearly correct, but the choice of lookup number isn't "as before" (which was to leave the lookup a "don't care"). However, lookup 0 should be fine, since even though there are ACL actions (policers) which have a requirement to be used in a specific lookup, that lookup is 0. Fixes: 226e9cd82a96 ("net: mscc: ocelot: only install TCAM entries into a specific lookup and PAG") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20220316192117.2568261-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-17net: bcmgenet: skip invalid partial checksumsDoug Berger1-2/+4
The RXCHK block will return a partial checksum of 0 if it encounters a problem while receiving a packet. Since a 1's complement sum can only produce this result if no bits are set in the received data stream it is fair to treat it as an invalid partial checksum and not pass it up the stack. Fixes: 810155397890 ("net: bcmgenet: use CHECKSUM_COMPLETE for NETIF_F_RXCSUM") Signed-off-by: Doug Berger <opendmb@gmail.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20220317012812.1313196-1-opendmb@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-17bnx2x: fix built-in kernel driver load failureManish Chopra3-26/+19
Commit b7a49f73059f ("bnx2x: Utilize firmware 7.13.21.0") added request_firmware() logic in probe() which caused load failure when firmware file is not present in initrd (below), as access to firmware file is not feasible during probe. Direct firmware load for bnx2x/bnx2x-e2-7.13.15.0.fw failed with error -2 Direct firmware load for bnx2x/bnx2x-e2-7.13.21.0.fw failed with error -2 This patch fixes this issue by - 1. Removing request_firmware() logic from the probe() such that .ndo_open() handle it as it used to handle it earlier 2. Given request_firmware() is removed from probe(), so driver has to relax FW version comparisons a bit against the already loaded FW version (by some other PFs of same adapter) to allow different compatible/close enough FWs with which multiple PFs may run with (in different environments), as the given PF who is in probe flow has no idea now with which firmware file version it is going to initialize the device in ndo_open() Link: https://lore.kernel.org/all/46f2d9d9-ae7f-b332-ddeb-b59802be2bab@molgen.mpg.de/ Reported-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Paul Menzel <pmenzel@molgen.mpg.de> Fixes: b7a49f73059f ("bnx2x: Utilize firmware 7.13.21.0") Signed-off-by: Manish Chopra <manishc@marvell.com> Signed-off-by: Ariel Elior <aelior@marvell.com> Link: https://lore.kernel.org/r/20220316214613.6884-1-manishc@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-17net: phy: mscc: Add MODULE_FIRMWARE macrosJuerg Haefliger1-0/+3
The driver requires firmware so define MODULE_FIRMWARE so that modinfo provides the details. Fixes: fa164e40c53b ("net: phy: mscc: split the driver into separate files") Signed-off-by: Juerg Haefliger <juergh@canonical.com> Link: https://lore.kernel.org/r/20220316151835.88765-1-juergh@canonical.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-17selftests: net: fix array_size.cocci warningGuo Zhengkui3-5/+10
Fix array_size.cocci warning in tools/testing/selftests/net. Use `ARRAY_SIZE(arr)` instead of forms like `sizeof(arr)/sizeof(arr[0])`. It has been tested with gcc (Debian 8.3.0-6) 8.3.0. Signed-off-by: Guo Zhengkui <guozhengkui@vivo.com> Link: https://lore.kernel.org/r/20220316092858.9398-1-guozhengkui@vivo.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-03-17net: stmmac: clean up impossible conditionDan Carpenter1-4/+1
This code works but it has a static checker warning: drivers/net/ethernet/stmicro/stmmac/stmmac_main.c:1687 init_dma_rx_desc_rings() warn: always true condition '(queue >= 0) => (0-u32max >= 0)' Obviously, it makes no sense to check if an unsigned int is >= 0. What prevents this code from being a forever loop is that later there is a separate check for if (queue == 0). The "queue" variable is less than MTL_MAX_RX_QUEUES (8) so it can easily fit in an int type. Any larger value for "queue" would lead to an array overflow when we assign "rx_q = &priv->rx_queue[queue]". Fixes: de0b90e52a11 ("net: stmmac: rearrange RX and TX desc init into per-queue basis") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/20220316083744.GB30941@kili Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-03-17net: dsa: Add missing of_node_put() in dsa_port_parse_ofMiaoqian Lin1-0/+1
The device_node pointer is returned by of_parse_phandle() with refcount incremented. We should use of_node_put() on it when done. Fixes: 6d4e5c570c2d ("net: dsa: get port type at parse time") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Link: https://lore.kernel.org/r/20220316082602.10785-1-linmq006@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-03-17net: geneve: support IPv4/IPv6 as inner protocolEyal Birger2-19/+64
This patch adds support for encapsulating IPv4/IPv6 within GENEVE. In order to use this, a new IFLA_GENEVE_INNER_PROTO_INHERIT flag needs to be provided at device creation. This property cannot be changed for the time being. In case IP traffic is received on a non-tun device the drop count is increased. Signed-off-by: Eyal Birger <eyal.birger@gmail.com> Link: https://lore.kernel.org/r/20220316061557.431872-1-eyal.birger@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-03-17Merge branch 'net-mvneta-armada-98dx2530-soc'Paolo Abeni2-0/+13
Chris Packham says: ==================== net: mvneta: Armada 98DX2530 SoC This is split off from [1] to let it go in via net-next rather than waiting for the rest of the series to land. [1] - https://lore.kernel.org/lkml/20220314213143.2404162-1-chris.packham@alliedtelesis.co.nz/ ==================== Link: https://lore.kernel.org/r/20220315215207.2746793-1-chris.packham@alliedtelesis.co.nz Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-03-17net: mvneta: Add support for 98DX2530 Ethernet portChris Packham1-0/+12
The 98DX2530 SoC is similar to the Armada 3700 except it needs a different MBUS window configuration. Add a new compatible string to identify this device and the required MBUS window configuration. Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-03-17dt-bindings: net: mvneta: Add marvell,armada-ac5-netaChris Packham1-0/+1
The out of band port on the 98DX2530 SoC is similar to the armada-3700 except it requires a slightly different MBUS window configuration. Add a new compatible string so this difference can be accounted for. Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-03-17ptp: ocp: Fix PTP_PF_* verification requestsJonathan Lemon1-8/+20
Update and check functionality for pin configuration requests: PTP_PF_NONE: requests "IN: None", disabling the pin. # testptp -d /dev/ptp3 -L3,0 -i1 set pin function okay # cat sma4 IN: None PTP_PF_EXTTS: should configure external timestamps, but since the timecard can steer inputs to multiple inputs as well as timestamps, allow the request, but don't change configurations. # testptp -d /dev/ptp3 -L3,1 -i1 set pin function okay (no functional or configuration change here yet) PTP_PF_PEROUT: Channel 0 is the PHC, at 1PPS. Channels 1-4 are the programmable frequency generators. # fails because period is not 1PPS. # testptp -d /dev/ptp3 -L3,2 -i0 -p 500000000 PTP_PEROUT_REQUEST: Invalid argument # testptp -d /dev/ptp3 -L3,2 -i0 -p 1000000000 periodic output request okay # cat sma4 OUT: PHC # testptp -d /dev/ptp3 -L3,2 -i1 -p 500000000 -w 200000000 periodic output request okay # cat sma4 OUT: GEN1 # cat gen1/signal 500000000 40 0 1 2022-03-10T23:55:26 TAI # cat gen1/running 1 # testptp -d /dev/ptp3 -L3,2 -i1 -p 0 periodic output request okay # cat gen1/running 0 Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Link: https://lore.kernel.org/r/20220315194626.1895-1-jonathan.lemon@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-03-17Merge branch 'flow_offload-add-tc-vlan-push_eth-and-pop_eth-actions'Jakub Kicinski14-70/+82
Roi Dayan says: ==================== flow_offload: add tc vlan push_eth and pop_eth actions Offloading vlan push_eth and pop_eth actions is needed in order to correctly offload MPLSoUDP encap and decap flows, this series extends the flow offload API to support these actions and updates mlx5 to parse them. ==================== Link: https://lore.kernel.org/r/20220315110211.1581468-1-roid@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-17net/mlx5e: MPLSoUDP encap, support action vlan pop_eth explicitlyMaor Dickman4-0/+10
Currently the MPLSoUDP encap offload does the L2 pop implicitly while adding such action explicitly (vlan eth_push) will cause the rule to not be offloaded. Solve it by adding offload support for vlan eth_push in case of MPLSoUDP decap case. Flow example: filter root protocol ip pref 1 flower chain 0 filter root protocol ip pref 1 flower chain 0 handle 0x1 eth_type ipv4 dst_ip 2.2.2.22 src_ip 2.2.2.21 in_hw in_hw_count 1 action order 1: vlan pop_eth pipe index 1 ref 1 bind 1 used_hw_stats delayed action order 2: mpls push protocol mpls_uc label 555 tc 3 ttl 255 pipe index 1 ref 1 bind 1 used_hw_stats delayed action order 3: tunnel_key set src_ip 8.8.8.21 dst_ip 8.8.8.22 dst_port 6635 csum tos 0x4 ttl 6 pipe index 1 ref 1 bind 1 used_hw_stats delayed action order 4: mirred (Egress Redirect to device bareudp0) stolen index 1 ref 1 bind 1 used_hw_stats delayed Signed-off-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-17net/mlx5e: MPLSoUDP decap, use vlan push_eth instead of peditMaor Dickman11-70/+43
Currently action pedit of source and destination MACs is used to fill the MACs in L2 push step in MPLSoUDP decap offload, this isn't aligned to tc SW which use vlan eth_push action to do this. To fix that, offload support for vlan veth_push action is added together with mpls pop action, and deprecate the use of pedit of MACs. Flow example: filter protocol mpls_uc pref 1 flower chain 0 filter protocol mpls_uc pref 1 flower chain 0 handle 0x1 eth_type 8847 mpls_label 555 enc_dst_port 6635 in_hw in_hw_count 1 action order 1: tunnel_key unset pipe index 2 ref 1 bind 1 used_hw_stats delayed action order 2: mpls pop protocol ip pipe index 2 ref 1 bind 1 used_hw_stats delayed action order 3: vlan push_eth dst_mac de:a2:ec:d6:69:c8 src_mac de:a2:ec:d6:69:c8 pipe index 2 ref 1 bind 1 used_hw_stats delayed action order 4: mirred (Egress Redirect to device enp8s0f0_0) stolen index 2 ref 1 bind 1 used_hw_stats delayed Signed-off-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-17net/sched: add vlan push_eth and pop_eth action to the hardware IRMaor Dickman3-0/+29
Add vlan push_eth and pop_eth action to the hardware intermediate representation model which would subsequently allow it to be used by drivers for offload. Signed-off-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-17net: handle ARPHRD_PIMREG in dev_is_mac_header_xmit()Nicolas Dichtel1-0/+1
This kind of interface doesn't have a mac header. This patch fixes bpf_redirect() to a PIM interface. Fixes: 27b29f63058d ("bpf: add bpf_redirect() helper") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Link: https://lore.kernel.org/r/20220315092008.31423-1-nicolas.dichtel@6wind.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-17net: dsa: Never offload FDB entries on standalone portsTobias Waldekranz1-0/+3
If a port joins a bridge that it can't offload, it will fallback to standalone mode and software bridging. In this case, we never want to offload any FDB entries to hardware either. Previously, for host addresses, we would eventually end up in dsa_port_bridge_host_fdb_add, which would unconditionally dereference dp->bridge and cause a segfault. Fixes: c26933639b54 ("net: dsa: request drivers to perform FDB isolation") Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20220315233033.1468071-1-tobias@waldekranz.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>