summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2023-04-23net: dsa: update TX path comments to not mention skb_mac_header()Vladimir Oltean2-3/+3
Once commit 6d1ccff62780 ("net: reset mac header in dev_start_xmit()") will be reverted, it will no longer be true that skb->data points at skb_mac_header(skb) - since the skb->mac_header will not be set - so stop saying that, and just say that it points to the MAC header. I've reviewed vlan_insert_tag() and it does not *actually* depend on skb_mac_header(), so reword that to avoid the confusion. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-23net: dsa: tag_sja1105: replace skb_mac_header() with vlan_eth_hdr()Vladimir Oltean1-1/+1
This is a cosmetic patch which consolidates the code to use the helper function offered by if_vlan.h. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-23net: dsa: tag_sja1105: don't rely on skb_mac_header() in TX pathsVladimir Oltean1-1/+1
skb_mac_header() will no longer be available in the TX path when reverting commit 6d1ccff62780 ("net: reset mac header in dev_start_xmit()"). As preparation for that, let's use skb_vlan_eth_hdr() to get to the VLAN header instead, which assumes it's located at skb->data (assumption which holds true here). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-23net: dsa: tag_ksz: do not rely on skb_mac_header() in TX pathsVladimir Oltean1-9/+9
skb_mac_header() will no longer be available in the TX path when reverting commit 6d1ccff62780 ("net: reset mac header in dev_start_xmit()"). As preparation for that, let's use skb_eth_hdr() to get to the Ethernet header's MAC DA instead, helper which assumes this header is located at skb->data (assumption which holds true here). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-23net: dsa: tag_ocelot: do not rely on skb_mac_header() for VLAN xmitVladimir Oltean1-1/+1
skb_mac_header() will no longer be available in the TX path when reverting commit 6d1ccff62780 ("net: reset mac header in dev_start_xmit()"). As preparation for that, let's use skb_vlan_eth_hdr() to get to the VLAN header instead, which assumes it's located at skb->data (assumption which holds true here). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-23net: dpaa: avoid one skb_reset_mac_header() in dpaa_enable_tx_csum()Vladimir Oltean1-7/+2
It appears that dpaa_enable_tx_csum() only calls skb_reset_mac_header() to get to the VLAN header using skb_mac_header(). We can use skb_vlan_eth_hdr() to get to the VLAN header based on skb->data directly. This avoids spending a few cycles to set skb->mac_header. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-23net: vlan: introduce skb_vlan_eth_hdr()Vladimir Oltean12-20/+24
Similar to skb_eth_hdr() introduced in commit 96cc4b69581d ("macvlan: do not assume mac_header is set in macvlan_broadcast()"), let's introduce a skb_vlan_eth_hdr() helper which can be used in TX-only code paths to get to the VLAN header based on skb->data rather than based on the skb_mac_header(skb). We also consolidate the drivers that dereference skb->data to go through this helper. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-23net: vlan: don't adjust MAC header in __vlan_insert_inner_tag() unless setVladimir Oltean1-1/+2
This is a preparatory change for the deletion of skb_reset_mac_header(skb) from __dev_queue_xmit(). After that deletion, skb_mac_header(skb) will no longer be set in TX paths, from which __vlan_insert_inner_tag() can still be called (perhaps indirectly). If we don't make this change, then an unset MAC header (equal to ~0U) will become set after the adjustment with VLAN_HLEN. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-23drivers/net/phy: add driver for Microchip LAN867x 10BASE-T1S PHYRamón Nordin Rodriguez3-0/+144
This patch adds support for the Microchip LAN867x 10BASE-T1S family (LAN8670/1/2). The driver supports P2MP with PLCA. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Ramón Nordin Rodriguez <ramon.nordin.rodriguez@ferroamp.se> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-23rxrpc: Replace fake flex-array with flexible-array memberGustavo A. R. Silva1-1/+1
Zero-length arrays as fake flexible arrays are deprecated and we are moving towards adopting C99 flexible-array members instead. Transform zero-length array into flexible-array member in struct rxrpc_ackpacket. Address the following warnings found with GCC-13 and -fstrict-flex-arrays=3 enabled: net/rxrpc/call_event.c:149:38: warning: array subscript i is outside array bounds of ‘uint8_t[0]’ {aka ‘unsigned char[]’} [-Warray-bounds=] This helps with the ongoing efforts to tighten the FORTIFY_SOURCE routines on memcpy() and help us make progress towards globally enabling -fstrict-flex-arrays=3 [1]. Link: https://github.com/KSPP/linux/issues/21 Link: https://github.com/KSPP/linux/issues/263 Link: https://gcc.gnu.org/pipermail/gcc-patches/2022-October/602902.html [1] Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Jeffrey Altman <jaltman@auristor.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: "David S. Miller" <davem@davemloft.net> cc: Eric Dumazet <edumazet@google.com> cc: Jakub Kicinski <kuba@kernel.org> cc: Paolo Abeni <pabeni@redhat.com> cc: linux-afs@lists.infradead.org cc: netdev@vger.kernel.org cc: linux-hardening@vger.kernel.org Link: https://lore.kernel.org/r/ZAZT11n4q5bBttW0@work/ Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-23Merge branch 'napi_threaded_poll-enhancements'David S. Miller3-26/+42
Eric Dumazet says: ==================== net: give napi_threaded_poll() some love There is interest to revert commit 4cd13c21b207 ("softirq: Let ksoftirqd do its job") and use instead the napi_threaded_poll() mode. https://lore.kernel.org/netdev/140f61e2e1fcb8cf53619709046e312e343b53ca.camel@redhat.com/T/#m8a8f5b09844adba157ad0d22fc1233d97013de50 Before doing so, make sure napi_threaded_poll() benefits from recent core stack improvements, to further reduce softirq triggers. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-23net: optimize napi_threaded_poll() vs RPS/RFSEric Dumazet2-2/+13
We use napi_threaded_poll() in order to reduce our softirq dependency. We can add a followup of 821eba962d95 ("net: optimize napi_schedule_rps()") to further remove the need of firing NET_RX_SOFTIRQ whenever RPS/RFS are used. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-23net: make napi_threaded_poll() aware of sd->defer_listEric Dumazet1-0/+3
If we call skb_defer_free_flush() from napi_threaded_poll(), we can avoid to raise IPI from skb_attempt_defer_free() when the list becomes too big. This allows napi_threaded_poll() to rely less on softirqs, and lowers latency caused by a too big list. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-23net: move skb_defer_free_flush() upEric Dumazet1-21/+21
We plan using skb_defer_free_flush() from napi_threaded_poll() in the following patch. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-23net: do not provide hard irq safety for sd->defer_lockEric Dumazet2-5/+4
kfree_skb() can be called from hard irq handlers, but skb_attempt_defer_free() is meant to be used from process or BH contexts, and skb_defer_free_flush() is meant to be called from BH contexts. Not having to mask hard irq can save some cycles. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-23net: add debugging checks in skb_attempt_defer_free()Eric Dumazet1-0/+3
Make sure skbs that are stored in softnet_data.defer_list do not have a dst attached. Also make sure the the skb was orphaned. Link: https://lore.kernel.org/netdev/CANn89iJuEVe72bPmEftyEJHLzzN=QNR2yueFjTxYXCEpS5S8HQ@mail.gmail.com/T/ Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-23Merge branch '100GbE' of ↵David S. Miller5-53/+36
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== This series lowers the CPU usage of the ice driver when using its provided /dev/gnss*. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-23net: mtk_eth_soc: mediatek: fix ppe flow accounting for v1 hardwareFelix Fietkau2-3/+10
Older chips (like MT7622) use a different bit in ib2 to enable hardware counter support. Add macros for both and select the appropriate bit. Fixes: 3fbe4d8c0e53 ("net: ethernet: mtk_eth_soc: ppe: add support for flow accounting") Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Daniel Golle <daniel@makrotopia.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-22Merge tag 'mlx5-updates-2023-04-20' of ↵Jakub Kicinski20-273/+297
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2023-04-20 1) Dragos Improves RX page pool, and provides some fixes to his previous series: 1.1) Fix releasing page_pool for striding RQ and legacy RQ nonlinear case 1.2) Hook NAPIs to page pools to gain more performance. 2) From Roi, Some cleanups to TC and eswitch modules. 3) Maher migrates vnic diagnostic counters reporting from debugfs to a dedicated devlink health reporter Maher Says: =========== net/mlx5: Expose vnic diagnostic counters using devlink Currently, vnic diagnostic counters are exposed through the following debugfs: $ ls /sys/kernel/debug/mlx5/0000:08:00.0/esw/vf_0/vnic_diag/ cq_overrun quota_exceeded_command total_q_under_processor_handle invalid_command send_queue_priority_update_flow nic_receive_steering_discard The current design does not allow the hypervisor to view the diagnostic counters of its VFs, in case the VFs get bound to a VM. In other words, the counters are not exposed for representor interfaces. Furthermore, the debugfs design is inconvenient future-wise, in case more counters need to be reported by the driver in the future. As these counters pertain to vNIC health, it is more appropriate to utilize the devlink health reporter to expose them. Thus, this patchest includes the following changes: * Drop the current vnic diagnostic counters debugfs interface. * Add a vnic devlink health reporter for PFs/VFs core devices, which when diagnosed will dump vnic diagnostic counter values that are queried from FW. * Add a vnic devlink health reporter for the representor interface, which serves the same purpose listed in the previous point, in addition to allowing the hypervisor to view its VFs diagnostic counters, even when the VFs are bounded to external VMs. Example of devlink health reporter usage is: $devlink health diagnose pci/0000:08:00.0 reporter vnic vNIC env counters: total_error_queues: 0 send_queue_priority_update_flow: 0 comp_eq_overrun: 0 async_eq_overrun: 0 cq_overrun: 0 invalid_command: 0 quota_exceeded_command: 0 nic_receive_steering_discard: 0 =========== 4) SW steering fixes and improvements Yevgeny Kliteynik Says: ======================= These short patch series are just small fixes / improvements for SW steering: - Patch 1: Fix dumping of legacy modify_hdr in debug dump to align to what is expected by parser - Patch 2: Have separate threshold for ICM sync per ICM type - Patch 3: Add more info to the steering debug dump - Linux version and device name - Patch 4: Keep track of number of buddies that are currently in use per domain per buddy type ======================= * tag 'mlx5-updates-2023-04-20' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5: Update op_mode to op_mod for port selection net/mlx5: E-Switch, Remove unused mlx5_esw_offloads_vport_metadata_set() net/mlx5: E-Switch, Remove redundant dev arg from mlx5_esw_vport_alloc() net/mlx5: Include linux/pci.h for pci_msix_can_alloc_dyn() net/mlx5e: RX, Hook NAPIs to page pools net/mlx5e: RX, Fix XDP_TX page release for legacy rq nonlinear case net/mlx5e: RX, Fix releasing page_pool pages twice for striding RQ net/mlx5e: Add vnic devlink health reporter to representors net/mlx5: Add vnic devlink health reporter to PFs/VFs Revert "net/mlx5: Expose vnic diagnostic counters for eswitch managed vports" Revert "net/mlx5: Expose steering dropped packets counter" net/mlx5: DR, Add memory statistics for domain object net/mlx5: DR, Add more info in domain dbg dump net/mlx5: DR, Calculate sync threshold of each pool according to its type net/mlx5: DR, Fix dumping of legacy modify_hdr in debug dump ==================== Link: https://lore.kernel.org/r/20230421013850.349646-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-22Merge tag 'for-netdev' of ↵Jakub Kicinski116-8896/+13397
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2023-04-21 We've added 71 non-merge commits during the last 8 day(s) which contain a total of 116 files changed, 13397 insertions(+), 8896 deletions(-). The main changes are: 1) Add a new BPF netfilter program type and minimal support to hook BPF programs to netfilter hooks such as prerouting or forward, from Florian Westphal. 2) Fix race between btf_put and btf_idr walk which caused a deadlock, from Alexei Starovoitov. 3) Second big batch to migrate test_verifier unit tests into test_progs for ease of readability and debugging, from Eduard Zingerman. 4) Add support for refcounted local kptrs to the verifier for allowing shared ownership, useful for adding a node to both the BPF list and rbtree, from Dave Marchevsky. 5) Migrate bpf_for(), bpf_for_each() and bpf_repeat() macros from BPF selftests into libbpf-provided bpf_helpers.h header and improve kfunc handling, from Andrii Nakryiko. 6) Support 64-bit pointers to kfuncs needed for archs like s390x, from Ilya Leoshkevich. 7) Support BPF progs under getsockopt with a NULL optval, from Stanislav Fomichev. 8) Improve verifier u32 scalar equality checking in order to enable LLVM transformations which earlier had to be disabled specifically for BPF backend, from Yonghong Song. 9) Extend bpftool's struct_ops object loading to support links, from Kui-Feng Lee. 10) Add xsk selftest follow-up fixes for hugepage allocated umem, from Magnus Karlsson. 11) Support BPF redirects from tc BPF to ifb devices, from Daniel Borkmann. 12) Add BPF support for integer type when accessing variable length arrays, from Feng Zhou. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (71 commits) selftests/bpf: verifier/value_ptr_arith converted to inline assembly selftests/bpf: verifier/value_illegal_alu converted to inline assembly selftests/bpf: verifier/unpriv converted to inline assembly selftests/bpf: verifier/subreg converted to inline assembly selftests/bpf: verifier/spin_lock converted to inline assembly selftests/bpf: verifier/sock converted to inline assembly selftests/bpf: verifier/search_pruning converted to inline assembly selftests/bpf: verifier/runtime_jit converted to inline assembly selftests/bpf: verifier/regalloc converted to inline assembly selftests/bpf: verifier/ref_tracking converted to inline assembly selftests/bpf: verifier/map_ptr_mixing converted to inline assembly selftests/bpf: verifier/map_in_map converted to inline assembly selftests/bpf: verifier/lwt converted to inline assembly selftests/bpf: verifier/loops1 converted to inline assembly selftests/bpf: verifier/jeq_infer_not_null converted to inline assembly selftests/bpf: verifier/direct_packet_access converted to inline assembly selftests/bpf: verifier/d_path converted to inline assembly selftests/bpf: verifier/ctx converted to inline assembly selftests/bpf: verifier/btf_ctx_access converted to inline assembly selftests/bpf: verifier/bpf_get_stack converted to inline assembly ... ==================== Link: https://lore.kernel.org/r/20230421211035.9111-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-22net: dst: fix missing initialization of rt_uncachedMaxime Bizon5-11/+3
xfrm_alloc_dst() followed by xfrm4_dst_destroy(), without a xfrm4_fill_dst() call in between, causes the following BUG: BUG: spinlock bad magic on CPU#0, fbxhostapd/732 lock: 0x890b7668, .magic: 890b7668, .owner: <none>/-1, .owner_cpu: 0 CPU: 0 PID: 732 Comm: fbxhostapd Not tainted 6.3.0-rc6-next-20230414-00613-ge8de66369925-dirty #9 Hardware name: Marvell Kirkwood (Flattened Device Tree) unwind_backtrace from show_stack+0x10/0x14 show_stack from dump_stack_lvl+0x28/0x30 dump_stack_lvl from do_raw_spin_lock+0x20/0x80 do_raw_spin_lock from rt_del_uncached_list+0x30/0x64 rt_del_uncached_list from xfrm4_dst_destroy+0x3c/0xbc xfrm4_dst_destroy from dst_destroy+0x5c/0xb0 dst_destroy from rcu_process_callbacks+0xc4/0xec rcu_process_callbacks from __do_softirq+0xb4/0x22c __do_softirq from call_with_stack+0x1c/0x24 call_with_stack from do_softirq+0x60/0x6c do_softirq from __local_bh_enable_ip+0xa0/0xcc Patch "net: dst: Prevent false sharing vs. dst_entry:: __refcnt" moved rt_uncached and rt_uncached_list fields from rtable struct to dst struct, so they are more zeroed by memset_after(xdst, 0, u.dst) in xfrm_alloc_dst(). Note that rt_uncached (list_head) was never properly initialized at alloc time, but xfrm[46]_dst_destroy() is written in such a way that it was not an issue thanks to the memset: if (xdst->u.rt.dst.rt_uncached_list) rt_del_uncached_list(&xdst->u.rt); The route code does it the other way around: rt_uncached_list is assumed to be valid IIF rt_uncached list_head is not empty: void rt_del_uncached_list(struct rtable *rt) { if (!list_empty(&rt->dst.rt_uncached)) { struct uncached_list *ul = rt->dst.rt_uncached_list; spin_lock_bh(&ul->lock); list_del_init(&rt->dst.rt_uncached); spin_unlock_bh(&ul->lock); } } This patch adds mandatory rt_uncached list_head initialization in generic dst_init(), and adapt xfrm[46]_dst_destroy logic to match the rest of the code. Fixes: d288a162dd1c ("net: dst: Prevent false sharing vs. dst_entry:: __refcnt") Reported-by: kernel test robot <oliver.sang@intel.com> Link: https://lore.kernel.org/oe-lkp/202304162125.18b7bcdd-oliver.sang@intel.com Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> CC: Leon Romanovsky <leon@kernel.org> Signed-off-by: Maxime Bizon <mbizon@freebox.fr> Link: https://lore.kernel.org/r/20230420182508.2417582-1-mbizon@freebox.fr Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-22net: dsa: qca8k: fix LEDS_CLASS dependencyArnd Bergmann1-1/+1
With LEDS_CLASS=m, a built-in qca8k driver fails to link: arm-linux-gnueabi-ld: drivers/net/dsa/qca/qca8k-leds.o: in function `qca8k_setup_led_ctrl': qca8k-leds.c:(.text+0x1ea): undefined reference to `devm_led_classdev_register_ext' Change the dependency to avoid the broken configuration. Fixes: 1e264f9d2918 ("net: dsa: qca8k: add LEDs basic support") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Christian Marangi <ansuelsmth@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20230420213639.2243388-1-arnd@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-22net/handshake: Fix section mismatch in handshake_exitGeert Uytterhoeven1-1/+1
If CONFIG_NET_NS=n (e.g. m68k/defconfig): WARNING: modpost: vmlinux.o: section mismatch in reference: handshake_exit (section: .exit.text) -> handshake_genl_net_ops (section: .init.data) ERROR: modpost: Section mismatches detected. Fix this by dropping the __net_initdata tag from handshake_genl_net_ops. Fixes: 3b3009ea8abb713b ("net/handshake: Create a NETLINK service for handling handshake requests") Reported-by: noreply@ellerman.id.au Closes: http://kisskb.ellerman.id.au/kisskb/buildresult/14912987 Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Link: https://lore.kernel.org/r/20230420173723.3773434-1-geert@linux-m68k.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-22net: phy: add basic driver for NXP CBTX PHYVladimir Oltean3-0/+234
The CBTX PHY is a Fast Ethernet PHY integrated into the SJA1110 A/B/C automotive Ethernet switches. It was hoped it would work with the Generic PHY driver, but alas, it doesn't. The most important reason why is that the PHY is powered down by default, and it needs a vendor register to power it on. It has a linear memory map that is accessed over SPI by the SJA1110 switch driver, which exposes a fake MDIO controller. It has the following (and only the following) standard clause 22 registers: 0x0: MII_BMCR 0x1: MII_BMSR 0x2: MII_PHYSID1 0x3: MII_PHYSID2 0x4: MII_ADVERTISE 0x5: MII_LPA 0x6: MII_EXPANSION 0x7: the missing MII_NPAGE for Next Page Transmit Register Every other register is vendor-defined. The register map expands the standard clause 22 5-bit address space of 0x20 registers, however the driver does not need to access the extra registers for now (and hopefully never). If it ever needs to do that, it is possible to implement a fake (software) page switching mechanism between the PHY driver and the SJA1110 MDIO controller driver. Also, Auto-MDIX is turned off by default in hardware, the driver turns it on by default and reports the current status. I've tested this with a VSC8514 link partner and a crossover cable, by forcing the mode on the link partner, and seeing that the CBTX PHY always sees the reverse of the mode forced on the VSC8514 (and that traffic works). The link doesn't come up (as expected) if MDI modes are forced on both ends in the same way (with the cross-over cable, that is). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20230418190141.1040562-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-22netfilter: nf_tables: allow to create netdev chain without devicePablo Neira Ayuso1-12/+11
Relax netdev chain creation to allow for loading the ruleset, then adding/deleting devices at a later stage. Hardware offload does not support for this feature yet. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-04-22netfilter: nf_tables: support for deleting devices in an existing netdev chainPablo Neira Ayuso1-11/+88
This patch allows for deleting devices in an existing netdev chain. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-04-22netfilter: nf_tables: support for adding new devices to an existing netdev chainPablo Neira Ayuso2-81/+142
This patch allows users to add devices to an existing netdev chain. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-04-22netfilter: nf_tables: rename function to destroy hook listPablo Neira Ayuso1-4/+4
Rename nft_flowtable_hooks_destroy() by nft_hooks_destroy() to prepare for netdev chain device updates. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-04-22netfilter: nf_tables: do not send complete notification of deletionsPablo Neira Ayuso1-19/+51
In most cases, table, name and handle is sufficient for userspace to identify an object that has been deleted. Skipping unneeded fields in the netlink attributes in the message saves bandwidth (ie. less chances of hitting ENOBUFS). Rules are an exception: the existing userspace monitor code relies on the rule definition. This exception can be removed by implementing a rule cache in userspace, this is already supported by the tracing infrastructure. Regarding flowtables, incremental deletion of devices is possible. Skipping a full notification allows userspace to differentiate between flowtable removal and incremental removal of devices. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-04-22netfilter: nf_tables: extended netlink error reporting for netdevicePablo Neira Ayuso1-14/+24
Flowtable and netdev chains are bound to one or several netdevice, extend netlink error reporting to specify the the netdevice that triggers the error. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-04-22ipvs: Correct spelling in commentsSimon Horman1-3/+3
Correct some spelling errors flagged by codespell and found by inspection. Signed-off-by: Simon Horman <horms@kernel.org> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-04-22ipvs: Remove {Enter,Leave}FunctionSimon Horman5-112/+9
Remove EnterFunction and LeaveFunction. These debugging macros seem well past their use-by date. And seem to have little value these days. Removing them allows some trivial cleanup of some exit paths for some functions. These are also included in this patch. There is likely scope for further cleanup of both debugging and unwind paths. But let's leave that for another day. Only intended to change debug output, and only when CONFIG_IP_VS_DEBUG is enabled. Compile tested only. Signed-off-by: Simon Horman <horms@kernel.org> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-04-22ipvs: Consistently use array_size() in ip_vs_conn_init()Simon Horman1-6/+6
Consistently use array_size() to calculate the size of ip_vs_conn_tab in bytes. Flagged by Coccinelle: WARNING: array_size is already used (line 1498) to compute the same size No functional change intended. Compile tested only. Signed-off-by: Simon Horman <horms@kernel.org> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-04-22ipvs: Update width of source for ip_vs_sync_conn_optionsSimon Horman2-3/+5
In ip_vs_sync_conn_v0() copy is made to struct ip_vs_sync_conn_options. That structure looks like this: struct ip_vs_sync_conn_options { struct ip_vs_seq in_seq; struct ip_vs_seq out_seq; }; The source of the copy is the in_seq field of struct ip_vs_conn. Whose type is struct ip_vs_seq. Thus we can see that the source - is not as wide as the amount of data copied, which is the width of struct ip_vs_sync_conn_option. The copy is safe because the next field in is another struct ip_vs_seq. Make use of struct_group() to annotate this. Flagged by gcc-13 as: In file included from ./include/linux/string.h:254, from ./include/linux/bitmap.h:11, from ./include/linux/cpumask.h:12, from ./arch/x86/include/asm/paravirt.h:17, from ./arch/x86/include/asm/cpuid.h:62, from ./arch/x86/include/asm/processor.h:19, from ./arch/x86/include/asm/timex.h:5, from ./include/linux/timex.h:67, from ./include/linux/time32.h:13, from ./include/linux/time.h:60, from ./include/linux/stat.h:19, from ./include/linux/module.h:13, from net/netfilter/ipvs/ip_vs_sync.c:38: In function 'fortify_memcpy_chk', inlined from 'ip_vs_sync_conn_v0' at net/netfilter/ipvs/ip_vs_sync.c:606:3: ./include/linux/fortify-string.h:529:25: error: call to '__read_overflow2_field' declared with attribute warning: detected read beyond size of field (2nd parameter); maybe use struct_group()? [-Werror=attribute-warning] 529 | __read_overflow2_field(q_size_field, size); | Compile tested only. Signed-off-by: Simon Horman <horms@kernel.org> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-04-22netfilter: nf_tables: do not store rule in traceinfo structureFlorian Westphal3-16/+16
pass it as argument instead. This reduces size of traceinfo to 16 bytes. Total stack usage: nf_tables_core.c:252 nft_do_chain 304 static While its possible to also pass basechain as argument, doing so increases nft_do_chaininfo function size. Unlike pktinfo/verdict/rule the basechain info isn't used in the expression evaluation path. gcc places it on the stack, which results in extra push/pop when it gets passed to the trace helpers as argument rather than as part of the traceinfo structure. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-04-22netfilter: nf_tables: do not store verdict in traceinfo structureFlorian Westphal3-19/+20
Just pass it as argument to nft_trace_notify. Stack is reduced by 8 bytes: nf_tables_core.c:256 nft_do_chain 312 static Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-04-22netfilter: nf_tables: do not store pktinfo in traceinfo structureFlorian Westphal3-15/+16
pass it as argument. No change in object size. stack usage decreases by 8 byte: nf_tables_core.c:254 nft_do_chain 320 static Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-04-22netfilter: nf_tables: remove unneeded conditionalFlorian Westphal1-4/+2
This helper is inlined, so keep it as small as possible. If the static key is true, there is only a very small chance that info->trace is false: 1. tracing was enabled at this very moment, the static key was updated to active right after nft_do_table was called. 2. tracing was disabled at this very moment. trace->info is already false, the static key is about to be patched to false soon. In both cases, no event will be sent because info->trace is false (checked in noinline slowpath). info->nf_trace is irrelevant. The nf_trace update is redunant in this case, but this will only happen for short duration, when static key flips. text data bss dec hex filename old: 2980 192 32 3204 c84 nf_tables_core.o new: 2964 192 32 3188 c74i nf_tables_core.o Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-04-22netfilter: nf_tables: make validation state per tableFlorian Westphal2-21/+20
We only need to validate tables that saw changes in the current transaction. The existing code revalidates all tables, but this isn't needed as cross-table jumps are not allowed (chains have table scope). Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-04-22netfilter: nf_tables: don't write table validation state without mutexFlorian Westphal3-9/+2
The ->cleanup callback needs to be removed, this doesn't work anymore as the transaction mutex is already released in the ->abort function. Just do it after a successful validation pass, this either happens from commit or abort phases where transaction mutex is held. Fixes: f102d66b335a ("netfilter: nf_tables: use dedicated mutex to guard transactions") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-04-22netfilter: nf_tables: don't store chain address on jumpFlorian Westphal4-28/+44
Now that the rule trailer/end marker and the rcu head reside in the same structure, we no longer need to save/restore the chain pointer when performing/returning from a jump. We can simply let the trace infra walk the evaluated rule until it hits the end marker and then fetch the chain pointer from there. When the rule is NULL (policy tracing), then chain and basechain pointers were already identical, so just use the basechain. This cuts size of jumpstack in half, from 256 to 128 bytes in 64bit, scripts/stackusage says: nf_tables_core.c:251 nft_do_chain 328 static Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-04-22netfilter: nf_tables: don't store address of last rule on jumpFlorian Westphal1-6/+2
Walk the rule headers until the trailer one (last_bit flag set) instead of stopping at last_rule address. This avoids the need to store the address when jumping to another chain. This cuts size of jumpstack array by one third, on 64bit from 384 to 256 bytes. Still, stack usage is still quite large: scripts/stackusage: nf_tables_core.c:258 nft_do_chain 496 static Next patch will also remove chain pointer. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-04-22netfilter: nf_tables: merge nft_rules_old structure and end of ruleblob markerFlorian Westphal1-28/+27
In order to free the rules in a chain via call_rcu, the rule array used to stash a rcu_head and space for a pointer at the end of the rule array. When the current nft_rule_dp blob format got added in 2c865a8a28a1 ("netfilter: nf_tables: add rule blob layout"), this results in a double-trailer: size (unsigned long) struct nft_rule_dp struct nft_expr ... struct nft_rule_dp struct nft_expr ... struct nft_rule_dp (is_last=1) // Trailer The trailer, struct nft_rule_dp (is_last=1), is not accounted for in size, so it can be located via start_addr + size. Because the rcu_head is stored after 'start+size' as well this means the is_last trailer is *aliased* to the rcu_head (struct nft_rules_old). This is harmless, because at this time the nft_do_chain function never evaluates/accesses the trailer, it only checks the address boundary: for (; rule < last_rule; rule = nft_rule_next(rule)) { ... But this way the last_rule address has to be stashed in the jump structure to restore it after returning from a chain. nft_do_chain stack usage has become way too big, so put it on a diet. Without this patch is impossible to use for (; !rule->is_last; rule = nft_rule_next(rule)) { ... because on free, the needed update of the rcu_head will clobber the nft_rule_dp is_last bit. Furthermore, also stash the chain pointer in the trailer, this allows to recover the original chain structure from nf_tables_trace infra without a need to place them in the jump struct. After this patch it is trivial to diet the jump stack structure, done in the next two patches. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-04-21selftests/bpf: verifier/value_ptr_arith converted to inline assemblyEduard Zingerman3-1146/+1451
Test verifier/value_ptr_arith automatically converted to use inline assembly. Test cases "sanitation: alu with different scalars 2" and "sanitation: alu with different scalars 3" are updated to avoid -ENOENT as return value, as __retval() annotation only supports numeric literals. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20230421174234.2391278-25-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-21selftests/bpf: verifier/value_illegal_alu converted to inline assemblyEduard Zingerman3-95/+151
Test verifier/value_illegal_alu automatically converted to use inline assembly. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20230421174234.2391278-24-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-21selftests/bpf: verifier/unpriv converted to inline assemblyEduard Zingerman4-562/+764
Test verifier/unpriv semi-automatically converted to use inline assembly. The verifier/unpriv.c had to be split in two parts: - the bulk of the tests is in the progs/verifier_unpriv.c; - the single test that needs `struct bpf_perf_event_data` definition is in the progs/verifier_unpriv_perf.c. The tests above can't be in a single file because: - first requires inclusion of the filter.h header (to get access to BPF_ST_MEM macro, inline assembler does not support this isntruction); - the second requires vmlinux.h, which contains definitions conflicting with filter.h. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20230421174234.2391278-23-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-21selftests/bpf: verifier/subreg converted to inline assemblyEduard Zingerman3-533/+675
Test verifier/subreg automatically converted to use inline assembly. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20230421174234.2391278-22-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-21selftests/bpf: verifier/spin_lock converted to inline assemblyEduard Zingerman3-447/+535
Test verifier/spin_lock automatically converted to use inline assembly. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20230421174234.2391278-21-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-21selftests/bpf: verifier/sock converted to inline assemblyEduard Zingerman3-706/+982
Test verifier/sock automatically converted to use inline assembly. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20230421174234.2391278-20-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-21selftests/bpf: verifier/search_pruning converted to inline assemblyEduard Zingerman3-266/+341
Test verifier/search_pruning automatically converted to use inline assembly. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20230421174234.2391278-19-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>