summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2022-08-20amt: remove unnecessary skb pointer checkYang Yingliang1-4/+2
The skb pointer will be checked in kfree_skb(), so remove the outside check. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Reviewed-by: Taehee Yoo <ap420073@gmail.com> Link: https://lore.kernel.org/r/20220818093114.2449179-1-yangyingliang@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-20selftests/net: test l2 tunnel TOS/TTL inheritingMatthias May2-0/+391
There are currently 3 ip tunnels that are capable of carrying L2 traffic: gretap, vxlan and geneve. They all are capable to inherit the TOS/TTL for the outer IP-header from the inner frame. Add a test that verifies that these fields are correctly inherited. These tests failed before the following commits: b09ab9c92e50 ("ip6_tunnel: allow to inherit from VLAN encapsulated IP") 3f8a8447fd0b ("ip6_gre: use actual protocol to select xmit") 41337f52b967 ("ip6_gre: set DSCP for non-IP") 7ae29fd1be43 ("ip_tunnel: allow to inherit from VLAN encapsulated IP") 7074732c8fae ("ip_tunnels: allow VXLAN/GENEVE to inherit TOS/TTL from VLAN") ca2bb69514a8 ("geneve: do not use RT_TOS for IPv6 flowlabel") b4ab94d6adaa ("geneve: fix TOS inheriting for ipv4") Signed-off-by: Matthias May <matthias.may@westermo.com> Link: https://lore.kernel.org/r/20220817073649.26117-1-matthias.may@westermo.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-20Merge branch 'net-dpaa-cleanups-in-preparation-for-phylink-conversion'Jakub Kicinski22-1051/+457
Sean Anderson says: ==================== net: dpaa: Cleanups in preparation for phylink conversion This series contains several cleanup patches for dpaa/fman. While they are intended to prepare for a phylink conversion, they stand on their own. This series was originally submitted as part of [1]. [1] https://lore.kernel.org/netdev/20220715215954.1449214-1-sean.anderson@seco.com ==================== Link: https://lore.kernel.org/r/20220818161649.2058728-1-sean.anderson@seco.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-20net: fman: memac: Use params instead of priv for max_speedSean Anderson1-3/+1
This option is present in params, so use it instead of the fman private version. Signed-off-by: Sean Anderson <sean.anderson@seco.com> Acked-by: Camelia Groza <camelia.groza@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-20net: fman: Export/rename some common functionsSean Anderson2-6/+9
In preparation for moving each of the initialization functions to their own file, export some common functions so they can be re-used. This adds an fman prefix to set_multi to make it a bit less genericly-named. Signed-off-by: Sean Anderson <sean.anderson@seco.com> Acked-by: Camelia Groza <camelia.groza@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-20net: fman: Configure fixed link in memac_initializationSean Anderson1-47/+46
memac is the only mac which parses fixed links. Move the parsing/configuring to its initialization function. Signed-off-by: Sean Anderson <sean.anderson@seco.com> Acked-by: Camelia Groza <camelia.groza@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-20net: fman: Move struct dev to mac_deviceSean Anderson2-20/+12
Move the reference to our device to mac_device. This way, macs can use it in their log messages. Signed-off-by: Sean Anderson <sean.anderson@seco.com> Acked-by: Camelia Groza <camelia.groza@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-20net: fman: Store initialization function in match dataSean Anderson2-192/+165
Instead of re-matching the compatible string in order to determine the init function, just store it in the match data. The separate setup functions aren't needed anymore. Merge their content into init as well. To ensure everything compiles correctly, we move them to the bottom of the file. Signed-off-by: Sean Anderson <sean.anderson@seco.com> Acked-by: Camelia Groza <camelia.groza@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-20net: fman: Get PCS node in per-mac initSean Anderson2-11/+10
This moves the reading of the PCS property out of the generic probe and into the mac-specific initialization function. This reduces the mac-specific jobs done in the top-level probe function. Signed-off-by: Sean Anderson <sean.anderson@seco.com> Acked-by: Camelia Groza <camelia.groza@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-20net: fman: dtsec: Always gracefully stop/startSean Anderson2-74/+30
There are two ways that GRS can be set: graceful_stop and dtsec_isr. It is cleared by graceful_start. If it is already set before calling graceful_stop, then that means that dtsec_isr set it. In that case, we will not set GRS nor will we clear it (which seems like a bug?). For GTS the logic is similar, except that there is no one else messing with this bit (so we will always set and clear it). Simplify the logic by always setting/clearing GRS/GTS. This is less racy that the previous behavior, and ensures that we always end up clearing the bits. This can of course clear GRS while dtsec_isr is waiting, but because we have already done our own waiting it should be fine. This is the last user of enum comm_mode, so remove it. Signed-off-by: Sean Anderson <sean.anderson@seco.com> Acked-by: Camelia Groza <camelia.groza@nxp.com> Tested-by: Camelia Groza <camelia.groza@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-20net: fman: Store en/disable in mac_device instead of mac_priv_sSean Anderson3-44/+15
All macs use the same start/stop functions. The actual mac-specific code lives in enable/disable. Move these functions to an appropriate struct, and inline the phy enable/disable calls to the caller of start/stop. Signed-off-by: Sean Anderson <sean.anderson@seco.com> Acked-by: Camelia Groza <camelia.groza@nxp.com> Tested-by: Camelia Groza <camelia.groza@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-20net: fman: Don't pass comm_mode to enable/disableSean Anderson7-46/+24
mac_priv_s->enable() and ->disable() are always called with a comm_mode of COMM_MODE_RX_AND_TX. Remove this parameter, and refactor the macs appropriately. Signed-off-by: Sean Anderson <sean.anderson@seco.com> Acked-by: Camelia Groza <camelia.groza@nxp.com> Tested-by: Camelia Groza <camelia.groza@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-20net: fman: Convert to SPDX identifiersSean Anderson18-515/+33
This converts the license text of files in the fman directory to use SPDX license identifiers instead. Signed-off-by: Sean Anderson <sean.anderson@seco.com> Acked-by: Camelia Groza <camelia.groza@nxp.com> Tested-by: Camelia Groza <camelia.groza@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-20dt-bindings: net: Convert FMan MAC bindings to yamlSean Anderson2-127/+146
This converts the MAC portion of the FMan MAC bindings to yaml. Signed-off-by: Sean Anderson <sean.anderson@seco.com> Reviewed-by: Rob Herring <robh@kernel.org> Acked-by: Camelia Groza <camelia.groza@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-20Revert "Merge branch 'wwan-t7xx-fw-flashing-and-coredump-support'"Jakub Kicinski25-1706/+73
This reverts commit 5417197dd516a8e115aa69f62a7b7554b0c3829c, reversing changes made to 0630f64d25a0f0a8c6a9ce9fde8750b3b561e6f5. Reverting to allow addressing review comments. Link: https://lore.kernel.org/all/4c5dbea0-52a9-1c3d-7547-00ea54c90550@linux.intel.com/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-19Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski657-9300/+107493
No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-19igc: add xdp frags support to ndo_xdp_xmitLorenzo Bianconi1-45/+83
Add the capability to map non-linear xdp frames in XDP_TX and ndo_xdp_xmit callback. Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20220817173628.109102-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-19Merge branch 'selftests-mlxsw-add-ordering-tests-for-unified-bridge-model'Jakub Kicinski4-0/+1112
Petr Machata says: ==================== selftests: mlxsw: Add ordering tests for unified bridge model Amit Cohen writes: Commit 798661c73672 ("Merge branch 'mlxsw-unified-bridge-conversion-part-6'") converted mlxsw driver to use unified bridge model. In the legacy model, when a RIF was created / destroyed, it was firmware's responsibility to update it in the relevant FID classification records. In the unified bridge model, this responsibility moved to software. This set adds tests to check the order of configuration for the following classifications: 1. {Port, VID} -> FID 2. VID -> FID 3. VNI -> FID (after decapsulation) In addition, in the legacy model, software is responsible to update a table which is used to determine the packet's egress VID. Add a test to check that the order of configuration does not impact switch behavior. See more details in the commit messages. Note that the tests supposed to pass also using the legacy model, they are added now as with the new model they test the driver and not the firmware. Patch set overview: Patch #1 adds test for {Port, VID} -> FID Patch #2 adds test for VID -> FID Patch #3 adds test for VNI -> FID Patch #4 adds test for egress VID classification ==================== Link: https://lore.kernel.org/r/cover.1660747162.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-19selftests: mlxsw: Add egress VID classification testAmit Cohen1-0/+273
After routing, the device always consults a table that determines the packet's egress VID based on {egress RIF, egress local port}. In the unified bridge model, it is up to software to maintain this table via REIV register. The table needs to be updated in the following flows: 1. When a RIF is set on a FID, for each FID's {Port, VID} mapping, a new {RIF, Port}->VID mapping should be created. 2. When a {Port, VID} is mapped to a FID and the FID already has a RIF, a new {RIF, Port}->VID mapping should be created. Add a test to verify that packets get the correct VID after routing, regardless of the order of the configuration. # ./egress_vid_classification.sh TEST: Add RIF for existing {port, VID}->FID mapping [ OK ] TEST: Add {port, VID}->FID mapping for FID with a RIF [ OK ] Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-19selftests: mlxsw: Add ingress RIF configuration test for VXLANAmit Cohen1-0/+311
Before layer 2 forwarding, the device classifies an incoming packet to a FID. After classification, the FID is known, but also all the attributes of the FID, such as the router interface (RIF) via which a packet that needs to be routed will ingress the router block. For VXLAN decapsulation, the FID classification is done according to the VNI. When a RIF is added on top of a FID, the existing VNI->FID mapping should be updated by the software with the new RIF. In addition, when a new mapping is added for FID which already has a RIF, the correct RIF should be used for it. Add a test to verify that packets can be routed after decapsulation which is done after VNI->FID classification, regardless of the order of the configuration. # ./ingress_rif_conf_vxlan.sh TEST: Add RIF for existing VNI->FID mapping [ OK ] TEST: Add VNI->FID mapping for FID with a RIF [ OK ] Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-19selftests: mlxsw: Add ingress RIF configuration test for 802.1Q bridgeAmit Cohen1-0/+264
Before layer 2 forwarding, the device classifies an incoming packet to a FID. After classification, the FID is known, but also all the attributes of the FID, such as the router interface (RIF) via which a packet that needs to be routed will ingress the router block. For VLAN-aware bridges (802.1Q), the FID classification is done according to VID. When a RIF is added on top of a FID, the existing VID->FID mapping should be updated by the software with the new RIF. We never map multiple VLANs to the same FID using VID->FID, so we cannot create VID->FID for FID which already has a RIF using 802.1Q. Anyway, verify that packets can be routed via port which is added after the FID already has a RIF. Add a test to verify that packets can be routed after VID->FID classification, regardless of the order of the configuration. # ./ingress_rif_conf_1q.sh TEST: Add RIF for existing VID->FID mapping [ OK ] TEST: Add port to VID->FID mapping for FID with a RIF [ OK ] Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-19selftests: mlxsw: Add ingress RIF configuration test for 802.1D bridgeAmit Cohen1-0/+264
Before layer 2 forwarding, the device classifies an incoming packet to a FID. After classification, the FID is known, but also all the attributes of the FID, such as the router interface (RIF) via which a packet that needs to be routed will ingress the router block. For VLAN-unaware bridges (802.1D), the FID classification is done according to {Port, VID}. When a RIF is added on top of a FID, all the existing {Port, VID}->FID mappings should be updated by the software with the new RIF. In addition, when a new mapping is added for FID which already has a RIF, the correct RIF should be used for it. Add a test to verify that packets can be routed after {Port, VID}->FID classification, regardless of the order of the configuration. # ./ingress_rif_conf_1d.sh TEST: Add RIF for existing {port, VID}->FID mapping [ OK ] TEST: Add {port, VID}->FID mapping for FID with a RIF [ OK ] Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-19net: ethernet: mtk_eth_soc: remove unused txd_pdma pointer in ↵Lorenzo Bianconi1-5/+7
mtk_xdp_submit_frame Get rid of unnecessary txd_pdma pointer in mtk_xdp_submit_frame for loop since it is actually used at the end of the routine using latest mtk_tx_dma consumed pointer as reference. Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/2c40b0fbb9163a0d62ff897abae17db84a9f3b99.1660669138.git.lorenzo@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-19net: macsec: Expose MACSEC_SALT_LEN definition to user spaceEmeel Hakim2-1/+2
Expose MACSEC_SALT_LEN definition to user space to be used in various user space applications such as iproute. Iproute will use this as part of adding macsec extended packet number support. Reviewed-by: Raed Salem <raeds@nvidia.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Emeel Hakim <ehakim@nvidia.com> Link: https://lore.kernel.org/r/20220818153229.4721-1-ehakim@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-19Merge tag 'net-6.0-rc2' of ↵Linus Torvalds65-793/+2351
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from netfilter. Current release - regressions: - tcp: fix cleanup and leaks in tcp_read_skb() (the new way BPF socket maps get data out of the TCP stack) - tls: rx: react to strparser initialization errors - netfilter: nf_tables: fix scheduling-while-atomic splat - net: fix suspicious RCU usage in bpf_sk_reuseport_detach() Current release - new code bugs: - mlxsw: ptp: fix a couple of races, static checker warnings and error handling Previous releases - regressions: - netfilter: - nf_tables: fix possible module reference underflow in error path - make conntrack helpers deal with BIG TCP (skbs > 64kB) - nfnetlink: re-enable conntrack expectation events - net: fix potential refcount leak in ndisc_router_discovery() Previous releases - always broken: - sched: cls_route: disallow handle of 0 - neigh: fix possible local DoS due to net iface start/stop loop - rtnetlink: fix module refcount leak in rtnetlink_rcv_msg - sched: fix adding qlen to qcpu->backlog in gnet_stats_add_queue_cpu - virtio_net: fix endian-ness for RSS - dsa: mv88e6060: prevent crash on an unused port - fec: fix timer capture timing in `fec_ptp_enable_pps()` - ocelot: stats: fix races, integer wrapping and reading incorrect registers (the change of register definitions here accounts for bulk of the changed LoC in this PR)" * tag 'net-6.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (77 commits) net: moxa: MAC address reading, generating, validity checking tcp: handle pure FIN case correctly tcp: refactor tcp_read_skb() a bit tcp: fix tcp_cleanup_rbuf() for tcp_read_skb() tcp: fix sock skb accounting in tcp_read_skb() igb: Add lock to avoid data race dt-bindings: Fix incorrect "the the" corrections net: genl: fix error path memory leak in policy dumping stmmac: intel: Add a missing clk_disable_unprepare() call in intel_eth_pci_remove() net: ethernet: mtk_eth_soc: fix possible NULL pointer dereference in mtk_xdp_run net/mlx5e: Allocate flow steering storage during uplink initialization net: mscc: ocelot: report ndo_get_stats64 from the wraparound-resistant ocelot->stats net: mscc: ocelot: keep ocelot_stat_layout by reg address, not offset net: mscc: ocelot: make struct ocelot_stat_layout array indexable net: mscc: ocelot: fix race between ndo_get_stats64 and ocelot_check_stats_work net: mscc: ocelot: turn stats_lock into a spinlock net: mscc: ocelot: fix address of SYS_COUNT_TX_AGING counter net: mscc: ocelot: fix incorrect ndo_get_stats64 packet counters net: dsa: felix: fix ethtool 256-511 and 512-1023 TX packet counters net: dsa: don't warn in dsa_port_set_state_now() when driver doesn't support it ...
2022-08-19Merge tag 'linux-kselftest-next-6.0-rc2' of ↵Linus Torvalds1-2/+5
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull Kselftest fix from Shuah Khan: - fix landlock test build regression * tag 'linux-kselftest-next-6.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: selftests/landlock: fix broken include of linux/landlock.h
2022-08-19Merge tag 'trace-rtla-v6.0' of ↵Linus Torvalds4-33/+43
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull rtla tool fixes from Steven Rostedt: "Fixes for the Real-Time Linux Analysis tooling: - Fix tracer name in comments and prints - Fix setting up symlinks - Allow extra flags to be set in build - Consolidate and show all necessary libraries not found in build error" * tag 'trace-rtla-v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: rtla: Consolidate and show all necessary libraries that failed for building tools/rtla: Build with EXTRA_{C,LD}FLAGS tools/rtla: Fix command symlinks rtla: Fix tracer name
2022-08-19ixgbe: Manual AN-37 for troublesome link partners for X550 SFIJeff Daly2-3/+56
Some (Juniper MX5) SFP link partners exhibit a disinclination to autonegotiate with X550 configured in SFI mode. This patch enables a manual AN-37 restart to work around the problem. Signed-off-by: Jeff Daly <jeffd@silicom-usa.com> Tested-by: Dave Switzer <david.switzer@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-19Merge branch 'add-dt-property-to-disable-hibernation-mode'Jakub Kicinski2-0/+33
Wei Fang says: ==================== Add DT property to disable hibernation mode The patches add the ability to disable the hibernation mode of AR803x PHYs. Hibernation mode defaults to enabled after hardware reset on these PHYs. If the AR803x PHYs enter hibernation mode, they will not provide any clock. For some MACs, they might need the clocks which provided by the PHYs to support their own hardware logic. So, the patches add the support to disable hibernation mode by adding a boolean: qca,disable-hibernation-mode If one wished to disable hibernation mode to better match with the specifical MAC, just add this property in the phy node of DT. ==================== Link: https://lore.kernel.org/r/20220818030054.1010660-1-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-19net: phy: at803x: add disable hibernation mode supportWei Fang1-0/+25
When the cable is unplugged, the Atheros AR803x PHYs will enter hibernation mode after about 10 seconds if the hibernation mode is enabled and will not provide any clock to the MAC. But for some MACs, this feature might cause unexpected issues due to the logic of MACs. Taking SYNP MAC (stmmac) as an example, if the cable is unplugged and the "eth0" interface is down, the AR803x PHY will enter hibernation mode. Then perform the "ifconfig eth0 up" operation, the stmmac can't be able to complete the software reset operation and fail to init it's own DMA. Therefore, the "eth0" interface is failed to ifconfig up. Why does it cause this issue? The truth is that the software reset operation of the stmmac is designed to depend on the RX_CLK of PHY. So, this patch offers an option for the user to determine whether to disable the hibernation mode of AR803x PHYs. Signed-off-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-19dt-bindings: net: ar803x: add disable-hibernation-mode propetryWei Fang1-0/+8
The hibernation mode of Atheros AR803x PHYs defaults to be enabled after hardware reset. When the cable is unplugged, the PHY will enter hibernation mode after about 10 seconds and the PHY clocks will be stopped to save power. However, some MACs need the phy output clock for proper functioning of their logic. For instance, stmmac needs the RX_CLK of PHY for software reset to complete. Therefore, add a DT property to configure the PHY to disable this hardware hibernation mode. Signed-off-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-18net: moxa: MAC address reading, generating, validity checkingSergei Antonov1-6/+7
This device does not remember its MAC address, so add a possibility to get it from the platform. If it fails, generate a random address. This will provide a MAC address early during boot without user space being involved. Also remove extra calls to is_valid_ether_addr(). Made after suggestions by Andrew Lunn: 1) Use eth_hw_addr_random() to assign a random MAC address during probe. 2) Remove is_valid_ether_addr() from moxart_mac_open() 3) Add a call to platform_get_ethdev_address() during probe 4) Remove is_valid_ether_addr() from moxart_set_mac_address(). The core does this v1 -> v2: Handle EPROBE_DEFER returned from platform_get_ethdev_address(). Move MAC reading code to the beginning of the probe function. Signed-off-by: Sergei Antonov <saproj@gmail.com> Suggested-by: Andrew Lunn <andrew@lunn.ch> CC: Yang Yingliang <yangyingliang@huawei.com> CC: Pavel Skripkin <paskripkin@gmail.com> CC: Guobin Huang <huangguobin4@huawei.com> CC: Yang Wei <yang.wei9@zte.com.cn> CC: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Link: https://lore.kernel.org/r/20220818092317.529557-1-saproj@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-18Merge branch 'tcp-some-bug-fixes-for-tcp_read_skb'Jakub Kicinski2-28/+26
Cong Wang says: ==================== tcp: some bug fixes for tcp_read_skb() This patchset contains 3 bug fixes and 1 minor refactor patch for tcp_read_skb(). V1 only had the first patch, as Eric prefers to fix all of them together, I have to group them together. ==================== Link: https://lore.kernel.org/r/20220817195445.151609-1-xiyou.wangcong@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-18tcp: handle pure FIN case correctlyCong Wang2-3/+4
When skb->len==0, the recv_actor() returns 0 too, but we also use 0 for error conditions. This patch amends this by propagating the errors to tcp_read_skb() so that we can distinguish skb->len==0 case from error cases. Fixes: 04919bed948d ("tcp: Introduce tcp_read_skb()") Reported-by: Eric Dumazet <edumazet@google.com> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-18tcp: refactor tcp_read_skb() a bitCong Wang1-17/+9
As tcp_read_skb() only reads one skb at a time, the while loop is unnecessary, we can turn it into an if. This also simplifies the code logic. Cc: Eric Dumazet <edumazet@google.com> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-18tcp: fix tcp_cleanup_rbuf() for tcp_read_skb()Cong Wang1-10/+14
tcp_cleanup_rbuf() retrieves the skb from sk_receive_queue, it assumes the skb is not yet dequeued. This is no longer true for tcp_read_skb() case where we dequeue the skb first. Fix this by introducing a helper __tcp_cleanup_rbuf() which does not require any skb and calling it in tcp_read_skb(). Fixes: 04919bed948d ("tcp: Introduce tcp_read_skb()") Cc: Eric Dumazet <edumazet@google.com> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-18tcp: fix sock skb accounting in tcp_read_skb()Cong Wang1-0/+1
Before commit 965b57b469a5 ("net: Introduce a new proto_ops ->read_skb()"), skb was not dequeued from receive queue hence when we close TCP socket skb can be just flushed synchronously. After this commit, we have to uncharge skb immediately after being dequeued, otherwise it is still charged in the original sock. And we still need to retain skb->sk, as eBPF programs may extract sock information from skb->sk. Therefore, we have to call skb_set_owner_sk_safe() here. Fixes: 965b57b469a5 ("net: Introduce a new proto_ops ->read_skb()") Reported-and-tested-by: syzbot+a0e6f8738b58f7654417@syzkaller.appspotmail.com Tested-by: Stanislav Fomichev <sdf@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-18igb: Add lock to avoid data raceLin Ma2-1/+13
The commit c23d92b80e0b ("igb: Teardown SR-IOV before unregister_netdev()") places the unregister_netdev() call after the igb_disable_sriov() call to avoid functionality issue. However, it introduces several race conditions when detaching a device. For example, when .remove() is called, the below interleaving leads to use-after-free. (FREE from device detaching) | (USE from netdev core) igb_remove | igb_ndo_get_vf_config igb_disable_sriov | vf >= adapter->vfs_allocated_count? kfree(adapter->vf_data) | adapter->vfs_allocated_count = 0 | | memcpy(... adapter->vf_data[vf] Moreover, the igb_disable_sriov() also suffers from data race with the requests from VF driver. (FREE from device detaching) | (USE from requests) igb_remove | igb_msix_other igb_disable_sriov | igb_msg_task kfree(adapter->vf_data) | vf < adapter->vfs_allocated_count adapter->vfs_allocated_count = 0 | To this end, this commit first eliminates the data races from netdev core by using rtnl_lock (similar to commit 719479230893 ("dpaa2-eth: add MAC/PHY support through phylink")). And then adds a spinlock to eliminate races from driver requests. (similar to commit 1e53834ce541 ("ixgbe: Add locking to prevent panic when setting sriov_numvfs to zero") Fixes: c23d92b80e0b ("igb: Teardown SR-IOV before unregister_netdev()") Signed-off-by: Lin Ma <linma@zju.edu.cn> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20220817184921.735244-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-18Merge branch '100GbE' of ↵Jakub Kicinski6-18/+85
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2022-08-17 (ice) This series contains updates to ice driver only. Grzegorz prevents modifications to VLAN 0 when setting VLAN promiscuous as it will already be set. He also ignores -EEXIST error when attempting to set promiscuous and ensures promiscuous mode is properly cleared from the hardware when being removed. Benjamin ignores additional -EEXIST errors when setting promiscuous mode since the existing mode is the desired mode. Sylwester fixes VFs to allow sending of tagged traffic when no VLAN filters exist. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: ice: Fix VF not able to send tagged traffic with no VLAN filters ice: Ignore error message when setting same promiscuous mode ice: Fix clearing of promisc mode with bridge over bond ice: Ignore EEXIST when setting promisc mode ice: Fix double VLAN error when entering promisc mode ==================== Link: https://lore.kernel.org/r/20220817171329.65285-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-18dt-bindings: Fix incorrect "the the" correctionsGeert Uytterhoeven2-2/+2
Lots of double occurrences of "the" were replaced by single occurrences, but some of them should become "to the" instead. Fixes: 12e5bde18d7f6ca4 ("dt-bindings: Fix typo in comment") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/c5743c0a1a24b3a8893797b52fed88b99e56b04b.1660755148.git.geert+renesas@glider.be Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-18net: ethernet: altera: Add use of ethtool_op_get_ts_infoMaxime Chevallier1-0/+1
Add the ethtool_op_get_ts_info() callback to ethtool ops, so that we can at least use software timestamping. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://lore.kernel.org/r/20220817095725.97444-1-maxime.chevallier@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-18net: genl: fix error path memory leak in policy dumpingJakub Kicinski2-3/+17
If construction of the array of policies fails when recording non-first policy we need to unwind. netlink_policy_dump_add_policy() itself also needs fixing as it currently gives up on error without recording the allocated pointer in the pstate pointer. Reported-by: syzbot+dc54d9ba8153b216cae0@syzkaller.appspotmail.com Fixes: 50a896cf2d6f ("genetlink: properly support per-op policy dumping") Link: https://lore.kernel.org/r/20220816161939.577583-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-18stmmac: intel: Add a missing clk_disable_unprepare() call in ↵Christophe JAILLET1-0/+1
intel_eth_pci_remove() Commit 09f012e64e4b ("stmmac: intel: Fix clock handling on error and remove paths") removed this clk_disable_unprepare() This was partly revert by commit ac322f86b56c ("net: stmmac: Fix clock handling on remove path") which removed this clk_disable_unprepare() because: " While unloading the dwmac-intel driver, clk_disable_unprepare() is being called twice in stmmac_dvr_remove() and intel_eth_pci_remove(). This causes kernel panic on the second call. " However later on, commit 5ec55823438e8 ("net: stmmac: add clocks management for gmac driver") has updated stmmac_dvr_remove() which do not call clk_disable_unprepare() anymore. So this call should now be called from intel_eth_pci_remove(). Fixes: 5ec55823438e8 ("net: stmmac: add clocks management for gmac driver") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/d7c8c1dadf40df3a7c9e643f76ffadd0ccc1ad1b.1660659689.git.christophe.jaillet@wanadoo.fr Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-18tee: add overflow check in register_shm_helper()Jens Wiklander1-0/+3
With special lengths supplied by user space, register_shm_helper() has an integer overflow when calculating the number of pages covered by a supplied user space memory region. This causes internal_get_user_pages_fast() a helper function of pin_user_pages_fast() to do a NULL pointer dereference: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000010 Modules linked in: CPU: 1 PID: 173 Comm: optee_example_a Not tainted 5.19.0 #11 Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015 pc : internal_get_user_pages_fast+0x474/0xa80 Call trace: internal_get_user_pages_fast+0x474/0xa80 pin_user_pages_fast+0x24/0x4c register_shm_helper+0x194/0x330 tee_shm_register_user_buf+0x78/0x120 tee_ioctl+0xd0/0x11a0 __arm64_sys_ioctl+0xa8/0xec invoke_syscall+0x48/0x114 Fix this by adding an an explicit call to access_ok() in tee_shm_register_user_buf() to catch an invalid user space address early. Fixes: 033ddf12bcf5 ("tee: add register user memory") Cc: stable@vger.kernel.org Reported-by: Nimish Mishra <neelam.nimish@gmail.com> Reported-by: Anirban Chakraborty <ch.anirban00727@gmail.com> Reported-by: Debdeep Mukhopadhyay <debdeep.mukhopadhyay@gmail.com> Suggested-by: Jerome Forissier <jerome.forissier@linaro.org> Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-08-18ixgbe: Don't call kmap() on page allocated with GFP_ATOMICFabio M. De Francesco1-3/+1
Pages allocated with GFP_ATOMIC cannot come from Highmem. This is why there is no need to call kmap() on them. Therefore, don't call kmap() on rx_buffer->page() and instead use a plain page_address() to get the kernel address. Suggested-by: Ira Weiny <ira.weiny@intel.com> Suggested-by: Alexander Duyck <alexander.duyck@gmail.com> Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-18ice: remove non-inclusive languageMikael Barsehyan2-9/+9
Remove non-inclusive language from the driver where possible; replace "master" with "primary"; replace "slave" with "secondary". Signed-off-by: Mikael Barsehyan <mikael.barsehyan@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-18ice: Remove ucast_sharedSylwester Dziedziuch3-165/+5
Remove ucast_shared as it was always true. Remove the code depending on ucast_shared from ice_add_mac and ice_remove_mac. Remove ice_find_ucast_rule_entry function as it was only used when ucast_shared was set to false. Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com> Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-18ice: Allow 100M speeds for some devicesAnirudh Venkataramanan3-4/+28
For certain devices, 100M speeds are supported. Do not mask off 100M speed for these devices. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Co-developed-by: Chinh T Cao <chinh.t.cao@intel.com> Signed-off-by: Chinh T Cao <chinh.t.cao@intel.com> Signed-off-by: Mikael Barsehyan <mikael.barsehyan@intel.com> Tested-by: Kavya AV <kavyax.av@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-18ice: Implement FCS/CRC and VLAN stripping co-existence policyAnatolii Gerasymenko1-0/+25
Make sure that only the valid combinations of FCS/CRC stripping and VLAN stripping offloads are allowed. You cannot have FCS/CRC stripping disabled while VLAN stripping is enabled - this breaks the correctness of the FCS/CRC. If administrator tries to enable VLAN stripping when FCS/CRC stripping is disabled, the request should be rejected. If administrator tries to disable FCS/CRC stripping when VLAN stripping is enabled, the request should be rejected if VLANs are configured. If there is no VLAN configured, then both FCS/CRC and VLAN stripping should be disabled. Testing Hints: The default settings after driver load are: - VLAN C-Tag offloads are enabled - VLAN S-Tag offloads are disabled - FCS/CRC stripping is enabled Restore the default settings before each test with the command: ethtool -K eth0 rx-fcs off rxvlan on txvlan on rx-vlan-stag-hw-parse off tx-vlan-stag-hw-insert off Test 1: Disable FCS/CRC and VLAN stripping: ethtool -K eth0 rx-fcs on rxvlan off Try to enable VLAN stripping: ethtool -K eth0 rxvlan on Expected: VLAN stripping request is rejected Test 2: Try to disable FCS/CRC stripping: ethtool -K eth0 rx-fcs on Expected: VLAN stripping is also disabled, as there are no VLAN configured Test 3: Add a VLAN: ip link add link eth0 eth0.42 type vlan id 42 ip link set eth0 up Try to disable FCS/CRC stripping: ethtool -K eth0 rx-fcs on Expected: FCS/CRC stripping request is rejected Signed-off-by: Anatolii Gerasymenko <anatolii.gerasymenko@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-18ice: Implement control of FCS/CRC strippingJesse Brandeburg7-6/+69
The driver can allow the user to configure whether the CRC aka the FCS (Frame Check Sequence) is DMA'd to the host as part of the receive buffer. The driver usually wants this feature disabled so that the hardware checks the FCS and strips it in order to save PCI bandwidth. Control the reception of FCS to the host using the command: ethtool -K eth0 rx-fcs <on|off> The default shown in ethtool -k eth0 | grep fcs; should be "off", as the hardware will drop any frame with a bad checksum, and DMA of the checksum is useless overhead especially for small packets. Testing Hints: test the FCS/CRC arrives with received packets using tcpdump -nnpi eth0 -xxxx and it should show crc data as the last 4 bytes of the packet. Can also use wireshark to turn on CRC checking and check the data is correct. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Co-developed-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Co-developed-by: Benjamin Mikailenko <benjamin.mikailenko@intel.com> Signed-off-by: Benjamin Mikailenko <benjamin.mikailenko@intel.com> Co-developed-by: Anatolii Gerasymenko <anatolii.gerasymenko@intel.com> Signed-off-by: Anatolii Gerasymenko <anatolii.gerasymenko@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>