summaryrefslogtreecommitdiff
path: root/drivers/net/dsa
AgeCommit message (Collapse)AuthorFilesLines
2021-08-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2-9/+10
drivers/net/wwan/mhi_wwan_mbim.c - drop the extra arg. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-26net: dsa: hellcreek: Adjust schedule look ahead windowKurt Kanzenbach1-1/+1
Traffic schedules can only be started up to eight seconds within the future. Therefore, the driver periodically checks every two seconds whether the admin base time provided by the user is inside that window. If so the schedule is started. Otherwise the check is deferred. However, according to the programming manual the look ahead window size should be four - not eight - seconds. By using the proposed value of four seconds starting a schedule at a specified admin base time actually works as expected. Fixes: 24dfc6eb39b2 ("net: dsa: hellcreek: Add TAPRIO offloading support") Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-26net: dsa: hellcreek: Fix incorrect setting of GCLKurt Kanzenbach1-3/+3
Currently the gate control list which is programmed into the hardware is incorrect resulting in wrong traffic schedules. The problem is the loop variables are incremented before they are referenced. Therefore, move the increment to the end of the loop. Fixes: 24dfc6eb39b2 ("net: dsa: hellcreek: Add TAPRIO offloading support") Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25net: dsa: tag_sja1105: stop asking the sja1105 driver in sja1105_xmit_tpidVladimir Oltean3-26/+0
Introduced in commit 38b5beeae7a4 ("net: dsa: sja1105: prepare tagger for handling DSA tags and VLAN simultaneously"), the sja1105_xmit_tpid function solved quite a different problem than our needs are now. Then, we used best-effort VLAN filtering and we were using the xmit_tpid to tunnel packets coming from an 8021q upper through the TX VLAN allocated by tag_8021q to that egress port. The need for a different VLAN protocol depending on switch revision came from the fact that this in itself was more of a hack to trick the hardware into accepting tunneled VLANs in the first place. Right now, we deny 8021q uppers (see sja1105_prechangeupper). Even if we supported them again, we would not do that using the same method of {tunneling the VLAN on egress, retagging the VLAN on ingress} that we had in the best-effort VLAN filtering mode. It seems rather simpler that we just allocate a VLAN in the VLAN table that is simply not used by the bridge at all, or by any other port. Anyway, I have 2 gripes with the current sja1105_xmit_tpid: 1. When sending packets on behalf of a VLAN-aware bridge (with the new TX forwarding offload framework) plus untagged (with the tag_8021q VLAN added by the tagger) packets, we can see that on SJA1105P/Q/R/S and later (which have a qinq_tpid of ETH_P_8021AD), some packets sent through the DSA master have a VLAN protocol of 0x8100 and others of 0x88a8. This is strange and there is no reason for it now. If we have a bridge and are therefore forced to send using that bridge's TPID, we can as well blend with that bridge's VLAN protocol for all packets. 2. The sja1105_xmit_tpid introduces a dependency on the sja1105 driver, because it looks inside dp->priv. It is desirable to keep as much separation between taggers and switch drivers as possible. Now it doesn't do that anymore. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25net: dsa: sja1105: drop untagged packets on the CPU and DSA portsVladimir Oltean1-1/+9
The sja1105 driver is a bit special in its use of VLAN headers as DSA tags. This is because in VLAN-aware mode, the VLAN headers use an actual TPID of 0x8100, which is understood even by the DSA master as an actual VLAN header. Furthermore, control packets such as PTP and STP are transmitted with no VLAN header as a DSA tag, because, depending on switch generation, there are ways to steer these control packets towards a precise egress port other than VLAN tags. Transmitting control packets as untagged means leaving a door open for traffic in general to be transmitted as untagged from the DSA master, and for it to traverse the switch and exit a random switch port according to the FDB lookup. This behavior is a bit out of line with other DSA drivers which have native support for DSA tagging. There, it is to be expected that the switch only accepts DSA-tagged packets on its CPU port, dropping everything that does not match this pattern. We perhaps rely a bit too much on the switches' hardware dropping on the CPU port, and place no other restrictions in the kernel data path to avoid that. For example, sja1105 is also a bit special in that STP/PTP packets are transmitted using "management routes" (sja1105_port_deferred_xmit): when sending a link-local packet from the CPU, we must first write a SPI message to the switch to tell it to expect a packet towards multicast MAC DA 01-80-c2-00-00-0e, and to route it towards port 3 when it gets it. This entry expires as soon as it matches a packet received by the switch, and it needs to be reinstalled for the next packet etc. All in all quite a ghetto mechanism, but it is all that the sja1105 switches offer for injecting a control packet. The driver takes a mutex for serializing control packets and making the pairs of SPI writes of a management route and its associated skb atomic, but to be honest, a mutex is only relevant as long as all parties agree to take it. With the DSA design, it is possible to open an AF_PACKET socket on the DSA master net device, and blast packets towards 01-80-c2-00-00-0e, and whatever locking the DSA switch driver might use, it all goes kaput because management routes installed by the driver will match skbs sent by the DSA master, and not skbs generated by the driver itself. So they will end up being routed on the wrong port. So through the lens of that, maybe it would make sense to avoid that from happening by doing something in the network stack, like: introduce a new bit in struct sk_buff, like xmit_from_dsa. Then, somewhere around dev_hard_start_xmit(), introduce the following check: if (netdev_uses_dsa(dev) && !skb->xmit_from_dsa) kfree_skb(skb); Ok, maybe that is a bit drastic, but that would at least prevent a bunch of problems. For example, right now, even though the majority of DSA switches drop packets without DSA tags sent by the DSA master (and therefore the majority of garbage that user space daemons like avahi and udhcpcd and friends create), it is still conceivable that an aggressive user space program can open an AF_PACKET socket and inject a spoofed DSA tag directly on the DSA master. We have no protection against that; the packet will be understood by the switch and be routed wherever user space says. Furthermore: there are some DSA switches where we even have register access over Ethernet, using DSA tags. So even user space drivers are possible in this way. This is a huge hole. However, the biggest thing that bothers me is that udhcpcd attempts to ask for an IP address on all interfaces by default, and with sja1105, it will attempt to get a valid IP address on both the DSA master as well as on sja1105 switch ports themselves. So with IP addresses in the same subnet on multiple interfaces, the routing table will be messed up and the system will be unusable for traffic until it is configured manually to not ask for an IP address on the DSA master itself. It turns out that it is possible to avoid that in the sja1105 driver, at least very superficially, by requesting the switch to drop VLAN-untagged packets on the CPU port. With the exception of control packets, all traffic originated from tag_sja1105.c is already VLAN-tagged, so only STP and PTP packets need to be converted. For that, we need to uphold the equivalence between an untagged and a pvid-tagged packet, and to remember that the CPU port of sja1105 uses a pvid of 4095. Now that we drop untagged traffic on the CPU port, non-aggressive user space applications like udhcpcd stop bothering us, and sja1105 effectively becomes just as vulnerable to the aggressive kind of user space programs as other DSA switches are (ok, users can also create 8021q uppers on top of the DSA master in the case of sja1105, but in future patches we can easily deny that, but it still doesn't change the fact that VLAN-tagged packets can still be injected over raw sockets). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25net: dsa: sja1105: prevent tag_8021q VLANs from being received on user portsVladimir Oltean1-8/+29
Currently it is possible for an attacker to craft packets with a fake DSA tag and send them to us, and our user ports will accept them and preserve that VLAN when transmitting towards the CPU. Then the tagger will be misled into thinking that the packets came on a different port than they really came on. Up until recently there wasn't a good option to prevent this from happening. In SJA1105P and later, the MAC Configuration Table introduced two options called: - DRPSITAG: Drop Single Inner Tagged Frames - DRPSOTAG: Drop Single Outer Tagged Frames Because the sja1105 driver classifies all VLANs as "outer VLANs" (S-Tags), it would be in principle possible to enable the DRPSOTAG bit on ports using tag_8021q, and drop on ingress all packets which have a VLAN tag. When the switch is VLAN-unaware, this works, because it uses a custom TPID of 0xdadb, so any "tagged" packets received on a user port are probably a spoofing attempt. But when the switch overall is VLAN-aware, and some ports are standalone (therefore they use tag_8021q), the TPID is 0x8100, and the port can receive a mix of untagged and VLAN-tagged packets. The untagged ones will be classified to the tag_8021q pvid, and the tagged ones to the VLAN ID from the packet header. Yes, it is true that since commit 4fbc08bd3665 ("net: dsa: sja1105: deny 8021q uppers on ports") we no longer support this mixed mode, but that is a temporary limitation which will eventually be lifted. It would be nice to not introduce one more restriction via DRPSOTAG, which would make the standalone ports of a VLAN-aware switch drop genuinely VLAN-tagged packets. Also, the DRPSOTAG bit is not available on the first generation of switches (SJA1105E, SJA1105T). So since one of the key features of this driver is compatibility across switch generations, this makes it an even less desirable approach. The breakthrough comes from commit bef0746cf4cc ("net: dsa: sja1105: make sure untagged packets are dropped on ingress ports with no pvid"), where it became obvious that untagged packets are not dropped even if the ingress port is not in the VMEMB_PORT vector of that port's pvid. However, VLAN-tagged packets are subject to VLAN ingress checking/dropping. This means that instead of using the catch-all DRPSOTAG bit introduced in SJA1105P, we can drop tagged packets on a per-VLAN basis, and this is already compatible with SJA1105E/T. This patch adds an "allowed_ingress" argument to sja1105_vlan_add(), and we call it with "false" for tag_8021q VLANs on user ports. The tag_8021q VLANs still need to be allowed, of course, on ingress to DSA ports and CPU ports. We also need to refine the drop_untagged check in sja1105_commit_pvid to make it not freak out about this new configuration. Currently it will try to keep the configuration consistent between untagged and pvid-tagged packets, so if the pvid of a port is 1 but VLAN 1 is not in VMEMB_PORT, packets tagged with VID 1 will behave the same as untagged packets, and be dropped. This behavior is what we want for ports under a VLAN-aware bridge, but for the ports with a tag_8021q pvid, we want untagged packets to be accepted, but packets tagged with a header recognized by the switch as a tag_8021q VLAN to be dropped. So only restrict the drop_untagged check to apply to the bridge_pvid, not to the tag_8021q_pvid. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25net: dsa: mt7530: manually set up VLAN ID 0DENG Qingfang2-0/+27
The driver was relying on dsa_slave_vlan_rx_add_vid to add VLAN ID 0. After the blamed commit, VLAN ID 0 won't be set up anymore, breaking software bridging fallback on VLAN-unaware bridges. Manually set up VLAN ID 0 to fix this. Fixes: 06cfb2df7eb0 ("net: dsa: don't advertise 'rx-vlan-filter' when not needed") Signed-off-by: DENG Qingfang <dqfext@gmail.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-24net: dsa: mv88e6xxx: Update mv88e6393x serdes errataNathan Rossi1-5/+6
In early erratas this issue only covered port 0 when changing from [x]MII (rev A 3.6). In subsequent errata versions this errata changed to cover the additional "Hardware reset in CPU managed mode" condition, and removed the note specifying that it only applied to port 0. In designs where the device is configured with CPU managed mode (CPU_MGD), on reset all SERDES ports (p0, p9, p10) have a stuck power down bit and require this initial power up procedure. As such apply this errata to all three SERDES ports of the mv88e6393x. Signed-off-by: Nathan Rossi <nathan.rossi@digi.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-24net: dsa: let drivers state that they need VLAN filtering while standaloneVladimir Oltean1-0/+1
As explained in commit e358bef7c392 ("net: dsa: Give drivers the chance to veto certain upper devices"), the hellcreek driver uses some tricks to comply with the network stack expectations: it enforces port separation in standalone mode using VLANs. For untagged traffic, bridging between ports is prevented by using different PVIDs, and for VLAN-tagged traffic, it never accepts 8021q uppers with the same VID on two ports, so packets with one VLAN cannot leak from one port to another. That is almost fine*, and has worked because hellcreek relied on an implicit behavior of the DSA core that was changed by the previous patch: the standalone ports declare the 'rx-vlan-filter' feature as 'on [fixed]'. Since most of the DSA drivers are actually VLAN-unaware in standalone mode, that feature was actually incorrectly reflecting the hardware/driver state, so there was a desire to fix it. This leaves the hellcreek driver in a situation where it has to explicitly request this behavior from the DSA framework. We configure the ports as follows: - Standalone: 'rx-vlan-filter' is on. An 8021q upper on top of a standalone hellcreek port will go through dsa_slave_vlan_rx_add_vid and will add a VLAN to the hardware tables, giving the driver the opportunity to refuse it through .port_prechangeupper. - Bridged with vlan_filtering=0: 'rx-vlan-filter' is off. An 8021q upper on top of a bridged hellcreek port will not go through dsa_slave_vlan_rx_add_vid, because there will not be any attempt to offload this VLAN. The driver already disables VLAN awareness, so that upper should receive the traffic it needs. - Bridged with vlan_filtering=1: 'rx-vlan-filter' is on. An 8021q upper on top of a bridged hellcreek port will call dsa_slave_vlan_rx_add_vid, and can again be vetoed through .port_prechangeupper. *It is not actually completely fine, because if I follow through correctly, we can have the following situation: ip link add br0 type bridge vlan_filtering 0 ip link set lan0 master br0 # lan0 now becomes VLAN-unaware ip link set lan0 nomaster # lan0 fails to become VLAN-aware again, therefore breaking isolation This patch fixes that corner case by extending the DSA core logic, based on this requested attribute, to change the VLAN awareness state of the switch (port) when it leaves the bridge. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-20net: mscc: ocelot: transmit the VLAN filtering restrictions via extackVladimir Oltean1-1/+1
We need to transmit more restrictions in future patches, convert this one to netlink extack. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-20net: mscc: ocelot: transmit the "native VLAN" error via extackVladimir Oltean1-3/+5
We need to reject some more configurations in future patches, convert the existing one to netlink extack. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski1-4/+2
drivers/ptp/Kconfig: 55c8fca1dae1 ("ptp_pch: Restore dependency on PCI") e5f31552674e ("ethernet: fix PTP_1588_CLOCK dependencies") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-18net: dsa: tag_sja1105: be dsa_loop-safeVladimir Oltean1-3/+2
Add support for tag_sja1105 running on non-sja1105 DSA ports, by making sure that every time we dereference dp->priv, we check the switch's dsa_switch_ops (otherwise we access a struct sja1105_port structure that is in fact something else). This adds an unconditional build-time dependency between sja1105 being built as module => tag_sja1105 must also be built as module. This was there only for PTP before. Some sane defaults must also take place when not running on sja1105 hardware. These are: - sja1105_xmit_tpid: the sja1105 driver uses different VLAN protocols depending on VLAN awareness and switch revision (when an encapsulated VLAN must be sent). Default to 0x8100. - sja1105_rcv_meta_state_machine: this aggregates PTP frames with their metadata timestamp frames. When running on non-sja1105 hardware, don't do that and accept all frames unmodified. - sja1105_defer_xmit: calls sja1105_port_deferred_xmit in sja1105_main.c which writes a management route over SPI. When not running on sja1105 hardware, bypass the SPI write and send the frame as-is. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-18net: dsa: sja1105: fix use-after-free after calling of_find_compatible_node, ↵Vladimir Oltean1-4/+2
or worse It seems that of_find_compatible_node has a weird calling convention in which it calls of_node_put() on the "from" node argument, instead of leaving that up to the caller. This comes from the fact that of_find_compatible_node with a non-NULL "from" argument it only supposed to be used as the iterator function of for_each_compatible_node(). OF iterator functions call of_node_get on the next OF node and of_node_put() on the previous one. When of_find_compatible_node calls of_node_put, it actually never expects the refcount to drop to zero, because the call is done under the atomic devtree_lock context, and when the refcount drops to zero it triggers a kobject and a sysfs file deletion, which assume blocking context. So any driver call to of_find_compatible_node is probably buggy because an unexpected of_node_put() takes place. What should be done is to use the of_get_compatible_child() function. Fixes: 5a8f09748ee7 ("net: dsa: sja1105: register the MDIO buses for 100base-T1 and 100base-TX") Link: https://lore.kernel.org/netdev/20210814010139.kzryimmp4rizlznt@skbuf/ Suggested-by: Frank Rowand <frowand.list@gmail.com> Suggested-by: Rob Herring <robh+dt@kernel.org> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-16net: dsa: sja1105: reorganize probe, remove, setup and teardown orderingVladimir Oltean1-198/+199
The sja1105 driver's initialization and teardown sequence is a chaotic mess that has gathered a lot of cruft over time. It works because there is no strict dependency between the functions, but it could be improved. The basic principle that teardown should be the exact reverse of setup is obviously not held. We have initialization steps (sja1105_tas_setup, sja1105_flower_setup) in the probe method that are torn down in the DSA .teardown method instead of driver unbind time. We also have code after the dsa_register_switch() call, which implicitly means after the .setup() method has finished, which is pretty unusual. Also, sja1105_teardown() has calls set up in a different order than the error path of sja1105_setup(): see the reversed ordering between sja1105_ptp_clock_unregister and sja1105_mdiobus_unregister. Also, sja1105_static_config_load() is called towards the end of sja1105_setup(), but sja1105_static_config_free() is also towards the end of the error path and teardown path. The static_config_load() call should be earlier. Also, making and breaking the connections between struct sja1105_port and struct dsa_port could be refactored into dedicated functions, makes the code easier to follow. We move some code from the DSA .setup() method into the probe method, like the device tree parsing, and we move some code from the probe method into the DSA .setup() method to be symmetric with its placement in the DSA .teardown() method, which is nice because the unbind function has a single call to dsa_unregister_switch(). Example of the latter type of code movement are the connections between ports mentioned above, they are now in the .setup() method. Finally, due to fact that the kthread_init_worker() call is no longer in sja1105_probe() - located towards the bottom of the file - but in sja1105_setup() - located much higher - there is an inverse ordering with the worker function declaration, sja1105_port_deferred_xmit. To avoid that, the entire sja1105_setup() and sja1105_teardown() functions are moved towards the bottom of the file. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-16net: mscc: ocelot: convert to phylinkVladimir Oltean2-85/+6
The felix DSA driver, which is a wrapper over the same hardware class as ocelot, is integrated with phylink, but ocelot is using the plain PHY library. It makes sense to bring together the two implementations, which is what this patch achieves. This is a large patch and hard to break up, but it does the following: The existing ocelot_adjust_link writes some registers, and felix_phylink_mac_link_up writes some registers, some of them are common, but both functions write to some registers to which the other doesn't. The main reasons for this are: - Felix switches so far have used an NXP PCS so they had no need to write the PCS1G registers that ocelot_adjust_link writes - Felix switches have the MAC fixed at 1G, so some of the MAC speed changes actually break the link and must be avoided. The naming conventions for the functions introduced in this patch are: - vsc7514_phylink_{mac_config,validate} are specific to the Ocelot instantiations and placed in ocelot_net.c which is built only for the ocelot switchdev driver. - ocelot_phylink_mac_link_{up,down} are shared between the ocelot switchdev driver and the felix DSA driver (they are put in the common lib). One by one, the registers written by ocelot_adjust_link are: DEV_MAC_MODE_CFG - felix_phylink_mac_link_up had no need to write this register since its out-of-reset value was fine and did not need changing. The write is moved to the common ocelot_phylink_mac_link_up and on felix it is guarded by a quirk bit that makes the written value identical with the out-of-reset one DEV_PORT_MISC - runtime invariant, was moved to vsc7514_phylink_mac_config PCS1G_MODE_CFG - same as above PCS1G_SD_CFG - same as above PCS1G_CFG - same as above PCS1G_ANEG_CFG - same as above PCS1G_LB_CFG - same as above DEV_MAC_ENA_CFG - both ocelot_adjust_link and ocelot_port_disable touched this. felix_phylink_mac_link_{up,down} also do. We go with what felix does and put it in ocelot_phylink_mac_link_up. DEV_CLOCK_CFG - ocelot_adjust_link and felix_phylink_mac_link_up both write this, but to different values. Move to the common ocelot_phylink_mac_link_up and make sure via the quirk that the old values are preserved for both. ANA_PFC_PFC_CFG - ocelot_adjust_link wrote this, felix_phylink_mac_link_up did not. Runtime invariant, speed does not matter since PFC is disabled via the RX_PFC_ENA bits which are cleared. Move to vsc7514_phylink_mac_config. QSYS_SWITCH_PORT_MODE_PORT_ENA - both ocelot_adjust_link and felix_phylink_mac_link_{up,down} wrote this. Ocelot also wrote this register from ocelot_port_disable. Keep what felix did, move in ocelot_phylink_mac_link_{up,down} and delete ocelot_port_disable. ANA_POL_FLOWC - same as above SYS_MAC_FC_CFG - same as above, except slight behavior change. Whereas ocelot always enabled RX and TX flow control, felix listened to phylink (for the most part, at least - see the 2500base-X comment). The registers which only felix_phylink_mac_link_up wrote are: SYS_PAUSE_CFG_PAUSE_ENA - this is why I am not sure that flow control worked on ocelot. Not it should, since the code is shared with felix where it does. ANA_PORT_PORT_CFG - this is a Frame Analyzer block register, phylink should be the one touching them, deleted. Other changes: - The old phylib registration code was in mscc_ocelot_init_ports. It is hard to work with 2 levels of indentation already in, and with hard to follow teardown logic. The new phylink registration code was moved inside ocelot_probe_port(), right between alloc_etherdev() and register_netdev(). It could not be done before (=> outside of) ocelot_probe_port() because ocelot_probe_port() allocates the struct ocelot_port which we then use to assign ocelot_port->phy_mode to. It is more preferable to me to have all PHY handling logic inside the same function. - On the same topic: struct ocelot_port_private :: serdes is only used in ocelot_port_open to set the SERDES protocol to Ethernet. This is logically a runtime invariant and can be done just once, when the port registers with phylink. We therefore don't even need to keep the serdes reference inside struct ocelot_port_private, or to use the devm variant of of_phy_get(). - Phylink needs a valid phy-mode for phylink_create() to succeed, and the existing device tree bindings in arch/mips/boot/dts/mscc/ocelot_pcb120.dts don't define one for the internal PHY ports. So we patch PHY_INTERFACE_MODE_NA into PHY_INTERFACE_MODE_INTERNAL. - There was a strategically placed: switch (priv->phy_mode) { case PHY_INTERFACE_MODE_NA: continue; which made the code skip the serdes initialization for the internal PHY ports. Frankly that is not all that obvious, so now we explicitly initialize the serdes under an "if" condition and not rely on code jumps, so everything is clearer. - There was a write of OCELOT_SPEED_1000 to DEV_CLOCK_CFG for QSGMII ports. Since that is in fact the default value for the register field DEV_CLOCK_CFG_LINK_SPEED, I can only guess the intention was to clear the adjacent fields, MAC_TX_RST and MAC_RX_RST, aka take the port out of reset, which does match the comment. I don't even want to know why this code is placed there, but if there is indeed an issue that all ports that share a QSGMII lane must all be up, then this logic is already buggy, since mscc_ocelot_init_ports iterates using for_each_available_child_of_node, so nobody prevents the user from putting a 'status = "disabled";' for some QSGMII ports which would break the driver's assumption. In any case, in the eventuality that I'm right, we would have yet another issue if ocelot_phylink_mac_link_down would reset those ports and that would be forbidden, so since the ocelot_adjust_link logic did not do that (maybe for a reason), add another quirk to preserve the old logic. The ocelot driver teardown goes through all ports in one fell swoop. When initialization of one port fails, the ocelot->ports[port] pointer for that is reset to NULL, and teardown is done only for non-NULL ports, so there is no reason to do partial teardowns, let the central mscc_ocelot_release_ports() do its job. Tested bind, unbind, rebind, link up, link down, speed change on mock-up hardware (modified the driver to probe on Felix VSC9959). Also regression tested the felix DSA driver. Could not test the Ocelot specific bits (PCS1G, SERDES, device tree bindings). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-16net: dsa: felix: stop calling ocelot_port_{enable,disable}Vladimir Oltean1-19/+0
ocelot_port_enable touches ANA_PORT_PORT_CFG, which has the following fields: - LOCKED_PORTMOVE_CPU, LEARNDROP, LEARNCPU, LEARNAUTO, RECV_ENA, all of which are written with their hardware default values, also runtime invariants. So it makes no sense to write these during every .ndo_open. - PORTID_VAL: this field has an out-of-reset value of zero for all ports and must be initialized by software. Additionally, the ocelot_setup_logical_port_ids() code path sets up different logical port IDs for the ports in a hardware LAG, and we absolutely don't want .ndo_open to interfere there and reset those values. So in fact the write from ocelot_port_enable can better be moved to ocelot_init_port, and the .ndo_open hook deleted. ocelot_port_disable touches DEV_MAC_ENA_CFG and QSYS_SWITCH_PORT_MODE_PORT_ENA, in an attempt to undo what ocelot_adjust_link did. But since .ndo_stop does not get called each time the link falls (i.e. this isn't a substitute for .phylink_mac_link_down), felix already does better at this by writing those registers already in felix_phylink_mac_link_down. So keep ocelot_port_disable (for now, until ocelot is converted to phylink too), and just delete the felix call to it, which is not necessary. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-14ethernet: fix PTP_1588_CLOCK dependenciesArnd Bergmann3-0/+4
The 'imply' keyword does not do what most people think it does, it only politely asks Kconfig to turn on another symbol, but does not prevent it from being disabled manually or built as a loadable module when the user is built-in. In the ICE driver, the latter now causes a link failure: aarch64-linux-ld: drivers/net/ethernet/intel/ice/ice_main.o: in function `ice_eth_ioctl': ice_main.c:(.text+0x13b0): undefined reference to `ice_ptp_get_ts_config' ice_main.c:(.text+0x13b0): relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol `ice_ptp_get_ts_config' aarch64-linux-ld: ice_main.c:(.text+0x13bc): undefined reference to `ice_ptp_set_ts_config' ice_main.c:(.text+0x13bc): relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol `ice_ptp_set_ts_config' aarch64-linux-ld: drivers/net/ethernet/intel/ice/ice_main.o: in function `ice_prepare_for_reset': ice_main.c:(.text+0x31fc): undefined reference to `ice_ptp_release' ice_main.c:(.text+0x31fc): relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol `ice_ptp_release' aarch64-linux-ld: drivers/net/ethernet/intel/ice/ice_main.o: in function `ice_rebuild': This is a recurring problem in many drivers, and we have discussed it several times befores, without reaching a consensus. I'm providing a link to the previous email thread for reference, which discusses some related problems. To solve the dependency issue better than the 'imply' keyword, introduce a separate Kconfig symbol "CONFIG_PTP_1588_CLOCK_OPTIONAL" that any driver can depend on if it is able to use PTP support when available, but works fine without it. Whenever CONFIG_PTP_1588_CLOCK=m, those drivers are then prevented from being built-in, the same way as with a 'depends on PTP_1588_CLOCK || !PTP_1588_CLOCK' dependency that does the same trick, but that can be rather confusing when you first see it. Since this should cover the dependencies correctly, the IS_REACHABLE() hack in the header is no longer needed now, and can be turned back into a normal IS_ENABLED() check. Any driver that gets the dependency wrong will now cause a link time failure rather than being unable to use PTP support when that is in a loadable module. However, the two recently added ptp_get_vclocks_index() and ptp_convert_timestamp() interfaces are only called from builtin code with ethtool and socket timestamps, so keep the current behavior by stubbing those out completely when PTP is in a loadable module. This should be addressed properly in a follow-up. As Richard suggested, we may want to actually turn PTP support into a 'bool' option later on, preventing it from being a loadable module altogether, which would be one way to solve the problem with the ethtool interface. Fixes: 06c16d89d2cb ("ice: register 1588 PTP clock device object for E810 devices") Link: https://lore.kernel.org/netdev/20210804121318.337276-1-arnd@kernel.org/ Link: https://lore.kernel.org/netdev/CAK8P3a06enZOf=XyZ+zcAwBczv41UuCTz+=0FMf2gBz1_cOnZQ@mail.gmail.com/ Link: https://lore.kernel.org/netdev/CAK8P3a3=eOxE-K25754+fB_-i_0BZzf9a9RfPTX3ppSwu9WZXw@mail.gmail.com/ Link: https://lore.kernel.org/netdev/20210726084540.3282344-1-arnd@kernel.org/ Acked-by: Shannon Nelson <snelson@pensando.io> Acked-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20210812183509.1362782-1-arnd@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski9-44/+185
Conflicts: drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.h 9e26680733d5 ("bnxt_en: Update firmware call to retrieve TX PTP timestamp") 9e518f25802c ("bnxt_en: 1PPS functions to configure TSIO pins") 099fdeda659d ("bnxt_en: Event handler for PPS events") kernel/bpf/helpers.c include/linux/bpf-cgroup.h a2baf4e8bb0f ("bpf: Fix potentially incorrect results with bpf_get_local_storage()") c7603cfa04e7 ("bpf: Add ambient BPF runtime context stored in current") drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c 5957cc557dc5 ("net/mlx5: Set all field of mlx5_irq before inserting it to the xarray") 2d0b41a37679 ("net/mlx5: Refcount mlx5_irq with integer") MAINTAINERS 7b637cd52f02 ("MAINTAINERS: fix Microchip CAN BUS Analyzer Tool entry typo") 7d901a1e878a ("net: phy: add Maxlinear GPY115/21x/24x driver") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-12net: dsa: sja1105: unregister the MDIO buses during teardownVladimir Oltean1-0/+1
The call to sja1105_mdiobus_unregister is present in the error path but absent from the main driver unbind path. Fixes: 5a8f09748ee7 ("net: dsa: sja1105: register the MDIO buses for 100base-T1 and 100base-TX") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-12net: dsa: mt7530: fix VLAN traffic leaks againDENG Qingfang1-4/+1
When a port leaves a VLAN-aware bridge, the current code does not clear other ports' matrix field bit. If the bridge is later set to VLAN-unaware mode, traffic in the bridge may leak to that port. Remove the VLAN filtering check in mt7530_port_bridge_leave. Fixes: 474a2ddaa192 ("net: dsa: mt7530: fix VLAN traffic leaks") Fixes: 83163f7dca56 ("net: dsa: mediatek: add VLAN support for MT7530") Signed-off-by: DENG Qingfang <dqfext@gmail.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-10net: dsa: sja1105: fix broken backpressure in .port_fdb_dumpVladimir Oltean1-1/+3
rtnl_fdb_dump() has logic to split a dump of PF_BRIDGE neighbors into multiple netlink skbs if the buffer provided by user space is too small (one buffer will typically handle a few hundred FDB entries). When the current buffer becomes full, nlmsg_put() in dsa_slave_port_fdb_do_dump() returns -EMSGSIZE and DSA saves the index of the last dumped FDB entry, returns to rtnl_fdb_dump() up to that point, and then the dump resumes on the same port with a new skb, and FDB entries up to the saved index are simply skipped. Since dsa_slave_port_fdb_do_dump() is pointed to by the "cb" passed to drivers, then drivers must check for the -EMSGSIZE error code returned by it. Otherwise, when a netlink skb becomes full, DSA will no longer save newly dumped FDB entries to it, but the driver will continue dumping. So FDB entries will be missing from the dump. Fix the broken backpressure by propagating the "cb" return code and allow rtnl_fdb_dump() to restart the FDB dump with a new skb. Fixes: 291d1e72b756 ("net: dsa: sja1105: Add support for FDB and MDB management") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-10net: dsa: lantiq: fix broken backpressure in .port_fdb_dumpVladimir Oltean1-4/+10
rtnl_fdb_dump() has logic to split a dump of PF_BRIDGE neighbors into multiple netlink skbs if the buffer provided by user space is too small (one buffer will typically handle a few hundred FDB entries). When the current buffer becomes full, nlmsg_put() in dsa_slave_port_fdb_do_dump() returns -EMSGSIZE and DSA saves the index of the last dumped FDB entry, returns to rtnl_fdb_dump() up to that point, and then the dump resumes on the same port with a new skb, and FDB entries up to the saved index are simply skipped. Since dsa_slave_port_fdb_do_dump() is pointed to by the "cb" passed to drivers, then drivers must check for the -EMSGSIZE error code returned by it. Otherwise, when a netlink skb becomes full, DSA will no longer save newly dumped FDB entries to it, but the driver will continue dumping. So FDB entries will be missing from the dump. Fix the broken backpressure by propagating the "cb" return code and allow rtnl_fdb_dump() to restart the FDB dump with a new skb. Fixes: 58c59ef9e930 ("net: dsa: lantiq: Add Forwarding Database access") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-10net: dsa: lan9303: fix broken backpressure in .port_fdb_dumpVladimir Oltean1-15/+19
rtnl_fdb_dump() has logic to split a dump of PF_BRIDGE neighbors into multiple netlink skbs if the buffer provided by user space is too small (one buffer will typically handle a few hundred FDB entries). When the current buffer becomes full, nlmsg_put() in dsa_slave_port_fdb_do_dump() returns -EMSGSIZE and DSA saves the index of the last dumped FDB entry, returns to rtnl_fdb_dump() up to that point, and then the dump resumes on the same port with a new skb, and FDB entries up to the saved index are simply skipped. Since dsa_slave_port_fdb_do_dump() is pointed to by the "cb" passed to drivers, then drivers must check for the -EMSGSIZE error code returned by it. Otherwise, when a netlink skb becomes full, DSA will no longer save newly dumped FDB entries to it, but the driver will continue dumping. So FDB entries will be missing from the dump. Fix the broken backpressure by propagating the "cb" return code and allow rtnl_fdb_dump() to restart the FDB dump with a new skb. Fixes: ab335349b852 ("net: dsa: lan9303: Add port_fast_age and port_fdb_dump methods") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-10net: dsa: hellcreek: fix broken backpressure in .port_fdb_dumpVladimir Oltean1-2/+5
rtnl_fdb_dump() has logic to split a dump of PF_BRIDGE neighbors into multiple netlink skbs if the buffer provided by user space is too small (one buffer will typically handle a few hundred FDB entries). When the current buffer becomes full, nlmsg_put() in dsa_slave_port_fdb_do_dump() returns -EMSGSIZE and DSA saves the index of the last dumped FDB entry, returns to rtnl_fdb_dump() up to that point, and then the dump resumes on the same port with a new skb, and FDB entries up to the saved index are simply skipped. Since dsa_slave_port_fdb_do_dump() is pointed to by the "cb" passed to drivers, then drivers must check for the -EMSGSIZE error code returned by it. Otherwise, when a netlink skb becomes full, DSA will no longer save newly dumped FDB entries to it, but the driver will continue dumping. So FDB entries will be missing from the dump. Fix the broken backpressure by propagating the "cb" return code and allow rtnl_fdb_dump() to restart the FDB dump with a new skb. Fixes: e4b27ebc780f ("net: dsa: Add DSA driver for Hirschmann Hellcreek switches") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-10net: dsa: microchip: ksz8795: Don't use phy_port_cnt in VLAN table lookupBen Hutchings1-4/+4
The magic number 4 in VLAN table lookup was the number of entries we can read and write at once. Using phy_port_cnt here doesn't make sense and presumably broke VLAN filtering for 3-port switches. Change it back to 4. Fixes: 4ce2a984abd8 ("net: dsa: microchip: ksz8795: use phy_port_cnt ...") Signed-off-by: Ben Hutchings <ben.hutchings@mind.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-10net: dsa: microchip: ksz8795: Fix VLAN filteringBen Hutchings1-0/+11
Currently ksz8_port_vlan_filtering() sets or clears the VLAN Enable hardware flag. That controls discarding of packets with a VID that has not been enabled for any port on the switch. Since it is a global flag, set the dsa_switch::vlan_filtering_is_global flag so that the DSA core understands this can't be controlled per port. When VLAN filtering is enabled, the switch should also discard packets with a VID that's not enabled on the ingress port. Set or clear each external port's VLAN Ingress Filter flag in ksz8_port_vlan_filtering() to make that happen. Fixes: e66f840c08a2 ("net: dsa: ksz: Add Microchip KSZ8795 DSA driver") Signed-off-by: Ben Hutchings <ben.hutchings@mind.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-10net: dsa: microchip: ksz8795: Use software untagging on CPU portBen Hutchings1-1/+8
On the CPU port, we can support both tagged and untagged VLANs at the same time by doing any necessary untagging in software rather than hardware. To enable that, keep the CPU port's Remove Tag flag cleared and set the dsa_switch::untag_bridge_pvid flag. Fixes: e66f840c08a2 ("net: dsa: ksz: Add Microchip KSZ8795 DSA driver") Signed-off-by: Ben Hutchings <ben.hutchings@mind.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-10net: dsa: microchip: ksz8795: Fix VLAN untagged flag change on deletionBen Hutchings1-3/+0
When a VLAN is deleted from a port, the flags in struct switchdev_obj_port_vlan are always 0. ksz8_port_vlan_del() copies the BRIDGE_VLAN_INFO_UNTAGGED flag to the port's Tag Removal flag, and therefore always clears it. In case there are multiple VLANs configured as untagged on this port - which seems useless, but is allowed - deleting one of them changes the remaining VLANs to be tagged. It's only ever necessary to change this flag when a VLAN is added to the port, so leave it unchanged in ksz8_port_vlan_del(). Fixes: e66f840c08a2 ("net: dsa: ksz: Add Microchip KSZ8795 DSA driver") Signed-off-by: Ben Hutchings <ben.hutchings@mind.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-10net: dsa: microchip: ksz8795: Reject unsupported VLAN configurationBen Hutchings2-1/+27
The switches supported by ksz8795 only have a per-port flag for Tag Removal. This means it is not possible to support both tagged and untagged VLANs on the same port. Reject attempts to add a VLAN that requires the flag to be changed, unless there are no VLANs currently configured. VID 0 is excluded from this check since it is untagged regardless of the state of the flag. On the CPU port we could support tagged and untagged VLANs at the same time. This will be enabled by a later patch. Fixes: e66f840c08a2 ("net: dsa: ksz: Add Microchip KSZ8795 DSA driver") Signed-off-by: Ben Hutchings <ben.hutchings@mind.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-10net: dsa: microchip: ksz8795: Fix PVID tag insertionBen Hutchings2-7/+23
ksz8795 has never actually enabled PVID tag insertion, and it also programmed the PVID incorrectly. To fix this: * Allow tag insertion to be controlled per ingress port. On most chips, set bit 2 in Global Control 19. On KSZ88x3 this control flag doesn't exist. * When adding a PVID: - Set the appropriate register bits to enable tag insertion on egress at every other port if this was the packet's ingress port. - Mask *out* the VID from the default tag, before or-ing in the new PVID. * When removing a PVID: - Clear the same control bits to disable tag insertion. - Don't update the default tag. This wasn't doing anything useful. Fixes: e66f840c08a2 ("net: dsa: ksz: Add Microchip KSZ8795 DSA driver") Signed-off-by: Ben Hutchings <ben.hutchings@mind.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-10net: dsa: microchip: Fix ksz_read64()Ben Hutchings1-6/+2
ksz_read64() currently does some dubious byte-swapping on the two halves of a 64-bit register, and then only returns the high bits. Replace this with a straightforward expression. Fixes: e66f840c08a2 ("net: dsa: ksz: Add Microchip KSZ8795 DSA driver") Signed-off-by: Ben Hutchings <ben.hutchings@mind.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-08net: dsa: sja1105: add FDB fast ageing supportVladimir Oltean1-0/+41
Delete the dynamically learned FDB entries when the STP state changes and when address learning is disabled. On sja1105 there is no shorthand SPI command for this, so we need to walk through the entire FDB to delete. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-08net: dsa: sja1105: rely on DSA core tracking of port learning stateVladimir Oltean2-20/+13
Now that DSA keeps track of the port learning state, it becomes superfluous to keep an additional variable with this information in the sja1105 driver. Remove it. The DSA core's learning state is present in struct dsa_port *dp. To avoid the antipattern where we iterate through a DSA switch's ports and then call dsa_to_port to obtain the "dp" reference (which is bad because dsa_to_port iterates through the DSA switch tree once again), just iterate through the dst->ports and operate on those directly. The sja1105 had an extra use of priv->learn_ena on non-user ports. DSA does not touch the learning state of those ports - drivers are free to do what they wish on them. Mark that information with a comment in struct dsa_port and let sja1105 set dp->learning for cascade ports. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-08net: dsa: centralize fast ageing when address learning is turned offVladimir Oltean1-7/+0
Currently DSA leaves it down to device drivers to fast age the FDB on a port when address learning is disabled on it. There are 2 reasons for doing that in the first place: - when address learning is disabled by user space, through IFLA_BRPORT_LEARNING or the brport_attr_learning sysfs, what user space typically wants to achieve is to operate in a mode with no dynamic FDB entry on that port. But if the port is already up, some addresses might have been already learned on it, and it seems silly to wait for 5 minutes for them to expire until something useful can be done. - when a port leaves a bridge and becomes standalone, DSA turns off address learning on it. This also has the nice side effect of flushing the dynamically learned bridge FDB entries on it, which is a good idea because standalone ports should not have bridge FDB entries on them. We let drivers manage fast ageing under this condition because if DSA were to do it, it would need to track each port's learning state, and act upon the transition, which it currently doesn't. But there are 2 reasons why doing it is better after all: - drivers might get it wrong and not do it (see b53_port_set_learning) - we would like to flush the dynamic entries from the software bridge too, and letting drivers do that would be another pain point So track the port learning state and trigger a fast age process automatically within DSA. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-07net: dsa: qca: ar9331: make proper initial port defaultsOleksij Rempel1-1/+72
Make sure that all external port are actually isolated from each other, so no packets are leaked. Fixes: ec6698c272de ("net: dsa: add support for Atheros AR9331 built-in switch") Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-06net: dsa: mt7530: add the missing RxUnicast MIB counterDENG Qingfang1-0/+1
Add the missing RxUnicast counter. Fixes: b8f126a8d543 ("net-next: dsa: add dsa support for Mediatek MT7530 switch") Signed-off-by: DENG Qingfang <dqfext@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-06net: dsa: mt7530: drop untagged frames on VLAN-aware ports without PVIDDENG Qingfang2-2/+37
The driver currently still accepts untagged frames on VLAN-aware ports without PVID. Use PVC.ACC_FRM to drop untagged frames in that case. Signed-off-by: DENG Qingfang <dqfext@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-06net: dsa: don't disable multicast flooding to the CPU even without an IGMP ↵Vladimir Oltean4-31/+0
querier Commit 08cc83cc7fd8 ("net: dsa: add support for BRIDGE_MROUTER attribute") added an option for users to turn off multicast flooding towards the CPU if they turn off the IGMP querier on a bridge which already has enslaved ports (echo 0 > /sys/class/net/br0/bridge/multicast_router). And commit a8b659e7ff75 ("net: dsa: act as passthrough for bridge port flags") simply papered over that issue, because it moved the decision to flood the CPU with multicast (or not) from the DSA core down to individual drivers, instead of taking a more radical position then. The truth is that disabling multicast flooding to the CPU is simply something we are not prepared to do now, if at all. Some reasons: - ICMP6 neighbor solicitation messages are unregistered multicast packets as far as the bridge is concerned. So if we stop flooding multicast, the outside world cannot ping the bridge device's IPv6 link-local address. - There might be foreign interfaces bridged with our DSA switch ports (sending a packet towards the host does not necessarily equal termination, but maybe software forwarding). So if there is no one interested in that multicast traffic in the local network stack, that doesn't mean nobody is. - PTP over L4 (IPv4, IPv6) is multicast, but is unregistered as far as the bridge is concerned. This should reach the CPU port. - The switch driver might not do FDB partitioning. And since we don't even bother to do more fine-grained flood disabling (such as "disable flooding _from_port_N_ towards the CPU port" as opposed to "disable flooding _from_any_port_ towards the CPU port"), this breaks standalone ports, or even multiple bridges where one has an IGMP querier and one doesn't. Reverting the logic makes all of the above work. Fixes: a8b659e7ff75 ("net: dsa: act as passthrough for bridge port flags") Fixes: 08cc83cc7fd8 ("net: dsa: add support for BRIDGE_MROUTER attribute") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-06net: dsa: mt7530: remove the .port_set_mrouter implementationVladimir Oltean1-13/+0
DSA's idea of optimizing out multicast flooding to the CPU port leaves quite a few holes open, so it should be reverted. The mt7530 driver is the only new driver which added a .port_set_mrouter implementation after the reorg from commit a8b659e7ff75 ("net: dsa: act as passthrough for bridge port flags"), so it needs to be reverted separately so that the other revert commit can go a bit further down the git history. Fixes: 5a30833b9a16 ("net: dsa: mt7530: support MDB and bridge flag operations") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-06Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski3-40/+95
Build failure in drivers/net/wwan/mhi_wwan_mbim.c: add missing parameter (0, assuming we don't want buffer pre-alloc). Conflict in drivers/net/dsa/sja1105/sja1105_main.c between: 589918df9322 ("net: dsa: sja1105: be stateless with FDB entries on SJA1105P/Q/R/S/SJA1110 too") 0fac6aa098ed ("net: dsa: sja1105: delete the best_effort_vlan_filtering mode") Follow the instructions from the commit message of the former commit - removed the if conditions. When looking at commit 589918df9322 ("net: dsa: sja1105: be stateless with FDB entries on SJA1105P/Q/R/S/SJA1110 too") note that the mask_iotag fields get removed by the following patch. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-05net: dsa: sja1105: enable address learning on cascade portsVladimir Oltean1-2/+7
Right now, address learning is disabled on DSA ports, which means that a packet received over a DSA port from a cross-chip switch will be flooded to unrelated ports. It is desirable to eliminate that, but for that we need a breakdown of the possibilities for the sja1105 driver. A DSA port can be: - a downstream-facing cascade port. This is simple because it will always receive packets from a downstream switch, and there should be no other route to reach that downstream switch in the first place, which means it should be safe to learn that MAC address towards that switch. - an upstream-facing cascade port. This receives packets either: * autonomously forwarded by an upstream switch (and therefore these packets belong to the data plane of a bridge, so address learning should be ok), or * injected from the CPU. This deserves further discussion, as normally, an upstream-facing cascade port is no different than the CPU port itself. But with "H" topologies (a DSA link towards a switch that has its own CPU port), these are more "laterally-facing" cascade ports than they are "upstream-facing". Here, there is a risk that the port might learn the host addresses on the wrong port (on the DSA port instead of on its own CPU port), but this is solved by DSA's RX filtering infrastructure, which installs the host addresses as static FDB entries on the CPU port of all switches in a "H" tree. So even if there will be an attempt from the switch to migrate the FDB entry from the CPU port to the laterally-facing cascade port, it will fail to do that, because the FDB entry that already exists is static and cannot migrate. So address learning should be safe for this configuration too. Ok, so what about other MAC addresses coming from the host, not necessarily the bridge local FDB entries? What about MAC addresses dynamically learned on foreign interfaces, isn't there a risk that cascade ports will learn these entries dynamically when they are supposed to be delivered towards the CPU port? Well, that is correct, and this is why we also need to enable the assisted learning feature, to snoop for these addresses and write them to hardware as static FDB entries towards the CPU, to make the switch's learning process on the cascade ports ineffective for them. With assisted learning enabled, the hardware learning on the CPU port must be disabled. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-05net: dsa: sja1105: suppress TX packets from looping back in "H" topologiesVladimir Oltean1-0/+29
H topologies like this one have a problem: eth0 eth1 | | CPU port CPU port | DSA link | sw0p0 sw0p1 sw0p2 sw0p3 sw0p4 -------- sw1p4 sw1p3 sw1p2 sw1p1 sw1p0 | | | | | | user user user user user user port port port port port port Basically any packet sent by the eth0 DSA master can be flooded on the interconnecting DSA link sw0p4 <-> sw1p4 and it will be received by the eth1 DSA master too. Basically we are talking to ourselves. In VLAN-unaware mode, these packets are encoded using a tag_8021q TX VLAN, which dsa_8021q_rcv() rightfully cannot decode and complains. Whereas in VLAN-aware mode, the packets are encoded with a bridge VLAN which _can_ be decoded by the tagger running on eth1, so it will attempt to reinject that packet into the network stack (the bridge, if there is any port under eth1 that is under a bridge). In the case where the ports under eth1 are under the same cross-chip bridge as the ports under eth0, the TX packets will even be learned as RX packets. The only thing that will prevent loops with the software bridging path, and therefore disaster, is that the source port and the destination port are in the same hardware domain, and the bridge will receive packets from the driver with skb->offload_fwd_mark = true and will not forward between the two. The proper solution to this problem is to detect H topologies and enforce that all packets are received through the local switch and we do not attempt to receive packets on our CPU port from switches that have their own. This is a viable solution which works thanks to the fact that MAC addresses which should be filtered towards the host are installed by DSA as static MAC addresses towards the CPU port of each switch. TX from a CPU port towards the DSA port continues to be allowed, this is because sja1105 supports bridge TX forwarding offload, and the skb->dev used initially for xmit does not have any direct correlation with where the station that will respond to that packet is connected. It may very well happen that when we send a ping through a br0 interface that spans all switch ports, the xmit packet will exit the system through a DSA switch interface under eth1 (say sw1p2), but the destination station is connected to a switch port under eth0, like sw0p0. So the switch under eth1 needs to communicate on TX with the switch under eth0. The response, however, will not follow the same path, but instead, this patch enforces that the response is sent by the first switch directly to its DSA master which is eth0. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-05net: dsa: sja1105: increase MTU to account for VLAN header on DSA portsVladimir Oltean1-2/+2
Since all packets are transmitted as VLAN-tagged over a DSA link (this VLAN tag represents the tag_8021q header), we need to increase the MTU of these interfaces to account for the possibility that we are already transporting a user-visible VLAN header. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-05net: dsa: sja1105: manage VLANs on cascade portsVladimir Oltean1-3/+3
Since commit ed040abca4c1 ("net: dsa: sja1105: use 4095 as the private VLAN for untagged traffic"), this driver uses a reserved value as pvid for the host port (DSA CPU port). Control packets which are sent as untagged get classified to this VLAN, and all ports are members of it (this is to be expected for control packets). Manage all cascade ports in the same way and allow control packets to egress everywhere. Also, all VLANs need to be sent as egress-tagged on all cascade ports. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-05net: dsa: sja1105: manage the forwarding domain towards DSA portsVladimir Oltean1-24/+60
Manage DSA links towards other switches, be they host ports or cascade ports, the same as the CPU port, i.e. allow forwarding and flooding unconditionally from all user ports. We send packets as always VLAN-tagged on a DSA port, and we rely on the cross-chip notifiers from tag_8021q to install the RX VLAN of a switch port only on the proper remote ports of another switch (the ports that are in the same bridging domain). So if there is no cross-chip bridging in the system, the flooded packets will be sent on the DSA ports too, but they will be dropped by the remote switches due to either (a) a lack of the RX VLAN in the VLAN table of the ingress DSA port, or (b) a lack of valid destinations for those packets, due to a lack of the RX VLAN on the user ports of the switch Note that switches which only transport packets in a cross-chip bridge, but have no user ports of their own as part of that bridge, such as switch 1 in this case: DSA link DSA link sw0p0 sw0p1 sw0p2 -------- sw1p0 sw1p2 sw1p3 -------- sw2p0 sw2p2 sw2p3 ip link set sw0p0 master br0 ip link set sw2p3 master br0 will still work, because the tag_8021q cross-chip notifiers keep the RX VLANs installed on all DSA ports. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-05net: dsa: sja1105: configure the cascade ports based on topologyVladimir Oltean1-27/+70
The sja1105 switch family has a feature called "cascade ports" which can be used in topologies where multiple SJA1105/SJA1110 switches are daisy chained. Upstream switches set this bit for the DSA link towards the downstream switches. This is used when the upstream switch receives a control packet (PTP, STP) from a downstream switch, because if the source port for a control packet is marked as a cascade port, then the source port, switch ID and RX timestamp will not be taken again on the upstream switch, it is assumed that this has already been done by the downstream switch (the leaf port in the tree) and that the CPU has everything it needs to decode the information from this packet. We need to distinguish between an upstream-facing DSA link and a downstream-facing DSA link, because the upstream-facing DSA links are "host ports" for the SJA1105/SJA1110 switches, and the downstream-facing DSA links are "cascade ports". Note that SJA1105 supports a single cascade port, so only daisy chain topologies work. With SJA1110, there can be more complex topologies such as: eth0 | host port | sw0p0 sw0p1 sw0p2 sw0p3 sw0p4 | | | | cascade cascade user user port port port port | | | | | | | host | port | | | sw1p0 sw1p1 sw1p2 sw1p3 sw1p4 | | | | | | user user user user host port port port port port | sw2p0 sw2p1 sw2p2 sw2p3 sw2p4 | | | | user user user user port port port port Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-04net: dsa: mt7530: always install FDB entries with IVL and FID 1DENG Qingfang2-2/+3
This reverts commit 7e777021780e ("mt7530 mt7530_fdb_write only set ivl bit vid larger than 1"). Before this series, the default value of all ports' PVID is 1, which is copied into the FDB entry, even if the ports are VLAN unaware. So `bridge fdb show` will show entries like `dev swp0 vlan 1 self` even on a VLAN-unaware bridge. The blamed commit does not solve that issue completely, instead it may cause a new issue that FDB is inaccessible in a VLAN-aware bridge with PVID 1. This series sets PVID to 0 on VLAN-unaware ports, so `bridge fdb show` will no longer print `vlan 1` on VLAN-unaware bridges, and that special case in fdb_write is not required anymore. Set FDB entries' filter ID to 1 to match the VLAN table. Signed-off-by: DENG Qingfang <dqfext@gmail.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-04net: dsa: mt7530: set STP state on filter ID 1DENG Qingfang2-3/+4
As filter ID 1 is the only one used for bridges, set STP state on it. Signed-off-by: DENG Qingfang <dqfext@gmail.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-04net: dsa: mt7530: use independent VLAN learning on VLAN-unaware bridgesDENG Qingfang2-21/+59
Consider the following bridge configuration, where bond0 is not offloaded: +-- br0 --+ / / | \ / / | \ / | | bond0 / | | / \ swp0 swp1 swp2 swp3 swp4 . . . . . . A B C Ideally, when the switch receives a packet from swp3 or swp4, it should forward the packet to the CPU, according to the port matrix and unknown unicast flood settings. But packet loss will happen if the destination address is at one of the offloaded ports (swp0~2). For example, when client C sends a packet to A, the FDB lookup will indicate that it should be forwarded to swp0, but the port matrix of swp3 and swp4 is configured to only allow the CPU to be its destination, so it is dropped. However, this issue does not happen if the bridge is VLAN-aware. That is because VLAN-aware bridges use independent VLAN learning, i.e. use VID for FDB lookup, on offloaded ports. As swp3 and swp4 are not offloaded, shared VLAN learning with default filter ID of 0 is used instead. So the lookup for A with filter ID 0 never hits and the packet can be forwarded to the CPU. In the current code, only two combinations were used to toggle user ports' VLAN awareness: one is PCR.PORT_VLAN set to port matrix mode with PVC.VLAN_ATTR set to transparent port, the other is PCR.PORT_VLAN set to security mode with PVC.VLAN_ATTR set to user port. It turns out that only PVC.VLAN_ATTR contributes to VLAN awareness, and port matrix mode just skips the VLAN table lookup. The reference manual is somehow misleading when describing PORT_VLAN modes. It states that PORT_MEM (VLAN port member) is used for destination if the VLAN table lookup hits, but actually **PORT_MEM & PORT_MATRIX** (bitwise AND of VLAN port member and port matrix) is used instead, which means we can have two or more separate VLAN-aware bridges with the same PVID and traffic won't leak between them. Therefore, to solve this, enable independent VLAN learning with PVID 0 on VLAN-unaware bridges, by setting their PCR.PORT_VLAN to fallback mode, while leaving standalone ports in port matrix mode. The CPU port is always set to fallback mode to serve those bridges. During testing, it is found that FDB lookup with filter ID of 0 will also hit entries with VID 0 even with independent VLAN learning. To avoid that, install all VLANs with filter ID of 1. Signed-off-by: DENG Qingfang <dqfext@gmail.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>