summaryrefslogtreecommitdiff
path: root/drivers/net/dsa
AgeCommit message (Collapse)AuthorFilesLines
2022-09-15net: dsa: felix: tc-taprio intervals smaller than MTU should send at least ↵Vladimir Oltean1-4/+31
one packet [ Upstream commit 11afdc6526de0e0368c05da632a8c0d29fc60bb8 ] The blamed commit broke tc-taprio schedules such as this one: tc qdisc replace dev $swp1 root taprio \ num_tc 8 \ map 0 1 2 3 4 5 6 7 \ queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \ base-time 0 \ sched-entry S 0x7f 990000 \ sched-entry S 0x80 10000 \ flags 0x2 because the gate entry for TC 7 (S 0x80 10000 ns) now has a static guard band added earlier than its 'gate close' event, such that packet overruns won't occur in the worst case of the largest packet possible. Since guard bands are statically determined based on the per-tc QSYS_QMAXSDU_CFG_* with a fallback on the port-based QSYS_PORT_MAX_SDU, we need to discuss what happens with TC 7 depending on kernel version, since the driver, prior to commit 55a515b1f5a9 ("net: dsa: felix: drop oversized frames with tc-taprio instead of hanging the port"), did not touch QSYS_QMAXSDU_CFG_*, and therefore relied on QSYS_PORT_MAX_SDU. 1 (before vsc9959_tas_guard_bands_update): QSYS_PORT_MAX_SDU defaults to 1518, and at gigabit this introduces a static guard band (independent of packet sizes) of 12144 ns, plus QSYS::HSCH_MISC_CFG.FRM_ADJ (bit time of 20 octets => 160 ns). But this is larger than the time window itself, of 10000 ns. So, the queue system never considers a frame with TC 7 as eligible for transmission, since the gate practically never opens, and these frames are forever stuck in the TX queues and hang the port. 2 (after vsc9959_tas_guard_bands_update): Under the sole goal of enabling oversized frame dropping, we make an effort to set QSYS_QMAXSDU_CFG_7 to 1230 bytes. But QSYS_QMAXSDU_CFG_7 plays one more role, which we did not take into account: per-tc static guard band, expressed in L2 byte time (auto-adjusted for FCS and L1 overhead). There is a discrepancy between what the driver thinks (that there is no guard band, and 100% of min_gate_len[tc] is available for egress scheduling) and what the hardware actually does (crops the equivalent of QSYS_QMAXSDU_CFG_7 ns out of min_gate_len[tc]). In practice, this means that the hardware thinks it has exactly 0 ns for scheduling tc 7. In both cases, even minimum sized Ethernet frames are stuck on egress rather than being considered for scheduling on TC 7, even if they would fit given a proper configuration. Considering the current situation, with vsc9959_tas_guard_bands_update(), frames between 60 octets and 1230 octets in size are not eligible for oversized dropping (because they are smaller than QSYS_QMAXSDU_CFG_7), but won't be considered as eligible for scheduling either, because the min_gate_len[7] (10000 ns) minus the guard band determined by QSYS_QMAXSDU_CFG_7 (1230 octets * 8 ns per octet == 9840 ns) minus the guard band auto-added for L1 overhead by QSYS::HSCH_MISC_CFG.FRM_ADJ (20 octets * 8 ns per octet == 160 octets) leaves 0 ns for scheduling in the queue system proper. Investigating the hardware behavior, it becomes apparent that the queue system needs precisely 33 ns of 'gate open' time in order to consider a frame as eligible for scheduling to a tc. So the solution to this problem is to amend vsc9959_tas_guard_bands_update(), by giving the per-tc guard bands less space by exactly 33 ns, just enough for one frame to be scheduled in that interval. This allows the queue system to make forward progress for that port-tc, and prevents it from hanging. Fixes: 297c4de6f780 ("net: dsa: felix: re-enable TAS guard band mode") Reported-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-09-15time64.h: consolidate uses of PSEC_PER_NSECVladimir Oltean1-2/+3
[ Upstream commit 837ced3a1a5d8bb1a637dd584711f31ae6b54d93 ] Time-sensitive networking code needs to work with PTP times expressed in nanoseconds, and with packet transmission times expressed in picoseconds, since those would be fractional at higher than gigabit speed when expressed in nanoseconds. Convert the existing uses in tc-taprio and the ocelot/felix DSA driver to a PSEC_PER_NSEC macro. This macro is placed in include/linux/time64.h as opposed to its relatives (PSEC_PER_SEC etc) from include/vdso/time64.h because the vDSO library does not (yet) need/use it. Cc: Andy Lutomirski <luto@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> # for the vDSO parts Signed-off-by: Jakub Kicinski <kuba@kernel.org> Stable-dep-of: 11afdc6526de ("net: dsa: felix: tc-taprio intervals smaller than MTU should send at least one packet") Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-09-15net: dsa: felix: access QSYS_TAG_CONFIG under tas_lock in ↵Vladimir Oltean1-2/+2
vsc9959_sched_speed_set [ Upstream commit a4bb481aeb9d84cb53112a478e6db4705b794c34 ] The read-modify-write of QSYS_TAG_CONFIG from vsc9959_sched_speed_set() runs unlocked with respect to the other functions that access it, which are vsc9959_tas_guard_bands_update(), vsc9959_qos_port_tas_set() and vsc9959_tas_clock_adjust(). All the others are under ocelot->tas_lock, so move the vsc9959_sched_speed_set() access under that lock as well, to resolve the concurrency. Fixes: 55a515b1f5a9 ("net: dsa: felix: drop oversized frames with tc-taprio instead of hanging the port") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-09-15net: dsa: felix: disable cut-through forwarding for frames oversized for ↵Vladimir Oltean1-43/+79
tc-taprio [ Upstream commit 843794bbdef83955ae5b43dfafc355c3786e2145 ] Experimentally, it looks like when QSYS_QMAXSDU_CFG_7 is set to 605, frames even way larger than 601 octets are transmitted even though these should be considered as oversized, according to the documentation, and dropped. Since oversized frame dropping depends on frame size, which is only known at the EOF stage, and therefore not at SOF when cut-through forwarding begins, it means that the switch cannot take QSYS_QMAXSDU_CFG_* into consideration for traffic classes that are cut-through. Since cut-through forwarding has no UAPI to control it, and the driver enables it based on the mantra "if we can, then why not", the strategy is to alter vsc9959_cut_through_fwd() to take into consideration which tc's have oversize frame dropping enabled, and disable cut-through for them. Then, from vsc9959_tas_guard_bands_update(), we re-trigger the cut-through determination process. There are 2 strategies for vsc9959_cut_through_fwd() to determine whether a tc has oversized dropping enabled or not. One is to keep a bit mask of traffic classes per port, and the other is to read back from the hardware registers (a non-zero value of QSYS_QMAXSDU_CFG_* means the feature is enabled). We choose reading back from registers, because struct ocelot_port is shared with drivers (ocelot, seville) that don't support either cut-through nor tc-taprio, and we don't have a felix specific extension of struct ocelot_port. Furthermore, reading registers from the Felix hardware is quite cheap, since they are memory-mapped. Fixes: 55a515b1f5a9 ("net: dsa: felix: drop oversized frames with tc-taprio instead of hanging the port") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-09-08net: dsa: xrs700x: Use irqsave variant for u64 stats updateSebastian Andrzej Siewior1-2/+3
[ Upstream commit 3f8ae9fe0409698799e173f698b714f34570b64b ] xrs700x_read_port_counters() updates the stats from a worker using the u64_stats_update_begin() version. This is okay on 32-UP since on the reader side preemption is disabled. On 32bit-SMP the writer can be preempted by the reader at which point the reader will spin on the seqcount until writer continues and completes the update. Assigning the mib_mutex mutex to the underlying seqcount would ensure proper synchronisation. The API for that on the u64_stats_init() side isn't available. Since it is the only user, just use disable interrupts during the update. Use u64_stats_update_begin_irqsave() on the writer side to ensure an uninterrupted update. Fixes: ee00b24f32eb8 ("net: dsa: add Arrow SpeedChips XRS700x driver") Cc: Andrew Lunn <andrew@lunn.ch> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: George McCollister <george.mccollister@gmail.com> Cc: Vivien Didelot <vivien.didelot@gmail.com> Cc: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: George McCollister <george.mccollister@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-31net: dsa: microchip: keep compatibility with device tree blobs with no phy-modeVladimir Oltean1-1/+7
[ Upstream commit 5fbb08eb7f945c7e8896ea39f03143ce66dfa4c7 ] DSA has multiple ways of specifying a MAC connection to an internal PHY. One requires a DT description like this: port@0 { reg = <0>; phy-handle = <&internal_phy>; phy-mode = "internal"; }; (which is IMO the recommended approach, as it is the clearest description) but it is also possible to leave the specification as just: port@0 { reg = <0>; } and if the driver implements ds->ops->phy_read and ds->ops->phy_write, the DSA framework "knows" it should create a ds->slave_mii_bus, and it should connect to a non-OF-based internal PHY on this MDIO bus, at an MDIO address equal to the port address. There is also an intermediary way of describing things: port@0 { reg = <0>; phy-handle = <&internal_phy>; }; In case 2, DSA calls phylink_connect_phy() and in case 3, it calls phylink_of_phy_connect(). In both cases, phylink_create() has been called with a phy_interface_t of PHY_INTERFACE_MODE_NA, and in both cases, PHY_INTERFACE_MODE_NA is translated into phy->interface. It is important to note that phy_device_create() initializes dev->interface = PHY_INTERFACE_MODE_GMII, and so, when we use phylink_create(PHY_INTERFACE_MODE_NA), no one will override this, and we will end up with a PHY_INTERFACE_MODE_GMII interface inherited from the PHY. All this means that in order to maintain compatibility with device tree blobs where the phy-mode property is missing, we need to allow the "gmii" phy-mode and treat it as "internal". Fixes: 2c709e0bdad4 ("net: dsa: microchip: ksz8795: add phylink support") Link: https://bugzilla.kernel.org/show_bug.cgi?id=216320 Reported-by: Craig McQueen <craig@mcqueen.id.au> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk> Tested-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk> Link: https://lore.kernel.org/r/20220818143250.2797111-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-31net: dsa: microchip: update the ksz_phylink_get_capsArun Ramadoss4-10/+11
[ Upstream commit 7012033ce10e0968e6cb82709aa0ed7f2080b61e ] This patch assigns the phylink_get_caps in ksz8795 and ksz9477 to ksz_phylink_get_caps. And update their mac_capabilities in the respective ksz_dev_ops. Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-31net: dsa: microchip: move the port mirror to ksz_commonArun Ramadoss4-13/+45
[ Upstream commit 00a298bbc23876288b1cd04c38752d8e7ed53ae2 ] This patch updates the common port mirror add/del dsa_switch_ops in ksz_common.c. The individual switches implementation is executed based on the ksz_dev_ops function pointers. Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-31net: dsa: microchip: move vlan functionality to ksz_commonArun Ramadoss4-20/+69
[ Upstream commit f0d997e31bb307c7aa046c4992c568547fd25195 ] This patch moves the vlan dsa_switch_ops such as vlan_add, vlan_del and vlan_filtering from the individual files ksz8795.c, ksz9477.c to ksz_common.c file. Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-31net: dsa: microchip: move tag_protocol to ksz_commonArun Ramadoss4-25/+28
[ Upstream commit 534a0431e9e68959e2c0d71c141d5b911d66ad7c ] This patch move the dsa hook get_tag_protocol to ksz_common file. And the tag_protocol is returned based on the dev->chip_id. Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-31net: dsa: microchip: move switch chip_id detection to ksz_commonArun Ramadoss6-90/+93
[ Upstream commit 91a98917a8839923d404a77c21646ca5fc9e330a ] KSZ87xx and KSZ88xx have chip_id representation at reg location 0. And KSZ9477 compatible switch and LAN937x switch have same chip_id detection at location 0x01 and 0x02. To have the common switch detect functionality for ksz switches, ksz_switch_detect function is introduced. Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-31net: dsa: microchip: ksz9477: cleanup the ksz9477_switch_detectArun Ramadoss1-26/+22
[ Upstream commit 27faa0aa85f6696d411bbbebaed9f0f723c2a175 ] The ksz9477_switch_detect performs the detecting the chip id from the location 0x00 and also check gigabit compatibility check & number of ports based on the register global_options0. To prepare the common ksz switch detect function, routine other than chip id read is moved to ksz9477_switch_init. Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-25net: mscc: ocelot: make struct ocelot_stat_layout array indexableVladimir Oltean2-190/+746
[ Upstream commit 9190460084ddd0e9235f55eab0fdd5456b5f2fd5 ] The ocelot counters are 32-bit and require periodic reading, every 2 seconds, by ocelot_port_update_stats(), so that wraparounds are detected. Currently, the counters reported by ocelot_get_stats64() come from the 32-bit hardware counters directly, rather than from the 64-bit accumulated ocelot->stats, and this is a problem for their integrity. The strategy is to make ocelot_get_stats64() able to cherry-pick individual stats from ocelot->stats the way in which it currently reads them out from SYS_COUNT_* registers. But currently it can't, because ocelot->stats is an opaque u64 array that's used only to feed data into ethtool -S. To solve that problem, we need to make ocelot->stats indexable, and associate each element with an element of struct ocelot_stat_layout used by ethtool -S. This makes ocelot_stat_layout a fat (and possibly sparse) array, so we need to change the way in which we access it. We no longer need OCELOT_STAT_END as a sentinel, because we know the array's size (OCELOT_NUM_STATS). We just need to skip the array elements that were left unpopulated for the switch revision (ocelot, felix, seville). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-25net: mscc: ocelot: turn stats_lock into a spinlockVladimir Oltean1-2/+2
[ Upstream commit 22d842e3efe56402c33b5e6e303bb71ce9bf9334 ] ocelot_get_stats64() currently runs unlocked and therefore may collide with ocelot_port_update_stats() which indirectly accesses the same counters. However, ocelot_get_stats64() runs in atomic context, and we cannot simply take the sleepable ocelot->stats_lock mutex. We need to convert it to an atomic spinlock first. Do that as a preparatory change. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-25net: dsa: sja1105: fix buffer overflow in sja1105_setup_devlink_regions()Rustam Subkhankulov1-1/+1
commit fd8e899cdb5ecaf8e8ee73854a99e10807eef1de upstream. If an error occurs in dsa_devlink_region_create(), then 'priv->regions' array will be accessed by negative index '-1'. Found by Linux Verification Center (linuxtesting.org) with SVACE. Signed-off-by: Rustam Subkhankulov <subkhankulov@ispras.ru> Fixes: bf425b82059e ("net: dsa: sja1105: expose static config as devlink region") Link: https://lore.kernel.org/r/20220817003845.389644-1-subkhankulov@ispras.ru Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-25net: mscc: ocelot: fix incorrect ndo_get_stats64 packet countersVladimir Oltean2-15/+21
commit 5152de7b79ab0be150f5966481b0c8f996192531 upstream. Reading stats using the SYS_COUNT_* register definitions is only used by ocelot_get_stats64() from the ocelot switchdev driver, however, currently the bucket definitions are incorrect. Separately, on both RX and TX, we have the following problems: - a 256-1023 bucket which actually tracks the 256-511 packets - the 1024-1526 bucket actually tracks the 512-1023 packets - the 1527-max bucket actually tracks the 1024-1526 packets => nobody tracks the packets from the real 1527-max bucket Additionally, the RX_PAUSE, RX_CONTROL, RX_LONGS and RX_CLASSIFIED_DROPS all track the wrong thing. However this doesn't seem to have any consequence, since ocelot_get_stats64() doesn't use these. Even though this problem only manifests itself for the switchdev driver, we cannot split the fix for ocelot and for DSA, since it requires fixing the bucket definitions from enum ocelot_reg, which makes us necessarily adapt the structures from felix and seville as well. Fixes: 84705fc16552 ("net: dsa: felix: introduce support for Seville VSC9953 switch") Fixes: 56051948773e ("net: dsa: ocelot: add driver for Felix switch family") Fixes: a556c76adc05 ("net: mscc: Add initial Ocelot switch support") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-25net: dsa: felix: fix ethtool 256-511 and 512-1023 TX packet countersVladimir Oltean1-1/+2
commit 40d21c4565bce064c73a03b79a157a3493c518b9 upstream. What the driver actually reports as 256-511 is in fact 512-1023, and the TX packets in the 256-511 bucket are not reported. Fix that. Fixes: 56051948773e ("net: dsa: ocelot: add driver for Felix switch family") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-25net: dsa: microchip: ksz9477: fix fdb_dump last invalid entryArun Ramadoss1-0/+3
commit 36c0d935015766bf20d621c18313f17691bda5e3 upstream. In the ksz9477_fdb_dump function it reads the ALU control register and exit from the timeout loop if there is valid entry or search is complete. After exiting the loop, it reads the alu entry and report to the user space irrespective of entry is valid. It works till the valid entry. If the loop exited when search is complete, it reads the alu table. The table returns all ones and it is reported to user space. So bridge fdb show gives ff:ff:ff:ff:ff:ff as last entry for every port. To fix it, after exiting the loop the entry is reported only if it is valid one. Fixes: b987e98e50ab ("dsa: add DSA switch driver for Microchip KSZ9477") Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Link: https://lore.kernel.org/r/20220816105516.18350-1-arun.ramadoss@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-25net: dsa: mv88e6060: prevent crash on an unused portSergei Antonov1-0/+3
commit 246bbf2f977ea36aaf41f5d24370fef433250728 upstream. If the port isn't a CPU port nor a user port, 'cpu_dp' is a null pointer and a crash happened on dereferencing it in mv88e6060_setup_port(): [ 9.575872] Unable to handle kernel NULL pointer dereference at virtual address 00000014 ... [ 9.942216] mv88e6060_setup from dsa_register_switch+0x814/0xe84 [ 9.948616] dsa_register_switch from mdio_probe+0x2c/0x54 [ 9.954433] mdio_probe from really_probe.part.0+0x98/0x2a0 [ 9.960375] really_probe.part.0 from driver_probe_device+0x30/0x10c [ 9.967029] driver_probe_device from __device_attach_driver+0xb8/0x13c [ 9.973946] __device_attach_driver from bus_for_each_drv+0x90/0xe0 [ 9.980509] bus_for_each_drv from __device_attach+0x110/0x184 [ 9.986632] __device_attach from bus_probe_device+0x8c/0x94 [ 9.992577] bus_probe_device from deferred_probe_work_func+0x78/0xa8 [ 9.999311] deferred_probe_work_func from process_one_work+0x290/0x73c [ 10.006292] process_one_work from worker_thread+0x30/0x4b8 [ 10.012155] worker_thread from kthread+0xd4/0x10c [ 10.017238] kthread from ret_from_fork+0x14/0x3c Fixes: 0abfd494deef ("net: dsa: use dedicated CPU port") CC: Vivien Didelot <vivien.didelot@savoirfairelinux.com> CC: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Sergei Antonov <saproj@gmail.com> Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Link: https://lore.kernel.org/r/20220811070939.1717146-1-saproj@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-25net: dsa: felix: suppress non-changes to the tagging protocolVladimir Oltean1-0/+3
commit 4c46bb49460ee14c69629e813640d8b929e88941 upstream. The way in which dsa_tree_change_tag_proto() works is that when dsa_tree_notify() fails, it doesn't know whether the operation failed mid way in a multi-switch tree, or it failed for a single-switch tree. So even though drivers need to fail cleanly in ds->ops->change_tag_protocol(), DSA will still call dsa_tree_notify() again, to restore the old tag protocol for potential switches in the tree where the change did succeeed (before failing for others). This means for the felix driver that if we report an error in felix_change_tag_protocol(), we'll get another call where proto_ops == old_proto_ops. If we proceed to act upon that, we may do unexpected things. For example, we will call dsa_tag_8021q_register() twice in a row, without any dsa_tag_8021q_unregister() in between. Then we will actually call dsa_tag_8021q_unregister() via old_proto_ops->teardown, which (if it manages to run at all, after walking through corrupted data structures) will leave the ports inoperational anyway. The bug can be readily reproduced if we force an error while in tag_8021q mode; this crashes the kernel. echo ocelot-8021q > /sys/class/net/eno2/dsa/tagging echo edsa > /sys/class/net/eno2/dsa/tagging # -EPROTONOSUPPORT Unable to handle kernel NULL pointer dereference at virtual address 0000000000000014 Call trace: vcap_entry_get+0x24/0x124 ocelot_vcap_filter_del+0x198/0x270 felix_tag_8021q_vlan_del+0xd4/0x21c dsa_switch_tag_8021q_vlan_del+0x168/0x2cc dsa_switch_event+0x68/0x1170 dsa_tree_notify+0x14/0x34 dsa_port_tag_8021q_vlan_del+0x84/0x110 dsa_tag_8021q_unregister+0x15c/0x1c0 felix_tag_8021q_teardown+0x16c/0x180 felix_change_tag_protocol+0x1bc/0x230 dsa_switch_event+0x14c/0x1170 dsa_tree_change_tag_proto+0x118/0x1c0 Fixes: 7a29d220f4c0 ("net: dsa: felix: reimplement tagging protocol change with function pointers") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20220808125127.3344094-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-17net: dsa: felix: fix min gate len calculation for tc when its first gate is ↵Vladimir Oltean1-1/+14
closed commit 7e4babffa6f340a74c820d44d44d16511e666424 upstream. min_gate_len[tc] is supposed to track the shortest interval of continuously open gates for a traffic class. For example, in the following case: TC 76543210 t0 00000001b 200000 ns t1 00000010b 200000 ns min_gate_len[0] and min_gate_len[1] should be 200000, while min_gate_len[2-7] should be 0. However what happens is that min_gate_len[0] is 200000, but min_gate_len[1] ends up being 0 (despite gate_len[1] being 200000 at the point where the logic detects the gate close event for TC 1). The problem is that the code considers a "gate close" event whenever it sees that there is a 0 for that TC (essentially it's level rather than edge triggered). By doing that, any time a gate is seen as closed without having been open prior, gate_len, which is 0, will be written into min_gate_len. Once min_gate_len becomes 0, it's impossible for it to track anything higher than that (the length of actually open intervals). To fix this, we make the writing to min_gate_len[tc] be edge-triggered, which avoids writes for gates that are closed in consecutive intervals. However what this does is it makes us need to special-case the permanently closed gates at the end. Fixes: 55a515b1f5a9 ("net: dsa: felix: drop oversized frames with tc-taprio instead of hanging the port") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20220804202817.1677572-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-17net: dsa: felix: build as module when tc-taprio is moduleVladimir Oltean1-0/+1
[ Upstream commit 10ed11ab6399813eb652137db9c378433c28a95c ] felix_vsc9959.c calls taprio_offload_get() and taprio_offload_free(), symbols exported by net/sched/sch_taprio.c. As such, we must disallow building the Felix driver as built-in when the symbol exported by tc-taprio isn't present in the kernel image. Fixes: 1c9017e44af2 ("net: dsa: felix: keep reference on entire tc-taprio config") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20220704190241.1288847-2-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-17net: dsa: felix: drop oversized frames with tc-taprio instead of hanging the ↵Vladimir Oltean3-0/+211
port [ Upstream commit 55a515b1f5a97df5704a1788fe97a4a740be2b9e ] Currently, sending a packet into a time gate too small for it (or always closed) causes the queue system to hold the frame forever. Even worse, this frame isn't subject to aging either, because for that to happen, it needs to be scheduled for transmission in the first place. But the frame will consume buffer memory and frame references while it is forever held in the queue system. Before commit a4ae997adcbd ("net: mscc: ocelot: initialize watermarks to sane defaults"), this behavior was somewhat subtle, as the switch had a more intricately tuned default watermark configuration out of reset, which did not allow any single port and tc to consume the entire switch buffer space. Nonetheless, the held frames are still there, and they reduce the total backplane capacity of the switch. However, after the aforementioned commit, the behavior can be very clearly seen, since we deliberately allow each {port, tc} to consume the entire shared buffer of the switch minus the reservations (and we disable all reservations by default). That is to say, we allow a permanently closed tc-taprio gate to hang the entire switch. A careful inspection of the documentation shows that the QSYS:Q_MAX_SDU per-port-tc registers serve 2 purposes: one is for guard band calculation (when zero, this falls back to QSYS:PORT_MAX_SDU), and the other is to enable oversized frame dropping (when non-zero). Currently the QSYS:Q_MAX_SDU registers are all zero, so oversized frame dropping is disabled. The goal of the change is to enable it seamlessly. For that, we need to hook into the MTU change, tc-taprio change, and port link speed change procedures, since we depend on these variables. Frames are not dropped on egress due to a queue system oversize condition, instead that egress port is simply excluded from the mask of valid destination ports for the packet. If there are no destination ports at all, the ingress counter that increments is the generic "drop_tail" in ethtool -S. The issue exists in various forms since the tc-taprio offload was introduced. Fixes: de143c0e274b ("net: dsa: felix: Configure Time-Aware Scheduler via taprio offload") Reported-by: Richie Pearn <richard.pearn@nxp.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-17net: dsa: felix: keep reference on entire tc-taprio configVladimir Oltean1-10/+13
[ Upstream commit 1c9017e44af2eee94b1001af18c401ae440ad77c ] In a future change we will need to remember the entire tc-taprio config on all ports rather than just the base time, so use the taprio_offload_get() helper function to replace ocelot_port->base_time with ocelot_port->taprio. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-17net: dsa: felix: update base time of time-aware shaper when adjusting PTP timeXiaoliang Yang1-6/+77
[ Upstream commit 8670dc33f48bab4d7bb4b8d0232f17f4dae419ec ] When adjusting the PTP clock, the base time of the TAS configuration will become unreliable. We need reset the TAS configuration by using a new base time. For example, if the driver gets a base time 0 of Qbv configuration from user, and current time is 20000. The driver will set the TAS base time to be 20000. After the PTP clock adjustment, the current time becomes 10000. If the TAS base time is still 20000, it will be a future time, and TAS entry list will stop running. Another example, if the current time becomes to be 10000000 after PTP clock adjust, a large time offset can cause the hardware to hang. This patch introduces a tas_clock_adjust() function to reset the TAS module by using a new base time after the PTP clock adjustment. This can avoid issues above. Due to PTP clock adjustment can occur at any time, it may conflict with the TAS configuration. We introduce a new TAS lock to serialize the access to the TAS registers. Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-07-19net: dsa: vitesse-vsc73xx: silent spi_device_id warningsOleksij Rempel1-0/+10
Add spi_device_id entries to silent SPI warnings. Fixes: 5fa6863ba692 ("spi: Check we have a spi_device_id for each DT compatible") Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Link: https://lore.kernel.org/r/20220717135831.2492844-2-o.rempel@pengutronix.de Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-07-19net: dsa: sja1105: silent spi_device_id warningsOleksij Rempel1-0/+16
Add spi_device_id entries to silent following warnings: SPI driver sja1105 has no spi_device_id for nxp,sja1105e SPI driver sja1105 has no spi_device_id for nxp,sja1105t SPI driver sja1105 has no spi_device_id for nxp,sja1105p SPI driver sja1105 has no spi_device_id for nxp,sja1105q SPI driver sja1105 has no spi_device_id for nxp,sja1105r SPI driver sja1105 has no spi_device_id for nxp,sja1105s SPI driver sja1105 has no spi_device_id for nxp,sja1110a SPI driver sja1105 has no spi_device_id for nxp,sja1110b SPI driver sja1105 has no spi_device_id for nxp,sja1110c SPI driver sja1105 has no spi_device_id for nxp,sja1110d Fixes: 5fa6863ba692 ("spi: Check we have a spi_device_id for each DT compatible") Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20220717135831.2492844-1-o.rempel@pengutronix.de Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-07-16net: dsa: microchip: ksz_common: Fix refcount leak bugLiang He1-1/+4
In ksz_switch_register(), we should call of_node_put() for the reference returned by of_get_child_by_name() which has increased the refcount. Fixes: 912aae27c6af ("net: dsa: microchip: really look for phy-mode in port nodes") Signed-off-by: Liang He <windhl@126.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Link: https://lore.kernel.org/r/20220714153138.375919-1-windhl@126.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-30net: dsa: felix: fix race between reading PSFP stats and port statsVladimir Oltean1-0/+4
Both PSFP stats and the port stats read by ocelot_check_stats_work() are indirectly read through the same mechanism - write to STAT_CFG:STAT_VIEW, read from SYS:STAT:CNT[n]. It's just that for port stats, we write STAT_VIEW with the index of the port, and for PSFP stats, we write STAT_VIEW with the filter index. So if we allow them to run concurrently, ocelot_check_stats_work() may change the view from vsc9959_psfp_counters_get(), and vice versa. Fixes: 7d4b564d6add ("net: dsa: felix: support psfp filter on vsc9959") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20220629183007.3808130-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-24net: dsa: bcm_sf2: force pause link settingsDoug Berger1-0/+5
The pause settings reported by the PHY should also be applied to the GMII port status override otherwise the switch will not generate pause frames towards the link partner despite the advertisement saying otherwise. Fixes: 246d7f773c13 ("net: dsa: add Broadcom SF2 switch driver") Signed-off-by: Doug Berger <opendmb@gmail.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20220623030204.1966851-1-f.fainelli@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-24net/dsa/hirschmann: Add missing of_node_get() in hellcreek_led_setup()Liang He1-0/+1
of_find_node_by_name() will decrease the refcount of its first arg and we need a of_node_get() to keep refcount balance. Fixes: 7d9ee2e8ff15 ("net: dsa: hellcreek: Add PTP status LEDs") Signed-off-by: Liang He <windhl@126.com> Link: https://lore.kernel.org/r/20220622040621.4094304-1-windhl@126.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-23net: dsa: qca8k: reduce mgmt ethernet timeoutChristian Marangi1-1/+1
The current mgmt ethernet timeout is set to 100ms. This value is too big and would slow down any mdio command in case the mgmt ethernet packet have some problems on the receiving part. Reduce it to just 5ms to handle case when some operation are done on the master port that would cause the mgmt ethernet to not work temporarily. Fixes: 5950c7c0a68c ("net: dsa: qca8k: add support for mgmt read/write in Ethernet packet") Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Link: https://lore.kernel.org/r/20220621151633.11741-1-ansuelsmth@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-23net: dsa: qca8k: reset cpu port on MTU changeChristian Marangi1-1/+21
It was discovered that the Documentation lacks of a fundamental detail on how to correctly change the MAX_FRAME_SIZE of the switch. In fact if the MAX_FRAME_SIZE is changed while the cpu port is on, the switch panics and cease to send any packet. This cause the mgmt ethernet system to not receive any packet (the slow fallback still works) and makes the device not reachable. To recover from this a switch reset is required. To correctly handle this, turn off the cpu ports before changing the MAX_FRAME_SIZE and turn on again after the value is applied. Fixes: f58d2598cf70 ("net: dsa: qca8k: implement the port MTU callbacks") Cc: stable@vger.kernel.org Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Link: https://lore.kernel.org/r/20220621151122.10220-1-ansuelsmth@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-09Merge tag 'net-5.19-rc2' of ↵Linus Torvalds3-46/+31
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from bpf and netfilter. Current release - regressions: - eth: amt: fix possible null-ptr-deref in amt_rcv() Previous releases - regressions: - tcp: use alloc_large_system_hash() to allocate table_perturb - af_unix: fix a data-race in unix_dgram_peer_wake_me() - nfc: st21nfca: fix memory leaks in EVT_TRANSACTION handling - eth: ixgbe: fix unexpected VLAN rx in promisc mode on VF Previous releases - always broken: - ipv6: fix signed integer overflow in __ip6_append_data - netfilter: - nat: really support inet nat without l3 address - nf_tables: memleak flow rule from commit path - bpf: fix calling global functions from BPF_PROG_TYPE_EXT programs - openvswitch: fix misuse of the cached connection on tuple changes - nfc: nfcmrvl: fix memory leak in nfcmrvl_play_deferred - eth: altera: fix refcount leak in altera_tse_mdio_create Misc: - add Quentin Monnet to bpftool maintainers" * tag 'net-5.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (45 commits) net: amd-xgbe: fix clang -Wformat warning tcp: use alloc_large_system_hash() to allocate table_perturb net: dsa: realtek: rtl8365mb: fix GMII caps for ports with internal PHY net: dsa: mv88e6xxx: correctly report serdes link failure net: dsa: mv88e6xxx: fix BMSR error to be consistent with others net: dsa: mv88e6xxx: use BMSR_ANEGCOMPLETE bit for filling an_complete net: altera: Fix refcount leak in altera_tse_mdio_create net: openvswitch: fix misuse of the cached connection on tuple changes net: ethernet: mtk_eth_soc: fix misuse of mem alloc interface netdev[napi]_alloc_frag ip_gre: test csum_start instead of transport header au1000_eth: stop using virt_to_bus() ipv6: Fix signed integer overflow in l2tp_ip6_sendmsg ipv6: Fix signed integer overflow in __ip6_append_data nfc: nfcmrvl: Fix memory leak in nfcmrvl_play_deferred nfc: st21nfca: fix incorrect sizing calculations in EVT_TRANSACTION nfc: st21nfca: fix memory leaks in EVT_TRANSACTION handling nfc: st21nfca: fix incorrect validating logic in EVT_TRANSACTION net: ipv6: unexport __init-annotated seg6_hmac_init() net: xfrm: unexport __init-annotated xfrm4_protocol_init() net: mdio: unexport __init-annotated mdio_bus_init() ...
2022-06-09net: dsa: realtek: rtl8365mb: fix GMII caps for ports with internal PHYAlvin Šipraga1-29/+9
Since commit a18e6521a7d9 ("net: phylink: handle NA interface mode in phylink_fwnode_phy_connect()"), phylib defaults to GMII when no phy-mode or phy-connection-type property is specified in a DSA port node of the device tree. The same commit caused a regression in rtl8365mb whereby phylink would fail to connect, because the driver did not advertise support for GMII for ports with internal PHY. It should be noted that the aforementioned regression is not because the blamed commit was incorrect: on the contrary, the blamed commit is correcting the previous behaviour whereby unspecified phy-mode would cause the internal interface mode to be PHY_INTERFACE_MODE_NA. The rtl8365mb driver only worked by accident before because it _did_ advertise support for PHY_INTERFACE_MODE_NA, despite NA being reserved for internal use by phylink. With one mistake fixed, the other was exposed. Commit a5dba0f207e5 ("net: dsa: rtl8365mb: add GMII as user port mode") then introduced implicit support for GMII mode on ports with internal PHY to allow a PHY connection for device trees where the phy-mode is not explicitly set to "internal". At this point everything was working OK again. Subsequently, commit 6ff6064605e9 ("net: dsa: realtek: convert to phylink_generic_validate()") broke this behaviour again by discarding the usage of rtl8365mb_phy_mode_supported() - where this GMII support was indicated - while switching to the new .phylink_get_caps API. With the new API, rtl8365mb_phy_mode_supported() is no longer needed. Remove it altogether and add back the GMII capability - this time to rtl8365mb_phylink_get_caps() - so that the above default behaviour works for ports with internal PHY again. Fixes: 6ff6064605e9 ("net: dsa: realtek: convert to phylink_generic_validate()") Signed-off-by: Alvin Šipraga <alsi@bang-olufsen.dk> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://lore.kernel.org/r/20220607184624.417641-1-alvin@pqrs.dk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-09net: dsa: mv88e6xxx: correctly report serdes link failureRussell King (Oracle)1-0/+8
Phylink wants to know if the link has dropped since the last time state was retrieved, and the BMSR gives us that. Read the BMSR and use it when deciding the link state. Fill in the an_complete member as well for the emulated PHY state. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-09net: dsa: mv88e6xxx: fix BMSR error to be consistent with othersRussell King (Oracle)1-1/+1
Other errors accessing the registers in mv88e6352_serdes_pcs_get_state() print "PHY " before the register name, except for the BMSR. Make this consistent with the other error messages. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-09net: dsa: mv88e6xxx: use BMSR_ANEGCOMPLETE bit for filling an_completeMarek Behún1-16/+11
Commit ede359d8843a ("net: dsa: mv88e6xxx: Link in pcs_get_state() if AN is bypassed") added the ability to link if AN was bypassed, and added filling of state->an_complete field, but set it to true if AN was enabled in BMCR, not when AN was reported complete in BMSR. This was done because for some reason, when I wanted to use BMSR value to infer an_complete, I was looking at BMSR_ANEGCAPABLE bit (which was always 1), instead of BMSR_ANEGCOMPLETE bit. Use BMSR_ANEGCOMPLETE for filling state->an_complete. Fixes: ede359d8843a ("net: dsa: mv88e6xxx: Link in pcs_get_state() if AN is bypassed") Signed-off-by: Marek Behún <kabel@kernel.org> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-08net: dsa: lantiq_gswip: Fix refcount leak in gswip_gphy_fw_listMiaoqian Lin1-1/+3
Every iteration of for_each_available_child_of_node() decrements the reference count of the previous node. when breaking early from a for_each_available_child_of_node() loop, we need to explicitly call of_node_put() on the gphy_fw_np. Add missing of_node_put() to avoid refcount leak. Fixes: 14fceff4771e ("net: dsa: Add Lantiq / Intel DSA driver for vrx200") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Link: https://lore.kernel.org/r/20220605072335.11257-1-linmq006@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-05Merge tag 'bitmap-for-5.19-rc1' of https://github.com/norov/linuxLinus Torvalds1-5/+1
Pull bitmap updates from Yury Norov: - bitmap: optimize bitmap_weight() usage, from me - lib/bitmap.c make bitmap_print_bitmask_to_buf parseable, from Mauro Carvalho Chehab - include/linux/find: Fix documentation, from Anna-Maria Behnsen - bitmap: fix conversion from/to fix-sized arrays, from me - bitmap: Fix return values to be unsigned, from Kees Cook It has been in linux-next for at least a week with no problems. * tag 'bitmap-for-5.19-rc1' of https://github.com/norov/linux: (31 commits) nodemask: Fix return values to be unsigned bitmap: Fix return values to be unsigned KVM: x86: hyper-v: replace bitmap_weight() with hweight64() KVM: x86: hyper-v: fix type of valid_bank_mask ia64: cleanup remove_siblinginfo() drm/amd/pm: use bitmap_{from,to}_arr32 where appropriate KVM: s390: replace bitmap_copy with bitmap_{from,to}_arr64 where appropriate lib/bitmap: add test for bitmap_{from,to}_arr64 lib: add bitmap_{from,to}_arr64 lib/bitmap: extend comment for bitmap_(from,to)_arr32() include/linux/find: Fix documentation lib/bitmap.c make bitmap_print_bitmask_to_buf parseable MAINTAINERS: add cpumask and nodemask files to BITMAP_API arch/x86: replace nodes_weight with nodes_empty where appropriate mm/vmstat: replace cpumask_weight with cpumask_empty where appropriate clocksource: replace cpumask_weight with cpumask_empty in clocksource.c genirq/affinity: replace cpumask_weight with cpumask_empty where appropriate irq: mips: replace cpumask_weight with cpumask_empty where appropriate drm/i915/pmu: replace cpumask_weight with cpumask_empty where appropriate arch/x86: replace cpumask_weight with cpumask_empty where appropriate ...
2022-05-27net: dsa: mv88e6xxx: Fix refcount leak in mv88e6xxx_mdios_registerMiaoqian Lin1-0/+1
of_get_child_by_name() returns a node pointer with refcount incremented, we should use of_node_put() on it when done. mv88e6xxx_mdio_register() pass the device node to of_mdiobus_register(). We don't need the device node after it. Add missing of_node_put() to avoid refcount leak. Fixes: a3c53be55c95 ("net: dsa: mv88e6xxx: Support multiple MDIO busses") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Reviewed-by: Marek Behún <kabel@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski1-1/+2
drivers/net/ethernet/cadence/macb_main.c 5cebb40bc955 ("net: macb: Fix PTP one step sync support") 138badbc21a0 ("net: macb: use NAPI for TX completion path") https://lore.kernel.org/all/20220523111021.31489367@canb.auug.org.au/ net/smc/af_smc.c 75c1edf23b95 ("net/smc: postpone sk_refcnt increment in connect()") 3aba103006bc ("net/smc: align the connect behaviour with TCP") https://lore.kernel.org/all/20220524114408.4bf1af38@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-23net: dsa: felix: tag_8021q preparation for multiple CPU portsVladimir Oltean1-40/+64
Update the VCAP filters to support multiple tag_8021q CPU ports. TX works using a filter for VLAN ID on the ingress of the CPU port, with a redirect and a VLAN pop action. This can be updated trivially by amending the ingress port mask of this rule to match on all tag_8021q CPU ports. RX works using a filter for ingress port on the egress of the CPU port, with a VLAN push action. Here we need to replicate these filters for each tag_8021q CPU port, and let them all have the same action. This means that the OCELOT_VCAP_ES0_TAG_8021Q_RXVLAN() cookie needs to encode a unique value for every {user port, CPU port} pair it's given. Do this by encoding the CPU port in the upper 16 bits of the cookie, and the user port in the lower 16 bits. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-23net: mscc: ocelot: switch from {,un}set to {,un}assign for tag_8021q CPU portsVladimir Oltean2-19/+11
There is a desire for the felix driver to gain support for multiple tag_8021q CPU ports, but the current model prevents it. This is because ocelot_apply_bridge_fwd_mask() only takes into consideration whether a port is a tag_8021q CPU port, but not whose CPU port it is. We need a model where we can have a direct affinity between an ocelot port and a tag_8021q CPU port. This serves as the basis for multiple CPU ports. Declare a "dsa_8021q_cpu" backpointer in struct ocelot_port which encodes that affinity. Repurpose the "ocelot_set_dsa_8021q_cpu" API to "ocelot_assign_dsa_8021q_cpu" to express the change of paradigm. Note that this change makes the first practical use of the new ocelot_port->index field in ocelot_port_unassign_dsa_8021q_cpu(), where we need to remove the old tag_8021q CPU port from the reserved VLAN range. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-23net: dsa: felix: directly call ocelot_port_{set,unset}_dsa_8021q_cpuVladimir Oltean1-27/+9
Absorb the final details of calling ocelot_port_{,un}set_dsa_8021q_cpu(), i.e. the need to lock &ocelot->fwd_domain_lock, into the callee, to simplify the caller and permit easier code reuse later. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-23net: dsa: felix: update bridge fwd mask from ocelot lib when changing ↵Vladimir Oltean1-4/+0
tag_8021q CPU Add more logic to ocelot_port_{,un}set_dsa_8021q_cpu() from the ocelot switch lib by encapsulating the ocelot_apply_bridge_fwd_mask() call that felix used to have. This is necessary because the CPU port change procedure will also need to do this, and it's good to reduce code duplication by having an entry point in the ocelot switch lib that does all that is needed. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-23net: dsa: felix: move the updating of PGID_CPU to the ocelot libVladimir Oltean1-7/+0
PGID_CPU must be updated every time a port is configured or unconfigured as a tag_8021q CPU port. The ocelot switch lib already has a hook for that operation, so move the updating of PGID_CPU to those hooks. These bits are pretty specific to DSA, so normally I would keep them out of the common switch lib, but when tag_8021q is in use, this has implications upon the forwarding mask determined by ocelot_apply_bridge_fwd_mask() and called extensively by the switch lib. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-23net: dsa: fix missing adjustment of host broadcast floodingVladimir Oltean1-0/+3
PGID_BC is configured statically by ocelot_init() to flood towards the CPU port module, and dynamically by ocelot_port_set_bcast_flood() towards all user ports. When the tagging protocol changes, the intention is to turn off flooding towards the old pipe towards the host, and to turn it on towards the new pipe. Due to a recent change which removed the adjustment of PGID_BC from felix_set_host_flood(), 3 things happen. - when we change from NPI to tag_8021q mode: in this mode, the CPU port module is accessed via registers, and used to read PTP packets with timestamps. We fail to disable broadcast flooding towards the CPU port module, and to enable broadcast flooding towards the physical port that serves as a DSA tag_8021q CPU port. - from tag_8021q to NPI mode: in this mode, the CPU port module is redirected to a physical port. We fail to disable broadcast flooding towards the physical tag_8021q CPU port, and to enable it towards the CPU port module at ocelot->num_phys_ports. - when the ports are put in promiscuous mode, we also fail to update PGID_BC towards the host pipe of the current protocol. First issue means that felix_check_xtr_pkt() has to do extra work, because it will not see only PTP packets, but also broadcasts. It needs to dequeue these packets just to drop them. Third issue is inconsequential, since PGID_BC is allocated from the nonreserved multicast PGID space, and these PGIDs are conveniently initialized to 0x7f (i.e. flood towards all ports except the CPU port module). Broadcasts reach the NPI port via ocelot_init(), and reach the tag_8021q CPU port via the hardware defaults. Second issue is also inconsequential, because we fail both at disabling and at enabling broadcast flooding on a port, so the defaults mentioned above are preserved, and they are fine except for the performance impact. Fixes: 7a29d220f4c0 ("net: dsa: felix: reimplement tagging protocol change with function pointers") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-23net: dsa: restrict SMSC_LAN9303_I2C kconfigRandy Dunlap1-1/+2
Since kconfig 'select' does not follow dependency chains, if symbol KSA selects KSB, then KSA should also depend on the same symbols that KSB depends on, in order to prevent Kconfig warnings and possible build errors. Change NET_DSA_SMSC_LAN9303_I2C and NET_DSA_SMSC_LAN9303_MDIO so that they are limited to VLAN_8021Q if the latter is enabled. This prevents the Kconfig warning: WARNING: unmet direct dependencies detected for NET_DSA_SMSC_LAN9303 Depends on [m]: NETDEVICES [=y] && NET_DSA [=y] && (VLAN_8021Q [=m] || VLAN_8021Q [=m]=n) Selected by [y]: - NET_DSA_SMSC_LAN9303_I2C [=y] && NETDEVICES [=y] && NET_DSA [=y] && I2C [=y] Fixes: 430065e26719 ("net: dsa: lan9303: add VLAN IDs to master device") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Andrew Lunn <andrew@lunn.ch> Cc: Vivien Didelot <vivien.didelot@gmail.com> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Vladimir Oltean <olteanv@gmail.com> Cc: Juergen Borleis <jbe@pengutronix.de> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Mans Rullgard <mans@mansr.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-20net: dsa: lantiq_gswip: Fix typo in gswip_port_fdb_dump() error printMartin Blumenstingl1-2/+3
gswip_port_fdb_dump() reads the MAC bridge entries. The error message should say "failed to read mac bridge entry". While here, also add the index to the error print so humans can get to the cause of the problem easier. Acked-by: Hauke Mehrtens <hauke@hauke-m.de> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>