summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2020-03-31hv_netvsc: Remove unnecessary round_up for recv_completion_cntHaiyang Zhang1-4/+5
The vzalloc_node(), already rounds the total size to whole pages, and sizeof(u64) is smaller than sizeof(struct recv_comp_data). So round_up of recv_completion_cnt is not necessary, and may cause extra memory allocation. To save memory, remove this unnecessary round_up for recv_completion_cnt. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-31Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller21-198/+280
Pablo Neira Ayuso says: ==================== Netfilter/IPVS updates for net-next The following patchset contains Netfilter/IPVS updates for net-next: 1) Add support to specify a stateful expression in set definitions, this allows users to specify e.g. counters per set elements. 2) Flowtable software counter support. 3) Flowtable hardware offload counter support, from wenxu. 3) Parallelize flowtable hardware offload requests, from Paul Blakey. This includes a patch to add one work entry per offload command. 4) Several patches to rework nf_queue refcount handling, from Florian Westphal. 4) A few fixes for the flowtable tunnel offload: Fix crash if tunneling information is missing and set up indirect flow block as TC_SETUP_FT, patch from wenxu. 5) Stricter netlink attribute sanity check on filters, from Romain Bellan and Florent Fourcot. 5) Annotations to make sparse happy, from Jules Irenge. 6) Improve icmp errors in debugging information, from Haishuang Yan. 7) Fix warning in IPVS icmp error debugging, from Haishuang Yan. 8) Fix endianess issue in tcp extension header, from Sergey Marinkevich. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-31Merge branch 'Add-packet-trap-policers-support'David S. Miller16-51/+1778
Ido Schimmel says: ==================== Add packet trap policers support Background ========== Devices capable of offloading the kernel's datapath and perform functions such as bridging and routing must also be able to send (trap) specific packets to the kernel (i.e., the CPU) for processing. For example, a device acting as a multicast-aware bridge must be able to trap IGMP membership reports to the kernel for processing by the bridge module. Motivation ========== In most cases, the underlying device is capable of handling packet rates that are several orders of magnitude higher compared to those that can be handled by the CPU. Therefore, in order to prevent the underlying device from overwhelming the CPU, devices usually include packet trap policers that are able to police the trapped packets to rates that can be handled by the CPU. Proposed solution ================= This patch set allows capable device drivers to register their supported packet trap policers with devlink. User space can then tune the parameters of these policers (currently, rate and burst size) and read from the device the number of packets that were dropped by the policer, if supported. These packet trap policers can then be bound to existing packet trap groups, which are used to aggregate logically related packet traps. As a result, trapped packets are policed to rates that can be handled the host CPU. Example usage ============= Instantiate netdevsim: Dump available packet trap policers: netdevsim/netdevsim10: policer 1 rate 1000 burst 128 policer 2 rate 2000 burst 256 policer 3 rate 3000 burst 512 Change the parameters of a packet trap policer: Bind a packet trap policer to a packet trap group: Dump parameters and statistics of a packet trap policer: netdevsim/netdevsim10: policer 3 rate 100 burst 16 stats: rx: dropped 92 Unbind a packet trap policer from a packet trap group: Patch set overview ================== Patch #1 adds the core infrastructure in devlink which allows capable device drivers to register their supported packet trap policers with devlink. Patch #2 extends the existing devlink-trap documentation. Patch #3 extends netdevsim to register a few dummy packet trap policers with devlink. Used later on to selftests the core infrastructure. Patches #4-#5 adds infrastructure in devlink to allow binding of packet trap policers to packet trap groups. Patch #6 extends netdevsim to allow such binding. Patch #7 adds a selftest over netdevsim that verifies the core devlink-trap policers functionality. Patches #8-#14 gradually add devlink-trap policers support in mlxsw. Patch #15 adds a selftest over mlxsw. All registered packet trap policers are verified to handle the configured rate and burst size. Future plans ============ * Allow changing default association between packet traps and packet trap groups * Add more packet traps. For example, for control packets (e.g., IGMP) v3: * Rebase v2 (address comments from Jiri and Jakub): * Patch #1: Add 'strict_start_type' in devlink policy * Patch #1: Have device drivers provide max/min rate/burst size for each policer. Use them to check validity of user provided parameters * Patch #3: Remove check about burst size being a power of 2 and instead add a debugfs knob to fail the operation * Patch #3: Provide max/min rate/burst size when registering policers and remove the validity checks from nsim_dev_devlink_trap_policer_set() * Patch #5: Check for presence of 'DEVLINK_ATTR_TRAP_POLICER_ID' in devlink_trap_group_set() and bail if not present * Patch #5: Add extack error message in case trap group was partially modified * Patch #7: Add test case with new 'fail_trap_policer_set' knob * Patch #7: Add test case for partially modified trap group * Patch #10: Provide max/min rate/burst size when registering policers * Patch #11: Remove the max/min validity checks from __mlxsw_sp_trap_policer_set() ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-31selftests: mlxsw: Add test cases for devlink-trap policersIdo Schimmel2-0/+390
Add test cases that verify that each registered packet trap policer: * Honors that imposed limitations of rate and burst size * Able to police trapped packets to the specified rate * Able to police trapped packets to the specified burst size * Able to be unbound from its trap group Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-31mlxsw: spectrum_trap: Add support for setting of packet trap group parametersIdo Schimmel5-5/+46
Implement support for setting of packet trap group parameters by invoking the trap_group_init() callback with the new parameters. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-31mlxsw: spectrum_trap: Switch to use correct packet trap groupIdo Schimmel3-18/+16
Some packet traps are currently exposed to user space as being member of "l3_drops" trap group, but internally they are member of a different group. Switch these traps to use the correct group so that they are all subject to the same policer, as exposed to user space. Set the trap priority of packets trapped due to loopback error during routing to the lowest priority. Such packets are not routed again by the kernel and therefore should not mask other traps (e.g., host miss) that should be routed. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-31mlxsw: spectrum_trap: Do not initialize dedicated discard policerIdo Schimmel1-9/+1
The policer is now initialized as part of the registration with devlink, so there is no need to initialize it before the registration. Remove the initialization. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-31mlxsw: spectrum_trap: Add devlink-trap policer supportIdo Schimmel6-11/+297
Register supported packet trap policers with devlink and implement callbacks to change their parameters and read their counters. Prevent user space from passing invalid policer parameters down to the device by checking their validity and communicating the failure via an appropriate extack message. v2: * Remove the max/min validity checks from __mlxsw_sp_trap_policer_set() Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-31mlxsw: spectrum_trap: Prepare policers for registration with devlinkIdo Schimmel2-1/+77
Prepare an array of policer IDs to register with devlink and their associated parameters. The array is composed from both policers that are currently bound to exposed trap groups and policers that are not bound to any trap group. v2: * Provide max/min rate/burst size when registering policers Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-31mlxsw: spectrum: Track used packet trap policer IDsIdo Schimmel4-3/+38
During initialization the driver configures various packet trap groups and binds policers to them. Currently, most of these groups are not exposed to user space and therefore their policers should not be exposed as well. Otherwise, user space will be able to alter policer parameters without knowing which packet traps are policed by the policer. Use a bitmap to track the used policer IDs so that these policers will not be registered with devlink in a subsequent patch. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-31mlxsw: reg: Extend QPCR registerIdo Schimmel1-0/+17
The QoS Policer Configuration Register (QPCR) is used to configure hardware policers. Extend this register with following fields and defines which will be used by subsequent patches: 1. Violate counter: reads number of packets dropped by the policer 2. Clear counter: to ensure we start counting from 0 3. Rate and burst size limits Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-31selftests: netdevsim: Add test cases for devlink-trap policersIdo Schimmel2-0/+153
Add test cases for packet trap policer set / show commands as well as for the binding of these policers to packet trap groups. Both good and bad flows are tested for maximum coverage. v2: * Add test case with new 'fail_trap_policer_set' knob * Add test case for partially modified trap group Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-31netdevsim: Add support for setting of packet trap group parametersIdo Schimmel2-0/+18
Add a dummy callback to set trap group parameters. Return an error when the 'fail_trap_group_set' debugfs file is set in order to exercise error paths and verify that error is propagated to user space when should. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-31devlink: Allow setting of packet trap group parametersIdo Schimmel2-2/+63
The previous patch allowed device drivers to publish their default binding between packet trap policers and packet trap groups. However, some users might not be content with this binding and would like to change it. In case user space passed a packet trap policer identifier when setting a packet trap group, invoke the appropriate device driver callback and pass the new policer identifier. v2: * Check for presence of 'DEVLINK_ATTR_TRAP_POLICER_ID' in devlink_trap_group_set() and bail if not present * Add extack error message in case trap group was partially modified Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-31devlink: Add packet trap group parameters supportIdo Schimmel4-9/+43
Packet trap groups are used to aggregate logically related packet traps. Currently, these groups allow user space to batch operations such as setting the trap action of all member traps. In order to prevent the CPU from being overwhelmed by too many trapped packets, it is desirable to bind a packet trap policer to these groups. For example, to limit all the packets that encountered an exception during routing to 10Kpps. Allow device drivers to bind default packet trap policers to packet trap groups when the latter are registered with devlink. The next patch will enable user space to change this default binding. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-31netdevsim: Add devlink-trap policer supportIdo Schimmel2-1/+86
Register three dummy packet trap policers with devlink and implement callbacks to change their parameters and read their counters. This will be used later on in the series to test the devlink-trap policer infrastructure. v2: * Remove check about burst size being a power of 2 and instead add a debugfs knob to fail the operation * Provide max/min rate/burst size when registering policers and remove the validity checks from nsim_dev_devlink_trap_policer_set() Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-31Documentation: Add description of packet trap policersIdo Schimmel1-0/+26
Extend devlink-trap documentation with information about packet trap policers. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-31devlink: Add packet trap policers supportIdo Schimmel3-0/+515
Devices capable of offloading the kernel's datapath and perform functions such as bridging and routing must also be able to send (trap) specific packets to the kernel (i.e., the CPU) for processing. For example, a device acting as a multicast-aware bridge must be able to trap IGMP membership reports to the kernel for processing by the bridge module. In most cases, the underlying device is capable of handling packet rates that are several orders of magnitude higher compared to those that can be handled by the CPU. Therefore, in order to prevent the underlying device from overwhelming the CPU, devices usually include packet trap policers that are able to police the trapped packets to rates that can be handled by the CPU. This patch allows capable device drivers to register their supported packet trap policers with devlink. User space can then tune the parameters of these policer (currently, rate and burst size) and read from the device the number of packets that were dropped by the policer, if supported. Subsequent patches in the series will allow device drivers to create default binding between these policers and packet trap groups and allow user space to change the binding. v2: * Add 'strict_start_type' in devlink policy * Have device drivers provide max/min rate/burst size for each policer. Use them to check validity of user provided parameters Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30ipvs: fix uninitialized variable warningHaishuang Yan1-2/+1
If outer_proto is not set, GCC warning as following: In file included from net/netfilter/ipvs/ip_vs_core.c:52: net/netfilter/ipvs/ip_vs_core.c: In function 'ip_vs_in_icmp': include/net/ip_vs.h:233:4: warning: 'outer_proto' may be used uninitialized in this function [-Wmaybe-uninitialized] 233 | printk(KERN_DEBUG pr_fmt(msg), ##__VA_ARGS__); \ | ^~~~~~ net/netfilter/ipvs/ip_vs_core.c:1666:8: note: 'outer_proto' was declared here 1666 | char *outer_proto; | ^~~~~~~~~~~ Fixes: 73348fed35d0 ("ipvs: optimize tunnel dumps for icmp errors") Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-03-30netfilter: nft_exthdr: fix endianness of tcp option castSergey Marinkevich1-5/+3
I got a problem on MIPS with Big-Endian is turned on: every time when NF trying to change TCP MSS it returns because of new.v16 was greater than old.v16. But real MSS was 1460 and my rule was like this: add rule table chain tcp option maxseg size set 1400 And 1400 is lesser that 1460, not greater. Later I founded that main causer is cast from u32 to __be16. Debugging: In example MSS = 1400(HEX: 0x578). Here is representation of each byte like it is in memory by addresses from left to right(e.g. [0x0 0x1 0x2 0x3]). LE — Little-Endian system, BE — Big-Endian, left column is type. LE BE u32: [78 05 00 00] [00 00 05 78] As you can see, u32 representation will be casted to u16 from different half of 4-byte address range. But actually nf_tables uses registers and store data of various size. Actually TCP MSS stored in 2 bytes. But registers are still u32 in definition: struct nft_regs { union { u32 data[20]; struct nft_verdict verdict; }; }; So, access like regs->data[priv->sreg] exactly u32. So, according to table presents above, per-byte representation of stored TCP MSS in register will be: LE BE (u32)regs->data[]: [78 05 00 00] [05 78 00 00] ^^ ^^ We see that register uses just half of u32 and other 2 bytes may be used for some another data. But in nft_exthdr_tcp_set_eval() it casted just like u32 -> __be16: new.v16 = src But u32 overfill __be16, so it get 2 low bytes. For clarity draw one more table(<xx xx> means that bytes will be used for cast). LE BE u32: [<78 05> 00 00] [00 00 <05 78>] (u32)regs->data[]: [<78 05> 00 00] [05 78 <00 00>] As you can see, for Little-Endian nothing changes, but for Big-endian we take the wrong half. In my case there is some other data instead of zeros, so new MSS was wrongly greater. For shooting this bug I used solution for ports ranges. Applying of this patch does not affect Little-Endian systems. Signed-off-by: Sergey Marinkevich <sergey.marinkevich@eltex-co.ru> Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-03-30Merge branch 'split-phylink-PCS-operations'David S. Miller2-42/+166
Russell King says: ==================== split phylink PCS operations This series splits the phylink_mac_ops structure so that PCS can be supported separately with their own PCS operations, separating them from the MAC layer. This may need adaption later as more users come along. v2: change pcs_config() and associated called function prototypes to only pass the information that is required, and add some documention. v3: change phylink_create() prototype ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30net: phylink: add separate pcs operations structureRussell King2-22/+143
Add a separate set of PCS operations, which MAC drivers can use to couple phylink with their associated MAC PCS layer. The PCS operations include: - pcs_get_state() - reads the link up/down, resolved speed, duplex and pause from the PCS. - pcs_config() - configures the PCS for the specified mode, PHY interface type, and setting the advertisement. - pcs_an_restart() - restarts 802.3 in-band negotiation with the link partner - pcs_link_up() - informs the PCS that link has come up, and the parameters of the link. Link parameters are used to program the PCS for fixed speed and non-inband modes. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30net: phylink: rename 'ops' to 'mac_ops'Russell King2-16/+16
Rename the bland 'ops' member of struct phylink to be a more descriptive 'mac_ops' - this is necessary as we're about to introduce another set of operations. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30net: phylink: change phylink_mii_c22_pcs_set_advertisement() prototypeRussell King2-6/+9
Change phylink_mii_c22_pcs_set_advertisement() to take only the PHY interface and advertisement mask, rather than the full phylink state. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30r8169: factor out rtl8169_tx_mapHeiner Kallweit1-60/+50
Factor out mapping the tx skb to a new function rtl8169_tx_map(). This allows to remove redundancies, and rtl8169_get_txd_opts1() has only one user left, so it can be inlined. As a result rtl8169_xmit_frags() is significantly simplified, and in rtl8169_start_xmit() the code is simplified and better readable. No functional change intended. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30Merge branch 'for-upstream' of ↵David S. Miller5-41/+173
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Johan Hedberg says: ==================== pull request: bluetooth-next 2020-03-29 Here are a few more Bluetooth patches for the 5.7 kernel: - Fix assumption of encryption key size when reading fails - Add support for DEFER_SETUP with L2CAP Enhanced Credit Based Mode - Fix issue with auto-connected devices - Fix suspend handling when entering the state fails ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30qed: Fix use after free in qed_chain_freeYuval Basson2-31/+31
The qed_chain data structure was modified in commit 1a4a69751f4d ("qed: Chain support for external PBL") to support receiving an external pbl (due to iWARP FW requirements). The pages pointed to by the pbl are allocated in qed_chain_alloc and their virtual address are stored in an virtual addresses array to enable accessing and freeing the data. The physical addresses however weren't stored and were accessed directly from the external-pbl during free. Destroy-qp flow, leads to freeing the external pbl before the chain is freed, when the chain is freed it tries accessing the already freed external pbl, leading to a use-after-free. Therefore we need to store the physical addresses in additional to the virtual addresses in a new data structure. Fixes: 1a4a69751f4d ("qed: Chain support for external PBL") Signed-off-by: Michal Kalderon <mkalderon@marvell.com> Signed-off-by: Yuval Bason <ybason@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30r8169: improve handling of TD_MSS_MAXHeiner Kallweit1-2/+2
If the mtu is greater than TD_MSS_MAX, then TSO is disabled, see rtl8169_fix_features(). Because mss is less than mtu, we can't have the case mss > TD_MSS_MAX in the TSO path. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30Merge branch 'Port-and-flow-policers-for-DSA'David S. Miller14-83/+742
Vladimir Oltean says: ==================== Port and flow policers for DSA (SJA1105, Felix/Ocelot) This series adds support for 2 types of policers: - port policers, via tc matchall filter - flow policers, via tc flower filter for 2 DSA drivers: - sja1105 - felix/ocelot First we start with ocelot/felix. Prior to this patch, the ocelot core library currently only supported: - Port policers - Flow-based dropping and trapping But the felix wrapper could not actually use the port policers due to missing linkage and support in the DSA core. So one of the patches addresses exactly that limitation by adding the missing support to the DSA core. The other patch for felix flow policers (via the VCAP IS2 engine) is actually in the ocelot library itself, since the linkage with the ocelot flower classifier has already been done in an earlier patch set. Then with the newly added .port_policer_add and .port_policer_del, we can also start supporting the L2 policers on sja1105. Then, for full functionality of these L2 policers on sja1105, we also implement a more limited set of flow-based policing keys for this switch, namely for broadcast and VLAN PCP. Series version 1 was submitted here: https://patchwork.ozlabs.org/cover/1263353/ Nothing functional changed in v2, only a rebase. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30net: dsa: sja1105: add broadcast and per-traffic class policersVladimir Oltean4-0/+385
This patch adds complete support for manipulating the L2 Policing Tables from this switch. There are 45 table entries, one entry per each port and traffic class, and one dedicated entry for broadcast traffic for each ingress port. Policing entries are shareable, and we use this functionality to support shared block filters. We are modeling broadcast policers as simple tc-flower matches on dst_mac. As for the traffic class policers, the switch only deduces the traffic class from the VLAN PCP field, so it makes sense to model this as a tc-flower match on vlan_prio. How to limit broadcast traffic coming from all front-panel ports to a cumulated total of 10 Mbit/s: tc qdisc add dev sw0p0 ingress_block 1 clsact tc qdisc add dev sw0p1 ingress_block 1 clsact tc qdisc add dev sw0p2 ingress_block 1 clsact tc qdisc add dev sw0p3 ingress_block 1 clsact tc filter add block 1 flower skip_sw dst_mac ff:ff:ff:ff:ff:ff \ action police rate 10mbit burst 64k How to limit traffic with VLAN PCP 0 (also includes untagged traffic) to 100 Mbit/s on port 0 only: tc filter add dev sw0p0 ingress protocol 802.1Q flower skip_sw \ vlan_prio 0 action police rate 100mbit burst 64k The broadcast, VLAN PCP and port policers are compatible with one another (can be installed at the same time on a port). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30net: dsa: sja1105: add configuration of port policersVladimir Oltean1-32/+100
This adds partial configuration support for the L2 Policing Table. Out of the 45 policing entries, only 5 are used (one for each port), in a shared manner. All 8 traffic classes, and the broadcast policer, are redirected to a common instance which belongs to the ingress port. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30net: dsa: felix: add port policersVladimir Oltean5-11/+36
This patch is a trivial passthrough towards the ocelot library, which support port policers since commit 2c1d029a017f ("net: mscc: ocelot: Implement port policers via tc command"). Some data structure conversion between the DSA core and the Ocelot library is necessary, for policer parameters. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30net: dsa: add port policersVladimir Oltean2-7/+85
The approach taken to pass the port policer methods on to drivers is pragmatic. It is similar to the port mirroring implementation (in that the DSA core does all of the filter block interaction and only passes simple operations for the driver to implement) and dissimilar to how flow-based policers are going to be implemented (where the driver has full control over the flow_cls_offload data structure). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30net: dsa: refactor matchall mirred action to separate functionVladimir Oltean1-30/+40
Make room for other actions for the matchall filter by keeping the mirred argument parsing self-contained in its own function. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30net: mscc: ocelot: add action of police on vcap_is2Xiaoliang Yang6-7/+100
Ocelot has 384 policers that can be allocated to ingress ports, QoS classes per port, and VCAP IS2 entries. ocelot_police.c supports to set policers which can be allocated to police action of VCAP IS2. We allocate policers from maximum pol_id, and decrease the pol_id when add a new vcap_is2 entry which is police action. Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30Merge branch 'ionic-support-for-firmware-upgrade'David S. Miller7-145/+322
Shannon Nelson says: ==================== ionic support for firmware upgrade The Pensando Distributed Services Card can get firmware upgrades from the off-host centralized management suite, and can be upgraded without a host reboot or driver reload. This patchset sets up the support for fw upgrade in the Linux driver. When the upgrade begins, the DSC first brings the link down, then stops the firmware. The driver will notice this and quiesce itself by stopping the queues and releasing DMA resources, then monitoring for firmware to start back up. When the upgrade is finished the firmware is restarted and link is brought up, and the driver rebuilds the queues and restarts traffic flow. First we separate the Link state from the netdev state, then reorganize a few things to prepare for partial tear-down of the queues. Next we fix up the state machine so that we take the Tx and Rx queues down and back up when we get LINK_DOWN and LINK_UP events. Lastly, we add handling of the FW reset itself by tearing down the lif internals and rebuilding them with the new FW setup. v2: This changes the design from (ab)using the full .ndo_stop and .ndo_open routines to getting a better separation between the alloc and the init functions so that we can keep our resource allocations as long as possible. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30ionic: remove lifs on fw resetShannon Nelson5-25/+158
When the FW RESET event comes to the driver from the firmware, or the fw_status goes to 0 (stopped) or to 0xff (no PCI connection), then shut down the driver activity. This event signals a FW upgrade where we need to quiesce all operations and wait for the FW to restart. The FW will continue the update process once it sees all the LIFs are reset. When the update process is done it will set the fw_status back to RUNNING. Meanwhile, the heartbeat check continues and when the fw_status is seen as set to running we can restart the driver operations. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30ionic: disable the queues on link downShannon Nelson1-56/+61
When the link goes down, we need to disable the queues on the NIC in addition to stopping the netdev stack. This lets the FW know that the driver has stopped queue activity, and then the FW can do internal reconfiguration work, whether actually Link related, or for other internal FW needs. To do this, we pull out the queue enable and disable from ionic_open() and ionic_stop() so they can be used by other routines. To help keep things sane, we swap the queue enables so that the rx queue and its napi are enabled before the tx queue which rides on the rx queues napi. We also drop the ionic_lif_quiesce() as it doesn't do anything more than what the queue disable has already taken care of. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30ionic: check for queues before deletingShannon Nelson1-19/+38
Make sure the queue structures exist before trying to delete them. This addresses a couple of error recovery issues. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30ionic: clean tx queue of unfinished requestsShannon Nelson3-0/+18
Clean out tx requests that didn't get finished before shutting down the queue. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30ionic: move irq request to qcq allocShannon Nelson1-22/+19
Move the irq request and free out of the qcq_init and deinit and into the alloc and free routines where they belong for better resource management. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30ionic: move debugfs add/delete to match alloc/freeShannon Nelson1-12/+8
Move the qcq debugfs add to the queue alloc, and likewise move the debugfs delete to the queue free. The LIF debugfs add also needs to be moved, but the del is already in the LIF free. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30ionic: check for linkup in watchdogShannon Nelson3-2/+7
Add a link_status_check to the heartbeat watchdog. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30ionic: decouple link message from netdev stateShannon Nelson1-14/+18
Rearrange the link_up/link_down messages so that we announce link up when we first notice that the link is up when the driver loads, and decouple the link_up/link_down messages from the UP and DOWN netdev state. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30mlxsw: spectrum_ptp: Fix build warningsIdo Schimmel1-0/+4
Cited commit extended the enums 'hwtstamp_tx_types' and 'hwtstamp_rx_filters' with values that were not accounted for in the switch statements, resulting in the build warnings below. Fix by adding a default case. drivers/net/ethernet/mellanox/mlxsw/spectrum_ptp.c: In function ‘mlxsw_sp_ptp_get_message_types’: drivers/net/ethernet/mellanox/mlxsw/spectrum_ptp.c:915:2: warning: enumeration value ‘__HWTSTAMP_TX_CNT’ not handled in switch [-Wswitch] 915 | switch (tx_type) { | ^~~~~~ drivers/net/ethernet/mellanox/mlxsw/spectrum_ptp.c:927:2: warning: enumeration value ‘__HWTSTAMP_FILTER_CNT’ not handled in switch [-Wswitch] 927 | switch (rx_filter) { | ^~~~~~ Fixes: f76510b458a5 ("ethtool: add timestamping related string sets") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reported-by: David S. Miller <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30Merge branch 'Devlink-health-auto-attributes-refactor'David S. Miller9-21/+42
Eran Ben Elisha says: ==================== Devlink health auto attributes refactor This patchset refactors the auto-recover health reporter flag to be explicitly set by the devlink core. In addition, add another flag to control auto-dump attribute, also to be explicitly set by the devlink core. For that, patch 0001 changes the auto-recover default value of netdevsim dummy reporter. After reporter registration, both flags can be altered be administrator only. Changes since v1: - Change default behaviour of netdevsim dummy reporter - Move initialization of DEVLINK_ATTR_HEALTH_REPORTER_AUTO_DUMP ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30devlink: Add auto dump flag to health reporterEran Ben Elisha2-4/+24
On low memory system, run time dumps can consume too much memory. Add administrator ability to disable auto dumps per reporter as part of the error flow handle routine. This attribute is not relevant while executing DEVLINK_CMD_HEALTH_REPORTER_DUMP_GET. By default, auto dump is activated for any reporter that has a dump method, as part of the reporter registration to devlink. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30devlink: Implicitly set auto recover flag when registering health reporterEran Ben Elisha7-17/+13
When health reporter is registered to devlink, devlink will implicitly set auto recover if and only if the reporter has a recover method. No reason to explicitly get the auto recover flag from the driver. Remove this flag from all drivers that called devlink_health_reporter_create. All existing health reporters set auto recovery to true if they have a recover method. Yet, administrator can unset auto recover via netlink command as prior to this patch. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30netdevsim: Change dummy reporter auto recover defaultEran Ben Elisha2-1/+6
Health reporters should be registered with auto recover set to true. Align dummy reporter behaviour with that, as in later patch the option to set auto recover behaviour will be removed. In addition, align netdevsim selftest to the new default value. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30ptp: Avoid deadlocks in the programmable pin code.Richard Cochran4-3/+44
The PTP Hardware Clock (PHC) subsystem offers an API for configuring programmable pins. User space sets or gets the settings using ioctls, and drivers verify dialed settings via a callback. Drivers may also query pin settings by calling the ptp_find_pin() method. Although the core subsystem protects concurrent access to the pin settings, the implementation places illogical restrictions on how drivers may call ptp_find_pin(). When enabling an auxiliary function via the .enable(on=1) callback, drivers may invoke the pin finding method, but when disabling with .enable(on=0) drivers are not permitted to do so. With the exception of the mv88e6xxx, all of the PHC drivers do respect this restriction, but still the locking pattern is both confusing and unnecessary. This patch changes the locking implementation to allow PHC drivers to freely call ptp_find_pin() from their .enable() and .verify() callbacks. V2 ChangeLog: - fixed spelling in the kernel doc - add Vladimir's tested by tag Signed-off-by: Richard Cochran <richardcochran@gmail.com> Reported-by: Yangbo Lu <yangbo.lu@nxp.com> Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>