summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2019-06-29e1000e: Increase pause and refresh timeMiguel Bernal Marin1-2/+2
Suggested-by: Tim Pepper <timothy.c.pepper@linux.intel.com> Signed-off-by: Miguel Bernal Marin <miguel.bernal.marin@linux.intel.com> Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de> Acked-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-06-29ice: Use struct_size() helperGustavo A. R. Silva1-2/+2
One of the more common cases of allocation size calculations is finding the size of a structure that has a zero-sized array at the end, along with memory for some number of elements for that array. For example: struct foo { int stuff; struct boo entry[]; }; size = sizeof(struct foo) + count * sizeof(struct boo); instance = alloc(size, GFP_KERNEL); Instead of leaving these open-coded and prone to type mistakes, we can now use the new struct_size() helper: size = struct_size(instance, entry, count); This code was detected with the help of Coccinelle. Signed-off-by: "Gustavo A. R. Silva" <gustavo@embeddedor.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-06-29Merge branch 'net-sched-Add-txtime-assist-support-for-taprio'David S. Miller4-36/+405
Vedang Patel says: ==================== net/sched: Add txtime-assist support for taprio. Changes in v6: - Use _BITUL() instead of BIT() in UAPI for etf. (patch #1) - Fix a bug reported by kbuild test bot in length_to_duration(). (patch #6) - Remove an unused function (get_cycle_start()). (Patch #6) Changes in v5: - Commit message improved for the igb patch (patch #1). - Fixed typo in commit message for etf patch (patch #2). Changes in v4: - Remove inline directive from functions in foo.c. - Fix spacing in pkt_sched.h (for etf patch). Changes in v3: - Simplify implementation for taprio flags. - txtime_delay can only be set if txtime-assist mode is enabled. - txtime_delay and flags will only be visible in tc output if set by user. - Minor changes in error reporting. Changes in v2: - Txtime-offload has now been renamed to txtime-assist mode. - Renamed the offload parameter to flags. - Removed the code which introduced the hardware offloading functionality. Original Cover letter (with above changes included) -------------------------------------------------- Currently, we are seeing packets being transmitted outside their timeslices. We can confirm that the packets are being dequeued at the right time. So, the delay is induced after the packet is dequeued, because taprio, without any offloading, has no control of when a packet is actually transmitted. In order to solve this, we are making use of the txtime feature provided by ETF qdisc. Hardware offloading needs to be supported by the ETF qdisc in order to take advantage of this feature. The taprio qdisc will assign txtime (in skb->tstamp) for all the packets which do not have the txtime allocated via the SO_TXTIME socket option. For the packets which already have SO_TXTIME set, taprio will validate whether the packet will be transmitted in the correct interval. In order to support this, the following parameters have been added: - flags (taprio): This is added in order to support different offloading modes which will be added in the future. - txtime-delay (taprio): This indicates the minimum time it will take for the packet to hit the wire after it reaches taprio_enqueue(). This is useful in determining whether we can transmit the packet in the remaining time if the gate corresponding to the packet is currently open. - skip_skb_check (ETF): ETF currently drops any packet which does not have the SO_TXTIME socket option set. This check can be skipped by specifying this option. Following is an example configuration: tc qdisc replace dev $IFACE parent root handle 100 taprio \\ num_tc 3 \\ map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \\ queues 1@0 1@0 1@0 \\ base-time $BASE_TIME \\ sched-entry S 01 300000 \\ sched-entry S 02 300000 \\ sched-entry S 04 400000 \\ flags 0x1 \\ txtime-delay 200000 \\ clockid CLOCK_TAI tc qdisc replace dev $IFACE parent 100:1 etf \\ offload delta 200000 clockid CLOCK_TAI skip_skb_check Here, the "flags" parameter is indicating that the txtime-assist mode is enabled. Also, all the traffic classes have been assigned the same queue. This is to prevent the traffic classes in the lower priority queues from getting starved. Note that this configuration is specific to the i210 ethernet card. Other network cards where the hardware queues are given the same priority, might be able to utilize more than one queue. Following are some of the other highlights of the series: - Fix a bug where hardware timestamping and SO_TXTIME options cannot be used together. (Patch 1) - Introduces the skip_skb_check option. (Patch 2) - Make TxTime assist mode work with TCP packets (Patch 7). The following changes are recommended to be done in order to get the best performance from taprio in this mode: ip link set dev enp1s0 mtu 1514 ethtool -K eth0 gso off ethtool -K eth0 tso off ethtool --set-eee eth0 eee off ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-29taprio: Adjust timestamps for TCP packetsVedang Patel1-1/+40
When the taprio qdisc is running in "txtime offload" mode, it will set the launchtime value (in skb->tstamp) for all the packets which do not have the SO_TXTIME socket option. But, the TCP packets already have this value set and it indicates the earliest departure time represented in CLOCK_MONOTONIC clock. We need to respect the timestamp set by the TCP subsystem. So, convert this time to the clock which taprio is using and ensure that the packet is not transmitted before the deadline set by TCP. Signed-off-by: Vedang Patel <vedang.patel@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-29taprio: make clock reference conversions easierVedang Patel1-8/+22
Later in this series we will need to transform from CLOCK_MONOTONIC (used in TCP) to the clock reference used in TAPRIO. Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Signed-off-by: Vedang Patel <vedang.patel@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-29taprio: Add support for txtime-assist modeVedang Patel2-17/+328
Currently, we are seeing non-critical packets being transmitted outside of their timeslice. We can confirm that the packets are being dequeued at the right time. So, the delay is induced in the hardware side. The most likely reason is the hardware queues are starving the lower priority queues. In order to improve the performance of taprio, we will be making use of the txtime feature provided by the ETF qdisc. For all the packets which do not have the SO_TXTIME option set, taprio will set the transmit timestamp (set in skb->tstamp) in this mode. TAPrio Qdisc will ensure that the transmit time for the packet is set to when the gate is open. If SO_TXTIME is set, the TAPrio qdisc will validate whether the timestamp (in skb->tstamp) occurs when the gate corresponding to skb's traffic class is open. Following two parameters added to support this mode: - flags: used to enable txtime-assist mode. Will also be used to enable other modes (like hardware offloading) later. - txtime-delay: This indicates the minimum time it will take for the packet to hit the wire. This is useful in determining whether we can transmit the packet in the remaining time if the gate corresponding to the packet is currently open. An example configuration for enabling txtime-assist: tc qdisc replace dev eth0 parent root handle 100 taprio \\ num_tc 3 \\ map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \\ queues 1@0 1@0 1@0 \\ base-time 1558653424279842568 \\ sched-entry S 01 300000 \\ sched-entry S 02 300000 \\ sched-entry S 04 400000 \\ flags 0x1 \\ txtime-delay 40000 \\ clockid CLOCK_TAI tc qdisc replace dev $IFACE parent 100:1 etf skip_sock_check \\ offload delta 200000 clockid CLOCK_TAI Note that all the traffic classes are mapped to the same queue. This is only possible in taprio when txtime-assist is enabled. Also, note that the ETF Qdisc is enabled with offload mode set. In this mode, if the packet's traffic class is open and the complete packet can be transmitted, taprio will try to transmit the packet immediately. This will be done by setting skb->tstamp to current_time + the time delta indicated in the txtime-delay parameter. This parameter indicates the time taken (in software) for packet to reach the network adapter. If the packet cannot be transmitted in the current interval or if the packet's traffic is not currently transmitting, the skb->tstamp is set to the next available timestamp value. This is tracked in the next_launchtime parameter in the struct sched_entry. The behaviour w.r.t admin and oper schedules is not changed from what is present in software mode. The transmit time is already known in advance. So, we do not need the HR timers to advance the schedule and wakeup the dequeue side of taprio. So, HR timer won't be run when this mode is enabled. Signed-off-by: Vedang Patel <vedang.patel@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-29taprio: Remove inline directiveVedang Patel1-1/+1
Remove inline directive from length_to_duration(). We will let the compiler make the decisions. Signed-off-by: Vedang Patel <vedang.patel@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-29taprio: calculate cycle_time when schedule is installedVedang Patel1-18/+11
cycle time for a particular schedule is calculated only when it is first installed. So, it makes sense to just calculate it once right after the 'cycle_time' parameter has been parsed and store it in cycle_time. Signed-off-by: Vedang Patel <vedang.patel@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-29etf: Add skip_sock_checkVedang Patel2-0/+11
Currently, etf expects a socket with SO_TXTIME option set for each packet it encounters. So, it will drop all other packets. But, in the future commits we are planning to add functionality where tstamp value will be set by another qdisc. Also, some packets which are generated from within the kernel (e.g. ICMP packets) do not have any socket associated with them. So, this commit adds support for skip_sock_check. When this option is set, etf will skip checking for a socket and other associated options for all skbs. Signed-off-by: Vedang Patel <vedang.patel@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-29etf: Don't use BIT() in UAPI headers.Vedang Patel1-2/+2
The BIT() macro isn't exported as part of the UAPI interface. So, the compile-test to ensure they are self contained fails. So, use _BITUL() instead. Signed-off-by: Vedang Patel <vedang.patel@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-29igb: clear out skb->tstamp after reading the txtimeVedang Patel1-0/+1
If a packet which is utilizing the launchtime feature (via SO_TXTIME socket option) also requests the hardware transmit timestamp, the hardware timestamp is not delivered to the userspace. This is because the value in skb->tstamp is mistaken as the software timestamp. Applications, like ptp4l, request a hardware timestamp by setting the SOF_TIMESTAMPING_TX_HARDWARE socket option. Whenever a new timestamp is detected by the driver (this work is done in igb_ptp_tx_work() which calls igb_ptp_tx_hwtstamps() in igb_ptp.c[1]), it will queue the timestamp in the ERR_QUEUE for the userspace to read. When the userspace is ready, it will issue a recvmsg() call to collect this timestamp. The problem is in this recvmsg() call. If the skb->tstamp is not cleared out, it will be interpreted as a software timestamp and the hardware tx timestamp will not be successfully sent to the userspace. Look at skb_is_swtx_tstamp() and the callee function __sock_recv_timestamp() in net/socket.c for more details. Signed-off-by: Vedang Patel <vedang.patel@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-29Merge branch 'mirred-recurse'David S. Miller4-6/+19
John Hurley says: ==================== Track recursive calls in TC act_mirred These patches aim to prevent act_mirred causing stack overflow events from recursively calling packet xmit or receive functions. Such events can occur with poor TC configuration that causes packets to travel in loops within the system. Florian Westphal advises that a recursion crash and packets looping are separate issues and should be treated as such. David Miller futher points out that pcpu counters cannot track the precise skb context required to detect loops. Hence these patches are not aimed at detecting packet loops, rather, preventing stack flows arising from such loops. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-29net: sched: protect against stack overflow in TC act_mirredJohn Hurley1-0/+14
TC hooks allow the application of filters and actions to packets at both ingress and egress of the network stack. It is possible, with poor configuration, that this can produce loops whereby an ingress hook calls a mirred egress action that has an egress hook that redirects back to the first ingress etc. The TC core classifier protects against loops when doing reclassifies but there is no protection against a packet looping between multiple hooks and recursively calling act_mirred. This can lead to stack overflow panics. Add a per CPU counter to act_mirred that is incremented for each recursive call of the action function when processing a packet. If a limit is passed then the packet is dropped and CPU counter reset. Note that this patch does not protect against loops in TC datapaths. Its aim is to prevent stack overflow kernel panics that can be a consequence of such loops. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-29net: sched: refactor reinsert actionJohn Hurley4-6/+5
The TC_ACT_REINSERT return type was added as an in-kernel only option to allow a packet ingress or egress redirect. This is used to avoid unnecessary skb clones in situations where they are not required. If a TC hook returns this code then the packet is 'reinserted' and no skb consume is carried out as no clone took place. This return type is only used in act_mirred. Rather than have the reinsert called from the main datapath, call it directly in act_mirred. Instead of returning TC_ACT_REINSERT, change the type to the new TC_ACT_CONSUMED which tells the caller that the packet has been stolen by another process and that no consume call is required. Moving all redirect calls to the act_mirred code is in preparation for tracking recursion created by act_mirred. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-29ipv4: enable route flushing in network namespacesChristian Brauner1-4/+8
Tools such as vpnc try to flush routes when run inside network namespaces by writing 1 into /proc/sys/net/ipv4/route/flush. This currently does not work because flush is not enabled in non-initial network namespaces. Since routes are per network namespace it is safe to enable /proc/sys/net/ipv4/route/flush in there. Link: https://github.com/lxc/lxd/issues/4257 Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28Merge tag 'batadv-next-for-davem-20190627v2' of ↵David S. Miller40-457/+1035
git://git.open-mesh.org/linux-merge Simon Wunderlich says: ==================== This feature/cleanup patchset includes the following patches: - bump version strings, by Simon Wunderlich - fix includes for _MAX constants, atomic functions and fwdecls, by Sven Eckelmann (3 patches) - shorten multicast tt/tvlv worker spinlock section, by Linus Luessing - routeable multicast preparations: implement MAC multicast filtering, by Linus Luessing (2 patches, David Millers comments integrated) - remove return value checks for debugfs_create, by Greg Kroah-Hartman - add routable multicast optimizations, by Linus Luessing (2 patches) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28Merge branch 'hns3-next'David S. Miller13-91/+184
Huazhong Tan says: ==================== net: hns3: some code optimizations & cleanups & bugfixes [patch 01/12] fixes a TX timeout issue. [patch 02/12 - 04/12] adds some patch related to TM module. [patch 05/12] fixes a compile warning. [patch 06/12] adds Asym Pause support for autoneg [patch 07/12] optimizes the error handler for VF reset. [patch 08/12] deals with the empty interrupt case. [patch 09/12 - 12/12] adds some cleanups & optimizations. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28net: hns3: optimize the CSQ cmd error handlingPeng Li2-8/+26
If CMDQ ring is full, hclge_cmd_send may return directly, but IMP still working and HW pointer changed, SW ring pointer do not match the HW pointer. This patch update the SW pointer every time when the space is full, so it can work normally next time if IMP and HW still working. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28net: hns3: remove RXD_VLD check in hns3_handle_bdinfoYunsheng Lin3-12/+0
The HNS3_RXD_VLD_B bit has already been checked in hns3_add_frag or hns3_handle_rx_bd before calling hns3_handle_bdinfo, so when hns3_handle_bdinfo is called, the HNS3_RXD_VLD_B bit is always set, which makes the checking in hns3_handle_bdinfo unnecessary. This patch removes the RXD_VLD_B checking in hns3_handle_bdinfo. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28net: hns3: remove unused linkmode definitionJian Shen1-19/+0
This patch removes unused linkmode definition. Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28net: hns3: fix a statistics issue about l3l4 checksum errorYufeng Mo1-1/+1
The frame column is based on rx_crc_errors and rx_frame_errors. So l3l4 checksum error should not be counted by rx_crc_errors. Instead, l3l4 checksum error should be counted in ifconfig error column. Fixes: d3ec4ef66937 ("net: hns3: refactor the statistics updating for netdev") Signed-off-by: Yufeng Mo <moyufeng@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28net: hns3: handle empty unknown interruptHuazhong Tan1-5/+10
Since some MSI-X interrupt's status may be cleared by hardware, so when the driver receives the interrupt, reading HCLGE_VECTOR0_PF_OTHER_INT_STS_REG register will get an empty unknown interrupt. For this case, the irq handler should enable vector0 interrupt. This patch also use dev_info() instead of dev_dbg() in the hclge_check_event_cause(), since this information will be useful for normal usage. Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28net: hns3: re-schedule reset task while VF reset failHuazhong Tan2-8/+23
The VF reset may fail for some probabilistic reasons, such as wait for hardware reset timeout, wait for mailbox response timeout, so this patch tries to re-schedule the reset task when the number of reset failing is under HCLGEVF_RESET_MAX_FAIL_CNT. This patch also add a function hclgevf_reset_err_handle() to handle the reset failing. Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28net: hns3: add Asym Pause support to fix autoneg problemYonglong Liu2-0/+8
Local device and link partner config auto-negotiation on both, local device config pause frame use as: rx on/tx off, link partner config pause frame use as: rx off/tx on. We except the result is: Local device: Autonegotiate: on RX: on TX: off RX negotiated: on TX negotiated: off Link partner: Autonegotiate: on RX: off TX: on RX negotiated: off TX negotiated: on But actually, the result of Local device and link partner is both: Autonegotiate: on RX: off TX: off RX negotiated: off TX negotiated: off The root cause is that the supported flag is has only Pause, reference to the function genphy_config_advert(): static int genphy_config_advert(struct phy_device *phydev) { ... linkmode_and(phydev->advertising, phydev->advertising, phydev->supported); ... } The pause frame use of link partner is rx off/tx on, so its advertising only set the bit Asym_Pause, and the supported is only set the bit Pause, so the result of linkmode_and(), is rx off/tx off. This patch adds Asym_Pause to the supported flag to fix it. Signed-off-by: Yonglong Liu <liuyonglong@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28net: hns3: fix a -Wformat-nonliteral compile warningYonglong Liu1-2/+1
When setting -Wformat=2, there is a compiler warning like this: hclge_main.c:xxx:x: warning: format not a string literal and no format arguments [-Wformat-nonliteral] strs[i].desc); ^~~~ This patch adds missing format parameter "%s" to snprintf() to fix it. Fixes: 46a3df9f9718 ("Add HNS3 Acceleration Engine & Compatibility Layer Support") Signed-off-by: Yonglong Liu <liuyonglong@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28net: hns3: add some error checking in hclge_tm moduleYunsheng Lin1-1/+5
When hdev->tx_sch_mode is HCLGE_FLAG_VNET_BASE_SCH_MODE, the hclge_tm_schd_mode_vnet_base_cfg calls hclge_tm_pri_schd_mode_cfg with vport->vport_id as pri_id, which is used as index for hdev->tm_info.tc_info, it will cause out of bound access issue if vport_id is equal to or larger than HNAE3_MAX_TC. Also hardware only support maximum speed of HCLGE_ETHER_MAX_RATE. So this patch adds two checks for above cases. Fixes: 848440544b41 ("net: hns3: Add support of TX Scheduler & Shaper to HNS3 driver") Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28net: hns3: change SSU's buffer allocation according to UMYunsheng Lin3-5/+64
Currently when there is share buffer in the SSU(storage switching unit), the low waterline for RX private buffer is too low to keep the hardware running. Hardware may have processed all the packet stored in the private buffer of the low waterline before the new packet comes, because hardware only tell the peer send packet again when the private buffer is under the low waterline. So this patch only allocate RX private buffer if there is enough buffer according to hardware user manual. This patch also reserve some buffer for reusing when TC num is less than or equal to 2, and change PAUSE_TRANS_GAP & HCLGE_NON_DCB_ADDITIONAL_BUF according to hardware user manual. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28net: hns3: enable DCB when TC num is one and pfc_en is non-zeroYunsheng Lin3-2/+20
Currently when TC num is one, the DCB will be disabled no matter if pfc_en is non-zero or not. This patch enables the DCB if pfc_en is non-zero, even when TC num is one. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28net: hns3: fix __QUEUE_STATE_STACK_XOFF not cleared issueHuazhong Tan1-28/+26
When change MTU or other operations, which just calling .reset_notify to do HNAE3_DOWN_CLIENT and HNAE3_UP_CLIENT, then the netdev_tx_reset_queue() in the hns3_clear_all_ring() will be ignored. So the dev_watchdog() may misdiagnose a TX timeout. This patch separates netdev_tx_reset_queue() from hns3_clear_all_ring(), and unifies hns3_clear_all_ring() and hns3_force_clear_all_ring into one, since they are doing similar things. Fixes: 3a30964a2eef ("net: hns3: delay ring buffer clearing during reset") Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28Merge branch 'Better-PHYLINK-compliance-for-SJA1105-DSA'David S. Miller1-2/+54
Vladimir Oltean says: ==================== Better PHYLINK compliance for SJA1105 DSA After discussing with Russell King, it appears this driver is making a few confusions and not performing some checks for consistent operation. Changes in v2: - Removed redundant print in the phylink_validate callback (in 2/3). ==================== Acked-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28net: dsa: sja1105: Mark in-band AN modes not supported for PHYLINKVladimir Oltean1-0/+5
We need a better way to signal this, perhaps in phylink_validate, but for now just print this error message as guidance for other people looking at this driver's code while trying to rework PHYLINK. Cc: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28net: dsa: sja1105: Check for PHY mode mismatches with what PHYLINK reportsVladimir Oltean1-0/+44
PHYLINK being designed with PHYs in mind that can change MII protocol, for correct operation it is necessary to ensure that the PHY interface mode stays the same (otherwise clear the supported bit mask, as required). Because this is just a hypothetical situation for now, we don't bother to check whether we could actually support the new PHY interface mode. Actually we could modify the xMII table, reset the switch and send an updated static configuration, but adding that would just be dead code. Cc: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28net: dsa: sja1105: Don't check state->link in phylink_mac_configVladimir Oltean1-4/+7
It has been pointed out that PHYLINK can call mac_config only to update the phy_interface_type and without knowing what the AN results are. Experimentally, when this was observed to happen, state->link was also unset, and therefore was used as a proxy to ignore this call. However it is also suggested that state->link is undefined for this callback and should not be relied upon. So let the previously-dead codepath for SPEED_UNKNOWN be called, and update the comment to make sure the MAC's behavior is sane. Cc: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28hinic: reduce rss_init stack usageArnd Bergmann1-7/+13
On 32-bit architectures, putting an array of 256 u32 values on the stack uses more space than the warning limit: drivers/net/ethernet/huawei/hinic/hinic_main.c: In function 'hinic_rss_init': drivers/net/ethernet/huawei/hinic/hinic_main.c:286:1: error: the frame size of 1068 bytes is larger than 1024 bytes [-Werror=frame-larger-than=] I considered changing the code to use u8 values here, since that's all the hardware supports, but dynamically allocating the array is a more isolated fix here. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28Merge branch 'stmmac-10GbE-using-XGMAC'David S. Miller7-56/+176
Jose Abreu says: ==================== net: stmmac: 10GbE using XGMAC Support for 10Gb Link using XGMAC core plus some performance tweaks. Tested in a PCI based setup. iperf3 TCP results: TSO ON, MTU=1500, TX Queues = 1, RX Queues = 1, Flow Control ON Pinned CPU (-A), Zero-Copy (-Z) [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-600.00 sec 643 GBytes 9.21 Gbits/sec 1 sender [ 5] 0.00-600.00 sec 643 GBytes 9.21 Gbits/sec receiver ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28net: stmmac: Update Kconfig entryJose Abreu1-1/+1
We support more speeds now. Update the Kconfig entry. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Cc: Joao Pinto <jpinto@synopsys.com> Cc: David S. Miller <davem@davemloft.net> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28net: stmmac: Only disable interrupts if NAPI is scheduledJose Abreu1-5/+5
Only disable the interrupts if RX NAPI gets to be scheduled. Also, schedule the TX NAPI only when the interrupts are disabled. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Cc: Joao Pinto <jpinto@synopsys.com> Cc: David S. Miller <davem@davemloft.net> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28net: stmmac: Update RX Tail Pointer to last free entryJose Abreu1-0/+2
Update the RX Tail Pointer to the last available SKB entry. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Cc: Joao Pinto <jpinto@synopsys.com> Cc: David S. Miller <davem@davemloft.net> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28net: stmmac: Enable support for > 32 Bits addressing in XGMACJose Abreu5-15/+66
Currently, stmmac only supports 32 bits addressing for SKB. Enable the support for upto 48 bits addressing in XGMAC core. This avoids the use of bounce buffers and increases performance. Changes from v1: - Fallback to 32 bits in failure (Andrew) Signed-off-by: Jose Abreu <joabreu@synopsys.com> Cc: Joao Pinto <jpinto@synopsys.com> Cc: David S. Miller <davem@davemloft.net> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Cc: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28net: stmmac: Do not disable interrupts when cleaning TXJose Abreu1-5/+3
This is a performance killer and anyways the interrupts are being disabled by RX NAPI so no need to disable them again. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Cc: Joao Pinto <jpinto@synopsys.com> Cc: David S. Miller <davem@davemloft.net> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28net: stmmac: Add the missing speeds that XGMAC supportsJose Abreu4-23/+86
XGMAC supports following speeds: - 10G XGMII - 5G XGMII - 2.5G XGMII - 2.5G GMII - 1G GMII - 100M MII - 10M MII Add them to the stmmac driver. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Cc: Joao Pinto <jpinto@synopsys.com> Cc: David S. Miller <davem@davemloft.net> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28net: stmmac: dwxgmac: Fix the undefined burst settingJose Abreu1-3/+3
Undefined burst shall only be set if pdata asks to. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Cc: Joao Pinto <jpinto@synopsys.com> Cc: David S. Miller <davem@davemloft.net> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28net: stmmac: Decrease default RX Watchdog valueJose Abreu2-3/+3
For performance reasons decrease the default RX Watchdog value for the minimum allowed. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Cc: Joao Pinto <jpinto@synopsys.com> Cc: David S. Miller <davem@davemloft.net> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28net: stmmac: Do not try to enable PHY EEE if MAC does not support itJose Abreu1-1/+1
Do not enable EEE feature in the PHY if MAC does not support it. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Cc: Joao Pinto <jpinto@synopsys.com> Cc: David S. Miller <davem@davemloft.net> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28net: stmmac: dwxgmac: Enable EDMA by defaultJose Abreu2-0/+6
Enable the EDMA feature by default which gives higher performance. Changes from v1: - Do not use magic values (David) Signed-off-by: Jose Abreu <joabreu@synopsys.com> Cc: Joao Pinto <jpinto@synopsys.com> Cc: David S. Miller <davem@davemloft.net> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28net: stmmac: Fix case when PHY handle is not presentJose Abreu1-2/+6
Some DT bindings do not have the PHY handle. Let's fallback to manually discovery in case phylink_of_phy_connect() fails. Changes from v1: - Fixup comment style (Sergei) Fixes: 74371272f97f ("net: stmmac: Convert to phylink and remove phylib logic") Reported-by: Katsuhiro Suzuki <katsuhiro@katsuster.net> Tested-by: Katsuhiro Suzuki <katsuhiro@katsuster.net> Signed-off-by: Jose Abreu <joabreu@synopsys.com> Cc: Joao Pinto <jpinto@synopsys.com> Cc: David S. Miller <davem@davemloft.net> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28sis900: remove TxIDLESergej Benilov1-12/+12
Before "sis900: fix TX completion" patch, TX completion was done on TxIDLE interrupt. TX completion also was the only thing done on TxIDLE interrupt. Since "sis900: fix TX completion", TX completion is done on TxDESC interrupt. So it is not necessary any more to set and to check for TxIDLE. Eliminate TxIDLE from sis900. Correct some typos, too. Signed-off-by: Sergej Benilov <sergej.benilov@googlemail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28tipc: add dst_cache support for udp mediaXin Long1-25/+47
As other udp/ip tunnels do, tipc udp media should also have a lockless dst_cache supported on its tx path. Here we add dst_cache into udp_replicast to support dst cache for both rmcast and rcast, and rmcast uses ub->rcast and each rcast uses its own node in ub->rcast.list. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller101-276/+719
The new route handling in ip_mc_finish_output() from 'net' overlapped with the new support for returning congestion notifications from BPF programs. In order to handle this I had to take the dev_loopback_xmit() calls out of the switch statement. The aquantia driver conflicts were simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28Merge branch 'nfp-extend-flower-capabilities-for-GRE-tunnel-offload'David S. Miller4-89/+263
Jakub Kicinski says: ==================== nfp: extend flower capabilities for GRE tunnel offload Pieter says: This set extends the flower match and action components to offload GRE decapsulation with classification and encapsulation actions. The first 3 patches are refactor and cleanup patches for improving readability and reusability. Patch 4 and 5 implement GRE decap and encap functionality respectively. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>