summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2020-05-02net: enetc: add hw tc hw offload features for PSPF capabilityPo Liu4-0/+96
This patch is to let ethtool enable/disable the tc flower offload features. Hardware ENETC has the feature of PSFP which is for per-stream policing. When enable the tc hw offloading feature, driver would enable the IEEE 802.1Qci feature. It is only set the register enable bit for this feature not enable for any entry of per stream filtering and stream gate or stream identify but get how much capabilities for each feature. Signed-off-by: Po Liu <Po.Liu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-02net: schedule: add action gate offloadingPo Liu3-0/+142
Add the gate action to the flow action entry. Add the gate parameters to the tc_setup_flow_action() queueing to the entries of flow_action_entry array provide to the driver. Signed-off-by: Po Liu <Po.Liu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-02net: qos: introduce a gate control flow actionPo Liu6-0/+744
Introduce a ingress frame gate control flow action. Tc gate action does the work like this: Assume there is a gate allow specified ingress frames can be passed at specific time slot, and be dropped at specific time slot. Tc filter chooses the ingress frames, and tc gate action would specify what slot does these frames can be passed to device and what time slot would be dropped. Tc gate action would provide an entry list to tell how much time gate keep open and how much time gate keep state close. Gate action also assign a start time to tell when the entry list start. Then driver would repeat the gate entry list cyclically. For the software simulation, gate action requires the user assign a time clock type. Below is the setting example in user space. Tc filter a stream source ip address is 192.168.0.20 and gate action own two time slots. One is last 200ms gate open let frame pass another is last 100ms gate close let frames dropped. When the ingress frames have reach total frames over 8000000 bytes, the excessive frames will be dropped in that 200000000ns time slot. > tc qdisc add dev eth0 ingress > tc filter add dev eth0 parent ffff: protocol ip \ flower src_ip 192.168.0.20 \ action gate index 2 clockid CLOCK_TAI \ sched-entry open 200000000 -1 8000000 \ sched-entry close 100000000 -1 -1 > tc chain del dev eth0 ingress chain 0 "sched-entry" follow the name taprio style. Gate state is "open"/"close". Follow with period nanosecond. Then next item is internal priority value means which ingress queue should put. "-1" means wildcard. The last value optional specifies the maximum number of MSDU octets that are permitted to pass the gate during the specified time interval. Base-time is not set will be 0 as default, as result start time would be ((N + 1) * cycletime) which is the minimal of future time. Below example shows filtering a stream with destination mac address is 10:00:80:00:00:00 and ip type is ICMP, follow the action gate. The gate action would run with one close time slot which means always keep close. The time cycle is total 200000000ns. The base-time would calculate by: 1357000000000 + (N + 1) * cycletime When the total value is the future time, it will be the start time. The cycletime here would be 200000000ns for this case. > tc filter add dev eth0 parent ffff: protocol ip \ flower skip_hw ip_proto icmp dst_mac 10:00:80:00:00:00 \ action gate index 12 base-time 1357000000000 \ sched-entry close 200000000 -1 -1 \ clockid CLOCK_TAI Signed-off-by: Po Liu <Po.Liu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-02net: bcmgenet: Move wake-up event out of side band ISRDoug Berger3-13/+67
The side band interrupt service routine is not available on chips like 7211, or rather, it does not permit the signaling of wake-up events due to the complex interrupt hierarchy. Move the wake-up event accounting into a .resume_noirq function, account for possible wake-up events and clear the MPD/HFB interrupts from there, while leaving the hardware untouched until the resume function proceeds with doing its usual business. Because bcmgenet_wol_power_down_cfg() now enables the MPD and HFB interrupts, it is invoked by a .suspend_noirq function to prevent the servicing of interrupts after the clocks have been disabled. Signed-off-by: Doug Berger <opendmb@gmail.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-02Merge branch 'net-ipa-dont-cache-channel-state'David S. Miller2-38/+59
Alex Elder says: ==================== net: ipa: don't cache channel state This series removes a field that holds a copy of a channel's state at the time it was last fetched. In principle the state can change at any time, so it's better to just fetch it whenever needed. The first patch is just preparatory, simplifying the arguments to gsi_channel_state(). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-02net: ipa: do not cache channel stateAlex Elder2-35/+55
It is possible for a GSI channel's state to be changed as a result of an action by a different execution environment. Specifically, the modem is able to issue a GSI generic command that causes a state change on a GSI channel associated with the AP. A channel's state only needs to be known when a channel is allocated or deallocaed, started or stopped, or reset. So there is little value in caching the state anyway. Stop recording a copy of the channel's last known state, and instead fetch the true state from hardware whenever it's needed. In such cases, *do* record the state in a local variable, in case an error message reports it (so the value reported is the value seen). Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-02net: ipa: pass channel pointer to gsi_channel_state()Alex Elder1-5/+6
Pass a channel pointer rather than a GSI pointer and channel ID to gsi_channel_state(). Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-02drop_monitor: work around gcc-10 stringop-overflow warningArnd Bergmann1-4/+7
The current gcc-10 snapshot produces a false-positive warning: net/core/drop_monitor.c: In function 'trace_drop_common.constprop': cc1: error: writing 8 bytes into a region of size 0 [-Werror=stringop-overflow=] In file included from net/core/drop_monitor.c:23: include/uapi/linux/net_dropmon.h:36:8: note: at offset 0 to object 'entries' with size 4 declared here 36 | __u32 entries; | ^~~~~~~ I reported this in the gcc bugzilla, but in case it does not get fixed in the release, work around it by using a temporary variable. Fixes: 9a8afc8d3962 ("Network Drop Monitor: Adding drop monitor implementation & Netlink protocol") Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94881 Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-02Merge branch 'net-dsa-mv88e6xxx-augment-phylink-support-for-10G'David S. Miller2-13/+49
Russell King says: ==================== net: dsa: mv88e6xxx: augment phylink support for 10G This series adds phylink 10G support for the 88E6390 series switches, as suggested by Andrew Lunn. The first patch cleans up the code to use generic definitions for the registers in a similar way to what was done with the initial conversion of 1G serdes support. The second patch adds the necessary bits 10GBASE mode to the pcs_get_state() method. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-02net: dsa: mv88e6xxx: 88e6390 10G serdes supportRussell King2-2/+42
Add support for reading and reporting the 10G link status on the 88e6390 in addition to the 1000BASE-X/2500BASE-X/SGMII status. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-02net: dsa: mv88e6xxx: use generic clause 45 definitionsRussell King2-11/+7
The private MV88E6390_PCS_CONTROL_1 definitions in serdes.h reflects the IEEE 802.3 standard PCS control register 1 definitions, only offset by 0x1000 in the PHYXS register space. Rather than inventing our own, use those that already exist, and name the register MV88E6390_10G_CTRL1. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-02Merge branch 'net-atlantic-A2-support'David S. Miller23-121/+2798
Igor Russkikh says: ==================== net: atlantic: A2 support This patchset adds support for the new generation of Atlantic NICs. Chip generations are mostly compatible register-wise, but there are still some differences. Therefore we've made some of first generation (A1) code non-static to re-use it where possible. Some pieces are A2 specific, in which case we redefine/extend such APIs. v2: * removed #pragma pack (2 structures require the packed attribute); * use defines instead of magic numbers where possible; v1: https://patchwork.ozlabs.org/cover/1276220/ ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-02net: atlantic: A2 ingress / egress hw configurationIgor Russkikh7-105/+172
Chip generations are mostly compatible register-wise, but there are still some differences. Therefore we've made some of first generation (A1) code non-static to re-use it where possible. Some pieces are A2 specific, in which case we redefine/extend such APIs. Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-02net: atlantic: basic A2 init/deinit hw_opsIgor Russkikh7-79/+362
This patch adds basic A2 HW initialization / deinitialization. Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Co-developed-by: Dmitry Bogdanov <dbogdanov@marvell.com> Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com> Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-02net: atlantic: common functions needed for basic A2 init/deinit hw_opsDmitry Bogdanov6-2/+163
This patch adds common functions (mostly FW-related), which are needed for basic A2 HW initialization / deinitialization. Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com> Co-developed-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-02net: atlantic: HW bindings for basic A2 init/deinit hw_opsDmitry Bogdanov3-0/+207
This patch adds A2 register definitions for basic A2 HW initialization / deinitialization. Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com> Co-developed-by: Egor Pomozov <epomozov@marvell.com> Signed-off-by: Egor Pomozov <epomozov@marvell.com> Co-developed-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Co-developed-by: Nikita Danilov <ndanilov@marvell.com> Signed-off-by: Nikita Danilov <ndanilov@marvell.com> Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-02net: atlantic: add A2 RPF hw_opsIgor Russkikh3-0/+259
This patch adds RPF-related hw_ops, which are needed for basic functionality. Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Co-developed-by: Dmitry Bogdanov <dbogdanov@marvell.com> Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com> Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-02net: atlantic: HW bindings for A2 RFPIgor Russkikh5-0/+284
RPF is one of the modules which has been significantly changed/extended on A2. This patch adds the necessary A2 register definitions for RPF, which are used in follow-up patches. Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Co-developed-by: Dmitry Bogdanov <dbogdanov@marvell.com> Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com> Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-02net: atlantic: A2 hw_ops skeletonIgor Russkikh6-7/+294
This patch adds basic hw_ops layout for A2. Actual implementation will be added in the follow-up patches. Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-02net: atlantic: minimal A2 fw_opsDmitry Bogdanov4-0/+352
This patch adds the minimum set of FW ops for A2. Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com> Co-developed-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-02net: atlantic: minimal A2 HW bindings required for fw_opsDmitry Bogdanov5-0/+137
This patch adds the bare minimum of A2 HW bindings required to get fw_ops working. Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com> Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-02net: atlantic: A2 driver-firmware interfaceDmitry Bogdanov1-0/+593
This patch adds the driver<->firmware interface for A2 Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com> Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-02net: atlantic: move IS_CHIP_FEATURE to aq_hw.hMark Starovoytov5-32/+36
IS_CHIP feature will be used to differentiate between A1 and A2, where necessary. Thus, move it to aq_hw.h, rename it and make it accept the 'hw' pointer. Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-02net: atlantic: make hw_get_regs optionalMark Starovoytov1-0/+6
This patch fixes potential crash in case if hw_get_regs is NULL. Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-02net: atlantic: simplify hw_get_fw_version() usageNikita Danilov4-12/+6
hw_get_fw_version() never fails, so this patch simplifies its usage by utilizing return value instead of output argument. Signed-off-by: Nikita Danilov <ndanilov@marvell.com> Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-02net: atlantic: add hw_soft_reset, hw_prepare to hw_opsMark Starovoytov5-5/+24
A2 will have a different implementation of these 2 APIs, so this patch moves them to hw_ops in preparation for A2. Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com> Co-developed-by: Dmitry Bezrukov <dbezrukov@marvell.com> Signed-off-by: Dmitry Bezrukov <dbezrukov@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-02net: atlantic: add defines for 10M and EEE 100M link modeIgor Russkikh3-10/+27
This patch adds defines for 10M and EEE 100M link modes, which are supported by A2. 10M support is added in this patch series. EEE is out of scope, but will be added in a follow-up series. Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-02net: atlantic: add A2 device IDsIgor Russkikh1-0/+7
Adding device ids for the new generation of atlantic nic. Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-02net: atlantic: update company name in the driver descriptionIgor Russkikh2-3/+3
Aquantia is now part of Marvell. Thus, update the driver description. Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-02gtp: set NLM_F_MULTI flag in gtp_genl_dump_pdp()Yoshiyuki Kurauchi1-4/+5
In drivers/net/gtp.c, gtp_genl_dump_pdp() should set NLM_F_MULTI flag since it returns multipart message. This patch adds a new arg "flags" in gtp_genl_fill_info() so that flags can be set by the callers. Signed-off-by: Yoshiyuki Kurauchi <ahochauwaaaaa@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-02cxgb4: Add missing annotation for service_ofldq()Jules Irenge1-0/+1
Sparse reports a warning at service_ofldq() warning: context imbalance in service_ofldq() - unexpected unlock The root cause is the missing annotation at service_ofldq() Add the missing __must_hold(&q->sendq.lock) annotation Signed-off-by: Jules Irenge <jbi.octave@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-02ice: cleanup language in ice.rst for fw.appJacob Keller1-2/+2
The documentation for the ice driver around "fw.app" has a spelling mistake in variation. Additionally, the language of "shall have a unique name" sounds like a requirement. Reword this to read more like a description or property. Reported-by: Benjamin Fisher <benjamin.l.fisher@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Jakub Kicinski <kubakici@wp.pl> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-02net: Make PTP-specific drivers depend on PTP_1588_CLOCKClay McClure5-5/+5
Commit d1cbfd771ce8 ("ptp_clock: Allow for it to be optional") changed all PTP-capable Ethernet drivers from `select PTP_1588_CLOCK` to `imply PTP_1588_CLOCK`, "in order to break the hard dependency between the PTP clock subsystem and ethernet drivers capable of being clock providers." As a result it is possible to build PTP-capable Ethernet drivers without the PTP subsystem by deselecting PTP_1588_CLOCK. Drivers are required to handle the missing dependency gracefully. Some PTP-capable Ethernet drivers (e.g., TI_CPSW) factor their PTP code out into separate drivers (e.g., TI_CPTS_MOD). The above commit also changed these PTP-specific drivers to `imply PTP_1588_CLOCK`, making it possible to build them without the PTP subsystem. But as Grygorii Strashko noted in [1]: On Wed, Apr 22, 2020 at 02:16:11PM +0300, Grygorii Strashko wrote: > Another question is that CPTS completely nonfunctional in this case and > it was never expected that somebody will even try to use/run such > configuration (except for random build purposes). In my view, enabling a PTP-specific driver without the PTP subsystem is a configuration error made possible by the above commit. Kconfig should not allow users to create a configuration with missing dependencies that results in "completely nonfunctional" drivers. I audited all network drivers that call ptp_clock_register() but merely `imply PTP_1588_CLOCK` and found five PTP-specific drivers that are likely nonfunctional without PTP_1588_CLOCK: NET_DSA_MV88E6XXX_PTP NET_DSA_SJA1105_PTP MACB_USE_HWSTAMP CAVIUM_PTP TI_CPTS_MOD Note how these symbols all reference PTP or timestamping in their name; this is a clue that they depend on PTP_1588_CLOCK. Change them from `imply PTP_1588_CLOCK` [2] to `depends on PTP_1588_CLOCK`. I'm not using `select PTP_1588_CLOCK` here because PTP_1588_CLOCK has its own dependencies, which `select` would not transitively apply. Additionally, remove the `select NET_PTP_CLASSIFY` from CPTS_TI_MOD; PTP_1588_CLOCK already selects that. [1]: https://lore.kernel.org/lkml/c04458ed-29ee-1797-3a11-7f3f560553e6@ti.com/ [2]: NET_DSA_SJA1105_PTP had never declared any type of dependency on PTP_1588_CLOCK (`imply` or otherwise); adding a `depends on PTP_1588_CLOCK` here seems appropriate. Cc: Arnd Bergmann <arnd@arndb.de> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Nicolas Pitre <nico@fluxnic.net> Cc: Grygorii Strashko <grygorii.strashko@ti.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Fixes: d1cbfd771ce8 ("ptp_clock: Allow for it to be optional") Signed-off-by: Clay McClure <clay@daemons.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-02drivers: net: davinci_mdio: fix potential NULL dereference in ↵Wei Yongjun1-0/+2
davinci_mdio_probe() platform_get_resource() may fail and return NULL, so we should better check it's return value to avoid a NULL pointer dereference since devm_ioremap() does not check input parameters for null. This is detected by Coccinelle semantic patch. @@ expression pdev, res, n, t, e, e1, e2; @@ res = \(platform_get_resource\|platform_get_resource_byname\)(pdev, t, n); + if (!res) + return -EINVAL; ... when != res == NULL e = devm_ioremap(e1, res->start, e2); Fixes: 03f66f067560 ("net: ethernet: ti: davinci_mdio: use devm_ioremap()") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-02devlink: fix return value after hitting end in region readJakub Kicinski1-0/+5
Commit d5b90e99e1d5 ("devlink: report 0 after hitting end in region read") fixed region dump, but region read still returns a spurious error: $ devlink region read netdevsim/netdevsim1/dummy snapshot 0 addr 0 len 128 0000000000000000 a6 f4 c4 1c 21 35 95 a6 9d 34 c3 5b 87 5b 35 79 0000000000000010 f3 a0 d7 ee 4f 2f 82 7f c6 dd c4 f6 a5 c3 1b ae 0000000000000020 a4 fd c8 62 07 59 48 03 70 3b c7 09 86 88 7f 68 0000000000000030 6f 45 5d 6d 7d 0e 16 38 a9 d0 7a 4b 1e 1e 2e a6 0000000000000040 e6 1d ae 06 d6 18 00 85 ca 62 e8 7e 11 7e f6 0f 0000000000000050 79 7e f7 0f f3 94 68 bd e6 40 22 85 b6 be 6f b1 0000000000000060 af db ef 5e 34 f0 98 4b 62 9a e3 1b 8b 93 fc 17 devlink answers: Invalid argument 0000000000000070 61 e8 11 11 66 10 a5 f7 b1 ea 8d 40 60 53 ed 12 This is a minimal fix, I'll follow up with a restructuring so we don't have two checks for the same condition. Fixes: fdd41ec21e15 ("devlink: Return right error code in case of errors for region read") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-02hv_netvsc: Fix netvsc_start_xmit's return typeNathan Chancellor1-1/+2
netvsc_start_xmit is used as a callback function for the ndo_start_xmit function pointer. ndo_start_xmit's return type is netdev_tx_t but netvsc_start_xmit's return type is int. This causes a failure with Control Flow Integrity (CFI), which requires function pointer prototypes and callback function definitions to match exactly. When CFI is in enforcing, the kernel panics. When booting a CFI kernel with WSL 2, the VM is immediately terminated because of this. The splat when CONFIG_CFI_PERMISSIVE is used: [ 5.916765] CFI failure (target: netvsc_start_xmit+0x0/0x10): [ 5.916771] WARNING: CPU: 8 PID: 0 at kernel/cfi.c:29 __cfi_check_fail+0x2e/0x40 [ 5.916772] Modules linked in: [ 5.916774] CPU: 8 PID: 0 Comm: swapper/8 Not tainted 5.7.0-rc3-next-20200424-microsoft-cbl-00001-ged4eb37d2c69-dirty #1 [ 5.916776] RIP: 0010:__cfi_check_fail+0x2e/0x40 [ 5.916777] Code: 48 c7 c7 70 98 63 a9 48 c7 c6 11 db 47 a9 e8 69 55 59 00 85 c0 75 02 5b c3 48 c7 c7 73 c6 43 a9 48 89 de 31 c0 e8 12 2d f0 ff <0f> 0b 5b c3 00 00 cc cc 00 00 cc cc 00 00 cc cc 00 00 85 f6 74 25 [ 5.916778] RSP: 0018:ffffa803c0260b78 EFLAGS: 00010246 [ 5.916779] RAX: 712a1af25779e900 RBX: ffffffffa8cf7950 RCX: ffffffffa962cf08 [ 5.916779] RDX: ffffffffa9c36b60 RSI: 0000000000000082 RDI: ffffffffa9c36b5c [ 5.916780] RBP: ffff8ffc4779c2c0 R08: 0000000000000001 R09: ffffffffa9c3c300 [ 5.916781] R10: 0000000000000151 R11: ffffffffa9c36b60 R12: ffff8ffe39084000 [ 5.916782] R13: ffffffffa8cf7950 R14: ffffffffa8d12cb0 R15: ffff8ffe39320140 [ 5.916784] FS: 0000000000000000(0000) GS:ffff8ffe3bc00000(0000) knlGS:0000000000000000 [ 5.916785] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 5.916786] CR2: 00007ffef5749408 CR3: 00000002f4f5e000 CR4: 0000000000340ea0 [ 5.916787] Call Trace: [ 5.916788] <IRQ> [ 5.916790] __cfi_check+0x3ab58/0x450e0 [ 5.916793] ? dev_hard_start_xmit+0x11f/0x160 [ 5.916795] ? sch_direct_xmit+0xf2/0x230 [ 5.916796] ? __dev_queue_xmit.llvm.11471227737707190958+0x69d/0x8e0 [ 5.916797] ? neigh_resolve_output+0xdf/0x220 [ 5.916799] ? neigh_connected_output.cfi_jt+0x8/0x8 [ 5.916801] ? ip6_finish_output2+0x398/0x4c0 [ 5.916803] ? nf_nat_ipv6_out+0x10/0xa0 [ 5.916804] ? nf_hook_slow+0x84/0x100 [ 5.916807] ? ip6_input_finish+0x8/0x8 [ 5.916807] ? ip6_output+0x6f/0x110 [ 5.916808] ? __ip6_local_out.cfi_jt+0x8/0x8 [ 5.916810] ? mld_sendpack+0x28e/0x330 [ 5.916811] ? ip_rt_bug+0x8/0x8 [ 5.916813] ? mld_ifc_timer_expire+0x2db/0x400 [ 5.916814] ? neigh_proxy_process+0x8/0x8 [ 5.916816] ? call_timer_fn+0x3d/0xd0 [ 5.916817] ? __run_timers+0x2a9/0x300 [ 5.916819] ? rcu_core_si+0x8/0x8 [ 5.916820] ? run_timer_softirq+0x14/0x30 [ 5.916821] ? __do_softirq+0x154/0x262 [ 5.916822] ? native_x2apic_icr_write+0x8/0x8 [ 5.916824] ? irq_exit+0xba/0xc0 [ 5.916825] ? hv_stimer0_vector_handler+0x99/0xe0 [ 5.916826] ? hv_stimer0_callback_vector+0xf/0x20 [ 5.916826] </IRQ> [ 5.916828] ? hv_stimer_global_cleanup.cfi_jt+0x8/0x8 [ 5.916829] ? raw_setsockopt+0x8/0x8 [ 5.916830] ? default_idle+0xe/0x10 [ 5.916832] ? do_idle.llvm.10446269078108580492+0xb7/0x130 [ 5.916833] ? raw_setsockopt+0x8/0x8 [ 5.916833] ? cpu_startup_entry+0x15/0x20 [ 5.916835] ? cpu_hotplug_enable.cfi_jt+0x8/0x8 [ 5.916836] ? start_secondary+0x188/0x190 [ 5.916837] ? secondary_startup_64+0xa5/0xb0 [ 5.916838] ---[ end trace f2683fa869597ba5 ]--- Avoid this by using the right return type for netvsc_start_xmit. Fixes: fceaf24a943d8 ("Staging: hv: add the Hyper-V virtual network driver") Link: https://github.com/ClangBuiltLinux/linux/issues/1009 Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-02Merge branch 'WoL-fixes-for-DP83822-and-DP83tc811'David S. Miller2-25/+26
Dan Murphy says: ==================== WoL fixes for DP83822 and DP83tc811 The WoL feature for each device was enabled during boot or when the PHY was brought up which may be undesired. These patches disable the WoL in the config_init. The disabling and enabling of the WoL is now done though the set_wol call. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-02net: phy: DP83TC811: Fix WoL in config init to be disabledDan Murphy1-9/+12
The WoL feature should be disabled when config_init is called and the feature should turned on or off when set_wol is called. In addition updated the calls to modify the registers to use the set_bit and clear_bit function calls. Fixes: 6d749428788b ("net: phy: DP83TC811: Introduce support for the DP83TC811 phy") Signed-off-by: Dan Murphy <dmurphy@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-02net: phy: DP83822: Fix WoL in config init to be disabledDan Murphy1-16/+14
The WoL feature should be disabled when config_init is called and the feature should turned on or off when set_wol is called. In addition updated the calls to modify the registers to use the set_bit and clear_bit function calls. Fixes: 3b427751a9d0 ("net: phy: DP83822 initial driver submission") Signed-off-by: Dan Murphy <dmurphy@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-02net: fix skb_panic to output real addressJesper Dangaard Brouer1-1/+1
In skb_panic() the real pointer values are really needed to diagnose issues, e.g. data and head are related (to calculate headroom). The hashed versions of the addresses doesn't make much sense here. The patch use the printk specifier %px to print the actual address. The printk documentation on %px: https://www.kernel.org/doc/html/latest/core-api/printk-formats.html#unmodified-addresses Fixes: ad67b74d2469 ("printk: hash addresses printed with %p") Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-02net: ethernet: stmmac: simplify phy modes management for stm32Christophe Roullier1-30/+44
No new feature, just to simplify stm32 part to be easier to use. Add by default all Ethernet clocks in DT, and activate or not in function of phy mode, clock frequency, if property "st,ext-phyclk" is set or not. Keep backward compatibility -------------------------------------------------------------------------- |PHY_MODE | Normal | PHY wo crystal| PHY wo crystal | No 125Mhz | | | | 25MHz | 50MHz | from PHY | -------------------------------------------------------------------------- | MII | - | eth-ck | n/a | n/a | | | | st,ext-phyclk | | | -------------------------------------------------------------------------- | GMII | - | eth-ck | n/a | n/a | | | | st,ext-phyclk | | | -------------------------------------------------------------------------- | RGMII | - | eth-ck | n/a | eth-ck | | | | st,ext-phyclk | |st,eth-clk-sel| | | | | | or | | | | | | st,ext-phyclk| ----------------==-------------------------------------------------------- | RMII | - | eth-ck | eth-ck | n/a | | | | st,ext-phyclk | st,eth-ref-clk-sel | | | | | | or st,ext-phyclk | | -------------------------------------------------------------------------- Signed-off-by: Christophe Roullier <christophe.roullier@st.com> Acked-by: Alexandre TORGUE <alexandre.torgue@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-02bpf: Fix use-after-free of bpf_link when priming half-failsAndrii Nakryiko1-6/+7
If bpf_link_prime() succeeds to allocate new anon file, but then fails to allocate ID for it, link priming is considered to be failed and user is supposed ot be able to directly kfree() bpf_link, because it was never exposed to user-space. But at that point file already keeps a pointer to bpf_link and will eventually call bpf_link_release(), so if bpf_link was kfree()'d by caller, that would lead to use-after-free. Fix this by first allocating ID and only then allocating file. Adding ID to link_idr is ok, because link at that point still doesn't have its ID set, so no user-space process can create a new FD for it. Fixes: a3b80e107894 ("bpf: Allocate ID for bpf_link") Reported-by: syzbot+39b64425f91b5aab714d@syzkaller.appspotmail.com Suggested-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200501185622.3088964-1-andriin@fb.com
2020-05-02net: Replace the limit of TCP_LINGER2 with TCP_FIN_TIMEOUT_MAXCambda Zhu2-2/+3
This patch changes the behavior of TCP_LINGER2 about its limit. The sysctl_tcp_fin_timeout used to be the limit of TCP_LINGER2 but now it's only the default value. A new macro named TCP_FIN_TIMEOUT_MAX is added as the limit of TCP_LINGER2, which is 2 minutes. Since TCP_LINGER2 used sysctl_tcp_fin_timeout as the default value and the limit in the past, the system administrator cannot set the default value for most of sockets and let some sockets have a greater timeout. It might be a mistake that let the sysctl to be the limit of the TCP_LINGER2. Maybe we can add a new sysctl to set the max of TCP_LINGER2, but FIN-WAIT-2 timeout is usually no need to be too long and 2 minutes are legal considering TCP specs. Changes in v3: - Remove the new socket option and change the TCP_LINGER2 behavior so that the timeout can be set to value between sysctl_tcp_fin_timeout and 2 minutes. Changes in v2: - Add int overflow check for the new socket option. Changes in v1: - Add a new socket option to set timeout greater than sysctl_tcp_fin_timeout. Signed-off-by: Cambda Zhu <cambda@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-01Merge branch 'r8169-improve-user-message-handling'David S. Miller1-76/+36
Heiner Kallweit says: ==================== r8169: improve user message handling Series improves few aspects of handling messages to users. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-01r8169: switch from netif_xxx message functions to netdev_xxxHeiner Kallweit1-46/+22
Considering the few messages we have in the driver, there's not really a benefit in being able to control them on a message type level. Therefore simplify the code and switch to the netdev_xxx message functions. In addition add net_ratelimit() to messages that can be printed from a hot path. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-01r8169: remove "out of memory" error message from rtl_request_firmwareHeiner Kallweit1-3/+1
When preparing an unrelated change, checkpatch complained about this redundant out-of-memory message. Therefore remove it. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-01r8169: simplify counter handlingHeiner Kallweit1-25/+13
The counter handling functions can only fail if rtl8169_do_counters() times out. In the poll function we emit an error message in case of timeout, therefore we don't have to propagate the timeout all the way up just to print another message basically saying the same. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-01r8169: remove redundant driver message when entering promiscuous modeHeiner Kallweit1-2/+0
Net core - __dev_set_promiscuity - prints a message already when promiscuous mode in entered/left, therefore we don't have to do this in the driver too. Also the driver message would be misleading (would be because "link" message level is disabled per default) because it would print "promisc mode enabled" even if it's being left. Reason is that __dev_change_flags() calls dev_set_rx_mode() before touching the promisc flag. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-01ipv6: Use global sernum for dst validation with nexthop objectsDavid Ahern3-0/+36
Nik reported a bug with pcpu dst cache when nexthop objects are used illustrated by the following: $ ip netns add foo $ ip -netns foo li set lo up $ ip -netns foo addr add 2001:db8:11::1/128 dev lo $ ip netns exec foo sysctl net.ipv6.conf.all.forwarding=1 $ ip li add veth1 type veth peer name veth2 $ ip li set veth1 up $ ip addr add 2001:db8:10::1/64 dev veth1 $ ip li set dev veth2 netns foo $ ip -netns foo li set veth2 up $ ip -netns foo addr add 2001:db8:10::2/64 dev veth2 $ ip -6 nexthop add id 100 via 2001:db8:10::2 dev veth1 $ ip -6 route add 2001:db8:11::1/128 nhid 100 Create a pcpu entry on cpu 0: $ taskset -a -c 0 ip -6 route get 2001:db8:11::1 Re-add the route entry: $ ip -6 ro del 2001:db8:11::1 $ ip -6 route add 2001:db8:11::1/128 nhid 100 Route get on cpu 0 returns the stale pcpu: $ taskset -a -c 0 ip -6 route get 2001:db8:11::1 RTNETLINK answers: Network is unreachable While cpu 1 works: $ taskset -a -c 1 ip -6 route get 2001:db8:11::1 2001:db8:11::1 from :: via 2001:db8:10::2 dev veth1 src 2001:db8:10::1 metric 1024 pref medium Conversion of FIB entries to work with external nexthop objects missed an important difference between IPv4 and IPv6 - how dst entries are invalidated when the FIB changes. IPv4 has a per-network namespace generation id (rt_genid) that is bumped on changes to the FIB. Checking if a dst_entry is still valid means comparing rt_genid in the rtable to the current value of rt_genid for the namespace. IPv6 also has a per network namespace counter, fib6_sernum, but the count is saved per fib6_node. With the per-node counter only dst_entries based on fib entries under the node are invalidated when changes are made to the routes - limiting the scope of invalidations. IPv6 uses a reference in the rt6_info, 'from', to track the corresponding fib entry used to create the dst_entry. When validating a dst_entry, the 'from' is used to backtrack to the fib6_node and check the sernum of it to the cookie passed to the dst_check operation. With the inline format (nexthop definition inline with the fib6_info), dst_entries cached in the fib6_nh have a 1:1 correlation between fib entries, nexthop data and dst_entries. With external nexthops, IPv6 looks more like IPv4 which means multiple fib entries across disparate fib6_nodes can all reference the same fib6_nh. That means validation of dst_entries based on external nexthops needs to use the IPv4 format - the per-network namespace counter. Add sernum to rt6_info and set it when creating a pcpu dst entry. Update rt6_get_cookie to return sernum if it is set and update dst_check for IPv6 to look for sernum set and based the check on it if so. Finally, rt6_get_pcpu_route needs to validate the cached entry before returning a pcpu entry (similar to the rt_cache_valid calls in __mkroute_input and __mkroute_output for IPv4). This problem only affects routes using the new, external nexthops. Thanks to the kbuild test robot for catching the IS_ENABLED needed around rt_genid_ipv6 before I sent this out. Fixes: 5b98324ebe29 ("ipv6: Allow routes to use nexthop objects") Reported-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David Ahern <dsahern@kernel.org> Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Tested-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-01bpf: Bpf_{g,s}etsockopt for struct bpf_sock_addrStanislav Fomichev5-27/+166
Currently, bpf_getsockopt and bpf_setsockopt helpers operate on the 'struct bpf_sock_ops' context in BPF_PROG_TYPE_SOCK_OPS program. Let's generalize them and make them available for 'struct bpf_sock_addr'. That way, in the future, we can allow those helpers in more places. As an example, let's expose those 'struct bpf_sock_addr' based helpers to BPF_CGROUP_INET{4,6}_CONNECT hooks. That way we can override CC before the connection is made. v3: * Expose custom helpers for bpf_sock_addr context instead of doing generic bpf_sock argument (as suggested by Daniel). Even with try_socket_lock that doesn't sleep we have a problem where context sk is already locked and socket lock is non-nestable. v2: * s/BPF_PROG_TYPE_CGROUP_SOCKOPT/BPF_PROG_TYPE_SOCK_OPS/ Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200430233152.199403-1-sdf@google.com