summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2017-08-01virtio_net: fix truesize for mergeable buffersMichael S. Tsirkin1-3/+2
Seth Forshee noticed a performance degradation with some workloads. This turns out to be due to packet drops. Euan Kemp noticed that this is because we drop all packets where length exceeds the truesize, but for some packets we add in extra memory without updating the truesize. This in turn was kept around unchanged from ab7db91705e95 ("virtio-net: auto-tune mergeable rx buffer size for improved performance"). That commit had an internal reason not to account for the extra space: not enough bits to do it. No longer true so let's account for the allocated length exactly. Many thanks to Seth Forshee for the report and bisecting and Euan Kemp for debugging the issue. Fixes: 680557cf79f8 ("virtio_net: rework mergeable buffer handling") Reported-by: Euan Kemp <euan.kemp@coreos.com> Tested-by: Euan Kemp <euan.kemp@coreos.com> Reported-by: Seth Forshee <seth.forshee@canonical.com> Tested-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-01mv643xx_eth: fix of_irq_to_resource() error checkSergei Shtylyov1-1/+1
of_irq_to_resource() has recently been fixed to return negative error #'s along with 0 in case of failure, however the Marvell MV643xx Ethernet driver still only regards 0 as invalid IRQ -- fix it up. Fixes: 7a4228bbff76 ("of: irq: use of_irq_get() in of_irq_to_resource()") Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-01MAINTAINERS: Add more files to the PHY LIBRARY sectionFlorian Fainelli1-3/+11
Include missing files that are provided by, used, or directly maintained within the PHY LIBRARY, this include uapi header, header files used by Device Tree code etc. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-01ipv4: fib: Fix NULL pointer deref during fib_sync_down_dev()Ido Schimmel1-1/+1
Michał reported a NULL pointer deref during fib_sync_down_dev() when unregistering a netdevice. The problem is that we don't check for 'in_dev' being NULL, which can happen in very specific cases. Usually routes are flushed upon NETDEV_DOWN sent in either the netdev or the inetaddr notification chains. However, if an interface isn't configured with any IP address, then it's possible for host routes to be flushed following NETDEV_UNREGISTER, after NULLing dev->ip_ptr in inetdev_destroy(). To reproduce: $ ip link add type dummy $ ip route add local 1.1.1.0/24 dev dummy0 $ ip link del dev dummy0 Fix this by checking for the presence of 'in_dev' before referencing it. Fixes: 982acb97560c ("ipv4: fib: Notify about nexthop status changes") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reported-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Tested-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-01net: phy: Correctly process PHY_HALTED in phy_stop_machine()Florian Fainelli1-0/+3
Marc reported that he was not getting the PHY library adjust_link() callback function to run when calling phy_stop() + phy_disconnect() which does not indeed happen because we set the state machine to PHY_HALTED but we don't get to run it to process this state past that point. Fix this with a synchronous call to phy_state_machine() in order to have the state machine actually act on PHY_HALTED, set the PHY device's link down, turn the network device's carrier off and finally call the adjust_link() function. Reported-by: Marc Gonzalez <marc_gonzalez@sigmadesigns.com> Fixes: a390d1f379cf ("phylib: convert state_queue work to delayed_work") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Marc Gonzalez <marc_gonzalez@sigmadesigns.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-01sunhme: fix up GREG_STAT and GREG_IMASK register offsetsMark Cave-Ayland1-3/+3
Update the values to match those from the STP2002QFP documentation. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-30bpf: fix bpf_prog_get_info_by_fd to dump correct xlated_prog_lenDaniel Borkmann1-1/+1
bpf_prog_size(prog->len) is not the correct length we want to dump back to user space. The code in bpf_prog_get_info_by_fd() uses this to copy prog->insnsi to user space, but bpf_prog_size(prog->len) also includes the size of struct bpf_prog itself plus program instructions and is usually used either in context of accounting or for bpf_prog_alloc() et al, thus we copy out of bounds in bpf_prog_get_info_by_fd() potentially. Use the correct bpf_prog_insn_size() instead. Fixes: 1e2709769086 ("bpf: Add BPF_OBJ_GET_INFO_BY_FD") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-30tcp: avoid bogus gcc-7 array-bounds warningArnd Bergmann1-2/+3
When using CONFIG_UBSAN_SANITIZE_ALL, the TCP code produces a false-positive warning: net/ipv4/tcp_output.c: In function 'tcp_connect': net/ipv4/tcp_output.c:2207:40: error: array subscript is below array bounds [-Werror=array-bounds] tp->chrono_stat[tp->chrono_type - 1] += now - tp->chrono_start; ^~ net/ipv4/tcp_output.c:2207:40: error: array subscript is below array bounds [-Werror=array-bounds] tp->chrono_stat[tp->chrono_type - 1] += now - tp->chrono_start; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~ I have opened a gcc bug for this, but distros have already shipped compilers with this problem, and it's not clear yet whether there is a way for gcc to avoid the warning. As the problem is related to the bitfield access, this introduces a temporary variable to store the old enum value. I did not notice this warning earlier, since UBSAN is disabled when building with COMPILE_TEST, and that was always turned on in both allmodconfig and randconfig tests. Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81601 Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-30Merge tag 'wireless-drivers-for-davem-2017-07-28' of ↵David S. Miller2-6/+1
git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers Kalle Valo says: ==================== wireless-drivers fixes for 4.13 Two fixes for for brcmfmac, the crash was reported by two people already so it's a high priority fix. brcmfmac * fix a crash in skb headroom handling in v4.13-rc1 * fix a memory leak due to a merge error in v4.6 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-30net: tc35815: fix spelling mistake: "Intterrupt" -> "Interrupt"Colin Ian King1-1/+1
Trivial fix to spelling mistake in printk message Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-30bpf: don't indicate success when copy_from_user failsDaniel Borkmann1-1/+1
err in bpf_prog_get_info_by_fd() still holds 0 at that time from prior check_uarg_tail_zero() check. Explicitly return -EFAULT instead, so user space can be notified of buggy behavior. Fixes: 1e2709769086 ("bpf: Add BPF_OBJ_GET_INFO_BY_FD") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-30udp6: fix socket leak on early demuxPaolo Abeni3-10/+21
When an early demuxed packet reaches __udp6_lib_lookup_skb(), the sk reference is retrieved and used, but the relevant reference count is leaked and the socket destructor is never called. Beyond leaking the sk memory, if there are pending UDP packets in the receive queue, even the related accounted memory is leaked. In the long run, this will cause persistent forward allocation errors and no UDP skbs (both ipv4 and ipv6) will be able to reach the user-space. Fix this by explicitly accessing the early demux reference before the lookup, and properly decreasing the socket reference count after usage. Also drop the skb_steal_sock() in __udp6_lib_lookup_skb(), and the now obsoleted comment about "socket cache". The newly added code is derived from the current ipv4 code for the similar path. v1 -> v2: fixed the __udp6_lib_rcv() return code for resubmission, as suggested by Eric Reported-by: Sam Edwards <CFSworks@gmail.com> Reported-by: Marc Haber <mh+netdev@zugschlus.de> Fixes: 5425077d73e0 ("net: ipv6: Add early demux handler for UDP unicast") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-30net: thunderx: Fix BGX transmit stall due to underflowSunil Goutham2-5/+24
For SGMII/RGMII/QSGMII interfaces when physical link goes down while traffic is high is resulting in underflow condition being set on that specific BGX's LMAC. Which assets a backpresure and VNIC stops transmitting packets. This is due to BGX being disabled in link status change callback while packet is in transit. This patch fixes this issue by not disabling BGX but instead just disables packet Rx and Tx. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-30Revert "vhost: cache used event for better performance"Jason Wang2-25/+6
This reverts commit 809ecb9bca6a9424ccd392d67e368160f8b76c92. Since it was reported to break vhost_net. We want to cache used event and use it to check for notification. The assumption was that guest won't move the event idx back, but this could happen in fact when 16 bit index wraps around after 64K entries. Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-29Merge tag 'mlx5-fixes-2017-07-27-V2' of ↵David S. Miller12-96/+232
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== Mellanox, mlx5 fixes 2017-07-27 This series contains some misc fixes to the mlx5 driver. Please pull and let me know if there's any problem. V1->V2: - removed redundant braces for -stable: 4.7 net/mlx5: Fix command bad flow on command entry allocation failure 4.9 net/mlx5: Consider tx_enabled in all modes on remap net/mlx5e: Fix outer_header_zero() check size 4.10 net/mlx5: Fix mlx5_add_flow_rules call with correct num of dests 4.11 net/mlx5: Fix mlx5_ifc_mtpps_reg_bits structure size net/mlx5e: Add field select to MTPPS register net/mlx5e: Fix broken disable 1PPS flow net/mlx5e: Change 1PPS out scheme net/mlx5e: Add missing support for PTP_CLK_REQ_PPS request net/mlx5e: Fix wrong delay calculation for overflow check scheduling net/mlx5e: Schedule overflow check work to mlx5e workqueue 4.12 net/mlx5: Fix command completion after timeout access invalid structure net/mlx5e: IPoIB, Modify add/remove underlay QPN flows I hope this is not too much, but most of the patches do apply cleanly on -stable. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-29team: use a larger struct for mac addressWANG Cong1-4/+4
IPv6 tunnels use sizeof(struct in6_addr) as dev->addr_len, but in many places especially bonding, we use struct sockaddr to copy and set mac addr, this could lead to stack out-of-bounds access. Fix it by using a larger address storage like bonding. Reported-by: Andrey Konovalov <andreyknvl@google.com> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-29net: check dev->addr_len for dev_set_mac_address()WANG Cong1-0/+2
Historically, dev_ifsioc() uses struct sockaddr as mac address definition, this is why dev_set_mac_address() accepts a struct sockaddr pointer as input but now we have various types of mac addresse whose lengths are up to MAX_ADDR_LEN, longer than struct sockaddr, and saved in dev->addr_len. It is too late to fix dev_ifsioc() due to API compatibility, so just reject those larger than sizeof(struct sockaddr), otherwise we would read and use some random bytes from kernel stack. Fortunately, only a few IPv6 tunnel devices have addr_len larger than sizeof(struct sockaddr) and they don't support ndo_set_mac_addr(). But with team driver, in lb mode, they can still be enslaved to a team master and make its mac addr length as the same. Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-28phy: bcm-ns-usb3: fix MDIO_BUS dependencyArnd Bergmann1-1/+1
The driver attempts to 'select MDIO_DEVICE', but the code is actually a loadable module when PHYLIB=m: drivers/phy/broadcom/phy-bcm-ns-usb3.o: In function `bcm_ns_usb3_mdiodev_phy_write': phy-bcm-ns-usb3.c:(.text.bcm_ns_usb3_mdiodev_phy_write+0x28): undefined reference to `mdiobus_write' drivers/phy/broadcom/phy-bcm-ns-usb3.o: In function `bcm_ns_usb3_module_exit': phy-bcm-ns-usb3.c:(.exit.text+0x18): undefined reference to `mdio_driver_unregister' drivers/phy/broadcom/phy-bcm-ns-usb3.o: In function `bcm_ns_usb3_module_init': phy-bcm-ns-usb3.c:(.init.text+0x18): undefined reference to `mdio_driver_register' phy-bcm-ns-usb3.c:(.init.text+0x38): undefined reference to `mdio_driver_unregister' Using 'depends on MDIO_BUS' instead will avoid the link error. Fixes: af850e14a7ae ("phy: bcm-ns-usb3: add MDIO driver using proper bus layer") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-28net: phy: rework Kconfig settings for MDIO_BUSArnd Bergmann1-3/+10
I still see build errors in randconfig builds and have had this patch for a while to locally work around it: drivers/built-in.o: In function `xgene_mdio_probe': mux-core.c:(.text+0x352154): undefined reference to `of_mdiobus_register' mux-core.c:(.text+0x352168): undefined reference to `mdiobus_free' mux-core.c:(.text+0x3521c0): undefined reference to `mdiobus_alloc_size' The idea is that CONFIG_MDIO_BUS now reflects whether the mdio_bus code is built-in or a module, and other drivers that use the core code can simply depend on that, instead of having a complex dependency line. Fixes: 90eff9096c01 ("net: phy: Allow splitting MDIO bus/device support from PHYs") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-27net/mlx5: Fix mlx5_add_flow_rules call with correct num of destsPaul Blakey1-1/+1
When adding ethtool steering rule with action DISCARD we wrongly pass a NULL dest with dest_num 1 to mlx5_add_flow_rules(). What this error seems to have caused is sending VPORT 0 (MLX5_FLOW_DESTINATION_TYPE_VPORT) as the fte dest instead of no dests. We have fte action correctly set to DROP so it might been ignored anyways. To reproduce use: # sudo ethtool --config-nfc <dev> flow-type ether \ dst aa:bb:cc:dd:ee:ff action -1 Fixes: 74491de93712 ("net/mlx5: Add multi dest support") Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-07-27net/mlx5e: Schedule overflow check work to mlx5e workqueueEugenia Emantayev1-6/+5
This is done in order to ensure that work will not run after the cleanup. Fixes: ef9814deafd0 ('net/mlx5e: Add HW timestamping (TS) support') Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-07-27net/mlx5e: Fix wrong delay calculation for overflow check schedulingEugenia Emantayev1-1/+2
The overflow_period is calculated in seconds. In order to use it for delayed work scheduling translation to jiffies is needed. Fixes: ef9814deafd0 ('net/mlx5e: Add HW timestamping (TS) support') Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-07-27net/mlx5e: Add missing support for PTP_CLK_REQ_PPS requestEugenia Emantayev3-1/+21
Add the missing option to enable the PTP_CLK_PPS function. In this case pin should be configured as 1PPS IN first and then it will be connected to PPS mechanism. Events will be reported as PTP_CLOCK_PPSUSR events to relevant sysfs. Fixes: ee7f12205abc ('net/mlx5e: Implement 1PPS support') Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-07-27net/mlx5e: Change 1PPS out schemeEugenia Emantayev2-38/+87
In order to fix the drift in 1PPS out need to adjust the next pulse. On each 1PPS out falling edge driver gets the event, then the event handler adjusts the next pulse starting time. Fixes: ee7f12205abc ('net/mlx5e: Implement 1PPS support') Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-07-27net/mlx5e: Fix broken disable 1PPS flowEugenia Emantayev1-29/+46
Need to disable the MTPPS and unsubscribe from the pulse events when user disables the 1PPS functionality. Fixes: ee7f12205abc ('net/mlx5e: Implement 1PPS support') Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-07-27net/mlx5e: Add field select to MTPPS registerEugenia Emantayev4-10/+36
In order to mark relevant fields while setting the MTPPS register add field select. Otherwise it can cause a misconfiguration in firmware. Fixes: ee7f12205abc ('net/mlx5e: Implement 1PPS support') Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-07-27net/mlx5: Fix mlx5_ifc_mtpps_reg_bits structure sizeEugenia Emantayev1-1/+1
Fix miscalculation in reserved_at_1a0 field. Fixes: ee7f12205abc ('net/mlx5e: Implement 1PPS support') Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-07-27net/mlx5e: Fix outer_header_zero() check sizeIlan Tayari1-1/+1
outer_header_zero() routine checks if the outer_headers match of a flow-table entry are all zero. This function uses the size of whole fte_match_param, instead of just the outer_headers member, causing failure to detect all-zeros if any other members of the fte_match_param are non-zero. Use the correct size for zero check. Fixes: 6dc6071cfcde ("net/mlx5e: Add ethtool flow steering support") Signed-off-by: Ilan Tayari <ilant@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-07-27net/mlx5e: IPoIB, Modify add/remove underlay QPN flowsAlex Vesker1-5/+11
On interface remove, the clean-up was done incorrectly causing an error in the log: "SET_FLOW_TABLE_ROOT(0x92f) op_mod(0x0) failed...syndrome (0x7e9f14)" This was caused by the following flow: -ndo_uninit: Move QP state to RST (this disconnects the QP from FT), the QP cannot be attached to any FT unless it is in RTS. -mlx5_rdma_netdev_free: cleanup_rx: Destroy FT cleanup_tx: Destroy QP and remove QPN from FT This caused a problem when destroying current FT we tried to re-attach the QP to the next FT which is not needed. The correct flow is: -mlx5_rdma_netdev_free: cleanup_rx: remove QPN from FT & Destroy FT cleanup_tx: Destroy QP Fixes: 508541146af1 ("net/mlx5: Use underlay QPN from the root name space") Signed-off-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-07-27net/mlx5: Fix command bad flow on command entry allocation failureMoshe Shemesh1-2/+17
When driver fail to allocate an entry to send command to FW, it must notify the calling function and release the memory allocated for this command. Fixes: e126ba97dba9e ('mlx5: Add driver for Mellanox Connect-IB adapters') Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Cc: kernel-team@fb.com Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-07-27net/mlx5: Fix command completion after timeout access invalid structureMoshe Shemesh1-2/+4
Completion on timeout should not free the driver command entry structure as it will need to access it again once real completion event from FW will occur. Fixes: 73dd3a4839c1 ('net/mlx5: Avoid using pending command interface slots') Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Cc: kernel-team@fb.com Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-07-27net/mlx5: Consider tx_enabled in all modes on remapAviv Heller1-15/+10
The tx_enabled lag event field is used to determine whether a slave is active. Current logic uses this value only if the mode is active-backup. However, LACP mode, although considered a load balancing mode, can mark a slave as inactive in certain situations (e.g., LACP timeout). This fix takes the tx_enabled value into account when remapping, with no respect to the LAG mode (this should not affect the behavior in XOR mode, since in this mode both slaves are marked as active). Fixes: 7907f23adc18 (net/mlx5: Implement RoCE LAG feature) Signed-off-by: Aviv Heller <avivh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-07-27net/mlx5: Clean SRIOV eswitch resources upon VF creation failureEran Ben Elisha2-1/+7
Upon sriov enable, eswitch is always enabled. Currently, if enable hca failed over all VFs, we would skip eswitch disable as part of sriov disable, which will lead to resources leak. Fix it by disabling eswitch if it was enabled (use indication from eswitch mode). Fixes: 6b6adee3dad2 ('net/mlx5: SRIOV core code refactoring') Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-07-27brcmfmac: fix memleak due to calling brcmf_sdiod_sgtable_alloc() twiceArend Van Spriel1-5/+0
Due to a bugfix in wireless tree and the commit mentioned below a merge was needed which went haywire. So the submitted change resulted in the function brcmf_sdiod_sgtable_alloc() being called twice during the probe thus leaking the memory of the first call. Cc: stable@vger.kernel.org # 4.6.x Fixes: 4d7928959832 ("brcmfmac: switch to new platform data") Reported-by: Stefan Wahren <stefan.wahren@i2se.com> Tested-by: Stefan Wahren <stefan.wahren@i2se.com> Reviewed-by: Hante Meuleman <hante.meuleman@broadcom.com> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2017-07-27brcmfmac: Don't grow SKB by negative sizeDaniel Stone1-1/+1
The commit to rework the headroom check in start_xmit() now calls pxskb_expand_head() unconditionally if the header is CoW. Unfortunately, it does so with the delta between the extant headroom and the header length, which may be negative if there is already sufficient headroom. pskb_expand_head() does allow for size being 0, in which case it just copies, so clamp the header delta to zero. Opening Chrome (and all my tabs) on a PCIE device was enough to reliably hit this. Fixes: 270a6c1f65fe ("brcmfmac: rework headroom check in .start_xmit()") Signed-off-by: Daniel Stone <daniels@collabora.com> Cc: Arend Van Spriel <arend.vanspriel@broadcom.com> Cc: James Hughes <james.hughes@raspberrypi.org> Cc: Hante Meuleman <hante.meuleman@broadcom.com> Cc: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com> Cc: Franky Lin <franky.lin@broadcom.com> Tested-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2017-07-27sctp: fix the check for _sctp_walk_params and _sctp_walk_errorsXin Long1-2/+2
Commit b1f5bfc27a19 ("sctp: don't dereference ptr before leaving _sctp_walk_{params, errors}()") tried to fix the issue that it may overstep the chunk end for _sctp_walk_{params, errors} with 'chunk_end > offset(length) + sizeof(length)'. But it introduced a side effect: When processing INIT, it verifies the chunks with 'param.v == chunk_end' after iterating all params by sctp_walk_params(). With the check 'chunk_end > offset(length) + sizeof(length)', it would return when the last param is not yet accessed. Because the last param usually is fwdtsn supported param whose size is 4 and 'chunk_end == offset(length) + sizeof(length)' This is a badly issue even causing sctp couldn't process 4-shakes. Client would always get abort when connecting to server, due to the failure of INIT chunk verification on server. The patch is to use 'chunk_end <= offset(length) + sizeof(length)' instead of 'chunk_end < offset(length) + sizeof(length)' for both _sctp_walk_params and _sctp_walk_errors. Fixes: b1f5bfc27a19 ("sctp: don't dereference ptr before leaving _sctp_walk_{params, errors}()") Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-27dccp: fix a memleak for dccp_feat_init err processXin Long1-2/+5
In dccp_feat_init, when ccid_get_builtin_ccids failsto alloc memory for rx.val, it should free tx.val before returning an error. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-27dccp: fix a memleak that dccp_ipv4 doesn't put reqsk properlyXin Long1-0/+1
The patch "dccp: fix a memleak that dccp_ipv6 doesn't put reqsk properly" fixed reqsk refcnt leak for dccp_ipv6. The same issue exists on dccp_ipv4. This patch is to fix it for dccp_ipv4. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-27dccp: fix a memleak that dccp_ipv6 doesn't put reqsk properlyXin Long1-0/+1
In dccp_v6_conn_request, after reqsk gets alloced and hashed into ehash table, reqsk's refcnt is set 3. one is for req->rsk_timer, one is for hlist, and the other one is for current using. The problem is when dccp_v6_conn_request returns and finishes using reqsk, it doesn't put reqsk. This will cause reqsk refcnt leaks and reqsk obj never gets freed. Jianlin found this issue when running dccp_memleak.c in a loop, the system memory would run out. dccp_memleak.c: int s1 = socket(PF_INET6, 6, IPPROTO_IP); bind(s1, &sa1, 0x20); listen(s1, 0x9); int s2 = socket(PF_INET6, 6, IPPROTO_IP); connect(s2, &sa1, 0x20); close(s1); close(s2); This patch is to put the reqsk before dccp_v6_conn_request returns, just as what tcp_conn_request does. Reported-by: Jianlin Shi <jishi@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-27bpf: don't zero out the info struct in bpf_obj_get_info_by_fd()Jakub Kicinski2-3/+6
The buffer passed to bpf_obj_get_info_by_fd() should be initialized to zeros. Kernel will enforce that to guarantee we can safely extend info structures in the future. Making the bpf_obj_get_info_by_fd() call in libbpf perform the zeroing is problematic, however, since some members of the info structures may need to be initialized by the callers (for instance pointers to buffers to which kernel is to dump translated and jited images). Remove the zeroing and fix up the in-tree callers before any kernel has been released with this code. As Daniel points out this seems to be the intended operation anyway, since commit 95b9afd3987f ("bpf: Test for bpf ID") is itself setting the buffer pointers before calling bpf_obj_get_info_by_fd(). Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-27netpoll: Fix device name check in netpoll_setup()Matthias Kaehlcke1-1/+1
Apparently netpoll_setup() assumes that netpoll.dev_name is a pointer when checking if the device name is set: if (np->dev_name) { ... However the field is a character array, therefore the condition always yields true. Check instead whether the first byte of the array has a non-zero value. Signed-off-by: Matthias Kaehlcke <mka@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-26bonding: commit link status change after proposeWANG Cong1-0/+2
Commit de77ecd4ef02 ("bonding: improve link-status update in mii-monitoring") moves link status commitment into bond_mii_monitor(), but it still relies on the return value of bond_miimon_inspect() as the hint. We need to return non-zero as long as we propose a link status change. Fixes: de77ecd4ef02 ("bonding: improve link-status update in mii-monitoring") Reported-by: Benjamin Gilbert <benjamin.gilbert@coreos.com> Tested-by: Benjamin Gilbert <benjamin.gilbert@coreos.com> Cc: Mahesh Bandewar <maheshb@google.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-26udp: unbreak build lacking CONFIG_XFRMPaolo Abeni1-1/+1
We must use pre-processor conditional block or suitable accessors to manipulate skb->sp elsewhere builds lacking the CONFIG_XFRM will break. Fixes: dce4551cb2ad ("udp: preserve head state for IP_CMSG_PASSSEC") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-26net: ethernet: nb8800: Handle all 4 RGMII modes identicallyMarc Gonzalez1-5/+4
Before commit bf8f6952a233 ("Add blurb about RGMII") it was unclear whose responsibility it was to insert the required clock skew, and in hindsight, some PHY drivers got it wrong. The solution forward is to introduce a new property, explicitly requiring skew from the node to which it is attached. In the interim, this driver will handle all 4 RGMII modes identically (no skew). Fixes: 52dfc8301248 ("net: ethernet: add driver for Aurora VLSI NB8800 Ethernet controller") Signed-off-by: Marc Gonzalez <marc_gonzalez@sigmadesigns.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-26Revert "netvsc: optimize calculation of number of slots"stephen hemminger1-10/+33
The logic for computing page buffer scatter does not take into account the impact of compound pages. Therefore the optimization to compute number of slots was incorrect and could cause stack corruption a skb was sent with lots of fragments from huge pages. This reverts commit 60b86665af0dfbeebda8aae43f0ba451cd2dcfe5. Fixes: 60b86665af0d ("netvsc: optimize calculation of number of slots") Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-26ftgmac100: return error in ftgmac100_alloc_rx_bufJoel Stanley1-2/+2
The error paths set err, but it's not returned. I wondered if we should fix all of the callers to check the returned value, but Ben explains why the code is this way: > Most call sites ignore it on purpose. There's nothing we can do if > we fail to get a buffer at interrupt time, so we point the buffer to > the scratch page so the HW doesn't DMA into lalaland and lose the > packet. > > The one call site that tests and can fail is the one used when brining > the interface up. If we fail to allocate at that point, we fail the > ifup. But as you noticed, I do have a bug not returning the error. Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-26ipv6: Don't increase IPSTATS_MIB_FRAGFAILS twice in ip6_fragment()Stefano Brivio1-4/+0
RFC 2465 defines ipv6IfStatsOutFragFails as: "The number of IPv6 datagrams that have been discarded because they needed to be fragmented at this output interface but could not be." The existing implementation, instead, would increase the counter twice in case we fail to allocate room for single fragments: once for the fragment, once for the datagram. This didn't look intentional though. In one of the two affected affected failure paths, the double increase was simply a result of a new 'goto fail' statement, introduced to avoid a skb leak. The other path appears to be affected since at least 2.6.12-rc2. Reported-by: Sabrina Dubroca <sdubroca@redhat.com> Fixes: 1d325d217c7f ("ipv6: ip6_fragment: fix headroom tests and skb leak") Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-25lib: test_rhashtable: Fix KASAN warningPhil Sutter1-1/+3
I forgot one spot when introducing struct test_obj_val. Fixes: e859afe1ee0c5 ("lib: test_rhashtable: fix for large entry counts") Reported by: kernel test robot <fengguang.wu@intel.com> Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-25net: phy: Remove trailing semicolon in macro definitionMarc Gonzalez1-1/+1
Commit e5a03bfd873c2 ("phy: Add an mdio_device structure") introduced a spurious trailing semicolon. Remove it. Signed-off-by: Marc Gonzalez <marc_gonzalez@sigmadesigns.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-25udp: preserve head state for IP_CMSG_PASSSECPaolo Abeni2-31/+40
Paul Moore reported a SELinux/IP_PASSSEC regression caused by missing skb->sp at recvmsg() time. We need to preserve the skb head state to process the IP_CMSG_PASSSEC cmsg. With this commit we avoid releasing the skb head state in the BH even if a secpath is attached to the current skb, and stores the skb status (with/without head states) in the scratch area, so that we can access it at skb deallocation time, without incurring in cache-miss penalties. This also avoids misusing the skb CB for ipv6 packets, as introduced by the commit 0ddf3fb2c43d ("udp: preserve skb->dst if required for IP options processing"). Clean a bit the scratch area helpers implementation, to reduce the code differences between 32 and 64 bits build. Reported-by: Paul Moore <paul@paul-moore.com> Fixes: 0a463c78d25b ("udp: avoid a cache miss on dequeue") Fixes: 0ddf3fb2c43d ("udp: preserve skb->dst if required for IP options processing") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Tested-by: Paul Moore <paul@paul-moore.com> Signed-off-by: David S. Miller <davem@davemloft.net>