kernel/linux.git - Linux kernel stable tree (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2018-04-12	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	Linus Torvalds	10	-19/+166
	Pull networking fixes from David Miller: 1) In ip_gre tunnel, handle the conflict between TUNNEL_{SEQ,CSUM} and GSO/LLTX properly. From Sabrina Dubroca. 2) Stop properly on error in lan78xx_read_otp(), from Phil Elwell. 3) Don't uncompress in slip before rstate is initialized, from Tejaswi Tanikella. 4) When using 1.x firmware on aquantia, issue a deinit before we hardware reset the chip, otherwise we break dirty wake WOL. From Igor Russkikh. 5) Correct log check in vhost_vq_access_ok(), from Stefan Hajnoczi. 6) Fix ethtool -x crashes in bnxt_en, from Michael Chan. 7) Fix races in l2tp tunnel creation and duplicate tunnel detection, from Guillaume Nault. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (22 commits) l2tp: fix race in duplicate tunnel detection l2tp: fix races in tunnel creation tun: send netlink notification when the device is modified tun: set the flags before registering the netdevice lan78xx: Don't reset the interface on open bnxt_en: Fix NULL pointer dereference at bnxt_free_irq(). bnxt_en: Need to include RDMA rings in bnxt_check_rings(). bnxt_en: Support max-mtu with VF-reps bnxt_en: Ignore src port field in decap filter nodes bnxt_en: do not allow wildcard matches for L2 flows bnxt_en: Fix ethtool -x crash when device is down. vhost: return bool from *_access_ok() functions vhost: fix vhost_vq_access_ok() log check vhost: Fix vhost_copy_to_user() net: aquantia: oops when shutdown on already stopped device net: aquantia: Regression on reset with 1.x firmware cdc_ether: flag the Cinterion AHS8 modem by gemalto as WWAN slip: Check if rstate is initialized before uncompressing lan78xx: Avoid spurious kevent 4 "error" lan78xx: Correctly indicate invalid OTP ...
2018-04-12	Merge tag 'pm-4.17-rc1-2' of ↵	Linus Torvalds	1	-1/+1
	git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull more power management updates from Rafael Wysocki: "These include one big-ticket item which is the rework of the idle loop in order to prevent CPUs from spending too much time in shallow idle states. It reduces idle power on some systems by 10% or more and may improve performance of workloads in which the idle loop overhead matters. This has been in the works for several weeks and it has been tested and reviewed quite thoroughly. Also included are changes that finalize the cpufreq cleanup moving frequency table validation from drivers to the core, a few fixes and cleanups of cpufreq drivers, a cpuidle documentation update and a PM QoS core update to mark the expected switch fall-throughs in it. Specifics: - Rework the idle loop in order to prevent CPUs from spending too much time in shallow idle states by making it stop the scheduler tick before putting the CPU into an idle state only if the idle duration predicted by the idle governor is long enough. That required the code to be reordered to invoke the idle governor before stopping the tick, among other things (Rafael Wysocki, Frederic Weisbecker, Arnd Bergmann). - Add the missing description of the residency sysfs attribute to the cpuidle documentation (Prashanth Prakash). - Finalize the cpufreq cleanup moving frequency table validation from drivers to the core (Viresh Kumar). - Fix a clock leak regression in the armada-37xx cpufreq driver (Gregory Clement). - Fix the initialization of the CPU performance data structures for shared policies in the CPPC cpufreq driver (Shunyong Yang). - Clean up the ti-cpufreq, intel_pstate and CPPC cpufreq drivers a bit (Viresh Kumar, Rafael Wysocki). - Mark the expected switch fall-throughs in the PM QoS core (Gustavo Silva)" * tag 'pm-4.17-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (23 commits) tick-sched: avoid a maybe-uninitialized warning cpufreq: Drop cpufreq_table_validate_and_show() cpufreq: SCMI: Don't validate the frequency table twice cpufreq: CPPC: Initialize shared perf capabilities of CPUs cpufreq: armada-37xx: Fix clock leak cpufreq: CPPC: Don't set transition_latency cpufreq: ti-cpufreq: Use builtin_platform_driver() cpufreq: intel_pstate: Do not include debugfs.h PM / QoS: mark expected switch fall-throughs cpuidle: Add definition of residency to sysfs documentation time: hrtimer: Use timerqueue_iterate_next() to get to the next timer nohz: Avoid duplication of code related to got_idle_tick nohz: Gather tick_sched booleans under a common flag field cpuidle: menu: Avoid selecting shallow states with stopped tick cpuidle: menu: Refine idle state selection for running tick sched: idle: Select idle state before stopping the tick time: hrtimer: Introduce hrtimer_next_event_without() time: tick-sched: Split tick_nohz_stop_sched_tick() cpuidle: Return nohz hint from cpuidle_select() jiffies: Introduce USER_TICK_USEC and redefine TICK_USEC ...
2018-04-11	tun: send netlink notification when the device is modified	Sabrina Dubroca	1	-2/+22
	I added dumping of link information about tun devices over netlink in commit 1ec010e70593 ("tun: export flags, uid, gid, queue information over netlink"), but didn't add the missing netlink notifications when the device's exported properties change. This patch adds notifications when owner/group or flags are modified, when queues are attached/detached, and when a tun fd is closed. Reported-by: Thomas Haller <thaller@redhat.com> Fixes: 1ec010e70593 ("tun: export flags, uid, gid, queue information over netlink") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-11	tun: set the flags before registering the netdevice	Sabrina Dubroca	1	-3/+6
	Otherwise, register_netdevice advertises the creation of the device with the default flags, instead of what the user requested. Reported-by: Thomas Haller <thaller@redhat.com> Fixes: 1ec010e70593 ("tun: export flags, uid, gid, queue information over netlink") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-11	lan78xx: Don't reset the interface on open	Phil Elwell	1	-4/+0
	Commit 92571a1aae40 ("lan78xx: Connect phy early") moves the PHY initialisation into lan78xx_probe, but lan78xx_open subsequently calls lan78xx_reset. As well as forcing a second round of link negotiation, this reset frequently prevents the phy interrupt from being generated (even though the link is up), rendering the interface unusable. Fix this issue by removing the lan78xx_reset call from lan78xx_open. Fixes: 92571a1aae40 ("lan78xx: Connect phy early") Signed-off-by: Phil Elwell <phil@raspberrypi.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-11	bnxt_en: Fix NULL pointer dereference at bnxt_free_irq().	Michael Chan	1	-1/+1
	When open fails during ethtool -L ring change, for example, the driver may crash at bnxt_free_irq() because bp->bnapi is NULL. If we fail to allocate all the new rings, bnxt_open_nic() will free all the memory including bp->bnapi. Subsequent call to bnxt_close_nic() will try to dereference bp->bnapi in bnxt_free_irq(). Fix it by checking for !bp->bnapi in bnxt_free_irq(). Fixes: e5811b8c09df ("bnxt_en: Add IRQ remapping logic.") Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-11	bnxt_en: Need to include RDMA rings in bnxt_check_rings().	Michael Chan	1	-0/+2
	With recent changes to reserve both L2 and RDMA rings, we need to include the RDMA rings in bnxt_check_rings(). Otherwise we will under-estimate the rings we need during ethtool -L and may lead to failure. Fixes: fbcfc8e46741 ("bnxt_en: Reserve completion rings and MSIX for bnxt_re RDMA driver.") Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-11	bnxt_en: Support max-mtu with VF-reps	Sriharsha Basavapatna	1	-0/+30
	While a VF is configured with a bigger mtu (> 1500), any packets that are punted to the VF-rep (slow-path) get dropped by OVS kernel-datapath with the following message: "dropped over-mtu packet". Fix this by returning the max-mtu value for a VF-rep derived from its corresponding VF. VF-rep's mtu can be changed using 'ip' command as shown in this example: $ ip link set bnxt0_pf0vf0 mtu 9000 Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-11	bnxt_en: Ignore src port field in decap filter nodes	Sriharsha Basavapatna	1	-1/+3
	The driver currently uses src port field (along with other fields) in the decap tunnel key, while looking up and adding tunnel nodes. This leads to redundant cfa_decap_filter_alloc() requests to the FW and flow-miss in the flow engine. Fix this by ignoring the src port field in decap tunnel nodes. Fixes: f484f6782e01 ("bnxt_en: add hwrm FW cmds for cfa_encap_record and decap_filter") Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-11	bnxt_en: do not allow wildcard matches for L2 flows	Andy Gospodarek	1	-0/+59
	Before this patch the following commands would succeed as far as the user was concerned: $ tc qdisc add dev p1p1 ingress $ tc filter add dev p1p1 parent ffff: protocol all \ flower skip_sw action drop $ tc filter add dev p1p1 parent ffff: protocol ipv4 \ flower skip_sw src_mac 00:02:00:00:00:01/44 action drop The current flow offload infrastructure used does not support wildcard matching for ethernet headers, so do not allow the second or third commands to succeed. If a user wants to drop traffic on that interface the protocol and MAC addresses need to be specified explicitly: $ tc qdisc add dev p1p1 ingress $ tc filter add dev p1p1 parent ffff: protocol arp \ flower skip_sw action drop $ tc filter add dev p1p1 parent ffff: protocol ipv4 \ flower skip_sw action drop ... $ tc filter add dev p1p1 parent ffff: protocol ipv4 \ flower skip_sw src_mac 00:02:00:00:00:01 action drop $ tc filter add dev p1p1 parent ffff: protocol ipv4 \ flower skip_sw src_mac 00:02:00:00:00:02 action drop ... There are also checks for VLAN parameters in this patch as other callers may wildcard those parameters even if tc does not. Using different flow infrastructure could allow this to work in the future for L2 flows, but for now it does not. Fixes: 2ae7408fedfe ("bnxt_en: bnxt: add TC flower filter offload support") Signed-off-by: Andy Gospodarek <gospo@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-11	bnxt_en: Fix ethtool -x crash when device is down.	Michael Chan	1	-3/+8
	Fix ethtool .get_rxfh() crash by checking for valid indirection table address before copying the data. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-11	mac80211_hwsim: use DEFINE_IDA	Matthew Wilcox	1	-1/+1
	This is preferred to opencoding an IDA_INIT. Link: http://lkml.kernel.org/r/20180313132639.17387-2-willy@infradead.org Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-11	net: aquantia: oops when shutdown on already stopped device	Igor Russkikh	1	-3/+5
	In case netdev is closed at the moment of pci shutdown, aq_nic_stop gets called second time. napi_disable in that case hangs indefinitely. In other case, if device was never opened at all, we get oops because of null pointer access. We should invoke aq_nic_stop conditionally, only if device is running at the moment of shutdown. Reported-by: David Arcari <darcari@redhat.com> Fixes: 90869ddfefeb ("net: aquantia: Implement pci shutdown callback") Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-11	net: aquantia: Regression on reset with 1.x firmware	Igor Russkikh	1	-0/+16
	On ASUS XG-C100C with 1.5.44 firmware a special mode called "dirty wake" is active. With this mode when motherboard gets powered (but no poweron happens yet), NIC automatically enables powersave link and watches for WOL packet. This normally allows to powerup the PC after AC power failures. Not all motherboards or bios settings gives power to PCI slots, so this mode is not enabled on all the hardware. 4.16 linux driver introduced full hardware reset sequence This is required since before that we had no NIC hardware reset implemented and there were side effects of "not clean start". But this full reset is incompatible with "dirty wake" WOL feature it keeps the PHY link in a special mode forever. As a consequence, driver sees no link and no traffic. To fix this we forcibly change FW state to idle state before doing the full reset. This makes FW to restore link state. Fixes: c8c82eb net: aquantia: Introduce global AQC hardware reset sequence Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-11	cdc_ether: flag the Cinterion AHS8 modem by gemalto as WWAN	Bassem Boubaker	1	-0/+6
	The Cinterion AHS8 is a 3G device with one embedded WWAN interface using cdc_ether as a driver. The modem is controlled via AT commands through the exposed TTYs. AT+CGDCONT write command can be used to activate or deactivate a WWAN connection for a PDP context defined with the same command. UE supports one WWAN adapter. Signed-off-by: Bassem Boubaker <bassem.boubaker@actia.fr> Acked-by: Oliver Neukum <oneukum@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-11	slip: Check if rstate is initialized before uncompressing	Tejaswi Tanikella	1	-0/+5
	On receiving a packet the state index points to the rstate which must be used to fill up IP and TCP headers. But if the state index points to a rstate which is unitialized, i.e. filled with zeros, it gets stuck in an infinite loop inside ip_fast_csum trying to compute the ip checsum of a header with zero length. 89.666953: <2> [<ffffff9dd3e94d38>] slhc_uncompress+0x464/0x468 89.666965: <2> [<ffffff9dd3e87d88>] ppp_receive_nonmp_frame+0x3b4/0x65c 89.666978: <2> [<ffffff9dd3e89dd4>] ppp_receive_frame+0x64/0x7e0 89.666991: <2> [<ffffff9dd3e8a708>] ppp_input+0x104/0x198 89.667005: <2> [<ffffff9dd3e93868>] pppopns_recv_core+0x238/0x370 89.667027: <2> [<ffffff9dd4428fc8>] __sk_receive_skb+0xdc/0x250 89.667040: <2> [<ffffff9dd3e939e4>] pppopns_recv+0x44/0x60 89.667053: <2> [<ffffff9dd4426848>] __sock_queue_rcv_skb+0x16c/0x24c 89.667065: <2> [<ffffff9dd4426954>] sock_queue_rcv_skb+0x2c/0x38 89.667085: <2> [<ffffff9dd44f7358>] raw_rcv+0x124/0x154 89.667098: <2> [<ffffff9dd44f7568>] raw_local_deliver+0x1e0/0x22c 89.667117: <2> [<ffffff9dd44c8ba0>] ip_local_deliver_finish+0x70/0x24c 89.667131: <2> [<ffffff9dd44c92f4>] ip_local_deliver+0x100/0x10c ./scripts/faddr2line vmlinux slhc_uncompress+0x464/0x468 output: ip_fast_csum at arch/arm64/include/asm/checksum.h:40 (inlined by) slhc_uncompress at drivers/net/slip/slhc.c:615 Adding a variable to indicate if the current rstate is initialized. If such a packet arrives, move to toss state. Signed-off-by: Tejaswi Tanikella <tejaswit@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-11	lan78xx: Avoid spurious kevent 4 "error"	Phil Elwell	1	-1/+1
	lan78xx_defer_event generates an error message whenever the work item is already scheduled. lan78xx_open defers three events - EVENT_STAT_UPDATE, EVENT_DEV_OPEN and EVENT_LINK_RESET. Being aware of the likelihood (or certainty) of an error message, the DEV_OPEN event is added to the set of pending events directly, relying on the subsequent deferral of the EVENT_LINK_RESET call to schedule the work. Take the same precaution with EVENT_STAT_UPDATE to avoid a totally unnecessary error message. Signed-off-by: Phil Elwell <phil@raspberrypi.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-11	lan78xx: Correctly indicate invalid OTP	Phil Elwell	1	-1/+2
	lan78xx_read_otp tries to return -EINVAL in the event of invalid OTP content, but the value gets overwritten before it is returned and the read goes ahead anyway. Make the read conditional as it should be and preserve the error code. Fixes: 55d7de9de6c3 ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet device driver") Signed-off-by: Phil Elwell <phil@raspberrypi.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-11	Merge branches 'pm-cpuidle' and 'pm-qos'	Rafael J. Wysocki	1	-1/+1
	* pm-cpuidle: tick-sched: avoid a maybe-uninitialized warning cpuidle: Add definition of residency to sysfs documentation time: hrtimer: Use timerqueue_iterate_next() to get to the next timer nohz: Avoid duplication of code related to got_idle_tick nohz: Gather tick_sched booleans under a common flag field cpuidle: menu: Avoid selecting shallow states with stopped tick cpuidle: menu: Refine idle state selection for running tick sched: idle: Select idle state before stopping the tick time: hrtimer: Introduce hrtimer_next_event_without() time: tick-sched: Split tick_nohz_stop_sched_tick() cpuidle: Return nohz hint from cpuidle_select() jiffies: Introduce USER_TICK_USEC and redefine TICK_USEC sched: idle: Do not stop the tick before cpuidle_idle_call() sched: idle: Do not stop the tick upfront in the idle loop time: tick-sched: Reorganize idle tick management code * pm-qos: PM / QoS: mark expected switch fall-throughs
2018-04-10	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	Linus Torvalds	15	-194/+303
	Pull networking fixes from David Miller: 1) The sockmap code has to free socket memory on close if there is corked data, from John Fastabend. 2) Tunnel names coming from userspace need to be length validated. From Eric Dumazet. 3) arp_filter() has to take VRFs properly into account, from Miguel Fadon Perlines. 4) Fix oops in error path of tcf_bpf_init(), from Davide Caratti. 5) Missing idr_remove() in u32_delete_key(), from Cong Wang. 6) More syzbot stuff. Several use of uninitialized value fixes all over, from Eric Dumazet. 7) Do not leak kernel memory to userspace in sctp, also from Eric Dumazet. 8) Discard frames from unused ports in DSA, from Andrew Lunn. 9) Fix DMA mapping and reset/failover problems in ibmvnic, from Thomas Falcon. 10) Do not access dp83640 PHY registers prematurely after reset, from Esben Haabendal. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (46 commits) vhost-net: set packet weight of tx polling to 2 * vq size net: thunderx: rework mac addresses list to u64 array inetpeer: fix uninit-value in inet_getpeer dp83640: Ensure against premature access to PHY registers after reset devlink: convert occ_get op to separate registration ARM: dts: ls1021a: Specify TBIPA register address net/fsl_pq_mdio: Allow explicit speficition of TBIPA address ibmvnic: Do not reset CRQ for Mobility driver resets ibmvnic: Fix failover case for non-redundant configuration ibmvnic: Fix reset scheduler error handling ibmvnic: Zero used TX descriptor counter on reset ibmvnic: Fix DMA mapping mistakes tipc: use the right skb in tipc_sk_fill_sock_diag() sctp: sctp_sockaddr_af must check minimal addr length for AF_INET6 net: dsa: Discard frames from unused ports sctp: do not leak kernel memory to user space soreuseport: initialise timewait reuseport field ipv4: fix uninit-value in ip_route_output_key_hash_rcu() dccp: initialize ireq->ir_mark net: fix uninit-value in __hw_addr_add_ex() ...
2018-04-09	net: thunderx: rework mac addresses list to u64 array	Vadim Lomovtsev	2	-24/+11
	It is too expensive to pass u64 values via linked list, instead allocate array for them by overall number of mac addresses from netdev. This eventually removes multiple kmalloc() calls, aviod memory fragmentation and allow to put single null check on kmalloc return value in order to prevent a potential null pointer dereference. Addresses-Coverity-ID: 1467429 ("Dereference null return value") Fixes: 37c3347eb247 ("net: thunderx: add ndo_set_rx_mode callback implementation for VF") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-09	dp83640: Ensure against premature access to PHY registers after reset	Esben Haabendal	1	-0/+18
	The datasheet specifies a 3uS pause after performing a software reset. The default implementation of genphy_soft_reset() does not provide this, so implement soft_reset with the needed pause. Signed-off-by: Esben Haabendal <eha@deif.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-08	devlink: convert occ_get op to separate registration	Jiri Pirko	4	-83/+74
	This resolves race during initialization where the resources with ops are registered before driver and the structures used by occ_get op is initialized. So keep occ_get callbacks registered only when all structs are initialized. The example flows, as it is in mlxsw: 1) driver load/asic probe: mlxsw_core -> mlxsw_sp_resources_register -> mlxsw_sp_kvdl_resources_register -> devlink_resource_register IDX mlxsw_spectrum -> mlxsw_sp_kvdl_init -> mlxsw_sp_kvdl_parts_init -> mlxsw_sp_kvdl_part_init -> devlink_resource_size_get IDX (to get the current setup size from devlink) -> devlink_resource_occ_get_register IDX (register current occupancy getter) 2) reload triggered by devlink command: -> mlxsw_devlink_core_bus_device_reload -> mlxsw_sp_fini -> mlxsw_sp_kvdl_fini -> devlink_resource_occ_get_unregister IDX (struct mlxsw_sp *mlxsw_sp is freed at this point, call to occ get which is using mlxsw_sp would cause use-after free) -> mlxsw_sp_init -> mlxsw_sp_kvdl_init -> mlxsw_sp_kvdl_parts_init -> mlxsw_sp_kvdl_part_init -> devlink_resource_size_get IDX (to get the current setup size from devlink) -> devlink_resource_occ_get_register IDX (register current occupancy getter) Fixes: d9f9b9a4d05f ("devlink: Add support for resource abstraction") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-08	net/fsl_pq_mdio: Allow explicit speficition of TBIPA address	Esben Haabendal	1	-16/+34
	This introduces a simpler and generic method for for finding (and mapping) the TBIPA register. Instead of relying of complicated logic for finding the TBIPA register address based on the MDIO or MII register block base address, which even in some cases relies on undocumented shadow registers, a second "reg" entry for the mdio bus devicetree node specifies the TBIPA register. Backwards compatibility is kept, as the existing logic is applied when only a single "reg" mapping is specified. Signed-off-by: Esben Haabendal <eha@deif.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-08	ibmvnic: Do not reset CRQ for Mobility driver resets	Nathan Fontenot	1	-23/+32
	When resetting the ibmvnic driver after a partition migration occurs there is no requirement to do a reset of the main CRQ. The current driver code does the required re-enable of the main CRQ, then does a reset of the main CRQ later. What we should be doing for a driver reset after a migration is to re-enable the main CRQ, release all the sub-CRQs, and then allocate new sub-CRQs after capability negotiation. This patch updates the handling of mobility resets to do the proper work and not reset the main CRQ. To do this the initialization/reset of the main CRQ had to be moved out of the ibmvnic_init routine and in to the ibmvnic_probe and do_reset routines. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-08	ibmvnic: Fix failover case for non-redundant configuration	Thomas Falcon	2	-8/+30
	There is a failover case for a non-redundant pseries VNIC configuration that was not being handled properly. The current implementation assumes that the driver will always have a redandant device to communicate with following a failover notification. There are cases, however, when a non-redundant configuration can receive a failover request. If that happens, the driver should wait until it receives a signal that the device is ready for operation. The driver is agnostic of its backing hardware configuration, so this fix necessarily affects all device failover management. The driver needs to wait until it receives a signal that the device is ready for resetting. A flag is introduced to track this intermediary state where the driver is waiting for an active device. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-08	ibmvnic: Fix reset scheduler error handling	Thomas Falcon	1	-10/+29
	In some cases, if the driver is waiting for a reset following a device parameter change, failure to schedule a reset can result in a hang since a completion signal is never sent. If the device configuration is being altered by a tool such as ethtool or ifconfig, it could cause the console to hang if the reset request does not get scheduled. Add some additional error handling code to exit the wait_for_completion if there is one in progress. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-08	ibmvnic: Zero used TX descriptor counter on reset	Thomas Falcon	1	-0/+1
	The counter that tracks used TX descriptors pending completion needs to be zeroed as part of a device reset. This change fixes a bug causing transmit queues to be stopped unnecessarily and in some cases a transmit queue stall and timeout reset. If the counter is not reset, the remaining descriptors will not be "removed", effectively reducing queue capacity. If the queue is over half full, it will cause the queue to stall if stopped. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-08	ibmvnic: Fix DMA mapping mistakes	Thomas Falcon	1	-8/+6
	Fix some mistakes caught by the DMA debugger. The first change fixes a unnecessary unmap that should have been removed in an earlier update. The next hunk fixes another bad unmap by zeroing the bit checked to determine that an unmap is needed. The final change fixes some buffers that are unmapped with the wrong direction specified. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-07	treewide: fix up files incorrectly marked executable	Linus Torvalds	1	-0/+0
	Joe Perches noted that we have a few source files that for some inexplicable reason (read: I'm too lazy to even go look at the history) are marked executable: drivers/gpu/drm/amd/amdgpu/vce_v4_0.c drivers/net/ethernet/cadence/macb_ptp.c A simple git command line to show executable C/asm/header files is this: git ls-files -s '*.[chsS]' \| grep '^100755' and then you can fix them up with scripting by just feeding that output into: \| cut -f2 \| xargs chmod -x and commit it. Which is exactly what this commit does. Reported-by: Joe Perches <joe@perches.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-07	Merge tag 'pci-v4.17-changes' of ↵	Linus Torvalds	4	-197/+7
	git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI updates from Bjorn Helgaas: - move pci_uevent_ers() out of pci.h (Michael Ellerman) - skip ASPM common clock warning if BIOS already configured it (Sinan Kaya) - fix ASPM Coverity warning about threshold_ns (Gustavo A. R. Silva) - remove last user of pci_get_bus_and_slot() and the function itself (Sinan Kaya) - add decoding for 16 GT/s link speed (Jay Fang) - add interfaces to get max link speed and width (Tal Gilboa) - add pcie_bandwidth_capable() to compute max supported link bandwidth (Tal Gilboa) - add pcie_bandwidth_available() to compute bandwidth available to device (Tal Gilboa) - add pcie_print_link_status() to log link speed and whether it's limited (Tal Gilboa) - use PCI core interfaces to report when device performance may be limited by its slot instead of doing it in each driver (Tal Gilboa) - fix possible cpqphp NULL pointer dereference (Shawn Lin) - rescan more of the hierarchy on ACPI hotplug to fix Thunderbolt/xHCI hotplug (Mika Westerberg) - add support for PCI I/O port space that's neither directly accessible via CPU in/out instructions nor directly mapped into CPU physical memory space. This is fairly intrusive and includes minor changes to interfaces used for I/O space on most platforms (Zhichang Yuan, John Garry) - add support for HiSilicon Hip06/Hip07 LPC I/O space (Zhichang Yuan, John Garry) - use PCI_EXP_DEVCTL2_COMP_TIMEOUT in rapidio/tsi721 (Bjorn Helgaas) - remove possible NULL pointer dereference in of_pci_bus_find_domain_nr() (Shawn Lin) - report quirk timings with dev_info (Bjorn Helgaas) - report quirks that take longer than 10ms (Bjorn Helgaas) - add and use Altera Vendor ID (Johannes Thumshirn) - tidy Makefiles and comments (Bjorn Helgaas) - don't set up INTx if MSI or MSI-X is enabled to align cris, frv, ia64, and mn10300 with x86 (Bjorn Helgaas) - move pcieport_if.h to drivers/pci/pcie/ to encapsulate it (Frederick Lawler) - merge pcieport_if.h into portdrv.h (Bjorn Helgaas) - move workaround for BIOS PME issue from portdrv to PCI core (Bjorn Helgaas) - completely disable portdrv with "pcie_ports=compat" (Bjorn Helgaas) - remove portdrv link order dependency (Bjorn Helgaas) - remove support for unused VC portdrv service (Bjorn Helgaas) - simplify portdrv feature permission checking (Bjorn Helgaas) - remove "pcie_hp=nomsi" parameter (use "pci=nomsi" instead) (Bjorn Helgaas) - remove unnecessary "pcie_ports=auto" parameter (Bjorn Helgaas) - use cached AER capability offset (Frederick Lawler) - don't enable DPC if BIOS hasn't granted AER control (Mika Westerberg) - rename pcie-dpc.c to dpc.c (Bjorn Helgaas) - use generic pci_mmap_resource_range() instead of powerpc and xtensa arch-specific versions (David Woodhouse) - support arbitrary PCI host bridge offsets on sparc (Yinghai Lu) - remove System and Video ROM reservations on sparc (Bjorn Helgaas) - probe for device reset support during enumeration instead of runtime (Bjorn Helgaas) - add ACS quirk for Ampere (née APM) root ports (Feng Kan) - add function 1 DMA alias quirk for Marvell 88SE9220 (Thomas Vincent-Cross) - protect device restore with device lock (Sinan Kaya) - handle failure of FLR gracefully (Sinan Kaya) - handle CRS (config retry status) after device resets (Sinan Kaya) - skip various config reads for SR-IOV VFs as an optimization (KarimAllah Ahmed) - consolidate VPD code in vpd.c (Bjorn Helgaas) - add Tegra dependency on PCI_MSI_IRQ_DOMAIN (Arnd Bergmann) - add DT support for R-Car r8a7743 (Biju Das) - fix a PCI_EJECT vs PCI_BUS_RELATIONS race condition in Hyper-V host bridge driver that causes a general protection fault (Dexuan Cui) - fix Hyper-V host bridge hang in MSI setup on 1-vCPU VMs with SR-IOV (Dexuan Cui) - fix Hyper-V host bridge hang when ejecting a VF before setting up MSI (Dexuan Cui) - make several structures static (Fengguang Wu) - increase number of MSI IRQs supported by Synopsys DesignWare bridges from 32 to 256 (Gustavo Pimentel) - implemented multiplexed IRQ domain API and remove obsolete MSI IRQ API from DesignWare drivers (Gustavo Pimentel) - add Tegra power management support (Manikanta Maddireddy) - add Tegra loadable module support (Manikanta Maddireddy) - handle 64-bit BARs correctly in endpoint support (Niklas Cassel) - support optional regulator for HiSilicon STB (Shawn Guo) - use regulator bulk API for Qualcomm apq8064 (Srinivas Kandagatla) - support power supplies for Qualcomm msm8996 (Srinivas Kandagatla) * tag 'pci-v4.17-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (123 commits) MAINTAINERS: Add John Garry as maintainer for HiSilicon LPC driver HISI LPC: Add ACPI support ACPI / scan: Do not enumerate Indirect IO host children ACPI / scan: Rename acpi_is_serial_bus_slave() for more general use HISI LPC: Support the LPC host on Hip06/Hip07 with DT bindings of: Add missing I/O range exception for indirect-IO devices PCI: Apply the new generic I/O management on PCI IO hosts PCI: Add fwnode handler as input param of pci_register_io_range() PCI: Remove __weak tag from pci_register_io_range() MAINTAINERS: Add missing /drivers/pci/cadence directory entry fm10k: Report PCIe link properties with pcie_print_link_status() net/mlx5e: Use pcie_bandwidth_available() to compute bandwidth net/mlx5: Report PCIe link properties with pcie_print_link_status() net/mlx4_core: Report PCIe link properties with pcie_print_link_status() PCI: Add pcie_print_link_status() to log link speed and whether it's limited PCI: Add pcie_bandwidth_available() to compute bandwidth available to device misc: pci_endpoint_test: Handle 64-bit BARs properly PCI: designware-ep: Make dw_pcie_ep_reset_bar() handle 64-bit BARs properly PCI: endpoint: Make sure that BAR_5 does not have 64-bit flag set when clearing PCI: endpoint: Make epc->ops->clear_bar()/pci_epc_clear_bar() take struct *epf_bar ...
2018-04-07	Merge tag 'for-linus-unmerged' of ↵	Linus Torvalds	7	-31/+65
	git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma Pull rdma updates from Jason Gunthorpe: "Doug and I are at a conference next week so if another PR is sent I expect it to only be bug fixes. Parav noted yesterday that there are some fringe case behavior changes in his work that he would like to fix, and I see that Intel has a number of rc looking patches for HFI1 they posted yesterday. Parav is again the biggest contributor by patch count with his ongoing work to enable container support in the RDMA stack, followed by Leon doing syzkaller inspired cleanups, though most of the actual fixing went to RC. There is one uncomfortable series here fixing the user ABI to actually work as intended in 32 bit mode. There are lots of notes in the commit messages, but the basic summary is we don't think there is an actual 32 bit kernel user of drivers/infiniband for several good reasons. However we are seeing people want to use a 32 bit user space with 64 bit kernel, which didn't completely work today. So in fixing it we required a 32 bit rxe user to upgrade their userspace. rxe users are still already quite rare and we think a 32 bit one is non-existing. - Fix RDMA uapi headers to actually compile in userspace and be more complete - Three shared with netdev pull requests from Mellanox: * 7 patches, mostly to net with 1 IB related one at the back). This series addresses an IRQ performance issue (patch 1), cleanups related to the fix for the IRQ performance problem (patches 2-6), and then extends the fragmented completion queue support that already exists in the net side of the driver to the ib side of the driver (patch 7). * Mostly IB, with 5 patches to net that are needed to support the remaining 10 patches to the IB subsystem. This series extends the current 'representor' framework when the mlx5 driver is in switchdev mode from being a netdev only construct to being a netdev/IB dev construct. The IB dev is limited to raw Eth queue pairs only, but by having an IB dev of this type attached to the representor for a switchdev port, it enables DPDK to work on the switchdev device. * All net related, but needed as infrastructure for the rdma driver - Updates for the hns, i40iw, bnxt_re, cxgb3, cxgb4, hns drivers - SRP performance updates - IB uverbs write path cleanup patch series from Leon - Add RDMA_CM support to ib_srpt. This is disabled by default. Users need to set the port for ib_srpt to listen on in configfs in order for it to be enabled (/sys/kernel/config/target/srpt/discovery_auth/rdma_cm_port) - TSO and Scatter FCS support in mlx4 - Refactor of modify_qp routine to resolve problems seen while working on new code that is forthcoming - More refactoring and updates of RDMA CM for containers support from Parav - mlx5 'fine grained packet pacing', 'ipsec offload' and 'device memory' user API features - Infrastructure updates for the new IOCTL interface, based on increased usage - ABI compatibility bug fixes to fully support 32 bit userspace on 64 bit kernel as was originally intended. See the commit messages for extensive details - Syzkaller bugs and code cleanups motivated by them" * tag 'for-linus-unmerged' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (199 commits) IB/rxe: Fix for oops in rxe_register_device on ppc64le arch IB/mlx5: Device memory mr registration support net/mlx5: Mkey creation command adjustments IB/mlx5: Device memory support in mlx5_ib net/mlx5: Query device memory capabilities IB/uverbs: Add device memory registration ioctl support IB/uverbs: Add alloc/free dm uverbs ioctl support IB/uverbs: Add device memory capabilities reporting IB/uverbs: Expose device memory capabilities to user RDMA/qedr: Fix wmb usage in qedr IB/rxe: Removed GID add/del dummy routines RDMA/qedr: Zero stack memory before copying to user space IB/mlx5: Add ability to hash by IPSEC_SPI when creating a TIR IB/mlx5: Add information for querying IPsec capabilities IB/mlx5: Add IPsec support for egress and ingress {net,IB}/mlx5: Add ipsec helper IB/mlx5: Add modify_flow_action_esp verb IB/mlx5: Add implementation for create and destroy action_xfrm IB/uverbs: Introduce ESP steering match filter IB/uverbs: Add modify ESP flow_action ...
2018-04-07	Merge branch 'akpm' (patches from Andrew)	Linus Torvalds	3	-2/+1
	Merge updates from Andrew Morton: - a few misc things - ocfs2 updates - the v9fs maintainers have been missing for a long time. I've taken over v9fs patch slinging. - most of MM * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (116 commits) mm,oom_reaper: check for MMF_OOM_SKIP before complaining mm/ksm: fix interaction with THP mm/memblock.c: cast constant ULLONG_MAX to phys_addr_t headers: untangle kmemleak.h from mm.h include/linux/mmdebug.h: make VM_WARN* non-rvals mm/page_isolation.c: make start_isolate_page_range() fail if already isolated mm: change return type to vm_fault_t mm, oom: remove 3% bonus for CAP_SYS_ADMIN processes mm, page_alloc: wakeup kcompactd even if kswapd cannot free more memory kernel/fork.c: detect early free of a live mm mm: make counting of list_lru_one::nr_items lockless mm/swap_state.c: make bool enable_vma_readahead and swap_vma_readahead() static block_invalidatepage(): only release page if the full page was invalidated mm: kernel-doc: add missing parameter descriptions mm/swap.c: remove @cold parameter description for release_pages() mm/nommu: remove description of alloc_vm_area zram: drop max_zpage_size and use zs_huge_class_size() zsmalloc: introduce zs_huge_class_size() mm: fix races between swapoff and flush dcache fs/direct-io.c: minor cleanups in do_blockdev_direct_IO ...
2018-04-06	Merge tag 'mtd/for-4.17' of git://git.infradead.org/linux-mtd	Linus Torvalds	2	-20/+2
	Pull MTD updates from Boris Brezillon: "MTD Core: - Remove support for asynchronous erase (not implemented by any of the existing drivers anyway) - Remove Cyrille from the list of SPI NOR and MTD maintainers - Fix kernel doc headers - Allow users to define the partitions parsers they want to test through a DT property (compatible of the partitions subnode) - Remove the bfin-async-flash driver (the only architecture using it has been removed) - Fix pagetest test - Add extra checks in mtd_erase() - Simplify the MTD partition creation logic and get rid of mtd_add_device_partitions() MTD Drivers: - Add endianness information to the physmap DT binding - Add Eon EN29LV400A IDs to JEDEC probe logic - Use %ph where appropriate SPI NOR Drivers: - Make fsl-quaspi assign different names to MTD devices connected to the same QSPI controller - Remove an unneeded driver.bus assigned in the fsl-qspi driver NAND Core: - Prepare arrival of the SPI NAND subsystem by implementing a generic (interface-agnostic) layer to ease manipulation of NAND devices - Move onenand code base to the drivers/mtd/nand/ dir - Rework timing mode selection - Provide a generic way for NAND chip drivers to flag a specific GET/SET FEATURE operation as supported/unsupported - Stop embedding ONFI/JEDEC param page in nand_chip NAND Drivers: - Rework/cleanup of the mxc driver - Various cleanups in the vf610 driver - Migrate the fsmc and vf610 to ->exec_op() - Get rid of the pxa driver (replaced by marvell_nand) - Support ->setup_data_interface() in the GPMI driver - Fix probe error path in several drivers - Remove support for unused hw_syndrome mode in sunxi_nand - Various minor improvements" tag 'mtd/for-4.17' of git://git.infradead.org/linux-mtd: (89 commits) dt-bindings: fsl-quadspi: Add the example of two SPI NOR mtd: fsl-quadspi: Distinguish the mtd device names mtd: nand: Fix some function description mismatches in core.c mtd: fsl-quadspi: Remove unneeded driver.bus assignment mtd: rawnand: marvell: Rename ->ecc_clk into ->core_clk mtd: rawnand: s3c2410: enhance the probe function error path mtd: rawnand: tango: fix probe function error path mtd: rawnand: sh_flctl: fix the probe function error path mtd: rawnand: omap2: fix the probe function error path mtd: rawnand: mxc: fix probe function error path mtd: rawnand: denali: fix probe function error path mtd: rawnand: davinci: fix probe function error path mtd: rawnand: cafe: fix probe function error path mtd: rawnand: brcmnand: fix probe function error path mtd: rawnand: sunxi: Stop supporting ECC_HW_SYNDROME mode mtd: rawnand: marvell: Fix clock resource by adding a register clock mtd: ftl: Use DIV_ROUND_UP() mtd: Fix some function description mismatches in mtdcore.c mtd: physmap_of: update struct map_info's swap as per map requirement dt-bindings: mtd-physmap: Add endianness supports ...
2018-04-06	net: phy: marvell: Enable interrupt function on LED2 pin	Esben Haabendal	1	-2/+18
	The LED2[2]/INTn pin on Marvell 88E1318S as well as 88E1510/12/14/18 needs to be configured to be usable as interrupt not only when WOL is enabled, but whenever we rely on interrupts from the PHY. Signed-off-by: Esben Haabendal <eha@deif.com> Cc: Rasmus Villemoes <rasmus.villemoes@prevas.dk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-06	ice: Bug fixes in ethtool code	Anirudh Venkataramanan	1	-2/+2
	1) Return correct size from ice_get_regs_len. 2) Fix incorrect use of ARRAY_SIZE in ice_get_regs. Fixes: fcea6f3da546 (ice: Add stats and ethtool support) Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-04-06	ice: Fix error return code in ice_init_hw()	Wei Yongjun	1	-1/+3
	Fix to return error code ICE_ERR_NO_MEMORY from the alloc error handling case instead of 0, as done elsewhere in this function. Fixes: dc49c7723676 ("ice: Get MAC/PHY/link info and scheduler topology") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-04-06	jiffies: Introduce USER_TICK_USEC and redefine TICK_USEC	Rafael J. Wysocki	1	-1/+1
	Since the subsequent changes will need a TICK_USEC definition analogous to TICK_NSEC, rename the existing TICK_USEC as USER_TICK_USEC, update its users and redefine TICK_USEC accordingly. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
2018-04-06	headers: untangle kmemleak.h from mm.h	Randy Dunlap	3	-2/+1
	Currently <linux/slab.h> #includes <linux/kmemleak.h> for no obvious reason. It looks like it's only a convenience, so remove kmemleak.h from slab.h and add <linux/kmemleak.h> to any users of kmemleak_* that don't already #include it. Also remove <linux/kmemleak.h> from source files that do not use it. This is tested on i386 allmodconfig and x86_64 allmodconfig. It would be good to run it through the 0day bot for other $ARCHes. I have neither the horsepower nor the storage space for the other $ARCHes. Update: This patch has been extensively build-tested by both the 0day bot & kisskb/ozlabs build farms. Both of them reported 2 build failures for which patches are included here (in v2). [ slab.h is the second most used header file after module.h; kernel.h is right there with slab.h. There could be some minor error in the counting due to some #includes having comments after them and I didn't combine all of those. ] [akpm@linux-foundation.org: security/keys/big_key.c needs vmalloc.h, per sfr] Link: http://lkml.kernel.org/r/e4309f98-3749-93e1-4bb7-d9501a39d015@infradead.org Link: http://kisskb.ellerman.id.au/kisskb/head/13396/ Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Ingo Molnar <mingo@kernel.org> Reported-by: Michael Ellerman <mpe@ellerman.id.au> [2 build failures] Reported-by: Fengguang Wu <fengguang.wu@intel.com> [2 build failures] Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Wei Yongjun <weiyongjun1@huawei.com> Cc: Luis R. Rodriguez <mcgrof@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Mimi Zohar <zohar@linux.vnet.ibm.com> Cc: John Johansen <john.johansen@canonical.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-06	hv_netvsc: Pass net_device parameter to revoke and teardown functions	Mohammed Gamal	1	-19/+18
	The callers to netvsc_revoke__buf() and netvsc_teardown__gpadl() already have their net_device instances. Pass them as a paramaeter to the function instead of obtaining them from netvsc_device struct everytime Signed-off-by: Mohammed Gamal <mgamal@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-06	hv_netvsc: Ensure correct teardown message sequence order	Mohammed Gamal	1	-6/+13
	Prior to commit 0cf737808ae7 ("hv_netvsc: netvsc_teardown_gpadl() split") the call sequence in netvsc_device_remove() was as follows (as implemented in netvsc_destroy_buf()): 1- Send NVSP_MSG1_TYPE_REVOKE_RECV_BUF message 2- Teardown receive buffer GPADL 3- Send NVSP_MSG1_TYPE_REVOKE_SEND_BUF message 4- Teardown send buffer GPADL 5- Close vmbus This didn't work for WS2016 hosts. Commit 0cf737808ae7 ("hv_netvsc: netvsc_teardown_gpadl() split") rearranged the teardown sequence as follows: 1- Send NVSP_MSG1_TYPE_REVOKE_RECV_BUF message 2- Send NVSP_MSG1_TYPE_REVOKE_SEND_BUF message 3- Close vmbus 4- Teardown receive buffer GPADL 5- Teardown send buffer GPADL That worked well for WS2016 hosts, but it prevented guests on older hosts from shutting down after changing network settings. Commit 0ef58b0a05c1 ("hv_netvsc: change GPAD teardown order on older versions") ensured the following message sequence for older hosts 1- Send NVSP_MSG1_TYPE_REVOKE_RECV_BUF message 2- Send NVSP_MSG1_TYPE_REVOKE_SEND_BUF message 3- Teardown receive buffer GPADL 4- Teardown send buffer GPADL 5- Close vmbus However, with this sequence calling `ip link set eth0 mtu 1000` hangs and the process becomes uninterruptible. On futher analysis it turns out that on tearing down the receive buffer GPADL the kernel is waiting indefinitely in vmbus_teardown_gpadl() for a completion to be signaled. Here is a snippet of where this occurs: int vmbus_teardown_gpadl(struct vmbus_channel channel, u32 gpadl_handle) { struct vmbus_channel_gpadl_teardown msg; struct vmbus_channel_msginfo info; unsigned long flags; int ret; info = kmalloc(sizeof(info) + sizeof(struct vmbus_channel_gpadl_teardown), GFP_KERNEL); if (!info) return -ENOMEM; init_completion(&info->waitevent); info->waiting_channel = channel; [....] ret = vmbus_post_msg(msg, sizeof(struct vmbus_channel_gpadl_teardown), true); if (ret) goto post_msg_err; wait_for_completion(&info->waitevent); [....] } The completion is signaled from vmbus_ongpadl_torndown(), which gets called when the corresponding message is received from the host, which apparently never happens in that case. This patch works around the issue by restoring the first mentioned message sequence for older hosts Fixes: 0ef58b0a05c1 ("hv_netvsc: change GPAD teardown order on older versions") Signed-off-by: Mohammed Gamal <mgamal@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-06	hv_netvsc: Split netvsc_revoke_buf() and netvsc_teardown_gpadl()	Mohammed Gamal	1	-12/+34
	Split each of the functions into two for each of send/recv buffers. This will be needed in order to implement a fine-grained messaging sequence to the host so that we accommodate the requirements of different Windows versions Fixes: 0ef58b0a05c12 ("hv_netvsc: change GPAD teardown order on older versions") Signed-off-by: Mohammed Gamal <mgamal@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-06	hv_netvsc: Use Windows version instead of NVSP version on GPAD teardown	Mohammed Gamal	1	-2/+2
	When changing network interface settings, Windows guests older than WS2016 can no longer shutdown. This was addressed by commit 0ef58b0a05c12 ("hv_netvsc: change GPAD teardown order on older versions"), however the issue also occurs on WS2012 guests that share NVSP protocol versions with WS2016 guests. Hence we use Windows version directly to differentiate them. Fixes: 0ef58b0a05c12 ("hv_netvsc: change GPAD teardown order on older versions") Signed-off-by: Mohammed Gamal <mgamal@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-06	net: mvpp2: Fix parser entry init boundary check	Maxime Chevallier	1	-1/+1
	Boundary check in mvpp2_prs_init_from_hw must be done according to the passed "tid" parameter, not the mvpp2_prs_entry index, which is not yet initialized at the time of the check. Fixes: 47e0e14eb1a6 ("net: mvpp2: Make mvpp2_prs_hw_read a parser entry init function") Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-05	net/mlx5: Mkey creation command adjustments	Ariel Levkovich	3	-3/+3
	This change updates the mlx5 interface to create mkey on the device. The updates in the command mailbox include increasing the access mode type field to 5 bits in order to support additional types such as MLX5_MKC_ACCESS_MODE_MEMIC which represents device memory access type and will be used when registering MR on allocated device memory. All the places that use the old access mode format are adjusted as well. Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-04-05	net/mlx5: Query device memory capabilities	Ariel Levkovich	1	-0/+6
	This patch adds querying of device memory capabilities by the mlx5_core driver during initialization. Device memory capabilities is a new capability type and structure which contains the necessary data that is needed for future device memory allocation. The presence of this new capabilities struct is indicated in the general capabilities struct which is queried first by the driver. If the presence bit is set, the driver will also query the new capabilities struct and save it in the device context. Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-04-05	Merge branch 'for-linus' of ↵	Linus Torvalds	6	-6/+6
	git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial tree updates from Jiri Kosina. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: kfifo: fix inaccurate comment tools/thermal: tmon: fix for segfault net: Spelling s/stucture/structure/ edd: don't spam log if no EDD information is present Documentation: Fix early-microcode.txt references after file rename tracing: Block comments should align the * on each line treewide: Fix typos in printk GenWQE: Fix a typo in two comments treewide: Align function definition open/close braces
2018-04-05	Merge tag 'driver-core-4.17-rc1' of ↵	Linus Torvalds	1	-1/+1
	git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is the "big" set of driver core patches for 4.17-rc1. There's really not much here, just a bunch of firmware code refactoring from Luis as he attempts to wrangle that codebase into something that is managable, along with a bunch of userspace tests for it. Other than that, a handful of small bugfixes and reverts of things that didn't work out. Full details are in the shortlog, it's not all that much. All of these have been in linux-next for a while with no reported issues" * tag 'driver-core-4.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (30 commits) drivers: base: remove check for callback in coredump_store() mt7601u: use firmware_request_cache() to address cache on reboot firmware: add firmware_request_cache() to help with cache on reboot firmware: fix typo on pr_info_once() when ignore_sysfs_fallback is used firmware: explicitly include vmalloc.h firmware: ensure the firmware cache is not used on incompatible calls test_firmware: modify custom fallback tests to use unique files firmware: add helper to check to see if fw cache is setup firmware: fix checking for return values for fw_add_devm_name() rename: _request_firmware_load() fw_load_sysfs_fallback() test_firmware: test three firmware kernel configs using a proc knob test_firmware: expand on library with shared helpers firmware: enable to force disable the fallback mechanism at run time firmware: enable run time change of forcing fallback loader firmware: move firmware loader into its own directory firmware: split firmware fallback functionality into its own file firmware: move loading timeout under struct firmware_fallback_config firmware: use helpers for setting up a temporary cache timeout firmware: simplify CONFIG_FW_LOADER_USER_HELPER_FALLBACK further drivers: base: add description for .coredump() callback ...
2018-04-04	netdevsim: remove incorrect __net_initdata annotations	Arnd Bergmann	2	-2/+2
	The __net_initdata section cannot currently be used for structures that get cleaned up in an exitcall using unregister_pernet_operations: WARNING: vmlinux.o(.text+0x868c34): Section mismatch in reference from the function nsim_devlink_exit() to the (unknown reference) .init.data:(unknown) The function nsim_devlink_exit() references the (unknown reference) __initdata (unknown). This is often because nsim_devlink_exit lacks a __initdata annotation or the annotation of (unknown) is wrong. WARNING: vmlinux.o(.text+0x868c64): Section mismatch in reference from the function nsim_devlink_init() to the (unknown reference) .init.data:(unknown) WARNING: vmlinux.o(.text+0x8692bc): Section mismatch in reference from the function nsim_fib_exit() to the (unknown reference) .init.data:(unknown) WARNING: vmlinux.o(.text+0x869300): Section mismatch in reference from the function nsim_fib_init() to the (unknown reference) .init.data:(unknown) As that warning tells us, discarding the structure after a module is loaded would lead to a undefined behavior when that module is removed. It might be possible to change that annotation so it has no effect for loadable modules, but I have not figured out exactly how to do that, and we want this to be fixed in -rc1. This just removes the annotations, just like we do for all other such modules. Fixes: 37923ed6b8ce ("netdevsim: Add simple FIB resource controller via devlink") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-04	sfc: remove ctpio_dmabuf_start from stats	Bert Kenward	2	-3/+0
	The ctpio_dmabuf_start entry is not actually a stat and shouldn't be exposed to ethtool. Fixes: 2c0b6ee837db ("sfc: expose CTPIO stats on NICs that support them") Signed-off-by: Bert Kenward <bkenward@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>