kernel/linux.git - Linux kernel stable tree (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2018-11-20	octeontx2-af: Support to enable/disable default MCAM entries	Sunil Goutham	4	-28/+122
	For a PF/VF with a NIXLF attached has default/reserved MCAM entries for receiving Ucast/Bcast/Promisc traffic. Ideally traffic should be forwarded to NIXLF only after it's contexts are initialized. This patch keeps these default entries disabled and adds mbox messages for a PF/VF to enable these once NPA/NIXLF initialization is done. Likewise while PF/VF is being teared down, it can send the disable mailbox message to stop receiving traffic. Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-20	octeontx2-af: Add MKEX default profile	Santosh Shukla	3	-25/+151
	Added basic default MKEX profile. This profile tells hardware what data to extract from packet and where to place it (bit offset) in final KEY generated for the parsed packet. Based on the bit placement of the packet data, MCAM entries have to programmed for matching. Also added a msg to retrieve this MKEX profile from PF/VF which inturn can process it to determine how MCAM entry has to be populated. Signed-off-by: Santosh Shukla <sshukla@marvell.com> Signed-off-by: Yuri Tolstov <ytolstov@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-20	octeontx2-af: Alloc and config NPC MCAM entry at a time	Sunil Goutham	3	-0/+94
	A new mailbox message is added to support allocating a MCAM entry along with a counter and configuring it in one go. This reduces the amount of mailbox communication involved in installing a new MCAM rule. Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-20	octeontx2-af: Map or unmap NPC MCAM entry and counter	Sunil Goutham	3	-4/+192
	Alloc memory to save MCAM 'entry to counter' mapping and since multiple entries can map to same counter, added counter's reference count tracking. Do 'entry to counter' mapping when a entry is being installed and mbox msg sender requested to configure a counter as well. Mapping is removed when a entry or counter is being freed or a explicit mbox msg is received to unmap them. Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-20	octeontx2-af: Support for NPC MCAM counters	Sunil Goutham	3	-0/+190
	NPC HW has counters which can be mapped to MCAM entries to gather entry match statistics. This patch adds support to allocate, free, clear and retrieve stats of NPC MCAM counters. New mailbox messages have been added for this. Similar to MCAM entries both contiguous and non-contiguous counter allocation is supported. Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-20	octeontx2-af: MCAM entry installation support	Sunil Goutham	3	-8/+228
	Add support for a RVU PF/VF to enable, disable, configure and shuffle MCAM entries via mbox commands. This patch adds mailbox message formats and handling of these commands. As of now otherthan validating MCAM entry index, info like channel number e.t.c in MCAM config data sent by PF/VF are not validated. Also a max of 64 MCAM entries can be shuffled with a single mbox command. Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-20	octeontx2-af: NPC MCAM entry alloc/free support	Sunil Goutham	4	-5/+588
	This patch adds NPC MCAM entry management and support for allocating and freeing them via mailbox. Both contiguous and non-contiguous allocations are supported. Incase of contiguous, if request cannot be met then max contiguous number of available entries are allocated. High or low priority index allocation w.r.t a reference MCAM index is also supported. Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-20	octeontx2-af: Relax resource lock into mutex	Stanislaw Kardach	4	-25/+30
	Mailbox message handling is done in a workqueue context scheduled from interrupt handler. So resource locks does not need to be a spinlock. Therefore relax them into a mutex so that later on we may use them in routines that might sleep. Signed-off-by: Stanislaw Kardach <skardach@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-20	octeontx2-af: Support to get NIX HW constants from AF	Kiran Kumar	2	-0/+12
	This patch adds reading HW limits like number of Rx/Tx stats, number of queue IRQs supported per NIX LF from AF registers and sync them to PF/VF. Signed-off-by: Kiran Kumar <kirankumark@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-20	octeontx2-af: Support to modify min/max allowed packet lengths	Sunil Goutham	5	-1/+220
	This patch adds support for RVU PF/VFs to modify min/max packet lengths allowed by HW. For VFs on PF0, settings will be automatically applied on LBK link. RX link's min/maxlen is configured to min/max of PF and it's all VFs. On the TX side if requested all SMQs attached to the requesting NIXLF will be updated with new min/max lengths. Also updates transmit credits for Tx links based on new maxlen. Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-20	octeontx2-af: Convert mbox handlers APIs to lowercase	Sunil Goutham	7	-102/+107
	This patch converts all mailbox message handler API names to lowercase. Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-20	Merge branch 'r8169-series-with-further-smaller-improvements'	David S. Miller	1	-171/+112
	Heiner Kallweit says: ==================== r8169: series with further smaller improvements Again nothing exciting, just smaller improvements. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-20	r8169: improve chip version identification	Heiner Kallweit	1	-60/+59
	Only the upper 12 bits are used for chip identification, this helps to reduce the size of array mac_info. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-20	r8169: simplify ocp functions	Heiner Kallweit	1	-51/+17
	rtl8168_oob_notify is used in rtl8168dp_driver_start and rtl8168dp_driver_stop only, so we can rename it to r8168dp_oob_notify. The same applies to condition rtl_ocp_read_cond which can be renamed to rtl_dp_ocp_read_cond. This allows to simplify the code. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-20	r8169: remove workaround for ancient gcc bug	Heiner Kallweit	1	-3/+3
	The kernel can't be built any longer with this ancient GCC version. Eventually it becomes clear what this statement actually does. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-20	r8169: remove manual padding in struct ring_info	Heiner Kallweit	1	-1/+0
	The compiler takes care of alignment and padding, I see no need to bother him with manual hints. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-20	r8169: remove "not PCI Express" message	Heiner Kallweit	1	-3/+0
	The ones who want to know can easily identify whether chip is PCI or PCIe based on the chip name. I doubt there's any benefit in this message, so remove it. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-20	r8169: remove print_mac_version	Heiner Kallweit	1	-9/+0
	The syslog message printed on driver load allows to easily identify the mac version number (based on chip name and XID). So we don't need this extra debug message which is wrong anyway because e.g. RTL_GIGA_MAC_VER_01 has value 0. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-20	r8169: use PCI_VDEVICE macro	Heiner Kallweit	1	-14/+14
	Using macro PCI_VDEVICE helps to simplify the PCI ID table. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-20	r8169: replace event_slow with irq_mask	Heiner Kallweit	1	-11/+11
	Recently the "slow event" handler was removed, therefore the member name isn't appropriate any longer. In addition store the full mask, including the RTL_EVENT_NAPI interrupt source bits. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-20	r8169: remove unused interrupt sources	Heiner Kallweit	1	-4/+3
	Setting PCSTimeout interrupt source was copied from the vendor driver which uses the chip programmable timer interrupt. The mainline driver doesn't use this timer interrupt. SYSErr indicates a PCI error and isn't defined on the PCIe models. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-20	r8169: use dev_get_drvdata where possible	Heiner Kallweit	1	-10/+5
	Using dev_get_drvdata directly is simpler here. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-20	r8169: merge rtl_irq_enable and rtl_irq_enable_all	Heiner Kallweit	1	-9/+4
	After the recent changes to the interrupt handler rtl_irq_enable and rtl_irq_enable_all can be merged. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-20	NFSv4: Fix a NFSv4 state manager deadlock	Trond Myklebust	2	-5/+13
	Fix a deadlock whereby the NFSv4 state manager can get stuck in the delegation return code, waiting for a layout return to complete in another thread. If the server reboots before that other thread completes, then we need to be able to start a second state manager thread in order to perform recovery. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-11-20	Merge branch 'bpf-zero-hash-seed'	Daniel Borkmann	4	-17/+82
	Lorenz Bauer says: ==================== Allow forcing the seed of a hash table to zero, for deterministic execution during benchmarking and testing. Changes from v2: * Change ordering of BPF_F_ZERO_SEED in linux/bpf.h Comments adressed from v1: * Add comment to discourage production use to linux/bpf.h * Require CAP_SYS_ADMIN ==================== Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-20	tools: add selftest for BPF_F_ZERO_SEED	Lorenz Bauer	1	-9/+55
	Check that iterating two separate hash maps produces the same order of keys if BPF_F_ZERO_SEED is used. Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-20	tools: sync linux/bpf.h	Lorenz Bauer	1	-3/+10
	Synchronize changes to linux/bpf.h from * "bpf: allow zero-initializing hash map seed" * "bpf: move BPF_F_QUERY_EFFECTIVE after map flags" Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-20	bpf: move BPF_F_QUERY_EFFECTIVE after map flags	Lorenz Bauer	1	-3/+3
	BPF_F_QUERY_EFFECTIVE is in the middle of the flags valid for BPF_MAP_CREATE. Move it to its own section to reduce confusion. Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-20	bpf: allow zero-initializing hash map seed	Lorenz Bauer	2	-2/+14
	Add a new flag BPF_F_ZERO_SEED, which forces a hash map to initialize the seed to zero. This is useful when doing performance analysis both on individual BPF programs, as well as the kernel's hash table implementation. Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-20	bpf: libbpf: retry map creation without the name	Stanislav Fomichev	1	-1/+10
	Since commit 88cda1c9da02 ("bpf: libbpf: Provide basic API support to specify BPF obj name"), libbpf unconditionally sets bpf_attr->name for maps. Pre v4.14 kernels don't know about map names and return an error about unexpected non-zero data. Retry sys_bpf without a map name to cover older kernels. v2 changes: * check for errno == EINVAL as suggested by Daniel Borkmann Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-20	net/mlx5e: Fix failing ethtool query on FEC query error	Shay Agroskin	1	-2/+1
	If FEC caps query fails when executing 'ethtool <interface>' the whole callback fails unnecessarily, fixed that by replacing the error return code with debug logging only. Fixes: 6cfa94605091 ("net/mlx5e: Ethtool driver callback for query/set FEC policy") Signed-off-by: Shay Agroskin <shayag@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-11-20	net/mlx5e: Removed unnecessary warnings in FEC caps query	Shay Agroskin	2	-4/+4
	Querying interface FEC caps with 'ethtool [int]' after link reset throws warning regading link speed. This warning is not needed as there is already an indication in user space that the link is not up. Fixes: 0696d60853d5 ("net/mlx5e: Receive buffer configuration") Signed-off-by: Shay Agroskin <shayag@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-11-20	net/mlx5e: Fix wrong field name in FEC related functions	Shay Agroskin	1	-2/+2
	This bug would result in reading wrong FEC capabilities for 10G/40G. Fixes: 2095b2641477 ("net/mlx5e: Add port FEC get/set functions") Signed-off-by: Shay Agroskin <shayag@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-11-20	net/mlx5e: Fix a bug in turning off FEC policy in unsupported speeds	Shay Agroskin	1	-16/+12
	Some speeds don't support turning FEC policy off. In case a requested FEC policy is not supported for a speed (including current speed), its new FEC policy would be: no FEC - if disabling FEC is supported for that speed unchanged - else Fixes: 2095b2641477 ("net/mlx5e: Add port FEC get/set functions") Signed-off-by: Shay Agroskin <shayag@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-11-20	Merge branch 'ena-hibernation-and-rmmod-bug-fixes'	David S. Miller	2	-13/+12
	Arthur Kiyanovski says: ==================== net: ena: hibernation and rmmod bug fixes This patchset includes 2 bug fixes: 1. A fix to a crash during resume from hibernation. 2. A fix to an illegal memory access during driver removal (e.g. during rmmod) which might cause a crash in certain systems. The subminor number in the driver version is also promoted to indicate driver was changed. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-20	net: ena: update driver version from 2.0.1 to 2.0.2	Arthur Kiyanovski	1	-1/+1
	Update driver version due to critical bug fixes. Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-20	net: ena: fix crash during ena_remove()	Arthur Kiyanovski	1	-11/+10
	In ena_remove() we have the following stack call: ena_remove() unregister_netdev() ena_destroy_device() netif_carrier_off() Calling netif_carrier_off() causes linkwatch to try to handle the link change event on the already unregistered netdev, which leads to a read from an unreadable memory address. This patch switches the order of the two functions, so that netif_carrier_off() is called on a regiestered netdev. To accomplish this fix we also had to: 1. Remove the set bit ENA_FLAG_TRIGGER_RESET 2. Add a sanitiy check in ena_close() both to prevent double device reset (when calling unregister_netdev() ena_close is called, but the device was already deleted in ena_destroy_device()). 3. Set the admin_queue running state to false to avoid using it after device was reset (for example when calling ena_destroy_all_io_queues() right after ena_com_dev_reset() in ena_down) Fixes: 944b28aa2982 ("net: ena: fix missing lock during device destruction") Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-20	net: ena: fix crash during failed resume from hibernation	Arthur Kiyanovski	1	-1/+1
	During resume from hibernation if ena_restore_device fails, ena_com_dev_reset() is called, and uses the readless read mechanism, which was already destroyed by the call to ena_com_mmio_reg_read_request_destroy(). This causes a NULL pointer reference. In this commit we switch the call order of the above two functions to avoid this crash. Fixes: d7703ddbd7c9 ("net: ena: fix rare bug when failed restart/resume is followed by driver removal") Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-20	sctp: not increase stream's incnt before sending addstrm_in request	Xin Long	1	-1/+0
	Different from processing the addstrm_out request, The receiver handles an addstrm_in request by sending back an addstrm_out request to the sender who will increase its stream's in and incnt later. Now stream->incnt has been increased since it sent out the addstrm_in request in sctp_send_add_streams(), with the wrong stream->incnt will even cause crash when copying stream info from the old stream's in to the new one's in sctp_process_strreset_addstrm_out(). This patch is to fix it by simply removing the stream->incnt change from sctp_send_add_streams(). Fixes: 242bd2d519d7 ("sctp: implement sender-side procedures for Add Incoming/Outgoing Streams Request Parameter") Reported-by: Jianwen Ji <jiji@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-20	net/mlx5e: Fix selftest for small MTUs	Valentine Fatiev	1	-16/+10
	Loopback test had fixed packet size, which can be bigger than configured MTU. Shorten the loopback packet size to be bigger than minimal MTU allowed by the device. Text field removed from struct 'mlx5ehdr' as redundant to allow send small packets as minimal allowed MTU. Fixes: d605d66 ("net/mlx5e: Add support for ethtool self diagnostics test") Signed-off-by: Valentine Fatiev <valentinef@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-11-20	net/mlx5e: RX, verify received packet size in Linear Striding RQ	Moshe Shemesh	5	-1/+15
	In case of striding RQ, we use MPWRQ (Multi Packet WQE RQ), which means that WQE (RX descriptor) can be used for many packets and so the WQE is much bigger than MTU. In virtualization setups where the port mtu can be larger than the vf mtu, if received packet is bigger than MTU, it won't be dropped by HW on too small receive WQE. If we use linear SKB in striding RQ, since each stride has room for mtu size payload and skb info, an oversized packet can lead to crash for crossing allocated page boundary upon the call to build_skb. So driver needs to check packet size and drop it. Introduce new SW rx counter, rx_oversize_pkts_sw_drop, which counts the number of packets dropped by the driver for being too large. As a new field is added to the RQ struct, re-open the channels whenever this field is being used in datapath (i.e., in the case of linear Striding RQ). Fixes: 619a8f2a42f1 ("net/mlx5e: Use linear SKB in Striding RQ") Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-11-20	net/mlx5e: Apply the correct check for supporting TC esw rules split	Roi Dayan	1	-1/+1
	The mirror and not the output count is the one denoting a split. Fix to condition the offload attempt on the mirror count being > 0 along the firmware to have the related capability. Fixes: 592d36515969 ("net/mlx5e: Parse mirroring action for offloaded TC eswitch flows") Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Yossi Kuperman <yossiku@mellanox.com> Reviewed-by: Chris Mi <chrism@mellanox.com> Acked-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-11-20	net/mlx5e: Adjust to max number of channles when re-attaching	Yuval Avnery	1	-5/+22
	When core driver enters deattach/attach flow after pci reset, Number of logical CPUs may have changed. As a result we need to update the cpu affiliated resource tables. 1. indirect rqt list 2. eq table Reproduction (PowerPC): echo 1000 > /sys/kernel/debug/powerpc/eeh_max_freezes ppc64_cpu --smt=on # Restart driver modprobe -r ... ; modprobe ... # Link up ifconfig ... # Only physical CPUs ppc64_cpu --smt=off # Inject PCI errors so PCI will reset - calling the pci error handler echo 0x8000000000000000 > /sys/kernel/debug/powerpc/<PCI BUS>/err_injct_inboundA Call trace when trying to add non-existing rqs to an indirect rqt: mlx5e_redirect_rqt+0x84/0x260 [mlx5_core] (unreliable) mlx5e_redirect_rqts+0x188/0x190 [mlx5_core] mlx5e_activate_priv_channels+0x488/0x570 [mlx5_core] mlx5e_open_locked+0xbc/0x140 [mlx5_core] mlx5e_open+0x50/0x130 [mlx5_core] mlx5e_nic_enable+0x174/0x1b0 [mlx5_core] mlx5e_attach_netdev+0x154/0x290 [mlx5_core] mlx5e_attach+0x88/0xd0 [mlx5_core] mlx5_attach_device+0x168/0x1e0 [mlx5_core] mlx5_load_one+0x1140/0x1210 [mlx5_core] mlx5_pci_resume+0x6c/0xf0 [mlx5_core] Create cq will fail when trying to use non-existing EQ. Fixes: 89d44f0a6c73 ("net/mlx5_core: Add pci error handlers to mlx5_core driver") Signed-off-by: Yuval Avnery <yuvalav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-11-20	net/mlx5e: Always use the match level enum when parsing TC rule match	Or Gerlitz	1	-2/+2
	We get the match level (none, l2, l3, l4) while going over the match dissectors of an offloaded tc rule. When doing this, the match level enum and the not min inline enum values should be used, fix that. This worked accidentally b/c both enums have the same numerical values. Fixes: d708f902989b ('net/mlx5e: Get the required HW match level while parsing TC flow matches') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-11-20	net/mlx5e: Claim TC hw offloads support only under a proper build config	Or Gerlitz	1	-0/+6
	Currently, we are only supporting tc hw offloads when the eswitch support is compiled in, but we are not gating the adevertizment of the NETIF_F_HW_TC feature on this config being set. Fix it, and while doing that, also avoid dealing with the feature on ethtool when the config is not set. Fixes: e8f887ac6a45 ('net/mlx5e: Introduce tc offload support') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-11-20	net/mlx5e: Don't match on vlan non-existence if ethertype is wildcarded	Or Gerlitz	1	-31/+32
	For the "all" ethertype we should not care whether the packet has vlans. Besides being wrong, the way we did it caused FW error for rules such as: tc filter add dev eth0 protocol all parent ffff: \ prio 1 flower skip_sw action drop b/c the matching meta-data (outer headers bit in struct mlx5_flow_spec) wasn't set. Fix that by matching on vlan non-existence only if we were also told to match on the ethertype. Fixes: cee26487620b ('net/mlx5e: Set vlan masks for all offloaded TC rules') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reported-by: Slava Ovsiienko <viacheslavo@mellanox.com> Reviewed-by: Jianbo Liu <jianbol@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-11-20	net/mlx5e: IPoIB, Reset QP after channels are closed	Denis Drozdov	1	-1/+1
	The mlx5e channels should be closed before mlx5i_uninit_underlay_qp puts the QP into RST (reset) state during mlx5i_close. Currently QP state incorrectly set to RST before channels got deactivated and closed, since mlx5_post_send request expects QP in RTS (Ready To Send) state. The fix is to keep QP in RTS state until mlx5e channels get closed and to reset QP afterwards. Also this fix is simply correct in order to keep the open/close flow symmetric, i.e mlx5i_init_underlay_qp() is called first thing at open, the correct thing to do is to call mlx5i_uninit_underlay_qp() last thing at close, which is exactly what this patch is doing. Fixes: dae37456c8ac ("net/mlx5: Support for attaching multiple underlay QPs to root flow table") Signed-off-by: Denis Drozdov <denisd@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-11-20	net/mlx5: IPSec, Fix the SA context hash key	Raed Salem	1	-2/+8
	The commit "net/mlx5: Refactor accel IPSec code" introduced a bug where asynchronous short time change in hash key value by create/release SA context might happen during an asynchronous hash resize operation this could cause a subsequent remove SA context operation to fail as the key value used during resize is not the same key value used when remove SA context operation is invoked. This commit fixes the bug by defining the SA context hash key such that it includes only fields that never change during the lifetime of the SA context object. Fixes: d6c4f0298cec ("net/mlx5: Refactor accel IPSec code") Signed-off-by: Raed Salem <raeds@mellanox.com> Reviewed-by: Aviad Yehezkel <aviadye@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-11-20	xfs: make xfs_file_remap_range() static	Eric Biggers	1	-1/+1
	xfs_file_remap_range() is only used in fs/xfs/xfs_file.c, so make it static. This addresses a gcc warning when -Wmissing-prototypes is enabled. Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-11-20	xfs: fix shared extent data corruption due to missing cow reservation	Brian Foster	1	-0/+1
	Page writeback indirectly handles shared extents via the existence of overlapping COW fork blocks. If COW fork blocks exist, writeback always performs the associated copy-on-write regardless if the underlying blocks are actually shared. If the blocks are shared, then overlapping COW fork blocks must always exist. fstests shared/010 reproduces a case where a buffered write occurs over a shared block without performing the requisite COW fork reservation. This ultimately causes writeback to the shared extent and data corruption that is detected across md5 checks of the filesystem across a mount cycle. The problem occurs when a buffered write lands over a shared extent that crosses an extent size hint boundary and that also happens to have a partial COW reservation that doesn't cover the start and end blocks of the data fork extent. For example, a buffered write occurs across the file offset (in FSB units) range of [29, 57]. A shared extent exists at blocks [29, 35] and COW reservation already exists at blocks [32, 34]. After accommodating a COW extent size hint of 32 blocks and the existing reservation at offset 32, xfs_reflink_reserve_cow() allocates 32 blocks of reservation at offset 0 and returns with COW reservation across the range of [0, 34]. The associated data fork extent is still [29, 35], however, which isn't fully covered by the COW reservation. This leads to a buffered write at file offset 35 over a shared extent without associated COW reservation. Writeback eventually kicks in, performs an overwrite of the underlying shared block and causes the associated data corruption. Update xfs_reflink_reserve_cow() to accommodate the fact that a delalloc allocation request may not fully cover the extent in the data fork. Trim the data fork extent appropriately, just as is done for shared extent boundaries and/or existing COW reservations that happen to overlap the start of the data fork extent. This prevents shared/010 failures due to data corruption on reflink enabled filesystems. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>