summaryrefslogtreecommitdiff
path: root/drivers/infiniband/ulp
AgeCommit message (Collapse)AuthorFilesLines
2017-05-01IB/core: Rename struct ib_ah_attr to rdma_ah_attrDasaratharaman Chandramouli5-5/+5
This patch simply renames struct ib_ah_attr to rdma_ah_attr as these fields specify attributes that are not necessarily specific to IB. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Don Hiatt <don.hiatt@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-05-01IB/IPoIB: Remove 'else' when the 'if' has a return.Dasaratharaman Chandramouli1-10/+9
This patch fixes a checkpatch issue related to not having to use an 'else' if the 'if' path returns from the function. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Don Hiatt <don.hiatt@intel.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-29IB/core: Move opa_class_port_info definition to header fileDasaratharaman Chandramouli1-25/+0
Both opa_vnic and the hfi driver use the same opa_classport_info definition. We will also have ib_sa capable of querying opa class port info and would need this definition. Move it to ib_mad.h for everyone to use. Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-28IB/SA: Modify SA to implicitly cache Class Port infoDasaratharaman Chandramouli3-78/+3
SA will query and cache class port info as part of its initialization. SA will also invalidate and refresh the cache based on specific events. Callers such as IPoIB and CM can query the SA to get the classportinfo information. Apart from making the caller code much simpler, this change puts the onus on the SA to query and maintain classportinfo much like how it maitains the address handle to the SM. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Don Hiatt <don.hiatt@intel.com> Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-25IB/iser: fix spelling mistake: "unexepected" -> "unexpected"Colin Ian King1-1/+1
trivial fix to spelling mistake in iser_err error message Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Acked-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-21IB/ipoib: Fix deadlock between ipoib_stop and mcast join flowFeras Daoud1-6/+5
Before calling ipoib_stop, rtnl_lock should be taken, then the flow clears the IPOIB_FLAG_ADMIN_UP and IPOIB_FLAG_OPER_UP flags, and waits for mcast completion if IPOIB_MCAST_FLAG_BUSY is set. On the other hand, the flow of multicast join task initializes a mcast completion, sets the IPOIB_MCAST_FLAG_BUSY and calls ipoib_mcast_join. If IPOIB_FLAG_OPER_UP flag is not set, this call returns EINVAL without setting the mcast completion and leads to a deadlock. ipoib_stop | | | clear_bit(IPOIB_FLAG_ADMIN_UP) | | | Context Switch | | ipoib_mcast_join_task | | | spin_lock_irq(lock) | | | init_completion(mcast) | | | set_bit(IPOIB_MCAST_FLAG_BUSY) | | | Context Switch | | clear_bit(IPOIB_FLAG_OPER_UP) | | | spin_lock_irqsave(lock) | | | Context Switch | | ipoib_mcast_join | return (-EINVAL) | | | spin_unlock_irq(lock) | | | Context Switch | | ipoib_mcast_dev_flush | wait_for_completion(mcast) | ipoib_stop will wait for mcast completion for ever, and will not release the rtnl_lock. As a result panic occurs with the following trace: [13441.639268] Call Trace: [13441.640150] [<ffffffff8168b579>] schedule+0x29/0x70 [13441.641038] [<ffffffff81688fc9>] schedule_timeout+0x239/0x2d0 [13441.641914] [<ffffffff810bc017>] ? complete+0x47/0x50 [13441.642765] [<ffffffff810a690d>] ? flush_workqueue_prep_pwqs+0x16d/0x200 [13441.643580] [<ffffffff8168b956>] wait_for_completion+0x116/0x170 [13441.644434] [<ffffffff810c4ec0>] ? wake_up_state+0x20/0x20 [13441.645293] [<ffffffffa05af170>] ipoib_mcast_dev_flush+0x150/0x190 [ib_ipoib] [13441.646159] [<ffffffffa05ac967>] ipoib_ib_dev_down+0x37/0x60 [ib_ipoib] [13441.647013] [<ffffffffa05a4805>] ipoib_stop+0x75/0x150 [ib_ipoib] Fixes: 08bc327629cb ("IB/ipoib: fix for rare multicast join race condition") Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-21IB/ipoib: Update broadcast object if PKey value was changed in index 0Feras Daoud1-0/+13
Update the broadcast address in the priv->broadcast object when the Pkey value changes in index 0, otherwise the multicast GID value will keep the previous value of the PKey, and will not be updated. This leads to interface state down because the interface will keep the old PKey value. For example, in SR-IOV environment, if the PF changes the value of PKey index 0 for one of the VFs, then the VF receives PKey change event that triggers heavy flush. This flush calls update_parent_pkey that update the broadcast object and its relevant members. If in this case the multicast GID will not be updated, the interface state will be down. Fixes: c2904141696e ("IPoIB: Fix pkey change flow for virtualization environments") Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-20IB/IPoIB: Support acceleration options callbacksErez Shitrit7-76/+188
IPoIB driver now uses the new set of callback functions. If the hardware provider supports the new ipoib_options implementation, the driver uses the callbacks in its data path flows, otherwise it uses the driver default implementation for all data flows in its code. The default implementation wasn't change and it is exactly as it was before introduction of acceleration support. Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-20IB/IPoIB: Use defined function for netdev_priv functionErez Shitrit10-129/+136
Make ipoib_priv point to netdev_priv where the code calls netdev_priv. Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-20IB/IPoIB: Rename qpn to be dqpn in ipoib_send and post_send functionsErez Shitrit2-7/+8
Change of function parameter name from qpn to be dqpn. Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-20IB/IPoIB: Separate control from HW operation on ipoib_open/stop ndoErez Shitrit3-103/+131
This patch is preparing the netdev part at the IPoIB driver to be able to use the ipoib_options. It deals with the two flows from the .ndo: ipoib_open and ipoib_stop. The code is rearranged as follows: * All operations which deal with the hardware resources, (for example change QP state, post-receive etc.) are performed in one place. * All operations that are control oriented (like restart multicast task, start the reap_ah etc.) are performed in separate place. The functions that deal with the hardware resources now located at __ipoib_ib_dev_open for the ipoib_open flow and __ipoib_ib_dev_stop for ipoib_stop. Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-20IB/IPoIB: Separate control and data related initializationsErez Shitrit4-88/+109
This patch prepares init and teardown flows so we can call them through ipoib_options function pointers. It arranges that area of code as the following: * All operations which deal with the resource allocation/deletion are performed in one place. * All operations that are control oriented, meaning that they are not connected to a specific hardware, are performed in a separate place. The operations for allocation of hardware resources are now in the function ipoib_dev_init_default, and the deletion of all the resources are in ipoib_dev_uninit_default The only exception is the creation of the PD object, which is used both for resource allocation (create QP etc.) and for control flows like creating AH. It also does: * Move creation of rx_ring and tx_ring to be in the resources allocation area. * Move the function ipoib_ib_dev_open that does the open device to the control area instead of the dev_init which creates resources. Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-20IB/opa-vnic: VNIC Ethernet Management Agent (VEMA) functionVishwanathapura, Niranjana5-5/+1106
OPA VEMA function interfaces with the Infiniband MAD stack to exchange the management information packets with the Ethernet Manager (EM). It interfaces with the OPA VNIC netdev function to SET/GET the management information. The information exchanged with the EM includes class port details, encapsulation configuration, various counters, unicast and multicast MAC list and the MAC table. It also supports sending traps to the EM. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Sadanand Warrier <sadanand.warrier@intel.com> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-20IB/opa-vnic: VNIC Ethernet Management Agent (VEMA) interfaceVishwanathapura, Niranjana5-2/+581
OPA VNIC EMA interface functions are the management interfaces to the OPA VNIC netdev. Add support to add and remove VNIC ports. Implement the required GET/SET management interface functions and processing of new management information. Add support to send trap notifications upon various events like interface status change, unicast/multicast mac list update and mac address change. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Sadanand Warrier <sadanand.warrier@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-20IB/opa-vnic: VNIC MAC table supportVishwanathapura, Niranjana3-0/+291
OPA VNIC MAC table contains the MAC address to DLID mappings provided by the Ethernet manager. During transmission, the MAC table provides the MAC address to DLID translation. Implement MAC table using simple hash list. Also provide support to update/query the MAC table by Ethernet manager. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Sadanand Warrier <sadanand.warrier@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-20IB/opa-vnic: VNIC statistics supportVishwanathapura, Niranjana3-0/+132
OPA VNIC driver statistics support maintains various counters including standard netdev counters and the Ethernet manager defined counters. Add the Ethtool hook to read the counters. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-20IB/opa-vnic: VNIC Ethernet Management (EM) structure definitionsVishwanathapura, Niranjana2-0/+456
Define VNIC EM MAD structures and the associated macros. These structures are used for information exchange between VNIC EM agent (EMA) on the host and the Ethernet manager. These include the virtual ethernet switch (vesw) port information, vesw port mac table, summay and error counters, vesw port interface mac lists and the EMA trap. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Sadanand Warrier <sadanand.warrier@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-20IB/opa-vnic: Virtual Network Interface Controller (VNIC) netdevVishwanathapura, Niranjana8-0/+794
OPA VNIC netdev function supports Ethernet functionality over Omni-Path fabric by encapsulating Ethernet packets inside Omni-Path packet header. It allocates a rdma netdev device and interfaces with the network stack to provide standard Ethernet network interfaces. It overrides HFI1 device's netdev operations where it is required. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Sadanand Warrier <sadanand.warrier@intel.com> Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com> Signed-off-by: Andrzej Kacprowski <andrzej.kacprowski@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-20Merge branch 'k.o/for-4.12' into k.o/for-4.12-rdma-netdeviceDoug Ledford3-8/+42
2017-04-12Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pendingLinus Torvalds2-19/+49
Pull SCSI target fixes from Nicholas Bellinger: "There has been work in a number of different areas over the last weeks, including: - Fix target-core-user (TCMU) back-end bi-directional handling (Xiubo Li + Mike Christie + Ilias Tsitsimpis) - Fix iscsi-target TMR reference leak during session shutdown (Rob Millner + Chu Yuan Lin) - Fix target_core_fabric_configfs.c race between LUN shutdown + mapped LUN creation (James Shen) - Fix target-core unknown fabric callback queue-full errors (Potnuri Bharat Teja) - Fix iscsi-target + iser-target queue-full handling in order to support iw_cxgb4 RNICs. (Potnuri Bharat Teja + Sagi Grimberg) - Fix ALUA transition state race between multiple initiator (Mike Christie) - Drop work-around for legacy GlobalSAN initiator, to allow QLogic 57840S + 579xx offload HBAs to work out-of-the-box in MSFT environments. (Martin Svec + Arun Easi) Note that a number are CC'ed for stable, and although the queue-full bug-fixes required for iser-target to work with iw_cxgb4 aren't CC'ed here, they'll be posted to Greg-KH separately" * git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: tcmu: Skip Data-Out blocks before gathering Data-In buffer for BIDI case iscsi-target: Drop work-around for legacy GlobalSAN initiator target: Fix ALUA transition state race between multiple initiators iser-target: avoid posting a recv buffer twice iser-target: Fix queue-full response handling iscsi-target: Propigate queue_data_in + queue_status errors target: Fix unknown fabric callback queue-full errors tcmu: Fix wrongly calculating of the base_command_size tcmu: Fix possible overwrite of t_data_sg's last iov[] target: Avoid mappedlun symlink creation during lun shutdown iscsi-target: Fix TMR reference leak during session shutdown usb: gadget: Correct usb EP argument for BOT status request tcmu: Allow cmd_time_out to be set to zero (disabled)
2017-04-05IB/IPoIB: ibX: failed to create mcg debug fileShamir Rabinovitch3-8/+42
When udev renames the netdev devices, ipoib debugfs entries does not get renamed. As a result, if subsequent probe of ipoib device reuse the name then creating a debugfs entry for the new device would fail. Also, moved ipoib_create_debug_files and ipoib_delete_debug_files as part of ipoib event handling in order to avoid any race condition between these. Fixes: 1732b0ef3b3a ([IPoIB] add path record information in debugfs) Cc: stable@vger.kernel.org # 2.6.15+ Signed-off-by: Vijay Kumar <vijay.ac.kumar@oracle.com> Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-03-31iser-target: avoid posting a recv buffer twiceSagi Grimberg2-1/+14
We pre-allocate our send-queues and might overflow them in case we have multi work-request operations which tend to occur for large RDMA transfers over devices with limited allowed sg elements. When we get to a queue-full condition we might retry again later, so track our receive buffers so we don't repost them for a retry case. Reported-by: Potnuri Bharat Teja <bharat@chelsio.com> Tested-by: Potnuri Bharat Teja <bharat@chelsio.com> Reviewed-by: Potnuri Bharat Teja <bharat@chelsio.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-31iser-target: Fix queue-full response handlingNicholas Bellinger1-18/+35
This patch addresses two queue-full handling bugs in iser-target. The first is propagating isert_rdma_rw_ctx_post() return back to target-core via isert_put_datain() + isert_get_dataout() callbacks, in order to trigger queue-full logic in target-core. Note target-core expects -EAGAIN or -ENOMEM error to signal RDMA WRITE/READ data-transfer callbacks should be retried, after queue-full logic been invoked. Other types of errors propagated up from RDMA RW API will result in target-core generating internal CHECK_CONDITION status, avoiding subsequent isert_put_datain() and isert_get_dataout() iscsit_transport callback retry attempts. The second is to use transport_generic_request_failure() during T10-PI hw-offload errors in isert_rdma_write_done() and isert_rdma_read_done(), so CHECK_CONDITION queue-full is handled internally by target-core. Also add isert_put_response() T10-PI failure case fixme in isert_rdma_write_done(), which is currently not internally retried or released until session reinstatement. Reported-by: Potnuri Bharat Teja <bharat@chelsio.com> Reviewed-by: Potnuri Bharat Teja <bharat@chelsio.com> Tested-by: Potnuri Bharat Teja <bharat@chelsio.com> Cc: Potnuri Bharat Teja <bharat@chelsio.com> Reported-by: Steve Wise <swise@opengridcomputing.com> Cc: Steve Wise <swise@opengridcomputing.com> Cc: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-25RDMA/iser: Fix possible mr leak on device removal eventSagi Grimberg2-3/+7
When the rdma device is removed, we must cleanup all the rdma resources within the DEVICE_REMOVAL event handler to let the device teardown gracefully. When this happens with live I/O, some memory regions are occupied. Thus, track them too and dereg all the mr's. We are safe with mr access by iscsi_iser_cleanup_task. Reported-by: Raju Rangoju <rajur@chelsio.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-03-02sched/headers: Prepare to move signal wakeup & sigpending methods from ↵Ingo Molnar2-0/+2
<linux/sched.h> into <linux/sched/signal.h> Fix up affected files that include this signal functionality via sched.h. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-02-26Merge tag 'for-next-dma_ops' of ↵Linus Torvalds5-7/+6
git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull rdma DMA mapping updates from Doug Ledford: "Drop IB DMA mapping code and use core DMA code instead. Bart Van Assche noted that the ib DMA mapping code was significantly similar enough to the core DMA mapping code that with a few changes it was possible to remove the IB DMA mapping code entirely and switch the RDMA stack to use the core DMA mapping code. This resulted in a nice set of cleanups, but touched the entire tree and has been kept separate for that reason." * tag 'for-next-dma_ops' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (37 commits) IB/rxe, IB/rdmavt: Use dma_virt_ops instead of duplicating it IB/core: Remove ib_device.dma_device nvme-rdma: Switch from dma_device to dev.parent RDS: net: Switch from dma_device to dev.parent IB/srpt: Modify a debug statement IB/srp: Switch from dma_device to dev.parent IB/iser: Switch from dma_device to dev.parent IB/IPoIB: Switch from dma_device to dev.parent IB/rxe: Switch from dma_device to dev.parent IB/vmw_pvrdma: Switch from dma_device to dev.parent IB/usnic: Switch from dma_device to dev.parent IB/qib: Switch from dma_device to dev.parent IB/qedr: Switch from dma_device to dev.parent IB/ocrdma: Switch from dma_device to dev.parent IB/nes: Remove a superfluous assignment statement IB/mthca: Switch from dma_device to dev.parent IB/mlx5: Switch from dma_device to dev.parent IB/mlx4: Switch from dma_device to dev.parent IB/i40iw: Remove a superfluous assignment statement IB/hns: Switch from dma_device to dev.parent ...
2017-02-23Merge tag 'for-linus' of ↵Linus Torvalds12-173/+248
git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull rdma updates from Doug Ledford: "First set of updates for 4.11 kernel merge window - Add new Broadcom bnxt_re RoCE driver - rxe driver updates - ioctl cleanups - ETH_P_IBOE declaration cleanup - IPoIB changes - Add port state cache - Allow srpt driver to accept guids as port names in config - Update to hfi1 driver - Update to srp driver - Lots of misc minor changes all over" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (114 commits) RDMA/bnxt_re: fix for "bnxt_en: Update to firmware interface spec 1.7.0." rdma_cm: fail iwarp accepts w/o connection params IB/srp: Drain the send queue before destroying a QP IB/core: Add support for draining IB_POLL_DIRECT completion queues IB/srp: Improve an error path IB/srp: Make a diagnostic message more informative IB/srp: Document locking conventions IB/srp: Fix race conditions related to task management IB/srp: Avoid that duplicate responses trigger a kernel bug IB/SRP: Avoid using IB_MR_TYPE_SG_GAPS RDMA/qedr: Fix some error handling RDMA/bnxt_re: add DCB dependency IB/hns: include linux/module.h IB/vmw_pvrdma: Expose vendor error to ULPs vmw_pvrdma: switch to pci_alloc_irq_vectors IB/hfi1: use size_t for passing array length IB/ipoib: Remove redudant label IB/ipoib: remove the unnecessary memory free IB/mthca: switch to pci_alloc_irq_vectors IB/hfi1: Code reuse with memdup_copy ...
2017-02-21Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds2-0/+2
Pull SCSI updates from James Bottomley: "This update includes the usual round of major driver updates (ncr5380, ufs, lpfc, be2iscsi, hisi_sas, storvsc, cxlflash, aacraid, megaraid_sas, ...). There's also an assortment of minor fixes and the major update of switching a bunch of drivers to pci_alloc_irq_vectors from Christoph" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (188 commits) scsi: megaraid_sas: handle dma_addr_t right on 32-bit scsi: megaraid_sas: array overflow in megasas_dump_frame() scsi: snic: switch to pci_irq_alloc_vectors scsi: megaraid_sas: driver version upgrade scsi: megaraid_sas: Change RAID_1_10_RMW_CMDS to RAID_1_PEER_CMDS and set value to 2 scsi: megaraid_sas: Indentation and smatch warning fixes scsi: megaraid_sas: Cleanup VD_EXT_DEBUG and SPAN_DEBUG related debug prints scsi: megaraid_sas: Increase internal command pool scsi: megaraid_sas: Use synchronize_irq to wait for IRQs to complete scsi: megaraid_sas: Bail out the driver load if ld_list_query fails scsi: megaraid_sas: Change build_mpt_mfi_pass_thru to return void scsi: megaraid_sas: During OCR, if get_ctrl_info fails do not continue with OCR scsi: megaraid_sas: Do not set fp_possible if TM capable for non-RW syspdIO, change fp_possible to bool scsi: megaraid_sas: Remove unused pd_index from megasas_build_ld_nonrw_fusion scsi: megaraid_sas: megasas_return_cmd does not memset IO frame to zero scsi: megaraid_sas: max_fw_cmds are decremented twice, remove duplicate scsi: megaraid_sas: update can_queue only if the new value is less scsi: megaraid_sas: Change max_cmd from u32 to u16 in all functions scsi: megaraid_sas: set pd_after_lb from MR_BuildRaidContext and initialize pDevHandle to MR_DEVHANDLE_INVALID scsi: megaraid_sas: latest controller OCR capability from FW before sending shutdown DCMD ...
2017-02-19IB/srp: Drain the send queue before destroying a QPBart Van Assche1-5/+14
A quote from the IB spec: However, if the Consumer does not wait for the Affiliated Asynchronous Last WQE Reached Event, then WQE and Data Segment leakage may occur. Therefore, it is good programming practice to tear down a QP that is associated with an SRQ by using the following process: * Put the QP in the Error State; * wait for the Affiliated Asynchronous Last WQE Reached Event; * either: * drain the CQ by invoking the Poll CQ verb and either wait for CQ to be empty or the number of Poll CQ operations has exceeded CQ capacity size; or * post another WR that completes on the same CQ and wait for this WR to return as a WC; * and then invoke a Destroy QP or Reset QP. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Israel Rukshin <israelr@mellanox.com> Cc: Max Gurtovoy <maxg@mellanox.com> Cc: Laurence Oberman <loberman@redhat.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-19IB/srp: Improve an error pathBart Van Assche1-1/+2
Avoid that the following message is printed if login fails: scsi host0: ib_srp: Sending CM DREQ failed Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Israel Rukshin <israelr@mellanox.com> Cc: Max Gurtovoy <maxg@mellanox.com> Cc: Laurence Oberman <loberman@redhat.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-19IB/srp: Make a diagnostic message more informativeBart Van Assche1-2/+3
Report the destination port GID if connecting fails. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Israel Rukshin <israelr@mellanox.com> Cc: Max Gurtovoy <maxg@mellanox.com> Cc: Laurence Oberman <loberman@redhat.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-19IB/srp: Document locking conventionsBart Van Assche1-0/+5
Use lockdep_assert_held() statements to verify at run-time whether the proper locks are held. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Israel Rukshin <israelr@mellanox.com> Cc: Max Gurtovoy <maxg@mellanox.com> Cc: Laurence Oberman <loberman@redhat.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-19IB/srp: Fix race conditions related to task managementBart Van Assche2-16/+30
Avoid that srp_process_rsp() overwrites the status information in ch if the SRP target response timed out and processing of another task management function has already started. Avoid that issuing multiple task management functions concurrently triggers list corruption. This patch prevents that the following stack trace appears in the system log: WARNING: CPU: 8 PID: 9269 at lib/list_debug.c:52 __list_del_entry_valid+0xbc/0xc0 list_del corruption. prev->next should be ffffc90004bb7b00, but was ffff8804052ecc68 CPU: 8 PID: 9269 Comm: sg_reset Tainted: G W 4.10.0-rc7-dbg+ #3 Call Trace: dump_stack+0x68/0x93 __warn+0xc6/0xe0 warn_slowpath_fmt+0x4a/0x50 __list_del_entry_valid+0xbc/0xc0 wait_for_completion_timeout+0x12e/0x170 srp_send_tsk_mgmt+0x1ef/0x2d0 [ib_srp] srp_reset_device+0x5b/0x110 [ib_srp] scsi_ioctl_reset+0x1c7/0x290 scsi_ioctl+0x12a/0x420 sd_ioctl+0x9d/0x100 blkdev_ioctl+0x51e/0x9f0 block_ioctl+0x38/0x40 do_vfs_ioctl+0x8f/0x700 SyS_ioctl+0x3c/0x70 entry_SYSCALL_64_fastpath+0x18/0xad Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Israel Rukshin <israelr@mellanox.com> Cc: Max Gurtovoy <maxg@mellanox.com> Cc: Laurence Oberman <loberman@redhat.com> Cc: Steve Feeley <Steve.Feeley@sandisk.com> Cc: <stable@vger.kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-19IB/srp: Avoid that duplicate responses trigger a kernel bugBart Van Assche1-1/+3
After srp_process_rsp() returns there is a short time during which the scsi_host_find_tag() call will return a pointer to the SCSI command that is being completed. If during that time a duplicate response is received, avoid that the following call stack appears: BUG: unable to handle kernel NULL pointer dereference at (null) IP: srp_recv_done+0x450/0x6b0 [ib_srp] Oops: 0000 [#1] SMP CPU: 10 PID: 0 Comm: swapper/10 Not tainted 4.10.0-rc7-dbg+ #1 Call Trace: <IRQ> __ib_process_cq+0x4b/0xd0 [ib_core] ib_poll_handler+0x1d/0x70 [ib_core] irq_poll_softirq+0xba/0x120 __do_softirq+0xba/0x4c0 irq_exit+0xbe/0xd0 smp_apic_timer_interrupt+0x38/0x50 apic_timer_interrupt+0x90/0xa0 </IRQ> RIP: srp_recv_done+0x450/0x6b0 [ib_srp] RSP: ffff88046f483e20 Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Israel Rukshin <israelr@mellanox.com> Cc: Max Gurtovoy <maxg@mellanox.com> Cc: Laurence Oberman <loberman@redhat.com> Cc: Steve Feeley <Steve.Feeley@sandisk.com> Cc: <stable@vger.kernel.org> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-19IB/SRP: Avoid using IB_MR_TYPE_SG_GAPSBart Van Assche1-9/+3
Tests have shown that the following error message is reported when using SG-GAPS registration with an mlx5 adapter: scsi host1: ib_srp: failed RECV status WR flushed (5) for CQE ffff880bd4270eb0 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 0f007806 2500002a ad9fafd1 scsi host1: ib_srp: reconnect succeeded mlx5_0:dump_cqe:262:(pid 7369): dump error cqe 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 0f007806 25000032 00105dd0 scsi host1: ib_srp: failed FAST REG status memory management operation error (6) for CQE ffff880b92860138 Hence avoid using SG-GAPS memory registrations. Additionally, always configure the blk_queue_virt_boundary() to avoid to trigger a mapping failure when using adapters that support SG-GAPS (e.g. mlx5). Fixes: commit ad8e66b4a801 ("IB/srp: fix mr allocation when the device supports sg gaps") Fixes: commit 509c5f33f4f6 ("IB/srp: Prevent mapping failures") Reported-by: Laurence Oberman <loberman@redhat.com> Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Israel Rukshin <israelr@mellanox.com> Cc: Max Gurtovoy <maxg@mellanox.com> Cc: Leon Romanovsky <leonro@mellanox.com> Cc: Mark Bloch <markb@mellanox.com> Cc: Yuval Shaia <yuval.shaia@oracle.com> Cc: <stable@vger.kernel.org> # 4.7+ Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-19IB/ipoib: Remove redudant labelZhu Yanjun1-4/+3
There are 2 labels to mark the same statements. Replace the 2 labels with one versatile labe. Cc: Joe Jin <joe.jin@oracle.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-19IB/ipoib: remove the unnecessary memory freeZhu Yanjun1-1/+3
In the function ipoib_cm_nonsrq_init_rx, the memory is not allocated successfully. It is not necessary to free it. CC: Joe Jin <joe.jin@oracle.com> CC: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-19IB/iser: Protect completion context active_qps updateMax Gurtovoy1-0/+2
As iser connections can share completion contexts, we need to protect the active_qps update each time we set it. Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Acked-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-19Merge branch 'k.o/for-4.10-rc' into HEADDoug Ledford4-23/+18
2017-02-15IB/IPoIB: Add destination address when re-queue packetErez Shitrit1-13/+17
When sending packet to destination that was not resolved yet via path query, the driver keeps the skb and tries to re-send it again when the path is resolved. But when re-sending via dev_queue_xmit the kernel doesn't call to dev_hard_header, so IPoIB needs to keep 20 bytes in the skb and to put the destination address inside them. In that way the dev_start_xmit will have the correct destination, and the driver won't take the destination from the skb->data, while nothing exists there, which causes to packet be be dropped. The test flow is: 1. Run the SM on remote node, 2. Restart the driver. 4. Ping some destination, 3. Observe that first ICMP request will be dropped. Fixes: fc791b633515 ("IB/ipoib: move back IB LL address into the hard header") Cc: <stable@vger.kernel.org> # v4.8+ Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Tested-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-07scsi: remove eh_timed_out methods in the transport templateChristoph Hellwig2-0/+2
Instead define the timeout behavior purely based on the host_template eh_timed_out method and wire up the existing transport implementations in the host templates. This also clears up the confusion that the transport template method overrides the host template one, so some drivers have to re-override the transport template one. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-01-27IB/srpt: Accept GUIDs as port namesBart Van Assche2-61/+96
Port and ACL information must be configured before an initiator logs in. Make it possible to configure this information before a subnet prefix has been assigned to a port by not only accepting GIDs as target port and initiator port names but by also accepting port GUIDs. Add a 'priv' member to struct se_wwn to allow target drivers to associate their own data with struct se_wwn. Reported-by: Doug Ledford <dledford@redhat.com> References: http://www.spinics.net/lists/linux-rdma/msg39505.html Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-01-25IB/ipoib: Remove the unnecessary error checkZhu Yanjun4-13/+7
The function ipoib_mcast_start_thread/ipoib_ib_dev_up always return zero. As such, in the function ipoib_open, err_stop will never be reached. So remove this err_stop and change the return type of the function ipoib_mcast_start_thread/ipoib_ib_dev_up to void. Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-01-25IB/ipoib: function interface changeZhu Yanjun2-8/+4
The ipoib_ib_dev_down/ipoib_ib_dev_stop return zero unconditionally and the callers never check the returned values, change the return type to void and remove the redundant return values. Reviewed-by: Shan Hai <shan.hai@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-01-25IB/ipoib: Remove unnecessary returned value checkZhu Yanjun3-10/+4
In the function ipoib_set_dev_features, the returned value is always 0. As such, it is not necessary to check the returned value. This is not a bug. It is a trivial problem. Reviewed-by: Guanglei Li <guanglei.li@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-01-25IB/isert: fix spelling mistake: "teminating" -> "terminating"Colin Ian King1-1/+1
Trivial fix to spelling mistake in isert_warn message Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-01-24IB/srpt: Modify a debug statementBart Van Assche1-2/+1
Since a later patch will remove ib_device.dma_device and since knowing the value of that pointer is not too important, remove dma_device from the debug output. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-01-24IB/srp: Switch from dma_device to dev.parentBart Van Assche1-2/+2
Prepare for removal of ib_device.dma_device. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-01-24IB/iser: Switch from dma_device to dev.parentBart Van Assche1-1/+1
Prepare for removal of ib_device.dma_device. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-01-24IB/IPoIB: Switch from dma_device to dev.parentBart Van Assche2-2/+2
Prepare for removal of ib_device.dma_device. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>