summaryrefslogtreecommitdiff
path: root/drivers/infiniband
AgeCommit message (Collapse)AuthorFilesLines
2014-03-28RDMA/cxgb4: set error code on kmalloc() failureYann Droneaud1-1/+3
If kmalloc() fails in c4iw_alloc_ucontext(), the function leaves but does not set an error code in ret variable: it will return 0 to the caller. This patch set ret to -ENOMEM in such case. Cc: Steve Wise <swise@opengridcomputing.com> Cc: Steve Wise <swise@chelsio.com> Signed-off-by: Yann Droneaud <ydroneaud@opteya.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-26mlx4: Use actual number of PCI functions (PF + VFs) for alias GUID logicMatan Barak1-1/+1
The code which is dealing with SRIOV alias GUIDs in the mlx4 IB driver has some logic which operated according to the maximal possible active functions (PF + VFs). After the single port VFs code integration this resulted in a flow of false-positive warnings going to the kernel log after the PF driver started the alias GUID work. Fix it by referring to the actual number of functions. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-21net/mlx4: Adapt code for N-Port VFMatan Barak3-18/+31
Adds support for N-Port VFs, this includes: 1. Adding support in the wrapped FW command In wrapped commands, we need to verify and convert the slave's port into the real physical port. Furthermore, when sending the response back to the slave, a reverse conversion should be made. 2. Adjusting sqpn for QP1 para-virtualization The slave assumes that sqpn is used for QP1 communication. If the slave is assigned to a port != (first port), we need to adjust the sqpn that will direct its QP1 packets into the correct endpoint. 3. Adjusting gid[5] to modify the port for raw ethernet In B0 steering, gid[5] contains the port. It needs to be adjusted into the physical port. 4. Adjusting number of ports in the query / ports caps in the FW commands When a slave queries the hardware, it needs to view only the physical ports it's assigned to. 5. Adjusting the sched_qp according to the port number The QP port is encoded in the sched_qp, thus in modify_qp we need to encode the correct port in sched_qp. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-21IB/mlx4_ib: Adapt code to use caps.num_ports instead of a constantMatan Barak1-4/+4
Some code in the mlx4 IB driver stack assumed MLX4_MAX_PORTS ports. Instead, we should only loop until the number of actual ports in i the device, which is stored in dev->caps.num_ports. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-15cxgb4/iw_cxgb4: Doorbell Drop Avoidance Bug FixesSteve Wise6-152/+228
The current logic suffers from a slow response time to disable user DB usage, and also fails to avoid DB FIFO drops under heavy load. This commit fixes these deficiencies and makes the avoidance logic more optimal. This is done by more efficiently notifying the ULDs of potential DB problems, and implements a smoother flow control algorithm in iw_cxgb4, which is the ULD that puts the most load on the DB fifo. Design: cxgb4: Direct ULD callback from the DB FULL/DROP interrupt handler. This allows the ULD to stop doing user DB writes as quickly as possible. While user DB usage is disabled, the LLD will accumulate DB write events for its queues. Then once DB usage is reenabled, a single DB write is done for each queue with its accumulated write count. This reduces the load put on the DB fifo when reenabling. iw_cxgb4: Instead of marking each qp to indicate DB writes are disabled, we create a device-global status page that each user process maps. This allows iw_cxgb4 to only set this single bit to disable all DB writes for all user QPs vs traversing the idr of all the active QPs. If the libcxgb4 doesn't support this, then we fall back to the old approach of marking each QP. Thus we allow the new driver to work with an older libcxgb4. When the LLD upcalls iw_cxgb4 indicating DB FULL, we disable all DB writes via the status page and transition the DB state to STOPPED. As user processes see that DB writes are disabled, they call into iw_cxgb4 to submit their DB write events. Since the DB state is in STOPPED, the QP trying to write gets enqueued on a new DB "flow control" list. As subsequent DB writes are submitted for this flow controlled QP, the amount of writes are accumulated for each QP on the flow control list. So all the user QPs that are actively ringing the DB get put on this list and the number of writes they request are accumulated. When the LLD upcalls iw_cxgb4 indicating DB EMPTY, which is in a workq context, we change the DB state to FLOW_CONTROL, and begin resuming all the QPs that are on the flow control list. This logic runs on until the flow control list is empty or we exit FLOW_CONTROL mode (due to a DB DROP upcall, for example). QPs are removed from this list, and their accumulated DB write counts written to the DB FIFO. Sets of QPs, called chunks in the code, are removed at one time. The chunk size is 64. So 64 QPs are resumed at a time, and before the next chunk is resumed, the logic waits (blocks) for the DB FIFO to drain. This prevents resuming to quickly and overflowing the FIFO. Once the flow control list is empty, the db state transitions back to NORMAL and user QPs are again allowed to write directly to the user DB register. The algorithm is designed such that if the DB write load is high enough, then all the DB writes get submitted by the kernel using this flow controlled approach to avoid DB drops. As the load lightens though, we resume to normal DB writes directly by user applications. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-15cxgb4/iw_cxgb4: Treat CPL_ERR_KEEPALV_NEG_ADVICE as negative adviceSteve Wise1-12/+12
Based on original work by Anand Priyadarshee <anandp@chelsio.com>. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2-75/+112
Conflicts: drivers/net/usb/r8152.c drivers/net/xen-netback/netback.c Both the r8152 and netback conflicts were simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12mlx4: Activate RoCE/SRIOVJack Morgenstein1-8/+0
To activate RoCE/SRIOV, need to remove the following: 1. In mlx4_ib_add, need to remove the error return preventing initialization of a RoCE port under SRIOV. 2. In update_vport_qp_params (in resource_tracker.c) need to remove the error return when a RoCE RC or UD qp is detected. This error return causes the INIT-to-RTR qp transition to fail in the wrapper function under RoCE/SRIOV. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12mlx4_ib: Fix SIDR support of for UD QPs under SRIOV/RoCEShani Michaelli1-15/+57
* Handle CM_SIDR_REQ_ATTR_ID and CM_SIDR_REP_ATTR_ID in multiplex_cm_handler and demux_cm_handler. * Handle Service ID Resolution messages and REQ messages separately, for their formats are different. Signed-off-by: Shani Michaeli <shanim@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12mlx4: Implement IP based gids support for RoCE/SRIOVJack Morgenstein5-24/+84
Since there is no connection between the MAC/VLAN and the GID when using IP-based addressing, the proxy QP1 (running on the slave) must pass the source-mac, destination-mac, and vlan_id information separately from the GID. Additionally, the Host must pass the remote source-mac and vlan_id back to the slave, This is achieved as follows: Outgoing MADs: 1. Source MAC: obtained from the CQ completion structure (struct ib_wc, smac field). 2. Destination MAC: obtained from the tunnel header 3. vlan_id: obtained from the tunnel header. Incoming MADs 1. The source (i.e., remote) MAC and vlan_id are passed in the tunnel header to the proxy QP1. VST mode support: For outgoing MADs, the vlan_id obtained from the header is discarded, and the vlan_id specified by the Hypervisor is used instead. For incoming MADs, the incoming vlan_id (in the wc) is discarded, and the "invalid" vlan (0xffff) is substituted when forwarding to the slave. Signed-off-by: Moni Shoua <monis@mellanox.co.il> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12mlx4: Add ref counting to port MAC table for RoCEJack Morgenstein2-43/+261
The IB side of RoCE requires the MAC table index of the MAC address used by its QPs. To obtain the real MAC index, the IB side registers the MAC (increasing its ref count, and also returning the real MAC index) during the modify-qp sequence. This protects against the ETH side deleting or modifying that MAC table entry while the QP is active. Note that until the modify-qp command returns success, the MAC and VLAN information only has "candidate" status. If the modify-qp succeeds, the "candidate" info is promoted to the operational MAC/VLAN info for the qp. If the modify fails, the candidate MAC/VLAN is unregistered, and the old qp info is preserved. The patch is a bit complex, because there are multiple qp transitions where the primary-path information may be modified: INIT-to-RTR, and SQD-to-SQD. Similarly for the alternate path information. Therefore the code must handle cases where path information has already been entered into the QP context by previous qp transitions. For the MAC address, the success logic is as follows: 1. If there was no previous MAC, simply move the candidate MAC information to the operational information, and reset the candidate MAC info. 2. If there was a previous MAC, unregister it. Then move the MAC information from candidate to operational, and reset the candidate info (as in 1. above). The MAC address failure logic is the same for all cases: - Unregister the candidate MAC, and reset the candidate MAC info. For Vlan registration, the logic is similar. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12mlx4: In RoCE allow guests to have multiple GIDSJack Morgenstein1-4/+30
The GIDs are statically distributed, as follows: PF: gets 16 GIDs VFs: Remaining GIDS are divided evenly between VFs activated by the driver. If the division is not even, lower-numbered VFs get an extra GID. For an IB interface, the number of gids per guest remains as before: one gid per guest. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12mlx4: Adjust QP1 multiplexing for RoCE/SRIOVJack Morgenstein3-20/+67
This requires the following modifications: 1. Fix build_mlx4_header to properly fill in the ETH fields 2. Adjust mux and demux QP1 flow to support RoCE. This commit still assumes only one GID per slave for RoCE. The commit enabling multiple GIDs is a subsequent commit, and is done separately because of its complexity. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pendingLinus Torvalds2-75/+112
Pull SCSI target fixes from Nicholas Bellinger: "This series addresses a number of outstanding issues wrt to active I/O shutdown using iser-target. This includes: - Fix a long standing tpg_state bug where a tpg could be referenced during explicit shutdown (v3.1+ stable) - Use list_del_init for iscsi_cmd->i_conn_node so list_empty checks work as expected (v3.10+ stable) - Fix a isert_conn->state related hung task bug + ensure outstanding I/O completes during session shutdown. (v3.10+ stable) - Fix isert_conn->post_send_buf_count accounting for RDMA READ/WRITEs (v3.10+ stable) - Ignore FRWR completions during active I/O shutdown (v3.12+ stable) - Fix command leakage for interrupt coalescing during active I/O shutdown (v3.13+ stable) Also included is another DIF emulation fix from Sagi specific to v3.14-rc code" * git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: Target/sbc: Fix sbc_copy_prot for offset scatters iser-target: Fix command leak for tx_desc->comp_llnode_batch iser-target: Ignore completions for FRWRs in isert_cq_tx_work iser-target: Fix post_send_buf_count for RDMA READ/WRITE iscsi/iser-target: Fix isert_conn->state hung shutdown issues iscsi/iser-target: Use list_del_init for ->i_conn_node iscsi-target: Fix iscsit_get_tpg_from_np tpg_state bug
2014-03-05iser-target: Fix command leak for tx_desc->comp_llnode_batchNicholas Bellinger2-6/+46
This patch addresses a number of active I/O shutdown issues related to isert_cmd descriptors being leaked that are part of a completion interrupt coalescing batch. This includes adding logic in isert_cq_tx_comp_err() to drain any associated tx_desc->comp_llnode_batch, as well as isert_cq_drain_comp_llist() to drain any associated isert_conn->conn_comp_llist. Also, set tx_desc->llnode_active in isert_init_send_wr() in order to determine when work requests need to be skipped in isert_cq_tx_work() exception path code. Finally, update isert_init_send_wr() to only allow interrupt coalescing when ISER_CONN_UP. Acked-by: Sagi Grimberg <sagig@mellanox.com> Cc: Or Gerlitz <ogerlitz@mellanox.com> Cc: <stable@vger.kernel.org> #3.13+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-03-05iser-target: Ignore completions for FRWRs in isert_cq_tx_workNicholas Bellinger2-2/+7
This patch changes IB_WR_FAST_REG_MR + IB_WR_LOCAL_INV related work requests to include a ISER_FRWR_LI_WRID value in order to signal isert_cq_tx_work() that these requests should be ignored. This is necessary because even though IB_SEND_SIGNALED is not set for either work request, during a QP failure event the work requests will be returned with exception status from the TX completion queue. v2 changes: - Rename ISER_FRWR_LI_WRID -> ISER_FASTREG_LI_WRID (Sagi) Acked-by: Sagi Grimberg <sagig@mellanox.com> Cc: Or Gerlitz <ogerlitz@mellanox.com> Cc: <stable@vger.kernel.org> #3.12+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-03-05iser-target: Fix post_send_buf_count for RDMA READ/WRITENicholas Bellinger1-6/+8
This patch fixes the incorrect setting of ->post_send_buf_count related to RDMA WRITEs + READs where isert_rdma_rw->send_wr_num was not being taken into account. This includes incrementing ->post_send_buf_count within isert_put_datain() + isert_get_dataout(), decrementing within __isert_send_completion() + isert_response_completion(), and clearing wr->send_wr_num within isert_completion_rdma_read() This is necessary because even though IB_SEND_SIGNALED is not set for RDMA WRITEs + READs, during a QP failure event the work requests will be returned with exception status from the TX completion queue. Acked-by: Sagi Grimberg <sagig@mellanox.com> Cc: Or Gerlitz <ogerlitz@mellanox.com> Cc: <stable@vger.kernel.org> #3.10+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-03-05iscsi/iser-target: Fix isert_conn->state hung shutdown issuesNicholas Bellinger2-60/+50
This patch addresses a couple of different hug shutdown issues related to wait_event() + isert_conn->state. First, it changes isert_conn->conn_wait + isert_conn->conn_wait_comp_err from waitqueues to completions, and sets ISER_CONN_TERMINATING from within isert_disconnect_work(). Second, it splits isert_free_conn() into isert_wait_conn() that is called earlier in iscsit_close_connection() to ensure that all outstanding commands have completed before continuing. Finally, it breaks isert_cq_comp_err() into seperate TX / RX related code, and adds logic in isert_cq_rx_comp_err() to wait for outstanding commands to complete before setting ISER_CONN_DOWN and calling complete(&isert_conn->conn_wait_comp_err). Acked-by: Sagi Grimberg <sagig@mellanox.com> Cc: Or Gerlitz <ogerlitz@mellanox.com> Cc: <stable@vger.kernel.org> #3.10+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-03-05iscsi/iser-target: Use list_del_init for ->i_conn_nodeNicholas Bellinger1-3/+3
There are a handful of uses of list_empty() for cmd->i_conn_node within iser-target code that expect to return false once a cmd has been removed from the per connect list. This patch changes all uses of list_del -> list_del_init in order to ensure that list_empty() returns false as expected. Acked-by: Sagi Grimberg <sagig@mellanox.com> Cc: Or Gerlitz <ogerlitz@mellanox.com> Cc: <stable@vger.kernel.org> #3.10+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-02-26net,IB/mlx: Bump all Mellanox driver versionsAmir Vadai2-4/+4
Bump all Mellanox driver versions. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-16Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pendingLinus Torvalds2-7/+8
Pull SCSI target fixes from Nicholas Bellinger: "Mostly minor fixes this time to v3.14-rc1 related changes. Also included is one fix for a free after use regression in persistent reservations UNREGISTER logic that is CC'ed to >= v3.11.y stable" * git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: Target/sbc: Fix protection copy routine IB/srpt: replace strict_strtoul() with kstrtoul() target: Simplify command completion by removing CMD_T_FAILED flag iser-target: Fix leak on failure in isert_conn_create_fastreg_pool iscsi-target: Fix SNACK Type 1 + BegRun=0 handling target: Fix missing length check in spc_emulate_evpd_83() qla2xxx: Remove last vestiges of qla_tgt_cmd.cmd_list target: Fix 32-bit + CONFIG_LBDAF=n link error w/ sector_div target: Fix free-after-use regression in PR unregister
2014-02-14Merge branches 'cma', 'cxgb4', 'iser', 'misc', 'mlx4', 'mlx5', 'nes', ↵Roland Dreier15-70/+206
'ocrdma', 'qib' and 'usnic' into for-next
2014-02-14RDMA/ocrdma: Fix load time panic during GID table initDevesh Sharma1-1/+1
We should use rdma_vlan_dev_real_dev() instead of using vlan_dev_real_dev() when building the GID table for a vlan interface. Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-02-14RDMA/ocrdma: Fix traffic class shiftDevesh Sharma1-1/+1
Use correct value for obtaining traffic class from device response for Query QP request. Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-02-14IB/iser: Fix use after free in iser_snd_completion()Dan Carpenter1-1/+2
We use "tx_desc" again after we free it. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-02-14IB/iser: Avoid dereferencing iscsi_iser conn object when not bound to iser ↵Roi Dayan1-3/+7
connection Fix a possible NULL pointer dereference in disconnection flow. This can happen if the target disconnected/rejected the connection request, e.g before the binding stage between iscsi connection to the transport connection. Signed-off-by: Alex Tabachnik <alext@mellanox.com> Signed-off-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-02-14IB/usnic: Fix smatch endianness errorUpinder Malhi1-1/+8
Error reported at http://marc.info/?l=linux-rdma&m=138995755801039&w=2 Fix short to int cast for big endian systems. Signed-off-by: Upinder Malhi <umalhi@cisco.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-02-14IB/mlx5: Remove dependency on X86Eli Cohen1-1/+1
Remove Kconfig dependency of mlx5_ib/mlx5_core on X86, since there is no such dependency in reality. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-02-14IB/qib: Add missing serdes init sequenceMike Marciniszyn1-0/+5
Research has shown that commit a77fcf895046 ("IB/qib: Use a single txselect module parameter for serdes tuning") missed a key serdes init sequence. This patch add that sequence. Cc: <stable@vger.kernel.org> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-02-14RDMA/cxgb4: Add missing neigh_release in LE-Workaround pathKumar Sanghvi1-0/+1
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-02-14IB: Report using RoCE IP based gids in port capsMoni Shoua2-2/+2
For userspace RoCE UD QPs we need to know the GID format that the kernel uses, e.g when working over older kernels. For that end, add a new port capability IB_PORT_IP_BASED_GIDS and report it when query port is issued. Signed-off-by: Moni Shoua <monis@mellanox.co.il> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-02-14IB/mlx4: Build the port IBoE GID table properly under bondingMoni Shoua1-4/+37
When scanning netdevices we need to check a few more conditions and cases to build the IBoE GID table properly. For example, under bonding we must make sure that when a port is down, the bond IP address isn't programmed as a GID, since doing so will cause failure with IB core flows that selects ports by GID. Signed-off-by: Moni Shoua <monis@mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-02-14IB/mlx4: Do IBoE GID table resets per-portMoni Shoua1-17/+18
The IBoE code used to reset the GID table did it for all Ethernet ports of the device. Since the whole architecture of generating GIDs and responding to events is port-based, this is inefficient and can lead to wrong content in the GID table. Change the reset flow to be per-port. Signed-off-by: Moni Shoua <monis@mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-02-14IB/mlx4: Do IBoE locking earlier when initializing the GID tableMoni Shoua1-3/+3
Updating the GID table under IBoE requires read/write from/to shared data structures. These data structures are protected with the device iboe lock. The flows that modify the GID table start from 1. Initializing the GID table 2. NETDEV events 3. INET or INET6 events This patch makes sure that the flow of initializing the GID table is consistent with the other two flows w.r.t on what step the lock is taken. Signed-off-by: Moni Shoua <monis@mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-02-14IB/mlx4: Move rtnl locking to the right placeMoni Shoua1-2/+2
On the one hand, the invocation of netdev_master_upper_dev_get() within mlx4_ib_scan_netdevs() must be done with rtnl lock held. On the other hand, it's wrong to call rtnl_lock() from within this function since it's also called by our netdev notifier callback. Therefore move the locking to mlx4_ib_add() so that both cases are covered. Signed-off-by: Moni Shoua <monis@mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-02-14IB/mlx4: Make sure GID index 0 is always occupiedMoni Shoua1-24/+68
Make sure that for Ethernet ports, the port GID table index 0 is always occupied with a default GID of the relevant IPv6 link-local adderss. This provides better user experience for legacy applications that don't use the RDMA CM and were working on index 0 prior to the IP addressing change. Also, as GIDs are generated from IP addresses of the network devices that are associated with the port, it's basically possible that the GID table will be empty if no IP address was assigned. This doesn't comply with the IB spec section 4.1.1 "GID usage and properties". Signed-off-by: Moni Shoua <monis@mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-02-13IB/mlx4: Don't allocate range of steerable UD QPs for Ethernet-only deviceMatan Barak1-1/+6
When the device has only Ethernet ports, don't try to allocate range of steerable UD QPs since they aren't needed. This fixes an issue where mlx4 VFs tried to allocate a range of UD steerable QPs, but failed to do so. Fixes: c1c98501121e ("IB/mlx4: Add support for steerable IB UD QPs") Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-02-13IB/srpt: replace strict_strtoul() with kstrtoul()Jingoo Han1-7/+7
The usage of strict_strtoul() is not preferred, because strict_strtoul() is obsolete. Thus, kstrtoul() should be used. Signed-off-by: Jingoo Han <jg1.han@samsung.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-02-13iser-target: Fix leak on failure in isert_conn_create_fastreg_poolNicholas Bellinger1-0/+1
This patch fixes a memory leak for fr_desc upon failure of isert_create_fr_desc() in isert_conn_create_fastreg_pool() code. As reported by Coverity 1166659: *** CID 1166659: Resource leak (RESOURCE_LEAK) /drivers/infiniband/ulp/isert/ib_isert.c: 470 in isert_conn_create_fastreg_pool() 464 isert_conn, isert_conn->conn_fr_pool_size); 465 466 return 0; 467 468 err: 469 isert_conn_free_fastreg_pool(isert_conn); >>> CID 1166659: Resource leak (RESOURCE_LEAK) >>> Variable "fr_desc" going out of scope leaks the storage it points to. 470 return ret; 471 } 472 473 static int 474 isert_connect_request(struct rdma_cm_id *cma_id, struct rdma_cm_event *event) 475 { Cc: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-02-12RDMA/amso1100: Fix error return codeJulia Lawall2-2/+5
Set the return variable to an error code as done elsewhere in the function. A simplified version of the semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> ( if@p1 (\(ret < 0\|ret != 0\)) { ... return ret; } | ret@p1 = 0 ) ... when != ret = e1 when != &ret *if(...) { ... when != ret = e2 when forall return ret; } // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-02-12RDMA/nes: Fix error return codeJulia Lawall1-1/+4
Set the return variable to an error code as done elsewhere in the function. A simplified version of the semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> ( if@p1 (\(ret < 0\|ret != 0\)) { ... return ret; } | ret@p1 = 0 ) ... when != ret = e1 when != &ret *if(...) { ... when != ret = e2 when forall return ret; } // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-02-07IB/mlx5: Don't set "block multicast loopback" capabilityEli Cohen2-4/+3
Currently Connect-IB does not support blocking multicast loopback, so don't set IB_DEVICE_BLOCK_MULTICAST_LOOPBACK in the device caps. Reported by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-02-07IB/mlx5: Fix binary compatibility with libmlx5Eli Cohen3-4/+32
Commit c1be5232d21d ("Fix micro UAR allocator") broke binary compatibility between libmlx5 and mlx5_ib since it defines a different value to the number of micro UARs per page, leading to wrong calculation in libmlx5. This patch defines struct mlx5_ib_alloc_ucontext_req_v2 as an extension to struct mlx5_ib_alloc_ucontext_req. The extended size is determined in mlx5_ib_alloc_ucontext() and in case of old library we use uuarn 0 which works fine -- this is acheived due to create_user_qp() falling back from high to medium then to low class where low class will return 0. For new libraries we use the more sophisticated allocation algorithm. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Yann Droneaud <ydroneaud@opteya.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-02-07IB/mlx5: Fix RC transport send queue overhead computationEli Cohen1-1/+3
Fix the RC QPs send queue overhead computation to take into account two additional segments in the WQE which are needed for registration operations. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-02-01Merge branch 'for-next' of ↵Linus Torvalds2-111/+121
git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending Pull SCSI target updates from Nicholas Bellinger: "The highlights this round include: - add support for SCSI Referrals (Hannes) - add support for T10 DIF into target core (nab + mkp) - add support for T10 DIF emulation in FILEIO + RAMDISK backends (Sagi + nab) - add support for T10 DIF -> bio_integrity passthrough in IBLOCK backend (nab) - prep changes to iser-target for >= v3.15 T10 DIF support (Sagi) - add support for qla2xxx N_Port ID Virtualization - NPIV (Saurav + Quinn) - allow percpu_ida_alloc() to receive task state bitmask (Kent) - fix >= v3.12 iscsi-target session reset hung task regression (nab) - fix >= v3.13 percpu_ref se_lun->lun_ref_active race (nab) - fix a long-standing network portal creation race (Andy)" * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (51 commits) target: Fix percpu_ref_put race in transport_lun_remove_cmd target/iscsi: Fix network portal creation race target: Report bad sector in sense data for DIF errors iscsi-target: Convert gfp_t parameter to task state bitmask iscsi-target: Fix connection reset hang with percpu_ida_alloc percpu_ida: Make percpu_ida_alloc + callers accept task state bitmask iscsi-target: Pre-allocate more tags to avoid ack starvation qla2xxx: Configure NPIV fc_vport via tcm_qla2xxx_npiv_make_lport qla2xxx: Enhancements to enable NPIV support for QLOGIC ISPs with TCM/LIO. qla2xxx: Fix scsi_host leak on qlt_lport_register callback failure IB/isert: pass scatterlist instead of cmd to fast_reg_mr routine IB/isert: Move fastreg descriptor creation to a function IB/isert: Avoid frwr notation, user fastreg IB/isert: seperate connection protection domains and dma MRs tcm_loop: Enable DIF/DIX modes in SCSI host LLD target/rd: Add DIF protection into rd_execute_rw target/rd: Add support for protection SGL setup + release target/rd: Refactor rd_build_device_space + rd_release_device_space target/file: Add DIF protection support to fd_execute_rw target/file: Add DIF protection init/format support ...
2014-01-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds1-1/+1
Pull networking updates from David Miller: 1) BPF debugger and asm tool by Daniel Borkmann. 2) Speed up create/bind in AF_PACKET, also from Daniel Borkmann. 3) Correct reciprocal_divide and update users, from Hannes Frederic Sowa and Daniel Borkmann. 4) Currently we only have a "set" operation for the hw timestamp socket ioctl, add a "get" operation to match. From Ben Hutchings. 5) Add better trace events for debugging driver datapath problems, also from Ben Hutchings. 6) Implement auto corking in TCP, from Eric Dumazet. Basically, if we have a small send and a previous packet is already in the qdisc or device queue, defer until TX completion or we get more data. 7) Allow userspace to manage ipv6 temporary addresses, from Jiri Pirko. 8) Add a qdisc bypass option for AF_PACKET sockets, from Daniel Borkmann. 9) Share IP header compression code between Bluetooth and IEEE802154 layers, from Jukka Rissanen. 10) Fix ipv6 router reachability probing, from Jiri Benc. 11) Allow packets to be captured on macvtap devices, from Vlad Yasevich. 12) Support tunneling in GRO layer, from Jerry Chu. 13) Allow bonding to be configured fully using netlink, from Scott Feldman. 14) Allow AF_PACKET users to obtain the VLAN TPID, just like they can already get the TCI. From Atzm Watanabe. 15) New "Heavy Hitter" qdisc, from Terry Lam. 16) Significantly improve the IPSEC support in pktgen, from Fan Du. 17) Allow ipv4 tunnels to cache routes, just like sockets. From Tom Herbert. 18) Add Proportional Integral Enhanced packet scheduler, from Vijay Subramanian. 19) Allow openvswitch to mmap'd netlink, from Thomas Graf. 20) Key TCP metrics blobs also by source address, not just destination address. From Christoph Paasch. 21) Support 10G in generic phylib. From Andy Fleming. 22) Try to short-circuit GRO flow compares using device provided RX hash, if provided. From Tom Herbert. The wireless and netfilter folks have been busy little bees too. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2064 commits) net/cxgb4: Fix referencing freed adapter ipv6: reallocate addrconf router for ipv6 address when lo device up fib_frontend: fix possible NULL pointer dereference rtnetlink: remove IFLA_BOND_SLAVE definition rtnetlink: remove check for fill_slave_info in rtnl_have_link_slave_info qlcnic: update version to 5.3.55 qlcnic: Enhance logic to calculate msix vectors. qlcnic: Refactor interrupt coalescing code for all adapters. qlcnic: Update poll controller code path qlcnic: Interrupt code cleanup qlcnic: Enhance Tx timeout debugging. qlcnic: Use bool for rx_mac_learn. bonding: fix u64 division rtnetlink: add missing IFLA_BOND_AD_INFO_UNSPEC sfc: Use the correct maximum TX DMA ring size for SFC9100 Add Shradha Shah as the sfc driver maintainer. net/vxlan: Share RX skb de-marking and checksum checks with ovs tulip: cleanup by using ARRAY_SIZE() ip_tunnel: clear IPCB in ip_tunnel_xmit() in case dst_link_failure() is called net/cxgb4: Don't retrieve stats during recovery ...
2014-01-25iscsi-target: Convert gfp_t parameter to task state bitmaskNicholas Bellinger1-7/+7
This patch propigates the use of task state bitmask now used by percpu_ida_alloc() up the iscsi-target callchain, replacing the use of GFP_ATOMIC for TASK_RUNNING, and GFP_KERNEL for TASK_INTERRUPTIBLE. Also, drop the unnecessary gfp_t parameter to isert_allocate_cmd(), and just pass TASK_INTERRUPTIBLE into iscsit_allocate_cmd(). Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-01-23Merge branch 'ip-roce' into for-nextRoland Dreier28-363/+834
Conflicts: drivers/infiniband/hw/mlx4/main.c
2014-01-23Merge branches 'cma', 'cxgb4', 'flowsteer', 'ipoib', 'misc', 'mlx4', 'mlx5', ↵Roland Dreier51-97/+6368
'ocrdma', 'qib', 'srp' and 'usnic' into for-next
2014-01-23IB/mlx5: Verify reserved fields are clearedEli Cohen1-2/+6
Verify that reserved fields in struct mlx5_ib_resize_cq are cleared before continuing execution of the verb. This is required to allow making use of this area in future revisions. Signed-off-by: Yann Droneaud <ydroneaud@opteya.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>