summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2018-09-11IB/hfi1: Right size user_sdma sequence numbers and related variablesMichael J. Ruhl4-11/+11
Hardware limits the maximum number of packets to u16 packets. Match that size for all relevant sequence numbers in the user_sdma engine. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-11IB/hfi1: Remove race conditions in user_sdma send pathMichael J. Ruhl2-18/+15
Packet queue state is over used to determine SDMA descriptor availablitity and packet queue request state. cpu 0 ret = user_sdma_send_pkts(req, pcount); cpu 0 if (atomic_read(&pq->n_reqs)) cpu 1 IRQ user_sdma_txreq_cb calls pq_update() (state to _INACTIVE) cpu 0 xchg(&pq->state, SDMA_PKT_Q_ACTIVE); At this point pq->n_reqs == 0 and pq->state is incorrectly SDMA_PKT_Q_ACTIVE. The close path will hang waiting for the state to return to _INACTIVE. This can also change the state from _DEFERRED to _ACTIVE. However, this is a mostly benign race. Remove the racy code path. Use n_reqs to determine if a packet queue is active or not. Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-11IB/hfi1: Eliminate races in the SDMA send error pathMichael J. Ruhl2-51/+39
pq_update() can only be called in two places: from the completion function when the complete (npkts) sequence of packets has been submitted and processed, or from setup function if a subset of the packets were submitted (i.e. the error path). Currently both paths can call pq_update() if an error occurrs. This race will cause the n_req value to go negative, hanging file_close(), or cause a crash by freeing the txlist more than once. Several variables are used to determine SDMA send state. Most of these are unnecessary, and have code inspectible races between the setup function and the completion function, in both the send path and the error path. The request 'status' value can be set by the setup or by the completion function. This is code inspectibly racy. Since the status is not needed in the completion code or by the caller it has been removed. The request 'done' value races between usage by the setup and the completion function. The completion function does not need this. When the number of processed packets matches npkts, it is done. The 'has_error' value races between usage of the setup and the completion function. This can cause incorrect error handling and leave the n_req in an incorrect value (i.e. negative). Simplify the code by removing all of the unneeded state checks and variables. Clean up iovs node when it is freed. Eliminate race conditions in the error path: If all packets are submitted, the completion handler will set the completion status correctly (ok or aborted). If all packets are not submitted, the caller must wait until the submitted packets have completed, and then set the completion status. These two change eliminate the race condition in the error path. Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-11RDMA/hns: Fix an error code in hns_roce_v2_init_eq_table()Dan Carpenter1-0/+1
The error code isn't set on this path. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-11IB/{hfi1, qib, rdmavt}: Schedule multi RC/UC packets instead of postingMichael J. Ruhl6-27/+32
The post_send() path determines if it should post directly or, schedule the post for later. The current logic is: if the swqe ring is empty or (for hfi1) wqe->length <= piothreshold post the send else schedule This can allow large requests to call the send engine directly. Large requests can potentially produce a large number of packets prior to returning to the caller, blocking the caller from posting more requests, and allowing better parallel processing. Allow the driver(s) more say in this logic (pass call_send to the driver, rather than examining a return value). Update hfi1/qib logic to schedule the send engine if an RC or UC message is larger than the QP MTU size. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-11infiniband: remove redundant condition check before debugfs_removezhong jiang1-2/+1
debugfs_remove has taken the IS_ERR_OR_NULL into account. Just remove the unnecessary condition. Signed-off-by: zhong jiang <zhongjiang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-11RDMA/mlx5: Allow creating a matcher for a NIC TX flow tableMark Bloch3-6/+34
Currently a matcher can only be created and attached to a NIC RX flow table. Extend it to allow it on NIC TX flow tables as well. In order to achieve that, we: 1) Expose a new attribute: MLX5_IB_ATTR_FLOW_MATCHER_FLOW_FLAGS. enum ib_flow_flags is used as valid flags. Only IB_FLOW_ATTR_FLAGS_EGRESS is supported. 2) Remove the requirement to have a DEVX or QP destination when creating a flow. A flow added to NIC TX flow table will forward the packet outside of the vport (Wire or E-Switch in the SR-iOV case). Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-11RDMA/mlx5: Add NIC TX namespace when getting a flow tableMark Bloch3-11/+32
Add the ability to get a NIC TX flow table when using _get_flow_table(). This will allow to create a matcher and a flow rule on the NIC TX path. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-11RDMA/mlx5: Add flow actions support to raw create flowMark Bloch4-8/+50
Support attaching flow actions to a flow rule via raw create flow. For now only NIC RX path is supported. This change requires to export flow resources management functions so we can maintain proper bookkeeping of flow actions. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-11RDMA/mlx5: Refactor raw flow creationMark Bloch3-7/+13
Move struct mlx5_flow_act to be passed from the method entry point, this will allow to add support for flow action for the raw create flow path. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-11RDMA/mlx5: Don't overwrite action if already setMark Bloch1-0/+10
We support only a single action type per flow rule, in case the user passes the same type of flow actions fail the flow creation. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-11RDMA/mlx5: Refactor flow action parsing to be more genericMark Bloch2-6/+10
Make the parsing of flow actions more generic so it could be used by mlx5 raw create flow. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-11RDMA/uverbs: Move flow resources initializationMark Bloch5-38/+36
Use ib_set_flow() when initializing flow related resources. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-11IB/uverbs: Add IDRs array attribute type to ioctl() interfaceGuy Levi4-3/+201
Methods sometimes need to get a flexible set of IDRs and not a strict set as can be achieved today by the conventional IDR attribute. Add a new IDRS_ARRAY attribute to the generic uverbs ioctl layer. IDRS_ARRAY points to array of idrs of the same object type and same access rights, only write and read are supported. Signed-off-by: Guy Levi <guyle@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>`` Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-11RDMA/mlx5: Enable attaching packet reformat action to steering flowsMark Bloch1-0/+8
Any matching rules will be mutated based on the packet reformat context which is attached to that given flow rule. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-11RDMA/mlx5: Enable reformat on NIC RX if supportedMark Bloch1-0/+4
A L3_TUNNEL_TO_L2 decap flow action requires to enable the encap bit on the flow table, enable it if supported. This will allow to attach those flow actions to NIC RX steering. We don't enable if running on a representor. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-11RDMA/mlx5: Enable attaching DECAP action to steering flowsMark Bloch1-0/+5
Any matching packet will be stripped of it's VXLAN tunnel, only the inner L2 onward is left. The user will receive the decapsulated packet. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-11RDMA/mlx5: Enable decap and packet reformat on flow tablesMark Bloch1-4/+13
If NIC RX flow tables support decap operation, enable it on creation, This allows to perform decapsulation of tunnelled packets by steering rules. If NIC TX flow tables support reformat operation, enable it on creation. We don't enable those capabilities on representors as the E-Switch should handle packet modification (can be configured via TC) and as current hardware can't handle both FDB and NIC flow tables with decap/packet reformat support. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-11RDMA/mlx5: Enable attaching modify header to steering flowsMark Bloch1-0/+8
When creating a flow steering rule, allow the user to attach a modify header action. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-11RDMA/mlx5: Add NIC TX steering supportMark Bloch2-10/+20
Just like ingress steering, allow a user to create steering rules that match egress vport traffic. We expose the same number of priorities as the bypass (NIC RX) steering. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-11RDMA/core: Document QP @event_handler functionChuck Lever1-0/+2
Add helpful warning for RDMA consumer implementers. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-11RDMA/core: Document CM @event_handler functionChuck Lever1-1/+5
Code audit suggests that the RDMA CM event handler callback function is _always_ invoked in a context that is safe to block. That's important for consumer implementers to know, so document that in the comment before rdma_create_id (where the handler function is set up by the consumer). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-06RDMA/core: Assign device ifindex before publishing the deviceParav Pandit1-1/+2
Even though device->ifindex is assigned before adding the device in the list which is read by netlink flow, it is better to assign rdma device index before publishing the device in the system to users and clients. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-06RDMA/core: Follow correct unregister order between sysfs and cgroupParav Pandit1-1/+1
During register_device() init sequence is, (a) register with rdma cgroup followed by (b) register with sysfs Therefore, unregister_device() sequence should follow the reverse order. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-06RDMA/umem: Restore lockdep check while downgrading lockLeon Romanovsky1-6/+0
Lockdep engine handles correctly downgrade of locks and it simply incorrect to disable lockdep checks prior to calling mmu_notifier. Remove lockdep_off and ensure locks correctness. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-06RDMA/core: Define client_data_lock as rwlock instead of spinlockParav Pandit2-17/+18
Even though device registration/unregistration and client registration/unregistration is not a performance path, define the client_data_lock as rwlock for code clarity. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-06RDMA/core: Use simpler spin lock irq API from blocking contextParav Pandit1-11/+9
add_client_context(), ib_unregister_device() and ib_unregister_client() are designed to call from blocking context. There is no need to save and restore last interrupt state when called from such blocking context. Even though this is not a performance path, using the right spin lock API is desired for code clarity. To avoid checkpatch warning while removing flags, sizeof() is used. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-06RDMA/core: Remove context entries from list while unregistering deviceParav Pandit1-1/+5
While unregistering a device, remove the context elements from the list to not have any stale entries. With that any errors/bugs can be checked when device is freed. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-06RDMA/core: Use simplified list_for_eachParav Pandit1-5/+4
While traversing client_data_list in following conditions, linked list is only read, no elements of the list are removed. Therefore, use list_for_each_entry(), instead of list_for_each_safe(). Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-06RDMA/core: No need to protect kfree with spin lock and semaphoreParav Pandit1-1/+1
While unregistering a client, only context removal should be protected with lock. There is no need to protect a freeing of such context which is already removed from the list. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-06RDMA/{cma, core}: Avoid callback on rdma_addr_cancel()Parav Pandit2-9/+7
Currently rdma_addr_cancel() is an async operation, which notifies that cancel is done by executing the callback function given during rdma_resolve_ip(). If resolve_ip request is already completed than callback is not executed. Instead, now rdma_resolve_addr() and rdma_addr_cancel() simplified in following ways. 1. rdma_addr_cancel() now a synchronous method. If request was pending, after it is cancelled, no callback is notified. 2. rdma_resolve_addr() and respective addr_handler() callback doesn't need to hold reference to cm_id. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-06RDMA/core: Rate limit MAD error messagesParav Pandit1-35/+37
While registering a mad agent, a user space can trigger various errors and flood the logs. Therefore, decrease verbosity and rate limit such error messages. While we are at it, use __func__ to print function name. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-06IB/ipoib: Ensure that MTU isn't less than minimum permittedMuhammad Sammar1-1/+2
It is illegal to change MTU to a value lower than the minimum MTU stated in ethernet spec. In addition to that we need to add 4 bytes for encapsulation header (IPOIB_ENCAP_LEN). Before "ifconfig ib0 mtu 0" command, succeeds while it obviously shouldn't. Signed-off-by: Muhammad Sammar <muhammads@mellanox.com> Reviewed-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-06IB/mlx5: Don't hold spin lock while checking device stateParav Pandit1-14/+12
mdev->state device state is not protected by the QP for which WRs are being processed. Therefore, there is no need to hold spin lock while checking mdev state. Given that device fatal error is unlikely situation, wrap the condition check with unlikely(). Additionally, kernel QP1 is also a kernel ULP for which soft CQEs needs to be generated. Therefore, check for device fatal error before processing QP1 work requests. Fixes: 89ea94a7b6c4 ("IB/mlx5: Reset flow support for IB kernel ULPs") Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-06RDMA/core: Fail early if unsupported QP is providedParav Pandit1-0/+4
When requested QP type is not supported for a {device, port}, return the error right away before validating all parameters during mad agent registration time. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-06Merge branch 'uverbs_dev_cleanups' into rdma.git for-nextJason Gunthorpe249-1723/+2314
For dependencies, branch based on rdma.git 'for-rc' of https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git/ Pull 'uverbs_dev_cleanups' from Leon Romanovsky: ==================== Reuse the char device code interfaces to simplify ib_uverbs_device creation and destruction. As part of this series, we are sending fix to cleanup path, which was discovered during internal review, The fix definitely can go to -rc, but it means that this series will be dependent on rdma-rc. ==================== * branch 'uverbs_dev_cleanups': RDMA/uverbs: Use device.groups to initialize device attributes RDMA/uverbs: Use cdev_device_add() instead of cdev_add() RDMA/core: Depend on device_add() to add device attributes RDMA/uverbs: Fix error cleanup path of ib_uverbs_add_one() Resolved conflict in ib_device_unregister_sysfs() Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-06RDMA/uverbs: Use device.groups to initialize device attributesParav Pandit2-13/+19
Instead of explicitly adding device attribute files and handling such error conditions, depend on device core layer to create device attributes files based group pointer NULL terminated array. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-06RDMA/uverbs: Use cdev_device_add() instead of cdev_add()Parav Pandit2-39/+30
Instead of doing two step process to add char device and create underlying device, use cdev_device_add() which does both. Currently a kobject per uverbs_device is created to keep reference to its holding ib_uverbs_device in addition to its underlying device 'dev'. Instead just use uverbs_device->dev to keep a reference to. With this change there is single reference tracker for ib_uverbs_device structure. This allows for subsequent patch to registers group attribute as well using single API cdev_device_add(). Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-06RDMA/core: Depend on device_add() to add device attributesParav Pandit2-34/+30
Instead of adding/removing device attribute files, depend on device_add() which considers adding these device files based on NULL terminated attributes group array. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-06RDMA/uverbs: Fix error cleanup path of ib_uverbs_add_one()Parav Pandit1-3/+2
If ib_uverbs_create_uapi() fails, dev_num should be freed from the bitmap. Fixes: 7d96c9b17636 ("IB/uverbs: Have the core code create the uverbs_root_spec") Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-06bnxt_re: Fix couple of memory leaks that could lead to IOMMU call tracesSomnath Kotur2-1/+3
1. DMA-able memory allocated for Shadow QP was not being freed. 2. bnxt_qplib_alloc_qp_hdr_buf() had a bug wherein the SQ pointer was erroneously pointing to the RQ. But since the corresponding free_qp_hdr_buf() was correct, memory being free was less than what was allocated. Fixes: 1ac5a4047975 ("RDMA/bnxt_re: Add bnxt_re RoCE driver") Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-06RDMA/core: Replace open-coded variant of get_deviceParav Pandit1-2/+2
Reuse existing get_device() API to do it symmetric to already used put_device() in commit 924b8900a49d ("RDMA/core: Replace open-coded variant of put_device") Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-06RDMA/uverbs: Declare closing variable as booleanLeon Romanovsky2-2/+2
The "closing" variable is used as boolean and set to "true" in one place, update the declaration of that variable and their other assignment to proper type. Fixes: e951747a087a ("IB/uverbs: Rework the locking for cleaning up the ucontext") Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-06RDMA/nes: Delete impossible debug printsLeon Romanovsky3-14/+0
The pci-core and net-core logic ensure that parameters provided to nes_probe() and nes_netdev_open() are valid, hence the assert print are not possible. Cc: Faisal Latif <faisal.latif@intel.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-06RDMA/qedr: remove set but not used variable 'ctx'YueHaibing1-2/+0
Fixes gcc '-Wunused-but-set-variable' warning: drivers/infiniband/hw/qedr/verbs.c: In function 'qedr_create_srq': drivers/infiniband/hw/qedr/verbs.c:1450:24: warning: variable 'ctx' set but not used [-Wunused-but-set-variable] Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Rahul Verma <rahul.verma@cavium.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-06IB/srp: Remove unnecessary unlikely()Igor Stoppa1-1/+1
WARN_ON() already contains an unlikely(), so it's not necessary to wrap it into another. Signed-off-by: Igor Stoppa <igor.stoppa@huawei.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-06IB/core: Add an unbound WQ type to the new CQ APIJack Morgenstein4-7/+27
The upstream kernel commit cited below modified the workqueue in the new CQ API to be bound to a specific CPU (instead of being unbound). This caused ALL users of the new CQ API to use the same bound WQ. Specifically, MAD handling was severely delayed when the CPU bound to the WQ was busy handling (higher priority) interrupts. This caused a delay in the MAD "heartbeat" response handling, which resulted in ports being incorrectly classified as "down". To fix this, add a new "unbound" WQ type to the new CQ API, so that users have the option to choose either a bound WQ or an unbound WQ. For MADs, choose the new "unbound" WQ. Fixes: b7363e67b23e ("IB/device: Convert ib-comp-wq to be CPU-bound") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.m> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-06RDMA/bnxt_re: QPLIB: Add and use #define dev_fmt(fmt) "QPLIB: " fmtJoe Perches4-151/+116
Consistently use the "QPLIB: " prefix for dev_<level> logging. Miscellanea: o Add missing newlines to avoid possible message interleaving o Coalesce consecutive dev_<level> uses that emit a message header to avoid < 80 column lengths and mistakenly output on multiple lines o Reflow modified lines to use 80 columns where appropriate o Consistently use "%s: " where __func__ is output o QPLIB: is now always output immediately after the dev_<level> header Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-06IB/ipoib: Avoid a race condition between start_xmit and cm_rep_handlerAaron Knister1-0/+2
Inside of start_xmit() the call to check if the connection is up and the queueing of the packets for later transmission is not atomic which leaves a window where cm_rep_handler can run, set the connection up, dequeue pending packets and leave the subsequently queued packets by start_xmit() sitting on neigh->queue until they're dropped when the connection is torn down. This only applies to connected mode. These dropped packets can really upset TCP, for example, and cause multi-minute delays in transmission for open connections. Here's the code in start_xmit where we check to see if the connection is up: if (ipoib_cm_get(neigh)) { if (ipoib_cm_up(neigh)) { ipoib_cm_send(dev, skb, ipoib_cm_get(neigh)); goto unref; } } The race occurs if cm_rep_handler execution occurs after the above connection check (specifically if it gets to the point where it acquires priv->lock to dequeue pending skb's) but before the below code snippet in start_xmit where packets are queued. if (skb_queue_len(&neigh->queue) < IPOIB_MAX_PATH_REC_QUEUE) { push_pseudo_header(skb, phdr->hwaddr); spin_lock_irqsave(&priv->lock, flags); __skb_queue_tail(&neigh->queue, skb); spin_unlock_irqrestore(&priv->lock, flags); } else { ++dev->stats.tx_dropped; dev_kfree_skb_any(skb); } The patch acquires the netif tx lock in cm_rep_handler for the section where it sets the connection up and dequeues and retransmits deferred skb's. Fixes: 839fcaba355a ("IPoIB: Connected mode experimental support") Cc: stable@vger.kernel.org Signed-off-by: Aaron Knister <aaron.s.knister@nasa.gov> Tested-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-06Merge branch 'mlx5-flow-mutate' into rdma.git for-nextJason Gunthorpe22-147/+643
For dependencies, branch based on 'mellanox/mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git Pull Flow actions to mutate packets from Leon Romanovsky: ==================== This series exposes the ability to create flow actions which can mutate packet headers. We do that by exposing two new verbs: * modify header - can change existing packet headers. packet * reformat - can encapsulate or decapsulate a packet. Once created a flow action must be attached to a steering rule for it to take effect. The first 10 patches refactor mlx5_core code, rename internal structures to better reflect their operation and export needed functions so the RDMA side can allocate the action. The last 5 patches expose via the IOCTL infrastructure mlx5_ib methods which do the actual allocation of resources and return an handle to the user. A user of this API is expected to know how to work with the device's spec as the input to those function is HW depended. An example usage of the modify header action is routing, A user can create an action which edits the L2 header and decrease the TTL. An example usage of the packet reformat action is VXLAN encap/decap which is done by the HW. ==================== * branch 'mlx5-flow-mutate': RDMA/mlx5: Extend packet reformat verbs RDMA/mlx5: Add new flow action verb - packet reformat RDMA/uverbs: Add generic function to fill in flow action object RDMA/mlx5: Add a new flow action verb - modify header RDMA/uverbs: Add UVERBS_ATTR_CONST_IN to the specs language net/mlx5: Export packet reformat alloc/dealloc functions net/mlx5: Pass a namespace for packet reformat ID allocation net/mlx5: Expose new packet reformat capabilities {net, RDMA}/mlx5: Rename encap to reformat packet net/mlx5: Move header encap type to IFC header file net/mlx5: Break encap/decap into two separated flow table creation flags net/mlx5: Add support for more namespaces when allocating modify header net/mlx5: Export modify header alloc/dealloc functions net/mlx5: Add proper NIC TX steering flow tables support net/mlx5: Cleanup flow namespace getter switch logic Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>