summaryrefslogtreecommitdiff
path: root/drivers/infiniband/hw/mlx5
AgeCommit message (Collapse)AuthorFilesLines
2020-10-01RDMA/mlx5: Extend advice MR to support non faulting modeYishai Hadas2-2/+8
Extend advice MR to support non faulting mode, this can improve performance by increasing the populated page tables in the device. Link: https://lore.kernel.org/r/20200930163828.1336747-4-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-01IB/core: Enable ODP sync without faultingYishai Hadas1-1/+1
Enable ODP sync without faulting, this improves performance by reducing the number of page faults in the system. The gain from this option is that the device page table can be aligned with the presented pages in the CPU page table without causing page faults. As of that, the overhead on data path from hardware point of view to trigger a fault which end-up by calling the driver to bring the pages will be dropped. Link: https://lore.kernel.org/r/20200930163828.1336747-3-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-01IB/core: Improve ODP to use hmm_range_fault()Yishai Hadas1-17/+7
Move to use hmm_range_fault() instead of get_user_pags_remote() to improve performance in a few aspects: This includes: - Dropping the need to allocate and free memory to hold its output - No need any more to use put_page() to unpin the pages - The logic to detect contiguous pages is done based on the returned order, no need to run per page and evaluate. In addition, moving to use hmm_range_fault() enables to reduce page faults in the system with it's snapshot mode, this will be introduced in next patches from this series. As part of this, cleanup some flows and use the required data structures to work with hmm_range_fault(). Link: https://lore.kernel.org/r/20200930163828.1336747-2-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-29RDMA/drivers: Remove udata check from special QPLeon Romanovsky1-12/+0
GSI QP can't be created from the user space, hence the udata check is always false (udata == NULL). Remove that check and simplify the flow. Link: https://lore.kernel.org/r/20200926102450.2966017-9-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-29RDMA/mlx5: Delete not needed GSI QP signal QP typeLeon Romanovsky2-8/+1
GSI QP doesn't need signal QP type because it is initialized statically to zero, which is IB_SIGNAL_ALL_WR also wr->send_flags isn't set too. This means that the GSI QP signal QP type can be removed. Link: https://lore.kernel.org/r/20200926102450.2966017-5-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-29RDMA/mlx5: Change GSI QP to have same creation flow like other QPsLeon Romanovsky3-46/+38
There is no reason to have separate create flow for the GSI QP, while general create_qp routine has all needed checks and ability to allocate and free the proper struct mlx5_ib_qp. Link: https://lore.kernel.org/r/20200926102450.2966017-4-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-29RDMA/mlx5: Reuse existing fields in parent QP storage objectLeon Romanovsky3-45/+31
Remove duplication of mlx5_ib_qp and mlx5_ib_gsi_qp fields. This change returns the memory footprint of mlx5_ib QP to be as it was before embedding GSI QP. Link: https://lore.kernel.org/r/20200926102450.2966017-3-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-29RDMA/mlx5: Embed GSI QP into general mlx5_ib QPLeon Romanovsky3-33/+32
The GSI QPs have different create flow from the regular QPs, but it is not really needed. Update the code to use mlx5_ib_qp as a storage class for all outside of GSI calls. Link: https://lore.kernel.org/r/20200926102450.2966017-2-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-25RDMA/mlx5: Fix type warning of sizeof in __mlx5_ib_alloc_counters()Liu Shixin1-2/+2
sizeof() when applied to a pointer typed expression should give the size of the pointed data, even if the data is a pointer. Fixes: e1f24a79f424 ("IB/mlx5: Support congestion related counters") Link: https://lore.kernel.org/r/20200917081354.2083293-1-liushixin2@huawei.com Signed-off-by: Liu Shixin <liushixin2@huawei.com> Acked-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-23RDMA/mlx5: Don't call to restrack recursivelyLeon Romanovsky1-2/+14
The restrack is going to manage memory of all IB objects and must be called before object is created. GSI QP in the mlx5_ib separated between creating dummy interface and HW object beneath. This was achieved by double call to ib_create_qp(). In order to skip such reentry call to internal driver create_qp code. Link: https://lore.kernel.org/r/20200922091106.2152715-3-leon@kernel.org Reviewed-by: Mark Zhang <markz@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-18RDMA/mlx5: Clarify what the UMR is for when creating MRsJason Gunthorpe4-49/+97
Once a mkey is created it can be modified using UMR. This is desirable for performance reasons. However, different hardware has restrictions on what modifications are possible using UMR. Make sense of these checks: - mlx5_ib_can_reconfig_with_umr() returns true if the access flags can be altered. Most cases create MRs using 0 access flags (now made clear by consistent use of set_mkc_access_pd_addr_fields()), but the old logic here was tormented. Make it clear that this is checking if the current access_flags can be modified using UMR to different access_flags. It is always OK to use UMR to change flags that all HW supports. - mlx5_ib_can_load_pas_with_umr() returns true if UMR can be used to enable and update the PAS/XLT. Enabling requires updating the entity size, so UMR ends up completely disabled on this old hardware. Make it clear why it is disabled. FRWR, ODP and cache always requires mlx5_ib_can_load_pas_with_umr(). - mlx5_ib_pas_fits_in_mr() is used to tell if an existing MR can be resized to hold a new PAS list. This only works for cached MR's because we don't store the PAS list size in other cases. To be very clear, arrange things so any pre-created MR's in the cache check the newly requested access_flags before allowing the MR to leave the cache. If UMR cannot set the required access_flags the cache fails to create the MR. This in turn means relaxed ordering and atomic are now correctly blocked early for implicit ODP on older HW. Link: https://lore.kernel.org/r/20200914112653.345244-6-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-18RDMA/mlx5: Disable IB_DEVICE_MEM_MGT_EXTENSIONS if IB_WR_REG_MR can't workJason Gunthorpe1-1/+3
set_reg_wr() always fails if !umr_modify_entity_size_disabled because mlx5_ib_can_use_umr() always fails. Without set_reg_wr() IB_WR_REG_MR doesn't work and that means the device should not advertise IB_DEVICE_MEM_MGT_EXTENSIONS. Fixes: 841b07f99a47 ("IB/mlx5: Block MR WR if UMR is not possible") Link: https://lore.kernel.org/r/20200914112653.345244-5-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-18RDMA/mlx5: Make mkeys always owned by the kernel's PD when not enabledJason Gunthorpe1-25/+26
Any mkey that is not enabled and assigned to userspace should have the PD set to a kernel owned PD. When cache entries are created for the first time the PDN is set to 0, which is probably a kernel PD, but be explicit. When a MR is registered using the hybrid reg_create with UMR xlt & enable the disabled mkey is pointing at the user PD, keep it pointing at the kernel until a UMR enables it and sets the user PD. Fixes: 9ec4483a3f0f ("IB/mlx5: Move MRs to a kernel PD when freeing them to the MR cache") Link: https://lore.kernel.org/r/20200914112653.345244-4-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-18RDMA/mlx5: Use set_mkc_access_pd_addr_fields() in reg_create()Jason Gunthorpe1-14/+1
reg_create() open codes this helper, use the shared code. Link: https://lore.kernel.org/r/20200914112653.345244-3-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-18RDMA/mlx5: Remove dead check for EAGAIN after alloc_mr_from_cache()Jason Gunthorpe1-3/+1
alloc_mr_from_cache() no longer returns EAGAIN, this is just dead code now. Fixes: aad719dcf379 ("RDMA/mlx5: Allow MRs to be created in the cache synchronously") Link: https://lore.kernel.org/r/20200914112653.345244-2-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-18Merge branch 'mlx_sw_owner_v2' into rdma.git for-nextJason Gunthorpe2-3/+7
Leon Romanovsky says: ==================== This series from Alex extends software steering interface to support devices with extra capability "sw_owner_2" which will replace existing "sw_owner". ==================== Based on the mlx5-next branch at git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux due to dependencies. * branch 'mlx5_sw_owner_v2: RDMA/mlx5: Expose TIR and QP ICM address for sw_owner_v2 devices RDMA/mlx5: Allow DM allocation for sw_owner_v2 enabled devices RDMA/mlx5: Add sw_owner_v2 bit capability
2020-09-18Merge branch 'mlx5_active_speed' into rdma.git for-nextJason Gunthorpe1-26/+15
Leon Romanovsky says: ==================== IBTA declares speed as 16 bits, but kernel stores it in u8. This series fixes in-kernel declaration while keeping external interface intact. ==================== Based on the mlx5-next branch at git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux due to dependencies. * branch 'mlx5_active_speed': RDMA: Fix link active_speed size RDMA/mlx5: Delete duplicated mlx5_ptys_width enum net/mlx5: Refactor query port speed functions
2020-09-18RDMA: Fix link active_speed sizeAharon Landau1-6/+2
According to the IB spec active_speed size should be u16 and not u8 as before. Changing it to allow further extensions in offered speeds. Link: https://lore.kernel.org/r/20200917090223.1018224-4-leon@kernel.org Signed-off-by: Aharon Landau <aharonl@mellanox.com> Reviewed-by: Michael Guralnik <michaelgur@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-18RDMA/mlx5: Expose TIR and QP ICM address for sw_owner_v2 devicesAlex Vesker1-2/+4
Expose the ICM address to access TIR and QP, this will allow sw_owned_v2 devices to steer traffic to TIRs and QPs same as done with sw_owner capability. Link: https://lore.kernel.org/r/20200903073857.1129166-4-leon@kernel.org Signed-off-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-18RDMA/mlx5: Allow DM allocation for sw_owner_v2 enabled devicesAlex Vesker1-1/+3
sw_owner_v2 will replace sw_owner for future devices, this means that if sw_owner_v2 is set sw_owner should be ignored and DM allocation is required for sw_owner_v2 devices to function. Link: https://lore.kernel.org/r/20200903073857.1129166-3-leon@kernel.org Signed-off-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-17RDMA: Convert RWQ table logic to ib_core allocation schemeLeon Romanovsky5-32/+24
Move struct ib_rwq_ind_table allocation to ib_core. Link: https://lore.kernel.org/r/20200902081623.746359-3-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-17RDMA: Clean MW allocation and free flowsLeon Romanovsky3-27/+18
Move allocation and destruction of memory windows under ib_core responsibility and clean drivers to ensure that no updates to MW ib_core structures are done in driver layer. Link: https://lore.kernel.org/r/20200902081623.746359-2-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-17RDMA/mlx5: Delete duplicated mlx5_ptys_width enumAharon Landau1-14/+6
Combine two same enums to avoid duplication. Signed-off-by: Aharon Landau <aharonl@mellanox.com> Reviewed-by: Michael Guralnik <michaelgur@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2020-09-17net/mlx5: Refactor query port speed functionsAharon Landau1-13/+14
The functions mlx5_query_port_link_width_oper and mlx5_query_port_ib_proto_oper are always called together, so combine them to a new function called mlx5_query_port_oper to avoid duplication. And while the mlx5i_get_port_settings is the same as mlx5_query_port_oper therefore let's remove it. According to the IB spec link_width_oper and ib_proto_oper should be u16 and not as written u8, so perform casting as a preparation to cross-RDMA patch which will fix that type for all drivers in the RDMA subsystem. Fixes: ada68c31ba9c ("net/mlx5: Introduce a new header file for physical port functions") Signed-off-by: Aharon Landau <aharonl@mellanox.com> Reviewed-by: Michael Guralnik <michaelgur@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2020-09-11RDMA/umem: Split ib_umem_num_pages() into ib_umem_num_dma_blocks()Jason Gunthorpe1-2/+2
ib_umem_num_pages() should only be used by things working with the SGL in CPU pages directly. Drivers building DMA lists should use the new ib_num_dma_blocks() which returns the number of blocks rdma_umem_for_each_block() will return. To make this general for DMA drivers requires a different implementation. Computing DMA block count based on umem->address only works if the requested page size is < PAGE_SIZE and/or the IOVA == umem->address. Instead the number of DMA pages should be computed in the IOVA address space, not umem->address. Thus the IOVA has to be stored inside the umem so it can be used for these calculations. For now set it to umem->address by default and fix it up if ib_umem_find_best_pgsz() was called. This allows drivers to be converted to ib_umem_num_dma_blocks() safely. Link: https://lore.kernel.org/r/6-v2-270386b7e60b+28f4-umem_1_jgg@nvidia.com Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-09RDMA: Make counters destroy symmetricalLeon Romanovsky1-1/+2
Change counters to return failure like any other verbs destroy, however this flow shouldn't return error at all. Link: https://lore.kernel.org/r/20200907120921.476363-10-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-09RDMA: Restore ability to return error for destroy WQLeon Romanovsky4-7/+12
Make this interface symmetrical to other destroy paths. Fixes: a49b1dc7ae44 ("RDMA: Convert destroy_wq to be void") Link: https://lore.kernel.org/r/20200907120921.476363-9-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-09RDMA: Change XRCD destroy return valueLeon Romanovsky2-3/+3
Update XRCD destroy flow to allow command failure. Fixes: 28ad5f65c314 ("RDMA: Move XRCD to be under ib_core responsibility") Link: https://lore.kernel.org/r/20200907120921.476363-8-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-09RDMA: Allow fail of destroy CQLeon Romanovsky2-3/+8
Like any other verbs objects, CQ shouldn't fail during destroy, but mlx5_ib didn't follow this contract with mixed IB verbs objects with DEVX. Such mix causes to the situation where FW and kernel are fully interdependent on the reference counting of each side. Kernel verbs and drivers that don't have DEVX flows shouldn't fail. Fixes: e39afe3d6dbd ("RDMA: Convert CQ allocations to be under core responsibility") Link: https://lore.kernel.org/r/20200907120921.476363-7-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-09RDMA: Restore ability to fail on SRQ destroyLeon Romanovsky4-18/+16
In similar way to other IB objects, restore the ability to return error on SRQ destroy. Strictly speaking, this change is not necessary, and provided here to ensure a symmetrical interface like other destroy functions. Fixes: 68e326dea1db ("RDMA: Handle SRQ allocations by IB/core") Link: https://lore.kernel.org/r/20200907120921.476363-5-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-09RDMA/mlx5: Issue FW command to destroy SRQ on reentryLeon Romanovsky1-3/+12
The HW release can fail and leave the system in limbo state, where SRQ is removed from the table, but can't be destroyed later. In every reentry, the initial xa_erase_irq() check will fail. Rewrite the erase logic to keep index, but don't store the entry itself. By doing it, we can safely reinsert entry back in the case of destroy failure. Link: https://lore.kernel.org/r/20200907120921.476363-4-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-09RDMA: Restore ability to fail on AH destroyLeon Romanovsky2-6/+4
Like any other IB verbs objects, AH are refcounted by ib_core. The release of those objects are controlled by ib_core with promise that AH destroy can't fail. Being SW object for now, this change makes dealloc_ah() to behave like any other destroy IB flows. Fixes: d345691471b4 ("RDMA: Handle AH allocations by IB/core") Link: https://lore.kernel.org/r/20200907120921.476363-3-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-09RDMA: Restore ability to fail on PD deallocateLeon Romanovsky3-5/+5
The IB verbs objects are counted by the kernel and ib_core ensures that deallocate PD will success so it will be called once all other objects that depends on PD will be released. This is achieved by managing various reference counters on such objects. The mlx5 driver didn't follow this standard flow when allowed DEVX objects that are not managed by ib_core to be interleaved with the ones under ib_core responsibility. In such interleaved scenarios deallocate command can fail and ib_core will leave uobject in internal DB and attempt to clean it later to free resources anyway. This change partially restores returned value from dealloc_pd() for all drivers, but keeping in mind that non-DEVX devices and kernel verbs paths shouldn't fail. Fixes: 21a428a019c9 ("RDMA: Handle PD allocations by IB/core") Link: https://lore.kernel.org/r/20200907120921.476363-2-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-03RDMA/mlx5: Fix potential race between destroy and CQE pollLeon Romanovsky1-2/+3
The SRQ can be destroyed right before mlx5_cmd_get_srq is called. In such case the latter will return NULL instead of expected SRQ. Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Link: https://lore.kernel.org/r/20200830084010.102381-5-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-08-31Merge tag 'v5.9-rc3' into rdma.git for-nextJason Gunthorpe4-7/+6
Required due to dependencies in following patches. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-08-27IB/mlx5: Add DCT RoCE LAG supportMark Zhang1-1/+8
When DCT QPs work in RoCE LAG mode: 1. DCT creation is allowed only when it is supported 2. The "port" of a DCT QP is assigned in a round-robin way Link: https://lore.kernel.org/r/20200818115245.700581-3-leon@kernel.org Signed-off-by: Mark Zhang <markz@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-08-27IB/mlx5: Add tx_affinity support for DCI QPMark Zhang1-8/+6
DCI QP supports tx_affinity as well. Link: https://lore.kernel.org/r/20200818115245.700581-2-leon@kernel.org Signed-off-by: Mark Zhang <markz@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-08-24treewide: Use fallthrough pseudo-keywordGustavo A. R. Silva4-7/+6
Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-08-18RDMA/mlx5: Enable sniffer when device is in switchdev modeMaor Gottlieb1-1/+2
In order to allow sniffer when the RDMA device is in switchdev mode, we don't need to set the source port when creating the sniffer rule. Link: https://lore.kernel.org/r/20200803060214.15328-1-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-08-18RDMA/mlx5: Add new IB rates supportMark Zhang1-3/+27
Support 56, 25, 100, 200 and 50Gbps IB rates in mlx5 driver. Link: https://lore.kernel.org/r/20200802081712.1993490-1-leon@kernel.org Signed-off-by: Mark Zhang <markz@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-08-18RDMA: Remove constant domain argument from flow creation callLeon Romanovsky1-6/+3
The "domain" argument is constant and modern device (mlx5) doesn't support anything except IB_FLOW_DOMAIN_USER, so delete this extra parameter and simplify code. Link: https://lore.kernel.org/r/20200730081235.1581127-4-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-08-18RDMA/mlx5: Replace open-coded offsetofend() macroLeon Romanovsky4-21/+19
Clean mlx5_ib from open-coded implementations of offsetofend(). Link: https://lore.kernel.org/r/20200730081235.1581127-3-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-08-18RDMA/mlx5: Simplify multiple else-if cases with switch keywordLeon Romanovsky1-49/+75
Improve readability of fs.c by converting multiple else-if constructions to be implemented with switch keyword. Link: https://lore.kernel.org/r/20200730081235.1581127-2-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-08-07Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds21-3918/+4000
Pull rdma updates from Jason Gunthorpe: "A quiet cycle after the larger 5.8 effort. Substantially cleanup and driver work with a few smaller features this time. - Driver updates for hfi1, rxe, mlx5, hns, qedr, usnic, bnxt_re - Removal of dead or redundant code across the drivers - RAW resource tracker dumps to include a device specific data blob for device objects to aide device debugging - Further advance the IOCTL interface, remove the ability to turn it off. Add QUERY_CONTEXT, QUERY_MR, and QUERY_PD commands - Remove stubs related to devices with no pkey table - A shared CQ scheme to allow multiple ULPs to share the CQ rings of a device to give higher performance - Several more static checker, syzkaller and rare crashers fixed" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (121 commits) RDMA/mlx5: Fix flow destination setting for RDMA TX flow table RDMA/rxe: Remove pkey table RDMA/umem: Add a schedule point in ib_umem_get() RDMA/hns: Fix the unneeded process when getting a general type of CQE error RDMA/hns: Fix error during modify qp RTS2RTS RDMA/hns: Delete unnecessary memset when allocating VF resource RDMA/hns: Remove redundant parameters in set_rc_wqe() RDMA/hns: Remove support for HIP08_A RDMA/hns: Refactor hns_roce_v2_set_hem() RDMA/hns: Remove redundant hardware opcode definitions RDMA/netlink: Remove CAP_NET_RAW check when dump a raw QP RDMA/include: Replace license text with SPDX tags RDMA/rtrs: remove WQ_MEM_RECLAIM for rtrs_wq RDMA/rtrs-clt: add an additional random 8 seconds before reconnecting RDMA/cma: Execute rdma_cm destruction from a handler properly RDMA/cma: Remove unneeded locking for req paths RDMA/cma: Using the standard locking pattern when delivering the removal event RDMA/cma: Simplify DEVICE_REMOVAL for internal_id RDMA/efa: Add EFA 0xefa1 PCI ID RDMA/efa: User/kernel compatibility handshake mechanism ...
2020-08-06RDMA/mlx5: Fix flow destination setting for RDMA TX flow tableMichael Guralnik1-2/+4
For RDMA TX flow table, set destination type to be 'port' and prevent creation of flows with TIR destination. As RDMA TX is an egress flow table the rules on this flow table should not forward traffic back to the NIC and should set the destination to be the port. Without the setting of this destination type flow rules on the RDMA TX flow tables are not created as FW invokes a syndrome for undefined destination for the rule. Fixes: 24670b1a3166 ("net/mlx5: Add support for RDMA TX steering") Link: https://lore.kernel.org/r/20200803055849.14947-1-leon@kernel.org Signed-off-by: Michael Guralnik <michaelgur@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-08-04Merge tag 'uninit-macro-v5.9-rc1' of ↵Linus Torvalds3-5/+5
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull uninitialized_var() macro removal from Kees Cook: "This is long overdue, and has hidden too many bugs over the years. The series has several "by hand" fixes, and then a trivial treewide replacement. - Clean up non-trivial uses of uninitialized_var() - Update documentation and checkpatch for uninitialized_var() removal - Treewide removal of uninitialized_var()" * tag 'uninit-macro-v5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: compiler: Remove uninitialized_var() macro treewide: Remove uninitialized_var() usage checkpatch: Remove awareness of uninitialized_var() macro mm/debug_vm_pgtable: Remove uninitialized_var() usage f2fs: Eliminate usage of uninitialized_var() macro media: sur40: Remove uninitialized_var() usage KVM: PPC: Book3S PR: Remove uninitialized_var() usage clk: spear: Remove uninitialized_var() usage clk: st: Remove uninitialized_var() usage spi: davinci: Remove uninitialized_var() usage ide: Remove uninitialized_var() usage rtlwifi: rtl8192cu: Remove uninitialized_var() usage b43: Remove uninitialized_var() usage drbd: Remove uninitialized_var() usage x86/mm/numa: Remove uninitialized_var() usage docs: deprecated.rst: Add uninitialized_var()
2020-07-30RDMA/mlx5: Initialize QP mutex for the debug kernelsLeon Romanovsky1-4/+1
In DCT and RSS RAW QP creation flows, the QP mutex wasn't initialized and the magic field inside lock was missing. This caused to the following kernel warning for kernels build with CONFIG_DEBUG_MUTEXES. DEBUG_LOCKS_WARN_ON(lock->magic != lock) WARNING: CPU: 3 PID: 16261 at kernel/locking/mutex.c:938 __mutex_lock+0x60e/0x940 Modules linked in: bonding nf_tables ipip tunnel4 geneve ip6_udp_tunnel udp_tunnel ip6_gre ip6_tunnel tunnel6 ip_gre gre ip_tunnel mlx5_ib mlx5_core mlxfw ptp pps_core rdma_ucm ib_uverbs ib_ipoib ib_umad openvswitch nsh xt_MASQUERADE nf_conntrack_netlink nfnetlink iptable_nat xt_addrtype xt_conntrack nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter overlay ib_srp scsi_transport_srp rpcrdma ib_iser libiscsi scsi_transport_iscsi rdma_cm iw_cm ib_cm ib_core [last unloaded: mlxfw] CPU: 3 PID: 16261 Comm: ib_send_bw Not tainted 5.8.0-rc4_for_upstream_min_debug_2020_07_08_22_04 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 RIP: 0010:__mutex_lock+0x60e/0x940 Code: c0 0f 84 6d fa ff ff 44 8b 15 4e 9d ba 00 45 85 d2 0f 85 5d fa ff ff 48 c7 c6 f2 de 2b 82 48 c7 c7 f1 8a 2b 82 e8 d2 4d 72 ff <0f> 0b 4c 8b 4d 88 e9 3f fa ff ff f6 c2 04 0f 84 37 fe ff ff 48 89 RSP: 0018:ffff88810bb8b870 EFLAGS: 00010286 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: ffff88829f1dd880 RSI: 0000000000000000 RDI: ffffffff81192afa RBP: ffff88810bb8b910 R08: 0000000000000000 R09: 0000000000000028 R10: 0000000000000000 R11: 0000000000003f85 R12: 0000000000000002 R13: ffff88827d8d3ce0 R14: ffffffffa059f615 R15: ffff8882a4d02610 FS: 00007f3f6988e740(0000) GS:ffff8882f5b80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000556556158000 CR3: 000000010a63c005 CR4: 0000000000360ea0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: ? cmd_exec+0x947/0xe60 [mlx5_core] ? __mutex_lock+0x76/0x940 ? mlx5_ib_qp_set_counter+0x25/0xa0 [mlx5_ib] mlx5_ib_qp_set_counter+0x25/0xa0 [mlx5_ib] mlx5_ib_counter_bind_qp+0x9b/0xe0 [mlx5_ib] __rdma_counter_bind_qp+0x6b/0xa0 [ib_core] rdma_counter_bind_qp_auto+0x363/0x520 [ib_core] _ib_modify_qp+0x316/0x580 [ib_core] ib_modify_qp_with_udata+0x19/0x30 [ib_core] modify_qp+0x4c4/0x600 [ib_uverbs] ib_uverbs_ex_modify_qp+0x87/0xe0 [ib_uverbs] ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0x129/0x1c0 [ib_uverbs] ib_uverbs_cmd_verbs.isra.5+0x5d5/0x11f0 [ib_uverbs] ? ib_uverbs_handler_UVERBS_METHOD_QUERY_CONTEXT+0x120/0x120 [ib_uverbs] ? lock_acquire+0xb9/0x3a0 ? ib_uverbs_ioctl+0xd0/0x210 [ib_uverbs] ? ib_uverbs_ioctl+0x175/0x210 [ib_uverbs] ib_uverbs_ioctl+0x14b/0x210 [ib_uverbs] ? ib_uverbs_ioctl+0xd0/0x210 [ib_uverbs] ksys_ioctl+0x234/0x7d0 ? exc_page_fault+0x202/0x640 ? do_syscall_64+0x1f/0x2e0 __x64_sys_ioctl+0x16/0x20 do_syscall_64+0x59/0x2e0 ? asm_exc_page_fault+0x8/0x30 ? rcu_read_lock_sched_held+0x52/0x60 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: b4aaa1f0b415 ("IB/mlx5: Handle type IB_QPT_DRIVER when creating a QP") Link: https://lore.kernel.org/r/20200730082719.1582397-2-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-29RDMA/mlx5: Allow providing extra scatter CQE QP flagLeon Romanovsky1-9/+15
Scatter CQE feature relies on two flags MLX5_QP_FLAG_SCATTER_CQE and MLX5_QP_FLAG_ALLOW_SCATTER_CQE, both of them can be provided without relation to device capability. Relax global validity check to allow MLX5_QP_FLAG_ALLOW_SCATTER_CQE QP flag. Existing user applications are failing on this new validity check. Fixes: 90ecb37a751b ("RDMA/mlx5: Change scatter CQE flag to be set like other vendor flags") Fixes: 37518fa49f76 ("RDMA/mlx5: Process all vendor flags in one place") Link: https://lore.kernel.org/r/20200728120255.805733-1-leon@kernel.org Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-28RDMA/mlx5: Delete unreachable codeLeon Romanovsky2-9/+4
Delete two occurrences of unreachable code discovered by the Coverity. Link: https://lore.kernel.org/r/20200727095746.495915-1-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-27RDMA/mlx5: Fix prefetch memory leak if get_prefetchable_mr failsJason Gunthorpe1-3/+2
destroy_prefetch_work() must always be called if the work is not going to be queued. The num_sge also should have been set to i, not i-1 which avoids the condition where it shouldn't have been called in the first place. Cc: stable@vger.kernel.org Fixes: fb985e278a30 ("RDMA/mlx5: Use SRCU properly in ODP prefetch") Link: https://lore.kernel.org/r/20200727095712.495652-1-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>