summaryrefslogtreecommitdiff
path: root/drivers/infiniband/hw
AgeCommit message (Collapse)AuthorFilesLines
2024-05-05RDMA/mana_ib: create and destroy RNIC cqsKonstantin Taranov2-0/+86
Implement RNIC requests for creation and destruction of RNIC CQs. Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com> Link: https://lore.kernel.org/r/1714137160-5222-3-git-send-email-kotaranov@linux.microsoft.com Reviewed-by: Long Li <longli@microsoft.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-05-05RDMA/mana_ib: create EQs for RNIC CQsKonstantin Taranov2-2/+33
Create EQs within mana_ib device. Such EQs are required for creation of RNIC CQs. Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com> Link: https://lore.kernel.org/r/1714137160-5222-2-git-send-email-kotaranov@linux.microsoft.com Reviewed-by: Long Li <longli@microsoft.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-05-04IB/qib: Use device_show_string() helper for sysfs attributesLukas Wunner3-15/+2
Deduplicate sysfs ->show() callbacks which expose a string at a static memory location. Use the newly introduced device_show_string() helper in the driver core instead by declaring those sysfs attributes with DEVICE_STRING_ATTR_RO(). No functional change intended. Signed-off-by: Lukas Wunner <lukas@wunner.de> Link: https://lore.kernel.org/r/185a88dd3e6e18bf508a875d87c95ea2a5c3fa13.1713608122.git.lukas@wunner.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-03IB/hfi1: allocate dummy net_device dynamicallyBreno Leitao2-3/+8
Embedding net_device into structures prohibits the usage of flexible arrays in the net_device structure. For more details, see the discussion at [1]. Un-embed the net_device from struct hfi1_netdev_rx by converting it into a pointer. Then use the leverage alloc_netdev() to allocate the net_device object at hfi1_alloc_rx(). [1] https://lore.kernel.org/all/20240229225910.79e224cf@kernel.org/ Acked-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Breno Leitao <leitao@debian.org> Acked-by: Leon Romanovsky <leon@kernel.org> Link: https://lore.kernel.org/r/20240430162213.746492-1-leitao@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-02RDMA/mlx5: Remove NULL check before dev_{put, hold}Jules Irenge1-2/+1
Coccinelle reports a warning WARNING: NULL check before dev_{put, hold} functions is not needed The reason is the call netdev_{put, hold} of dev_{put,hold} will check NULL There is no need to check before using dev_{put, hold} Signed-off-by: Jules Irenge <jbi.octave@gmail.com> Link: https://lore.kernel.org/r/ZjGC4qXrOwZE0aHi@octinomon.home Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-30RDMA/mlx5: Track DCT, DCI and REG_UMR QPs as diver_detail resources.Chiara Meiohas2-2/+30
Allow user to see driver-specific QPs (the "driver_detail" QPs) through the rdmatool, when requested. When creating DCT, DCI and REG_UMR QPs, we designate them as driver_detail resources. When filling the QP info for the rdma tool, for the driver_detail QPs: -the QP type is IB_QPT_DRIVER -the subtype is a string with the QP name ("DCT", "DCI", "REG_UMR") Signed-off-by: Chiara Meiohas <cmeiohas@nvidia.com> Link: https://lore.kernel.org/r/452432d7d0917f053a80a893a614169857fe3b10.1713268997.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-29RDMA/efa: Add shutdown notifierMichael Margolin1-0/+11
Add driver function to stop the device and release any active IRQs as preparation for shutdown. This should fix issues caused by unexpected AQ interrupts when booting kernel using kexec and possible data integrity issues when the system is being shutdown during traffic. Link: https://lore.kernel.org/r/20240425171814.25216-1-mrgolin@amazon.com Reviewed-by: Firas Jahjah <firasj@amazon.com> Reviewed-by: Yonatan Nachum <ynachum@amazon.com> Signed-off-by: Michael Margolin <mrgolin@amazon.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-04-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski1-1/+2
Cross-merge networking fixes after downstream PR. Conflicts: drivers/net/ethernet/ti/icssg/icssg_prueth.c net/mac80211/chan.c 89884459a0b9 ("wifi: mac80211: fix idle calculation with multi-link") 87f5500285fb ("wifi: mac80211: simplify ieee80211_assign_link_chanctx()") https://lore.kernel.org/all/20240422105623.7b1fbda2@canb.auug.org.au/ net/unix/garbage.c 1971d13ffa84 ("af_unix: Suppress false-positive lockdep splat for spin_lock() in __unix_gc().") 4090fa373f0e ("af_unix: Replace garbage collection algorithm.") drivers/net/ethernet/ti/icssg/icssg_prueth.c drivers/net/ethernet/ti/icssg/icssg_common.c 4dcd0e83ea1d ("net: ti: icssg-prueth: Fix signedness bug in prueth_init_rx_chns()") e2dc7bfd677f ("net: ti: icssg-prueth: Move common functions into a separate file") No adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-25RDMA/vmw_pvrdma: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACYDamien Le Moal1-1/+1
Use the macro PCI_IRQ_INTX instead of the deprecated PCI_IRQ_LEGACY macro. Link: https://lore.kernel.org/r/20240325070944.3600338-13-dlemoal@kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-04-25IB/qib: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACYDamien Le Moal3-5/+4
Use the macro PCI_IRQ_INTX instead of the deprecated PCI_IRQ_LEGACY macro. Link: https://lore.kernel.org/r/20240325070944.3600338-12-dlemoal@kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
2024-04-23RDMA/mana_ib: Fix missing ret valueKonstantin Taranov1-0/+1
Set ret to -ENODEV when netdev_master_upper_dev_get_rcu returns NULL. Fixes: 8b184e4f1c32 ("RDMA/mana_ib: Enable RoCE on port 1") Link: https://lore.kernel.org/r/1713881751-21621-1-git-send-email-kotaranov@linux.microsoft.com Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-04-22IB/mlx5: Use __iowrite64_copy() for write combining storesJason Gunthorpe1-5/+3
mlx5 has a built in self-test at driver startup to evaluate if the platform supports write combining to generate a 64 byte PCIe TLP or not. This has proven necessary because a lot of common scenarios end up with broken write combining (especially inside virtual machines) and there is other way to learn this information. This self test has been consistently failing on new ARM64 CPU designs (specifically with NVIDIA Grace's implementation of Neoverse V2). The C loop around writeq() generates some pretty terrible ARM64 assembly, but historically this has worked on a lot of existing ARM64 CPUs till now. We see it succeed about 1 time in 10,000 on the worst effected systems. The CPU architects speculate that the load instructions interspersed with the stores makes the WC buffers statistically flush too often and thus the generation of large TLPs becomes infrequent. This makes the boot up test unreliable in that it indicates no write-combining, however userspace would be fine since it uses a ST4 instruction. Further, S390 has similar issues where only the special zpci_memcpy_toio() will actually generate large TLPs, and the open coded loop does not trigger it at all. Fix both ARM64 and S390 by switching to __iowrite64_copy() which now provides architecture specific variants that have a high change of generating a large TLP with write combining. x86 continues to use a similar writeq loop in the generate __iowrite64_copy(). Fixes: 11f552e21755 ("IB/mlx5: Test write combining support") Link: https://lore.kernel.org/r/6-v3-1893cd8b9369+1925-mlx5_arm_wc_jgg@nvidia.com Tested-by: Niklas Schnelle <schnelle@linux.ibm.com> Acked-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-04-16RDMA/hns: Modify the print level of CQE errorChengchang Tang1-2/+3
Too much print may lead to a panic in kernel. Change ibdev_err() to ibdev_err_ratelimited(), and change the printing level of cqe dump to debug level. Fixes: 7c044adca272 ("RDMA/hns: Simplify the cqe code of poll cq") Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://lore.kernel.org/r/20240412091616.370789-11-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-16RDMA/hns: Use complete parentheses in macrosChengchang Tang1-6/+6
Use complete parentheses to ensure that macro expansion does not produce unexpected results. Fixes: a25d13cbe816 ("RDMA/hns: Add the interfaces to support multi hop addressing for the contexts in hip08") Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://lore.kernel.org/r/20240412091616.370789-10-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-16RDMA/hns: Add mutex_destroy()wenglianfa6-5/+39
Add mutex_destroy(). Signed-off-by: wenglianfa <wenglianfa@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://lore.kernel.org/r/20240412091616.370789-9-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-16RDMA/hns: Fix GMV table pagesizeChengchang Tang1-1/+1
GMV's BA table only supports 4K pages. Currently, PAGESIZE is used to calculate gmv_bt_num, which will cause an abnormal number of gmv_bt_num in a 64K OS. Fixes: d6d91e46210f ("RDMA/hns: Add support for configuring GMV table") Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://lore.kernel.org/r/20240412091616.370789-8-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-16RDMA/hns: Fix mismatch exception rollbackwenglianfa1-1/+1
When dma_alloc_coherent() fails in hns_roce_alloc_hem(), just call kfree() to release hem instead of hns_roce_free_hem(). Fixes: c00743cbf2b8 ("RDMA/hns: Simplify 'struct hns_roce_hem' allocation") Signed-off-by: wenglianfa <wenglianfa@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://lore.kernel.org/r/20240412091616.370789-7-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-16RDMA/hns: Fix UAF for cq async eventChengchang Tang1-11/+13
The refcount of CQ is not protected by locks. When CQ asynchronous events and CQ destruction are concurrent, CQ may have been released, which will cause UAF. Use the xa_lock() to protect the CQ refcount. Fixes: 9a4435375cd1 ("IB/hns: Add driver files for hns RoCE driver") Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://lore.kernel.org/r/20240412091616.370789-6-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-16RDMA/hns: Fix deadlock on SRQ async events.Chengchang Tang2-3/+4
xa_lock for SRQ table may be required in AEQ. Use xa_store_irq()/ xa_erase_irq() to avoid deadlock. Fixes: 81fce6291d99 ("RDMA/hns: Add SRQ asynchronous event support") Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://lore.kernel.org/r/20240412091616.370789-5-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-16RDMA/hns: Add max_ah and cq moderation capacities in query_device()Chengchang Tang4-2/+12
Add max_ah and cq moderation capacities to hns_roce_query_device(). Fixes: 9a4435375cd1 ("IB/hns: Add driver files for hns RoCE driver") Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://lore.kernel.org/r/20240412091616.370789-4-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-16RDMA/hns: Remove unused parameters and variablesChengchang Tang7-33/+20
Remove unused parameters and variables. Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://lore.kernel.org/r/20240412091616.370789-3-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-16RDMA/hns: Use macro instead of magic numberYangyang Li3-16/+34
Use macro instead of magic number. Signed-off-by: Yangyang Li <liyangyang20@huawei.com> Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://lore.kernel.org/r/20240412091616.370789-2-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-16RDMA/hns: Fix return value in hns_roce_map_mr_sgZhengchao Shao1-8/+7
As described in the ib_map_mr_sg function comment, it returns the number of sg elements that were mapped to the memory region. However, hns_roce_map_mr_sg returns the number of pages required for mapping the DMA area. Fix it. Fixes: 9b2cf76c9f05 ("RDMA/hns: Optimize PBL buffer allocation process") Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Link: https://lore.kernel.org/r/20240411033851.2884771-1-shaozhengchao@huawei.com Reviewed-by: Junxian Huang <huangjunxian6@hisilicon.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-16RDMA/mana_ib: Configure mac address in RNICKonstantin Taranov3-0/+46
Set local mac address in RNIC, which is required by the HW. Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com> Link: https://lore.kernel.org/r/1712738551-22075-7-git-send-email-kotaranov@linux.microsoft.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-16RDMA/mana_ib: Adding and deleting GIDsKonstantin Taranov3-0/+97
Implement add_gid and del_gid for RNIC. IPv4 and IPv6 addresses are supported. Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com> Link: https://lore.kernel.org/r/1712738551-22075-6-git-send-email-kotaranov@linux.microsoft.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-16RDMA/mana_ib: Enable RoCE on port 1Konstantin Taranov2-4/+26
Set netdev and RoCEv2 flag to enable GID population on port 1. Use GIDs of the master netdev. As mc->ports[] stores slave devices, use a helper to get the master netdev. Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com> Link: https://lore.kernel.org/r/1712738551-22075-5-git-send-email-kotaranov@linux.microsoft.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-16RDMA/mana_ib: Implement port parametersKonstantin Taranov3-1/+42
Implement port parameters for RNIC: 1) extend query_port() method 2) implement get_link_layer() 3) implement query_pkey() Only port 1 can store GIDs. Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com> Link: https://lore.kernel.org/r/1712738551-22075-4-git-send-email-kotaranov@linux.microsoft.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-16RDMA/mana_ib: Create and destroy rnic adapterKonstantin Taranov3-1/+79
Add functions for RNIC creation and destruction. If creation fails, the ib_probe fails as well. Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com> Link: https://lore.kernel.org/r/1712738551-22075-3-git-send-email-kotaranov@linux.microsoft.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-16RDMA/mana_ib: Add EQ creation for rnic adapterKonstantin Taranov3-3/+41
Create an error EQ for the RNIC adapter. Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com> Link: https://lore.kernel.org/r/1712738551-22075-2-git-send-email-kotaranov@linux.microsoft.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-16RDMA/mana_ib: Use num_comp_vectors of ib_deviceKonstantin Taranov3-9/+4
Use num_comp_vectors of struct ib_device instead of max_num_queues from gdma_context. Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com> Link: https://lore.kernel.org/r/1712911656-17352-1-git-send-email-kotaranov@linux.microsoft.com Reviewed-by: Long Li <longli@microsoft.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-11Merge branch mana-ib-flex of ↵Jakub Kicinski1-7/+5
git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git Erick Archer says: ==================== mana: Add flex array to struct mana_cfg_rx_steer_req_v2 (part) The "struct mana_cfg_rx_steer_req_v2" uses a dynamically sized set of trailing elements. Specifically, it uses a "mana_handle_t" array. So, use the preferred way in the kernel declaring a flexible array [1]. At the same time, prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). Also, avoid the open-coded arithmetic in the memory allocator functions [2] using the "struct_size" macro. Moreover, use the "offsetof" helper to get the indirect table offset instead of the "sizeof" operator and avoid the open-coded arithmetic in pointers using the new flex member. This new structure member also allow us to remove the "req_indir_tab" variable since it is no longer needed. Now, it is also possible to use the "flex_array_size" helper to compute the size of these trailing elements in the "memcpy" function. Specifically, the first commit adds the flex member and the patches 2 and 3 refactor the consumers of the "struct mana_cfg_rx_steer_req_v2". This code was detected with the help of Coccinelle, and audited and modified manually. The Coccinelle script used to detect this code pattern is the following: virtual report @rule1@ type t1; type t2; identifier i0; identifier i1; identifier i2; identifier ALLOC =~ "kmalloc|kzalloc|kmalloc_node|kzalloc_node|vmalloc|vzalloc|kvmalloc|kvzalloc"; position p1; @@ i0 = sizeof(t1) + sizeof(t2) * i1; ... i2 = ALLOC@p1(..., i0, ...); @script:python depends on report@ p1 << rule1.p1; @@ msg = "WARNING: verify allocation on line %s" % (p1[0].line) coccilib.report.print_report(p1[0],msg) Link: https://www.kernel.org/doc/html/next/process/deprecated.html#zero-length-and-one-element-arrays [1] Link: https://www.kernel.org/doc/html/next/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments [2] v1: https://lore.kernel.org/linux-hardening/AS8PR02MB7237974EF1B9BAFA618166C38B382@AS8PR02MB7237.eurprd02.prod.outlook.com/ v2: https://lore.kernel.org/linux-hardening/AS8PR02MB723729C5A63F24C312FC9CD18B3F2@AS8PR02MB7237.eurprd02.prod.outlook.com/ ==================== Link: https://lore.kernel.org/r/AS8PR02MB72374BD1B23728F2E3C3B1A18B022@AS8PR02MB7237.eurprd02.prod.outlook.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-11RDMA/mana_ib: remove useless return values from dbg printsKonstantin Taranov3-7/+5
Remove printing ret value on success as it was always 0. Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com> Link: https://lore.kernel.org/r/1712672465-29960-1-git-send-email-kotaranov@linux.microsoft.com Reviewed-by: Long Li <longli@microsoft.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-11RDMA/mana_ib: Add flex array to struct mana_cfg_rx_steer_req_v2Leon Romanovsky1-7/+5
The "struct mana_cfg_rx_steer_req_v2" uses a dynamically sized set of trailing elements. Specifically, it uses a "mana_handle_t" array. So, use the preferred way in the kernel declaring a flexible array [1]. At the same time, prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). Also, avoid the open-coded arithmetic in the memory allocator functions [2] using the "struct_size" macro. Moreover, use the "offsetof" helper to get the indirect table offset instead of the "sizeof" operator and avoid the open-coded arithmetic in pointers using the new flex member. This new structure member also allow us to remove the "req_indir_tab" variable since it is no longer needed. Now, it is also possible to use the "flex_array_size" helper to compute the size of these trailing elements in the "memcpy" function. Specifically, the first commit adds the flex member and the patches 2 and 3 refactor the consumers of the "struct mana_cfg_rx_steer_req_v2". This code was detected with the help of Coccinelle, and audited and modified manually. The Coccinelle script used to detect this code pattern is the following: virtual report @rule1@ type t1; type t2; identifier i0; identifier i1; identifier i2; identifier ALLOC =~ "kmalloc|kzalloc|kmalloc_node|kzalloc_node|vmalloc|vzalloc|kvmalloc|kvzalloc"; position p1; @@ i0 = sizeof(t1) + sizeof(t2) * i1; ... i2 = ALLOC@p1(..., i0, ...); @script:python depends on report@ p1 << rule1.p1; @@ msg = "WARNING: verify allocation on line %s" % (p1[0].line) coccilib.report.print_report(p1[0],msg) [1] https://www.kernel.org/doc/html/next/process/deprecated.html#zero-length-and-one-element-arrays [2] https://www.kernel.org/doc/html/next/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments Link: https://lore.kernel.org/all/AS8PR02MB72374BD1B23728F2E3C3B1A18B022@AS8PR02MB7237.eurprd02.prod.outlook.com Signed-off-by: Erick Archer <erick.archer@outlook.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> * mana-ib-flex: net: mana: Avoid open coded arithmetic RDMA/mana_ib: Prefer struct_size over open coded arithmetic net: mana: Add flex array to struct mana_cfg_rx_steer_req_v2
2024-04-11RDMA/mana_ib: Prefer struct_size over open coded arithmeticErick Archer1-7/+5
This is an effort to get rid of all multiplications from allocation functions in order to prevent integer overflows [1][2]. As the "req" variable is a pointer to "struct mana_cfg_rx_steer_req_v2" and this structure ends in a flexible array: struct mana_cfg_rx_steer_req_v2 { [...] mana_handle_t indir_tab[] __counted_by(num_indir_entries); }; the preferred way in the kernel is to use the struct_size() helper to do the arithmetic instead of the calculation "size + size * count" in the kzalloc() function. Moreover, use the "offsetof" helper to get the indirect table offset instead of the "sizeof" operator and avoid the open-coded arithmetic in pointers using the new flex member. This new structure member also allow us to remove the "req_indir_tab" variable since it is no longer needed. This way, the code is more readable and safer. This code was detected with the help of Coccinelle, and audited and modified manually. Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments [1] Link: https://github.com/KSPP/linux/issues/160 [2] Signed-off-by: Erick Archer <erick.archer@outlook.com> Link: https://lore.kernel.org/r/AS8PR02MB72375EB06EE1A84A67BE722E8B022@AS8PR02MB7237.eurprd02.prod.outlook.com Reviewed-by: Long Li <longli@microsoft.com> Reviewed-by: Justin Stitt <justinstitt@google.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-09RDMA/hns: Support DSCPJunxian Huang4-29/+108
Add support for DSCP configuration. For DSCP, get dscp-prio mapping via hns3 nic driver api .get_dscp_prio() and fill the SL (in WQE for UD or in QPC for RC) with the priority value. The prio-tc mapping is configured to HW by hns3 nic driver. HW will select a corresponding TC according to SL and the prio-tc mapping. Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://lore.kernel.org/r/20240315093551.1650088-1-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-08ipv4: Set scope explicitly in ip_route_output().Guillaume Nault2-2/+4
Add a "scope" parameter to ip_route_output() so that callers don't have to override the tos parameter with the RTO_ONLINK flag if they want a local scope. This will allow converting flowi4_tos to dscp_t in the future, thus allowing static analysers to flag invalid interactions between "tos" (the DSCP bits) and ECN. Only three users ask for local scope (bonding, arp and atm). The others continue to use RT_SCOPE_UNIVERSE. While there, add a comment to warn users about the limitations of ip_route_output(). Signed-off-by: Guillaume Nault <gnault@redhat.com> Acked-by: Leon Romanovsky <leonro@nvidia.com> # infiniband Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-08RDMA/mlx5: Adding remote atomic access flag to updatable flagsOr Har-Toov1-1/+2
Currently IB_ACCESS_REMOTE_ATOMIC is blocked from being updated via UMR although in some cases it should be possible. These cases are checked in mlx5r_umr_can_reconfig function. Fixes: ef3642c4f54d ("RDMA/mlx5: Fix error unwinds for rereg_mr") Signed-off-by: Or Har-Toov <ohartoov@nvidia.com> Link: https://lore.kernel.org/r/24dac73e2fa48cb806f33a932d97f3e402a5ea2c.1712140377.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-08RDMA/mlx5: Change check for cacheable mkeysOr Har-Toov2-10/+23
umem can be NULL for user application mkeys in some cases. Therefore umem can't be used for checking if the mkey is cacheable and it is changed for checking a flag that indicates it. Also make sure that all mkeys which are not returned to the cache will be destroyed. Fixes: dd1b913fb0d0 ("RDMA/mlx5: Cache all user cacheable mkeys on dereg MR flow") Signed-off-by: Or Har-Toov <ohartoov@nvidia.com> Link: https://lore.kernel.org/r/2690bc5c6896bcb937f89af16a1ff0343a7ab3d0.1712140377.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-08RDMA/mlx5: Uncacheable mkey has neither rb_key or cache_entOr Har-Toov1-1/+1
As some mkeys can't be modified with UMR due to some UMR limitations, like the size of translation that can be updated, not all user mkeys can be cached. Fixes: dd1b913fb0d0 ("RDMA/mlx5: Cache all user cacheable mkeys on dereg MR flow") Signed-off-by: Or Har-Toov <ohartoov@nvidia.com> Link: https://lore.kernel.org/r/f2742dd934ed73b2d32c66afb8e91b823063880c.1712140377.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-08RDMA/mlx5: Fix port number for counter query in multi-port configurationMichael Guralnik1-1/+2
Set the correct port when querying PPCNT in multi-port configuration. Distinguish between cases where switchdev mode was enabled to multi-port configuration and don't overwrite the queried port to 1 in multi-port case. Fixes: 74b30b3ad5ce ("RDMA/mlx5: Set local port to one when accessing counters") Signed-off-by: Michael Guralnik <michaelgur@nvidia.com> Link: https://lore.kernel.org/r/9bfcc8ade958b760a51408c3ad654a01b11f7d76.1712134988.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-02RDMA/mana_ib: Use struct mana_ib_queue for RAW QPsKonstantin Taranov2-46/+18
Use struct mana_ib_queue and its helpers for RAW QPs Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com> Link: https://lore.kernel.org/r/1711483688-24358-5-git-send-email-kotaranov@linux.microsoft.com Reviewed-by: Long Li <longli@microsoft.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-02RDMA/mana_ib: Use struct mana_ib_queue for WQsKonstantin Taranov3-35/+10
Use struct mana_ib_queue and its helpers for WQs Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com> Link: https://lore.kernel.org/r/1711483688-24358-4-git-send-email-kotaranov@linux.microsoft.com Reviewed-by: Long Li <longli@microsoft.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-02RDMA/mana_ib: Use struct mana_ib_queue for CQsKonstantin Taranov3-58/+24
Use struct mana_ib_queue and its helpers for CQs Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com> Link: https://lore.kernel.org/r/1711483688-24358-3-git-send-email-kotaranov@linux.microsoft.com Reviewed-by: Long Li <longli@microsoft.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-02RDMA/mana_ib: Introduce helpers to create and destroy mana queuesKonstantin Taranov2-0/+53
Intoduce helpers to work with mana ib queues (struct mana_ib_queue). A queue always consists of umem, gdma_region, and id. A queue can become a WQ or a CQ. Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com> Link: https://lore.kernel.org/r/1711483688-24358-2-git-send-email-kotaranov@linux.microsoft.com Reviewed-by: Long Li <longli@microsoft.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-01RDMA/erdma: Remove unnecessary __GFP_ZERO flagBoshi Yu2-8/+4
The dma_alloc_coherent() interface automatically zero the memory returned. Thus, we do not need to specify the __GFP_ZERO flag explicitly when we call dma_alloc_coherent(). Reviewed-by: Cheng Xu <chengyou@linux.alibaba.com> Signed-off-by: Boshi Yu <boshiyu@linux.alibaba.com> Link: https://lore.kernel.org/r/20240311113821.22482-4-boshiyu@alibaba-inc.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-01RDMA/erdma: Unify the names related to doorbell recordsBoshi Yu8-93/+79
There exist two different names for the doorbell records: db_info and db_record. We use dbrec for cpu address of the doorbell record and dbrec_dma for dma address of the doorbell recordi uniformly. Reviewed-by: Cheng Xu <chengyou@linux.alibaba.com> Signed-off-by: Boshi Yu <boshiyu@linux.alibaba.com> Link: https://lore.kernel.org/r/20240311113821.22482-3-boshiyu@alibaba-inc.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-01RDMA/erdma: Allocate doorbell records from dma poolBoshi Yu6-101/+167
Currently, the 8 byte doorbell record is allocated along with the queue buffer, which may result in waste of dma space when the queue buffer is page aligned. To address this issue, we introduce a dma pool named db_pool and allocate doorbell record from it. Reviewed-by: Cheng Xu <chengyou@linux.alibaba.com> Signed-off-by: Boshi Yu <boshiyu@linux.alibaba.com> Link: https://lore.kernel.org/r/20240311113821.22482-2-boshiyu@alibaba-inc.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-03-19Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds23-469/+636
Pull rdma updates from Jason Gunthorpe: "Very small update this cycle: - Minor code improvements in fi, rxe, ipoib, mana, cxgb4, mlx5, irdma, rxe, rtrs, mana - Simplify the hns hem mechanism - Fix EFA's MSI-X allocation in resource constrained configurations - Fix a KASN splat in srpt - Narrow hns's congestion control selection to QPs granularity and allow userspace to select it - Solve a parallel module loading race between the CM module and a driver module - Flexible array cleanup - Dump hns's SCC Conext to 'rdma res' for debugging - Make mana build page lists for HW objects that require a 0 offset correctly - Stuck CM ID debugging" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (29 commits) RDMA/cm: add timeout to cm_destroy_id wait RDMA/mana_ib: Use virtual address in dma regions for MRs RDMA/mana_ib: Fix bug in creation of dma regions RDMA/hns: Append SCC context to the raw dump of QPC RDMA/uverbs: Avoid -Wflex-array-member-not-at-end warnings RDMA/hns: Support userspace configuring congestion control algorithm with QP granularity RDMA/rtrs-clt: Check strnlen return len in sysfs mpath_policy_store() RDMA/uverbs: Remove flexible arrays from struct *_filter RDMA/device: Fix a race between mad_client and cm_client init RDMA/hns: Fix mis-modifying default congestion control algorithm RDMA/rxe: Remove unused 'iova' parameter from rxe_mr_init_user RDMA/srpt: Do not register event handler until srpt device is fully setup RDMA/irdma: Remove duplicate assignment RDMA/efa: Limit EQs to available MSI-X vectors RDMA/mlx5: Delete unused mlx5_ib_copy_pas prototype RDMA/cxgb4: Delete unused c4iw_ep_redirect prototype RDMA/mana_ib: Introduce mana_ib_install_cq_cb helper function RDMA/mana_ib: Introduce mana_ib_get_netdev helper function RDMA/mana_ib: Introduce mdev_to_gc helper function RDMA/hns: Simplify 'struct hns_roce_hem' allocation ...
2024-03-07RDMA/mana_ib: Use virtual address in dma regions for MRsKonstantin Taranov6-20/+45
Introduce mana_ib_create_dma_region() to create dma regions with iova for MRs. It allows creating MRs with any page offset. Previously, only page-aligned addresses worked. For dma regions that must have a zero dma offset (e.g., for queues), mana_ib_create_zero_offset_dma_region() is added. To get the zero offset, ib_umem_find_best_pgoff() is used with zero pgoff_bitmask. Fixes: 0266a177631d ("RDMA/mana_ib: Add a driver for Microsoft Azure Network Adapter") Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com> Link: https://lore.kernel.org/r/1709560361-26393-3-git-send-email-kotaranov@linux.microsoft.com Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-03-07RDMA/mana_ib: Fix bug in creation of dma regionsKonstantin Taranov1-1/+1
Use ib_umem_dma_offset() helper to calculate correct dma offset. Fixes: 0266a177631d ("RDMA/mana_ib: Add a driver for Microsoft Azure Network Adapter") Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com> Link: https://lore.kernel.org/r/1709560361-26393-2-git-send-email-kotaranov@linux.microsoft.com Signed-off-by: Leon Romanovsky <leon@kernel.org>