summaryrefslogtreecommitdiff
path: root/drivers/infiniband
AgeCommit message (Collapse)AuthorFilesLines
2021-04-08RDMA/addr: Be strict with gid sizeLeon Romanovsky1-1/+3
The nla_len() is less than or equal to 16. If it's less than 16 then end of the "gid" buffer is uninitialized. Fixes: ae43f8286730 ("IB/core: Add IP to GID netlink offload") Link: https://lore.kernel.org/r/20210405074434.264221-1-leon@kernel.org Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-08RDMA/hns: Prevent le32 from being implicitly converted to u32Lang Cheng1-8/+7
Replace BUILD_BUG_ON_ZERO() with BUILD_BUG_ON() to avoid sparse complaining "restricted __le32 degrades to integer". Link: https://lore.kernel.org/r/1617354454-47840-10-git-send-email-liweihang@huawei.com Signed-off-by: Lang Cheng <chenglang@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-08RDMA/hns: Simplify the function config_eqc()Yixing Liu2-185/+65
Use "hr_reg_write" replace "roce_set_filed". Link: https://lore.kernel.org/r/1617354454-47840-9-git-send-email-liweihang@huawei.com Signed-off-by: Yixing Liu <liuyixing1@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-08RDMA/hns: Add XRC subtype in QPC and XRC type in SRQCWenpeng Liang2-7/+11
A field to distuiguish basic SRQ from XRC SRQ in SRQC and a field in QPC to determine whether a QP is XRC TGT QP or XRC INI QP are missing. Fixes: 32548870d438 ("RDMA/hns: Add support for XRC on HIP09") Link: https://lore.kernel.org/r/1617354454-47840-8-git-send-email-liweihang@huawei.com Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-08RDMA/hns: Remove unsupported QP typesWenpeng Liang3-5/+1
The hns ROCEE does not support UC QP currently. Link: https://lore.kernel.org/r/1617354454-47840-7-git-send-email-liweihang@huawei.com Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-08RDMA/hns: Delete unused members in the structure hns_roce_hwYangyang Li3-32/+0
Some structure members in hns_roce_hw have never been used and need to be deleted. Fixes: 9a4435375cd1 ("IB/hns: Add driver files for hns RoCE driver") Fixes: b156269d88e4 ("RDMA/hns: Add modify CQ support for hip08") Fixes: c7bcb13442e1 ("RDMA/hns: Add SRQ support for hip08 kernel mode") Link: https://lore.kernel.org/r/1617354454-47840-6-git-send-email-liweihang@huawei.com Signed-off-by: Yangyang Li <liyangyang20@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-08RDMA/hns: Delete redundant abnormal interrupt statusWenpeng Liang2-16/+6
The hardware supports only two types of abnormal interrupts. Fixes: a5073d6054f7 ("RDMA/hns: Add eq support of hip08") Link: https://lore.kernel.org/r/1617354454-47840-5-git-send-email-liweihang@huawei.com Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-08RDMA/hns: Delete redundant condition judgment related to eqYangyang Li1-21/+6
The register value related to the eq interrupt depends only on enable_flag, so the redundant condition judgment is deleted. Fixes: a5073d6054f7 ("RDMA/hns: Add eq support of hip08") Link: https://lore.kernel.org/r/1617354454-47840-4-git-send-email-liweihang@huawei.com Signed-off-by: Yangyang Li <liyangyang20@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-08RDMA/hns: Fix missing assignment of max_inline_dataWeihang Li1-0/+1
When querying QP, the ULPs should be informed of the max length of inline data supported by the hardware. Fixes: 30b707886aeb ("RDMA/hns: Support inline data in extented sge space for RC") Link: https://lore.kernel.org/r/1617354454-47840-3-git-send-email-liweihang@huawei.com Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-08RDMA/hns: Avoid enabling RQ inline on UDWeihang Li2-3/+8
RQ inline is not supported on UD/GSI QP, it should be disabled in QPC. Link: https://lore.kernel.org/r/1617354454-47840-2-git-send-email-liweihang@huawei.com Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-08RDMA/hns: Modify prints for mailbox and command queueWenpeng Liang2-10/+20
Use ratelimited print in mbox and cmq. And print mailbox operation if mailbox fails because it's useful information for the user. Link: https://lore.kernel.org/r/1617262341-37571-4-git-send-email-liweihang@huawei.com Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: Lang Cheng <chenglang@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-08RDMA/hns: Support more return types of command queueLang Cheng1-4/+14
Add error code definition according to the return code from firmware to help find out more detailed reasons why a command fails to be sent. Link: https://lore.kernel.org/r/1617262341-37571-3-git-send-email-liweihang@huawei.com Signed-off-by: Lang Cheng <chenglang@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-08RDMA/hns: Enable all CMDQ contextLang Cheng2-41/+27
Fix error of cmd's context number calculation algorithm to enable all of 32 cmd entries and support 32 concurrent accesses. Link: https://lore.kernel.org/r/1617262341-37571-2-git-send-email-liweihang@huawei.com Signed-off-by: Lang Cheng <chenglang@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-08RDMA/rxe: Fix missing acks from responderBob Pearson2-11/+8
All responder errors from request packets that do not consume a receive WQE fail to generate acks for RC QPs. This patch corrects this behavior by making the flow follow the same path as request packets that do consume a WQE after the completion. Link: https://lore.kernel.org/r/20210402001016.3210-1-rpearson@hpe.com Link: https://lore.kernel.org/linux-rdma/1a7286ac-bcea-40fb-2267-480134dd301b@gmail.com/ Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-08RDMA/qedr: Fix kernel panic when trying to access recv_cqKamal Heib1-1/+2
As INI QP does not require a recv_cq, avoid the following null pointer dereference by checking if the qp_type is not INI before trying to extract the recv_cq. BUG: kernel NULL pointer dereference, address: 00000000000000e0 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI CPU: 0 PID: 54250 Comm: mpitests-IMB-MP Not tainted 5.12.0-rc5 #1 Hardware name: Dell Inc. PowerEdge R320/0KM5PX, BIOS 2.7.0 08/19/2019 RIP: 0010:qedr_create_qp+0x378/0x820 [qedr] Code: 02 00 00 50 e8 29 d4 a9 d1 48 83 c4 18 e9 65 fe ff ff 48 8b 53 10 48 8b 43 18 44 8b 82 e0 00 00 00 45 85 c0 0f 84 10 74 00 00 <8b> b8 e0 00 00 00 85 ff 0f 85 50 fd ff ff e9 fd 73 00 00 48 8d bd RSP: 0018:ffff9c8f056f7a70 EFLAGS: 00010202 RAX: 0000000000000000 RBX: ffff9c8f056f7b58 RCX: 0000000000000009 RDX: ffff8c41a9744c00 RSI: ffff9c8f056f7b58 RDI: ffff8c41c0dfa280 RBP: ffff8c41c0dfa280 R08: 0000000000000002 R09: 0000000000000001 R10: 0000000000000000 R11: ffff8c41e06fc608 R12: ffff8c4194052000 R13: 0000000000000000 R14: ffff8c4191546070 R15: ffff8c41c0dfa280 FS: 00007f78b2787b80(0000) GS:ffff8c43a3200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000000000e0 CR3: 00000001011d6002 CR4: 00000000001706f0 Call Trace: ib_uverbs_handler_UVERBS_METHOD_QP_CREATE+0x4e4/0xb90 [ib_uverbs] ? ib_uverbs_cq_event_handler+0x30/0x30 [ib_uverbs] ib_uverbs_run_method+0x6f6/0x7a0 [ib_uverbs] ? ib_uverbs_handler_UVERBS_METHOD_QP_DESTROY+0x70/0x70 [ib_uverbs] ? __cond_resched+0x15/0x30 ? __kmalloc+0x5a/0x440 ib_uverbs_cmd_verbs+0x195/0x360 [ib_uverbs] ? xa_load+0x6e/0x90 ? cred_has_capability+0x7c/0x130 ? avc_has_extended_perms+0x17f/0x440 ? vma_link+0xae/0xb0 ? vma_set_page_prot+0x2a/0x60 ? mmap_region+0x298/0x6c0 ? do_mmap+0x373/0x520 ? selinux_file_ioctl+0x17f/0x220 ib_uverbs_ioctl+0xa7/0x110 [ib_uverbs] __x64_sys_ioctl+0x84/0xc0 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f78b120262b Fixes: 06e8d1df46ed ("RDMA/qedr: Add support for user mode XRC-SRQ's") Link: https://lore.kernel.org/r/20210404125501.154789-1-kamalheib1@gmail.com Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-08RDMA/core: Make the wc status prompt message clearerYixian Liu1-2/+2
Local invalidate is also a kind of memory management operation, not only memory bind operation. Furthermore, as invalidate operations include local and remote, add prefix to the prompt message to make it clearer. Link: https://lore.kernel.org/r/1617698772-13871-1-git-send-email-liweihang@huawei.com Signed-off-by: Yixian Liu <liuyixian@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-08RDMA/hns: Use GFP_ATOMIC under spin lockWei Yongjun1-1/+1
A spin lock is taken here so we should use GFP_ATOMIC. Fixes: f91696f2f053 ("RDMA/hns: Support congestion control type selection according to the FW") Link: https://lore.kernel.org/r/20210407154900.3486268-1-weiyongjun1@huawei.com Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-08IB/mlx5: Reduce max order of memory allocated for xlt updatePraveen Kumar Kannoju1-1/+1
To update xlt (during mlx5_ib_reg_user_mr()), the driver can request up to 1 MB (order-8) memory, depending on the size of the MR. This costly allocation can sometimes take very long to return (a few seconds). This causes the calling application to hang for a long time, especially when the system is fragmented. To avoid these long latency spikes, the calls the higher order allocations need to fail faster in case they are not available. In order to acheive this we need __GFP_NORETRY flag in the gfp_mask before during fetching the free pages. Allow the algorithm to automatically fall back to smaller page sizes. Link: https://lore.kernel.org/r/1617425635-35631-1-git-send-email-praveen.kannoju@oracle.com Signed-off-by: Praveen Kumar Kannoju <praveen.kannoju@oracle.com> Acked-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-08IB/hfi1: Remove unused functionKaike Wan1-18/+0
Remove the unused function sdma_iowait_schedule(). Fixes: 7724105686e7 ("IB/hfi1: add driver files") Link: https://lore.kernel.org/r/1617026791-89997-1-git-send-email-dennis.dalessandro@cornelisnetworks.com Reviewed-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-08IB/hfi1: Use kzalloc() for mmu_rb_handler allocationMike Marciniszyn1-1/+1
The code currently assumes that the mmu_notifier struct embedded in mmu_rb_handler only contains two fields. There are now extra fields: struct mmu_notifier { struct hlist_node hlist; const struct mmu_notifier_ops *ops; struct mm_struct *mm; struct rcu_head rcu; unsigned int users; }; Given that there in no init for the mmu_notifier, a kzalloc() should be used to insure that any newly added fields are given a predictable initial value of zero. Fixes: 06e0ffa69312 ("IB/hfi1: Re-factor MMU notification code") Link: https://lore.kernel.org/r/1617026056-50483-9-git-send-email-dennis.dalessandro@cornelisnetworks.com Reviewed-by: Adam Goldman <adam.goldman@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-08IB/hfi1: Add additional usdma tracesMike Marciniszyn3-3/+85
Add traces that were vital in isolating an issue with pq waitlist in commit fa8dac396863 ("IB/hfi1: Fix another case where pq is left on waitlist") Link: https://lore.kernel.org/r/1617026056-50483-8-git-send-email-dennis.dalessandro@cornelisnetworks.com Reviewed-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-08IB/hfi1: Remove indirect call to hfi1_ipoib_send_dma()Mike Marciniszyn3-16/+8
hfi1_ipoib_send() directly calls hfi1_ipoib_send_dma() with no value add. Fix by renaming hfi1_ipoib_send_dma() to hfi1_ipoib_send(). Link: https://lore.kernel.org/r/1617026056-50483-6-git-send-email-dennis.dalessandro@cornelisnetworks.com Reviewed-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-08IB/hfi1: Use napi_schedule_irqoff() for tx napiMike Marciniszyn1-1/+1
The call is from an ISR context and napi_schedule_irqoff() can be used. Change the call to the more efficient type. Link: https://lore.kernel.org/r/1617026056-50483-5-git-send-email-dennis.dalessandro@cornelisnetworks.com Reviewed-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-08IB/hfi1: Correct oversized ring allocationMike Marciniszyn2-8/+9
The completion ring for tx is using the wrong size to size the ring, oversizing the ring by two orders of magniture. Correct the allocation size and use kcalloc_node() to allocate the ring. Fix mistaken GFP defines in similar allocations. Link: https://lore.kernel.org/r/1617026056-50483-4-git-send-email-dennis.dalessandro@cornelisnetworks.com Reviewed-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-08IB/{ipoib,hfi1}: Add a timeout handler for rdma_netdevMike Marciniszyn4-0/+37
The current rdma_netdev handling in ipoib hooks the tx_timeout handler, but prints out a totally useless message that prevents effective debugging especially when multiple transmit queues are being used. Add a tx_timeout rdma_netdev hook and implement the callback in the hfi1 to print additional information. The existing non-helpful message is avoided when the driver has presented a callback. Link: https://lore.kernel.org/r/1617026056-50483-3-git-send-email-dennis.dalessandro@cornelisnetworks.com Reviewed-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-08IB/hfi1: Add AIP tx tracesMike Marciniszyn2-3/+119
Add traces to allow for debugging issues with AIP tx. Link: https://lore.kernel.org/r/1617026056-50483-2-git-send-email-dennis.dalessandro@cornelisnetworks.com Reviewed-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-07IB/hfi1: Fix probe time panic when AIP is enabled with a buggy BIOSMike Marciniszyn4-19/+16
A panic can result when AIP is enabled: BUG: unable to handle kernel NULL pointer dereference at 000000000000000 PGD 0 P4D 0 Oops: 0000 1 SMP PTI CPU: 70 PID: 981 Comm: systemd-udevd Tainted: G OE --------- - - 4.18.0-240.el8.x86_64 #1 Hardware name: Intel Corporation S2600KP/S2600KP, BIOS SE5C610.86B.01.01.0005.101720141054 10/17/2014 RIP: 0010:__bitmap_and+0x1b/0x70 RSP: 0018:ffff99aa0845f9f0 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff8d5a6fc18000 RCX: 0000000000000048 RDX: 0000000000000000 RSI: ffffffffc06336f0 RDI: ffff8d5a8fa67750 RBP: 0000000000000079 R08: 0000000fffffffff R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000001 R12: ffffffffc06336f0 R13: 00000000000000a0 R14: ffff8d5a6fc18000 R15: 0000000000000003 FS: 00007fec137a5980(0000) GS:ffff8d5a9fa80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000a04b48002 CR4: 00000000001606e0 Call Trace: hfi1_num_netdev_contexts+0x7c/0x110 [hfi1] hfi1_init_dd+0xd7f/0x1a90 [hfi1] ? pci_bus_read_config_dword+0x49/0x70 ? pci_mmcfg_read+0x3e/0xe0 do_init_one.isra.18+0x336/0x640 [hfi1] local_pci_probe+0x41/0x90 pci_device_probe+0x105/0x1c0 really_probe+0x212/0x440 driver_probe_device+0x49/0xc0 device_driver_attach+0x50/0x60 __driver_attach+0x61/0x130 ? device_driver_attach+0x60/0x60 bus_for_each_dev+0x77/0xc0 ? klist_add_tail+0x3b/0x70 bus_add_driver+0x14d/0x1e0 ? dev_init+0x10b/0x10b [hfi1] driver_register+0x6b/0xb0 ? dev_init+0x10b/0x10b [hfi1] hfi1_mod_init+0x1e6/0x20a [hfi1] do_one_initcall+0x46/0x1c3 ? free_unref_page_commit+0x91/0x100 ? _cond_resched+0x15/0x30 ? kmem_cache_alloc_trace+0x140/0x1c0 do_init_module+0x5a/0x220 load_module+0x14b4/0x17e0 ? __do_sys_finit_module+0xa8/0x110 __do_sys_finit_module+0xa8/0x110 do_syscall_64+0x5b/0x1a0 The issue happens when pcibus_to_node() returns NO_NUMA_NODE. Fix this issue by moving the initialization of dd->node to hfi1_devdata allocation and remove the other pcibus_to_node() calls in the probe path and use dd->node instead. Affinity logic is adjusted to use a new field dd->affinity_entry as a guard instead of dd->node. Fixes: 4730f4a6c6b2 ("IB/hfi1: Activate the dummy netdev") Link: https://lore.kernel.org/r/1617025700-31865-4-git-send-email-dennis.dalessandro@cornelisnetworks.com Cc: stable@vger.kernel.org Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-07RDMA/cxgb4: check for ipv6 address properly while destroying listenerPotnuri Bharat Teja1-1/+2
ipv6 bit is wrongly set by the below which causes fatal adapter lookup engine errors for ipv4 connections while destroying a listener. Fix it to properly check the local address for ipv6. Fixes: 3408be145a5d ("RDMA/cxgb4: Fix adapter LE hash errors while destroying ipv6 listening server") Link: https://lore.kernel.org/r/20210331135715.30072-1-bharat@chelsio.com Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-01RDMA/hns: Reorganize doorbell update interfaces for all queuesYixian Liu7-130/+119
The doorbell update interfaces are very similar for different queues, such as SQ, RQ, SRQ, CQ and EQ. So reorganize these code and also fix some inappropriate naming. Link: https://lore.kernel.org/r/1616840738-7866-3-git-send-email-liweihang@huawei.com Signed-off-by: Yixian Liu <liuyixian@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-01RDMA/hns: Support configuring doorbell mode of RQ and CQYixian Liu5-40/+66
HIP08 supports both normal and record doorbell mode for RQ and CQ, SQ record doorbell for userspace is also supported by the software for flushing CQE process. As now the capability of HIP08 are exposed to the user and are configurable, the support of normal doorbell should be added back. Note that, if switching to normal doorbell, the kernel will report "flush cqe is unsupported" if modify qp to error status as the flush is based on record doorbell. Link: https://lore.kernel.org/r/1616840738-7866-2-git-send-email-liweihang@huawei.com Signed-off-by: Yixian Liu <liuyixian@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-01RDMA/hns: Simplify command fields for HEM base address configurationXi Wang2-160/+83
Use hr_reg_write() instead of roce_set_field() to simplify codes about configuring HEM BA. Link: https://lore.kernel.org/r/1616815294-13434-6-git-send-email-liweihang@huawei.com Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-01RDMA/hns: Reorganize process of setting HEMXi Wang1-33/+44
Encapsulate configuring GMV base address and other type of HEM table into two separate functions to make process of setting HEM clearer. Link: https://lore.kernel.org/r/1616815294-13434-5-git-send-email-liweihang@huawei.com Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-01RDMA/hns: Refactor reset state checking flowXi Wang5-200/+220
The 'HNS_ROCE_OPC_QUERY_MB_ST' command will response the mailbox complete status and hardware busy flag, and the complete status is only valid when the busy flag is 0, so it's better to query these two fields at a time rather than separately. Link: https://lore.kernel.org/r/1616815294-13434-4-git-send-email-liweihang@huawei.com Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-01RDMA/hns: Reorganize hns_roce_create_cq()Yixing Liu1-28/+58
Encapsulate two subprocesses into functions. Link: https://lore.kernel.org/r/1616815294-13434-3-git-send-email-liweihang@huawei.com Signed-off-by: Yixing Liu <liuyixing1@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-01RDMA/hns: Refactor hns_roce_v2_poll_one()Weihang Li1-179/+224
Encapsulate the process of obtaining the current QP and filling WC as functions, also merge some duplicate code. Link: https://lore.kernel.org/r/1616815294-13434-2-git-send-email-liweihang@huawei.com Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-01RDMA/rtrs-clt: Cap max_io_sizeJack Wang1-1/+3
Max io size is limited by both remote buffer size and the max fr pages per mr. Link: https://lore.kernel.org/r/20210325153308.1214057-20-gi-oh.kim@ionos.com Signed-off-by: Jack Wang <jinpu.wang@ionos.com> Reviewed-by: Md Haris Iqbal <haris.iqbal@ionos.com> Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-01RDMA/rtrs: Cleanup unused 's' variable in __alloc_sessJack Wang1-2/+0
Link: https://lore.kernel.org/r/20210325153308.1214057-18-gi-oh.kim@ionos.com Signed-off-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-01RDMA/rtrs-srv: Report temporary sessname for error messageGioh Kim1-0/+11
Before receiving the session name, the error message cannot include any information about which connection generates the error. This patch stores the addresses of source and target in the sessname field to show which generates the error. That field will be over-written when receiving the session name from client. Link: https://lore.kernel.org/r/20210325153308.1214057-17-gi-oh.kim@ionos.com Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com> Signed-off-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-01RDMA/rtrs: New function converting rtrs_addr to stringGioh Kim4-14/+37
There is common code converting addresses of source machine and destination machine to a string. We already have a struct rtrs_addr to store two addresses. This patch introduces a new function that converts two addresses into one string with struct rtrs_addr. Link: https://lore.kernel.org/r/20210325153308.1214057-14-gi-oh.kim@ionos.com Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com> Signed-off-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-01RDMA/rtrs-clt: Close rtrs client conn before destroying rtrs clt session filesMd Haris Iqbal1-1/+1
KASAN detected the following BUG: BUG: KASAN: use-after-free in rtrs_clt_update_wc_stats+0x41/0x100 [rtrs_client] Read of size 8 at addr ffff88bf2fb4adc0 by task swapper/0/0 CPU: 0 PID: 0 Comm: swapper/0 Tainted: G O 5.4.84-pserver #5.4.84-1+feature+linux+5.4.y+dbg+20201216.1319+b6b887b~deb10 Hardware name: Supermicro H8QG6/H8QG6, BIOS 3.00 09/04/2012 Call Trace: <IRQ> dump_stack+0x96/0xe0 print_address_description.constprop.4+0x1f/0x300 ? irq_work_claim+0x2e/0x50 __kasan_report.cold.8+0x78/0x92 ? rtrs_clt_update_wc_stats+0x41/0x100 [rtrs_client] kasan_report+0x10/0x20 rtrs_clt_update_wc_stats+0x41/0x100 [rtrs_client] rtrs_clt_rdma_done+0xb1/0x760 [rtrs_client] ? lockdep_hardirqs_on+0x1a8/0x290 ? process_io_rsp+0xb0/0xb0 [rtrs_client] ? mlx4_ib_destroy_cq+0x100/0x100 [mlx4_ib] ? add_interrupt_randomness+0x1a2/0x340 __ib_process_cq+0x97/0x100 [ib_core] ib_poll_handler+0x41/0xb0 [ib_core] irq_poll_softirq+0xe0/0x260 __do_softirq+0x127/0x672 irq_exit+0xd1/0xe0 do_IRQ+0xa3/0x1d0 common_interrupt+0xf/0xf </IRQ> RIP: 0010:cpuidle_enter_state+0xea/0x780 Code: 31 ff e8 99 48 47 ff 80 7c 24 08 00 74 12 9c 58 f6 c4 02 0f 85 53 05 00 00 31 ff e8 b0 6f 53 ff e8 ab 4f 5e ff fb 8b 44 24 04 <85> c0 0f 89 f3 01 00 00 48 8d 7b 14 e8 65 1e 77 ff c7 43 14 00 00 RSP: 0018:ffffffffab007d58 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffca RAX: 0000000000000002 RBX: ffff88b803d69800 RCX: ffffffffa91a8298 RDX: 0000000000000007 RSI: dffffc0000000000 RDI: ffffffffab021414 RBP: ffffffffab6329e0 R08: 0000000000000002 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000002 R13: 000000bf39d82466 R14: ffffffffab632aa0 R15: ffffffffab632ae0 ? lockdep_hardirqs_on+0x1a8/0x290 ? cpuidle_enter_state+0xe5/0x780 cpuidle_enter+0x3c/0x60 do_idle+0x2fb/0x390 ? arch_cpu_idle_exit+0x40/0x40 ? schedule+0x94/0x120 cpu_startup_entry+0x19/0x1b start_kernel+0x5da/0x61b ? thread_stack_cache_init+0x6/0x6 ? load_ucode_amd_bsp+0x6f/0xc4 ? init_amd_microcode+0xa6/0xa6 ? x86_family+0x5/0x20 ? load_ucode_bsp+0x182/0x1fd secondary_startup_64+0xa4/0xb0 Allocated by task 5730: save_stack+0x19/0x80 __kasan_kmalloc.constprop.9+0xc1/0xd0 kmem_cache_alloc_trace+0x15b/0x350 alloc_sess+0xf4/0x570 [rtrs_client] rtrs_clt_open+0x3b4/0x780 [rtrs_client] find_and_get_or_create_sess+0x649/0x9d0 [rnbd_client] rnbd_clt_map_device+0xd7/0xf50 [rnbd_client] rnbd_clt_map_device_store+0x4ee/0x970 [rnbd_client] kernfs_fop_write+0x141/0x240 vfs_write+0xf3/0x280 ksys_write+0xba/0x150 do_syscall_64+0x68/0x270 entry_SYSCALL_64_after_hwframe+0x49/0xbe Freed by task 5822: save_stack+0x19/0x80 __kasan_slab_free+0x125/0x170 kfree+0xe7/0x3f0 kobject_put+0xd3/0x240 rtrs_clt_destroy_sess_files+0x3f/0x60 [rtrs_client] rtrs_clt_close+0x3c/0x80 [rtrs_client] close_rtrs+0x45/0x80 [rnbd_client] rnbd_client_exit+0x10f/0x2bd [rnbd_client] __x64_sys_delete_module+0x27b/0x340 do_syscall_64+0x68/0x270 entry_SYSCALL_64_after_hwframe+0x49/0xbe When rtrs_clt_close is triggered, it iterates over all the present rtrs_clt_sess and triggers close on them. However, the call to rtrs_clt_destroy_sess_files is done before the rtrs_clt_close_conns. This is incorrect since during the initialization phase we allocate rtrs_clt_sess first, and then we go ahead and create rtrs_clt_con for it. If we free the rtrs_clt_sess structure before closing the rtrs_clt_con, it may so happen that an inflight IO completion would trigger the function rtrs_clt_rdma_done, which would lead to the above UAF case. Hence close the rtrs_clt_con connections first, and then trigger the destruction of session files. Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality") Link: https://lore.kernel.org/r/20210325153308.1214057-12-gi-oh.kim@ionos.com Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com> Signed-off-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-01RDMA/rtrs: Cleanup the code in rtrs_srv_rdma_cm_handlerGuoqing Jiang1-4/+1
Let these cases share the same path since all of them need to close session. Link: https://lore.kernel.org/r/20210325153308.1214057-11-gi-oh.kim@ionos.com Signed-off-by: Guoqing Jiang <guoqing.jiang@ionos.com> Reviewed-by: Danil Kipnis <danil.kipnis@ionos.com> Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com> Signed-off-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-01RDMA/rtrs: Remove sessname and sess_kobj from rtrs_attrsGuoqing Jiang2-4/+0
The two members are not used in the code, so remove them. Link: https://lore.kernel.org/r/20210325153308.1214057-10-gi-oh.kim@ionos.com Signed-off-by: Guoqing Jiang <guoqing.jiang@ionos.com> Reviewed-by: Danil Kipnis <danil.kipnis@ionos.com> Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com> Signed-off-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-01RDMA/rtrs: Kill the put label in rtrs_srv_create_once_sysfs_root_foldersGuoqing Jiang1-5/+2
We can remove the label after move put_device to the right place. Link: https://lore.kernel.org/r/20210325153308.1214057-9-gi-oh.kim@ionos.com Signed-off-by: Guoqing Jiang <guoqing.jiang@ionos.com> Reviewed-by: Danil Kipnis <danil.kipnis@ionos.com> Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com> Signed-off-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-01RDMA/rtrs-clt: Remove redundant code from rtrs_clt_read_reqGuoqing Jiang1-4/+1
There is no need to dereference 's' from 'sess', since we have "sess = to_clt_sess(s)" before. And we can deference 'dev' from 's' earlier. Link: https://lore.kernel.org/r/20210325153308.1214057-8-gi-oh.kim@ionos.com Signed-off-by: Guoqing Jiang <guoqing.jiang@ionos.com> Reviewed-by: Danil Kipnis <danil.kipnis@ionos.com> Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com> Signed-off-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-01RDMA/hns: Support congestion control type selection according to the FWYangyang Li4-3/+200
The type of congestion control algorithm includes DCQCN, LDCP, HC3 and DIP. The driver will select one of them according to the firmware when querying PF capabilities, and then set the related configuration fields into QPC. Link: https://lore.kernel.org/r/1616679236-7795-3-git-send-email-liweihang@huawei.com Signed-off-by: Yangyang Li <liyangyang20@huawei.com> Signed-off-by: Yixing Liu <liuyixing1@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-01RDMA/hns: Support query information of functions from FWWei Xu3-1/+37
Add a new type of command to query mac id of functions from the firmware, it is used to select the template of congestion algorithm. More info will be supported in the future. Link: https://lore.kernel.org/r/1616679236-7795-2-git-send-email-liweihang@huawei.com Signed-off-by: Wei Xu <xuwei5@hisilicon.com> Signed-off-by: Shengming Shu <shushengming1@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-01RDMA/core: Fix corrupted SL on passive sideHåkon Bugge1-1/+2
On RoCE systems, a CM REQ contains a Primary Hop Limit > 1 and Primary Subnet Local is zero. In cm_req_handler(), the cm_process_routed_req() function is called. Since the Primary Subnet Local value is zero in the request, and since this is RoCE (Primary Local LID is permissive), the following statement will be executed: IBA_SET(CM_REQ_PRIMARY_SL, req_msg, wc->sl); This corrupts SL in req_msg if it was different from zero. In other words, a request to setup a connection using an SL != zero, will not be honored, and a connection using SL zero will be created instead. Fixed by not calling cm_process_routed_req() on RoCE systems, the cm_process_route_req() is only for IB anyhow. Fixes: 3971c9f6dbf2 ("IB/cm: Add interim support for routed paths") Link: https://lore.kernel.org/r/1616420132-31005-1-git-send-email-haakon.bugge@oracle.com Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-03-31RDMA/rxe: Remove rxe_dma_device declarationKamal Heib1-1/+0
The function isn't implemented - delete the declaration. Fixes: a9d2e9ae953f ("RDMA/siw,rxe: Make emulated devices virtual in the device tree") Link: https://lore.kernel.org/r/20210331102043.691950-1-kamalheib1@gmail.com Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-03-31RDMA/iw_cxgb4: Use DEFINE_SPINLOCK() for spinlockTang Yizhou1-2/+1
spinlock can be initialized automatically with DEFINE_SPINLOCK() rather than explicitly calling spin_lock_init(). Link: https://lore.kernel.org/r/20210331020105.4858-1-tangyizhou@huawei.com Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Tang Yizhou <tangyizhou@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-03-31RDMA/iser: struct iscsi_iser_task is declared twiceWan Jiabing1-1/+0
struct iscsi_iser_task has been declared at 201st line. Remove the duplicate. Link: https://lore.kernel.org/r/20210326113347.903976-1-wanjiabing@vivo.com Signed-off-by: Wan Jiabing <wanjiabing@vivo.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>