summaryrefslogtreecommitdiff
path: root/drivers/infiniband/hw
AgeCommit message (Collapse)AuthorFilesLines
2020-01-16RDMA/mlx5: Fix handling of IOVA != user_va in ODP pathsJason Gunthorpe2-6/+15
Till recently it was not possible for userspace to specify a different IOVA, but with the new ibv_reg_mr_iova() library call this can be done. To compute the user_va we must compute: user_va = (iova - iova_start) + user_va_start while being cautious of overflow and other math problems. The iova is not reliably stored in the mmkey when the MR is created. Only the cached creation path (the common one) set it, so it must also be set when creating uncached MRs. Fix the weird use of iova when computing the starting page index in the MR. In the normal case, when iova == umem.address: iova & (~(BIT(page_shift) - 1)) == ALIGN_DOWN(umem.address, odp->page_size) == ib_umem_start(odp) And when iova is different using it in math with a user_va is wrong. Finally, do not allow an implicit ODP to be created with a non-zero IOVA as we have no support for that. Fixes: 7bdf65d411c1 ("IB/mlx5: Handle page faults") Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-01-16IB/mlx5: Mask out unsupported ODP capabilities for kernel QPsMoni Shoua1-0/+17
The ODP handler for WQEs in RQ or SRQ is not implented for kernel QPs. Therefore don't report support in these if query comes from a kernel user. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-01-16RDMA/mlx5: Don't fake udata for kernel pathLeon Romanovsky1-18/+16
Kernel paths must not set udata and provide NULL pointer, instead of faking zeroed udata struct. Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-01-16IB/mlx5: Add ODP WQE handlers for kernel QPsMoni Shoua3-70/+117
One of the steps in ODP page fault handler for WQEs is to read a WQE from a QP send queue or receive queue buffer at a specific index. Since the implementation of this buffer is different between kernel and user QP the implementation of the handler needs to be aware of that and handle it in a different way. ODP for kernel MRs is currently supported only for RDMA_READ and RDMA_WRITE operations so change the handler to - read a WQE from a kernel QP send queue - fail if access to receive queue or shared receive queue is required for a kernel QP Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-01-16IB/core: Introduce ib_reg_user_mrMoni Shoua2-1/+4
Add ib_reg_user_mr() for kernel ULPs to register user MRs. The common use case that uses this function is a userspace application that allocates memory for HCA access but the responsibility to register the memory at the HCA is on an kernel ULP. This ULP that acts as an agent for the userspace application. This function is intended to be used without a user context so vendor drivers need to be aware of calling reg_user_mr() device operation with udata equal to NULL. Among all drivers, i40iw is the only driver which relies on presence of udata, so check udata existence for that driver. Signed-off-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Guy Levi <guyle@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-01-16IB: Allow calls to ib_umem_get from kernel ULPsMoni Shoua28-56/+62
So far the assumption was that ib_umem_get() and ib_umem_odp_get() are called from flows that start in UVERBS and therefore has a user context. This assumption restricts flows that are initiated by ULPs and need the service that ib_umem_get() provides. This patch changes ib_umem_get() and ib_umem_odp_get() to get IB device directly by relying on the fact that both UVERBS and ULPs sets that field correctly. Reviewed-by: Guy Levi <guyle@mellanox.com> Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-01-15RDMA/efa: Remove unused ucontext parameter from efa_qp_user_mmap_entries_removeGal Pressman1-7/+4
The ucontext parameter is unused, remove it. Link: https://lore.kernel.org/r/20200114085706.82229-6-galpress@amazon.com Reviewed-by: Firas JahJah <firasj@amazon.com> Reviewed-by: Yossi Leybovich <sleybo@amazon.com> Signed-off-by: Gal Pressman <galpress@amazon.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-15RDMA/efa: Remove {} brackets from single statement ifGal Pressman1-2/+1
The {} brackets are not needed according to the Linux coding style. Link: https://lore.kernel.org/r/20200114085706.82229-5-galpress@amazon.com Reviewed-by: Daniel Kranzdorf <dkkranzd@amazon.com> Reviewed-by: Firas JahJah <firasj@amazon.com> Signed-off-by: Gal Pressman <galpress@amazon.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-15RDMA/efa: Device definitions documentation updatesGal Pressman1-13/+24
Various clarifications and updates to the documentation of the device definitions. No functional changes in this patch. Link: https://lore.kernel.org/r/20200114085706.82229-4-galpress@amazon.com Reviewed-by: Firas JahJah <firasj@amazon.com> Reviewed-by: Yossi Leybovich <sleybo@amazon.com> Signed-off-by: Gal Pressman <galpress@amazon.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-15RDMA/hns: Add support for extended atomic in userspaceJiaran Zhang2-2/+17
To support extended atomic operations including cmp & swap and fetch & add of 8 bytes, 16 bytes, 32 bytes, 64 bytes in userspace, some field in qpc should be configured. Link: https://lore.kernel.org/r/1579052546-11746-1-git-send-email-liweihang@huawei.com Signed-off-by: Jiaran Zhang <zhangjiaran@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-15RDMA/hns: Get pf capabilities from firmwareLijun Ou2-109/+6
Get pf capabilities from firmware according to different hardwares, if it fails, all capabilities will be set with a default value. Link: https://lore.kernel.org/r/1578738761-3176-4-git-send-email-liweihang@huawei.com Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-15RDMA/hns: Add interfaces to get pf capabilities from firmwareLijun Ou3-0/+527
pf capabilities are set by default for hip08 previously which should depends on different types of hardware. So add new interfaces to get them from firmware. Link: https://lore.kernel.org/r/1578738761-3176-3-git-send-email-liweihang@huawei.com Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-15RDMA/hns: Remove some redundant variables related to capabilitiesWeihang Li3-7/+1
In struct hns_roce_caps, max_srq_sg and max_srqwqes is unused, and max_srqs has the same effect with num_srqs. So remove them from this structrue. Link: https://lore.kernel.org/r/1578738761-3176-2-git-send-email-liweihang@huawei.com Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-13RDMA/mlx5: Simplify devx async commandsJason Gunthorpe1-14/+10
With the new FD structure the async commands do not need to hold any references while running. The existing mlx5_cmd_exec_cb() and mlx5_cmd_cleanup_async_ctx() provide enough synchronization to ensure that all outstanding commands are completed before the uobject can be destructed. Remove the now confusing get_file() and the type erasure of the devx_async_cmd_event_file. Link: https://lore.kernel.org/r/1578504126-9400-4-git-send-email-yishaih@mellanox.com Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-13RDMA/core: Simplify destruction of FD uobjectsJason Gunthorpe1-63/+48
FD uobjects have a weird split between the struct file and uobject world. Simplify this to make them pure uobjects and use a generic release method for all struct file operations. This fixes the control flow so that mlx5_cmd_cleanup_async_ctx() is always called before erasing the linked list contents to make the concurrancy simpler to understand. For this to work the uobject destruction must fence anything that it is cleaning up - the design must not rely on struct file lifetime. Only deliver_event() relies on the struct file to when adding new events to the queue, add a is_destroyed check under lock to block it. Link: https://lore.kernel.org/r/1578504126-9400-3-git-send-email-yishaih@mellanox.com Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-13RDMA/mlx5: Use RCU and direct refcounts to keep memory aliveJason Gunthorpe1-17/+17
dispatch_event_fd() runs from a notifier with minimal locking, and relies on RCU and a file refcount to keep the uobject and eventfd alive. As the next patch wants to remove the file_operations release function from the drivers, re-organize things so that the devx_event_notifier() path uses the existing RCU to manage the lifetime of the uobject and eventfd. Move the refcount puts to a call_rcu so that the objects are guaranteed to exist and remove the indirect file refcount. Link: https://lore.kernel.org/r/1578504126-9400-2-git-send-email-yishaih@mellanox.com Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-13IB/mlx5: Add mmap support for VARYishai Hadas1-1/+4
Add mmap support for VAR, it uses the 'offset' command mode with involvement of IB core APIs to find the previously allocated mmap entry. Link: https://lore.kernel.org/r/20191212110928.334995-6-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-13IB/mlx5: Introduce VAR object and its alloc/destroy methodsYishai Hadas2-0/+164
Introduce VAR object and its alloc/destroy KABI methods. The internal implementation uses the IB core API to manage mmap/munamp calls. Link: https://lore.kernel.org/r/20191212110928.334995-5-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-13IB/mlx5: Extend caps stage to handle VAR capabilitiesYishai Hadas2-2/+48
Extend caps stage to handle VAR capabilities. Link: https://lore.kernel.org/r/20191212110928.334995-4-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-10Merge branch 'x86/mm' into efi/core, to pick up dependenciesIngo Molnar1-1/+1
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-01-10RDMA/hns: Add support for reporting wc as software modeXi Wang6-34/+252
When hardware is in resetting stage, we may can't poll back all the expected work completions as the hardware won't generate cqe anymore. This patch allows the driver to compose the expected wc instead of the hardware during resetting stage. Once the hardware finished resetting, we can poll cq from hardware again. Link: https://lore.kernel.org/r/1578572412-25756-1-git-send-email-liweihang@huawei.com Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-10RDMA/hns: Bugfix for posting a wqe with sgeLijun Ou1-16/+25
Driver should first check whether the sge is valid, then fill the valid sge and the caculated total into hardware, otherwise invalid sges will cause an error. Fixes: 52e3b42a2f58 ("RDMA/hns: Filter for zero length of sge in hip08 kernel mode") Fixes: 7bdee4158b37 ("RDMA/hns: Fill sq wqe context of ud type in hip08") Link: https://lore.kernel.org/r/1578571852-13704-1-git-send-email-liweihang@huawei.com Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-10IB/hfi1: Add RcvShortLengthErrCnt to hfi1statsMike Marciniszyn3-0/+3
This counter, RxShrErr, is required for error analysis and debug. Fixes: 7724105686e7 ("IB/hfi1: add driver files") Link: https://lore.kernel.org/r/20200106134235.119356.29123.stgit@awfm-01.aw.intel.com Reviewed-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-10IB/hfi1: Add software counter for ctxt0 seq dropMike Marciniszyn4-0/+14
All other code paths increment some form of drop counter. This was missed in the original implementation. Fixes: 82c2611daaf0 ("staging/rdma/hfi1: Handle packets with invalid RHF on context 0") Link: https://lore.kernel.org/r/20200106134228.119356.96828.stgit@awfm-01.aw.intel.com Reviewed-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-10IB/hfi1: Return void in packet receiving functionsGrzegorz Andrejczuk2-25/+18
Packet receiving functions returns int value, and yet the return values are not used at all. This patch converts the functions to return void. Link: https://lore.kernel.org/r/20200106134222.119356.84098.stgit@awfm-01.aw.intel.com Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-10IB/hfi1: Decouple IRQ name from typeGrzegorz Andrejczuk2-48/+59
IRQ name was connected to IRQ type, this is not sufficient and it would be better to use name as argument to msix_request_irq instead of assigning it to variables when function is called. Index argument was required to generate name and now it can be removed. To generate name correctly helpers function were added and updated. Link: https://lore.kernel.org/r/20200106134216.119356.44478.stgit@awfm-01.aw.intel.com Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-10IB/hfi1: Create API for auto activateMike Marciniszyn1-14/+23
Add an auto activate routine for use by the interrupt handler. Link: https://lore.kernel.org/r/20200106134210.119356.43079.stgit@awfm-01.aw.intel.com Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-10IB/hfi1: IB/hfi1: Add an API to handle special case dropMike Marciniszyn3-8/+23
This patch pushes special case drop logic into an API to be shared by all interrupt handlers. Additionally, convert do_drop to a bool. Link: https://lore.kernel.org/r/20200106134203.119356.36962.stgit@awfm-01.aw.intel.com Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-10IB/hfi1: Move common receive IRQ code to functionGrzegorz Andrejczuk1-30/+52
Tracing interrupts, incrementing interrupt counter and ASPM are part that will be reused by HFI1 receive IRQ handlers. Create common function to have shared code in one place. Link: https://lore.kernel.org/r/20200106134157.119356.32656.stgit@awfm-01.aw.intel.com Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-10IB/hfi1: Add fast and slow handlers for receive contextMike Marciniszyn4-60/+58
This patch eliminate special cases by adding a fast_handler member to the receive context and changes to the fast handler as specified in the new variable. Initialize the variable as soon as the setting for dma tail is known when the context is created. Setting fast path is called every time when any context has entered slow path. Add function to check if contexts is using fast path and do not set fast path when it is already done to improve RCD fastpath setting. Link: https://lore.kernel.org/r/20200106134150.119356.87558.stgit@awfm-01.aw.intel.com Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com> Signed-off-by: Sadanand Warrier <sadanand.warrier@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-10IB/hfi1: Move chip specific functions to chip.cMike Marciniszyn3-69/+87
Move routines and defines associated with hdrq size validation to a chip specific routine since the limits are specific to the device. Fix incorrect value for min size 2 -> 32 CSR writes should also be in chip.c. Create a chip routine to write the hdrq specific CSRs and call as appropriate. Link: https://lore.kernel.org/r/20200106134144.119356.74312.stgit@awfm-01.aw.intel.com Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-08IB/mlx5: Do reverse sequence during device removalParav Pandit1-0/+2
When IB device profile initialization completes, device is marked as active. However, IB device is not marked inactive, during device removal flow. It should be the mirror of the add flow. Hence, mark it inactive during remove sequence. Link: https://lore.kernel.org/r/20191212113024.336702-2-leon@kernel.org Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-07RDMA/hns: Fix coding style issuesLijun Ou2-143/+92
Fix some coding style issuses without changing logic of codes, most of the modification is unreasonable line breaks and alignments. Link: https://lore.kernel.org/r/1578313276-29080-8-git-send-email-liweihang@huawei.com Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Lang Cheng <chenglang@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-07RDMA/hns: Replace custom macros HNS_ROCE_ALIGN_UPWenpeng Liang2-26/+20
HNS_ROCE_ALIGN_UP can be replaced by round_up() which is defined in kernel.h. Link: https://lore.kernel.org/r/1578313276-29080-7-git-send-email-liweihang@huawei.com Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-07RDMA/hns: Remove redundant print informationYixing Liu1-1/+0
There are already necessary prints in outer function, prints in hns_roce_function_clear() may confuse users. So these prints is removed. Link: https://lore.kernel.org/r/1578313276-29080-6-git-send-email-liweihang@huawei.com Signed-off-by: Yixing Liu <liuyixing1@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-07RDMA/hns: Delete unnessary parameters in hns_roce_v2_qp_modify()Lijun Ou1-3/+1
Current state and new state of qp won't be configured when modifying qp, so these two redundant parameters should be removed. Link: https://lore.kernel.org/r/1578313276-29080-5-git-send-email-liweihang@huawei.com Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-07RDMA/hns: Update the value of qp typeLijun Ou1-5/+7
The values used to represent service type of RC and UD should be interchanged according to design of hardware. And it's better to define these types in enumeration than macros. Link: https://lore.kernel.org/r/1578313276-29080-4-git-send-email-liweihang@huawei.com Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-07RDMA/hns: Remove unused function hns_roce_init_eq_table()Lijun Ou1-1/+0
hns_roce_init_eq_table() is an unused function that only retains its declaration in driver. Link: https://lore.kernel.org/r/1578313276-29080-3-git-send-email-liweihang@huawei.com Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-07RDMA/hns: Avoid printing address of mtt pageWenpeng Liang1-2/+2
Address of a page shouldn't be printed in case of security issues. Link: https://lore.kernel.org/r/1578313276-29080-2-git-send-email-liweihang@huawei.com Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-07i40iw: Remove setting of VMA private data and use rdma_user_mmap_ioShiraz Saleem1-8/+6
vm_ops is now initialized in ib_uverbs_mmap() with the recent rdma mmap API changes. Earlier it was done in rdma_umap_priv_init() which would not be called unless a driver called rdma_user_mmap_io() in its mmap. i40iw does not use the rdma_user_mmap_io API but sets the vma's vm_private_data to a driver object. This now conflicts with the vm_op rdma_umap_close as priv pointer points to the i40iw driver object instead of the private data setup by core when rdma_user_mmap_io is called. This leads to a crash in rdma_umap_close with a mmap put being called when it should not have. Remove the redundant setting of the vma private_data in i40iw as it is not used. Also move i40iw over to use the rdma_user_mmap_io API. This gives the extra protection of having the mappings zapped when the context is detsroyed. BUG: unable to handle page fault for address: 0000000100000001 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page PGD 0 P4D 0 Oops: 0002 [#1] SMP PTI CPU: 6 PID: 9528 Comm: rping Kdump: loaded Not tainted 5.5.0-rc4+ #117 Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./Q87M-D2H, BIOS F7 01/17/2014 RIP: 0010:rdma_user_mmap_entry_put+0xa/0x30 [ib_core] RSP: 0018:ffffb340c04c7c38 EFLAGS: 00010202 RAX: 00000000ffffffff RBX: ffff9308e7be2a00 RCX: 000000000000cec0 RDX: 0000000000000000 RSI: 0000000000000006 RDI: 0000000100000001 RBP: ffff9308dc7641f0 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000001 R11: ffffffff8d4414d8 R12: ffff93075182c780 R13: 0000000000000001 R14: ffff93075182d2a8 R15: ffff9308e2ddc840 FS: 0000000000000000(0000) GS:ffff9308fdc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000100000001 CR3: 00000002e0412004 CR4: 00000000001606e0 Call Trace: rdma_umap_close+0x40/0x90 [ib_uverbs] remove_vma+0x43/0x80 exit_mmap+0xfd/0x1b0 mmput+0x6e/0x130 do_exit+0x290/0xcc0 ? get_signal+0x152/0xc40 do_group_exit+0x46/0xc0 get_signal+0x1bd/0xc40 ? prepare_to_wait_event+0x97/0x190 do_signal+0x36/0x630 ? remove_wait_queue+0x60/0x60 ? __audit_syscall_exit+0x1d9/0x290 ? rcu_read_lock_sched_held+0x52/0x90 ? kfree+0x21c/0x2e0 exit_to_usermode_loop+0x4f/0xc3 do_syscall_64+0x1ed/0x270 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7fae715a81fd Code: Bad RIP value. RSP: 002b:00007fae6e163cb0 EFLAGS: 00000293 ORIG_RAX: 0000000000000001 RAX: fffffffffffffe00 RBX: 00007fae6e163d30 RCX: 00007fae715a81fd RDX: 0000000000000010 RSI: 00007fae6e163cf0 RDI: 0000000000000003 RBP: 00000000013413a0 R08: 00007fae68000000 R09: 0000000000000017 R10: 0000000000000001 R11: 0000000000000293 R12: 00007fae680008c0 R13: 00007fae6e163cf0 R14: 00007fae717c9804 R15: 00007fae6e163ed0 CR2: 0000000100000001 ---[ end trace b33d58d3a06782cb ]--- RIP: 0010:rdma_user_mmap_entry_put+0xa/0x30 [ib_core] Fixes: b86deba977a9 ("RDMA/core: Move core content from ib_uverbs to ib_core") Link: https://lore.kernel.org/r/20200107162223.1745-1-shiraz.saleem@intel.com Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-06remove ioremap_nocache and devm_ioremap_nocacheChristoph Hellwig7-10/+10
ioremap has provided non-cached semantics by default since the Linux 2.6 days, so remove the additional ioremap_nocache interface. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Arnd Bergmann <arnd@arndb.de>
2020-01-04RDMA/i40iw: fix a potential NULL pointer dereferenceXiyu Yang1-0/+2
A NULL pointer can be returned by in_dev_get(). Thus add a corresponding check so that a NULL pointer dereference will be avoided at this place. Fixes: 8e06af711bf2 ("i40iw: add main, hdr, status") Link: https://lore.kernel.org/r/1577672668-46499-1-git-send-email-xiyuyang19@fudan.edu.cn Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn> Signed-off-by: Xin Tan <tanxin.ctf@gmail.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-04RDMA/mlx5: use true,false for bool variablezhengbin2-3/+3
Fixes coccicheck warning: drivers/infiniband/hw/mlx5/mr.c:150:2-26: WARNING: Assignment of 0/1 to bool variable drivers/infiniband/hw/mlx5/mr.c:1455:2-26: WARNING: Assignment of 0/1 to bool variable drivers/infiniband/hw/mlx5/qp.c:1874:6-20: WARNING: Assignment of 0/1 to bool variable Link: https://lore.kernel.org/r/1577176812-2238-6-git-send-email-zhengbin13@huawei.com Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: zhengbin <zhengbin13@huawei.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-04RDMA/mlx4: use true,false for bool variablezhengbin1-2/+2
Fixes coccicheck warning: drivers/infiniband/hw/mlx4/qp.c:852:2-14: WARNING: Assignment of 0/1 to bool variable drivers/infiniband/hw/mlx4/qp.c:3087:3-10: WARNING: Assignment of 0/1 to bool variable Link: https://lore.kernel.org/r/1577176812-2238-5-git-send-email-zhengbin13@huawei.com Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: zhengbin <zhengbin13@huawei.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-04IB/hfi1: use true,false for bool variablezhengbin1-1/+1
Fixes coccicheck warning: drivers/infiniband/hw/hfi1/rc.c:2602:1-8: WARNING: Assignment of 0/1 to bool variable Link: https://lore.kernel.org/r/1577176812-2238-3-git-send-email-zhengbin13@huawei.com Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: zhengbin <zhengbin13@huawei.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-04IB/mlx5: Unify ODP MR code paths to allow extra flexibilityArtemy Kovalyov4-67/+67
Building MR translation table in the ODP case requires additional flexibility, namely random access to DMA addresses. Make both direct and indirect ODP MR use same code path, separated from the non-ODP MR code path. With the restructuring the correct page_shift is now used around __mlx5_ib_populate_pas(). Fixes: d2183c6f1958 ("RDMA/umem: Move page_shift from ib_umem to ib_odp_umem") Link: https://lore.kernel.org/r/20191222124649.52300-2-leon@kernel.org Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-03IB/hfi1: Adjust flow PSN with the correct resync_psnKaike Wan1-0/+9
When a TID RDMA ACK to RESYNC request is received, the flow PSNs for pending TID RDMA WRITE segments will be adjusted with the next flow generation number, based on the resync_psn value extracted from the flow PSN of the TID RDMA ACK packet. The resync_psn value indicates the last flow PSN for which a TID RDMA WRITE DATA packet has been received by the responder and the requester should resend TID RDMA WRITE DATA packets, starting from the next flow PSN. However, if resync_psn points to the last flow PSN for a segment and the next segment flow PSN starts with a new generation number, use of the old resync_psn to adjust the flow PSN for the next segment will lead to miscalculation, resulting in WARN_ON and sge rewinding errors: WARNING: CPU: 4 PID: 146961 at /nfs/site/home/phcvs2/gitrepo/ifs-all/components/Drivers/tmp/rpmbuild/BUILD/ifs-kernel-updates-3.10.0_957.el7.x86_64/hfi1/tid_rdma.c:4764 hfi1_rc_rcv_tid_rdma_ack+0x8f6/0xa90 [hfi1] Modules linked in: ib_ipoib(OE) hfi1(OE) rdmavt(OE) rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfsv3 nfs_acl nfs lockd grace fscache iTCO_wdt iTCO_vendor_support skx_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm irqbypass crc32_pclmul ghash_clmulni_intel ib_isert iscsi_target_mod target_core_mod aesni_intel lrw gf128mul glue_helper ablk_helper cryptd rpcrdma sunrpc opa_vnic ast ttm ib_iser libiscsi drm_kms_helper scsi_transport_iscsi ipmi_ssif syscopyarea sysfillrect sysimgblt fb_sys_fops drm joydev ipmi_si pcspkr sg drm_panel_orientation_quirks ipmi_devintf lpc_ich i2c_i801 ipmi_msghandler wmi rdma_ucm ib_ucm ib_uverbs acpi_cpufreq acpi_power_meter ib_umad rdma_cm ib_cm iw_cm ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif crct10dif_generic crct10dif_pclmul i2c_algo_bit crct10dif_common crc32c_intel e1000e ib_core ahci libahci ptp libata pps_core nfit libnvdimm [last unloaded: rdmavt] CPU: 4 PID: 146961 Comm: kworker/4:0H Kdump: loaded Tainted: G W OE ------------ 3.10.0-957.el7.x86_64 #1 Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.0X.02.0117.040420182310 04/04/2018 Workqueue: hfi0_0 _hfi1_do_tid_send [hfi1] Call Trace: <IRQ> [<ffffffff9e361dc1>] dump_stack+0x19/0x1b [<ffffffff9dc97648>] __warn+0xd8/0x100 [<ffffffff9dc9778d>] warn_slowpath_null+0x1d/0x20 [<ffffffffc05d28c6>] hfi1_rc_rcv_tid_rdma_ack+0x8f6/0xa90 [hfi1] [<ffffffffc05c21cc>] hfi1_kdeth_eager_rcv+0x1dc/0x210 [hfi1] [<ffffffffc05c23ef>] ? hfi1_kdeth_expected_rcv+0x1ef/0x210 [hfi1] [<ffffffffc0574f15>] kdeth_process_eager+0x35/0x90 [hfi1] [<ffffffffc0575b5a>] handle_receive_interrupt_nodma_rtail+0x17a/0x2b0 [hfi1] [<ffffffffc056a623>] receive_context_interrupt+0x23/0x40 [hfi1] [<ffffffff9dd4a294>] __handle_irq_event_percpu+0x44/0x1c0 [<ffffffff9dd4a442>] handle_irq_event_percpu+0x32/0x80 [<ffffffff9dd4a4cc>] handle_irq_event+0x3c/0x60 [<ffffffff9dd4d27f>] handle_edge_irq+0x7f/0x150 [<ffffffff9dc2e554>] handle_irq+0xe4/0x1a0 [<ffffffff9e3795dd>] do_IRQ+0x4d/0xf0 [<ffffffff9e36b362>] common_interrupt+0x162/0x162 <EOI> [<ffffffff9dfa0f79>] ? swiotlb_map_page+0x49/0x150 [<ffffffffc05c2ed1>] hfi1_verbs_send_dma+0x291/0xb70 [hfi1] [<ffffffffc05c2c40>] ? hfi1_wait_kmem+0xf0/0xf0 [hfi1] [<ffffffffc05c3f26>] hfi1_verbs_send+0x126/0x2b0 [hfi1] [<ffffffffc05ce683>] _hfi1_do_tid_send+0x1d3/0x320 [hfi1] [<ffffffff9dcb9d4f>] process_one_work+0x17f/0x440 [<ffffffff9dcbade6>] worker_thread+0x126/0x3c0 [<ffffffff9dcbacc0>] ? manage_workers.isra.25+0x2a0/0x2a0 [<ffffffff9dcc1c31>] kthread+0xd1/0xe0 [<ffffffff9dcc1b60>] ? insert_kthread_work+0x40/0x40 [<ffffffff9e374c1d>] ret_from_fork_nospec_begin+0x7/0x21 [<ffffffff9dcc1b60>] ? insert_kthread_work+0x40/0x40 This patch fixes the issue by adjusting the resync_psn first if the flow generation has been advanced for a pending segment. Fixes: 9e93e967f7b4 ("IB/hfi1: Add a function to receive TID RDMA ACK packet") Link: https://lore.kernel.org/r/20191219231920.51069.37147.stgit@awfm-01.aw.intel.com Cc: <stable@vger.kernel.org> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-03IB/hfi1: List all receive contexts from debugfsMichael J. Ruhl2-4/+10
The current debugfs output for receive contexts (rcds), stops after the kernel receive contexts have been displayed. This is not enough information to fully diagnose packet drops. Display all of the receive contexts. Augment the output with some more context information. Limit the ring buffer header output to 5 entries to avoid overextending the sequential file output. Fixes: bf808b5039c ("IB/hfi1: Add kernel receive context info to debugfs") Link: https://lore.kernel.org/r/20191219211928.58387.20737.stgit@awfm-01.aw.intel.com Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-03IB/hfi1: Add accessor API routines to access context membersMike Marciniszyn9-85/+183
This patch adds a set of accessor routines to access context members. Link: https://lore.kernel.org/r/20191219211922.58387.26548.stgit@awfm-01.aw.intel.com Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-03IB/hfi1: Don't cancel unused work itemKaike Wan1-1/+3
In the iowait structure, two iowait_work entries were included to queue a given object: one for normal IB operations, and the other for TID RDMA operations. For non-TID RDMA operations, the iowait_work structure for TID RDMA is initialized to contain a NULL function (not used). When the QP is reset, the function iowait_cancel_work will be called to cancel any pending work. The problem is that this function will call cancel_work_sync() for both iowait_work entries, even though the one for TID RDMA is not used at all. Eventually, the call cascades to __flush_work(), wherein a WARN_ON will be triggered due to the fact that work->func is NULL. The WARN_ON was introduced in commit 4d43d395fed1 ("workqueue: Try to catch flush_work() without INIT_WORK().") This patch fixes the issue by making sure that a work function is present for TID RDMA before calling cancel_work_sync in iowait_cancel_work. Fixes: 4d43d395fed1 ("workqueue: Try to catch flush_work() without INIT_WORK().") Fixes: 5da0fc9dbf89 ("IB/hfi1: Prepare resource waits for dual leg") Link: https://lore.kernel.org/r/20191219211941.58387.39883.stgit@awfm-01.aw.intel.com Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>