Age | Commit message (Collapse) | Author | Files | Lines |
|
Expose UAPI to query MEMIC DM, this will let user space application
that didn't allocate the DM but has access to by owning the matching
command FD to retrieve its information.
Link: https://lore.kernel.org/r/20210411122924.60230-8-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
MEMIC buffer, in addition to regular read and write operations, can
support atomic operations from the host.
Introduce and implement new UAPI to allocate address space for MEMIC
operations such as atomic. This includes:
1. Expose new IOCTL for request mapping of MEMIC operation.
2. Hold the operations address in a list, so same operation to same DM
will be allocated only once.
3. Manage refcount on the mlx5_ib_dm object, so it would be keep valid
until all addresses were unmapped.
Link: https://lore.kernel.org/r/20210411122924.60230-7-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Add two functions to allocate and deallocate MEMIC operations
by using the MODIFY_MEMIC command.
Link: https://lore.kernel.org/r/20210411122924.60230-6-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
1. Inline the checks from check_dm_type_support() into their
respective allocation functions.
2. Fix use after free when driver fails to copy the MEMIC address to the
user by moving the allocation code into their respective functions,
hence we avoid the explicit call to free the DM in the error flow.
3. Split mlx5_ib_dm struct to memic and icm proper typesafety
throughout.
Fixes: dc2316eba73f ("IB/mlx5: Fix device memory flows")
Link: https://lore.kernel.org/r/20210411122924.60230-5-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Move all device memory related code to a separate file.
Link: https://lore.kernel.org/r/20210411122924.60230-4-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
All other users of the dummy netdevice embed the netdev in other
structures:
init_dummy_netdev(&mal->dummy_dev);
init_dummy_netdev(ð->dummy_dev);
init_dummy_netdev(&ar->napi_dev);
init_dummy_netdev(&irq_grp->napi_ndev);
init_dummy_netdev(&wil->napi_ndev);
init_dummy_netdev(&trans_pcie->napi_dev);
init_dummy_netdev(&dev->napi_dev);
init_dummy_netdev(&bus->mux_dev);
The AIP and VNIC implementation turns that model inside out and used a
kfree() to free what appears to be a netdev struct when in reality, it is
a struct that enbodies the rx state as well as the dummy netdev used to
support napi_poll across disparate receive contexts. The relationship is
infered by the odd allocation:
const int netdev_size = sizeof(*dd->dummy_netdev) +
sizeof(struct hfi1_netdev_priv);
<snip>
dd->dummy_netdev = kcalloc_node(1, netdev_size, GFP_KERNEL, dd->node);
Correct the issue by:
- Correctly naming the alloc and free functions
- Renaming hfi1_netdev_priv to hfi1_netdev_rx
- Replacing dd dummy_netdev with a netdev_rx pointer
- Embedding the net_device in hfi1_netdev_rx
- Moving the init_dummy_netdev to the alloc routine
- Adjusting wrappers to fit the new model
Fixes: 6991abcb993c ("IB/hfi1: Add functions to receive accelerated ipoib packets")
Link: https://lore.kernel.org/r/1617026056-50483-11-git-send-email-dennis.dalessandro@cornelisnetworks.com
Reviewed-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Fix the following clang warning:
drivers/infiniband/hw/qib/qib_iba7322.c:803:19: warning: unused function 'qib_read_ureg' [-Wunused-function].
Link: https://lore.kernel.org/r/1618305063-29007-1-git-send-email-jiapeng.chong@linux.alibaba.com
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
As a flush operation is implemented inside destroy_workqueue(), there is
no need to do flush operation before.
Fixes: bfcc681bd09d ("IB/hns: Fix the bug when free mr")
Fixes: 0425e3e6e0c7 ("RDMA/hns: Support flush cqe for hip08 in kernel space")
Link: https://lore.kernel.org/r/1618305087-30799-1-git-send-email-liweihang@huawei.com
Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
A session can be removed dynamically by sysfs interface "remove_path" that
eventually calls rtrs_clt_remove_path_from_sysfs function. The current
rtrs_clt_remove_path_from_sysfs first removes the sysfs interfaces and
frees sess->stats object. Second it removes the session from the active
list.
Therefore some functions could access non-connected session and access the
freed sess->stats object even-if they check the session status before
accessing the session.
For instance rtrs_clt_request and get_next_path_min_inflight check the
session status and try to send IO to the session. The session status
could be changed when they are trying to send IO but they could not catch
the change and update the statistics information in sess->stats object,
and generate use-after-free problem.
(see: "RDMA/rtrs-clt: Check state of the rtrs_clt_sess before reading its
stats")
This patch changes the rtrs_clt_remove_path_from_sysfs to remove the
session from the active session list and then destroy the sysfs
interfaces.
Each function still should check the session status because closing or
error recovery paths can change the status.
Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality")
Link: https://lore.kernel.org/r/20210412084002.33582-1-gi-oh.kim@ionos.com
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Fix to return a negative error code from the error handling case instead
of 0, as done elsewhere in this function.
Fixes: db7683d7deb2 ("IB/srpt: Fix login-related race conditions")
Link: https://lore.kernel.org/r/20210408113132.87250-1-wangwensheng4@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Wensheng <wangwensheng4@huawei.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Introduce the ability for kernel ULPs to adjust the minimum RNR Retry
timer. The INIT -> RTR transition executed by RDMA CM will be used for
this adjustment. This avoids an additional ib_modify_qp() call.
rdma_set_min_rnr_timer() must be called before the call to rdma_connect()
on the active side and before the call to rdma_accept() on the passive
side.
The default value of RNR Retry timer is zero, which translates to 655
ms. When the receiver is not ready to accept a send messages, it encodes
the RNR Retry timer value in the NAK. The requestor will then wait at
least the specified time value before retrying the send.
The 5-bit value to be supplied to the rdma_set_min_rnr_timer() is
documented in IBTA Table 45: "Encoding for RNR NAK Timer Field".
Link: https://lore.kernel.org/r/1617216194-12890-2-git-send-email-haakon.bugge@oracle.com
Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
Acked-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
spinlock can be initialized automatically with DEFINE_SPINLOCK() rather
than explicitly calling spin_lock_init().
Link: https://lore.kernel.org/r/20210409095147.2294269-1-yebin10@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Ye Bin <yebin10@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Fix to return a negative error code from the error handling case instead
of 0, as done elsewhere in this function.
Fixes: 1ac5a4047975 ("RDMA/bnxt_re: Add bnxt_re RoCE driver")
Link: https://lore.kernel.org/r/20210408113137.97202-1-wangwensheng4@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Wensheng <wangwensheng4@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Fix to return a negative error code from the error handling case instead
of 0, as done elsewhere in this function.
Fixes: 7724105686e7 ("IB/hfi1: add driver files")
Link: https://lore.kernel.org/r/20210408113140.103032-1-wangwensheng4@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Wensheng <wangwensheng4@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Fix to return a negative error code from the error handling case instead
of 0, as done elsewhere in this function.
Fixes: 82af6d19d8d9 ("RDMA/qedr: Fix synchronization methods and memory leaks in qedr")
Link: https://lore.kernel.org/r/20210408113135.92165-1-wangwensheng4@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Wensheng <wangwensheng4@huawei.com>
Acked-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Block comments should not use a trailing */ on a separate line and every
line of a block comment should start with an '*'.
Link: https://lore.kernel.org/r/1617783353-48249-7-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Do following cleanups about braces:
- Add the necessary braces to maintain context alignment.
- Fix the open '{' that is not on the same line as "switch".
- Remove braces that are not necessary for single statement blocks.
- Fix "else" that doesn't follow close brace '}'.
Link: https://lore.kernel.org/r/1617783353-48249-6-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Space is not required after '(', before ')', before ',' and between '*'
and symbol name of a definition.
Link: https://lore.kernel.org/r/1617783353-48249-5-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Space is required before '(' of switch statements and around '='.
Link: https://lore.kernel.org/r/1617783353-48249-4-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
The return statements at the end of a void function is meaningless.
Link: https://lore.kernel.org/r/1617783353-48249-3-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
It's better to use __func__ than a fixed string to print a function's
name.
Link: https://lore.kernel.org/r/1617783353-48249-2-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
Saeed Mahameed says:
====================
This pr contains changes from mlx5-next branch,
already reviewed on netdev and rdma mailing lists, links below.
1) From Leon, Dynamically assign MSI-X vectors count
Already Acked by Bjorn Helgaas.
https://patchwork.kernel.org/project/netdevbpf/cover/20210314124256.70253-1-leon@kernel.org/
2) Cleanup series:
https://patchwork.kernel.org/project/netdevbpf/cover/20210311070915.321814-1-saeed@kernel.org/
From Mark, E-Switch cleanups and refactoring, and the addition
of single FDB mode needed HW bits.
From Mikhael, Remove unused struct field
From Saeed, Cleanup W=1 prototype warning
From Zheng, Esw related cleanup
From Tariq, User order-0 page allocation for EQs
====================
* mlx5-next:
net/mlx5: Implement sriov_get_vf_total_msix/count() callbacks
net/mlx5: Dynamically assign MSI-X vectors count
net/mlx5: Add dynamic MSI-X capabilities bits
PCI/IOV: Add sysfs MSI-X vector assignment interface
net/mlx5: Use order-0 allocations for EQs
net/mlx5: Add IFC bits needed for single FDB mode
net/mlx5: E-Switch, Refactor send to vport to be more generic
RDMA/mlx5: Use representor E-Switch when getting netdev and metadata
net/mlx5: E-Switch, Add eswitch pointer to each representor
net/mlx5: E-Switch, Add match on vhca id to default send rules
net/mlx5: Remove unused mlx5_core_health member recover_work
net/mlx5: simplify the return expression of mlx5_esw_offloads_pair()
net/mlx5: Cleanup prototype warning
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Conflicts:
MAINTAINERS
- keep Chandrasekar
drivers/net/ethernet/mellanox/mlx5/core/en_main.c
- simple fix + trust the code re-added to param.c in -next is fine
include/linux/bpf.h
- trivial
include/linux/ethtool.h
- trivial, fix kdoc while at it
include/linux/skmsg.h
- move to relevant place in tcp.c, comment re-wrapped
net/core/skmsg.c
- add the sk = sk // sk = NULL around calls
net/tipc/crypto.c
- trivial
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
Saeed Mahameed says:
====================
mlx5-next 2021-04-09
This pr contains changes from mlx5-next branch,
already reviewed on netdev and rdma mailing lists, links below.
1) From Leon, Dynamically assign MSI-X vectors count
Already Acked by Bjorn Helgaas.
https://patchwork.kernel.org/project/netdevbpf/cover/20210314124256.70253-1-leon@kernel.org/
2) Cleanup series:
https://patchwork.kernel.org/project/netdevbpf/cover/20210311070915.321814-1-saeed@kernel.org/
From Mark, E-Switch cleanups and refactoring, and the addition
of single FDB mode needed HW bits.
From Mikhael, Remove unused struct field
From Saeed, Cleanup W=1 prototype warning
From Zheng, Esw related cleanup
From Tariq, User order-0 page allocation for EQs
* 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux:
net/mlx5: Implement sriov_get_vf_total_msix/count() callbacks
net/mlx5: Dynamically assign MSI-X vectors count
net/mlx5: Add dynamic MSI-X capabilities bits
PCI/IOV: Add sysfs MSI-X vector assignment interface
net/mlx5: Use order-0 allocations for EQs
net/mlx5: Add IFC bits needed for single FDB mode
net/mlx5: E-Switch, Refactor send to vport to be more generic
RDMA/mlx5: Use representor E-Switch when getting netdev and metadata
net/mlx5: E-Switch, Add eswitch pointer to each representor
net/mlx5: E-Switch, Add match on vhca id to default send rules
net/mlx5: Remove unused mlx5_core_health member recover_work
net/mlx5: simplify the return expression of mlx5_esw_offloads_pair()
net/mlx5: Cleanup prototype warning
====================
Link: https://lore.kernel.org/r/20210409200704.10886-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
list_sort() internally casts the comparison function passed to it
to a different type with constant struct list_head pointers, and
uses this pointer to call the functions, which trips indirect call
Control-Flow Integrity (CFI) checking.
Instead of removing the consts, this change defines the
list_cmp_func_t type and changes the comparison function types of
all list_sort() callers to use const pointers, thus avoiding type
mismatches.
Suggested-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210408182843.1754385-10-samitolvanen@google.com
|
|
The nla_len() is less than or equal to 16. If it's less than 16 then end
of the "gid" buffer is uninitialized.
Fixes: ae43f8286730 ("IB/core: Add IP to GID netlink offload")
Link: https://lore.kernel.org/r/20210405074434.264221-1-leon@kernel.org
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Replace BUILD_BUG_ON_ZERO() with BUILD_BUG_ON() to avoid sparse
complaining "restricted __le32 degrades to integer".
Link: https://lore.kernel.org/r/1617354454-47840-10-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Use "hr_reg_write" replace "roce_set_filed".
Link: https://lore.kernel.org/r/1617354454-47840-9-git-send-email-liweihang@huawei.com
Signed-off-by: Yixing Liu <liuyixing1@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
A field to distuiguish basic SRQ from XRC SRQ in SRQC and a field in QPC
to determine whether a QP is XRC TGT QP or XRC INI QP are missing.
Fixes: 32548870d438 ("RDMA/hns: Add support for XRC on HIP09")
Link: https://lore.kernel.org/r/1617354454-47840-8-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
The hns ROCEE does not support UC QP currently.
Link: https://lore.kernel.org/r/1617354454-47840-7-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Some structure members in hns_roce_hw have never been used and need to be
deleted.
Fixes: 9a4435375cd1 ("IB/hns: Add driver files for hns RoCE driver")
Fixes: b156269d88e4 ("RDMA/hns: Add modify CQ support for hip08")
Fixes: c7bcb13442e1 ("RDMA/hns: Add SRQ support for hip08 kernel mode")
Link: https://lore.kernel.org/r/1617354454-47840-6-git-send-email-liweihang@huawei.com
Signed-off-by: Yangyang Li <liyangyang20@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
The hardware supports only two types of abnormal interrupts.
Fixes: a5073d6054f7 ("RDMA/hns: Add eq support of hip08")
Link: https://lore.kernel.org/r/1617354454-47840-5-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
The register value related to the eq interrupt depends only on
enable_flag, so the redundant condition judgment is deleted.
Fixes: a5073d6054f7 ("RDMA/hns: Add eq support of hip08")
Link: https://lore.kernel.org/r/1617354454-47840-4-git-send-email-liweihang@huawei.com
Signed-off-by: Yangyang Li <liyangyang20@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
When querying QP, the ULPs should be informed of the max length of inline
data supported by the hardware.
Fixes: 30b707886aeb ("RDMA/hns: Support inline data in extented sge space for RC")
Link: https://lore.kernel.org/r/1617354454-47840-3-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
RQ inline is not supported on UD/GSI QP, it should be disabled in QPC.
Link: https://lore.kernel.org/r/1617354454-47840-2-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Use ratelimited print in mbox and cmq. And print mailbox operation if
mailbox fails because it's useful information for the user.
Link: https://lore.kernel.org/r/1617262341-37571-4-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Add error code definition according to the return code from firmware to
help find out more detailed reasons why a command fails to be sent.
Link: https://lore.kernel.org/r/1617262341-37571-3-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Fix error of cmd's context number calculation algorithm to enable all of
32 cmd entries and support 32 concurrent accesses.
Link: https://lore.kernel.org/r/1617262341-37571-2-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
All responder errors from request packets that do not consume a receive
WQE fail to generate acks for RC QPs. This patch corrects this behavior
by making the flow follow the same path as request packets that do consume
a WQE after the completion.
Link: https://lore.kernel.org/r/20210402001016.3210-1-rpearson@hpe.com
Link: https://lore.kernel.org/linux-rdma/1a7286ac-bcea-40fb-2267-480134dd301b@gmail.com/
Signed-off-by: Bob Pearson <rpearson@hpe.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
As INI QP does not require a recv_cq, avoid the following null pointer
dereference by checking if the qp_type is not INI before trying to extract
the recv_cq.
BUG: kernel NULL pointer dereference, address: 00000000000000e0
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#1] SMP PTI
CPU: 0 PID: 54250 Comm: mpitests-IMB-MP Not tainted 5.12.0-rc5 #1
Hardware name: Dell Inc. PowerEdge R320/0KM5PX, BIOS 2.7.0 08/19/2019
RIP: 0010:qedr_create_qp+0x378/0x820 [qedr]
Code: 02 00 00 50 e8 29 d4 a9 d1 48 83 c4 18 e9 65 fe ff ff 48 8b 53 10 48 8b 43 18 44 8b 82 e0 00 00 00 45 85 c0 0f 84 10 74 00 00 <8b> b8 e0 00 00 00 85 ff 0f 85 50 fd ff ff e9 fd 73 00 00 48 8d bd
RSP: 0018:ffff9c8f056f7a70 EFLAGS: 00010202
RAX: 0000000000000000 RBX: ffff9c8f056f7b58 RCX: 0000000000000009
RDX: ffff8c41a9744c00 RSI: ffff9c8f056f7b58 RDI: ffff8c41c0dfa280
RBP: ffff8c41c0dfa280 R08: 0000000000000002 R09: 0000000000000001
R10: 0000000000000000 R11: ffff8c41e06fc608 R12: ffff8c4194052000
R13: 0000000000000000 R14: ffff8c4191546070 R15: ffff8c41c0dfa280
FS: 00007f78b2787b80(0000) GS:ffff8c43a3200000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000000e0 CR3: 00000001011d6002 CR4: 00000000001706f0
Call Trace:
ib_uverbs_handler_UVERBS_METHOD_QP_CREATE+0x4e4/0xb90 [ib_uverbs]
? ib_uverbs_cq_event_handler+0x30/0x30 [ib_uverbs]
ib_uverbs_run_method+0x6f6/0x7a0 [ib_uverbs]
? ib_uverbs_handler_UVERBS_METHOD_QP_DESTROY+0x70/0x70 [ib_uverbs]
? __cond_resched+0x15/0x30
? __kmalloc+0x5a/0x440
ib_uverbs_cmd_verbs+0x195/0x360 [ib_uverbs]
? xa_load+0x6e/0x90
? cred_has_capability+0x7c/0x130
? avc_has_extended_perms+0x17f/0x440
? vma_link+0xae/0xb0
? vma_set_page_prot+0x2a/0x60
? mmap_region+0x298/0x6c0
? do_mmap+0x373/0x520
? selinux_file_ioctl+0x17f/0x220
ib_uverbs_ioctl+0xa7/0x110 [ib_uverbs]
__x64_sys_ioctl+0x84/0xc0
do_syscall_64+0x33/0x40
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f78b120262b
Fixes: 06e8d1df46ed ("RDMA/qedr: Add support for user mode XRC-SRQ's")
Link: https://lore.kernel.org/r/20210404125501.154789-1-kamalheib1@gmail.com
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Local invalidate is also a kind of memory management operation, not only
memory bind operation. Furthermore, as invalidate operations include local
and remote, add prefix to the prompt message to make it clearer.
Link: https://lore.kernel.org/r/1617698772-13871-1-git-send-email-liweihang@huawei.com
Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
A spin lock is taken here so we should use GFP_ATOMIC.
Fixes: f91696f2f053 ("RDMA/hns: Support congestion control type selection according to the FW")
Link: https://lore.kernel.org/r/20210407154900.3486268-1-weiyongjun1@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
To update xlt (during mlx5_ib_reg_user_mr()), the driver can request up to
1 MB (order-8) memory, depending on the size of the MR. This costly
allocation can sometimes take very long to return (a few seconds). This
causes the calling application to hang for a long time, especially when
the system is fragmented. To avoid these long latency spikes, the calls
the higher order allocations need to fail faster in case they are not
available.
In order to acheive this we need __GFP_NORETRY flag in the gfp_mask before
during fetching the free pages. Allow the algorithm to automatically fall
back to smaller page sizes.
Link: https://lore.kernel.org/r/1617425635-35631-1-git-send-email-praveen.kannoju@oracle.com
Signed-off-by: Praveen Kumar Kannoju <praveen.kannoju@oracle.com>
Acked-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Remove the unused function sdma_iowait_schedule().
Fixes: 7724105686e7 ("IB/hfi1: add driver files")
Link: https://lore.kernel.org/r/1617026791-89997-1-git-send-email-dennis.dalessandro@cornelisnetworks.com
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
The code currently assumes that the mmu_notifier struct
embedded in mmu_rb_handler only contains two fields.
There are now extra fields:
struct mmu_notifier {
struct hlist_node hlist;
const struct mmu_notifier_ops *ops;
struct mm_struct *mm;
struct rcu_head rcu;
unsigned int users;
};
Given that there in no init for the mmu_notifier, a kzalloc() should
be used to insure that any newly added fields are given a predictable
initial value of zero.
Fixes: 06e0ffa69312 ("IB/hfi1: Re-factor MMU notification code")
Link: https://lore.kernel.org/r/1617026056-50483-9-git-send-email-dennis.dalessandro@cornelisnetworks.com
Reviewed-by: Adam Goldman <adam.goldman@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Add traces that were vital in isolating an issue with pq waitlist in
commit fa8dac396863 ("IB/hfi1: Fix another case where pq is left on
waitlist")
Link: https://lore.kernel.org/r/1617026056-50483-8-git-send-email-dennis.dalessandro@cornelisnetworks.com
Reviewed-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
hfi1_ipoib_send() directly calls hfi1_ipoib_send_dma() with no value add.
Fix by renaming hfi1_ipoib_send_dma() to hfi1_ipoib_send().
Link: https://lore.kernel.org/r/1617026056-50483-6-git-send-email-dennis.dalessandro@cornelisnetworks.com
Reviewed-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
The call is from an ISR context and napi_schedule_irqoff() can be used.
Change the call to the more efficient type.
Link: https://lore.kernel.org/r/1617026056-50483-5-git-send-email-dennis.dalessandro@cornelisnetworks.com
Reviewed-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
The completion ring for tx is using the wrong size to size the ring,
oversizing the ring by two orders of magniture.
Correct the allocation size and use kcalloc_node() to allocate the ring.
Fix mistaken GFP defines in similar allocations.
Link: https://lore.kernel.org/r/1617026056-50483-4-git-send-email-dennis.dalessandro@cornelisnetworks.com
Reviewed-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
The current rdma_netdev handling in ipoib hooks the tx_timeout handler,
but prints out a totally useless message that prevents effective debugging
especially when multiple transmit queues are being used.
Add a tx_timeout rdma_netdev hook and implement the callback in the hfi1
to print additional information.
The existing non-helpful message is avoided when the driver has presented
a callback.
Link: https://lore.kernel.org/r/1617026056-50483-3-git-send-email-dennis.dalessandro@cornelisnetworks.com
Reviewed-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|