Age | Commit message (Collapse) | Author | Files | Lines |
|
Any mkey that is not enabled and assigned to userspace should have the PD
set to a kernel owned PD.
When cache entries are created for the first time the PDN is set to 0,
which is probably a kernel PD, but be explicit.
When a MR is registered using the hybrid reg_create with UMR xlt & enable
the disabled mkey is pointing at the user PD, keep it pointing at the
kernel until a UMR enables it and sets the user PD.
Fixes: 9ec4483a3f0f ("IB/mlx5: Move MRs to a kernel PD when freeing them to the MR cache")
Link: https://lore.kernel.org/r/20200914112653.345244-4-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
reg_create() open codes this helper, use the shared code.
Link: https://lore.kernel.org/r/20200914112653.345244-3-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
alloc_mr_from_cache() no longer returns EAGAIN, this is just dead code
now.
Fixes: aad719dcf379 ("RDMA/mlx5: Allow MRs to be created in the cache synchronously")
Link: https://lore.kernel.org/r/20200914112653.345244-2-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
hip06 does not support IB_WR_LOCAL_INV, so the ps_opcode should be set to
an invalid value instead of being left uninitialized.
Fixes: 9a4435375cd1 ("IB/hns: Add driver files for hns RoCE driver")
Fixes: a2f3d4479fe9 ("RDMA/hns: Avoid unncessary initialization")
Link: https://lore.kernel.org/r/1600350615-115217-1-git-send-email-oulijun@huawei.com
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
When xa_insert() fails, the acquired resource in qedr_create_qp should
also be freed. However, current implementation does not handle the error.
Fix this by adding a new goto label that calls qedr_free_qp_resources.
Fixes: 1212767e23bb ("qedr: Add wrapping generic structure for qpidr and adjust idr routines.")
Link: https://lore.kernel.org/r/20200911125159.4577-1-keitasuzuki.park@sslab.ics.keio.ac.jp
Signed-off-by: Keita Suzuki <keitasuzuki.park@sslab.ics.keio.ac.jp>
Acked-by: Michal KalderonĀ <michal.kalderon@marvell.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Receive side delayed ack mode is needed only for certain area networks/
connections. Therefore disable it by default.
Link: https://lore.kernel.org/r/20200909153707.2795-1-bharat@chelsio.com
Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Leon Romanovsky says:
====================
This series from Alex extends software steering interface to support
devices with extra capability "sw_owner_2" which will replace existing
"sw_owner".
====================
Based on the mlx5-next branch at
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
due to dependencies.
* branch 'mlx5_sw_owner_v2:
RDMA/mlx5: Expose TIR and QP ICM address for sw_owner_v2 devices
RDMA/mlx5: Allow DM allocation for sw_owner_v2 enabled devices
RDMA/mlx5: Add sw_owner_v2 bit capability
|
|
Leon Romanovsky says:
====================
IBTA declares speed as 16 bits, but kernel stores it in u8. This series
fixes in-kernel declaration while keeping external interface intact.
====================
Based on the mlx5-next branch at
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
due to dependencies.
* branch 'mlx5_active_speed':
RDMA: Fix link active_speed size
RDMA/mlx5: Delete duplicated mlx5_ptys_width enum
net/mlx5: Refactor query port speed functions
|
|
According to the IB spec active_speed size should be u16 and not u8 as
before. Changing it to allow further extensions in offered speeds.
Link: https://lore.kernel.org/r/20200917090223.1018224-4-leon@kernel.org
Signed-off-by: Aharon Landau <aharonl@mellanox.com>
Reviewed-by: Michael Guralnik <michaelgur@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Expose the ICM address to access TIR and QP, this will allow sw_owned_v2
devices to steer traffic to TIRs and QPs same as done with sw_owner
capability.
Link: https://lore.kernel.org/r/20200903073857.1129166-4-leon@kernel.org
Signed-off-by: Alex Vesker <valex@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
sw_owner_v2 will replace sw_owner for future devices, this means that if
sw_owner_v2 is set sw_owner should be ignored and DM allocation is
required for sw_owner_v2 devices to function.
Link: https://lore.kernel.org/r/20200903073857.1129166-3-leon@kernel.org
Signed-off-by: Alex Vesker <valex@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Move struct ib_rwq_ind_table allocation to ib_core.
Link: https://lore.kernel.org/r/20200902081623.746359-3-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Move allocation and destruction of memory windows under ib_core
responsibility and clean drivers to ensure that no updates to MW
ib_core structures are done in driver layer.
Link: https://lore.kernel.org/r/20200902081623.746359-2-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Combine two same enums to avoid duplication.
Signed-off-by: Aharon Landau <aharonl@mellanox.com>
Reviewed-by: Michael Guralnik <michaelgur@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
|
|
The functions mlx5_query_port_link_width_oper and
mlx5_query_port_ib_proto_oper are always called together, so combine them
to a new function called mlx5_query_port_oper to avoid duplication.
And while the mlx5i_get_port_settings is the same as
mlx5_query_port_oper therefore let's remove it.
According to the IB spec link_width_oper and ib_proto_oper should be u16
and not as written u8, so perform casting as a preparation to cross-RDMA
patch which will fix that type for all drivers in the RDMA subsystem.
Fixes: ada68c31ba9c ("net/mlx5: Introduce a new header file for physical port functions")
Signed-off-by: Aharon Landau <aharonl@mellanox.com>
Reviewed-by: Michael Guralnik <michaelgur@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
|
|
The roce path triggers a work queue that continues to touch the id_priv
but doesn't hold any reference on it. Futher, unlike in the IB case, the
work queue is not fenced during rdma_destroy_id().
This can trigger a use after free if a destroy is triggered in the
incredibly narrow window after the queue_work and the work starting and
obtaining the handler_mutex.
The only purpose of this work queue is to run the ULP event callback from
the standard context, so switch the design to use the existing
cma_work_handler() scheme. This simplifies quite a lot of the flow:
- Use the cma_work_handler() callback to launch the work for roce. This
requires generating the event synchronously inside the
rdma_join_multicast(), which in turn means the dummy struct
ib_sa_multicast can become a simple stack variable.
- cm_work_handler() used the id_priv kref, so we can entirely eliminate
the kref inside struct cma_multicast. Since the cma_multicast never
leaks into an unprotected work queue the kfree can be done at the same
time as for IB.
- Eliminating the general multicast.ib requires using cma_set_mgid() in a
few places to recompute the mgid.
Fixes: 3c86aa70bf67 ("RDMA/cm: Add RDMA CM support for IBoE devices")
Link: https://lore.kernel.org/r/20200902081122.745412-9-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Two places were open coding this sequence, and also pull in
cma_leave_roce_mc_group() which was called only once.
Link: https://lore.kernel.org/r/20200902081122.745412-8-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
There is no kernel user of RDMA CM multicast so this code managing the
multicast subscription of the kernel-only internal QP is dead. Remove it.
This makes the bug fixes in the next patches much simpler.
Link: https://lore.kernel.org/r/20200902081122.745412-7-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
These are the same thing, except that cma_ndev_work doesn't have a state
transition. Signal no state transition by setting old_state and new_state
== 0.
In all cases the handler function should not be called once
rdma_destroy_id() has progressed passed setting the state.
Link: https://lore.kernel.org/r/20200902081122.745412-6-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
The only place that still uses it is rdma_join_multicast() which is only
doing a sanity check that the caller hasn't done something wrong and
doesn't need the spinlock.
At least in the case of rdma_join_multicast() the information it needs
will remain until the ID is destroyed once it enters these
states. Similarly there is no reason to check for these specific states in
the handler callback, instead use the usual check for a destroyed id under
the handler_mutex.
Link: https://lore.kernel.org/r/20200902081122.745412-5-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
There is a strange unlocked read of the ID state when checking for
reuseaddr. This is because an ID cannot be reusable once it becomes a
listening ID. Instead of using the state to exclude reuse, just clear it
as part of rdma_listen()'s flow to convert reusable into not reusable.
Once a ID goes to listen there is no way back out, and the only use of
reusable is on the bind_list check.
Finally, update the checks under handler_mutex to use READ_ONCE and audit
that once RDMA_CM_LISTEN is observed in a req callback it is stable under
the handler_mutex.
Link: https://lore.kernel.org/r/20200902081122.745412-4-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Re-organize things so the state variable is not read unlocked. The first
attempt to go directly from ADDR_BOUND immediately tells us if the ID is
already bound, if we can't do that then the attempt inside
rdma_bind_addr() to go from IDLE to ADDR_BOUND confirms the ID needs
binding.
Link: https://lore.kernel.org/r/20200902081122.745412-3-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
It is currently a bit confusing, but the design is if the handler_mutex
is held, and the state is in RDMA_CM_CONNECT, then the state cannot leave
RDMA_CM_CONNECT without also serializing with the handler_mutex.
Make this clearer by adding a direct assertion, fixing the usage in
rdma_connect and generally using READ_ONCE to read the state value.
Link: https://lore.kernel.org/r/20200902081122.745412-2-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Use DEFINE_SEQ_ATTRIBUTE macro to simplify the code.
Link: https://lore.kernel.org/r/20200916025022.3992627-1-liushixin2@huawei.com
Signed-off-by: Liu Shixin <liushixin2@huawei.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
hw_context stores a pci_device pointer. Use the correct type instead of
void * and avoid typecasts.
Link: https://lore.kernel.org/r/20200914123528.270382-1-parav@mellanox.com
Signed-off-by: Parav Pandit <parav@nvidia.com>
Acked-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Implement the XRC specific verbs. The additional QP type introduced new
logic to the rest of the verbs that now require distinguishing whether a
QP has an "RQ" or an "SQ" or both.
Link: https://lore.kernel.org/r/20200722102339.30104-1-ybason@marvell.com
Signed-off-by: Michal Kalderon <mkalderon@marvell.com>
Signed-off-by: Yuval Basson <ybason@marvell.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
rxe will hold a refcount on the IB device as long as CQ objects exist,
this causes destruction of a rxe device to hang if the CQ pool has any
cached CQs since they are being destroyed after the refcount must go to
zero.
Treat the CQ pool like a client and create/destroy it before/after all
other clients. No users of CQ pool can exist past a client remove call.
Link: https://lore.kernel.org/r/e8a240aa-9e9b-3dca-062f-9130b787f29b@acm.org
Fixes: c7ff819aefea ("RDMA/core: Introduce shared CQ pool API")
Tested-by: Bart Van Assche <bvanassche@acm.org>
Tested-by: Yi Zhang <yi.zhang@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Pull rdma fixes from Jason Gunthorpe:
"A number of driver bug fixes and a few recent regressions:
- Several bug fixes for bnxt_re. Crashing, incorrect data reported,
and corruption on new HW
- Memory leak and crash in rxe
- Fix sysfs corruption in rxe if the netdev name is too long
- Fix a crash on error unwind in the new cq_pool code
- Fix kobject panics in rtrs by working device lifetime properly
- Fix a data corruption bug in iser target related to misaligned
buffers"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
IB/isert: Fix unaligned immediate-data handling
RDMA/rtrs-srv: Set .release function for rtrs srv device during device init
RDMA/bnxt_re: Remove set but not used variable 'qplib_ctx'
RDMA/core: Fix reported speed and width
RDMA/core: Fix unsafe linked list traversal after failing to allocate CQ
RDMA/bnxt_re: Remove the qp from list only if the qp destroy succeeds
RDMA/bnxt_re: Fix driver crash on unaligned PSN entry address
RDMA/bnxt_re: Restrict the max_gids to 256
RDMA/bnxt_re: Static NQ depth allocation
RDMA/bnxt_re: Fix the qp table indexing
RDMA/bnxt_re: Do not report transparent vlan from QP1
RDMA/mlx4: Read pkey table length instead of hardcoded value
RDMA/rxe: Fix panic when calling kmem_cache_create()
RDMA/rxe: Fix memleak in rxe_mem_init_user
RDMA/rxe: Fix the parent sysfs read when the interface has 15 chars
RDMA/rtrs-srv: Replace device_register with device_initialize and device_add
|
|
Alignment of parameters was off by one
Link: https://lore.kernel.org/r/20200902165741.8355-9-michal.kalderon@marvell.com
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
commit 59e8970b3798 ("RDMA/qedr: Return max inline data in QP query
result") changed query_qp max_inline size to return the max roce inline
size. When iwarp was introduced, this should have been modified to return
the max inline size based on protocol. This size is cached in the device
attributes
Fixes: 69ad0e7fe845 ("RDMA/qedr: Add support for iWARP in user space")
Link: https://lore.kernel.org/r/20200902165741.8355-8-michal.kalderon@marvell.com
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Currently iWARP does not support mtu-change. Notify user when MTU changes
that reload of qedr is required for mtu to change. Display the correct
active mtu.
Fixes: f5b1b1775be6 ("RDMA/qedr: Add iWARP support in existing verbs")
Link: https://lore.kernel.org/r/20200902165741.8355-7-michal.kalderon@marvell.com
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
In iWARP, accept could be called after a QP is already destroyed. In this
case an error should be returned and not success.
Fixes: 82af6d19d8d9 ("RDMA/qedr: Fix synchronization methods and memory leaks in qedr")
Link: https://lore.kernel.org/r/20200902165741.8355-5-michal.kalderon@marvell.com
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
dev->attr.page_size_caps was used uninitialized when setting device
attributes
Fixes: ec72fce401c6 ("qedr: Add support for RoCE HW init")
Link: https://lore.kernel.org/r/20200902165741.8355-4-michal.kalderon@marvell.com
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Change the doorbell setting so that the maximum value between the last and
current value is set. This is to avoid doorbells being lost.
Fixes: a7efd7773e31 ("qedr: Add support for PD,PKEY and CQ verbs")
Link: https://lore.kernel.org/r/20200902165741.8355-3-michal.kalderon@marvell.com
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
The qedr_qp structure wasn't freed when the protocol was RoCE. kmemleak
output when running basic RoCE scenario.
unreferenced object 0xffff927ad7e22c00 (size 1024):
comm "ib_send_bw", pid 7082, jiffies 4384133693 (age 274.698s)
hex dump (first 32 bytes):
00 b0 cd a2 79 92 ff ff 00 3f a1 a2 79 92 ff ff ....y....?..y...
00 ee 5c dd 80 92 ff ff 00 f6 5c dd 80 92 ff ff ..\.......\.....
backtrace:
[<00000000b2ba0f35>] qedr_create_qp+0xb3/0x6c0 [qedr]
[<00000000e85a43dd>] ib_uverbs_handler_UVERBS_METHOD_QP_CREATE+0x555/0xad0 [ib_uverbs]
[<00000000fee4d029>] ib_uverbs_cmd_verbs+0xa5a/0xb80 [ib_uverbs]
[<000000005d622660>] ib_uverbs_ioctl+0xa4/0x110 [ib_uverbs]
[<00000000eb4cdc71>] ksys_ioctl+0x87/0xc0
[<00000000abe6b23a>] __x64_sys_ioctl+0x16/0x20
[<0000000046e7cef4>] do_syscall_64+0x4d/0x90
[<00000000c6948f76>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fixes: 1212767e23bb ("qedr: Add wrapping generic structure for qpidr and adjust idr routines.")
Link: https://lore.kernel.org/r/20200902165741.8355-2-michal.kalderon@marvell.com
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
This is always the same value as IOVA masked by the page size, just use
that clearer calculation directly.
It is unclear of ocrdma hardware can actually support a true fbo, if so it
could use a different algorithm to compute the best page size.
Link: https://lore.kernel.org/r/17-v2-270386b7e60b+28f4-umem_1_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
zbva is always false, so fbo is never read.
A 'zero-based-virtual-address' is simply IOVA == 0, and the driver already
supports this.
Link: https://lore.kernel.org/r/16-v2-270386b7e60b+28f4-umem_1_jgg@nvidia.com
Acked-by: Michal KalderonĀ <michal.kalderon@marvell.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
For the calls linked to mlx4_ib_umem_calc_optimal_mtt_size() use
ib_umem_num_dma_blocks() inside the function, it is just some weird static
default.
All other places are just using it with PAGE_SIZE, switch to
ib_umem_num_dma_blocks().
As this is the last call site, remove ib_umem_num_count().
Link: https://lore.kernel.org/r/15-v2-270386b7e60b+28f4-umem_1_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
This driver always uses PAGE_SIZE.
Link: https://lore.kernel.org/r/14-v2-270386b7e60b+28f4-umem_1_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
This driver always uses a DMA array made up of PAGE_SIZE elements, so just
use ib_umem_num_dma_blocks().
Since rdma_for_each_dma_block() always iterates exactly
ib_umem_num_dma_blocks() there is no need for the early exit check in
build_user_pbes(), delete it.
Link: https://lore.kernel.org/r/13-v2-270386b7e60b+28f4-umem_1_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
mtr_umem_page_count() does the same thing, replace it with the core code.
Also, ib_umem_find_best_pgsz() should always be called to check that the
umem meets the page_size requirement. If there is a limited set of
page_sizes that work it the pgsz_bitmap should be set to that set. 0 is a
failure and the umem cannot be used.
Lightly tidy the control flow to implement this flow properly.
Link: https://lore.kernel.org/r/12-v2-270386b7e60b+28f4-umem_1_jgg@nvidia.com
Acked-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
ib_umem_page_count() returns the number of 4k entries required for a DMA
map, but bnxt_re already computes a variable page size. The correct API to
determine the size of the page table array is ib_umem_num_dma_blocks().
Fix the overallocation of the page array in fill_umem_pbl_tbl() when
working with larger page sizes by using the right function. Lightly
re-organize this function to make it clearer.
Replace the other calls to ib_umem_num_pages().
Fixes: d85582517e91 ("RDMA/bnxt_re: Use core helpers to get aligned DMA address")
Link: https://lore.kernel.org/r/11-v2-270386b7e60b+28f4-umem_1_jgg@nvidia.com
Acked-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
The length of the list populated by qedr_populate_pbls() should be
calculated using ib_umem_num_dma_blocks() with the same size/shift passed
to qedr_populate_pbls().
Link: https://lore.kernel.org/r/10-v2-270386b7e60b+28f4-umem_1_jgg@nvidia.com
Acked-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
This loop is splitting the DMA SGL into pg_shift sized pages, use the core
code for this directly.
Link: https://lore.kernel.org/r/9-v2-270386b7e60b+28f4-umem_1_jgg@nvidia.com
Acked-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
If ib_umem_find_best_pgsz() returns > PAGE_SIZE then the equation here is
not correct. 'start' should be 'virt'. Change it to use the core code for
page_num and the canonical calculation of page_shift.
Fixes: eb52c0333f06 ("RDMA/i40iw: Use core helpers to get aligned DMA address within a supported page size")
Link: https://lore.kernel.org/r/8-v2-270386b7e60b+28f4-umem_1_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
If ib_umem_find_best_pgsz() returns > PAGE_SIZE then the equation here is
not correct. 'start' should be 'virt'. Change it to use the core code for
page_num and the canonical calculation of page_shift.
Fixes: 40ddb3f02083 ("RDMA/efa: Use API to get contiguous memory blocks aligned to device supported page size")
Link: https://lore.kernel.org/r/7-v2-270386b7e60b+28f4-umem_1_jgg@nvidia.com
Tested-by: Gal Pressman <galpress@amazon.com>
Acked-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
ib_umem_num_pages() should only be used by things working with the SGL in
CPU pages directly.
Drivers building DMA lists should use the new ib_num_dma_blocks() which
returns the number of blocks rdma_umem_for_each_block() will return.
To make this general for DMA drivers requires a different implementation.
Computing DMA block count based on umem->address only works if the
requested page size is < PAGE_SIZE and/or the IOVA == umem->address.
Instead the number of DMA pages should be computed in the IOVA address
space, not umem->address. Thus the IOVA has to be stored inside the umem
so it can be used for these calculations.
For now set it to umem->address by default and fix it up if
ib_umem_find_best_pgsz() was called. This allows drivers to be converted
to ib_umem_num_dma_blocks() safely.
Link: https://lore.kernel.org/r/6-v2-270386b7e60b+28f4-umem_1_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Generally drivers should be using this core helper to split up the umem
into DMA pages.
These drivers are all probably wrong in some way to pass PAGE_SIZE in as
the HW page size. Either the driver doesn't support other page sizes and
it should use 4096, or the driver does support other page sizes and should
use ib_umem_find_best_pgsz() to select the best HW pages size of the HW
supported set.
The only case it could be correct is if the HW has a global setting for
PAGE_SIZE set at driver initialization time.
Link: https://lore.kernel.org/r/5-v2-270386b7e60b+28f4-umem_1_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
This helper does the same as rdma_for_each_block(), except it works on a
umem. This simplifies most of the call sites.
Link: https://lore.kernel.org/r/4-v2-270386b7e60b+28f4-umem_1_jgg@nvidia.com
Acked-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Acked-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
The calculation in rdma_find_pg_bit() is fairly complicated, and the
function is never called anywhere else. Inline a simpler version into
ib_umem_find_best_pgsz()
Link: https://lore.kernel.org/r/3-v2-270386b7e60b+28f4-umem_1_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|