Age | Commit message (Collapse) | Author | Files | Lines |
|
commit 02e7d139e5e24abb5fde91934fc9dc0344ac1926 upstream.
Change the key that we send from IB driver to EN driver regarding the
MPV device affiliation, since at that stage the IB device is not yet
initialized, so its index would be zero for different IB devices and
cause wrong associations between unrelated master and slave devices.
Instead use a unique value from inside the core device which is already
initialized at this stage.
Fixes: 0d293714ac32 ("RDMA/mlx5: Send events from IB driver about device affiliation state")
Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Link: https://lore.kernel.org/r/ac7e66357d963fc68d7a419515180212c96d137d.1697705185.git.leon@kernel.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 0d293714ac32650bfb669ceadf7cc2fad8161401 ]
Send blocking events from IB driver whenever the device is done being
affiliated or if it is removed from an affiliation.
This is useful since now the EN driver can register to those event and
know when a device is affiliated or not.
Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Link: https://lore.kernel.org/r/a7491c3e483cfd8d962f5f75b9a25f253043384a.1695296682.git.leon@kernel.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Stable-dep-of: 762a55a54eec ("net/mlx5e: Disable IPsec offload support if not FW steering")
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit e3e82fcb79eeb3f1a88a89f676831773caff514a ]
When creating ceq_0 during probing irdma, cqp.sc_cqp will be sent as a
cqp_request to cqp->sc_cqp.sq_ring. If the request is pending when
removing the irdma driver or unplugging its aux device, cqp.sc_cqp will be
dereferenced as wrong struct in irdma_free_pending_cqp_request().
PID: 3669 TASK: ffff88aef892c000 CPU: 28 COMMAND: "kworker/28:0"
#0 [fffffe0000549e38] crash_nmi_callback at ffffffff810e3a34
#1 [fffffe0000549e40] nmi_handle at ffffffff810788b2
#2 [fffffe0000549ea0] default_do_nmi at ffffffff8107938f
#3 [fffffe0000549eb8] do_nmi at ffffffff81079582
#4 [fffffe0000549ef0] end_repeat_nmi at ffffffff82e016b4
[exception RIP: native_queued_spin_lock_slowpath+1291]
RIP: ffffffff8127e72b RSP: ffff88aa841ef778 RFLAGS: 00000046
RAX: 0000000000000000 RBX: ffff88b01f849700 RCX: ffffffff8127e47e
RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffffffff83857ec0
RBP: ffff88afe3e4efc8 R8: ffffed15fc7c9dfa R9: ffffed15fc7c9dfa
R10: 0000000000000001 R11: ffffed15fc7c9df9 R12: 0000000000740000
R13: ffff88b01f849708 R14: 0000000000000003 R15: ffffed1603f092e1
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0000
-- <NMI exception stack> --
#5 [ffff88aa841ef778] native_queued_spin_lock_slowpath at ffffffff8127e72b
#6 [ffff88aa841ef7b0] _raw_spin_lock_irqsave at ffffffff82c22aa4
#7 [ffff88aa841ef7c8] __wake_up_common_lock at ffffffff81257363
#8 [ffff88aa841ef888] irdma_free_pending_cqp_request at ffffffffa0ba12cc [irdma]
#9 [ffff88aa841ef958] irdma_cleanup_pending_cqp_op at ffffffffa0ba1469 [irdma]
#10 [ffff88aa841ef9c0] irdma_ctrl_deinit_hw at ffffffffa0b2989f [irdma]
#11 [ffff88aa841efa28] irdma_remove at ffffffffa0b252df [irdma]
#12 [ffff88aa841efae8] auxiliary_bus_remove at ffffffff8219afdb
#13 [ffff88aa841efb00] device_release_driver_internal at ffffffff821882e6
#14 [ffff88aa841efb38] bus_remove_device at ffffffff82184278
#15 [ffff88aa841efb88] device_del at ffffffff82179d23
#16 [ffff88aa841efc48] ice_unplug_aux_dev at ffffffffa0eb1c14 [ice]
#17 [ffff88aa841efc68] ice_service_task at ffffffffa0d88201 [ice]
#18 [ffff88aa841efde8] process_one_work at ffffffff811c589a
#19 [ffff88aa841efe60] worker_thread at ffffffff811c71ff
#20 [ffff88aa841eff10] kthread at ffffffff811d87a0
#21 [ffff88aa841eff50] ret_from_fork at ffffffff82e0022f
Fixes: 44d9e52977a1 ("RDMA/irdma: Implement device initialization definitions")
Link: https://lore.kernel.org/r/20231130081415.891006-1-lishifeng@sangfor.com.cn
Suggested-by: "Ismail, Mustafa" <mustafa.ismail@intel.com>
Signed-off-by: Shifeng Li <lishifeng@sangfor.com.cn>
Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 03769f72d66edab82484449ed594cb6b00ae0223 ]
Virtual QP and CQ require a 4K HW page size but the driver passes
PAGE_SIZE to ib_umem_find_best_pgsz() instead.
Fix this by using the appropriate 4k value in the bitmap passed to
ib_umem_find_best_pgsz().
Fixes: 693a5386eff0 ("RDMA/irdma: Split mr alloc and free into new functions")
Link: https://lore.kernel.org/r/20231129202143.1434-4-shiraz.saleem@intel.com
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 0a5ec366de7e94192669ba08de6ed336607fd282 ]
The SQ is shared for between kernel and used by storing the kernel page
pointer and passing that to a kmap_atomic().
This then requires that the alignment is PAGE_SIZE aligned.
Fix by adding an iWarp specific alignment check.
Fixes: e965ef0e7b2c ("RDMA/irdma: Split QP handler into irdma_reg_user_mr_type_qp")
Link: https://lore.kernel.org/r/20231129202143.1434-3-shiraz.saleem@intel.com
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 4fbc3a52cd4d14de3793f4b2c721d7306ea84cf9 ]
64k pages introduce the situation in this diagram when the HCA 4k page
size is being used:
+-------------------------------------------+ <--- 64k aligned VA
| |
| HCA 4k page |
| |
+-------------------------------------------+
| o |
| |
| o |
| |
| o |
+-------------------------------------------+
| |
| HCA 4k page |
| |
+-------------------------------------------+ <--- Live HCA page
|OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO| <--- offset
| | <--- VA
| MR data |
+-------------------------------------------+
| |
| HCA 4k page |
| |
+-------------------------------------------+
| o |
| |
| o |
| |
| o |
+-------------------------------------------+
| |
| HCA 4k page |
| |
+-------------------------------------------+
The VA addresses are coming from rdma-core in this diagram can be
arbitrary, but for 64k pages, the VA may be offset by some number of HCA
4k pages and followed by some number of HCA 4k pages.
The current iterator doesn't account for either the preceding 4k pages or
the following 4k pages.
Fix the issue by extending the ib_block_iter to contain the number of DMA
pages like comment [1] says and by using __sg_advance to start the
iterator at the first live HCA page.
The changes are contained in a parallel set of iterator start and next
functions that are umem aware and specific to umem since there is one user
of the rdma_for_each_block() without umem.
These two fixes prevents the extra pages before and after the user MR
data.
Fix the preceding pages by using the __sq_advance field to start at the
first 4k page containing MR data.
Fix the following pages by saving the number of pgsz blocks in the
iterator state and downcounting on each next.
This fix allows for the elimination of the small page crutch noted in the
Fixes.
Fixes: 10c75ccb54e4 ("RDMA/umem: Prevent small pages from being returned by ib_umem_find_best_pgsz()")
Link: https://lore.kernel.org/r/20231129202143.1434-2-shiraz.saleem@intel.com
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 2b78832f50c4d711e161b166d7d8790968051546 ]
When removing the irdma driver or unplugging its aux device, the ccq
queue is released before destorying the cqp_cmpl_wq queue.
But in the window, there may still be completion events for wqes. That
will cause a UAF in irdma_sc_ccq_get_cqe_info().
[34693.333191] BUG: KASAN: use-after-free in irdma_sc_ccq_get_cqe_info+0x82f/0x8c0 [irdma]
[34693.333194] Read of size 8 at addr ffff889097f80818 by task kworker/u67:1/26327
[34693.333194]
[34693.333199] CPU: 9 PID: 26327 Comm: kworker/u67:1 Kdump: loaded Tainted: G O --------- -t - 4.18.0 #1
[34693.333200] Hardware name: SANGFOR Inspur/NULL, BIOS 4.1.13 08/01/2016
[34693.333211] Workqueue: cqp_cmpl_wq cqp_compl_worker [irdma]
[34693.333213] Call Trace:
[34693.333220] dump_stack+0x71/0xab
[34693.333226] print_address_description+0x6b/0x290
[34693.333238] ? irdma_sc_ccq_get_cqe_info+0x82f/0x8c0 [irdma]
[34693.333240] kasan_report+0x14a/0x2b0
[34693.333251] irdma_sc_ccq_get_cqe_info+0x82f/0x8c0 [irdma]
[34693.333264] ? irdma_free_cqp_request+0x151/0x1e0 [irdma]
[34693.333274] irdma_cqp_ce_handler+0x1fb/0x3b0 [irdma]
[34693.333285] ? irdma_ctrl_init_hw+0x2c20/0x2c20 [irdma]
[34693.333290] ? __schedule+0x836/0x1570
[34693.333293] ? strscpy+0x83/0x180
[34693.333296] process_one_work+0x56a/0x11f0
[34693.333298] worker_thread+0x8f/0xf40
[34693.333301] ? __kthread_parkme+0x78/0xf0
[34693.333303] ? rescuer_thread+0xc50/0xc50
[34693.333305] kthread+0x2a0/0x390
[34693.333308] ? kthread_destroy_worker+0x90/0x90
[34693.333310] ret_from_fork+0x1f/0x40
Fixes: 44d9e52977a1 ("RDMA/irdma: Implement device initialization definitions")
Signed-off-by: Shifeng Li <lishifeng1992@126.com>
Link: https://lore.kernel.org/r/20231121101236.581694-1-lishifeng1992@126.com
Acked-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 422b19f7f006e813ee0865aadce6a62b3c263c42 ]
The word "Driver" is repeated twice in the "modinfo bnxt_re"
output description. Fix it.
Fixes: 1ac5a4047975 ("RDMA/bnxt_re: Add bnxt_re RoCE driver")
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Link: https://lore.kernel.org/r/1700555387-6277-1-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 0c8bb6eb70ca41031f663b4481aac9ac78b53bc6 ]
As we chain the WR during write request: memory registration,
rdma write, local invalidate, if only the last WR fail to send due
to send queue overrun, the server can send back the reply, while
client mark the req->in_use to false in case of error in rtrs_clt_req
when error out from rtrs_post_rdma_write_sg.
Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality")
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
Link: https://lore.kernel.org/r/20231120154146.920486-8-haris.iqbal@ionos.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 6d09f6f7d7584e099633282ea915988914f86529 ]
For each write request, we need Request, Response Memory Registration,
Local Invalidate.
Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality")
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
Link: https://lore.kernel.org/r/20231120154146.920486-7-haris.iqbal@ionos.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit c4d32e77fc1006f99eeb78417efc3d81a384072a ]
Destroying path files may lead to the freeing of rdma_stats. This creates
the following race.
An IO is in-flight, or has just passed the session state check in
process_read/process_write. The close_work gets triggered and the function
rtrs_srv_close_work() starts and does destroy path which frees the
rdma_stats. After this the function process_read/process_write resumes and
tries to update the stats through the function rtrs_srv_update_rdma_stats
This commit solves the problem by moving the destroy path function to a
later point. This point makes sure any inflights are completed. This is
done by qp drain, and waiting for all in-flights through ops_id.
Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality")
Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Santosh Kumar Pradhan <santosh.pradhan@ionos.com>
Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
Link: https://lore.kernel.org/r/20231120154146.920486-6-haris.iqbal@ionos.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 3a71cd6ca0ce33d1af019ecf1d7167406fa54400 ]
Since srv_mr->iu is allocated and used only when always_invalidate is
true, free it only when always_invalidate is true.
Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality")
Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
Link: https://lore.kernel.org/r/20231120154146.920486-5-haris.iqbal@ionos.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit ed1e52aefa16f15dc2f04054a3baf11726a7460e ]
While processing info request, it could so happen that the srv_path goes
to CLOSING state, cause of any of the error events from RDMA. That state
change should be picked up while trying to change the state in
process_info_req, by checking the return value. In case the state change
call in process_info_req fails, we fail the processing.
We should also check the return value for rtrs_srv_path_up, since it
sends a link event to the client above, and the client can fail for any
reason.
Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality")
Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
Link: https://lore.kernel.org/r/20231120154146.920486-4-haris.iqbal@ionos.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 3e44a61b5db873612e20e7b7922468d7d1ac2d22 ]
If we start hb too early, it will confuse server side to close
the session.
Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality")
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
Link: https://lore.kernel.org/r/20231120154146.920486-3-haris.iqbal@ionos.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 3ee7ecd712048ade6482bea4b2f3dcaf039c0348 ]
When IO is completed, rtrs can be called in softirq context,
unconditionally enabling irq could cause panic.
To be on safe side, use spin_lock_irqsave and spin_unlock_irqrestore
instread.
Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality")
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Florian-Ewald Mueller <florian-ewald.mueller@ionos.com>
Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
Link: https://lore.kernel.org/r/20231120154146.920486-2-haris.iqbal@ionos.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit bd6da690c27d75cae432c09162d054b34fa2156f ]
Currently, there is no wait for the QP suspend to complete on a modify
to SQD state. Add a wait, after the modify to SQD state, for the Suspend
Complete AE. While we are at it, update the suspend timeout value in
irdma_prep_tc_change to use IRDMA_EVENT_TIMEOUT_MS too.
Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs")
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20231114170246.238-3-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit ba12ab66aa83a2340a51ad6e74b284269745138c ]
Remove the modify to SQD before going to ERROR state. It is not needed.
Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs")
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20231114170246.238-2-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
algorithm
[ Upstream commit efb9cbf66440482ceaa90493d648226ab7ec2ebf ]
Add a default congest control algorithm so that driver won't return
an error when the configured algorithm is invalid.
Fixes: f91696f2f053 ("RDMA/hns: Support congestion control type selection according to the FW")
Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
Link: https://lore.kernel.org/r/20231028093242.670325-1-huangjunxian6@hisilicon.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 0550d4604e2ca4e653dc13f0c009fc42106b6bfc ]
KMSAN reported the following uninit-value access issue:
lo speed is unknown, defaulting to 1000
=====================================================
BUG: KMSAN: uninit-value in ib_get_width_and_speed drivers/infiniband/core/verbs.c:1889 [inline]
BUG: KMSAN: uninit-value in ib_get_eth_speed+0x546/0xaf0 drivers/infiniband/core/verbs.c:1998
ib_get_width_and_speed drivers/infiniband/core/verbs.c:1889 [inline]
ib_get_eth_speed+0x546/0xaf0 drivers/infiniband/core/verbs.c:1998
siw_query_port drivers/infiniband/sw/siw/siw_verbs.c:173 [inline]
siw_get_port_immutable+0x6f/0x120 drivers/infiniband/sw/siw/siw_verbs.c:203
setup_port_data drivers/infiniband/core/device.c:848 [inline]
setup_device drivers/infiniband/core/device.c:1244 [inline]
ib_register_device+0x1589/0x1df0 drivers/infiniband/core/device.c:1383
siw_device_register drivers/infiniband/sw/siw/siw_main.c:72 [inline]
siw_newlink+0x129e/0x13d0 drivers/infiniband/sw/siw/siw_main.c:490
nldev_newlink+0x8fd/0xa60 drivers/infiniband/core/nldev.c:1763
rdma_nl_rcv_skb drivers/infiniband/core/netlink.c:239 [inline]
rdma_nl_rcv+0xe8a/0x1120 drivers/infiniband/core/netlink.c:259
netlink_unicast_kernel net/netlink/af_netlink.c:1342 [inline]
netlink_unicast+0xf4b/0x1230 net/netlink/af_netlink.c:1368
netlink_sendmsg+0x1242/0x1420 net/netlink/af_netlink.c:1910
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg net/socket.c:745 [inline]
____sys_sendmsg+0x997/0xd60 net/socket.c:2588
___sys_sendmsg+0x271/0x3b0 net/socket.c:2642
__sys_sendmsg net/socket.c:2671 [inline]
__do_sys_sendmsg net/socket.c:2680 [inline]
__se_sys_sendmsg net/socket.c:2678 [inline]
__x64_sys_sendmsg+0x2fa/0x4a0 net/socket.c:2678
do_syscall_x64 arch/x86/entry/common.c:51 [inline]
do_syscall_64+0x44/0x110 arch/x86/entry/common.c:82
entry_SYSCALL_64_after_hwframe+0x63/0x6b
Local variable lksettings created at:
ib_get_eth_speed+0x4b/0xaf0 drivers/infiniband/core/verbs.c:1974
siw_query_port drivers/infiniband/sw/siw/siw_verbs.c:173 [inline]
siw_get_port_immutable+0x6f/0x120 drivers/infiniband/sw/siw/siw_verbs.c:203
CPU: 0 PID: 11257 Comm: syz-executor.1 Not tainted 6.6.0-14500-g1c41041124bd #10
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014
=====================================================
If __ethtool_get_link_ksettings() fails, `netdev_speed` is set to the
default value, SPEED_1000. In this case, if `lanes` field of struct
ethtool_link_ksettings is not initialized, an uninitialized value is passed
to ib_get_width_and_speed(). This causes the above issue. This patch
resolves the issue by initializing `lanes` to 0.
Fixes: cb06b6b3f6cb ("RDMA/core: Get IB width and speed from netdev")
Signed-off-by: Shigeru Yoshida <syoshida@redhat.com>
Link: https://lore.kernel.org/r/20231108143113.1360567-1-syoshida@redhat.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 8bf7187d978610b9e327a3d92728c8864a575ebd ]
Use FIELD_GET() to extract PCIe Negotiated Link Width field instead of
custom masking and shifting, and remove extract_width() which only
wraps that FIELD_GET().
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/20230919125648.1920-2-ilpo.jarvinen@linux.intel.com
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dean Luick <dean.luick@cornelisnetworks.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 2ef422f063b74adcc4a4a9004b0a87bb55e0a836 ]
In the unlikely event that workqueue allocation fails and returns NULL in
mlx5_mkey_cache_init(), delete the call to
mlx5r_umr_resource_cleanup() (which frees the QP) in
mlx5_ib_stage_post_ib_reg_umr_init(). This will avoid attempted double
free of the same QP when __mlx5_ib_add() does its cleanup.
Resolves a splat:
Syzkaller reported a UAF in ib_destroy_qp_user
workqueue: Failed to create a rescuer kthread for wq "mkey_cache": -EINTR
infiniband mlx5_0: mlx5_mkey_cache_init:981:(pid 1642):
failed to create work queue
infiniband mlx5_0: mlx5_ib_stage_post_ib_reg_umr_init:4075:(pid 1642):
mr cache init failed -12
==================================================================
BUG: KASAN: slab-use-after-free in ib_destroy_qp_user (drivers/infiniband/core/verbs.c:2073)
Read of size 8 at addr ffff88810da310a8 by task repro_upstream/1642
Call Trace:
<TASK>
kasan_report (mm/kasan/report.c:590)
ib_destroy_qp_user (drivers/infiniband/core/verbs.c:2073)
mlx5r_umr_resource_cleanup (drivers/infiniband/hw/mlx5/umr.c:198)
__mlx5_ib_add (drivers/infiniband/hw/mlx5/main.c:4178)
mlx5r_probe (drivers/infiniband/hw/mlx5/main.c:4402)
...
</TASK>
Allocated by task 1642:
__kmalloc (./include/linux/kasan.h:198 mm/slab_common.c:1026
mm/slab_common.c:1039)
create_qp (./include/linux/slab.h:603 ./include/linux/slab.h:720
./include/rdma/ib_verbs.h:2795 drivers/infiniband/core/verbs.c:1209)
ib_create_qp_kernel (drivers/infiniband/core/verbs.c:1347)
mlx5r_umr_resource_init (drivers/infiniband/hw/mlx5/umr.c:164)
mlx5_ib_stage_post_ib_reg_umr_init (drivers/infiniband/hw/mlx5/main.c:4070)
__mlx5_ib_add (drivers/infiniband/hw/mlx5/main.c:4168)
mlx5r_probe (drivers/infiniband/hw/mlx5/main.c:4402)
...
Freed by task 1642:
__kmem_cache_free (mm/slub.c:1826 mm/slub.c:3809 mm/slub.c:3822)
ib_destroy_qp_user (drivers/infiniband/core/verbs.c:2112)
mlx5r_umr_resource_cleanup (drivers/infiniband/hw/mlx5/umr.c:198)
mlx5_ib_stage_post_ib_reg_umr_init (drivers/infiniband/hw/mlx5/main.c:4076
drivers/infiniband/hw/mlx5/main.c:4065)
__mlx5_ib_add (drivers/infiniband/hw/mlx5/main.c:4168)
mlx5r_probe (drivers/infiniband/hw/mlx5/main.c:4402)
...
Fixes: 04876c12c19e ("RDMA/mlx5: Move init and cleanup of UMR to umr.c")
Link: https://lore.kernel.org/r/1698170518-4006-1-git-send-email-george.kennedy@oracle.com
Suggested-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: George Kennedy <george.kennedy@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit d4b2d165714c0ce8777d5131f6e0aad617b7adc4 ]
Increase name array to be large enough to overcome the following
compilation error.
drivers/infiniband/hw/hfi1/efivar.c: In function ‘read_hfi1_efi_var’:
drivers/infiniband/hw/hfi1/efivar.c:124:44: error: ‘snprintf’ output may be truncated before the last format character [-Werror=format-truncation=]
124 | snprintf(name, sizeof(name), "%s-%s", prefix_name, kind);
| ^
drivers/infiniband/hw/hfi1/efivar.c:124:9: note: ‘snprintf’ output 2 or more bytes (assuming 65) into a destination of size 64
124 | snprintf(name, sizeof(name), "%s-%s", prefix_name, kind);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/infiniband/hw/hfi1/efivar.c:133:52: error: ‘snprintf’ output may be truncated before the last format character [-Werror=format-truncation=]
133 | snprintf(name, sizeof(name), "%s-%s", prefix_name, kind);
| ^
drivers/infiniband/hw/hfi1/efivar.c:133:17: note: ‘snprintf’ output 2 or more bytes (assuming 65) into a destination of size 64
133 | snprintf(name, sizeof(name), "%s-%s", prefix_name, kind);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
make[6]: *** [scripts/Makefile.build:243: drivers/infiniband/hw/hfi1/efivar.o] Error 1
Fixes: c03c08d50b3d ("IB/hfi1: Check upper-case EFI variables")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/238fa39a8fd60e87a5ad7e1ca6584fcdf32e9519.1698159993.git.leonro@nvidia.com
Acked-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 07f06e0e5cd99555c861e874716d9e2627655fd5 ]
During device init, a struct for HW stats will be allocated. As HW
stats are not supported for VF and HIP08, currently
hns_roce_alloc_hw_port_stats() returns NULL in this case. However,
ib-core considers the returned NULL pointer as memory allocation
failure and returns ENOMEM, eventually leading to the failure of VF
and HIP08 init.
In the case where the driver does not support the .alloc_hw_port_stats()
ops, ib-core will return EOPNOTSUPP and ignore this error code in the
upper layer function. So for VF and HIP08, just don't set the HW stats
ops to ib-core.
Fixes: 5a87279591a1 ("RDMA/hns: Support hns HW stats")
Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
Link: https://lore.kernel.org/r/20231017125239.164455-8-huangjunxian6@hisilicon.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit b4a797b894dc91a541ea230db6fa00cc74683bfd ]
The num_ports capability of devices should be compared with the
number of port(i.e. the input param "port_num") but not the port
index(i.e. port_num - 1).
Fixes: 5a87279591a1 ("RDMA/hns: Support hns HW stats")
Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
Link: https://lore.kernel.org/r/20231017125239.164455-7-huangjunxian6@hisilicon.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 27c5fd271d8b8730fc0bb1b6cae953ad7808a874 ]
Due to hardware limitations, only DCQCN is supported for UD. Therefore, the
default algorithm for UD is set to DCQCN.
Fixes: f91696f2f053 ("RDMA/hns: Support congestion control type selection according to the FW")
Signed-off-by: Luoyouming <luoyouming@huawei.com>
Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
Link: https://lore.kernel.org/r/20231017125239.164455-6-huangjunxian6@hisilicon.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 5e617c18b1f34ec57ad5dce44f09de603cf6bd6c ]
SL set by users may exceed the capability of devices. So add check
for this situation.
Fixes: fba429fcf9a5 ("RDMA/hns: Fix missing fields in address vector")
Fixes: 70f92521584f ("RDMA/hns: Use the reserved loopback QPs to free MR before destroying MPT")
Fixes: f0cb411aad23 ("RDMA/hns: Use new interface to modify QP context")
Signed-off-by: Luoyouming <luoyouming@huawei.com>
Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
Link: https://lore.kernel.org/r/20231017125239.164455-5-huangjunxian6@hisilicon.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit b5f9efff101b06fd06a5e280a2b00b1335f5f476 ]
The ib_mtu_enum_to_int() and uverbs_attr_get_len() may returns a negative
value. In this case, mixed comparisons of signed and unsigned types will
throw wrong results.
This patch adds judgement for this situation.
Fixes: 30b707886aeb ("RDMA/hns: Support inline data in extented sge space for RC")
Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
Link: https://lore.kernel.org/r/20231017125239.164455-4-huangjunxian6@hisilicon.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit c64e9710f9241e38a1c761ed1c1a30854784da66 ]
ucmd in hns_roce_create_qp_common() are not initialized. But it works fine
until new member sdb_addr is added to struct hns_roce_ib_create_qp.
If the user-mode driver uses an old version ABI, then the value of the new
member will be undefined after ib_copy_from_udata().
This patch fixes it by initialize this variable to 0. And the default value
of the new member sdb_addr will be 0 which is invalid.
Fixes: 0425e3e6e0c7 ("RDMA/hns: Support flush cqe for hip08 in kernel space")
Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
Link: https://lore.kernel.org/r/20231017125239.164455-3-huangjunxian6@hisilicon.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 9faef73ef4f6666b97e04d99734ac09251098185 ]
The current driver will print all asynchronous events. Some of the
print levels are set improperly, e.g. SRQ limit reach and SRQ last
wqe reach, which may also occur during normal operation of the software.
Currently, the information of these event is printed as a warning,
which causes a large amount of printing even during normal use of the
application. As a result, the service performance deteriorates.
This patch fixes the printing storms by modifying the print level.
Fixes: b00a92c8f2ca ("RDMA/hns: Move all prints out of irq handle")
Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
Link: https://lore.kernel.org/r/20231017125239.164455-2-huangjunxian6@hisilicon.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit c1336bb4aa5e809a622a87d74311275514086596 ]
Previously when we had a RAW QP, we bound a counter to it when it moved
to INIT state, using the counter context inside RQC.
But when we try to modify that counter later in RTS state we used
modify QP which tries to change the counter inside QPC instead of RQC.
Now we correctly modify the counter set_id inside of RQC instead of QPC
for the RAW QP.
Fixes: d14133dd4161 ("IB/mlx5: Support set qp counter")
Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Reviewed-by: Mark Zhang <markzhang@nvidia.com>
Link: https://lore.kernel.org/r/2e5ab6713784a8fe997d19c508187a0dfecf2dfc.1696847964.git.leon@kernel.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 81760bedc65194ff38e1e4faefd5f9f0c95c19a4 ]
If, for any reason, the open-coded arithmetic causes a wraparound,
the protection that `struct_size()` provides against potential integer
overflows is defeated. Fix this by hardening calls to `struct_size()`
with `size_add()`, `size_sub()` and `size_mul()`.
Fixes: 467f432a521a ("RDMA/core: Split port and device counter sysfs attributes")
Fixes: a4676388e2e2 ("RDMA/core: Simplify how the gid_attrs sysfs is created")
Fixes: e9dd5daf884c ("IB/umad: Refactor code to use cdev_device_add()")
Fixes: 324e227ea7c9 ("RDMA/device: Add ib_device_get_by_netdev()")
Fixes: 5aad26a7eac5 ("IB/core: Use struct_size() in kzalloc()")
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/ZQdt4NsJFwwOYxUR@work
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
Like any other set command, require admin permissions to do it.
Cc: stable@vger.kernel.org
Fixes: 2b34c5580226 ("RDMA/core: Add command to set ib_core device net namspace sharing mode")
Link: https://lore.kernel.org/r/75d329fdd7381b52cbdf87910bef16c9965abb1f.1696443438.git.leon@kernel.org
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
|
|
During execution of mlx5_mkey_cache_cleanup(), there is a guarantee
that MR are not registered and/or destroyed. It means that we don't
need newly introduced cache disable flag.
Fixes: 374012b00457 ("RDMA/mlx5: Fix mkey cache possible deadlock on cleanup")
Link: https://lore.kernel.org/r/c7e9c9f98c8ae4a7413d97d9349b29f5b0a23dbe.1695921626.git.leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
|
|
Initialize the structure to 0 so that it's fields won't have random
values. For example fields like rec.traffic_class (as well as
rec.flow_label and rec.sl) is used to generate the user AH through:
cma_iboe_join_multicast
cma_make_mc_event
ib_init_ah_from_mcmember
And a random traffic_class causes a random IP DSCP in RoCEv2.
Fixes: b5de0c60cc30 ("RDMA/cma: Fix use after free race in roce multicast join")
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Link: https://lore.kernel.org/r/20230927090511.603595-1-markzhang@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
|
Fix the deadlock by refactoring the MR cache cleanup flow to flush the
workqueue without holding the rb_lock.
This adds a race between cache cleanup and creation of new entries which
we solve by denied creation of new entries after cache cleanup started.
Lockdep:
WARNING: possible circular locking dependency detected
[ 2785.326074 ] 6.2.0-rc6_for_upstream_debug_2023_01_31_14_02 #1 Not tainted
[ 2785.339778 ] ------------------------------------------------------
[ 2785.340848 ] devlink/53872 is trying to acquire lock:
[ 2785.341701 ] ffff888124f8c0c8 ((work_completion)(&(&ent->dwork)->work)){+.+.}-{0:0}, at: __flush_work+0xc8/0x900
[ 2785.343403 ]
[ 2785.343403 ] but task is already holding lock:
[ 2785.344464 ] ffff88817e8f1260 (&dev->cache.rb_lock){+.+.}-{3:3}, at: mlx5_mkey_cache_cleanup+0x77/0x250 [mlx5_ib]
[ 2785.346273 ]
[ 2785.346273 ] which lock already depends on the new lock.
[ 2785.346273 ]
[ 2785.347720 ]
[ 2785.347720 ] the existing dependency chain (in reverse order) is:
[ 2785.349003 ]
[ 2785.349003 ] -> #1 (&dev->cache.rb_lock){+.+.}-{3:3}:
[ 2785.350160 ] __mutex_lock+0x14c/0x15c0
[ 2785.350962 ] delayed_cache_work_func+0x2d1/0x610 [mlx5_ib]
[ 2785.352044 ] process_one_work+0x7c2/0x1310
[ 2785.352879 ] worker_thread+0x59d/0xec0
[ 2785.353636 ] kthread+0x28f/0x330
[ 2785.354370 ] ret_from_fork+0x1f/0x30
[ 2785.355135 ]
[ 2785.355135 ] -> #0 ((work_completion)(&(&ent->dwork)->work)){+.+.}-{0:0}:
[ 2785.356515 ] __lock_acquire+0x2d8a/0x5fe0
[ 2785.357349 ] lock_acquire+0x1c1/0x540
[ 2785.358121 ] __flush_work+0xe8/0x900
[ 2785.358852 ] __cancel_work_timer+0x2c7/0x3f0
[ 2785.359711 ] mlx5_mkey_cache_cleanup+0xfb/0x250 [mlx5_ib]
[ 2785.360781 ] mlx5_ib_stage_pre_ib_reg_umr_cleanup+0x16/0x30 [mlx5_ib]
[ 2785.361969 ] __mlx5_ib_remove+0x68/0x120 [mlx5_ib]
[ 2785.362960 ] mlx5r_remove+0x63/0x80 [mlx5_ib]
[ 2785.363870 ] auxiliary_bus_remove+0x52/0x70
[ 2785.364715 ] device_release_driver_internal+0x3c1/0x600
[ 2785.365695 ] bus_remove_device+0x2a5/0x560
[ 2785.366525 ] device_del+0x492/0xb80
[ 2785.367276 ] mlx5_detach_device+0x1a9/0x360 [mlx5_core]
[ 2785.368615 ] mlx5_unload_one_devl_locked+0x5a/0x110 [mlx5_core]
[ 2785.369934 ] mlx5_devlink_reload_down+0x292/0x580 [mlx5_core]
[ 2785.371292 ] devlink_reload+0x439/0x590
[ 2785.372075 ] devlink_nl_cmd_reload+0xaef/0xff0
[ 2785.372973 ] genl_family_rcv_msg_doit.isra.0+0x1bd/0x290
[ 2785.374011 ] genl_rcv_msg+0x3ca/0x6c0
[ 2785.374798 ] netlink_rcv_skb+0x12c/0x360
[ 2785.375612 ] genl_rcv+0x24/0x40
[ 2785.376295 ] netlink_unicast+0x438/0x710
[ 2785.377121 ] netlink_sendmsg+0x7a1/0xca0
[ 2785.377926 ] sock_sendmsg+0xc5/0x190
[ 2785.378668 ] __sys_sendto+0x1bc/0x290
[ 2785.379440 ] __x64_sys_sendto+0xdc/0x1b0
[ 2785.380255 ] do_syscall_64+0x3d/0x90
[ 2785.381031 ] entry_SYSCALL_64_after_hwframe+0x46/0xb0
[ 2785.381967 ]
[ 2785.381967 ] other info that might help us debug this:
[ 2785.381967 ]
[ 2785.383448 ] Possible unsafe locking scenario:
[ 2785.383448 ]
[ 2785.384544 ] CPU0 CPU1
[ 2785.385383 ] ---- ----
[ 2785.386193 ] lock(&dev->cache.rb_lock);
[ 2785.386940 ] lock((work_completion)(&(&ent->dwork)->work));
[ 2785.388327 ] lock(&dev->cache.rb_lock);
[ 2785.389425 ] lock((work_completion)(&(&ent->dwork)->work));
[ 2785.390414 ]
[ 2785.390414 ] *** DEADLOCK ***
[ 2785.390414 ]
[ 2785.391579 ] 6 locks held by devlink/53872:
[ 2785.392341 ] #0: ffffffff84c17a50 (cb_lock){++++}-{3:3}, at: genl_rcv+0x15/0x40
[ 2785.393630 ] #1: ffff888142280218 (&devlink->lock_key){+.+.}-{3:3}, at: devlink_get_from_attrs_lock+0x12d/0x2d0
[ 2785.395324 ] #2: ffff8881422d3c38 (&dev->lock_key){+.+.}-{3:3}, at: mlx5_unload_one_devl_locked+0x4a/0x110 [mlx5_core]
[ 2785.397322 ] #3: ffffffffa0e59068 (mlx5_intf_mutex){+.+.}-{3:3}, at: mlx5_detach_device+0x60/0x360 [mlx5_core]
[ 2785.399231 ] #4: ffff88810e3cb0e8 (&dev->mutex){....}-{3:3}, at: device_release_driver_internal+0x8d/0x600
[ 2785.400864 ] #5: ffff88817e8f1260 (&dev->cache.rb_lock){+.+.}-{3:3}, at: mlx5_mkey_cache_cleanup+0x77/0x250 [mlx5_ib]
Fixes: b95845178328 ("RDMA/mlx5: Change the cache structure to an RB-tree")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Michael Guralnik <michaelgur@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
|
|
checkpath is complaining about NULL string, change it to 'Unknown'.
Fixes: 37aa5c36aa70 ("IB/mlx5: Add UARs write-combining and non-cached mapping")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Link: https://lore.kernel.org/r/8638e5c14fadbde5fa9961874feae917073af920.1695203958.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
|
The mutex was not unlocked on some of the error flows.
Moved the unlock location to include all the error flow scenarios.
Fixes: e1f4a52ac171 ("RDMA/mlx5: Create an indirect flow table for steering anchor")
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Hamdan Igbaria <hamdani@nvidia.com>
Link: https://lore.kernel.org/r/1244a69d783da997c0af0b827c622eb00495492e.1695203958.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
|
After the change to use dynamic cache structure, new cache entries
can be added and the mkey allocation can no longer assume that all
mkeys created for the cache have access_flags equal to zero.
Example of a flow that exposes the issue:
A user registers MR with RO on a HCA that cannot UMR RO and the mkey is
created outside of the cache. When the user deregisters the MR, a new
cache entry is created to store mkeys with RO.
Later, the user registers 2 MRs with RO. The first MR is reused from the
new cache entry. When we try to get the second mkey from the cache we see
the entry is empty so we go to the MR cache mkey allocation flow which
would have allocated a mkey with no access flags, resulting the user getting
a MR without RO.
Fixes: dd1b913fb0d0 ("RDMA/mlx5: Cache all user cacheable mkeys on dereg MR flow")
Reviewed-by: Edward Srouji <edwards@nvidia.com>
Signed-off-by: Michael Guralnik <michaelgur@nvidia.com>
Link: https://lore.kernel.org/r/8a802700b82def3ace3f77cd7a9ad9d734af87e7.1695203958.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
|
In order to be sure that 'buff' is never truncated, its size should be
12, not 11.
When building with W=1, this fixes the following warnings:
drivers/infiniband/hw/mlx4/sysfs.c: In function ‘add_port_entries’:
drivers/infiniband/hw/mlx4/sysfs.c:268:34: error: ‘sprintf’ may write a terminating nul past the end of the destination [-Werror=format-overflow=]
268 | sprintf(buff, "%d", i);
| ^
drivers/infiniband/hw/mlx4/sysfs.c:268:17: note: ‘sprintf’ output between 2 and 12 bytes into a destination of size 11
268 | sprintf(buff, "%d", i);
| ^~~~~~~~~~~~~~~~~~~~~~
drivers/infiniband/hw/mlx4/sysfs.c:286:34: error: ‘sprintf’ may write a terminating nul past the end of the destination [-Werror=format-overflow=]
286 | sprintf(buff, "%d", i);
| ^
drivers/infiniband/hw/mlx4/sysfs.c:286:17: note: ‘sprintf’ output between 2 and 12 bytes into a destination of size 11
286 | sprintf(buff, "%d", i);
| ^~~~~~~~~~~~~~~~~~~~~~
Fixes: c1e7e466120b ("IB/mlx4: Add iov directory in sysfs under the ib device")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/0bb1443eb47308bc9be30232cc23004c4d4cf43e.1695448530.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
|
rc_qp_count and ud_qp_count is not decremented during qp destroy.
Fix this.
Fixes: cb95709e0dca ("bnxt_re: Update the hw counters for resource stats")
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Link: https://lore.kernel.org/r/1695199280-13520-3-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
|
Flag that indicate control path command completion should be cleared
only after copying the command response data. As soon as the is_in_used
flag is clear, the waiting thread can proceed with wrong response
data. This wrong data is causing multiple issues like wrong lkey
used in data traffic and wrong AH Id etc.
Use a memory barrier to ensure that the response data
is copied and visible to the process waiting on a different
cpu core before clearing the is_in_used flag.
Clear the is_in_used after copying the command response.
Fixes: bcfee4ce3e01 ("RDMA/bnxt_re: remove redundant cmdq_bitmap")
Signed-off-by: Saravanan Vajravel <saravanan.vajravel@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Link: https://lore.kernel.org/r/1695199280-13520-2-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
|
The following compilation error is false alarm as RDMA devices don't
have such large amount of ports to actually cause to format truncation.
drivers/infiniband/core/cma_configfs.c: In function ‘make_cma_ports’:
drivers/infiniband/core/cma_configfs.c:223:57: error: ‘snprintf’ output may be truncated before the last format character [-Werror=format-truncation=]
223 | snprintf(port_str, sizeof(port_str), "%u", i + 1);
| ^
drivers/infiniband/core/cma_configfs.c:223:17: note: ‘snprintf’ output between 2 and 11 bytes into a destination of size 10
223 | snprintf(port_str, sizeof(port_str), "%u", i + 1);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
make[5]: *** [scripts/Makefile.build:243: drivers/infiniband/core/cma_configfs.o] Error 1
Fixes: 045959db65c6 ("IB/cma: Add configfs for rdma_cm")
Link: https://lore.kernel.org/r/a7e3b347ee134167fa6a3787c56ef231a04bc8c2.1694434639.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
|
|
Fix the crash of regmr_cmd called by erdma_ib_alloc_mr. The reason is
that mr->mem.mtt is not initialized but it is accessed in regmr_cmd.
The call trace information:
BUG: kernel NULL pointer dereference, address: 0000000000000000
<...>
RIP: 0010:regmr_cmd+0x170/0x1c0 [erdma]
<...>
Call Trace:
? __die+0x20/0x70
? page_fault_oops+0x66/0x150
? do_user_addr_fault+0x61/0x660
? exc_page_fault+0x65/0x140
? asm_exc_page_fault+0x22/0x30
? regmr_cmd+0x170/0x1c0 [erdma]
? preempt_count_add+0x70/0xa0
? _raw_spin_lock_irqsave+0x19/0x50
? _raw_spin_unlock_irqrestore+0x1b/0x40
? erdma_alloc_idx+0x51/0x90 [erdma]
erdma_get_dma_mr+0xa3/0x120 [erdma]
__ib_alloc_pd+0xeb/0x1c0 [ib_core]
Fixes: 7244b4aa4221 ("RDMA/erdma: Refactor the storage structure of MTT entries")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/all/3d140c1d-524a-4dbe-a51c-aee4f7ecafdb@moroto.mountain/
Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com>
Link: https://lore.kernel.org/r/20230908060559.80203-1-chengyou@linux.alibaba.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
|
The erdma_create_scatter_mtt() function is supposed to return error
pointers. Returning NULL will lead to an Oops.
Fixes: ed10435d3583 ("RDMA/erdma: Implement hierarchical MTT")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Link: https://lore.kernel.org/r/1eb400d5-d8a3-4a8e-b3da-c43c6c377f86@moroto.mountain
Acked-by: Cheng Xu <chengyou@linux.alibaba.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
|
Since size of 'hdr' pointer and '*hdr' structure is equal on 64-bit
machines issue probably didn't cause any wrong behavior. But anyway,
fixing of typo is required.
Fixes: da0f60df7bd5 ("RDMA/uverbs: Prohibit write() calls with too small buffers")
Co-developed-by: Ivanov Mikhail <ivanov.mikhail1@huawei-partners.com>
Signed-off-by: Ivanov Mikhail <ivanov.mikhail1@huawei-partners.com>
Signed-off-by: Konstantin Meskhidze <konstantin.meskhidze@huawei.com>
Link: https://lore.kernel.org/r/20230905103258.1738246-1-konstantin.meskhidze@huawei.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
|
get_skb() can fail to allocate skb, so check it.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 5be78ee924ae ("RDMA/cxgb4: Fix LE hash collision bug for active open connection")
Signed-off-by: Artem Chernyshev <artem.chernyshev@red-soft.ru>
Link: https://lore.kernel.org/r/20230905124048.284165-1-artem.chernyshev@red-soft.ru
Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
|
In case immediate MPA request processing fails, the newly
created endpoint unlinks the listening endpoint and is
ready to be dropped. This special case was not handled
correctly by the code handling the later TCP socket close,
causing a NULL dereference crash in siw_cm_work_handler()
when dereferencing a NULL listener. We now also cancel
the useless MPA timeout, if immediate MPA request
processing fails.
This patch furthermore simplifies MPA processing in general:
Scheduling a useless TCP socket read in sk_data_ready() upcall
is now surpressed, if the socket is already moved out of
TCP_ESTABLISHED state.
Fixes: 6c52fdc244b5 ("rdma/siw: connection management")
Signed-off-by: Bernard Metzler <bmt@zurich.ibm.com>
Link: https://lore.kernel.org/r/20230905145822.446263-1-bmt@zurich.ibm.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
|
After scmd_eh_abort_handler() has called the SCSI LLD eh_abort_handler
callback, it performs one of the following actions:
* Call scsi_queue_insert().
* Call scsi_finish_command().
* Call scsi_eh_scmd_add().
Hence, SCSI abort handlers must not call scsi_done(). Otherwise all
the above actions would trigger a use-after-free. Hence remove the
scsi_done() call from srp_abort(). Keep the srp_free_req() call
before returning SUCCESS because we may not see the command again if
SUCCESS is returned.
Cc: Bob Pearson <rpearsonhpe@gmail.com>
Cc: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Fixes: d8536670916a ("IB/srp: Avoid having aborted requests hang")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20230823205727.505681-1-bvanassche@acm.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
|
Pull SCSI updates from James Bottomley:
"Updates to the usual drivers (ufs, lpfc, qla2xxx, mpi3mr, libsas) and
the usual minor updates and bug fixes but no significant core changes"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (116 commits)
scsi: storvsc: Handle additional SRB status values
scsi: libsas: Delete sas_ata_task.retry_count
scsi: libsas: Delete sas_ata_task.stp_affil_pol
scsi: libsas: Delete sas_ata_task.set_affil_pol
scsi: libsas: Delete sas_ssp_task.task_prio
scsi: libsas: Delete sas_ssp_task.enable_first_burst
scsi: libsas: Delete sas_ssp_task.retry_count
scsi: libsas: Delete struct scsi_core
scsi: libsas: Delete enum sas_phy_type
scsi: libsas: Delete enum sas_class
scsi: libsas: Delete sas_ha_struct.lldd_module
scsi: target: Fix write perf due to unneeded throttling
scsi: lpfc: Do not abuse UUID APIs and LPFC_COMPRESS_VMID_SIZE
scsi: pm8001: Remove unused declarations
scsi: fcoe: Fix potential deadlock on &fip->ctlr_lock
scsi: elx: sli4: Remove code duplication
scsi: bfa: Replace one-element array with flexible-array member in struct fc_rscn_pl_s
scsi: qla2xxx: Remove unused declarations
scsi: pmcraid: Use pci_dev_id() to simplify the code
scsi: pm80xx: Set RETFIS when requested by libsas
...
|
|
Pull rdma updates from Jason Gunthorpe:
"Many small changes across the subystem, some highlights:
- Usual driver cleanups in qedr, siw, erdma, hfi1, mlx4/5, irdma,
mthca, hns, and bnxt_re
- siw now works over tunnel and other netdevs with a MAC address by
removing assumptions about a MAC/GID from the connection manager
- "Doorbell Pacing" for bnxt_re - this is a best effort scheme to
allow userspace to slow down the doorbell rings if the HW gets full
- irdma egress VLAN priority, better QP/WQ sizing
- rxe bug fixes in queue draining and srq resizing
- Support more ethernet speed options in the core layer
- DMABUF support for bnxt_re
- Multi-stage MTT support for erdma to allow much bigger MR
registrations
- A irdma fix with a CVE that came in too late to go to -rc, missing
bounds checking for 0 length MRs"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (87 commits)
IB/hfi1: Reduce printing of errors during driver shut down
RDMA/hfi1: Move user SDMA system memory pinning code to its own file
RDMA/hfi1: Use list_for_each_entry() helper
RDMA/mlx5: Fix trailing */ formatting in block comment
RDMA/rxe: Fix redundant break statement in switch-case.
RDMA/efa: Fix wrong resources deallocation order
RDMA/siw: Call llist_reverse_order in siw_run_sq
RDMA/siw: Correct wrong debug message
RDMA/siw: Balance the reference of cep->kref in the error path
Revert "IB/isert: Fix incorrect release of isert connection"
RDMA/bnxt_re: Fix kernel doc errors
RDMA/irdma: Prevent zero-length STAG registration
RDMA/erdma: Implement hierarchical MTT
RDMA/erdma: Refactor the storage structure of MTT entries
RDMA/erdma: Renaming variable names and field names of struct erdma_mem
RDMA/hns: Support hns HW stats
RDMA/hns: Dump whole QP/CQ/MR resource in raw
RDMA/irdma: Add missing kernel-doc in irdma_setup_umode_qp()
RDMA/mlx4: Copy union directly
RDMA/irdma: Drop unused kernel push code
...
|