summaryrefslogtreecommitdiff
path: root/drivers/infiniband/sw/rxe
AgeCommit message (Collapse)AuthorFilesLines
2024-05-02RDMA/rxe: Fix the problem "mutex_destroy missing"Yanjun.Zhu1-0/+2
[ Upstream commit 481047d7e8391d3842ae59025806531cdad710d9 ] When a mutex lock is not used any more, the function mutex_destroy should be called to mark the mutex lock uninitialized. Fixes: 8700e3e7c485 ("Soft RoCE driver") Signed-off-by: Yanjun.Zhu <yanjun.zhu@linux.dev> Link: https://lore.kernel.org/r/20240314065140.27468-1-yanjun.zhu@linux.dev Reviewed-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-06-21RDMA/rxe: Fix the use-before-initialization error of resp_pktsZhu Yanjun1-4/+3
[ Upstream commit 2a62b6210ce876c596086ab8fd4c8a0c3d10611a ] In the following: Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xd9/0x150 lib/dump_stack.c:106 assign_lock_key kernel/locking/lockdep.c:982 [inline] register_lock_class+0xdb6/0x1120 kernel/locking/lockdep.c:1295 __lock_acquire+0x10a/0x5df0 kernel/locking/lockdep.c:4951 lock_acquire kernel/locking/lockdep.c:5691 [inline] lock_acquire+0x1b1/0x520 kernel/locking/lockdep.c:5656 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] _raw_spin_lock_irqsave+0x3d/0x60 kernel/locking/spinlock.c:162 skb_dequeue+0x20/0x180 net/core/skbuff.c:3639 drain_resp_pkts drivers/infiniband/sw/rxe/rxe_comp.c:555 [inline] rxe_completer+0x250d/0x3cc0 drivers/infiniband/sw/rxe/rxe_comp.c:652 rxe_qp_do_cleanup+0x1be/0x820 drivers/infiniband/sw/rxe/rxe_qp.c:761 execute_in_process_context+0x3b/0x150 kernel/workqueue.c:3473 __rxe_cleanup+0x21e/0x370 drivers/infiniband/sw/rxe/rxe_pool.c:233 rxe_create_qp+0x3f6/0x5f0 drivers/infiniband/sw/rxe/rxe_verbs.c:583 This is a use-before-initialization problem. It happens because rxe_qp_do_cleanup is called during error unwind before the struct has been fully initialized. Move the initialization of the skb earlier. Fixes: 8700e3e7c485 ("Soft RoCE driver") Link: https://lore.kernel.org/r/20230602035408.741534-1-yanjun.zhu@intel.com Reported-by: syzbot+eba589d8f49c73d356da@syzkaller.appspotmail.com Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-06-21RDMA/rxe: Removed unused name from rxe_task structBob Pearson3-12/+5
[ Upstream commit de669ae8af49ceed0eed44f5b3d51dc62affc5e4 ] The name field in struct rxe_task is never used. This patch removes it. Link: https://lore.kernel.org/r/20221021200118.2163-4-rpearsonhpe@gmail.com Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com> Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Stable-dep-of: 2a62b6210ce8 ("RDMA/rxe: Fix the use-before-initialization error of resp_pkts") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-06-21RDMA/rxe: Remove the unused variable objZhu Yanjun3-7/+5
[ Upstream commit f07853582d1f6ed282f8d9a0b1209a87dd761f58 ] The member variable obj in struct rxe_task is not needed. So remove it to save memory. Link: https://lore.kernel.org/r/20220822011615.805603-4-yanjun.zhu@linux.dev Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Stable-dep-of: 2a62b6210ce8 ("RDMA/rxe: Fix the use-before-initialization error of resp_pkts") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-14RDMA/rxe: Fix NULL-ptr-deref in rxe_qp_do_cleanup() when socket create failedZhang Xiaoxu1-3/+3
[ Upstream commit f67376d801499f4fa0838c18c1efcad8840e550d ] There is a null-ptr-deref when mount.cifs over rdma: BUG: KASAN: null-ptr-deref in rxe_qp_do_cleanup+0x2f3/0x360 [rdma_rxe] Read of size 8 at addr 0000000000000018 by task mount.cifs/3046 CPU: 2 PID: 3046 Comm: mount.cifs Not tainted 6.1.0-rc5+ #62 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-1.fc3 Call Trace: <TASK> dump_stack_lvl+0x34/0x44 kasan_report+0xad/0x130 rxe_qp_do_cleanup+0x2f3/0x360 [rdma_rxe] execute_in_process_context+0x25/0x90 __rxe_cleanup+0x101/0x1d0 [rdma_rxe] rxe_create_qp+0x16a/0x180 [rdma_rxe] create_qp.part.0+0x27d/0x340 ib_create_qp_kernel+0x73/0x160 rdma_create_qp+0x100/0x230 _smbd_get_connection+0x752/0x20f0 smbd_get_connection+0x21/0x40 cifs_get_tcp_session+0x8ef/0xda0 mount_get_conns+0x60/0x750 cifs_mount+0x103/0xd00 cifs_smb3_do_mount+0x1dd/0xcb0 smb3_get_tree+0x1d5/0x300 vfs_get_tree+0x41/0xf0 path_mount+0x9b3/0xdd0 __x64_sys_mount+0x190/0x1d0 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 The root cause of the issue is the socket create failed in rxe_qp_init_req(). So move the reset rxe_qp_do_cleanup() after the NULL ptr check. Fixes: 8700e3e7c485 ("Soft RoCE driver") Link: https://lore.kernel.org/r/20221122151437.1057671-1-zhangxiaoxu5@huawei.com Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-10-26RDMA/rxe: Fix the error caused by qp->skZhu Yanjun1-2/+4
[ Upstream commit 548ce2e66725dcba4e27d1e8ac468d5dd17fd509 ] When sock_create_kern in the function rxe_qp_init_req fails, qp->sk is set to NULL. Then the function rxe_create_qp will call rxe_qp_do_cleanup to handle allocated resource. Before handling qp->sk, this variable should be checked. Fixes: 8700e3e7c485 ("Soft RoCE driver") Link: https://lore.kernel.org/r/20220822011615.805603-3-yanjun.zhu@linux.dev Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-10-26RDMA/rxe: Fix "kernel NULL pointer dereference" errorZhu Yanjun1-1/+3
[ Upstream commit a625ca30eff806395175ebad3ac1399014bdb280 ] When rxe_queue_init in the function rxe_qp_init_req fails, both qp->req.task.func and qp->req.task.arg are not initialized. Because of creation of qp fails, the function rxe_create_qp will call rxe_qp_do_cleanup to handle allocated resource. Before calling __rxe_do_task, both qp->req.task.func and qp->req.task.arg should be checked. Fixes: 8700e3e7c485 ("Soft RoCE driver") Link: https://lore.kernel.org/r/20220822011615.805603-2-yanjun.zhu@linux.dev Reported-by: syzbot+ab99dc4c6e961eed8b8e@syzkaller.appspotmail.com Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-25RDMA/rxe: Limit the number of calls to each taskletBob Pearson2-4/+18
[ Upstream commit eff6d998ca297cb0b2e53b032a56cf8e04dd8b17 ] Limit the maximum number of calls to each tasklet from rxe_do_task() before yielding the cpu. When the limit is reached reschedule the tasklet and exit the calling loop. This patch prevents one tasklet from consuming 100% of a cpu core and causing a deadlock or soft lockup. Link: https://lore.kernel.org/r/20220630190425.2251-9-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-21RDMA/rxe: Fix error unwind in rxe_create_qp()Zhu Yanjun1-4/+8
[ Upstream commit fd5382c5805c4bcb50fd25b7246247d3f7114733 ] In the function rxe_create_qp(), rxe_qp_from_init() is called to initialize qp, internally things like the spin locks are not setup until rxe_qp_init_req(). If an error occures before this point then the unwind will call rxe_cleanup() and eventually to rxe_qp_do_cleanup()/rxe_cleanup_task() which will oops when trying to access the uninitialized spinlock. Move the spinlock initializations earlier before any failures. Fixes: 8700e3e7c485 ("Soft RoCE driver") Link: https://lore.kernel.org/r/20220731063621.298405-1-yanjun.zhu@linux.dev Reported-by: syzbot+833061116fa28df97f3b@syzkaller.appspotmail.com Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-06-09RDMA/rxe: Generate a completion for unsupported/invalid opcodeXiao Yang1-1/+1
commit 2f917af777011c88e977b9b9a5d00b280d3a59ce upstream. Current rxe_requester() doesn't generate a completion when processing an unsupported/invalid opcode. If rxe driver doesn't support a new opcode (e.g. RDMA Atomic Write) and RDMA library supports it, an application using the new opcode can reproduce this issue. Fix the issue by calling "goto err;". Fixes: 8700e3e7c485 ("Soft RoCE driver") Link: https://lore.kernel.org/r/20220410113513.27537-1-yangx.jy@fujitsu.com Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-01-27RDMA/rxe: Fix a typo in opcode nameChengguang Xu1-1/+1
commit 8d1cfb884e881efd69a3be4ef10772c71cb22216 upstream. There is a redundant ']' in the name of opcode IB_OPCODE_RC_SEND_MIDDLE, so just fix it. Fixes: 8700e3e7c485 ("Soft RoCE driver") Link: https://lore.kernel.org/r/20211218112320.3558770-1-cgxu519@mykernel.net Signed-off-by: Chengguang Xu <cgxu519@mykernel.net> Acked-by: Zhu Yanjun <zyjzyj2000@gmail.com> Reviewed-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18RDMA/rxe: Fix wrong port_cap_flagsJunji Wei1-1/+1
[ Upstream commit dcd3f985b20ffcc375f82ca0ca9f241c7025eb5e ] The port->attr.port_cap_flags should be set to enum ib_port_capability_mask_bits in ib_mad.h, not RDMA_CORE_CAP_PROT_ROCE_UDP_ENCAP. Fixes: 8700e3e7c485 ("Soft RoCE driver") Link: https://lore.kernel.org/r/20210831083223.65797-1-weijunji@bytedance.com Signed-off-by: Junji Wei <weijunji@bytedance.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-19RDMA/rxe: Don't overwrite errno from ib_umem_get()Xiao Yang1-1/+1
[ Upstream commit 20ec0a6d6016aa28b9b3299be18baef1a0f91cd2 ] rxe_mr_init_user() always returns the fixed -EINVAL when ib_umem_get() fails so it's hard for user to know which actual error happens in ib_umem_get(). For example, ib_umem_get() will return -EOPNOTSUPP when trying to pin pages on a DAX file. Return actual error as mlx4/mlx5 does. Link: https://lore.kernel.org/r/20210621071456.4259-1-ice_yangxiao@163.com Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14RDMA/rxe: Fix qp reference counting for atomic opsBob Pearson2-3/+0
[ Upstream commit 15ae1375ea91ae2dee6f12d71a79d8c0a10a30bf ] Currently the rdma_rxe driver attempts to protect atomic responder resources by taking a reference to the qp which is only freed when the resource is recycled for a new read or atomic operation. This means that in normal circumstances there is almost always an extra qp reference once an atomic operation has been executed which prevents cleaning up the qp and associated pd and cqs when the qp is destroyed. This patch removes the call to rxe_add_ref() in send_atomic_ack() and the call to rxe_drop_ref() in free_rd_atomic_resource(). If the qp is destroyed while a peer is retrying an atomic op it will cause the operation to fail which is acceptable. Link: https://lore.kernel.org/r/20210604230558.4812-1-rpearsonhpe@gmail.com Reported-by: Zhu Yanjun <zyjzyj2000@gmail.com> Fixes: 86af61764151 ("IB/rxe: remove unnecessary skb_clone") Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14RDMA/rxe: Fix failure during driver loadKamal Heib1-3/+7
[ Upstream commit 32a25f2ea690dfaace19f7a3a916f5d7e1ddafe8 ] To avoid the following failure when trying to load the rdma_rxe module while IPv6 is disabled, add a check for EAFNOSUPPORT and ignore the failure, also delete the needless debug print from rxe_setup_udp_tunnel(). $ modprobe rdma_rxe modprobe: ERROR: could not insert 'rdma_rxe': Operation not permitted Fixes: dfdd6158ca2c ("IB/rxe: Fix kernel panic in udp_setup_tunnel") Link: https://lore.kernel.org/r/20210603090112.36341-1-kamalheib1@gmail.com Reported-by: Yi Zhang <yi.zhang@redhat.com> Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-26RDMA/rxe: Clear all QP fields if creation failedLeon Romanovsky1-0/+7
[ Upstream commit 67f29896fdc83298eed5a6576ff8f9873f709228 ] rxe_qp_do_cleanup() relies on valid pointer values in QP for the properly created ones, but in case rxe_qp_from_init() failed it was filled with garbage and caused tot the following error. refcount_t: underflow; use-after-free. WARNING: CPU: 1 PID: 12560 at lib/refcount.c:28 refcount_warn_saturate+0x1d1/0x1e0 lib/refcount.c:28 Modules linked in: CPU: 1 PID: 12560 Comm: syz-executor.4 Not tainted 5.12.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:refcount_warn_saturate+0x1d1/0x1e0 lib/refcount.c:28 Code: e9 db fe ff ff 48 89 df e8 2c c2 ea fd e9 8a fe ff ff e8 72 6a a7 fd 48 c7 c7 e0 b2 c1 89 c6 05 dc 3a e6 09 01 e8 ee 74 fb 04 <0f> 0b e9 af fe ff ff 0f 1f 84 00 00 00 00 00 41 56 41 55 41 54 55 RSP: 0018:ffffc900097ceba8 EFLAGS: 00010286 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000040000 RSI: ffffffff815bb075 RDI: fffff520012f9d67 RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000000 R10: ffffffff815b4eae R11: 0000000000000000 R12: ffff8880322a4800 R13: ffff8880322a4940 R14: ffff888033044e00 R15: 0000000000000000 FS: 00007f6eb2be3700(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fdbe5d41000 CR3: 000000001d181000 CR4: 00000000001506e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: __refcount_sub_and_test include/linux/refcount.h:283 [inline] __refcount_dec_and_test include/linux/refcount.h:315 [inline] refcount_dec_and_test include/linux/refcount.h:333 [inline] kref_put include/linux/kref.h:64 [inline] rxe_qp_do_cleanup+0x96f/0xaf0 drivers/infiniband/sw/rxe/rxe_qp.c:805 execute_in_process_context+0x37/0x150 kernel/workqueue.c:3327 rxe_elem_release+0x9f/0x180 drivers/infiniband/sw/rxe/rxe_pool.c:391 kref_put include/linux/kref.h:65 [inline] rxe_create_qp+0x2cd/0x310 drivers/infiniband/sw/rxe/rxe_verbs.c:425 _ib_create_qp drivers/infiniband/core/core_priv.h:331 [inline] ib_create_named_qp+0x2ad/0x1370 drivers/infiniband/core/verbs.c:1231 ib_create_qp include/rdma/ib_verbs.h:3644 [inline] create_mad_qp+0x177/0x2d0 drivers/infiniband/core/mad.c:2920 ib_mad_port_open drivers/infiniband/core/mad.c:3001 [inline] ib_mad_init_device+0xd6f/0x1400 drivers/infiniband/core/mad.c:3092 add_client_context+0x405/0x5e0 drivers/infiniband/core/device.c:717 enable_device_and_get+0x1cd/0x3b0 drivers/infiniband/core/device.c:1331 ib_register_device drivers/infiniband/core/device.c:1413 [inline] ib_register_device+0x7c7/0xa50 drivers/infiniband/core/device.c:1365 rxe_register_device+0x3d5/0x4a0 drivers/infiniband/sw/rxe/rxe_verbs.c:1147 rxe_add+0x12fe/0x16d0 drivers/infiniband/sw/rxe/rxe.c:247 rxe_net_add+0x8c/0xe0 drivers/infiniband/sw/rxe/rxe_net.c:503 rxe_newlink drivers/infiniband/sw/rxe/rxe.c:269 [inline] rxe_newlink+0xb7/0xe0 drivers/infiniband/sw/rxe/rxe.c:250 nldev_newlink+0x30e/0x550 drivers/infiniband/core/nldev.c:1555 rdma_nl_rcv_msg+0x36d/0x690 drivers/infiniband/core/netlink.c:195 rdma_nl_rcv_skb drivers/infiniband/core/netlink.c:239 [inline] rdma_nl_rcv+0x2ee/0x430 drivers/infiniband/core/netlink.c:259 netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline] netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1338 netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1927 sock_sendmsg_nosec net/socket.c:654 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:674 ____sys_sendmsg+0x6e8/0x810 net/socket.c:2350 ___sys_sendmsg+0xf3/0x170 net/socket.c:2404 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2433 do_syscall_64+0x3a/0xb0 arch/x86/entry/common.c:47 entry_SYSCALL_64_after_hwframe+0x44/0xae Fixes: 8700e3e7c485 ("Soft RoCE driver") Link: https://lore.kernel.org/r/7bf8d548764d406dbbbaf4b574960ebfd5af8387.1620717918.git.leonro@nvidia.com Reported-by: syzbot+36a7f280de4e11c6f04e@syzkaller.appspotmail.com Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Zhu Yanjun <zyjzyj2000@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14RDMA/rxe: Fix a bug in rxe_fill_ip_info()Bob Pearson1-1/+1
[ Upstream commit 45062f441590810772959d8e1f2b24ba57ce1bd9 ] Fix a bug in rxe_fill_ip_info() which was attempting to convert from RDMA_NETWORK_XXX to RXE_NETWORK_XXX. .._IPV6 should have mapped to .._IPV6 not .._IPV4. Fixes: edebc8407b88 ("RDMA/rxe: Fix small problem in network_type patch") Link: https://lore.kernel.org/r/20210421035952.4892-1-rpearson@hpe.com Suggested-by: Frank Zago <frank.zago@hpe.com> Signed-off-by: Bob Pearson <rpearson@hpe.com> Acked-by: Zhu Yanjun <zyjzyj2000@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-09RDMA/rxe: Fix missing kconfig dependency on CRYPTOJulian Braha1-0/+1
[ Upstream commit 475f23b8c66d2892ad6acbf90ed757cafab13de7 ] When RDMA_RXE is enabled and CRYPTO is disabled, Kbuild gives the following warning: WARNING: unmet direct dependencies detected for CRYPTO_CRC32 Depends on [n]: CRYPTO [=n] Selected by [y]: - RDMA_RXE [=y] && (INFINIBAND_USER_ACCESS [=y] || !INFINIBAND_USER_ACCESS [=y]) && INET [=y] && PCI [=y] && INFINIBAND [=y] && INFINIBAND_VIRT_DMA [=y] This is because RDMA_RXE selects CRYPTO_CRC32, without depending on or selecting CRYPTO, despite that config option being subordinate to CRYPTO. Fixes: cee2688e3cd6 ("IB/rxe: Offload CRC calculation when possible") Signed-off-by: Julian Braha <julianbraha@gmail.com> Link: https://lore.kernel.org/r/21525878.NYvzQUHefP@ubuntu-mate-laptop Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04RDMA/rxe: Correct skb on loopback pathBob Pearson1-0/+5
[ Upstream commit 5120bf0a5fc15dec210a0fe0f39e4a256bb6e349 ] rxe_net.c sends packets at the IP layer with skb->data pointing at the IP header but receives packets from a UDP tunnel with skb->data pointing at the UDP header. On the loopback path this was not correctly accounted for. This patch corrects for this by using sbk_pull() to strip the IP header from the skb on received packets. Fixes: 8700e3e7c485 ("Soft RoCE driver") Link: https://lore.kernel.org/r/20210128182301.16859-1-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04RDMA/rxe: Fix coding error in rxe_rcv_mcast_pktBob Pearson1-6/+10
[ Upstream commit 8fc1b7027fc162738d5a85c82410e501a371a404 ] rxe_rcv_mcast_pkt() in rxe_recv.c can leak SKBs in error path code. The loop over the QPs attached to a multicast group creates new cloned SKBs for all but the last QP in the list and passes the SKB and its clones to rxe_rcv_pkt() for further processing. Any QPs that do not pass some checks are skipped. If the last QP in the list fails the tests the SKB is leaked. This patch checks if the SKB for the last QP was used and if not frees it. Also removes a redundant loop invariant assignment. Fixes: 8700e3e7c485 ("Soft RoCE driver") Fixes: 71abf20b28ff ("RDMA/rxe: Handle skb_clone() failure in rxe_recv.c") Link: https://lore.kernel.org/r/20210128174752.16128-1-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04RDMA/rxe: Fix coding error in rxe_recv.cBob Pearson1-3/+8
[ Upstream commit 7d9ae80e31df57dd3253e1ec514f0000aa588a81 ] check_type_state() in rxe_recv.c is written as if the type bits in the packet opcode were a bit mask which is not correct. This patch corrects this code to compare all 3 type bits to the required type. Fixes: 8700e3e7c485 ("Soft RoCE driver") Link: https://lore.kernel.org/r/20210127214500.3707-1-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-01-09RDMA/siw,rxe: Make emulated devices virtual in the device treeJason Gunthorpe2-13/+0
[ Upstream commit a9d2e9ae953f0ddd0327479c81a085adaa76d903 ] This moves siw and rxe to be virtual devices in the device tree: lrwxrwxrwx 1 root root 0 Nov 6 13:55 /sys/class/infiniband/rxe0 -> ../../devices/virtual/infiniband/rxe0/ Previously they were trying to parent themselves to the physical device of their attached netdev, which doesn't make alot of sense. My hope is this will solve some weird syzkaller hits related to sysfs as it could be possible that the parent of a netdev is another netdev, eg under bonding or some other syzkaller found netdev configuration. Nesting a ib_device under anything but a physical device is going to cause inconsistencies in sysfs during destructions. Link: https://lore.kernel.org/r/0-v1-dcbfc68c4b4a+d6-virtual_dev_jgg@nvidia.com Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-01-09RDMA/core: remove use of dma_virt_opsChristoph Hellwig3-9/+0
[ Upstream commit 5a7a9e038b032137ae9c45d5429f18a2ffdf7d42 ] Use the ib_dma_* helpers to skip the DMA translation instead. This removes the last user if dma_virt_ops and keeps the weird layering violation inside the RDMA core instead of burderning the DMA mapping subsystems with it. This also means the software RDMA drivers now don't have to mess with DMA parameters that are not relevant to them at all, and that in the future we can use PCI P2P transfers even for software RDMA, as there is no first fake layer of DMA mapping that the P2P DMA support. Link: https://lore.kernel.org/r/20201106181941.1878556-8-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-12-30RDMA/rxe: Compute PSN windows correctlyBob Pearson1-1/+2
[ Upstream commit bb3ab2979fd69db23328691cb10067861df89037 ] The code which limited the number of unacknowledged PSNs was incorrect. The PSNs are limited to 24 bits and wrap back to zero from 0x00ffffff. The test was computing a 32 bit value which wraps at 32 bits so that qp->req.psn can appear smaller than the limit when it is actually larger. Replace '>' test with psn_compare which is used for other PSN comparisons and correctly handles the 24 bit size. Fixes: 8700e3e7c485 ("Soft RoCE driver") Link: https://lore.kernel.org/r/20201013170741.3590-1-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-11-12RMDA/sw: Don't allow drivers using dma_virt_ops on highmem configsChristoph Hellwig1-1/+1
dma_virt_ops requires that all pages have a kernel virtual address. Introduce a INFINIBAND_VIRT_DMA Kconfig symbol that depends on !HIGHMEM and make all three drivers depend on the new symbol. Also remove the ARCH_DMA_ADDR_T_64BIT dependency, which has been obsolete since commit 4965a68780c5 ("arch: define the ARCH_DMA_ADDR_T_64BIT config symbol in lib/Kconfig") Fixes: 551199aca1c3 ("lib/dma-virt: Add dma_virt_ops") Link: https://lore.kernel.org/r/20201106181941.1878556-2-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-02RDMA: Fix software RDMA drivers for dma mapping errorParav Pandit1-1/+5
The commit f959dcd6ddfd ("dma-direct: Fix potential NULL pointer dereference") made dma_mask as mandetory field to be setup even for dma_virt_ops based dma devices. The commit in the fixes tag omitted setting up the dma_mask on virtual devices triggering the below trace when they were combined during the merge window. Fix it by setting empty DMA MASK for software based RDMA devices. WARNING: CPU: 1 PID: 8488 at kernel/dma/mapping.c:149 dma_map_page_attrs+0x493/0x700 CPU: 1 PID: 8488 Comm: syz-executor144 Not tainted 5.9.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:dma_map_page_attrs+0x493/0x700 kernel/dma/mapping.c:149 Trace: dma_map_single_attrs include/linux/dma-mapping.h:279 [inline] ib_dma_map_single include/rdma/ib_verbs.h:3967 [inline] ib_mad_post_receive_mads+0x23f/0xd60 drivers/infiniband/core/mad.c:2715 ib_mad_port_start drivers/infiniband/core/mad.c:2862 [inline] ib_mad_port_open drivers/infiniband/core/mad.c:3016 [inline] ib_mad_init_device+0x72b/0x1400 drivers/infiniband/core/mad.c:3092 add_client_context+0x405/0x5e0 drivers/infiniband/core/device.c:680 enable_device_and_get+0x1d5/0x3c0 drivers/infiniband/core/device.c:1301 ib_register_device drivers/infiniband/core/device.c:1376 [inline] ib_register_device+0x7a7/0xa40 drivers/infiniband/core/device.c:1335 rxe_register_device+0x46d/0x570 drivers/infiniband/sw/rxe/rxe_verbs.c:1182 rxe_add+0x12fe/0x16d0 drivers/infiniband/sw/rxe/rxe.c:247 rxe_net_add+0x8c/0xe0 drivers/infiniband/sw/rxe/rxe_net.c:507 rxe_newlink drivers/infiniband/sw/rxe/rxe.c:269 [inline] rxe_newlink+0xb7/0xe0 drivers/infiniband/sw/rxe/rxe.c:250 nldev_newlink+0x30e/0x540 drivers/infiniband/core/nldev.c:1555 rdma_nl_rcv_msg+0x367/0x690 drivers/infiniband/core/netlink.c:195 rdma_nl_rcv_skb drivers/infiniband/core/netlink.c:239 [inline] rdma_nl_rcv+0x2f2/0x440 drivers/infiniband/core/netlink.c:259 netlink_unicast_kernel net/netlink/af_netlink.c:1304 [inline] netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1330 netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1919 sock_sendmsg_nosec net/socket.c:651 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:671 ____sys_sendmsg+0x6e8/0x810 net/socket.c:2353 ___sys_sendmsg+0xf3/0x170 net/socket.c:2407 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2440 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x443699 Link: https://lore.kernel.org/r/20201030093803.278830-1-parav@nvidia.com Reported-by: syzbot+34dc2fea3478e659af01@syzkaller.appspotmail.com Fixes: e0477b34d9d1 ("RDMA: Explicitly pass in the dma_device to ib_register_device") Signed-off-by: Parav Pandit <parav@nvidia.com> Tested-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Tested-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Zhu Yanjun <yanjunz@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-27RDMA/rxe: Fix small problem in network_type patchBob Pearson2-5/+32
The patch referenced below has a typo that results in using the wrong L2 header size for outbound traffic. (V4 <-> V6). It also breaks kernel-side RC traffic because they use AVs that use RDMA_NETWORK_XXX enums instead of RXE_NETWORK_TYPE_XXX enums. Fix this by transcoding between these enum types. Fixes: e0d696d201dd ("RDMA/rxe: Move the definitions for rxe_av.network_type to uAPI") Link: https://lore.kernel.org/r/20201016211343.22906-1-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-16RDMA/rxe: Handle skb_clone() failure in rxe_recv.cBob Pearson1-0/+3
If skb_clone() is unable to allocate memory for a new sk_buff this is not detected by the current code. Check for a NULL return and continue. This is similar to other errors in this loop over QPs attached to the multicast address and consistent with the unreliable UD transport. Fixes: e7ec96fc7932f ("RDMA/rxe: Fix skb lifetime in rxe_rcv_mcast_pkt()") Addresses-Coverity-ID: 1497804: Null pointer dereferences (NULL_RETURNS) Link: https://lore.kernel.org/r/20201013184236.5231-1-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-16RDMA/rxe: Move the definitions for rxe_av.network_type to uAPIJason Gunthorpe1-4/+4
RXE was wrongly using an internal kernel enum as part of its uAPI, split this out into a dedicated uAPI enum just for RXE. It only uses the IPv4 and IPv6 values. This was exposed by changing the internal kernel enum definition which broke RXE. Fixes: 1c15b4f2a42f ("RDMA/core: Modify enum ib_gid_type and enum rdma_network_type") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-16RDMA: Explicitly pass in the dma_device to ib_register_deviceJason Gunthorpe1-6/+3
The code in setup_dma_device has become rather convoluted, move all of this to the drivers. Drives now pass in a DMA capable struct device which will be used to setup DMA, or drivers must fully configure the ibdev for DMA and pass in NULL. Other than setting the masks in rvt all drivers were doing this already anyhow. mthca, mlx4 and mlx5 were already setting up maximum DMA segment size for DMA based on their hardweare limits in: __mthca_init_one() dma_set_max_seg_size (1G) __mlx4_init_one() dma_set_max_seg_size (1G) mlx5_pci_init() set_dma_caps() dma_set_max_seg_size (2G) Other non software drivers (except usnic) were extended to UINT_MAX [1, 2] instead of 2G as was before. [1] https://lore.kernel.org/linux-rdma/20200924114940.GE9475@nvidia.com/ [2] https://lore.kernel.org/linux-rdma/20200924114940.GE9475@nvidia.com/ Link: https://lore.kernel.org/r/20201008082752.275846-1-leon@kernel.org Link: https://lore.kernel.org/r/6b2ed339933d066622d5715903870676d8cc523a.1602590106.git.mchehab+huawei@kernel.org Suggested-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-09RDMA/rxe: Fix bug rejecting all multicast packetsBob Pearson1-3/+16
Fix a bug in rxe_rcv() that causes all multicast packets to be dropped. Currently rxe_match_dgid() is called for each packet to verify that the destination IP address matches one of the entries in the port source GID table. This is incorrect for IP multicast addresses since they do not appear in the GID table. Add code to detect multicast addresses. Change function name to rxe_chk_dgid() which is clearer. Link: https://lore.kernel.org/r/20201008212753.265249-1-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-09RDMA/rxe: Fix skb lifetime in rxe_rcv_mcast_pkt()Bob Pearson1-5/+12
The changes referenced below replaced sbk_clone)_ by taking additional references, passing the skb along and then freeing the skb. This deleted the packets before they could be processed and additionally passed bad data in each packet. Since pkt is stored in skb->cb changing pkt->qp changed it for all the packets. Replace skb_get() by sbk_clone() in rxe_rcv_mcast_pkt() for cases where multiple QPs are receiving multicast packets on the same address. Delete kfree_skb() because the packets need to live until they have been processed by each QP. They are freed later. Fixes: 86af61764151 ("IB/rxe: remove unnecessary skb_clone") Fixes: fe896ceb5772 ("IB/rxe: replace refcount_inc with skb_get") Link: https://lore.kernel.org/r/20201008203651.256958-1-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-09RDMA/rxe: Remove duplicate entries in struct rxe_mrBob Pearson4-22/+28
Struct rxe_mem had pd, lkey and rkey values both in itself and in the struct ib_mr which is also included in rxe_mem. Delete these entries and replace references with the ones in ibmr.Add mr_pd, mr_lkey and mr_rkey macros which extract these values from mr. Link: https://lore.kernel.org/r/20201008212818.265303-1-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-18Merge branch 'mlx5_active_speed' into rdma.git for-nextJason Gunthorpe5-1/+13
Leon Romanovsky says: ==================== IBTA declares speed as 16 bits, but kernel stores it in u8. This series fixes in-kernel declaration while keeping external interface intact. ==================== Based on the mlx5-next branch at git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux due to dependencies. * branch 'mlx5_active_speed': RDMA: Fix link active_speed size RDMA/mlx5: Delete duplicated mlx5_ptys_width enum net/mlx5: Refactor query port speed functions
2020-09-11Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds5-1/+13
Pull rdma fixes from Jason Gunthorpe: "A number of driver bug fixes and a few recent regressions: - Several bug fixes for bnxt_re. Crashing, incorrect data reported, and corruption on new HW - Memory leak and crash in rxe - Fix sysfs corruption in rxe if the netdev name is too long - Fix a crash on error unwind in the new cq_pool code - Fix kobject panics in rtrs by working device lifetime properly - Fix a data corruption bug in iser target related to misaligned buffers" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: IB/isert: Fix unaligned immediate-data handling RDMA/rtrs-srv: Set .release function for rtrs srv device during device init RDMA/bnxt_re: Remove set but not used variable 'qplib_ctx' RDMA/core: Fix reported speed and width RDMA/core: Fix unsafe linked list traversal after failing to allocate CQ RDMA/bnxt_re: Remove the qp from list only if the qp destroy succeeds RDMA/bnxt_re: Fix driver crash on unaligned PSN entry address RDMA/bnxt_re: Restrict the max_gids to 256 RDMA/bnxt_re: Static NQ depth allocation RDMA/bnxt_re: Fix the qp table indexing RDMA/bnxt_re: Do not report transparent vlan from QP1 RDMA/mlx4: Read pkey table length instead of hardcoded value RDMA/rxe: Fix panic when calling kmem_cache_create() RDMA/rxe: Fix memleak in rxe_mem_init_user RDMA/rxe: Fix the parent sysfs read when the interface has 15 chars RDMA/rtrs-srv: Replace device_register with device_initialize and device_add
2020-09-09RDMA: Allow fail of destroy CQLeon Romanovsky1-1/+2
Like any other verbs objects, CQ shouldn't fail during destroy, but mlx5_ib didn't follow this contract with mixed IB verbs objects with DEVX. Such mix causes to the situation where FW and kernel are fully interdependent on the reference counting of each side. Kernel verbs and drivers that don't have DEVX flows shouldn't fail. Fixes: e39afe3d6dbd ("RDMA: Convert CQ allocations to be under core responsibility") Link: https://lore.kernel.org/r/20200907120921.476363-7-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-09RDMA: Restore ability to fail on SRQ destroyLeon Romanovsky1-1/+2
In similar way to other IB objects, restore the ability to return error on SRQ destroy. Strictly speaking, this change is not necessary, and provided here to ensure a symmetrical interface like other destroy functions. Fixes: 68e326dea1db ("RDMA: Handle SRQ allocations by IB/core") Link: https://lore.kernel.org/r/20200907120921.476363-5-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-09RDMA: Restore ability to fail on AH destroyLeon Romanovsky1-1/+2
Like any other IB verbs objects, AH are refcounted by ib_core. The release of those objects are controlled by ib_core with promise that AH destroy can't fail. Being SW object for now, this change makes dealloc_ah() to behave like any other destroy IB flows. Fixes: d345691471b4 ("RDMA: Handle AH allocations by IB/core") Link: https://lore.kernel.org/r/20200907120921.476363-3-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-09RDMA: Restore ability to fail on PD deallocateLeon Romanovsky1-1/+2
The IB verbs objects are counted by the kernel and ib_core ensures that deallocate PD will success so it will be called once all other objects that depends on PD will be released. This is achieved by managing various reference counters on such objects. The mlx5 driver didn't follow this standard flow when allowed DEVX objects that are not managed by ib_core to be interleaved with the ones under ib_core responsibility. In such interleaved scenarios deallocate command can fail and ib_core will leave uobject in internal DB and attempt to clean it later to free resources anyway. This change partially restores returned value from dealloc_pd() for all drivers, but keeping in mind that non-DEVX devices and kernel verbs paths shouldn't fail. Fixes: 21a428a019c9 ("RDMA: Handle PD allocations by IB/core") Link: https://lore.kernel.org/r/20200907120921.476363-2-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-03RDMA/rxe: Convert tasklets to use new tasklet_setup() APIAllen Pais3-8/+8
In preparation for unconditionally passing the struct tasklet_struct pointer to all tasklet callbacks, switch to using the new tasklet_setup() and from_tasklet() to pass the tasklet pointer explicitly. Link: https://lore.kernel.org/r/20200903060637.424458-6-allen.lkml@gmail.com Signed-off-by: Romain Perier <romain.perier@gmail.com> Signed-off-by: Allen Pais <allen.lkml@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-08-31Merge tag 'v5.9-rc3' into rdma.git for-nextJason Gunthorpe3-3/+3
Required due to dependencies in following patches. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-08-31RDMA/rxe: Address an issue with hardened user copyBob Pearson3-73/+2
Change rxe pools to use kzalloc instead of kmem_cache to allocate memory for rxe objects. The pools are not really necessary and they trigger hardened user copy warnings as the ioctl framework copies the QP number directly to userspace. Also the general project to move object alloation to the core code will eventually clean these out anyhow. Link: https://lore.kernel.org/r/20200827163535.2632-1-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-08-31RDMA/rxe: Add SPDX hdrs to rxe source filesBob Pearson32-896/+32
Add SPDX headers to all rxe .c and .h files. Link: https://lore.kernel.org/r/20200827145439.2273-1-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-08-27RDMA/rxe: Fix style warningsBob Pearson4-6/+4
Fixed several minor checkpatch warnings in existing rxe source. Link: https://lore.kernel.org/r/20200820224638.3212-3-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-08-27RDMA/rxe: Fix panic when calling kmem_cache_create()Kamal Heib3-0/+11
To avoid the following kernel panic when calling kmem_cache_create() with a NULL pointer from pool_cache(), Block the rxe_param_set_add() from running if the rdma_rxe module is not initialized. BUG: unable to handle kernel NULL pointer dereference at 000000000000000b PGD 0 P4D 0 Oops: 0000 [#1] SMP NOPTI CPU: 4 PID: 8512 Comm: modprobe Kdump: loaded Not tainted 4.18.0-231.el8.x86_64 #1 Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 10/02/2018 RIP: 0010:kmem_cache_alloc+0xd1/0x1b0 Code: 8b 57 18 45 8b 77 1c 48 8b 5c 24 30 0f 1f 44 00 00 5b 48 89 e8 5d 41 5c 41 5d 41 5e 41 5f c3 81 e3 00 00 10 00 75 0e 4d 89 fe <41> f6 47 0b 04 0f 84 6c ff ff ff 4c 89 ff e8 cc da 01 00 49 89 c6 RSP: 0018:ffffa2b8c773f9d0 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000005 RDX: 0000000000000004 RSI: 00000000006080c0 RDI: 0000000000000000 RBP: ffff8ea0a8634fd0 R08: ffffa2b8c773f988 R09: 00000000006000c0 R10: 0000000000000000 R11: 0000000000000230 R12: 00000000006080c0 R13: ffffffffc0a97fc8 R14: 0000000000000000 R15: 0000000000000000 FS: 00007f9138ed9740(0000) GS:ffff8ea4ae800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000000000b CR3: 000000046d59a000 CR4: 00000000003406e0 Call Trace: rxe_alloc+0xc8/0x160 [rdma_rxe] rxe_get_dma_mr+0x25/0xb0 [rdma_rxe] __ib_alloc_pd+0xcb/0x160 [ib_core] ib_mad_init_device+0x296/0x8b0 [ib_core] add_client_context+0x11a/0x160 [ib_core] enable_device_and_get+0xdc/0x1d0 [ib_core] ib_register_device+0x572/0x6b0 [ib_core] ? crypto_create_tfm+0x32/0xe0 ? crypto_create_tfm+0x7a/0xe0 ? crypto_alloc_tfm+0x58/0xf0 rxe_register_device+0x19d/0x1c0 [rdma_rxe] rxe_net_add+0x3d/0x70 [rdma_rxe] ? dev_get_by_name_rcu+0x73/0x90 rxe_param_set_add+0xaf/0xc0 [rdma_rxe] parse_args+0x179/0x370 ? ref_module+0x1b0/0x1b0 load_module+0x135e/0x17e0 ? ref_module+0x1b0/0x1b0 ? __do_sys_init_module+0x13b/0x180 __do_sys_init_module+0x13b/0x180 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x65/0xca RIP: 0033:0x7f9137ed296e This can be triggered if a user tries to use the 'module option' which is not actually a real module option but some idiotic (and thankfully no obsolete) sysfs interface. Fixes: 8700e3e7c485 ("Soft RoCE driver") Link: https://lore.kernel.org/r/20200825151725.254046-1-kamalheib1@gmail.com Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-08-27RDMA/rxe: Fix memleak in rxe_mem_init_userDinghao Liu1-0/+1
When page_address() fails, umem should be freed just like when rxe_mem_alloc() fails. Fixes: 8700e3e7c485 ("Soft RoCE driver") Link: https://lore.kernel.org/r/20200819075632.22285-1-dinghao.liu@zju.edu.cn Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-08-24RDMA/rxe: prevent rxe creation on top of vlan interfaceMohammad Heib2-0/+12
Creating rxe device on top of vlan interface will create a non-functional device that has an empty gids table and can't be used for rdma cm communication. This is caused by the logic in enum_all_gids_of_dev_cb()/is_eth_port_of_netdev(), which only considers networks connected to "upper devices" of the configured network device, resulting in an empty set of gids for a vlan interface, and attempts to connect via this rdma device fail in cm_init_av_for_response because no gids can be resolved. Apparently, this behavior was implemented to fit the HW-RoCE devices that create RoCE device per port, therefore RXE must behave the same like HW-RoCE devices and create rxe device per real device only. In order to communicate via a vlan interface, the user must use the gid index of the vlan address instead of creating rxe over vlan. Link: https://lore.kernel.org/r/20200811150415.3693-1-goody698@gmail.com Signed-off-by: Mohammad Heib <goody698@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-08-24RDMA/rxe: Fix the parent sysfs read when the interface has 15 charsYi Zhang1-1/+1
'parent' sysfs reads will yield '\0' bytes when the interface name has 15 chars, and there will no "\n" output. To reproduce, create one interface with 15 chars: [root@test ~]# ip a s enp0s29u1u7u3c2 2: enp0s29u1u7u3c2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UNKNOWN group default qlen 1000 link/ether 02:21:28:57:47:17 brd ff:ff:ff:ff:ff:ff inet6 fe80::ac41:338f:5bcd:c222/64 scope link noprefixroute valid_lft forever preferred_lft forever [root@test ~]# modprobe rdma_rxe [root@test ~]# echo enp0s29u1u7u3c2 > /sys/module/rdma_rxe/parameters/add [root@test ~]# cat /sys/class/infiniband/rxe0/parent enp0s29u1u7u3c2[root@test ~]# [root@test ~]# f="/sys/class/infiniband/rxe0/parent" [root@test ~]# echo "$(<"$f")" -bash: warning: command substitution: ignored null byte in input enp0s29u1u7u3c2 Use scnprintf and PAGE_SIZE to fill the sysfs output buffer. Cc: stable@vger.kernel.org Fixes: 8700e3e7c485 ("Soft RoCE driver") Link: https://lore.kernel.org/r/20200820153646.31316-1-yi.zhang@redhat.com Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Yi Zhang <yi.zhang@redhat.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-08-24treewide: Use fallthrough pseudo-keywordGustavo A. R. Silva3-3/+3
Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-07-31RDMA/rxe: Remove pkey tableKamal Heib6-77/+13
The RoCE spec requires RoCE devices to support only the default pkey. However the rxe driver maintains a 64 enties pkey table and uses only the first entry. Remove the pkey table and hard code a table of length one hard wired with the default pkey. Replace all checks of the pkey_table with a comparison to the default_pkey instead. Link: https://lore.kernel.org/r/20200721101618.686110-1-kamalheib1@gmail.com Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>