summaryrefslogtreecommitdiff
path: root/drivers/infiniband/core
AgeCommit message (Collapse)AuthorFilesLines
2021-07-19RDMA/cma: Fix rdma_resolve_route() memory leakGerd Rausch1-1/+2
[ Upstream commit 74f160ead74bfe5f2b38afb4fcf86189f9ff40c9 ] Fix a memory leak when "mda_resolve_route() is called more than once on the same "rdma_cm_id". This is possible if cma_query_handler() triggers the RDMA_CM_EVENT_ROUTE_ERROR flow which puts the state machine back and allows rdma_resolve_route() to be called again. Link: https://lore.kernel.org/r/f6662b7b-bdb7-2706-1e12-47c61d3474b6@oracle.com Signed-off-by: Gerd Rausch <gerd.rausch@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14RDMA/core: Sanitize WQ state received from the userspaceLeon Romanovsky1-2/+19
[ Upstream commit f97442887275d11c88c2899e720fe945c1f61488 ] The mlx4 and mlx5 implemented differently the WQ input checks. Instead of duplicating mlx4 logic in the mlx5, let's prepare the input in the central place. The mlx5 implementation didn't check for validity of state input. It is not real bug because our FW checked that, but still worth to fix. Fixes: f213c0527210 ("IB/uverbs: Add WQ support") Link: https://lore.kernel.org/r/ac41ad6a81b095b1a8ad453dcf62cf8d3c5da779.1621413310.git.leonro@nvidia.com Reported-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-26RDMA/uverbs: Fix a NULL vs IS_ERR() bugDan Carpenter1-2/+2
[ Upstream commit 463a3f66473b58d71428a1c3ce69ea52c05440e5 ] The uapi_get_object() function returns error pointers, it never returns NULL. Fixes: 149d3845f4a5 ("RDMA/uverbs: Add a method to introspect handles in a context") Link: https://lore.kernel.org/r/YJ6Got+U7lz+3n9a@mwanda Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-26RDMA/core: Don't access cm_id after its destructionShay Drory1-1/+3
[ Upstream commit 889d916b6f8a48b8c9489fffcad3b78eedd01a51 ] restrack should only be attached to a cm_id while the ID has a valid device pointer. It is set up when the device is first loaded, but not cleared when the device is removed. There is also two copies of the device pointer, one private and one in the public API, and these were left out of sync. Make everything go to NULL together and manipulate restrack right around the device assignments. Found by syzcaller: BUG: KASAN: wild-memory-access in __list_del include/linux/list.h:112 [inline] BUG: KASAN: wild-memory-access in __list_del_entry include/linux/list.h:135 [inline] BUG: KASAN: wild-memory-access in list_del include/linux/list.h:146 [inline] BUG: KASAN: wild-memory-access in cma_cancel_listens drivers/infiniband/core/cma.c:1767 [inline] BUG: KASAN: wild-memory-access in cma_cancel_operation drivers/infiniband/core/cma.c:1795 [inline] BUG: KASAN: wild-memory-access in cma_cancel_operation+0x1f4/0x4b0 drivers/infiniband/core/cma.c:1783 Write of size 8 at addr dead000000000108 by task syz-executor716/334 CPU: 0 PID: 334 Comm: syz-executor716 Not tainted 5.11.0+ #271 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Call Trace: __dump_stack lib/dump_stack.c:79 [inline] dump_stack+0xbe/0xf9 lib/dump_stack.c:120 __kasan_report mm/kasan/report.c:400 [inline] kasan_report.cold+0x5f/0xd5 mm/kasan/report.c:413 __list_del include/linux/list.h:112 [inline] __list_del_entry include/linux/list.h:135 [inline] list_del include/linux/list.h:146 [inline] cma_cancel_listens drivers/infiniband/core/cma.c:1767 [inline] cma_cancel_operation drivers/infiniband/core/cma.c:1795 [inline] cma_cancel_operation+0x1f4/0x4b0 drivers/infiniband/core/cma.c:1783 _destroy_id+0x29/0x460 drivers/infiniband/core/cma.c:1862 ucma_close_id+0x36/0x50 drivers/infiniband/core/ucma.c:185 ucma_destroy_private_ctx+0x58d/0x5b0 drivers/infiniband/core/ucma.c:576 ucma_close+0x91/0xd0 drivers/infiniband/core/ucma.c:1797 __fput+0x169/0x540 fs/file_table.c:280 task_work_run+0xb7/0x100 kernel/task_work.c:140 exit_task_work include/linux/task_work.h:30 [inline] do_exit+0x7da/0x17f0 kernel/exit.c:825 do_group_exit+0x9e/0x190 kernel/exit.c:922 __do_sys_exit_group kernel/exit.c:933 [inline] __se_sys_exit_group kernel/exit.c:931 [inline] __x64_sys_exit_group+0x2d/0x30 kernel/exit.c:931 do_syscall_64+0x2d/0x40 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: 255d0c14b375 ("RDMA/cma: rdma_bind_addr() leaks a cma_dev reference count") Link: https://lore.kernel.org/r/3352ee288fe34f2b44220457a29bfc0548686363.1620711734.git.leonro@nvidia.com Signed-off-by: Shay Drory <shayd@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-04-14RDMA/addr: Be strict with gid sizeLeon Romanovsky1-1/+3
[ Upstream commit d1c803a9ccd7bd3aff5e989ccfb39ed3b799b975 ] The nla_len() is less than or equal to 16. If it's less than 16 then end of the "gid" buffer is uninitialized. Fixes: ae43f8286730 ("IB/core: Add IP to GID netlink offload") Link: https://lore.kernel.org/r/20210405074434.264221-1-leon@kernel.org Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04IB/cm: Avoid a loop when device has 255 portsParav Pandit1-4/+4
[ Upstream commit 131be26750379592f0dd6244b2a90bbb504a10bb ] When RDMA device has 255 ports, loop iterator i overflows. Due to which cm_add_one() port iterator loops infinitely. Use core provided port iterator to avoid the infinite loop. Fixes: a977049dacde ("[PATCH] IB: Add the kernel CM implementation") Link: https://lore.kernel.org/r/20210127150010.1876121-9-leon@kernel.org Signed-off-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04IB/umad: Return EPOLLERR in case of when device disassociatedShay Drory1-0/+10
[ Upstream commit def4cd43f522253645b72c97181399c241b54536 ] Currently, polling a umad device will always works, even if the device was disassociated. A disassociated device should immediately return EPOLLERR from poll(). Otherwise userspace is endlessly hung on poll() with no idea that the device has been removed from the system. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Link: https://lore.kernel.org/r/20210125121339.837518-3-leon@kernel.org Signed-off-by: Shay Drory <shayd@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04IB/umad: Return EIO in case of when device disassociatedShay Drory1-1/+6
[ Upstream commit 4fc5461823c9cad547a9bdfbf17d13f0da0d6bb5 ] MAD message received by the user has EINVAL error in all flows including when the device is disassociated. That makes it impossible for the applications to treat such flow differently. Change it to return EIO, so the applications will be able to perform disassociation recovery. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Link: https://lore.kernel.org/r/20210125121339.837518-2-leon@kernel.org Signed-off-by: Shay Drory <shayd@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-01-19RDMA/restrack: Don't treat as an error allocation ID wrappingLeon Romanovsky1-0/+1
commit 3c638cdb8ecc0442552156e0fed8708dd2c7f35b upstream. xa_alloc_cyclic() call returns positive number if ID allocation succeeded but wrapped. It is not an error, so normalize the "ret" variable to zero as marker of not-an-error. drivers/infiniband/core/restrack.c:261 rdma_restrack_add() warn: 'ret' can be either negative or positive Fixes: fd47c2f99f04 ("RDMA/restrack: Convert internal DB from hash to XArray") Link: https://lore.kernel.org/r/20201216100753.1127638-1-leon@kernel.org Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-30RDMA/cma: Don't overwrite sgid_attr after device is releasedLeon Romanovsky1-3/+4
[ Upstream commit e246b7c035d74abfb3507fa10082d0c42cc016c3 ] As part of the cma_dev release, that pointer will be set to NULL. In case it happens in rdma_bind_addr() (part of an error flow), the next call to addr_handler() will have a call to cma_acquire_dev_by_src_ip() which will overwrite sgid_attr without releasing it. WARNING: CPU: 2 PID: 108 at drivers/infiniband/core/cma.c:606 cma_bind_sgid_attr drivers/infiniband/core/cma.c:606 [inline] WARNING: CPU: 2 PID: 108 at drivers/infiniband/core/cma.c:606 cma_acquire_dev_by_src_ip+0x470/0x4b0 drivers/infiniband/core/cma.c:649 CPU: 2 PID: 108 Comm: kworker/u8:1 Not tainted 5.10.0-rc6+ #257 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Workqueue: ib_addr process_one_req RIP: 0010:cma_bind_sgid_attr drivers/infiniband/core/cma.c:606 [inline] RIP: 0010:cma_acquire_dev_by_src_ip+0x470/0x4b0 drivers/infiniband/core/cma.c:649 Code: 66 d9 4a ff 4d 8b 6e 10 49 8d bd 1c 08 00 00 e8 b6 d6 4a ff 45 0f b6 bd 1c 08 00 00 41 83 e7 01 e9 49 fd ff ff e8 90 c5 29 ff <0f> 0b e9 80 fe ff ff e8 84 c5 29 ff 4c 89 f7 e8 2c d9 4a ff 4d 8b RSP: 0018:ffff8881047c7b40 EFLAGS: 00010293 RAX: ffff888104789c80 RBX: 0000000000000001 RCX: ffffffff820b8ef8 RDX: 0000000000000000 RSI: ffffffff820b9080 RDI: ffff88810cd4c998 RBP: ffff8881047c7c08 R08: ffff888104789c80 R09: ffffed10209f4036 R10: ffff888104fa01ab R11: ffffed10209f4035 R12: ffff88810cd4c800 R13: ffff888105750e28 R14: ffff888108f0a100 R15: ffff88810cd4c998 FS: 0000000000000000(0000) GS:ffff888119c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000104e60005 CR4: 0000000000370ea0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: addr_handler+0x266/0x350 drivers/infiniband/core/cma.c:3190 process_one_req+0xa3/0x300 drivers/infiniband/core/addr.c:645 process_one_work+0x54c/0x930 kernel/workqueue.c:2272 worker_thread+0x82/0x830 kernel/workqueue.c:2418 kthread+0x1ca/0x220 kernel/kthread.c:292 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296 Fixes: ff11c6cd521f ("RDMA/cma: Introduce and use cma_acquire_dev_by_src_ip()") Link: https://lore.kernel.org/r/20201213132940.345554-5-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-12-30RDMA/core: Do not indicate device ready when device enablement failsJack Morgenstein1-3/+4
[ Upstream commit 779e0bf47632c609c59f527f9711ecd3214dccb0 ] In procedure ib_register_device, procedure kobject_uevent is called (advertising that the device is ready for userspace usage) even when device_enable_and_get() returned an error. As a result, various RDMA modules attempted to register for the device even while the device driver was preparing to unregister the device. Fix this by advertising the device availability only after enabling the device succeeds. Fixes: e7a5b4aafd82 ("RDMA/device: Don't fire uevent before device is fully initialized") Link: https://lore.kernel.org/r/20201208073545.9723-3-leon@kernel.org Suggested-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-12-30RDMA/cm: Fix an attempt to use non-valid pointer when cleaning timewaitLeon Romanovsky1-0/+2
[ Upstream commit 340b940ea0ed12d9adbb8f72dea17d516b2019e8 ] If cm_create_timewait_info() fails, the timewait_info pointer will contain an error value and will be used in cm_remove_remote() later. general protection fault, probably for non-canonical address 0xdffffc0000000024: 0000 [#1] SMP KASAN PTI KASAN: null-ptr-deref in range [0×0000000000000120-0×0000000000000127] CPU: 2 PID: 12446 Comm: syz-executor.3 Not tainted 5.10.0-rc5-5d4c0742a60e #27 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:cm_remove_remote.isra.0+0x24/0×170 drivers/infiniband/core/cm.c:978 Code: 84 00 00 00 00 00 41 54 55 53 48 89 fb 48 8d ab 2d 01 00 00 e8 7d bf 4b fe 48 89 ea 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 <0f> b6 04 02 48 89 ea 83 e2 07 38 d0 7f 08 84 c0 0f 85 fc 00 00 00 RSP: 0018:ffff888013127918 EFLAGS: 00010006 RAX: dffffc0000000000 RBX: fffffffffffffff4 RCX: ffffc9000a18b000 RDX: 0000000000000024 RSI: ffffffff82edc573 RDI: fffffffffffffff4 RBP: 0000000000000121 R08: 0000000000000001 R09: ffffed1002624f1d R10: 0000000000000003 R11: ffffed1002624f1c R12: ffff888107760c70 R13: ffff888107760c40 R14: fffffffffffffff4 R15: ffff888107760c9c FS: 00007fe1ffcc1700(0000) GS:ffff88811a600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000001b2ff21000 CR3: 000000010f504001 CR4: 0000000000370ee0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: cm_destroy_id+0x189/0×15b0 drivers/infiniband/core/cm.c:1155 cma_connect_ib drivers/infiniband/core/cma.c:4029 [inline] rdma_connect_locked+0x1100/0×17c0 drivers/infiniband/core/cma.c:4107 rdma_connect+0x2a/0×40 drivers/infiniband/core/cma.c:4140 ucma_connect+0x277/0×340 drivers/infiniband/core/ucma.c:1069 ucma_write+0x236/0×2f0 drivers/infiniband/core/ucma.c:1724 vfs_write+0x220/0×830 fs/read_write.c:603 ksys_write+0x1df/0×240 fs/read_write.c:658 do_syscall_64+0x33/0×40 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: a977049dacde ("[PATCH] IB: Add the kernel CM implementation") Link: https://lore.kernel.org/r/20201204064205.145795-1-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Reported-by: Amit Matityahu <mitm@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-11-01RDMA/addr: Fix race with netevent_callback()/rdma_addr_cancel()Jason Gunthorpe1-6/+5
commit 2ee9bf346fbfd1dad0933b9eb3a4c2c0979b633e upstream. This three thread race can result in the work being run once the callback becomes NULL: CPU1 CPU2 CPU3 netevent_callback() process_one_req() rdma_addr_cancel() [..] spin_lock_bh() set_timeout() spin_unlock_bh() spin_lock_bh() list_del_init(&req->list); spin_unlock_bh() req->callback = NULL spin_lock_bh() if (!list_empty(&req->list)) // Skipped! // cancel_delayed_work(&req->work); spin_unlock_bh() process_one_req() // again req->callback() // BOOM cancel_delayed_work_sync() The solution is to always cancel the work once it is completed so any in between set_timeout() does not result in it running again. Cc: stable@vger.kernel.org Fixes: 44e75052bc2a ("RDMA/rdma_cm: Make rdma_addr_cancel into a fence") Link: https://lore.kernel.org/r/20200930072007.1009692-1-leon@kernel.org Reported-by: Dan Aloni <dan@kernelim.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-29RDMA/cma: Consolidate the destruction of a cma_multicast in one placeJason Gunthorpe1-32/+31
[ Upstream commit 3788d2997bc0150ea911a964d5b5a2e11808a936 ] Two places were open coding this sequence, and also pull in cma_leave_roce_mc_group() which was called only once. Link: https://lore.kernel.org/r/20200902081122.745412-8-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-29RDMA/cma: Remove dead code for kernel rdmacm multicastJason Gunthorpe1-15/+4
[ Upstream commit 1bb5091def706732c749df9aae45fbca003696f2 ] There is no kernel user of RDMA CM multicast so this code managing the multicast subscription of the kernel-only internal QP is dead. Remove it. This makes the bug fixes in the next patches much simpler. Link: https://lore.kernel.org/r/20200902081122.745412-7-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-29RDMA/umem: Prevent small pages from being returned by ib_umem_find_best_pgsz()Jason Gunthorpe1-0/+6
[ Upstream commit 10c75ccb54e4fe548cb16d7ed426d7d709e6ae76 ] rdma_for_each_block() makes assumptions about how the SGL is constructed that don't work if the block size is below the page size used to to build the SGL. The rules for umem SGL construction require that the SG's all be PAGE_SIZE aligned and we don't encode the actual byte offset of the VA range inside the SGL using offset and length. So rdma_for_each_block() has no idea where the actual starting/ending point is to compute the first/last block boundary if the starting address should be within a SGL. Fixing the SGL construction turns out to be really hard, and will be the subject of other patches. For now block smaller pages. Fixes: 4a35339958f1 ("RDMA/umem: Add API to find best driver supported page size in an MR") Link: https://lore.kernel.org/r/2-v2-270386b7e60b+28f4-umem_1_jgg@nvidia.com Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-29RDMA/umem: Fix ib_umem_find_best_pgsz() for mappings that cross a page boundaryJason Gunthorpe1-2/+7
[ Upstream commit a40c20dabdf9045270767c75918feb67f0727c89 ] It is possible for a single SGL to span an aligned boundary, eg if the SGL is 61440 -> 90112 Then the length is 28672, which currently limits the block size to 32k. With a 32k page size the two covering blocks will be: 32768->65536 and 65536->98304 However, the correct answer is a 128K block size which will span the whole 28672 bytes in a single block. Instead of limiting based on length figure out which high IOVA bits don't change between the start and end addresses. That is the highest useful page size. Fixes: 4a35339958f1 ("RDMA/umem: Add API to find best driver supported page size in an MR") Link: https://lore.kernel.org/r/1-v2-270386b7e60b+28f4-umem_1_jgg@nvidia.com Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-29RDMA/ucma: Add missing locking around rdma_leave_multicast()Jason Gunthorpe1-0/+2
[ Upstream commit 38e03d092699891c3237b5aee9e8029d4ede0956 ] All entry points to the rdma_cm from a ULP must be single threaded, even this error unwinds. Add the missing locking. Fixes: 7c11910783a1 ("RDMA/ucma: Put a lock around every call to the rdma_cm layer") Link: https://lore.kernel.org/r/20200818120526.702120-11-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-29RDMA/ucma: Fix locking for ctx->events_reportedJason Gunthorpe1-1/+3
[ Upstream commit 98837c6c3d7285f6eca86480b6f7fac6880e27a8 ] This value is locked under the file->mut, ensure it is held whenever touching it. The case in ucma_migrate_id() is a race, while in ucma_free_uctx() it is already not possible for the write side to run, the movement is just for clarity. Fixes: 88314e4dda1e ("RDMA/cma: add support for rdma_migrate_id()") Link: https://lore.kernel.org/r/20200818120526.702120-10-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-01RDMA/cm: Remove a race freeing timewait_infoJason Gunthorpe1-10/+15
[ Upstream commit bede86a39d9dc3387ac00dcb8e1ac221676b2f25 ] When creating a cm_id during REQ the id immediately becomes visible to the other MAD handlers, and shortly after the state is moved to IB_CM_REQ_RCVD This allows cm_rej_handler() to run concurrently and free the work: CPU 0 CPU1 cm_req_handler() ib_create_cm_id() cm_match_req() id_priv->state = IB_CM_REQ_RCVD cm_rej_handler() cm_acquire_id() spin_lock(&id_priv->lock) switch (id_priv->state) case IB_CM_REQ_RCVD: cm_reset_to_idle() kfree(id_priv->timewait_info); goto destroy destroy: kfree(id_priv->timewait_info); id_priv->timewait_info = NULL Causing a double free or worse. Do not free the timewait_info without also holding the id_priv->lock. Simplify this entire flow by making the free unconditional during cm_destroy_id() and removing the confusing special case error unwind during creation of the timewait_info. This also fixes a leak of the timewait if cm_destroy_id() is called in IB_CM_ESTABLISHED with an XRC TGT QP. The state machine will be left in ESTABLISHED while it needed to transition through IB_CM_TIMEWAIT to release the timewait pointer. Also fix a leak of the timewait_info if the caller mis-uses the API and does ib_send_cm_reqs(). Fixes: a977049dacde ("[PATCH] IB: Add the kernel CM implementation") Link: https://lore.kernel.org/r/20200310092545.251365-4-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-09-17RDMA/core: Fix reported speed and widthKamal Heib1-1/+1
[ Upstream commit 28b0865714b315e318ac45c4fc9156f3d4649646 ] When the returned speed from __ethtool_get_link_ksettings() is SPEED_UNKNOWN this will lead to reporting a wrong speed and width for providers that uses ib_get_eth_speed(), fix that by defaulting the netdev_speed to SPEED_1000 in case the returned value from __ethtool_get_link_ksettings() is SPEED_UNKNOWN. Fixes: d41861942fc5 ("IB/core: Add generic function to extract IB speed from netdev") Link: https://lore.kernel.org/r/20200902124304.170912-1-kamalheib1@gmail.com Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-08-21RDMA/counter: Allow manually bind QPs with different pids to same counterMark Zhang1-1/+1
[ Upstream commit cbeb7d896c0f296451ffa7b67e7706786b8364c8 ] In manual mode allow bind user QPs with different pids to same counter, since this is allowed in auto mode. Bind kernel QPs and user QPs to the same counter are not allowed. Fixes: 1bd8e0a9d0fd ("RDMA/counter: Allow manual mode configuration support") Link: https://lore.kernel.org/r/20200702082933.424537-4-leon@kernel.org Signed-off-by: Mark Zhang <markz@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-08-21RDMA/counter: Only bind user QPs in auto modeMark Zhang1-1/+1
[ Upstream commit c9f557421e505f75da4234a6af8eff46bc08614b ] In auto mode only bind user QPs to a dynamic counter, since this feature is mainly used for system statistic and diagnostic purpose, while there's no need to counter kernel QPs so far. Fixes: 99fa331dc862 ("RDMA/counter: Add "auto" configuration mode support") Link: https://lore.kernel.org/r/20200702082933.424537-3-leon@kernel.org Signed-off-by: Mark Zhang <markz@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-08-21IB/uverbs: Set IOVA on IB MR in uverbs layerYishai Hadas1-0/+4
[ Upstream commit 04c0a5fcfcf65aade2fb238b6336445f1a99b646 ] Set IOVA on IB MR in uverbs layer to let all drivers have it, this includes both reg/rereg MR flows. As part of this change cleaned-up this setting from the drivers that already did it by themselves in their user flows. Fixes: e6f0330106f4 ("mlx4_ib: set user mr attributes in struct ib_mr") Link: https://lore.kernel.org/r/20200630093916.332097-3-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-08-19RDMA/netlink: Remove CAP_NET_RAW check when dump a raw QPMark Zhang1-3/+0
[ Upstream commit 1d70ad0f85435a7262de802b104e49e6598c50ff ] When dumping QPs bound to a counter, raw QPs should be allowed to dump without the CAP_NET_RAW privilege. This is consistent with what "rdma res show qp" does. Fixes: c4ffee7c9bdb ("RDMA/netlink: Implement counter dumpit calback") Link: https://lore.kernel.org/r/20200727095828.496195-1-leon@kernel.org Signed-off-by: Mark Zhang <markz@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-08-19RDMA/core: Fix return error value in _ib_modify_qp() to negativeLi Heng1-1/+1
[ Upstream commit 47fda651d5af2506deac57d54887cf55ce26e244 ] The error codes in _ib_modify_qp() are supposed to be negative errno. Fixes: 7a5c938b9ed0 ("IB/core: Check for rdma_protocol_ib only after validating port_num") Link: https://lore.kernel.org/r/1595645787-20375-1-git-send-email-liheng40@huawei.com Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Li Heng <liheng40@huawei.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-08-19RDMA/core: Fix bogus WARN_ON during ib_unregister_device_queued()Jason Gunthorpe1-3/+8
[ Upstream commit 0cb42c0265837fafa2b4f302c8a7fed2631d7869 ] ib_unregister_device_queued() can only be used by drivers using the new dealloc_device callback flow, and it has a safety WARN_ON to ensure drivers are using it properly. However, if unregister and register are raced there is a special destruction path that maintains the uniform error handling semantic of 'caller does ib_dealloc_device() on failure'. This requires disabling the dealloc_device callback which triggers the WARN_ON. Instead of using NULL to disable the callback use a special function pointer so the WARN_ON does not trigger. Fixes: d0899892edd0 ("RDMA/device: Provide APIs from the core code to help unregistration") Link: https://lore.kernel.org/r/0-v1-a36d512e0a99+762-syz_dealloc_driver_jgg@nvidia.com Reported-by: syzbot+4088ed905e4ae2b0e13b@syzkaller.appspotmail.com Suggested-by: Hillf Danton <hdanton@sina.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-07-16IB/sa: Resolv use-after-free in ib_nl_make_request()Divya Indi1-21/+17
[ Upstream commit f427f4d6214c183c474eeb46212d38e6c7223d6a ] There is a race condition where ib_nl_make_request() inserts the request data into the linked list but the timer in ib_nl_request_timeout() can see it and destroy it before ib_nl_send_msg() is done touching it. This could happen, for instance, if there is a long delay allocating memory during nlmsg_new() This causes a use-after-free in the send_mad() thread: [<ffffffffa02f43cb>] ? ib_pack+0x17b/0x240 [ib_core] [ <ffffffffa032aef1>] ib_sa_path_rec_get+0x181/0x200 [ib_sa] [<ffffffffa0379db0>] rdma_resolve_route+0x3c0/0x8d0 [rdma_cm] [<ffffffffa0374450>] ? cma_bind_port+0xa0/0xa0 [rdma_cm] [<ffffffffa040f850>] ? rds_rdma_cm_event_handler_cmn+0x850/0x850 [rds_rdma] [<ffffffffa040f22c>] rds_rdma_cm_event_handler_cmn+0x22c/0x850 [rds_rdma] [<ffffffffa040f860>] rds_rdma_cm_event_handler+0x10/0x20 [rds_rdma] [<ffffffffa037778e>] addr_handler+0x9e/0x140 [rdma_cm] [<ffffffffa026cdb4>] process_req+0x134/0x190 [ib_addr] [<ffffffff810a02f9>] process_one_work+0x169/0x4a0 [<ffffffff810a0b2b>] worker_thread+0x5b/0x560 [<ffffffff810a0ad0>] ? flush_delayed_work+0x50/0x50 [<ffffffff810a68fb>] kthread+0xcb/0xf0 [<ffffffff816ec49a>] ? __schedule+0x24a/0x810 [<ffffffff816ec49a>] ? __schedule+0x24a/0x810 [<ffffffff810a6830>] ? kthread_create_on_node+0x180/0x180 [<ffffffff816f25a7>] ret_from_fork+0x47/0x90 [<ffffffff810a6830>] ? kthread_create_on_node+0x180/0x180 The ownership rule is once the request is on the list, ownership transfers to the list and the local thread can't touch it any more, just like for the normal MAD case in send_mad(). Thus, instead of adding before send and then trying to delete after on errors, move the entire thing under the spinlock so that the send and update of the lists are atomic to the conurrent threads. Lightly reoganize things so spinlock safe memory allocations are done in the final NL send path and the rest of the setup work is done before and outside the lock. Fixes: 3ebd2fd0d011 ("IB/sa: Put netlink request into the request list before sending") Link: https://lore.kernel.org/r/1592964789-14533-1-git-send-email-divya.indi@oracle.com Signed-off-by: Divya Indi <divya.indi@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-07-09RDMA/counter: Query a counter before releaseMark Zhang1-1/+3
[ Upstream commit c1d869d64a1955817c4d6fff08ecbbe8e59d36f8 ] Query a dynamically-allocated counter before release it, to update it's hwcounters and log all of them into history data. Otherwise all values of these hwcounters will be lost. Fixes: f34a55e497e8 ("RDMA/core: Get sum value of all counters when perform a sysfs stat read") Link: https://lore.kernel.org/r/20200621110000.56059-1-leon@kernel.org Signed-off-by: Mark Zhang <markz@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-30RDMA/mad: Fix possible memory leak in ib_mad_post_receive_mads()Fan Guo1-0/+1
[ Upstream commit a17f4bed811c60712d8131883cdba11a105d0161 ] If ib_dma_mapping_error() returns non-zero value, ib_mad_post_receive_mads() will jump out of loops and return -ENOMEM without freeing mad_priv. Fix this memory-leak problem by freeing mad_priv in this case. Fixes: 2c34e68f4261 ("IB/mad: Check and handle potential DMA mapping errors") Link: https://lore.kernel.org/r/20200612063824.180611-1-guofan5@huawei.com Signed-off-by: Fan Guo <guofan5@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-30RDMA/cma: Protect bind_list and listen_list while finding matching cm idMark Zhang1-0/+18
[ Upstream commit 730c8912484186d4623d0c76509066d285c3a755 ] The bind_list and listen_list must be accessed under a lock, add the missing locking around the access in cm_ib_id_from_event() In addition add lockdep asserts to make it clearer what the locking semantic is here. general protection fault: 0000 [#1] SMP NOPTI CPU: 226 PID: 126135 Comm: kworker/226:1 Tainted: G OE 4.12.14-150.47-default #1 SLE15 Hardware name: Cray Inc. Windom/Windom, BIOS 0.8.7 01-10-2020 Workqueue: ib_cm cm_work_handler [ib_cm] task: ffff9c5a60a1d2c0 task.stack: ffffc1d91f554000 RIP: 0010:cma_ib_req_handler+0x3f1/0x11b0 [rdma_cm] RSP: 0018:ffffc1d91f557b40 EFLAGS: 00010286 RAX: deacffffffffff30 RBX: 0000000000000001 RCX: ffff9c2af5bb6000 RDX: 00000000000000a9 RSI: ffff9c5aa4ed2f10 RDI: ffffc1d91f557b08 RBP: ffffc1d91f557d90 R08: ffff9c340cc80000 R09: ffff9c2c0f901900 R10: 0000000000000000 R11: 0000000000000001 R12: deacffffffffff30 R13: ffff9c5a48aeec00 R14: ffffc1d91f557c30 R15: ffff9c5c2eea3688 FS: 0000000000000000(0000) GS:ffff9c5c2fa80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00002b5cc03fa320 CR3: 0000003f8500a000 CR4: 00000000003406e0 Call Trace: ? rdma_addr_cancel+0xa0/0xa0 [ib_core] ? cm_process_work+0x28/0x140 [ib_cm] cm_process_work+0x28/0x140 [ib_cm] ? cm_get_bth_pkey.isra.44+0x34/0xa0 [ib_cm] cm_work_handler+0xa06/0x1a6f [ib_cm] ? __switch_to_asm+0x34/0x70 ? __switch_to_asm+0x34/0x70 ? __switch_to_asm+0x40/0x70 ? __switch_to_asm+0x34/0x70 ? __switch_to_asm+0x40/0x70 ? __switch_to_asm+0x34/0x70 ? __switch_to_asm+0x40/0x70 ? __switch_to+0x7c/0x4b0 ? __switch_to_asm+0x40/0x70 ? __switch_to_asm+0x34/0x70 process_one_work+0x1da/0x400 worker_thread+0x2b/0x3f0 ? process_one_work+0x400/0x400 kthread+0x118/0x140 ? kthread_create_on_node+0x40/0x40 ret_from_fork+0x22/0x40 Code: 00 66 83 f8 02 0f 84 ca 05 00 00 49 8b 84 24 d0 01 00 00 48 85 c0 0f 84 68 07 00 00 48 2d d0 01 00 00 49 89 c4 0f 84 59 07 00 00 <41> 0f b7 44 24 20 49 8b 77 50 66 83 f8 0a 75 9e 49 8b 7c 24 28 Fixes: 4c21b5bcef73 ("IB/cma: Add net_dev and private data checks to RDMA CM") Link: https://lore.kernel.org/r/20200616104304.2426081-1-leon@kernel.org Signed-off-by: Mark Zhang <markz@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-30IB/mad: Fix use after free when destroying MAD agentShay Drory1-1/+1
commit 116a1b9f1cb769b83e5adff323f977a62b1dcb2e upstream. Currently, when RMPP MADs are processed while the MAD agent is destroyed, it could result in use after free of rmpp_recv, as decribed below: cpu-0 cpu-1 ----- ----- ib_mad_recv_done() ib_mad_complete_recv() ib_process_rmpp_recv_wc() unregister_mad_agent() ib_cancel_rmpp_recvs() cancel_delayed_work() process_rmpp_data() start_rmpp() queue_delayed_work(rmpp_recv->cleanup_work) destroy_rmpp_recv() free_rmpp_recv() cleanup_work()[1] spin_lock_irqsave(&rmpp_recv->agent->lock) <-- use after free [1] cleanup_work() == recv_cleanup_handler Fix it by waiting for the MAD agent reference count becoming zero before calling to ib_cancel_rmpp_recvs(). Fixes: 9a41e38a467c ("IB/mad: Use IDR for agent IDs") Link: https://lore.kernel.org/r/20200621104738.54850-2-leon@kernel.org Signed-off-by: Shay Drory <shayd@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-24IB/cma: Fix ports memory leak in cma_configfsMaor Gottlieb1-0/+13
[ Upstream commit 63a3345c2d42a9b29e1ce2d3a4043689b3995cea ] The allocated ports structure in never freed. The free function should be called by release_cma_ports_group, but the group is never released since we don't remove its default group. Remove default groups when device group is deleted. Fixes: 045959db65c6 ("IB/cma: Add configfs for rdma_cm") Link: https://lore.kernel.org/r/20200521072650.567908-1-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-24RDMA/core: Fix several reference count leaks.Qiushi Wu1-5/+5
[ Upstream commit 0b8e125e213204508e1b3c4bdfe69713280b7abd ] kobject_init_and_add() takes reference even when it fails. If this function returns an error, kobject_put() must be called to properly clean up the memory associated with the object. Previous commit b8eb718348b8 ("net-sysfs: Fix reference count leak in rx|netdev_queue_add_kobject") fixed a similar problem. Link: https://lore.kernel.org/r/20200528030231.9082-1-wu000273@umn.edu Signed-off-by: Qiushi Wu <wu000273@umn.edu> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-17RDMA/uverbs: Make the event_queue fds return POLLERR when disassociatedJason Gunthorpe1-0/+2
[ Upstream commit eb356e6dc15a30af604f052cd0e170450193c254 ] If is_closed is set, and the event list is empty, then read() will return -EIO without blocking. After setting is_closed in ib_uverbs_free_event_queue(), we do trigger a wake_up on the poll_wait, but the fops->poll() function does not check it, so poll will continue to sleep on an empty list. Fixes: 14e23bd6d221 ("RDMA/core: Fix locking in ib_uverbs_event_read") Link: https://lore.kernel.org/r/0-v1-ace813388969+48859-uverbs_poll_fix%25jgg@mellanox.com Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-03RDMA/core: Fix double destruction of uobjectJason Gunthorpe1-7/+13
[ Upstream commit c85f4abe66bea0b5db8d28d55da760c4fe0a0301 ] Fix use after free when user user space request uobject concurrently for the same object, within the RCU grace period. In that case, remove_handle_idr_uobject() is called twice and we will have an extra put on the uobject which cause use after free. Fix it by leaving the uobject write locked after it was removed from the idr. Call to rdma_lookup_put_uobject with UVERBS_LOOKUP_DESTROY instead of UVERBS_LOOKUP_WRITE will do the work. refcount_t: underflow; use-after-free. WARNING: CPU: 0 PID: 1381 at lib/refcount.c:28 refcount_warn_saturate+0xfe/0x1a0 Kernel panic - not syncing: panic_on_warn set ... CPU: 0 PID: 1381 Comm: syz-executor.0 Not tainted 5.5.0-rc3 #8 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 Call Trace: dump_stack+0x94/0xce panic+0x234/0x56f __warn+0x1cc/0x1e1 report_bug+0x200/0x310 fixup_bug.part.11+0x32/0x80 do_error_trap+0xd3/0x100 do_invalid_op+0x31/0x40 invalid_op+0x1e/0x30 RIP: 0010:refcount_warn_saturate+0xfe/0x1a0 Code: 0f 0b eb 9b e8 23 f6 6d ff 80 3d 6c d4 19 03 00 75 8d e8 15 f6 6d ff 48 c7 c7 c0 02 55 bd c6 05 57 d4 19 03 01 e8 a2 58 49 ff <0f> 0b e9 6e ff ff ff e8 f6 f5 6d ff 80 3d 42 d4 19 03 00 0f 85 5c RSP: 0018:ffffc90002df7b98 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff88810f6a193c RCX: ffffffffba649009 RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffff88811b0283cc RBP: 0000000000000003 R08: ffffed10236060e3 R09: ffffed10236060e3 R10: 0000000000000001 R11: ffffed10236060e2 R12: ffff88810f6a193c R13: ffffc90002df7d60 R14: 0000000000000000 R15: ffff888116ae6a08 uverbs_uobject_put+0xfd/0x140 __uobj_perform_destroy+0x3d/0x60 ib_uverbs_close_xrcd+0x148/0x170 ib_uverbs_write+0xaa5/0xdf0 __vfs_write+0x7c/0x100 vfs_write+0x168/0x4a0 ksys_write+0xc8/0x200 do_syscall_64+0x9c/0x390 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x465b49 Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f759d122c58 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 000000000073bfa8 RCX: 0000000000465b49 RDX: 000000000000000c RSI: 0000000020000080 RDI: 0000000000000003 RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007f759d1236bc R13: 00000000004ca27c R14: 000000000070de40 R15: 00000000ffffffff Dumping ftrace buffer: (ftrace buffer empty) Kernel Offset: 0x39400000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) Fixes: 7452a3c745a2 ("IB/uverbs: Allow RDMA_REMOVE_DESTROY to work concurrently with disassociate") Link: https://lore.kernel.org/r/20200527135534.482279-1-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-05-20RDMA/core: Fix double put of resourceSasha Levin1-1/+1
[ Upstream commit 50bbe3d34fea74b7c0fabe553c40c2f4a48bb9c3 ] Do not decrease the reference count of resource tracker object twice in the error flow of res_get_common_doit. Fixes: c5dfe0ea6ffa ("RDMA/nldev: Add resource tracker doit callback") Link: https://lore.kernel.org/r/20200507062942.98305-1-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-05-20IB/core: Fix potential NULL pointer dereference in pkey cacheJack Morgenstein1-2/+5
[ Upstream commit 1901b91f99821955eac2bd48fe25ee983385dc00 ] The IB core pkey cache is populated by procedure ib_cache_update(). Initially, the pkey cache pointer is NULL. ib_cache_update allocates a buffer and populates it with the device's pkeys, via repeated calls to procedure ib_query_pkey(). If there is a failure in populating the pkey buffer via ib_query_pkey(), ib_cache_update does not replace the old pkey buffer cache with the updated one -- it leaves the old cache as is. Since initially the pkey buffer cache is NULL, when calling ib_cache_update the first time, a failure in ib_query_pkey() will cause the pkey buffer cache pointer to remain NULL. In this situation, any calls subsequent to ib_get_cached_pkey(), ib_find_cached_pkey(), or ib_find_cached_pkey_exact() will try to dereference the NULL pkey cache pointer, causing a kernel panic. Fix this by checking the ib_cache_update() return value. Fixes: 8faea9fd4a39 ("RDMA/cache: Move the cache per-port data into the main ib_port_data") Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Link: https://lore.kernel.org/r/20200507071012.100594-1-leon@kernel.org Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-05-06RDMA/cm: Fix an error check in cm_alloc_id_priv()Dan Carpenter1-1/+1
commit 983653515849fb56b78ce55d349bb384d43030f6 upstream. The xa_alloc_cyclic_irq() function returns either 0 or 1 on success and negatives on error. This code treats 1 as an error and returns ERR_PTR(1) which will cause an Oops in the caller. Fixes: ae78ff3a0f0c ("RDMA/cm: Convert local_id_table to XArray") Link: https://lore.kernel.org/r/20200407093714.GA80285@mwanda Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-06RDMA/cm: Fix ordering of xa_alloc_cyclic() in ib_create_cm_id()Jason Gunthorpe1-16/+11
commit e8dc4e885c459343970b25acd9320fe9ee5492e7 upstream. xa_alloc_cyclic() is a SMP release to be paired with some later acquire during xa_load() as part of cm_acquire_id(). As such, xa_alloc_cyclic() must be done after the cm_id is fully initialized, in particular, it absolutely must be after the refcount_set(), otherwise the refcount_inc() in cm_acquire_id() may not see the set. As there are several cases where a reader will be able to use the id.local_id after cm_acquire_id in the IB_CM_IDLE state there needs to be an unfortunate split into a NULL allocate and a finalizing xa_store. Fixes: a977049dacde ("[PATCH] IB: Add the kernel CM implementation") Link: https://lore.kernel.org/r/20200310092545.251365-2-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-06RDMA/core: Fix race between destroy and release FD objectLeon Romanovsky1-1/+1
commit f0abc761bbb9418876cc4d1ebc473e4ea6352e42 upstream. The call to ->lookup_put() was too early and it caused an unlock of the read/write protection of the uobject after the FD was put. This allows a race: CPU1 CPU2 rdma_lookup_put_uobject() lookup_put_fd_uobject() fput() fput() uverbs_uobject_fd_release() WARN_ON(uverbs_try_lock_object(uobj, UVERBS_LOOKUP_WRITE)); atomic_dec(usecnt) Fix the code by changing the order, first unlock and call to ->lookup_put() after that. Fixes: 3832125624b7 ("IB/core: Add support for idr types") Link: https://lore.kernel.org/r/20200423060122.6182-1-leon@kernel.org Suggested-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-06RDMA/core: Prevent mixed use of FDs between shared ufilesLeon Romanovsky1-1/+1
commit 0fb00941dc63990a10951146df216fc7b0e20bc2 upstream. FDs can only be used on the ufile that created them, they cannot be mixed to other ufiles. We are lacking a check to prevent it. BUG: KASAN: null-ptr-deref in atomic64_sub_and_test include/asm-generic/atomic-instrumented.h:1547 [inline] BUG: KASAN: null-ptr-deref in atomic_long_sub_and_test include/asm-generic/atomic-long.h:460 [inline] BUG: KASAN: null-ptr-deref in fput_many+0x1a/0x140 fs/file_table.c:336 Write of size 8 at addr 0000000000000038 by task syz-executor179/284 CPU: 0 PID: 284 Comm: syz-executor179 Not tainted 5.5.0-rc5+ #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x94/0xce lib/dump_stack.c:118 __kasan_report+0x18f/0x1b7 mm/kasan/report.c:510 kasan_report+0xe/0x20 mm/kasan/common.c:639 check_memory_region_inline mm/kasan/generic.c:185 [inline] check_memory_region+0x15d/0x1b0 mm/kasan/generic.c:192 atomic64_sub_and_test include/asm-generic/atomic-instrumented.h:1547 [inline] atomic_long_sub_and_test include/asm-generic/atomic-long.h:460 [inline] fput_many+0x1a/0x140 fs/file_table.c:336 rdma_lookup_put_uobject+0x85/0x130 drivers/infiniband/core/rdma_core.c:692 uobj_put_read include/rdma/uverbs_std_types.h:96 [inline] _ib_uverbs_lookup_comp_file drivers/infiniband/core/uverbs_cmd.c:198 [inline] create_cq+0x375/0xba0 drivers/infiniband/core/uverbs_cmd.c:1006 ib_uverbs_create_cq+0x114/0x140 drivers/infiniband/core/uverbs_cmd.c:1089 ib_uverbs_write+0xaa5/0xdf0 drivers/infiniband/core/uverbs_main.c:769 __vfs_write+0x7c/0x100 fs/read_write.c:494 vfs_write+0x168/0x4a0 fs/read_write.c:558 ksys_write+0xc8/0x200 fs/read_write.c:611 do_syscall_64+0x9c/0x390 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x44ef99 Code: 00 b8 00 01 00 00 eb e1 e8 74 1c 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c4 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007ffc0b74c028 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 00007ffc0b74c030 RCX: 000000000044ef99 RDX: 0000000000000040 RSI: 0000000020000040 RDI: 0000000000000005 RBP: 00007ffc0b74c038 R08: 0000000000401830 R09: 0000000000401830 R10: 00007ffc0b74c038 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000000 R14: 00000000006be018 R15: 0000000000000000 Fixes: cf8966b3477d ("IB/core: Add support for fd objects") Link: https://lore.kernel.org/r/20200421082929.311931-2-leon@kernel.org Suggested-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-13RDMA/cm: Update num_paths in cma_resolve_iboe_route error flowAvihai Horon1-0/+1
commit 987914ab841e2ec281a35b54348ab109b4c0bb4e upstream. After a successful allocation of path_rec, num_paths is set to 1, but any error after such allocation will leave num_paths uncleared. This causes to de-referencing a NULL pointer later on. Hence, num_paths needs to be set back to 0 if such an error occurs. The following crash from syzkaller revealed it. kasan: CONFIG_KASAN_INLINE enabled kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI CPU: 0 PID: 357 Comm: syz-executor060 Not tainted 4.18.0+ #311 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014 RIP: 0010:ib_copy_path_rec_to_user+0x94/0x3e0 Code: f1 f1 f1 f1 c7 40 0c 00 00 f4 f4 65 48 8b 04 25 28 00 00 00 48 89 45 c8 31 c0 e8 d7 60 24 ff 48 8d 7b 4c 48 89 f8 48 c1 e8 03 <42> 0f b6 14 30 48 89 f8 83 e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85 RSP: 0018:ffff88006586f980 EFLAGS: 00010207 RAX: 0000000000000009 RBX: 0000000000000000 RCX: 1ffff1000d5fe475 RDX: ffff8800621e17c0 RSI: ffffffff820d45f9 RDI: 000000000000004c RBP: ffff88006586fa50 R08: ffffed000cb0df73 R09: ffffed000cb0df72 R10: ffff88006586fa70 R11: ffffed000cb0df73 R12: 1ffff1000cb0df30 R13: ffff88006586fae8 R14: dffffc0000000000 R15: ffff88006aff2200 FS: 00000000016fc880(0000) GS:ffff88006d000000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020000040 CR3: 0000000063fec000 CR4: 00000000000006b0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: ? ib_copy_path_rec_from_user+0xcc0/0xcc0 ? __mutex_unlock_slowpath+0xfc/0x670 ? wait_for_completion+0x3b0/0x3b0 ? ucma_query_route+0x818/0xc60 ucma_query_route+0x818/0xc60 ? ucma_listen+0x1b0/0x1b0 ? sched_clock_cpu+0x18/0x1d0 ? sched_clock_cpu+0x18/0x1d0 ? ucma_listen+0x1b0/0x1b0 ? ucma_write+0x292/0x460 ucma_write+0x292/0x460 ? ucma_close_id+0x60/0x60 ? sched_clock_cpu+0x18/0x1d0 ? sched_clock_cpu+0x18/0x1d0 __vfs_write+0xf7/0x620 ? ucma_close_id+0x60/0x60 ? kernel_read+0x110/0x110 ? time_hardirqs_on+0x19/0x580 ? lock_acquire+0x18b/0x3a0 ? finish_task_switch+0xf3/0x5d0 ? _raw_spin_unlock_irq+0x29/0x40 ? _raw_spin_unlock_irq+0x29/0x40 ? finish_task_switch+0x1be/0x5d0 ? __switch_to_asm+0x34/0x70 ? __switch_to_asm+0x40/0x70 ? security_file_permission+0x172/0x1e0 vfs_write+0x192/0x460 ksys_write+0xc6/0x1a0 ? __ia32_sys_read+0xb0/0xb0 ? entry_SYSCALL_64_after_hwframe+0x3e/0xbe ? do_syscall_64+0x1d/0x470 do_syscall_64+0x9e/0x470 entry_SYSCALL_64_after_hwframe+0x49/0xbe Fixes: 3c86aa70bf67 ("RDMA/cm: Add RDMA CM support for IBoE devices") Link: https://lore.kernel.org/r/20200318101741.47211-1-leon@kernel.org Signed-off-by: Avihai Horon <avihaih@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-13RDMA/cma: Teach lockdep about the order of rtnl and lockJason Gunthorpe1-0/+13
commit 32ac9e4399b12d3e54d312a0e0e30ed5cd19bd4e upstream. This lock ordering only happens when bonding is enabled and a certain bonding related event fires. However, since it can happen this is a global restriction on lock ordering. Teach lockdep about the order directly and unconditionally so bugs here are found quickly. See https://syzkaller.appspot.com/bug?extid=55de90ab5f44172b0c90 Link: https://lore.kernel.org/r/20200227203651.GA27185@ziepe.ca Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-13RDMA/ucma: Put a lock around every call to the rdma_cm layerJason Gunthorpe1-2/+47
commit 7c11910783a1ea17e88777552ef146cace607b3c upstream. The rdma_cm must be used single threaded. This appears to be a bug in the design, as it does have lots of locking that seems like it should allow concurrency. However, when it is all said and done every single place that uses the cma_exch() scheme is broken, and all the unlocked reads from the ucma of the cm_id data are wrong too. syzkaller has been finding endless bugs related to this. Fixing this in any elegant way is some enormous amount of work. Take a very big hammer and put a mutex around everything to do with the ucma_context at the top of every syscall. Fixes: 75216638572f ("RDMA/cma: Export rdma cm interface to userspace") Link: https://lore.kernel.org/r/20200218210432.GA31966@ziepe.ca Reported-by: syzbot+adb15cf8c2798e4e0db4@syzkaller.appspotmail.com Reported-by: syzbot+e5579222b6a3edd96522@syzkaller.appspotmail.com Reported-by: syzbot+4b628fcc748474003457@syzkaller.appspotmail.com Reported-by: syzbot+29ee8f76017ce6cf03da@syzkaller.appspotmail.com Reported-by: syzbot+6956235342b7317ec564@syzkaller.appspotmail.com Reported-by: syzbot+b358909d8d01556b790b@syzkaller.appspotmail.com Reported-by: syzbot+6b46b135602a3f3ac99e@syzkaller.appspotmail.com Reported-by: syzbot+8458d13b13562abf6b77@syzkaller.appspotmail.com Reported-by: syzbot+bd034f3fdc0402e942ed@syzkaller.appspotmail.com Reported-by: syzbot+c92378b32760a4eef756@syzkaller.appspotmail.com Reported-by: syzbot+68b44a1597636e0b342c@syzkaller.appspotmail.com Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-01RDMA/core: Ensure security pkey modify is not lostMike Marciniszyn1-8/+3
commit 2d47fbacf2725a67869f4d3634c2415e7dfab2f4 upstream. The following modify sequence (loosely based on ipoib) will lose a pkey modifcation: - Modify (pkey index, port) - Modify (new pkey index, NO port) After the first modify, the qp_pps list will have saved the pkey and the unit on the main list. During the second modify, get_new_pps() will fetch the port from qp_pps and read the new pkey index from qp_attr->pkey_index. The state will still be zero, or IB_PORT_PKEY_NOT_VALID. Because of the invalid state, the new values will never replace the one in the qp pps list, losing the new pkey. This happens because the following if statements will never correct the state because the first term will be false. If the code had been executed, it would incorrectly overwrite valid values. if ((qp_attr_mask & IB_QP_PKEY_INDEX) && (qp_attr_mask & IB_QP_PORT)) new_pps->main.state = IB_PORT_PKEY_VALID; if (!(qp_attr_mask & (IB_QP_PKEY_INDEX | IB_QP_PORT)) && qp_pps) { new_pps->main.port_num = qp_pps->main.port_num; new_pps->main.pkey_index = qp_pps->main.pkey_index; if (qp_pps->main.state != IB_PORT_PKEY_NOT_VALID) new_pps->main.state = IB_PORT_PKEY_VALID; } Fix by joining the two if statements with an or test to see if qp_pps is non-NULL and in the correct state. Fixes: 1dd017882e01 ("RDMA/core: Fix protection fault in get_pkey_idx_qp_list") Link: https://lore.kernel.org/r/20200313124704.14982.55907.stgit@awfm-01.aw.intel.com Reviewed-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-01RDMA/mad: Do not crash if the rdma device does not have a umad interfaceJason Gunthorpe1-11/+22
commit 5bdfa854013ce4193de0d097931fd841382c76a7 upstream. Non-IB devices do not have a umad interface and the client_data will be left set to NULL. In this case calling get_nl_info() will try to kref a NULL cdev causing a crash: general protection fault, probably for non-canonical address 0xdffffc00000000ba: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x00000000000005d0-0x00000000000005d7] CPU: 0 PID: 20851 Comm: syz-executor.0 Not tainted 5.6.0-rc2-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:kobject_get+0x35/0x150 lib/kobject.c:640 Code: 53 e8 3f b0 8b f9 4d 85 e4 0f 84 a2 00 00 00 e8 31 b0 8b f9 49 8d 7c 24 3c 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f b6 04 02 48 89 fa +83 e2 07 38 d0 7f 08 84 c0 0f 85 eb 00 00 00 RSP: 0018:ffffc9000946f1a0 EFLAGS: 00010203 RAX: dffffc0000000000 RBX: ffffffff85bdbbb0 RCX: ffffc9000bf22000 RDX: 00000000000000ba RSI: ffffffff87e9d78f RDI: 00000000000005d4 RBP: ffffc9000946f1b8 R08: ffff8880581a6440 R09: ffff8880581a6cd0 R10: fffffbfff154b838 R11: ffffffff8aa5c1c7 R12: 0000000000000598 R13: 0000000000000000 R14: ffffc9000946f278 R15: ffff88805cb0c4d0 FS: 00007faa9e8af700(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000001b30121000 CR3: 000000004515d000 CR4: 00000000001406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: get_device+0x25/0x40 drivers/base/core.c:2574 __ib_get_client_nl_info+0x205/0x2e0 drivers/infiniband/core/device.c:1861 ib_get_client_nl_info+0x35/0x180 drivers/infiniband/core/device.c:1881 nldev_get_chardev+0x575/0xac0 drivers/infiniband/core/nldev.c:1621 rdma_nl_rcv_msg drivers/infiniband/core/netlink.c:195 [inline] rdma_nl_rcv_skb drivers/infiniband/core/netlink.c:239 [inline] rdma_nl_rcv+0x5d9/0x980 drivers/infiniband/core/netlink.c:259 netlink_unicast_kernel net/netlink/af_netlink.c:1303 [inline] netlink_unicast+0x59e/0x7e0 net/netlink/af_netlink.c:1329 netlink_sendmsg+0x91c/0xea0 net/netlink/af_netlink.c:1918 sock_sendmsg_nosec net/socket.c:652 [inline] sock_sendmsg+0xd7/0x130 net/socket.c:672 ____sys_sendmsg+0x753/0x880 net/socket.c:2343 ___sys_sendmsg+0x100/0x170 net/socket.c:2397 __sys_sendmsg+0x105/0x1d0 net/socket.c:2430 __do_sys_sendmsg net/socket.c:2439 [inline] __se_sys_sendmsg net/socket.c:2437 [inline] __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2437 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe Cc: stable@kernel.org Fixes: 8f71bb0030b8 ("RDMA: Report available cdevs through RDMA_NLDEV_CMD_GET_CHARDEV") Link: https://lore.kernel.org/r/20200310075339.238090-1-leon@kernel.org Reported-by: syzbot+46fe08363dbba223dec5@syzkaller.appspotmail.com Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-01RDMA/nl: Do not permit empty devices names during RDMA_NLDEV_CMD_NEWLINK/SETJason Gunthorpe1-1/+5
commit 7aefa6237cfe4a6fcf06a8656eee988b36f8fefc upstream. Empty device names cannot be added to sysfs and crash with: kobject: (00000000f9de3792): attempted to be registered with empty name! WARNING: CPU: 1 PID: 10856 at lib/kobject.c:234 kobject_add_internal+0x7ac/0x9a0 lib/kobject.c:234 Kernel panic - not syncing: panic_on_warn set ... CPU: 1 PID: 10856 Comm: syz-executor459 Not tainted 5.6.0-rc3-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x197/0x210 lib/dump_stack.c:118 panic+0x2e3/0x75c kernel/panic.c:221 __warn.cold+0x2f/0x3e kernel/panic.c:582 report_bug+0x289/0x300 lib/bug.c:195 fixup_bug arch/x86/kernel/traps.c:174 [inline] fixup_bug arch/x86/kernel/traps.c:169 [inline] do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:267 do_invalid_op+0x37/0x50 arch/x86/kernel/traps.c:286 invalid_op+0x23/0x30 arch/x86/entry/entry_64.S:1027 RIP: 0010:kobject_add_internal+0x7ac/0x9a0 lib/kobject.c:234 Code: 7a ca ca f9 e9 f0 f8 ff ff 4c 89 f7 e8 cd ca ca f9 e9 95 f9 ff ff e8 13 25 8c f9 4c 89 e6 48 c7 c7 a0 08 1a 89 e8 a3 76 5c f9 <0f> 0b 41 bd ea ff ff ff e9 52 ff ff ff e8 f2 24 8c f9 0f 0b e8 eb RSP: 0018:ffffc90002006eb0 EFLAGS: 00010286 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffffffff815eae46 RDI: fffff52000400dc8 RBP: ffffc90002006f08 R08: ffff8880972ac500 R09: ffffed1015d26659 R10: ffffed1015d26658 R11: ffff8880ae9332c7 R12: ffff888093034668 R13: 0000000000000000 R14: ffff8880a69d7600 R15: 0000000000000001 kobject_add_varg lib/kobject.c:390 [inline] kobject_add+0x150/0x1c0 lib/kobject.c:442 device_add+0x3be/0x1d00 drivers/base/core.c:2412 ib_register_device drivers/infiniband/core/device.c:1371 [inline] ib_register_device+0x93e/0xe40 drivers/infiniband/core/device.c:1343 rxe_register_device+0x52e/0x655 drivers/infiniband/sw/rxe/rxe_verbs.c:1231 rxe_add+0x122b/0x1661 drivers/infiniband/sw/rxe/rxe.c:302 rxe_net_add+0x91/0xf0 drivers/infiniband/sw/rxe/rxe_net.c:539 rxe_newlink+0x39/0x90 drivers/infiniband/sw/rxe/rxe.c:318 nldev_newlink+0x28a/0x430 drivers/infiniband/core/nldev.c:1538 rdma_nl_rcv_msg drivers/infiniband/core/netlink.c:195 [inline] rdma_nl_rcv_skb drivers/infiniband/core/netlink.c:239 [inline] rdma_nl_rcv+0x5d9/0x980 drivers/infiniband/core/netlink.c:259 netlink_unicast_kernel net/netlink/af_netlink.c:1303 [inline] netlink_unicast+0x59e/0x7e0 net/netlink/af_netlink.c:1329 netlink_sendmsg+0x91c/0xea0 net/netlink/af_netlink.c:1918 sock_sendmsg_nosec net/socket.c:652 [inline] sock_sendmsg+0xd7/0x130 net/socket.c:672 ____sys_sendmsg+0x753/0x880 net/socket.c:2343 ___sys_sendmsg+0x100/0x170 net/socket.c:2397 __sys_sendmsg+0x105/0x1d0 net/socket.c:2430 __do_sys_sendmsg net/socket.c:2439 [inline] __se_sys_sendmsg net/socket.c:2437 [inline] __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2437 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe Prevent empty names when checking the name provided from userspace during newlink and rename. Fixes: 3856ec4b93c9 ("RDMA/core: Add RDMA_NLDEV_CMD_NEWLINK/DELLINK support") Fixes: 05d940d3a3ec ("RDMA/nldev: Allow IB device rename through RDMA netlink") Cc: stable@kernel.org Link: https://lore.kernel.org/r/20200309191648.GA30852@ziepe.ca Reported-and-tested-by: syzbot+da615ac67d4dbea32cbc@syzkaller.appspotmail.com Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-01RDMA/core: Fix missing error check on dev_set_name()Jason Gunthorpe1-1/+3
commit f2f2b3bbf0d9f8d090b9a019679223b2bd1c66c4 upstream. If name memory allocation fails the name will be left empty and device_add_one() will crash: kobject: (0000000004952746): attempted to be registered with empty name! WARNING: CPU: 0 PID: 329 at lib/kobject.c:234 kobject_add_internal+0x7ac/0x9a0 lib/kobject.c:234 Kernel panic - not syncing: panic_on_warn set ... CPU: 0 PID: 329 Comm: syz-executor.5 Not tainted 5.6.0-rc2-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x197/0x210 lib/dump_stack.c:118 panic+0x2e3/0x75c kernel/panic.c:221 __warn.cold+0x2f/0x3e kernel/panic.c:582 report_bug+0x289/0x300 lib/bug.c:195 fixup_bug arch/x86/kernel/traps.c:174 [inline] fixup_bug arch/x86/kernel/traps.c:169 [inline] do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:267 do_invalid_op+0x37/0x50 arch/x86/kernel/traps.c:286 invalid_op+0x23/0x30 arch/x86/entry/entry_64.S:1027 RIP: 0010:kobject_add_internal+0x7ac/0x9a0 lib/kobject.c:234 Code: 1a 98 ca f9 e9 f0 f8 ff ff 4c 89 f7 e8 6d 98 ca f9 e9 95 f9 ff ff e8 c3 f0 8b f9 4c 89 e6 48 c7 c7 a0 0e 1a 89 e8 e3 41 5c f9 <0f> 0b 41 bd ea ff ff ff e9 52 ff ff ff e8 a2 f0 8b f9 0f 0b e8 9b RSP: 0018:ffffc90005b27908 EFLAGS: 00010286 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000040000 RSI: ffffffff815eae46 RDI: fffff52000b64f13 RBP: ffffc90005b27960 R08: ffff88805aeba480 R09: ffffed1015d06659 R10: ffffed1015d06658 R11: ffff8880ae8332c7 R12: ffff8880a37fd000 R13: 0000000000000000 R14: ffff888096691780 R15: 0000000000000001 kobject_add_varg lib/kobject.c:390 [inline] kobject_add+0x150/0x1c0 lib/kobject.c:442 device_add+0x3be/0x1d00 drivers/base/core.c:2412 add_one_compat_dev drivers/infiniband/core/device.c:901 [inline] add_one_compat_dev+0x46a/0x7e0 drivers/infiniband/core/device.c:857 rdma_dev_init_net+0x2eb/0x490 drivers/infiniband/core/device.c:1120 ops_init+0xb3/0x420 net/core/net_namespace.c:137 setup_net+0x2d5/0x8b0 net/core/net_namespace.c:327 copy_net_ns+0x29e/0x5a0 net/core/net_namespace.c:468 create_new_namespaces+0x403/0xb50 kernel/nsproxy.c:108 unshare_nsproxy_namespaces+0xc2/0x200 kernel/nsproxy.c:229 ksys_unshare+0x444/0x980 kernel/fork.c:2955 __do_sys_unshare kernel/fork.c:3023 [inline] __se_sys_unshare kernel/fork.c:3021 [inline] __x64_sys_unshare+0x31/0x40 kernel/fork.c:3021 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe Link: https://lore.kernel.org/r/20200309193200.GA10633@ziepe.ca Cc: stable@kernel.org Fixes: 4e0f7b907072 ("RDMA/core: Implement compat device/sysfs tree in net namespace") Reported-by: syzbot+ab4dae63f7d310641ded@syzkaller.appspotmail.com Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-12RMDA/cm: Fix missing ib_cm_destroy_id() in ib_cm_insert_listen()Jason Gunthorpe1-0/+1
commit c14dfddbd869bf0c2bafb7ef260c41d9cebbcfec upstream. The algorithm pre-allocates a cm_id since allocation cannot be done while holding the cm.lock spinlock, however it doesn't free it on one error path, leading to a memory leak. Fixes: 067b171b8679 ("IB/cm: Share listening CM IDs") Link: https://lore.kernel.org/r/20200221152023.GA8680@ziepe.ca Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>