summaryrefslogtreecommitdiff
path: root/drivers/infiniband/hw
AgeCommit message (Collapse)AuthorFilesLines
2020-10-29RDMA/hns: Set the unsupported wr opcodeLijun Ou1-1/+0
[ Upstream commit 22d3e1ed2cc837af87f76c3c8a4ccf4455e225c5 ] hip06 does not support IB_WR_LOCAL_INV, so the ps_opcode should be set to an invalid value instead of being left uninitialized. Fixes: 9a4435375cd1 ("IB/hns: Add driver files for hns RoCE driver") Fixes: a2f3d4479fe9 ("RDMA/hns: Avoid unncessary initialization") Link: https://lore.kernel.org/r/1600350615-115217-1-git-send-email-oulijun@huawei.com Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-29RDMA/qedr: Fix use of uninitialized fieldMichal Kalderon1-1/+1
[ Upstream commit a379ad54e55a12618cae7f6333fd1b3071de9606 ] dev->attr.page_size_caps was used uninitialized when setting device attributes Fixes: ec72fce401c6 ("qedr: Add support for RoCE HW init") Link: https://lore.kernel.org/r/20200902165741.8355-4-michal.kalderon@marvell.com Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-29IB/mlx4: Adjust delayed work when a dup is observedHåkon Bugge1-0/+3
[ Upstream commit 785167a114855c5aa75efca97000e405c2cc85bf ] When scheduling delayed work to clean up the cache, if the entry already has been scheduled for deletion, we adjust the delay. Fixes: 3cf69cc8dbeb ("IB/mlx4: Add CM paravirtualization") Link: https://lore.kernel.org/r/20200803061941.1139994-7-haakon.bugge@oracle.com Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-29IB/mlx4: Fix starvation in paravirt mux/demuxHåkon Bugge2-3/+33
[ Upstream commit 7fd1507df7cee9c533f38152fcd1dd769fcac6ce ] The mlx4 driver will proxy MAD packets through the PF driver. A VM or an instantiated VF will send its MAD packets to the PF driver using loop-back. The PF driver will be informed by an interrupt, but defer the handling and polling of CQEs to a worker thread running on an ordered work-queue. Consider the following scenario: the VMs will in short proximity in time, for example due to a network event, send many MAD packets to the PF driver. Lets say there are K VMs, each sending N packets. The interrupt from the first VM will start the worker thread, which will poll N CQEs. A common case here is where the PF driver will multiplex the packets received from the VMs out on the wire QP. But before the wire QP has returned a send CQE and associated interrupt, the other K - 1 VMs have sent their N packets as well. The PF driver has to multiplex K * N packets out on the wire QP. But the send-queue on the wire QP has a finite capacity. So, in this scenario, if K * N is larger than the send-queue capacity of the wire QP, we will get MAD packets dropped on the floor with this dynamic debug message: mlx4_ib_multiplex_mad: failed sending GSI to wire on behalf of slave 2 (-11) and this despite the fact that the wire send-queue could have capacity, but the PF driver isn't aware, because the wire send CQEs have not yet been polled. We can also have a similar scenario inbound, with a wire recv-queue larger than the tunnel QP's send-queue. If many remote peers send MAD packets to the very same VM, the tunnel send-queue destined to the VM could allegedly be construed to be full by the PF driver. This starvation is fixed by introducing separate work queues for the wire QPs vs. the tunnel QPs. With this fix, using a dual ported HCA, 8 VFs instantiated, we could run cmtime on each of the 18 interfaces towards a similar configured peer, each cmtime instance with 800 QPs (all in all 14400 QPs) without a single CM packet getting lost. Fixes: 3cf69cc8dbeb ("IB/mlx4: Add CM paravirtualization") Link: https://lore.kernel.org/r/20200803061941.1139994-5-haakon.bugge@oracle.com Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-01RDMA/iw_cgxb4: Fix an error handling path in 'c4iw_connect()'Christophe JAILLET1-2/+2
[ Upstream commit 9067f2f0b41d7e817fc8c5259bab1f17512b0147 ] We should jump to fail3 in order to undo the 'xa_insert_irq()' call. Link: https://lore.kernel.org/r/20190923190746.10964-1-christophe.jaillet@wanadoo.fr Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-01RDMA/i40iw: Fix potential use after freePan Bian1-1/+1
[ Upstream commit da046d5f895fca18d63b15ac8faebd5bf784e23a ] Release variable dst after logging dst->error to avoid possible use after free. Link: https://lore.kernel.org/r/1573022651-37171-1-git-send-email-bianpan2016@163.com Signed-off-by: Pan Bian <bianpan2016@163.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-03RDMA/pvrdma: Fix missing pci disable in pvrdma_pci_probe()Qiushi Wu1-1/+1
[ Upstream commit db857e6ae548f0f4f4a0f63fffeeedf3cca21f9d ] In function pvrdma_pci_probe(), pdev was not disabled in one error path. Thus replace the jump target “err_free_device” by "err_disable_pdev". Fixes: 29c8d9eba550 ("IB: Add vmw_pvrdma driver") Link: https://lore.kernel.org/r/20200523030457.16160-1-wu000273@umn.edu Signed-off-by: Qiushi Wu <wu000273@umn.edu> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-03IB/qib: Call kobject_put() when kobject_init_and_add() failsKaike Wan1-4/+5
[ Upstream commit a35cd6447effd5c239b564c80fa109d05ff3d114 ] When kobject_init_and_add() returns an error in the function qib_create_port_files(), the function kobject_put() is not called for the corresponding kobject, which potentially leads to memory leak. This patch fixes the issue by calling kobject_put() even if kobject_init_and_add() fails. In addition, the ppd->diagc_kobj is released along with other kobjects when the sysfs is unregistered. Fixes: f931551bafe1 ("IB/qib: Add new qib driver for QLogic PCIe InfiniBand adapters") Link: https://lore.kernel.org/r/20200512031328.189865.48627.stgit@awfm-01.aw.intel.com Cc: <stable@vger.kernel.org> Suggested-by: Lin Yi <teroincn@gmail.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-05-20IB/mlx4: Test return value of calls to ib_get_cached_pkeyJack Morgenstein1-3/+11
[ Upstream commit 6693ca95bd4330a0ad7326967e1f9bcedd6b0800 ] In the mlx4_ib_post_send() flow, some functions call ib_get_cached_pkey() without checking its return value. If ib_get_cached_pkey() returns an error code, these functions should return failure. Fixes: 1ffeb2eb8be9 ("IB/mlx4: SR-IOV IB context objects and proxy/tunnel SQP support") Fixes: 225c7b1feef1 ("IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters") Fixes: e622f2f4ad21 ("IB: split struct ib_send_wr") Link: https://lore.kernel.org/r/20200426075921.130074-1-leon@kernel.org Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-05-20i40iw: Fix error handling in i40iw_manage_arp_cache()Dan Carpenter1-1/+1
[ Upstream commit 37e31d2d26a4124506c24e95434e9baf3405a23a ] The i40iw_arp_table() function can return -EOVERFLOW if i40iw_alloc_resource() fails so we can't just test for "== -1". Fixes: 4e9042e647ff ("i40iw: add hw and utils files") Link: https://lore.kernel.org/r/20200422092211.GA195357@mwanda Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-05-05RDMA/mlx4: Initialize ib_spec on the stackAlaa Hleihel1-1/+2
commit c08cfb2d8d78bfe81b37cc6ba84f0875bddd0d5c upstream. Initialize ib_spec on the stack before using it, otherwise we will have garbage values that will break creating default rules with invalid parsing error. Fixes: a37a1a428431 ("IB/mlx4: Add mechanism to support flow steering over IB links") Link: https://lore.kernel.org/r/20200413132235.930642-1-leon@kernel.org Signed-off-by: Alaa Hleihel <alaa@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-05RDMA/mlx5: Set GRH fields in query QP on RoCEAharon Landau1-1/+3
commit 2d7e3ff7b6f2c614eb21d0dc348957a47eaffb57 upstream. GRH fields such as sgid_index, hop limit, et. are set in the QP context when QP is created/modified. Currently, when query QP is performed, we fill the GRH fields only if the GRH bit is set in the QP context, but this bit is not set for RoCE. Adjust the check so we will set all relevant data for the RoCE too. Since this data is returned to userspace, the below is an ABI regression. Fixes: d8966fcd4c25 ("IB/core: Use rdma_ah_attr accessor functions") Link: https://lore.kernel.org/r/20200413132028.930109-1-leon@kernel.org Signed-off-by: Aharon Landau <aharonl@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-13IB/hfi1: Fix memory leaks in sysfs registration and unregistrationKaike Wan1-2/+11
commit 5c15abc4328ad696fa61e2f3604918ed0c207755 upstream. When the hfi1 driver is unloaded, kmemleak will report the following issue: unreferenced object 0xffff8888461a4c08 (size 8): comm "kworker/0:0", pid 5, jiffies 4298601264 (age 2047.134s) hex dump (first 8 bytes): 73 64 6d 61 30 00 ff ff sdma0... backtrace: [<00000000311a6ef5>] kvasprintf+0x62/0xd0 [<00000000ade94d9f>] kobject_set_name_vargs+0x1c/0x90 [<0000000060657dbb>] kobject_init_and_add+0x5d/0xb0 [<00000000346fe72b>] 0xffffffffa0c5ecba [<000000006cfc5819>] 0xffffffffa0c866b9 [<0000000031c65580>] 0xffffffffa0c38e87 [<00000000e9739b3f>] local_pci_probe+0x41/0x80 [<000000006c69911d>] work_for_cpu_fn+0x16/0x20 [<00000000601267b5>] process_one_work+0x171/0x380 [<0000000049a0eefa>] worker_thread+0x1d1/0x3f0 [<00000000909cf2b9>] kthread+0xf8/0x130 [<0000000058f5f874>] ret_from_fork+0x35/0x40 This patch fixes the issue by: - Releasing dd->per_sdma[i].kobject in hfi1_unregister_sysfs(). - This will fix the memory leak. - Calling kobject_put() to unwind operations only for those entries in dd->per_sdma[] whose operations have succeeded (including the current one that has just failed) in hfi1_verbs_register_sysfs(). Cc: <stable@vger.kernel.org> Fixes: 0cb2aa690c7e ("IB/hfi1: Add sysfs interface for affinity setup") Link: https://lore.kernel.org/r/20200326163807.21129.27371.stgit@awfm-01.aw.intel.com Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-13IB/hfi1: Call kobject_put() when kobject_init_and_add() failsKaike Wan1-5/+8
commit dfb5394f804ed4fcea1fc925be275a38d66712ab upstream. When kobject_init_and_add() returns an error in the function hfi1_create_port_files(), the function kobject_put() is not called for the corresponding kobject, which potentially leads to memory leak. This patch fixes the issue by calling kobject_put() even if kobject_init_and_add() fails. Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20200326163813.21129.44280.stgit@awfm-01.aw.intel.com Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-02RDMA/mlx5: Block delay drop to unprivileged usersMaor Gottlieb1-0/+4
commit ba80013fba656b9830ef45cd40a6a1e44707f47a upstream. It has been discovered that this feature can globally block the RX port, so it should be allowed for highly privileged users only. Fixes: 03404e8ae652("IB/mlx5: Add support to dropless RQ") Link: https://lore.kernel.org/r/20200322124906.1173790-1-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-11IB/hfi1, qib: Ensure RCU is locked when accessing listDennis Dalessandro2-1/+5
commit 817a68a6584aa08e323c64283fec5ded7be84759 upstream. The packet handling function, specifically the iteration of the qp list for mad packet processing misses locking RCU before running through the list. Not only is this incorrect, but the list_for_each_entry_rcu() call can not be called with a conditional check for lock dependency. Remedy this by invoking the rcu lock and unlock around the critical section. This brings MAD packet processing in line with what is done for non-MAD packets. Fixes: 7724105686e7 ("IB/hfi1: add driver files") Link: https://lore.kernel.org/r/20200225195445.140896.41873.stgit@awfm-01.aw.intel.com Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-28IB/hfi1: Add software counter for ctxt0 seq dropMike Marciniszyn4-0/+14
[ Upstream commit 5ffd048698ea5139743acd45e8ab388a683642b8 ] All other code paths increment some form of drop counter. This was missed in the original implementation. Fixes: 82c2611daaf0 ("staging/rdma/hfi1: Handle packets with invalid RHF on context 0") Link: https://lore.kernel.org/r/20200106134228.119356.96828.stgit@awfm-01.aw.intel.com Reviewed-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-28IB/hfi1: Close window for pq and request colidingMike Marciniszyn4-29/+48
commit be8638344c70bf492963ace206a9896606b6922d upstream. Cleaning up a pq can result in the following warning and panic: WARNING: CPU: 52 PID: 77418 at lib/list_debug.c:53 __list_del_entry+0x63/0xd0 list_del corruption, ffff88cb2c6ac068->next is LIST_POISON1 (dead000000000100) Modules linked in: mmfs26(OE) mmfslinux(OE) tracedev(OE) 8021q garp mrp ib_isert iscsi_target_mod target_core_mod crc_t10dif crct10dif_generic opa_vnic rpcrdma ib_iser libiscsi scsi_transport_iscsi ib_ipoib(OE) bridge stp llc iTCO_wdt iTCO_vendor_support intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crct10dif_pclmul crct10dif_common crc32_pclmul ghash_clmulni_intel ast aesni_intel ttm lrw gf128mul glue_helper ablk_helper drm_kms_helper cryptd syscopyarea sysfillrect sysimgblt fb_sys_fops drm pcspkr joydev lpc_ich mei_me drm_panel_orientation_quirks i2c_i801 mei wmi ipmi_si ipmi_devintf ipmi_msghandler nfit libnvdimm acpi_power_meter acpi_pad hfi1(OE) rdmavt(OE) rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_core binfmt_misc numatools(OE) xpmem(OE) ip_tables nfsv3 nfs_acl nfs lockd grace sunrpc fscache igb ahci i2c_algo_bit libahci dca ptp libata pps_core crc32c_intel [last unloaded: i2c_algo_bit] CPU: 52 PID: 77418 Comm: pvbatch Kdump: loaded Tainted: G OE ------------ 3.10.0-957.38.3.el7.x86_64 #1 Hardware name: HPE.COM HPE SGI 8600-XA730i Gen10/X11DPT-SB-SG007, BIOS SBED1229 01/22/2019 Call Trace: [<ffffffff90365ac0>] dump_stack+0x19/0x1b [<ffffffff8fc98b78>] __warn+0xd8/0x100 [<ffffffff8fc98bff>] warn_slowpath_fmt+0x5f/0x80 [<ffffffff8ff970c3>] __list_del_entry+0x63/0xd0 [<ffffffff8ff9713d>] list_del+0xd/0x30 [<ffffffff8fddda70>] kmem_cache_destroy+0x50/0x110 [<ffffffffc0328130>] hfi1_user_sdma_free_queues+0xf0/0x200 [hfi1] [<ffffffffc02e2350>] hfi1_file_close+0x70/0x1e0 [hfi1] [<ffffffff8fe4519c>] __fput+0xec/0x260 [<ffffffff8fe453fe>] ____fput+0xe/0x10 [<ffffffff8fcbfd1b>] task_work_run+0xbb/0xe0 [<ffffffff8fc2bc65>] do_notify_resume+0xa5/0xc0 [<ffffffff90379134>] int_signal+0x12/0x17 BUG: unable to handle kernel NULL pointer dereference at 0000000000000010 IP: [<ffffffff8fe1f93e>] kmem_cache_close+0x7e/0x300 PGD 2cdab19067 PUD 2f7bfdb067 PMD 0 Oops: 0000 [#1] SMP Modules linked in: mmfs26(OE) mmfslinux(OE) tracedev(OE) 8021q garp mrp ib_isert iscsi_target_mod target_core_mod crc_t10dif crct10dif_generic opa_vnic rpcrdma ib_iser libiscsi scsi_transport_iscsi ib_ipoib(OE) bridge stp llc iTCO_wdt iTCO_vendor_support intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crct10dif_pclmul crct10dif_common crc32_pclmul ghash_clmulni_intel ast aesni_intel ttm lrw gf128mul glue_helper ablk_helper drm_kms_helper cryptd syscopyarea sysfillrect sysimgblt fb_sys_fops drm pcspkr joydev lpc_ich mei_me drm_panel_orientation_quirks i2c_i801 mei wmi ipmi_si ipmi_devintf ipmi_msghandler nfit libnvdimm acpi_power_meter acpi_pad hfi1(OE) rdmavt(OE) rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_core binfmt_misc numatools(OE) xpmem(OE) ip_tables nfsv3 nfs_acl nfs lockd grace sunrpc fscache igb ahci i2c_algo_bit libahci dca ptp libata pps_core crc32c_intel [last unloaded: i2c_algo_bit] CPU: 52 PID: 77418 Comm: pvbatch Kdump: loaded Tainted: G W OE ------------ 3.10.0-957.38.3.el7.x86_64 #1 Hardware name: HPE.COM HPE SGI 8600-XA730i Gen10/X11DPT-SB-SG007, BIOS SBED1229 01/22/2019 task: ffff88cc26db9040 ti: ffff88b5393a8000 task.ti: ffff88b5393a8000 RIP: 0010:[<ffffffff8fe1f93e>] [<ffffffff8fe1f93e>] kmem_cache_close+0x7e/0x300 RSP: 0018:ffff88b5393abd60 EFLAGS: 00010287 RAX: 0000000000000000 RBX: ffff88cb2c6ac000 RCX: 0000000000000003 RDX: 0000000000000400 RSI: 0000000000000400 RDI: ffffffff9095b800 RBP: ffff88b5393abdb0 R08: ffffffff9095b808 R09: ffffffff8ff77c19 R10: ffff88b73ce1f160 R11: ffffddecddde9800 R12: ffff88cb2c6ac000 R13: 000000000000000c R14: ffff88cf3fdca780 R15: 0000000000000000 FS: 00002aaaaab52500(0000) GS:ffff88b73ce00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000010 CR3: 0000002d27664000 CR4: 00000000007607e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: [<ffffffff8fe20d44>] __kmem_cache_shutdown+0x14/0x80 [<ffffffff8fddda78>] kmem_cache_destroy+0x58/0x110 [<ffffffffc0328130>] hfi1_user_sdma_free_queues+0xf0/0x200 [hfi1] [<ffffffffc02e2350>] hfi1_file_close+0x70/0x1e0 [hfi1] [<ffffffff8fe4519c>] __fput+0xec/0x260 [<ffffffff8fe453fe>] ____fput+0xe/0x10 [<ffffffff8fcbfd1b>] task_work_run+0xbb/0xe0 [<ffffffff8fc2bc65>] do_notify_resume+0xa5/0xc0 [<ffffffff90379134>] int_signal+0x12/0x17 Code: 00 00 ba 00 04 00 00 0f 4f c2 3d 00 04 00 00 89 45 bc 0f 84 e7 01 00 00 48 63 45 bc 49 8d 04 c4 48 89 45 b0 48 8b 80 c8 00 00 00 <48> 8b 78 10 48 89 45 c0 48 83 c0 10 48 89 45 d0 48 8b 17 48 39 RIP [<ffffffff8fe1f93e>] kmem_cache_close+0x7e/0x300 RSP <ffff88b5393abd60> CR2: 0000000000000010 The panic is the result of slab entries being freed during the destruction of the pq slab. The code attempts to quiesce the pq, but looking for n_req == 0 doesn't account for new requests. Fix the issue by using SRCU to get a pq pointer and adjust the pq free logic to NULL the fd pq pointer prior to the quiesce. Fixes: e87473bc1b6c ("IB/hfi1: Only set fd pointer when base context is completely initialized") Link: https://lore.kernel.org/r/20200210131033.87408.81174.stgit@awfm-01.aw.intel.com Reviewed-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-15IB/mlx5: Fix outstanding_pi index for GSI qpsPrabhath Sajeepa1-2/+1
commit b5671afe5e39ed71e94eae788bacdcceec69db09 upstream. Commit b0ffeb537f3a ("IB/mlx5: Fix iteration overrun in GSI qps") changed the way outstanding WRs are tracked for the GSI QP. But the fix did not cover the case when a call to ib_post_send() fails and updates index to track outstanding. Since the prior commmit outstanding_pi should not be bounded otherwise the loop generate_completions() will fail. Fixes: b0ffeb537f3a ("IB/mlx5: Fix iteration overrun in GSI qps") Link: https://lore.kernel.org/r/1576195889-23527-1-git-send-email-psajeepa@purestorage.com Signed-off-by: Prabhath Sajeepa <psajeepa@purestorage.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-27RDMA/hns: Fixs hw access invalid dma memory errorXi Wang1-1/+0
[ Upstream commit ec5bc2cc69b4fc494e04d10fc5226f6f9cf67c56 ] When smmu is enable, if execute the perftest command and then use 'kill -9' to exit, follow this operation repeatedly, the kernel will have a high probability to print the following smmu event: arm-smmu-v3 arm-smmu-v3.1.auto: event 0x10 received: arm-smmu-v3 arm-smmu-v3.1.auto: 0x00007d0000000010 arm-smmu-v3 arm-smmu-v3.1.auto: 0x0000020900000080 arm-smmu-v3 arm-smmu-v3.1.auto: 0x00000000f47cf000 arm-smmu-v3 arm-smmu-v3.1.auto: 0x00000000f47cf000 This is because the hw will periodically refresh the qpc cache until the next reset. This patch fixed it by removing the action that release qpc memory in the 'hns_roce_qp_free' function. Fixes: 9a4435375cd1 ("IB/hns: Add driver files for hns RoCE driver") Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-27RDMA/qedr: Fix incorrect device rate.Sagiv Ozeri1-16/+9
[ Upstream commit 69054666df0a9b4e8331319f98b6b9a88bc3fcc4 ] Use the correct enum value introduced in commit 12113a35ada6 ("IB/core: Add HDR speed enum") Prior to this change a 50Gbps port would show 40Gbps. This patch also cleaned up the redundant redefiniton of ib speeds for qedr. Fixes: 12113a35ada6 ("IB/core: Add HDR speed enum") Signed-off-by: Sagiv Ozeri <sagiv.ozeri@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-27IB/mlx5: Add missing XRC options to QP optional params maskJack Morgenstein1-0/+21
[ Upstream commit 8f4426aa19fcdb9326ac44154a117b1a3a5ae126 ] The QP transition optional parameters for the various transition for XRC QPs are identical to those for RC QPs. Many of the XRC QP transition optional parameter bits are missing from the QP optional mask table. These omissions caused failures when doing XRC QP state transitions. For example, when trying to change the response timer of an XRC receive QP via the RTS2RTS transition, the new timer value was ignored because MLX5_QP_OPTPAR_RNR_TIMEOUT bit was missing from the optional params mask for XRC qps for the RTS2RTS transition. Fix this by adding the missing XRC optional parameters for all QP transitions to the opt_mask table. Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Fixes: a4774e9095de ("IB/mlx5: Fix opt param mask according to firmware spec") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-27iw_cxgb4: use tos when finding ipv6 routesSteve Wise1-2/+3
[ Upstream commit c8a7eb554a83214c3d8ee5cb322da8c72810d2dc ] When IPv6 support was added, the correct tos was not passed to cxgb_find_route6(). This potentially results in the wrong route entry. Fixes: 830662f6f032 ("RDMA/cxgb4: Add support for active and passive open connection with IPv6 address") Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-27iw_cxgb4: use tos when importing the endpointSteve Wise1-1/+1
[ Upstream commit cb3ba0bde881f0cb7e3945d2a266901e2bd18c92 ] import_ep() is passed the correct tos, but doesn't use it correctly. Fixes: ac8e4c69a021 ("cxgb4/iw_cxgb4: TOS support") Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-27RDMA/iw_cxgb4: Fix the unchecked ep dereferenceRaju Rangoju1-7/+10
[ Upstream commit 3352976c892301fd576a2e9ff0ac7337b2e2ca48 ] The patch 944661dd97f4: "RDMA/iw_cxgb4: atomically lookup ep and get a reference" from May 6, 2016, leads to the following Smatch complaint: drivers/infiniband/hw/cxgb4/cm.c:2953 terminate() error: we previously assumed 'ep' could be null (see line 2945) Fixes: 944661dd97f4 ("RDMA/iw_cxgb4: atomically lookup ep and get a reference") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Raju Rangoju <rajur@chelsio.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-27RDMA/qedr: Fix out of bounds index check in query pkeyGal Pressman1-1/+1
[ Upstream commit dbe30dae487e1a232158c24b432d45281c2805b7 ] The pkey table size is QEDR_ROCE_PKEY_TABLE_LEN, index should be tested for >= QEDR_ROCE_PKEY_TABLE_LEN instead of > QEDR_ROCE_PKEY_TABLE_LEN. Fixes: a7efd7773e31 ("qedr: Add support for PD,PKEY and CQ verbs") Signed-off-by: Gal Pressman <galpress@amazon.com> Acked-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-27RDMA/ocrdma: Fix out of bounds index check in query pkeyGal Pressman1-1/+1
[ Upstream commit b188940796c7be31c1b8c25a9a0e0842c2e7a49e ] The pkey table size is one element, index should be tested for > 0 instead of > 1. Fixes: fe2caefcdf58 ("RDMA/ocrdma: Add driver for Emulex OneConnect IBoE RDMA adapter") Signed-off-by: Gal Pressman <galpress@amazon.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-27IB/usnic: Fix out of bounds index check in query pkeyGal Pressman1-1/+1
[ Upstream commit 4959d5da5737dd804255c75b8cea0a2929ce279a ] The pkey table size is one element, index should be tested for > 0 instead of > 1. Fixes: e3cf00d0a87f ("IB/usnic: Add Cisco VIC low-level hardware driver") Signed-off-by: Gal Pressman <galpress@amazon.com> Acked-by: Parvi Kaustubhi <pkaustub@cisco.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-27IB/hfi1: Add mtu check for operational data VLsAlex Estrin1-4/+22
[ Upstream commit eb50130964e8c1379f37c3d3bab33a411ec62e98 ] Since Virtual Lanes BCT credits and MTU are set through separate MADs, we have to ensure both are valid, and data VLs are ready for transmission before we allow port transition to Armed state. Fixes: 5e2d6764a729 ("IB/hfi1: Verify port data VLs credits on transition to Armed") Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Alex Estrin <alex.estrin@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-17RDMA/mlx5: Return proper error valueLeon Romanovsky1-1/+1
commit 546d30099ed204792083f043cd7e016de86016a3 upstream. Returned value from mlx5_mr_cache_alloc() is checked to be error or real pointer. Return proper error code instead of NULL which is not checked later. Fixes: 81713d3788d2 ("IB/mlx5: Add implicit MR support") Link: https://lore.kernel.org/r/20191029055721.7192-1-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-17RDMA/bnxt_re: Fix Send Work Entry state check while polling completionsSelvin Xavier1-6/+6
commit c5275723580922e5f3264f96751337661a153c7d upstream. Some adapters need a fence Work Entry to handle retransmission. Currently the driver checks for this condition, only if the Send queue entry is signalled. Implement the condition check, irrespective of the signalled state of the Work queue entries Failure to add the fence can result in access to memory that is already marked as completed, triggering data corruption, transmission failure, IOMMU failures, etc. Fixes: 9152e0b722b2 ("RDMA/bnxt_re: HW workarounds for handling specific conditions") Link: https://lore.kernel.org/r/1574671174-5064-3-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-09IB/mlx4: Follow mirror sequence of device add during device removalParav Pandit1-4/+5
[ Upstream commit 89f988d93c62384758b19323c886db917a80c371 ] Current code device add sequence is: ib_register_device() ib_mad_init() init_sriov_init() register_netdev_notifier() Therefore, the remove sequence should be, unregister_netdev_notifier() close_sriov() mad_cleanup() ib_unregister_device() However it is not above. Hence, make do above remove sequence. Fixes: fa417f7b520ee ("IB/mlx4: Add support for IBoE") Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Link: https://lore.kernel.org/r/20191212091214.315005-3-leon@kernel.org Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-31RDMA/qedr: Fix memory leak in user qp and mrMichal Kalderon1-2/+10
[ Upstream commit 24e412c1e00ebfe73619e6b88cbc26c2c7d41b85 ] User QPs pbl's weren't freed properly. MR pbls weren't freed properly. Fixes: e0290cce6ac0 ("qedr: Add support for memory registeration verbs") Link: https://lore.kernel.org/r/20191027200451.28187-5-michal.kalderon@marvell.com Signed-off-by: Ariel Elior <ariel.elior@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-17RDMA/qib: Validate ->show()/store() callbacks before calling themViresh Kumar1-0/+6
commit 7ee23491b39259ae83899dd93b2a29ef0f22f0a7 upstream. The permissions of the read-only or write-only sysfs files can be changed (as root) and the user can then try to read a write-only file or write to a read-only file which will lead to kernel crash here. Protect against that by always validating the show/store callbacks. Link: https://lore.kernel.org/r/d45cc26361a174ae12dbb86c994ef334d257924b.1573096807.git.viresh.kumar@linaro.org Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17mlx4: Use snprintf instead of complicated strcpyQian Cai1-8/+4
[ Upstream commit 0fbc9b8b4ea3f688a5da141a64f97aa33ad02ae9 ] This fixes a compilation warning in sysfs.c drivers/infiniband/hw/mlx4/sysfs.c:360:2: warning: 'strncpy' output may be truncated copying 8 bytes from a string of length 31 [-Wstringop-truncation] By eliminating the temporary stack buffer. Signed-off-by: Qian Cai <cai@gmx.us> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-17IB/hfi1: Close VNIC sdma_progress sleep windowMike Marciniszyn1-10/+5
[ Upstream commit 18912c4524385dd6532c682cb9d4f6aa39ba8d47 ] The call to sdma_progress() is called outside the wait lock. In this case, there is a race condition where sdma_progress() can return false and the sdma_engine can idle. If that happens, there will be no more sdma interrupts to cause the wakeup and the vnic_sdma xmit will hang. Fix by moving the lock to enclose the sdma_progress() call. Also, delete the tx_retry. The need for this was removed by: commit bcad29137a97 ("IB/hfi1: Serve the most starved iowait entry first") Fixes: 64551ede6cd1 ("IB/hfi1: VNIC SDMA support") Reviewed-by: Gary Leshner <Gary.S.Leshner@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-17IB/hfi1: Ignore LNI errors before DC8051 transitions to Polling stateKaike Wan1-1/+46
[ Upstream commit c1a797c0818e0122c7ec8422edd971cfec9b15ea ] When it is requested to change its physical state back to Offline while in the process to go up, DC8051 will set the ERROR field in the DC8051_DBG_ERR_INFO_SET_BY_8051 register. This ERROR field will remain until the next time when DC8051 transitions from Offline to Polling. Subsequently, when the host requests DC8051 to change its physical state to Polling again, it may receive a DC8051 interrupt with the stale ERROR field still in DC8051_DBG_ERR_INFO_SET_BY_8051. If the host link state has been changed to Polling, this stale ERROR will force the host to transition to Offline state, resulting in a vicious cycle of Polling ->Offline->Polling->Offline. On the other hand, if the host link state is still Offline when the stale ERROR is received, the stale ERROR will be ignored, and the link will come up correctly. This patch implements the correct behavior by changing host link state to Polling only after DC8051 changes its physical state to Polling. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Krzysztof Goreczny <krzysztof.goreczny@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-17RDMA/hns: Correct the value of HNS_ROCE_HEM_CHUNK_LENSirong Wang1-1/+1
[ Upstream commit 531eb45b3da4267fc2a64233ba256c8ffb02edd2 ] Size of pointer to buf field of struct hns_roce_hem_chunk should be considered when calculating HNS_ROCE_HEM_CHUNK_LEN, or sg table size will be larger than expected when allocating hem. Fixes: 9a4435375cd1 ("IB/hns: Add driver files for hns RoCE driver") Link: https://lore.kernel.org/r/1572575610-52530-2-git-send-email-liweihang@hisilicon.com Signed-off-by: Sirong Wang <wangsirong@huawei.com> Signed-off-by: Weihang Li <liweihang@hisilicon.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-05IB/qib: Fix an error code in qib_sdma_verbs_send()Dan Carpenter1-1/+3
[ Upstream commit 5050ae5fa3d54c8e83e1e447cc7e3591110a7f57 ] We accidentally return success on this error path. Fixes: f931551bafe1 ("IB/qib: Add new qib driver for QLogic PCIe InfiniBand adapters") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-05RDMA/vmw_pvrdma: Use atomic memory allocation in create AHGal Pressman1-1/+1
[ Upstream commit a276a4d93bf1580d737f38d1810e5f4b166f3edd ] Create address handle callback should not sleep, use GFP_ATOMIC instead of GFP_KERNEL for memory allocation. Fixes: 29c8d9eba550 ("IB: Add vmw_pvrdma driver") Cc: Adit Ranadive <aditr@vmware.com> Signed-off-by: Gal Pressman <galpress@amazon.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-01RDMA/bnxt_re: Fix qp async event reportingDevesh Sharma1-4/+9
[ Upstream commit 4c01f2e3a906a0d2d798be5751c331cf501bc129 ] Reports affiliated async event on the qp-async event channel instead of global event channel. Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-24IB/mthca: Fix error return code in __mthca_init_one()Wei Yongjun1-1/+2
[ Upstream commit 39f2495618c5e980d2873ea3f2d1877dd253e07a ] Fix to return a negative error code from the mthca_cmd_init() error handling case instead of 0, as done elsewhere in this function. Fixes: 80fd8238734c ("[PATCH] IB/mthca: Encapsulate command interface init") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-20RDMA/i40iw: Fix incorrect iterator typeHåkon Bugge1-1/+1
[ Upstream commit 802fa45cd320de319e86c93bca72abec028ba059 ] Commit f27b4746f378 ("i40iw: add connection management code") uses an incorrect rcu iterator, whilst holding the rtnl_lock. Since the critical region invokes i40iw_manage_qhash(), which is a sleeping function, the rcu locking and traversal cannot be used. Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-20IB/hfi1: Missing return value in error path for user sdmaMichael J. Ruhl1-1/+3
[ Upstream commit 2bf4b33f83dfe521c4c7c407b6b150aeec04d69c ] If the set_txreq_header_agh() function returns an error, the exit path is chosen. In this path, the code fails to set the return value. This will cause the caller to not realize an error has occurred. Set the return value correctly in the error path. Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-20IB/hfi1: Ensure full Gen3 speed in a Gen4 systemJames Erwin1-1/+3
commit a9c3c4c597704b3a1a2b9bef990e7d8a881f6533 upstream. If an hfi1 card is inserted in a Gen4 systems, the driver will avoid the gen3 speed bump and the card will operate at half speed. This is because the driver avoids the gen3 speed bump when the parent bus speed isn't identical to gen3, 8.0GT/s. This is not compatible with gen4 and newer speeds. Fix by relaxing the test to explicitly look for the lower capability speeds which inherently allows for gen4 and all future speeds. Fixes: 7724105686e7 ("IB/hfi1: add driver files") Link: https://lore.kernel.org/r/20191101192059.106248.1699.stgit@awfm-01.aw.intel.com Cc: <stable@vger.kernel.org> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: James Erwin <james.erwin@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-12RDMA/iw_cxgb4: Avoid freeing skb twice in arp failure casePotnuri Bharat Teja1-2/+0
[ Upstream commit d4934f45693651ea15357dd6c7c36be28b6da884 ] _put_ep_safe() and _put_pass_ep_safe() free the skb before it is freed by process_work(). fix double free by freeing the skb only in process_work(). Fixes: 1dad0ebeea1c ("iw_cxgb4: Avoid touch after free error in ARP failure handlers") Link: https://lore.kernel.org/r/1572006880-5800-1-git-send-email-bharat@chelsio.com Signed-off-by: Dakshaja Uppalapati <dakshaja@chelsio.com> Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-12RDMA/qedr: Fix reported firmware versionKamal Heib1-1/+1
[ Upstream commit b806c94ee44e53233b8ce6c92d9078d9781786a5 ] Remove spaces from the reported firmware version string. Actual value: $ cat /sys/class/infiniband/qedr0/fw_ver 8. 37. 7. 0 Expected value: $ cat /sys/class/infiniband/qedr0/fw_ver 8.37.7.0 Fixes: ec72fce401c6 ("qedr: Add support for RoCE HW init") Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Acked-by: Michal Kalderon <michal.kalderon@marvell.com> Link: https://lore.kernel.org/r/20191007210730.7173-1-kamalheib1@gmail.com Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-06RDMA/hfi1: Prevent memory leak in sdma_initNavid Emamdoost1-1/+4
[ Upstream commit 34b3be18a04ecdc610aae4c48e5d1b799d8689f6 ] In sdma_init if rhashtable_init fails the allocated memory for tmp_sdma_rht should be released. Fixes: 5a52a7acf7e2 ("IB/hfi1: NULL pointer dereference when freeing rhashtable") Link: https://lore.kernel.org/r/20190925144543.10141-1-navid.emamdoost@gmail.com Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com> Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-29RDMA/cxgb4: Do not dma memory off of the stackGreg KH1-11/+17
commit 3840c5b78803b2b6cc1ff820100a74a092c40cbb upstream. Nicolas pointed out that the cxgb4 driver is doing dma off of the stack, which is generally considered a very bad thing. On some architectures it could be a security problem, but odds are none of them actually run this driver, so it's just a "normal" bug. Resolve this by allocating the memory for a message off of the heap instead of the stack. kmalloc() always will give us a proper memory location that DMA will work correctly from. Link: https://lore.kernel.org/r/20191001165611.GA3542072@kroah.com Reported-by: Nicolas Waisman <nico@semmle.com> Tested-by: Potnuri Bharat Teja <bharat@chelsio.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-05IB/hfi1: Define variables as unsigned long to fix KASAN warningIra Weiny1-26/+19
commit f8659d68e2bee5b86a1beaf7be42d942e1fc81f4 upstream. Define the working variables to be unsigned long to be compatible with for_each_set_bit and change types as needed. While we are at it remove unused variables from a couple of functions. This was found because of the following KASAN warning: ================================================================== BUG: KASAN: stack-out-of-bounds in find_first_bit+0x19/0x70 Read of size 8 at addr ffff888362d778d0 by task kworker/u308:2/1889 CPU: 21 PID: 1889 Comm: kworker/u308:2 Tainted: G W 5.3.0-rc2-mm1+ #2 Hardware name: Intel Corporation W2600CR/W2600CR, BIOS SE5C600.86B.02.04.0003.102320141138 10/23/2014 Workqueue: ib-comp-unb-wq ib_cq_poll_work [ib_core] Call Trace: dump_stack+0x9a/0xf0 ? find_first_bit+0x19/0x70 print_address_description+0x6c/0x332 ? find_first_bit+0x19/0x70 ? find_first_bit+0x19/0x70 __kasan_report.cold.6+0x1a/0x3b ? find_first_bit+0x19/0x70 kasan_report+0xe/0x12 find_first_bit+0x19/0x70 pma_get_opa_portstatus+0x5cc/0xa80 [hfi1] ? ret_from_fork+0x3a/0x50 ? pma_get_opa_port_ectrs+0x200/0x200 [hfi1] ? stack_trace_consume_entry+0x80/0x80 hfi1_process_mad+0x39b/0x26c0 [hfi1] ? __lock_acquire+0x65e/0x21b0 ? clear_linkup_counters+0xb0/0xb0 [hfi1] ? check_chain_key+0x1d7/0x2e0 ? lock_downgrade+0x3a0/0x3a0 ? match_held_lock+0x2e/0x250 ib_mad_recv_done+0x698/0x15e0 [ib_core] ? clear_linkup_counters+0xb0/0xb0 [hfi1] ? ib_mad_send_done+0xc80/0xc80 [ib_core] ? mark_held_locks+0x79/0xa0 ? _raw_spin_unlock_irqrestore+0x44/0x60 ? rvt_poll_cq+0x1e1/0x340 [rdmavt] __ib_process_cq+0x97/0x100 [ib_core] ib_cq_poll_work+0x31/0xb0 [ib_core] process_one_work+0x4ee/0xa00 ? pwq_dec_nr_in_flight+0x110/0x110 ? do_raw_spin_lock+0x113/0x1d0 worker_thread+0x57/0x5a0 ? process_one_work+0xa00/0xa00 kthread+0x1bb/0x1e0 ? kthread_create_on_node+0xc0/0xc0 ret_from_fork+0x3a/0x50 The buggy address belongs to the page: page:ffffea000d8b5dc0 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 flags: 0x17ffffc0000000() raw: 0017ffffc0000000 0000000000000000 ffffea000d8b5dc8 0000000000000000 raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected addr ffff888362d778d0 is located in stack of task kworker/u308:2/1889 at offset 32 in frame: pma_get_opa_portstatus+0x0/0xa80 [hfi1] this frame has 1 object: [32, 36) 'vl_select_mask' Memory state around the buggy address: ffff888362d77780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff888362d77800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >ffff888362d77880: 00 00 00 00 00 00 f1 f1 f1 f1 04 f2 f2 f2 00 00 ^ ffff888362d77900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff888362d77980: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 04 f2 f2 f2 ================================================================== Cc: <stable@vger.kernel.org> Fixes: 7724105686e7 ("IB/hfi1: add driver files") Link: https://lore.kernel.org/r/20190911113053.126040.47327.stgit@awfm-01.aw.intel.com Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>