summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2021-11-25scsi: core: sysfs: Fix hang when device state is set via sysfsMike Christie1-11/+19
[ Upstream commit 4edd8cd4e86dd3047e5294bbefcc0a08f66a430f ] This fixes a regression added with: commit f0f82e2476f6 ("scsi: core: Fix capacity set to zero after offlinining device") The problem is that after iSCSI recovery, iscsid will call into the kernel to set the dev's state to running, and with that patch we now call scsi_rescan_device() with the state_mutex held. If the SCSI error handler thread is just starting to test the device in scsi_send_eh_cmnd() then it's going to try to grab the state_mutex. We are then stuck, because when scsi_rescan_device() tries to send its I/O scsi_queue_rq() calls -> scsi_host_queue_ready() -> scsi_host_in_recovery() which will return true (the host state is still in recovery) and I/O will just be requeued. scsi_send_eh_cmnd() will then never be able to grab the state_mutex to finish error handling. To prevent the deadlock move the rescan-related code to after we drop the state_mutex. This also adds a check for if we are already in the running state. This prevents extra scans and helps the iscsid case where if the transport class has already onlined the device during its recovery process then we don't need userspace to do it again plus possibly block that daemon. Link: https://lore.kernel.org/r/20211105221048.6541-3-michael.christie@oracle.com Fixes: f0f82e2476f6 ("scsi: core: Fix capacity set to zero after offlinining device") Cc: Bart Van Assche <bvanassche@acm.org> Cc: lijinlin <lijinlin3@huawei.com> Cc: Wu Bo <wubo40@huawei.com> Reviewed-by: Lee Duncan <lduncan@suse.com> Reviewed-by: Wu Bo <wubo40@huawei.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-25scsi: ufs: core: Improve SCSI abort handlingBart Van Assche1-0/+1
[ Upstream commit 3ff1f6b6ba6f97f50862aa50e79959cc8ddc2566 ] The following has been observed on a test setup: WARNING: CPU: 4 PID: 250 at drivers/scsi/ufs/ufshcd.c:2737 ufshcd_queuecommand+0x468/0x65c Call trace: ufshcd_queuecommand+0x468/0x65c scsi_send_eh_cmnd+0x224/0x6a0 scsi_eh_test_devices+0x248/0x418 scsi_eh_ready_devs+0xc34/0xe58 scsi_error_handler+0x204/0x80c kthread+0x150/0x1b4 ret_from_fork+0x10/0x30 That warning is triggered by the following statement: WARN_ON(lrbp->cmd); Fix this warning by clearing lrbp->cmd from the abort handler. Link: https://lore.kernel.org/r/20211104181059.4129537-1-bvanassche@acm.org Fixes: 7a3e97b0dc4b ("[SCSI] ufshcd: UFS Host controller driver") Reviewed-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-25net/mlx5: E-Switch, return error if encap isn't supportedRaed Salem1-1/+1
[ Upstream commit c4c3176739dfa6efcc5b1d1de4b3fd2b51b048c7 ] On regular ConnectX HCAs getting encap mode isn't supported when the E-Switch is in NONE mode. Current code would return no error code when trying to get encap mode in such case which is wrong. Fix by returning error value to indicate failure to caller in such case. Fixes: 8e0aa4bc959c ("net/mlx5: E-switch, Protect eswitch mode changes") Signed-off-by: Raed Salem <raeds@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-25net/mlx5: Lag, update tracker when state change event receivedMaher Sanalla1-15/+13
[ Upstream commit ae396d85c01c7bdc9eeceecde1f493d03f793465 ] Currently, In NETDEV_CHANGELOWERSTATE/NETDEV_CHANGEUPPERSTATE events handling, tracking is not fully completed if the LAG device is not ready at the time the events occur. But, we must keep track of the upper and lower states after receiving the events because RoCE needs this info in mlx5_lag_get_roce_netdev() - in order to return the corresponding port that its running on. Returning the wrong (not most recent) port will lead to gids table being incorrect. For example: If during the attachment of a slave to the bond, the other non-attached port performs pci_reload, then the LAG device is not ready, but that should not result in dismissing attached slave tracker update automatically (which is performed in mlx5_handle_changelowerstate()), Since these events might not come later, which can lead to both bond ports having tx_enabled=0 - which is not a valid state of LAG bond. Fixes: 9b412cc35f00 ("net/mlx5e: Add LAG warning if bond slave is not lag master") Signed-off-by: Maher Sanalla <msanalla@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-25net/mlx5e: CT, Fix multiple allocations and memleak of mod actsRoi Dayan3-11/+25
[ Upstream commit 806401c20a0f9c51b6c8fd7035671e6ca841f6c2 ] CT clear action offload adds additional mod hdr actions to the flow's original mod actions in order to clear the registers which hold ct_state. When such flow also includes encap action, a neigh update event can cause the driver to unoffload the flow and then reoffload it. Each time this happens, the ct clear handling adds that same set of mod hdr actions to reset ct_state until the max of mod hdr actions is reached. Also the driver never releases the allocated mod hdr actions and causing a memleak. Fix above two issues by moving CT clear mod acts allocation into the parsing actions phase and only use it when offloading the rule. The release of mod acts will be done in the normal flow_put(). backtrace: [<000000007316e2f3>] krealloc+0x83/0xd0 [<00000000ef157de1>] mlx5e_mod_hdr_alloc+0x147/0x300 [mlx5_core] [<00000000970ce4ae>] mlx5e_tc_match_to_reg_set_and_get_id+0xd7/0x240 [mlx5_core] [<0000000067c5fa17>] mlx5e_tc_match_to_reg_set+0xa/0x20 [mlx5_core] [<00000000d032eb98>] mlx5_tc_ct_entry_set_registers.isra.0+0x36/0xc0 [mlx5_core] [<00000000fd23b869>] mlx5_tc_ct_flow_offload+0x272/0x1f10 [mlx5_core] [<000000004fc24acc>] mlx5e_tc_offload_fdb_rules.part.0+0x150/0x620 [mlx5_core] [<00000000dc741c17>] mlx5e_tc_encap_flows_add+0x489/0x690 [mlx5_core] [<00000000e92e49d7>] mlx5e_rep_update_flows+0x6e4/0x9b0 [mlx5_core] [<00000000f60f5602>] mlx5e_rep_neigh_update+0x39a/0x5d0 [mlx5_core] Fixes: 1ef3018f5af3 ("net/mlx5e: CT: Support clear action") Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-25net/mlx5: E-Switch, rebuild lag only when neededMark Bloch1-2/+10
[ Upstream commit 2eb0cb31bc4ce2ede5460cf3ef433b40cf5f040d ] A user can enable VFs without changing E-Switch mode, this can happen when a user moves straight to switchdev mode and only once in switchdev VFs are enabled via the sysfs interface. The cited commit assumed this isn't possible and exposed a single API function where the E-switch calls into the lag code, breaks the lag and prevents any other lag operations to take place until the E-switch update has ended. Breaking the hardware lag when it isn't needed can make it such that hardware lag can't be enabled again. In the sysfs call path check if the current E-Switch mode is NONE, in the context of the function it can only mean the E-Switch is moving out of NONE mode and the hardware lag should be disabled and enabled once the mode change has ended. If the mode isn't NONE it means VFs are about to be enabled and such operation doesn't require toggling the hardware lag. Fixes: cac1eb2cf2e3 ("net/mlx5: Lag, properly lock eswitch if needed") Signed-off-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-25net/mlx5: Update error handler for UCTX and UMEMNeta Ostrovsky1-2/+2
[ Upstream commit ba50cd9451f6c49cf0841c0a4a146ff6a2822699 ] In the fast unload flow, the device state is set to internal error, which indicates that the driver started the destroy process. In this case, when a destroy command is being executed, it should return MLX5_CMD_STAT_OK. Fix MLX5_CMD_OP_DESTROY_UCTX and MLX5_CMD_OP_DESTROY_UMEM to return OK instead of EIO. This fixes a call trace in the umem release process - [ 2633.536695] Call Trace: [ 2633.537518] ib_uverbs_remove_one+0xc3/0x140 [ib_uverbs] [ 2633.538596] remove_client_context+0x8b/0xd0 [ib_core] [ 2633.539641] disable_device+0x8c/0x130 [ib_core] [ 2633.540615] __ib_unregister_device+0x35/0xa0 [ib_core] [ 2633.541640] ib_unregister_device+0x21/0x30 [ib_core] [ 2633.542663] __mlx5_ib_remove+0x38/0x90 [mlx5_ib] [ 2633.543640] auxiliary_bus_remove+0x1e/0x30 [auxiliary] [ 2633.544661] device_release_driver_internal+0x103/0x1f0 [ 2633.545679] bus_remove_device+0xf7/0x170 [ 2633.546640] device_del+0x181/0x410 [ 2633.547606] mlx5_rescan_drivers_locked.part.10+0x63/0x160 [mlx5_core] [ 2633.548777] mlx5_unregister_device+0x27/0x40 [mlx5_core] [ 2633.549841] mlx5_uninit_one+0x21/0xc0 [mlx5_core] [ 2633.550864] remove_one+0x69/0xe0 [mlx5_core] [ 2633.551819] pci_device_remove+0x3b/0xc0 [ 2633.552731] device_release_driver_internal+0x103/0x1f0 [ 2633.553746] unbind_store+0xf6/0x130 [ 2633.554657] kernfs_fop_write+0x116/0x190 [ 2633.555567] vfs_write+0xa5/0x1a0 [ 2633.556407] ksys_write+0x4f/0xb0 [ 2633.557233] do_syscall_64+0x5b/0x1a0 [ 2633.558071] entry_SYSCALL_64_after_hwframe+0x65/0xca [ 2633.559018] RIP: 0033:0x7f9977132648 [ 2633.559821] Code: 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 55 6f 2d 00 8b 00 85 c0 75 17 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 41 54 49 89 d4 55 [ 2633.562332] RSP: 002b:00007fffb1a83888 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 2633.563472] RAX: ffffffffffffffda RBX: 000000000000000c RCX: 00007f9977132648 [ 2633.564541] RDX: 000000000000000c RSI: 000055b90546e230 RDI: 0000000000000001 [ 2633.565596] RBP: 000055b90546e230 R08: 00007f9977406860 R09: 00007f9977a54740 [ 2633.566653] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f99774056e0 [ 2633.567692] R13: 000000000000000c R14: 00007f9977400880 R15: 000000000000000c [ 2633.568725] ---[ end trace 10b4fe52945e544d ]--- Fixes: 6a6fabbfa3e8 ("net/mlx5: Update pci error handler entries and command translation") Signed-off-by: Neta Ostrovsky <netao@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-25net/mlx5e: nullify cq->dbg pointer in mlx5_debug_cq_remove()Valentine Fatiev2-3/+6
[ Upstream commit 76ded29d3fcda4928da8849ffc446ea46871c1c2 ] Prior to this patch in case mlx5_core_destroy_cq() failed it proceeds to rest of destroy operations. mlx5_core_destroy_cq() could be called again by user and cause additional call of mlx5_debug_cq_remove(). cq->dbg was not nullify in previous call and cause the crash. Fix it by nullify cq->dbg pointer after removal. Also proceed to destroy operations only if FW return 0 for MLX5_CMD_OP_DESTROY_CQ command. general protection fault, probably for non-canonical address 0x2000300004058: 0000 [#1] SMP PTI CPU: 5 PID: 1228 Comm: python Not tainted 5.15.0-rc5_for_upstream_min_debug_2021_10_14_11_06 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:lockref_get+0x1/0x60 Code: 5d e9 53 ff ff ff 48 8d 7f 70 e8 0a 2e 48 00 c7 85 d0 00 00 00 02 00 00 00 c6 45 70 00 fb 5d c3 c3 cc cc cc cc cc cc cc cc 53 <48> 8b 17 48 89 fb 85 d2 75 3d 48 89 d0 bf 64 00 00 00 48 89 c1 48 RSP: 0018:ffff888137dd7a38 EFLAGS: 00010206 RAX: 0000000000000000 RBX: ffff888107d5f458 RCX: 00000000fffffffe RDX: 000000000002c2b0 RSI: ffffffff8155e2e0 RDI: 0002000300004058 RBP: ffff888137dd7a88 R08: 0002000300004058 R09: ffff8881144a9f88 R10: 0000000000000000 R11: 0000000000000000 R12: ffff8881141d4000 R13: ffff888137dd7c68 R14: ffff888137dd7d58 R15: ffff888137dd7cc0 FS: 00007f4644f2a4c0(0000) GS:ffff8887a2d40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055b4500f4380 CR3: 0000000114f7a003 CR4: 0000000000170ea0 Call Trace: simple_recursive_removal+0x33/0x2e0 ? debugfs_remove+0x60/0x60 debugfs_remove+0x40/0x60 mlx5_debug_cq_remove+0x32/0x70 [mlx5_core] mlx5_core_destroy_cq+0x41/0x1d0 [mlx5_core] devx_obj_cleanup+0x151/0x330 [mlx5_ib] ? __pollwait+0xd0/0xd0 ? xas_load+0x5/0x70 ? xa_load+0x62/0xa0 destroy_hw_idr_uobject+0x20/0x80 [ib_uverbs] uverbs_destroy_uobject+0x3b/0x360 [ib_uverbs] uobj_destroy+0x54/0xa0 [ib_uverbs] ib_uverbs_cmd_verbs+0xaf2/0x1160 [ib_uverbs] ? uverbs_finalize_object+0xd0/0xd0 [ib_uverbs] ib_uverbs_ioctl+0xc4/0x1b0 [ib_uverbs] __x64_sys_ioctl+0x3e4/0x8e0 Fixes: 94b960b9deff ("net/mlx5e: Fix memory leak in mlx5_core_destroy_cq() error path") Signed-off-by: Valentine Fatiev <valentinef@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-25net/mlx5: E-Switch, Fix resetting of encap mode when entering switchdevPaul Blakey3-11/+9
[ Upstream commit d7751d6476185ff754b9dad2cba0c0a6e43ecadc ] E-Switch encap mode is relevant only when in switchdev mode. The RDMA driver can query the encap configuration via mlx5_eswitch_get_encap_mode(). Make sure it returns the currently used mode and not the set one. This reverts the cited commit which reset the encap mode on entering switchdev and fixes the original issue properly. Fixes: 9a64144d683a ("net/mlx5: E-Switch, Fix default encap mode") Signed-off-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-25net/mlx5e: Wait for concurrent flow deletion during neigh/fib eventsVlad Buslov3-1/+10
[ Upstream commit 362980eada85b5ea691e5e0d9257a991aa7ade47 ] Function mlx5e_take_tmp_flow() skips flows with zero reference count. This can cause syndrome 0x179e84 when the called from neigh or route update code and the skipped flow is not removed from the hardware by the time underlying encap/decap resource is deleted. Add new completion 'del_hw_done' that is completed when flow is unoffloaded. This is safe to do because flow with reference count zero needs to be detached from encap/decap entry before its memory is deallocated, which requires taking the encap_tbl_lock mutex that is held by the event handlers code. Fixes: 8914add2c9e5 ("net/mlx5e: Handle FIB events to update tunnel endpoint device") Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-25net/mlx5e: kTLS, Fix crash in RX resync flowTariq Toukan1-6/+17
[ Upstream commit cc4a9cc03faa6d8db1a6954bb536f2c1e63bdff6 ] For the TLS RX resync flow, we maintain a list of TLS contexts that require some attention, to communicate their resync information to the HW. Here we fix list corruptions, by protecting the entries against movements coming from resync_handle_seq_match(), until their resync handling in napi is fully completed. Fixes: e9ce991bce5b ("net/mlx5e: kTLS, Add resiliency to RX resync failures") Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-25RDMA/core: Set send and receive CQ before forwarding to the driverLeon Romanovsky1-0/+3
[ Upstream commit 6cd7397d01c4a3e09757840299e4f114f0aa5fa0 ] Preset both receive and send CQ pointers prior to call to the drivers and overwrite it later again till the mlx4 is going to be changed do not overwrite ibqp properties. This change is needed for mlx5, because in case of QP creation failure, it will go to the path of QP destroy which relies on proper CQ pointers. BUG: KASAN: use-after-free in create_qp.cold+0x164/0x16e [mlx5_ib] Write of size 8 at addr ffff8880064c55c0 by task a.out/246 CPU: 0 PID: 246 Comm: a.out Not tainted 5.15.0+ #291 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Call Trace: dump_stack_lvl+0x45/0x59 print_address_description.constprop.0+0x1f/0x140 kasan_report.cold+0x83/0xdf create_qp.cold+0x164/0x16e [mlx5_ib] mlx5_ib_create_qp+0x358/0x28a0 [mlx5_ib] create_qp.part.0+0x45b/0x6a0 [ib_core] ib_create_qp_user+0x97/0x150 [ib_core] ib_uverbs_handler_UVERBS_METHOD_QP_CREATE+0x92c/0x1250 [ib_uverbs] ib_uverbs_cmd_verbs+0x1c38/0x3150 [ib_uverbs] ib_uverbs_ioctl+0x169/0x260 [ib_uverbs] __x64_sys_ioctl+0x866/0x14d0 do_syscall_64+0x3d/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae Allocated by task 246: kasan_save_stack+0x1b/0x40 __kasan_kmalloc+0xa4/0xd0 create_qp.part.0+0x92/0x6a0 [ib_core] ib_create_qp_user+0x97/0x150 [ib_core] ib_uverbs_handler_UVERBS_METHOD_QP_CREATE+0x92c/0x1250 [ib_uverbs] ib_uverbs_cmd_verbs+0x1c38/0x3150 [ib_uverbs] ib_uverbs_ioctl+0x169/0x260 [ib_uverbs] __x64_sys_ioctl+0x866/0x14d0 do_syscall_64+0x3d/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae Freed by task 246: kasan_save_stack+0x1b/0x40 kasan_set_track+0x1c/0x30 kasan_set_free_info+0x20/0x30 __kasan_slab_free+0x10c/0x150 slab_free_freelist_hook+0xb4/0x1b0 kfree+0xe7/0x2a0 create_qp.part.0+0x52b/0x6a0 [ib_core] ib_create_qp_user+0x97/0x150 [ib_core] ib_uverbs_handler_UVERBS_METHOD_QP_CREATE+0x92c/0x1250 [ib_uverbs] ib_uverbs_cmd_verbs+0x1c38/0x3150 [ib_uverbs] ib_uverbs_ioctl+0x169/0x260 [ib_uverbs] __x64_sys_ioctl+0x866/0x14d0 do_syscall_64+0x3d/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae Fixes: 514aee660df4 ("RDMA: Globally allocate and release QP memory") Link: https://lore.kernel.org/r/2dbb2e2cbb1efb188a500e5634be1d71956424ce.1636631035.git.leonro@nvidia.com Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-25btrfs: make 1-bit bit-fields of scrub_page unsigned intColin Ian King1-2/+2
[ Upstream commit d08e38b62327961295be1c63b562cd46ec97cd07 ] The bitfields have_csum and io_error are currently signed which is not recommended as the representation is an implementation defined behaviour. Fix this by making the bit-fields unsigned ints. Fixes: 2c36395430b0 ("btrfs: scrub: remove the anonymous structure from scrub_page") Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-25udp: Validate checksum in udp_read_sock()Cong Wang1-0/+11
[ Upstream commit 099f896f498a2b26d84f4ddae039b2c542c18b48 ] It turns out the skb's in sock receive queue could have bad checksums, as both ->poll() and ->recvmsg() validate checksums. We have to do the same for ->read_sock() path too before they are redirected in sockmap. Fixes: d7f571188ecf ("udp: Implement ->read_sock() for sockmap") Reported-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20211115044006.26068-1-xiyou.wangcong@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-25platform/x86: think-lmi: Abort probe on analyze failureAlex Williamson2-4/+10
[ Upstream commit 812fcc609502096e98cc3918a4b807722dba8fd9 ] A Lenovo ThinkStation S20 (4157CTO BIOS 60KT41AUS) fails to boot on recent kernels including the think-lmi driver, due to the fact that errors returned by the tlmi_analyze() function are ignored by tlmi_probe(), where tlmi_sysfs_init() is called unconditionally. This results in making use of an array of already freed, non-null pointers and other uninitialized globals, causing all sorts of nasty kobject and memory faults. Make use of the analyze function return value, free a couple leaked allocations, and remove the settings_count field, which is incremented but never consumed. Fixes: a40cd7ef22fb ("platform/x86: think-lmi: Add WMI interface support on Lenovo platforms") Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Mark Gross <markgross@kernel.org> Reviewed-by: Mark Pearson <markpearson@lenovo.com> Link: https://lore.kernel.org/r/163639463588.1330483.15850167112490200219.stgit@omen Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-25platform/x86: hp_accel: Fix an error handling path in 'lis3lv02d_probe()'Christophe JAILLET1-0/+2
[ Upstream commit c961a7d2aa23ae19e0099fbcdf1040fb760eea83 ] If 'led_classdev_register()' fails, some additional resources should be released. Add the missing 'i8042_remove_filter()' and 'lis3lv02d_remove_fs()' calls that are already in the remove function but are missing here. Fixes: a4c724d0723b ("platform: hp_accel: add a i8042 filter to remove HPQ6000 data from kb bus stream") Fixes: 9e0c79782143 ("lis3lv02d: merge with leds hp disk") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/5a4f218f8f16d2e3a7906b7ca3654ffa946895f8.1636314074.git.christophe.jaillet@wanadoo.fr Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-25gpio: rockchip: needs GENERIC_IRQ_CHIP to fix build errorsRandy Dunlap1-0/+1
[ Upstream commit d6912b1251b47e6b04ea8c8881dfb35a6e7a3e29 ] gpio-rockchip uses interfaces that are provided by the Kconfig symbol GENERIC_IRQ_CHIP, so the driver should select that symbol in order to prevent build errors. Fixes these build errors (and more): aarch64-linux-ld: drivers/gpio/gpio-rockchip.o: in function `rockchip_irq_disable': gpio-rockchip.c:(.text+0x454): undefined reference to `irq_gc_mask_set_bit' aarch64-linux-ld: drivers/gpio/gpio-rockchip.o: in function `rockchip_irq_enable': gpio-rockchip.c:(.text+0x478): undefined reference to `irq_gc_mask_clr_bit' aarch64-linux-ld: drivers/gpio/gpio-rockchip.o: in function `rockchip_interrupts_register': gpio-rockchip.c:(.text+0x518): undefined reference to `irq_generic_chip_ops' aarch64-linux-ld: gpio-rockchip.c:(.text+0x594): undefined reference to `__irq_alloc_domain_generic_chips' aarch64-linux-ld: gpio-rockchip.c:(.text+0x5cc): undefined reference to `irq_get_domain_generic_chip' aarch64-linux-ld: gpio-rockchip.c:(.text+0x5e0): undefined reference to `irq_gc_ack_set_bit' aarch64-linux-ld: gpio-rockchip.c:(.text+0x604): undefined reference to `irq_gc_set_wake' Fixes: 936ee2675eee ("gpio/rockchip: add driver for rockchip gpio") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-25mips: lantiq: add support for clk_get_parent()Randy Dunlap1-0/+6
[ Upstream commit fc1aabb088860d6cf9dd03612b7a6f0de91ccac2 ] Provide a simple implementation of clk_get_parent() in the lantiq subarch so that callers of it will build without errors. Fixes this build error: ERROR: modpost: "clk_get_parent" [drivers/iio/adc/ingenic-adc.ko] undefined! Fixes: 171bb2f19ed6 ("MIPS: Lantiq: Add initial support for Lantiq SoCs") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Suggested-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Cc: linux-mips@vger.kernel.org Cc: John Crispin <john@phrozen.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Jonathan Cameron <jic23@kernel.org> Cc: linux-iio@vger.kernel.org Cc: Russell King <linux@armlinux.org.uk> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: John Crispin <john@phrozen.org> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-25mips: bcm63xx: add support for clk_get_parent()Randy Dunlap1-0/+6
[ Upstream commit e8f67482e5a4bc8d0b65d606d08cb60ee123b468 ] BCM63XX selects HAVE_LEGACY_CLK but does not provide/support clk_get_parent(), so add a simple implementation of that function so that callers of it will build without errors. Fixes these build errors: mips-linux-ld: drivers/iio/adc/ingenic-adc.o: in function `jz4770_adc_init_clk_div': ingenic-adc.c:(.text+0xe4): undefined reference to `clk_get_parent' mips-linux-ld: drivers/iio/adc/ingenic-adc.o: in function `jz4725b_adc_init_clk_div': ingenic-adc.c:(.text+0x1b8): undefined reference to `clk_get_parent' Fixes: e7300d04bd08 ("MIPS: BCM63xx: Add support for the Broadcom BCM63xx family of SOCs." ) Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: kernel test robot <lkp@intel.com> Suggested-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Cc: Artur Rojek <contact@artur-rojek.eu> Cc: Paul Cercueil <paul@crapouillou.net> Cc: linux-mips@vger.kernel.org Cc: Jonathan Cameron <jic23@kernel.org> Cc: Lars-Peter Clausen <lars@metafoo.de> Cc: linux-iio@vger.kernel.org Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: Russell King <linux@armlinux.org.uk> Cc: bcm-kernel-feedback-list@broadcom.com Cc: Jonas Gorski <jonas.gorski@gmail.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-25MIPS: generic/yamon-dt: fix uninitialized variable errorColin Ian King1-1/+1
[ Upstream commit 255e51da15baed47531beefd02f222e4dc01f1c1 ] In the case where fw_getenv returns an error when fetching values for ememsizea and memsize then variable phys_memsize is not assigned a variable and will be uninitialized on a zero check of phys_memsize. Fix this by initializing phys_memsize to zero. Cleans up cppcheck error: arch/mips/generic/yamon-dt.c:100:7: error: Uninitialized variable: phys_memsize [uninitvar] Fixes: f41d2430bbd6 ("MIPS: generic/yamon-dt: Support > 256MB of RAM") Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-25bpf: Fix toctou on read-only map's constant scalar trackingDaniel Borkmann3-23/+54
[ Upstream commit 353050be4c19e102178ccc05988101887c25ae53 ] Commit a23740ec43ba ("bpf: Track contents of read-only maps as scalars") is checking whether maps are read-only both from BPF program side and user space side, and then, given their content is constant, reading out their data via map->ops->map_direct_value_addr() which is then subsequently used as known scalar value for the register, that is, it is marked as __mark_reg_known() with the read value at verification time. Before a23740ec43ba, the register content was marked as an unknown scalar so the verifier could not make any assumptions about the map content. The current implementation however is prone to a TOCTOU race, meaning, the value read as known scalar for the register is not guaranteed to be exactly the same at a later point when the program is executed, and as such, the prior made assumptions of the verifier with regards to the program will be invalid which can cause issues such as OOB access, etc. While the BPF_F_RDONLY_PROG map flag is always fixed and required to be specified at map creation time, the map->frozen property is initially set to false for the map given the map value needs to be populated, e.g. for global data sections. Once complete, the loader "freezes" the map from user space such that no subsequent updates/deletes are possible anymore. For the rest of the lifetime of the map, this freeze one-time trigger cannot be undone anymore after a successful BPF_MAP_FREEZE cmd return. Meaning, any new BPF_* cmd calls which would update/delete map entries will be rejected with -EPERM since map_get_sys_perms() removes the FMODE_CAN_WRITE permission. This also means that pending update/delete map entries must still complete before this guarantee is given. This corner case is not an issue for loaders since they create and prepare such program private map in successive steps. However, a malicious user is able to trigger this TOCTOU race in two different ways: i) via userfaultfd, and ii) via batched updates. For i) userfaultfd is used to expand the competition interval, so that map_update_elem() can modify the contents of the map after map_freeze() and bpf_prog_load() were executed. This works, because userfaultfd halts the parallel thread which triggered a map_update_elem() at the time where we copy key/value from the user buffer and this already passed the FMODE_CAN_WRITE capability test given at that time the map was not "frozen". Then, the main thread performs the map_freeze() and bpf_prog_load(), and once that had completed successfully, the other thread is woken up to complete the pending map_update_elem() which then changes the map content. For ii) the idea of the batched update is similar, meaning, when there are a large number of updates to be processed, it can increase the competition interval between the two. It is therefore possible in practice to modify the contents of the map after executing map_freeze() and bpf_prog_load(). One way to fix both i) and ii) at the same time is to expand the use of the map's map->writecnt. The latter was introduced in fc9702273e2e ("bpf: Add mmap() support for BPF_MAP_TYPE_ARRAY") and further refined in 1f6cb19be2e2 ("bpf: Prevent re-mmap()'ing BPF map as writable for initially r/o mapping") with the rationale to make a writable mmap()'ing of a map mutually exclusive with read-only freezing. The counter indicates writable mmap() mappings and then prevents/fails the freeze operation. Its semantics can be expanded beyond just mmap() by generally indicating ongoing write phases. This would essentially span any parallel regular and batched flavor of update/delete operation and then also have map_freeze() fail with -EBUSY. For the check_mem_access() in the verifier we expand upon the bpf_map_is_rdonly() check ensuring that all last pending writes have completed via bpf_map_write_active() test. Once the map->frozen is set and bpf_map_write_active() indicates a map->writecnt of 0 only then we are really guaranteed to use the map's data as known constants. For map->frozen being set and pending writes in process of still being completed we fall back to marking that register as unknown scalar so we don't end up making assumptions about it. With this, both TOCTOU reproducers from i) and ii) are fixed. Note that the map->writecnt has been converted into a atomic64 in the fix in order to avoid a double freeze_mutex mutex_{un,}lock() pair when updating map->writecnt in the various map update/delete BPF_* cmd flavors. Spanning the freeze_mutex over entire map update/delete operations in syscall side would not be possible due to then causing everything to be serialized. Similarly, something like synchronize_rcu() after setting map->frozen to wait for update/deletes to complete is not possible either since it would also have to span the user copy which can sleep. On the libbpf side, this won't break d66562fba1ce ("libbpf: Add BPF object skeleton support") as the anonymous mmap()-ed "map initialization image" is remapped as a BPF map-backed mmap()-ed memory where for .rodata it's non-writable. Fixes: a23740ec43ba ("bpf: Track contents of read-only maps as scalars") Reported-by: w1tcher.bupt@gmail.com Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-25iavf: Restore VLAN filters after link downAkeem G Abodunrin2-5/+31
[ Upstream commit 4293014230b887d94b68aa460ff00153454a3709 ] Restore VLAN filters after the link is brought down, and up - since all filters are deleted from HW during the netdev link down routine. Fixes: ed1f5b58ea01 ("i40evf: remove VLAN filters on close") Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-25iavf: Fix for setting queues to 0Grzegorz Szczurek1-1/+1
[ Upstream commit 9a6e9e483a9684a34573fd9f9e30ecfb047cb8cb ] Now setting combine to 0 will be rejected with the appropriate error code. This has been implemented by adding a condition that checks the value of combine equal to zero. Without this patch, when the user requested it, no error was returned and combine was set to the default value for VF. Fixes: 5520deb15326 ("iavf: Enable support for up to 16 queues") Signed-off-by: Grzegorz Szczurek <grzegorzx.szczurek@intel.com> Tested-by: Tony Brelinski <tony.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-25iavf: Fix for the false positive ASQ/ARQ errors while issuing VF resetSurabhi Boob1-1/+1
[ Upstream commit 321421b57a12e933f92b228e0e6d0b2c6541f41d ] While issuing VF Reset from the guest OS, the VF driver prints logs about critical / Overflow error detection. This is not an actual error since the VF_MBX_ARQLEN register is set to all FF's for a short period of time and the VF would catch the bits set if it was reading the register during that spike of time. This patch introduces an additional check to ignore this condition since the VF is in reset. Fixes: 19b73d8efaa4 ("i40evf: Add additional check for reset") Signed-off-by: Surabhi Boob <surabhi.boob@intel.com> Tested-by: Tony Brelinski <tony.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-25iavf: validate pointersMitch Williams1-7/+6
[ Upstream commit 131b0edc4028bb88bb472456b1ddba526cfb7036 ] In some cases, the ethtool get_rxfh handler may be called with a null key or indir parameter. So check these pointers, or you will have a very bad day. Fixes: 43a3d9ba34c9 ("i40evf: Allow PF driver to configure RSS") Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Tony Brelinski <tony.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-25iavf: prevent accidental free of filter structureJacob Keller1-2/+2
[ Upstream commit 4f0400803818f2642f066d3eacaf013f23554cc7 ] In iavf_config_clsflower, the filter structure could be accidentally released at the end, if iavf_parse_cls_flower or iavf_handle_tclass ever return a non-zero but positive value. In this case, the function continues through to the end, and will call kfree() on the filter structure even though it has been added to the linked list. This can actually happen because iavf_parse_cls_flower will return a positive IAVF_ERR_CONFIG value instead of the traditional negative error codes. Fix this by ensuring that the kfree() check and error checks are similar. Use the more idiomatic "if (err)" to catch all non-zero error codes. Fixes: 0075fa0fadd0 ("i40evf: Add support to apply cloud filters") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tony.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-25iavf: Fix failure to exit out from last all-multicast modePiotr Marczak1-2/+1
[ Upstream commit 8905072a192fffe9389255489db250c73ecab008 ] The driver could only quit allmulti when allmulti and promisc modes are turn on at the same time. If promisc had been off there was no way to turn off allmulti mode. The patch corrects this behavior. Switching allmulti does not depends on promisc state mode anymore Fixes: f42a5c74da99 ("i40e: Add allmulti support for the VF") Signed-off-by: Piotr Marczak <piotr.marczak@intel.com> Tested-by: Tony Brelinski <tony.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-25iavf: don't clear a lock we don't holdNicholas Nunley1-2/+4
[ Upstream commit 2135a8d5c8186bc92901dc00f179ffd50e54c2ac ] In iavf_configure_clsflower() the function will bail out if it is unable to obtain the crit_section lock in a reasonable time. However, it will clear the lock when exiting, so fix this. Fixes: 640a8af5841f ("i40evf: Reorder configure_clsflower to avoid deadlock on error") Signed-off-by: Nicholas Nunley <nicholas.d.nunley@intel.com> Tested-by: Tony Brelinski <tony.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-25iavf: free q_vectors before queues in iavf_disable_vfNicholas Nunley1-1/+1
[ Upstream commit 89f22f129696ab53cfbc608e0a2184d0fea46ac1 ] iavf_free_queues() clears adapter->num_active_queues, which iavf_free_q_vectors() relies on, so swap the order of these two function calls in iavf_disable_vf(). This resolves a panic encountered when the interface is disabled and then later brought up again after PF communication is restored. Fixes: 65c7006f234c ("i40evf: assign num_active_queues inside i40evf_alloc_queues") Signed-off-by: Nicholas Nunley <nicholas.d.nunley@intel.com> Tested-by: Tony Brelinski <tony.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-25iavf: check for null in iavf_fix_featuresNicholas Nunley1-1/+2
[ Upstream commit 8a4a126f4be88eb8b5f00a165ab58c35edf4ef76 ] If the driver has lost contact with the PF then it enters a disabled state and frees adapter->vf_res. However, ndo_fix_features can still be called on the interface, so we need to check for this condition first. Since we have no information on the features at this time simply leave them unmodified and return. Fixes: c4445aedfe09 ("i40evf: Fix VLAN features") Signed-off-by: Nicholas Nunley <nicholas.d.nunley@intel.com> Tested-by: Tony Brelinski <tony.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-25iavf: Fix return of set the new channel countMateusz Palczewski1-0/+15
[ Upstream commit 4e5e6b5d9d1334d3490326b6922a2daaf56a867f ] Fixed return correct code from set the new channel count. Implemented by check if reset is done in appropriate time. This solution give a extra time to pf for reset vf in case when user want set new channel count for all vfs. Without this patch it is possible to return misleading output code to user and vf reset not to be correctly performed by pf. Fixes: 5520deb15326 ("iavf: Enable support for up to 16 queues") Signed-off-by: Grzegorz Szczurek <grzegorzx.szczurek@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-25NFSD: Fix exposure in nfsd4_decode_bitmap()Chuck Lever1-5/+2
[ Upstream commit c0019b7db1d7ac62c711cda6b357a659d46428fe ] rtm@csail.mit.edu reports: > nfsd4_decode_bitmap4() will write beyond bmval[bmlen-1] if the RPC > directs it to do so. This can cause nfsd4_decode_state_protect4_a() > to write client-supplied data beyond the end of > nfsd4_exchange_id.spo_must_allow[] when called by > nfsd4_decode_exchange_id(). Rewrite the loops so nfsd4_decode_bitmap() cannot iterate beyond @bmlen. Reported by: rtm@csail.mit.edu Fixes: d1c263a031e8 ("NFSD: Replace READ* macros in nfsd4_decode_fattr()") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-25net/smc: Make sure the link_id is uniqueWen Gu1-1/+2
[ Upstream commit cf4f5530bb55ef7d5a91036b26676643b80b1616 ] The link_id is supposed to be unique, but smcr_next_link_id() doesn't skip the used link_id as expected. So the patch fixes this. Fixes: 026c381fb477 ("net/smc: introduce link_idx for link group array") Signed-off-by: Wen Gu <guwen@linux.alibaba.com> Reviewed-by: Tony Lu <tonylu@linux.alibaba.com> Acked-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-25sock: fix /proc/net/sockstat underflow in sk_clone_lock()Tetsuo Handa1-3/+3
[ Upstream commit 938cca9e4109b30ee1d476904538225a825e54eb ] sk_clone_lock() needs to call sock_inuse_add(1) before entering the sk_free_unlock_clone() error path, for __sk_free() from sk_free() from sk_free_unlock_clone() calls sock_inuse_add(-1). Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Fixes: 648845ab7e200993 ("sock: Move the socket inuse to namespace.") Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-25tipc: only accept encrypted MSG_CRYPTO msgsXin Long1-2/+5
[ Upstream commit 271351d255b09e39c7f6437738cba595f9b235be ] The MSG_CRYPTO msgs are always encrypted and sent to other nodes for keys' deployment. But when receiving in peers, if those nodes do not validate it and make sure it's encrypted, one could craft a malicious MSG_CRYPTO msg to deploy its key with no need to know other nodes' keys. This patch is to do that by checking TIPC_SKB_CB(skb)->decrypted and discard it if this packet never got decrypted. Note that this is also a supplementary fix to CVE-2021-43267 that can be triggered by an unencrypted malicious MSG_CRYPTO msg. Fixes: 1ef6f7c9390f ("tipc: add automatic session key exchange") Acked-by: Ying Xue <ying.xue@windriver.com> Acked-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-25bnxt_en: reject indirect blk offload when hw-tc-offload is offSriharsha Basavapatna1-1/+1
[ Upstream commit b0757491a118ae5727cf9f1c3a11544397d46596 ] The driver does not check if hw-tc-offload is enabled for the device before offloading a flow in the context of indirect block callback. Fix this by checking NETIF_F_HW_TC in the features flag and rejecting the offload request. This will avoid unnecessary dmesg error logs when hw-tc-offload is disabled, such as these: bnxt_en 0000:19:00.1 eno2np1: dev(ifindex=294) not on same switch bnxt_en 0000:19:00.1 eno2np1: Error: bnxt_tc_add_flow: cookie=0xffff8dace1c88000 error=-22 bnxt_en 0000:19:00.0 eno1np0: dev(ifindex=294) not on same switch bnxt_en 0000:19:00.0 eno1np0: Error: bnxt_tc_add_flow: cookie=0xffff8dace1c88000 error=-22 Reported-by: Marcelo Ricardo Leitner <mleitner@redhat.com> Fixes: 627c89d00fb9 ("bnxt_en: flow_offload: offload tunnel decap rules via indirect callbacks") Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-25net: bnx2x: fix variable dereferenced before checkPavel Skripkin1-1/+3
[ Upstream commit f8885ac89ce310570e5391fe0bf0ec9c7c9b4fdc ] Smatch says: bnx2x_init_ops.h:640 bnx2x_ilt_client_mem_op() warn: variable dereferenced before check 'ilt' (see line 638) Move ilt_cli variable initialization _after_ ilt validation, because it's unsafe to deref the pointer before validation check. Fixes: 523224a3b3cd ("bnx2x, cnic, bnx2i: use new FW/HSI") Signed-off-by: Pavel Skripkin <paskripkin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-25selftests: gpio: fix gpio compiling errorLi Zhijian1-0/+1
[ Upstream commit 92a59d7f381d2caf69385bfa00590028e32eea26 ] The gpio selftests build against the system includes rather than the headers from the linux tree. This results in the compile failing if the system includes are outdated. Prefer the headers from the linux tree, as per other selftests. Fixes: 8bc395a6a2e2 ("selftests: gpio: rework and simplify test implementation") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> [Kent: reworded commit comment and added Fixes:] Signed-off-by: Kent Gibson <warthog618@gmail.com> Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-25net: ipa: disable HOLB drop when updating timerAlex Elder1-0/+2
[ Upstream commit 816316cacad2b5abd5b41423cf04e4845239abd4 ] The head-of-line blocking timer should only be modified when head-of-line drop is disabled. One of the steps in recovering from a modem crash is to enable dropping of packets with timeout of 0 (immediate). We don't know how the modem configured its endpoints, so before we program the timer, we need to ensure HOL_BLOCK is disabled. Fixes: 84f9bd12d46db ("soc: qcom: ipa: IPA endpoints") Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-25net: ipa: HOLB register sometimes must be written twiceAlex Elder1-0/+3
[ Upstream commit 6e228d8cbb1cc6ba78022d406340e901e08d26e0 ] Starting with IPA v4.5, the HOL_BLOCK_EN register must be written twice when enabling head-of-line blocking avoidance. Fixes: 84f9bd12d46db ("soc: qcom: ipa: IPA endpoints") Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-25mac80211: fix monitor_sdata RCU/locking assertionsJohannes Berg3-8/+15
[ Upstream commit 6dd2360334f3cb3b45fc1b8194c670090474b87c ] Since commit a05829a7222e ("cfg80211: avoid holding the RTNL when calling the driver") we've not only been protecting the pointer to monitor_sdata with the RTNL, but also with the wiphy->mtx. This is relevant in a number of lockdep assertions, e.g. the one we hit in ieee80211_set_monitor_channel(). However, we're now protecting all the assignments/dereferences, even the one in interface iter, with the wiphy->mtx, so switch over the lockdep assertions to that lock. Fixes: a05829a7222e ("cfg80211: avoid holding the RTNL when calling the driver") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Link: https://lore.kernel.org/r/20211112135143.cb8e8ceffef3.Iaa210f16f6904c8a7a24954fb3396da0ef86ec08@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-25nl80211: fix radio statistics in survey dumpJohannes Berg2-20/+20
[ Upstream commit ce6b69749961426c6d822215ded9e67154e1ad4f ] Even if userspace specifies the NL80211_ATTR_SURVEY_RADIO_STATS attribute, we cannot get the statistics because we're not really parsing the incoming attributes properly any more. Fix this by passing the attrbuf to nl80211_prepare_wdev_dump() and filling it there, if given, and using a local version only if no output is desired. Since I'm touching it anyway, make nl80211_prepare_wdev_dump() static. Fixes: 50508d941c18 ("cfg80211: use parallel_ops for genl") Reported-by: Jan Fuchs <jf@simonwunderlich.de> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Tested-by: Sven Eckelmann <sven@narfation.org> Link: https://lore.kernel.org/r/20211029092539.2851b4799386.If9736d4575ee79420cbec1bd930181e1d53c7317@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-25tracing: Add length protection to histogram string copiesSteven Rostedt (VMware)2-3/+8
[ Upstream commit 938aa33f14657c9ed9deea348b7d6f14b6d69cb7 ] The string copies to the histogram storage has a max size of 256 bytes (defined by MAX_FILTER_STR_VAL). Only the string size of the event field needs to be copied to the event storage, but no more than what is in the event storage. Although nothing should be bigger than 256 bytes, there's no protection against overwriting of the storage if one day there is. Copy no more than the destination size, and enforce it. Also had to turn MAX_FILTER_STR_VAL into an unsigned int, to keep the min() comparison of the string sizes of comparable types. Link: https://lore.kernel.org/all/CAHk-=wjREUihCGrtRBwfX47y_KrLCGjiq3t6QtoNJpmVrAEb1w@mail.gmail.com/ Link: https://lkml.kernel.org/r/20211114132834.183429a4@rorschach.local.home Cc: Ingo Molnar <mingo@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Tom Zanussi <zanussi@kernel.org> Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Fixes: 63f84ae6b82b ("tracing/histogram: Do not copy the fixed-size char array field over the field size") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-25tcp: Fix uninitialized access in skb frags array for Rx 0cp.Arjun Roy1-0/+3
[ Upstream commit 70701b83e208767f2720d8cd3e6a62cddafb3a30 ] TCP Receive zerocopy iterates through the SKB queue via tcp_recv_skb(), acquiring a pointer to an SKB and an offset within that SKB to read from. From there, it iterates the SKB frags array to determine which offset to start remapping pages from. However, this is built on the assumption that the offset read so far within the SKB is smaller than the SKB length. If this assumption is violated, we can attempt to read an invalid frags array element, which would cause a fault. tcp_recv_skb() can cause such an SKB to be returned when the TCP FIN flag is set. Therefore, we must guard against this occurrence inside skb_advance_frag(). One way that we can reproduce this error follows: 1) In a receiver program, call getsockopt(TCP_ZEROCOPY_RECEIVE) with: char some_array[32 * 1024]; struct tcp_zerocopy_receive zc = { .copybuf_address = (__u64) &some_array[0], .copybuf_len = 32 * 1024, }; 2) In a sender program, after a TCP handshake, send the following sequence of packets: i) Seq = [X, X+4000] ii) Seq = [X+4000, X+5000] iii) Seq = [X+4000, X+5000], Flags = FIN | URG, urgptr=1000 (This can happen without URG, if we have a signal pending, but URG is a convenient way to reproduce the behaviour). In this case, the following event sequence will occur on the receiver: tcp_zerocopy_receive(): -> receive_fallback_to_copy() // copybuf_len >= inq -> tcp_recvmsg_locked() // reads 5000 bytes, then breaks due to URG -> tcp_recv_skb() // yields skb with skb->len == offset -> tcp_zerocopy_set_hint_for_skb() -> skb_advance_to_frag() // will returns a frags ptr. >= nr_frags -> find_next_mappable_frag() // will dereference this bad frags ptr. With this patch, skb_advance_to_frag() will no longer return an invalid frags pointer, and will return NULL instead, fixing the issue. Signed-off-by: Arjun Roy <arjunroy@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: 05255b823a61 ("tcp: add TCP_ZEROCOPY_RECEIVE support for zerocopy receive") Link: https://lore.kernel.org/r/20211111235215.2605384-1-arjunroy.kdev@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-25net/ipa: ipa_resource: Fix wrong for loop rangeKonrad Dybcio1-1/+1
[ Upstream commit 27df68d579c67ef6c39a5047559b6a7c08c96219 ] The source group count was mistakenly assigned to both dst and src loops. Fix it to make IPA probe and work again. Fixes: 4fd704b3608a ("net: ipa: record number of groups in data") Acked-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@somainline.org> Reviewed-by: Marijn Suijten <marijn.suijten@somainline.org> Signed-off-by: Konrad Dybcio <konrad.dybcio@somainline.org> Reviewed-by: Alex Elder <elder@linaro.org> Link: https://lore.kernel.org/r/20211111183724.593478-1-konrad.dybcio@somainline.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-25selftests: net: switch to socat in the GSO GRE testJakub Kicinski1-7/+9
[ Upstream commit 0cda7d4bac5fd29dceb13df26083333fa99d6bb4 ] Commit a985442fdecb ("selftests: net: properly support IPv6 in GSO GRE test") is not compatible with: Ncat: Version 7.80 ( https://nmap.org/ncat ) (which is distributed with Fedora/Red Hat), tests fail with: nc: invalid option -- 'N' Let's switch to socat which is far more dependable. Fixes: 025efa0a82df ("selftests: add simple GSO GRE test") Fixes: a985442fdecb ("selftests: net: properly support IPv6 in GSO GRE test") Tested-by: Andrea Righi <andrea.righi@canonical.com> Link: https://lore.kernel.org/r/20211111162929.530470-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-25samples/bpf: Fix incorrect use of strlen in xdp_redirect_cpuKumar Kartikeya Dwivedi1-3/+2
[ Upstream commit 2453afe3845523d9dfe89dbfb3d71abfa095e260 ] Commit b599015f044d ("samples/bpf: Fix application of sizeof to pointer") tried to fix a bug where sizeof was incorrectly applied to a pointer instead of the array string was being copied to, to find the destination buffer size, but ended up using strlen, which is still incorrect. However, on closer look ifname_buf has no other use, hence directly use optarg. Fixes: b599015f044d ("samples/bpf: Fix application of sizeof to pointer") Fixes: e531a220cc59 ("samples: bpf: Convert xdp_redirect_cpu to XDP samples helper") Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com> Tested-by: Alexander Lobakin <alexandr.lobakin@intel.com> Link: https://lore.kernel.org/bpf/20211112020301.528357-1-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-25samples/bpf: Fix summary per-sec stats in xdp_sample_userAlexander Lobakin1-13/+15
[ Upstream commit dc14ca4644f48b1cfa93631e35c28bdc011ad109 ] sample_summary_print() uses accumulated period to calculate and display per-sec averages. This period gets incremented by sampling interval each time a new sample is formed, and thus equals to the number of samples collected multiplied by this interval. However, the totals are being calculated differently, they receive current sample statistics already divided by the interval gotten as a difference between sample timestamps for better precision -- in other words, they are being incremented by the per-sec values each sample. This leads to the excessive division of summary per-secs when interval != 1 sec. It is obvious pps couldn't become two times lower just from picking a different sampling interval value: $ samples/bpf/xdp_redirect_cpu -p xdp_prognum_n1_inverse_qnum -c all -s -d 6 -i 1 < snip > Packets received : 2,197,230,321 Average packets/s : 22,887,816 Packets redirected : 2,197,230,472 Average redir/s : 22,887,817 $ samples/bpf/xdp_redirect_cpu -p xdp_prognum_n1_inverse_qnum -c all -s -d 6 -i 2 < snip > Packets received : 159,566,498 Average packets/s : 11,397,607 Packets redirected : 159,566,995 Average redir/s : 11,397,642 This can be easily fixed by treating the divisor not as a period, but rather as a total number of samples, and thus incrementing it by 1 instead of interval. As a nice side effect, we can now remove so-named argument from a couple of functions. Let us also create an "alias" for sample_output::rx_cnt::pps named 'num' using a union since this field is used to store this number (period previously) as well, and the resulting counter-intuitive code might've been a reason for this bug. Fixes: 156f886cf697 ("samples: bpf: Add basic infrastructure for XDP samples") Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Reviewed-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/bpf/20211111215703.690-1-alexandr.lobakin@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-25bpf: Fix inner map state pruning regression.Alexei Starovoitov1-1/+2
[ Upstream commit 34d11a440c6167133201b7374065b59f259730d7 ] Introduction of map_uid made two lookups from outer map to be distinct. That distinction is only necessary when inner map has an embedded timer. Otherwise it will make the verifier state pruning to be conservative which will cause complex programs to hit 1M insn_processed limit. Tighten map_uid logic to apply to inner maps with timers only. Fixes: 3e8ce29850f1 ("bpf: Prevent pointer mismatch in bpf_timer_init.") Reported-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Lorenz Bauer <lmb@cloudflare.com> Link: https://lore.kernel.org/bpf/CACAyw99hVEJFoiBH_ZGyy=+oO-jyydoz6v1DeKPKs2HVsUH28w@mail.gmail.com Link: https://lore.kernel.org/bpf/20211110172556.20754-1-alexei.starovoitov@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-25drm/nouveau: hdmigv100.c: fix corrupted HDMI Vendor InfoFrameHans Verkuil1-1/+0
[ Upstream commit 3cc1ae1fa70ab369e4645e38ce335a19438093ad ] gv100_hdmi_ctrl() writes vendor_infoframe.subpack0_high to 0x6f0110, and then overwrites it with 0. Just drop the overwrite with 0, that's clearly a mistake. Because of this issue the HDMI VIC is 0 instead of 1 in the HDMI Vendor InfoFrame when transmitting 4kp30. Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Fixes: 290ffeafcc1a ("drm/nouveau/disp/gv100: initial support") Reviewed-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Karol Herbst <kherbst@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/3d3bd0f7-c150-2479-9350-35d394ee772d@xs4all.nl Signed-off-by: Sasha Levin <sashal@kernel.org>