summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)AuthorFilesLines
2021-01-19dm: eliminate potential source of excessive kernel log noiseMike Snitzer1-1/+1
commit 0378c625afe80eb3f212adae42cc33c9f6f31abf upstream. There wasn't ever a real need to log an error in the kernel log for ioctls issued with insufficient permissions. Simply return an error and if an admin/user is sufficiently motivated they can enable DM's dynamic debugging to see an explanation for why the ioctls were disallowed. Reported-by: Nir Soffer <nsoffer@redhat.com> Fixes: e980f62353c6 ("dm: don't allow ioctls to targets that don't map to whole devices") Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-19iommu/vt-d: Fix unaligned addresses for intel_flush_svm_range_dev()Lu Baolu1-2/+20
commit 2d6ffc63f12417b979955a5b22ad9a76d2af5de9 upstream. The VT-d hardware will ignore those Addr bits which have been masked by the AM field in the PASID-based-IOTLB invalidation descriptor. As the result, if the starting address in the descriptor is not aligned with the address mask, some IOTLB caches might not invalidate. Hence people will see below errors. [ 1093.704661] dmar_fault: 29 callbacks suppressed [ 1093.704664] DMAR: DRHD: handling fault status reg 3 [ 1093.712738] DMAR: [DMA Read] Request device [7a:02.0] PASID 2 fault addr 7f81c968d000 [fault reason 113] SM: Present bit in first-level paging entry is clear Fix this by using aligned address for PASID-based-IOTLB invalidation. Fixes: 1c4f88b7f1f9 ("iommu/vt-d: Shared virtual address in scalable mode") Reported-and-tested-by: Guo Kaijie <Kaijie.Guo@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20201231005323.2178523-2-baolu.lu@linux.intel.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-19drm/i915/gt: Restore clear-residual mitigations for Ivybridge, BaytrailChris Wilson1-1/+1
commit 09aa9e45863e9e25dfbf350bae89fc3c2964482c upstream. The mitigation is required for all gen7 platforms, now that it does not cause GPU hangs, restore it for Ivybridge and Baytrail. Fixes: 47f8253d2b89 ("drm/i915/gen7: Clear all EU/L3 residual contexts") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com> Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Cc: Bloomfield Jon <jon.bloomfield@intel.com> Reviewed-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210111225220.3483-2-chris@chris-wilson.co.uk (cherry picked from commit 008ead6ef8f588a8c832adfe9db201d9be5fd410) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-19drm/i915/icl: Fix initing the DSI DSC power refcount during HW readoutImre Deak1-4/+0
commit 2af5268180410b874fc06be91a1b2fbb22b1be0c upstream. For an enabled DSC during HW readout the corresponding power reference is taken along the CRTC power domain references in get_crtc_power_domains(). Remove the incorrect get ref from the DSI encoder hook. Fixes: 2b68392e638d ("drm/i915/dsi: add support for DSC") Cc: Vandita Kulkarni <vandita.kulkarni@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Anshuman Gupta <anshuman.gupta@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201209153952.3397959-1-imre.deak@intel.com (cherry picked from commit 3a9ec563a4ff770ae647f6ee539810f1866866c9) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-19drm/i915/dsi: Use unconditional msleep for the panel_on_delay when there is ↵Hans de Goede1-3/+13
no reset-deassert MIPI-sequence commit 00cb645fd7e29bdd20967cd20fa8f77bcdf422f9 upstream. Commit 25b4620ee822 ("drm/i915/dsi: Skip delays for v3 VBTs in vid-mode") added an intel_dsi_msleep() helper which skips sleeping if the MIPI-sequences have a version of 3 or newer and the panel is in vid-mode; and it moved a bunch of msleep-s over to this new helper. This was based on my reading of the big comment around line 730 which starts with "Panel enable/disable sequences from the VBT spec.", where the "v3 video mode seq" column does not have any wait t# entries. Given that this code has been used on a lot of different devices without issues until now, it seems that my interpretation of the spec here is mostly correct. But now I have encountered one device, an Acer Aspire Switch 10 E SW3-016, where the panel will not light up unless we do actually honor the panel_on_delay after exexuting the MIPI_SEQ_PANEL_ON sequence. What seems to set this model apart is that it is lacking a MIPI_SEQ_DEASSERT_RESET sequence, which is where the power-on delay usually happens. Fix the panel not lighting up on this model by using an unconditional msleep(panel_on_delay) instead of intel_dsi_msleep() when there is no MIPI_SEQ_DEASSERT_RESET sequence. Fixes: 25b4620ee822 ("drm/i915/dsi: Skip delays for v3 VBTs in vid-mode") Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201118124058.26021-1-hdegoede@redhat.com (cherry picked from commit 6fdb335f1c9c0845b50625de1624d8445c4c4a07) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-19dm zoned: select CONFIG_CRC32Arnd Bergmann1-0/+1
commit b690bd546b227c32b860dae985a18bed8aa946fe upstream. Without crc32 support, this driver fails to link: arm-linux-gnueabi-ld: drivers/md/dm-zoned-metadata.o: in function `dmz_write_sb': dm-zoned-metadata.c:(.text+0xe98): undefined reference to `crc32_le' arm-linux-gnueabi-ld: drivers/md/dm-zoned-metadata.o: in function `dmz_check_sb': dm-zoned-metadata.c:(.text+0x7978): undefined reference to `crc32_le' Fixes: 3b1a94c88b79 ("dm zoned: drive-managed zoned block device target") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-19IB/mlx5: Fix error unwinding when set_has_smi_cap failsParav Pandit1-1/+1
commit 2cb091f6293df898b47f4e0f2e54324e2bbaf816 upstream. When set_has_smi_cap() fails, multiport master cleanup is missed. Fix it by doing the correct error unwinding goto. Fixes: a989ea01cb10 ("RDMA/mlx5: Move SMI caps logic") Link: https://lore.kernel.org/r/20210113121703.559778-3-leon@kernel.org Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-19RDMA/mlx5: Fix wrong free of blue flame register on errorMark Bloch1-1/+1
commit 1c3aa6bd0b823105c2030af85d92d158e815d669 upstream. If the allocation of the fast path blue flame register fails, the driver should free the regular blue flame register allocated a statement above, not the one that it just failed to allocate. Fixes: 16c1975f1032 ("IB/mlx5: Create profile infrastructure to add and remove stages") Link: https://lore.kernel.org/r/20210113121703.559778-6-leon@kernel.org Reported-by: Hans Petter Selasky <hanss@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-19bnxt_en: Improve stats context resource accounting with RDMA driver loaded.Michael Chan1-2/+6
commit 869c4d5eb1e6fbda66aa790c48bdb946d71494a0 upstream. The function bnxt_get_ulp_stat_ctxs() does not count the stats contexts used by the RDMA driver correctly when the RDMA driver is freeing the MSIX vectors. It assumes that if the RDMA driver is registered, the additional stats contexts will be needed. This is not true when the RDMA driver is about to unregister and frees the MSIX vectors. This slight error leads to over accouting of the stats contexts needed after the RDMA driver has unloaded. This will cause some firmware warning and error messages in dmesg during subsequent config. changes or ifdown/ifup. Fix it by properly accouting for extra stats contexts only if the RDMA driver is registered and MSIX vectors have been successfully requested. Fixes: c027c6b4e91f ("bnxt_en: get rid of num_stat_ctxs variable") Reviewed-by: Yongping Zhang <yongping.zhang@broadcom.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-19RDMA/usnic: Fix memleak in find_free_vf_and_create_qp_grpDinghao Liu1-0/+3
commit a306aba9c8d869b1fdfc8ad9237f1ed718ea55e6 upstream. If usnic_ib_qp_grp_create() fails at the first call, dev_list will not be freed on error, which leads to memleak. Fixes: e3cf00d0a87f ("IB/usnic: Add Cisco VIC low-level hardware driver") Link: https://lore.kernel.org/r/20201226074248.2893-1-dinghao.liu@zju.edu.cn Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-19RDMA/restrack: Don't treat as an error allocation ID wrappingLeon Romanovsky1-0/+1
commit 3c638cdb8ecc0442552156e0fed8708dd2c7f35b upstream. xa_alloc_cyclic() call returns positive number if ID allocation succeeded but wrapped. It is not an error, so normalize the "ret" variable to zero as marker of not-an-error. drivers/infiniband/core/restrack.c:261 rdma_restrack_add() warn: 'ret' can be either negative or positive Fixes: fd47c2f99f04 ("RDMA/restrack: Convert internal DB from hash to XArray") Link: https://lore.kernel.org/r/20201216100753.1127638-1-leon@kernel.org Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-19scsi: ufs: Fix possible power drain during system suspendStanley Chu1-1/+2
commit 1d53864c3617f5235f891ca0fbe9347c4cd35d46 upstream. Currently if device needs to do flush or BKOP operations, the device VCC power is kept during runtime-suspend period. However, if system suspend is happening while device is runtime-suspended, such power may not be disabled successfully. The reasons may be, 1. If current PM level is the same as SPM level, device will keep runtime-suspended by ufshcd_system_suspend(). 2. Flush recheck work may not be scheduled successfully during system suspend period. If it can wake up the system, this is also not the intention of the recheck work. To fix this issue, simply runtime-resume the device if the flush is allowed during runtime suspend period. Flush capability will be disabled while leaving runtime suspend, and also not be allowed in system suspend period. Link: https://lore.kernel.org/r/20201222072905.32221-2-stanley.chu@mediatek.com Fixes: 51dd905bd2f6 ("scsi: ufs: Fix WriteBooster flush during runtime suspend") Reviewed-by: Chaotian Jing <chaotian.jing@mediatek.com> Reviewed-by: Can Guo <cang@codeaurora.org> Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-19nvme-tcp: Fix warning with CONFIG_DEBUG_PREEMPTSagi Grimberg1-1/+1
commit ada831772188192243f9ea437c46e37e97a5975d upstream. We shouldn't call smp_processor_id() in a preemptible context, but this is advisory at best, so instead call __smp_processor_id(). Fixes: db5ad6b7f8cd ("nvme-tcp: try to send request in queue_rq context") Reported-by: Or Gerlitz <gerlitz.or@gmail.com> Reported-by: Yi Zhang <yi.zhang@redhat.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-19nvme-tcp: fix possible data corruption with bio mergesSagi Grimberg1-1/+1
commit ca1ff67d0fb14f39cf0cc5102b1fbcc3b14f6fb9 upstream. When a bio merges, we can get a request that spans multiple bios, and the overall request payload size is the sum of all bios. When we calculate how much we need to send from the existing bio (and bvec), we did not take into account the iov_iter byte count cap. Since multipage bvecs support, bvecs can split in the middle which means that when we account for the last bvec send we should also take the iov_iter byte count cap as it might be lower than the last bvec size. Reported-by: Hao Wang <pkuwangh@gmail.com> Fixes: 3f2304f8c6d6 ("nvme-tcp: add NVMe over TCP host driver") Tested-by: Hao Wang <pkuwangh@gmail.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-19nvme: don't intialize hwmon for discovery controllersSagi Grimberg1-3/+8
commit 5ab25a32cd90ce561ac28b9302766e565d61304c upstream. Discovery controllers usually don't support smart log page command. So when we connect to the discovery controller we see this warning: nvme nvme0: Failed to read smart log (error 24577) nvme nvme0: new ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery", addr 192.168.123.1:8009 nvme nvme0: Removing ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery" Introduce a new helper to understand if the controller is a discovery controller and use this helper to skip nvme_init_hwmon (also use it in other places that we check if the controller is a discovery controller). Fixes: 400b6a7b13a3 ("nvme: Add hardware monitoring support") Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-19nvmet-rdma: Fix NULL deref when setting pi_enable and traddr INADDR_ANYIsrael Rukshin1-8/+8
commit 7a84665619bb5da8c8b6517157875a1fd7632014 upstream. When setting port traddr to INADDR_ANY, the listening cm_id->device is NULL. The associate IB device is known only when a connect request event arrives, so checking T10-PI device capability should be done at this stage. Fixes: b09160c3996c ("nvmet-rdma: add metadata/T10-PI support") Signed-off-by: Israel Rukshin <israelr@nvidia.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-19net/mlx5: E-Switch, fix changing vf VLANIDAlaa Hleihel1-14/+13
[ Upstream commit 25c904b59aaf4816337acd415514b0c47715f604 ] Adding vf VLANID for the first time, or after having cleared previously defined VLANID works fine, however, attempting to change an existing vf VLANID clears the rules on the firmware, but does not add new rules for the new vf VLANID. Fix this by changing the logic in function esw_acl_egress_lgcy_setup() so that it will always configure egress rules. Fixes: ea651a86d468 ("net/mlx5: E-Switch, Refactor eswitch egress acl codes") Signed-off-by: Alaa Hleihel <alaa@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-01-19net/mlx5: Fix passing zero to 'PTR_ERR'YueHaibing4-4/+4
[ Upstream commit 0c4accc41cb56e527c8c049f5495af9f3d6bef7e ] Fix smatch warnings: drivers/net/ethernet/mellanox/mlx5/core/esw/acl/egress_lgcy.c:105 esw_acl_egress_lgcy_setup() warn: passing zero to 'PTR_ERR' drivers/net/ethernet/mellanox/mlx5/core/esw/acl/egress_ofld.c:177 esw_acl_egress_ofld_setup() warn: passing zero to 'PTR_ERR' drivers/net/ethernet/mellanox/mlx5/core/esw/acl/ingress_lgcy.c:184 esw_acl_ingress_lgcy_setup() warn: passing zero to 'PTR_ERR' drivers/net/ethernet/mellanox/mlx5/core/esw/acl/ingress_ofld.c:262 esw_acl_ingress_ofld_setup() warn: passing zero to 'PTR_ERR' esw_acl_table_create() never returns NULL, so NULL test should be removed. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-01-19net/mlx5e: CT: Use per flow counter when CT flow accounting is enabledOz Shlomo1-28/+49
[ Upstream commit eed38eeee734756596e2cc163bdc7dac3be501b1 ] Connection counters may be shared for both directions when the counter is used for connection aging purposes. However, if TC flow accounting is enabled then a unique counter is required per direction. Instantiate a unique counter per direction if the conntrack accounting extension is enabled. Use a shared counter when the connection accounting extension is disabled. Fixes: 1edae2335adf ("net/mlx5e: CT: Use the same counter for both directions") Signed-off-by: Oz Shlomo <ozsh@nvidia.com> Reported-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Paul Blakey <paulb@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-01-19iommu/vt-d: Update domain geometry in iommu_ops.at(de)tach_devLu Baolu1-2/+14
[ Upstream commit c062db039f40e868c371c36afe8d0fac64305b5d ] The iommu-dma constrains IOVA allocation based on the domain geometry that the driver reports. Update domain geometry everytime a domain is attached to or detached from a device. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Tested-by: Logan Gunthorpe <logang@deltatee.com> Link: https://lore.kernel.org/r/20201124082057.2614359-6-baolu.lu@linux.intel.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-01-19nvme-fc: avoid calling _nvme_fc_abort_outstanding_ios from interrupt contextJames Smart1-1/+14
[ Upstream commit 19fce0470f05031e6af36e49ce222d0f0050d432 ] Recent patches changed calling sequences. nvme_fc_abort_outstanding_ios used to be called from a timeout or work context. Now it is being called in an io completion context, which can be an interrupt handler. Unfortunately, the abort outstanding ios routine attempts to stop nvme queues and nested routines that may try to sleep, which is in conflict with the interrupt handler. Correct replacing the direct call with a work element scheduling, and the abort outstanding ios routine will be called in the work element. Fixes: 95ced8a2c72d ("nvme-fc: eliminate terminate_io use by nvme_fc_error_recovery") Signed-off-by: James Smart <james.smart@broadcom.com> Reported-by: Daniel Wagner <dwagner@suse.de> Tested-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-01-19drm/msm: Call msm_init_vram before binding the gpuCraig Tatlor1-4/+4
[ Upstream commit d863f0c7b536288e2bd40cbc01c10465dd226b11 ] vram.size is needed when binding a gpu without an iommu and is defined in msm_init_vram(), so run that before binding it. Signed-off-by: Craig Tatlor <ctatlor97@gmail.com> Reviewed-by: Brian Masney <masneyb@onstation.org> Tested-by: Alexey Minnekhanov <alexeymin@postmarketos.org> Signed-off-by: Rob Clark <robdclark@chromium.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-01-19iommu/vt-d: Fix lockdep splat in sva bind()/unbind()Lu Baolu1-6/+8
[ Upstream commit 420d42f6f9db27d88bc4f83e3e668fcdacbf7e29 ] Lock(&iommu->lock) without disabling irq causes lockdep warnings. ======================================================== WARNING: possible irq lock inversion dependency detected 5.11.0-rc1+ #828 Not tainted -------------------------------------------------------- kworker/0:1H/120 just changed the state of lock: ffffffffad9ea1b8 (device_domain_lock){..-.}-{2:2}, at: iommu_flush_dev_iotlb.part.0+0x32/0x120 but this lock took another, SOFTIRQ-unsafe lock in the past: (&iommu->lock){+.+.}-{2:2} and interrupts could create inverse lock ordering between them. other info that might help us debug this: Possible interrupt unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&iommu->lock); local_irq_disable(); lock(device_domain_lock); lock(&iommu->lock); <Interrupt> lock(device_domain_lock); *** DEADLOCK *** Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20201231005323.2178523-5-baolu.lu@linux.intel.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-01-19usb: typec: Fix copy paste error for NVIDIA alt-mode descriptionPeter Robinson1-1/+1
[ Upstream commit 41952a66015466c3208aac96b14ffd92e0943589 ] The name of the module for the NVIDIA alt-mode is incorrect as it looks to be a copy-paste error from the entry above, update it to the correct typec_nvidia module name. Cc: Ajay Gupta <ajayg@nvidia.com> Cc: Heikki Krogerus <heikki.krogerus@linux.intel.com> Signed-off-by: Peter Robinson <pbrobinson@gmail.com> Link: https://lore.kernel.org/r/20210106001605.167917-1-pbrobinson@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-01-19drm/amdgpu: fix potential memory leak during navi12 deinitializationJiawei Gu1-4/+14
[ Upstream commit e6d5c64efaa34aae3815a9afeb1314a976142e83 ] Navi12 HDCP & DTM deinitialization needs continue to free bo if already created though initialized flag is not set. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Jiawei Gu <Jiawei.Gu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-01-19drm/amd/pm: fix the failure when change power profile for renoirXiaojian Du2-0/+2
[ Upstream commit 44cb39e19a05ca711bcb6e776e0a4399223204a0 ] This patch is to fix the failure when change power profile to "profile_peak" for renoir. Signed-off-by: Xiaojian Du <Xiaojian.Du@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-01-19drm/amdgpu: fix a GPU hang issue when remove deviceDennis Li1-2/+2
[ Upstream commit 88e21af1b3f887d217f2fb14fc7e7d3cd87ebf57 ] When GFXOFF is enabled and GPU is idle, driver will fail to access some registers. Therefore change to disable power gating before all access registers with MMIO. Dmesg log is as following: amdgpu 0000:03:00.0: amdgpu: amdgpu: finishing device. amdgpu: cp queue pipe 4 queue 0 preemption failed amdgpu 0000:03:00.0: amdgpu: failed to write reg 2890 wait reg 28a2 amdgpu 0000:03:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706 amdgpu 0000:03:00.0: amdgpu: failed to write reg 2890 wait reg 28a2 amdgpu 0000:03:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706 Signed-off-by: Dennis Li <Dennis.Li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-01-19drm/amd/display: fix sysfs amdgpu_current_backlight_pwm NULL pointer issueKevin Wang1-1/+6
[ Upstream commit a7b5d9dd57298333e6e9f4c167f01385d922bbfb ] fix NULL pointer issue when read sysfs amdgpu_current_backlight_pwm sysfs node. Call Trace: [ 248.273833] BUG: kernel NULL pointer dereference, address: 0000000000000130 [ 248.273930] #PF: supervisor read access in kernel mode [ 248.273993] #PF: error_code(0x0000) - not-present page [ 248.274054] PGD 0 P4D 0 [ 248.274092] Oops: 0000 [#1] SMP PTI [ 248.274138] CPU: 2 PID: 1377 Comm: cat Tainted: G OE 5.9.0-rc5-drm-next-5.9+ #1 [ 248.274233] Hardware name: System manufacturer System Product Name/Z170-A, BIOS 3802 03/15/2018 [ 248.274641] RIP: 0010:dc_link_get_backlight_level+0x5/0x70 [amdgpu] [ 248.274718] Code: 67 ff ff ff 41 b9 03 00 00 00 e9 45 ff ff ff d1 ea e9 55 ff ff ff 0f 1f 44 00 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 <48> 8b 87 30 01 00 00 48 8b 00 48 8b 88 88 03 00 00 48 8d 81 e8 01 [ 248.274919] RSP: 0018:ffffb5ad809b3df0 EFLAGS: 00010203 [ 248.274982] RAX: ffffa0f77d1c0010 RBX: ffffa0f793ae9168 RCX: 0000000000000001 [ 248.275064] RDX: ffffa0f79753db00 RSI: 0000000000000001 RDI: 0000000000000000 [ 248.275145] RBP: ffffb5ad809b3e00 R08: ffffb5ad809b3da0 R09: 0000000000000000 [ 248.275225] R10: ffffb5ad809b3e68 R11: 0000000000000000 R12: ffffa0f793ae9190 [ 248.275306] R13: ffffb5ad809b3ef0 R14: 0000000000000001 R15: ffffa0f793ae9168 [ 248.275388] FS: 00007f5f1ec4d540(0000) GS:ffffa0f79ec80000(0000) knlGS:0000000000000000 [ 248.275480] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 248.275547] CR2: 0000000000000130 CR3: 000000042a03c005 CR4: 00000000003706e0 [ 248.275628] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 248.275708] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 248.275789] Call Trace: [ 248.276124] ? current_backlight_read+0x24/0x40 [amdgpu] [ 248.276194] seq_read+0xc3/0x3f0 [ 248.276240] full_proxy_read+0x5c/0x90 [ 248.276290] vfs_read+0xa7/0x190 [ 248.276334] ksys_read+0xa7/0xe0 [ 248.276379] __x64_sys_read+0x1a/0x20 [ 248.276429] do_syscall_64+0x37/0x80 [ 248.276477] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 248.276538] RIP: 0033:0x7f5f1e75c191 [ 248.276585] Code: fe ff ff 48 8d 3d b7 9d 0a 00 48 83 ec 08 e8 46 4d 02 00 66 0f 1f 44 00 00 48 8d 05 71 07 2e 00 8b 00 85 c0 75 13 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 57 f3 c3 0f 1f 44 00 00 41 54 55 49 89 d4 53Hw [ 248.276784] RSP: 002b:00007ffcb1fc3f38 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 [ 248.276872] RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007f5f1e75c191 [ 248.276953] RDX: 0000000000020000 RSI: 00007f5f1ec2b000 RDI: 0000000000000003 [ 248.277034] RBP: 0000000000020000 R08: 00000000ffffffff R09: 0000000000000000 [ 248.277115] R10: 0000000000000022 R11: 0000000000000246 R12: 00007f5f1ec2b000 [ 248.277195] R13: 0000000000000003 R14: 00007f5f1ec2b00f R15: 0000000000020000 [ 248.277279] Modules linked in: amdgpu(OE) iommu_v2 gpu_sched ttm(OE) drm_kms_helper cec drm i2c_algo_bit fb_sys_fops syscopyarea sysfillrect sysimgblt rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache nls_iso8859_1 snd_hda_codec_realtek snd_hda_codec_hdmi snd_hda_codec_generic ledtrig_audio intel_rapl_msr intel_rapl_common snd_hda_intel snd_intel_dspcfg x86_pkg_temp_thermal intel_powerclamp snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_seq_midi snd_seq_midi_event mei_hdcp coretemp snd_rawmidi snd_seq kvm_intel kvm snd_seq_device snd_timer irqbypass joydev snd input_leds soundcore crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd glue_helper rapl intel_cstate mac_hid mei_me serio_raw mei eeepc_wmi wmi_bmof asus_wmi mxm_wmi intel_wmi_thunderbolt acpi_pad sparse_keymap efi_pstore sch_fq_codel parport_pc ppdev lp parport sunrpc ip_tables x_tables autofs4 hid_logitech_hidpp hid_logitech_dj hid_generic usbhid hid e1000e psmouse ahci libahci wmi video [ 248.278211] CR2: 0000000000000130 [ 248.278221] ---[ end trace 1fbe72fe6f91091d ]--- [ 248.357226] RIP: 0010:dc_link_get_backlight_level+0x5/0x70 [amdgpu] [ 248.357272] Code: 67 ff ff ff 41 b9 03 00 00 00 e9 45 ff ff ff d1 ea e9 55 ff ff ff 0f 1f 44 00 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 <48> 8b 87 30 01 00 00 48 8b 00 48 8b 88 88 03 00 00 48 8d 81 e8 01 Signed-off-by: Kevin Wang <kevin1.wang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-01-19nvmet-rdma: Fix list_del corruption on queue establishment failureIsrael Rukshin1-0/+10
[ Upstream commit 9ceb7863537748c67fa43ac4f2f565819bbd36e4 ] When a queue is in NVMET_RDMA_Q_CONNECTING state, it may has some requests at rsp_wait_list. In case a disconnect occurs at this state, no one will empty this list and will return the requests to free_rsps list. Normally nvmet_rdma_queue_established() free those requests after moving the queue to NVMET_RDMA_Q_LIVE state, but in this case __nvmet_rdma_queue_disconnect() is called before. The crash happens at nvmet_rdma_free_rsps() when calling list_del(&rsp->free_list), because the request exists only at the wait list. To fix the issue, simply clear rsp_wait_list when destroying the queue. Signed-off-by: Israel Rukshin <israelr@nvidia.com> Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-01-19nvme: avoid possible double fetch in handling CQELalithambika Krishnakumar1-3/+4
[ Upstream commit 62df80165d7f197c9c0652e7416164f294a96661 ] While handling the completion queue, keep a local copy of the command id from the DMA-accessible completion entry. This silences a time-of-check to time-of-use (TOCTOU) warning from KF/x[1], with respect to a Thunderclap[2] vulnerability analysis. The double-read impact appears benign. There may be a theoretical window for @command_id to be used as an adversary-controlled array-index-value for mounting a speculative execution attack, but that mitigation is saved for a potential follow-on. A man-in-the-middle attack on the data payload is out of scope for this analysis and is hopefully mitigated by filesystem integrity mechanisms. [1] https://github.com/intel/kernel-fuzzer-for-xen-project [2] http://thunderclap.io/thunderclap-paper-ndss2019.pdf Signed-off-by: Lalithambika Krishna Kumar <lalithambika.krishnakumar@intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-01-19nvme-pci: mark Samsung PM1725a as IGNORE_DEV_SUBNQNGopal Tiwari1-1/+2
[ Upstream commit 7ee5c78ca3895d44e918c38332921983ed678be0 ] A system with more than one of these SSDs will only have one usable. Hence the kernel fails to detect nvme devices due to duplicate cntlids. [ 6.274554] nvme nvme1: Duplicate cntlid 33 with nvme0, rejecting [ 6.274566] nvme nvme1: Removing after probe failure status: -22 Adding the NVME_QUIRK_IGNORE_DEV_SUBNQN quirk to resolves the issue. Signed-off-by: Gopal Tiwari <gtiwari@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-01-19net: ethernet: fs_enet: Add missing MODULE_LICENSEMichael Ellerman2-0/+2
[ Upstream commit 445c6198fe7be03b7d38e66fe8d4b3187bc251d4 ] Since commit 1d6cd3929360 ("modpost: turn missing MODULE_LICENSE() into error") the ppc32_allmodconfig build fails with: ERROR: modpost: missing MODULE_LICENSE() in drivers/net/ethernet/freescale/fs_enet/mii-fec.o ERROR: modpost: missing MODULE_LICENSE() in drivers/net/ethernet/freescale/fs_enet/mii-bitbang.o Add the missing MODULE_LICENSEs to fix the build. Both files include a copyright header indicating they are GPL v2. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-01-19misdn: dsp: select CONFIG_BITREVERSEArnd Bergmann1-0/+1
[ Upstream commit 51049bd903a81307f751babe15a1df8d197884e8 ] Without this, we run into a link error arm-linux-gnueabi-ld: drivers/isdn/mISDN/dsp_audio.o: in function `dsp_audio_generate_law_tables': (.text+0x30c): undefined reference to `byte_rev_table' arm-linux-gnueabi-ld: drivers/isdn/mISDN/dsp_audio.o:(.text+0x5e4): more undefined references to `byte_rev_table' follow Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-01-19spi: fix the divide by 0 error when calculating xfer waiting timeXu Yilun1-1/+5
[ Upstream commit 6170d077bf92c5b3dfbe1021688d3c0404f7c9e9 ] The xfer waiting time is the result of xfer->len / xfer->speed_hz. This patch makes the assumption of 100khz xfer speed if the xfer->speed_hz is not assigned and stays 0. This avoids the divide by 0 issue and ensures a reasonable tolerant waiting time. Signed-off-by: Xu Yilun <yilun.xu@intel.com> Link: https://lore.kernel.org/r/1609723749-3557-1-git-send-email-yilun.xu@intel.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-01-19hwmon: (pwm-fan) Ensure that calculation doesn't discard big period valuesUwe Kleine-König1-1/+11
[ Upstream commit 1eda52334e6d13eb1a85f713ce06dd39342b5020 ] With MAX_PWM being defined to 255 the code unsigned long period; ... period = ctx->pwm->args.period; state.duty_cycle = DIV_ROUND_UP(pwm * (period - 1), MAX_PWM); calculates a too small value for duty_cycle if the configured period is big (either by discarding the 64 bit value ctx->pwm->args.period or by overflowing the multiplication). As this results in a too slow fan and so maybe an overheating machine better be safe than sorry and error out in .probe. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://lore.kernel.org/r/20201215092031.152243-1-u.kleine-koenig@pengutronix.de Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-01-19habanalabs: Fix memleak in hl_device_resetDinghao Liu1-0/+2
[ Upstream commit b000700d6db50c933ce8b661154e26cf4ad06dba ] When kzalloc() fails, we should execute hl_mmu_fini() to release the MMU module. It's the same when hl_ctx_init() fails. Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-01-19spi: altera: fix return value for altera_spi_txrx()Xu Yilun1-12/+14
[ Upstream commit ede090f5a438e97d0586f64067bbb956e30a2a31 ] This patch fixes the return value for altera_spi_txrx. It should return 1 for interrupt transfer mode, and return 0 for polling transfer mode. The altera_spi_txrx() implements the spi_controller.transfer_one callback. According to the spi-summary.rst, the transfer_one should return 0 when transfer is finished, return 1 when transfer is still in progress. Signed-off-by: Xu Yilun <yilun.xu@intel.com> Link: https://lore.kernel.org/r/1609219662-27057-2-git-send-email-yilun.xu@intel.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-01-19staging: spmi: hisi-spmi-controller: Fix some error handling pathsChristophe JAILLET1-6/+15
[ Upstream commit 12b38ea040b3bb2a30eb9cd488376df5be7ea81f ] IN the probe function, if an error occurs after calling 'spmi_controller_alloc()', it must be undone by a corresponding 'spmi_controller_put() call. In the remove function, use 'spmi_controller_put(ctrl)' instead of 'kfree(ctrl)'. While a it fix an error message (s/spmi_add_controller/spmi_controller_add/) Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/20201213151105.137731-1-christophe.jaillet@wanadoo.fr Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-01-19habanalabs: register to pci shutdown callbackOded Gabbay1-0/+1
[ Upstream commit fcaebc7354188b0d708c79df4390fbabd4d9799d ] We need to make sure our device is idle when rebooting a virtual machine. This is done in the driver level. The firmware will later handle FLR but we want to be extra safe and stop the devices until the FLR is handled. Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-01-19habanalabs/gaudi: retry loading TPC f/w on -EINTROded Gabbay1-2/+8
[ Upstream commit 98e8781f008372057bd5cb059ca6b507371e473d ] If loading the firmware file for the TPC f/w was interrupted, try to do it again, up to 5 times. Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-01-19habanalabs: adjust pci controller init to new firmwareOded Gabbay3-14/+21
[ Upstream commit 377182a3cc5ae6cc17fb04d06864c975f9f71c18 ] When the firmware security is enabled, the pcie_aux_dbi_reg_addr register in the PCI controller is blocked. Therefore, ignore the result of writing to this register and assume it worked. Also remove the prints on errors in the internal ELBI write function. If the security is enabled, the firmware is responsible for setting this register correctly so we won't have any problem. If the security is disabled, the write will work (unless something is totally broken at the PCI level and then the whole sequence will fail). In addition, remove a write to register pcie_aux_dbi_reg_addr+4, which was never actually needed. Moreover, PCIE_DBI registers are blocked to access from host when firmware security is enabled. Use a different register to flush the writes. Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-01-19ethernet: ucc_geth: fix definition and size of ucc_geth_tx_global_pramRasmus Villemoes1-1/+8
[ Upstream commit 887078de2a23689e29d6fa1b75d7cbc544c280be ] Table 8-53 in the QUICC Engine Reference manual shows definitions of fields up to a size of 192 bytes, not just 128. But in table 8-111, one does find the text Base Address of the Global Transmitter Parameter RAM Page. [...] The user needs to allocate 128 bytes for this page. The address must be aligned to the page size. I've checked both rev. 7 (11/2015) and rev. 9 (05/2018) of the manual; they both have this inconsistency (and the table numbers are the same). Adding a bit of debug printing, on my board the struct ucc_geth_tx_global_pram is allocated at offset 0x880, while the (opaque) ucc_geth_thread_data_tx gets allocated immediately afterwards, at 0x900. So whatever the engine writes into the thread data overlaps with the tail of the global tx pram (and devmem says that something does get written during a simple ping). I haven't observed any failure that could be attributed to this, but it seems to be the kind of thing that would be extremely hard to debug. So extend the struct definition so that we do allocate 192 bytes. Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-01-19regulator: bd718x7: Add enable timesGuido Günther1-0/+57
[ Upstream commit 3b66e4a8e58a85af3212c7117d7a29c9ef6679a2 ] Use the typical startup times from the data sheet so boards get a reasonable default. Not setting any enable time can lead to board hangs when e.g. clocks are enabled too soon afterwards. This fixes gpu power domain resume on the Librem 5. [Moved #defines into driver, seems to be general agreement and avoids any cross tree issues -- broonie] Signed-off-by: Guido Günther <agx@sigxcpu.org> Reviewed-by: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com> Link: https://lore.kernel.org/r/41fb2ed19f584f138336344e2297ae7301f72b75.1608316658.git.agx@sigxcpu.org Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-01-19ath11k: qmi: try to allocate a big block of DMA memory firstCarl Huang2-2/+23
[ Upstream commit f6f92968e1e5a7a9d211faaebefc26ebe408dad7 ] Not all firmware versions support allocating DMA memory in smaller blocks so first try to allocate big block of DMA memory for QMI. If the allocation fails, let firmware request multiple blocks of DMA memory with smaller size. This also fixes an unnecessary error message seen during ath11k probe on QCA6390: ath11k_pci 0000:06:00.0: Respond mem req failed, result: 1, err: 0 ath11k_pci 0000:06:00.0: qmi failed to respond fw mem req:-22 Tested-on: QCA6390 hw2.0 PCI WLAN.HST.1.0.1-01740-QCAHSTSWPLZ_V2_TO_X86-1 Signed-off-by: Carl Huang <cjhuang@codeaurora.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/1608127593-15192-1-git-send-email-kvalo@codeaurora.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-01-19ath11k: fix crash caused by NULL rx_channelCarl Huang1-3/+7
[ Upstream commit 3597010630d0aa96f5778901e691c6068bb86318 ] During connect and disconnect stress test, crashed happened because ar->rx_channel is NULL. Fix it by checking whether ar->rx_channel is NULL. Crash stack is as below: RIP: 0010:ath11k_dp_rx_h_ppdu+0x110/0x230 [ath11k] [ 5028.808963] ath11k_dp_rx_wbm_err+0x14a/0x360 [ath11k] [ 5028.808970] ath11k_dp_rx_process_wbm_err+0x41c/0x520 [ath11k] [ 5028.808978] ath11k_dp_service_srng+0x25e/0x2d0 [ath11k] [ 5028.808982] ath11k_pci_ext_grp_napi_poll+0x23/0x80 [ath11k_pci] [ 5028.808986] net_rx_action+0x27e/0x400 [ 5028.808990] __do_softirq+0xfd/0x2bb [ 5028.808993] irq_exit+0xa6/0xb0 [ 5028.808995] do_IRQ+0x56/0xe0 [ 5028.808997] common_interrupt+0xf/0xf Tested-on: QCA6390 hw2.0 PCI WLAN.HST.1.0.1-01740-QCAHSTSWPLZ_V2_TO_X86-1 Signed-off-by: Carl Huang <cjhuang@codeaurora.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20201211055613.9310-1-cjhuang@codeaurora.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-01-19r8152: Add Lenovo Powered USB-C Travel HubLeon Schuermann2-0/+8
commit cb82a54904a99df9e8f9e9d282046055dae5a730 upstream. This USB-C Hub (17ef:721e) based on the Realtek RTL8153B chip used to use the cdc_ether driver. However, using this driver, with the system suspended the device constantly sends pause-frames as soon as the receive buffer fills up. This causes issues with other devices, where some Ethernet switches stop forwarding packets altogether. Using the Realtek driver (r8152) fixes this issue. Pause frames are no longer sent while the host system is suspended. Signed-off-by: Leon Schuermann <leon@is.currently.online> Tested-by: Leon Schuermann <leon@is.currently.online> Link: https://lore.kernel.org/r/20210111190312.12589-2-leon@is.currently.online Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-19stmmac: intel: change all EHL/TGL to auto detect phy addrVoon Weifeng1-5/+1
commit bff6f1db91e330d7fba56f815cdbc412c75fe163 upstream. Set all EHL/TGL phy_addr to -1 so that the driver will automatically detect it at run-time by probing all the possible 32 addresses. Signed-off-by: Voon Weifeng <weifeng.voon@intel.com> Signed-off-by: Wong Vee Khee <vee.khee.wong@intel.com> Link: https://lore.kernel.org/r/20201106094341.4241-1-vee.khee.wong@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-19dm crypt: defer decryption to a tasklet if interrupts disabledIgnat Korchagin1-2/+6
commit c87a95dc28b1431c7e77e2c0c983cf37698089d2 upstream. On some specific hardware on early boot we occasionally get: [ 1193.920255][ T0] BUG: sleeping function called from invalid context at mm/mempool.c:381 [ 1193.936616][ T0] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 0, name: swapper/69 [ 1193.953233][ T0] no locks held by swapper/69/0. [ 1193.965871][ T0] irq event stamp: 575062 [ 1193.977724][ T0] hardirqs last enabled at (575061): [<ffffffffab73f662>] tick_nohz_idle_exit+0xe2/0x3e0 [ 1194.002762][ T0] hardirqs last disabled at (575062): [<ffffffffab74e8af>] flush_smp_call_function_from_idle+0x4f/0x80 [ 1194.029035][ T0] softirqs last enabled at (575050): [<ffffffffad600fd2>] asm_call_irq_on_stack+0x12/0x20 [ 1194.054227][ T0] softirqs last disabled at (575043): [<ffffffffad600fd2>] asm_call_irq_on_stack+0x12/0x20 [ 1194.079389][ T0] CPU: 69 PID: 0 Comm: swapper/69 Not tainted 5.10.6-cloudflare-kasan-2021.1.4-dev #1 [ 1194.104103][ T0] Hardware name: NULL R162-Z12-CD/MZ12-HD4-CD, BIOS R10 06/04/2020 [ 1194.119591][ T0] Call Trace: [ 1194.130233][ T0] dump_stack+0x9a/0xcc [ 1194.141617][ T0] ___might_sleep.cold+0x180/0x1b0 [ 1194.153825][ T0] mempool_alloc+0x16b/0x300 [ 1194.165313][ T0] ? remove_element+0x160/0x160 [ 1194.176961][ T0] ? blk_mq_end_request+0x4b/0x490 [ 1194.188778][ T0] crypt_convert+0x27f6/0x45f0 [dm_crypt] [ 1194.201024][ T0] ? rcu_read_lock_sched_held+0x3f/0x70 [ 1194.212906][ T0] ? module_assert_mutex_or_preempt+0x3e/0x70 [ 1194.225318][ T0] ? __module_address.part.0+0x1b/0x3a0 [ 1194.237212][ T0] ? is_kernel_percpu_address+0x5b/0x190 [ 1194.249238][ T0] ? crypt_iv_tcw_ctr+0x4a0/0x4a0 [dm_crypt] [ 1194.261593][ T0] ? is_module_address+0x25/0x40 [ 1194.272905][ T0] ? static_obj+0x8a/0xc0 [ 1194.283582][ T0] ? lockdep_init_map_waits+0x26a/0x700 [ 1194.295570][ T0] ? __raw_spin_lock_init+0x39/0x110 [ 1194.307330][ T0] kcryptd_crypt_read_convert+0x31c/0x560 [dm_crypt] [ 1194.320496][ T0] ? kcryptd_queue_crypt+0x1be/0x380 [dm_crypt] [ 1194.333203][ T0] blk_update_request+0x6d7/0x1500 [ 1194.344841][ T0] ? blk_mq_trigger_softirq+0x190/0x190 [ 1194.356831][ T0] blk_mq_end_request+0x4b/0x490 [ 1194.367994][ T0] ? blk_mq_trigger_softirq+0x190/0x190 [ 1194.379693][ T0] flush_smp_call_function_queue+0x24b/0x560 [ 1194.391847][ T0] flush_smp_call_function_from_idle+0x59/0x80 [ 1194.403969][ T0] do_idle+0x287/0x450 [ 1194.413891][ T0] ? arch_cpu_idle_exit+0x40/0x40 [ 1194.424716][ T0] ? lockdep_hardirqs_on_prepare+0x286/0x3f0 [ 1194.436399][ T0] ? _raw_spin_unlock_irqrestore+0x39/0x40 [ 1194.447759][ T0] cpu_startup_entry+0x19/0x20 [ 1194.458038][ T0] secondary_startup_64_no_verify+0xb0/0xbb IO completion can be queued to a different CPU by the block subsystem as a "call single function/data". The CPU may run these routines from the idle task, but it does so with interrupts disabled. It is not a good idea to do decryption with irqs disabled even in an idle task context, so just defer it to a tasklet (as is done with requests from hard irqs). Fixes: 39d42fa96ba1 ("dm crypt: add flags to optionally bypass kcryptd workqueues") Cc: stable@vger.kernel.org # v5.9+ Signed-off-by: Ignat Korchagin <ignat@cloudflare.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-19dm crypt: do not call bio_endio() from the dm-crypt taskletIgnat Korchagin1-1/+23
commit 8e14f610159d524cd7aac37982826d3ef75c09e8 upstream. Sometimes, when dm-crypt executes decryption in a tasklet, we may get "BUG: KASAN: use-after-free in tasklet_action_common.constprop..." with a kasan-enabled kernel. When the decryption fully completes in the tasklet, dm-crypt will call bio_endio(), which in turn will call clone_endio() from dm.c core code. That function frees the resources associated with the bio, including per bio private structures. For dm-crypt it will free the current struct dm_crypt_io, which contains our tasklet object, causing use-after-free, when the tasklet is being dequeued by the kernel. To avoid this, do not call bio_endio() from the current tasklet context, but delay its execution to the dm-crypt IO workqueue. Fixes: 39d42fa96ba1 ("dm crypt: add flags to optionally bypass kcryptd workqueues") Cc: <stable@vger.kernel.org> # v5.9+ Signed-off-by: Ignat Korchagin <ignat@cloudflare.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-19dm crypt: do not wait for backlogged crypto request completion in softirqIgnat Korchagin1-5/+98
commit 8abec36d1274bbd5ae8f36f3658b9abb3db56c31 upstream. Commit 39d42fa96ba1 ("dm crypt: add flags to optionally bypass kcryptd workqueues") made it possible for some code paths in dm-crypt to be executed in softirq context, when the underlying driver processes IO requests in interrupt/softirq context. When Crypto API backlogs a crypto request, dm-crypt uses wait_for_completion to avoid sending further requests to an already overloaded crypto driver. However, if the code is executing in softirq context, we might get the following stacktrace: [ 210.235213][ C0] BUG: scheduling while atomic: fio/2602/0x00000102 [ 210.236701][ C0] Modules linked in: [ 210.237566][ C0] CPU: 0 PID: 2602 Comm: fio Tainted: G W 5.10.0+ #50 [ 210.239292][ C0] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015 [ 210.241233][ C0] Call Trace: [ 210.241946][ C0] <IRQ> [ 210.242561][ C0] dump_stack+0x7d/0xa3 [ 210.243466][ C0] __schedule_bug.cold+0xb3/0xc2 [ 210.244539][ C0] __schedule+0x156f/0x20d0 [ 210.245518][ C0] ? io_schedule_timeout+0x140/0x140 [ 210.246660][ C0] schedule+0xd0/0x270 [ 210.247541][ C0] schedule_timeout+0x1fb/0x280 [ 210.248586][ C0] ? usleep_range+0x150/0x150 [ 210.249624][ C0] ? unpoison_range+0x3a/0x60 [ 210.250632][ C0] ? ____kasan_kmalloc.constprop.0+0x82/0xa0 [ 210.251949][ C0] ? unpoison_range+0x3a/0x60 [ 210.252958][ C0] ? __prepare_to_swait+0xa7/0x190 [ 210.254067][ C0] do_wait_for_common+0x2ab/0x370 [ 210.255158][ C0] ? usleep_range+0x150/0x150 [ 210.256192][ C0] ? bit_wait_io_timeout+0x160/0x160 [ 210.257358][ C0] ? blk_update_request+0x757/0x1150 [ 210.258582][ C0] ? _raw_spin_lock_irq+0x82/0xd0 [ 210.259674][ C0] ? _raw_read_unlock_irqrestore+0x30/0x30 [ 210.260917][ C0] wait_for_completion+0x4c/0x90 [ 210.261971][ C0] crypt_convert+0x19a6/0x4c00 [ 210.263033][ C0] ? _raw_spin_lock_irqsave+0x87/0xe0 [ 210.264193][ C0] ? kasan_set_track+0x1c/0x30 [ 210.265191][ C0] ? crypt_iv_tcw_ctr+0x4a0/0x4a0 [ 210.266283][ C0] ? kmem_cache_free+0x104/0x470 [ 210.267363][ C0] ? crypt_endio+0x91/0x180 [ 210.268327][ C0] kcryptd_crypt_read_convert+0x30e/0x420 [ 210.269565][ C0] blk_update_request+0x757/0x1150 [ 210.270563][ C0] blk_mq_end_request+0x4b/0x480 [ 210.271680][ C0] blk_done_softirq+0x21d/0x340 [ 210.272775][ C0] ? _raw_spin_lock+0x81/0xd0 [ 210.273847][ C0] ? blk_mq_stop_hw_queue+0x30/0x30 [ 210.275031][ C0] ? _raw_read_lock_irq+0x40/0x40 [ 210.276182][ C0] __do_softirq+0x190/0x611 [ 210.277203][ C0] ? handle_edge_irq+0x221/0xb60 [ 210.278340][ C0] asm_call_irq_on_stack+0x12/0x20 [ 210.279514][ C0] </IRQ> [ 210.280164][ C0] do_softirq_own_stack+0x37/0x40 [ 210.281281][ C0] irq_exit_rcu+0x110/0x1b0 [ 210.282286][ C0] common_interrupt+0x74/0x120 [ 210.283376][ C0] asm_common_interrupt+0x1e/0x40 [ 210.284496][ C0] RIP: 0010:_aesni_enc1+0x65/0xb0 Fix this by making crypt_convert function reentrant from the point of a single bio and make dm-crypt defer further bio processing to a workqueue, if Crypto API backlogs a request in interrupt context. Fixes: 39d42fa96ba1 ("dm crypt: add flags to optionally bypass kcryptd workqueues") Cc: stable@vger.kernel.org # v5.9+ Signed-off-by: Ignat Korchagin <ignat@cloudflare.com> Acked-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>