summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2021-07-20ALSA: ac97: fix PM reference leak in ac97_bus_remove()Yufen Yu1-1/+1
[ Upstream commit a38e93302ee25b2ca6f4ee76c6c974cf3637985e ] pm_runtime_get_sync will increment pm usage counter even it failed. Forgetting to putting operation will result in reference leak here. Fix it by replacing it with pm_runtime_resume_and_get to keep usage counter balanced. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Yufen Yu <yuyufen@huawei.com> Link: https://lore.kernel.org/r/20210524093811.612302-1-yuyufen@huawei.com Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-20scsi: core: Cap scsi_host cmd_per_lun at can_queueJohn Garry1-0/+3
[ Upstream commit ea2f0f77538c50739b9fb4de4700cee5535e1f77 ] The sysfs handling function sdev_store_queue_depth() enforces that the sdev queue depth cannot exceed shost can_queue. The initial sdev queue depth comes from shost cmd_per_lun. However, the LLDD may manually set cmd_per_lun to be larger than can_queue, which leads to an initial sdev queue depth greater than can_queue. Such an issue was reported in [0], which caused a hang. That has since been fixed in commit fc09acb7de31 ("scsi: scsi_debug: Fix cmd_per_lun, set to max_queue"). Stop this possibly happening for other drivers by capping shost cmd_per_lun at shost can_queue. [0] https://lore.kernel.org/linux-scsi/YHaez6iN2HHYxYOh@T590/ Link: https://lore.kernel.org/r/1621434662-173079-1-git-send-email-john.garry@huawei.com Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-20scsi: lpfc: Fix crash when lpfc_sli4_hba_setup() fails to initialize the SGLsJames Smart1-2/+3
[ Upstream commit 5aa615d195f1e142c662cb2253f057c9baec7531 ] The driver is encountering a crash in lpfc_free_iocb_list() while performing initial attachment. Code review found this to be an errant failure path that was taken, jumping to a tag that then referenced structures that were uninitialized. Fix the failure path. Link: https://lore.kernel.org/r/20210514195559.119853-9-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-20scsi: lpfc: Fix "Unexpected timeout" error in direct attach topologyJames Smart1-0/+9
[ Upstream commit e30d55137edef47434c40d7570276a0846fe922c ] An 'unexpected timeout' message may be seen in a point-2-point topology. The message occurs when a PLOGI is received before the driver is notified of FLOGI completion. The FLOGI completion failure causes discovery to be triggered for a second time. The discovery timer is restarted but no new discovery activity is initiated, thus the timeout message eventually appears. In point-2-point, when discovery has progressed before the FLOGI completion is processed, it is not a failure. Add code to FLOGI completion to detect that discovery has progressed and exit the FLOGI handling (noop'ing it). Link: https://lore.kernel.org/r/20210514195559.119853-4-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-20scsi: hisi_sas: Propagate errors in interrupt_init_v1_hw()Sergey Shtylyov1-6/+6
[ Upstream commit ab17122e758ef68fb21033e25c041144067975f5 ] After commit 6c11dc060427 ("scsi: hisi_sas: Fix IRQ checks") we have the error codes returned by platform_get_irq() ready for the propagation upsream in interrupt_init_v1_hw() -- that will fix still broken deferred probing. Let's propagate the error codes from devm_request_irq() as well since I don't see the reason to override them with -ENOENT... Link: https://lore.kernel.org/r/49ba93a3-d427-7542-d85a-b74fe1a33a73@omp.ru Acked-by: John Garry <john.garry@huawei.com> Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-20scsi: arcmsr: Fix doorbell status being updated late on ARC-1886ching Huang1-2/+9
[ Upstream commit d9a231226f28261a787535e08d0c78669e1ad010 ] It is possible for the IOP to be delayed in updating the doorbell status. The doorbell status should not be 0 so loop until the value changes. Link: https://lore.kernel.org/r/afdfdf7eabecf14632492c4987a6b9ac6312a7ad.camel@areca.com.tw Signed-off-by: ching Huang <ching2048@areca.com.tw> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-20w1: ds2438: fixing bug that would always get page0Luiz Sampaio1-2/+2
[ Upstream commit 1f5e7518f063728aee0679c5086b92d8ea429e11 ] The purpose of the w1_ds2438_get_page function is to get the register values at the page passed as the pageno parameter. However, the page0 was hardcoded, such that the function always returned the page0 contents. Fixed so that the function can retrieve any page. Signed-off-by: Luiz Sampaio <sampaio.ime@gmail.com> Link: https://lore.kernel.org/r/20210519223046.13798-5-sampaio.ime@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-20ASoC: SOF: topology: fix assignment to use le32_to_cpuJaska Uimonen1-1/+1
[ Upstream commit ccaea61a8d1b8180cc3c470e383381884e4bc1f2 ] Fix sparse warning by using le32_to_cpu. Signed-off-by: Jaska Uimonen <jaska.uimonen@linux.intel.com> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com> Link: https://lore.kernel.org/r/20210521092804.3721324-6-kai.vehmanen@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-20usb: common: usb-conn-gpio: fix NULL pointer dereference of chargerChunfeng Yun1-18/+26
[ Upstream commit 880287910b1892ed2cb38977893b947382a09d21 ] When power on system with OTG cable, IDDIG's interrupt arises before the charger registration, it will cause a NULL pointer dereference, fix the issue by registering the power supply before requesting IDDIG/VBUS irq. Signed-off-by: Chunfeng Yun <chunfeng.yun@mediatek.com> Link: https://lore.kernel.org/r/1621406386-18838-1-git-send-email-chunfeng.yun@mediatek.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-20Revert "ALSA: bebob/oxfw: fix Kconfig entry for Mackie d.2 Pro"Takashi Sakamoto3-4/+4
[ Upstream commit 5d6fb80a142b5994355ce675c517baba6089d199 ] This reverts commit 0edabdfe89581669609eaac5f6a8d0ae6fe95e7f. I've explained that optional FireWire card for d.2 is also built-in to d.2 Pro, however it's wrong. The optional card uses DM1000 ASIC and has 'Mackie DJ Mixer' in its model name of configuration ROM. On the other hand, built-in FireWire card for d.2 Pro and d.4 Pro uses OXFW971 ASIC and has 'd.Pro' in its model name according to manuals and user experiences. The former card is not the card for d.2 Pro. They are similar in appearance but different internally. Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp> Link: https://lore.kernel.org/r/20210518084557.102681-2-o-takashi@sakamocchi.jp Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-20ALSA: usx2y: Don't call free_pages_exact() with NULL addressTakashi Iwai1-2/+5
[ Upstream commit cae0cf651adccee2c3f376e78f30fbd788d0829f ] Unlike some other functions, we can't pass NULL pointer to free_pages_exact(). Add a proper NULL check for avoiding possible Oops. Link: https://lore.kernel.org/r/20210517131545.27252-10-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-20ALSA: usx2y: Avoid camelCaseTakashi Iwai8-649/+649
[ Upstream commit bae3ce4942980d5f7b2b9855f4a2db0c00f9dfbd ] For improving readability, convert camelCase fields, variables and functions to the plain names with underscore. Also align the macros to be capital letters. All done via sed, no functional changes. Note that you'll still see many coding style issues even after this patch; the fixes will follow. Link: https://lore.kernel.org/r/20210517131545.27252-2-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-20iio: magn: bmc150: Balance runtime pm + use pm_runtime_resume_and_get()Jonathan Cameron1-6/+4
[ Upstream commit 264da512431495e542fcaf56ffe75e7df0e7db74 ] probe() error paths after runtime pm is enabled, should disable it. remove() should not call pm_runtime_put_noidle() as there is no matching get() to have raised the reference count. This case has no affect a the runtime pm core protects against going negative. Whilst here use pm_runtime_resume_and_get() to tidy things up a little. coccicheck script didn't get this one due to complex code structure so found by inspection. Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Link: https://lore.kernel.org/r/20210509113354.660190-12-jic23@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-20iio: gyro: fxa21002c: Balance runtime pm + use pm_runtime_resume_and_get().Jonathan Cameron1-10/+1
[ Upstream commit 41120ebbb1eb5e9dec93320e259d5b2c93226073 ] In both the probe() error path and remove() pm_runtime_put_noidle() is called which will decrement the runtime pm reference count. However, there is no matching function to have raised the reference count. Not this isn't a fix as the runtime pm core will stop the reference count going negative anyway. An alternative would have been to raise the count in these paths, but it is not clear why that would be necessary. Whilst we are here replace some boilerplate with pm_runtime_resume_and_get() Found using coccicheck script under review at: https://lore.kernel.org/lkml/20210427141946.2478411-1-Julia.Lawall@inria.fr/ Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Rui Miguel Silva <rui.silva@linaro.org> Reviewed-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Link: https://lore.kernel.org/r/20210509113354.660190-2-jic23@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-20iio: imu: st_lsm6dsx: correct ODR in headerSean Nyekjaer1-3/+3
[ Upstream commit f7d5c18a8c371c306d73757547c2e0d6cfc764b3 ] Fix wrongly stated 13 Hz ODR for accelerometers, the correct ODR is 12.5 Hz Signed-off-by: Sean Nyekjaer <sean@geanix.com> Acked-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-20partitions: msdos: fix one-byte get_unaligned()Arnd Bergmann3-17/+12
[ Upstream commit 1b1774998b2dec837a57d729d1a22e5eb2d6d206 ] A simplification of get_unaligned() clashes with callers that pass in a character pointer, causing a harmless warning like: block/partitions/msdos.c: In function 'msdos_partition': include/asm-generic/unaligned.h:13:22: warning: 'packed' attribute ignored for field of type 'u8' {aka 'unsigned char'} [-Wattributes] Remove the SYS_IND() macro with the get_unaligned() call and just use the ->ind field directly. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-20ASoC: intel/boards: add missing MODULE_DEVICE_TABLEZou Wei2-0/+2
[ Upstream commit a75e5cdf4dd1307bb1541edbb0c008f40896644c ] This patch adds missing MODULE_DEVICE_TABLE definition which generates correct modalias for automatic loading of this driver when it is built as an external module. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zou Wei <zou_wei@huawei.com> Link: https://lore.kernel.org/r/1620791647-16024-1-git-send-email-zou_wei@huawei.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-20misc: alcor_pci: fix null-ptr-deref when there is no PCI bridgeTong Zhang1-1/+7
[ Upstream commit 3ce3e45cc333da707d4d6eb433574b990bcc26f5 ] There is an issue with the ASPM(optional) capability checking function. A device might be attached to root complex directly, in this case, bus->self(bridge) will be NULL, thus priv->parent_pdev is NULL. Since alcor_pci_init_check_aspm(priv->parent_pdev) checks the PCI link's ASPM capability and populate parent_cap_off, which will be used later by alcor_pci_aspm_ctrl() to dynamically turn on/off device, what we can do here is to avoid checking the capability if we are on the root complex. This will make pdev_cap_off 0 and alcor_pci_aspm_ctrl() will simply return when bring called, effectively disable ASPM for the device. [ 1.246492] BUG: kernel NULL pointer dereference, address: 00000000000000c0 [ 1.248731] RIP: 0010:pci_read_config_byte+0x5/0x40 [ 1.253998] Call Trace: [ 1.254131] ? alcor_pci_find_cap_offset.isra.0+0x3a/0x100 [alcor_pci] [ 1.254476] alcor_pci_probe+0x169/0x2d5 [alcor_pci] Co-developed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tong Zhang <ztong0001@gmail.com> Link: https://lore.kernel.org/r/20210513040732.1310159-1-ztong0001@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-20misc/libmasm/module: Fix two use after free in ibmasm_init_oneLv Yunlong1-2/+3
[ Upstream commit 7272b591c4cb9327c43443f67b8fbae7657dd9ae ] In ibmasm_init_one, it calls ibmasm_init_remote_input_dev(). Inside ibmasm_init_remote_input_dev, mouse_dev and keybd_dev are allocated by input_allocate_device(), and assigned to sp->remote.mouse_dev and sp->remote.keybd_dev respectively. In the err_free_devices error branch of ibmasm_init_one, mouse_dev and keybd_dev are freed by input_free_device(), and return error. Then the execution runs into error_send_message error branch of ibmasm_init_one, where ibmasm_free_remote_input_dev(sp) is called to unregister the freed sp->remote.mouse_dev and sp->remote.keybd_dev. My patch add a "error_init_remote" label to handle the error of ibmasm_init_remote_input_dev(), to avoid the uaf bugs. Signed-off-by: Lv Yunlong <lyl2019@mail.ustc.edu.cn> Link: https://lore.kernel.org/r/20210426170620.10546-1-lyl2019@mail.ustc.edu.cn Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-20serial: 8250: of: Check for CONFIG_SERIAL_8250_BCM7271Jim Quinlan1-0/+4
[ Upstream commit f5b08386dee439c7a9e60ce0a4a4a705f3a60dff ] Our SoC's have always had a NS16650A UART core and older SoC's would have a compatible string of: 'compatible = ""ns16550a"' and use the 8250_of driver. Our newer SoC's have added enhancements to the base core to add support for DMA and accurate high speed baud rates and use this newer 8250_bcm7271 driver. The Device Tree node for our enhanced UARTs has a compatible string of: 'compatible = "brcm,bcm7271-uart", "ns16550a"''. With both drivers running and the link order setup so that the 8250_bcm7217 driver is initialized before the 8250_of driver, we should bind the 8250_bcm7271 driver to the enhanced UART, or for upstream kernels that don't have the 8250_bcm7271 driver, we bind to the 8250_of driver. The problem is that when both the 8250_of and 8250_bcm7271 drivers were running, occasionally the 8250_of driver would be bound to the enhanced UART instead of the 8250_bcm7271 driver. This was happening because we use SCMI based clocks which come up late in initialization and cause probe DEFER's when the two drivers get their clocks. Occasionally the SCMI clock would become ready between the 8250_bcm7271 probe and the 8250_of probe and the 8250_of driver would be bound. To fix this we decided to config only our 8250_bcm7271 driver and added "ns16665a0" to the compatible string so the driver would work on our older system. This commit has of_platform_serial_probe() check specifically for the "brcm,bcm7271-uart" and whether its companion driver is enabled. If it is the case, and the clock provider is not ready, we want to make sure that when the 8250_bcm7271.c driver returns EPROBE_DEFER, we are not getting the UART registered via 8250_of.c. Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Jim Quinlan <jim2101024@gmail.com> Signed-off-by: Al Cooper <alcooperx@gmail.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20210423183206.3917725-1-f.fainelli@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-20serial: fsl_lpuart: disable DMA for console and fix sysrqMichael Walle1-0/+6
[ Upstream commit 8cac2f6eb8548245e6f8fb893fc7f2a714952654 ] SYSRQ doesn't work with DMA. This is because there is no error indication whether a symbol had a framing error or not. Actually, this is not completely correct, there is a bit in the data register which is set in this case, but we'd have to read change the DMA access to 16 bit and we'd need to post process the data, thus make the DMA pointless in the first place. Signed-off-by: Michael Walle <michael@walle.cc> Link: https://lore.kernel.org/r/20210512141255.18277-10-michael@walle.cc Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-20tty: serial: fsl_lpuart: fix the potential risk of division or modulo by zeroSherry Sun1-0/+3
[ Upstream commit fcb10ee27fb91b25b68d7745db9817ecea9f1038 ] We should be very careful about the register values that will be used for division or modulo operations, althrough the possibility that the UARTBAUD register value is zero is very low, but we had better to deal with the "bad data" of hardware in advance to avoid division or modulo by zero leading to undefined kernel behavior. Signed-off-by: Sherry Sun <sherry.sun@nxp.com> Link: https://lore.kernel.org/r/20210427021226.27468-1-sherry.sun@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-20usb: dwc3: pci: Fix DEFINE for Intel Elkhart LakeRaymond Tan1-4/+4
[ Upstream commit 457d22850b27de3aea336108272d08602c55fdf7 ] There's no separate low power (LP) version of Elkhart Lake, thus this patch updates the PCI Device ID DEFINE to indicate this. Acked-by: Felipe Balbi <balbi@kernel.org> Signed-off-by: Raymond Tan <raymond.tan@intel.com> Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Link: https://lore.kernel.org/r/20210512135901.28495-1-heikki.krogerus@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-20soundwire: bus: handle -ENODATA errors in clock stop/start sequencesPierre-Louis Bossart1-33/+36
[ Upstream commit b50bb8ba369cdc8814e82558117711c12ef3af3d ] If a device lost sync and can no longer ACK a command, it may not be able to enter a lower-power state but it will still be able to resync when the clock restarts. In those cases, we want to continue with the clock stop sequence. This patch modifies the behavior during clock stop sequences to only log errors unrelated to -ENODATA/Command_Ignored. The flow is also modified so that loops continue to prepare/deprepare other devices even when one seems to have lost sync. When resuming the clocks, all issues are logged with a dev_warn(), previously only some of them were checked. This is the only part that now differs between the clock stop entry and clock stop exit sequences: while we don't want to stop the suspend flow, we do want information on potential issues while resuming, as they may have ripple effects. For consistency the log messages are also modified to be unique and self-explanatory. Errors in sdw_slave_clk_stop_callback() were removed, they are now handled in the caller. Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com> Reviewed-by: Rander Wang <rander.wang@intel.com> Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Link: https://lore.kernel.org/r/20210511030048.25622-4-yung-chuan.liao@linux.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-20soundwire: bus: only use CLOCK_STOP_MODE0 and fix confusionsPierre-Louis Bossart2-62/+40
[ Upstream commit 345e9f5ca798600e44c0843646621f2804eb99f4 ] Existing devices and implementations only support the required CLOCK_STOP_MODE0. All the code related to CLOCK_STOP_MODE1 has not been tested and is highly questionable, with a clear confusion between CLOCK_STOP_MODE1 and the simple clock stop state machine. This patch removes all usages of CLOCK_STOP_MODE1 - which has no impact on any solution - and fixes the use of the simple clock stop state machine. The resulting code should be a lot more symmetrical and easier to maintain. Note that CLOCK_STOP_MODE1 is not supported in the SoundWire Device Class specification so it's rather unlikely that we need to re-add this mode later. Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com> Reviewed-by: Rander Wang <rander.wang@intel.com> Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Link: https://lore.kernel.org/r/20210511030048.25622-2-yung-chuan.liao@linux.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-20rcu: Reject RCU_LOCKDEP_WARN() false positivesPaul E. McKenney2-2/+2
[ Upstream commit 3066820034b5dd4e89bd74a7739c51c2d6f5e554 ] If another lockdep report runs concurrently with an RCU lockdep report from RCU_LOCKDEP_WARN(), the following sequence of events can occur: 1. debug_lockdep_rcu_enabled() sees that lockdep is enabled when called from (say) synchronize_rcu(). 2. Lockdep is disabled by a concurrent lockdep report. 3. debug_lockdep_rcu_enabled() evaluates its lockdep-expression argument, for example, lock_is_held(&rcu_bh_lock_map). 4. Because lockdep is now disabled, lock_is_held() plays it safe and returns the constant 1. 5. But in this case, the constant 1 is not safe, because invoking synchronize_rcu() under rcu_read_lock_bh() is disallowed. 6. debug_lockdep_rcu_enabled() wrongly invokes lockdep_rcu_suspicious(), resulting in a false-positive splat. This commit therefore changes RCU_LOCKDEP_WARN() to check debug_lockdep_rcu_enabled() after checking the lockdep expression, so that any "safe" returns from lock_is_held() are rejected by debug_lockdep_rcu_enabled(). This requires memory ordering, which is supplied by READ_ONCE(debug_locks). The resulting volatile accesses prevent the compiler from reordering and the fact that only one variable is being accessed prevents the underlying hardware from reordering. The combination works for IA64, which can reorder reads to the same location, but this is defeated by the volatile accesses, which compile to load instructions that provide ordering. Reported-by: syzbot+dde0cc33951735441301@syzkaller.appspotmail.com Reported-by: Matthew Wilcox <willy@infradead.org> Reported-by: syzbot+88e4f02896967fe1ab0d@syzkaller.appspotmail.com Reported-by: Thomas Gleixner <tglx@linutronix.de> Suggested-by: Boqun Feng <boqun.feng@gmail.com> Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-20srcu: Fix broken node geometry after early ssp initFrederic Weisbecker3-1/+20
[ Upstream commit b5befe842e6612cf894cf4a199924ee872d8b7d8 ] An srcu_struct structure that is initialized before rcu_init_geometry() will have its srcu_node hierarchy based on CONFIG_NR_CPUS. Once rcu_init_geometry() is called, this hierarchy is compressed as needed for the actual maximum number of CPUs for this system. Later on, that srcu_struct structure is confused, sometimes referring to its initial CONFIG_NR_CPUS-based hierarchy, and sometimes instead to the new num_possible_cpus() hierarchy. For example, each of its ->mynode fields continues to reference the original leaf rcu_node structures, some of which might no longer exist. On the other hand, srcu_for_each_node_breadth_first() traverses to the new node hierarchy. There are at least two bad possible outcomes to this: 1) a) A callback enqueued early on an srcu_data structure (call it *sdp) is recorded pending on sdp->mynode->srcu_data_have_cbs in srcu_funnel_gp_start() with sdp->mynode pointing to a deep leaf (say 3 levels). b) The grace period ends after rcu_init_geometry() shrinks the nodes level to a single one. srcu_gp_end() walks through the new srcu_node hierarchy without ever reaching the old leaves so the callback is never executed. This is easily reproduced on an 8 CPUs machine with CONFIG_NR_CPUS >= 32 and "rcupdate.rcu_self_test=1". The srcu_barrier() after early tests verification never completes and the boot hangs: [ 5413.141029] INFO: task swapper/0:1 blocked for more than 4915 seconds. [ 5413.147564] Not tainted 5.12.0-rc4+ #28 [ 5413.151927] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 5413.159753] task:swapper/0 state:D stack: 0 pid: 1 ppid: 0 flags:0x00004000 [ 5413.168099] Call Trace: [ 5413.170555] __schedule+0x36c/0x930 [ 5413.174057] ? wait_for_completion+0x88/0x110 [ 5413.178423] schedule+0x46/0xf0 [ 5413.181575] schedule_timeout+0x284/0x380 [ 5413.185591] ? wait_for_completion+0x88/0x110 [ 5413.189957] ? mark_held_locks+0x61/0x80 [ 5413.193882] ? mark_held_locks+0x61/0x80 [ 5413.197809] ? _raw_spin_unlock_irq+0x24/0x50 [ 5413.202173] ? wait_for_completion+0x88/0x110 [ 5413.206535] wait_for_completion+0xb4/0x110 [ 5413.210724] ? srcu_torture_stats_print+0x110/0x110 [ 5413.215610] srcu_barrier+0x187/0x200 [ 5413.219277] ? rcu_tasks_verify_self_tests+0x50/0x50 [ 5413.224244] ? rdinit_setup+0x2b/0x2b [ 5413.227907] rcu_verify_early_boot_tests+0x2d/0x40 [ 5413.232700] do_one_initcall+0x63/0x310 [ 5413.236541] ? rdinit_setup+0x2b/0x2b [ 5413.240207] ? rcu_read_lock_sched_held+0x52/0x80 [ 5413.244912] kernel_init_freeable+0x253/0x28f [ 5413.249273] ? rest_init+0x250/0x250 [ 5413.252846] kernel_init+0xa/0x110 [ 5413.256257] ret_from_fork+0x22/0x30 2) An srcu_struct structure that is initialized before rcu_init_geometry() and used afterward will always have stale rdp->mynode references, resulting in callbacks to be missed in srcu_gp_end(), just like in the previous scenario. This commit therefore causes init_srcu_struct_nodes to initialize the geometry, if needed. This ensures that the srcu_node hierarchy is properly built and distributed from the get-go. Suggested-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Neeraj Upadhyay <neeraju@codeaurora.org> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Uladzislau Rezki <urezki@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-20scsi: arcmsr: Fix the wrong CDB payload report to IOPching Huang1-2/+6
[ Upstream commit 5b8644968d2ca85abb785e83efec36934974b0c2 ] This patch fixes the wrong CDB payload report to IOP. Link: https://lore.kernel.org/r/d2c97df3c817595c6faf582839316209022f70da.camel@areca.com.tw Signed-off-by: ching Huang <ching2048@areca.com.tw> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-20dmaengine: fsl-qdma: check dma_set_mask return valueRobin Gong1-1/+5
[ Upstream commit f0c07993af0acf5545d5c1445798846565f4f147 ] For fix below warning reported by static code analysis tool like Coverity from Synopsys: Signed-off-by: Robin Gong <yibin.gong@nxp.com> Addresses-Coverity-ID: 12285639 ("Unchecked return value") Link: https://lore.kernel.org/r/1619427549-20498-1-git-send-email-yibin.gong@nxp.com Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-20ASoC: Intel: sof_sdw: add mutual exclusion between PCH DMIC and RT715Pierre-Louis Bossart2-2/+18
[ Upstream commit 35564e2bf94611c3eb51d35362addb3cb394ad54 ] When external RT714/715 devices are used for capture, we don't want the PCH DMICs to be used. Any information provided by the SOF platform driver or DMI quirks will be overridden. Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Kai Vehmanen <kai.vehmanen@linux.intel.com> Reviewed-by: Libin Yang <libin.yang@intel.com> Link: https://lore.kernel.org/r/20210505163705.305616-5-pierre-louis.bossart@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-20leds: tlc591xx: fix return value check in tlc591xx_probe()Yang Yingliang1-2/+6
[ Upstream commit ee522bcf026ec82ada793979c3a906274430595a ] After device_get_match_data(), tlc591xx is not checked, add check for it and also check np after dev_of_node. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-20net: bridge: multicast: fix MRD advertisement router port marking raceNikolay Aleksandrov1-0/+4
commit 000b7287b67555fee39d39fff75229dedde0dcbf upstream. When an MRD advertisement is received on a bridge port with multicast snooping enabled, we mark it as a router port automatically, that includes adding that port to the router port list. The multicast lock protects that list, but it is not acquired in the MRD advertisement case leading to a race condition, we need to take it to fix the race. Cc: stable@vger.kernel.org Cc: linus.luessing@c0d3.blue Fixes: 4b3087c7e37f ("bridge: Snoop Multicast Router Advertisements") Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-20net: bridge: multicast: fix PIM hello router port marking raceNikolay Aleksandrov1-0/+2
commit 04bef83a3358946bfc98a5ecebd1b0003d83d882 upstream. When a PIM hello packet is received on a bridge port with multicast snooping enabled, we mark it as a router port automatically, that includes adding that port the router port list. The multicast lock protects that list, but it is not acquired in the PIM message case leading to a race condition, we need to take it to fix the race. Cc: stable@vger.kernel.org Fixes: 91b02d3d133b ("bridge: mcast: add router port on PIM hello message") Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-20drm/dp_mst: Add missing drm parameters to recently added call to drm_dbg_kms()José Roberto de Souza1-2/+5
commit 24ff3dc18b99c4b912ab1746e803ddb3be5ced4c upstream. Commit 3769e4c0af5b ("drm/dp_mst: Avoid to mess up payload table by ports in stale topology") added to calls to drm_dbg_kms() but it missed the first parameter, the drm device breaking the build. Fixes: 3769e4c0af5b ("drm/dp_mst: Avoid to mess up payload table by ports in stale topology") Cc: Wayne Lin <Wayne.Lin@amd.com> Cc: Lyude Paul <lyude@redhat.com> Cc: dri-devel@lists.freedesktop.org Cc: stable@vger.kernel.org Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210616194415.36926-1-jose.souza@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-20drm/dp_mst: Avoid to mess up payload table by ports in stale topologyWayne Lin1-0/+29
commit 3769e4c0af5b82c8ea21d037013cb9564dfaa51f upstream. [Why] After unplug/hotplug hub from the system, userspace might start to clear stale payloads gradually. If we call drm_dp_mst_deallocate_vcpi() to release stale VCPI of those ports which are not relating to current topology, we have chane to wrongly clear active payload table entry for current topology. E.g. We have allocated VCPI 1 in current payload table and we call drm_dp_mst_deallocate_vcpi() to clean VCPI 1 in stale topology. In drm_dp_mst_deallocate_vcpi(), it will call drm_dp_mst_put_payload_id() tp put VCPI 1 and which means ID 1 is available again. Thereafter, if we want to allocate a new payload stream, it will find ID 1 is available by drm_dp_mst_assign_payload_id(). However, ID 1 is being used [How] Check target sink is relating to current topology or not before doing any payload table update. Searching upward to find the target sink's relevant root branch device. If the found root branch device is not the same root of current topology, don't update payload table. Changes since v1: * Change debug macro to use drm_dbg_kms() instead * Amend the commit message to add Cc tag. Signed-off-by: Wayne Lin <Wayne.Lin@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210616035501.3776-3-Wayne.Lin@amd.com Reviewed-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-20drm/dp_mst: Do not set proposed vcpi directlyWayne Lin1-26/+10
commit 35d3e8cb35e75450f87f87e3d314e2d418b6954b upstream. [Why] When we receive CSN message to notify one port is disconnected, we will implicitly set its corresponding num_slots to 0. Later on, we will eventually call drm_dp_update_payload_part1() to arrange down streams. In drm_dp_update_payload_part1(), we iterate over all proposed_vcpis[] to do the update. Not specific to a target sink only. For example, if we light up 2 monitors, Monitor_A and Monitor_B, and then we unplug Monitor_B. Later on, when we call drm_dp_update_payload_part1() to try to update payload for Monitor_A, we'll also implicitly clean payload for Monitor_B at the same time. And finally, when we try to call drm_dp_update_payload_part1() to clean payload for Monitor_B, we will do nothing at this time since payload for Monitor_B has been cleaned up previously. For StarTech 1to3 DP hub, it seems like if we didn't update DPCD payload ID table then polling for "ACT Handled"(BIT_1 of DPCD 002C0h) will fail and this polling will last for 3 seconds. Therefore, guess the best way is we don't set the proposed_vcpi[] diretly. Let user of these herlper functions to set the proposed_vcpi directly. [How] 1. Revert commit 7617e9621bf2 ("drm/dp_mst: clear time slots for ports invalid") 2. Tackle the issue in previous commit by skipping those trasient proposed VCPIs. These stale VCPIs shoulde be explicitly cleared by user later on. Changes since v1: * Change debug macro to use drm_dbg_kms() instead * Amend the commit message to add Fixed & Cc tags Signed-off-by: Wayne Lin <Wayne.Lin@amd.com> Fixes: 7617e9621bf2 ("drm/dp_mst: clear time slots for ports invalid") Cc: Lyude Paul <lyude@redhat.com> Cc: Wayne Lin <Wayne.Lin@amd.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: dri-devel@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v5.5+ Signed-off-by: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210616035501.3776-2-Wayne.Lin@amd.com Reviewed-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-20btrfs: zoned: fix wrong mutex unlock on failure to allocate log root treeFilipe Manana1-1/+1
commit ea32af47f00a046a1f953370514d6d946efe0152 upstream. When syncing the log, if we fail to allocate the root node for the log root tree: 1) We are unlocking fs_info->tree_log_mutex, but at this point we have not yet locked this mutex; 2) We have locked fs_info->tree_root->log_mutex, but we end up not unlocking it; So fix this by unlocking fs_info->tree_root->log_mutex instead of fs_info->tree_log_mutex. Fixes: e75f9fd194090e ("btrfs: zoned: move log tree node allocation out of log_root_tree->log_mutex") CC: stable@vger.kernel.org # 5.13+ Reviewed-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-20btrfs: don't block if we can't acquire the reclaim lockJohannes Thumshirn1-1/+9
commit 9cc0b837e14ae913581ec1ea6e979a738f71b0fd upstream. If we can't acquire the reclaim_bgs_lock on block group reclaim, we block until it is free. This can potentially stall for a long time. While reclaim of block groups is necessary for a good user experience on a zoned file system, there still is no need to block as it is best effort only, just like when we're deleting unused block groups. CC: stable@vger.kernel.org # 5.13 Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-20btrfs: rework chunk allocation to avoid exhaustion of the system chunk arrayFilipe Manana7-184/+546
commit 79bd37120b149532af5b21953643ed74af69654f upstream. Commit eafa4fd0ad0607 ("btrfs: fix exhaustion of the system chunk array due to concurrent allocations") fixed a problem that resulted in exhausting the system chunk array in the superblock when there are many tasks allocating chunks in parallel. Basically too many tasks enter the first phase of chunk allocation without previous tasks having finished their second phase of allocation, resulting in too many system chunks being allocated. That was originally observed when running the fallocate tests of stress-ng on a PowerPC machine, using a node size of 64K. However that commit also introduced a deadlock where a task in phase 1 of the chunk allocation waited for another task that had allocated a system chunk to finish its phase 2, but that other task was waiting on an extent buffer lock held by the first task, therefore resulting in both tasks not making any progress. That change was later reverted by a patch with the subject "btrfs: fix deadlock with concurrent chunk allocations involving system chunks", since there is no simple and short solution to address it and the deadlock is relatively easy to trigger on zoned filesystems, while the system chunk array exhaustion is not so common. This change reworks the chunk allocation to avoid the system chunk array exhaustion. It accomplishes that by making the first phase of chunk allocation do the updates of the device items in the chunk btree and the insertion of the new chunk item in the chunk btree. This is done while under the protection of the chunk mutex (fs_info->chunk_mutex), in the same critical section that checks for available system space, allocates a new system chunk if needed and reserves system chunk space. This way we do not have chunk space reserved until the second phase completes. The same logic is applied to chunk removal as well, since it keeps reserved system space long after it is done updating the chunk btree. For direct allocation of system chunks, the previous behaviour remains, because otherwise we would deadlock on extent buffers of the chunk btree. Changes to the chunk btree are by large done by chunk allocation and chunk removal, which first reserve chunk system space and then later do changes to the chunk btree. The other remaining cases are uncommon and correspond to adding a device, removing a device and resizing a device. All these other cases do not pre-reserve system space, they modify the chunk btree right away, so they don't hold reserved space for a long period like chunk allocation and chunk removal do. The diff of this change is huge, but more than half of it is just addition of comments describing both how things work regarding chunk allocation and removal, including both the new behavior and the parts of the old behavior that did not change. CC: stable@vger.kernel.org # 5.12+ Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Tested-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Tested-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-20btrfs: fix deadlock with concurrent chunk allocations involving system chunksFilipe Manana3-69/+1
commit 1cb3db1cf383a3c7dbda1aa0ce748b0958759947 upstream. When a task attempting to allocate a new chunk verifies that there is not currently enough free space in the system space_info and there is another task that allocated a new system chunk but it did not finish yet the creation of the respective block group, it waits for that other task to finish creating the block group. This is to avoid exhaustion of the system chunk array in the superblock, which is limited, when we have a thundering herd of tasks allocating new chunks. This problem was described and fixed by commit eafa4fd0ad0607 ("btrfs: fix exhaustion of the system chunk array due to concurrent allocations"). However there are two very similar scenarios where this can lead to a deadlock: 1) Task B allocated a new system chunk and task A is waiting on task B to finish creation of the respective system block group. However before task B ends its transaction handle and finishes the creation of the system block group, it attempts to allocate another chunk (like a data chunk for an fallocate operation for a very large range). Task B will be unable to progress and allocate the new chunk, because task A set space_info->chunk_alloc to 1 and therefore it loops at btrfs_chunk_alloc() waiting for task A to finish its chunk allocation and set space_info->chunk_alloc to 0, but task A is waiting on task B to finish creation of the new system block group, therefore resulting in a deadlock; 2) Task B allocated a new system chunk and task A is waiting on task B to finish creation of the respective system block group. By the time that task B enter the final phase of block group allocation, which happens at btrfs_create_pending_block_groups(), when it modifies the extent tree, the device tree or the chunk tree to insert the items for some new block group, it needs to allocate a new chunk, so it ends up at btrfs_chunk_alloc() and keeps looping there because task A has set space_info->chunk_alloc to 1, but task A is waiting for task B to finish creation of the new system block group and release the reserved system space, therefore resulting in a deadlock. In short, the problem is if a task B needs to allocate a new chunk after it previously allocated a new system chunk and if another task A is currently waiting for task B to complete the allocation of the new system chunk. Unfortunately this deadlock scenario introduced by the previous fix for the system chunk array exhaustion problem does not have a simple and short fix, and requires a big change to rework the chunk allocation code so that chunk btree updates are all made in the first phase of chunk allocation. And since this deadlock regression is being frequently hit on zoned filesystems and the system chunk array exhaustion problem is triggered in more extreme cases (originally observed on PowerPC with a node size of 64K when running the fallocate tests from stress-ng), revert the changes from that commit. The next patch in the series, with a subject of "btrfs: rework chunk allocation to avoid exhaustion of the system chunk array" does the necessary changes to fix the system chunk array exhaustion problem. Reported-by: Naohiro Aota <naohiro.aota@wdc.com> Link: https://lore.kernel.org/linux-btrfs/20210621015922.ewgbffxuawia7liz@naota-xeon/ Fixes: eafa4fd0ad0607 ("btrfs: fix exhaustion of the system chunk array due to concurrent allocations") CC: stable@vger.kernel.org # 5.12+ Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Tested-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Tested-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-20btrfs: zoned: fix types for u64 division in btrfs_reclaim_bgs_workDavid Sterba1-1/+1
commit 54afaae34ee49e98c1c902b444b42832551d090c upstream. The types in calculation of the used percentage in the reclaiming messages are both u64, though bg->length is either 1GiB (non-zoned) or the zone size in the zoned mode. The upper limit on zone size is 8GiB so this could theoretically overflow in the future, right now the values fit. Fixes: 18bb8bbf13c1 ("btrfs: zoned: automatically reclaim zones") CC: stable@vger.kernel.org # 5.13 Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-20btrfs: properly split extent_map for REQ_OP_ZONE_APPENDNaohiro Aota1-29/+118
commit abb99cfdaf0759f8a619e5fecf52ccccdf310c8c upstream. Damien reported a test failure with btrfs/209. The test itself ran fine, but the fsck ran afterwards reported a corrupted filesystem. The filesystem corruption happens because we're splitting an extent and then writing the extent twice. We have to split the extent though, because we're creating too large extents for a REQ_OP_ZONE_APPEND operation. When dumping the extent tree, we can see two EXTENT_ITEMs at the same start address but different lengths. $ btrfs inspect dump-tree /dev/nullb1 -t extent ... item 19 key (269484032 EXTENT_ITEM 126976) itemoff 15470 itemsize 53 refs 1 gen 7 flags DATA extent data backref root FS_TREE objectid 257 offset 786432 count 1 item 20 key (269484032 EXTENT_ITEM 262144) itemoff 15417 itemsize 53 refs 1 gen 7 flags DATA extent data backref root FS_TREE objectid 257 offset 786432 count 1 The duplicated EXTENT_ITEMs originally come from wrongly split extent_map in extract_ordered_extent(). Since extract_ordered_extent() uses create_io_em() to split an existing extent_map, we will have split->orig_start != split->start. Then, it will be logged with non-zero "extent data offset". Finally, the logged entries are replayed into a duplicated EXTENT_ITEM. Introduce and use proper splitting function for extent_map. The function is intended to be simple and specific usage for extract_ordered_extent() e.g. not supporting compression case (we do not allow splitting compressed extent_map anyway). There was a question raised by Qu, in summary why we want to split the extent map (and not the bio): The problem is not the limit on the zone end, which as you mention is the same as the block group end. The problem is that data write use zone append (ZA) operations. ZA BIOs cannot be split so a large extent may need to be processed with multiple ZA BIOs, While that is also true for regular writes, the major difference is that ZA are "nameless" write operation giving back the written sectors on completion. And ZA operations may be reordered by the block layer (not intentionally though). Combine both of these characteristics and you can see that the data for a large extent may end up being shuffled when written resulting in data corruption and the impossibility to map the extent to some start sector. To avoid this problem, zoned btrfs uses the principle "one data extent == one ZA BIO". So large extents need to be split. This is unfortunate, but we can revisit this later and optimize, e.g. merge back together the fragments of an extent once written if they actually were written sequentially in the zone. Reported-by: Damien Le Moal <damien.lemoal@wdc.com> Fixes: d22002fd37bd ("btrfs: zoned: split ordered extent when bio is sent") CC: stable@vger.kernel.org # 5.12+ CC: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-20io_uring: use right task for exiting checksPavel Begunkov1-1/+1
commit 9c6882608bce249a8918744ecdb65748534e3f17 upstream. When we use delayed_work for fallback execution of requests, current will be not of the submitter task, and so checks in io_req_task_submit() may not behave as expected. Currently, it leaves inline completions not flushed, so making io_ring_exit_work() to hang. Use the submitter task for all those checks. Cc: stable@vger.kernel.org Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/cb413c715bed0bc9c98b169059ea9c8a2c770715.1625881431.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-20arm64: Avoid premature usercopy failureRobin Murphy3-13/+35
commit 295cf156231ca3f9e3a66bde7fab5e09c41835e0 upstream. Al reminds us that the usercopy API must only return complete failure if absolutely nothing could be copied. Currently, if userspace does something silly like giving us an unaligned pointer to Device memory, or a size which overruns MTE tag bounds, we may fail to honour that requirement when faulting on a multi-byte access even though a smaller access could have succeeded. Add a mitigation to the fixup routines to fall back to a single-byte copy if we faulted on a larger access before anything has been written to the destination, to guarantee making *some* forward progress. We needn't be too concerned about the overall performance since this should only occur when callers are doing something a bit dodgy in the first place. Particularly broken userspace might still be able to trick generic_perform_write() into an infinite loop by targeting write() at an mmap() of some read-only device register where the fault-in load succeeds but any store synchronously aborts such that copy_to_user() is genuinely unable to make progress, but, well, don't do that... CC: stable@vger.kernel.org Reported-by: Chen Huang <chenhuang5@huawei.com> Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/dc03d5c675731a1f24a62417dba5429ad744234e.1626098433.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-20mm/hugetlb: fix refs calculation from unaligned @vaddrJoao Martins1-2/+3
commit d08af0a59684e18a51aa4bfd24c658994ea3fc5b upstream. Commit 82e5d378b0e47 ("mm/hugetlb: refactor subpage recording") refactored the count of subpages but missed an edge case when @vaddr is not aligned to PAGE_SIZE e.g. when close to vma->vm_end. It would then errousnly set @refs to 0 and record_subpages_vmas() wouldn't set the @pages array element to its value, consequently causing the reported null-deref by syzbot. Fix it by aligning down @vaddr by PAGE_SIZE in @refs calculation. Link: https://lkml.kernel.org/r/20210713152440.28650-1-joao.m.martins@oracle.com Fixes: 82e5d378b0e47 ("mm/hugetlb: refactor subpage recording") Reported-by: syzbot+a3fcd59df1b372066f5a@syzkaller.appspotmail.com Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-20EDAC/igen6: fix core dependency AGAINRandy Dunlap1-1/+1
commit a1c9ca5f65c9acfd7c02474b9d5cacbd7ea288df upstream. My previous patch had a typo/thinko which prevents this driver from being enabled: change X64_64 to X86_64. Fixes: 0a9ece9ba154 ("EDAC/igen6: fix core dependency") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: linux-edac@vger.kernel.org Cc: bowsingbetee <bowsingbetee@protonmail.com> Cc: stable@vger.kernel.org Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-20fbmem: Do not delete the mode that is still in useZhen Lei1-7/+5
commit 0af778269a522c988ef0b4188556aba97fb420cc upstream. The execution of fb_delete_videomode() is not based on the result of the previous fbcon_mode_deleted(). As a result, the mode is directly deleted, regardless of whether it is still in use, which may cause UAF. ================================================================== BUG: KASAN: use-after-free in fb_mode_is_equal+0x36e/0x5e0 \ drivers/video/fbdev/core/modedb.c:924 Read of size 4 at addr ffff88807e0ddb1c by task syz-executor.0/18962 CPU: 2 PID: 18962 Comm: syz-executor.0 Not tainted 5.10.45-rc1+ #3 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ... Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x137/0x1be lib/dump_stack.c:118 print_address_description+0x6c/0x640 mm/kasan/report.c:385 __kasan_report mm/kasan/report.c:545 [inline] kasan_report+0x13d/0x1e0 mm/kasan/report.c:562 fb_mode_is_equal+0x36e/0x5e0 drivers/video/fbdev/core/modedb.c:924 fbcon_mode_deleted+0x16a/0x220 drivers/video/fbdev/core/fbcon.c:2746 fb_set_var+0x1e1/0xdb0 drivers/video/fbdev/core/fbmem.c:975 do_fb_ioctl+0x4d9/0x6e0 drivers/video/fbdev/core/fbmem.c:1108 vfs_ioctl fs/ioctl.c:48 [inline] __do_sys_ioctl fs/ioctl.c:753 [inline] __se_sys_ioctl+0xfb/0x170 fs/ioctl.c:739 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Freed by task 18960: kasan_save_stack mm/kasan/common.c:48 [inline] kasan_set_track+0x3d/0x70 mm/kasan/common.c:56 kasan_set_free_info+0x17/0x30 mm/kasan/generic.c:355 __kasan_slab_free+0x108/0x140 mm/kasan/common.c:422 slab_free_hook mm/slub.c:1541 [inline] slab_free_freelist_hook+0xd6/0x1a0 mm/slub.c:1574 slab_free mm/slub.c:3139 [inline] kfree+0xca/0x3d0 mm/slub.c:4121 fb_delete_videomode+0x56a/0x820 drivers/video/fbdev/core/modedb.c:1104 fb_set_var+0x1f3/0xdb0 drivers/video/fbdev/core/fbmem.c:978 do_fb_ioctl+0x4d9/0x6e0 drivers/video/fbdev/core/fbmem.c:1108 vfs_ioctl fs/ioctl.c:48 [inline] __do_sys_ioctl fs/ioctl.c:753 [inline] __se_sys_ioctl+0xfb/0x170 fs/ioctl.c:739 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: 13ff178ccd6d ("fbcon: Call fbcon_mode_deleted/new_modelist directly") Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Cc: <stable@vger.kernel.org> # v5.3+ Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210712085544.2828-1-thunder.leizhen@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-20cgroup: verify that source is a stringChristian Brauner1-0/+2
commit 3b0462726e7ef281c35a7a4ae33e93ee2bc9975b upstream. The following sequence can be used to trigger a UAF: int fscontext_fd = fsopen("cgroup"); int fd_null = open("/dev/null, O_RDONLY); int fsconfig(fscontext_fd, FSCONFIG_SET_FD, "source", fd_null); close_range(3, ~0U, 0); The cgroup v1 specific fs parser expects a string for the "source" parameter. However, it is perfectly legitimate to e.g. specify a file descriptor for the "source" parameter. The fs parser doesn't know what a filesystem allows there. So it's a bug to assume that "source" is always of type fs_value_is_string when it can reasonably also be fs_value_is_file. This assumption in the cgroup code causes a UAF because struct fs_parameter uses a union for the actual value. Access to that union is guarded by the param->type member. Since the cgroup paramter parser didn't check param->type but unconditionally moved param->string into fc->source a close on the fscontext_fd would trigger a UAF during put_fs_context() which frees fc->source thereby freeing the file stashed in param->file causing a UAF during a close of the fd_null. Fix this by verifying that param->type is actually a string and report an error if not. In follow up patches I'll add a new generic helper that can be used here and by other filesystems instead of this error-prone copy-pasta fix. But fixing it in here first makes backporting a it to stable a lot easier. Fixes: 8d2451f4994f ("cgroup1: switch to option-by-option parsing") Reported-by: syzbot+283ce5a46486d6acdbaf@syzkaller.appspotmail.com Cc: Christoph Hellwig <hch@lst.de> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: <stable@kernel.org> Cc: syzkaller-bugs <syzkaller-bugs@googlegroups.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-20drm/i915/gt: Fix -EDEADLK handling regressionVille Syrjälä1-1/+1
commit 2feeb52859fc1ab94cd35b61ada3a6ac4ff24243 upstream. The conversion to ww mutexes failed to address the fence code which already returns -EDEADLK when we run out of fences. Ww mutexes on the other hand treat -EDEADLK as an internal errno value indicating a need to restart the operation due to a deadlock. So now when the fence code returns -EDEADLK the higher level code erroneously restarts everything instead of returning the error to userspace as is expected. To remedy this let's switch the fence code to use a different errno value for this. -ENOBUFS seems like a semi-reasonable unique choice. Apart from igt the only user of this I could find is sna, and even there all we do is dump the current fence registers from debugfs into the X server log. So no user visible functionality is affected. If we really cared about preserving this we could of course convert back to -EDEADLK higher up, but doesn't seem like that's worth the hassle here. Not quite sure which commit specifically broke this, but I'll just attribute it to the general gem ww mutex work. Cc: stable@vger.kernel.org Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Thomas Hellström <thomas.hellstrom@intel.com> Testcase: igt/gem_pread/exhaustion Testcase: igt/gem_pwrite/basic-exhaustion Testcase: igt/gem_fenced_exec_thrash/too-many-fences Fixes: 80f0b679d6f0 ("drm/i915: Add an implementation for i915_gem_ww_ctx locking, v2.") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210630164413.25481-1-ville.syrjala@linux.intel.com Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> (cherry picked from commit 78d2ad7eb4e1f0e9cd5d79788446b6092c21d3e0) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-20drm/i915/gtt: drop the page table optimisationMatthew Auld1-4/+1
commit 0abb33bfca0fb74df76aac03e90ce685016ef7be upstream. We skip filling out the pt with scratch entries if the va range covers the entire pt, since we later have to fill it with the PTEs for the object pages anyway. However this might leave open a small window where the PTEs don't point to anything valid for the HW to consume. When for example using 2M GTT pages this fill_px() showed up as being quite significant in perf measurements, and ends up being completely wasted since we ignore the pt and just use the pde directly. Anyway, currently we have our PTE construction split between alloc and insert, which is probably slightly iffy nowadays, since the alloc doesn't actually allocate anything anymore, instead it just sets up the page directories and points the PTEs at the scratch page. Later when we do the insert step we re-program the PTEs again. Better might be to squash the alloc and insert into a single step, then bringing back this optimisation(along with some others) should be possible. Fixes: 14826673247e ("drm/i915: Only initialize partially filled pagetables") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Cc: Chris Wilson <chris.p.wilson@intel.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: <stable@vger.kernel.org> # v4.15+ Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210713130431.2392740-1-matthew.auld@intel.com (cherry picked from commit 8f88ca76b3942d82e2c1cea8735ec368d89ecc15) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>