summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2022-04-20Linux 5.17.4v5.17.4Greg Kroah-Hartman1-1/+1
Link: https://lore.kernel.org/r/20220418121203.462784814@linuxfoundation.org Tested-by: Ronald Warsow <rwarsow@gmx.de Tested-by: Justin M. Forbes <jforbes@fedoraproject.org> Tested-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Shuah Khan <skhan@linuxfoundation.org> Tested-by: Rudi Heitbaum <rudi@heitbaum.com> Tested-by: Zan Aziz <zanaziz313@gmail.com> Tested-by: Linux Kernel Functional Testing <lkft@linaro.org> Tested-by: Fenil Jain<fkjainco@gmail.com> Tested-by: Ron Economos <re@w6rz.net> Tested-by: Jon Hunter <jonathanh@nvidia.com> Tested-by: Jiri Slaby <jirislaby@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20io_uring: fix poll error reportingPavel Begunkov1-3/+2
commit 7179c3ce3dbff646c55f7cd664a895f462f049e5 upstream. We should not return an error code in req->result in io_poll_check_events(), because it may get mangled and returned as success. Just return the error code directly, the callers will fail the request or proceed accordingly. Fixes: 6bf9c47a3989 ("io_uring: defer file assignment") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/5f03514ee33324dc811fb93df84aee0f695fb044.1649862516.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20io_uring: fix poll file assign deadlockPavel Begunkov1-1/+2
commit cce64ef01308b677a687d90927fc2b2e0e1cba67 upstream. We pass "unlocked" into io_assign_file() in io_poll_check_events(), which can lead to double locking. Fixes: 6bf9c47a3989 ("io_uring: defer file assignment") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/2476d4ae46554324b599ee4055447b105f20a75a.1649862516.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20io_uring: use right issue_flags for splice/teePavel Begunkov1-2/+2
commit e941976659f1f6834077a1596bf53e6bdb10e90b upstream. Pass right issue_flags into into io_file_get_fixed() instead of IO_URING_F_UNLOCKED. It's probably not a problem at the moment but let's do it safer. Fixes: 6bf9c47a3989 ("io_uring: defer file assignment") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/7d242daa9df5d776907686977cd29fbceb4a2d8d.1649862516.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20ax25: Fix UAF bugs in ax25 timersDuoming Zhou1-0/+5
commit 82e31755e55fbcea6a9dfaae5fe4860ade17cbc0 upstream. There are race conditions that may lead to UAF bugs in ax25_heartbeat_expiry(), ax25_t1timer_expiry(), ax25_t2timer_expiry(), ax25_t3timer_expiry() and ax25_idletimer_expiry(), when we call ax25_release() to deallocate ax25_dev. One of the UAF bugs caused by ax25_release() is shown below: (Thread 1) | (Thread 2) ax25_dev_device_up() //(1) | ... | ax25_kill_by_device() ax25_bind() //(2) | ax25_connect() | ... ax25_std_establish_data_link() | ax25_start_t1timer() | ax25_dev_device_down() //(3) mod_timer(&ax25->t1timer,..) | | ax25_release() (wait a time) | ... | ax25_dev_put(ax25_dev) //(4)FREE ax25_t1timer_expiry() | ax25->ax25_dev->values[..] //USE| ... ... | We increase the refcount of ax25_dev in position (1) and (2), and decrease the refcount of ax25_dev in position (3) and (4). The ax25_dev will be freed in position (4) and be used in ax25_t1timer_expiry(). The fail log is shown below: ============================================================== [ 106.116942] BUG: KASAN: use-after-free in ax25_t1timer_expiry+0x1c/0x60 [ 106.116942] Read of size 8 at addr ffff88800bda9028 by task swapper/0/0 [ 106.116942] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.17.0-06123-g0905eec574 [ 106.116942] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-14 [ 106.116942] Call Trace: ... [ 106.116942] ax25_t1timer_expiry+0x1c/0x60 [ 106.116942] call_timer_fn+0x122/0x3d0 [ 106.116942] __run_timers.part.0+0x3f6/0x520 [ 106.116942] run_timer_softirq+0x4f/0xb0 [ 106.116942] __do_softirq+0x1c2/0x651 ... This patch adds del_timer_sync() in ax25_release(), which could ensure that all timers stop before we deallocate ax25_dev. Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20cpu/hotplug: Remove the 'cpu' member of cpuhp_cpu_stateSteven Price1-18/+18
commit b7ba6d8dc3569e49800ef0136799f26f43e237e8 upstream. Currently the setting of the 'cpu' member of struct cpuhp_cpu_state in cpuhp_create() is too late as it is used earlier in _cpu_up(). If kzalloc_node() in __smpboot_create_thread() fails then the rollback will be done with st->cpu==0 causing CPU0 to be erroneously set to be dying, causing the scheduler to get mightily confused and throw its toys out of the pram. However the cpu number is actually available directly, so simply remove the 'cpu' member and avoid the problem in the first place. Fixes: 2ea46c6fc945 ("cpumask/hotplug: Fix cpu_dying() state tracking") Signed-off-by: Steven Price <steven.price@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20220411152233.474129-2-steven.price@arm.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20drm/i915: Sunset igpu legacy mmap support based on GRAPHICS_VER_FULLMatt Roper1-1/+1
commit 1acb34e7dd7720a1fff00cbd4d000ec3219dc9d6 upstream. The intent of the version check in the mmap ioctl was to maintain support for existing platforms (i.e., ADL/RPL and earlier), but drop support on all future igpu platforms. As we've seen on the dgpu side, the hardware teams are using a more fine-grained numbering system for IP version numbers these days, so it's possible the version number associated with our next igpu could be some form of "12.xx" rather than 13 or higher. Comparing against the full ver.release number will ensure the intent of the check is maintained no matter what numbering the hardware teams settle on. Fixes: d3f3baa3562a ("drm/i915: Reinstate the mmap ioctl for some platforms") Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220407161839.1073443-1-matthew.d.roper@intel.com (cherry picked from commit 8e7e5c077cd57ee9a36d58c65f07257dc49a88d5) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20mm, kfence: support kmem_dump_obj() for KFENCE objectsMarco Elver9-25/+105
commit 2dfe63e61cc31ee59ce951672b0850b5229cd5b0 upstream. Calling kmem_obj_info() via kmem_dump_obj() on KFENCE objects has been producing garbage data due to the object not actually being maintained by SLAB or SLUB. Fix this by implementing __kfence_obj_info() that copies relevant information to struct kmem_obj_info when the object was allocated by KFENCE; this is called by a common kmem_obj_info(), which also calls the slab/slub/slob specific variant now called __kmem_obj_info(). For completeness, kmem_dump_obj() now displays if the object was allocated by KFENCE. Link: https://lore.kernel.org/all/20220323090520.GG16885@xsang-OptiPlex-9020/ Link: https://lkml.kernel.org/r/20220406131558.3558585-1-elver@google.com Fixes: b89fb5ef0ce6 ("mm, kfence: insert KFENCE hooks for SLUB") Fixes: d3fb45f370d9 ("mm, kfence: insert KFENCE hooks for SLAB") Signed-off-by: Marco Elver <elver@google.com> Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Reported-by: kernel test robot <oliver.sang@intel.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> [slab] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20dma-direct: avoid redundant memory sync for swiotlbChao Gao1-1/+2
commit 9e02977bfad006af328add9434c8bffa40e053bb upstream. When we looked into FIO performance with swiotlb enabled in VM, we found swiotlb_bounce() is always called one more time than expected for each DMA read request. It turns out that the bounce buffer is copied to original DMA buffer twice after the completion of a DMA request (one is done by in dma_direct_sync_single_for_cpu(), the other by swiotlb_tbl_unmap_single()). But the content in bounce buffer actually doesn't change between the two rounds of copy. So, one round of copy is redundant. Pass DMA_ATTR_SKIP_CPU_SYNC flag to swiotlb_tbl_unmap_single() to skip the memory copy in it. This fix increases FIO 64KB sequential read throughput in a guest with swiotlb=force by 5.6%. Fixes: 55897af63091 ("dma-direct: merge swiotlb_dma_ops into the dma_direct code") Reported-by: Wang Zhaoyang1 <zhaoyang1.wang@intel.com> Reported-by: Gao Liang <liang.gao@intel.com> Signed-off-by: Chao Gao <chao.gao@intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20timers: Fix warning condition in __run_timers()Anna-Maria Behnsen1-4/+7
commit c54bc0fc84214b203f7a0ebfd1bd308ce2abe920 upstream. When the timer base is empty, base::next_expiry is set to base::clk + NEXT_TIMER_MAX_DELTA and base::next_expiry_recalc is false. When no timer is queued until jiffies reaches base::next_expiry value, the warning for not finding any expired timer and base::next_expiry_recalc is false in __run_timers() triggers. To prevent triggering the warning in this valid scenario base::timers_pending needs to be added to the warning condition. Fixes: 31cd0e119d50 ("timers: Recalculate next timer interrupt only when necessary") Reported-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lore.kernel.org/r/20220405191732.7438-3-anna-maria@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20dt-bindings: net: snps: remove duplicate nameDongjin Yang1-4/+2
commit ce8b3ad1071b764e963d9b08ac34ffddddf12da6 upstream. snps,dwmac has duplicated name for loongson,ls2k-dwmac and loongson,ls7a-dwmac. Signed-off-by: Dongjin Yang <dj76.yang@samsung.com> Fixes: 68277749a013 ("dt-bindings: dwmac: Add bindings for new Loongson SoC and bridge chip") Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20220404022857epcms1p6e6af1a6a86569f339e50c318abde7d3c@epcms1p6 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20i2c: pasemi: Wait for write xfers to finishMartin Povišer1-0/+6
commit bd8963e602c77adc76dbbbfc3417c3cf14fed76b upstream. Wait for completion of write transfers before returning from the driver. At first sight it may seem advantageous to leave write transfers queued for the controller to carry out on its own time, but there's a couple of issues with it: * Driver doesn't check for FIFO space. * The queued writes can complete while the driver is in its I2C read transfer path which means it will get confused by the raising of XEN (the 'transaction ended' signal). This can cause a spurious ENODATA error due to premature reading of the MRXFIFO register. Adding the wait fixes some unreliability issues with the driver. There's some efficiency cost to it (especially with pasemi_smb_waitready doing its polling), but that will be alleviated once the driver receives interrupt support. Fixes: beb58aa39e6e ("i2c: PA Semi SMBus driver") Signed-off-by: Martin Povišer <povik+lin@cutebit.org> Reviewed-by: Sven Peter <sven@svenpeter.dev> Signed-off-by: Wolfram Sang <wsa@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20dt-bindings: memory: snps,ddrc-3.80a compatible also need interruptsSherry Sun1-2/+4
commit 4f9f45d0eb0e7d449bc9294459df79b9c66edfac upstream. For the snps,ddrc-3.80a compatible, the interrupts property is also required, also order the compatibles by name (s goes before x). Signed-off-by: Sherry Sun <sherry.sun@nxp.com> Fixes: a9e6b3819b36 ("dt-bindings: memory: Add entry for version 3.80a") Link: https://lore.kernel.org/r/20220321075131.17811-2-sherry.sun@nxp.com Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20smp: Fix offline cpu check in flush_smp_call_function_queue()Nadav Amit1-1/+1
commit 9e949a3886356fe9112c6f6f34a6e23d1d35407f upstream. The check in flush_smp_call_function_queue() for callbacks that are sent to offline CPUs currently checks whether the queue is empty. However, flush_smp_call_function_queue() has just deleted all the callbacks from the queue and moved all the entries into a local list. This checks would only be positive if some callbacks were added in the short time after llist_del_all() was called. This does not seem to be the intention of this check. Change the check to look at the local list to which the entries were moved instead of the queue from which all the callbacks were just removed. Fixes: 8d056c48e4862 ("CPU hotplug, smp: flush any pending IPI callbacks before CPU offline") Signed-off-by: Nadav Amit <namit@vmware.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20220319072015.1495036-1-namit@vmware.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20Revert "net: dsa: setup master before ports"Vladimir Oltean1-13/+10
commit 762c2998c9625f642f0d23da7d3f7e4f90665fdf upstream. This reverts commit 11fd667dac315ea3f2469961f6d2869271a46cae. dsa_slave_change_mtu() updates the MTU of the DSA master and of the associated CPU port, but only if it detects a change to the master MTU. The blamed commit in the Fixes: tag below addressed a regression where dsa_slave_change_mtu() would return early and not do anything due to ds->ops->port_change_mtu() not being implemented. However, that commit also had the effect that the master MTU got set up to the correct value by dsa_master_setup(), but the associated CPU port's MTU did not get updated. This causes breakage for drivers that rely on the ->port_change_mtu() DSA call to account for the tagging overhead on the CPU port, and don't set up the initial MTU during the setup phase. Things actually worked before because they were in a fragile equilibrium where dsa_slave_change_mtu() was called before dsa_master_setup() was. So dsa_slave_change_mtu() could actually detect a change and update the CPU port MTU too. Restore the code to the way things used to work by reverting the reorder of dsa_tree_setup_master() and dsa_tree_setup_ports(). That change did not have a concrete motivation going for it anyway, it just looked better. Fixes: 066dfc429040 ("Revert "net: dsa: stop updating master MTU from master.c"") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20i2c: dev: check return value when calling dev_set_name()Andy Shevchenko1-5/+10
commit 993eb48fa199b5f476df8204e652eff63dd19361 upstream. If dev_set_name() fails, the dev_name() is null, check the return value of dev_set_name() to avoid the null-ptr-deref. Fixes: 1413ef638aba ("i2c: dev: Fix the race between the release of i2c_dev and cdev") Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Wolfram Sang <wsa@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20dm integrity: fix memory corruption when tag_size is less than digest sizeMikulas Patocka1-2/+5
commit 08c1af8f1c13bbf210f1760132f4df24d0ed46d6 upstream. It is possible to set up dm-integrity in such a way that the "tag_size" parameter is less than the actual digest size. In this situation, a part of the digest beyond tag_size is ignored. In this case, dm-integrity would write beyond the end of the ic->recalc_tags array and corrupt memory. The corruption happened in integrity_recalc->integrity_sector_checksum->crypto_shash_final. Fix this corruption by increasing the tags array so that it has enough padding at the end to accomodate the loop in integrity_recalc() being able to write a full digest size for the last member of the tags array. Cc: stable@vger.kernel.org # v4.19+ Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20ep93xx: clock: Fix UAF in ep93xx_clk_register_gate()Alexander Sverdlin1-1/+3
commit 3b68b08885217abd9c57ff9b3bb3eb173eee02a9 upstream. arch/arm/mach-ep93xx/clock.c:154:2: warning: Use of memory after it is freed [clang-analyzer-unix.Malloc] arch/arm/mach-ep93xx/clock.c:151:2: note: Taking true branch if (IS_ERR(clk)) ^ arch/arm/mach-ep93xx/clock.c:152:3: note: Memory is released kfree(psc); ^~~~~~~~~~ arch/arm/mach-ep93xx/clock.c:154:2: note: Use of memory after it is freed return &psc->hw; ^ ~~~~~~~~ Fixes: 9645ccc7bd7a ("ep93xx: clock: convert in-place to COMMON_CLK") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Alexander Sverdlin <alexander.sverdlin@gmail.com> Cc: stable@vger.kernel.org Link: https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org/thread/B5YCO2NJEXINCYE26Y255LCVMO55BGWW/ Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20ARM: davinci: da850-evm: Avoid NULL pointer dereferenceNathan Chancellor1-1/+3
commit 83a1cde5c74bfb44b49cb2a940d044bb2380f4ea upstream. With newer versions of GCC, there is a panic in da850_evm_config_emac() when booting multi_v5_defconfig in QEMU under the palmetto-bmc machine: Unable to handle kernel NULL pointer dereference at virtual address 00000020 pgd = (ptrval) [00000020] *pgd=00000000 Internal error: Oops: 5 [#1] PREEMPT ARM Modules linked in: CPU: 0 PID: 1 Comm: swapper Not tainted 5.15.0 #1 Hardware name: Generic DT based system PC is at da850_evm_config_emac+0x1c/0x120 LR is at do_one_initcall+0x50/0x1e0 The emac_pdata pointer in soc_info is NULL because davinci_soc_info only gets populated on davinci machines but da850_evm_config_emac() is called on all machines via device_initcall(). Move the rmii_en assignment below the machine check so that it is only dereferenced when running on a supported SoC. Fixes: bae105879f2f ("davinci: DA850/OMAP-L138 EVM: implement autodetect of RMII PHY") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Bartosz Golaszewski <brgl@bgdev.pl> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/YcS4xVWs6bQlQSPC@archlinux-ax161/ Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20tick/nohz: Use WARN_ON_ONCE() to prevent console saturationPaul Gortmaker1-1/+1
commit 40e97e42961f8c6cc7bd5fe67cc18417e02d78f1 upstream. While running some testing on code that happened to allow the variable tick_nohz_full_running to get set but with no "possible" NOHZ cores to back up that setting, this warning triggered: if (unlikely(tick_do_timer_cpu == TICK_DO_TIMER_NONE)) WARN_ON(tick_nohz_full_running); The console was overwhemled with an endless stream of one WARN per tick per core and there was no way to even see what was going on w/o using a serial console to capture it and then trace it back to this. Change it to WARN_ON_ONCE(). Fixes: 08ae95f4fd3b ("nohz_full: Allow the boot CPU to be nohz_full") Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20211206145950.10927-3-paul.gortmaker@windriver.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20genirq/affinity: Consider that CPUs on nodes can be unbalancedRei Yamamoto1-2/+3
commit 08d835dff916bfe8f45acc7b92c7af6c4081c8a7 upstream. If CPUs on a node are offline at boot time, the number of nodes is different when building affinity masks for present cpus and when building affinity masks for possible cpus. This causes the following problem: In the case that the number of vectors is less than the number of nodes there are cases where bits of masks for present cpus are overwritten when building masks for possible cpus. Fix this by excluding CPUs, which are not part of the current build mask (present/possible). [ tglx: Massaged changelog and added comment ] Fixes: b82592199032 ("genirq/affinity: Spread IRQs to all available NUMA nodes") Signed-off-by: Rei Yamamoto <yamamoto.rei@jp.fujitsu.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220331003309.10891-1-yamamoto.rei@jp.fujitsu.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20x86/tsx: Disable TSX development mode at bootPawan Gupta6-18/+55
commit 400331f8ffa3bec5c561417e5eec6848464e9160 upstream. A microcode update on some Intel processors causes all TSX transactions to always abort by default[*]. Microcode also added functionality to re-enable TSX for development purposes. With this microcode loaded, if tsx=on was passed on the cmdline, and TSX development mode was already enabled before the kernel boot, it may make the system vulnerable to TSX Asynchronous Abort (TAA). To be on safer side, unconditionally disable TSX development mode during boot. If a viable use case appears, this can be revisited later. [*]: Intel TSX Disable Update for Selected Processors, doc ID: 643557 [ bp: Drop unstable web link, massage heavily. ] Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com> Suggested-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Tested-by: Neelima Krishnan <neelima.krishnan@intel.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/347bd844da3a333a9793c6687d4e4eb3b2419a3e.1646943780.git.pawan.kumar.gupta@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20x86/tsx: Use MSR_TSX_CTRL to clear CPUID bitsPawan Gupta2-7/+48
commit 258f3b8c3210b03386e4ad92b4bd8652b5c1beb3 upstream. tsx_clear_cpuid() uses MSR_TSX_FORCE_ABORT to clear CPUID.RTM and CPUID.HLE. Not all CPUs support MSR_TSX_FORCE_ABORT, alternatively use MSR_IA32_TSX_CTRL when supported. [ bp: Document how and why TSX gets disabled. ] Fixes: 293649307ef9 ("x86/tsx: Clear CPUID bits when TSX always force aborts") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Tested-by: Neelima Krishnan <neelima.krishnan@intel.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/5b323e77e251a9c8bcdda498c5cc0095be1e1d3c.1646943780.git.pawan.kumar.gupta@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20drm/amdgpu: Enable gfxoff quirk on MacBook ProTomasz Moń1-0/+2
commit 4593c1b6d159f1e5c35c07a7f125e79e5a864302 upstream. Enabling gfxoff quirk results in perfectly usable graphical user interface on MacBook Pro (15-inch, 2019) with Radeon Pro Vega 20 4 GB. Without the quirk, X server is completely unusable as every few seconds there is gpu reset due to ring gfx timeout. Signed-off-by: Tomasz Moń <desowin@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20drm/amd/display: don't ignore alpha property on pre-multiplied modeMelissa Wen2-10/+18
commit e4f1541caf60fcbe5a59e9d25805c0b5865e546a upstream. "Pre-multiplied" is the default pixel blend mode for KMS/DRM, as documented in supported_modes of drm_plane_create_blend_mode_property(): https://cgit.freedesktop.org/drm/drm-misc/tree/drivers/gpu/drm/drm_blend.c In this mode, both 'pixel alpha' and 'plane alpha' participate in the calculation, as described by the pixel blend mode formula in KMS/DRM documentation: out.rgb = plane_alpha * fg.rgb + (1 - (plane_alpha * fg.alpha)) * bg.rgb Considering the blend config mechanisms we have in the driver so far, the alpha mode that better fits this blend mode is the _PER_PIXEL_ALPHA_COMBINED_GLOBAL_GAIN, where the value for global_gain is the plane alpha (global_alpha). With this change, alpha property stops to be ignored. It also addresses Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1734 v2: * keep the 8-bit value for global_alpha_value (Nicholas) * correct the logical ordering for combined global gain (Nicholas) * apply to dcn10 too (Nicholas) Signed-off-by: Melissa Wen <mwen@igalia.com> Tested-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Tested-by: Simon Ser <contact@emersion.fr> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20ipv6: fix panic when forwarding a pkt with no in6 devNicolas Dichtel1-1/+1
commit e3fa461d8b0e185b7da8a101fe94dfe6dd500ac0 upstream. kongweibin reported a kernel panic in ip6_forward() when input interface has no in6 dev associated. The following tc commands were used to reproduce this panic: tc qdisc del dev vxlan100 root tc qdisc add dev vxlan100 root netem corrupt 5% CC: stable@vger.kernel.org Fixes: ccd27f05ae7b ("ipv6: fix 'disable_policy' for fwd packets") Reported-by: kongweibin <kongweibin2@huawei.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20nl80211: correctly check NL80211_ATTR_REG_ALPHA2 sizeJohannes Berg1-1/+2
commit 6624bb34b4eb19f715db9908cca00122748765d7 upstream. We need this to be at least two bytes, so we can access alpha2[0] and alpha2[1]. It may be three in case some userspace used NUL-termination since it was NLA_STRING (and we also push it out with NUL-termination). Cc: stable@vger.kernel.org Reported-by: Lee Jones <lee.jones@linaro.org> Link: https://lore.kernel.org/r/20220411114201.fd4a31f06541.Ie7ff4be2cf348d8cc28ed0d626fc54becf7ea799@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20ALSA: pcm: Test for "silence" field in struct "pcm_format_data"Fabio M. De Francesco1-1/+1
commit 2f7a26abb8241a0208c68d22815aa247c5ddacab upstream. Syzbot reports "KASAN: null-ptr-deref Write in snd_pcm_format_set_silence".[1] It is due to missing validation of the "silence" field of struct "pcm_format_data" in "pcm_formats" array. Add a test for valid "pat" and, if it is not so, return -EINVAL. [1] https://lore.kernel.org/lkml/000000000000d188ef05dc2c7279@google.com/ Reported-and-tested-by: syzbot+205eb15961852c2c5974@syzkaller.appspotmail.com Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20220409012655.9399-1-fmdefrancesco@gmail.com Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20ALSA: hda/realtek: add quirk for Lenovo Thinkpad X12 speakersTao Jin1-0/+1
commit 264fb03497ec1c7841bba872571bcd11beed57a7 upstream. For this specific device on Lenovo Thinkpad X12 tablet, the verbs were dumped by qemu running a guest OS that init this codec properly. After studying the dump, it turns out that the same quirk used by the other Lenovo devices can be reused. The patch was tested working against the mainline kernel. Cc: <stable@vger.kernel.org> Signed-off-by: Tao Jin <tao-j@outlook.com> Link: https://lore.kernel.org/r/CO6PR03MB6241CD73310B37858FE64C85E1E89@CO6PR03MB6241.namprd03.prod.outlook.com Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20ALSA: hda/realtek: Add quirk for Clevo PD50PNTTim Crawford1-0/+1
commit 9eb6f5c388060d8cef3c8b616cc31b765e022359 upstream. Fixes speaker output and headset detection on Clevo PD50PNT. Signed-off-by: Tim Crawford <tcrawford@system76.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20220405182029.27431-1-tcrawford@system76.com Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20btrfs: mark resumed async balance as writingNaohiro Aota1-0/+2
commit a690e5f2db4d1dca742ce734aaff9f3112d63764 upstream. When btrfs balance is interrupted with umount, the background balance resumes on the next mount. There is a potential deadlock with FS freezing here like as described in commit 26559780b953 ("btrfs: zoned: mark relocation as writing"). Mark the process as sb_writing to avoid it. Reviewed-by: Filipe Manana <fdmanana@suse.com> CC: stable@vger.kernel.org # 4.9+ Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20btrfs: fix root ref counts in error handling in btrfs_get_root_refJia-Ju Bai1-2/+3
commit 168a2f776b9762f4021421008512dd7ab7474df1 upstream. In btrfs_get_root_ref(), when btrfs_insert_fs_root() fails, btrfs_put_root() can happen for two reasons: - the root already exists in the tree, in that case it returns the reference obtained in btrfs_lookup_fs_root() - another error so the cleanup is done in the fail label Calling btrfs_put_root() unconditionally would lead to double decrement of the root reference possibly freeing it in the second case. Reported-by: TOTE Robot <oslab@tsinghua.edu.cn> Fixes: bc44d7c4b2b1 ("btrfs: push btrfs_grab_fs_root into btrfs_get_fs_root") CC: stable@vger.kernel.org # 5.10+ Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20btrfs: zoned: activate block group only for extent allocationNaohiro Aota3-9/+21
commit 760e69c4c2e2f475a812bdd414b62758215ce9cb upstream. In btrfs_make_block_group(), we activate the allocated block group, expecting that the block group is soon used for allocation. However, the chunk allocation from flush_space() context broke the assumption. There can be a large time gap between the chunk allocation time and the extent allocation time from the chunk. Activating the empty block groups pre-allocated from flush_space() context can exhaust the active zone counter of a device. Once we use all the active zone counts for empty pre-allocated block groups, we cannot activate new block group for the other things: metadata, tree-log, or data relocation block group. That failure results in a fake -ENOSPC. This patch introduces CHUNK_ALLOC_FORCE_FOR_EXTENT to distinguish the chunk allocation from find_free_extent(). Now, the new block group is activated only in that context. Fixes: eb66a010d518 ("btrfs: zoned: activate new block group") CC: stable@vger.kernel.org # 5.16+ Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Tested-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20ath9k: Fix usage of driver-private space in tx_infoToke Høiland-Jørgensen2-11/+21
commit 5a6b06f5927c940fa44026695779c30b7536474c upstream. The ieee80211_tx_info_clear_status() helper also clears the rate counts and the driver-private part of struct ieee80211_tx_info, so using it breaks quite a few other things. So back out of using it, and instead define a ath-internal helper that only clears the area between the status_driver_data and the rates info. Combined with moving the ath_frame_info struct to status_driver_data, this avoids clearing anything we shouldn't be, and so we can keep the existing code for handling the rate information. While fixing this I also noticed that the setting of tx_info->status.rates[tx_rateindex].count on hardware underrun errors was always immediately overridden by the normal setting of the same fields, so rearrange the code so that the underrun detection actually takes effect. The new helper could be generalised to a 'memset_between()' helper, but leave it as a driver-internal helper for now since this needs to go to stable. Cc: stable@vger.kernel.org Reported-by: Peter Seiderer <ps.report@gmx.net> Fixes: 037250f0a45c ("ath9k: Properly clear TX status area before reporting to mac80211") Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Reviewed-by: Peter Seiderer <ps.report@gmx.net> Tested-by: Peter Seiderer <ps.report@gmx.net> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20220404204800.2681133-1-toke@toke.dk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20ath9k: Properly clear TX status area before reporting to mac80211Toke Høiland-Jørgensen1-3/+2
commit 037250f0a45cf9ecf5b52d4b9ff8eadeb609c800 upstream. The ath9k driver was not properly clearing the status area in the ieee80211_tx_info struct before reporting TX status to mac80211. Instead, it was manually filling in fields, which meant that fields introduced later were left as-is. Conveniently, mac80211 actually provides a helper to zero out the status area, so use that to make sure we zero everything. The last commit touching the driver function writing the status information seems to have actually been fixing an issue that was also caused by the area being uninitialised; but it only added clearing of a single field instead of the whole struct. That is now redundant, though, so revert that commit and use it as a convenient Fixes tag. Fixes: cc591d77aba1 ("ath9k: Make sure to zero status.tx_time before reporting TX status") Reported-by: Bagas Sanjaya <bagasdotme@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk> Tested-by: Bagas Sanjaya <bagasdotme@gmail.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20220330164409.16645-1-toke@toke.dk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20gpio: sim: fix setting and getting multiple linesBartosz Golaszewski1-2/+2
commit 3836c73e6a2585561af928c6641d74528a8bdfa4 upstream. We need to take mask into account in the set/get_multiple() callbacks. Use bitmap_replace() instead of bitmap_copy(). Fixes: cb8c474e79be ("gpio: sim: new testing module") Cc: stable@vger.kernel.org Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20cifs: verify that tcon is valid before dereference in cifs_kill_sbRonnie Sahlberg1-3/+3
commit 8b6c58458ee3206dde345fce327a4cb83e69caf9 upstream. On umount, cifs_sb->tlink_tree might contain entries that do not represent a valid tcon. Check the tcon for error before we dereference it. Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Cc: stable@vger.kernel.org Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Reported-by: Xiaoli Feng <xifeng@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20gcc-plugins: latent_entropy: use /dev/urandomJason A. Donenfeld1-17/+27
commit c40160f2998c897231f8454bf797558d30a20375 upstream. While the latent entropy plugin mostly doesn't derive entropy from get_random_const() for measuring the call graph, when __latent_entropy is applied to a constant, then it's initialized statically to output from get_random_const(). In that case, this data is derived from a 64-bit seed, which means a buffer of 512 bits doesn't really have that amount of compile-time entropy. This patch fixes that shortcoming by just buffering chunks of /dev/urandom output and doling it out as requested. At the same time, it's important that we don't break the use of -frandom-seed, for people who want the runtime benefits of the latent entropy plugin, while still having compile-time determinism. In that case, we detect whether gcc's set_random_seed() has been called by making a call to get_random_seed(noinit=true) in the plugin init function, which is called after set_random_seed() is called but before anything that calls get_random_seed(noinit=false), and seeing if it's zero or not. If it's not zero, we're in deterministic mode, and so we just generate numbers with a basic xorshift prng. Note that we don't detect if -frandom-seed is being used using the documented local_tick variable, because it's assigned via: local_tick = (unsigned) tv.tv_sec * 1000 + tv.tv_usec / 1000; which may well overflow and become -1 on its own, and so isn't reliable: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105171 [kees: The 256 byte rnd_buf size was chosen based on average (250), median (64), and std deviation (575) bytes of used entropy for a defconfig x86_64 build] Fixes: 38addce8b600 ("gcc-plugins: Add latent_entropy plugin") Cc: stable@vger.kernel.org Cc: PaX Team <pageexec@freemail.hu> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20220405222815.21155-1-Jason@zx2c4.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20memory: renesas-rpc-if: fix platform-device leak in error pathJohan Hovold1-1/+9
commit b452dbf24d7d9a990d70118462925f6ee287d135 upstream. Make sure to free the flash platform device in the event that registration fails during probe. Fixes: ca7d8b980b67 ("memory: add Renesas RPC-IF driver") Cc: stable@vger.kernel.org # 5.8 Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://lore.kernel.org/r/20220303180632.3194-1-johan@kernel.org Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20SUNRPC: Fix NFSD's request deferral on RDMA transportsChuck Lever3-1/+5
commit 773f91b2cf3f52df0d7508fdbf60f37567cdaee4 upstream. Trond Myklebust reports an NFSD crash in svc_rdma_sendto(). Further investigation shows that the crash occurred while NFSD was handling a deferred request. This patch addresses two inter-related issues that prevent request deferral from working correctly for RPC/RDMA requests: 1. Prevent the crash by ensuring that the original svc_rqst::rq_xprt_ctxt value is available when the request is revisited. Otherwise svc_rdma_sendto() does not have a Receive context available with which to construct its reply. 2. Possibly since before commit 71641d99ce03 ("svcrdma: Properly compute .len and .buflen for received RPC Calls"), svc_rdma_recvfrom() did not include the transport header in the returned xdr_buf. There should have been no need for svc_defer() and friends to save and restore that header, as of that commit. This issue is addressed in a backport-friendly way by simply having svc_rdma_recvfrom() set rq_xprt_hlen to zero unconditionally, just as svc_tcp_recvfrom() does. This enables svc_deferred_recv() to correctly reconstruct an RPC message received via RPC/RDMA. Reported-by: Trond Myklebust <trondmy@hammerspace.com> Link: https://lore.kernel.org/linux-nfs/82662b7190f26fb304eb0ab1bb04279072439d4e.camel@hammerspace.com/ Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20KVM: x86/mmu: Resolve nx_huge_pages when kvm.ko is loadedSean Christopherson3-8/+37
commit 1d0e84806047f38027d7572adb4702ef7c09b317 upstream. Resolve nx_huge_pages to true/false when kvm.ko is loaded, leaving it as -1 is technically undefined behavior when its value is read out by param_get_bool(), as boolean values are supposed to be '0' or '1'. Alternatively, KVM could define a custom getter for the param, but the auto value doesn't depend on the vendor module in any way, and printing "auto" would be unnecessarily unfriendly to the user. In addition to fixing the undefined behavior, resolving the auto value also fixes the scenario where the auto value resolves to N and no vendor module is loaded. Previously, -1 would result in Y being printed even though KVM would ultimately disable the mitigation. Rename the existing MMU module init/exit helpers to clarify that they're invoked with respect to the vendor module, and add comments to document why KVM has two separate "module init" flows. ========================================================================= UBSAN: invalid-load in kernel/params.c:320:33 load of value 255 is not a valid value for type '_Bool' CPU: 6 PID: 892 Comm: tail Not tainted 5.17.0-rc3+ #799 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 Call Trace: <TASK> dump_stack_lvl+0x34/0x44 ubsan_epilogue+0x5/0x40 __ubsan_handle_load_invalid_value.cold+0x43/0x48 param_get_bool.cold+0xf/0x14 param_attr_show+0x55/0x80 module_attr_show+0x1c/0x30 sysfs_kf_seq_show+0x93/0xc0 seq_read_iter+0x11c/0x450 new_sync_read+0x11b/0x1a0 vfs_read+0xf0/0x190 ksys_read+0x5f/0xe0 do_syscall_64+0x3b/0xc0 entry_SYSCALL_64_after_hwframe+0x44/0xae </TASK> ========================================================================= Fixes: b8e8c8303ff2 ("kvm: mmu: ITLB_MULTIHIT mitigation") Cc: stable@vger.kernel.org Reported-by: Bruno Goncalves <bgoncalv@redhat.com> Reported-by: Jan Stancek <jstancek@redhat.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220331221359.3912754-1-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20revert "fs/binfmt_elf: use PT_LOAD p_align values for static PIE"Andrew Morton1-2/+2
commit aeb7923733d100b86c6bc68e7ae32913b0cec9d8 upstream. Despite Mike's attempted fix (925346c129da117122), regressions reports continue: https://lore.kernel.org/lkml/cb5b81bd-9882-e5dc-cd22-54bdbaaefbbc@leemhuis.info/ https://bugzilla.kernel.org/show_bug.cgi?id=215720 https://lkml.kernel.org/r/b685f3d0-da34-531d-1aa9-479accd3e21b@leemhuis.info So revert this patch. Fixes: 9630f0d60fec ("fs/binfmt_elf: use PT_LOAD p_align values for static PIE") Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Chris Kennelly <ckennelly@google.com> Cc: David Rientjes <rientjes@google.com> Cc: Fangrui Song <maskray@google.com> Cc: H.J. Lu <hjl.tools@gmail.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Sandeep Patil <sspatil@google.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Song Liu <songliubraving@fb.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Thorsten Leemhuis <regressions@leemhuis.info> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20revert "fs/binfmt_elf: fix PT_LOAD p_align values for loaders"Andrew Morton1-1/+1
commit 354e923df042a11d1ab8ca06b3ebfab3a018a4ec upstream. Commit 925346c129da11 ("fs/binfmt_elf: fix PT_LOAD p_align values for loaders") was an attempt to fix regressions due to 9630f0d60fec5f ("fs/binfmt_elf: use PT_LOAD p_align values for static PIE"). But regressionss continue to be reported: https://lore.kernel.org/lkml/cb5b81bd-9882-e5dc-cd22-54bdbaaefbbc@leemhuis.info/ https://bugzilla.kernel.org/show_bug.cgi?id=215720 https://lkml.kernel.org/r/b685f3d0-da34-531d-1aa9-479accd3e21b@leemhuis.info This patch reverts the fix, so the original can also be reverted. Fixes: 925346c129da11 ("fs/binfmt_elf: fix PT_LOAD p_align values for loaders") Cc: H.J. Lu <hjl.tools@gmail.com> Cc: Chris Kennelly <ckennelly@google.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Song Liu <songliubraving@fb.com> Cc: David Rientjes <rientjes@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Sandeep Patil <sspatil@google.com> Cc: Fangrui Song <maskray@google.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Thorsten Leemhuis <regressions@leemhuis.info> Cc: Mike Rapoport <rppt@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20hugetlb: do not demote poisoned hugetlb pagesMike Kravetz1-7/+10
commit 5a317412ef884763fdf7aa17f9f3636959d11d8f upstream. It is possible for poisoned hugetlb pages to reside on the free lists. The huge page allocation routines which dequeue entries from the free lists make a point of avoiding poisoned pages. There is no such check and avoidance in the demote code path. If a hugetlb page on the is on a free list, poison will only be set in the head page rather then the page with the actual error. If such a page is demoted, then the poison flag may follow the wrong page. A page without error could have poison set, and a page with poison could not have the flag set. Check for poison before attempting to demote a hugetlb page. Also, return -EBUSY to the caller if only poisoned pages are on the free list. Link: https://lkml.kernel.org/r/20220307215707.50916-1-mike.kravetz@oracle.com Fixes: 8531fc6f52f5 ("hugetlb: add hugetlb demote page support") Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Reviewed-by: Naoya Horiguchi <naoya.horiguchi@nec.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20mm: kmemleak: take a full lowmem check in kmemleak_*_phys()Patrick Wang1-4/+4
commit 23c2d497de21f25898fbea70aeb292ab8acc8c94 upstream. The kmemleak_*_phys() apis do not check the address for lowmem's min boundary, while the caller may pass an address below lowmem, which will trigger an oops: # echo scan > /sys/kernel/debug/kmemleak Unable to handle kernel paging request at virtual address ff5fffffffe00000 Oops [#1] Modules linked in: CPU: 2 PID: 134 Comm: bash Not tainted 5.18.0-rc1-next-20220407 #33 Hardware name: riscv-virtio,qemu (DT) epc : scan_block+0x74/0x15c ra : scan_block+0x72/0x15c epc : ffffffff801e5806 ra : ffffffff801e5804 sp : ff200000104abc30 gp : ffffffff815cd4e8 tp : ff60000004cfa340 t0 : 0000000000000200 t1 : 00aaaaaac23954cc t2 : 00000000000003ff s0 : ff200000104abc90 s1 : ffffffff81b0ff28 a0 : 0000000000000000 a1 : ff5fffffffe01000 a2 : ffffffff81b0ff28 a3 : 0000000000000002 a4 : 0000000000000001 a5 : 0000000000000000 a6 : ff200000104abd7c a7 : 0000000000000005 s2 : ff5fffffffe00ff9 s3 : ffffffff815cd998 s4 : ffffffff815d0e90 s5 : ffffffff81b0ff28 s6 : 0000000000000020 s7 : ffffffff815d0eb0 s8 : ffffffffffffffff s9 : ff5fffffffe00000 s10: ff5fffffffe01000 s11: 0000000000000022 t3 : 00ffffffaa17db4c t4 : 000000000000000f t5 : 0000000000000001 t6 : 0000000000000000 status: 0000000000000100 badaddr: ff5fffffffe00000 cause: 000000000000000d scan_gray_list+0x12e/0x1a6 kmemleak_scan+0x2aa/0x57e kmemleak_write+0x32a/0x40c full_proxy_write+0x56/0x82 vfs_write+0xa6/0x2a6 ksys_write+0x6c/0xe2 sys_write+0x22/0x2a ret_from_syscall+0x0/0x2 The callers may not quite know the actual address they pass(e.g. from devicetree). So the kmemleak_*_phys() apis should guarantee the address they finally use is in lowmem range, so check the address for lowmem's min boundary. Link: https://lkml.kernel.org/r/20220413122925.33856-1-patrick.wang.shcn@gmail.com Signed-off-by: Patrick Wang <patrick.wang.shcn@gmail.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20mm: fix unexpected zeroed page mapping with zram swapMinchan Kim1-54/+0
commit e914d8f00391520ecc4495dd0ca0124538ab7119 upstream. Two processes under CLONE_VM cloning, user process can be corrupted by seeing zeroed page unexpectedly. CPU A CPU B do_swap_page do_swap_page SWP_SYNCHRONOUS_IO path SWP_SYNCHRONOUS_IO path swap_readpage valid data swap_slot_free_notify delete zram entry swap_readpage zeroed(invalid) data pte_lock map the *zero data* to userspace pte_unlock pte_lock if (!pte_same) goto out_nomap; pte_unlock return and next refault will read zeroed data The swap_slot_free_notify is bogus for CLONE_VM case since it doesn't increase the refcount of swap slot at copy_mm so it couldn't catch up whether it's safe or not to discard data from backing device. In the case, only the lock it could rely on to synchronize swap slot freeing is page table lock. Thus, this patch gets rid of the swap_slot_free_notify function. With this patch, CPU A will see correct data. CPU A CPU B do_swap_page do_swap_page SWP_SYNCHRONOUS_IO path SWP_SYNCHRONOUS_IO path swap_readpage original data pte_lock map the original data swap_free swap_range_free bd_disk->fops->swap_slot_free_notify swap_readpage read zeroed data pte_unlock pte_lock if (!pte_same) goto out_nomap; pte_unlock return on next refault will see mapped data by CPU B The concern of the patch would increase memory consumption since it could keep wasted memory with compressed form in zram as well as uncompressed form in address space. However, most of cases of zram uses no readahead and do_swap_page is followed by swap_free so it will free the compressed form from in zram quickly. Link: https://lkml.kernel.org/r/YjTVVxIAsnKAXjTd@google.com Fixes: 0bcac06f27d7 ("mm, swap: skip swapcache for swapin of synchronous device") Reported-by: Ivan Babrou <ivan@cloudflare.com> Tested-by: Ivan Babrou <ivan@cloudflare.com> Signed-off-by: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: David Hildenbrand <david@redhat.com> Cc: <stable@vger.kernel.org> [4.14+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20mm, page_alloc: fix build_zonerefs_node()Juergen Gross1-1/+1
commit e553f62f10d93551eb883eca227ac54d1a4fad84 upstream. Since commit 6aa303defb74 ("mm, vmscan: only allocate and reclaim from zones with pages managed by the buddy allocator") only zones with free memory are included in a built zonelist. This is problematic when e.g. all memory of a zone has been ballooned out when zonelists are being rebuilt. The decision whether to rebuild the zonelists when onlining new memory is done based on populated_zone() returning 0 for the zone the memory will be added to. The new zone is added to the zonelists only, if it has free memory pages (managed_zone() returns a non-zero value) after the memory has been onlined. This implies, that onlining memory will always free the added pages to the allocator immediately, but this is not true in all cases: when e.g. running as a Xen guest the onlined new memory will be added only to the ballooned memory list, it will be freed only when the guest is being ballooned up afterwards. Another problem with using managed_zone() for the decision whether a zone is being added to the zonelists is, that a zone with all memory used will in fact be removed from all zonelists in case the zonelists happen to be rebuilt. Use populated_zone() when building a zonelist as it has been done before that commit. There was a report that QubesOS (based on Xen) is hitting this problem. Xen has switched to use the zone device functionality in kernel 5.9 and QubesOS wants to use memory hotplugging for guests in order to be able to start a guest with minimal memory and expand it as needed. This was the report leading to the patch. Link: https://lkml.kernel.org/r/20220407120637.9035-1-jgross@suse.com Fixes: 6aa303defb74 ("mm, vmscan: only allocate and reclaim from zones with pages managed by the buddy allocator") Signed-off-by: Juergen Gross <jgross@suse.com> Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Reviewed-by: Wei Yang <richard.weiyang@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20mm/secretmem: fix panic when growing a memfd_secretAxel Rasmussen1-0/+17
commit f9b141f93659e09a52e28791ccbaf69c273b8e92 upstream. When one tries to grow an existing memfd_secret with ftruncate, one gets a panic [1]. For example, doing the following reliably induces the panic: fd = memfd_secret(); ftruncate(fd, 10); ptr = mmap(NULL, 10, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); strcpy(ptr, "123456789"); munmap(ptr, 10); ftruncate(fd, 20); The basic reason for this is, when we grow with ftruncate, we call down into simple_setattr, and then truncate_inode_pages_range, and eventually we try to zero part of the memory. The normal truncation code does this via the direct map (i.e., it calls page_address() and hands that to memset()). For memfd_secret though, we specifically don't map our pages via the direct map (i.e. we call set_direct_map_invalid_noflush() on every fault). So the address returned by page_address() isn't useful, and when we try to memset() with it we panic. This patch avoids the panic by implementing a custom setattr for memfd_secret, which detects resizes specifically (setting the size for the first time works just fine, since there are no existing pages to try to zero), and rejects them with EINVAL. One could argue growing should be supported, but I think that will require a significantly more lengthy change. So, I propose a minimal fix for the benefit of stable kernels, and then perhaps to extend memfd_secret to support growing in a separate patch. [1]: BUG: unable to handle page fault for address: ffffa0a889277028 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page PGD afa01067 P4D afa01067 PUD 83f909067 PMD 83f8bf067 PTE 800ffffef6d88060 Oops: 0002 [#1] PREEMPT SMP DEBUG_PAGEALLOC PTI CPU: 0 PID: 281 Comm: repro Not tainted 5.17.0-dbg-DEV #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 RIP: 0010:memset_erms+0x9/0x10 Code: c1 e9 03 40 0f b6 f6 48 b8 01 01 01 01 01 01 01 01 48 0f af c6 f3 48 ab 89 d1 f3 aa 4c 89 c8 c3 90 49 89 f9 40 88 f0 48 89 d1 <f3> aa 4c 89 c8 c3 90 49 89 fa 40 0f b6 ce 48 b8 01 01 01 01 01 01 RSP: 0018:ffffb932c09afbf0 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffffda63c4249dc0 RCX: 0000000000000fd8 RDX: 0000000000000fd8 RSI: 0000000000000000 RDI: ffffa0a889277028 RBP: ffffb932c09afc00 R08: 0000000000001000 R09: ffffa0a889277028 R10: 0000000000020023 R11: 0000000000000000 R12: ffffda63c4249dc0 R13: ffffa0a890d70d98 R14: 0000000000000028 R15: 0000000000000fd8 FS: 00007f7294899580(0000) GS:ffffa0af9bc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffa0a889277028 CR3: 0000000107ef6006 CR4: 0000000000370ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: ? zero_user_segments+0x82/0x190 truncate_inode_partial_folio+0xd4/0x2a0 truncate_inode_pages_range+0x380/0x830 truncate_setsize+0x63/0x80 simple_setattr+0x37/0x60 notify_change+0x3d8/0x4d0 do_sys_ftruncate+0x162/0x1d0 __x64_sys_ftruncate+0x1c/0x20 do_syscall_64+0x44/0xa0 entry_SYSCALL_64_after_hwframe+0x44/0xae Modules linked in: xhci_pci xhci_hcd virtio_net net_failover failover virtio_blk virtio_balloon uhci_hcd ohci_pci ohci_hcd evdev ehci_pci ehci_hcd 9pnet_virtio 9p netfs 9pnet CR2: ffffa0a889277028 [lkp@intel.com: secretmem_iops can be static] Signed-off-by: kernel test robot <lkp@intel.com> [axelrasmussen@google.com: return EINVAL] Link: https://lkml.kernel.org/r/20220324210909.1843814-1-axelrasmussen@google.com Link: https://lkml.kernel.org/r/20220412193023.279320-1-axelrasmussen@google.com Signed-off-by: Axel Rasmussen <axelrasmussen@google.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: <stable@vger.kernel.org> Cc: kernel test robot <lkp@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20perf/imx_ddr: Fix undefined behavior due to shift overflowing the constantBorislav Petkov1-1/+1
[ Upstream commit d02b4dd84e1a90f7f1444d027c0289bf355b0d5a ] Fix: In file included from <command-line>:0:0: In function ‘ddr_perf_counter_enable’, inlined from ‘ddr_perf_irq_handler’ at drivers/perf/fsl_imx8_ddr_perf.c:651:2: ././include/linux/compiler_types.h:352:38: error: call to ‘__compiletime_assert_729’ \ declared with attribute error: FIELD_PREP: mask is not constant _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) ... See https://lore.kernel.org/r/YkwQ6%2BtIH8GQpuct@zn.tnic for the gory details as to why it triggers with older gccs only. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Frank Li <Frank.li@nxp.com> Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Shawn Guo <shawnguo@kernel.org> Cc: Sascha Hauer <s.hauer@pengutronix.de> Cc: Pengutronix Kernel Team <kernel@pengutronix.de> Cc: Fabio Estevam <festevam@gmail.com> Cc: NXP Linux Team <linux-imx@nxp.com> Cc: linux-arm-kernel@lists.infradead.org Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220405151517.29753-10-bp@alien8.de Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-20io_uring: use nospec annotation for more indexesPavel Begunkov1-6/+5
[ Upstream commit 4cdd158be9d09223737df83136a1fb65269d809a ] There are still several places that using pre array_index_nospec() indexes, fix them up. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/b01ef5ee83f72ed35ad525912370b729f5d145f4.1649336342.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>