summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2022-03-16Linux 5.15.29v5.15.29Greg Kroah-Hartman1-1/+1
Link: https://lore.kernel.org/r/20220314112743.029192918@linuxfoundation.org Tested-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Ron Economos <re@w6rz.net> Tested-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Linux Kernel Functional Testing <lkft@linaro.org> Tested-by: Jon Hunter <jonathanh@nvidia.com> Tested-by: Bagas Sanjaya <bagasdotme@gmail.com> Tested-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk> Tested-by: Fox Chen <foxhlchen@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-16vhost: allow batching hint without sizeJason Wang1-1/+3
commit 95932ab2ea07b79cdb33121e2f40ccda9e6a73b5 upstream. Commit e2ae38cf3d91 ("vhost: fix hung thread due to erroneous iotlb entries") tries to reject the IOTLB message whose size is zero. But the size is not necessarily meaningful, one example is the batching hint, so the commit breaks that. Fixing this be reject zero size message only if the message is used to update/invalidate the IOTLB. Fixes: e2ae38cf3d91 ("vhost: fix hung thread due to erroneous iotlb entries") Reported-by: Eli Cohen <elic@nvidia.com> Cc: Anirudh Rayabharam <mail@anirudhrb.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20220310075211.4801-1-jasowang@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Eli Cohen <elic@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-16Revert "net: dsa: mv88e6xxx: flush switchdev FDB workqueue before removing VLAN"Vladimir Oltean4-9/+1
This reverts commit 2566a89b9e163b2fcd104d6005e0149f197b8a48 which is commit a2614140dc0f467a83aa3bb4b6ee2d6480a76202 upstream. The above change depends on upstream commit 0faf890fc519 ("net: dsa: drop rtnl_lock from dsa_slave_switchdev_event_work"), which is not present in linux-5.15.y. Without that change, waiting for the switchdev workqueue causes deadlocks on the rtnl_mutex. Backporting the dependency commit isn't trivial/desirable, since it requires that the following dependencies of the dependency are also backported: df405910ab9f net: dsa: sja1105: wait for dynamic config command completion on writes too eb016afd83a9 net: dsa: sja1105: serialize access to the dynamic config interface 2468346c5677 net: mscc: ocelot: serialize access to the MAC table f7eb4a1c0864 net: dsa: b53: serialize access to the ARL table cf231b436f7c net: dsa: lantiq_gswip: serialize access to the PCE registers 338a3a4745aa net: dsa: introduce locking for the address lists on CPU and DSA ports and then this bugfix on top: 8940e6b669ca ("net: dsa: avoid call to __dev_set_promiscuity() while rtnl_mutex isn't held") Reported-by: Daniel Suchy <danny@danysek.cz> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-16block: drop unused includes in <linux/genhd.h>Christoph Hellwig17-12/+18
commit b81e0c2372e65e5627864ba034433b64b2fc73f5 upstream. Drop various include not actually used in genhd.h itself, and move the remaning includes closer together. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20210920123328.1399408-15-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk> Reported-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>a Reported-by: "H. Nikolaus Schaller" <hns@goldelico.com> Reported-by: Guenter Roeck <linux@roeck-us.net> Cc: "Maciej W. Rozycki" <macro@orcam.me.uk> [ resolves MIPS build failure by luck, root cause needs to be fixed in Linus's tree properly, but this is needed for now to fix the build - gregkh ] Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-16riscv: dts: k210: fix broken IRQs on hart1Niklas Cassel1-1/+2
commit 74583f1b92cb3bbba1a3741cea237545c56f506c upstream. Commit 67d96729a9e7 ("riscv: Update Canaan Kendryte K210 device tree") incorrectly removed two entries from the PLIC interrupt-controller node's interrupts-extended property. The PLIC driver cannot know the mapping between hart contexts and hart ids, so this information has to be provided by device tree, as specified by the PLIC device tree binding. The PLIC driver uses the interrupts-extended property, and initializes the hart context registers in the exact same order as provided by the interrupts-extended property. In other words, if we don't specify the S-mode interrupts, the PLIC driver will simply initialize the hart0 S-mode hart context with the hart1 M-mode configuration. It is therefore essential to specify the S-mode IRQs even though the system itself will only ever be running in M-mode. Re-add the S-mode interrupts, so that we get working IRQs on hart1 again. Cc: <stable@vger.kernel.org> Fixes: 67d96729a9e7 ("riscv: Update Canaan Kendryte K210 device tree") Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-16drm/i915: Workaround broken BIOS DBUF configuration on TGL/RKLVille Syrjälä4-2/+74
commit 4e6f55120c7eccf6f9323bb681632e23cbcb3f3c upstream. On TGL/RKL the BIOS likes to use some kind of bogus DBUF layout that doesn't match what the spec recommends. With a single active pipe that is not going to be a problem, but with multiple pipes active skl_commit_modeset_enables() goes into an infinite loop since it can't figure out any order in which it can commit the pipes without causing DBUF overlaps between the planes. We'd need some kind of extra DBUF defrag stage in between to make the transition possible. But that is clearly way too complex a solution, so in the name of simplicity let's just sanitize the DBUF state by simply turning off all planes when we detect a pipe encroaching on its neighbours' DBUF slices. We only have to disable the primary planes as all other planes should have already been disabled (if they somehow were enabled) by earlier sanitization steps. And for good measure let's also sanitize in case the DBUF allocations of the pipes already seem to overlap each other. Cc: <stable@vger.kernel.org> # v5.14+ Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/4762 Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220204141818.1900-3-ville.syrjala@linux.intel.com Reviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> (cherry picked from commit 15512021eb3975a8c2366e3883337e252bb0eee5) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-16btrfs: make send work with concurrent block group relocationFilipe Manana7-104/+395
commit d96b34248c2f4ea8cd09286090f2f6f77102eaab upstream. We don't allow send and balance/relocation to run in parallel in order to prevent send failing or silently producing some bad stream. This is because while send is using an extent (specially metadata) or about to read a metadata extent and expecting it belongs to a specific parent node, relocation can run, the transaction used for the relocation is committed and the extent gets reallocated while send is still using the extent, so it ends up with a different content than expected. This can result in just failing to read a metadata extent due to failure of the validation checks (parent transid, level, etc), failure to find a backreference for a data extent, and other unexpected failures. Besides reallocation, there's also a similar problem of an extent getting discarded when it's unpinned after the transaction used for block group relocation is committed. The restriction between balance and send was added in commit 9e967495e0e0 ("Btrfs: prevent send failures and crashes due to concurrent relocation"), kernel 5.3, while the more general restriction between send and relocation was added in commit 1cea5cf0e664 ("btrfs: ensure relocation never runs while we have send operations running"), kernel 5.14. Both send and relocation can be very long running operations. Relocation because it has to do a lot of IO and expensive backreference lookups in case there are many snapshots, and send due to read IO when operating on very large trees. This makes it inconvenient for users and tools to deal with scheduling both operations. For zoned filesystem we also have automatic block group relocation, so send can fail with -EAGAIN when users least expect it or send can end up delaying the block group relocation for too long. In the future we might also get the automatic block group relocation for non zoned filesystems. This change makes it possible for send and relocation to run in parallel. This is achieved the following way: 1) For all tree searches, send acquires a read lock on the commit root semaphore; 2) After each tree search, and before releasing the commit root semaphore, the leaf is cloned and placed in the search path (struct btrfs_path); 3) After releasing the commit root semaphore, the changed_cb() callback is invoked, which operates on the leaf and writes commands to the pipe (or file in case send/receive is not used with a pipe). It's important here to not hold a lock on the commit root semaphore, because if we did we could deadlock when sending and receiving to the same filesystem using a pipe - the send task blocks on the pipe because it's full, the receive task, which is the only consumer of the pipe, triggers a transaction commit when attempting to create a subvolume or reserve space for a write operation for example, but the transaction commit blocks trying to write lock the commit root semaphore, resulting in a deadlock; 4) Before moving to the next key, or advancing to the next change in case of an incremental send, check if a transaction used for relocation was committed (or is about to finish its commit). If so, release the search path(s) and restart the search, to where we were before, so that we don't operate on stale extent buffers. The search restarts are always possible because both the send and parent roots are RO, and no one can add, remove of update keys (change their offset) in RO trees - the only exception is deduplication, but that is still not allowed to run in parallel with send; 5) Periodically check if there is contention on the commit root semaphore, which means there is a transaction commit trying to write lock it, and release the semaphore and reschedule if there is contention, so as to avoid causing any significant delays to transaction commits. This leaves some room for optimizations for send to have less path releases and re searching the trees when there's relocation running, but for now it's kept simple as it performs quite well (on very large trees with resulting send streams in the order of a few hundred gigabytes). Test case btrfs/187, from fstests, stresses relocation, send and deduplication attempting to run in parallel, but without verifying if send succeeds and if it produces correct streams. A new test case will be added that exercises relocation happening in parallel with send and then checks that send succeeds and the resulting streams are correct. A final note is that for now this still leaves the mutual exclusion between send operations and deduplication on files belonging to a root used by send operations. A solution for that will be slightly more complex but it will eventually be built on top of this change. Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-16drm/panel: Select DRM_DP_HELPER for DRM_PANEL_EDPThomas Zimmermann1-0/+1
commit 3755d35ee1d2454b20b8a1e20d790e56201678a4 upstream. As reported in [1], DRM_PANEL_EDP depends on DRM_DP_HELPER. Select the option to fix the build failure. The error message is shown below. arm-linux-gnueabihf-ld: drivers/gpu/drm/panel/panel-edp.o: in function `panel_edp_probe': panel-edp.c:(.text+0xb74): undefined reference to `drm_panel_dp_aux_backlight' make[1]: *** [/builds/linux/Makefile:1222: vmlinux] Error 1 The issue has been reported before, when DisplayPort helpers were hidden behind the option CONFIG_DRM_KMS_HELPER. [2] v2: * fix and expand commit description (Arnd) Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Fixes: 9d6366e743f3 ("drm: fb_helper: improve CONFIG_FB dependency") Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Reported-by: Linux Kernel Functional Testing <lkft@linaro.org> Reviewed-by: Lyude Paul <lyude@redhat.com> Acked-by: Sam Ravnborg <sam@ravnborg.org> Link: https://lore.kernel.org/dri-devel/CA+G9fYvN0NyaVkRQmA1O6rX7H8PPaZrUAD7=RDy33QY9rUU-9g@mail.gmail.com/ # [1] Link: https://lore.kernel.org/all/20211117062704.14671-1-rdunlap@infradead.org/ # [2] Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Lyude Paul <lyude@redhat.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: dri-devel@lists.freedesktop.org Link: https://patchwork.freedesktop.org/patch/msgid/20220203093922.20754-1-tzimmermann@suse.de Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-16x86/traps: Mark do_int3() NOKPROBE_SYMBOLLi Huafei1-0/+1
commit a365a65f9ca1ceb9cf1ac29db4a4f51df7c507ad upstream. Since kprobe_int3_handler() is called in do_int3(), probing do_int3() can cause a breakpoint recursion and crash the kernel. Therefore, do_int3() should be marked as NOKPROBE_SYMBOL. Fixes: 21e28290b317 ("x86/traps: Split int3 handler up") Signed-off-by: Li Huafei <lihuafei1@huawei.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20220310120915.63349-1-lihuafei1@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-16x86/sgx: Free backing memory after faulting the enclave pageJarkko Sakkinen1-9/+48
commit 08999b2489b4c9b939d7483dbd03702ee4576d96 upstream. There is a limited amount of SGX memory (EPC) on each system. When that memory is used up, SGX has its own swapping mechanism which is similar in concept but totally separate from the core mm/* code. Instead of swapping to disk, SGX swaps from EPC to normal RAM. That normal RAM comes from a shared memory pseudo-file and can itself be swapped by the core mm code. There is a hierarchy like this: EPC <-> shmem <-> disk After data is swapped back in from shmem to EPC, the shmem backing storage needs to be freed. Currently, the backing shmem is not freed. This effectively wastes the shmem while the enclave is running. The memory is recovered when the enclave is destroyed and the backing storage freed. Sort this out by freeing memory with shmem_truncate_range(), as soon as a page is faulted back to the EPC. In addition, free the memory for PCMD pages as soon as all PCMD's in a page have been marked as unused by zeroing its contents. Cc: stable@vger.kernel.org Fixes: 1728ab54b4be ("x86/sgx: Add a page reclaimer") Reported-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lkml.kernel.org/r/20220303223859.273187-1-jarkko@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-16x86/boot: Add setup_indirect support in early_memremap_is_setup_data()Ross Philipson1-2/+31
commit 445c1470b6ef96440e7cfc42dfc160f5004fd149 upstream. The x86 boot documentation describes the setup_indirect structures and how they are used. Only one of the two functions in ioremap.c that needed to be modified to be aware of the introduction of setup_indirect functionality was updated. Adds comparable support to the other function where it was missing. Fixes: b3c72fc9a78e ("x86/boot: Introduce setup_indirect") Signed-off-by: Ross Philipson <ross.philipson@oracle.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/1645668456-22036-3-git-send-email-ross.philipson@oracle.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-16x86/boot: Fix memremap of setup_indirect structuresRoss Philipson5-47/+166
commit 7228918b34615ef6317edcd9a058a057bc54aa32 upstream. As documented, the setup_indirect structure is nested inside the setup_data structures in the setup_data list. The code currently accesses the fields inside the setup_indirect structure but only the sizeof(struct setup_data) is being memremapped. No crash occurred but this is just due to how the area is remapped under the covers. Properly memremap both the setup_data and setup_indirect structures in these cases before accessing them. Fixes: b3c72fc9a78e ("x86/boot: Introduce setup_indirect") Signed-off-by: Ross Philipson <ross.philipson@oracle.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/1645668456-22036-2-git-send-email-ross.philipson@oracle.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-16watch_queue: Make comment about setting ->defunct more accurateDavid Howells1-1/+1
commit 4edc0760412b0c4ecefc7e02cb855b310b122825 upstream. watch_queue_clear() has a comment stating that setting ->defunct to true preventing new additions as well as preventing notifications. Whilst the latter is true, the first bit is superfluous since at the time this function is called, the pipe cannot be accessed to add new event sources. Remove the "new additions" bit from the comment. Fixes: c73be61cede5 ("pipe: Add general notification queue support") Reported-by: Jann Horn <jannh@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-16watch_queue: Fix lack of barrier/sync/lock between post and readDavid Howells2-2/+3
commit 2ed147f015af2b48f41c6f0b6746aa9ea85c19f3 upstream. There's nothing to synchronise post_one_notification() versus pipe_read(). Whilst posting is done under pipe->rd_wait.lock, the reader only takes pipe->mutex which cannot bar notification posting as that may need to be made from contexts that cannot sleep. Fix this by setting pipe->head with a barrier in post_one_notification() and reading pipe->head with a barrier in pipe_read(). If that's not sufficient, the rd_wait.lock will need to be taken, possibly in a ->confirm() op so that it only applies to notifications. The lock would, however, have to be dropped before copy_page_to_iter() is invoked. Fixes: c73be61cede5 ("pipe: Add general notification queue support") Reported-by: Jann Horn <jannh@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-16watch_queue: Free the alloc bitmap when the watch_queue is torn downDavid Howells1-0/+1
commit 7ea1a0124b6da246b5bc8c66cddaafd36acf3ecb upstream. Free the watch_queue note allocation bitmap when the watch_queue is destroyed. Fixes: c73be61cede5 ("pipe: Add general notification queue support") Reported-by: Jann Horn <jannh@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-16watch_queue: Fix the alloc bitmap size to reflect notes allocatedDavid Howells1-1/+2
commit 3b4c0371928c17af03e8397ac842346624017ce6 upstream. Currently, watch_queue_set_size() sets the number of notes available in wqueue->nr_notes according to the number of notes allocated, but sets the size of the bitmap to the unrounded number of notes originally asked for. Fix this by setting the bitmap size to the number of notes we're actually going to make available (ie. the number allocated). Fixes: c73be61cede5 ("pipe: Add general notification queue support") Reported-by: Jann Horn <jannh@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-16watch_queue: Fix to always request a pow-of-2 pipe ring sizeDavid Howells1-1/+1
commit 96a4d8912b28451cd62825fd7caa0e66e091d938 upstream. The pipe ring size must always be a power of 2 as the head and tail pointers are masked off by AND'ing with the size of the ring - 1. watch_queue_set_size(), however, lets you specify any number of notes between 1 and 511. This number is passed through to pipe_resize_ring() without checking/forcing its alignment. Fix this by rounding the number of slots required up to the nearest power of two. The request is meant to guarantee that at least that many notifications can be generated before the queue is full, so rounding down isn't an option, but, alternatively, it may be better to give an error if we aren't allowed to allocate that much ring space. Fixes: c73be61cede5 ("pipe: Add general notification queue support") Reported-by: Jann Horn <jannh@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-16watch_queue: Fix to release page in ->release()David Howells1-0/+1
commit c1853fbadcba1497f4907971e7107888e0714c81 upstream. When a pipe ring descriptor points to a notification message, the refcount on the backing page is incremented by the generic get function, but the release function, which marks the bitmap, doesn't drop the page ref. Fix this by calling generic_pipe_buf_release() at the end of watch_queue_pipe_buf_release(). Fixes: c73be61cede5 ("pipe: Add general notification queue support") Reported-by: Jann Horn <jannh@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-16watch_queue, pipe: Free watchqueue state after clearing pipe ringDavid Howells1-3/+5
commit db8facfc9fafacefe8a835416a6b77c838088f8b upstream. In free_pipe_info(), free the watchqueue state after clearing the pipe ring as each pipe ring descriptor has a release function, and in the case of a notification message, this is watch_queue_pipe_buf_release() which tries to mark the allocation bitmap that was previously released. Fix this by moving the put of the pipe's ref on the watch queue to after the ring has been cleared. We still need to call watch_queue_clear() before doing that to make sure that the pipe is disconnected from any notification sources first. Fixes: c73be61cede5 ("pipe: Add general notification queue support") Reported-by: Jann Horn <jannh@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-16watch_queue: Fix filter limit checkDavid Howells2-3/+4
commit c993ee0f9f81caf5767a50d1faeba39a0dc82af2 upstream. In watch_queue_set_filter(), there are a couple of places where we check that the filter type value does not exceed what the type_filter bitmap can hold. One place calculates the number of bits by: if (tf[i].type >= sizeof(wfilter->type_filter) * 8) which is fine, but the second does: if (tf[i].type >= sizeof(wfilter->type_filter) * BITS_PER_LONG) which is not. This can lead to a couple of out-of-bounds writes due to a too-large type: (1) __set_bit() on wfilter->type_filter (2) Writing more elements in wfilter->filters[] than we allocated. Fix this by just using the proper WATCH_TYPE__NR instead, which is the number of types we actually know about. The bug may cause an oops looking something like: BUG: KASAN: slab-out-of-bounds in watch_queue_set_filter+0x659/0x740 Write of size 4 at addr ffff88800d2c66bc by task watch_queue_oob/611 ... Call Trace: <TASK> dump_stack_lvl+0x45/0x59 print_address_description.constprop.0+0x1f/0x150 ... kasan_report.cold+0x7f/0x11b ... watch_queue_set_filter+0x659/0x740 ... __x64_sys_ioctl+0x127/0x190 do_syscall_64+0x43/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae Allocated by task 611: kasan_save_stack+0x1e/0x40 __kasan_kmalloc+0x81/0xa0 watch_queue_set_filter+0x23a/0x740 __x64_sys_ioctl+0x127/0x190 do_syscall_64+0x43/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae The buggy address belongs to the object at ffff88800d2c66a0 which belongs to the cache kmalloc-32 of size 32 The buggy address is located 28 bytes inside of 32-byte region [ffff88800d2c66a0, ffff88800d2c66c0) Fixes: c73be61cede5 ("pipe: Add general notification queue support") Reported-by: Jann Horn <jannh@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-16ARM: fix Thumb2 regression with Spectre BHBRussell King (Oracle)1-2/+2
commit 6c7cb60bff7aec24b834343ff433125f469886a3 upstream. When building for Thumb2, the vectors make use of a local label. Sadly, the Spectre BHB code also uses a local label with the same number which results in the Thumb2 reference pointing at the wrong place. Fix this by changing the number used for the Spectre BHB local label. Fixes: b9baf5c8c5c3 ("ARM: Spectre-BHB workaround") Tested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-16net/mlx5: Fix offloading with ESWITCH_IPV4_TTL_MODIFY_ENABLEDima Chumak1-3/+0
commit 39bab83b119faac4bf7f07173a42ed35be95147e upstream. Only prio 1 is supported for nic mode when there is no ignore flow level support in firmware. But for switchdev mode, which supports fixed number of statically pre-allocated prios, this restriction is not relevant so it can be relaxed. Fixes: d671e109bd85 ("net/mlx5: Fix tc max supported prio for nic mode") Signed-off-by: Dima Chumak <dchumak@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-16virtio: acknowledge all features before accessMichael S. Tsirkin2-18/+24
commit 4fa59ede95195f267101a1b8916992cf3f245cdb upstream. The feature negotiation was designed in a way that makes it possible for devices to know which config fields will be accessed by drivers. This is broken since commit 404123c2db79 ("virtio: allow drivers to validate features") with fallout in at least block and net. We have a partial work-around in commit 2f9a174f918e ("virtio: write back F_VERSION_1 before validate") which at least lets devices find out which format should config space have, but this is a partial fix: guests should not access config space without acknowledging features since otherwise we'll never be able to change the config space format. To fix, split finalize_features from virtio_finalize_features and call finalize_features with all feature bits before validation, and then - if validation changed any bits - once again after. Since virtio_finalize_features no longer writes out features rename it to virtio_features_ok - since that is what it does: checks that features are ok with the device. As a side effect, this also reduces the amount of hypervisor accesses - we now only acknowledge features once unless we are clearing any features when validating (which is uncommon). IRC I think that this was more or less always the intent in the spec but unfortunately the way the spec is worded does not say this explicitly, I plan to address this at the spec level, too. Acked-by: Jason Wang <jasowang@redhat.com> Cc: stable@vger.kernel.org Fixes: 404123c2db79 ("virtio: allow drivers to validate features") Fixes: 2f9a174f918e ("virtio: write back F_VERSION_1 before validate") Cc: "Halil Pasic" <pasic@linux.ibm.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-16virtio: unexport virtio_finalize_featuresMichael S. Tsirkin2-3/+1
commit 838d6d3461db0fdbf33fc5f8a69c27b50b4a46da upstream. virtio_finalize_features is only used internally within virtio. No reason to export it. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-16KVM: x86/mmu: kvm_faultin_pfn has to return false if pfh is returnedAndrei Vagin1-0/+1
commit a7cc099f2ec3117678adeb69749bef7e9dde3148 upstream. This looks like a typo in 8f32d5e563cb. This change didn't intend to do any functional changes. The problem was caught by gVisor tests. Fixes: 8f32d5e563cb ("KVM: x86/mmu: allow kvm_faultin_pfn to return page fault handling code") Cc: Maxim Levitsky <mlevitsk@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Andrei Vagin <avagin@gmail.com> Message-Id: <20211015163221.472508-1-avagin@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-16swiotlb: rework "fix info leak with DMA_FROM_DEVICE"Halil Pasic3-24/+15
commit aa6f8dcbab473f3a3c7454b74caa46d36cdc5d13 upstream. Unfortunately, we ended up merging an old version of the patch "fix info leak with DMA_FROM_DEVICE" instead of merging the latest one. Christoph (the swiotlb maintainer), he asked me to create an incremental fix (after I have pointed this out the mix up, and asked him for guidance). So here we go. The main differences between what we got and what was agreed are: * swiotlb_sync_single_for_device is also required to do an extra bounce * We decided not to introduce DMA_ATTR_OVERWRITE until we have exploiters * The implantation of DMA_ATTR_OVERWRITE is flawed: DMA_ATTR_OVERWRITE must take precedence over DMA_ATTR_SKIP_CPU_SYNC Thus this patch removes DMA_ATTR_OVERWRITE, and makes swiotlb_sync_single_for_device() bounce unconditionally (that is, also when dir == DMA_TO_DEVICE) in order do avoid synchronising back stale data from the swiotlb buffer. Let me note, that if the size used with dma_sync_* API is less than the size used with dma_[un]map_*, under certain circumstances we may still end up with swiotlb not being transparent. In that sense, this is no perfect fix either. To get this bullet proof, we would have to bounce the entire mapping/bounce buffer. For that we would have to figure out the starting address, and the size of the mapping in swiotlb_sync_single_for_device(). While this does seem possible, there seems to be no firm consensus on how things are supposed to work. Signed-off-by: Halil Pasic <pasic@linux.ibm.com> Fixes: ddbd89deb7d3 ("swiotlb: fix info leak with DMA_FROM_DEVICE") Cc: stable@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-16arm64: kasan: fix include error in MTE functionsPaul Semel1-0/+1
commit b859ebedd1e730bbda69142fca87af4e712649a1 upstream. Fix `error: expected string literal in 'asm'`. This happens when compiling an ebpf object file that includes `net/net_namespace.h` from linux kernel headers. Include trace: include/net/net_namespace.h:10 include/linux/workqueue.h:9 include/linux/timer.h:8 include/linux/debugobjects.h:6 include/linux/spinlock.h:90 include/linux/workqueue.h:9 arch/arm64/include/asm/spinlock.h:9 arch/arm64/include/generated/asm/qrwlock.h:1 include/asm-generic/qrwlock.h:14 arch/arm64/include/asm/processor.h:33 arch/arm64/include/asm/kasan.h:9 arch/arm64/include/asm/mte-kasan.h:45 arch/arm64/include/asm/mte-def.h:14 Signed-off-by: Paul Semel <paul.semel@datadoghq.com> Fixes: 2cb34276427a ("arm64: kasan: simplify and inline MTE functions") Cc: <stable@vger.kernel.org> # 5.12.x Link: https://lore.kernel.org/r/bacb5387-2992-97e4-0c48-1ed925905bee@gmail.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-16arm64: Ensure execute-only permissions are not allowed without EPANCatalin Marinas4-17/+19
commit 6e2edd6371a497a6350bb735534c9bda2a31f43d upstream. Commit 18107f8a2df6 ("arm64: Support execute-only permissions with Enhanced PAN") re-introduced execute-only permissions when EPAN is available. When EPAN is not available, arch_filter_pgprot() is supposed to change a PAGE_EXECONLY permission into PAGE_READONLY_EXEC. However, if BTI or MTE are present, such check does not detect the execute-only pgprot in the presence of PTE_GP (BTI) or MT_NORMAL_TAGGED (MTE), allowing the user to request PROT_EXEC with PROT_BTI or PROT_MTE. Remove the arch_filter_pgprot() function, change the default VM_EXEC permissions to PAGE_READONLY_EXEC and update the protection_map[] array at core_initcall() if EPAN is detected. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Fixes: 18107f8a2df6 ("arm64: Support execute-only permissions with Enhanced PAN") Cc: <stable@vger.kernel.org> # 5.13.x Acked-by: Will Deacon <will@kernel.org> Reviewed-by: Vladimir Murzin <vladimir.murzin@arm.com> Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-16arm64: dts: marvell: armada-37xx: Remap IO space to bus address 0x0Pali Rohár2-2/+7
commit a1cc1697bb56cdf880ad4d17b79a39ef2c294bc9 upstream. Legacy and old PCI I/O based cards do not support 32-bit I/O addressing. Since commit 64f160e19e92 ("PCI: aardvark: Configure PCIe resources from 'ranges' DT property") kernel can set different PCIe address on CPU and different on the bus for the one A37xx address mapping without any firmware support in case the bus address does not conflict with other A37xx mapping. So remap I/O space to the bus address 0x0 to enable support for old legacy I/O port based cards which have hardcoded I/O ports in low address space. Note that DDR on A37xx is mapped to bus address 0x0. And mapping of I/O space can be set to address 0x0 too because MEM space and I/O space are separate and so do not conflict. Remapping IO space on Turris Mox to different address is not possible to due bootloader bug. Signed-off-by: Pali Rohár <pali@kernel.org> Reported-by: Arnd Bergmann <arnd@arndb.de> Fixes: 76f6386b25cc ("arm64: dts: marvell: Add Aardvark PCIe support for Armada 3700") Cc: stable@vger.kernel.org # 64f160e19e92 ("PCI: aardvark: Configure PCIe resources from 'ranges' DT property") Cc: stable@vger.kernel.org # 514ef1e62d65 ("arm64: dts: marvell: armada-37xx: Extend PCIe MEM space") Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-16tracing/osnoise: Force quiescent states while tracingNicolas Saenz Julienne1-0/+20
commit caf4c86bf136845982c5103b2661751b40c474c0 upstream. At the moment running osnoise on a nohz_full CPU or uncontested FIFO priority and a PREEMPT_RCU kernel might have the side effect of extending grace periods too much. This will entice RCU to force a context switch on the wayward CPU to end the grace period, all while introducing unwarranted noise into the tracer. This behaviour is unavoidable as overly extending grace periods might exhaust the system's memory. This same exact problem is what extended quiescent states (EQS) were created for, conversely, rcu_momentary_dyntick_idle() emulates them by performing a zero duration EQS. So let's make use of it. In the common case rcu_momentary_dyntick_idle() is fairly inexpensive: atomically incrementing a local per-CPU counter and doing a store. So it shouldn't affect osnoise's measurements (which has a 1us granularity), so we'll call it unanimously. The uncommon case involve calling rcu_momentary_dyntick_idle() after having the osnoise process: - Receive an expedited quiescent state IPI with preemption disabled or during an RCU critical section. (activates rdp->cpu_no_qs.b.exp code-path). - Being preempted within in an RCU critical section and having the subsequent outermost rcu_read_unlock() called with interrupts disabled. (t->rcu_read_unlock_special.b.blocked code-path). Neither of those are possible at the moment, and are unlikely to be in the future given the osnoise's loop design. On top of this, the noise generated by the situations described above is unavoidable, and if not exposed by rcu_momentary_dyntick_idle() will be eventually seen in subsequent rcu_read_unlock() calls or schedule operations. Link: https://lkml.kernel.org/r/20220307180740.577607-1-nsaenzju@redhat.com Cc: stable@vger.kernel.org Fixes: bce29ac9ce0b ("trace: Add osnoise tracer") Signed-off-by: Nicolas Saenz Julienne <nsaenzju@redhat.com> Acked-by: Paul E. McKenney <paulmck@kernel.org> Acked-by: Daniel Bristot de Oliveira <bristot@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-16riscv: Fix auipc+jalr relocation range checksEmil Renner Berthing1-5/+16
commit 0966d385830de3470b7131db8e86c0c5bc9c52dc upstream. RISC-V can do PC-relative jumps with a 32bit range using the following two instructions: auipc t0, imm20 ; t0 = PC + imm20 * 2^12 jalr ra, t0, imm12 ; ra = PC + 4, PC = t0 + imm12 Crucially both the 20bit immediate imm20 and the 12bit immediate imm12 are treated as two's-complement signed values. For this reason the immediates are usually calculated like this: imm20 = (offset + 0x800) >> 12 imm12 = offset & 0xfff ..where offset is the signed offset from the auipc instruction. When the 11th bit of offset is 0 the addition of 0x800 doesn't change the top 20 bits and imm12 considered positive. When the 11th bit is 1 the carry of the addition by 0x800 means imm20 is one higher, but since imm12 is then considered negative the two's complement representation means it all cancels out nicely. However, this addition by 0x800 (2^11) means an offset greater than or equal to 2^31 - 2^11 would overflow so imm20 is considered negative and result in a backwards jump. Similarly the lower range of offset is also moved down by 2^11 and hence the true 32bit range is [-2^31 - 2^11, 2^31 - 2^11) Signed-off-by: Emil Renner Berthing <kernel@esmil.dk> Fixes: e2c0cdfba7f6 ("RISC-V: User-facing API") Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-16mmc: meson: Fix usage of meson_mmc_post_req()Rong Chen1-7/+8
commit f0d2f15362f02444c5d7ffd5a5eb03e4aa54b685 upstream. Currently meson_mmc_post_req() is called in meson_mmc_request() right after meson_mmc_start_cmd(). This could lead to DMA unmapping before the request is actually finished. To fix, don't call meson_mmc_post_req() until meson_mmc_request_done(). Signed-off-by: Rong Chen <rong.chen@amlogic.com> Reviewed-by: Kevin Hilman <khilman@baylibre.com> Fixes: 79ed05e329c3 ("mmc: meson-gx: add support for descriptor chain mode") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220216124239.4007667-1-rong.chen@amlogic.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-16riscv: alternative only works on !XIP_KERNELJisheng Zhang2-2/+3
commit c80ee64a8020ef1a6a92109798080786829b8994 upstream. The alternative mechanism needs runtime code patching, it can't work on XIP_KERNEL. And the errata workarounds are implemented via the alternative mechanism. So add !XIP_KERNEL dependency for alternative and erratas. Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Fixes: 44c922572952 ("RISC-V: enable XIP") Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-16net: macb: Fix lost RX packet wakeup race in NAPI receiveRobert Hancock1-1/+24
commit 0bf476fc3624e3a72af4ba7340d430a91c18cd67 upstream. There is an oddity in the way the RSR register flags propagate to the ISR register (and the actual interrupt output) on this hardware: it appears that RSR register bits only result in ISR being asserted if the interrupt was actually enabled at the time, so enabling interrupts with RSR bits already set doesn't trigger an interrupt to be raised. There was already a partial fix for this race in the macb_poll function where it checked for RSR bits being set and re-triggered NAPI receive. However, there was a still a race window between checking RSR and actually enabling interrupts, where a lost wakeup could happen. It's necessary to check again after enabling interrupts to see if RSR was set just prior to the interrupt being enabled, and re-trigger receive in that case. This issue was noticed in a point-to-point UDP request-response protocol which periodically saw timeouts or abnormally high response times due to received packets not being processed in a timely fashion. In many applications, more packets arriving, including TCP retransmissions, would cause the original packet to be processed, thus masking the issue. Fixes: 02f7a34f34e3 ("net: macb: Re-enable RX interrupt only when RX is done") Cc: stable@vger.kernel.org Co-developed-by: Scott McNutt <scott.mcnutt@siriusxm.com> Signed-off-by: Scott McNutt <scott.mcnutt@siriusxm.com> Signed-off-by: Robert Hancock <robert.hancock@calian.com> Tested-by: Claudiu Beznea <claudiu.beznea@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-16staging: gdm724x: fix use after free in gdm_lte_rx()Dan Carpenter1-2/+3
commit fc7f750dc9d102c1ed7bbe4591f991e770c99033 upstream. The netif_rx_ni() function frees the skb so we can't dereference it to save the skb->len. Fixes: 61e121047645 ("staging: gdm7240: adding LTE USB driver") Cc: stable <stable@vger.kernel.org> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/20220228074331.GA13685@kili Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-16staging: rtl8723bs: Fix access-point mode deadlockHans de Goede5-24/+33
commit 8f4347081be32e67b0873827e0138ab0fdaaf450 upstream. Commit 54659ca026e5 ("staging: rtl8723bs: remove possible deadlock when disconnect (v2)") split the locking of pxmitpriv->lock vs sleep_q/lock into 2 locks in attempt to fix a lockdep reported issue with the locking order of the sta_hash_lock vs pxmitpriv->lock. But in the end this turned out to not fully solve the sta_hash_lock issue so commit a7ac783c338b ("staging: rtl8723bs: remove a second possible deadlock") was added to fix this in another way. The original fix was kept as it was still seen as a good thing to have, but now it turns out that it creates a deadlock in access-point mode: [Feb20 23:47] ====================================================== [ +0.074085] WARNING: possible circular locking dependency detected [ +0.074077] 5.16.0-1-amd64 #1 Tainted: G C E [ +0.064710] ------------------------------------------------------ [ +0.074075] ksoftirqd/3/29 is trying to acquire lock: [ +0.060542] ffffb8b30062ab00 (&pxmitpriv->lock){+.-.}-{2:2}, at: rtw_xmit_classifier+0x8a/0x140 [r8723bs] [ +0.114921] but task is already holding lock: [ +0.069908] ffffb8b3007ab704 (&psta->sleep_q.lock){+.-.}-{2:2}, at: wakeup_sta_to_xmit+0x3b/0x300 [r8723bs] [ +0.116976] which lock already depends on the new lock. [ +0.098037] the existing dependency chain (in reverse order) is: [ +0.089704] -> #1 (&psta->sleep_q.lock){+.-.}-{2:2}: [ +0.077232] _raw_spin_lock_bh+0x34/0x40 [ +0.053261] xmitframe_enqueue_for_sleeping_sta+0xc1/0x2f0 [r8723bs] [ +0.082572] rtw_xmit+0x58b/0x940 [r8723bs] [ +0.056528] _rtw_xmit_entry+0xba/0x350 [r8723bs] [ +0.062755] dev_hard_start_xmit+0xf1/0x320 [ +0.056381] sch_direct_xmit+0x9e/0x360 [ +0.052212] __dev_queue_xmit+0xce4/0x1080 [ +0.055334] ip6_finish_output2+0x18f/0x6e0 [ +0.056378] ndisc_send_skb+0x2c8/0x870 [ +0.052209] ndisc_send_ns+0xd3/0x210 [ +0.050130] addrconf_dad_work+0x3df/0x5a0 [ +0.055338] process_one_work+0x274/0x5a0 [ +0.054296] worker_thread+0x52/0x3b0 [ +0.050124] kthread+0x16c/0x1a0 [ +0.044925] ret_from_fork+0x1f/0x30 [ +0.049092] -> #0 (&pxmitpriv->lock){+.-.}-{2:2}: [ +0.074101] __lock_acquire+0x10f5/0x1d80 [ +0.054298] lock_acquire+0xd7/0x300 [ +0.049088] _raw_spin_lock_bh+0x34/0x40 [ +0.053248] rtw_xmit_classifier+0x8a/0x140 [r8723bs] [ +0.066949] rtw_xmitframe_enqueue+0xa/0x20 [r8723bs] [ +0.066946] rtl8723bs_hal_xmitframe_enqueue+0x14/0x50 [r8723bs] [ +0.078386] wakeup_sta_to_xmit+0xa6/0x300 [r8723bs] [ +0.065903] rtw_recv_entry+0xe36/0x1160 [r8723bs] [ +0.063809] rtl8723bs_recv_tasklet+0x349/0x6c0 [r8723bs] [ +0.071093] tasklet_action_common.constprop.0+0xe5/0x110 [ +0.070966] __do_softirq+0x16f/0x50a [ +0.050134] __irq_exit_rcu+0xeb/0x140 [ +0.051172] irq_exit_rcu+0xa/0x20 [ +0.047006] common_interrupt+0xb8/0xd0 [ +0.052214] asm_common_interrupt+0x1e/0x40 [ +0.056381] finish_task_switch.isra.0+0x100/0x3a0 [ +0.063670] __schedule+0x3ad/0xd20 [ +0.048047] schedule+0x4e/0xc0 [ +0.043880] smpboot_thread_fn+0xc4/0x220 [ +0.054298] kthread+0x16c/0x1a0 [ +0.044922] ret_from_fork+0x1f/0x30 [ +0.049088] other info that might help us debug this: [ +0.095950] Possible unsafe locking scenario: [ +0.070952] CPU0 CPU1 [ +0.054282] ---- ---- [ +0.054285] lock(&psta->sleep_q.lock); [ +0.047004] lock(&pxmitpriv->lock); [ +0.074082] lock(&psta->sleep_q.lock); [ +0.077209] lock(&pxmitpriv->lock); [ +0.043873] *** DEADLOCK *** [ +0.070950] 1 lock held by ksoftirqd/3/29: [ +0.049082] #0: ffffb8b3007ab704 (&psta->sleep_q.lock){+.-.}-{2:2}, at: wakeup_sta_to_xmit+0x3b/0x300 [r8723bs] Analysis shows that in hindsight the splitting of the lock was not a good idea, so revert this to fix the access-point mode deadlock. Note this is a straight-forward revert done with git revert, the commented out "/* spin_lock_bh(&psta_bmc->sleep_q.lock); */" lines were part of the code before the reverted changes. Fixes: 54659ca026e5 ("staging: rtl8723bs: remove possible deadlock when disconnect (v2)") Cc: stable <stable@vger.kernel.org> Cc: Fabio Aiuto <fabioaiuto83@gmail.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=215542 Link: https://lore.kernel.org/r/20220302101637.26542-1-hdegoede@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-16fuse: fix pipe buffer lifetime for direct_ioMiklos Szeredi3-1/+13
commit 0c4bcfdecb1ac0967619ee7ff44871d93c08c909 upstream. In FOPEN_DIRECT_IO mode, fuse_file_write_iter() calls fuse_direct_write_iter(), which normally calls fuse_direct_io(), which then imports the write buffer with fuse_get_user_pages(), which uses iov_iter_get_pages() to grab references to userspace pages instead of actually copying memory. On the filesystem device side, these pages can then either be read to userspace (via fuse_dev_read()), or splice()d over into a pipe using fuse_dev_splice_read() as pipe buffers with &nosteal_pipe_buf_ops. This is wrong because after fuse_dev_do_read() unlocks the FUSE request, the userspace filesystem can mark the request as completed, causing write() to return. At that point, the userspace filesystem should no longer have access to the pipe buffer. Fix by copying pages coming from the user address space to new pipe buffers. Reported-by: Jann Horn <jannh@google.com> Fixes: c3021629a0d8 ("fuse: support splice() reading from fuse device") Cc: <stable@vger.kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-16fuse: fix fileattr op failureMiklos Szeredi1-3/+6
commit a679a61520d8a7b0211a1da990404daf5cc80b72 upstream. The fileattr API conversion broke lsattr on ntfs3g. Previously the ioctl(... FS_IOC_GETFLAGS) returned an EINVAL error, but after the conversion the error returned by the fuse filesystem was not propagated back to the ioctl() system call, resulting in success being returned with bogus values. Fix by checking for outarg.result in fuse_priv_ioctl(), just as generic ioctl code does. Reported-by: Jean-Pierre André <jean-pierre.andre@wanadoo.fr> Fixes: 72227eac177d ("fuse: convert to fileattr") Cc: <stable@vger.kernel.org> # v5.13 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-16ARM: Spectre-BHB: provide empty stub for non-configRandy Dunlap1-0/+6
commit 68453767131a5deec1e8f9ac92a9042f929e585d upstream. When CONFIG_GENERIC_CPU_VULNERABILITIES is not set, references to spectre_v2_update_state() cause a build error, so provide an empty stub for that function when the Kconfig option is not set. Fixes this build error: arm-linux-gnueabi-ld: arch/arm/mm/proc-v7-bugs.o: in function `cpu_v7_bugs_init': proc-v7-bugs.c:(.text+0x52): undefined reference to `spectre_v2_update_state' arm-linux-gnueabi-ld: proc-v7-bugs.c:(.text+0x82): undefined reference to `spectre_v2_update_state' Fixes: b9baf5c8c5c3 ("ARM: Spectre-BHB workaround") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: kernel test robot <lkp@intel.com> Cc: Russell King <rmk+kernel@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: patches@armlinux.org.uk Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-16selftests/memfd: clean up mapping in mfd_fail_writeMike Kravetz1-0/+1
[ Upstream commit fda153c89af344d21df281009a9d046cf587ea0f ] Running the memfd script ./run_hugetlbfs_test.sh will often end in error as follows: memfd-hugetlb: CREATE memfd-hugetlb: BASIC memfd-hugetlb: SEAL-WRITE memfd-hugetlb: SEAL-FUTURE-WRITE memfd-hugetlb: SEAL-SHRINK fallocate(ALLOC) failed: No space left on device ./run_hugetlbfs_test.sh: line 60: 166855 Aborted (core dumped) ./memfd_test hugetlbfs opening: ./mnt/memfd fuse: DONE If no hugetlb pages have been preallocated, run_hugetlbfs_test.sh will allocate 'just enough' pages to run the test. In the SEAL-FUTURE-WRITE test the mfd_fail_write routine maps the file, but does not unmap. As a result, two hugetlb pages remain reserved for the mapping. When the fallocate call in the SEAL-SHRINK test attempts allocate all hugetlb pages, it is short by the two reserved pages. Fix by making sure to unmap in mfd_fail_write. Link: https://lkml.kernel.org/r/20220219004340.56478-1-mike.kravetz@oracle.com Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-16selftest/vm: fix map_fixed_noreplace test failureAneesh Kumar K.V1-12/+37
[ Upstream commit f39c58008dee7ab5fc94c3f1995a21e886801df0 ] On the latest RHEL the test fails due to executable mapped at 256MB address # ./map_fixed_noreplace mmap() @ 0x10000000-0x10050000 p=0xffffffffffffffff result=File exists 10000000-10010000 r-xp 00000000 fd:04 34905657 /root/rpmbuild/BUILD/kernel-5.14.0-56.el9/linux-5.14.0-56.el9.ppc64le/tools/testing/selftests/vm/map_fixed_noreplace 10010000-10020000 r--p 00000000 fd:04 34905657 /root/rpmbuild/BUILD/kernel-5.14.0-56.el9/linux-5.14.0-56.el9.ppc64le/tools/testing/selftests/vm/map_fixed_noreplace 10020000-10030000 rw-p 00010000 fd:04 34905657 /root/rpmbuild/BUILD/kernel-5.14.0-56.el9/linux-5.14.0-56.el9.ppc64le/tools/testing/selftests/vm/map_fixed_noreplace 10029b90000-10029bc0000 rw-p 00000000 00:00 0 [heap] 7fffbb510000-7fffbb750000 r-xp 00000000 fd:04 24534 /usr/lib64/libc.so.6 7fffbb750000-7fffbb760000 r--p 00230000 fd:04 24534 /usr/lib64/libc.so.6 7fffbb760000-7fffbb770000 rw-p 00240000 fd:04 24534 /usr/lib64/libc.so.6 7fffbb780000-7fffbb7a0000 r--p 00000000 00:00 0 [vvar] 7fffbb7a0000-7fffbb7b0000 r-xp 00000000 00:00 0 [vdso] 7fffbb7b0000-7fffbb800000 r-xp 00000000 fd:04 24514 /usr/lib64/ld64.so.2 7fffbb800000-7fffbb810000 r--p 00040000 fd:04 24514 /usr/lib64/ld64.so.2 7fffbb810000-7fffbb820000 rw-p 00050000 fd:04 24514 /usr/lib64/ld64.so.2 7fffd93f0000-7fffd9420000 rw-p 00000000 00:00 0 [stack] Error: couldn't map the space we need for the test Fix this by finding a free address using mmap instead of hardcoding BASE_ADDRESS. Link: https://lkml.kernel.org/r/20220217083417.373823-1-aneesh.kumar@linux.ibm.com Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Jann Horn <jannh@google.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-16tracing/osnoise: Make osnoise_main to sleep for microsecondsDaniel Bristot de Oliveira1-21/+32
[ Upstream commit dd990352f01ee9a6c6eee152e5d11c021caccfe4 ] osnoise's runtime and period are in the microseconds scale, but it is currently sleeping in the millisecond's scale. This behavior roots in the usage of hwlat as the skeleton for osnoise. Make osnoise to sleep in the microseconds scale. Also, move the sleep to a specialized function. Link: https://lkml.kernel.org/r/302aa6c7bdf2d131719b22901905e9da122a11b2.1645197336.git.bristot@kernel.org Cc: Ingo Molnar <mingo@redhat.com> Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-16tracing: Ensure trace buffer is at least 4096 bytes largeSven Schnelle1-4/+6
[ Upstream commit 7acf3a127bb7c65ff39099afd78960e77b2ca5de ] Booting the kernel with 'trace_buf_size=1' give a warning at boot during the ftrace selftests: [ 0.892809] Running postponed tracer tests: [ 0.892893] Testing tracer function: [ 0.901899] Callback from call_rcu_tasks_trace() invoked. [ 0.983829] Callback from call_rcu_tasks_rude() invoked. [ 1.072003] .. bad ring buffer .. corrupted trace buffer .. [ 1.091944] Callback from call_rcu_tasks() invoked. [ 1.097695] PASSED [ 1.097701] Testing dynamic ftrace: .. filter failed count=0 ..FAILED! [ 1.353474] ------------[ cut here ]------------ [ 1.353478] WARNING: CPU: 0 PID: 1 at kernel/trace/trace.c:1951 run_tracer_selftest+0x13c/0x1b0 Therefore enforce a minimum of 4096 bytes to make the selftest pass. Link: https://lkml.kernel.org/r/20220214134456.1751749-1-svens@linux.ibm.com Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-16ipv6: prevent a possible race condition with lifetimesNiels Dossche1-0/+2
[ Upstream commit 6c0d8833a605e195ae219b5042577ce52bf71fff ] valid_lft, prefered_lft and tstamp are always accessed under the lock "lock" in other places. Reading these without taking the lock may result in inconsistencies regarding the calculation of the valid and preferred variables since decisions are taken on these fields for those variables. Signed-off-by: Niels Dossche <dossche.niels@gmail.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Niels Dossche <niels.dossche@ugent.be> Link: https://lore.kernel.org/r/20220223131954.6570-1-niels.dossche@ugent.be Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-16Revert "xen-netback: Check for hotplug-status existence before watching"Marek Marczykowski-Górecki1-8/+4
[ Upstream commit e8240addd0a3919e0fd7436416afe9aa6429c484 ] This reverts commit 2afeec08ab5c86ae21952151f726bfe184f6b23d. The reasoning in the commit was wrong - the code expected to setup the watch even if 'hotplug-status' didn't exist. In fact, it relied on the watch being fired the first time - to check if maybe 'hotplug-status' is already set to 'connected'. Not registering a watch for non-existing path (which is the case if hotplug script hasn't been executed yet), made the backend not waiting for the hotplug script to execute. This in turns, made the netfront think the interface is fully operational, while in fact it was not (the vif interface on xen-netback side might not be configured yet). This was a workaround for 'hotplug-status' erroneously being removed. But since that is reverted now, the workaround is not necessary either. More discussion at https://lore.kernel.org/xen-devel/afedd7cb-a291-e773-8b0d-4db9b291fa98@ipxe.org/T/#u Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Reviewed-by: Paul Durrant <paul@xen.org> Reviewed-by: Michael Brown <mbrown@fensystems.co.uk> Link: https://lore.kernel.org/r/20220222001817.2264967-2-marmarek@invisiblethingslab.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-16Revert "xen-netback: remove 'hotplug-status' once it has served its purpose"Marek Marczykowski-Górecki1-1/+1
[ Upstream commit 0f4558ae91870692ce7f509c31c9d6ee721d8cdc ] This reverts commit 1f2565780e9b7218cf92c7630130e82dcc0fe9c2. The 'hotplug-status' node should not be removed as long as the vif device remains configured. Otherwise the xen-netback would wait for re-running the network script even if it was already called (in case of the frontent re-connecting). But also, it _should_ be removed when the vif device is destroyed (for example when unbinding the driver) - otherwise hotplug script would not configure the device whenever it re-appear. Moving removal of the 'hotplug-status' node was a workaround for nothing calling network script after xen-netback module is reloaded. But when vif interface is re-created (on xen-netback unbind/bind for example), the script should be called, regardless of who does that - currently this case is not handled by the toolstack, and requires manual script call. Keeping hotplug-status=connected to skip the call is wrong and leads to not configured interface. More discussion at https://lore.kernel.org/xen-devel/afedd7cb-a291-e773-8b0d-4db9b291fa98@ipxe.org/T/#u Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Reviewed-by: Paul Durrant <paul@xen.org> Link: https://lore.kernel.org/r/20220222001817.2264967-1-marmarek@invisiblethingslab.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-16drm/amdgpu: bypass tiling flag check in virtual display case (v2)Guchun Chen1-1/+1
[ Upstream commit e2b993302f40c4eb714ecf896dd9e1c5be7d4cd7 ] vkms leverages common amdgpu framebuffer creation, and also as it does not support FB modifier, there is no need to check tiling flags when initing framebuffer when virtual display is enabled. This can fix below calltrace: amdgpu 0000:00:08.0: GFX9+ requires FB check based on format modifier WARNING: CPU: 0 PID: 1023 at drivers/gpu/drm/amd/amdgpu/amdgpu_display.c:1150 amdgpu_display_framebuffer_init+0x8e7/0xb40 [amdgpu] v2: check adev->enable_virtual_display instead as vkms can be enabled in bare metal as well. Signed-off-by: Leslie Shi <Yuliang.Shi@amd.com> Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-16gpio: Return EPROBE_DEFER if gc->to_irq is NULLShreeya Patel1-0/+10
[ Upstream commit ae42f9288846353982e2eab181fb41e7fd8bf60f ] We are racing the registering of .to_irq when probing the i2c driver. This results in random failure of touchscreen devices. Following explains the race condition better. [gpio driver] gpio driver registers gpio chip [gpio consumer] gpio is acquired [gpio consumer] gpiod_to_irq() fails with -ENXIO [gpio driver] gpio driver registers irqchip gpiod_to_irq works at this point, but -ENXIO is fatal We could see the following errors in dmesg logs when gc->to_irq is NULL [2.101857] i2c_hid i2c-FTS3528:00: HID over i2c has not been provided an Int IRQ [2.101953] i2c_hid: probe of i2c-FTS3528:00 failed with error -22 To avoid this situation, defer probing until to_irq is registered. Returning -EPROBE_DEFER would be the first step towards avoiding the failure of devices due to the race in registration of .to_irq. Final solution to this issue would be to avoid using gc irq members until they are fully initialized. This issue has been reported many times in past and people have been using workarounds like changing the pinctrl_amd to built-in instead of loading it as a module or by adding a softdep for pinctrl_amd into the config file. BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=209413 Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Shreeya Patel <shreeya.patel@collabora.com> Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-16PCI: Mark all AMD Navi10 and Navi14 GPU ATS as brokenAlex Deucher1-5/+9
[ Upstream commit 3f1271b54edcc692da5a3663f2aa2a64781f9bc3 ] There are enough VBIOS escapes without the proper workaround that some users still hit this. Microsoft never productized ATS on Windows so OEM platforms that were Windows-only didn't always validate ATS. The advantages of ATS are not worth it compared to the potential instabilities on harvested boards. Disable ATS on all Navi10 and Navi14 boards. Symptoms include: amdgpu 0000:07:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0007 address=0xffffc02000 flags=0x0000] AMD-Vi: Event logged [IO_PAGE_FAULT device=07:00.0 domain=0x0007 address=0xffffc02000 flags=0x0000] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout, signaled seq=6047, emitted seq=6049 amdgpu 0000:07:00.0: amdgpu: GPU reset begin! amdgpu 0000:07:00.0: amdgpu: GPU reset succeeded, trying to resume amdgpu 0000:07:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring sdma0 test failed (-110) [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <sdma_v4_0> failed -110 amdgpu 0000:07:00.0: amdgpu: GPU reset(1) failed Related commits: e8946a53e2a6 ("PCI: Mark AMD Navi14 GPU ATS as broken") a2da5d8cc0b0 ("PCI: Mark AMD Raven iGPU ATS as broken in some platforms") 45beb31d3afb ("PCI: Mark AMD Navi10 GPU rev 0x00 ATS as broken") 5e89cd303e3a ("PCI: Mark AMD Navi14 GPU rev 0xc5 ATS as broken") d28ca864c493 ("PCI: Mark AMD Stoney Radeon R7 GPU ATS as broken") 9b44b0b09dec ("PCI: Mark AMD Stoney GPU ATS as broken") [bhelgaas: add symptoms and related commits] Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1760 Link: https://lore.kernel.org/r/20220222160801.841643-1-alexander.deucher@amd.com Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-16hwmon: (pmbus) Clear pmbus fault/warning bits after readVikash Chandola1-0/+5
[ Upstream commit 35f165f08950a876f1b95a61d79c93678fba2fd6 ] Almost all fault/warning bits in pmbus status registers remain set even after fault/warning condition are removed. As per pmbus specification these faults must be cleared by user. Modify hwmon behavior to clear fault/warning bit after fetching data if fault/warning bit was set. This allows to get fresh data in next read. Signed-off-by: Vikash Chandola <vikash.chandola@linux.intel.com> Link: https://lore.kernel.org/r/20220222131253.2426834-1-vikash.chandola@linux.intel.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Sasha Levin <sashal@kernel.org>