summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)AuthorFilesLines
4 daysriscv: pgtable: Cleanup useless VA_USER_XXX definitionsGuo Ren (Alibaba DAMO Academy)1-4/+0
[ Upstream commit 5e5be092ffadcab0093464ccd9e30f0c5cce16b9 ] These marcos are not used after commit b5b4287accd7 ("riscv: mm: Use hint address in mmap if available"). Cleanup VA_USER_XXX definitions in asm/pgtable.h. Fixes: b5b4287accd7 ("riscv: mm: Use hint address in mmap if available") Signed-off-by: Guo Ren (Alibaba DAMO Academy) <guoren@kernel.org> Reviewed-by: Jinjie Ruan <ruanjinjie@huawei.com> Link: https://patch.msgid.link/20251201005850.702569-1-guoren@kernel.org Signed-off-by: Paul Walmsley <pjw@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
4 daysarm64: dts: mba8mx: Fix Ethernet PHY IRQ supportAlexander Stein1-1/+1
[ Upstream commit 89e87d0dc87eb3654c9ae01afc4a18c1c6d1e523 ] Ethernet PHY interrupt mode is level triggered. Adjust the mode accordingly. Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Fixes: 70cf622bb16e ("arm64: dts: mba8mx: Add Ethernet PHY IRQ support") Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
4 daysarm64: dts: imx8qm-ss-dma: correct the dma channels of lpuartSherry Sun1-4/+4
[ Upstream commit a988caeed9d918452aa0a68de2c6e94d86aa43ba ] The commit 616effc0272b5 ("arm64: dts: imx8: Fix lpuart DMA channel order") swap uart rx and tx channel at common imx8-ss-dma.dtsi. But miss update imx8qm-ss-dma.dtsi. The commit 5a8e9b022e569 ("arm64: dts: imx8qm-ss-dma: Pass lpuart dma-names") just simple add dma-names as binding doc requirement. Correct lpuart0 - lpuart3 dma rx and tx channels, and use defines for the FSL_EDMA_RX flag. Fixes: 5a8e9b022e56 ("arm64: dts: imx8qm-ss-dma: Pass lpuart dma-names") Signed-off-by: Sherry Sun <sherry.sun@nxp.com> Reviewed-by: Frank Li <Frank.Li@nxp.com> Reviewed-by: Alexander Stein <alexander.stein@ew.tq-group.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
4 daysarm64: dts: imx8mp: Fix LAN8740Ai PHY reference clock on DH electronics ↵Marek Vasut1-0/+1
i.MX8M Plus DHCOM [ Upstream commit c63749a7ddc59ac6ec0b05abfa0a21af9f2c1d38 ] Add missing 'clocks' property to LAN8740Ai PHY node, to allow the PHY driver to manage LAN8740Ai CLKIN reference clock supply. This fixes sporadic link bouncing caused by interruptions on the PHY reference clock, by letting the PHY driver manage the reference clock and assure there are no interruptions. This follows the matching PHY driver recommendation described in commit bedd8d78aba3 ("net: phy: smsc: LAN8710/20: add phy refclk in support") Fixes: 8d6712695bc8 ("arm64: dts: imx8mp: Add support for DH electronics i.MX8M Plus DHCOM and PDK2") Signed-off-by: Marek Vasut <marek.vasut@mailbox.org> Tested-by: Christoph Niedermaier <cniedermaier@dh-electronics.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
4 daysARM: dts: imx6q-ba16: fix RTC interrupt levelIan Ray1-1/+1
[ Upstream commit e6a4eedd49ce27c16a80506c66a04707e0ee0116 ] RTC interrupt level should be set to "LOW". This was revealed by the introduction of commit: f181987ef477 ("rtc: m41t80: use IRQ flags obtained from fwnode") which changed the way IRQ type is obtained. Fixes: 56c27310c1b4 ("ARM: dts: imx: Add Advantech BA-16 Qseven module") Signed-off-by: Ian Ray <ian.ray@gehealthcare.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
4 daysarm64: dts: add off-on-delay-us for usdhc2 regulatorHaibo Chen1-0/+1
[ Upstream commit ca643894a37a25713029b36cfe7d1bae515cac08 ] For SD card, according to the spec requirement, for sd card power reset operation, it need sd card supply voltage to be lower than 0.5v and keep over 1ms, otherwise, next time power back the sd card supply voltage to 3.3v, sd card can't support SD3.0 mode again. To match such requirement on imx8qm-mek board, add 4.8ms delay between sd power off and power on. Fixes: 307fd14d4b14 ("arm64: dts: imx: add imx8qm mek support") Reviewed-by: Frank Li <Frank.Li@nxp.com> Signed-off-by: Haibo Chen <haibo.chen@nxp.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
4 daysarm64: dts: ti: k3-am62-lp-sk-nand: Rename pinctrls to fix schema warningsWadim Egorov1-1/+1
[ Upstream commit cf5e8adebe77917a4cc95e43e461cdbd857591ce ] Rename pinctrl nodes to comply with naming conventions required by pinctrl-single schema. Fixes: e569152274fec ("arm64: dts: ti: am62-lp-sk: Add overlay for NAND expansion card") Signed-off-by: Wadim Egorov <w.egorov@phytec.de> Link: https://patch.msgid.link/20251127122733.2523367-3-w.egorov@phytec.de Signed-off-by: Nishanth Menon <nm@ti.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
4 daysalpha: don't reference obsolete termio struct for TC* constantsSam James1-4/+4
[ Upstream commit 9aeed9041929812a10a6d693af050846942a1d16 ] Similar in nature to ab107276607af90b13a5994997e19b7b9731e251. glibc-2.42 drops the legacy termio struct, but the ioctls.h header still defines some TC* constants in terms of termio (via sizeof). Hardcode the values instead. This fixes building Python for example, which falls over like: ./Modules/termios.c:1119:16: error: invalid application of 'sizeof' to incomplete type 'struct termio' Link: https://bugs.gentoo.org/961769 Link: https://bugs.gentoo.org/962600 Signed-off-by: Sam James <sam@gentoo.org> Reviewed-by: Magnus Lindholm <linmag7@gmail.com> Link: https://lore.kernel.org/r/6ebd3451908785cad53b50ca6bc46cfe9d6bc03c.1764922497.git.sam@gentoo.org Signed-off-by: Magnus Lindholm <linmag7@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
4 daysARM: 9461/1: Disable HIGHPTE on PREEMPT_RT kernelsSebastian Andrzej Siewior1-1/+1
[ Upstream commit fedadc4137234c3d00c4785eeed3e747fe9036ae ] gup_pgd_range() is invoked with disabled interrupts and invokes __kmap_local_page_prot() via pte_offset_map(), gup_p4d_range(). With HIGHPTE enabled, __kmap_local_page_prot() invokes kmap_high_get() which uses a spinlock_t via lock_kmap_any(). This leads to an sleeping-while-atomic error on PREEMPT_RT because spinlock_t becomes a sleeping lock and must not be acquired in atomic context. The loop in map_new_virtual() uses wait_queue_head_t for wake up which also is using a spinlock_t. Since HIGHPTE is rarely needed at all, turn it off for PREEMPT_RT to allow the use of get_user_pages_fast(). [arnd: rework patch to turn off HIGHPTE instead of HAVE_PAST_GUP] Co-developed-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Sasha Levin <sashal@kernel.org>
4 dayscsky: fix csky_cmpxchg_fixup not workingYang Li1-2/+2
[ Upstream commit 809ef03d6d21d5fea016bbf6babeec462e37e68c ] In the csky_cmpxchg_fixup function, it is incorrect to use the global variable csky_cmpxchg_stw to determine the address where the exception occurred.The global variable csky_cmpxchg_stw stores the opcode at the time of the exception, while &csky_cmpxchg_stw shows the address where the exception occurred. Signed-off-by: Yang Li <yang.li85200@gmail.com> Signed-off-by: Guo Ren <guoren@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
4 daysarm64: Fix cleared E0POE bit after cpu_suspend()/resume()Yeoreum Yun2-1/+9
commit bdf3f4176092df5281877cacf42f843063b4784d upstream. TCR2_ELx.E0POE is set during smp_init(). However, this bit is not reprogrammed when the CPU enters suspension and later resumes via cpu_resume(), as __cpu_setup() does not re-enable E0POE and there is no save/restore logic for the TCR2_ELx system register. As a result, the E0POE feature no longer works after cpu_resume(). To address this, save and restore TCR2_EL1 in the cpu_suspend()/cpu_resume() path, rather than adding related logic to __cpu_setup(), taking into account possible future extensions of the TCR2_ELx feature. Fixes: bf83dae90fbc ("arm64: enable the Permission Overlay Extension for EL0") Cc: <stable@vger.kernel.org> # 6.12.x Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Kevin Brodsky <kevin.brodsky@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
13 dayspowerpc/pseries/cmm: adjust BALLOON_MIGRATE when migrating pagesDavid Hildenbrand1-0/+1
[ Upstream commit 0da2ba35c0d532ca0fe7af698b17d74c4d084b9a ] Let's properly adjust BALLOON_MIGRATE like the other drivers. Note that the INFLATE/DEFLATE events are triggered from the core when enqueueing/dequeueing pages. This was found by code inspection. Link: https://lkml.kernel.org/r/20251021100606.148294-3-david@redhat.com Fixes: fe030c9b85e6 ("powerpc/pseries/cmm: Implement balloon compaction") Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
13 daysmm/balloon_compaction: convert balloon_page_delete() to balloon_page_finalize()David Hildenbrand1-1/+1
[ Upstream commit 15504b1163007bbfbd9a63460d5c14737c16e96d ] Let's move the removal of the page from the balloon list into the single caller, to remove the dependency on the PG_isolated flag and clarify locking requirements. Note that for now, balloon_page_delete() was used on two paths: (1) Removing a page from the balloon for deflation through balloon_page_list_dequeue() (2) Removing an isolated page from the balloon for migration in the per-driver migration handlers. Isolated pages were already removed from the balloon list during isolation. So instead of relying on the flag, we can just distinguish both cases directly and handle it accordingly in the caller. We'll shuffle the operations a bit such that they logically make more sense (e.g., remove from the list before clearing flags). In balloon migration functions we can now move the balloon_page_finalize() out of the balloon lock and perform the finalization just before dropping the balloon reference. Document that the page lock is currently required when modifying the movability aspects of a page; hopefully we can soon decouple this from the page lock. Link: https://lkml.kernel.org/r/20250704102524.326966-3-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Brendan Jackman <jackmanb@google.com> Cc: Byungchul Park <byungchul@sk.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Christian Brauner <brauner@kernel.org> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Eugenio Pé rez <eperezma@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Gregory Price <gourry@gourry.net> Cc: Harry Yoo <harry.yoo@oracle.com> Cc: "Huang, Ying" <ying.huang@linux.alibaba.com> Cc: Jan Kara <jack@suse.cz> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jason Wang <jasowang@redhat.com> Cc: Jerrin Shaji George <jerrin.shaji-george@broadcom.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Joshua Hahn <joshua.hahnjy@gmail.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mathew Brost <matthew.brost@intel.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Naoya Horiguchi <nao.horiguchi@gmail.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Peter Xu <peterx@redhat.com> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Rakie Kim <rakie.kim@sk.com> Cc: Rik van Riel <riel@surriel.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Cc: xu xin <xu.xin16@zte.com.cn> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Stable-dep-of: 0da2ba35c0d5 ("powerpc/pseries/cmm: adjust BALLOON_MIGRATE when migrating pages") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
13 daysx86/microcode/AMD: Select which microcode patch to loadBorislav Petkov (AMD)1-37/+67
commit 8d171045069c804e5ffaa18be590c42c6af0cf3f upstream. All microcode patches up to the proper BIOS Entrysign fix are loaded only after the sha256 signature carried in the driver has been verified. Microcode patches after the Entrysign fix has been applied, do not need that signature verification anymore. In order to not abandon machines which haven't received the BIOS update yet, add the capability to select which microcode patch to load. The corresponding microcode container supplied through firmware-linux has been modified to carry two patches per CPU type (family/model/stepping) so that the proper one gets selected. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Tested-by: Waiman Long <longman@redhat.com> Link: https://patch.msgid.link/20251027133818.4363-1-bp@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
13 daysARM: dts: microchip: sama7g5: fix uart fifo size to 32Nicolas Ferre1-2/+2
[ Upstream commit 5654889a94b0de5ad6ceae3793e7f5e0b61b50b6 ] On some flexcom nodes related to uart, the fifo sizes were wrong: fix them to 32 data. Fixes: 7540629e2fc7 ("ARM: dts: at91: add sama7g5 SoC DT and sama7g5-ek") Cc: stable@vger.kernel.org # 5.15+ Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com> Link: https://lore.kernel.org/r/20251114103313.20220-2-nicolas.ferre@microchip.com Signed-off-by: Claudiu Beznea <claudiu.beznea@tuxon.dev> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
13 dayslib/crypto: riscv/chacha: Avoid s0/fp registerVivian Wang1-3/+2
commit 43169328c7b4623b54b7713ec68479cebda5465f upstream. In chacha_zvkb, avoid using the s0 register, which is the frame pointer, by reallocating KEY0 to t5. This makes stack traces available if e.g. a crash happens in chacha_zvkb. No frame pointer maintenance is otherwise required since this is a leaf function. Signed-off-by: Vivian Wang <wangruikang@iscas.ac.cn> Fixes: bb54668837a0 ("crypto: riscv - add vector crypto accelerated ChaCha20") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20251202-riscv-chacha_zvkb-fp-v2-1-7bd00098c9dc@iscas.ac.cn Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
13 daysLoongArch: BPF: Sign extend kfunc call argumentsHengqi Chen2-0/+42
commit 3f5a238f24d7b75f9efe324d3539ad388f58536e upstream. The kfunc calls are native calls so they should follow LoongArch calling conventions. Sign extend its arguments properly to avoid kernel panic. This is done by adding a new emit_abi_ext() helper. The emit_abi_ext() helper performs extension in place meaning a value already store in the target register (Note: this is different from the existing sign_extend() helper and thus we can't reuse it). Cc: stable@vger.kernel.org Fixes: 5dc615520c4d ("LoongArch: Add BPF JIT support") Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
13 daysLoongArch: BPF: Zero-extend bpf_tail_call() indexHengqi Chen1-0/+2
commit eb71f5c433e1c6dff089b315881dec40a88a7baf upstream. The bpf_tail_call() index should be treated as a u32 value. Let's zero-extend it to avoid calling wrong BPF progs. See similar fixes for x86 [1]) and arm64 ([2]) for more details. [1]: https://github.com/torvalds/linux/commit/90caccdd8cc0215705f18b92771b449b01e2474a [2]: https://github.com/torvalds/linux/commit/16338a9b3ac30740d49f5dfed81bac0ffa53b9c7 Cc: stable@vger.kernel.org Fixes: 5dc615520c4d ("LoongArch: Add BPF JIT support") Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
13 daysLoongArch: Refactor register restoration in ftrace_common_returnChenghao Duan1-4/+10
commit 45cb47c628dfbd1994c619f3eac271a780602826 upstream. Refactor the register restoration sequence in the ftrace_common_return function to clearly distinguish between the logic of normal returns and direct call returns in function tracing scenarios. The logic is as follows: 1. In the case of a normal return, the execution flow returns to the traced function, and ftrace must ensure that the register data is consistent with the state when the function was entered. ra = parent return address; t0 = traced function return address. 2. In the case of a direct call return, the execution flow jumps to the custom trampoline function, and ftrace must ensure that the register data is consistent with the state when ftrace was entered. ra = traced function return address; t0 = parent return address. Cc: stable@vger.kernel.org Fixes: 9cdc3b6a299c ("LoongArch: ftrace: Add direct call support") Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
13 daysx86/microcode/AMD: Fix Entrysign revision check for Zen5/Strix HaloRong Zhang1-1/+1
commit 150b1b97e27513535dcd3795d5ecd28e61b6cb8c upstream. Zen5 also contains family 1Ah, models 70h-7Fh, which are mistakenly missing from cpu_has_entrysign(). Add the missing range. Fixes: 8a9fb5129e8e ("x86/microcode/AMD: Limit Entrysign signature checking to known generations") Signed-off-by: Rong Zhang <i@rong.moe> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: stable@kernel.org Link: https://patch.msgid.link/20251229182245.152747-1-i@rong.moe Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
13 daysLoongArch: Use unsigned long for _end and _textTiezhu Yang1-2/+2
commit a258a3cb1895e3acf5f2fe245d17426e894bc935 upstream. It is better to use unsigned long rather than long for _end and _text to calculate the kernel length. Cc: stable@vger.kernel.org # v6.3+ Fixes: e5f02b51fa0c ("LoongArch: Add support for kernel address space layout randomization (KASLR)") Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
13 daysLoongArch: Use __pmd()/__pte() for swap entry conversionsWangYuli1-2/+2
commit 4a71df151e703b5e7e85b33369cee59ef2665e61 upstream. The __pmd() and __pte() helper macros provide the correct initialization syntax and abstraction for the pmd_t and pte_t types. Use __pmd() to fix follow warning about __swp_entry_to_pmd() with gcc-15 under specific configs [1] : In file included from ./include/linux/pgtable.h:6, from ./include/linux/mm.h:31, from ./include/linux/pagemap.h:8, from arch/loongarch/mm/init.c:14: ./include/linux/swapops.h: In function ‘swp_entry_to_pmd’: ./arch/loongarch/include/asm/pgtable.h:302:34: error: missing braces around initializer [-Werror=missing-braces] 302 | #define __swp_entry_to_pmd(x) ((pmd_t) { (x).val | _PAGE_HUGE }) | ^ ./include/linux/swapops.h:559:16: note: in expansion of macro ‘__swp_entry_to_pmd’ 559 | return __swp_entry_to_pmd(arch_entry); | ^~~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors Also update __swp_entry_to_pte() to use __pte() for consistency. [1]. https://download.01.org/0day-ci/archive/20251119/202511190316.luI90kAo-lkp@intel.com/config Cc: stable@vger.kernel.org Signed-off-by: Yuli Wang <wangyl5933@chinaunicom.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
13 daysLoongArch: Fix build errors for CONFIG_RANDSTRUCTHuacai Chen1-2/+2
commit 3c250aecef62da81deb38ac6738ac0a88d91f1fc upstream. When CONFIG_RANDSTRUCT enabled, members of task_struct are randomized. There is a chance that TASK_STACK_CANARY be out of 12bit immediate's range and causes build errors. TASK_STACK_CANARY is naturally aligned, so fix it by replacing ld.d/st.d with ldptr.d/stptr.d which have 14bit immediates. Cc: stable@vger.kernel.org Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202511240656.0NaPcJs1-lkp@intel.com/ Suggested-by: Rui Wang <wangrui@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
13 daysLoongArch: Correct the calculation logic of thread_countQiang Ma1-1/+7
commit 1de0ae21f136efa6c5d8a4d3e07b7d1ca39c750f upstream. For thread_count, the current calculation method has a maximum of 255, which may not be sufficient in the future. Therefore, we are correcting it now. Reference: SMBIOS Specification, 7.5 Processor Information (Type 4)[1] [1]: https://www.dmtf.org/sites/default/files/standards/documents/DSP0134_3.9.0.pdf Cc: stable@vger.kernel.org Signed-off-by: Qiang Ma <maqianga@uniontech.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
13 daysLoongArch: Add new PCI ID for pci_fixup_vgadev()Huacai Chen1-0/+2
commit bf3fa8f232a1eec8d7b88dcd9e925e60f04f018d upstream. Loongson-2K3000 has a new PCI ID (0x7a46) for its display controller, Add it for pci_fixup_vgadev() since we prefer a discrete graphics card as default boot device if present. Cc: stable@vger.kernel.org Signed-off-by: Tianrui Zhao <zhaotianrui@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
13 dayspowerpc/pseries/cmm: call balloon_devinfo_init() also without ↵David Hildenbrand1-1/+1
CONFIG_BALLOON_COMPACTION commit fc6bcf9ac4de76f5e7bcd020b3c0a86faff3f2d5 upstream. Patch series "powerpc/pseries/cmm: two smaller fixes". Two smaller fixes identified while doing a bigger rework. This patch (of 2): We always have to initialize the balloon_dev_info, even when compaction is not configured in: otherwise the containing list and the lock are left uninitialized. Likely not many such configs exist in practice, but let's CC stable to be sure. This was found by code inspection. Link: https://lkml.kernel.org/r/20251021100606.148294-1-david@redhat.com Link: https://lkml.kernel.org/r/20251021100606.148294-2-david@redhat.com Fixes: fe030c9b85e6 ("powerpc/pseries/cmm: Implement balloon compaction") Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
13 daysperf/x86/amd/uncore: Fix the return value of amd_uncore_df_event_init() on errorSandipan Das1-4/+1
commit 01439286514ce9d13b8123f8ec3717d7135ff1d6 upstream. If amd_uncore_event_init() fails, return an error irrespective of the pmu_version. Setting hwc->config should be safe even if there is an error so use this opportunity to simplify the code. Closes: https://lore.kernel.org/all/aTaI0ci3vZ44lmBn@stanley.mountain/ Fixes: d6389d3ccc13 ("perf/x86/amd/uncore: Refactor uncore management") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Sandipan Das <sandipan.das@amd.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/076935e23a70335d33bd6e23308b75ae0ad35ba2.1765268667.git.sandipan.das@amd.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
13 daysparisc: entry: set W bit for !compat tasks in syscall_restore_rfi()Sven Schnelle2-1/+6
commit 5fb1d3ce3e74a4530042795e1e065422295f1371 upstream. When the kernel leaves to userspace via syscall_restore_rfi(), the W bit is not set in the new PSW. This doesn't cause any problems because there's no 64 bit userspace for parisc. Simple static binaries are usually loaded at addresses way below the 32 bit limit so the W bit doesn't matter. Fix this by setting the W bit when TIF_32BIT is not set. Signed-off-by: Sven Schnelle <svens@stackframe.org> Cc: stable@vger.kernel.org Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
13 daysparisc: entry.S: fix space adjustment on interruption for 64-bit userspaceSven Schnelle1-3/+8
commit 1aa4524c0c1b54842c4c0a370171d11b12d0709b upstream. In wide mode, the IASQ contain the upper part of the GVA during interruption. This needs to be reversed before the space is used - otherwise it contains parts of IAOQ. See Page 2-13 "Processing Resources / Interruption Instruction Address Queues" in the Parisc 2.0 Architecture Manual page 2-13 for an explanation. The IAOQ/IASQ space_adjust was skipped for other interruptions than itlb misses. However, the code in handle_interruption() checks whether iasq[0] contains a valid space. Due to the not masked out bits this match failed and the process was killed. Also add space_adjust for IAOQ1/IASQ1 so ptregs contains sane values. Signed-off-by: Sven Schnelle <svens@stackframe.org> Cc: stable@vger.kernel.org # v6.0+ Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
13 dayspowerpc/64s/slb: Fix SLB multihit issue during SLB preloadDonet Tom5-98/+0
commit 00312419f0863964625d6dcda8183f96849412c6 upstream. On systems using the hash MMU, there is a software SLB preload cache that mirrors the entries loaded into the hardware SLB buffer. This preload cache is subject to periodic eviction — typically after every 256 context switches — to remove old entry. To optimize performance, the kernel skips switch_mmu_context() in switch_mm_irqs_off() when the prev and next mm_struct are the same. However, on hash MMU systems, this can lead to inconsistencies between the hardware SLB and the software preload cache. If an SLB entry for a process is evicted from the software cache on one CPU, and the same process later runs on another CPU without executing switch_mmu_context(), the hardware SLB may retain stale entries. If the kernel then attempts to reload that entry, it can trigger an SLB multi-hit error. The following timeline shows how stale SLB entries are created and can cause a multi-hit error when a process moves between CPUs without a MMU context switch. CPU 0 CPU 1 ----- ----- Process P exec swapper/1 load_elf_binary begin_new_exc activate_mm switch_mm_irqs_off switch_mmu_context switch_slb /* * This invalidates all * the entries in the HW * and setup the new HW * SLB entries as per the * preload cache. */ context_switch sched_migrate_task migrates process P to cpu-1 Process swapper/0 context switch (to process P) (uses mm_struct of Process P) switch_mm_irqs_off() switch_slb load_slb++ /* * load_slb becomes 0 here * and we evict an entry from * the preload cache with * preload_age(). We still * keep HW SLB and preload * cache in sync, that is * because all HW SLB entries * anyways gets evicted in * switch_slb during SLBIA. * We then only add those * entries back in HW SLB, * which are currently * present in preload_cache * (after eviction). */ load_elf_binary continues... setup_new_exec() slb_setup_new_exec() sched_switch event sched_migrate_task migrates process P to cpu-0 context_switch from swapper/0 to Process P switch_mm_irqs_off() /* * Since both prev and next mm struct are same we don't call * switch_mmu_context(). This will cause the HW SLB and SW preload * cache to go out of sync in preload_new_slb_context. Because there * was an SLB entry which was evicted from both HW and preload cache * on cpu-1. Now later in preload_new_slb_context(), when we will try * to add the same preload entry again, we will add this to the SW * preload cache and then will add it to the HW SLB. Since on cpu-0 * this entry was never invalidated, hence adding this entry to the HW * SLB will cause a SLB multi-hit error. */ load_elf_binary continues... START_THREAD start_thread preload_new_slb_context /* * This tries to add a new EA to preload cache which was earlier * evicted from both cpu-1 HW SLB and preload cache. This caused the * HW SLB of cpu-0 to go out of sync with the SW preload cache. The * reason for this was, that when we context switched back on CPU-0, * we should have ideally called switch_mmu_context() which will * bring the HW SLB entries on CPU-0 in sync with SW preload cache * entries by setting up the mmu context properly. But we didn't do * that since the prev mm_struct running on cpu-0 was same as the * next mm_struct (which is true for swapper / kernel threads). So * now when we try to add this new entry into the HW SLB of cpu-0, * we hit a SLB multi-hit error. */ WARNING: CPU: 0 PID: 1810970 at arch/powerpc/mm/book3s64/slb.c:62 assert_slb_presence+0x2c/0x50(48 results) 02:47:29 [20157/42149] Modules linked in: CPU: 0 UID: 0 PID: 1810970 Comm: dd Not tainted 6.16.0-rc3-dirty #12 VOLUNTARY Hardware name: IBM pSeries (emulated by qemu) POWER8 (architected) 0x4d0200 0xf000004 of:SLOF,HEAD hv:linux,kvm pSeries NIP: c00000000015426c LR: c0000000001543b4 CTR: 0000000000000000 REGS: c0000000497c77e0 TRAP: 0700 Not tainted (6.16.0-rc3-dirty) MSR: 8000000002823033 <SF,VEC,VSX,FP,ME,IR,DR,RI,LE> CR: 28888482 XER: 00000000 CFAR: c0000000001543b0 IRQMASK: 3 <...> NIP [c00000000015426c] assert_slb_presence+0x2c/0x50 LR [c0000000001543b4] slb_insert_entry+0x124/0x390 Call Trace: 0x7fffceb5ffff (unreliable) preload_new_slb_context+0x100/0x1a0 start_thread+0x26c/0x420 load_elf_binary+0x1b04/0x1c40 bprm_execve+0x358/0x680 do_execveat_common+0x1f8/0x240 sys_execve+0x58/0x70 system_call_exception+0x114/0x300 system_call_common+0x160/0x2c4 >From the above analysis, during early exec the hardware SLB is cleared, and entries from the software preload cache are reloaded into hardware by switch_slb. However, preload_new_slb_context and slb_setup_new_exec also attempt to load some of the same entries, which can trigger a multi-hit. In most cases, these additional preloads simply hit existing entries and add nothing new. Removing these functions avoids redundant preloads and eliminates the multi-hit issue. This patch removes these two functions. We tested process switching performance using the context_switch benchmark on POWER9/hash, and observed no regression. Without this patch: 129041 ops/sec With this patch: 129341 ops/sec We also measured SLB faults during boot, and the counts are essentially the same with and without this patch. SLB faults without this patch: 19727 SLB faults with this patch: 19786 Fixes: 5434ae74629a ("powerpc/64s/hash: Add a SLB preload cache") cc: stable@vger.kernel.org Suggested-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Donet Tom <donettom@linux.ibm.com> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/0ac694ae683494fe8cadbd911a1a5018d5d3c541.1761834163.git.ritesh.list@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
13 dayspowerpc, mm: Fix mprotect on book3s 32-bitDave Vasilevsky2-1/+13
commit 78fc63ffa7813e33681839bb33826c24195f0eb7 upstream. On 32-bit book3s with hash-MMUs, tlb_flush() was a no-op. This was unnoticed because all uses until recently were for unmaps, and thus handled by __tlb_remove_tlb_entry(). After commit 4a18419f71cd ("mm/mprotect: use mmu_gather") in kernel 5.19, tlb_gather_mmu() started being used for mprotect as well. This caused mprotect to simply not work on these machines: int *ptr = mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0); *ptr = 1; // force HPTE to be created mprotect(ptr, 4096, PROT_READ); *ptr = 2; // should segfault, but succeeds Fixed by making tlb_flush() actually flush TLB pages. This finally agrees with the behaviour of boot3s64's tlb_flush(). Fixes: 4a18419f71cd ("mm/mprotect: use mmu_gather") Cc: stable@vger.kernel.org Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Signed-off-by: Dave Vasilevsky <dave@vasilevsky.ca> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20251116-vasi-mprotect-g3-v3-1-59a9bd33ba00@vasilevsky.ca Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
13 daysarm64: dts: ti: k3-j721e-sk: Fix pinmux for pin Y1 used by power regulatorSiddharth Vadapalli1-6/+6
commit 51f89c488f2ecc020f82bfedd77482584ce8027a upstream. The SoC pin Y1 is incorrectly defined in the WKUP Pinmux device-tree node (pinctrl@4301c000) leading to the following silent failure: pinctrl-single 4301c000.pinctrl: mux offset out of range: 0x1dc (0x178) According to the datasheet for the J721E SoC [0], the pin Y1 belongs to the MAIN Pinmux device-tree node (pinctrl@11c000). This is confirmed by the address of the pinmux register for it on page 142 of the datasheet which is 0x00011C1DC. Hence fix it. [0]: https://www.ti.com/lit/ds/symlink/tda4vm.pdf Fixes: 97b67cc102dc ("arm64: dts: ti: k3-j721e-sk: Add DT nodes for power regulators") Cc: stable@vger.kernel.org Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com> Reviewed-by: Yemike Abhilash Chandra <y-abhilashchandra@ti.com> Link: https://patch.msgid.link/20251119160148.2752616-1-s-vadapalli@ti.com Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
13 daysx86/msi: Make irq_retrigger() functional for posted MSIThomas Gleixner2-0/+30
[ Upstream commit 0edc78b82bea85e1b2165d8e870a5c3535919695 ] Luigi reported that retriggering a posted MSI interrupt does not work correctly. The reason is that the retrigger happens at the vector domain by sending an IPI to the actual vector on the target CPU. That works correctly exactly once because the posted MSI interrupt chip does not issue an EOI as that's only required for the posted MSI notification vector itself. As a consequence the vector becomes stale in the ISR, which not only affects this vector but also any lower priority vector in the affected APIC because the ISR bit is not cleared. Luigi proposed to set the vector in the remap PIR bitmap and raise the posted MSI notification vector. That works, but that still does not cure a related problem: If there is ever a stray interrupt on such a vector, then the related APIC ISR bit becomes stale due to the lack of EOI as described above. Unlikely to happen, but if it happens it's not debuggable at all. So instead of playing games with the PIR, this can be actually solved for both cases by: 1) Keeping track of the posted interrupt vector handler state 2) Implementing a posted MSI specific irq_ack() callback which checks that state. If the posted vector handler is inactive it issues an EOI, otherwise it delegates that to the posted handler. This is correct versus affinity changes and concurrent events on the posted vector as the actual handler invocation is serialized through the interrupt descriptor lock. Fixes: ed1e48ea4370 ("iommu/vt-d: Enable posted mode for device MSIs") Reported-by: Luigi Rizzo <lrizzo@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Luigi Rizzo <lrizzo@google.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251125214631.044440658@linutronix.de Closes: https://lore.kernel.org/lkml/20251124104836.3685533-1-lrizzo@google.com [ DEFINE_PER_CPU_CACHE_HOT => DEFINE_PER_CPU ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
13 daysARM: dts: microchip: sama5d2: fix spi flexcom fifo size to 32Nicolas Ferre1-5/+5
commit 7d5864dc5d5ea6a35983dd05295fb17f2f2f44ce upstream. Unlike standalone spi peripherals, on sama5d2, the flexcom spi have fifo size of 32 data. Fix flexcom/spi nodes where this property is wrong. Fixes: 6b9a3584c7ed ("ARM: dts: at91: sama5d2: Add missing flexcom definitions") Cc: stable@vger.kernel.org # 5.8+ Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com> Link: https://lore.kernel.org/r/20251114140225.30372-1-nicolas.ferre@microchip.com Signed-off-by: Claudiu Beznea <claudiu.beznea@tuxon.dev> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
13 daysarm64: Revamp HCR_EL2.E2H RES1 detectionMarc Zyngier1-6/+32
[ Upstream commit ca88ecdce5f51874a7c151809bd2c936ee0d3805 ] We currently have two ways to identify CPUs that only implement FEAT_VHE and not FEAT_E2H0: - either they advertise it via ID_AA64MMFR4_EL1.E2H0, - or the HCR_EL2.E2H bit is RAO/WI However, there is a third category of "cpus" that fall between these two cases: on CPUs that do not implement FEAT_FGT, it is IMPDEF whether an access to ID_AA64MMFR4_EL1 can trap to EL2 when the register value is zero. A consequence of this is that on systems such as Neoverse V2, a NV guest cannot reliably detect that it is in a VHE-only configuration (E2H is writable, and ID_AA64MMFR0_EL1 is 0), despite the hypervisor's best effort to repaint the id register. Replace the RAO/WI test by a sequence that makes use of the VHE register remnapping between EL1 and EL2 to detect this situation, and work out whether we get the VHE behaviour even after having set HCR_EL2.E2H to 0. This solves the NV problem, and provides a more reliable acid test for CPUs that do not completely follow the letter of the architecture while providing a RES1 behaviour for HCR_EL2.E2H. Suggested-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Tested-by: Jan Kotas <jank@cadence.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/15A85F2B-1A0C-4FA7-9FE4-EEC2203CC09E@global.cadence.com Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Wei-Lin Chang <weilin.chang@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
13 daysKVM: arm64: Initialize SCTLR_EL1 in __kvm_hyp_init_cpu()Ahmed Genidi4-8/+5
[ Upstream commit 3855a7b91d42ebf3513b7ccffc44807274978b3d ] When KVM is in protected mode, host calls to PSCI are proxied via EL2, and cold entries from CPU_ON, CPU_SUSPEND, and SYSTEM_SUSPEND bounce through __kvm_hyp_init_cpu() at EL2 before entering the host kernel's entry point at EL1. While __kvm_hyp_init_cpu() initializes SPSR_EL2 for the exception return to EL1, it does not initialize SCTLR_EL1. Due to this, it's possible to enter EL1 with SCTLR_EL1 in an UNKNOWN state. In practice this has been seen to result in kernel crashes after CPU_ON as a result of SCTLR_EL1.M being 1 in violation of the initial core configuration specified by PSCI. Fix this by initializing SCTLR_EL1 for cold entry to the host kernel. As it's necessary to write to SCTLR_EL12 in VHE mode, this initialization is moved into __kvm_host_psci_cpu_entry() where we can use write_sysreg_el1(). The remnants of the '__init_el2_nvhe_prepare_eret' macro are folded into its only caller, as this is clearer than having the macro. Fixes: cdf367192766ad11 ("KVM: arm64: Intercept host's CPU_ON SMCs") Reported-by: Leo Yan <leo.yan@arm.com> Signed-off-by: Ahmed Genidi <ahmed.genidi@arm.com> [ Mark: clarify commit message, handle E2H, move to C, remove macro ] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ahmed Genidi <ahmed.genidi@arm.com> Cc: Ben Horgan <ben.horgan@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Leo Yan <leo.yan@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Will Deacon <will@kernel.org> Reviewed-by: Leo Yan <leo.yan@arm.com> Link: https://lore.kernel.org/r/20250227180526.1204723-3-mark.rutland@arm.com Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Wei-Lin Chang <weilin.chang@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
13 daysKVM: arm64: Initialize HCR_EL2.E2H earlyMark Rutland3-19/+34
[ Upstream commit 7a68b55ff39b0a1638acb1694c185d49f6077a0d ] On CPUs without FEAT_E2H0, HCR_EL2.E2H is RES1, but may reset to an UNKNOWN value out of reset and consequently may not read as 1 unless it has been explicitly initialized. We handled this for the head.S boot code in commits: 3944382fa6f22b54 ("arm64: Treat HCR_EL2.E2H as RES1 when ID_AA64MMFR4_EL1.E2H0 is negative") b3320142f3db9b3f ("arm64: Fix early handling of FEAT_E2H0 not being implemented") Unfortunately, we forgot to apply a similar fix to the KVM PSCI entry points used when relaying CPU_ON, CPU_SUSPEND, and SYSTEM SUSPEND. When KVM is entered via these entry points, the value of HCR_EL2.E2H may be consumed before it has been initialized (e.g. by the 'init_el2_state' macro). Initialize HCR_EL2.E2H early in these paths such that it can be consumed reliably. The existing code in head.S is factored out into a new 'init_el2_hcr' macro, and this is used in the __kvm_hyp_init_cpu() function common to all the relevant PSCI entry points. For clarity, I've tweaked the assembly used to check whether ID_AA64MMFR4_EL1.E2H0 is negative. The bitfield is extracted as a signed value, and this is checked with a signed-greater-or-equal (GE) comparison. As the hyp code will reconfigure HCR_EL2 later in ___kvm_hyp_init(), all bits other than E2H are initialized to zero in __kvm_hyp_init_cpu(). Fixes: 3944382fa6f22b54 ("arm64: Treat HCR_EL2.E2H as RES1 when ID_AA64MMFR4_EL1.E2H0 is negative") Fixes: b3320142f3db9b3f ("arm64: Fix early handling of FEAT_E2H0 not being implemented") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ahmed Genidi <ahmed.genidi@arm.com> Cc: Ben Horgan <ben.horgan@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Leo Yan <leo.yan@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20250227180526.1204723-2-mark.rutland@arm.com [maz: fixed LT->GE thinko] Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Wei-Lin Chang <weilin.chang@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
13 dayss390/ipl: Clear SBP flag when bootprog is setSven Schnelle2-12/+37
commit b1aa01d31249bd116b18c7f512d3e46b4b4ad83b upstream. With z16 a new flag 'search boot program' was introduced for list-directed IPL (SCSI, NVMe, ECKD DASD). If this flag is set, e.g. via selecting the "Automatic" value for the "Boot program selector" control on an HMC load panel, it is copied to the reipl structure from the initial ipl structure. When a user now sets a boot prog via sysfs, the flag is not cleared and the bootloader will again automatically select the boot program, ignoring user configuration. To avoid that, clear the SBP flag when a bootprog sysfs file is written. Cc: stable@vger.kernel.org Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
13 dayspowerpc/kexec: Enable SMT before waking offline CPUsNysal Jan K.A.1-0/+19
commit c2296a1e42418556efbeb5636c4fa6aa6106713a upstream. If SMT is disabled or a partial SMT state is enabled, when a new kernel image is loaded for kexec, on reboot the following warning is observed: kexec: Waking offline cpu 228. WARNING: CPU: 0 PID: 9062 at arch/powerpc/kexec/core_64.c:223 kexec_prepare_cpus+0x1b0/0x1bc [snip] NIP kexec_prepare_cpus+0x1b0/0x1bc LR kexec_prepare_cpus+0x1a0/0x1bc Call Trace: kexec_prepare_cpus+0x1a0/0x1bc (unreliable) default_machine_kexec+0x160/0x19c machine_kexec+0x80/0x88 kernel_kexec+0xd0/0x118 __do_sys_reboot+0x210/0x2c4 system_call_exception+0x124/0x320 system_call_vectored_common+0x15c/0x2ec This occurs as add_cpu() fails due to cpu_bootable() returning false for CPUs that fail the cpu_smt_thread_allowed() check or non primary threads if SMT is disabled. Fix the issue by enabling SMT and resetting the number of SMT threads to the number of threads per core, before attempting to wake up all present CPUs. Fixes: 38253464bc82 ("cpu/SMT: Create topology_smt_thread_allowed()") Reported-by: Sachin P Bappalige <sachinpb@linux.ibm.com> Cc: stable@vger.kernel.org # v6.6+ Reviewed-by: Srikar Dronamraju <srikar@linux.ibm.com> Signed-off-by: Nysal Jan K.A. <nysal@linux.ibm.com> Tested-by: Samir M <samir@linux.ibm.com> Reviewed-by: Sourabh Jain <sourabhjain@linux.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20251028105516.26258-1-nysal@linux.ibm.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
13 daysKVM: nSVM: Clear exit_code_hi in VMCB when synthesizing nested VM-ExitsSean Christopherson2-3/+6
commit da01f64e7470988f8607776aa7afa924208863fb upstream. Explicitly clear exit_code_hi in the VMCB when synthesizing "normal" nested VM-Exits, as the full exit code is a 64-bit value (spoiler alert), and all exit codes for non-failing VMRUN use only bits 31:0. Cc: Jim Mattson <jmattson@google.com> Cc: Yosry Ahmed <yosry.ahmed@linux.dev> Cc: stable@vger.kernel.org Reviewed-by: Yosry Ahmed <yosry.ahmed@linux.dev> Link: https://patch.msgid.link/20251113225621.1688428-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
13 daysKVM: nSVM: Set exit_code_hi to -1 when synthesizing SVM_EXIT_ERR (failed VMRUN)Sean Christopherson1-2/+2
commit f402ecd7a8b6446547076f4bd24bd5d4dcc94481 upstream. Set exit_code_hi to -1u as a temporary band-aid to fix a long-standing (effectively since KVM's inception) bug where KVM treats the exit code as a 32-bit value, when in reality it's a 64-bit value. Per the APM, offset 0x70 is a single 64-bit value: 070h 63:0 EXITCODE And a sane reading of the error values defined in "Table C-1. SVM Intercept Codes" is that negative values use the full 64 bits: –1 VMEXIT_INVALID Invalid guest state in VMCB. –2 VMEXIT_BUSYBUSY bit was set in the VMSA –3 VMEXIT_IDLE_REQUIREDThe sibling thread is not in an idle state -4 VMEXIT_INVALID_PMC Invalid PMC state And that interpretation is confirmed by testing on Milan and Turin (by setting bits in CR0[63:32] to generate VMEXIT_INVALID on VMRUN). Furthermore, Xen has treated exitcode as a 64-bit value since HVM support was adding in 2006 (see Xen commit d1bd157fbc ("Big merge the HVM full-virtualisation abstractions.")). Cc: Jim Mattson <jmattson@google.com> Cc: Yosry Ahmed <yosry.ahmed@linux.dev> Cc: stable@vger.kernel.org Reviewed-by: Yosry Ahmed <yosry.ahmed@linux.dev> Link: https://patch.msgid.link/20251113225621.1688428-3-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
13 daysKVM: nVMX: Immediately refresh APICv controls as needed on nested VM-ExitDongli Zhang1-1/+2
commit 29763138830916f46daaa50e83e7f4f907a3236b upstream. If an APICv status updated was pended while L2 was active, immediately refresh vmcs01's controls instead of pending KVM_REQ_APICV_UPDATE as kvm_vcpu_update_apicv() only calls into vendor code if a change is necessary. E.g. if APICv is inhibited, and then activated while L2 is running: kvm_vcpu_update_apicv() | -> __kvm_vcpu_update_apicv() | -> apic->apicv_active = true | -> vmx_refresh_apicv_exec_ctrl() | -> vmx->nested.update_vmcs01_apicv_status = true | -> return Then L2 exits to L1: __nested_vmx_vmexit() | -> kvm_make_request(KVM_REQ_APICV_UPDATE) vcpu_enter_guest(): KVM_REQ_APICV_UPDATE -> kvm_vcpu_update_apicv() | -> __kvm_vcpu_update_apicv() | -> return // because if (apic->apicv_active == activate) Reported-by: Chao Gao <chao.gao@intel.com> Closes: https://lore.kernel.org/all/aQ2jmnN8wUYVEawF@intel.com Fixes: 7c69661e225c ("KVM: nVMX: Defer APICv updates while L2 is active until L1 is active") Cc: stable@vger.kernel.org Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> [sean: write changelog] Link: https://patch.msgid.link/20251205231913.441872-3-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
13 daysKVM: SVM: Mark VMCB_PERM_MAP as dirty on nested VMRUNJim Mattson1-0/+1
commit 93c9e107386dbe1243287a5b14ceca894de372b9 upstream. Mark the VMCB_PERM_MAP bit as dirty in nested_vmcb02_prepare_control() on every nested VMRUN. If L1 changes MSR interception (INTERCEPT_MSR_PROT) between two VMRUN instructions on the same L1 vCPU, the msrpm_base_pa in the associated vmcb02 will change, and the VMCB_PERM_MAP clean bit should be cleared. Fixes: 4bb170a5430b ("KVM: nSVM: do not mark all VMCB02 fields dirty on nested vmexit") Reported-by: Matteo Rizzo <matteorizzo@google.com> Cc: stable@vger.kernel.org Signed-off-by: Jim Mattson <jmattson@google.com> Link: https://lore.kernel.org/r/20250922162935.621409-2-jmattson@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
13 daysKVM: nSVM: Propagate SVM_EXIT_CR0_SEL_WRITE correctly for LMSW emulationYosry Ahmed1-9/+9
commit 5674a76db0213f9db1e4d08e847ff649b46889c0 upstream. When emulating L2 instructions, svm_check_intercept() checks whether a write to CR0 should trigger a synthesized #VMEXIT with SVM_EXIT_CR0_SEL_WRITE. For MOV-to-CR0, SVM_EXIT_CR0_SEL_WRITE is only triggered if any bit other than CR0.MP and CR0.TS is updated. However, according to the APM (24593—Rev. 3.42—March 2024, Table 15-7): The LMSW instruction treats the selective CR0-write intercept as a non-selective intercept (i.e., it intercepts regardless of the value being written). Skip checking the changed bits for x86_intercept_lmsw and always inject SVM_EXIT_CR0_SEL_WRITE. Fixes: cfec82cb7d31 ("KVM: SVM: Add intercept check for emulated cr accesses") Cc: stable@vger.kernel.org Reported-by: Matteo Rizzo <matteorizzo@google.com> Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev> Link: https://patch.msgid.link/20251024192918.3191141-3-yosry.ahmed@linux.dev Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
13 daysKVM: SVM: Mark VMCB_NPT as dirty on nested VMRUNJim Mattson1-0/+1
commit 7c8b465a1c91f674655ea9cec5083744ec5f796a upstream. Mark the VMCB_NPT bit as dirty in nested_vmcb02_prepare_save() on every nested VMRUN. If L1 changes the PAT MSR between two VMRUN instructions on the same L1 vCPU, the g_pat field in the associated vmcb02 will change, and the VMCB_NPT clean bit should be cleared. Fixes: 4bb170a5430b ("KVM: nSVM: do not mark all VMCB02 fields dirty on nested vmexit") Cc: stable@vger.kernel.org Signed-off-by: Jim Mattson <jmattson@google.com> Link: https://lore.kernel.org/r/20250922162935.621409-3-jmattson@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
13 daysKVM: nSVM: Avoid incorrect injection of SVM_EXIT_CR0_SEL_WRITEYosry Ahmed1-5/+19
commit 3d80f4c93d3d26d0f9a0dd2844961a632eeea634 upstream. When emulating L2 instructions, svm_check_intercept() checks whether a write to CR0 should trigger a synthesized #VMEXIT with SVM_EXIT_CR0_SEL_WRITE. However, it does not check whether L1 enabled the intercept for SVM_EXIT_WRITE_CR0, which has higher priority according to the APM (24593—Rev. 3.42—March 2024, Table 15-7): When both selective and non-selective CR0-write intercepts are active at the same time, the non-selective intercept takes priority. With respect to exceptions, the priority of this intercept is the same as the generic CR0-write intercept. Make sure L1 does NOT intercept SVM_EXIT_WRITE_CR0 before checking if SVM_EXIT_CR0_SEL_WRITE needs to be injected. Opportunistically tweak the "not CR0" logic to explicitly bail early so that it's more obvious that only CR0 has a selective intercept, and that modifying icpt_info.exit_code is functionally necessary so that the call to nested_svm_exit_handled() checks the correct exit code. Fixes: cfec82cb7d31 ("KVM: SVM: Add intercept check for emulated cr accesses") Cc: stable@vger.kernel.org Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev> Link: https://patch.msgid.link/20251024192918.3191141-4-yosry.ahmed@linux.dev [sean: isolate non-CR0 write logic, tweak comments accordingly] Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
13 daysKVM: x86: Fix VM hard lockup after prolonged inactivity with periodic HV timerfuqiang wang1-5/+23
commit 18ab3fc8e880791aa9f7c000261320fc812b5465 upstream. When advancing the target expiration for the guest's APIC timer in periodic mode, set the expiration to "now" if the target expiration is in the past (similar to what is done in update_target_expiration()). Blindly adding the period to the previous target expiration can result in KVM generating a practically unbounded number of hrtimer IRQs due to programming an expired timer over and over. In extreme scenarios, e.g. if userspace pauses/suspends a VM for an extended duration, this can even cause hard lockups in the host. Currently, the bug only affects Intel CPUs when using the hypervisor timer (HV timer), a.k.a. the VMX preemption timer. Unlike the software timer, a.k.a. hrtimer, which KVM keeps running even on exits to userspace, the HV timer only runs while the guest is active. As a result, if the vCPU does not run for an extended duration, there will be a huge gap between the target expiration and the current time the vCPU resumes running. Because the target expiration is incremented by only one period on each timer expiration, this leads to a series of timer expirations occurring rapidly after the vCPU/VM resumes. More critically, when the vCPU first triggers a periodic HV timer expiration after resuming, advancing the expiration by only one period will result in a target expiration in the past. As a result, the delta may be calculated as a negative value. When the delta is converted into an absolute value (tscdeadline is an unsigned u64), the resulting value can overflow what the HV timer is capable of programming. I.e. the large value will exceed the VMX Preemption Timer's maximum bit width of cpu_preemption_timer_multi + 32, and thus cause KVM to switch from the HV timer to the software timer (hrtimers). After switching to the software timer, periodic timer expiration callbacks may be executed consecutively within a single clock interrupt handler, because hrtimers honors KVM's request for an expiration in the past and immediately re-invokes KVM's callback after reprogramming. And because the interrupt handler runs with IRQs disabled, restarting KVM's hrtimer over and over until the target expiration is advanced to "now" can result in a hard lockup. E.g. the following hard lockup was triggered in the host when running a Windows VM (only relevant because it used the APIC timer in periodic mode) after resuming the VM from a long suspend (in the host). NMI watchdog: Watchdog detected hard LOCKUP on cpu 45 ... RIP: 0010:advance_periodic_target_expiration+0x4d/0x80 [kvm] ... RSP: 0018:ff4f88f5d98d8ef0 EFLAGS: 00000046 RAX: fff0103f91be678e RBX: fff0103f91be678e RCX: 00843a7d9e127bcc RDX: 0000000000000002 RSI: 0052ca4003697505 RDI: ff440d5bfbdbd500 RBP: ff440d5956f99200 R08: ff2ff2a42deb6a84 R09: 000000000002a6c0 R10: 0122d794016332b3 R11: 0000000000000000 R12: ff440db1af39cfc0 R13: ff440db1af39cfc0 R14: ffffffffc0d4a560 R15: ff440db1af39d0f8 FS: 00007f04a6ffd700(0000) GS:ff440db1af380000(0000) knlGS:000000e38a3b8000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000d5651feff8 CR3: 000000684e038002 CR4: 0000000000773ee0 PKRU: 55555554 Call Trace: <IRQ> apic_timer_fn+0x31/0x50 [kvm] __hrtimer_run_queues+0x100/0x280 hrtimer_interrupt+0x100/0x210 ? ttwu_do_wakeup+0x19/0x160 smp_apic_timer_interrupt+0x6a/0x130 apic_timer_interrupt+0xf/0x20 </IRQ> Moreover, if the suspend duration of the virtual machine is not long enough to trigger a hard lockup in this scenario, since commit 98c25ead5eda ("KVM: VMX: Move preemption timer <=> hrtimer dance to common x86"), KVM will continue using the software timer until the guest reprograms the APIC timer in some way. Since the periodic timer does not require frequent APIC timer register programming, the guest may continue to use the software timer in perpetuity. Fixes: d8f2f498d9ed ("x86/kvm: fix LAPIC timer drift when guest uses periodic mode") Cc: stable@vger.kernel.org Signed-off-by: fuqiang wang <fuqiang.wng@gmail.com> [sean: massage comments and changelog] Link: https://patch.msgid.link/20251113205114.1647493-4-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
13 daysKVM: x86: Explicitly set new periodic hrtimer expiration in apic_timer_fn()fuqiang wang1-1/+1
commit 9633f180ce994ab293ce4924a9b7aaf4673aa114 upstream. When restarting an hrtimer to emulate a the guest's APIC timer in periodic mode, explicitly set the expiration using the target expiration computed by advance_periodic_target_expiration() instead of adding the period to the existing timer. This will allow making adjustments to the expiration, e.g. to deal with expirations far in the past, without having to implement the same logic in both advance_periodic_target_expiration() and apic_timer_fn(). Cc: stable@vger.kernel.org Signed-off-by: fuqiang wang <fuqiang.wng@gmail.com> [sean: split to separate patch, write changelog] Link: https://patch.msgid.link/20251113205114.1647493-3-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
13 daysKVM: x86: WARN if hrtimer callback for periodic APIC timer fires with period=0Sean Christopherson1-1/+1
commit 0ea9494be9c931ddbc084ad5e11fda91b554cf47 upstream. WARN and don't restart the hrtimer if KVM's callback runs with the guest's APIC timer in periodic mode but with a period of '0', as not advancing the hrtimer's deadline would put the CPU into an infinite loop of hrtimer events. Observing a period of '0' should be impossible, even when the hrtimer is running on a different CPU than the vCPU, as KVM is supposed to cancel the hrtimer before changing (or zeroing) the period, e.g. when switching from periodic to one-shot. Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251113205114.1647493-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
13 dayspowerpc: Add reloc_offset() to font bitmap pointer used for bootx_printf()Finn Thain1-1/+2
commit b94b73567561642323617155bf4ee24ef0d258fe upstream. Since Linux v6.7, booting using BootX on an Old World PowerMac produces an early crash. Stan Johnson writes, "the symptoms are that the screen goes blank and the backlight stays on, and the system freezes (Linux doesn't boot)." Further testing revealed that the failure can be avoided by disabling CONFIG_BOOTX_TEXT. Bisection revealed that the regression was caused by a change to the font bitmap pointer that's used when btext_init() begins painting characters on the display, early in the boot process. Christophe Leroy explains, "before kernel text is relocated to its final location ... data is addressed with an offset which is added to the Global Offset Table (GOT) entries at the start of bootx_init() by function reloc_got2(). But the pointers that are located inside a structure are not referenced in the GOT and are therefore not updated by reloc_got2(). It is therefore needed to apply the offset manually by using PTRRELOC() macro." Cc: stable@vger.kernel.org Link: https://lists.debian.org/debian-powerpc/2025/10/msg00111.html Link: https://lore.kernel.org/linuxppc-dev/d81ddca8-c5ee-d583-d579-02b19ed95301@yahoo.com/ Reported-by: Cedar Maxwell <cedarmaxwell@mac.com> Closes: https://lists.debian.org/debian-powerpc/2025/09/msg00031.html Bisected-by: Stan Johnson <userm57@yahoo.com> Tested-by: Stan Johnson <userm57@yahoo.com> Fixes: 0ebc7feae79a ("powerpc: Use shared font data") Suggested-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Finn Thain <fthain@linux-m68k.org> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/22b3b247425a052b079ab84da926706b3702c2c7.1762731022.git.fthain@linux-m68k.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>