kernel/linux.git - Linux kernel stable tree (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2023-10-25	s390/pci: fix iommu bitmap allocation	Niklas Schnelle	1	-2/+13
	commit c1ae1c59c8c6e0b66a718308c623e0cb394dab6b upstream. Since the fixed commits both zdev->iommu_bitmap and zdev->lazy_bitmap are allocated as vzalloc(zdev->iommu_pages / 8). The problem is that zdev->iommu_bitmap is a pointer to unsigned long but the above only yields an allocation that is a multiple of sizeof(unsigned long) which is 8 on s390x if the number of IOMMU pages is a multiple of 64. This in turn is the case only if the effective IOMMU aperture is a multiple of 64 * 4K = 256K. This is usually the case and so didn't cause visible issues since both the virt_to_phys(high_memory) reduced limit and hardware limits use nice numbers. Under KVM, and in particular with QEMU limiting the IOMMU aperture to the vfio DMA limit (default 65535), it is possible for the reported aperture not to be a multiple of 256K however. In this case we end up with an iommu_bitmap whose allocation is not a multiple of 8 causing bitmap operations to access it out of bounds. Sadly we can't just fix this in the obvious way and use bitmap_zalloc() because for large RAM systems (tested on 8 TiB) the zdev->iommu_bitmap grows too large for kmalloc(). So add our own bitmap_vzalloc() wrapper. This might be a candidate for common code, but this area of code will be replaced by the upcoming conversion to use the common code DMA API on s390 so just add a local routine. Fixes: 224593215525 ("s390/pci: use virtual memory for iommu bitmap") Fixes: 13954fd6913a ("s390/pci_dma: improve lazy flush for unmap") Cc: stable@vger.kernel.org Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-10-25	ARM: dts: ti: omap: Fix noisy serial with overrun-throttle-ms for mapphone	Tony Lindgren	1	-0/+1
	[ Upstream commit 5ad37b5e30433afa7a5513e3eb61f69fa0976785 ] On mapphone devices we may get lots of noise on the micro-USB port in debug uart mode until the phy-cpcap-usb driver probes. Let's limit the noise by using overrun-throttle-ms. Note that there is also a related separate issue where the charger cable connected may cause random sysrq requests until phy-cpcap-usb probes that still remains. Cc: Ivaylo Dimitrov <ivo.g.dimitrov.75@gmail.com> Cc: Carl Philipp Klemm <philipp@uvos.xyz> Cc: Merlijn Wajer <merlijn@wizzup.org> Cc: Pavel Machek <pavel@ucw.cz> Reviewed-by: Sebastian Reichel <sebastian.reichel@collabora.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-10-25	x86/sev: Check for user-space IOIO pointing to kernel space	Joerg Roedel	2	-2/+34
	Upstream commit: 63e44bc52047f182601e7817da969a105aa1f721 Check the memory operand of INS/OUTS before emulating the instruction. The #VC exception can get raised from user-space, but the memory operand can be manipulated to access kernel memory before the emulation actually begins and after the exception handler has run. [ bp: Massage commit message. ] Fixes: 597cfe48212a ("x86/boot/compressed/64: Setup a GHCB-based VC Exception handler") Reported-by: Tom Dohrmann <erbse.13@gmx.de> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: <stable@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-10-25	x86/sev: Check IOBM for IOIO exceptions from user-space	Joerg Roedel	3	-7/+47
	Upstream commit: b9cb9c45583b911e0db71d09caa6b56469eb2bdf Check the IO permission bitmap (if present) before emulating IOIO #VC exceptions for user-space. These permissions are checked by hardware already before the #VC is raised, but due to the VC-handler decoding race it needs to be checked again in software. Fixes: 25189d08e516 ("x86/sev-es: Add support for handling IOIO exceptions") Reported-by: Tom Dohrmann <erbse.13@gmx.de> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Tested-by: Tom Dohrmann <erbse.13@gmx.de> Cc: <stable@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-10-25	x86/sev: Disable MMIO emulation from user mode	Borislav Petkov (AMD)	1	-0/+3
	Upstream commit: a37cd2a59d0cb270b1bba568fd3a3b8668b9d3ba A virt scenario can be constructed where MMIO memory can be user memory. When that happens, a race condition opens between when the hardware raises the #VC and when the #VC handler gets to emulate the instruction. If the MOVS is replaced with a MOVS accessing kernel memory in that small race window, then write to kernel memory happens as the access checks are not done at emulation time. Disable MMIO emulation in user mode temporarily until a sensible use case appears and justifies properly handling the race window. Fixes: 0118b604c2c9 ("x86/sev-es: Handle MMIO String Instructions") Reported-by: Tom Dohrmann <erbse.13@gmx.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Tested-by: Tom Dohrmann <erbse.13@gmx.de> Cc: <stable@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-10-25	KVM: x86: Mask LVTPC when handling a PMI	Jim Mattson	1	-2/+6
	commit a16eb25b09c02a54c1c1b449d4b6cfa2cf3f013a upstream. Per the SDM, "When the local APIC handles a performance-monitoring counters interrupt, it automatically sets the mask flag in the LVT performance counter register." Add this behavior to KVM's local APIC emulation. Failure to mask the LVTPC entry results in spurious PMIs, e.g. when running Linux as a guest, PMI handlers that do a "late_ack" spew a large number of "dazed and confused" spurious NMI warnings. Fixes: f5132b01386b ("KVM: Expose a version 2 architectural PMU to a guests") Cc: stable@vger.kernel.org Signed-off-by: Jim Mattson <jmattson@google.com> Tested-by: Mingwei Zhang <mizhang@google.com> Signed-off-by: Mingwei Zhang <mizhang@google.com> Link: https://lore.kernel.org/r/20230925173448.3518223-3-mizhang@google.com [sean: massage changelog, correct Fixes] Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-10-25	arm64: armv8_deprecated: fix unused-function error	Ren Zhijie	1	-1/+1
	commit 223d3a0d30b6e9f979f5642e430e1753d3e29f89 upstream. If CONFIG_SWP_EMULATION is not set and CONFIG_CP15_BARRIER_EMULATION is not set, aarch64-linux-gnu complained about unused-function : arch/arm64/kernel/armv8_deprecated.c:67:21: error: ‘aarch32_check_condition’ defined but not used [-Werror=unused-function] static unsigned int aarch32_check_condition(u32 opcode, u32 psr) ^~~~~~~~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors To fix this warning, modify aarch32_check_condition() with __maybe_unused. Fixes: 0c5f416219da ("arm64: armv8_deprecated: move aarch32 helper earlier") Signed-off-by: Ren Zhijie <renzhijie2@huawei.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20221124022429.19024-1-renzhijie2@huawei.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-10-25	arm64: armv8_deprecated: rework deprected instruction handling	Mark Rutland	3	-192/+154
	commit 124c49b1b5d947b7180c5d6cbb09ddf76ea45ea2 upstream. Support for deprecated instructions can be enabled or disabled at runtime. To handle this, the code in armv8_deprecated.c registers and unregisters undef_hooks, and makes cross CPU calls to configure HW support. This is rather complicated, and the synchronization required to make this safe ends up serializing the handling of instructions which have been trapped. This patch simplifies the deprecated instruction handling by removing the dynamic registration and unregistration, and changing the trap handling code to determine whether a handler should be invoked. This removes the need for dynamic list management, and simplifies the locking requirements, making it possible to handle trapped instructions entirely in parallel. Where changing the emulation state requires a cross-call, this is serialized by locally disabling interrupts, ensuring that the CPU is not left in an inconsistent state. To simplify sysctl management, each insn_emulation is given a separate sysctl table, permitting these to be registered separately. The core sysctl code will iterate over all of these when walking sysfs. I've tested this with userspace programs which use each of the deprecated instructions, and I've concurrently modified the support level for each of the features back-and-forth between HW and emulated to check that there are no spurious SIGILLs sent to userspace when the support level is changed. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20221019144123.612388-10-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-10-25	arm64: armv8_deprecated: move aarch32 helper earlier	Mark Rutland	1	-20/+19
	commit 0c5f416219da3795dc8b33e5bb7865a6b3c4e55c upstream. Subsequent patches will rework the logic in armv8_deprecated.c. In preparation for subsequent changes, this patch moves some shared logic earlier in the file. This will make subsequent diffs simpler and easier to read. At the same time, drop the `__kprobes` annotation from aarch32_check_condition(), as this is only used for traps from compat userspace, and has no risk of recursion within kprobes. As this is the last kprobes annotation in armve8_deprecated.c, we no longer need to include <asm/kprobes.h>. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20221019144123.612388-9-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-10-25	arm64: armv8_deprecated move emulation functions	Mark Rutland	1	-197/+197
	commit 25eeac0cfe7c97ade1be07340e11e7143aab57a6 upstream. Subsequent patches will rework the logic in armv8_deprecated.c. In preparation for subsequent changes, this patch moves the emulation logic earlier in the file, and moves the infrastructure later in the file. This will make subsequent diffs simpler and easier to read. This is purely a move. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20221019144123.612388-8-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-10-25	arm64: armv8_deprecated: fold ops into insn_emulation	Mark Rutland	1	-43/+33
	commit b4453cc8a7ebbd45436a8cd3ffeaa069ceac146f upstream. The code for emulating deprecated instructions has two related structures: struct insn_emulation_ops and struct insn_emulation, where each struct insn_emulation_ops is associated 1-1 with a struct insn_emulation. It would be simpler to combine the two into a single structure, removing the need for (unconditional) dynamic allocation at boot time, and simplifying some runtime pointer chasing. This patch merges the two structures together. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20221019144123.612388-7-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-10-25	arm64: rework EL0 MRS emulation	Mark Rutland	3	-19/+10
	commit f5962add74b61f8ae31c6311f75ca35d7e1d2d8f upstream. On CPUs without FEAT_IDST, ID register emulation is slower than it needs to be, as all threads contend for the same lock to perform the emulation. This patch reworks the emulation to avoid this unnecessary contention. On CPUs with FEAT_IDST (which is mandatory from ARMv8.4 onwards), EL0 accesses to ID registers result in a SYS trap, and emulation of these is handled with a sys64_hook. These hooks are statically allocated, and no locking is required to iterate through the hooks and perform the emulation, allowing emulation to occur in parallel with no contention. On CPUs without FEAT_IDST, EL0 accesses to ID registers result in an UNDEFINED exception, and emulation of these accesses is handled with an undef_hook. When an EL0 MRS instruction is trapped to EL1, the kernel finds the relevant handler by iterating through all of the undef_hooks, requiring undef_lock to be held during this lookup. This locking is only required to safely traverse the list of undef_hooks (as it can be concurrently modified), and the actual emulation of the MRS does not require any mutual exclusion. This locking is an unfortunate bottleneck, especially given that MRS emulation is enabled unconditionally and is never disabled. This patch reworks the non-FEAT_IDST MRS emulation logic so that it can be invoked directly from do_el0_undef(). This removes the bottleneck, allowing MRS traps to be handled entirely in parallel, and is a stepping stone to making all of the undef_hooks lock-free. I've tested this in a 64-vCPU VM on a 64-CPU ThunderX2 host, with a benchmark which spawns a number of threads which each try to read ID_AA64ISAR0_EL1 1000000 times. This is vastly more contention than will ever be seen in realistic usage, but clearly demonstrates the removal of the bottleneck: \| Threads \|\| Time (seconds) \| \| \|\| Before \|\| After \| \| \|\| Real \| System \|\| Real \| System \| \|---------++--------+---------++--------+---------\| \| 1 \|\| 0.29 \| 0.20 \|\| 0.24 \| 0.12 \| \| 2 \|\| 0.35 \| 0.51 \|\| 0.23 \| 0.27 \| \| 4 \|\| 1.08 \| 3.87 \|\| 0.24 \| 0.56 \| \| 8 \|\| 4.31 \| 33.60 \|\| 0.24 \| 1.11 \| \| 16 \|\| 9.47 \| 149.39 \|\| 0.23 \| 2.15 \| \| 32 \|\| 19.07 \| 605.27 \|\| 0.24 \| 4.38 \| \| 64 \|\| 65.40 \| 3609.09 \|\| 0.33 \| 11.27 \| Aside from the speedup, there should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20221019144123.612388-6-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-10-25	arm64: factor insn read out of call_undef_hook()	Mark Rutland	1	-9/+22
	commit dbfbd87efa79575491af0ba1a87bf567eaea6cae upstream. Subsequent patches will rework EL0 UNDEF handling, removing the need for struct undef_hook and call_undef_hook. In preparation for those changes, this patch factors the logic for reading user instructions out of call_undef_hook() and into a new user_insn_read() helper, matching the style of the existing aarch64_insn_read() helper used for reading kernel instructions. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20221019144123.612388-5-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-10-25	arm64: factor out EL1 SSBS emulation hook	Mark Rutland	3	-26/+17
	commit bff8f413c71ffc3cb679dbd9a5632b33af563f9f upstream. Currently call_undef_hook() is used to handle UNDEFINED exceptions from EL0 and EL1. As support for deprecated instructions may be enabled independently, the handlers for individual instructions are organised as a linked list of struct undef_hook which can be manipulated dynamically. As this can be manipulated dynamically, the list is protected with a raw_spinlock which must be acquired when handling UNDEFINED exceptions or when manipulating the list of handlers. This locking is unfortunate as it serialises handling of UNDEFINED exceptions, and requires RCU to be enabled for lockdep, requiring the use of RCU_NONIDLE() in resume path of cpu_suspend() since commit: a2c42bbabbe260b7 ("arm64: spectre: Prevent lockdep splat on v4 mitigation enable path") The list of UNDEFINED handlers largely consist of handlers for exceptions taken from EL0, and the only handler for exceptions taken from EL1 handles `MSR SSBS, #imm` on CPUs which feature PSTATE.SSBS but lack the corresponding MSR (Immediate) instruction. Other than this we never expect to take an UNDEFINED exception from EL1 in normal operation. This patch reworks do_el0_undef() to invoke the EL1 SSBS handler directly, relegating call_undef_hook() to only handle EL0 UNDEFs. This removes redundant work to iterate the list for EL1 UNDEFs, and removes the need for locking, permitting EL1 UNDEFs to be handled in parallel without contention. The RCU_NONIDLE() call in cpu_suspend() will be removed in a subsequent patch, as there are other potential issues with the use of instrumentable code and RCU in the CPU suspend code. I've tested this by forcing the detection of SSBS on a CPU that doesn't have it, and verifying that the try_emulate_el1_ssbs() callback is invoked. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20221019144123.612388-4-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-10-25	arm64: split EL0/EL1 UNDEF handlers	Mark Rutland	3	-13/+16
	commit 61d64a376ea80f9097e7ea599bcd68671b836dc6 upstream. In general, exceptions taken from EL1 need to be handled separately from exceptions taken from EL0, as the logic to handle the two cases can be significantly divergent, and exceptions taken from EL1 typically have more stringent requirements on locking and instrumentation. Subsequent patches will rework the way EL1 UNDEFs are handled in order to address longstanding soundness issues with instrumentation and RCU. In preparation for that rework, this patch splits the existing do_undefinstr() handler into separate do_el0_undef() and do_el1_undef() handlers. Prior to this patch, do_undefinstr() was marked with NOKPROBE_SYMBOL(), preventing instrumentation via kprobes. However, do_undefinstr() invokes other code which can be instrumented, and: * For UNDEFINED exceptions taken from EL0, there is no risk of recursion within kprobes. Therefore it is safe for do_el0_undef to be instrumented with kprobes, and it does not need to be marked with NOKPROBE_SYMBOL(). * For UNDEFINED exceptions taken from EL1, either: (a) The exception is has been taken when manipulating SSBS; these cases are limited and do not occur within code that can be invoked recursively via kprobes. Hence, in these cases instrumentation with kprobes is benign. (b) The exception has been taken for an unknown reason, as other than manipulating SSBS we do not expect to take UNDEFINED exceptions from EL1. Any handling of these exception is best-effort. ... and in either case, marking do_el1_undef() with NOKPROBE_SYMBOL() isn't sufficient to prevent recursion via kprobes as functions it calls (including die()) are instrumentable via kprobes. Hence, it's not worthwhile to mark do_el1_undef() with NOKPROBE_SYMBOL(). The same applies to do_el1_bti() and do_el1_fpac(), so their NOKPROBE_SYMBOL() annotations are also removed. Aside from the new instrumentability, there should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20221019144123.612388-3-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-10-25	arm64: allow kprobes on EL0 handlers	Mark Rutland	3	-8/+6
	commit b3a0c010e900a9f89dcd99f10bd8f7538d21b0a9 upstream. Currently do_sysinstr() and do_cp15instr() are marked with NOKPROBE_SYMBOL(). However, these are only called for exceptions taken from EL0, and there is no risk of recursion in kprobes, so this is not necessary. Remove the NOKPROBE_SYMBOL() annotation, and rename the two functions to more clearly indicate that these are solely for exceptions taken from EL0, better matching the names used by the lower level entry points in entry-common.c. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20221019144123.612388-2-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-10-25	arm64: rework BTI exception handling	Mark Rutland	3	-5/+22
	commit 830a2a4d853f2c4a1e4606aa03341b7f273b0e9b upstream. If a BTI exception is taken from EL1, the entry code will treat this as an unhandled exception and will panic() the kernel. This is inconsistent with the way we handle FPAC exceptions, which have a dedicated handler and only necessarily kill the thread from which the exception was taken from, and we don't log all the information that could be relevant to debug the issue. The code in do_bti() has: BUG_ON(!user_mode(regs)); ... and it seems like the intent was to call this for EL1 BTI exceptions, as with FPAC, but this was omitted due to an oversight. This patch adds separate EL0 and EL1 BTI exception handlers, with the latter calling die() directly to report the original context the BTI exception was taken from. This matches our handling of FPAC exceptions. Prior to this patch, a BTI failure is reported as: \| Unhandled 64-bit el1h sync exception on CPU0, ESR 0x0000000034000002 -- BTI \| CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.19.0-rc3-00131-g7d937ff0221d-dirty #9 \| Hardware name: linux,dummy-virt (DT) \| pstate: 20400809 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=-c) \| pc : test_bti_callee+0x4/0x10 \| lr : test_bti_caller+0x1c/0x28 \| sp : ffff80000800bdf0 \| x29: ffff80000800bdf0 x28: 0000000000000000 x27: 0000000000000000 \| x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000 \| x23: ffff80000a2b8000 x22: 0000000000000000 x21: 0000000000000000 \| x20: ffff8000099fa5b0 x19: ffff800009ff7000 x18: fffffbfffda37000 \| x17: 3120676e696d7573 x16: 7361202c6e6f6974 x15: 0000000041a90000 \| x14: 0040000000000041 x13: 0040000000000001 x12: ffff000001a90000 \| x11: fffffbfffda37480 x10: 0068000000000703 x9 : 0001000040000000 \| x8 : 0000000000090000 x7 : 0068000000000f03 x6 : 0060000000000f83 \| x5 : ffff80000a2b6000 x4 : ffff0000028d0000 x3 : ffff800009f78378 \| x2 : 0000000000000000 x1 : 0000000040210000 x0 : ffff8000080257e4 \| Kernel panic - not syncing: Unhandled exception \| CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.19.0-rc3-00131-g7d937ff0221d-dirty #9 \| Hardware name: linux,dummy-virt (DT) \| Call trace: \| dump_backtrace.part.0+0xcc/0xe0 \| show_stack+0x18/0x5c \| dump_stack_lvl+0x64/0x80 \| dump_stack+0x18/0x34 \| panic+0x170/0x360 \| arm64_exit_nmi.isra.0+0x0/0x80 \| el1h_64_sync_handler+0x64/0xd0 \| el1h_64_sync+0x64/0x68 \| test_bti_callee+0x4/0x10 \| smp_cpus_done+0xb0/0xbc \| smp_init+0x7c/0x8c \| kernel_init_freeable+0x128/0x28c \| kernel_init+0x28/0x13c \| ret_from_fork+0x10/0x20 With this patch applied, a BTI failure is reported as: \| Internal error: Oops - BTI: 0000000034000002 [#1] PREEMPT SMP \| Modules linked in: \| CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.19.0-rc3-00132-g0ad98265d582-dirty #8 \| Hardware name: linux,dummy-virt (DT) \| pstate: 20400809 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=-c) \| pc : test_bti_callee+0x4/0x10 \| lr : test_bti_caller+0x1c/0x28 \| sp : ffff80000800bdf0 \| x29: ffff80000800bdf0 x28: 0000000000000000 x27: 0000000000000000 \| x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000 \| x23: ffff80000a2b8000 x22: 0000000000000000 x21: 0000000000000000 \| x20: ffff8000099fa5b0 x19: ffff800009ff7000 x18: fffffbfffda37000 \| x17: 3120676e696d7573 x16: 7361202c6e6f6974 x15: 0000000041a90000 \| x14: 0040000000000041 x13: 0040000000000001 x12: ffff000001a90000 \| x11: fffffbfffda37480 x10: 0068000000000703 x9 : 0001000040000000 \| x8 : 0000000000090000 x7 : 0068000000000f03 x6 : 0060000000000f83 \| x5 : ffff80000a2b6000 x4 : ffff0000028d0000 x3 : ffff800009f78378 \| x2 : 0000000000000000 x1 : 0000000040210000 x0 : ffff800008025804 \| Call trace: \| test_bti_callee+0x4/0x10 \| smp_cpus_done+0xb0/0xbc \| smp_init+0x7c/0x8c \| kernel_init_freeable+0x128/0x28c \| kernel_init+0x28/0x13c \| ret_from_fork+0x10/0x20 \| Code: d50323bf d53cd040 d65f03c0 d503233f (d50323bf) Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Mark Brown <broonie@kernel.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Alexandru Elisei <alexandru.elisei@arm.com> Cc: Amit Daniel Kachhap <amit.kachhap@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220913101732.3925290-6-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-10-25	arm64: rework FPAC exception handling	Mark Rutland	3	-9/+14
	commit a1fafa3b24a70461bbf3e5c0770893feb0a49292 upstream. If an FPAC exception is taken from EL1, the entry code will call do_ptrauth_fault(), where due to: BUG_ON(!user_mode(regs)) ... the kernel will report a problem within do_ptrauth_fault() rather than reporting the original context the FPAC exception was taken from. The pt_regs and ESR value reported will be from within do_ptrauth_fault() and the code dump will be for the BRK in BUG_ON(), which isn't sufficient to debug the cause of the original exception. This patch makes the reporting better by having separate EL0 and EL1 FPAC exception handlers, with the latter calling die() directly to report the original context the FPAC exception was taken from. Note that we only need to prevent kprobes of the EL1 FPAC handler, since the EL0 FPAC handler cannot be called recursively. For consistency with do_el0_svc*(), I've named the split functions do_el{0,1}_fpac() rather than do_el{0,1}_ptrauth_fault(). I've also clarified the comment to not imply there are casues other than FPAC exceptions. Prior to this patch FPAC exceptions are reported as: \| kernel BUG at arch/arm64/kernel/traps.c:517! \| Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP \| Modules linked in: \| CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.19.0-rc3-00130-g9c8a180a1cdf-dirty #12 \| Hardware name: FVP Base RevC (DT) \| pstate: 00400009 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) \| pc : do_ptrauth_fault+0x3c/0x40 \| lr : el1_fpac+0x34/0x54 \| sp : ffff80000a3bbc80 \| x29: ffff80000a3bbc80 x28: ffff0008001d8000 x27: 0000000000000000 \| x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000 \| x23: 0000000020400009 x22: ffff800008f70fa4 x21: ffff80000a3bbe00 \| x20: 0000000072000000 x19: ffff80000a3bbcb0 x18: fffffbfffda37000 \| x17: 3120676e696d7573 x16: 7361202c6e6f6974 x15: 0000000081a90000 \| x14: 0040000000000041 x13: 0040000000000001 x12: ffff000001a90000 \| x11: fffffbfffda37480 x10: 0068000000000703 x9 : 0001000080000000 \| x8 : 0000000000090000 x7 : 0068000000000f03 x6 : 0060000000000783 \| x5 : ffff80000a3bbcb0 x4 : ffff0008001d8000 x3 : 0000000072000000 \| x2 : 0000000000000000 x1 : 0000000020400009 x0 : ffff80000a3bbcb0 \| Call trace: \| do_ptrauth_fault+0x3c/0x40 \| el1h_64_sync_handler+0xc4/0xd0 \| el1h_64_sync+0x64/0x68 \| test_pac+0x8/0x10 \| smp_init+0x7c/0x8c \| kernel_init_freeable+0x128/0x28c \| kernel_init+0x28/0x13c \| ret_from_fork+0x10/0x20 \| Code: 97fffe5e a8c17bfd d50323bf d65f03c0 (d4210000) With this patch applied FPAC exceptions are reported as: \| Internal error: Oops - FPAC: 0000000072000000 [#1] PREEMPT SMP \| Modules linked in: \| CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.19.0-rc3-00132-g78846e1c4757-dirty #11 \| Hardware name: FVP Base RevC (DT) \| pstate: 20400009 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) \| pc : test_pac+0x8/0x10 \| lr : 0x0 \| sp : ffff80000a3bbe00 \| x29: ffff80000a3bbe00 x28: 0000000000000000 x27: 0000000000000000 \| x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000 \| x23: ffff80000a2c8000 x22: 0000000000000000 x21: 0000000000000000 \| x20: ffff8000099fa5b0 x19: ffff80000a007000 x18: fffffbfffda37000 \| x17: 3120676e696d7573 x16: 7361202c6e6f6974 x15: 0000000081a90000 \| x14: 0040000000000041 x13: 0040000000000001 x12: ffff000001a90000 \| x11: fffffbfffda37480 x10: 0068000000000703 x9 : 0001000080000000 \| x8 : 0000000000090000 x7 : 0068000000000f03 x6 : 0060000000000783 \| x5 : ffff80000a2c6000 x4 : ffff0008001d8000 x3 : ffff800009f88378 \| x2 : 0000000000000000 x1 : 0000000080210000 x0 : ffff000001a90000 \| Call trace: \| test_pac+0x8/0x10 \| smp_init+0x7c/0x8c \| kernel_init_freeable+0x128/0x28c \| kernel_init+0x28/0x13c \| ret_from_fork+0x10/0x20 \| Code: d50323bf d65f03c0 d503233f aa1f03fe (d50323bf) Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Mark Brown <broonie@kernel.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Alexandru Elisei <alexandru.elisei@arm.com> Cc: Amit Daniel Kachhap <amit.kachhap@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220913101732.3925290-5-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-10-25	arm64: consistently pass ESR_ELx to die()	Mark Rutland	3	-15/+15
	commit 0f2cb928a1547ae8f89e80a4b8df2c6c02ae5f96 upstream. Currently, bug_handler() and kasan_handler() call die() with '0' as the 'err' value, whereas die_kernel_fault() passes the ESR_ELx value. For consistency, this patch ensures we always pass the ESR_ELx value to die(). As this is only called for exceptions taken from kernel mode, there should be no user-visible change as a result of this patch. For UNDEFINED exceptions, I've had to modify do_undefinstr() and its callers to pass the ESR_ELx value. In all cases the ESR_ELx value had already been read and was available. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Mark Brown <broonie@kernel.org> Cc: Alexandru Elisei <alexandru.elisei@arm.com> Cc: Amit Daniel Kachhap <amit.kachhap@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220913101732.3925290-4-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-10-25	arm64: die(): pass 'err' as long	Mark Rutland	2	-4/+4
	commit 18906ff9af6517c20763ed63dab602a4150794f7 upstream. Recently, we reworked a lot of code to consistentlt pass ESR_ELx as a 64-bit quantity. However, we missed that this can be passed into die() and __die() as the 'err' parameter where it is truncated to a 32-bit int. As notify_die() already takes 'err' as a long, this patch changes die() and __die() to also take 'err' as a long, ensuring that the full value of ESR_ELx is retained. At the same time, die() is updated to consistently log 'err' as a zero-padded 64-bit quantity. Subsequent patches will pass the ESR_ELx value to die() for a number of exceptions. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Mark Brown <broonie@kernel.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Alexandru Elisei <alexandru.elisei@arm.com> Cc: Amit Daniel Kachhap <amit.kachhap@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220913101732.3925290-3-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-10-25	arm64: report EL1 UNDEFs better	Mark Rutland	1	-1/+3
	commit b502c87d2a26c349acbc231ff2acd6f17147926b upstream. If an UNDEFINED exception is taken from EL1, and do_undefinstr() doesn't find any suitable undef_hook, it will call: BUG_ON(!user_mode(regs)) ... and the kernel will report a failure witin do_undefinstr() rather than reporting the original context that the UNDEFINED exception was taken from. The pt_regs and ESR value reported within the BUG() handler will be from within do_undefinstr() and the code dump will be for the BRK in BUG_ON(), which isn't sufficient to debug the cause of the original exception. This patch makes the reporting better by having do_undefinstr() call die() directly in this case to report the original context from which the UNDEFINED exception was taken. Prior to this patch, an undefined instruction is reported as: \| kernel BUG at arch/arm64/kernel/traps.c:497! \| Internal error: Oops - BUG: 0 [#1] PREEMPT SMP \| Modules linked in: \| CPU: 0 PID: 0 Comm: swapper Not tainted 5.19.0-rc3-00127-geff044f1b04e-dirty #3 \| Hardware name: linux,dummy-virt (DT) \| pstate: 000000c5 (nzcv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--) \| pc : do_undefinstr+0x28c/0x2ac \| lr : do_undefinstr+0x298/0x2ac \| sp : ffff800009f63bc0 \| x29: ffff800009f63bc0 x28: ffff800009f73c00 x27: ffff800009644a70 \| x26: ffff8000096778a8 x25: 0000000000000040 x24: 0000000000000000 \| x23: 00000000800000c5 x22: ffff800009894060 x21: ffff800009f63d90 \| x20: 0000000000000000 x19: ffff800009f63c40 x18: 0000000000000006 \| x17: 0000000000403000 x16: 00000000bfbfd000 x15: ffff800009f63830 \| x14: ffffffffffffffff x13: 0000000000000000 x12: 0000000000000019 \| x11: 0101010101010101 x10: 0000000000161b98 x9 : 0000000000000000 \| x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000000000 \| x5 : ffff800009f761d0 x4 : 0000000000000000 x3 : ffff80000a2b80f8 \| x2 : 0000000000000000 x1 : ffff800009f73c00 x0 : 00000000800000c5 \| Call trace: \| do_undefinstr+0x28c/0x2ac \| el1_undef+0x2c/0x4c \| el1h_64_sync_handler+0x84/0xd0 \| el1h_64_sync+0x64/0x68 \| setup_arch+0x550/0x598 \| start_kernel+0x88/0x6ac \| __primary_switched+0xb8/0xc0 \| Code: 17ffff95 a9425bf5 17ffffb8 a9025bf5 (d4210000) With this patch applied, an undefined instruction is reported as: \| Internal error: Oops - Undefined instruction: 0 [#1] PREEMPT SMP \| Modules linked in: \| CPU: 0 PID: 0 Comm: swapper Not tainted 5.19.0-rc3-00128-gf27cfcc80e52-dirty #5 \| Hardware name: linux,dummy-virt (DT) \| pstate: 800000c5 (Nzcv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--) \| pc : setup_arch+0x550/0x598 \| lr : setup_arch+0x50c/0x598 \| sp : ffff800009f63d90 \| x29: ffff800009f63d90 x28: 0000000081000200 x27: ffff800009644a70 \| x26: ffff8000096778c8 x25: 0000000000000040 x24: 0000000000000000 \| x23: 0000000000000100 x22: ffff800009f69a58 x21: ffff80000a2b80b8 \| x20: 0000000000000000 x19: 0000000000000000 x18: 0000000000000006 \| x17: 0000000000403000 x16: 00000000bfbfd000 x15: ffff800009f63830 \| x14: ffffffffffffffff x13: 0000000000000000 x12: 0000000000000019 \| x11: 0101010101010101 x10: 0000000000161b98 x9 : 0000000000000000 \| x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000000000 \| x5 : 0000000000000008 x4 : 0000000000000010 x3 : 0000000000000000 \| x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000000 \| Call trace: \| setup_arch+0x550/0x598 \| start_kernel+0x88/0x6ac \| __primary_switched+0xb8/0xc0 \| Code: b4000080 90ffed80 912ac000 97db745f (00000000) Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Mark Brown <broonie@kernel.org> Cc: Alexandru Elisei <alexandru.elisei@arm.com> Cc: Amit Daniel Kachhap <amit.kachhap@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20220913101732.3925290-2-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-10-25	x86/alternatives: Disable KASAN in apply_alternatives()	Kirill A. Shutemov	1	-0/+13
	commit d35652a5fc9944784f6f50a5c979518ff8dacf61 upstream. Fei has reported that KASAN triggers during apply_alternatives() on a 5-level paging machine: BUG: KASAN: out-of-bounds in rcu_is_watching() Read of size 4 at addr ff110003ee6419a0 by task swapper/0/0 ... __asan_load4() rcu_is_watching() trace_hardirqs_on() text_poke_early() apply_alternatives() ... On machines with 5-level paging, cpu_feature_enabled(X86_FEATURE_LA57) gets patched. It includes KASAN code, where KASAN_SHADOW_START depends on __VIRTUAL_MASK_SHIFT, which is defined with cpu_feature_enabled(). KASAN gets confused when apply_alternatives() patches the KASAN_SHADOW_START users. A test patch that makes KASAN_SHADOW_START static, by replacing __VIRTUAL_MASK_SHIFT with 56, works around the issue. Fix it for real by disabling KASAN while the kernel is patching alternatives. [ mingo: updated the changelog ] Fixes: 6657fca06e3f ("x86/mm: Allow to boot without LA57 if CONFIG_X86_5LEVEL=y") Reported-by: Fei Yang <fei.yang@intel.com> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20231012100424.1456-1-kirill.shutemov@linux.intel.com Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-10-25	powerpc/64e: Fix wrong test in __ptep_test_and_clear_young()	Christophe Leroy	1	-1/+1
	[ Upstream commit 5ea0bbaa32e8f54e9a57cfee4a3b8769b80be0d2 ] Commit 45201c879469 ("powerpc/nohash: Remove hash related code from nohash headers.") replaced: if ((pte_val(ptep) & (_PAGE_ACCESSED \| _PAGE_HASHPTE)) == 0) return 0; By: if (pte_young(ptep)) return 0; But it should be: if (!pte_young(*ptep)) return 0; Fix it. Fixes: 45201c879469 ("powerpc/nohash: Remove hash related code from nohash headers.") Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/8bb7f06494e21adada724ede47a4c3d97e879d40.1695659959.git.christophe.leroy@csgroup.eu Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-10-25	powerpc/8xx: Fix pte_access_permitted() for PAGE_NONE	Christophe Leroy	2	-0/+9
	[ Upstream commit 5d9cea8a552ee122e21fbd5a3c5d4eb85f648e06 ] On 8xx, PAGE_NONE is handled by setting _PAGE_NA instead of clearing _PAGE_USER. But then pte_user() returns 1 also for PAGE_NONE. As _PAGE_NA prevent reads, add a specific version of pte_read() that returns 0 when _PAGE_NA is set instead of always returning 1. Fixes: 351750331fc1 ("powerpc/mm: Introduce _PAGE_NA") Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/57bcfbe578e43123f9ed73e040229b80f1ad56ec.1695659959.git.christophe.leroy@csgroup.eu Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-10-25	x86/cpu: Fix AMD erratum #1485 on Zen4-based CPUs	Borislav Petkov (AMD)	2	-2/+15
	commit f454b18e07f518bcd0c05af17a2239138bff52de upstream. Fix erratum #1485 on Zen4 parts where running with STIBP disabled can cause an #UD exception. The performance impact of the fix is negligible. Reported-by: René Rebe <rene@exactcode.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Tested-by: René Rebe <rene@exactcode.de> Cc: <stable@kernel.org> Link: https://lore.kernel.org/r/D99589F4-BC5D-430B-87B2-72C20370CF57@exactcode.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-10-25	riscv, bpf: Sign-extend return values	Björn Töpel	1	-2/+3
	[ Upstream commit 2f1b0d3d733169eb11680bfa97c266ae5e757148 ] The RISC-V architecture does not expose sub-registers, and hold all 32-bit values in a sign-extended format [1] [2]: \| The compiler and calling convention maintain an invariant that all \| 32-bit values are held in a sign-extended format in 64-bit \| registers. Even 32-bit unsigned integers extend bit 31 into bits \| 63 through 32. Consequently, conversion between unsigned and \| signed 32-bit integers is a no-op, as is conversion from a signed \| 32-bit integer to a signed 64-bit integer. While BPF, on the other hand, exposes sub-registers, and use zero-extension (similar to arm64/x86). This has led to some subtle bugs, where a BPF JITted program has not sign-extended the a0 register (return value in RISC-V land), passed the return value up the kernel, e.g.: \| int from_bpf(void); \| \| long foo(void) \| { \| return from_bpf(); \| } Here, a0 would be 0xffff_ffff, instead of the expected 0xffff_ffff_ffff_ffff. Internally, the RISC-V JIT uses a5 as a dedicated register for BPF return values. Keep a5 zero-extended, but explicitly sign-extend a0 (which is used outside BPF land). Now that a0 (RISC-V ABI) and a5 (BPF ABI) differs, a0 is only moved to a5 for non-BPF native calls (BPF_PSEUDO_CALL). Fixes: 2353ecc6f91f ("bpf, riscv: add BPF JIT for RV64G") Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://github.com/riscv/riscv-isa-manual/releases/download/riscv-isa-release-056b6ff-2023-10-02/unpriv-isa-asciidoc.pdf # [2] Link: https://github.com/riscv-non-isa/riscv-elf-psabi-doc/releases/download/draft-20230929-e5c800e661a53efe3c2678d71a306323b60eb13b/riscv-abi.pdf # [2] Link: https://lore.kernel.org/bpf/20231004120706.52848-2-bjorn@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-10-25	riscv, bpf: Factor out emit_call for kernel and bpf context	Pu Lehui	1	-17/+13
	[ Upstream commit 0fd1fd0104954380477353aea29c347e85dff16d ] The current emit_call function is not suitable for kernel function call as it store return value to bpf R0 register. We can separate it out for common use. Meanwhile, simplify judgment logic, that is, fixed function address can use jal or auipc+jalr, while the unfixed can use only auipc+jalr. Signed-off-by: Pu Lehui <pulehui@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Björn Töpel <bjorn@rivosinc.com> Acked-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/bpf/20230215135205.1411105-3-pulehui@huaweicloud.com Stable-dep-of: 2f1b0d3d7331 ("riscv, bpf: Sign-extend return values") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-10-10	parisc: Restore __ldcw_align for PA-RISC 2.0 processors	John David Anglin	2	-21/+20
	commit 914988e099fc658436fbd7b8f240160c352b6552 upstream. Back in 2005, Kyle McMartin removed the 16-byte alignment for ldcw semaphores on PA 2.0 machines (CONFIG_PA20). This broke spinlocks on pre PA8800 processors. The main symptom was random faults in mmap'd memory (e.g., gcc compilations, etc). Unfortunately, the errata for this ldcw change is lost. The issue is the 16-byte alignment required for ldcw semaphore instructions can only be reduced to natural alignment when the ldcw operation can be handled coherently in cache. Only PA8800 and PA8900 processors actually support doing the operation in cache. Aligning the spinlock dynamically adds two integer instructions to each spinlock. Tested on rp3440, c8000 and a500. Signed-off-by: John David Anglin <dave.anglin@bell.net> Link: https://lore.kernel.org/linux-parisc/6b332788-2227-127f-ba6d-55e99ecf4ed8@bell.net/T/#t Link: https://lore.kernel.org/linux-parisc/20050609050702.GB4641@roadwarrior.mcmartin.ca/ Cc: stable@vger.kernel.org Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-10-10	arm64: Add Cortex-A520 CPU part definition	Rob Herring	1	-0/+2
	commit a654a69b9f9c06b2e56387d0b99f0e3e6b0ff4ef upstream. Add the CPU Part number for the new Arm design. Cc: stable@vger.kernel.org Signed-off-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20230921194156.1050055-1-robh@kernel.org Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-10-10	powerpc/watchpoints: Disable preemption in thread_change_pc()	Benjamin Gray	1	-1/+6
	[ Upstream commit cc879ab3ce39bc39f9b1d238b283f43a5f6f957d ] thread_change_pc() uses CPU local data, so must be protected from swapping CPUs while it is reading the breakpoint struct. The error is more noticeable after 1e60f3564bad ("powerpc/watchpoints: Track perf single step directly on the breakpoint"), which added an unconditional __this_cpu_read() call in thread_change_pc(). However the existing __this_cpu_read() that runs if a breakpoint does need to be re-inserted has the same issue. Signed-off-by: Benjamin Gray <bgray@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230829063457.54157-2-bgray@linux.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-10-10	parisc: irq: Make irq_stack_union static to avoid sparse warning	Helge Deller	1	-1/+1
	[ Upstream commit b1bef1388c427cdad7331a9c8eb4ebbbe5b954b0 ] Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-10-10	parisc: drivers: Fix sparse warning	Helge Deller	1	-1/+1
	[ Upstream commit b137b9d60b8add5620a06c687a71ce18776730b0 ] Fix "warning: directive in macro's argument list" warning. Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-10-10	parisc: sba: Fix compile warning wrt list of SBA devices	Helge Deller	1	-0/+3
	[ Upstream commit eb3255ee8f6f4691471a28fbf22db5e8901116cd ] Fix this makecheck warning: drivers/parisc/sba_iommu.c:98:19: warning: symbol 'sba_list' was not declared. Should it be static? Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-10-10	xtensa: boot/lib: fix function prototypes	Max Filippov	1	-2/+3
	[ Upstream commit f54d02c8f2cc4b46ba2a3bd8252a6750453b6f2b ] Add function prototype for gunzip() to the boot library code and make exit() and zalloc() static. arch/xtensa/boot/lib/zmem.c:8:6: warning: no previous prototype for 'exit' [-Wmissing-prototypes] 8 \| void exit (void) arch/xtensa/boot/lib/zmem.c:13:7: warning: no previous prototype for 'zalloc' [-Wmissing-prototypes] 13 \| void zalloc(unsigned size) arch/xtensa/boot/lib/zmem.c:35:6: warning: no previous prototype for 'gunzip' [-Wmissing-prototypes] 35 \| void gunzip (void dst, int dstlen, unsigned char src, int lenp) Fixes: 4bedea945451 ("xtensa: Architecture support for Tensilica Xtensa Part 2") Fixes: e7d163f76665 ("xtensa: Removed local copy of zlib and fixed O= support") Suggested-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-10-10	xtensa: boot: don't add include-dirs	Randy Dunlap	1	-2/+1
	[ Upstream commit 54d3d7d363823782c3444ddc41bb8cf1edc80514 ] Drop the -I<include-dir> options to prevent build warnings since there is not boot/include directory: cc1: warning: arch/xtensa/boot/include: No such file or directory [-Wmissing-include-dirs] Fixes: 437374e9a950 ("restore arch/{ppc/xtensa}/boot cflags") Fixes: 4bedea945451 ("xtensa: Architecture support for Tensilica Xtensa Part 2") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Chris Zankel <chris@zankel.net> Cc: Max Filippov <jcmvbkbc@gmail.com> Message-Id: <20230920052139.10570-15-rdunlap@infradead.org> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-10-10	xtensa: iss/network: make functions static	Randy Dunlap	1	-2/+2
	[ Upstream commit 1b59efeb59851277266318f4e0132aa61ce3455e ] Make 2 functions static to prevent build warnings: arch/xtensa/platforms/iss/network.c:204:16: warning: no previous prototype for 'tuntap_protocol' [-Wmissing-prototypes] 204 \| unsigned short tuntap_protocol(struct sk_buff skb) arch/xtensa/platforms/iss/network.c:444:6: warning: no previous prototype for 'iss_net_user_timer_expire' [-Wmissing-prototypes] 444 \| void iss_net_user_timer_expire(struct timer_list unused) Fixes: 7282bee78798 ("xtensa: Architecture support for Tensilica Xtensa Part 8") Fixes: d8479a21a98b ("xtensa: Convert timers to use timer_setup()") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Chris Zankel <chris@zankel.net> Cc: Max Filippov <jcmvbkbc@gmail.com> Message-Id: <20230920052139.10570-14-rdunlap@infradead.org> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-10-10	xtensa: add default definition for XCHAL_HAVE_DIV32	Max Filippov	1	-0/+4
	[ Upstream commit 494e87ffa0159b3f879694a9231089707792a44d ] When variant FSF is set, XCHAL_HAVE_DIV32 is not defined. Add default definition for that macro to prevent build warnings: arch/xtensa/lib/divsi3.S:9:5: warning: "XCHAL_HAVE_DIV32" is not defined, evaluates to 0 [-Wundef] 9 \| #if XCHAL_HAVE_DIV32 arch/xtensa/lib/modsi3.S:9:5: warning: "XCHAL_HAVE_DIV32" is not defined, evaluates to 0 [-Wundef] 9 \| #if XCHAL_HAVE_DIV32 Fixes: 173d6681380a ("xtensa: remove extra header files") Suggested-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Reported-by: kernel test robot <lkp@intel.com> Closes: lore.kernel.org/r/202309150556.t0yCdv3g-lkp@intel.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-10-10	ARM: dts: ti: omap: motorola-mapphone: Fix abe_clkctrl warning on boot	Tony Lindgren	1	-2/+2
	[ Upstream commit ac08bda1569b06b7a62c7b4dd00d4c3b28ceaaec ] Commit 0840242e8875 ("ARM: dts: Configure clock parent for pwm vibra") attempted to fix the PWM settings but ended up causin an additional clock reparenting error: clk: failed to reparent abe-clkctrl:0060:24 to sys_clkin_ck: -22 Only timer9 is in the PER domain and can use the sys_clkin_ck clock source. For timer8, the there is no sys_clkin_ck available as it's in the ABE domain, instead it should use syc_clk_div_ck. However, for power management, we want to use the always on sys_32k_ck instead. Cc: Ivaylo Dimitrov <ivo.g.dimitrov.75@gmail.com> Cc: Carl Philipp Klemm <philipp@uvos.xyz> Cc: Merlijn Wajer <merlijn@wizzup.org> Cc: Pavel Machek <pavel@ucw.cz> Reviewed-by: Sebastian Reichel <sebastian.reichel@collabora.com> Fixes: 0840242e8875 ("ARM: dts: Configure clock parent for pwm vibra") Depends-on: 61978617e905 ("ARM: dts: Add minimal support for Droid Bionic xt875") Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-10-10	ARM: dts: Unify pwm-omap-dmtimer node names	Tony Lindgren	6	-7/+8
	[ Upstream commit 4f15fc7c0f28ffcd6e9a56396db6edcdfa4c9925 ] There is no reg property for pwm-omap-dmtimer. Cc: Krzysztof Kozlowski <krzysztof.kozlowski+dt@linaro.org> Cc: Rob Herring <robh+dt@kernel.org> Signed-off-by: Tony Lindgren <tony@atomide.com> Stable-dep-of: ac08bda1569b ("ARM: dts: ti: omap: motorola-mapphone: Fix abe_clkctrl warning on boot") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-10-10	ARM: dts: am335x: Guardian: Update beeper label	Gireesh Hiremath	1	-4/+4
	[ Upstream commit b5bf6b434575d32aeaa70c82ec84b3cec92e2973 ] * Update lable pwm to guardian beeper Signed-off-by: Gireesh Hiremath <Gireesh.Hiremath@in.bosch.com> Message-Id: <20220325100613.1494-8-Gireesh.Hiremath@in.bosch.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Stable-dep-of: ac08bda1569b ("ARM: dts: ti: omap: motorola-mapphone: Fix abe_clkctrl warning on boot") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-10-10	ARM: dts: motorola-mapphone: Drop second ti,wlcore compatible value	Geert Uytterhoeven	1	-1/+1
	[ Upstream commit 7ebe6e99f7702dad342486e5b30d989a0a6499af ] The TI wlcore DT bindings specify using a single compatible value for each variant, and the Linux kernel driver matches against the first compatible value since commit 078b30da3f074f2e ("wlcore: add wl1285 compatible") in v4.13. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Sebastian Reichel <sebastian.reichel@collabora.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Stable-dep-of: ac08bda1569b ("ARM: dts: ti: omap: motorola-mapphone: Fix abe_clkctrl warning on boot") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-10-10	ARM: dts: motorola-mapphone: Add 1.2GHz OPP	Carl Philipp Klemm	1	-0/+15
	[ Upstream commit 19e367147ea8864dff1fb153cfab6d8e8da10324 ] The omap4430 HS HIGH performance devces support 1.2GHz opp, lower speed variants do not. However for mapphone devices Motorola seems to have decided that this does not really matter for the SoC variants they have tested to use, and decided to clock all devices, including the ones with STANDARD performance chips at 1.2GHz upon release of the 3.0.8 vendor kernel shiped with Android 4.0. Therefore it seems safe to do the same, but let's only do it for Motorola devices as the others have not been tested. Note that we prevent overheating with the passive cooling device cpu_alert0 configured in the dts file that starts lowering the speed as needed. This also removes the "failed to find current OPP for freq 1200000000" warning. Cc: Merlijn Wajer <merlijn@wizzup.org> Cc: Pavel Machek <pavel@ucw.cz> Cc: Sebastian Reichel <sre@kernel.org> Signed-off-by: Carl Philipp Klemm <philipp@uvos.xyz> [tony@atomide.com: made motorola specific, updated comments] Signed-off-by: Tony Lindgren <tony@atomide.com> Stable-dep-of: ac08bda1569b ("ARM: dts: ti: omap: motorola-mapphone: Fix abe_clkctrl warning on boot") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-10-10	ARM: dts: motorola-mapphone: Configure lower temperature passive cooling	Tony Lindgren	1	-0/+8
	[ Upstream commit 5c3db2d4d4ed747e714387362afe007e6ae5e2d3 ] The current cooling device temperature is too high at 100C as we have a battery on the device right next to the SoC as pointed out by Carl Philipp Klemm <philipp@uvos.xyz>. Let's configure the max temperature to 80C. As we only have a tshut interrupt and no talert interrupt on 4430, we have a passive cooling device configured for 4430. However, we want the poll interval to be 10 seconds instead of 1 second for power management. The value of 10 seconds seems like plenty of time to notice the temperature increase above the 75C temperatures. Having the bandgap temperature change seems to take several tens of seconds because of heat dissipation above 75C range as monitored with a full CPU load. Cc: Carl Philipp Klemm <philipp@uvos.xyz> Cc: Merlijn Wajer <merlijn@wizzup.org> Cc: Pavel Machek <pavel@ucw.cz> Cc: Sebastian Reichel <sre@kernel.org> Suggested-by: Carl Philipp Klemm <philipp@uvos.xyz> Signed-off-by: Tony Lindgren <tony@atomide.com> Stable-dep-of: ac08bda1569b ("ARM: dts: ti: omap: motorola-mapphone: Fix abe_clkctrl warning on boot") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-10-10	ARM: dts: ti: omap: Fix bandgap thermal cells addressing for omap3/4	Tony Lindgren	4	-3/+7
	[ Upstream commit 6469b2feade8fd82d224dd3734e146536f3e9f0e ] Fix "thermal_sys: cpu_thermal: Failed to read thermal-sensors cells: -2" error on boot for omap3/4. This is caused by wrong addressing in the dts for bandgap sensor for single sensor instances. Note that omap4-cpu-thermal.dtsi is shared across omap4/5 and dra7, so we can't just change the addressing in omap4-cpu-thermal.dtsi. Cc: Ivaylo Dimitrov <ivo.g.dimitrov.75@gmail.com> Cc: Carl Philipp Klemm <philipp@uvos.xyz> Cc: Merlijn Wajer <merlijn@wizzup.org> Cc: Pavel Machek <pavel@ucw.cz> Reviewed-by: Sebastian Reichel <sebastian.reichel@collabora.com> Fixes: a761d517bbb1 ("ARM: dts: omap3: Add cpu_thermal zone") Fixes: 0bbf6c54d100 ("arm: dts: add omap4 CPU thermal data") Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-10-10	ARM: dts: omap: correct indentation	Krzysztof Kozlowski	9	-101/+101
	[ Upstream commit 8ae9c7a69fa14e95d032e64d8d758e3f85bee132 ] Do not use spaces for indentation. Link: https://lore.kernel.org/r/20221002092002.68880-1-krzysztof.kozlowski@linaro.org Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Stable-dep-of: 6469b2feade8 ("ARM: dts: ti: omap: Fix bandgap thermal cells addressing for omap3/4") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-10-10	MIPS: Alchemy: only build mmc support helpers if au1xmmc is enabled	Christoph Hellwig	3	-0/+14
	[ Upstream commit ef8f8f04a0b25e8f294b24350e8463a8d6a9ba0b ] While commit d4a5c59a955b ("mmc: au1xmmc: force non-modular build and remove symbol_get usage") to be built in, it can still build a kernel without MMC support and thuse no mmc_detect_change symbol at all. Add ifdefs to build the mmc support code in the alchemy arch code conditional on mmc support. Fixes: d4a5c59a955b ("mmc: au1xmmc: force non-modular build and remove symbol_get usage") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> # build-tested Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-10-10	x86/srso: Fix SBPB enablement for spec_rstack_overflow=off	Josh Poimboeuf	1	-1/+1
	[ Upstream commit 01b057b2f4cc2d905a0bd92195657dbd9a7005ab ] If the user has requested no SRSO mitigation, other mitigations can use the lighter-weight SBPB instead of IBPB. Fixes: fb3bd914b3ec ("x86/srso: Add a Speculative RAS Overflow mitigation") Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/b20820c3cfd1003171135ec8d762a0b957348497.1693889988.git.jpoimboe@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-10-10	x86/srso: Fix srso_show_state() side effect	Josh Poimboeuf	1	-1/+1
	[ Upstream commit a8cf700c17d9ca6cb8ee7dc5c9330dbac3948237 ] Reading the 'spec_rstack_overflow' sysfs file can trigger an unnecessary MSR write, and possibly even a (handled) exception if the microcode hasn't been updated. Avoid all that by just checking X86_FEATURE_IBPB_BRTYPE instead, which gets set by srso_select_mitigation() if the updated microcode exists. Fixes: fb3bd914b3ec ("x86/srso: Add a Speculative RAS Overflow mitigation") Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Nikolay Borisov <nik.borisov@suse.com> Acked-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/27d128899cb8aee9eb2b57ddc996742b0c1d776b.1693889988.git.jpoimboe@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-10-10	powerpc/perf/hv-24x7: Update domain value check	Kajol Jain	1	-1/+1
	[ Upstream commit 4ff3ba4db5943cac1045e3e4a3c0463ea10f6930 ] Valid domain value is in range 1 to HV_PERF_DOMAIN_MAX. Current code has check for domain value greater than or equal to HV_PERF_DOMAIN_MAX. But the check for domain value 0 is missing. Fix this issue by adding check for domain value 0. Before: # ./perf stat -v -e hv_24x7/CPM_ADJUNCT_INST,domain=0,core=1/ sleep 1 Using CPUID 00800200 Control descriptor is not initialized Error: The sys_perf_event_open() syscall returned with 5 (Input/output error) for event (hv_24x7/CPM_ADJUNCT_INST,domain=0,core=1/). /bin/dmesg \| grep -i perf may provide additional information. Result from dmesg: [ 37.819387] hv-24x7: hcall failed: [0 0x60040000 0x100 0] => ret 0xfffffffffffffffc (-4) detail=0x2000000 failing ix=0 After: # ./perf stat -v -e hv_24x7/CPM_ADJUNCT_INST,domain=0,core=1/ sleep 1 Using CPUID 00800200 Control descriptor is not initialized Warning: hv_24x7/CPM_ADJUNCT_INST,domain=0,core=1/ event is not supported by the kernel. failed to read counter hv_24x7/CPM_ADJUNCT_INST,domain=0,core=1/ Fixes: ebd4a5a3ebd9 ("powerpc/perf/hv-24x7: Minor improvements") Reported-by: Krishan Gopal Sarawast <krishang@linux.vnet.ibm.com> Signed-off-by: Kajol Jain <kjain@linux.ibm.com> Tested-by: Disha Goel <disgoel@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230825055601.360083-1-kjain@linux.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-23	x86/boot/compressed: Reserve more memory for page tables	Kirill A. Shutemov	2	-14/+39
	[ Upstream commit f530ee95b72e77b09c141c4b1a4b94d1199ffbd9 ] The decompressor has a hard limit on the number of page tables it can allocate. This limit is defined at compile-time and will cause boot failure if it is reached. The kernel is very strict and calculates the limit precisely for the worst-case scenario based on the current configuration. However, it is easy to forget to adjust the limit when a new use-case arises. The worst-case scenario is rarely encountered during sanity checks. In the case of enabling 5-level paging, a use-case was overlooked. The limit needs to be increased by one to accommodate the additional level. This oversight went unnoticed until Aaron attempted to run the kernel via kexec with 5-level paging and unaccepted memory enabled. Update wost-case calculations to include 5-level paging. To address this issue, let's allocate some extra space for page tables. 128K should be sufficient for any use-case. The logic can be simplified by using a single value for all kernel configurations. [ Also add a warning, should this memory run low - by Dave Hansen. ] Fixes: 34bbb0009f3b ("x86/boot/compressed: Enable 5-level paging during decompression stage") Reported-by: Aaron Lu <aaron.lu@intel.com> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20230915070221.10266-1-kirill.shutemov@linux.intel.com Signed-off-by: Sasha Levin <sashal@kernel.org>