summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)AuthorFilesLines
2025-12-14x86/boot/e820: Simplify the e820__range_remove() APIIngo Molnar4-14/+11
Right now e820__range_remove() has two parameters to control the E820 type of the range removed: extern void e820__range_remove(u64 start, u64 size, enum e820_type old_type, bool check_type); Since E820 types start at 1, zero has a natural meaning of 'no type. Consolidate the (old_type,check_type) parameters into a single (filter_type) parameter: extern void e820__range_remove(u64 start, u64 size, enum e820_type filter_type); Note that both e820__mapped_raw_any() and e820__mapped_any() already have such semantics for their 'type' parameter, although it's currently not used with '0' by in-kernel code. Also, the __e820__mapped_all() internal helper already has such semantics implemented as well, and the e820__get_entry_type() API uses the '0' type to such effect. This simplifies not just e820__range_remove(), and synchronizes its use of type filters with other E820 API functions, but simplifies usage sites as well, such as parse_memmap_one(), beyond the reduction of the number of parameters: - else if (from) - e820__range_remove(start_at, mem_size, from, 1); else - e820__range_remove(start_at, mem_size, 0, 0); + e820__range_remove(start_at, mem_size, from); The generated code gets smaller as well: add/remove: 0/0 grow/shrink: 0/5 up/down: 0/-66 (-66) Function old new delta parse_memopt 112 107 -5 efi_init 1048 1039 -9 setup_arch 2719 2709 -10 e820__range_remove 283 273 -10 parse_memmap_opt 559 527 -32 Total: Before=22,675,600, After=22,675,534, chg -0.00% Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: H . Peter Anvin <hpa@zytor.com> Cc: Andy Shevchenko <andy@kernel.org> Cc: Arnd Bergmann <arnd@kernel.org> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Juergen Gross <jgross@suse.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Paul Menzel <pmenzel@molgen.mpg.de> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://patch.msgid.link/20250515120549.2820541-28-mingo@kernel.org
2025-12-14x86/boot/e820: Remove e820__range_remove()'s unused return parameterIngo Molnar2-8/+2
None of the usage sites make use of the 'real_removed_size' return parameter of e820__range_remove(), and it's hard to contemplate much constructive use: E820 maps can have holes, and removing a fixed range may result in removal of any number of bytes from 0 to the requested size. So remove this pointless calculation. This simplifies the function a bit: text data bss dec hex filename 7645 44072 0 51717 ca05 e820.o.before 7597 44072 0 51669 c9d5 e820.o.after Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: H . Peter Anvin <hpa@zytor.com> Cc: Andy Shevchenko <andy@kernel.org> Cc: Arnd Bergmann <arnd@kernel.org> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Juergen Gross <jgross@suse.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Paul Menzel <pmenzel@molgen.mpg.de> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://patch.msgid.link/20250515120549.2820541-27-mingo@kernel.org
2025-12-14x86/boot/e820: Simplify append_e820_table() and remove restriction on ↵Ingo Molnar1-24/+11
single-entry tables So append_e820_table() begins with this weird condition that checks 'nr_entries': static int __init append_e820_table(struct boot_e820_entry *entries, u32 nr_entries) { /* Only one memory region (or negative)? Ignore it */ if (nr_entries < 2) return -1; Firstly, 'nr_entries' has been an u32 since 2017 and cannot be negative. Secondly, there's nothing inherently wrong with single-entry E820 maps, especially in virtualized environments. So remove this restriction and remove the __append_e820_table() indirection. Also: - fix/update comments - remove obsolete comments This shrinks the generated code a bit as well: text data bss dec hex filename 7549 44072 0 51621 c9a5 e820.o.before 7533 44072 0 51605 c995 e820.o.after Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andy Shevchenko <andy@kernel.org> Cc: Arnd Bergmann <arnd@kernel.org> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: H . Peter Anvin <hpa@zytor.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Paul Menzel <pmenzel@molgen.mpg.de> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://patch.msgid.link/20250515120549.2820541-26-mingo@kernel.org
2025-12-14x86/boot/e820: Standardize __init/__initdata tag placementIngo Molnar1-46/+46
So the e820.c file has a hodgepodge of __init and __initdata tag placements: static int __init e820_search_gap(unsigned long *max_gap_start, unsigned long *max_gap_size) __init void e820__setup_pci_gap(void) __init void e820__reallocate_tables(void) void __init e820__memory_setup_extended(u64 phys_addr, u32 data_len) void __init e820__register_nosave_regions(unsigned long limit_pfn) static int __init e820__register_nvs_regions(void) u64 __init e820__memblock_alloc_reserved(u64 size, u64 align) Standardize on the style used by e820__setup_pci_gap() and place them before the storage class. In addition to the consistency, as a bonus this makes the grep output rather clean looking: __init void e820__range_remove(u64 start, u64 size, enum e820_type filter_type) __init void e820__update_table_print(void) __init static void e820__update_table_kexec(void) __init static int e820_search_gap(unsigned long *max_gap_start, unsigned long *max_gap_size) __init void e820__setup_pci_gap(void) __init void e820__reallocate_tables(void) __init void e820__memory_setup_extended(u64 phys_addr, u32 data_len) __init void e820__register_nosave_regions(unsigned long limit_pfn) __init static int e820__register_nvs_regions(void) ... and if one learns to just ignore the leftmost '__init' noise then the rest of the line looks just like a regular C function definition. With the 'mixed' tag placement style the __init tag breaks up the function's prototype for no good reason. Do the same for __initdata. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: H . Peter Anvin <hpa@zytor.com> Cc: Andy Shevchenko <andy@kernel.org> Cc: Arnd Bergmann <arnd@kernel.org> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Juergen Gross <jgross@suse.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Paul Menzel <pmenzel@molgen.mpg.de> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://patch.msgid.link/20250515120549.2820541-25-mingo@kernel.org
2025-12-14x86/boot/e820: Simplify & clarify __e820__range_add() a bitIngo Molnar1-4/+7
Use 'entry_new' to make clear we are allocating a new entry. Change the table-full message to say that the table is full. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: H . Peter Anvin <hpa@zytor.com> Cc: Andy Shevchenko <andy@kernel.org> Cc: Arnd Bergmann <arnd@kernel.org> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Juergen Gross <jgross@suse.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Paul Menzel <pmenzel@molgen.mpg.de> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://patch.msgid.link/20250515120549.2820541-24-mingo@kernel.org
2025-12-14x86/boot/e820: Rename gap_start/gap_size to max_gap_start/max_gap_start in ↵Ingo Molnar1-11/+11
e820_search_gap() et al The PCI gap searching functions pass around pointers to the gap_start/gap_size variables, which refer to the maximum size gap found so far. Rename the variables to say so, and disambiguate their namespace from 'current gap' variables. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: H . Peter Anvin <hpa@zytor.com> Cc: Andy Shevchenko <andy@kernel.org> Cc: Arnd Bergmann <arnd@kernel.org> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Juergen Gross <jgross@suse.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Paul Menzel <pmenzel@molgen.mpg.de> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://patch.msgid.link/20250515120549.2820541-23-mingo@kernel.org
2025-12-14x86/boot/e820: Change e820_search_gap() to search for the highest-address ↵Ingo Molnar1-1/+1
PCI gap Right now the main x86 function that determines the position and size of the 'PCI gap', e820_search_gap(), has this curious property: while (--idx >= 0) { ... if (gap >= *gap_size) { I.e. it will iterate the E820 table backwards, from its end to the beginning, and will search for larger and larger gaps in the memory map below 4GB, until it finishes with the table. This logic will, should there be two gaps with the same size, pick the one with the lower physical address - which is contrary to usual practice that the PCI gap is just below 4GB. Furthermore, the commit that introduced this weird logic 16 years ago: 3381959da5a0 ("x86: cleanup e820_setup_gap(), add e820_search_gap(), v2") - if (gap > gapsize) { + if (gap >= *gapsize) { didn't even declare this change, the title says it's a cleanup, and the changelog declares it as a preparatory refactoring for a later bugfix: 809d9a8f93bd ("x86/PCI: ACPI based PCI gap calculation") which bugfix was reverted only 1 day later without much of an explanation, and was never reintroduced: 58b6e5538460 ("Revert "x86/PCI: ACPI based PCI gap calculation"") So based on the Git archeology and by the plain reading of the code I declare this '>=' change an unintended bug and side effect. Change it to '>' again. It should not make much of a difference in practice, as the likelihood of having *two* largest gaps with exactly the same size are very low outside of weird user-provided memory maps. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: H . Peter Anvin <hpa@zytor.com> Cc: Andy Shevchenko <andy@kernel.org> Cc: Arnd Bergmann <arnd@kernel.org> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Juergen Gross <jgross@suse.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Paul Menzel <pmenzel@molgen.mpg.de> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://patch.msgid.link/20250515120549.2820541-22-mingo@kernel.org
2025-12-14x86/boot/e820: Clean up e820__setup_pci_gap()/e820_search_gap() a bitIngo Molnar1-11/+11
Apply misc cleanups: - Use a bit more readable variable names, we haven't run out of underscore characters in the kernel yet. - s/0x400000/SZ_4M - s/1024*1024/SZ_1M Suggested-by: Andy Shevchenko <andy@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Arnd Bergmann <arnd@kernel.org> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: H . Peter Anvin <hpa@zytor.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Paul Menzel <pmenzel@molgen.mpg.de> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://patch.msgid.link/20250515120549.2820541-21-mingo@kernel.org
2025-12-14x86/boot/e820: Change struct e820_table::nr_entries type from __u32 to u32Ingo Molnar1-1/+1
__u32 is for UAPI headers, and this definition is the only place in the kernel-internal E820 code that uses __u32. Change it to u32. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: H . Peter Anvin <hpa@zytor.com> Cc: Andy Shevchenko <andy@kernel.org> Cc: Arnd Bergmann <arnd@kernel.org> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Juergen Gross <jgross@suse.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Paul Menzel <pmenzel@molgen.mpg.de> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://patch.msgid.link/20250515120549.2820541-20-mingo@kernel.org
2025-12-14x86/boot/e820: Standardize e820 table index variable types under 'u32'Ingo Molnar1-12/+12
So we have 'idx' types of 'int' and 'unsigned int', and sometimes we assign 'u32' fields such as e820_table::nr_entries to these 'int' values. While there's no real risk of overflow with these tables, make it all cleaner by standardizing on a single type: u32. This also happens to shrink the code a bit: text data bss dec hex filename 7745 44072 0 51817 ca69 e820.o.before 7613 44072 0 51685 c9e5 e820.o.after Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: H . Peter Anvin <hpa@zytor.com> Cc: Andy Shevchenko <andy@kernel.org> Cc: Arnd Bergmann <arnd@kernel.org> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Juergen Gross <jgross@suse.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Paul Menzel <pmenzel@molgen.mpg.de> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://patch.msgid.link/20250515120549.2820541-19-mingo@kernel.org
2025-12-14x86/boot/e820: Standardize e820 table index variable names under 'idx'Ingo Molnar1-57/+57
Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andy Shevchenko <andy@kernel.org> Cc: Arnd Bergmann <arnd@kernel.org> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: H . Peter Anvin <hpa@zytor.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Paul Menzel <pmenzel@molgen.mpg.de> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://patch.msgid.link/20250515120549.2820541-18-mingo@kernel.org
2025-12-14x86/boot/e820: Remove unnecessary header inclusionsIngo Molnar1-2/+0
Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: H . Peter Anvin <hpa@zytor.com> Cc: Andy Shevchenko <andy@kernel.org> Cc: Arnd Bergmann <arnd@kernel.org> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Juergen Gross <jgross@suse.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Paul Menzel <pmenzel@molgen.mpg.de> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://patch.msgid.link/20250515120549.2820541-17-mingo@kernel.org
2025-12-14x86/boot/e820: Clean up __refdata use a bitIngo Molnar1-3/+3
So __refdata, like __init, is more of a storage class specifier, so move the attribute in front of the type, not after the variable name. This also aligns it vertically. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andy Shevchenko <andy@kernel.org> Cc: Arnd Bergmann <arnd@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Juergen Gross <jgross@suse.com> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Paul Menzel <pmenzel@molgen.mpg.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: David Woodhouse <dwmw@amazon.co.uk> Link: https://patch.msgid.link/20250515120549.2820541-16-mingo@kernel.org
2025-12-14x86/boot/e820: Clean up __e820__range_add() a bitIngo Molnar1-7/+8
- Use 'idx' index variable instead of a weird 'x' - Make the error message E820-specific - Group the code a bit better Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andy Shevchenko <andy@kernel.org> Cc: Arnd Bergmann <arnd@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Juergen Gross <jgross@suse.com> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Paul Menzel <pmenzel@molgen.mpg.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: David Woodhouse <dwmw@amazon.co.uk> Link: https://patch.msgid.link/20250515120549.2820541-15-mingo@kernel.org
2025-12-14x86/boot/e820: Improve e820_print_type() messagesIngo Molnar1-8/+8
For E820_TYPE_RESERVED, print: 'reserved' -> 'device reserved' For E820_TYPE_PRAM and E820_TYPE_PMEM: 'persistent' -> 'persistent RAM' Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andy Shevchenko <andy@kernel.org> Cc: Arnd Bergmann <arnd@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Juergen Gross <jgross@suse.com> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Paul Menzel <pmenzel@molgen.mpg.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: David Woodhouse <dwmw@amazon.co.uk> Link: https://patch.msgid.link/20250515120549.2820541-14-mingo@kernel.org
2025-12-14x86/boot/e820: Clean up confusing and self-contradictory verbiage around ↵Ingo Molnar1-18/+37
E820 related resource allocations So the E820 code has a rather confusing area of code at around e820__reserve_resources(), which is, by its plain reading, rather self-contradictory. For example, the comment explaining e820__reserve_resources() claims: - '* Mark E820 reserved areas as busy for the resource manager' By 'E820 reserved areas' one can naively conclude that it's talking about E820_TYPE_RESERVED areas - while those areas are treated in exactly the opposite fashion by do_mark_busy(): switch (type) { case E820_TYPE_RESERVED: case E820_TYPE_SOFT_RESERVED: case E820_TYPE_PRAM: case E820_TYPE_PMEM: return false; Ie. E820_TYPE_RESERVED areas are *not* marked busy for the resource manager, because E820_TYPE_RESERVED areas are device regions that might eventually be claimed by a device driver. This type of confusion permeates this whole area of code, making it exceedingly difficult to read (for me at least). So untangle it bit by bit: - Instead of talking about ambiguous 'reserved areas', talk about 'E820 device address regions' instead, and 'register'/'lock' them. - The do_mark_busy() function is a misnomer as well, because despite its name it 'does' nothing - it only determines what type of resource handling an E820 type should receive from the kernel. Rename it to e820_device_region() and negate its meaning, to avoid the 'busy/reserved' confusion. Because that's what this code is really about: filtering out device regions such as E820_TYPE_RESERVED, E820_TYPE_PRAM, E820_TYPE_PMEM, etc., and allowing them to be claimed by device drivers later on. - All other E820 regions (system regions) are registered and locked early on, before the PCI resource manager does its search for device BAR addresses, etc. Also fix this somewhat misleading comment: /* * Try to bump up RAM regions to reasonable boundaries, to * avoid stolen RAM: */ and explain that here we register artificial 'gap' resources at the end of suspiciously sized RAM regions, as heuristics to try to avoid buggy firmware with undeclared 'stolen RAM' regions: /* * Create additional 'gaps' at the end of RAM regions, * rounding them up to 64k/1MB/64MB boundaries, should * they be weirdly sized, and register extra, locked * resource regions for them, to make sure drivers * won't claim those addresses. * * These are basically blind guesses and heuristics to * avoid resource conflicts with broken firmware that * doesn't properly list 'stolen RAM' as a system region * in the E820 map. */ Also improve the printout of this extra resource a bit: make the message more unambiguous, and upgrade it from pr_debug() (where very few people will see it), to pr_info() (where it will make it into the syslog on default distro configs). Also fix spelling and improve comment placement. No change in functionality intended. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andy Shevchenko <andy@kernel.org> Cc: Arnd Bergmann <arnd@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Juergen Gross <jgross@suse.com> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Paul Menzel <pmenzel@molgen.mpg.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: David Woodhouse <dwmw@amazon.co.uk> Link: https://patch.msgid.link/20250515120549.2820541-13-mingo@kernel.org
2025-12-14x86/boot/e820: Remove pointless early_panic() indirectionIngo Molnar1-7/+1
early_panic() is a pointless wrapper around panic(): static void __init early_panic(char *msg) { early_printk(msg); panic(msg); } panic() will already do a printk() of 'msg', and an early_printk() if earlyprintk is enabled. There's no need to print it separately. Remove the function. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andy Shevchenko <andy@kernel.org> Cc: Arnd Bergmann <arnd@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Juergen Gross <jgross@suse.com> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Paul Menzel <pmenzel@molgen.mpg.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: David Woodhouse <dwmw@amazon.co.uk> Link: https://patch.msgid.link/20250515120549.2820541-12-mingo@kernel.org
2025-12-14x86/boot/e820: Use 'u64' consistently instead of 'unsigned long long'Ingo Molnar1-5/+5
There's a number of structure fields and local variables related to E820 entry physical addresses that are defined as 'unsigned long long', but then are compared to u64 fields. Make the types all consistently u64. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andy Shevchenko <andy@kernel.org> Cc: Arnd Bergmann <arnd@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Juergen Gross <jgross@suse.com> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Paul Menzel <pmenzel@molgen.mpg.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: David Woodhouse <dwmw@amazon.co.uk> Link: https://patch.msgid.link/20250515120549.2820541-11-mingo@kernel.org
2025-12-14x86/boot/e820: Call the PCI gap a 'gap' in the boot log printoutIngo Molnar1-1/+1
It is a bit weird and inconsistent that the PCI gap is advertised during bootup as 'mem'ory: [mem 0xc0000000-0xfed1bfff] available for PCI devices ^^^ It's not really memory, it's a gap that PCI devices can decode and use and they often do not map it to any memory themselves. So advertise it for what it is, a gap: [gap 0xc0000000-0xfed1bfff] available for PCI devices Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andy Shevchenko <andy@kernel.org> Cc: Arnd Bergmann <arnd@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Juergen Gross <jgross@suse.com> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Paul Menzel <pmenzel@molgen.mpg.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: David Woodhouse <dwmw@amazon.co.uk> Link: https://patch.msgid.link/20250515120549.2820541-10-mingo@kernel.org
2025-12-14x86/boot/e820: Print E820_TYPE_RAM entries as ... RAM entriesIngo Molnar1-1/+1
So it is a bit weird that the actual RAM entries of the E820 table are not actually called RAM, but 'usable': BIOS-e820: [mem 0x0000000000100000-0x000000007ffdbfff] 1.9 GB usable 'usable' is pretty passive-aggressive in that context and ambiguous, most E820 entries denote 'usable' address ranges - reserved ranges may be used by devices, or the platform. Clarify and disambiguate this by making the boot log entry explicitly say 'System RAM', like in /proc/iomem: BIOS-e820: [mem 0x0000000000100000-0x000000007ffdbfff] 1.9 GB System RAM Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andy Shevchenko <andy@kernel.org> Cc: Arnd Bergmann <arnd@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Juergen Gross <jgross@suse.com> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Paul Menzel <pmenzel@molgen.mpg.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: David Woodhouse <dwmw@amazon.co.uk> Link: https://patch.msgid.link/20250515120549.2820541-9-mingo@kernel.org
2025-12-14x86/boot/e820: Make the field separator space character part of ↵Ingo Molnar1-11/+11
e820_print_type() We are going to add more columns to the E820 table printout, so make e820_print_type()'s field separator (space character) part of the function itself. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: H . Peter Anvin <hpa@zytor.com> Cc: Andy Shevchenko <andy@kernel.org> Cc: Arnd Bergmann <arnd@kernel.org> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Juergen Gross <jgross@suse.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Paul Menzel <pmenzel@molgen.mpg.de> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://patch.msgid.link/20250515120549.2820541-7-mingo@kernel.org
2025-12-14x86/boot/e820: Print gaps in the E820 tableIngo Molnar1-4/+18
Gaps in the E820 table are not obvious at a glance and can easily be overlooked. Print out gaps in the E820 table: Before: BIOS-provided physical RAM map: BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved BIOS-e820: [mem 0x0000000000100000-0x000000007ffdbfff] usable BIOS-e820: [mem 0x000000007ffdc000-0x000000007fffffff] reserved BIOS-e820: [mem 0x00000000b0000000-0x00000000bfffffff] reserved BIOS-e820: [mem 0x00000000fed1c000-0x00000000fed1ffff] reserved BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] reserved BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved BIOS-e820: [mem 0x000000fd00000000-0x000000ffffffffff] reserved After: BIOS-provided physical RAM map: BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved BIOS-e820: [gap 0x00000000000a0000-0x00000000000effff] BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved BIOS-e820: [mem 0x0000000000100000-0x000000007ffdbfff] usable BIOS-e820: [mem 0x000000007ffdc000-0x000000007fffffff] reserved BIOS-e820: [gap 0x0000000080000000-0x00000000afffffff] BIOS-e820: [mem 0x00000000b0000000-0x00000000bfffffff] reserved BIOS-e820: [gap 0x00000000c0000000-0x00000000fed1bfff] BIOS-e820: [mem 0x00000000fed1c000-0x00000000fed1ffff] reserved BIOS-e820: [gap 0x00000000fed20000-0x00000000feffbfff] BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] reserved BIOS-e820: [gap 0x00000000ff000000-0x00000000fffbffff] BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved BIOS-e820: [gap 0x0000000100000000-0x000000fcffffffff] BIOS-e820: [mem 0x000000fd00000000-0x000000ffffffffff] reserved Also warn about badly ordered E820 table entries: BUG: out of order E820 entry! ( this is printed before the entry is printed, so there's no need to print any additional data with the warning. ) Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: H . Peter Anvin <hpa@zytor.com> Cc: Andy Shevchenko <andy@kernel.org> Cc: Arnd Bergmann <arnd@kernel.org> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Juergen Gross <jgross@suse.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Paul Menzel <pmenzel@molgen.mpg.de> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://patch.msgid.link/20250515120549.2820541-6-mingo@kernel.org
2025-12-14x86/boot/e820: Mark e820__print_table() staticIngo Molnar2-2/+1
There are no external users of this function left. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: H . Peter Anvin <hpa@zytor.com> Cc: Andy Shevchenko <andy@kernel.org> Cc: Arnd Bergmann <arnd@kernel.org> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Juergen Gross <jgross@suse.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Paul Menzel <pmenzel@molgen.mpg.de> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://patch.msgid.link/20250515120549.2820541-5-mingo@kernel.org
2025-12-14x86/boot/e820: Simplify the PPro Erratum #50 workaroundIngo Molnar1-4/+2
No need to print out the table - users won't really be able to tell much from it anyway and the messages around this erratum are unnecessarily obtuse. Instead clearly inform the user that a 256 kB hole is being punched in their memory map at the 1.75 GB physical address. Not that there are many PPro users left. :-) Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: H . Peter Anvin <hpa@zytor.com> Cc: Andy Shevchenko <andy@kernel.org> Cc: Arnd Bergmann <arnd@kernel.org> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Juergen Gross <jgross@suse.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Paul Menzel <pmenzel@molgen.mpg.de> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://patch.msgid.link/20250515120549.2820541-4-mingo@kernel.org
2025-12-14x86/boot/e820: Simplify e820__print_table() a bitIngo Molnar1-3/+5
Introduce 'entry' for the current table entry and shorten repetitious use of e820_table->entries[i]. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: H . Peter Anvin <hpa@zytor.com> Cc: Andy Shevchenko <andy@kernel.org> Cc: Arnd Bergmann <arnd@kernel.org> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Juergen Gross <jgross@suse.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Paul Menzel <pmenzel@molgen.mpg.de> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://patch.msgid.link/20250515120549.2820541-3-mingo@kernel.org
2025-12-14x86/boot/e820: Remove inverted boolean logic from the e820_nomerge() ↵Ingo Molnar1-6/+10
function name, rename it to e820_type_mergeable() It's a bad practice to put inverted logic into function names, flip it back and rename it to e820_type_mergeable(). Add/update a few comments about this function while at it. Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Nikolay Borisov <nik.borisov@suse.com> Link: https://patch.msgid.link/20250515120549.2820541-2-mingo@kernel.org
2025-12-14x86/acpi/boot: Correct acpi_is_processor_usable() check againYazen Ghannam2-19/+8
ACPI v6.3 defined a new "Online Capable" MADT LAPIC flag. This bit is used in conjunction with the "Enabled" MADT LAPIC flag to determine if a CPU can be enabled/hotplugged by the OS after boot. Before the new bit was defined, the "Enabled" bit was explicitly described like this (ACPI v6.0 wording provided): "If zero, this processor is unusable, and the operating system support will not attempt to use it" This means that CPU hotplug (based on MADT) is not possible. Many BIOS implementations follow this guidance. They may include LAPIC entries in MADT for unavailable CPUs, but since these entries are marked with "Enabled=0" it is expected that the OS will completely ignore these entries. However, QEMU will do the same (include entries with "Enabled=0") for the purpose of allowing CPU hotplug within the guest. Comment from QEMU function pc_madt_cpu_entry(): /* ACPI spec says that LAPIC entry for non present * CPU may be omitted from MADT or it must be marked * as disabled. However omitting non present CPU from * MADT breaks hotplug on linux. So possible CPUs * should be put in MADT but kept disabled. */ Recent Linux topology changes broke the QEMU use case. A following fix for the QEMU use case broke bare metal topology enumeration. Rework the Linux MADT LAPIC flags check to allow the QEMU use case only for guests and to maintain the ACPI spec behavior for bare metal. Remove an unnecessary check added to fix a bare metal case introduced by the QEMU "fix". [ bp: Change logic as Michal suggested. ] [ mingo: Removed misapplied -stable tag. ] Fixes: fed8d8773b8e ("x86/acpi/boot: Correct acpi_is_processor_usable() check") Fixes: f0551af02130 ("x86/topology: Ignore non-present APIC IDs in a present package") Closes: https://lore.kernel.org/r/20251024204658.3da9bf3f.michal.pecio@gmail.com Reported-by: Michal Pecio <michal.pecio@gmail.com> Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Michal Pecio <michal.pecio@gmail.com> Tested-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Link: https://lore.kernel.org/20251111145357.4031846-1-yazen.ghannam@amd.com Cc: stable@vger.kernel.org
2025-12-14x86/platform/uv: Fix UBSAN array-index-out-of-boundsKyle Meyer1-1/+1
When UBSAN is enabled, multiple array-index-out-of-bounds messages are printed: [ 0.000000] [ T0] UBSAN: array-index-out-of-bounds in arch/x86/kernel/apic/x2apic_uv_x.c:276:23 [ 0.000000] [ T0] index 1 is out of range for type '<unknown> [1]' ... [ 0.000000] [ T0] UBSAN: array-index-out-of-bounds in arch/x86/kernel/apic/x2apic_uv_x.c:277:32 [ 0.000000] [ T0] index 1 is out of range for type '<unknown> [1]' ... [ 0.000000] [ T0] UBSAN: array-index-out-of-bounds in arch/x86/kernel/apic/x2apic_uv_x.c:282:16 [ 0.000000] [ T0] index 1 is out of range for type '<unknown> [1]' ... [ 0.515850] [ T1] UBSAN: array-index-out-of-bounds in arch/x86/kernel/apic/x2apic_uv_x.c:1344:23 [ 0.519851] [ T1] index 1 is out of range for type '<unknown> [1]' ... [ 0.603850] [ T1] UBSAN: array-index-out-of-bounds in arch/x86/kernel/apic/x2apic_uv_x.c:1345:32 [ 0.607850] [ T1] index 1 is out of range for type '<unknown> [1]' ... [ 0.691850] [ T1] UBSAN: array-index-out-of-bounds in arch/x86/kernel/apic/x2apic_uv_x.c:1353:20 [ 0.695850] [ T1] index 1 is out of range for type '<unknown> [1]' One-element arrays have been deprecated: https://docs.kernel.org/process/deprecated.html#zero-length-and-one-element-arrays Switch entry in struct uv_systab to a flexible array member to fix UBSAN array-index-out-of-bounds messages. sizeof(struct uv_systab) is passed to early_memremap() and ioremap(). The flexible array member is not accessed until the UV system table size is used to remap the entire UV system table, so changes to sizeof(struct uv_systab) have no impact. Signed-off-by: Kyle Meyer <kyle.meyer@hpe.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://patch.msgid.link/aTxksN-3otY41WvQ@hpe.com
2025-12-13Merge tag 'perf-urgent-2025-12-12' of ↵Linus Torvalds2-4/+4
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf event fixes from Ingo Molnar: - Fix NULL pointer dereference crash in the Intel PMU driver - Fix missing read event generation on task exit - Fix AMD uncore driver init error handling - Fix whitespace noise * tag 'perf-urgent-2025-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel: Fix NULL event dereference crash in handle_pmi_common() perf/core: Fix missing read event generation on task exit perf/x86/amd/uncore: Fix the return value of amd_uncore_df_event_init() on error perf/uprobes: Remove <space><Tab> whitespace noise
2025-12-13x86/sgx: Remove unmatched quote in __sgx_encl_extend function commentThorsten Blum1-1/+1
There is no opening quote. Remove the unmatched closing quote. Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Kai Huang <kai.huang@intel.com> Link: https://patch.msgid.link/20251210125628.544916-1-thorsten.blum@linux.dev
2025-12-13Merge tag 'mm-stable-2025-12-11-11-39' of ↵Linus Torvalds1-1/+2
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull more MM updates from Andrew Morton: - "powerpc/pseries/cmm: two smaller fixes" (David Hildenbrand) fixes a couple of minor things in ppc land - "Improve folio split related functions" (Zi Yan) some cleanups and minorish fixes in the folio splitting code * tag 'mm-stable-2025-12-11-11-39' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mm/damon/tests/core-kunit: avoid damos_test_commit stack warning mm: vmscan: correct nr_requested tracing in scan_folios MAINTAINERS: add idr core-api doc file to XARRAY mm/hugetlb: fix incorrect error return from hugetlb_reserve_pages() mm: fix CONFIG_STACK_GROWSUP typo in mm.h mm/huge_memory: fix folio split stats counting mm/huge_memory: make min_order_for_split() always return an order mm/huge_memory: replace can_split_folio() with direct refcount calculation mm/huge_memory: change folio_split_supported() to folio_check_splittable() mm/sparse: fix sparse_vmemmap_init_nid_early definition without CONFIG_SPARSEMEM powerpc/pseries/cmm: adjust BALLOON_MIGRATE when migrating pages powerpc/pseries/cmm: call balloon_devinfo_init() also without CONFIG_BALLOON_COMPACTION
2025-12-13x86/hv: Add gitignore entry for generated header fileLinus Torvalds1-0/+1
Commit 7bfe3b8ea6e3 ("Drivers: hv: Introduce mshv_vtl driver") added a new generated header file for the offsets into the mshv_vtl_cpu_context structure to be used by the low-level assembly code. But it didn't add the .gitignore file to go with it, so 'git status' and friends will mention it. Let's add the gitignore file before somebody thinks that generated header should be committed. Fixes: 7bfe3b8ea6e3 ("Drivers: hv: Introduce mshv_vtl driver") Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-12-13Merge tag 'sound-fix-6.19-rc1' of ↵Linus Torvalds16-16/+0
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "The only slightly large change is the enablement of CIX HD-audio controller, which took a bit time to be cooked up, while most of other changes are device-specific small trivial fixes: - Default disablement of the kconfig for decades old pre-release alsa-lib PCM API; it's only the default config value change, so it can't lead to any regressions for the existing setups - Support for CIX HD-audio controller - A few ASoC ACP fixes - Fixes for ASoC cirrus, bcm, wcd, qcom, ak platforms - Trivial hardening for FireWire and USB-audio - HD-audio Intel binding fix and quirks" * tag 'sound-fix-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (30 commits) ALSA: hda/tas2781: Add new quirk for HP new project ALSA: hda: cix-ipbloq: Use modern PM ops ALSA: hda: intel-dsp-config: Prefer legacy driver as fallback ASoC: amd: acp: update tdm channels for specific DAI ASoC: cs35l56: Fix incorrect select SND_SOC_CS35L56_CAL_SYSFS_COMMON ALSA: firewire-motu: add bounds check in put_user loop for DSP events ASoC: cs35l41: Always return 0 when a subsystem ID is found ALSA: uapi: Fix typo in asound.h comment ALSA: Do not build obsolete API ALSA: hda: add CIX IPBLOQ HDA controller support ALSA: hda/core: add addr_offset field for bus address translation ALSA: hda: dt-bindings: add CIX IPBLOQ HDA controller support ALSA: hda/realtek: Add support for ASUS UM3406GA ALSA: hda/realtek: Add support for HP Turbine Laptops ALSA: usb-audio: Initialize status1 to fix uninitialized symbol errors ALSA: firewire-motu: fix buffer overflow in hwdep read for DSP events ALSA: hda: cs35l41: Fix NULL pointer dereference in cs35l41_hda_read_acpi() ASoC: cros_ec_codec: Remove unnecessary selection of CRYPTO ASoc: qcom: q6afe: fix bad guard conversion ASoC: rockchip: Fix Wvoid-pointer-to-enum-cast warning (again) ...
2025-12-12Merge tag 'loongarch-6.19' of ↵Linus Torvalds75-765/+2994
git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson Pull LoongArch updates from Huacai Chen: - Add basic LoongArch32 support Note: Build infrastructures of LoongArch32 are not enabled yet, because we need to adjust irqchip drivers and wait for GNU toolchain be upstream first. - Select HAVE_ARCH_BITREVERSE in Kconfig - Fix build and boot for CONFIG_RANDSTRUCT - Correct the calculation logic of thread_count - Some bug fixes and other small changes * tag 'loongarch-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: (22 commits) LoongArch: Adjust default config files for 32BIT/64BIT LoongArch: Adjust VDSO/VSYSCALL for 32BIT/64BIT LoongArch: Adjust misc routines for 32BIT/64BIT LoongArch: Adjust user accessors for 32BIT/64BIT LoongArch: Adjust system call for 32BIT/64BIT LoongArch: Adjust module loader for 32BIT/64BIT LoongArch: Adjust time routines for 32BIT/64BIT LoongArch: Adjust process management for 32BIT/64BIT LoongArch: Adjust memory management for 32BIT/64BIT LoongArch: Adjust boot & setup for 32BIT/64BIT LoongArch: Adjust common macro definitions for 32BIT/64BIT LoongArch: Add adaptive CSR accessors for 32BIT/64BIT LoongArch: Add atomic operations for 32BIT/64BIT LoongArch: Add new PCI ID for pci_fixup_vgadev() LoongArch: Add and use some macros for AVEC LoongArch: Correct the calculation logic of thread_count LoongArch: Use unsigned long for _end and _text LoongArch: Use __pmd()/__pte() for swap entry conversions LoongArch: Fix arch_dup_task_struct() for CONFIG_RANDSTRUCT LoongArch: Fix build errors for CONFIG_RANDSTRUCT ...
2025-12-12Merge tag 'libcrypto-fixes-for-linus' of ↵Linus Torvalds5-89/+86
git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux Pull crypto library fixes from Eric Biggers: "Fixes for some recent regressions as well as some longstanding issues: - Fix incorrect output from the arm64 NEON implementation of GHASH - Merge the ksimd scopes in the arm64 XTS code to reduce stack usage - Roll up the BLAKE2b round loop on 32-bit kernels to greatly reduce code size and stack usage - Add missing RISCV_EFFICIENT_VECTOR_UNALIGNED_ACCESS dependency - Fix chacha-riscv64-zvkb.S to not use frame pointer for data" * tag 'libcrypto-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: crypto: arm64/ghash - Fix incorrect output from ghash-neon crypto/arm64: sm4/xts - Merge ksimd scopes to reduce stack bloat crypto/arm64: aes/xts - Use single ksimd scope to reduce stack bloat lib/crypto: blake2s: Replace manual unrolling with unrolled_full lib/crypto: blake2b: Roll up BLAKE2b round loop on 32-bit lib/crypto: riscv: Depend on RISCV_EFFICIENT_VECTOR_UNALIGNED_ACCESS lib/crypto: riscv/chacha: Avoid s0/fp register
2025-12-12perf/x86/intel: Fix NULL event dereference crash in handle_pmi_common()Evan Li1-0/+3
handle_pmi_common() may observe an active bit set in cpuc->active_mask while the corresponding cpuc->events[] entry has already been cleared, which leads to a NULL pointer dereference. This can happen when interrupt throttling stops all events in a group while PEBS processing is still in progress. perf_event_overflow() can trigger perf_event_throttle_group(), which stops the group and clears the cpuc->events[] entry, but the active bit may still be set when handle_pmi_common() iterates over the events. The following recent fix: 7e772a93eb61 ("perf/x86: Fix NULL event access and potential PEBS record loss") moved the cpuc->events[] clearing from x86_pmu_stop() to x86_pmu_del() and relied on cpuc->active_mask/pebs_enabled checks. However, handle_pmi_common() can still encounter a NULL cpuc->events[] entry despite the active bit being set. Add an explicit NULL check on the event pointer before using it, to cover this legitimate scenario and avoid the NULL dereference crash. Fixes: 7e772a93eb61 ("perf/x86: Fix NULL event access and potential PEBS record loss") Reported-by: kitta <kitta@linux.alibaba.com> Co-developed-by: kitta <kitta@linux.alibaba.com> Signed-off-by: Evan Li <evan.li@linux.alibaba.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://patch.msgid.link/20251212084943.2124787-1-evan.li@linux.alibaba.com Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220855
2025-12-11Merge tag 's390-6.19-2' of ↵Linus Torvalds11-134/+277
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull more s390 updates from Heiko Carstens: - Use the MSI parent domain API instead of the legacy API for setup and teardown of PCI MSI IRQs - Select POSIX_CPU_TIMERS_TASK_WORK now that VIRT_XFER_TO_GUEST_WORK has been implemented for s390 - Fix a KVM bug which can lead to guest memory corruption - Fix KASAN shadow memory mapping for hotplugged memory - Minor bug fixes and improvements * tag 's390-6.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/bug: Add missing alignment s390/bug: Add missing CONFIG_BUG ifdef again KVM: s390: Fix gmap_helper_zap_one_page() again s390/pci: Migrate s390 IRQ logic to IRQ domain API genirq: Change hwirq parameter to irq_hw_number_t s390: Select POSIX_CPU_TIMERS_TASK_WORK s390: Unmap early KASAN shadow on memory offlining s390/vmem: Support 2G page splitting for KASAN shadow freeing s390/boot: Use entire page for PTEs s390/vmur: Use scnprintf() instead of sprintf()
2025-12-11Merge tag 'alpha-for-v6.19-tag' of ↵Linus Torvalds5-14/+14
git://git.kernel.org/pub/scm/linux/kernel/git/lindholm/alpha Pull alpha updates from Magnus Lindholm: "Two small uapi fixes. One patch hardcodes TC* ioctl values that previously depended on the deprecated termio struct, avoiding build issues with newer glibc versions. The other patch switches uapi headers to use the compiler-defined __ASSEMBLER__ macro for better consistency between kernel and userspace. - don't reference obsolete termio struct for TC* constants - Replace __ASSEMBLY__ with __ASSEMBLER__ in the alpha headers" * tag 'alpha-for-v6.19-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/lindholm/alpha: alpha: don't reference obsolete termio struct for TC* constants alpha: Replace __ASSEMBLY__ with __ASSEMBLER__ in the alpha headers
2025-12-11Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linuxLinus Torvalds4-33/+87
Pull ARM updates from Russell King: - disable jump label and high PTE for PREEMPT RT kernels - fix input operand modification in load_unaligned_zeropad() - fix hash_name() / fault path induced warnings - fix branch predictor hardening * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linux: ARM: fix branch predictor hardening ARM: fix hash_name() fault ARM: allow __do_kernel_fault() to report execution of memory faults ARM: group is_permission_fault() with is_translation_fault() ARM: 9464/1: fix input-only operand modification in load_unaligned_zeropad() ARM: 9461/1: Disable HIGHPTE on PREEMPT_RT kernels ARM: 9459/1: Disable jump-label on PREEMPT_RT
2025-12-10crypto: arm64/ghash - Fix incorrect output from ghash-neonEric Biggers1-1/+1
Commit 9a7c987fb92b ("crypto: arm64/ghash - Use API partial block handling") made ghash_finup() pass the wrong buffer to ghash_do_simd_update(). As a result, ghash-neon now produces incorrect outputs when the message length isn't divisible by 16 bytes. Fix this. (I didn't notice this earlier because this code is reached only on CPUs that support NEON but not PMULL. I haven't yet found a way to get qemu-system-aarch64 to emulate that configuration.) Fixes: 9a7c987fb92b ("crypto: arm64/ghash - Use API partial block handling") Cc: stable@vger.kernel.org Reported-by: Diederik de Haas <diederik@cknow-tech.com> Closes: https://lore.kernel.org/linux-crypto/DETXT7QI62KE.F3CGH2VWX1SC@cknow-tech.com/ Tested-by: Diederik de Haas <diederik@cknow-tech.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Link: https://lore.kernel.org/r/20251209223417.112294-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-12-10Merge branches 'fixes' and 'misc' into for-nextRussell King (Oracle)1-2/+2
2025-12-10ARM: fix branch predictor hardeningRussell King (Oracle)2-14/+31
__do_user_fault() may be called with indeterminent interrupt enable state, which means we may be preemptive at this point. This causes problems when calling harden_branch_predictor(). For example, when called from a data abort, do_alignment_fault()->do_bad_area(). Move harden_branch_predictor() out of __do_user_fault() and into the calling contexts. Moving it into do_kernel_address_page_fault(), we can be sure that interrupts will be disabled here. Converting do_translation_fault() to use do_kernel_address_page_fault() rather than do_bad_area() means that we keep branch predictor handling for translation faults. Interrupts will also be disabled at this call site. do_sect_fault() needs special handling, so detect user mode accesses to kernel-addresses, and add an explicit call to branch predictor hardening. Finally, add branch predictor hardening to do_alignment() for the faulting case (user mode accessing kernel addresses) before interrupts are enabled. This should cover all cases where harden_branch_predictor() is called, ensuring that it is always has interrupts disabled, also ensuring that it is called early in each call path. Reviewed-by: Xie Yuanbin <xieyuanbin1@huawei.com> Tested-by: Xie Yuanbin <xieyuanbin1@huawei.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2025-12-10ARM: fix hash_name() faultRussell King (Oracle)1-0/+35
Zizhi Wo reports: "During the execution of hash_name()->load_unaligned_zeropad(), a potential memory access beyond the PAGE boundary may occur. For example, when the filename length is near the PAGE_SIZE boundary. This triggers a page fault, which leads to a call to do_page_fault()->mmap_read_trylock(). If we can't acquire the lock, we have to fall back to the mmap_read_lock() path, which calls might_sleep(). This breaks RCU semantics because path lookup occurs under an RCU read-side critical section." This is seen with CONFIG_DEBUG_ATOMIC_SLEEP=y and CONFIG_KFENCE=y. Kernel addresses (with the exception of the vectors/kuser helper page) do not have VMAs associated with them. If the vectors/kuser helper page faults, then there are two possibilities: 1. if the fault happened while in kernel mode, then we're basically dead, because the CPU won't be able to vector through this page to handle the fault. 2. if the fault happened while in user mode, that means the page was protected from user access, and we want to fault anyway. Thus, we can handle kernel addresses from any context entirely separately without going anywhere near the mmap lock. This gives us an entirely non-sleeping path for all kernel mode kernel address faults. As we handle the kernel address faults before interrupts are enabled, this change has the side effect of improving the branch predictor hardening, but does not completely solve the issue. Reported-by: Zizhi Wo <wozizhi@huaweicloud.com> Reported-by: Xie Yuanbin <xieyuanbin1@huawei.com> Link: https://lore.kernel.org/r/20251126090505.3057219-1-wozizhi@huaweicloud.com Reviewed-by: Xie Yuanbin <xieyuanbin1@huawei.com> Tested-by: Xie Yuanbin <xieyuanbin1@huawei.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2025-12-10ARM: allow __do_kernel_fault() to report execution of memory faultsRussell King (Oracle)1-0/+2
Allow __do_kernel_fault() to detect the execution of memory, so we can provide the same fault message as do_page_fault() would do. This is required when we split the kernel address fault handling from the main do_page_fault() code path. Reviewed-by: Xie Yuanbin <xieyuanbin1@huawei.com> Tested-by: Xie Yuanbin <xieyuanbin1@huawei.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2025-12-10x86/fpu: Fix FPU state core dump truncation on CPUs with no extended xfeaturesYongxin Liu1-2/+2
Zero can be a valid value of num_records. For example, on Intel Atom x6425RE, only x87 and SSE are supported (features 0, 1), and fpu_user_cfg.max_features is 3. The for_each_extended_xfeature() loop only iterates feature 2, which is not enabled, so num_records = 0. This is valid and should not cause core dump failure. The issue is that dump_xsave_layout_desc() returns 0 for both genuine errors (dump_emit() failure) and valid cases (no extended features). Use negative return values for errors and only abort on genuine failures. Fixes: ba386777a30b ("x86/elf: Add a new FPU buffer layout info to x86 core files") Signed-off-by: Yongxin Liu <yongxin.liu@windriver.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://patch.msgid.link/20251210000219.4094353-2-yongxin.liu@windriver.com
2025-12-10x86/unwind/orc: Support reliable unwinding through BPF stack framesJosh Poimboeuf1-12/+27
BPF JIT programs and trampolines use a frame pointer, so the current ORC unwinder strategy of falling back to frame pointers (when an ORC entry is missing) usually works in practice when unwinding through BPF JIT stack frames. However, that frame pointer fallback is just a guess, so the unwind gets marked unreliable for live patching, which can cause livepatch transition stalls. Make the common case reliable by calling the bpf_has_frame_pointer() helper to detect the valid frame pointer region of BPF JIT programs and trampolines. Fixes: ee9f8fce9964 ("x86/unwind: Add the ORC unwinder") Reported-by: Andrey Grodzovsky <andrey.grodzovsky@crowdstrike.com> Closes: https://lore.kernel.org/0e555733-c670-4e84-b2e6-abb8b84ade38@crowdstrike.com Acked-by: Song Liu <song@kernel.org> Acked-and-tested-by: Andrey Grodzovsky <andrey.grodzovsky@crowdstrike.com> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Link: https://lore.kernel.org/r/a18505975662328c8ffb1090dded890c6f8c1004.1764818927.git.jpoimboe@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Jiri Olsa <jolsa@kernel.org>
2025-12-10bpf: Add bpf_has_frame_pointer()Josh Poimboeuf1-0/+12
Introduce a bpf_has_frame_pointer() helper that unwinders can call to determine whether a given instruction pointer is within the valid frame pointer region of a BPF JIT program or trampoline (i.e., after the prologue, before the epilogue). This will enable livepatch (with the ORC unwinder) to reliably unwind through BPF JIT frames. Acked-by: Song Liu <song@kernel.org> Acked-and-tested-by: Andrey Grodzovsky <andrey.grodzovsky@crowdstrike.com> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Link: https://lore.kernel.org/r/fd2bc5b4e261a680774b28f6100509fd5ebad2f0.1764818927.git.jpoimboe@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Jiri Olsa <jolsa@kernel.org>
2025-12-10bpf, arm64: Do not audit capability check in do_jit()Ondrej Mosnacek1-1/+1
Analogically to the x86 commit 881a9c9cb785 ("bpf: Do not audit capability check in do_jit()"), change the capable() call to ns_capable_noaudit() in order to avoid spurious SELinux denials in audit log. The commit log from that commit applies here as well: """ The failure of this check only results in a security mitigation being applied, slightly affecting performance of the compiled BPF program. It doesn't result in a failed syscall, an thus auditing a failed LSM permission check for it is unwanted. For example with SELinux, it causes a denial to be reported for confined processes running as root, which tends to be flagged as a problem to be fixed in the policy. Yet dontauditing or allowing CAP_SYS_ADMIN to the domain may not be desirable, as it would allow/silence also other checks - either going against the principle of least privilege or making debugging potentially harder. Fix it by changing it from capable() to ns_capable_noaudit(), which instructs the LSMs to not audit the resulting denials. """ Fixes: f300769ead03 ("arm64: bpf: Only mitigate cBPF programs loaded by unprivileged users") Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Link: https://lore.kernel.org/r/20251204125916.441021-1-omosnace@redhat.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-12-10Merge tag 'csky-for-linus-6.19' of https://github.com/c-sky/csky-linuxLinus Torvalds13-20/+21
Pull csky updates from Guo Ren: - Remove compile warning for CONFIG_SMP - Fix __ASSEMBLER__ typo in headers - Fix csky_cmpxchg_fixup * tag 'csky-for-linus-6.19' of https://github.com/c-sky/csky-linux: csky: Remove compile warning for CONFIG_SMP csky: Replace __ASSEMBLY__ with __ASSEMBLER__ in uapi header csky: Replace __ASSEMBLY__ with __ASSEMBLER__ in non-uapi headers csky: fix csky_cmpxchg_fixup not working
2025-12-10crypto/arm64: sm4/xts - Merge ksimd scopes to reduce stack bloatArd Biesheuvel1-22/+20
Merge the two ksimd scopes in the implementation of SM4-XTS to prevent stack bloat in cases where the compiler fails to combine the stack slots for the kernel mode FP/SIMD buffers. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20251203163803.157541-6-ardb@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>