summaryrefslogtreecommitdiff
path: root/arch/x86/platform/efi/efi_64.c
AgeCommit message (Collapse)AuthorFilesLines
2020-02-28efi/x86: Don't panic or BUG() on non-critical error conditionsArd Biesheuvel1-4/+5
[ Upstream commit e2d68a955e49d61fd0384f23e92058dc9b79be5e ] The logic in __efi_enter_virtual_mode() does a number of steps in sequence, all of which may fail in one way or the other. In most cases, we simply print an error and disable EFI runtime services support, but in some cases, we BUG() or panic() and bring down the system when encountering conditions that we could easily handle in the same way. While at it, replace a pointless page-to-virt-phys conversion with one that goes straight from struct page to physical. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Arvind Sankar <nivedita@alum.mit.edu> Cc: Matthew Garrett <mjg59@google.com> Cc: linux-efi@vger.kernel.org Link: https://lkml.kernel.org/r/20200103113953.9571-14-ardb@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-07-03x86/efi: Fix efi_call_phys_epilog() with CONFIG_X86_5LEVEL=yKirill A. Shutemov1-2/+2
commit cfe19577047e74cdac5826adbdc2337d8437f8fb upstream. Open-coded page table entry checks don't work correctly when we fold the page table level at runtime. pgd_present() on 4-level paging machine always returns true, but open-coded version of the check may return false-negative result and we silently skip the rest of the loop body in efi_call_phys_epilog(). Replace open-coded checks with proper helpers. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Baoquan He <bhe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org # v4.12+ Fixes: 94133e46a0f5 ("x86/efi: Correct EFI identity mapping under 'efi=old_map' when KASLR is enabled") Link: http://lkml.kernel.org/r/20180625120852.18300-1-kirill.shutemov@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-28x86/efi: Free efi_pgd with free_pages()Waiman Long1-1/+1
commit 06ace26f4e6fcf747e890a39193be811777a048a upstream. The efi_pgd is allocated as PGD_ALLOCATION_ORDER pages and therefore must also be freed as PGD_ALLOCATION_ORDER pages with free_pages(). Fixes: d9e9a6418065 ("x86/mm/pti: Allocate a separate user PGD") Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/1521746333-19593-1-git-send-email-longman@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-22kmemcheck: stop using GFP_NOTRACK and SLAB_NOTRACKLevin, Alexander (Sasha Levin)1-1/+1
commit 75f296d93bcebcfe375884ddac79e30263a31766 upstream. Convert all allocations that used a NOTRACK flag to stop using it. Link: http://lkml.kernel.org/r/20171007030159.22241-3-alexander.levin@verizon.com Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Cc: Alexander Potapenko <glider@google.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Tim Hansen <devtimhansen@gmail.com> Cc: Vegard Nossum <vegardno@ifi.uio.no> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-01-17x86/pti: Unbreak EFI old_memmapJiri Kosina1-0/+2
commit de53c3786a3ce162a1c815d0c04c766c23ec9c0a upstream. EFI_OLD_MEMMAP's efi_call_phys_prolog() calls set_pgd() with swapper PGD that has PAGE_USER set, which makes PTI set NX on it, and therefore EFI can't execute it's code. Fix that by forcefully clearing _PAGE_NX from the PGD (this can't be done by the pgprot API). _PAGE_NX will be automatically reintroduced in efi_call_phys_epilog(), as _set_pgd() will again notice that this is _PAGE_USER, and set _PAGE_NX on it. Tested-by: Dimitri Sivanich <sivanich@hpe.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/nycvar.YFH.7.76.1801052215460.11852@cbobk.fhfr.pm Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-01-02x86/mm/pti: Allocate a separate user PGDDave Hansen1-1/+4
commit d9e9a6418065bb376e5de8d93ce346939b9a37a6 upstream. Kernel page table isolation requires to have two PGDs. One for the kernel, which contains the full kernel mapping plus the user space mapping and one for user space which contains the user space mappings and the minimal set of kernel mappings which are required by the architecture to be able to transition from and to user space. Add the necessary preliminaries. [ tglx: Split out from the big kaiser dump. EFI fixup from Kirill ] Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-02License cleanup: add SPDX GPL-2.0 license identifier to files with no licenseGreg Kroah-Hartman1-0/+1
Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-18x86/efi: Update EFI pagetable creation to work with SMETom Lendacky1-4/+11
When SME is active, pagetable entries created for EFI need to have the encryption mask set as necessary. When the new pagetable pages are allocated they are mapped encrypted. So, update the efi_pgt value that will be used in CR3 to include the encryption mask so that the PGD table can be read successfully. The pagetable mapping as well as the kernel are also added to the pagetable mapping as encrypted. All other EFI mappings are mapped decrypted (tables, etc.). Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brijesh Singh <brijesh.singh@amd.com> Cc: Dave Young <dyoung@redhat.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Toshimitsu Kani <toshi.kani@hpe.com> Cc: kasan-dev@googlegroups.com Cc: kvm@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: linux-doc@vger.kernel.org Cc: linux-efi@vger.kernel.org Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/9a8f4c502db4a84b09e2f0a1555bb75aa8b69785.1500319216.git.thomas.lendacky@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-04Merge branch 'x86-mm-for-linus' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 mm updates from Ingo Molnar: "The main changes in this cycle were: - Continued work to add support for 5-level paging provided by future Intel CPUs. In particular we switch the x86 GUP code to the generic implementation. (Kirill A. Shutemov) - Continued work to add PCID CPU support to native kernels as well. In this round most of the focus is on reworking/refreshing the TLB flush infrastructure for the upcoming PCID changes. (Andy Lutomirski)" * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (34 commits) x86/mm: Delete a big outdated comment about TLB flushing x86/mm: Don't reenter flush_tlb_func_common() x86/KASLR: Fix detection 32/64 bit bootloaders for 5-level paging x86/ftrace: Exclude functions in head64.c from function-tracing x86/mmap, ASLR: Do not treat unlimited-stack tasks as legacy mmap x86/mm: Remove reset_lazy_tlbstate() x86/ldt: Simplify the LDT switching logic x86/boot/64: Put __startup_64() into .head.text x86/mm: Add support for 5-level paging for KASLR x86/mm: Make kernel_physical_mapping_init() support 5-level paging x86/mm: Add sync_global_pgds() for configuration with 5-level paging x86/boot/64: Add support of additional page table level during early boot x86/boot/64: Rename init_level4_pgt and early_level4_pgt x86/boot/64: Rewrite startup_64() in C x86/boot/compressed: Enable 5-level paging during decompression stage x86/boot/efi: Define __KERNEL32_CS GDT on 64-bit configurations x86/boot/efi: Fix __KERNEL_CS definition of GDT entry on 64-bit configurations x86/boot/efi: Cleanup initialization of GDT entries x86/asm: Fix comment in return_from_SYSCALL_64() x86/mm/gup: Switch GUP to the generic get_user_page_fast() implementation ...
2017-06-13x86/mm: Split read_cr3() into read_cr3_pa() and __read_cr3()Andy Lutomirski1-2/+2
The kernel has several code paths that read CR3. Most of them assume that CR3 contains the PGD's physical address, whereas some of them awkwardly use PHYSICAL_PAGE_MASK to mask off low bits. Add explicit mask macros for CR3 and convert all of the CR3 readers. This will keep them from breaking when PCID is enabled. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: xen-devel <xen-devel@lists.xen.org> Link: http://lkml.kernel.org/r/883f8fb121f4616c1c1427ad87350bb2f5ffeca1.1497288170.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-05x86/efi: Extend CONFIG_EFI_PGT_DUMP support to x86_32 and kexec as wellSai Praneeth1-1/+4
CONFIG_EFI_PGT_DUMP=y, as the name suggests, dumps EFI page tables to the kernel log during kernel boot. This feature is very useful while debugging page faults/null pointer dereferences to EFI related addresses. Presently, this feature is limited only to x86_64, so let's extend it to other EFI configurations like kexec kernel, efi=old_map and to x86_32 as well. This doesn't effect normal boot path because this config option should be used only for debug purposes. Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Shankar <ravi.v.shankar@intel.com> Cc: Ricardo Neri <ricardo.neri@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/20170602135207.21708-13-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-05-28x86/efi: Correct EFI identity mapping under 'efi=old_map' when KASLR is enabledBaoquan He1-8/+71
For EFI with the 'efi=old_map' kernel option specified, the kernel will panic when KASLR is enabled: BUG: unable to handle kernel paging request at 000000007febd57e IP: 0x7febd57e PGD 1025a067 PUD 0 Oops: 0010 [#1] SMP Call Trace: efi_enter_virtual_mode() start_kernel() x86_64_start_reservations() x86_64_start_kernel() start_cpu() The root cause is that the identity mapping is not built correctly in the 'efi=old_map' case. On 'nokaslr' kernels, PAGE_OFFSET is 0xffff880000000000 which is PGDIR_SIZE aligned. We can borrow the PUD table from the direct mappings safely. Given a physical address X, we have pud_index(X) == pud_index(__va(X)). However, on KASLR kernels, PAGE_OFFSET is PUD_SIZE aligned. For a given physical address X, pud_index(X) != pud_index(__va(X)). We can't just copy the PGD entry from direct mapping to build identity mapping, instead we need to copy the PUD entries one by one from the direct mapping. Fix it. Signed-off-by: Baoquan He <bhe@redhat.com> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Bhupesh Sharma <bhsharma@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Young <dyoung@redhat.com> Cc: Frank Ramsay <frank.ramsay@hpe.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russ Anderson <rja@sgi.com> Cc: Thomas Garnier <thgarnie@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/20170526113652.21339-5-matt@codeblueprint.co.uk [ Fixed and reworded the changelog and code comments to be more readable. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-05-02Merge branch 'x86-mm-for-linus' of ↵Linus Torvalds1-10/+31
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 mm updates from Ingo Molnar: "The main x86 MM changes in this cycle were: - continued native kernel PCID support preparation patches to the TLB flushing code (Andy Lutomirski) - various fixes related to 32-bit compat syscall returning address over 4Gb in applications, launched from 64-bit binaries - motivated by C/R frameworks such as Virtuozzo. (Dmitry Safonov) - continued Intel 5-level paging enablement: in particular the conversion of x86 GUP to the generic GUP code. (Kirill A. Shutemov) - x86/mpx ABI corner case fixes/enhancements (Joerg Roedel) - ... plus misc updates, fixes and cleanups" * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (62 commits) mm, zone_device: Replace {get, put}_zone_device_page() with a single reference to fix pmem crash x86/mm: Fix flush_tlb_page() on Xen x86/mm: Make flush_tlb_mm_range() more predictable x86/mm: Remove flush_tlb() and flush_tlb_current_task() x86/vm86/32: Switch to flush_tlb_mm_range() in mark_screen_rdonly() x86/mm/64: Fix crash in remove_pagetable() Revert "x86/mm/gup: Switch GUP to the generic get_user_page_fast() implementation" x86/boot/e820: Remove a redundant self assignment x86/mm: Fix dump pagetables for 4 levels of page tables x86/mpx, selftests: Only check bounds-vs-shadow when we keep shadow x86/mpx: Correctly report do_mpx_bt_fault() failures to user-space Revert "x86/mm/numa: Remove numa_nodemask_from_meminfo()" x86/espfix: Add support for 5-level paging x86/kasan: Extend KASAN to support 5-level paging x86/mm: Add basic defines/helpers for CONFIG_X86_5LEVEL=y x86/paravirt: Add 5-level support to the paravirt code x86/mm: Define virtual memory map for 5-level paging x86/asm: Remove __VIRTUAL_MASK_SHIFT==47 assert x86/boot: Detect 5-level paging support x86/mm/numa: Remove numa_nodemask_from_meminfo() ...
2017-05-02Merge branch 'x86-boot-for-linus' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 boot updates from Ingo Molnar: "The biggest changes in this cycle were: - reworking of the e820 code: separate in-kernel and boot-ABI data structures and apply a whole range of cleanups to the kernel side. No change in functionality. - enable KASLR by default: it's used by all major distros and it's out of the experimental stage as well. - ... misc fixes and cleanups" * 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (63 commits) x86/KASLR: Fix kexec kernel boot crash when KASLR randomization fails x86/reboot: Turn off KVM when halting a CPU x86/boot: Fix BSS corruption/overwrite bug in early x86 kernel startup x86: Enable KASLR by default boot/param: Move next_arg() function to lib/cmdline.c for later reuse x86/boot: Fix Sparse warning by including required header file x86/boot/64: Rename start_cpu() x86/xen: Update e820 table handling to the new core x86 E820 code x86/boot: Fix pr_debug() API braindamage xen, x86/headers: Add <linux/device.h> dependency to <asm/xen/page.h> x86/boot/e820: Simplify e820__update_table() x86/boot/e820: Separate the E820 ABI structures from the in-kernel structures x86/boot/e820: Fix and clean up e820_type switch() statements x86/boot/e820: Rename the remaining E820 APIs to the e820__*() prefix x86/boot/e820: Remove unnecessary #include's x86/boot/e820: Rename e820_mark_nosave_regions() to e820__register_nosave_regions() x86/boot/e820: Rename e820_reserve_resources*() to e820__reserve_resources*() x86/boot/e820: Use bool in query APIs x86/boot/e820: Document e820__reserve_setup_data() x86/boot/e820: Clean up __e820__update_table() et al ...
2017-04-11Merge branch 'x86/boot' into x86/mm, to avoid conflictIngo Molnar1-1/+1
There's a conflict between ongoing level-5 paging support and the E820 rewrite. Since the E820 rewrite is essentially ready, merge it into x86/mm to reduce tree conflicts. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-04-05x86/efi: Clean up a minor mistake in commentBaoquan He1-1/+1
EFI allocates runtime services regions from EFI_VA_START, -4G, down to -68G, EFI_VA_END - 64G altogether, top-down. The mechanism was introduced in commit: d2f7cbe7b26a7 ("x86/efi: Runtime services virtual mapping") Fix the comment that still says bottom-up. Signed-off-by: Baoquan He <bhe@redhat.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/20170404160245.27812-10-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-27x86/efi: Add 5-level paging supportKirill A. Shutemov1-10/+26
Allocate additional page table level and ajdust efi_sync_low_kernel_mappings() to work with additional page table level. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-arch@vger.kernel.org Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20170317185515.8636-3-kirill.shutemov@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-14x86/mm: Convert trivial cases of page table walk to 5-level pagingKirill A. Shutemov1-4/+9
This patch only covers simple cases. Less trivial cases will be converted with separate patches. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-arch@vger.kernel.org Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20170313143309.16020-3-kirill.shutemov@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-01Merge branch 'linus' into WIP.x86/boot, to fix up conflicts and to pick up ↵Ingo Molnar1-10/+70
updates Conflicts: arch/x86/xen/setup.c Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-02-01x86/efi: Add support for EFI_MEMORY_ATTRIBUTES_TABLESai Praneeth1-10/+54
UEFI v2.6 introduces EFI_MEMORY_ATTRIBUTES_TABLE which describes memory protections that may be applied to the EFI Runtime code and data regions by the kernel. This enables the kernel to map these regions more strictly thereby increasing security. Presently, the only valid bits for the attribute field of a memory descriptor are EFI_MEMORY_RO and EFI_MEMORY_XP, hence use these bits to update the mappings in efi_pgd. The UEFI specification recommends to use this feature instead of EFI_PROPERTIES_TABLE and hence while updating EFI mappings we first check for EFI_MEMORY_ATTRIBUTES_TABLE and if it's present we update the mappings according to this table and hence disregarding EFI_PROPERTIES_TABLE even if it's published by the firmware. We consider EFI_PROPERTIES_TABLE only when EFI_MEMORY_ATTRIBUTES_TABLE is absent. Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Lee, Chun-Yi <jlee@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Shankar <ravi.v.shankar@intel.com> Cc: Ricardo Neri <ricardo.neri@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/1485868902-20401-6-git-send-email-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-28x86/boot/e820: Move asm/e820.h to asm/e820/api.hIngo Molnar1-1/+1
In line with asm/e820/types.h, move the e820 API declarations to asm/e820/api.h and update all usage sites. This is just a mechanical, obviously correct move & replace patch, there will be subsequent changes to clean up the code and to make better use of the new header organization. Cc: Alex Thorlton <athorlton@sgi.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Huang, Ying <ying.huang@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Jackson <pj@sgi.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-28x86/efi: Always map the first physical page into the EFI pagetablesJiri Kosina1-0/+16
Commit: 129766708 ("x86/efi: Only map RAM into EFI page tables if in mixed-mode") stopped creating 1:1 mappings for all RAM, when running in native 64-bit mode. It turns out though that there are 64-bit EFI implementations in the wild (this particular problem has been reported on a Lenovo Yoga 710-11IKB), which still make use of the first physical page for their own private use, even though they explicitly mark it EFI_CONVENTIONAL_MEMORY in the memory map. In case there is no mapping for this particular frame in the EFI pagetables, as soon as firmware tries to make use of it, a triple fault occurs and the system reboots (in case of the Yoga 710-11IKB this is very early during bootup). Fix that by always mapping the first page of physical memory into the EFI pagetables. We're free to hand this page to the BIOS, as trim_bios_range() will reserve the first page and isolate it away from memory allocators anyway. Note that just reverting 129766708 alone is not enough on v4.9-rc1+ to fix the regression on affected hardware, as this commit: ab72a27da ("x86/efi: Consolidate region mapping logic") later made the first physical frame not to be mapped anyway. Reported-by: Hanka Pavlikova <hanka@ucw.cz> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bp@suse.de> Cc: Laura Abbott <labbott@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vojtech Pavlik <vojtech@ucw.cz> Cc: Waiman Long <waiman.long@hpe.com> Cc: linux-efi@vger.kernel.org Cc: stable@kernel.org # v4.8+ Fixes: 129766708 ("x86/efi: Only map RAM into EFI page tables if in mixed-mode") Link: http://lkml.kernel.org/r/20170127222552.22336-1-matt@codeblueprint.co.uk [ Tidied up the changelog and the comment. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-13x86/efi: Prevent mixed mode boot corruption with CONFIG_VMAP_STACK=yMatt Fleming1-23/+57
Booting an EFI mixed mode kernel has been crashing since commit: e37e43a497d5 ("x86/mm/64: Enable vmapped stacks (CONFIG_HAVE_ARCH_VMAP_STACK=y)") The user-visible effect in my test setup was the kernel being unable to find the root file system ramdisk. This was likely caused by silent memory or page table corruption. Enabling CONFIG_DEBUG_VIRTUAL=y immediately flagged the thunking code as abusing virt_to_phys() because it was passing addresses that were not part of the kernel direct mapping. Use the slow version instead, which correctly handles all memory regions by performing a page table walk. Suggested-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk> Cc: Andy Lutomirski <luto@kernel.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/20161112210424.5157-3-matt@codeblueprint.co.uk Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-20Merge branch 'efi/urgent' into efi/core, to pick up fixesIngo Molnar1-1/+1
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-20x86/efi: Only map RAM into EFI page tables if in mixed-modeMatt Fleming1-1/+1
Waiman reported that booting with CONFIG_EFI_MIXED enabled on his multi-terabyte HP machine results in boot crashes, because the EFI region mapping functions loop forever while trying to map those regions describing RAM. While this patch doesn't fix the underlying hang, there's really no reason to map EFI_CONVENTIONAL_MEMORY regions into the EFI page tables when mixed-mode is not in use at runtime. Reported-by: Waiman Long <waiman.long@hpe.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> CC: Theodore Ts'o <tytso@mit.edu> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Scott J Norton <scott.norton@hpe.com> Cc: Douglas Hatch <doug.hatch@hpe.com> Cc: <stable@vger.kernel.org> # v4.6+ Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
2016-09-09x86/efi: Use kmalloc_array() in efi_call_phys_prolog()Markus Elfring1-1/+1
* A multiplication for the size determination of a memory allocation indicated that an array data structure should be processed. Thus reuse the corresponding function "kmalloc_array". This issue was detected by using the Coccinelle software. * Replace the specification of a data type by a pointer dereference to make the corresponding size determination a bit safer according to the Linux coding style convention. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Julia Lawall <julia.lawall@lip6.fr> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
2016-09-09x86/efi: Map in physical addresses in efi_map_region_fixedAlex Thorlton1-0/+1
This is a simple change to add in the physical mappings as well as the virtual mappings in efi_map_region_fixed. The motivation here is to get access to EFI runtime code that is only available via the 1:1 mappings on a kexec'd kernel. The added call is essentially the kexec analog of the first __map_region that Boris put in efi_map_region in commit d2f7cbe7b26a ("x86/efi: Runtime services virtual mapping"). Signed-off-by: Alex Thorlton <athorlton@sgi.com> Cc: Russ Anderson <rja@sgi.com> Cc: Dimitri Sivanich <sivanich@sgi.com> Cc: Mike Travis <travis@sgi.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Dave Young <dyoung@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
2016-09-09x86/efi: Consolidate region mapping logicMatt Fleming1-20/+0
EFI regions are currently mapped in two separate places. The bulk of the work is done in efi_map_regions() but when CONFIG_EFI_MIXED is enabled the additional regions that are required when operating in mixed mode are mapping in efi_setup_page_tables(). Pull everything into efi_map_regions() and refactor the test for which regions should be mapped into a should_map_region() function. Generously sprinkle comments to clarify the different cases. Acked-by: Borislav Petkov <bp@suse.de> Tested-by: Dave Young <dyoung@redhat.com> [kexec/kdump] Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> [arm] Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
2016-08-05Merge tag 'rtc-4.8' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux Pull RTC updates from Alexandre Belloni: "RTC for 4.8 Cleanups: - huge cleanup of rtc-generic and char/genrtc this allowed to cleanup rtc-cmos, rtc-sh, rtc-m68k, rtc-powerpc and rtc-parisc - move mn10300 to rtc-cmos Subsystem: - fix wakealarms after hibernate - multiples fixes for rctest - simplify implementations of .read_alarm New drivers: - Maxim MAX6916 Drivers: - ds1307: fix weekday - m41t80: add wakeup support - pcf85063: add support for PCF85063A variant - rv8803: extend i2c fix and other fixes - s35390a: fix alarm reading, this fixes instant reboot after shutdown for QNAP TS-41x - s3c: clock fixes" * tag 'rtc-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: (65 commits) rtc: rv8803: Clear V1F when setting the time rtc: rv8803: Stop the clock while setting the time rtc: rv8803: Always apply the I²C workaround rtc: rv8803: Fix read day of week rtc: rv8803: Remove the check for valid time rtc: rv8803: Kconfig: Indicate rx8900 support rtc: asm9260: remove .owner field for driver rtc: at91sam9: Fix missing spin_lock_init() rtc: m41t80: add suspend handlers for alarm IRQ rtc: m41t80: make it a real error message rtc: pcf85063: Add support for the PCF85063A device rtc: pcf85063: fix year range rtc: hym8563: in .read_alarm set .tm_sec to 0 to signal minute accuracy rtc: explicitly set tm_sec = 0 for drivers with minute accurancy rtc: s3c: Add s3c_rtc_{enable/disable}_clk in s3c_rtc_setfreq() rtc: s3c: Remove unnecessary call to disable already disabled clock rtc: abx80x: use devm_add_action_or_reset() rtc: m41t80: use devm_add_action_or_reset() rtc: fix a typo and reduce three empty lines to one rtc: s35390a: improve two comments in .set_alarm ...
2016-08-01Merge branch 'x86-headers-for-linus' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 header cleanups from Ingo Molnar: "This tree is a cleanup of the x86 tree reducing spurious uses of module.h - which should improve build performance a bit" * 'x86-headers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, crypto: Restore MODULE_LICENSE() to glue_helper.c so it loads x86/apic: Remove duplicated include from probe_64.c x86/ce4100: Remove duplicated include from ce4100.c x86/headers: Include spinlock_types.h in x8664_ksyms_64.c for missing spinlock_t x86/platform: Delete extraneous MODULE_* tags fromm ts5500 x86: Audit and remove any remaining unnecessary uses of module.h x86/kvm: Audit and remove any unnecessary uses of module.h x86/xen: Audit and remove any unnecessary uses of module.h x86/platform: Audit and remove any unnecessary uses of module.h x86/lib: Audit and remove any unnecessary uses of module.h x86/kernel: Audit and remove any unnecessary uses of module.h x86/mm: Audit and remove any unnecessary uses of module.h x86: Don't use module.h just for AUTHOR / LICENSE tags
2016-07-26Merge branch 'x86-mm-for-linus' of ↵Linus Torvalds1-5/+0
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 mm updates from Ingo Molnar: "Various x86 low level modifications: - preparatory work to support virtually mapped kernel stacks (Andy Lutomirski) - support for 64-bit __get_user() on 32-bit kernels (Benjamin LaHaise) - (involved) workaround for Knights Landing CPU erratum (Dave Hansen) - MPX enhancements (Dave Hansen) - mremap() extension to allow remapping of the special VDSO vma, for purposes of user level context save/restore (Dmitry Safonov) - hweight and entry code cleanups (Borislav Petkov) - bitops code generation optimizations and cleanups with modern GCC (H. Peter Anvin) - syscall entry code optimizations (Paolo Bonzini)" * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (43 commits) x86/mm/cpa: Add missing comment in populate_pdg() x86/mm/cpa: Fix populate_pgd(): Stop trying to deallocate failed PUDs x86/syscalls: Add compat_sys_preadv64v2/compat_sys_pwritev64v2 x86/smp: Remove unnecessary initialization of thread_info::cpu x86/smp: Remove stack_smp_processor_id() x86/uaccess: Move thread_info::addr_limit to thread_struct x86/dumpstack: Rename thread_struct::sig_on_uaccess_error to sig_on_uaccess_err x86/uaccess: Move thread_info::uaccess_err and thread_info::sig_on_uaccess_err to thread_struct x86/dumpstack: When OOPSing, rewind the stack before do_exit() x86/mm/64: In vmalloc_fault(), use CR3 instead of current->active_mm x86/dumpstack/64: Handle faults when printing the "Stack: " part of an OOPS x86/dumpstack: Try harder to get a call trace on stack overflow x86/mm: Remove kernel_unmap_pages_in_pgd() and efi_cleanup_page_tables() x86/mm/cpa: In populate_pgd(), don't set the PGD entry until it's populated x86/mm/hotplug: Don't remove PGD entries in remove_pagetable() x86/mm: Use pte_none() to test for empty PTE x86/mm: Disallow running with 32-bit PTEs to work around erratum x86/mm: Ignore A/D bits in pte/pmd/pud_none() x86/mm: Move swap offset/type up in PTE to work around erratum x86/entry: Inline enter_from_user_mode() ...
2016-07-15x86/mm: Remove kernel_unmap_pages_in_pgd() and efi_cleanup_page_tables()Andy Lutomirski1-5/+0
kernel_unmap_pages_in_pgd() is dangerous: if a PGD entry in init_mm.pgd were to be cleared, callers would need to ensure that the pgd entry hadn't been propagated to any other pgd. Its only caller was efi_cleanup_page_tables(), and that, in turn, was unused, so just delete both functions. This leaves a couple of other helpers unused, so delete them, too. Signed-off-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk> Acked-by: Borislav Petkov <bp@suse.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/77ff20fdde3b75cd393be5559ad8218870520248.1468527351.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-14x86/platform: Audit and remove any unnecessary uses of module.hPaul Gortmaker1-1/+1
Historically a lot of these existed because we did not have a distinction between what was modular code and what was providing support to modules via EXPORT_SYMBOL and friends. That changed when we forked out support for the latter into the export.h file. This means we should be able to reduce the usage of module.h in code that is obj-y Makefile or bool Kconfig. The advantage in doing so is that module.h itself sources about 15 other headers; adding significantly to what we feed cpp, and it can obscure what headers we are effectively using. Since module.h was the source for init.h (for __init) and for export.h (for EXPORT_SYMBOL) we consider each obj-y/bool instance for the presence of either and replace as needed. One module.h was converted to moduleparam.h since the file had multiple module_param() in it, and another file had an instance of MODULE_DEVICE_TABLE deleted, since that is a no-op when builtin. Finally, the 32 bit build coverage of olpc_ofw revealed a couple implicit includes, which were pretty self evident to fix based on what gcc was complaining about. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20160714001901.31603-6-paul.gortmaker@windriver.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-27x86/efi: Update efi_thunk() to use the the arch_efi_call_virt*() macrosAlex Thorlton1-13/+8
Currently, the efi_thunk macro has some semi-duplicated code in it that can be replaced with the arch_efi_call_virt_setup/teardown macros. This commit simply replaces the duplicated code with those macros. Suggested-by: Matt Fleming <matt@codeblueprint.co.uk> Signed-off-by: Alex Thorlton <athorlton@sgi.com> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dimitri Sivanich <sivanich@sgi.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Roy Franz <roy.franz@linaro.org> Cc: Russ Anderson <rja@sgi.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/1466839230-12781-7-git-send-email-matt@codeblueprint.co.uk [ Renamed variables to the standard __ prefix. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-25x86/efi: get rid of superfluous __GFP_REPEATMichal Hocko1-1/+1
__GFP_REPEAT has a rather weak semantic but since it has been introduced around 2.6.12 it has been ignored for low order allocations. efi_alloc_page_tables uses __GFP_REPEAT but it allocates an order-0 page. This means that this flag has never been actually useful here because it has always been used only for PAGE_ALLOC_COSTLY requests. Link: http://lkml.kernel.org/r/1464599699-30131-4-git-send-email-mhocko@kernel.org Signed-off-by: Michal Hocko <mhocko@suse.com> Acked-by: Matt Fleming <matt@codeblueprint.co.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-06-04rtc: cmos: move mc146818rtc code out of asm-generic/rtc.hArnd Bergmann1-0/+1
Drivers should not really include stuff from asm-generic directly, and the PC-style cmos rtc driver does this in order to reuse the mc146818 implementation of get_rtc_time/set_rtc_time rather than the architecture specific one for the architecture it gets built for. To make it more obvious what is going on, this moves and renames the two functions into include/linux/mc146818rtc.h, which holds the other mc146818 specific code. Ideally it would be in a .c file, but that would require extra infrastructure as the functions are called by multiple drivers with conflicting dependencies. With this change, the asm-generic/rtc.h header also becomes much more generic, so it can be reused more easily across any architecture that still relies on the genrtc driver. The only caller of the internal __get_rtc_time/__set_rtc_time functions is in arch/alpha/kernel/rtc.c, and we just change those over to the new naming. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
2016-04-28efi: Iterate over efi.memmap in for_each_efi_memory_desc()Matt Fleming1-7/+3
Most of the users of for_each_efi_memory_desc() are equally happy iterating over the EFI memory map in efi.memmap instead of 'memmap', since the former is usually a pointer to the latter. For those users that want to specify an EFI memory map other than efi.memmap, that can be done using for_each_efi_memory_desc_in_map(). One such example is in the libstub code where the firmware is queried directly for the memory map, it gets iterated over, and then freed. This change goes part of the way toward deleting the global 'memmap' variable, which is not universally available on all architectures (notably IA64) and is rather poorly named. Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Leif Lindholm <leif.lindholm@linaro.org> Cc: Mark Salter <msalter@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/1461614832-17633-7-git-send-email-matt@codeblueprint.co.uk Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-22x86/efi: Only map kernel text for EFI mixed modeSai Praneeth1-1/+1
The correct symbol to use when figuring out the size of the kernel text is '_etext', not '_end' which is the symbol for the entire kernel image includes data and debug sections. Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Shankar <ravi.v.shankar@intel.com> Cc: Ricardo Neri <ricardo.neri@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/1455712566-16727-14-git-send-email-matt@codeblueprint.co.uk Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-22x86/efi: Map EFI_MEMORY_{XP,RO} memory region bits to EFI page tablesSai Praneeth1-4/+41
Now that we have EFI memory region bits that indicate which regions do not need execute permission or read/write permission in the page tables, let's use them. We also check for EFI_NX_PE_DATA and only enforce the restrictive mappings if it's present (to allow us to ignore buggy firmware that sets bits it didn't mean to and to preserve backwards compatibility). Instead of assuming that firmware would set appropriate attributes in memory descriptor like EFI_MEMORY_RO for code and EFI_MEMORY_XP for data, we can expect some firmware out there which might only set *type* in memory descriptor to be EFI_RUNTIME_SERVICES_CODE or EFI_RUNTIME_SERVICES_DATA leaving away attribute. This will lead to improper mappings of EFI runtime regions. In order to avoid it, we check attribute and type of memory descriptor to update mappings and moreover Windows works this way. Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Lee, Chun-Yi <jlee@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luis R. Rodriguez <mcgrof@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Shankar <ravi.v.shankar@intel.com> Cc: Ricardo Neri <ricardo.neri@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Toshi Kani <toshi.kani@hp.com> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/1455712566-16727-13-git-send-email-matt@codeblueprint.co.uk Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-22x86/mm/pat: Don't implicitly allow _PAGE_RW in kernel_map_pages_in_pgd()Sai Praneeth1-4/+4
As part of the preparation for the EFI_MEMORY_RO flag added in the UEFI 2.5 specification, we need the ability to map pages in kernel page tables without _PAGE_RW being set. Modify kernel_map_pages_in_pgd() to require its callers to pass _PAGE_RW if the pages need to be mapped read/write. Otherwise, we'll map the pages as read-only. Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Hugh Dickins <hughd@google.com> Cc: Lee, Chun-Yi <jlee@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luis R. Rodriguez <mcgrof@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Shankar <ravi.v.shankar@intel.com> Cc: Ricardo Neri <ricardo.neri@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Toshi Kani <toshi.kani@hp.com> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/1455712566-16727-12-git-send-email-matt@codeblueprint.co.uk Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-12-19Merge tag 'efi-next' of ↵Thomas Gleixner1-0/+2
git://git.kernel.org/pub/scm/linux/kernel/git/mfleming/efi into x86/efi Pull efi changes from Matt Fleming: * We don't need to carry our own formatting code in the esrt driver because the kobject API can do that for us - Rasmus Villemoes * Update the arm64 file paths in Documentation/efi-stub.txt to match the current tree - Alan Ott * Consistently preface all print statements with "efi" arch/x86 so that it's more obvious to users reporting problems which statements in the kernel log are relevant for EFI - Matt Fleming * Fix a boot crash in the ACPI BGRT driver and delete efi_lookup_mapped_addr() since it's useless now that the EFI mappings *only* exist in the 'efi_pgd' page table. Instead we always early_memremap() the BGRT memory - Sai Praneeth Prakhya
2015-12-14x86/efi: Preface all print statements with efi* tagMatt Fleming1-0/+2
The pr_*() calls in the x86 EFI code may or may not include a subsystem tag, which makes it difficult to grep the kernel log for all relevant EFI messages and leads users to miss important information. Recently, a bug reporter provided all the EFI print messages from the kernel log when trying to diagnose an issue but missed the following statement because it wasn't prefixed with anything indicating it was related to EFI, pr_err("Error ident-mapping new memmap (0x%lx)!\n", pa_memmap); Cc: Borislav Petkov <bp@suse.de> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
2015-11-29x86/efi: Build our own page table structuresMatt Fleming1-15/+82
With commit e1a58320a38d ("x86/mm: Warn on W^X mappings") all users booting on 64-bit UEFI machines see the following warning, ------------[ cut here ]------------ WARNING: CPU: 7 PID: 1 at arch/x86/mm/dump_pagetables.c:225 note_page+0x5dc/0x780() x86/mm: Found insecure W+X mapping at address ffff88000005f000/0xffff88000005f000 ... x86/mm: Checked W+X mappings: FAILED, 165660 W+X pages found. ... This is caused by mapping EFI regions with RWX permissions. There isn't much we can do to restrict the permissions for these regions due to the way the firmware toolchains mix code and data, but we can at least isolate these mappings so that they do not appear in the regular kernel page tables. In commit d2f7cbe7b26a ("x86/efi: Runtime services virtual mapping") we started using 'trampoline_pgd' to map the EFI regions because there was an existing identity mapping there which we use during the SetVirtualAddressMap() call and for broken firmware that accesses those addresses. But 'trampoline_pgd' shares some PGD entries with 'swapper_pg_dir' and does not provide the isolation we require. Notably the virtual address for __START_KERNEL_map and MODULES_START are mapped by the same PGD entry so we need to be more careful when copying changes over in efi_sync_low_kernel_mappings(). This patch doesn't go the full mile, we still want to share some PGD entries with 'swapper_pg_dir'. Having completely separate page tables brings its own issues such as synchronising new mappings after memory hotplug and module loading. Sharing also keeps memory usage down. Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk> Reviewed-by: Borislav Petkov <bp@suse.de> Acked-by: Borislav Petkov <bp@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Jones <davej@codemonkey.org.uk> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Toshi Kani <toshi.kani@hp.com> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/1448658575-17029-6-git-send-email-matt@codeblueprint.co.uk Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-11-29x86/efi: Hoist page table switching code into efi_call_virt()Matt Fleming1-13/+11
This change is a prerequisite for pending patches that switch to a dedicated EFI page table, instead of using 'trampoline_pgd' which shares PGD entries with 'swapper_pg_dir'. The pending patches make it impossible to dereference the runtime service function pointer without first switching %cr3. It's true that we now have duplicated switching code in efi_call_virt() and efi_call_phys_{prolog,epilog}() but we are sacrificing code duplication for a little more clarity and the ease of writing the page table switching code in C instead of asm. Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk> Reviewed-by: Borislav Petkov <bp@suse.de> Acked-by: Borislav Petkov <bp@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Jones <davej@codemonkey.org.uk> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Toshi Kani <toshi.kani@hp.com> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/1448658575-17029-5-git-send-email-matt@codeblueprint.co.uk Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-11-29x86/efi: Map RAM into the identity page table for mixed modeMatt Fleming1-0/+20
We are relying on the pre-existing mappings in 'trampoline_pgd' when accessing function arguments in the EFI mixed mode thunking code. Instead let's map memory explicitly so that things will continue to work when we move to a separate page table in the future. Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk> Reviewed-by: Borislav Petkov <bp@suse.de> Acked-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Toshi Kani <toshi.kani@hp.com> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/1448658575-17029-4-git-send-email-matt@codeblueprint.co.uk Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-11-29x86/mm/pat: Ensure cpa->pfn only contains page frame numbersMatt Fleming1-6/+10
The x86 pageattr code is confused about the data that is stored in cpa->pfn, sometimes it's treated as a page frame number, sometimes it's treated as an unshifted physical address, and in one place it's treated as a pte. The result of this is that the mapping functions do not map the intended physical address. This isn't a problem in practice because most of the addresses we're mapping in the EFI code paths are already mapped in 'trampoline_pgd' and so the pageattr mapping functions don't actually do anything in this case. But when we move to using a separate page table for the EFI runtime this will be an issue. Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk> Reviewed-by: Borislav Petkov <bp@suse.de> Acked-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Toshi Kani <toshi.kani@hp.com> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/1448658575-17029-3-git-send-email-matt@codeblueprint.co.uk Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-01efi: Clean up the efi_call_phys_[prolog|epilog]() save/restore interactionIngo Molnar1-10/+16
Currently x86-64 efi_call_phys_prolog() saves into a global variable (save_pgd), and efi_call_phys_epilog() restores the kernel pagetables from that global variable. Change this to a cleaner save/restore pattern where the saving function returns the saved object and the restore function restores that. Apply the same concept to the 32-bit code as well. Plus this approach, as an added bonus, allows us to express the !efi_enabled(EFI_OLD_MEMMAP) situation in a clean fashion as well, via a 'NULL' return value. Cc: Tapasweni Pathak <tapaswenipathak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2015-04-01efi: Disable interrupts around EFI calls, not in the epilog/prolog callsIngo Molnar1-3/+0
Tapasweni Pathak reported that we do a kmalloc() in efi_call_phys_prolog() on x86-64 while having interrupts disabled, which is a big no-no, as kmalloc() can sleep. Solve this by removing the irq disabling from the prolog/epilog calls around EFI calls: it's unnecessary, as in this stage we are single threaded in the boot thread, and we don't ever execute this from interrupt contexts. Reported-by: Tapasweni Pathak <tapaswenipathak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2014-11-12x86, ptdump: Add section for EFI runtime servicesMathias Krause1-2/+1
In commit 3891a04aafd6 ("x86-64, espfix: Don't leak bits 31:16 of %esp returning..") the "ESPFix Area" was added to the page table dump special sections. That area, though, has a limited amount of entries printed. The EFI runtime services are, unfortunately, located in-between the espfix area and the high kernel memory mapping. Due to the enforced limitation for the espfix area, the EFI mappings won't be printed in the page table dump. To make the ESP runtime service mappings visible again, provide them a dedicated entry. Signed-off-by: Mathias Krause <minipli@googlemail.com> Acked-by: Borislav Petkov <bp@suse.de> Cc: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2014-10-03x86/efi: Mark initialization code as suchMathias Krause1-3/+3
The 32 bit and 64 bit implementations differ in their __init annotations for some functions referenced from the common EFI code. Namely, the 32 bit variant is missing some of the __init annotations the 64 bit variant has. To solve the colliding annotations, mark the corresponding functions in efi_32.c as initialization code, too -- as it is such. Actually, quite a few more functions are only used during initialization and therefore can be marked __init. They are therefore annotated, too. Also add the __init annotation to the prototypes in the efi.h header so users of those functions will see it's meant as initialization code only. This patch also fixes the "prelog" typo. ("prologue" / "epilogue" might be more appropriate but this is C code after all, not an opera! :D) Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: Matt Fleming <matt.fleming@intel.com>