summaryrefslogtreecommitdiff
path: root/arch/x86
AgeCommit message (Collapse)AuthorFilesLines
2019-02-11x86/defconfig: Enable EFI stub, mixed mode and BGRTArd Biesheuvel2-0/+5
Enable various options in the defconfigs of both i386 and x86_64 that could do with some test coverage in automated testing. Note that these options are typically enabled by distros as well. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-11mm/gup: Remove the 'write' parameter from gup_fast_permitted()Ira Weiny1-2/+1
The 'write' parameter is unused in gup_fast_permitted() so remove it. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20190210223424.13934-1-ira.weiny@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-11perf/x86/intel: Add counter freezing quirk for GoldmontKan Liang1-2/+7
A microcode patch is also needed for Goldmont while counter freezing feature is enabled. Otherwise, there will be some issues, e.g. PMI lost. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: bp@alien8.de Link: https://lkml.kernel.org/r/1549319013-4522-5-git-send-email-kan.liang@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-11perf/x86/intel: Clean up counter freezing quirkKan Liang1-16/+11
Clean up counter freezing quirk to use the new facility to check for min microcode revisions. Rename the counter freezing quirk related functions. Because other platforms, e.g. Goldmont, also needs to call the quirk. Only check the boot CPU, assuming models and features are consistent over all CPUs. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: bp@alien8.de Link: https://lkml.kernel.org/r/1549319013-4522-4-git-send-email-kan.liang@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-11perf/x86/intel: Clean up SNB PEBS quirkKan Liang1-25/+10
Clean up SNB PEBS quirk to use the new facility to check for min microcode revisions. Only check the boot CPU, assuming models and features are consistent over all CPUs. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: bp@alien8.de Link: https://lkml.kernel.org/r/1549319013-4522-3-git-send-email-kan.liang@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-11perf/x86/kvm: Avoid unnecessary work in guest filteringAndi Kleen3-16/+75
KVM added a workaround for PEBS events leaking into guests with commit: 26a4f3c08de4 ("perf/x86: disable PEBS on a guest entry.") This uses the VT entry/exit list to add an extra disable of the PEBS_ENABLE MSR. Intel also added a fix for this issue to microcode updates on Haswell/Broadwell/Skylake. It turns out using the MSR entry/exit list makes VM exits significantly slower. The list is only needed for disabling PEBS, because the GLOBAL_CTRL change gets optimized by KVM into changing the VMCS. Check for the microcode updates that have the microcode fix for leaking PEBS, and disable the extra entry/exit list entry for PEBS_ENABLE. In addition we always clear the GLOBAL_CTRL for the PEBS counter while running in the guest, which is enough to make them never fire at the wrong side of the host/guest transition. The overhead for VM exits with the filtering active with the patch is reduced from 8% to 4%. The microcode patch has already been merged into future platforms. This patch is one-off thing. The quirks is used here. For other old platforms which doesn't have microcode patch and quirks, extra disable of the PEBS_ENABLE MSR is still required. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: bp@alien8.de Link: https://lkml.kernel.org/r/1549319013-4522-2-git-send-email-kan.liang@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-11Merge branch 'x86/cpu' into perf/core, to pick up dependent commitIngo Molnar3-5/+62
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-11x86/cpufeature: Add facility to check for min microcode revisionsKan Liang2-0/+59
For bug workarounds or checks, it is useful to check for specific microcode revisions. Add a new generic function to match the CPU with stepping. Add the other function to check the min microcode revisions for the matched CPU. A new table format is introduced to facilitate the quirk to fill the related information. This does not change the existing x86_cpu_id because it's an ABI shared with modules, and also has quite different requirements, as in no wildcards, but everything has to be matched exactly. Originally-by: Andi Kleen <ak@linux.intel.com> Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: eranian@google.com Link: https://lkml.kernel.org/r/1549319013-4522-1-git-send-email-kan.liang@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-10Merge tag 'y2038-new-syscalls' of ↵Thomas Gleixner3-33/+53
git://git.kernel.org:/pub/scm/linux/kernel/git/arnd/playground into timers/2038 Pull y2038 - time64 system calls from Arnd Bergmann: This series finally gets us to the point of having system calls with 64-bit time_t on all architectures, after a long time of incremental preparation patches. There was actually one conversion that I missed during the summer, i.e. Deepa's timex series, which I now updated based the 5.0-rc1 changes and review comments. The following system calls are now added on all 32-bit architectures using the same system call numbers: 403 clock_gettime64 404 clock_settime64 405 clock_adjtime64 406 clock_getres_time64 407 clock_nanosleep_time64 408 timer_gettime64 409 timer_settime64 410 timerfd_gettime64 411 timerfd_settime64 412 utimensat_time64 413 pselect6_time64 414 ppoll_time64 416 io_pgetevents_time64 417 recvmmsg_time64 418 mq_timedsend_time64 419 mq_timedreceiv_time64 420 semtimedop_time64 421 rt_sigtimedwait_time64 422 futex_time64 423 sched_rr_get_interval_time64 Each one of these corresponds directly to an existing system call that includes a 'struct timespec' argument, or a structure containing a timespec or (in case of clock_adjtime) timeval. Not included here are new versions of getitimer/setitimer and getrusage/waitid, which are planned for the future but only needed to make a consistent API rather than for correct operation beyond y2038. These four system calls are based on 'timeval', and it has not been finally decided what the replacement kernel interface will use instead. So far, I have done a lot of build testing across most architectures, which has found a number of bugs. Runtime testing so far included testing LTP on 32-bit ARM with the existing system calls, to ensure we do not regress for existing binaries, and a test with a 32-bit x86 build of LTP against a modified version of the musl C library that has been adapted to the new system call interface [3]. This library can be used for testing on all architectures supported by musl-1.1.21, but it is not how the support is getting integrated into the official musl release. Official musl support is planned but will require more invasive changes to the library. Link: https://lore.kernel.org/lkml/20190110162435.309262-1-arnd@arndb.de/T/ Link: https://lore.kernel.org/lkml/20190118161835.2259170-1-arnd@arndb.de/ Link: https://git.linaro.org/people/arnd/musl-y2038.git/ [2]
2019-02-10Merge tag 'y2038-syscall-cleanup' of ↵Thomas Gleixner2-0/+13
git://git.kernel.org:/pub/scm/linux/kernel/git/arnd/playground into timers/2038 Pull preparatory work for y2038 changes from Arnd Bergmann: System call unification and cleanup The system call tables have diverged a bit over the years, and a number of the recent additions never made it into all architectures, for one reason or another. This is an attempt to clean it up as far as we can without breaking compatibility, doing a number of steps: - Add system calls that have not yet been integrated into all architectures but that we definitely want there. This includes {,f}statfs64() and get{eg,eu,g,p,u,pp}id() on alpha, which have been missing traditionally. - The s390 compat syscall handling is cleaned up to be more like what we do on other architectures, while keeping the 31-bit pointer extension. This was merged as a shared branch by the s390 maintainers and is included here in order to base the other patches on top. - Add the separate ipc syscalls on all architectures that traditionally only had sys_ipc(). This version is done without support for IPC_OLD that is we have in sys_ipc. The new semtimedop_time64 syscall will only be added here, not in sys_ipc - Add syscall numbers for a couple of syscalls that we probably don't need everywhere, in particular pkey_* and rseq, for the purpose of symmetry: if it's in asm-generic/unistd.h, it makes sense to have it everywhere. I expect that any future system calls will get assigned on all platforms together, even when they appear to be specific to a single architecture. - Prepare for having the same system call numbers for any future calls. In combination with the generated tables, this hopefully makes it easier to add new calls across all architectures together. All of the above are technically separate from the y2038 work, but are done as preparation before we add the new 64-bit time_t system calls everywhere, providing a common baseline set of system calls. I expect that glibc and other libraries that want to use 64-bit time_t will require linux-5.1 kernel headers for building in the future, and at a much later point may also require linux-5.1 or a later version as the minimum kernel at runtime. Having a common baseline then allows the removal of many architecture or kernel version specific workarounds.
2019-02-10Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds4-26/+29
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "A handful of fixes: - Fix an MCE corner case bug/crash found via MCE injection testing - Fix 5-level paging boot crash - Fix MCE recovery cache invalidation bug - Fix regression on Xen guests caused by a recent PMD level mremap speedup optimization" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm: Make set_pmd_at() paravirt aware x86/mm/cpa: Fix set_mce_nospec() x86/boot/compressed/64: Do not corrupt EDX on EFER.LME=1 setting x86/MCE: Initialize mce.bank in the case of a fatal error in mce_no_way_out()
2019-02-10Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds2-6/+14
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "A couple of kernel side fixes: - Fix the Intel uncore driver on certain hardware configurations - Fix a CPU hotplug related memory allocation bug - Remove a spurious WARN() ... plus also a handful of perf tooling fixes" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf script python: Add Python3 support to tests/attr.py perf trace: Support multiple "vfs_getname" probes perf symbols: Filter out hidden symbols from labels perf symbols: Add fallback definitions for GELF_ST_VISIBILITY() tools headers uapi: Sync linux/in.h copy from the kernel sources perf clang: Do not use 'return std::move(something)' perf mem/c2c: Fix perf_mem_events to support powerpc perf tests evsel-tp-sched: Fix bitwise operator perf/core: Don't WARN() for impossible ring-buffer sizes perf/x86/intel: Delay memory deallocation until x86_pmu_dead_cpu() perf/x86/intel/uncore: Add Node ID mask
2019-02-10x86/mm: Make set_pmd_at() paravirt awareJuergen Gross1-1/+1
set_pmd_at() calls native_set_pmd() unconditionally on x86. This was fine as long as only huge page entries were written via set_pmd_at(), as Xen pv guests don't support those. Commit 2c91bd4a4e2e53 ("mm: speed up mremap by 20x on large regions") introduced a usage of set_pmd_at() possible on pv guests, leading to failures like: BUG: unable to handle kernel paging request at ffff888023e26778 #PF error: [PROT] [WRITE] RIP: e030:move_page_tables+0x7c1/0xae0 move_vma.isra.3+0xd1/0x2d0 __se_sys_mremap+0x3c6/0x5b0 do_syscall_64+0x49/0x100 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Make set_pmd_at() paravirt aware by just letting it use set_pmd(). Fixes: 2c91bd4a4e2e53 ("mm: speed up mremap by 20x on large regions") Reported-by: Sander Eikelenboom <linux@eikelenboom.it> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: xen-devel@lists.xenproject.org Cc: boris.ostrovsky@oracle.com Cc: sstabellini@kernel.org Cc: hpa@zytor.com Cc: bp@alien8.de Cc: torvalds@linux-foundation.org Link: https://lkml.kernel.org/r/20190210074056.11842-1-jgross@suse.com
2019-02-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller16-17/+69
An ipvlan bug fix in 'net' conflicted with the abstraction away of the IPV6 specific support in 'net-next'. Similarly, a bug fix for mlx5 in 'net' conflicted with the flow action conversion in 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-08x86/fpu: Move init_xstate_size() to __init sectionSergey Senozhatsky1-1/+1
WARNING: vmlinux.o(.text.unlikely+0x1c05): Section mismatch in reference from the function init_xstate_size() to the function .init.text:get_xsave_size() WARNING: vmlinux.o(.text.unlikely+0x1c19): Section mismatch in reference from the function init_xstate_size() to the function .init.text:get_xsaves_size() Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Link: https://lkml.kernel.org/r/20190108130225.5066-2-sergey.senozhatsky@gmail.com
2019-02-08x86/cpu_entry_area: Move percpu_setup_debug_store() to __init sectionSergey Senozhatsky1-1/+1
WARNING: vmlinux.o(.text.unlikely+0x498d): Section mismatch in reference from the function percpu_setup_debug_store() to the function .init.text:cea_map_percpu_pages() Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Link: https://lkml.kernel.org/r/20190108130225.5066-1-sergey.senozhatsky@gmail.com
2019-02-08x86/mtrr: Remove unused variableBo Yu1-2/+1
Compiling the kernel with W=1 results in the following warning: arch/x86/kernel/cpu/mtrr/cleanup.c:299:16: warning: variable ‘second_basek’ set but not used [-Wunused-but-set-variable] unsigned long second_basek, second_sizek; Remove the unused variable. [ tglx: Massaged changelog ] Signed-off-by: Bo Yu <tsu.yubo@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: bp@alien8.de Cc: hpa@zytor.com Cc: puwen@hygon.cn Link: https://lkml.kernel.org/r/20190208125343.11451-1-tsu.yubo@gmail.com
2019-02-08x86/mm/cpa: Fix set_mce_nospec()Peter Zijlstra1-25/+25
The recent commit fe0937b24ff5 ("x86/mm/cpa: Fold cpa_flush_range() and cpa_flush_array() into a single cpa_flush() function") accidentally made the call to make_addr_canonical_again() go away, which breaks set_mce_nospec(). Re-instate the call to convert the address back into canonical form right before invoking either CLFLUSH or INVLPG. Rename the function while at it to be shorter (and less MAGA). Fixes: fe0937b24ff5 ("x86/mm/cpa: Fold cpa_flush_range() and cpa_flush_array() into a single cpa_flush() function") Reported-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Tony Luck <tony.luck@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Rik van Riel <riel@surriel.com> Link: https://lkml.kernel.org/r/20190208120859.GH32511@hirez.programming.kicks-ass.net
2019-02-08crypto: x86/aesni-gcm - fix crash on empty plaintextEric Biggers1-5/+8
gcmaes_crypt_by_sg() dereferences the NULL pointer returned by scatterwalk_ffwd() when encrypting an empty plaintext and the source scatterlist ends immediately after the associated data. Fix it by only fast-forwarding to the src/dst data scatterlists if the data length is nonzero. This bug is reproduced by the "rfc4543(gcm(aes))" test vectors when run with the new AEAD test manager. Fixes: e845520707f8 ("crypto: aesni - Update aesni-intel_glue to use scatter/gather") Cc: <stable@vger.kernel.org> # v4.17+ Cc: Dave Watson <davejwatson@fb.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-02-08crypto: x86/morus - fix handling chunked inputs and MAY_SLEEPEric Biggers2-48/+31
The x86 MORUS implementations all fail the improved AEAD tests because they produce the wrong result with some data layouts. The issue is that they assume that if the skcipher_walk API gives 'nbytes' not aligned to the walksize (a.k.a. walk.stride), then it is the end of the data. In fact, this can happen before the end. Also, when the CRYPTO_TFM_REQ_MAY_SLEEP flag is given, they can incorrectly sleep in the skcipher_walk_*() functions while preemption has been disabled by kernel_fpu_begin(). Fix these bugs. Fixes: 56e8e57fc3a7 ("crypto: morus - Add common SIMD glue code for MORUS") Cc: <stable@vger.kernel.org> # v4.18+ Cc: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-02-08crypto: x86/aegis - fix handling chunked inputs and MAY_SLEEPEric Biggers3-69/+45
The x86 AEGIS implementations all fail the improved AEAD tests because they produce the wrong result with some data layouts. The issue is that they assume that if the skcipher_walk API gives 'nbytes' not aligned to the walksize (a.k.a. walk.stride), then it is the end of the data. In fact, this can happen before the end. Also, when the CRYPTO_TFM_REQ_MAY_SLEEP flag is given, they can incorrectly sleep in the skcipher_walk_*() functions while preemption has been disabled by kernel_fpu_begin(). Fix these bugs. Fixes: 1d373d4e8e15 ("crypto: x86 - Add optimized AEGIS implementations") Cc: <stable@vger.kernel.org> # v4.18+ Cc: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-02-08crypto: x86/crct10dif-pcl - cleanup and optimizationsEric Biggers2-559/+235
The x86, arm, and arm64 asm implementations of crct10dif are very difficult to understand partly because many of the comments, labels, and macros are named incorrectly: the lengths mentioned are usually off by a factor of two from the actual code. Many other things are unnecessarily convoluted as well, e.g. there are many more fold constants than actually needed and some aren't fully reduced. This series therefore cleans up all these implementations to be much more maintainable. I also made some small optimizations where I saw opportunities, resulting in slightly better performance. This patch cleans up the x86 version. As part of this, I removed support for len < 16 from the x86 assembly; now the glue code falls back to the generic table-based implementation in this case. Due to the overhead of kernel_fpu_begin(), this actually significantly improves performance on these lengths. (And even if kernel_fpu_begin() were free, the generic code is still faster for about len < 11.) This removal also eliminates error-prone special cases and makes the x86, arm32, and arm64 ports of the code match more closely. Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-02-07KVM: nVMX: unconditionally cancel preemption timer in free_nested ↵Peter Shier1-0/+1
(CVE-2019-7221) Bugzilla: 1671904 There are multiple code paths where an hrtimer may have been started to emulate an L1 VMX preemption timer that can result in a call to free_nested without an intervening L2 exit where the hrtimer is normally cancelled. Unconditionally cancel in free_nested to cover all cases. Embargoed until Feb 7th 2019. Signed-off-by: Peter Shier <pshier@google.com> Reported-by: Jim Mattson <jmattson@google.com> Reviewed-by: Jim Mattson <jmattson@google.com> Reported-by: Felix Wilhelm <fwilhelm@google.com> Cc: stable@kernel.org Message-Id: <20181011184646.154065-1-pshier@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-02-07KVM: x86: work around leak of uninitialized stack contents (CVE-2019-7222)Paolo Bonzini1-0/+7
Bugzilla: 1671930 Emulation of certain instructions (VMXON, VMCLEAR, VMPTRLD, VMWRITE with memory operand, INVEPT, INVVPID) can incorrectly inject a page fault when passed an operand that points to an MMIO address. The page fault will use uninitialized kernel stack memory as the CR2 and error code. The right behavior would be to abort the VM with a KVM_EXIT_INTERNAL_ERROR exit to userspace; however, it is not an easy fix, so for now just ensure that the error code and CR2 are zero. Embargoed until Feb 7th 2019. Reported-by: Felix Wilhelm <fwilhelm@google.com> Cc: stable@kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-02-07x86/PCI: Fixup RTIT_BAR of Intel Denverton Trace HubAlexander Shishkin1-0/+16
On Denverton's integration of the Intel(R) Trace Hub (for a reference and overview see Documentation/trace/intel_th.rst) the reported size of one of its resources (RTIT_BAR) doesn't match its actual size, which leads to overlaps with other devices' resources. In practice, it overlaps with XHCI MMIO space, which results in the xhci driver bailing out after seeing its registers as 0xffffffff, and perceived disappearance of all USB devices: intel_th_pci 0000:00:1f.7: enabling device (0004 -> 0006) xhci_hcd 0000:00:15.0: xHCI host controller not responding, assume dead xhci_hcd 0000:00:15.0: xHC not responding in xhci_irq, assume controller is dead xhci_hcd 0000:00:15.0: HC died; cleaning up usb 1-1: USB disconnect, device number 2 For this reason, we need to resize the RTIT_BAR on Denverton to its actual size, which in this case is 4MB. The corresponding erratum is DNV36 at the link below: DNV36. Processor Host Root Complex May Incorrectly Route Memory Accesses to Intel® Trace Hub Problem: The Intel® Trace Hub RTIT_BAR (B0:D31:F7 offset 20h) is reported as a 2KB memory range. Due to this erratum, the processor Host Root Complex will forward addresses from RTIT_BAR to RTIT_BAR + 4MB -1 to Intel® Trace Hub. Implication: Devices assigned within the RTIT_BAR to RTIT_BAR + 4MB -1 space may not function correctly. Workaround: A BIOS code change has been identified and may be implemented as a workaround for this erratum. Status: No Fix. Note that 5118ccd34780 ("intel_th: pci: Add Denverton SOC support") updates the Trace Hub driver so it claims the Denverton device, but the resource overlap exists regardless of whether that driver is loaded or that commit is included. Link: https://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/atom-c3000-family-spec-update.pdf Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> [bhelgaas: include erratum text, clarify relationship with 5118ccd34780] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org
2019-02-07y2038: add 64-bit time_t syscalls to all 32-bit architecturesArnd Bergmann1-0/+20
This adds 21 new system calls on each ABI that has 32-bit time_t today. All of these have the exact same semantics as their existing counterparts, and the new ones all have macro names that end in 'time64' for clarification. This gets us to the point of being able to safely use a C library that has 64-bit time_t in user space. There are still a couple of loose ends to tie up in various areas of the code, but this is the big one, and should be entirely uncontroversial at this point. In particular, there are four system calls (getitimer, setitimer, waitid, and getrusage) that don't have a 64-bit counterpart yet, but these can all be safely implemented in the C library by wrapping around the existing system calls because the 32-bit time_t they pass only counts elapsed time, not time since the epoch. They will be dealt with later. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com>
2019-02-07y2038: rename old time and utime syscallsArnd Bergmann2-9/+9
The time, stime, utime, utimes, and futimesat system calls are only used on older architectures, and we do not provide y2038 safe variants of them, as they are replaced by clock_gettime64, clock_settime64, and utimensat_time64. However, for consistency it seems better to have the 32-bit architectures that still use them call the "time32" entry points (leaving the traditional handlers for the 64-bit architectures), like we do for system calls that now require two versions. Note: We used to always define __ARCH_WANT_SYS_TIME and __ARCH_WANT_SYS_UTIME and only set __ARCH_WANT_COMPAT_SYS_TIME and __ARCH_WANT_SYS_UTIME32 for compat mode on 64-bit kernels. Now this is reversed: only 64-bit architectures set __ARCH_WANT_SYS_TIME/UTIME, while we need __ARCH_WANT_SYS_TIME32/UTIME32 for 32-bit architectures and compat mode. The resulting asm/unistd.h changes look a bit counterintuitive. This is only a cleanup patch and it should not change any behavior. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2019-02-07y2038: use time32 syscall names on 32-bitArnd Bergmann1-22/+22
This is the big flip, where all 32-bit architectures set COMPAT_32BIT_TIME and use the _time32 system calls from the former compat layer instead of the system calls that take __kernel_timespec and similar arguments. The temporary redirects for __kernel_timespec, __kernel_itimerspec and __kernel_timex can get removed with this. It would be easy to split this commit by architecture, but with the new generated system call tables, it's easy enough to do it all at once, which makes it a little easier to check that the changes are the same in each table. Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-02-07y2038: syscalls: rename y2038 compat syscallsArnd Bergmann1-26/+26
A lot of system calls that pass a time_t somewhere have an implementation using a COMPAT_SYSCALL_DEFINEx() on 64-bit architectures, and have been reworked so that this implementation can now be used on 32-bit architectures as well. The missing step is to redefine them using the regular SYSCALL_DEFINEx() to get them out of the compat namespace and make it possible to build them on 32-bit architectures. Any system call that ends in 'time' gets a '32' suffix on its name for that version, while the others get a '_time32' suffix, to distinguish them from the normal version, which takes a 64-bit time argument in the future. In this step, only 64-bit architectures are changed, doing this rename first lets us avoid touching the 32-bit architectures twice. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-02-07x86/x32: use time64 versions of sigtimedwait and recvmmsgArnd Bergmann1-2/+2
x32 has always followed the time64 calling conventions of these syscalls, which required a special hack in compat_get_timespec aka get_old_timespec32 to continue working. Since we now have the time64 syscalls, use those explicitly. Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-02-06x86/boot/compressed/64: Explain paging_prepare()'s return valueKirill A. Shutemov1-3/+6
paging_prepare() returns a two-quadword structure which lands into RDX:RAX: - Address of the trampoline is returned in RAX. - Non zero RDX means trampoline needs to enable 5-level paging. Document that explicitly. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: dave.hansen@linux.intel.com Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kyle D Pelton <kyle.d.pelton@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Huang <wei@redhat.com> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190206154756.matwldebbxkmlnae@black.fi.intel.com
2019-02-06x86/boot/compressed/64: Do not corrupt EDX on EFER.LME=1 settingKirill A. Shutemov1-0/+2
RDMSR in the trampoline code overwrites EDX but that register is used to indicate whether 5-level paging has to be enabled and if clobbered, leads to failure to boot on a 5-level paging machine. Preserve EDX on the stack while we are dealing with EFER. Fixes: b677dfae5aa1 ("x86/boot/compressed/64: Set EFER.LME=1 in 32-bit trampoline before returning to long mode") Reported-by: Kyle D Pelton <kyle.d.pelton@intel.com> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: dave.hansen@linux.intel.com Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Huang <wei@redhat.com> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190206115253.1907-1-kirill.shutemov@linux.intel.com
2019-02-06regulator: fixed/gpio: Pull inversion/OD into gpiolibLinus Walleij1-1/+0
This pushes the handling of inversion semantics and open drain settings to the GPIO descriptor and gpiolib. All affected board files are also augmented. This is especially nice since we don't have to have any confusing flags passed around to the left and right littering the fixed and GPIO regulator drivers and the regulator core. It is all just very straight-forward: the core asks the GPIO line to be asserted or deasserted and gpiolib deals with the rest depending on how the platform is configured: if the line is active low, it deals with that, if the line is open drain, it deals with that too. Cc: Alexander Shiyan <shc_work@mail.ru> # i.MX boards user Cc: Haojian Zhuang <haojian.zhuang@gmail.com> # MMP2 maintainer Cc: Aaro Koskinen <aaro.koskinen@iki.fi> # OMAP1 maintainer Cc: Tony Lindgren <tony@atomide.com> # OMAP1,2,3 maintainer Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> # EM-X270 maintainer Cc: Robert Jarzmik <robert.jarzmik@free.fr> # EZX maintainer Cc: Philipp Zabel <philipp.zabel@gmail.com> # Magician maintainer Cc: Petr Cvek <petr.cvek@tul.cz> # Magician Cc: Robert Jarzmik <robert.jarzmik@free.fr> # PXA Cc: Paul Parsons <lost.distance@yahoo.com> # hx4700 Cc: Daniel Mack <zonque@gmail.com> # Raumfeld maintainer Cc: Marc Zyngier <marc.zyngier@arm.com> # Zeus maintainer Cc: Geert Uytterhoeven <geert+renesas@glider.be> # SuperH pinctrl/GPIO maintainer Cc: Russell King <rmk+kernel@armlinux.org.uk> # SA1100 Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Tested-by: Janusz Krzysztofik <jmkrzyszt@gmail.com> #OMAP1 Amstrad Delta Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Mark Brown <broonie@kernel.org>
2019-02-06x86/kexec: Fill in acpi_rsdp_addr from the first kernelKairui Song1-1/+3
When efi=noruntime or efi=oldmap is used on the kernel command line, EFI services won't be available in the second kernel, therefore the second kernel will not be able to get the ACPI RSDP address from firmware by calling EFI services and so it won't boot. Commit e6e094e053af ("x86/acpi, x86/boot: Take RSDP address from boot params if available") added an acpi_rsdp_addr field to boot_params which stores the RSDP address for other kernel users. Recently, after 3a63f70bf4c3 ("x86/boot: Early parse RSDP and save it in boot_params") the acpi_rsdp_addr will always be filled with a valid RSDP address. So fill in that value into the second kernel's boot_params thus ensuring that the second kernel receives the RSDP value from the first kernel. [ bp: massage commit message. ] Signed-off-by: Kairui Song <kasong@redhat.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Baoquan He <bhe@redhat.com> Cc: Chao Fan <fanc.fnst@cn.fujitsu.com> Cc: Dave Young <dyoung@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: kexec@lists.infradead.org Cc: Philipp Rudo <prudo@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86-ml <x86@kernel.org> Cc: Yannik Sembritzki <yannik@sembritzki.me> Link: https://lkml.kernel.org/r/20190204173852.4863-1-kasong@redhat.com
2019-02-06perf/aux: Make perf_event accessible to setup_aux()Mathieu Poirier2-3/+6
When pmu::setup_aux() is called the coresight PMU needs to know which sink to use for the session by looking up the information in the event's attr::config2 field. As such simply replace the cpu information by the complete perf_event structure and change all affected customers. Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Reviewed-by: Suzuki Poulouse <suzuki.poulose@arm.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-s390@vger.kernel.org Link: http://lkml.kernel.org/r/20190131184714.20388-2-mathieu.poirier@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-02-06x86/boot: Fix randconfig build error due to MEMORY_HOTREMOVEBorislav Petkov2-2/+2
When building randconfigs, one of the failures is: ld: arch/x86/boot/compressed/kaslr.o: in function `choose_random_location': kaslr.c:(.text+0xbf7): undefined reference to `count_immovable_mem_regions' ld: kaslr.c:(.text+0xcbe): undefined reference to `immovable_mem' make[2]: *** [arch/x86/boot/compressed/vmlinux] Error 1 because CONFIG_ACPI is not enabled in this particular .config but CONFIG_MEMORY_HOTREMOVE is and count_immovable_mem_regions() is unresolvable because it is defined in compressed/acpi.c which is the compilation unit that depends on CONFIG_ACPI. Add CONFIG_ACPI to the explicit dependencies for MEMORY_HOTREMOVE. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Chao Fan <fanc.fnst@cn.fujitsu.com> Cc: x86@kernel.org Link: https://lkml.kernel.org/r/20190205131033.9564-1-bp@alien8.de
2019-02-06x86/boot: Fix cmdline_find_option() prototype visibilityBorislav Petkov1-2/+0
ac09c5f43cf6 ("x86/boot: Build the command line parsing code unconditionally") enabled building the command line parsing code unconditionally but it forgot to remove the respective ifdeffery around the prototypes in the misc.h header, leading to arch/x86/boot/compressed/acpi.c: In function ‘get_acpi_rsdp’: arch/x86/boot/compressed/acpi.c:37:8: warning: implicit declaration of function \ ‘cmdline_find_option’ [-Wimplicit-function-declaration] ret = cmdline_find_option("acpi_rsdp", val, MAX_ADDR_LEN); ^~~~~~~~~~~~~~~~~~~ for configs where neither CONFIG_EARLY_PRINTK nor CONFIG_RANDOMIZE_BASE was defined. Drop the ifdeffery in the header too. Fixes: ac09c5f43cf6 ("x86/boot: Build the command line parsing code unconditionally") Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Chao Fan <fanc.fnst@cn.fujitsu.com> Cc: x86@kernel.org Link: https://lkml.kernel.org/r/5c51daf0.83pQEkvDZILqoSYW%lkp@intel.com Link: https://lkml.kernel.org/r/20190205131352.GA27396@zn.tnic
2019-02-05x86/resctrl: Remove duplicate MSR_MISC_FEATURE_CONTROL definitionReinette Chatre1-7/+0
The definition of MSR_MISC_FEATURE_CONTROL was first introduced in 98af74599ea0 ("x86 msr_index.h: Define MSR_MISC_FEATURE_CONTROL") and present in Linux since v4.11. The Cache Pseudo-Locking code added this duplicate definition in more recent f2a177292bd0 ("x86/intel_rdt: Discover supported platforms via prefetch disable bits"), available since v4.19. Remove the duplicate definition from the resctrl subsystem and let that code obtain the needed definition from the core architecture msr-index.h instead. Fixes: f2a177292bd0 ("x86/intel_rdt: Discover supported platforms via prefetch disable bits") Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: gavin.hindman@intel.com Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: jithu.joseph@intel.com Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/ff6b95d9b6ef6f4ac96267f130719ba1af09614b.1549312475.git.reinette.chatre@intel.com
2019-02-05kexec, KEYS: Make use of platform keyring for signature verifyKairui Song1-3/+11
This patch allows the kexec_file_load syscall to verify the PE signed kernel image signature based on the preboot keys stored in the .platform keyring, as fall back, if the signature verification failed due to not finding the public key in the secondary or builtin keyrings. This commit adds a VERIFY_USE_PLATFORM_KEYRING similar to previous VERIFY_USE_SECONDARY_KEYRING indicating that verify_pkcs7_signature should verify the signature using platform keyring. Also, decrease the error message log level when verification failed with -ENOKEY, so that if called tried multiple time with different keyring it won't generate extra noises. Signed-off-by: Kairui Song <kasong@redhat.com> Cc: David Howells <dhowells@redhat.com> Acked-by: Dave Young <dyoung@redhat.com> (for kexec_file_load part) [zohar@linux.ibm.com: tweaked the first paragraph of the patch description, and fixed checkpatch warning.] Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2019-02-04refcount_t: Add ACQUIRE ordering on success for dec(sub)_and_test() variantsElena Reshetova1-4/+18
This adds an smp_acquire__after_ctrl_dep() barrier on successful decrease of refcounter value from 1 to 0 for refcount_dec(sub)_and_test variants and therefore gives stronger memory ordering guarantees than prior versions of these functions. Co-developed-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Andrea Parri <andrea.parri@amarulasolutions.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: dvyukov@google.com Cc: keescook@chromium.org Cc: stern@rowland.harvard.edu Link: https://lkml.kernel.org/r/1548847131-27854-2-git-send-email-elena.reshetova@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-04Merge branch 'perf/urgent' into perf/core, to pick up fixesIngo Molnar36-165/+255
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-04perf/x86/intel: Delay memory deallocation until x86_pmu_dead_cpu()Peter Zijlstra1-5/+11
intel_pmu_cpu_prepare() allocated memory for ->shared_regs among other members of struct cpu_hw_events. This memory is released in intel_pmu_cpu_dying() which is wrong. The counterpart of the intel_pmu_cpu_prepare() callback is x86_pmu_dead_cpu(). Otherwise if the CPU fails on the UP path between CPUHP_PERF_X86_PREPARE and CPUHP_AP_PERF_X86_STARTING then it won't release the memory but allocate new memory on the next attempt to online the CPU (leaking the old memory). Also, if the CPU down path fails between CPUHP_AP_PERF_X86_STARTING and CPUHP_PERF_X86_PREPARE then the CPU will go back online but never allocate the memory that was released in x86_pmu_dying_cpu(). Make the memory allocation/free symmetrical in regard to the CPU hotplug notifier by moving the deallocation to intel_pmu_cpu_dead(). This started in commit: a7e3ed1e47011 ("perf: Add support for supplementary event registers"). In principle the bug was introduced in v2.6.39 (!), but it will almost certainly not backport cleanly across the big CPU hotplug rewrite between v4.7-v4.15... [ bigeasy: Added patch description. ] [ mingo: Added backporting guidance. ] Reported-by: He Zhe <zhe.he@windriver.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> # With developer hat on Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> # With maintainer hat on Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: acme@kernel.org Cc: bp@alien8.de Cc: hpa@zytor.com Cc: jolsa@kernel.org Cc: kan.liang@linux.intel.com Cc: namhyung@kernel.org Cc: <stable@vger.kernel.org> Fixes: a7e3ed1e47011 ("perf: Add support for supplementary event registers"). Link: https://lkml.kernel.org/r/20181219165350.6s3jvyxbibpvlhtq@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-04perf/x86/intel/uncore: Add Node ID maskKan Liang1-1/+3
Some PCI uncore PMUs cannot be registered on an 8-socket system (HPE Superdome Flex). To understand which Socket the PCI uncore PMUs belongs to, perf retrieves the local Node ID of the uncore device from CPUNODEID(0xC0) of the PCI configuration space, and the mapping between Socket ID and Node ID from GIDNIDMAP(0xD4). The Socket ID can be calculated accordingly. The local Node ID is only available at bit 2:0, but current code doesn't mask it. If a BIOS doesn't clear the rest of the bits, an incorrect Node ID will be fetched. Filter the Node ID by adding a mask. Reported-by: Song Liu <songliubraving@fb.com> Tested-by: Song Liu <songliubraving@fb.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@vger.kernel.org> # v3.7+ Fixes: 7c94ee2e0917 ("perf/x86: Add Intel Nehalem and Sandy Bridge-EP uncore support") Link: https://lkml.kernel.org/r/1548600794-33162-1-git-send-email-kan.liang@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-04efi/x86: Convert x86 EFI earlyprintk into generic earlycon implementationArd Biesheuvel5-256/+0
Move the x86 EFI earlyprintk implementation to a shared location under drivers/firmware and tweak it slightly so we can expose it as an earlycon implementation (which is generic) rather than earlyprintk (which is only implemented for a few architectures) This also involves switching to write-combine mappings by default (which is required on ARM since device mappings lack memory semantics, and so memcpy/memset may not be used on them), and adding support for shared memory framebuffers on cache coherent non-x86 systems (which do not tolerate mismatched attributes). Note that 32-bit ARM does not populate its struct screen_info early enough for earlycon=efifb to work, so it is disabled there. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Alexander Graf <agraf@suse.de> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Bjorn Andersson <bjorn.andersson@linaro.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Heinrich Schuchardt <xypron.glpk@gmx.de> Cc: Jeffrey Hugo <jhugo@codeaurora.org> Cc: Lee Jones <lee.jones@linaro.org> Cc: Leif Lindholm <leif.lindholm@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Peter Jones <pjones@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/20190202094119.13230-10-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-04x86: Make ARCH_USE_MEMREMAP_PROT a generic Kconfig symbolArd Biesheuvel2-6/+3
Turn ARCH_USE_MEMREMAP_PROT into a generic Kconfig symbol, and fix the dependency expression to reflect that AMD_MEM_ENCRYPT depends on it, instead of the other way around. This will permit ARCH_USE_MEMREMAP_PROT to be selected by other architectures. Note that the encryption related early memremap routines in arch/x86/mm/ioremap.c cannot be built for 32-bit x86 without triggering the following warning: arch/x86//mm/ioremap.c: In function 'early_memremap_encrypted': >> arch/x86/include/asm/pgtable_types.h:193:27: warning: conversion from 'long long unsigned int' to 'long unsigned int' changes value from '9223372036854776163' to '355' [-Woverflow] #define __PAGE_KERNEL_ENC (__PAGE_KERNEL | _PAGE_ENC) ^~~~~~~~~~~~~~~~~~~~~~~~~~~ arch/x86//mm/ioremap.c:713:46: note: in expansion of macro '__PAGE_KERNEL_ENC' return early_memremap_prot(phys_addr, size, __PAGE_KERNEL_ENC); which essentially means they are 64-bit only anyway. However, we cannot make them dependent on CONFIG_ARCH_HAS_MEM_ENCRYPT, since that is always defined, even for i386 (and changing that results in a slew of build errors) So instead, build those routines only if CONFIG_AMD_MEM_ENCRYPT is defined. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Alexander Graf <agraf@suse.de> Cc: Bjorn Andersson <bjorn.andersson@linaro.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Heinrich Schuchardt <xypron.glpk@gmx.de> Cc: Jeffrey Hugo <jhugo@codeaurora.org> Cc: Lee Jones <lee.jones@linaro.org> Cc: Leif Lindholm <leif.lindholm@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Peter Jones <pjones@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/20190202094119.13230-9-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-04x86/efi: Mark can_free_region() as an __init functionSai Praneeth Prakhya1-1/+1
can_free_region() is called only once during boot, by efi_reserve_boot_services(). Hence, mark it as an __init function. Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Alexander Graf <agraf@suse.de> Cc: Bjorn Andersson <bjorn.andersson@linaro.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Heinrich Schuchardt <xypron.glpk@gmx.de> Cc: Jeffrey Hugo <jhugo@codeaurora.org> Cc: Lee Jones <lee.jones@linaro.org> Cc: Leif Lindholm <leif.lindholm@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Peter Jones <pjones@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/20190202094119.13230-2-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-03arch: Use asm-generic/socket.h when possibleDeepa Dinamani2-1/+1
Many architectures maintain an arch specific copy of the file even though there are no differences with the asm-generic one. Allow these architectures to use the generic one instead. Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com> Acked-by: Max Filippov <jcmvbkbc@gmail.com> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Acked-by: Willem de Bruijn <willemb@google.com> Cc: chris@zankel.net Cc: fenghua.yu@intel.com Cc: tglx@linutronix.de Cc: schwidefsky@de.ibm.com Cc: linux-ia64@vger.kernel.org Cc: linux-xtensa@linux-xtensa.org Cc: linux-s390@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-03Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds10-12/+24
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: "A few updates for x86: - Fix an unintended sign extension issue in the fault handling code - Rename the new resource control config switch so it's less confusing - Avoid setting up EFI info in kexec when the EFI runtime is disabled. - Fix the microcode version check in the AMD microcode loader so it only loads higher version numbers and never downgrades - Set EFER.LME in the 32bit trampoline before returning to long mode to handle older AMD/KVM behaviour properly. - Add Darren and Andy as x86/platform reviewers" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/resctrl: Avoid confusion over the new X86_RESCTRL config x86/kexec: Don't setup EFI info if EFI runtime is not enabled x86/microcode/amd: Don't falsely trick the late loading mechanism MAINTAINERS: Add Andy and Darren as arch/x86/platform/ reviewers x86/fault: Fix sign-extend unintended sign extension x86/boot/compressed/64: Set EFER.LME=1 in 32-bit trampoline before returning to long mode x86/cpu: Add Atom Tremont (Jacobsville)
2019-02-03Merge branch 'smp-urgent-for-linus' of ↵Linus Torvalds2-2/+3
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull cpu hotplug fixes from Thomas Gleixner: "Two fixes for the cpu hotplug machinery: - Replace the overly clever 'SMT disabled by BIOS' detection logic as it breaks KVM scenarios and prevents speculation control updates when the Hyperthreads are brought online late after boot. - Remove a redundant invocation of the speculation control update function" * 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: cpu/hotplug: Fix "SMT disabled by BIOS" detection for KVM x86/speculation: Remove redundant arch_smt_update() invocation
2019-02-03x86/MCE: Initialize mce.bank in the case of a fatal error in mce_no_way_out()Tony Luck1-0/+1
Internal injection testing crashed with a console log that said: mce: [Hardware Error]: CPU 7: Machine Check Exception: f Bank 0: bd80000000100134 This caused a lot of head scratching because the MCACOD (bits 15:0) of that status is a signature from an L1 data cache error. But Linux says that it found it in "Bank 0", which on this model CPU only reports L1 instruction cache errors. The answer was that Linux doesn't initialize "m->bank" in the case that it finds a fatal error in the mce_no_way_out() pre-scan of banks. If this was a local machine check, then this partially initialized struct mce is being passed to mce_panic(). Fix is simple: just initialize m->bank in the case of a fatal error. Fixes: 40c36e2741d7 ("x86/mce: Fix incorrect "Machine check from unknown source" message") Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: x86-ml <x86@kernel.org> Cc: stable@vger.kernel.org # v4.18 Note pre-v5.0 arch/x86/kernel/cpu/mce/core.c was called arch/x86/kernel/cpu/mcheck/mce.c Link: https://lkml.kernel.org/r/20190201003341.10638-1-tony.luck@intel.com