summaryrefslogtreecommitdiff
path: root/arch/x86
AgeCommit message (Collapse)AuthorFilesLines
2020-07-24x86/entry: Move user return notifier out of loopThomas Gleixner1-4/+4
Guests and user space share certain MSRs. KVM sets these MSRs to guest values once and does not set them back to user space values on every VM exit to spare the costly MSR operations. User return notifiers ensure that these MSRs are set back to the correct values before returning to user space in exit_to_usermode_loop(). There is no reason to evaluate the TIF flag indicating that user return notifiers need to be invoked in the loop. The important point is that they are invoked before returning to user space. Move the invocation out of the loop into the section which does the last preperatory steps before returning to user space. That section is not preemptible and runs with interrupts disabled until the actual return. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20200722220520.159112003@linutronix.de
2020-07-24x86/entry: Consolidate 32/64 bit syscall entryThomas Gleixner1-52/+41
64bit and 32bit entry code have the same open coded syscall entry handling after the bitwidth specific bits. Move it to a helper function and share the code. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20200722220520.051234096@linutronix.de
2020-07-24x86/entry: Consolidate check_user_regs()Thomas Gleixner1-15/+9
The user register sanity check is sprinkled all over the place. Move it into enter_from_user_mode(). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/20200722220519.943016204@linutronix.de
2020-07-24Merge branch 'core/entry' into x86/entryThomas Gleixner20-77/+102
Pick up generic entry code to migrate x86 over.
2020-07-24x86/defconfigs: Remove CONFIG_CRYPTO_AES_586 from i386_defconfigSedat Dilek1-1/+0
Initially CONFIG_CRYPTO_AES_586=y was added to the i386_defconfig file in: c1b362e3b4d3: ("x86: update defconfigs") The code and Kconfig for CONFIG_CRYPTO_AES_586 was removed in: 1d2c3279311e: ("crypto: x86/aes - drop scalar assembler implementations") Remove the leftover from the i386_defconfig file as well. Signed-off-by: Sedat Dilek <sedat.dilek@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20200723171119.9881-1-sedat.dilek@gmail.com
2020-07-24compiler.h: Move instrumentation_begin()/end() to new ↵Ingo Molnar1-0/+1
<linux/instrumentation.h> header Linus pointed out that compiler.h - which is a key header that gets included in every single one of the 28,000+ kernel files during a kernel build - was bloated in: 655389666643: ("vmlinux.lds.h: Create section for protection against instrumentation") Linus noted: > I have pulled this, but do we really want to add this to a header file > that is _so_ core that it gets included for basically every single > file built? > > I don't even see those instrumentation_begin/end() things used > anywhere right now. > > It seems excessive. That 53 lines is maybe not a lot, but it pushed > that header file to over 12kB, and while it's mostly comments, it's > extra IO and parsing basically for _every_ single file compiled in the > kernel. > > For what appears to be absolutely zero upside right now, and I really > don't see why this should be in such a core header file! Move these primitives into a new header: <linux/instrumentation.h>, and include that header in the headers that make use of it. Unfortunately one of these headers is asm-generic/bug.h, which does get included in a lot of places, similarly to compiler.h. So the de-bloating effect isn't as good as we'd like it to be - but at least the interfaces are defined separately. No change to functionality intended. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20200604071921.GA1361070@gmail.com Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Peter Zijlstra <peterz@infradead.org>
2020-07-24x86: Correct noinstr qualifiersIra Weiny2-2/+2
The noinstr qualifier is to be specified before the return type in the same way inline is used. These 2 cases were missed by previous patches. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20200723161405.852613-1-ira.weiny@intel.com
2020-07-24x86/mm: Drop unused MAX_PHYSADDR_BITSArvind Sankar1-5/+1
The macro is not used anywhere, and has an incorrect value (going by the comment) on x86_64 since commit c898faf91b3e ("x86: 46 bit physical address support on 64 bits") To avoid confusion, just remove the definition. Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20200723231544.17274-2-nivedita@alum.mit.edu
2020-07-24Merge v5.8-rc6 into drm-nextDave Airlie60-293/+487
I've got a silent conflict + two trees based on fixes to merge. Fixes a silent merge with amdgpu Signed-off-by: Dave Airlie <airlied@redhat.com>
2020-07-23x86/uaccess: Make __get_user_size() Clang compliant on 32-bitNick Desaulniers1-1/+4
Clang fails to compile __get_user_size() on 32-bit for the following code: long long val; __get_user(val, usrptr); with: error: invalid output size for constraint '=q' GCC compiles the same code without complaints. The reason is that GCC and Clang are architecturally different, which leads to subtle issues for code that's invalid but clearly dead, i.e. with code that emulates polymorphism with the preprocessor and sizeof. GCC will perform semantic analysis after early inlining and dead code elimination, so it will not warn on invalid code that's dead. Clang strictly performs optimizations after semantic analysis, so it will warn for dead code. Neither Clang nor GCC like this very much with -m32: long long ret; asm ("movb $5, %0" : "=q" (ret)); However, GCC can tolerate this variant: long long ret; switch (sizeof(ret)) { case 1: asm ("movb $5, %0" : "=q" (ret)); break; case 8:; } Clang, on the other hand, won't accept that because it validates the inline asm for the '1' case before the optimisation phase where it realises that it wouldn't have to emit it anyway. If LLVM (Clang's "back end") fails such as during instruction selection or register allocation, it cannot provide accurate diagnostics (warnings / errors) that contain line information, as the AST has been discarded from memory at that point. While there have been early discussions about having C/C++ specific language optimizations in Clang via the use of MLIR, which would enable such earlier optimizations, such work is not scoped and likely a multi-year endeavor. It was discussed to change the asm output constraint for the one byte case from "=q" to "=r". While it works for 64-bit, it fails on 32-bit. With '=r' the compiler could fail to chose a register accessible as high/low which is required for the byte operation. If that happens the assembly will fail. Use a local temporary variable of type 'unsigned char' as output for the byte copy inline asm and then assign it to the real output variable. This prevents Clang from failing the semantic analysis in the above case. The resulting code for the actual one byte copy is not affected as the temporary variable is optimized out. [ tglx: Amended changelog ] Reported-by: Arnd Bergmann <arnd@arndb.de> Reported-by: David Woodhouse <dwmw2@infradead.org> Reported-by: Dmitry Golovin <dima@golovin.in> Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Dennis Zhou <dennis@kernel.org> Link: https://bugs.llvm.org/show_bug.cgi?id=33587 Link: https://github.com/ClangBuiltLinux/linux/issues/3 Link: https://github.com/ClangBuiltLinux/linux/issues/194 Link: https://github.com/ClangBuiltLinux/linux/issues/781 Link: https://lore.kernel.org/lkml/20180209161833.4605-1-dwmw2@infradead.org/ Link: https://lore.kernel.org/lkml/CAK8P3a1EBaWdbAEzirFDSgHVJMtWjuNt2HGG8z+vpXeNHwETFQ@mail.gmail.com/ Link: https://lkml.kernel.org/r/20200720204925.3654302-12-ndesaulniers@google.com
2020-07-23x86/percpu: Remove unused PER_CPU() macroBrian Gerst1-18/+0
Also remove now unused __percpu_mov_op. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Dennis Zhou <dennis@kernel.org> Link: https://lkml.kernel.org/r/20200720204925.3654302-11-ndesaulniers@google.com
2020-07-23x86/percpu: Clean up percpu_stable_op()Brian Gerst1-29/+12
Use __pcpu_size_call_return() to simplify this_cpu_read_stable(). Also remove __bad_percpu_size() which is now unused. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Dennis Zhou <dennis@kernel.org> Link: https://lkml.kernel.org/r/20200720204925.3654302-10-ndesaulniers@google.com
2020-07-23x86/percpu: Clean up percpu_cmpxchg_op()Brian Gerst1-40/+18
The core percpu macros already have a switch on the data size, so the switch in the x86 code is redundant and produces more dead code. Also use appropriate types for the width of the instructions. This avoids errors when compiling with Clang. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Dennis Zhou <dennis@kernel.org> Link: https://lkml.kernel.org/r/20200720204925.3654302-9-ndesaulniers@google.com
2020-07-23x86/percpu: Clean up percpu_xchg_op()Brian Gerst1-43/+18
The core percpu macros already have a switch on the data size, so the switch in the x86 code is redundant and produces more dead code. Also use appropriate types for the width of the instructions. This avoids errors when compiling with Clang. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Dennis Zhou <dennis@kernel.org> Link: https://lkml.kernel.org/r/20200720204925.3654302-8-ndesaulniers@google.com
2020-07-23x86/percpu: Clean up percpu_add_return_op()Brian Gerst1-35/+16
The core percpu macros already have a switch on the data size, so the switch in the x86 code is redundant and produces more dead code. Also use appropriate types for the width of the instructions. This avoids errors when compiling with Clang. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Dennis Zhou <dennis@kernel.org> Link: https://lkml.kernel.org/r/20200720204925.3654302-7-ndesaulniers@google.com
2020-07-23x86/percpu: Remove "e" constraint from XADDBrian Gerst1-1/+1
The "e" constraint represents a constant, but the XADD instruction doesn't accept immediate operands. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Dennis Zhou <dennis@kernel.org> Link: https://lkml.kernel.org/r/20200720204925.3654302-6-ndesaulniers@google.com
2020-07-23x86/percpu: Clean up percpu_add_op()Brian Gerst1-77/+22
The core percpu macros already have a switch on the data size, so the switch in the x86 code is redundant and produces more dead code. Also use appropriate types for the width of the instructions. This avoids errors when compiling with Clang. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Dennis Zhou <dennis@kernel.org> Link: https://lkml.kernel.org/r/20200720204925.3654302-5-ndesaulniers@google.com
2020-07-23x86/percpu: Clean up percpu_from_op()Brian Gerst1-35/+15
The core percpu macros already have a switch on the data size, so the switch in the x86 code is redundant and produces more dead code. Also use appropriate types for the width of the instructions. This avoids errors when compiling with Clang. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Dennis Zhou <dennis@kernel.org> Link: https://lkml.kernel.org/r/20200720204925.3654302-4-ndesaulniers@google.com
2020-07-23x86/percpu: Clean up percpu_to_op()Brian Gerst1-55/+35
The core percpu macros already have a switch on the data size, so the switch in the x86 code is redundant and produces more dead code. Also use appropriate types for the width of the instructions. This avoids errors when compiling with Clang. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Dennis Zhou <dennis@kernel.org> Link: https://lkml.kernel.org/r/20200720204925.3654302-3-ndesaulniers@google.com
2020-07-23x86/percpu: Introduce size abstraction macrosBrian Gerst1-0/+30
In preparation for cleaning up the percpu operations, define macros for abstraction based on the width of the operation. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Dennis Zhou <dennis@kernel.org> Link: https://lkml.kernel.org/r/20200720204925.3654302-2-ndesaulniers@google.com
2020-07-23crypto: x86 - Put back integer parts of include/asm/inst.hUros Bizjak1-0/+148
Resolves conflict with the tip tree. Fixes: d7866e503bdc ("crypto: x86 - Remove include/asm/inst.h") CC: Thomas Gleixner <tglx@linutronix.de> CC: Ingo Molnar <mingo@redhat.com> CC: Borislav Petkov <bp@alien8.de> CC: "H. Peter Anvin" <hpa@zytor.com> CC: Stephen Rothwell <sfr@canb.auug.org.au>, CC: "Chang S. Bae" <chang.seok.bae@intel.com>, CC: Peter Zijlstra <peterz@infradead.org>, CC: Sasha Levin <sashal@kernel.org> Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-23crypto: x86/crc32c - fix building with clang iasArnd Bergmann1-1/+1
The clang integrated assembler complains about movzxw: arch/x86/crypto/crc32c-pcl-intel-asm_64.S:173:2: error: invalid instruction mnemonic 'movzxw' It seems that movzwq is the mnemonic that it expects instead, and this is what objdump prints when disassembling the file. Fixes: 6a8ce1ef3940 ("crypto: crc32c - Optimize CRC32C calculation with PCLMULQDQ instruction") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-23irqdomain/treewide: Free firmware node after domain removalJon Derrick1-0/+5
Commit 711419e504eb ("irqdomain: Add the missing assignment of domain->fwnode for named fwnode") unintentionally caused a dangling pointer page fault issue on firmware nodes that were freed after IRQ domain allocation. Commit e3beca48a45b fixed that dangling pointer issue by only freeing the firmware node after an IRQ domain allocation failure. That fix no longer frees the firmware node immediately, but leaves the firmware node allocated after the domain is removed. The firmware node must be kept around through irq_domain_remove, but should be freed it afterwards. Add the missing free operations after domain removal where where appropriate. Fixes: e3beca48a45b ("irqdomain/treewide: Keep firmware node unconditionally allocated") Signed-off-by: Jon Derrick <jonathan.derrick@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> # drivers/pci Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/1595363169-7157-1-git-send-email-jonathan.derrick@intel.com
2020-07-23x86/dumpstack: Show registers dump with trace's log levelDmitry Safonov1-5/+5
show_trace_log_lvl() provides x86 platform-specific way to unwind backtrace with a given log level. Unfortunately, registers dump(s) are not printed with the same log level - instead, KERN_DEFAULT is always used. Arista's switches uses quite common setup with rsyslog, where only urgent messages goes to console (console_log_level=KERN_ERR), everything else goes into /var/log/ as the console baud-rate often is indecently slow (9600 bps). Backtrace dumps without registers printed have proven to be as useful as morning standups. Furthermore, in order to introduce KERN_UNSUPPRESSED (which I believe is still the most elegant way to fix raciness of sysrq[1]) the log level should be passed down the stack to register dumping functions. Besides, there is a potential use-case for printing traces with KERN_DEBUG level [2] (where registers dump shouldn't appear with higher log level). After all preparations are done, provide log_lvl parameter for show_regs_if_on_stack() and wire up to actual log level used as an argument for show_trace_log_lvl(). [1]: https://lore.kernel.org/lkml/20190528002412.1625-1-dima@arista.com/ [2]: https://lore.kernel.org/linux-doc/20190724170249.9644-1-dima@arista.com/ Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Petr Mladek <pmladek@suse.com> Link: https://lkml.kernel.org/r/20200629144847.492794-4-dima@arista.com
2020-07-23x86/dumpstack: Add log_lvl to __show_regs()Dmitry Safonov4-43/+49
show_trace_log_lvl() provides x86 platform-specific way to unwind backtrace with a given log level. Unfortunately, registers dump(s) are not printed with the same log level - instead, KERN_DEFAULT is always used. Arista's switches uses quite common setup with rsyslog, where only urgent messages goes to console (console_log_level=KERN_ERR), everything else goes into /var/log/ as the console baud-rate often is indecently slow (9600 bps). Backtrace dumps without registers printed have proven to be as useful as morning standups. Furthermore, in order to introduce KERN_UNSUPPRESSED (which I believe is still the most elegant way to fix raciness of sysrq[1]) the log level should be passed down the stack to register dumping functions. Besides, there is a potential use-case for printing traces with KERN_DEBUG level [2] (where registers dump shouldn't appear with higher log level). Add log_lvl parameter to __show_regs(). Keep the used log level intact to separate visible change. [1]: https://lore.kernel.org/lkml/20190528002412.1625-1-dima@arista.com/ [2]: https://lore.kernel.org/linux-doc/20190724170249.9644-1-dima@arista.com/ Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Petr Mladek <pmladek@suse.com> Link: https://lkml.kernel.org/r/20200629144847.492794-3-dima@arista.com
2020-07-23x86/dumpstack: Add log_lvl to show_iret_regs()Dmitry Safonov3-6/+6
show_trace_log_lvl() provides x86 platform-specific way to unwind backtrace with a given log level. Unfortunately, registers dump(s) are not printed with the same log level - instead, KERN_DEFAULT is always used. Arista's switches uses quite common setup with rsyslog, where only urgent messages goes to console (console_log_level=KERN_ERR), everything else goes into /var/log/ as the console baud-rate often is indecently slow (9600 bps). Backtrace dumps without registers printed have proven to be as useful as morning standups. Furthermore, in order to introduce KERN_UNSUPPRESSED (which I believe is still the most elegant way to fix raciness of sysrq[1]) the log level should be passed down the stack to register dumping functions. Besides, there is a potential use-case for printing traces with KERN_DEBUG level [2] (where registers dump shouldn't appear with higher log level). Add log_lvl parameter to show_iret_regs() as a preparation to add it to __show_regs() and show_regs_if_on_stack(). [1]: https://lore.kernel.org/lkml/20190528002412.1625-1-dima@arista.com/ [2]: https://lore.kernel.org/linux-doc/20190724170249.9644-1-dima@arista.com/ Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Petr Mladek <pmladek@suse.com> Link: https://lkml.kernel.org/r/20200629144847.492794-2-dima@arista.com
2020-07-23x86/dumpstack: Dump user space code correctly againThomas Gleixner1-10/+17
H.J. reported that post 5.7 a segfault of a user space task does not longer dump the Code bytes when /proc/sys/debug/exception-trace is enabled. It prints 'Code: Bad RIP value.' instead. This was broken by a recent change which made probe_kernel_read() reject non-kernel addresses. Update show_opcodes() so it retrieves user space opcodes via copy_from_user_nmi(). Fixes: 98a23609b103 ("maccess: always use strict semantics for probe_kernel_read") Reported-by: H.J. Lu <hjl.tools@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/87h7tz306w.fsf@nanos.tec.linutronix.de
2020-07-23x86/stacktrace: Fix reliable check for empty user task stacksJosh Poimboeuf1-5/+0
If a user task's stack is empty, or if it only has user regs, ORC reports it as a reliable empty stack. But arch_stack_walk_reliable() incorrectly treats it as unreliable. That happens because the only success path for user tasks is inside the loop, which only iterates on non-empty stacks. Generally, a user task must end in a user regs frame, but an empty stack is an exception to that rule. Thanks to commit 71c95825289f ("x86/unwind/orc: Fix error handling in __unwind_start()"), unwind_start() now sets state->error appropriately. So now for both ORC and FP unwinders, unwind_done() and !unwind_error() always means the end of the stack was successfully reached. So the success path for kthreads is no longer needed -- it can also be used for empty user tasks. Reported-by: Wang ShaoBo <bobo.shaobowang@huawei.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Wang ShaoBo <bobo.shaobowang@huawei.com> Link: https://lkml.kernel.org/r/f136a4e5f019219cbc4f4da33b30c2f44fa65b84.1594994374.git.jpoimboe@redhat.com
2020-07-23x86/unwind/orc: Fix ORC for newly forked tasksJosh Poimboeuf1-2/+6
The ORC unwinder fails to unwind newly forked tasks which haven't yet run on the CPU. It correctly reads the 'ret_from_fork' instruction pointer from the stack, but it incorrectly interprets that value as a call stack address rather than a "signal" one, so the address gets incorrectly decremented in the call to orc_find(), resulting in bad ORC data. Fix it by forcing 'ret_from_fork' frames to be signal frames. Reported-by: Wang ShaoBo <bobo.shaobowang@huawei.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Wang ShaoBo <bobo.shaobowang@huawei.com> Link: https://lkml.kernel.org/r/f91a8778dde8aae7f71884b5df2b16d552040441.1594994374.git.jpoimboe@redhat.com
2020-07-22Merge tag 'media/v5.8-3' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media into master Pull media fixes from Mauro Carvalho Chehab: "A series of fixes for the upcoming atomisp driver. They solve issues when probing atomisp on devices with multiple cameras and get rid of warnings when built with W=1. The diffstat is a bit long, as this driver has several abstractions. The patches that solved the issues with W=1 had to get rid of some duplicated code (there used to have 2 versions of the same code, one for ISP2401 and another one for ISP2400). As this driver is not in 5.7, such changes won't cause regressions" * tag 'media/v5.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (38 commits) Revert "media: atomisp: keep the ISP powered on when setting it" media: atomisp: fix mask and shift operation on ISPSSPM0 media: atomisp: move system_local consts into a C file media: atomisp: get rid of version-specific system_local.h media: atomisp: move global stuff into a common header media: atomisp: remove non-used 32-bits consts at system_local media: atomisp: get rid of some unused static vars media: atomisp: Fix error code in ov5693_probe() media: atomisp: Replace trace_printk by pr_info media: atomisp: Fix __func__ style warnings media: atomisp: fix help message for ISP2401 selection media: atomisp: i2c: atomisp-ov2680.c: fixed a brace coding style issue. media: atomisp: make const arrays static, makes object smaller media: atomisp: Clean up non-existing folders from Makefile media: atomisp: Get rid of ACPI specifics in gmin_subdev_add() media: atomisp: Provide Gmin subdev as parameter to gmin_subdev_add() media: atomisp: Use temporary variable for device in gmin_subdev_add() media: atomisp: Refactor PMIC detection to a separate function media: atomisp: Deduplicate return ret in gmin_i2c_write() media: atomisp: Make pointer to PMIC client global ...
2020-07-22x86/perf: Fix a typoHu Haowen1-1/+1
The word "Zhoaxin" is incorrect and the right one is "Zhaoxin". Signed-off-by: Hu Haowen <xianfengting221@163.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200719105007.57649-1-xianfengting221@163.com
2020-07-22Merge branch 'sched/urgent'Peter Zijlstra22-81/+108
2020-07-22x86, vmlinux.lds: Page-align end of ..page_aligned sectionsJoerg Roedel1-0/+1
On x86-32 the idt_table with 256 entries needs only 2048 bytes. It is page-aligned, but the end of the .bss..page_aligned section is not guaranteed to be page-aligned. As a result, objects from other .bss sections may end up on the same 4k page as the idt_table, and will accidentially get mapped read-only during boot, causing unexpected page-faults when the kernel writes to them. This could be worked around by making the objects in the page aligned sections page sized, but that's wrong. Explicit sections which store only page aligned objects have an implicit guarantee that the object is alone in the page in which it is placed. That works for all objects except the last one. That's inconsistent. Enforcing page sized objects for these sections would wreckage memory sanitizers, because the object becomes artificially larger than it should be and out of bound access becomes legit. Align the end of the .bss..page_aligned and .data..page_aligned section on page-size so all objects places in these sections are guaranteed to have their own page. [ tglx: Amended changelog ] Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20200721093448.10417-1-joro@8bytes.org
2020-07-21exec: Implement kernel_execveEric W. Biederman3-3/+3
To allow the kernel not to play games with set_fs to call exec implement kernel_execve. The function kernel_execve takes pointers into kernel memory and copies the values pointed to onto the new userspace stack. The calls with arguments from kernel space of do_execve are replaced with calls to kernel_execve. The calls do_execve and do_execveat are made static as there are now no callers outside of exec. The comments that mention do_execve are updated to refer to kernel_execve or execve depending on the circumstances. In addition to correcting the comments, this makes it easy to grep for do_execve and verify it is not used. Inspired-by: https://lkml.kernel.org/r/20200627072704.2447163-1-hch@lst.de Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/87wo365ikj.fsf@x220.int.ebiederm.org Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2020-07-20net: remove compat_sys_{get,set}sockoptChristoph Hellwig3-4/+11
Now that the ->compat_{get,set}sockopt proto_ops methods are gone there is no good reason left to keep the compat syscalls separate. This fixes the odd use of unsigned int for the compat_setsockopt optlen and the missing sock_use_custom_sol_socket. It would also easily allow running the eBPF hooks for the compat syscalls, but such a large change in behavior does not belong into a consolidation patch like this one. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20copy_xstate_to_kernel: Fix typo which caused GDB regressionKevin Buettner1-1/+1
This fixes a regression encountered while running the gdb.base/corefile.exp test in GDB's test suite. In my testing, the typo prevented the sw_reserved field of struct fxregs_state from being output to the kernel XSAVES area. Thus the correct mask corresponding to XCR0 was not present in the core file for GDB to interrogate, resulting in the following behavior: [kev@f32-1 gdb]$ ./gdb -q testsuite/outputs/gdb.base/corefile/corefile testsuite/outputs/gdb.base/corefile/corefile.core Reading symbols from testsuite/outputs/gdb.base/corefile/corefile... [New LWP 232880] warning: Unexpected size of section `.reg-xstate/232880' in core file. With the typo fixed, the test works again as expected. Signed-off-by: Kevin Buettner <kevinb@redhat.com> Fixes: 9e4636545933 ("copy_xstate_to_kernel(): don't leave parts of destination uninitialized") Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Dave Airlie <airlied@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-07-19Merge tag 'x86-urgent-2020-07-19' of ↵Linus Torvalds12-37/+66
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into master Pull x86 fixes from Thomas Gleixner: "A pile of fixes for x86: - Fix the I/O bitmap invalidation on XEN PV, which was overlooked in the recent ioperm/iopl rework. This caused the TSS and XEN's I/O bitmap to get out of sync. - Use the proper vectors for HYPERV. - Make disabling of stack protector for the entry code work with GCC builds which enable stack protector by default. Removing the option is not sufficient, it needs an explicit -fno-stack-protector to shut it off. - Mark check_user_regs() noinstr as it is called from noinstr code. The missing annotation causes it to be placed in the text section which makes it instrumentable. - Add the missing interrupt disable in exc_alignment_check() - Fixup a XEN_PV build dependency in the 32bit entry code - A few fixes to make the Clang integrated assembler happy - Move EFI stub build to the right place for out of tree builds - Make prepare_exit_to_usermode() static. It's not longer called from ASM code" * tag 'x86-urgent-2020-07-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/boot: Don't add the EFI stub to targets x86/entry: Actually disable stack protector x86/ioperm: Fix io bitmap invalidation on Xen PV x86: math-emu: Fix up 'cmp' insn for clang ias x86/entry: Fix vectors to IDTENTRY_SYSVEC for CONFIG_HYPERV x86/entry: Add compatibility with IAS x86/entry/common: Make prepare_exit_to_usermode() static x86/entry: Mark check_user_regs() noinstr x86/traps: Disable interrupts in exc_aligment_check() x86/entry/32: Fix XEN_PV build dependency
2020-07-19Merge tag 'irq-urgent-2020-07-19' of ↵Linus Torvalds4-30/+24
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into master Pull irq fixes from Thomas Gleixner: "Two fixes for the interrupt subsystem: - Make the handling of the firmware node consistent and do not free the node after the domain has been created successfully. The core code stores a pointer to it which can lead to a use after free or double free. This used to "work" because the pointer was not stored when the initial code was written, but at some point later it was required to store it. Of course nobody noticed that the existing users break that way. - Handle affinity setting on inactive interrupts correctly when hierarchical irq domains are enabled. When interrupts are inactive with the modern hierarchical irqdomain design, the interrupt chips are not necessarily in a state where affinity changes can be handled. The legacy irq chip design allowed this because interrupts are immediately fully initialized at allocation time. X86 has a hacky workaround for this, but other implementations do not. This cased malfunction on GIC-V3. Instead of playing whack a mole to find all affected drivers, change the core code to store the requested affinity setting and then establish it when the interrupt is allocated, which makes the X86 hack go away" * tag 'irq-urgent-2020-07-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: genirq/affinity: Handle affinity setting on inactive interrupts correctly irqdomain/treewide: Keep firmware node unconditionally allocated
2020-07-19x86/boot: Don't add the EFI stub to targetsArvind Sankar1-2/+2
vmlinux-objs-y is added to targets, which currently means that the EFI stub gets added to the targets as well. It shouldn't be added since it is built elsewhere. This confuses Makefile.build which interprets the EFI stub as a target $(obj)/$(objtree)/drivers/firmware/efi/libstub/lib.a and will create drivers/firmware/efi/libstub/ underneath arch/x86/boot/compressed, to hold this supposed target, if building out-of-tree. [0] Fix this by pulling the stub out of vmlinux-objs-y into efi-obj-y. [0] See scripts/Makefile.build near the end: # Create directories for object files if they do not exist Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lkml.kernel.org/r/20200715032631.1562882-1-nivedita@alum.mit.edu
2020-07-19x86/entry: Actually disable stack protectorKees Cook1-3/+11
Some builds of GCC enable stack protector by default. Simply removing the arguments is not sufficient to disable stack protector, as the stack protector for those GCC builds must be explicitly disabled. Remove the argument removals and add -fno-stack-protector. Additionally include missed x32 argument updates, and adjust whitespace for readability. Fixes: 20355e5f73a7 ("x86/entry: Exclude low level entry code from sanitizing") Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/202006261333.585319CA6B@keescook
2020-07-19dma-mapping: make support for dma ops optionalChristoph Hellwig1-0/+1
Avoid the overhead of the dma ops support for tiny builds that only use the direct mapping. Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2020-07-18x86/ioperm: Fix io bitmap invalidation on Xen PVAndy Lutomirski6-17/+38
tss_invalidate_io_bitmap() wasn't wired up properly through the pvop machinery, so the TSS and Xen's io bitmap would get out of sync whenever disabling a valid io bitmap. Add a new pvop for tss_invalidate_io_bitmap() to fix it. This is XSA-329. Fixes: 22fe5b0439dd ("x86/ioperm: Move TSS bitmap update to exit to user work") Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Juergen Gross <jgross@suse.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/d53075590e1f91c19f8af705059d3ff99424c020.1595030016.git.luto@kernel.org
2020-07-18media: atomisp: move CCK endpoint address to generic headerAndy Shevchenko1-0/+1
IOSF MBI header contains a lot of definitions, such as end point addresses of IPs. Move CCK address from AtomISP driver to generic header. While here, drop unused one. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-07-18genirq/affinity: Handle affinity setting on inactive interrupts correctlyThomas Gleixner1-17/+5
Setting interrupt affinity on inactive interrupts is inconsistent when hierarchical irq domains are enabled. The core code should just store the affinity and not call into the irq chip driver for inactive interrupts because the chip drivers may not be in a state to handle such requests. X86 has a hacky workaround for that but all other irq chips have not which causes problems e.g. on GIC V3 ITS. Instead of adding more ugly hacks all over the place, solve the problem in the core code. If the affinity is set on an inactive interrupt then: - Store it in the irq descriptors affinity mask - Update the effective affinity to reflect that so user space has a consistent view - Don't call into the irq chip driver This is the core equivalent of the X86 workaround and works correctly because the affinity setting is established in the irq chip when the interrupt is activated later on. Note, that this is only effective when hierarchical irq domains are enabled by the architecture. Doing it unconditionally would break legacy irq chip implementations. For hierarchial irq domains this works correctly as none of the drivers can have a dependency on affinity setting in inactive state by design. Remove the X86 workaround as it is not longer required. Fixes: 02edee152d6e ("x86/apic/vector: Ignore set_affinity call for inactive interrupts") Reported-by: Ali Saidi <alisaidi@amazon.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Ali Saidi <alisaidi@amazon.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20200529015501.15771-1-alisaidi@amazon.com Link: https://lkml.kernel.org/r/877dv2rv25.fsf@nanos.tec.linutronix.de
2020-07-17x86/efi: Remove unused EFI_UV1_MEMMAP codesteve.wahl@hpe.com1-18/+2
With UV1 support removed, EFI_UV1_MEMMAP is no longer used. Signed-off-by: Steve Wahl <steve.wahl@hpe.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lkml.kernel.org/r/20200713212956.019149227@hpe.com
2020-07-17x86/platform/uv: Remove uv bios and efi code related to EFI_UV1_MEMMAPsteve.wahl@hpe.com2-159/+2
With UV1 removed, EFI_UV1_MEMMAP is not longer used. Remove the code used by it and the related code in EFI. Signed-off-by: Steve Wahl <steve.wahl@hpe.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lkml.kernel.org/r/20200713212955.902592618@hpe.com
2020-07-17x86/efi: Remove references to no-longer-used efi_have_uv1_memmap()steve.wahl@hpe.com4-63/+6
In removing UV1 support, efi_have_uv1_memmap is no longer used. Signed-off-by: Steve Wahl <steve.wahl@hpe.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lkml.kernel.org/r/20200713212955.786177105@hpe.com
2020-07-17x86/efi: Delete SGI UV1 detection.steve.wahl@hpe.com1-23/+0
As a part of UV1 platform removal, don't try to recognize the platform through DMI to set the EFI_UV1_MEMMAP bit. Signed-off-by: Steve Wahl <steve.wahl@hpe.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lkml.kernel.org/r/20200713212955.667726896@hpe.com
2020-07-17x86/platform/uv: Remove efi=old_map command line optionsteve.wahl@hpe.com1-14/+0
As a part of UV1 platform removal, delete the efi=old_map option, which is not longer needed. Signed-off-by: Steve Wahl <steve.wahl@hpe.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lkml.kernel.org/r/20200713212955.552098718@hpe.com
2020-07-17x86/platform/uv: Remove vestigial mention of UV1 platform from bios headersteve.wahl@hpe.com1-1/+1
Remove UV1 reference as UV1 is not longer supported by HPE. Signed-off-by: Steve Wahl <steve.wahl@hpe.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20200713212955.435951508@hpe.com