summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)AuthorFilesLines
2021-06-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller1-33/+13
Daniel Borkmann says: ==================== pull-request: bpf-next 2021-06-28 The following pull-request contains BPF updates for your *net-next* tree. We've added 37 non-merge commits during the last 12 day(s) which contain a total of 56 files changed, 394 insertions(+), 380 deletions(-). The main changes are: 1) XDP driver RCU cleanups, from Toke Høiland-Jørgensen and Paul E. McKenney. 2) Fix bpf_skb_change_proto() IPv4/v6 GSO handling, from Maciej Żenczykowski. 3) Fix false positive kmemleak report for BPF ringbuf alloc, from Rustam Kovhaev. 4) Fix x86 JIT's extable offset calculation for PROBE_LDX NULL, from Ravi Bangoria. 5) Enable libbpf fallback probing with tracing under RHEL7, from Jonathan Edwards. 6) Clean up x86 JIT to remove unused cnt tracking from EMIT macro, from Jiri Olsa. 7) Netlink cleanups for libbpf to please Coverity, from Kumar Kartikeya Dwivedi. 8) Allow to retrieve ancestor cgroup id in tracing programs, from Namhyung Kim. 9) Fix lirc BPF program query to use user-provided prog_cnt, from Sean Young. 10) Add initial libbpf doc including generated kdoc for its API, from Grant Seltzer. 11) Make xdp_rxq_info_unreg_mem_model() more robust, from Jakub Kicinski. 12) Fix up bpfilter startup log-level to info level, from Gary Lin. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28bpf, x86: Fix extable offset calculationRavi Bangoria1-1/+1
Commit 4c5de127598e1 ("bpf: Emit explicit NULL pointer checks for PROBE_LDX instructions.") is emitting a couple of instructions before the actual load. Consider those additional instructions while calculating extable offset. Fixes: 4c5de127598e1 ("bpf: Emit explicit NULL pointer checks for PROBE_LDX instructions.") Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210622110026.1157847-1-ravi.bangoria@linux.ibm.com
2021-06-24arm64: dts: sparx5: Add the Sparx5 switch nodeSteen Hegelund3-84/+1112
This provides the configuration for the currently available evaluation boards PCB134 and PCB135. The series depends on the following series currently on its way into the kernel: - Sparx5 Reset Driver Link: https://lore.kernel.org/r/20210416084054.2922327-1-steen.hegelund@microchip.com/ Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com> Signed-off-by: Lars Povlsen <lars.povlsen@microchip.com> Signed-off-by: Bjarni Jonasson <bjarni.jonasson@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24net: retrieve netns cookie via getsocketoptMartynas Pumputis4-0/+8
It's getting more common to run nested container environments for testing cloud software. One of such examples is Kind [1] which runs a Kubernetes cluster in Docker containers on a single host. Each container acts as a Kubernetes node, and thus can run any Pod (aka container) inside the former. This approach simplifies testing a lot, as it eliminates complicated VM setups. Unfortunately, such a setup breaks some functionality when cgroupv2 BPF programs are used for load-balancing. The load-balancer BPF program needs to detect whether a request originates from the host netns or a container netns in order to allow some access, e.g. to a service via a loopback IP address. Typically, the programs detect this by comparing netns cookies with the one of the init ns via a call to bpf_get_netns_cookie(NULL). However, in nested environments the latter cannot be used given the Kubernetes node's netns is outside the init ns. To fix this, we need to pass the Kubernetes node netns cookie to the program in a different way: by extending getsockopt() with a SO_NETNS_COOKIE option, the orchestrator which runs in the Kubernetes node netns can retrieve the cookie and pass it to the program instead. Thus, this is following up on Eric's commit 3d368ab87cf6 ("net: initialize net->net_cookie at netns setup") to allow retrieval via SO_NETNS_COOKIE. This is also in line in how we retrieve socket cookie via SO_COOKIE. [1] https://kind.sigs.k8s.io/ Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Martynas Pumputis <m@lambda.lt> Cc: Eric Dumazet <edumazet@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24bpf, x86: Remove unused cnt increase from EMIT macroJiri Olsa1-32/+12
Removing unused cnt increase from EMIT macro together with cnt declarations. This was introduced in commit [1] to ensure proper code generation. But that code was removed in commit [2] and this extra code was left in. [1] b52f00e6a715 ("x86: bpf_jit: implement bpf_tail_call() helper") [2] ebf7d1f508a7 ("bpf, x64: rework pro/epilogue and tailcall handling in JIT") Signed-off-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210623112504.709856-1-jolsa@kernel.org
2021-06-19Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski113-486/+777
Trivial conflicts in net/can/isotp.c and tools/testing/selftests/net/mptcp/mptcp_connect.sh scaled_ppm_to_ppb() was moved from drivers/ptp/ptp_clock.c to include/linux/ptp_clock_kernel.h in -next so re-apply the fix there. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-06-18Merge tag 'pci-v5.13-fixes-2' of ↵Linus Torvalds1-0/+44
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fixes from Bjorn Helgaas: - Clear 64-bit flag for host bridge windows below 4GB to fix a resource allocation regression added in -rc1 (Punit Agrawal) - Fix tegra194 MCFG quirk build regressions added in -rc1 (Jon Hunter) - Avoid secondary bus resets on TI KeyStone C667X devices (Antti Järvinen) - Avoid secondary bus resets on some NVIDIA GPUs (Shanker Donthineni) - Work around FLR erratum on Huawei Intelligent NIC VF (Chiqijun) - Avoid broken ATS on AMD Navi14 GPU (Evan Quan) - Trust Broadcom BCM57414 NIC to isolate functions even though it doesn't advertise ACS support (Sriharsha Basavapatna) - Work around AMD RS690 BIOSes that don't configure DMA above 4GB (Mikel Rychliski) - Fix panic during PIO transfer on Aardvark controller (Pali Rohár) * tag 'pci-v5.13-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: aardvark: Fix kernel panic during PIO transfer PCI: Add AMD RS690 quirk to enable 64-bit DMA PCI: Add ACS quirk for Broadcom BCM57414 NIC PCI: Mark AMD Navi14 GPU ATS as broken PCI: Work around Huawei Intelligent NIC VF FLR erratum PCI: Mark some NVIDIA GPUs to avoid bus reset PCI: Mark TI C667X to avoid bus reset PCI: tegra194: Fix MCFG quirk build regressions PCI: of: Clear 64-bit flag for non-prefetchable memory below 4GB
2021-06-18MIPS: Loongson64: DTS: Add GMAC support for LS7A PCHQing Zhang1-2/+4
The GMAC module is now supported, enable it. Signed-off-by: Qing Zhang <zhangqing@loongson.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-18MIPS: Loongson64: Add GMAC support for Loongson-2K1000Qing Zhang1-0/+46
The GMAC module is now supported, enable it. Signed-off-by: Qing Zhang <zhangqing@loongson.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-18Merge tag 'arc-5.13-rc7-fixes' of ↵Linus Torvalds3-1/+45
git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc Pull ARC fixes from Vineet Gupta: - ARCv2 userspace ABI not populating a few registers - Unbork CONFIG_HARDENED_USERCOPY for ARC * tag 'arc-5.13-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: ARC: fix CONFIG_HARDENED_USERCOPY ARCv2: save ABI registers across signal handling
2021-06-18PCI: Add AMD RS690 quirk to enable 64-bit DMAMikel Rychliski1-0/+44
Although the AMD RS690 chipset has 64-bit DMA support, BIOS implementations sometimes fail to configure the memory limit registers correctly. The Acer F690GVM mainboard uses this chipset and a Marvell 88E8056 NIC. The sky2 driver programs the NIC to use 64-bit DMA, which will not work: sky2 0000:02:00.0: error interrupt status=0x8 sky2 0000:02:00.0 eth0: tx timeout sky2 0000:02:00.0 eth0: transmit ring 0 .. 22 report=0 done=0 Other drivers required by this mainboard either don't support 64-bit DMA, or have it disabled using driver specific quirks. For example, the ahci driver has quirks to enable or disable 64-bit DMA depending on the BIOS version (see ahci_sb600_enable_64bit() in ahci.c). This ahci quirk matches against the SB600 SATA controller, but the real issue is almost certainly with the RS690 PCI host that it was commonly attached to. To avoid this issue in all drivers with 64-bit DMA support, fix the configuration of the PCI host. If the kernel is aware of physical memory above 4GB, but the BIOS never configured the PCI host with this information, update the registers with our values. [bhelgaas: drop PCI_DEVICE_ID_ATI_RS690 definition] Link: https://lore.kernel.org/r/20210611214823.4898-1-mikel@mikelr.com Signed-off-by: Mikel Rychliski <mikel@mikelr.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2021-06-17Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds7-10/+53
Pull kvm fixes from Paolo Bonzini: "Miscellaneous bugfixes. The main interesting one is a NULL pointer dereference reported by syzkaller ("KVM: x86: Immediately reset the MMU context when the SMM flag is cleared")" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: selftests: Fix kvm_check_cap() assertion KVM: x86/mmu: Calculate and check "full" mmu_role for nested MMU KVM: X86: Fix x86_emulator slab cache leak KVM: SVM: Call SEV Guest Decommission if ASID binding fails KVM: x86: Immediately reset the MMU context when the SMM flag is cleared KVM: x86: Fix fall-through warnings for Clang KVM: SVM: fix doc warnings KVM: selftests: Fix compiling errors when initializing the static structure kvm: LAPIC: Restore guard to prevent illegal APIC register access
2021-06-12Merge tag 'riscv-for-linus-5.13-rc6' of ↵Linus Torvalds6-16/+36
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: - A pair of XIP fixes: one to fix alternatives, and one to turn off the rest of the features that require code modification - A fix to a type that was causing some alternatives to break - A build fix for BUILTIN_DTB * tag 'riscv-for-linus-5.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: Fix BUILTIN_DTB for sifive and microchip soc riscv: alternative: fix typo in macro name riscv: code patching only works on !XIP_KERNEL riscv: xip: support runtime trap patching
2021-06-12Merge tag 'perf-urgent-2021-06-12' of ↵Linus Torvalds2-5/+8
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Misc fixes: - Fix the NMI watchdog on ancient Intel CPUs - Remove a misguided, NMI-unsafe KASAN callback from the NMI-safe irq_work path used by perf. - Fix uncore events on Ice Lake servers. - Someone booted maxcpus=1 on an SNB-EP, and the uncore driver emitted warnings and was probably buggy. Fix it. - KCSAN found a genuine data race in the core perf code. Somewhat ironically the bug was introduced through a recent race fix. :-/ In our defense, the new race window was much more narrow. Fix it" * tag 'perf-urgent-2021-06-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/nmi_watchdog: Fix old-style NMI watchdog regression on old Intel CPUs irq_work: Make irq_work_queue() NMI-safe again perf/x86/intel/uncore: Fix M2M event umask for Ice Lake server perf/x86/intel/uncore: Fix a kernel WARNING triggered by maxcpus=1 perf: Fix data race between pin_count increment/decrement
2021-06-12riscv: Fix BUILTIN_DTB for sifive and microchip socAlexandre Ghiti2-0/+2
Fix BUILTIN_DTB config which resulted in a dtb that was actually not built into the Linux image: in the same manner as Canaan soc does, create an object file from the dtb file that will get linked into the Linux image. Signed-off-by: Alexandre Ghiti <alex@ghiti.fr> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-06-11s390/qeth: remove QAOB's pointer to its TX bufferJulian Wiedmann1-3/+1
Maintaining a pointer inside the aob's user-definable area is fragile and unnecessary. At this stage we only need it to overload the buffer's state field, and to access the buffer's TX queue. The first part is easily solved by tracking the aob's state within the aob itself. This also feels much cleaner and self-contained. For enabling the access to the associated TX queue, we can store the queue's index in the aob. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11x86, lto: Pass -stack-alignment only on LLD < 13.0.0Tor Vic1-2/+3
Since LLVM commit 3787ee4, the '-stack-alignment' flag has been dropped [1], leading to the following error message when building a LTO kernel with Clang-13 and LLD-13: ld.lld: error: -plugin-opt=-: ld.lld: Unknown command line argument '-stack-alignment=8'. Try 'ld.lld --help' ld.lld: Did you mean '--stackrealign=8'? It also appears that the '-code-model' flag is not necessary anymore starting with LLVM-9 [2]. Drop '-code-model' and make '-stack-alignment' conditional on LLD < 13.0.0. These flags were necessary because these flags were not encoded in the IR properly, so the link would restart optimizations without them. Now there are properly encoded in the IR, and these flags exposing implementation details are no longer necessary. [1] https://reviews.llvm.org/D103048 [2] https://reviews.llvm.org/D52322 Cc: stable@vger.kernel.org Link: https://github.com/ClangBuiltLinux/linux/issues/1377 Signed-off-by: Tor Vic <torvic9@mailbox.org> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/f2c018ee-5999-741e-58d4-e482d5246067@mailbox.org
2021-06-11KVM: x86/mmu: Calculate and check "full" mmu_role for nested MMUSean Christopherson1-1/+25
Calculate and check the full mmu_role when initializing the MMU context for the nested MMU, where "full" means the bits and pieces of the role that aren't handled by kvm_calc_mmu_role_common(). While the nested MMU isn't used for shadow paging, things like the number of levels in the guest's page tables are surprisingly important when walking the guest page tables. Failure to reinitialize the nested MMU context if L2's paging mode changes can result in unexpected and/or missed page faults, and likely other explosions. E.g. if an L1 vCPU is running both a 32-bit PAE L2 and a 64-bit L2, the "common" role calculation will yield the same role for both L2s. If the 64-bit L2 is run after the 32-bit PAE L2, L0 will fail to reinitialize the nested MMU context, ultimately resulting in a bad walk of L2's page tables as the MMU will still have a guest root_level of PT32E_ROOT_LEVEL. WARNING: CPU: 4 PID: 167334 at arch/x86/kvm/vmx/vmx.c:3075 ept_save_pdptrs+0x15/0xe0 [kvm_intel] Modules linked in: kvm_intel] CPU: 4 PID: 167334 Comm: CPU 3/KVM Not tainted 5.13.0-rc1-d849817d5673-reqs #185 Hardware name: ASUS Q87M-E/Q87M-E, BIOS 1102 03/03/2014 RIP: 0010:ept_save_pdptrs+0x15/0xe0 [kvm_intel] Code: <0f> 0b c3 f6 87 d8 02 00f RSP: 0018:ffffbba702dbba00 EFLAGS: 00010202 RAX: 0000000000000011 RBX: 0000000000000002 RCX: ffffffff810a2c08 RDX: ffff91d7bc30acc0 RSI: 0000000000000011 RDI: ffff91d7bc30a600 RBP: ffff91d7bc30a600 R08: 0000000000000010 R09: 0000000000000007 R10: 0000000000000000 R11: 0000000000000000 R12: ffff91d7bc30a600 R13: ffff91d7bc30acc0 R14: ffff91d67c123460 R15: 0000000115d7e005 FS: 00007fe8e9ffb700(0000) GS:ffff91d90fb00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000029f15a001 CR4: 00000000001726e0 Call Trace: kvm_pdptr_read+0x3a/0x40 [kvm] paging64_walk_addr_generic+0x327/0x6a0 [kvm] paging64_gva_to_gpa_nested+0x3f/0xb0 [kvm] kvm_fetch_guest_virt+0x4c/0xb0 [kvm] __do_insn_fetch_bytes+0x11a/0x1f0 [kvm] x86_decode_insn+0x787/0x1490 [kvm] x86_decode_emulated_instruction+0x58/0x1e0 [kvm] x86_emulate_instruction+0x122/0x4f0 [kvm] vmx_handle_exit+0x120/0x660 [kvm_intel] kvm_arch_vcpu_ioctl_run+0xe25/0x1cb0 [kvm] kvm_vcpu_ioctl+0x211/0x5a0 [kvm] __x64_sys_ioctl+0x83/0xb0 do_syscall_64+0x40/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xae Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: stable@vger.kernel.org Fixes: bf627a928837 ("x86/kvm/mmu: check if MMU reconfiguration is needed in init_kvm_nested_mmu()") Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210610220026.1364486-1-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-11KVM: X86: Fix x86_emulator slab cache leakWanpeng Li1-0/+1
Commit c9b8b07cded58 (KVM: x86: Dynamically allocate per-vCPU emulation context) tries to allocate per-vCPU emulation context dynamically, however, the x86_emulator slab cache is still exiting after the kvm module is unload as below after destroying the VM and unloading the kvm module. grep x86_emulator /proc/slabinfo x86_emulator 36 36 2672 12 8 : tunables 0 0 0 : slabdata 3 3 0 This patch fixes this slab cache leak by destroying the x86_emulator slab cache when the kvm module is unloaded. Fixes: c9b8b07cded58 (KVM: x86: Dynamically allocate per-vCPU emulation context) Cc: stable@vger.kernel.org Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Message-Id: <1623387573-5969-1-git-send-email-wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-11KVM: SVM: Call SEV Guest Decommission if ASID binding failsAlper Gun1-5/+15
Send SEV_CMD_DECOMMISSION command to PSP firmware if ASID binding fails. If a failure happens after a successful LAUNCH_START command, a decommission command should be executed. Otherwise, guest context will be unfreed inside the AMD SP. After the firmware will not have memory to allocate more SEV guest context, LAUNCH_START command will begin to fail with SEV_RET_RESOURCE_LIMIT error. The existing code calls decommission inside sev_unbind_asid, but it is not called if a failure happens before guest activation succeeds. If sev_bind_asid fails, decommission is never called. PSP firmware has a limit for the number of guests. If sev_asid_binding fails many times, PSP firmware will not have resources to create another guest context. Cc: stable@vger.kernel.org Fixes: 59414c989220 ("KVM: SVM: Add support for KVM_SEV_LAUNCH_START command") Reported-by: Peter Gonda <pgonda@google.com> Signed-off-by: Alper Gun <alpergun@google.com> Reviewed-by: Marc Orr <marcorr@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20210610174604.2554090-1-alpergun@google.com>
2021-06-11riscv: alternative: fix typo in macro nameVitaly Wool1-2/+2
alternative-macros.h defines ALT_NEW_CONTENT in its assembly part and ALT_NEW_CONSTENT in the C part. Most likely it is the latter that is wrong. Fixes: 6f4eea90465ad (riscv: Introduce alternative mechanism to apply errata solution) Signed-off-by: Vitaly Wool <vitaly.wool@konsulko.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-06-11ARC: fix CONFIG_HARDENED_USERCOPYVineet Gupta1-1/+1
Currently enabling this triggers a warning | usercopy: Kernel memory overwrite attempt detected to kernel text (offset 155633, size 11)! | usercopy: BUG: failure at mm/usercopy.c:99/usercopy_abort()! | |gcc generated __builtin_trap |Path: /bin/busybox |CPU: 0 PID: 84 Comm: init Not tainted 5.4.22 | |[ECR ]: 0x00090005 => gcc generated __builtin_trap |[EFA ]: 0x9024fcaa |[BLINK ]: usercopy_abort+0x8a/0x8c |[ERET ]: memfd_fcntl+0x0/0x470 |[STAT32]: 0x80080802 : IE K |... |... |Stack Trace: | memfd_fcntl+0x0/0x470 | usercopy_abort+0x8a/0x8c | __check_object_size+0x10e/0x138 | copy_strings+0x1f4/0x38c | __do_execve_file+0x352/0x848 | EV_Trap+0xcc/0xd0 The issue is triggered by an allocation in "init reclaimed" region. ARC _stext emcompasses the init region (for historical reasons we wanted the init.text to be under .text as well). This however trips up __check_object_size()->check_kernel_text_object() which treats this as object bleeding into kernel text. Fix that by rezoning _stext to start from regular kernel .text and leave out .init altogether. Fixes: https://github.com/foss-for-synopsys-dwc-arc-processors/linux/issues/15 Reported-by: Evgeniy Didin <didin@synopsys.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2021-06-11ARCv2: save ABI registers across signal handlingVineet Gupta2-0/+44
ARCv2 has some configuration dependent registers (r30, r58, r59) which could be targetted by the compiler. To keep the ABI stable, these were unconditionally part of the glibc ABI (sysdeps/unix/sysv/linux/arc/sys/ucontext.h:mcontext_t) however we missed populating them (by saving/restoring them across signal handling). This patch fixes the issue by - adding arcv2 ABI regs to kernel struct sigcontext - populating them during signal handling Change to struct sigcontext might seem like a glibc ABI change (although it primarily uses ucontext_t:mcontext_t) but the fact is - it has only been extended (existing fields are not touched) - the old sigcontext was ABI incomplete to begin with anyways Fixes: https://github.com/foss-for-synopsys-dwc-arc-processors/linux/issues/53 Cc: <stable@vger.kernel.org> Tested-by: kernel test robot <lkp@intel.com> Reported-by: Vladimir Isaev <isaev@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2021-06-11riscv: code patching only works on !XIP_KERNELJisheng Zhang1-9/+9
Some features which need code patching such as KPROBES, DYNAMIC_FTRACE KGDB can only work on !XIP_KERNEL. Add dependencies for these features that rely on code patching. Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-06-11riscv: xip: support runtime trap patchingVitaly Wool2-5/+23
RISCV_ERRATA_ALTERNATIVE patches text at runtime which is currently not possible when the kernel is executed from the flash in XIP mode. Since runtime patching concerns only traps at the moment, let's just have all the traps reside in RAM anyway if RISCV_ERRATA_ALTERNATIVE is set. Thus, these functions will be patch-able even when the .text section is in flash. Signed-off-by: Vitaly Wool <vitaly.wool@konsulko.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-06-10KVM: x86: Immediately reset the MMU context when the SMM flag is clearedSean Christopherson1-1/+4
Immediately reset the MMU context when the vCPU's SMM flag is cleared so that the SMM flag in the MMU role is always synchronized with the vCPU's flag. If RSM fails (which isn't correctly emulated), KVM will bail without calling post_leave_smm() and leave the MMU in a bad state. The bad MMU role can lead to a NULL pointer dereference when grabbing a shadow page's rmap for a page fault as the initial lookups for the gfn will happen with the vCPU's SMM flag (=0), whereas the rmap lookup will use the shadow page's SMM flag, which comes from the MMU (=1). SMM has an entirely different set of memslots, and so the initial lookup can find a memslot (SMM=0) and then explode on the rmap memslot lookup (SMM=1). general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] CPU: 1 PID: 8410 Comm: syz-executor382 Not tainted 5.13.0-rc5-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:__gfn_to_rmap arch/x86/kvm/mmu/mmu.c:935 [inline] RIP: 0010:gfn_to_rmap+0x2b0/0x4d0 arch/x86/kvm/mmu/mmu.c:947 Code: <42> 80 3c 20 00 74 08 4c 89 ff e8 f1 79 a9 00 4c 89 fb 4d 8b 37 44 RSP: 0018:ffffc90000ffef98 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff888015b9f414 RCX: ffff888019669c40 RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000001 RBP: 0000000000000001 R08: ffffffff811d9cdb R09: ffffed10065a6002 R10: ffffed10065a6002 R11: 0000000000000000 R12: dffffc0000000000 R13: 0000000000000003 R14: 0000000000000001 R15: 0000000000000000 FS: 000000000124b300(0000) GS:ffff8880b9b00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000028e31000 CR4: 00000000001526e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: rmap_add arch/x86/kvm/mmu/mmu.c:965 [inline] mmu_set_spte+0x862/0xe60 arch/x86/kvm/mmu/mmu.c:2604 __direct_map arch/x86/kvm/mmu/mmu.c:2862 [inline] direct_page_fault+0x1f74/0x2b70 arch/x86/kvm/mmu/mmu.c:3769 kvm_mmu_do_page_fault arch/x86/kvm/mmu.h:124 [inline] kvm_mmu_page_fault+0x199/0x1440 arch/x86/kvm/mmu/mmu.c:5065 vmx_handle_exit+0x26/0x160 arch/x86/kvm/vmx/vmx.c:6122 vcpu_enter_guest+0x3bdd/0x9630 arch/x86/kvm/x86.c:9428 vcpu_run+0x416/0xc20 arch/x86/kvm/x86.c:9494 kvm_arch_vcpu_ioctl_run+0x4e8/0xa40 arch/x86/kvm/x86.c:9722 kvm_vcpu_ioctl+0x70f/0xbb0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3460 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:1069 [inline] __se_sys_ioctl+0xfb/0x170 fs/ioctl.c:1055 do_syscall_64+0x3f/0xb0 arch/x86/entry/common.c:47 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x440ce9 Cc: stable@vger.kernel.org Reported-by: syzbot+fb0b6a7e8713aeb0319c@syzkaller.appspotmail.com Fixes: 9ec19493fb86 ("KVM: x86: clear SMM flags before loading state while leaving SMM") Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210609185619.992058-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-10KVM: x86: Fix fall-through warnings for ClangGustavo A. R. Silva2-0/+2
In preparation to enable -Wimplicit-fallthrough for Clang, fix a couple of warnings by explicitly adding break statements instead of just letting the code fall through to the next case. Link: https://github.com/KSPP/linux/issues/115 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Message-Id: <20210528200756.GA39320@embeddedor> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-10KVM: SVM: fix doc warningsChenXiaoSong1-3/+3
Fix kernel-doc warnings: arch/x86/kvm/svm/avic.c:233: warning: Function parameter or member 'activate' not described in 'avic_update_access_page' arch/x86/kvm/svm/avic.c:233: warning: Function parameter or member 'kvm' not described in 'avic_update_access_page' arch/x86/kvm/svm/avic.c:781: warning: Function parameter or member 'e' not described in 'get_pi_vcpu_info' arch/x86/kvm/svm/avic.c:781: warning: Function parameter or member 'kvm' not described in 'get_pi_vcpu_info' arch/x86/kvm/svm/avic.c:781: warning: Function parameter or member 'svm' not described in 'get_pi_vcpu_info' arch/x86/kvm/svm/avic.c:781: warning: Function parameter or member 'vcpu_info' not described in 'get_pi_vcpu_info' arch/x86/kvm/svm/avic.c:1009: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst Signed-off-by: ChenXiaoSong <chenxiaosong2@huawei.com> Message-Id: <20210609122217.2967131-1-chenxiaosong2@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-10x86/nmi_watchdog: Fix old-style NMI watchdog regression on old Intel CPUsCodyYao-oc1-2/+2
The following commit: 3a4ac121c2ca ("x86/perf: Add hardware performance events support for Zhaoxin CPU.") Got the old-style NMI watchdog logic wrong and broke it for basically every Intel CPU where it was active. Which is only truly old CPUs, so few people noticed. On CPUs with perf events support we turn off the old-style NMI watchdog, so it was pretty pointless to add the logic for X86_VENDOR_ZHAOXIN to begin with ... :-/ Anyway, the fix is to restore the old logic and add a 'break'. [ mingo: Wrote a new changelog. ] Fixes: 3a4ac121c2ca ("x86/perf: Add hardware performance events support for Zhaoxin CPU.") Signed-off-by: CodyYao-oc <CodyYao-oc@zhaoxin.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210607025335.9643-1-CodyYao-oc@zhaoxin.com
2021-06-10kvm: LAPIC: Restore guard to prevent illegal APIC register accessJim Mattson1-0/+3
Per the SDM, "any access that touches bytes 4 through 15 of an APIC register may cause undefined behavior and must not be executed." Worse, such an access in kvm_lapic_reg_read can result in a leak of kernel stack contents. Prior to commit 01402cf81051 ("kvm: LAPIC: write down valid APIC registers"), such an access was explicitly disallowed. Restore the guard that was removed in that commit. Fixes: 01402cf81051 ("kvm: LAPIC: write down valid APIC registers") Signed-off-by: Jim Mattson <jmattson@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Message-Id: <20210602205224.3189316-1-jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-09Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds5-20/+42
Pull kvm fixes from Paolo Bonzini: "Bugfixes, including a TLB flush fix that affects processors without nested page tables" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: kvm: fix previous commit for 32-bit builds kvm: avoid speculation-based attacks from out-of-range memslot accesses KVM: x86: Unload MMU on guest TLB flush if TDP disabled to force MMU sync KVM: x86: Ensure liveliness of nested VM-Enter fail tracepoint message selftests: kvm: Add support for customized slot0 memory size KVM: selftests: introduce P47V64 for s390x KVM: x86: Ensure PV TLB flush tracepoint reflects KVM behavior KVM: X86: MMU: Use the correct inherited permissions to get shadow page KVM: LAPIC: Write 0 to TMICT should also cancel vmx-preemption timer KVM: SVM: Fix SEV SEND_START session length & SEND_UPDATE_DATA query length after commit 238eca821cee
2021-06-09KVM: x86: Unload MMU on guest TLB flush if TDP disabled to force MMU syncLai Jiangshan1-0/+13
When using shadow paging, unload the guest MMU when emulating a guest TLB flush to ensure all roots are synchronized. From the guest's perspective, flushing the TLB ensures any and all modifications to its PTEs will be recognized by the CPU. Note, unloading the MMU is overkill, but is done to mirror KVM's existing handling of INVPCID(all) and ensure the bug is squashed. Future cleanup can be done to more precisely synchronize roots when servicing a guest TLB flush. If TDP is enabled, synchronizing the MMU is unnecessary even if nested TDP is in play, as a "legacy" TLB flush from L1 does not invalidate L1's TDP mappings. For EPT, an explicit INVEPT is required to invalidate guest-physical mappings; for NPT, guest mappings are always tagged with an ASID and thus can only be invalidated via the VMCB's ASID control. This bug has existed since the introduction of KVM_VCPU_FLUSH_TLB. It was only recently exposed after Linux guests stopped flushing the local CPU's TLB prior to flushing remote TLBs (see commit 4ce94eabac16, "x86/mm/tlb: Flush remote and local TLBs concurrently"), but is also visible in Windows 10 guests. Tested-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Fixes: f38a7b75267f ("KVM: X86: support paravirtualized help for TLB shootdowns") Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com> [sean: massaged comment and changelog] Message-Id: <20210531172256.2908-1-jiangshanlai@gmail.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-08KVM: x86: Ensure liveliness of nested VM-Enter fail tracepoint messageSean Christopherson1-3/+3
Use the __string() machinery provided by the tracing subystem to make a copy of the string literals consumed by the "nested VM-Enter failed" tracepoint. A complete copy is necessary to ensure that the tracepoint can't outlive the data/memory it consumes and deference stale memory. Because the tracepoint itself is defined by kvm, if kvm-intel and/or kvm-amd are built as modules, the memory holding the string literals defined by the vendor modules will be freed when the module is unloaded, whereas the tracepoint and its data in the ring buffer will live until kvm is unloaded (or "indefinitely" if kvm is built-in). This bug has existed since the tracepoint was added, but was recently exposed by a new check in tracing to detect exactly this type of bug. fmt: '%s%s ' current_buffer: ' vmx_dirty_log_t-140127 [003] .... kvm_nested_vmenter_failed: ' WARNING: CPU: 3 PID: 140134 at kernel/trace/trace.c:3759 trace_check_vprintf+0x3be/0x3e0 CPU: 3 PID: 140134 Comm: less Not tainted 5.13.0-rc1-ce2e73ce600a-req #184 Hardware name: ASUS Q87M-E/Q87M-E, BIOS 1102 03/03/2014 RIP: 0010:trace_check_vprintf+0x3be/0x3e0 Code: <0f> 0b 44 8b 4c 24 1c e9 a9 fe ff ff c6 44 02 ff 00 49 8b 97 b0 20 RSP: 0018:ffffa895cc37bcb0 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffffa895cc37bd08 RCX: 0000000000000027 RDX: 0000000000000027 RSI: 00000000ffffdfff RDI: ffff9766cfad74f8 RBP: ffffffffc0a041d4 R08: ffff9766cfad74f0 R09: ffffa895cc37bad8 R10: 0000000000000001 R11: 0000000000000001 R12: ffffffffc0a041d4 R13: ffffffffc0f4dba8 R14: 0000000000000000 R15: ffff976409f2c000 FS: 00007f92fa200740(0000) GS:ffff9766cfac0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000559bd11b0000 CR3: 000000019fbaa002 CR4: 00000000001726e0 Call Trace: trace_event_printf+0x5e/0x80 trace_raw_output_kvm_nested_vmenter_failed+0x3a/0x60 [kvm] print_trace_line+0x1dd/0x4e0 s_show+0x45/0x150 seq_read_iter+0x2d5/0x4c0 seq_read+0x106/0x150 vfs_read+0x98/0x180 ksys_read+0x5f/0xe0 do_syscall_64+0x40/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xae Cc: Steven Rostedt <rostedt@goodmis.org> Fixes: 380e0055bc7e ("KVM: nVMX: trace nested VM-Enter failures detected by H/W") Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Message-Id: <20210607175748.674002-1-seanjc@google.com>
2021-06-08Merge tag 'orphans-v5.13-rc6' of ↵Linus Torvalds1-2/+3
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull orphan section fixes from Kees Cook: "These two corner case fixes have been in -next for about a week: - Avoid orphan section in ARM cpuidle (Arnd Bergmann) - Avoid orphan section with !SMP (Nathan Chancellor)" * tag 'orphans-v5.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: vmlinux.lds.h: Avoid orphan section with !SMP ARM: cpuidle: Avoid orphan section warning
2021-06-08KVM: x86: Ensure PV TLB flush tracepoint reflects KVM behaviorLai Jiangshan1-2/+4
In record_steal_time(), st->preempted is read twice, and trace_kvm_pv_tlb_flush() might output result inconsistent if kvm_vcpu_flush_tlb_guest() see a different st->preempted later. It is a very trivial problem and hardly has actual harm and can be avoided by reseting and reading st->preempted in atomic way via xchg(). Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com> Message-Id: <20210531174628.10265-1-jiangshanlai@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-08KVM: X86: MMU: Use the correct inherited permissions to get shadow pageLai Jiangshan1-5/+9
When computing the access permissions of a shadow page, use the effective permissions of the walk up to that point, i.e. the logic AND of its parents' permissions. Two guest PxE entries that point at the same table gfn need to be shadowed with different shadow pages if their parents' permissions are different. KVM currently uses the effective permissions of the last non-leaf entry for all non-leaf entries. Because all non-leaf SPTEs have full ("uwx") permissions, and the effective permissions are recorded only in role.access and merged into the leaves, this can lead to incorrect reuse of a shadow page and eventually to a missing guest protection page fault. For example, here is a shared pagetable: pgd[] pud[] pmd[] virtual address pointers /->pmd1(u--)->pte1(uw-)->page1 <- ptr1 (u--) /->pud1(uw-)--->pmd2(uw-)->pte2(uw-)->page2 <- ptr2 (uw-) pgd-| (shared pmd[] as above) \->pud2(u--)--->pmd1(u--)->pte1(uw-)->page1 <- ptr3 (u--) \->pmd2(uw-)->pte2(uw-)->page2 <- ptr4 (u--) pud1 and pud2 point to the same pmd table, so: - ptr1 and ptr3 points to the same page. - ptr2 and ptr4 points to the same page. (pud1 and pud2 here are pud entries, while pmd1 and pmd2 here are pmd entries) - First, the guest reads from ptr1 first and KVM prepares a shadow page table with role.access=u--, from ptr1's pud1 and ptr1's pmd1. "u--" comes from the effective permissions of pgd, pud1 and pmd1, which are stored in pt->access. "u--" is used also to get the pagetable for pud1, instead of "uw-". - Then the guest writes to ptr2 and KVM reuses pud1 which is present. The hypervisor set up a shadow page for ptr2 with pt->access is "uw-" even though the pud1 pmd (because of the incorrect argument to kvm_mmu_get_page in the previous step) has role.access="u--". - Then the guest reads from ptr3. The hypervisor reuses pud1's shadow pmd for pud2, because both use "u--" for their permissions. Thus, the shadow pmd already includes entries for both pmd1 and pmd2. - At last, the guest writes to ptr4. This causes no vmexit or pagefault, because pud1's shadow page structures included an "uw-" page even though its role.access was "u--". Any kind of shared pagetable might have the similar problem when in virtual machine without TDP enabled if the permissions are different from different ancestors. In order to fix the problem, we change pt->access to be an array, and any access in it will not include permissions ANDed from child ptes. The test code is: https://lore.kernel.org/kvm/20210603050537.19605-1-jiangshanlai@gmail.com/ Remember to test it with TDP disabled. The problem had existed long before the commit 41074d07c78b ("KVM: MMU: Fix inherited permissions for emulated guest pte updates"), and it is hard to find which is the culprit. So there is no fixes tag here. Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com> Message-Id: <20210603052455.21023-1-jiangshanlai@gmail.com> Cc: stable@vger.kernel.org Fixes: cea0f0e7ea54 ("[PATCH] KVM: MMU: Shadow page table caching") Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-08KVM: LAPIC: Write 0 to TMICT should also cancel vmx-preemption timerWanpeng Li1-6/+11
According to the SDM 10.5.4.1: A write of 0 to the initial-count register effectively stops the local APIC timer, in both one-shot and periodic mode. However, the lapic timer oneshot/periodic mode which is emulated by vmx-preemption timer doesn't stop by writing 0 to TMICT since vmx->hv_deadline_tsc is still programmed and the guest will receive the spurious timer interrupt later. This patch fixes it by also cancelling the vmx-preemption timer when writing 0 to the initial-count register. Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Message-Id: <1623050385-100988-1-git-send-email-wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-08KVM: SVM: Fix SEV SEND_START session length & SEND_UPDATE_DATA query length ↵Ashish Kalra1-4/+2
after commit 238eca821cee Commit 238eca821cee ("KVM: SVM: Allocate SEV command structures on local stack") uses the local stack to allocate the structures used to communicate with the PSP, which were earlier being kzalloced. This breaks SEV live migration for computing the SEND_START session length and SEND_UPDATE_DATA query length as session_len and trans_len and hdr_len fields are not zeroed respectively for the above commands before issuing the SEV Firmware API call, hence the firmware returns incorrect session length and update data header or trans length. Also the SEV Firmware API returns SEV_RET_INVALID_LEN firmware error for these length query API calls, and the return value and the firmware error needs to be passed to the userspace as it is, so need to remove the return check in the KVM code. Signed-off-by: Ashish Kalra <ashish.kalra@amd.com> Message-Id: <20210607061532.27459-1-Ashish.Kalra@amd.com> Fixes: 238eca821cee ("KVM: SVM: Allocate SEV command structures on local stack") Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-06Merge tag 'arm-soc-fixes-v5.13-2' of ↵Linus Torvalds26-113/+96
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC fixes from Olof Johansson: "A set of fixes that have been coming in over the last few weeks, the usual mix of fixes: - DT fixups for TI K3 - SATA drive detection fix for TI DRA7 - Power management fixes and a few build warning removals for OMAP - OP-TEE fix to use standard API for UUID exporting - DT fixes for a handful of i.MX boards And a few other smaller items" * tag 'arm-soc-fixes-v5.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (29 commits) arm64: meson: select COMMON_CLK soc: amlogic: meson-clk-measure: remove redundant dev_err call in meson_msr_probe() ARM: OMAP1: ams-delta: remove unused function ams_delta_camera_power bus: ti-sysc: Fix flakey idling of uarts and stop using swsup_sidle_act ARM: dts: imx: emcon-avari: Fix nxp,pca8574 #gpio-cells ARM: dts: imx7d-pico: Fix the 'tuning-step' property ARM: dts: imx7d-meerkat96: Fix the 'tuning-step' property arm64: dts: freescale: sl28: var1: fix RGMII clock and voltage arm64: dts: freescale: sl28: var4: fix RGMII clock and voltage ARM: imx: pm-imx27: Include "common.h" arm64: dts: zii-ultra: fix 12V_MAIN voltage arm64: dts: zii-ultra: remove second GEN_3V3 regulator instance arm64: dts: ls1028a: fix memory node bus: ti-sysc: Fix am335x resume hang for usb otg module ARM: OMAP2+: Fix build warning when mmc_omap is not built ARM: OMAP1: isp1301-omap: Add missing gpiod_add_lookup_table function ARM: OMAP1: Fix use of possibly uninitialized irq variable optee: use export_uuid() to copy client UUID arm64: dts: ti: k3*: Introduce reg definition for interrupt routers arm64: dts: ti: k3-am65|j721e|am64: Map the dma / navigator subsystem via explicit ranges ...
2021-06-06Merge tag 'powerpc-5.13-5' of ↵Linus Torvalds8-57/+49
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "Fix our KVM reverse map real-mode handling since we enabled huge vmalloc (in some configurations). Revert a recent change to our IOMMU code which broke some devices. Fix KVM handling of FSCR on P7/P8, which could have possibly let a guest crash it's Qemu. Fix kprobes validation of prefixed instructions across page boundary. Thanks to Alexey Kardashevskiy, Christophe Leroy, Fabiano Rosas, Frederic Barrat, Naveen N. Rao, and Nicholas Piggin" * tag 'powerpc-5.13-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: Revert "powerpc/kernel/iommu: Align size for IOMMU_PAGE_SIZE() to save TCEs" KVM: PPC: Book3S HV: Save host FSCR in the P7/8 path powerpc: Fix reverse map real-mode address lookup with huge vmalloc powerpc/kprobes: Fix validation of prefixed instructions across page boundary
2021-06-06Merge tag 'x86_urgent_for_v5.13-rc5' of ↵Linus Torvalds14-120/+132
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: "A bunch of x86/urgent stuff accumulated for the last two weeks so lemme unload it to you. It should be all totally risk-free, of course. :-) - Fix out-of-spec hardware (1st gen Hygon) which does not implement MSR_AMD64_SEV even though the spec clearly states so, and check CPUID bits first. - Send only one signal to a task when it is a SEGV_PKUERR si_code type. - Do away with all the wankery of reserving X amount of memory in the first megabyte to prevent BIOS corrupting it and simply and unconditionally reserve the whole first megabyte. - Make alternatives NOP optimization work at an arbitrary position within the patched sequence because the compiler can put single-byte NOPs for alignment anywhere in the sequence (32-bit retpoline), vs our previous assumption that the NOPs are only appended. - Force-disable ENQCMD[S] instructions support and remove update_pasid() because of insufficient protection against FPU state modification in an interrupt context, among other xstate horrors which are being addressed at the moment. This one limits the fallout until proper enablement. - Use cpu_feature_enabled() in the idxd driver so that it can be build-time disabled through the defines in disabled-features.h. - Fix LVT thermal setup for SMI delivery mode by making sure the APIC LVT value is read before APIC initialization so that softlockups during boot do not happen at least on one machine. - Mark all legacy interrupts as legacy vectors when the IO-APIC is disabled and when all legacy interrupts are routed through the PIC" * tag 'x86_urgent_for_v5.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/sev: Check SME/SEV support in CPUID first x86/fault: Don't send SIGSEGV twice on SEGV_PKUERR x86/setup: Always reserve the first 1M of RAM x86/alternative: Optimize single-byte NOPs at an arbitrary position x86/cpufeatures: Force disable X86_FEATURE_ENQCMD and remove update_pasid() dmaengine: idxd: Use cpu_feature_enabled() x86/thermal: Fix LVT thermal setup for SMI delivery mode x86/apic: Mark _all_ legacy interrupts when IO/APIC is missing
2021-06-06Merge tag 'ti-k3-dt-fixes-for-v5.13' of ↵Olof Johansson10-63/+45
git://git.kernel.org/pub/scm/linux/kernel/git/nmenon/linux into arm/fixes Devicetree fixes for TI K3 platforms for v5.13 merge window: These minor fixes include: * Fixups for device tree discovered during yaml conversion * Fixups for missing dma-coherent property in j7200 * Removal of camera sensor node from am65 evm dts to overlay as camera sensor boards are variable. * tag 'ti-k3-dt-fixes-for-v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/nmenon/linux: arm64: dts: ti: k3*: Introduce reg definition for interrupt routers arm64: dts: ti: k3-am65|j721e|am64: Map the dma / navigator subsystem via explicit ranges arm64: dts: ti: k3-*: Rename the TI-SCI node arm64: dts: ti: k3-am65-wakeup: Drop un-necessary properties from dmsc node arm64: dts: ti: k3-am65-wakeup: Add debug region to TI-SCI node arm64: dts: ti: k3-*: Rename the TI-SCI clocks node name arm64: dts: ti: j7200-main: Mark Main NAVSS as dma-coherent arm64: dts: ti: k3-am654-base-board: remove ov5640 Link: https://lore.kernel.org/r/20210518115634.467vgpbzplal5kou@obituary Signed-off-by: Olof Johansson <olof@lixom.net>
2021-06-06Merge tag 'omap-for-v5.13/fixes-pm' of ↵Olof Johansson4-19/+11
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into arm/fixes PM and build warning fixes for omaps While chasing system suspend related regressions, I noticed few other issues related to PM would be good to have fixed: - UART idling does not always work for hardware autoidle features - am335x resume works only the first time unless musb module is loaded Then there are three patches for omap1 related warnings caused by the gpio changes, and one build warning fix for legacy mmc platform code when mmc is built as a loadable module. These can all be merged whenever suitable naturally. I've sent the more urgent SATA regression fix separately although it appears in this pull request too because of the branches merged. * tag 'omap-for-v5.13/fixes-pm' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: ARM: OMAP1: ams-delta: remove unused function ams_delta_camera_power bus: ti-sysc: Fix flakey idling of uarts and stop using swsup_sidle_act bus: ti-sysc: Fix am335x resume hang for usb otg module ARM: OMAP2+: Fix build warning when mmc_omap is not built ARM: OMAP1: isp1301-omap: Add missing gpiod_add_lookup_table function ARM: OMAP1: Fix use of possibly uninitialized irq variable Link: https://lore.kernel.org/r/pull-1622614772-543196@atomide.com Signed-off-by: Olof Johansson <olof@lixom.net>
2021-06-06Merge tag 'amlogic-fixes-v5.13-rc1' of ↵Olof Johansson1-0/+1
https://git.kernel.org/pub/scm/linux/kernel/git/amlogic/linux into arm/fixes Amlogic fixes for v5.13-rc1 - arm64: meson: select COMMON_CLK to select a proper implementation of the clock API - soc: amlogic: meson-clk-measure: remove redundant dev_err call in meson_msr_probe() * tag 'amlogic-fixes-v5.13-rc1' of https://git.kernel.org/pub/scm/linux/kernel/git/amlogic/linux: arm64: meson: select COMMON_CLK soc: amlogic: meson-clk-measure: remove redundant dev_err call in meson_msr_probe() Link: https://lore.kernel.org/r/73e76706-f3f4-bebf-10dd-d2ec9106a234@baylibre.com Signed-off-by: Olof Johansson <olof@lixom.net>
2021-06-06Merge tag 'imx-fixes-5.13' of ↵Olof Johansson11-31/+39
git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into arm/fixes i.MX fixes for 5.13: - Fix missing-prototypes warning of 'imx27_pm_init' in i.MX27 platform pm code. - A couple of patches from Fabio Estevam to fix 'tuning-step' property in imx7d-meerkat96 and imx7d-pico DT. - Fix '#gpio-cells' of nxp,pca8574 device in imx6qdl-emcon-avari DT. - A couple of patches from Lucas Stach to fix regulator and voltage for imx8mq-zii-ultra board. - Add missing regulators for imx6q-dhcom to avoid possible instability issues. - Fix memory-controller settings for fsl-ls1028a DT. - Fix RGMII clock and voltage for a couple of fsl-ls1028a-kontron-sl28 boards. - Fix RGMII connection to QCA8334 switch for imx6dl-yapp4 board. * tag 'imx-fixes-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux: ARM: dts: imx: emcon-avari: Fix nxp,pca8574 #gpio-cells ARM: dts: imx7d-pico: Fix the 'tuning-step' property ARM: dts: imx7d-meerkat96: Fix the 'tuning-step' property arm64: dts: freescale: sl28: var1: fix RGMII clock and voltage arm64: dts: freescale: sl28: var4: fix RGMII clock and voltage ARM: imx: pm-imx27: Include "common.h" arm64: dts: zii-ultra: fix 12V_MAIN voltage arm64: dts: zii-ultra: remove second GEN_3V3 regulator instance arm64: dts: ls1028a: fix memory node ARM: dts: imx6q-dhcom: Add PU,VDD1P1,VDD2P5 regulators ARM: dts: imx6dl-yapp4: Fix RGMII connection to QCA8334 switch Link: https://lore.kernel.org/r/20210527011758.GD8194@dragon Signed-off-by: Olof Johansson <olof@lixom.net>
2021-06-05Merge branch 'akpm' (patches from Andrew)Linus Torvalds1-16/+14
Merge misc fixes from Andrew Morton: "13 patches. Subsystems affected by this patch series: mips, mm (kfence, debug, pagealloc, memory-hotplug, hugetlb, kasan, and hugetlb), init, proc, lib, ocfs2, and mailmap" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mailmap: use private address for Michel Lespinasse ocfs2: fix data corruption by fallocate lib: crc64: fix kernel-doc warning mm, hugetlb: fix simple resv_huge_pages underflow on UFFDIO_COPY mm/kasan/init.c: fix doc warning proc: add .gitignore for proc-subset-pid selftest hugetlb: pass head page to remove_hugetlb_page() drivers/base/memory: fix trying offlining memory blocks with memory holes on aarch64 mm/page_alloc: fix counting of free pages after take off from buddy mm/debug_vm_pgtable: fix alignment for pmd/pud_advanced_tests() pid: take a reference when initializing `cad_pid` kfence: use TASK_IDLE when awaiting allocation Revert "MIPS: make userspace mapping young by default"
2021-06-05Merge tag 'riscv-for-linus-5.13-rc5' of ↵Linus Torvalds4-5/+18
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: - Build with '-mno-relax' when using LLVM's linker, which doesn't support linker relaxation. - A fix to build without SiFive's errata. - A fix to use PAs during init_resources() - A fix to avoid W+X mappings during boot. * tag 'riscv-for-linus-5.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: RISC-V: Fix memblock_free() usages in init_resources() riscv: skip errata_cip_453.o if CONFIG_ERRATA_SIFIVE_CIP_453 is disabled riscv: mm: Fix W+X mappings at boot riscv: Use -mno-relax when using lld linker
2021-06-05Revert "MIPS: make userspace mapping young by default"Thomas Bogendoerfer1-16/+14
This reverts commit f685a533a7fab35c5d069dcd663f59c8e4171a75. The MIPS cache flush logic needs to know whether the mapping was already established to decide how to flush caches. This is done by checking the valid bit in the PTE. The commit above breaks this logic by setting the valid in the PTE in new mappings, which causes kernel crashes. Link: https://lkml.kernel.org/r/20210526094335.92948-1-tsbogend@alpha.franken.de Fixes: f685a533a7f ("MIPS: make userspace mapping young by default") Reported-by: Zhou Yanjie <zhouyanjie@wanyeetech.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Huang Pei <huangpei@loongson.cn> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-04x86/sev: Check SME/SEV support in CPUID firstPu Wen1-5/+6
The first two bits of the CPUID leaf 0x8000001F EAX indicate whether SEV or SME is supported, respectively. It's better to check whether SEV or SME is actually supported before accessing the MSR_AMD64_SEV to check whether SEV or SME is enabled. This is both a bare-metal issue and a guest/VM issue. Since the first generation Hygon Dhyana CPU doesn't support the MSR_AMD64_SEV, reading that MSR results in a #GP - either directly from hardware in the bare-metal case or via the hypervisor (because the RDMSR is actually intercepted) in the guest/VM case, resulting in a failed boot. And since this is very early in the boot phase, rdmsrl_safe()/native_read_msr_safe() can't be used. So check the CPUID bits first, before accessing the MSR. [ tlendacky: Expand and improve commit message. ] [ bp: Massage commit message. ] Fixes: eab696d8e8b9 ("x86/sev: Do not require Hypervisor CPUID bit for SEV guests") Signed-off-by: Pu Wen <puwen@hygon.cn> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Tom Lendacky <thomas.lendacky@amd.com> Cc: <stable@vger.kernel.org> # v5.10+ Link: https://lkml.kernel.org/r/20210602070207.2480-1-puwen@hygon.cn
2021-06-04x86/fault: Don't send SIGSEGV twice on SEGV_PKUERRJiashuo Liang1-2/+2
__bad_area_nosemaphore() calls both force_sig_pkuerr() and force_sig_fault() when handling SEGV_PKUERR. This does not cause problems because the second signal is filtered by the legacy_queue() check in __send_signal() because in both cases, the signal is SIGSEGV, the second one seeing that the first one is already pending. This causes the kernel to do unnecessary work so send the signal only once for SEGV_PKUERR. [ bp: Massage commit message. ] Fixes: 9db812dbb29d ("signal/x86: Call force_sig_pkuerr from __bad_area_nosemaphore") Suggested-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Jiashuo Liang <liangjs@pku.edu.cn> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Link: https://lkml.kernel.org/r/20210601085203.40214-1-liangjs@pku.edu.cn