summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)AuthorFilesLines
2019-09-08Revert "x86/apic: Include the LDR when clearing out APIC registers"Linus Torvalds1-4/+0
This reverts commit 558682b5291937a70748d36fd9ba757fb25b99ae. Chris Wilson reports that it breaks his CPU hotplug test scripts. In particular, it breaks offlining and then re-onlining the boot CPU, which we treat specially (and the BIOS does too). The symptoms are that we can offline the CPU, but it then does not come back online again: smpboot: CPU 0 is now offline smpboot: Booting Node 0 Processor 0 APIC 0x0 smpboot: do_boot_cpu failed(-1) to wakeup CPU#0 Thomas says he knows why it's broken (my personal suspicion: our magic handling of the "cpu0_logical_apicid" thing), but for 5.3 the right fix is to just revert it, since we've never touched the LDR bits before, and it's not worth the risk to do anything else at this stage. [ Hotpluging of the boot CPU is special anyway, and should be off by default. See the "BOOTPARAM_HOTPLUG_CPU0" config option and the cpu0_hotplug kernel parameter. In general you should not do it, and it has various known limitations (hibernate and suspend require the boot CPU, for example). But it should work, even if the boot CPU is special and needs careful treatment - Linus ] Link: https://lore.kernel.org/lkml/156785100521.13300.14461504732265570003@skylake-alporthouse-com/ Reported-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Bandan Das <bsd@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-06Merge tag 'armsoc-fixes' of ↵Linus Torvalds2-3/+4
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC fixes from Arnd Bergmann: "There are three more fixes for this week: - The Windows-on-ARM laptops require a workaround to prevent crashing at boot from ACPI - The Renesas 'draak' board needs one bugfix for the backlight regulator - Also for Renesas, the 'hihope' board accidentally had its eMMC turned off in the 5.3 merge window" * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: soc: qcom: geni: Provide parameter error checking arm64: dts: renesas: hihope-common: Fix eMMC status arm64: dts: renesas: r8a77995: draak: Fix backlight regulator name
2019-09-06Merge tag 'powerpc-5.3-5' of ↵Linus Torvalds2-18/+4
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "One fix for a boot hang on some Freescale machines when PREEMPT is enabled. Two CVE fixes for bugs in our handling of FP registers and transactional memory, both of which can result in corrupted FP state, or FP state leaking between processes. Thanks to: Chris Packham, Christophe Leroy, Gustavo Romero, Michael Neuling" * tag 'powerpc-5.3-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/tm: Fix restoring FP/VMX facility incorrectly on interrupts powerpc/tm: Fix FP/VMX unavailable exceptions inside a transaction powerpc/64e: Drop stale call to smp_processor_id() which hangs SMP startup
2019-09-05Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds3-4/+9
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Misc fixes: - EFI boot fix for signed kernels - an AC flags fix related to UBSAN - Hyper-V infinite loop fix" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/hyper-v: Fix overflow bug in fill_gva_list() x86/uaccess: Don't leak the AC flags into __get_user() argument evaluation x86/boot: Preserve boot_params.secure_boot from sanitizing
2019-09-05Merge tag 'renesas-fixes2-for-v5.3' of ↵Arnd Bergmann1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into arm/fixes Second Round of Renesas ARM Based SoC Fixes for v5.3 * RZ/G2M based HiHope main board - Re-enabled accidently disabled SDHI3 (eMMC) support * tag 'renesas-fixes2-for-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas: arm64: dts: renesas: hihope-common: Fix eMMC status Link: https://lore.kernel.org/r/cover.1567675986.git.horms+renesas@verge.net.au Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-09-04Merge tag 'renesas-fixes-for-v5.3' of ↵Arnd Bergmann1-3/+3
git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into arm/fixes Renesas ARM Based SoC Fixes for v5.3 * R-Car D3 (r8a77995) based Draak Board - Correct backlight regulator name in device tree * tag 'renesas-fixes-for-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas: arm64: dts: renesas: r8a77995: draak: Fix backlight regulator name Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-09-04powerpc/tm: Fix restoring FP/VMX facility incorrectly on interruptsGustavo Romero1-16/+2
When in userspace and MSR FP=0 the hardware FP state is unrelated to the current process. This is extended for transactions where if tbegin is run with FP=0, the hardware checkpoint FP state will also be unrelated to the current process. Due to this, we need to ensure this hardware checkpoint is updated with the correct state before we enable FP for this process. Unfortunately we get this wrong when returning to a process from a hardware interrupt. A process that starts a transaction with FP=0 can take an interrupt. When the kernel returns back to that process, we change to FP=1 but with hardware checkpoint FP state not updated. If this transaction is then rolled back, the FP registers now contain the wrong state. The process looks like this: Userspace: Kernel Start userspace with MSR FP=0 TM=1 < ----- ... tbegin bne Hardware interrupt ---- > <do_IRQ...> .... ret_from_except restore_math() /* sees FP=0 */ restore_fp() tm_active_with_fp() /* sees FP=1 (Incorrect) */ load_fp_state() FP = 0 -> 1 < ----- Return to userspace with MSR TM=1 FP=1 with junk in the FP TM checkpoint TM rollback reads FP junk When returning from the hardware exception, tm_active_with_fp() is incorrectly making restore_fp() call load_fp_state() which is setting FP=1. The fix is to remove tm_active_with_fp(). tm_active_with_fp() is attempting to handle the case where FP state has been changed inside a transaction. In this case the checkpointed and transactional FP state is different and hence we must restore the FP state (ie. we can't do lazy FP restore inside a transaction that's used FP). It's safe to remove tm_active_with_fp() as this case is handled by restore_tm_state(). restore_tm_state() detects if FP has been using inside a transaction and will set load_fp and call restore_math() to ensure the FP state (checkpoint and transaction) is restored. This is a data integrity problem for the current process as the FP registers are corrupted. It's also a security problem as the FP registers from one process may be leaked to another. Similarly for VMX. A simple testcase to replicate this will be posted to tools/testing/selftests/powerpc/tm/tm-poison.c This fixes CVE-2019-15031. Fixes: a7771176b439 ("powerpc: Don't enable FP/Altivec if not checkpointed") Cc: stable@vger.kernel.org # 4.15+ Signed-off-by: Gustavo Romero <gromero@linux.ibm.com> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190904045529.23002-2-gromero@linux.vnet.ibm.com
2019-09-04powerpc/tm: Fix FP/VMX unavailable exceptions inside a transactionGustavo Romero1-1/+2
When we take an FP unavailable exception in a transaction we have to account for the hardware FP TM checkpointed registers being incorrect. In this case for this process we know the current and checkpointed FP registers must be the same (since FP wasn't used inside the transaction) hence in the thread_struct we copy the current FP registers to the checkpointed ones. This copy is done in tm_reclaim_thread(). We use thread->ckpt_regs.msr to determine if FP was on when in userspace. thread->ckpt_regs.msr represents the state of the MSR when exiting userspace. This is setup by check_if_tm_restore_required(). Unfortunatley there is an optimisation in giveup_all() which returns early if tsk->thread.regs->msr (via local variable `usermsr`) has FP=VEC=VSX=SPE=0. This optimisation means that check_if_tm_restore_required() is not called and hence thread->ckpt_regs.msr is not updated and will contain an old value. This can happen if due to load_fp=255 we start a userspace process with MSR FP=1 and then we are context switched out. In this case thread->ckpt_regs.msr will contain FP=1. If that same process is then context switched in and load_fp overflows, MSR will have FP=0. If that process now enters a transaction and does an FP instruction, the FP unavailable will not update thread->ckpt_regs.msr (the bug) and MSR FP=1 will be retained in thread->ckpt_regs.msr. tm_reclaim_thread() will then not perform the required memcpy and the checkpointed FP regs in the thread struct will contain the wrong values. The code path for this happening is: Userspace: Kernel Start userspace with MSR FP/VEC/VSX/SPE=0 TM=1 < ----- ... tbegin bne fp instruction FP unavailable ---- > fp_unavailable_tm() tm_reclaim_current() tm_reclaim_thread() giveup_all() return early since FP/VMX/VSX=0 /* ckpt MSR not updated (Incorrect) */ tm_reclaim() /* thread_struct ckpt FP regs contain junk (OK) */ /* Sees ckpt MSR FP=1 (Incorrect) */ no memcpy() performed /* thread_struct ckpt FP regs not fixed (Incorrect) */ tm_recheckpoint() /* Put junk in hardware checkpoint FP regs */ .... < ----- Return to userspace with MSR TM=1 FP=1 with junk in the FP TM checkpoint TM rollback reads FP junk This is a data integrity problem for the current process as the FP registers are corrupted. It's also a security problem as the FP registers from one process may be leaked to another. This patch moves up check_if_tm_restore_required() in giveup_all() to ensure thread->ckpt_regs.msr is updated correctly. A simple testcase to replicate this will be posted to tools/testing/selftests/powerpc/tm/tm-poison.c Similarly for VMX. This fixes CVE-2019-15030. Fixes: f48e91e87e67 ("powerpc/tm: Fix FP and VMX register corruption") Cc: stable@vger.kernel.org # 4.12+ Signed-off-by: Gustavo Romero <gromero@linux.vnet.ibm.com> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190904045529.23002-1-gromero@linux.vnet.ibm.com
2019-09-02x86/hyper-v: Fix overflow bug in fill_gva_list()Tianyu Lan1-3/+5
When the 'start' parameter is >= 0xFF000000 on 32-bit systems, or >= 0xFFFFFFFF'FF000000 on 64-bit systems, fill_gva_list() gets into an infinite loop. With such inputs, 'cur' overflows after adding HV_TLB_FLUSH_UNIT and always compares as less than end. Memory is filled with guest virtual addresses until the system crashes. Fix this by never incrementing 'cur' to be larger than 'end'. Reported-by: Jong Hyun Park <park.jonghyun@yonsei.ac.kr> Signed-off-by: Tianyu Lan <Tianyu.Lan@microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 2ffd9e33ce4a ("x86/hyper-v: Use hypercall for remote TLB flush") Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-09-02x86/uaccess: Don't leak the AC flags into __get_user() argument evaluationPeter Zijlstra1-1/+3
Identical to __put_user(); the __get_user() argument evalution will too leak UBSAN crud into the __uaccess_begin() / __uaccess_end() region. While uncommon this was observed to happen for: drivers/xen/gntdev.c: if (__get_user(old_status, batch->status[i])) where UBSAN added array bound checking. This complements commit: 6ae865615fc4 ("x86/uaccess: Dont leak the AC flag into __put_user() argument evaluation") Tested-by Sedat Dilek <sedat.dilek@gmail.com> Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: broonie@kernel.org Cc: sfr@canb.auug.org.au Cc: akpm@linux-foundation.org Cc: Randy Dunlap <rdunlap@infradead.org> Cc: mhocko@suse.cz Cc: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20190829082445.GM2369@hirez.programming.kicks-ass.net
2019-09-02x86/boot: Preserve boot_params.secure_boot from sanitizingJohn S. Gruber1-0/+1
Commit a90118c445cc ("x86/boot: Save fields explicitly, zero out everything else") now zeroes the secure boot setting information (enabled/disabled/...) passed by the boot loader or by the kernel's EFI handover mechanism. The problem manifests itself with signed kernels using the EFI handoff protocol with grub and the kernel loses the information whether secure boot is enabled in the firmware, i.e., the log message "Secure boot enabled" becomes "Secure boot could not be determined". efi_main() arch/x86/boot/compressed/eboot.c sets this field early but it is subsequently zeroed by the above referenced commit. Include boot_params.secure_boot in the preserve field list. [ bp: restructure commit message and massage. ] Fixes: a90118c445cc ("x86/boot: Save fields explicitly, zero out everything else") Signed-off-by: John S. Gruber <JohnSGruber@gmail.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Mark Brown <broonie@kernel.org> Cc: stable <stable@vger.kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/CAPotdmSPExAuQcy9iAHqX3js_fc4mMLQOTr5RBGvizyCOPcTQQ@mail.gmail.com
2019-09-01Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds7-39/+43
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: "A set of fixes for x86: - Fix the bogus detection of 32bit user mode for uretprobes which caused corruption of the user return address resulting in application crashes. In the uprobes handler in_ia32_syscall() is obviously always returning false on a 64bit kernel. Use user_64bit_mode() instead which works correctly. - Prevent large page splitting when ftrace flips RW/RO on the kernel text which caused iTLB performance issues. Ftrace wants to be converted to text_poke() which avoids the problem, but for now allow large page preservation in the static protections check when the change request spawns a full large page. - Prevent arch_dynirq_lower_bound() from returning 0 when the IOAPIC is configured via device tree. In the device tree case the GSI 1:1 mapping is meaningless therefore the lower bound which protects the GSI range on ACPI machines is irrelevant. Return the lower bound which the core hands to the function instead of blindly returning 0 which causes the core to allocate the invalid virtual interupt number 0 which in turn prevents all drivers from allocating and requesting an interrupt. - Remove the bogus initialization of LDR and DFR in the 32bit bigsmp APIC driver. That uses physical destination mode where LDR/DFR are ignored, but the initialization and the missing clear of LDR caused the APIC to be left in a inconsistent state on kexec/reboot. - Clear LDR when clearing the APIC registers so the APIC is in a well defined state. - Initialize variables proper in the find_trampoline_placement() code. - Silence GCC( build warning for the real mode part of the build" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm/cpa: Prevent large page split when ftrace flips RW on kernel text x86/build: Add -Wnoaddress-of-packed-member to REALMODE_CFLAGS, to silence GCC9 build warning x86/boot/compressed/64: Fix missing initialization in find_trampoline_placement() x86/apic: Include the LDR when clearing out APIC registers x86/apic: Do not initialize LDR and DFR for bigsmp uprobes/x86: Fix detection of 32-bit user mode x86/apic: Fix arch_dynirq_lower_bound() bug for DT enabled machines
2019-09-01Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds3-7/+24
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Thomas Gleixner: "Two fixes for perf x86 hardware implementations: - Restrict the period on Nehalem machines to prevent perf from hogging the CPU - Prevent the AMD IBS driver from overwriting the hardwre controlled and pre-seeded reserved bits (0-6) in the count register which caused a sample bias for dispatched micro-ops" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/amd/ibs: Fix sample bias for dispatched micro-ops perf/x86/intel: Restrict period on Nehalem
2019-08-31Merge tag 'trace-v5.3-rc6' of ↵Linus Torvalds1-1/+0
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: "Small fixes and minor cleanups for tracing: - Make exported ftrace function not static - Fix NULL pointer dereference in reading probes as they are created - Fix NULL pointer dereference in k/uprobe clean up path - Various documentation fixes" * tag 'trace-v5.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Correct kdoc formats ftrace/x86: Remove mcount() declaration tracing/probe: Fix null pointer dereference tracing: Make exported ftrace_set_clr_event non-static ftrace: Check for successful allocation of hash ftrace: Check for empty hash and comment the race with registering probes ftrace: Fix NULL pointer dereference in t_probe_next()
2019-08-31Merge tag 'riscv/for-v5.3-rc7' of ↵Linus Torvalds2-6/+10
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fix from Paul Walmsley: "One significant fix for 32-bit RISC-V systems: Fix the RV32 memory map to prevent userspace from corrupting the FIXMAP area. Without this patch, the system can crash very early during the boot" * tag 'riscv/for-v5.3-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: RISC-V: Fix FIXMAP area corruption on RV32 systems
2019-08-31Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds6-16/+19
Pull KVM fixes from Radim Krčmář: "PPC: - Fix bug which could leave locks held in the host on return to a guest. x86: - Prevent infinitely looping emulation of a failing syscall while single stepping. - Do not crash the host when nesting is disabled" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: Don't update RIP or do single-step on faulting emulation KVM: x86: hyper-v: don't crash on KVM_GET_SUPPORTED_HV_CPUID when kvm_intel.nested is disabled KVM: PPC: Book3S: Fix incorrect guest-to-user-translation error handling
2019-08-31ftrace/x86: Remove mcount() declarationJisheng Zhang1-1/+0
Commit 562e14f72292 ("ftrace/x86: Remove mcount support") removed the support for using mcount, so we could remove the mcount() declaration to clean up. Link: http://lkml.kernel.org/r/20190826170150.10f101ba@xhacker.debian Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-08-31arm64: dts: renesas: hihope-common: Fix eMMC statusFabrizio Castro1-0/+1
SDHI3 got accidentally disabled while adding USB 2.0 support, this patch fixes it. Fixes: 734d277f412a ("arm64: dts: renesas: hihope-common: Add USB 2.0 support") Signed-off-by: Fabrizio Castro <fabrizio.castro@bp.renesas.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
2019-08-30Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds2-2/+8
Pull ARM fixes from Russell King: "Three fixes for ARM this time around: - A fix for update_sections_early() to cope with NULL ->mm pointers. - A correction to the backtrace code to allow proper backtraces. - Reinforcement of pfn_valid() with PFNs >= 4GiB" * tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: 8901/1: add a criteria for pfn_valid of arm ARM: 8897/1: check stmfd instruction using right shift ARM: 8874/1: mm: only adjust sections of valid mm structures
2019-08-30Merge tag 'armsoc-fixes' of ↵Linus Torvalds21-85/+120
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC fixes from Arnd Bergmann: "The majority of the fixes this time are for OMAP hardware, here is a breakdown of the significant changes: Various device tree bug fixes: - TI am57xx boards need a voltage level fix to avoid damaging SD cards - vf610-bk4 fails to detect its flash due to an incorrect description - meson-g12a USB phy configuration fails - meson-g12b reboot should not power off the SD card - Some corrections for apparently harmless differences from the documentation. Regression fixes: - ams-delta FIQ interrupts broke in 5.3 - TI am3/am4 mmc controllers broke in 5.2 The logic_pio driver (used on some Huawei ARM servers) got a few bug fixes for reliability. And a couple of compile-time warning fixes" * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (26 commits) soc: ixp4xx: Protect IXP4xx SoC drivers by ARCH_IXP4XX || COMPILE_TEST soc: ti: pm33xx: Make two symbols static soc: ti: pm33xx: Fix static checker warnings ARM: OMAP: dma: Mark expected switch fall-throughs ARM: dts: Fix incomplete dts data for am3 and am4 mmc bus: ti-sysc: Simplify cleanup upon failures in sysc_probe() ARM: OMAP1: ams-delta-fiq: Fix missing irq_ack ARM: dts: dra74x: Fix iodelay configuration for mmc3 ARM: dts: am335x: Fix UARTs length ARM: OMAP2+: Fix omap4 errata warning on other SoCs bus: hisi_lpc: Add .remove method to avoid driver unbind crash bus: hisi_lpc: Unregister logical PIO range to avoid potential use-after-free lib: logic_pio: Add logic_pio_unregister_range() lib: logic_pio: Avoid possible overlap for unregistering regions lib: logic_pio: Fix RCU usage arm64: dts: amlogic: odroid-n2: keep SD card regulator always on arm64: dts: meson-g12a-sei510: enable IR controller arm64: dts: meson-g12a: add missing dwc2 phy-names ARM: dts: vf610-bk4: Fix qspi node description ARM: dts: Fix incorrect dcan register mapping for am3, am4 and dra7 ...
2019-08-30perf/x86/amd/ibs: Fix sample bias for dispatched micro-opsKim Phillips2-7/+18
When counting dispatched micro-ops with cnt_ctl=1, in order to prevent sample bias, IBS hardware preloads the least significant 7 bits of current count (IbsOpCurCnt) with random values, such that, after the interrupt is handled and counting resumes, the next sample taken will be slightly perturbed. The current count bitfield is in the IBS execution control h/w register, alongside the maximum count field. Currently, the IBS driver writes that register with the maximum count, leaving zeroes to fill the current count field, thereby overwriting the random bits the hardware preloaded for itself. Fix the driver to actually retain and carry those random bits from the read of the IBS control register, through to its write, instead of overwriting the lower current count bits with zeroes. Tested with: perf record -c 100001 -e ibs_op/cnt_ctl=1/pp -a -C 0 taskset -c 0 <workload> 'perf annotate' output before: 15.70 65: addsd %xmm0,%xmm1 17.30 add $0x1,%rax 15.88 cmp %rdx,%rax je 82 17.32 72: test $0x1,%al jne 7c 7.52 movapd %xmm1,%xmm0 5.90 jmp 65 8.23 7c: sqrtsd %xmm1,%xmm0 12.15 jmp 65 'perf annotate' output after: 16.63 65: addsd %xmm0,%xmm1 16.82 add $0x1,%rax 16.81 cmp %rdx,%rax je 82 16.69 72: test $0x1,%al jne 7c 8.30 movapd %xmm1,%xmm0 8.13 jmp 65 8.24 7c: sqrtsd %xmm1,%xmm0 8.39 jmp 65 Tested on Family 15h and 17h machines. Machines prior to family 10h Rev. C don't have the RDWROPCNT capability, and have the IbsOpCurCnt bitfield reserved, so this patch shouldn't affect their operation. It is unknown why commit db98c5faf8cb ("perf/x86: Implement 64-bit counter support for IBS") ignored the lower 4 bits of the IbsOpCurCnt field; the number of preloaded random bits has always been 7, AFAICT. Signed-off-by: Kim Phillips <kim.phillips@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: "Arnaldo Carvalho de Melo" <acme@kernel.org> Cc: <x86@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "Borislav Petkov" <bp@alien8.de> Cc: Stephane Eranian <eranian@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: "Namhyung Kim" <namhyung@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Link: https://lkml.kernel.org/r/20190826195730.30614-1-kim.phillips@amd.com
2019-08-30perf/x86/intel: Restrict period on NehalemJosh Hunt1-0/+6
We see our Nehalem machines reporting 'perfevents: irq loop stuck!' in some cases when using perf: perfevents: irq loop stuck! WARNING: CPU: 0 PID: 3485 at arch/x86/events/intel/core.c:2282 intel_pmu_handle_irq+0x37b/0x530 ... RIP: 0010:intel_pmu_handle_irq+0x37b/0x530 ... Call Trace: <NMI> ? perf_event_nmi_handler+0x2e/0x50 ? intel_pmu_save_and_restart+0x50/0x50 perf_event_nmi_handler+0x2e/0x50 nmi_handle+0x6e/0x120 default_do_nmi+0x3e/0x100 do_nmi+0x102/0x160 end_repeat_nmi+0x16/0x50 ... ? native_write_msr+0x6/0x20 ? native_write_msr+0x6/0x20 </NMI> intel_pmu_enable_event+0x1ce/0x1f0 x86_pmu_start+0x78/0xa0 x86_pmu_enable+0x252/0x310 __perf_event_task_sched_in+0x181/0x190 ? __switch_to_asm+0x41/0x70 ? __switch_to_asm+0x35/0x70 ? __switch_to_asm+0x41/0x70 ? __switch_to_asm+0x35/0x70 finish_task_switch+0x158/0x260 __schedule+0x2f6/0x840 ? hrtimer_start_range_ns+0x153/0x210 schedule+0x32/0x80 schedule_hrtimeout_range_clock+0x8a/0x100 ? hrtimer_init+0x120/0x120 ep_poll+0x2f7/0x3a0 ? wake_up_q+0x60/0x60 do_epoll_wait+0xa9/0xc0 __x64_sys_epoll_wait+0x1a/0x20 do_syscall_64+0x4e/0x110 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7fdeb1e96c03 ... Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: acme@kernel.org Cc: Josh Hunt <johunt@akamai.com> Cc: bpuranda@akamai.com Cc: mingo@redhat.com Cc: jolsa@redhat.com Cc: tglx@linutronix.de Cc: namhyung@kernel.org Cc: alexander.shishkin@linux.intel.com Link: https://lkml.kernel.org/r/1566256411-18820-1-git-send-email-johunt@akamai.com
2019-08-29x86/mm/cpa: Prevent large page split when ftrace flips RW on kernel textThomas Gleixner1-8/+18
ftrace does not use text_poke() for enabling trace functionality. It uses its own mechanism and flips the whole kernel text to RW and back to RO. The CPA rework removed a loop based check of 4k pages which tried to preserve a large page by checking each 4k page whether the change would actually cover all pages in the large page. This resulted in endless loops for nothing as in testing it turned out that it actually never preserved anything. Of course testing missed to include ftrace, which is the one and only case which benefitted from the 4k loop. As a consequence enabling function tracing or ftrace based kprobes results in a full 4k split of the kernel text, which affects iTLB performance. The kernel RO protection is the only valid case where this can actually preserve large pages. All other static protections (RO data, data NX, PCI, BIOS) are truly static. So a conflict with those protections which results in a split should only ever happen when a change of memory next to a protected region is attempted. But these conflicts are rightfully splitting the large page to preserve the protected regions. In fact a change to the protected regions itself is a bug and is warned about. Add an exception for the static protection check for kernel text RO when the to be changed region spawns a full large page which allows to preserve the large mappings. This also prevents the syslog to be spammed about CPA violations when ftrace is used. The exception needs to be removed once ftrace switched over to text_poke() which avoids the whole issue. Fixes: 585948f4f695 ("x86/mm/cpa: Avoid the 4k pages check completely") Reported-by: Song Liu <songliubraving@fb.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Song Liu <songliubraving@fb.com> Reviewed-by: Song Liu <songliubraving@fb.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1908282355340.1938@nanos.tec.linutronix.de
2019-08-29Merge tag 'Wimplicit-fallthrough-5.3-rc7' of ↵Linus Torvalds2-0/+3
git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux Pull fallthrough fixes from Gustavo A. R. Silva: "Fix fall-through warnings on arc and nds32 for multiple configurations" * tag 'Wimplicit-fallthrough-5.3-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux: nds32: Mark expected switch fall-throughs ARC: unwind: Mark expected switch fall-through
2019-08-29nds32: Mark expected switch fall-throughsGustavo A. R. Silva1-0/+2
Mark switch cases where we are expecting to fall through. This patch fixes the following warnings (Building: allmodconfig nds32): include/math-emu/soft-fp.h:124:8: warning: this statement may fall through [-Wimplicit-fallthrough=] arch/nds32/kernel/signal.c:362:20: warning: this statement may fall through [-Wimplicit-fallthrough=] arch/nds32/kernel/signal.c:315:7: warning: this statement may fall through [-Wimplicit-fallthrough=] include/math-emu/op-common.h:417:11: warning: this statement may fall through [-Wimplicit-fallthrough=] include/math-emu/op-common.h:430:11: warning: this statement may fall through [-Wimplicit-fallthrough=] include/math-emu/op-common.h:310:11: warning: this statement may fall through [-Wimplicit-fallthrough=] include/math-emu/op-common.h:320:11: warning: this statement may fall through [-Wimplicit-fallthrough=] include/math-emu/op-common.h:310:11: warning: this statement may fall through [-Wimplicit-fallthrough=] include/math-emu/op-common.h:320:11: warning: this statement may fall through [-Wimplicit-fallthrough=] include/math-emu/soft-fp.h:124:8: warning: this statement may fall through [-Wimplicit-fallthrough=] include/math-emu/op-common.h:417:11: warning: this statement may fall through [-Wimplicit-fallthrough=] include/math-emu/op-common.h:430:11: warning: this statement may fall through [-Wimplicit-fallthrough=] include/math-emu/op-common.h:310:11: warning: this statement may fall through [-Wimplicit-fallthrough=] include/math-emu/op-common.h:320:11: warning: this statement may fall through [-Wimplicit-fallthrough=] include/math-emu/op-common.h:310:11: warning: this statement may fall through [-Wimplicit-fallthrough=] include/math-emu/op-common.h:320:11: warning: this statement may fall through [-Wimplicit-fallthrough=] Reported-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
2019-08-29ARC: unwind: Mark expected switch fall-throughGustavo A. R. Silva1-0/+1
Mark switch cases where we are expecting to fall through. This patch fixes the following warnings (Building: haps_hs_defconfig arc): arch/arc/kernel/unwind.c: In function ‘read_pointer’: ./include/linux/compiler.h:328:5: warning: this statement may fall through [-Wimplicit-fallthrough=] do { \ ^ ./include/linux/compiler.h:338:2: note: in expansion of macro ‘__compiletime_assert’ __compiletime_assert(condition, msg, prefix, suffix) ^~~~~~~~~~~~~~~~~~~~ ./include/linux/compiler.h:350:2: note: in expansion of macro ‘_compiletime_assert’ _compiletime_assert(condition, msg, __compiletime_assert_, __LINE__) ^~~~~~~~~~~~~~~~~~~ ./include/linux/build_bug.h:39:37: note: in expansion of macro ‘compiletime_assert’ #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg) ^~~~~~~~~~~~~~~~~~ ./include/linux/build_bug.h:50:2: note: in expansion of macro ‘BUILD_BUG_ON_MSG’ BUILD_BUG_ON_MSG(condition, "BUILD_BUG_ON failed: " #condition) ^~~~~~~~~~~~~~~~ arch/arc/kernel/unwind.c:573:3: note: in expansion of macro ‘BUILD_BUG_ON’ BUILD_BUG_ON(sizeof(u32) != sizeof(value)); ^~~~~~~~~~~~ arch/arc/kernel/unwind.c:575:2: note: here case DW_EH_PE_native: ^~~~ Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
2019-08-29ARM: 8901/1: add a criteria for pfn_valid of armzhaoyang1-0/+5
pfn_valid can be wrong when parsing a invalid pfn whose phys address exceeds BITS_PER_LONG as the MSB will be trimed when shifted. The issue originally arise from bellowing call stack, which corresponding to an access of the /proc/kpageflags from userspace with a invalid pfn parameter and leads to kernel panic. [46886.723249] c7 [<c031ff98>] (stable_page_flags) from [<c03203f8>] [46886.723264] c7 [<c0320368>] (kpageflags_read) from [<c0312030>] [46886.723280] c7 [<c0311fb0>] (proc_reg_read) from [<c02a6e6c>] [46886.723290] c7 [<c02a6e24>] (__vfs_read) from [<c02a7018>] [46886.723301] c7 [<c02a6f74>] (vfs_read) from [<c02a778c>] [46886.723315] c7 [<c02a770c>] (SyS_pread64) from [<c0108620>] (ret_fast_syscall+0x0/0x28) Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2019-08-29RISC-V: Fix FIXMAP area corruption on RV32 systemsAnup Patel2-6/+10
Currently, various virtual memory areas of Linux RISC-V are organized in increasing order of their virtual addresses is as follows: 1. User space area (This is lowest area and starts at 0x0) 2. FIXMAP area 3. VMALLOC area 4. Kernel area (This is highest area and starts at PAGE_OFFSET) The maximum size of user space aread is represented by TASK_SIZE. On RV32 systems, TASK_SIZE is defined as VMALLOC_START which causes the user space area to overlap the FIXMAP area. This allows user space apps to potentially corrupt the FIXMAP area and kernel OF APIs will crash whenever they access corrupted FDT in the FIXMAP area. On RV64 systems, TASK_SIZE is set to fixed 256GB and no other areas happen to overlap so we don't see any FIXMAP area corruptions. This patch fixes FIXMAP area corruption on RV32 systems by setting TASK_SIZE to FIXADDR_START. We also move FIXADDR_TOP, FIXADDR_SIZE, and FIXADDR_START defines to asm/pgtable.h so that we can avoid cyclic header includes. Signed-off-by: Anup Patel <anup.patel@wdc.com> Tested-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2019-08-28x86/build: Add -Wnoaddress-of-packed-member to REALMODE_CFLAGS, to silence ↵Linus Torvalds1-0/+1
GCC9 build warning One of the very few warnings I have in the current build comes from arch/x86/boot/edd.c, where I get the following with a gcc9 build: arch/x86/boot/edd.c: In function ‘query_edd’: arch/x86/boot/edd.c:148:11: warning: taking address of packed member of ‘struct boot_params’ may result in an unaligned pointer value [-Waddress-of-packed-member] 148 | mbrptr = boot_params.edd_mbr_sig_buffer; | ^~~~~~~~~~~ This warning triggers because we throw away all the CFLAGS and then make a new set for REALMODE_CFLAGS, so the -Wno-address-of-packed-member we added in the following commit is not present: 6f303d60534c ("gcc-9: silence 'address-of-packed-member' warning") The simplest solution for now is to adjust the warning for this version of CFLAGS as well, but it would definitely make sense to examine whether REALMODE_CFLAGS could be derived from CFLAGS, so that it picks up changes in the compiler flags environment automatically. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Borislav Petkov <bp@alien8.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-08-27KVM: x86: Don't update RIP or do single-step on faulting emulationSean Christopherson1-4/+5
Don't advance RIP or inject a single-step #DB if emulation signals a fault. This logic applies to all state updates that are conditional on clean retirement of the emulation instruction, e.g. updating RFLAGS was previously handled by commit 38827dbd3fb85 ("KVM: x86: Do not update EFLAGS on faulting emulation"). Not advancing RIP is likely a nop, i.e. ctxt->eip isn't updated with ctxt->_eip until emulation "retires" anyways. Skipping #DB injection fixes a bug reported by Andy Lutomirski where a #UD on SYSCALL due to invalid state with EFLAGS.TF=1 would loop indefinitely due to emulation overwriting the #UD with #DB and thus restarting the bad SYSCALL over and over. Cc: Nadav Amit <nadav.amit@gmail.com> Cc: stable@vger.kernel.org Reported-by: Andy Lutomirski <luto@kernel.org> Fixes: 663f4c61b803 ("KVM: x86: handle singlestep during emulation") Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2019-08-27KVM: x86: hyper-v: don't crash on KVM_GET_SUPPORTED_HV_CPUID when ↵Vitaly Kuznetsov3-8/+6
kvm_intel.nested is disabled If kvm_intel is loaded with nested=0 parameter an attempt to perform KVM_GET_SUPPORTED_HV_CPUID results in OOPS as nested_get_evmcs_version hook in kvm_x86_ops is NULL (we assign it in nested_vmx_hardware_setup() and this only happens in case nested is enabled). Check that kvm_x86_ops->nested_get_evmcs_version is not NULL before calling it. With this, we can remove the stub from svm as it is no longer needed. Cc: <stable@vger.kernel.org> Fixes: e2e871ab2f02 ("x86/kvm/hyper-v: Introduce nested_get_evmcs_version() helper") Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2019-08-27Merge tag 'arc-5.3-rc7' of ↵Linus Torvalds8-28/+141
git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc Pull ARC updates from Vineet Gupta: - support for Edge Triggered IRQs in ARC IDU intc - other fixes here and there * tag 'arc-5.3-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: arc: prefer __section from compiler_attributes.h dt-bindings: IDU-intc: Add support for edge-triggered interrupts dt-bindings: IDU-intc: Clean up documentation ARCv2: IDU-intc: Add support for edge-triggered interrupts ARC: unwind: Mark expected switch fall-throughs ARC: [plat-hsdk]: allow to switch between AXI DMAC port configurations ARC: fix typo in setup_dma_ops log message ARCv2: entry: early return from exception need not clear U & DE bits
2019-08-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds1-5/+7
Pull networking fixes from David Miller: 1) Use 32-bit index for tails calls in s390 bpf JIT, from Ilya Leoshkevich. 2) Fix missed EPOLLOUT events in TCP, from Eric Dumazet. Same fix for SMC from Jason Baron. 3) ipv6_mc_may_pull() should return 0 for malformed packets, not -EINVAL. From Stefano Brivio. 4) Don't forget to unpin umem xdp pages in error path of xdp_umem_reg(). From Ivan Khoronzhuk. 5) Fix sta object leak in mac80211, from Johannes Berg. 6) Fix regression by not configuring PHYLINK on CPU port of bcm_sf2 switches. From Florian Fainelli. 7) Revert DMA sync removal from r8169 which was causing regressions on some MIPS Loongson platforms. From Heiner Kallweit. 8) Use after free in flow dissector, from Jakub Sitnicki. 9) Fix NULL derefs of net devices during ICMP processing across collect_md tunnels, from Hangbin Liu. 10) proto_register() memory leaks, from Zhang Lin. 11) Set NLM_F_MULTI flag in multipart netlink messages consistently, from John Fastabend. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (66 commits) r8152: Set memory to all 0xFFs on failed reg reads openvswitch: Fix conntrack cache with timeout ipv4: mpls: fix mpls_xmit for iptunnel nexthop: Fix nexthop_num_path for blackhole nexthops net: rds: add service level support in rds-info net: route dump netlink NLM_F_MULTI flag missing s390/qeth: reject oversized SNMP requests sock: fix potential memory leak in proto_register() MAINTAINERS: Add phylink keyword to SFF/SFP/SFP+ MODULE SUPPORT xfrm/xfrm_policy: fix dst dev null pointer dereference in collect_md mode ipv4/icmp: fix rt dst dev null pointer dereference openvswitch: Fix log message in ovs conntrack bpf: allow narrow loads of some sk_reuseport_md fields with offset > 0 bpf: fix use after free in prog symbol exposure bpf: fix precision tracking in presence of bpf2bpf calls flow_dissector: Fix potential use-after-free on BPF_PROG_DETACH Revert "r8169: remove not needed call to dma_sync_single_for_device" ipv6: propagate ipv6_add_dev's error returns out of ipv6_find_idev net/ncsi: Fix the payload copying for the request coming from Netlink qed: Add cleanup in qed_slowpath_start() ...
2019-08-27Merge tag 'kvm-ppc-fixes-5.3-1' of ↵Radim Krčmář2-4/+8
git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc KVM/PPC fix for 5.3 - Fix bug which could leave locks locked in the host on return to a guest.
2019-08-27x86/boot/compressed/64: Fix missing initialization in ↵Kirill A. Shutemov1-1/+1
find_trampoline_placement() Gustavo noticed that 'new' can be left uninitialized if 'bios_start' happens to be less or equal to 'entry->addr + entry->size'. Initialize the variable at the begin of the iteration to the current value of 'bios_start'. Fixes: 0a46fff2f910 ("x86/boot/compressed/64: Fix boot on machines with broken E820 table") Reported-by: "Gustavo A. R. Silva" <gustavo@embeddedor.com> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20190826133326.7cxb4vbmiawffv2r@box
2019-08-27KVM: PPC: Book3S: Fix incorrect guest-to-user-translation error handlingAlexey Kardashevskiy2-4/+8
H_PUT_TCE_INDIRECT handlers receive a page with up to 512 TCEs from a guest. Although we verify correctness of TCEs before we do anything with the existing tables, there is a small window when a check in kvmppc_tce_validate might pass and right after that the guest alters the page of TCEs, causing an early exit from the handler and leaving srcu_read_lock(&vcpu->kvm->srcu) (virtual mode) or lock_rmap(rmap) (real mode) locked. This fixes the bug by jumping to the common exit code with an appropriate unlock. Cc: stable@vger.kernel.org # v4.11+ Fixes: 121f80ba68f1 ("KVM: PPC: VFIO: Add in-kernel acceleration for VFIO") Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2019-08-26x86/apic: Include the LDR when clearing out APIC registersBandan Das1-0/+4
Although APIC initialization will typically clear out the LDR before setting it, the APIC cleanup code should reset the LDR. This was discovered with a 32-bit KVM guest jumping into a kdump kernel. The stale bits in the LDR triggered a bug in the KVM APIC implementation which caused the destination mapping for VCPUs to be corrupted. Note that this isn't intended to paper over the KVM APIC bug. The kernel has to clear the LDR when resetting the APIC registers except when X2APIC is enabled. This lacks a Fixes tag because missing to clear LDR goes way back into pre git history. [ tglx: Made x2apic_enabled a function call as required ] Signed-off-by: Bandan Das <bsd@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20190826101513.5080-3-bsd@redhat.com
2019-08-26x86/apic: Do not initialize LDR and DFR for bigsmpBandan Das1-22/+2
Legacy apic init uses bigsmp for smp systems with 8 and more CPUs. The bigsmp APIC implementation uses physical destination mode, but it nevertheless initializes LDR and DFR. The LDR even ends up incorrectly with multiple bit being set. This does not cause a functional problem because LDR and DFR are ignored when physical destination mode is active, but it triggered a problem on a 32-bit KVM guest which jumps into a kdump kernel. The multiple bits set unearthed a bug in the KVM APIC implementation. The code which creates the logical destination map for VCPUs ignores the disabled state of the APIC and ends up overwriting an existing valid entry and as a result, APIC calibration hangs in the guest during kdump initialization. Remove the bogus LDR/DFR initialization. This is not intended to work around the KVM APIC bug. The LDR/DFR ininitalization is wrong on its own. The issue goes back into the pre git history. The fixes tag is the commit in the bitkeeper import which introduced bigsmp support in 2003. git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git Fixes: db7b9e9f26b8 ("[PATCH] Clustered APIC setup for >8 CPU systems") Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Bandan Das <bsd@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20190826101513.5080-2-bsd@redhat.com
2019-08-26arc: prefer __section from compiler_attributes.hNick Desaulniers2-6/+5
Reported-by: Sedat Dilek <sedat.dilek@gmail.com> Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2019-08-26ARCv2: IDU-intc: Add support for edge-triggered interruptsMischa Jonker1-6/+54
This adds support for an optional extra interrupt cell to specify edge vs level triggered. It is backward compatible with dts files with only one cell, and will default to level-triggered in such a case. Note that I had to make a change to idu_irq_set_affinity as well, as this function was setting the interrupt type to "level" unconditionally, since this was the only type supported previously. Signed-off-by: Mischa Jonker <mischa.jonker@synopsys.com> Reviewed-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2019-08-26uprobes/x86: Fix detection of 32-bit user modeSebastian Mayr1-7/+10
32-bit processes running on a 64-bit kernel are not always detected correctly, causing the process to crash when uretprobes are installed. The reason for the crash is that in_ia32_syscall() is used to determine the process's mode, which only works correctly when called from a syscall. In the case of uretprobes, however, the function is called from a exception and always returns 'false' on a 64-bit kernel. In consequence this leads to corruption of the process's return address. Fix this by using user_64bit_mode() instead of in_ia32_syscall(), which is correct in any situation. [ tglx: Add a comment and the following historical info ] This should have been detected by the rename which happened in commit abfb9498ee13 ("x86/entry: Rename is_{ia32,x32}_task() to in_{ia32,x32}_syscall()") which states in the changelog: The is_ia32_task()/is_x32_task() function names are a big misnomer: they suggests that the compat-ness of a system call is a task property, which is not true, the compatness of a system call purely depends on how it was invoked through the system call layer. ..... and then it went and blindly renamed every call site. Sadly enough this was already mentioned here: 8faaed1b9f50 ("uprobes/x86: Introduce sizeof_long(), cleanup adjust_ret_addr() and arch_uretprobe_hijack_return_addr()") where the changelog says: TODO: is_ia32_task() is not what we actually want, TS_COMPAT does not necessarily mean 32bit. Fortunately syscall-like insns can't be probed so it actually works, but it would be better to rename and use is_ia32_frame(). and goes all the way back to: 0326f5a94dde ("uprobes/core: Handle breakpoint and singlestep exceptions") Oh well. 7+ years until someone actually tried a uretprobe on a 32bit process on a 64bit kernel.... Fixes: 0326f5a94dde ("uprobes/core: Handle breakpoint and singlestep exceptions") Signed-off-by: Sebastian Mayr <me@sam.st> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Dmitry Safonov <dsafonov@virtuozzo.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20190728152617.7308-1-me@sam.st
2019-08-26x86/apic: Fix arch_dynirq_lower_bound() bug for DT enabled machinesThomas Gleixner1-1/+7
Rahul Tanwar reported the following bug on DT systems: > 'ioapic_dynirq_base' contains the virtual IRQ base number. Presently, it is > updated to the end of hardware IRQ numbers but this is done only when IOAPIC > configuration type is IOAPIC_DOMAIN_LEGACY or IOAPIC_DOMAIN_STRICT. There is > a third type IOAPIC_DOMAIN_DYNAMIC which applies when IOAPIC configuration > comes from devicetree. > > See dtb_add_ioapic() in arch/x86/kernel/devicetree.c > > In case of IOAPIC_DOMAIN_DYNAMIC (DT/OF based system), 'ioapic_dynirq_base' > remains to zero initialized value. This means that for OF based systems, > virtual IRQ base will get set to zero. Such systems will very likely not even boot. For DT enabled machines ioapic_dynirq_base is irrelevant and not updated, so simply map the IRQ base 1:1 instead. Reported-by: Rahul Tanwar <rahul.tanwar@linux.intel.com> Tested-by: Rahul Tanwar <rahul.tanwar@linux.intel.com> Tested-by: Andy Shevchenko <andriy.shevchenko@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: alan@linux.intel.com Cc: bp@alien8.de Cc: cheol.yong.kim@intel.com Cc: qi-ming.wu@intel.com Cc: rahul.tanwar@intel.com Cc: rppt@linux.ibm.com Cc: tony.luck@intel.com Link: http://lkml.kernel.org/r/20190821081330.1187-1-rahul.tanwar@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-08-25Merge tag 'for-linus-5.3-rc6' of ↵Linus Torvalds3-12/+20
git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml Pull UML fix from Richard Weinberger: "Fix time travel mode" * tag 'for-linus-5.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml: um: fix time travel mode
2019-08-25Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds8-33/+220
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: "A few fixes for x86: - Fix a boot regression caused by the recent bootparam sanitizing change, which escaped the attention of all people who reviewed that code. - Address a boot problem on machines with broken E820 tables caused by an underflow which ended up placing the trampoline start at physical address 0. - Handle machines which do not advertise a legacy timer of any form, but need calibration of the local APIC timer gracefully by making the calibration routine independent from the tick interrupt. Marked for stable as well as there seems to be quite some new laptops rolled out which expose this. - Clear the RDRAND CPUID bit on AMD family 15h and 16h CPUs which are affected by broken firmware which does not initialize RDRAND correctly after resume. Add a command line parameter to override this for machine which either do not use suspend/resume or have a fixed BIOS. Unfortunately there is no way to detect this on boot, so the only safe decision is to turn it off by default. - Prevent RFLAGS from being clobbers in CALL_NOSPEC on 32bit which caused fast KVM instruction emulation to break. - Explain the Intel CPU model naming convention so that the repeating discussions come to an end" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/retpoline: Don't clobber RFLAGS during CALL_NOSPEC on i386 x86/boot: Fix boot regression caused by bootparam sanitizing x86/CPU/AMD: Clear RDRAND CPUID bit on AMD family 15h/16h x86/boot/compressed/64: Fix boot on machines with broken E820 table x86/apic: Handle missing global clockevent gracefully x86/cpu: Explain Intel model naming convention
2019-08-25Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Thomas Gleixner: "Two small fixes for kprobes and perf: - Prevent a deadlock in kprobe_optimizer() causes by reverse lock ordering - Fix a comment typo" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: kprobes: Fix potential deadlock in kprobe_optimizer() perf/x86: Fix typo in comment
2019-08-25Merge branch 'akpm' (patches from Andrew)Linus Torvalds1-2/+1
Mergr misc fixes from Andrew Morton: "11 fixes" Mostly VM fixes, one psi polling fix, and one parisc build fix. * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm/kasan: fix false positive invalid-free reports with CONFIG_KASAN_SW_TAGS=y mm/zsmalloc.c: fix race condition in zs_destroy_pool mm/zsmalloc.c: migration can leave pages in ZS_EMPTY indefinitely mm, page_owner: handle THP splits correctly userfaultfd_release: always remove uffd flags and clear vm_userfaultfd_ctx psi: get poll_work to run when calling poll syscall next time mm: memcontrol: flush percpu vmevents before releasing memcg mm: memcontrol: flush percpu vmstats before releasing memcg parisc: fix compilation errrors mm, page_alloc: move_freepages should not examine struct page of reserved memory mm/z3fold.c: fix race between migration and destruction
2019-08-25Merge tag 'dma-mapping-5.3-5' of git://git.infradead.org/users/hch/dma-mappingLinus Torvalds2-4/+4
Pull dma-mapping fixes from Christoph Hellwig: "Two fixes for regressions in this merge window: - select the Kconfig symbols for the noncoherent dma arch helpers on arm if swiotlb is selected, not just for LPAE to not break then Xen build, that uses swiotlb indirectly through swiotlb-xen - fix the page allocator fallback in dma_alloc_contiguous if the CMA allocation fails" * tag 'dma-mapping-5.3-5' of git://git.infradead.org/users/hch/dma-mapping: dma-direct: fix zone selection after an unaddressable CMA allocation arm: select the dma-noncoherent symbols for all swiotlb builds
2019-08-25parisc: fix compilation errrorsQian Cai1-2/+1
Commit 0cfaee2af3a0 ("include/asm-generic/5level-fixup.h: fix variable 'p4d' set but not used") converted a few functions from macros to static inline, which causes parisc to complain, In file included from include/asm-generic/4level-fixup.h:38:0, from arch/parisc/include/asm/pgtable.h:5, from arch/parisc/include/asm/io.h:6, from include/linux/io.h:13, from sound/core/memory.c:9: include/asm-generic/5level-fixup.h:14:18: error: unknown type name 'pgd_t'; did you mean 'pid_t'? #define p4d_t pgd_t ^ include/asm-generic/5level-fixup.h:24:28: note: in expansion of macro 'p4d_t' static inline int p4d_none(p4d_t p4d) ^~~~~ It is because "4level-fixup.h" is included before "asm/page.h" where "pgd_t" is defined. Link: http://lkml.kernel.org/r/20190815205305.1382-1-cai@lca.pw Fixes: 0cfaee2af3a0 ("include/asm-generic/5level-fixup.h: fix variable 'p4d' set but not used") Signed-off-by: Qian Cai <cai@lca.pw> Reported-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Guenter Roeck <linux@roeck-us.net> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-08-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller1-5/+7
Daniel Borkmann says: ==================== pull-request: bpf 2019-08-24 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) Fix verifier precision tracking with BPF-to-BPF calls, from Alexei. 2) Fix a use-after-free in prog symbol exposure, from Daniel. 3) Several s390x JIT fixes plus BE related fixes in BPF kselftests, from Ilya. 4) Fix memory leak by unpinning XDP umem pages in error path, from Ivan. 5) Fix a potential use-after-free on flow dissector detach, from Jakub. 6) Fix bpftool to close prog fd after showing metadata, from Quentin. 7) BPF kselftest config and TEST_PROGS_EXTENDED fixes, from Anders. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-23x86/retpoline: Don't clobber RFLAGS during CALL_NOSPEC on i386Sean Christopherson1-1/+1
Use 'lea' instead of 'add' when adjusting %rsp in CALL_NOSPEC so as to avoid clobbering flags. KVM's emulator makes indirect calls into a jump table of sorts, where the destination of the CALL_NOSPEC is a small blob of code that performs fast emulation by executing the target instruction with fixed operands. adcb_al_dl: 0x000339f8 <+0>: adc %dl,%al 0x000339fa <+2>: ret A major motiviation for doing fast emulation is to leverage the CPU to handle consumption and manipulation of arithmetic flags, i.e. RFLAGS is both an input and output to the target of CALL_NOSPEC. Clobbering flags results in all sorts of incorrect emulation, e.g. Jcc instructions often take the wrong path. Sans the nops... asm("push %[flags]; popf; " CALL_NOSPEC " ; pushf; pop %[flags]\n" 0x0003595a <+58>: mov 0xc0(%ebx),%eax 0x00035960 <+64>: mov 0x60(%ebx),%edx 0x00035963 <+67>: mov 0x90(%ebx),%ecx 0x00035969 <+73>: push %edi 0x0003596a <+74>: popf 0x0003596b <+75>: call *%esi 0x000359a0 <+128>: pushf 0x000359a1 <+129>: pop %edi 0x000359a2 <+130>: mov %eax,0xc0(%ebx) 0x000359b1 <+145>: mov %edx,0x60(%ebx) ctxt->eflags = (ctxt->eflags & ~EFLAGS_MASK) | (flags & EFLAGS_MASK); 0x000359a8 <+136>: mov -0x10(%ebp),%eax 0x000359ab <+139>: and $0x8d5,%edi 0x000359b4 <+148>: and $0xfffff72a,%eax 0x000359b9 <+153>: or %eax,%edi 0x000359bd <+157>: mov %edi,0x4(%ebx) For the most part this has gone unnoticed as emulation of guest code that can trigger fast emulation is effectively limited to MMIO when running on modern hardware, and MMIO is rarely, if ever, accessed by instructions that affect or consume flags. Breakage is almost instantaneous when running with unrestricted guest disabled, in which case KVM must emulate all instructions when the guest has invalid state, e.g. when the guest is in Big Real Mode during early BIOS. Fixes: 776b043848fd2 ("x86/retpoline: Add initial retpoline support") Fixes: 1a29b5b7f347a ("KVM: x86: Make indirect calls in emulator speculation safe") Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20190822211122.27579-1-sean.j.christopherson@intel.com