Age | Commit message (Collapse) | Author | Files | Lines |
|
To pick up fixes.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Masayoshi Mizuma reported that a warning message is shown while a CPU is
hot-removed on Broadwell servers:
WARNING: CPU: 126 PID: 6 at arch/x86/events/intel/uncore.c:988
uncore_pci_remove+0x10b/0x150
Call Trace:
pci_device_remove+0x42/0xd0
device_release_driver_internal+0x148/0x220
pci_stop_bus_device+0x76/0xa0
pci_stop_root_bus+0x44/0x60
acpi_pci_root_remove+0x1f/0x80
acpi_bus_trim+0x57/0x90
acpi_bus_trim+0x2e/0x90
acpi_device_hotplug+0x2bc/0x4b0
acpi_hotplug_work_fn+0x1a/0x30
process_one_work+0x174/0x3a0
worker_thread+0x4c/0x3d0
kthread+0xf8/0x130
This bug was introduced by:
commit 15a3e845b01c ("perf/x86/intel/uncore: Fix SBOX support for Broadwell CPUs")
The index of "QPI Port 2 filter" was hardcode to 2, but this conflicts with the
index of "PCU.3" which is "HSWEP_PCI_PCU_3", which equals to 2 as well.
To fix the conflict, the hardcoded index needs to be cleaned up:
- introduce a new enumerator "BDX_PCI_QPI_PORT2_FILTER" for "QPI Port 2
filter" on Broadwell,
- increase UNCORE_EXTRA_PCI_DEV_MAX by one,
- clean up the hardcoded index.
Debugged-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Suggested-by: Ingo Molnar <mingo@kernel.org>
Reported-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Tested-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: msys.mizuma@gmail.com
Cc: stable@vger.kernel.org
Fixes: 15a3e845b01c ("perf/x86/intel/uncore: Fix SBOX support for Broadwell CPUs")
Link: http://lkml.kernel.org/r/1532953688-15008-1-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"Misc fixes:
- a build race fix
- a Xen entry fix
- a TSC_DEADLINE quirk future-proofing fix"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/boot: Fix if_changed build flip/flop bug
x86/entry/64: Remove %ebx handling from error_entry/exit
x86/apic: Future-proof the TSC_DEADLINE quirk for SKX
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"Misc fixes:
- AMD IBS data corruptor fix (uncovered by UBSAN)
- an Intel PEBS entry unwind error fix
- a HW-tracing crash fix
- a MAINTAINERS update"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/core: Fix crash when using HW tracing kernel filters
perf/x86/intel: Fix unwind errors from PEBS entries (mk-II)
MAINTAINERS: Add Naveen N. Rao as kprobes co-maintainer
perf/x86/amd/ibs: Don't access non-started event
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fixes from Ingo Molnar:
"A paravirt UP-patching fix, and an I2C MUX driver lockdep warning fix"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/pvqspinlock/x86: Use LOCK_PREFIX in __pv_queued_spin_unlock() assembly code
i2c/mux, locking/core: Annotate the nested rt_mutex usage
locking/rtmutex: Allow specifying a subclass for nested locking
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI fix from Ingo Molnar:
"An UEFI variables fix for SEV guests"
* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/efi: Access EFI MMIO data as unencrypted when SEV is active
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
Pull MIPS fix from Paul Burton:
"Here's one more MIPS fix, reverting an errata workaround that was
merged for v4.18-rc2 but has since been found to cause system hangs on
some BCM4718A1-based systems by the OpenWRT project"
* tag 'mips_fixes_4.18_5' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
Revert "MIPS: BCM47XX: Enable 74K Core ExternalSync for PCIe erratum"
|
|
Merge misc fixes from Andrew Morton:
"11 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
kvm, mm: account shadow page tables to kmemcg
zswap: re-check zswap_is_full() after do zswap_shrink()
include/linux/eventfd.h: include linux/errno.h
mm: fix vma_is_anonymous() false-positives
mm: use vma_init() to initialize VMAs on stack and data segments
mm: introduce vma_init()
mm: fix exports that inadvertently make put_page() EXPORT_SYMBOL_GPL
ipc/sem.c: prevent queue.status tearing in semop
mm: disallow mappings that conflict for devm_memremap_pages()
kasan: only select SLUB_DEBUG with SYSFS=y
delayacct: fix crash in delayacct_blkio_end() after delayacct init failure
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"Inevitably, after saying that I hoped we would be done on the fixes
front, a couple of issues have cropped up over the last week. Next
time I'll stay schtum.
We've fixed an over-eager BUILD_BUG_ON() which Arnd ran into with
arndconfig, as well as ensuring that KPTI really is disabled on
Thunder-X1, where the cure is worse than the disease (this regressed
when we reworked the heterogeneous CPU feature checking).
Summary:
- Fix disabling of kpti on Thunder-X machines
- Fix premature BUILD_BUG_ON() found with randconfig"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: fix vmemmap BUILD_BUG_ON() triggering on !vmemmap setups
arm64: Check for errata before evaluating cpu features
|
|
This reverts commit 2a027b47dba6 ("MIPS: BCM47XX: Enable 74K Core
ExternalSync for PCIe erratum").
Enabling ExternalSync caused a regression for BCM4718A1 (used e.g. in
Netgear E3000 and ASUS RT-N16): it simply hangs during PCIe
initialization. It's likely that BCM4717A1 is also affected.
I didn't notice that earlier as the only BCM47XX devices with PCIe I
own are:
1) BCM4706 with 2 x 14e4:4331
2) BCM4706 with 14e4:4360 and 14e4:4331
it appears that BCM4706 is unaffected.
While BCM5300X-ES300-RDS.pdf seems to document that erratum and its
workarounds (according to quotes provided by Tokunori) it seems not even
Broadcom follows them.
According to the provided info Broadcom should define CONF7_ES in their
SDK's mipsinc.h and implement workaround in the si_mips_init(). Checking
both didn't reveal such code. It *could* mean Broadcom also had some
problems with the given workaround.
Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Signed-off-by: Paul Burton <paul.burton@mips.com>
Reported-by: Michael Marley <michael@michaelmarley.com>
Patchwork: https://patchwork.linux-mips.org/patch/20032/
URL: https://bugs.openwrt.org/index.php?do=details&task_id=1688
Cc: Tokunori Ikegami <ikegami@allied-telesis.co.jp>
Cc: Hauke Mehrtens <hauke@hauke-m.de>
Cc: Chris Packham <chris.packham@alliedtelesis.co.nz>
Cc: James Hogan <jhogan@kernel.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
|
|
The size of kvm's shadow page tables corresponds to the size of the
guest virtual machines on the system. Large VMs can spend a significant
amount of memory as shadow page tables which can not be left as system
memory overhead. So, account shadow page tables to the kmemcg.
[shakeelb@google.com: replace (GFP_KERNEL|__GFP_ACCOUNT) with GFP_KERNEL_ACCOUNT]
Link: http://lkml.kernel.org/r/20180629140224.205849-1-shakeelb@google.com
Link: http://lkml.kernel.org/r/20180627181349.149778-1-shakeelb@google.com
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Peter Feiner <pfeiner@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Make sure to initialize all VMAs properly, not only those which come
from vm_area_cachep.
Link: http://lkml.kernel.org/r/20180724121139.62570-3-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Arnd reports the following arm64 randconfig build error with the PSI
patches that add another page flag:
/git/arm-soc/arch/arm64/mm/init.c: In function 'mem_init':
/git/arm-soc/include/linux/compiler.h:357:38: error: call to
'__compiletime_assert_618' declared with attribute error: BUILD_BUG_ON
failed: sizeof(struct page) > (1 << STRUCT_PAGE_MAX_SHIFT)
The additional page flag causes other information stored in
page->flags to get bumped into their own struct page member:
#if SECTIONS_WIDTH+ZONES_WIDTH+NODES_SHIFT+LAST_CPUPID_SHIFT <=
BITS_PER_LONG - NR_PAGEFLAGS
#define LAST_CPUPID_WIDTH LAST_CPUPID_SHIFT
#else
#define LAST_CPUPID_WIDTH 0
#endif
#if defined(CONFIG_NUMA_BALANCING) && LAST_CPUPID_WIDTH == 0
#define LAST_CPUPID_NOT_IN_PAGE_FLAGS
#endif
which in turn causes the struct page size to exceed the size set in
STRUCT_PAGE_MAX_SHIFT. This value is an an estimate used to size the
VMEMMAP page array according to address space and struct page size.
However, the check is performed - and triggers here - on a !VMEMMAP
config, which consumes an additional 22 page bits for the sparse
section id. When VMEMMAP is enabled, those bits are returned, cpupid
doesn't need its own member, and the page passes the VMEMMAP check.
Restrict that check to the situation it was meant to check: that we
are sizing the VMEMMAP page array correctly.
Says Arnd:
Further experiments show that the build error already existed before,
but was only triggered with larger values of CONFIG_NR_CPU and/or
CONFIG_NODES_SHIFT that might be used in actual configurations but
not in randconfig builds.
With longer CPU and node masks, I could recreate the problem with
kernels as old as linux-4.7 when arm64 NUMA support got added.
Reported-by: Arnd Bergmann <arnd@arndb.de>
Tested-by: Arnd Bergmann <arnd@arndb.de>
Cc: stable@vger.kernel.org
Fixes: 1a2db300348b ("arm64, numa: Add NUMA support for arm64 platforms.")
Fixes: 3e1907d5bf5a ("arm64: mm: move vmemmap region right below the linear region")
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
Since commit d3aec8a28be3b8 ("arm64: capabilities: Restrict KPTI
detection to boot-time CPUs") we rely on errata flags being already
populated during feature enumeration. The order of errata and
features was flipped as part of commit ed478b3f9e4a ("arm64:
capabilities: Group handling of features and errata workarounds").
Return to the orginal order of errata and feature evaluation to
ensure errata flags are present during feature evaluation.
Fixes: ed478b3f9e4a ("arm64: capabilities: Group handling of
features and errata workarounds")
CC: Suzuki K Poulose <suzuki.poulose@arm.com>
CC: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Dirk Mueller <dmueller@suse.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
Dirk Gouders reported that two consecutive "make" invocations on an
already compiled tree will show alternating behaviors:
$ make
CALL scripts/checksyscalls.sh
DESCEND objtool
CHK include/generated/compile.h
DATAREL arch/x86/boot/compressed/vmlinux
Kernel: arch/x86/boot/bzImage is ready (#48)
Building modules, stage 2.
MODPOST 165 modules
$ make
CALL scripts/checksyscalls.sh
DESCEND objtool
CHK include/generated/compile.h
LD arch/x86/boot/compressed/vmlinux
ZOFFSET arch/x86/boot/zoffset.h
AS arch/x86/boot/header.o
LD arch/x86/boot/setup.elf
OBJCOPY arch/x86/boot/setup.bin
OBJCOPY arch/x86/boot/vmlinux.bin
BUILD arch/x86/boot/bzImage
Setup is 15644 bytes (padded to 15872 bytes).
System is 6663 kB
CRC 3eb90f40
Kernel: arch/x86/boot/bzImage is ready (#48)
Building modules, stage 2.
MODPOST 165 modules
He bisected it back to:
commit 98f78525371b ("x86/boot: Refuse to build with data relocations")
The root cause was the use of the "if_changed" kbuild function multiple
times for the same target. It was designed to only be used once per
target, otherwise it will effectively always trigger, flipping back and
forth between the two commands getting recorded by "if_changed". Instead,
this patch merges the two commands into a single function to get stable
build artifacts (i.e. .vmlinux.cmd), and a single build behavior.
Bisected-and-Reported-by: Dirk Gouders <dirk@gouders.net>
Fix-Suggested-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20180724230827.GA37823@beast
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Enable the extended PEBS for Goldmont Plus.
There is no specific PEBS constrains for Goldmont Plus. Removing the
pebs_constraints for Goldmont Plus.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: acme@kernel.org
Link: http://lkml.kernel.org/r/20180309021542.11374-4-kan.liang@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The pebs_drain() need to support fixed counters. The DS Save Area now
include "counter reset value" fields for each fixed counters.
Extend the related variables (e.g. mask, counters, error) to support
fixed counters. There is no extended PEBS in PEBS v2 and earlier PEBS
format. Only need to change the code for PEBS v3 and later PEBS format.
Extend the pebs_event_reset[] logic to support new "counter reset value" fields.
Increase the reserve space for fixed counters.
Based-on-code-from: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: acme@kernel.org
Link: http://lkml.kernel.org/r/20180309021542.11374-3-kan.liang@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The Extended PEBS feature supports PEBS on fixed-function performance
counters as well as all four general purpose counters.
It has to change the order of PEBS and fixed counter enabling to make
sure PEBS is enabled for the fixed counters.
The change of the order doesn't impact the behavior of current code on
other platforms which don't support extended PEBS.
Because there is no dependency among those enable/disable functions.
Don't enable IRQ generation (0x8) for MSR_ARCH_PERFMON_FIXED_CTR_CTRL.
The PEBS ucode will handle the interrupt generation.
Based-on-code-from: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: acme@kernel.org
Link: http://lkml.kernel.org/r/20180309021542.11374-2-kan.liang@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The Extended PEBS feature, introduced in the Goldmont Plus
microarchitecture, supports all events as "Extended PEBS".
Introduce flag PMU_FL_PEBS_ALL to indicate the platforms which support
extended PEBS.
To support all events, it needs to support all constraints for PEBS. To
avoid duplicating all the constraints in the PEBS table, making the PEBS
code search the normal constraints too.
Based-on-code-from: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: acme@kernel.org
Link: http://lkml.kernel.org/r/20180309021542.11374-1-kan.liang@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Vince reported the perf_fuzzer giving various unwinder warnings and
Josh reported:
> Deja vu. Most of these are related to perf PEBS, similar to the
> following issue:
>
> b8000586c90b ("perf/x86/intel: Cure bogus unwind from PEBS entries")
>
> This is basically the ORC version of that. setup_pebs_sample_data() is
> assembling a franken-pt_regs which ORC isn't happy about. RIP is
> inconsistent with some of the other registers (like RSP and RBP).
And where the previous unwinder only needed BP,SP ORC also requires
IP. But we cannot spoof IP because then the sample will get displaced,
entirely negating the point of PEBS.
So cure the whole thing differently by doing the unwind early; this
does however require a means to communicate we did the unwind early.
We (ab)use an unused sample_type bit for this, which we set on events
that fill out the data->callchain before the normal
perf_prepare_sample().
Debugged-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Tested-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
assembly code
The LOCK_PREFIX macro should be used in the __raw_callee_save___pv_queued_spin_unlock()
assembly code, so that the lock prefix can be patched out on UP systems.
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Joe Mario <jmario@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Link: http://lkml.kernel.org/r/1531858560-21547-1-git-send-email-longman@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
Pull MIPS fixes from Paul Burton:
"A couple more MIPS fixes for 4.18:
- Fix an off-by-one in reporting PCI resource sizes to userland which
regressed in v3.12.
- Fix writes to DDR controller registers used to flush write buffers,
which regressed with some refactoring in v4.2"
* tag 'mips_fixes_4.18_4' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
MIPS: ath79: fix register address in ath79_ddr_wb_flush()
MIPS: Fix off-by-one in pci_resource_to_user()
|
|
Pull networking fixes from David Miller:
1) Handle stations tied to AP_VLANs properly during mac80211 hw
reconfig. From Manikanta Pubbisetty.
2) Fix jump stack depth validation in nf_tables, from Taehee Yoo.
3) Fix quota handling in aRFS flow expiration of mlx5 driver, from Eran
Ben Elisha.
4) Exit path handling fix in powerpc64 BPF JIT, from Daniel Borkmann.
5) Use ptr_ring_consume_bh() in page pool code, from Tariq Toukan.
6) Fix cached netdev name leak in nf_tables, from Florian Westphal.
7) Fix memory leaks on chain rename, also from Florian Westphal.
8) Several fixes to DCTCP congestion control ACK handling, from Yuchunk
Cheng.
9) Missing rcu_read_unlock() in CAIF protocol code, from Yue Haibing.
10) Fix link local address handling with VRF, from David Ahern.
11) Don't clobber 'err' on a successful call to __skb_linearize() in
skb_segment(). From Eric Dumazet.
12) Fix vxlan fdb notification races, from Roopa Prabhu.
13) Hash UDP fragments consistently, from Paolo Abeni.
14) If TCP receives lots of out of order tiny packets, we do really
silly stuff. Make the out-of-order queue ending more robust to this
kind of behavior, from Eric Dumazet.
15) Don't leak netlink dump state in nf_tables, from Florian Westphal.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (76 commits)
net: axienet: Fix double deregister of mdio
qmi_wwan: fix interface number for DW5821e production firmware
ip: in cmsg IP(V6)_ORIGDSTADDR call pskb_may_pull
bnx2x: Fix invalid memory access in rss hash config path.
net/mlx4_core: Save the qpn from the input modifier in RST2INIT wrapper
r8169: restore previous behavior to accept BIOS WoL settings
cfg80211: never ignore user regulatory hint
sock: fix sg page frag coalescing in sk_alloc_sg
netfilter: nf_tables: move dumper state allocation into ->start
tcp: add tcp_ooo_try_coalesce() helper
tcp: call tcp_drop() from tcp_data_queue_ofo()
tcp: detect malicious patterns in tcp_collapse_ofo_queue()
tcp: avoid collapses in tcp_prune_queue() if possible
tcp: free batches of packets in tcp_prune_ofo_queue()
ip: hash fragments consistently
ipv6: use fib6_info_hold_safe() when necessary
can: xilinx_can: fix power management handling
can: xilinx_can: fix incorrect clear of non-processed interrupts
can: xilinx_can: fix RX overflow interrupt not being enabled
can: xilinx_can: keep only 1-2 frames in TX FIFO to fix TX accounting
...
|
|
error_entry and error_exit communicate the user vs. kernel status of
the frame using %ebx. This is unnecessary -- the information is in
regs->cs. Just use regs->cs.
This makes error_entry simpler and makes error_exit more robust.
It also fixes a nasty bug. Before all the Spectre nonsense, the
xen_failsafe_callback entry point returned like this:
ALLOC_PT_GPREGS_ON_STACK
SAVE_C_REGS
SAVE_EXTRA_REGS
ENCODE_FRAME_POINTER
jmp error_exit
And it did not go through error_entry. This was bogus: RBX
contained garbage, and error_exit expected a flag in RBX.
Fortunately, it generally contained *nonzero* garbage, so the
correct code path was used. As part of the Spectre fixes, code was
added to clear RBX to mitigate certain speculation attacks. Now,
depending on kernel configuration, RBX got zeroed and, when running
some Wine workloads, the kernel crashes. This was introduced by:
commit 3ac6d8c787b8 ("x86/entry/64: Clear registers for exceptions/interrupts, to reduce speculation attack surface")
With this patch applied, RBX is no longer needed as a flag, and the
problem goes away.
I suspect that malicious userspace could use this bug to crash the
kernel even without the offending patch applied, though.
[ Historical note: I wrote this patch as a cleanup before I was aware
of the bug it fixed. ]
[ Note to stable maintainers: this should probably get applied to all
kernels. If you're nervous about that, a more conservative fix to
add xorl %ebx,%ebx; incl %ebx before the jump to error_exit should
also fix the problem. ]
Reported-and-tested-by: M. Vefa Bicakci <m.v.b@runbox.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Cc: xen-devel@lists.xenproject.org
Fixes: 3ac6d8c787b8 ("x86/entry/64: Clear registers for exceptions/interrupts, to reduce speculation attack surface")
Link: http://lkml.kernel.org/r/b5010a090d3586b2d6e06c7ad3ec5542d1241c45.1532282627.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
All SKX with stepping higher than 4 support the TSC_DEADLINE,
no matter the microcode version.
Without this patch, upcoming SKX steppings will not be able to use
their TSC_DEADLINE timer.
Signed-off-by: Len Brown <len.brown@intel.com>
Cc: <stable@kernel.org> # v4.14+
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 616dd5872e ("x86/apic: Update TSC_DEADLINE quirk with additional SKX stepping")
Link: http://lkml.kernel.org/r/d0c7129e509660be9ec6b233284b8d42d90659e8.1532207856.git.len.brown@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Paul Menzel reported the following bug:
> Enabling the undefined behavior sanitizer and building GNU/Linux 4.18-rc5+
> (with some unrelated commits) with GCC 8.1.0 from Debian Sid/unstable, the
> warning below is shown.
>
> > [ 2.111913]
> > ================================================================================
> > [ 2.111917] UBSAN: Undefined behaviour in arch/x86/events/amd/ibs.c:582:24
> > [ 2.111919] member access within null pointer of type 'struct perf_event'
> > [ 2.111926] CPU: 0 PID: 144 Comm: udevadm Not tainted 4.18.0-rc5-00316-g4864b68cedf2 #104
> > [ 2.111928] Hardware name: ASROCK E350M1/E350M1, BIOS TIMELESS 01/01/1970
> > [ 2.111930] Call Trace:
> > [ 2.111943] dump_stack+0x55/0x89
> > [ 2.111949] ubsan_epilogue+0xb/0x33
> > [ 2.111953] handle_null_ptr_deref+0x7f/0x90
> > [ 2.111958] __ubsan_handle_type_mismatch_v1+0x55/0x60
> > [ 2.111964] perf_ibs_handle_irq+0x596/0x620
The code dereferences event before checking the STARTED bit. Patch
below should cure the issue.
The warning should not trigger, if I analyzed the thing correctly.
(And Paul's testing confirms this.)
Reported-by: Paul Menzel <pmenzel@molgen.mpg.de>
Tested-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Menzel <pmenzel+linux-x86@molgen.mpg.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/alpine.DEB.2.21.1807200958390.1580@nanos.tec.linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The s390 build currently fails with the latent entropy plugin:
arch/s390/kernel/als.o: In function `verify_facilities':
als.c:(.init.text+0x24): undefined reference to `latent_entropy'
als.c:(.init.text+0xae): undefined reference to `latent_entropy'
make[3]: *** [arch/s390/boot/compressed/vmlinux] Error 1
make[2]: *** [arch/s390/boot/compressed/vmlinux] Error 2
make[1]: *** [bzImage] Error 2
This will be fixed with the early boot rework from Vasily, which
is planned for the 4.19 merge window.
For 4.18 the simplest solution is to disable the gcc plugins and
reenable them after the early boot rework is upstream.
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
kernel_wait4() expects a userland address for status - it's only
rusage that goes as a kernel one (and needs a copyout afterwards)
[ Also, fix the prototype of kernel_wait4() to have that __user
annotation - Linus ]
Fixes: 92ebce5ac55d ("osf_wait4: switch to kernel_wait4()")
Cc: stable@kernel.org # v4.13+
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
SEV guest fails to update the UEFI runtime variables stored in the
flash.
The following commit:
1379edd59673 ("x86/efi: Access EFI data as encrypted when SEV is active")
unconditionally maps all the UEFI runtime data as 'encrypted' (C=1).
When SEV is active the UEFI runtime data marked as EFI_MEMORY_MAPPED_IO
should be mapped as 'unencrypted' so that both guest and hypervisor can
access the data.
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Cc: <stable@vger.kernel.org> # 4.15.x
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Fixes: 1379edd59673 ("x86/efi: Access EFI data as encrypted ...")
Link: http://lkml.kernel.org/r/20180720012846.23560-2-ard.biesheuvel@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
- Fix interrupt type on ethernet switch for i.MX-based RDU2
- GPC on i.MX exposed too large a register window which resulted in
userspace being able to crash the machine.
- Fixup of bad merge resolution moving GPIO DT nodes under pinctrl on
droid4.
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: dts: imx6: RDU2: fix irq type for mv88e6xxx switch
soc: imx: gpc: restrict register range for regmap access
ARM: dts: omap4-droid4: fix dts w.r.t. pwm
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fix from Ingo Molnar:
"A single fix for a MCE-polling regression, which prevented the
disabling of polling"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/MCE: Remove min interval polling limitation
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 pti fixes from Ingo Molnar:
"An APM fix, and a BTS hardware-tracing fix related to PTI changes"
* 'x86-pti-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/apm: Don't access __preempt_count with zeroed fs
x86/events/intel/ds: Fix bts_interrupt_threshold alignment
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core kernel fixes from Ingo Molnar:
"This is mostly the copy_to_user_mcsafe() related fixes from Dan
Williams, and an ORC fix for Clang"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/asm/memcpy_mcsafe: Fix copy_to_user_mcsafe() exception handling
lib/iov_iter: Fix pipe handling in _copy_to_iter_mcsafe()
lib/iov_iter: Document _copy_to_iter_flushcache()
lib/iov_iter: Document _copy_to_iter_mcsafe()
objtool: Use '.strtab' if '.shstrtab' doesn't exist, to support ORC tables on Clang
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"Two regression fixes, one for xmon disassembly formatting and the
other to fix the E500 build.
Two commits to fix a potential security issue in the VFIO code under
obscure circumstances.
And finally a fix to the Power9 idle code to restore SPRG3, which is
user visible and used for sched_getcpu().
Thanks to: Alexey Kardashevskiy, David Gibson. Gautham R. Shenoy,
James Clarke"
* tag 'powerpc-4.18-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/powernv: Fix save/restore of SPRG3 on entry/exit from stop (idle)
powerpc/Makefile: Assemble with -me500 when building for E500
KVM: PPC: Check if IOMMU page is contained in the pinned physical page
vfio/spapr: Use IOMMU pageshift rather than pagesize
powerpc/xmon: Fix disassembly since printf changes
|
|
Like vm_area_dup(), it initializes the anon_vma_chain head, and the
basic mm pointer.
The rest of the fields end up being different for different users,
although the plan is to also initialize the 'vm_ops' field to a dummy
entry.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The vm_area_struct is one of the most fundamental memory management
objects, but the management of it is entirely open-coded evertwhere,
ranging from allocation and freeing (using kmem_cache_[z]alloc and
kmem_cache_free) to initializing all the fields.
We want to unify this in order to end up having some unified
initialization of the vmas, and the first step to this is to at least
have basic allocation functions.
Right now those functions are literally just wrappers around the
kmem_cache_*() calls. This is a purely mechanical conversion:
# new vma:
kmem_cache_zalloc(vm_area_cachep, GFP_KERNEL) -> vm_area_alloc()
# copy old vma
kmem_cache_alloc(vm_area_cachep, GFP_KERNEL) -> vm_area_dup(old)
# free vma
kmem_cache_free(vm_area_cachep, vma) -> vm_area_free(vma)
to the point where the old vma passed in to the vm_area_dup() function
isn't even used yet (because I've left all the old manual initialization
alone).
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into fixes
i.MX fixes for 4.18, round 4:
- A fix for i.MX6 RDU2 board on the wrong IRQ type of Marvell switch,
which might result in a race condition in the interrupt handler and
cause the OS to miss all future events.
* tag 'imx-fixes-4.18-4' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
ARM: dts: imx6: RDU2: fix irq type for mv88e6xxx switch
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Pull ARC fixes from Vineet Gupta:
"ARC is back after radio silence in 4.17:
- Fix CONFIG_SWAP [Alexey]
- Robustify cmpxchg emulation for systems w/o atomics [Alexey /
PeterZ]
- Allow mprotext(PROT_EXEC) for stack mappings [Vineet]
- HSDK platform enable PCIe, APG GPIO [Gustavo]
- miscll other fixes, config updates etc"
* tag 'arc-4.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
ARCv2: [plat-hsdk]: Save accl reg pair by default
ARC: mm: allow mprotect to make stack mappings executable
ARC: Fix CONFIG_SWAP
ARC: [arcompact] entry.S: minor code movement
ARC: configs: Remove CONFIG_INITRAMFS_SOURCE from defconfigs
ARC: configs: remove no longer needed CONFIG_DEVPTS_MULTIPLE_INSTANCES
ARC: Improve cmpxchg syscall implementation
ARC: [plat-hsdk]: Configure APB GPIO controller on ARC HSDK platform
ARC: [plat-hsdk] Add PCIe support
ARC: Enable machine_desc->init_per_cpu for !CONFIG_SMP
ARC: Explicitly add -mmedium-calls to CFLAGS
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/greentime/linux
Pull nds32 updates from Greentime Hu:
"Bug fixes and build ixes for nds32"
* tag 'nds32-for-linus-4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/greentime/linux:
nds32: fix build error "relocation truncated to fit: R_NDS32_25_PCREL_RELA" when make allyesconfig
nds32: To simplify the implementation of update_mmu_cache()
nds32: Fix the dts pointer is not passed correctly issue.
nds32: To implement these icache invalidation APIs since nds32 cores don't snoop data cache. This issue is found by Guo Ren. Based on the Documentation/core-api/cachetlb.rst and it says:
nds32: Fix build error caused by configuration flag rename
nds32: define __NDS32_E[BL]__ for sparse
|
|
ath79_ddr_wb_flush_base has the type void __iomem *, so register offsets
need to be a multiple of 4 in order to access the intended register.
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: John Crispin <john@phrozen.org>
Signed-off-by: Paul Burton <paul.burton@mips.com>
Fixes: 24b0e3e84fbf ("MIPS: ath79: Improve the DDR controller interface")
Patchwork: https://patchwork.linux-mips.org/patch/19912/
Cc: Alban Bedel <albeu@free.fr>
Cc: James Hogan <jhogan@kernel.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: stable@vger.kernel.org # 4.2+
|
|
The Marvell switches report their interrupts in a level sensitive way.
When using edge sensitive detection a race condition in the interrupt
handler of the swich might result in the OS to miss all future events
which might make the switch non-functional.
The problem is that both mv88e6xxx_g2_irq_thread_fn() and
mv88e6xxx_g1_irq_thread_work() sample the irq cause register
(MV88E6XXX_G2_INT_SRC and MV88E6XXX_G1_STS respectively) once and then
handle the observed sources. If after sampling but before all observed
irq sources are handled a new irq source gets active this is not noticed
by the handler which returns unsuspecting, but the interrupt line stays
active which prevents the edge detector to kick in.
All device trees but imx6qdl-zii-rdu2 get this right (most of them by
not specifying an interrupt parent). So fix imx6qdl-zii-rdu2
accordingly.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Fixes: f64992d1a916 ("ARM: dts: imx6: RDU2: Add Switch interrupts")
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
|
|
None of the JITs is allowed to implement exit paths from the BPF
insn mappings other than BPF_JMP | BPF_EXIT. In the BPF core code
we have a couple of rewrites in eBPF (e.g. LD_ABS / LD_IND) and
in eBPF to cBPF translation to retain old existing behavior where
exceptions may occur; they are also tightly controlled by the
verifier where it disallows some of the features such as BPF to
BPF calls when legacy LD_ABS / LD_IND ops are present in the BPF
program. During recent review of all BPF_XADD JIT implementations
I noticed that the ppc64 one is buggy in that it contains two
jumps to exit paths. This is problematic as this can bypass verifier
expectations e.g. pointed out in commit f6b1b3bf0d5f ("bpf: fix
subprog verifier bypass by div/mod by 0 exception"). The first
exit path is obsoleted by the fix in ca36960211eb ("bpf: allow xadd
only on aligned memory") anyway, and for the second one we need to
do a fetch, add and store loop if the reservation from lwarx/ldarx
was lost in the meantime.
Fixes: 156d0e290e96 ("powerpc/ebpf/jit: Implement JIT compiler for extended BPF")
Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Reviewed-by: Sandipan Das <sandipan@linux.vnet.ibm.com>
Tested-by: Sandipan Das <sandipan@linux.vnet.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes
One omap dts mismerge fix
The dts patch for droid4 PWM vibrator has added gpio6 entries to the wrong
node. Let's fix it with a note that there seems to be also other GPIO PWM
issues to fix still to get the PWM vibrator working. So this can wait for
v4.19 merge cycle if necessary.
* tag 'omap-for-v4.18/fixes-rc5-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: dts: omap4-droid4: fix dts w.r.t. pwm
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
This manifsted as strace segfaulting on HSDK because gcc was targetting
the accumulator registers as GPRs, which kernek was not saving/restoring
by default.
Cc: stable@vger.kernel.org #4.14+
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
|
|
Pull kvm fixes from Paolo Bonzini:
"Miscellaneous bugfixes, plus a small patchlet related to Spectre v2"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
kvmclock: fix TSC calibration for nested guests
KVM: VMX: Mark VMXArea with revision_id of physical CPU even when eVMCS enabled
KVM: irqfd: fix race between EPOLLHUP and irq_bypass_register_consumer
KVM/Eventfd: Avoid crash when assign and deassign specific eventfd in parallel.
x86/kvmclock: set pvti_cpu0_va after enabling kvmclock
x86/kvm/Kconfig: Ensure CRYPTO_DEV_CCP_DD state at minimum matches KVM_AMD
kvm: nVMX: Restore exit qual for VM-entry failure due to MSR loading
x86/kvm/vmx: don't read current->thread.{fs,gs}base of legacy tasks
KVM: VMX: support MSR_IA32_ARCH_CAPABILITIES as a feature MSR
|
|
On 64-bit servers, SPRN_SPRG3 and its userspace read-only mirror
SPRN_USPRG3 are used as userspace VDSO write and read registers
respectively.
SPRN_SPRG3 is lost when we enter stop4 and above, and is currently not
restored. As a result, any read from SPRN_USPRG3 returns zero on an
exit from stop4 (Power9 only) and above.
Thus in this situation, on POWER9, any call from sched_getcpu() always
returns zero, as on powerpc, we call __kernel_getcpu() which relies
upon SPRN_USPRG3 to report the CPU and NUMA node information.
Fix this by restoring SPRN_SPRG3 on wake up from a deep stop state
with the sprg_vdso value that is cached in PACA.
Fixes: e1c1cfed5432 ("powerpc/powernv: Save/Restore additional SPRs for stop4 cpuidle")
Cc: stable@vger.kernel.org # v4.14+
Reported-by: Florian Weimer <fweimer@redhat.com>
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Reviewed-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Some of the assembly files use instructions specific to BookE or E500,
which are rejected with the now-default -mcpu=powerpc, so we must pass
-me500 to the assembler just as we pass -me200 for E200.
Fixes: 4bf4f42a2feb ("powerpc/kbuild: Set default generic machine type for 32-bit compile")
Signed-off-by: James Clarke <jrtc27@jrtc27.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Inside a nested guest, access to hardware can be slow enough that
tsc_read_refs always return ULLONG_MAX, causing tsc_refine_calibration_work
to be called periodically and the nested guest to spend a lot of time
reading the ACPI timer.
However, if the TSC frequency is available from the pvclock page,
we can just set X86_FEATURE_TSC_KNOWN_FREQ and avoid the recalibration.
'refine' operation.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Peng Hao <peng.hao2@zte.com.cn>
[Commit message rewritten. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
When eVMCS is enabled, all VMCS allocated to be used by KVM are marked
with revision_id of KVM_EVMCS_VERSION instead of revision_id reported
by MSR_IA32_VMX_BASIC.
However, even though not explictly documented by TLFS, VMXArea passed
as VMXON argument should still be marked with revision_id reported by
physical CPU.
This issue was found by the following setup:
* L0 = KVM which expose eVMCS to it's L1 guest.
* L1 = KVM which consume eVMCS reported by L0.
This setup caused the following to occur:
1) L1 execute hardware_enable().
2) hardware_enable() calls kvm_cpu_vmxon() to execute VMXON.
3) L0 intercept L1 VMXON and execute handle_vmon() which notes
vmxarea->revision_id != VMCS12_REVISION and therefore fails with
nested_vmx_failInvalid() which sets RFLAGS.CF.
4) L1 kvm_cpu_vmxon() don't check RFLAGS.CF for failure and therefore
hardware_enable() continues as usual.
5) L1 hardware_enable() then calls ept_sync_global() which executes
INVEPT.
6) L0 intercept INVEPT and execute handle_invept() which notes
!vmx->nested.vmxon and thus raise a #UD to L1.
7) Raised #UD caused L1 to panic.
Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Cc: stable@vger.kernel.org
Fixes: 773e8a0425c923bc02668a2d6534a5ef5a43cc69
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|