Age | Commit message (Collapse) | Author | Files | Lines |
|
[ Upstream commit 2a671f77ee49f3e78997b77fdee139467ff6a598 ]
The struct pci_dev uses reference counting but zPCI assumed erroneously
that the last reference would always be the local reference after
calling pci_stop_and_remove_bus_device(). This is usually the case but
not how reference counting works and thus inherently fragile.
In fact one case where this causes a NULL pointer dereference when on an
SRIOV device the function 0 was hot unplugged before another function of
the same multi-function device. In this case the second function's
pdev->sriov->dev reference keeps the struct pci_dev of function 0 alive
even after the unplug. This bug was previously hidden by the fact that
we were leaking the struct pci_dev which in turn means that it always
outlived the struct zpci_dev. This was fixed in commit 0b13525c20fe
("s390/pci: fix leak of PCI device structure") exposing the broken
behavior.
Fix this by accounting for the long living reference a struct pci_dev
has to its underlying struct zpci_dev via the zbus->function[] array and
only release that in pcibios_release_device() ensuring that the struct
pci_dev is not left with a dangling reference. This is a minimal fix in
the future it would probably better to use fine grained reference
counting for struct zpci_dev.
Fixes: 05bc1be6db4b2 ("s390/pci: create zPCI bus")
Cc: stable@vger.kernel.org
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit f5e81d1117501546b7be050c5fbafa6efd2c722c ]
In case of JITs, each of the JIT backends compiles the BPF nospec instruction
/either/ to a machine instruction which emits a speculation barrier /or/ to
/no/ machine instruction in case the underlying architecture is not affected
by Speculative Store Bypass or has different mitigations in place already.
This covers both x86 and (implicitly) arm64: In case of x86, we use 'lfence'
instruction for mitigation. In case of arm64, we rely on the firmware mitigation
as controlled via the ssbd kernel parameter. Whenever the mitigation is enabled,
it works for all of the kernel code with no need to provide any additional
instructions here (hence only comment in arm64 JIT). Other archs can follow
as needed. The BPF nospec instruction is specifically targeting Spectre v4
since i) we don't use a serialization barrier for the Spectre v1 case, and
ii) mitigation instructions for v1 and v4 might be different on some archs.
The BPF nospec is required for a future commit, where the BPF verifier does
annotate intermediate BPF programs with speculation barriers.
Co-developed-by: Piotr Krysiuk <piotras@gmail.com>
Co-developed-by: Benedict Schlueter <benedict.schlueter@rub.de>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Piotr Krysiuk <piotras@gmail.com>
Signed-off-by: Benedict Schlueter <benedict.schlueter@rub.de>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit 463f36c76fa4ec015c640ff63ccf52e7527abee0 upstream.
The DMA code section of the decompressor must be compiled with expolines
if Spectre V2 mitigation has been enabled for the decompressed kernel.
This is required because although the decompressor's image contains
the DMA code section, it is handed over to the decompressed kernel for use.
Because the DMA code is already slow w/o expolines, use expolines always
regardless whether the decompressed kernel is using them or not. This
simplifies the DMA code by dropping the conditional compilation of
expolines.
Fixes: bf72630130c2 ("s390: use proper expoline sections for .dma code")
Cc: <stable@vger.kernel.org> # 5.2
Signed-off-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit f8c2602733c953ed7a16e060640b8e96f9d94b9b upstream.
s390 enforces DYNAMIC_FTRACE if FUNCTION_TRACER is selected.
At the same time implementation of ftrace_caller is not compliant with
HAVE_DYNAMIC_FTRACE since it doesn't provide implementation of
ftrace_update_ftrace_func() and calls ftrace_trace_function() directly.
The subtle difference is that during ftrace code patching ftrace
replaces function tracer via ftrace_update_ftrace_func() and activates
it back afterwards. Unexpected direct calls to ftrace_trace_function()
during ftrace code patching leads to nullptr-dereferences when tracing
is activated for one of functions which are used during code patching.
Those function currently are:
copy_from_kernel_nofault()
copy_from_kernel_nofault_allowed()
preempt_count_sub() [with debug_defconfig]
preempt_count_add() [with debug_defconfig]
Corresponding KASAN report:
BUG: KASAN: nullptr-dereference in function_trace_call+0x316/0x3b0
Read of size 4 at addr 0000000000001e08 by task migration/0/15
CPU: 0 PID: 15 Comm: migration/0 Tainted: G B 5.13.0-41423-g08316af3644d
Hardware name: IBM 3906 M04 704 (LPAR)
Stopper: multi_cpu_stop+0x0/0x3e0 <- stop_machine_cpuslocked+0x1e4/0x218
Call Trace:
[<0000000001f77caa>] show_stack+0x16a/0x1d0
[<0000000001f8de42>] dump_stack+0x15a/0x1b0
[<0000000001f81d56>] print_address_description.constprop.0+0x66/0x2e0
[<000000000082b0ca>] kasan_report+0x152/0x1c0
[<00000000004cfd8e>] function_trace_call+0x316/0x3b0
[<0000000001fb7082>] ftrace_caller+0x7a/0x7e
[<00000000006bb3e6>] copy_from_kernel_nofault_allowed+0x6/0x10
[<00000000006bb42e>] copy_from_kernel_nofault+0x3e/0xd0
[<000000000014605c>] ftrace_make_call+0xb4/0x1f8
[<000000000047a1b4>] ftrace_replace_code+0x134/0x1d8
[<000000000047a6e0>] ftrace_modify_all_code+0x120/0x1d0
[<000000000047a7ec>] __ftrace_modify_code+0x5c/0x78
[<000000000042395c>] multi_cpu_stop+0x224/0x3e0
[<0000000000423212>] cpu_stopper_thread+0x33a/0x5a0
[<0000000000243ff2>] smpboot_thread_fn+0x302/0x708
[<00000000002329ea>] kthread+0x342/0x408
[<00000000001066b2>] __ret_from_fork+0x92/0xf0
[<0000000001fb57fa>] ret_from_fork+0xa/0x30
The buggy address belongs to the page:
page:(____ptrval____) refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1
flags: 0x1ffff00000001000(reserved|node=0|zone=0|lastcpupid=0x1ffff)
raw: 1ffff00000001000 0000040000000048 0000040000000048 0000000000000000
raw: 0000000000000000 0000000000000000 ffffffff00000001 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
0000000000001d00: f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7
0000000000001d80: f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7
>0000000000001e00: f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7
^
0000000000001e80: f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7
0000000000001f00: f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7
==================================================================
To fix that introduce ftrace_func callback to be called from
ftrace_caller and update it in ftrace_update_ftrace_func().
Fixes: 4cc9bed034d1 ("[S390] cleanup ftrace backend functions")
Cc: stable@vger.kernel.org
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 91091656252f5d6d8c476e0c92776ce9fae7b445 ]
Currently array jit->seen_reg[r1] is being accessed before the range
checking of index r1. The range changing on r1 should be performed
first since it will avoid any potential out-of-range accesses on the
array seen_reg[] and also it is more optimal to perform checks on r1
before fetching data from the array. Fix this by swapping the order
of the checks before the array access.
Fixes: 054623105728 ("s390/bpf: Add s390x eBPF JIT compiler backend")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Ilya Leoshkevich <iii@linux.ibm.com>
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/bpf/20210715125712.24690-1-colin.king@canonical.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 41d71fe59cce41237f24f3b7bdc1b414069a34ed ]
The existing CALL_ON_STACK() macro allows for subtle bugs:
- There is no type checking of the function that is being called. That
is: missing or too many arguments do not cause any compile error or
warning. The same is true if the return type of the called function
changes. This can lead to quite random bugs.
- Sign and zero extension of arguments is missing. Given that the s390
C ABI requires that the caller of a function performs proper sign
and zero extension this can also lead to subtle bugs.
- If arguments to the CALL_ON_STACK() macros contain functions calls
register corruption can happen due to register asm constructs being
used.
Therefore introduce a new call_on_stack() macro which is supposed to
fix all these problems.
Reviewed-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit b8e9cc20b808e26329090c19ff80b7f5098e98ff ]
tinyconfig fails to boot, because without CONFIG_BUG report_bug()
always returns BUG_TRAP_TYPE_BUG, which causes mc 0,0 in
test_monitor_call() to panic. Fix by skipping the test without
CONFIG_BUG.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 0aa4ff7688632a86bdb133fa106f2ccd514b91a7 ]
This is no longer true since we switched to generic entry. The code
switches to the IRQ stack before calling do_IRQ, but switches back
to the kernel stack before calling irq_exit().
Fixes: 56e62a737028 ("s390: convert to generic entry")
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit da9057576785aaab52e706e76c0475c85b77ec14 ]
The tprot() inline asm temporarily changes the program check new psw
to redirect a potential program check on the diag instruction.
Restoring of the program check new psw is done in C code behind the
inline asm.
This can be problematic, especially if the function is inlined, since
the compiler can reorder instructions in such a way that a different
instruction, which may result in a program check, might be executed
before the program check new psw has been restored.
To avoid such a scenario move restoring into the inline asm. For
consistency reasons move also saving of the original program check new
psw into the inline asm.
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 86807f348f418a84970eebb8f9912a7eea16b497 ]
The __diag260() inline asm temporarily changes the program check new
psw to redirect a potential program check on the diag instruction.
Restoring of the program check new psw is done in C code behind the
inline asm.
This can be problematic, especially if the function is inlined, since
the compiler can reorder instructions in such a way that a different
instruction, which may result in a program check, might be executed
before the program check new psw has been restored.
To avoid such a scenario move restoring into the inline asm. For
consistency reasons move also saving of the original program check new
psw into the inline asm.
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 88c2510cecb7e2b518e3c4fdf3cf0e13ebe9377c ]
The __diag308() inline asm temporarily changes the program check new
psw to redirect a potential program check on the diag instruction.
Restoring of the program check new psw is done in C code behind the
inline asm.
This can be problematic, especially if the function is inlined, since
the compiler can reorder instructions in such a way that a different
instruction, which may result in a program check, might be executed
before the program check new psw has been restored.
To avoid such a scenario move restoring into the inline asm. For
consistency reasons move also saving of the original program check new
psw into the inline asm.
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 9c9a915afd90f7534c16a71d1cd44b58596fddf3 ]
s390 is the only architecture which makes use of the __no_kasan_or_inline
attribute for two functions. Given that both stap() and __load_psw_mask()
are very small functions they can and should be always inlined anyway.
Therefore get rid of __no_kasan_or_inline and always inline these
functions.
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 42e8d652438f5ddf04e5dac299cb5e623d113dc0 ]
Though -nostdlib is passed in PURGATORY_LDFLAGS and -ffreestanding in
KBUILD_CFLAGS_DECOMPRESSOR, -fno-stack-protector must also be passed to
avoid linking errors related to undefined references to
'__stack_chk_guard' and '__stack_chk_fail' if toolchain enforces
-fstack-protector.
Fixes
- https://gitlab.com/kubu93/buildroot/-/jobs/1247043361
Signed-off-by: Fabrice Fontaine <fontaine.fabrice@gmail.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Reviewed-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Tested-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Link: https://lore.kernel.org/r/20210510053133.1220167-1-fontaine.fabrice@gmail.com
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit b7d91d230a119fdcc334d10c9889ce9c5e15118b ]
Console name reported in /proc/consoles:
ttyS1 -W- (EC p ) 4:65
does not match the char device name:
crw--w---- 1 root root 4, 65 May 17 12:18 /dev/ttysclp0
so debian-installer inside a QEMU s390x instance gets confused and fails
to start with the following error:
steal-ctty: No such file or directory
Signed-off-by: Valentin Vidic <vvidic@valentin-vidic.from.hr>
Link: https://lore.kernel.org/r/20210427194010.9330-1-vvidic@valentin-vidic.from.hr
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit 6a942f5780545ebd11aca8b3ac4b163397962322 upstream.
S390's init_idle_preempt_count(p, cpu) doesn't actually let us initialize the
preempt_count of the requested CPU's idle task: it unconditionally writes
to the current CPU's. This clearly conflicts with idle_threads_init(),
which intends to initialize *all* the idle tasks, including their
preempt_count (or their CPU's, if the arch uses a per-CPU preempt_count).
Unfortunately, it seems the way s390 does things doesn't let us initialize
every possible CPU's preempt_count early on, as the pages where this
resides are only allocated when a CPU is brought up and are freed when it
is brought down.
Let the arch-specific code set a CPU's preempt_count when its lowcore is
allocated, and turn init_idle_preempt_count() into an empty stub.
Fixes: f1a0a376ca0c ("sched/core: Initialize the idle task with preemption disabled")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Link: https://lore.kernel.org/r/20210707163338.1623014-1-valentin.schneider@arm.com
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 5d3516b3647621d5a1180672ea9e0817fb718ada ]
APPLDATA_BASE should depend on PROC_SYSCTL instead of PROC_FS.
Building with PROC_FS but not PROC_SYSCTL causes a build error,
since appldata_base.c uses data and APIs from fs/proc/proc_sysctl.c.
arch/s390/appldata/appldata_base.o: in function `appldata_generic_handler':
appldata_base.c:(.text+0x192): undefined reference to `sysctl_vals'
Fixes: c185b783b099 ("[S390] Remove config options.")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: linux-s390@vger.kernel.org
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Link: https://lore.kernel.org/r/20210528002420.17634-1-rdunlap@infradead.org
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit d460bb6c6417588dd8b0907d34f69b237918812a ]
In commit b02002cc4c0f ("s390/pci: Implement ioremap_wc/prot() with
MIO") we implemented both ioremap_wc() and ioremap_prot() however until
now we had not set HAVE_IOREMAP_PROT in Kconfig, do so now.
This also requires implementing pte_pgprot() as this is used in the
generic_access_phys() code enabled by CONFIG_HAVE_IOREMAP_PROT. As with
ioremap_wc() we need to take the MMIO Write Back bit index into account.
Moreover since the pgprot value returned from pte_pgprot() is to be used
for mappings into kernel address space we must make sure that it uses
appropriate kernel page table protection bits. In particular a pgprot
value originally coming from userspace could have the _PAGE_PROTECT
bit set to enable fault based dirty bit accounting which would then make
the mapping inaccessible when used in kernel address space.
Fixes: b02002cc4c0f ("s390/pci: Implement ioremap_wc/prot() with MIO")
Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit c0f8aa4fa815daacb6eca52cae04820d6aecb7c2 ]
Commit c65e774fb3f6 ("x86/mm: Make PGDIR_SHIFT and PTRS_PER_P4D variable")
made PTRS_PER_P4D variable on x86 and introduced MAX_PTRS_PER_P4D as a
constant for cases which need a compile-time constant (e.g. fixed-size
arrays).
powerpc likewise has boot-time selectable MMU features which can cause
other mm "constants" to vary. For KASAN, we have some static
PTE/PMD/PUD/P4D arrays so we need compile-time maximums for all these
constants. Extend the MAX_PTRS_PER_ idiom, and place default definitions
in include/pgtable.h. These define MAX_PTRS_PER_x to be PTRS_PER_x unless
an architecture has defined MAX_PTRS_PER_x in its arch headers.
Clean up pgtable-nop4d.h and s390's MAX_PTRS_PER_P4D definitions while
we're at it: both can just pick up the default now.
Link: https://lkml.kernel.org/r/20210624034050.511391-4-dja@axtens.net
Signed-off-by: Daniel Axtens <dja@axtens.net>
Acked-by: Andrey Konovalov <andreyknvl@gmail.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 4fa3b91bdee1b08348c82660668ca0ca34e271ad ]
Using register asm statements has been proven to be very error prone,
especially when using code instrumentation where gcc may add function
calls, which clobbers register contents in an unexpected way.
Therefore get rid of register asm statements in kvm code, even though
there is currently nothing wrong with them. This way we know for sure
that this bug class won't be introduced here.
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Link: https://lore.kernel.org/r/20210621140356.1210771-1-hca@linux.ibm.com
[borntraeger@de.ibm.com: checkpatch strict fix]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit f1a0a376ca0c4ef1fc3d24e3e502acbb5b795674 ]
As pointed out by commit
de9b8f5dcbd9 ("sched: Fix crash trying to dequeue/enqueue the idle thread")
init_idle() can and will be invoked more than once on the same idle
task. At boot time, it is invoked for the boot CPU thread by
sched_init(). Then smp_init() creates the threads for all the secondary
CPUs and invokes init_idle() on them.
As the hotplug machinery brings the secondaries to life, it will issue
calls to idle_thread_get(), which itself invokes init_idle() yet again.
In this case it's invoked twice more per secondary: at _cpu_up(), and at
bringup_cpu().
Given smp_init() already initializes the idle tasks for all *possible*
CPUs, no further initialization should be required. Now, removing
init_idle() from idle_thread_get() exposes some interesting expectations
with regards to the idle task's preempt_count: the secondary startup always
issues a preempt_disable(), requiring some reset of the preempt count to 0
between hot-unplug and hotplug, which is currently served by
idle_thread_get() -> idle_init().
Given the idle task is supposed to have preemption disabled once and never
see it re-enabled, it seems that what we actually want is to initialize its
preempt_count to PREEMPT_DISABLED and leave it there. Do that, and remove
init_idle() from idle_thread_get().
Secondary startups were patched via coccinelle:
@begone@
@@
-preempt_disable();
...
cpu_startup_entry(CPUHP_AP_ONLINE_IDLE);
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210512094636.2958515-1-valentin.schneider@arm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit 85b18d7b5e7ffefb2f076186511d39c4990aa005 upstream.
Turns out that the bit 61 in the TEID is not always 1 and if that's
the case the address space ID and the address are
unpredictable. Without an address and its address space ID we can't
export memory and hence we can only send a SIGSEGV to the process or
panic the kernel depending on who caused the exception.
Unfortunately bit 61 is only reliable if we have the "misc" UV feature
bit.
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Fixes: 084ea4d611a3d ("s390/mm: add (non)secure page access exceptions handlers")
Cc: stable@vger.kernel.org
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Vasily Gorbik:
- Fix a couple of late pt_regs flags handling findings of conversion to
generic entry.
- Fix potential register clobbering in stack switch helper.
- Fix thread/group masks for offline cpus.
- Fix cleanup of mdev resources when remove callback is invoked in
vfio-ap code.
* tag 's390-5.13-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/stack: fix possible register corruption with stack switch helper
s390/topology: clear thread/group maps for offline cpus
s390/vfio-ap: clean up mdev resources when remove callback invoked
s390: clear pt_regs::flags on irq entry
s390: fix system call restart with multiple signals
|
|
The Create Secure Configuration Ultravisor Call does not support using
large pages for the virtual memory area. This is a hardware limitation.
This patch replaces the vzalloc call with an almost equivalent call to
the newly introduced vmalloc_no_huge function, which guarantees that
only small pages will be used for the backing.
The new call will not clear the allocated memory, but that has never
been an actual requirement.
Link: https://lkml.kernel.org/r/20210614132357.10202-3-imbrenda@linux.ibm.com
Fixes: 121e6f3258fe3 ("mm/vmalloc: hugepage vmalloc mappings")
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The CALL_ON_STACK macro is used to call a C function from inline
assembly, and therefore must consider the C ABI, which says that only
registers 6-13, and 15 are non-volatile (restored by the called
function).
The inline assembly incorrectly marks all registers used to pass
parameters to the called function as read-only input operands, instead
of operands that are read and written to. This might result in
register corruption depending on usage, compiler, and compile options.
Fix this by marking all operands used to pass parameters as read/write
operands. To keep the code simple even register 6, if used, is marked
as read-write operand.
Fixes: ff340d2472ec ("s390: add stack switch helper")
Cc: <stable@kernel.org> # 4.20
Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
|
The current code doesn't clear the thread/group maps for offline
CPUs. This may cause kernel crashes like the one bewlow in common
code that assumes if a CPU has sibblings it is online.
Unable to handle kernel pointer dereference in virtual kernel address space
Call Trace:
[<000000013a4b8c3c>] blk_mq_map_swqueue+0x10c/0x388
([<000000013a4b8bcc>] blk_mq_map_swqueue+0x9c/0x388)
[<000000013a4b9300>] blk_mq_init_allocated_queue+0x448/0x478
[<000000013a4b9416>] blk_mq_init_queue+0x4e/0x90
[<000003ff8019d3e6>] loop_add+0x106/0x278 [loop]
[<000003ff801b8148>] loop_init+0x148/0x1000 [loop]
[<0000000139de4924>] do_one_initcall+0x3c/0x1e0
[<0000000139ef449a>] do_init_module+0x6a/0x2a0
[<0000000139ef61bc>] __do_sys_finit_module+0xa4/0xc0
[<0000000139de9e6e>] do_syscall+0x7e/0xd0
[<000000013a8e0aec>] __do_syscall+0xbc/0x110
[<000000013a8ee2e8>] system_call+0x78/0xa0
Fixes: 52aeda7accb6 ("s390/topology: remove offline CPUs from CPU topology masks")
Cc: <stable@kernel.org> # 5.7+
Reported-by: Marius Hillenbrand <mhillen@linux.ibm.com>
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
|
The current irq entry code doesn't initialize pt_regs::flags. On exit to
user mode arch_do_signal_or_restart() tests whether PIF_SYSCALL is set,
which might yield wrong results.
Fix this by clearing pt_regs::flags in the entry.S irq handler
code.
Reported-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Fixes: 56e62a737028 ("s390: convert to generic entry")
Cc: <stable@vger.kernel.org> # 5.12
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
|
glibc complained with "The futex facility returned an unexpected error
code.". It turned out that the futex syscall returned -ERESTARTSYS because
a signal is pending. arch_do_signal_or_restart() restored the syscall
parameters (nameley regs->gprs[2]) and set PIF_SYSCALL_RESTART. When
another signal is made pending later in the exit loop
arch_do_signal_or_restart() is called again. This function clears
PIF_SYSCALL_RESTART and checks the return code which is set in
regs->gprs[2]. However, regs->gprs[2] was restored in the previous run
and no longer contains -ERESTARTSYS, so PIF_SYSCALL_RESTART isn't set
again and the syscall is skipped.
Fix this by not clearing PIF_SYSCALL_RESTART - it is already cleared in
__do_syscall() when the syscall is restarted.
Reported-by: Bjoern Walk <bwalk@linux.ibm.com>
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Fixes: 56e62a737028 ("s390: convert to generic entry")
Cc: <stable@vger.kernel.org> # 5.12
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
|
Wrong condition check is used to decide if a machine check hit
while in KVM guest. As result of this check the instruction
following the SIE critical section might be considered as still
in KVM guest and _CIF_MCCK_GUEST CPU flag mistakenly set as
result.
Fixes: c929500d7a5a ("s390/nmi: s390: New low level handling for machine check happening in guest")
Cc: <stable@vger.kernel.org>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
|
The size of SIE critical section is calculated wrongly
as result of a missed subtraction in commit 0b0ed657fe00
("s390: remove critical section cleanup from entry.S")
Fixes: 0b0ed657fe00 ("s390: remove critical section cleanup from entry.S")
Cc: <stable@vger.kernel.org>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
|
In commit fa8b90070a80 ("quota: wire up quotactl_path") we have wired up
new quotactl_path syscall. However some people in LWN discussion have
objected that the path based syscall is missing dirfd and flags argument
which is mostly standard for contemporary path based syscalls. Indeed
they have a point and after a discussion with Christian Brauner and
Sascha Hauer I've decided to disable the syscall for now and update its
API. Since there is no userspace currently using that syscall and it
hasn't been released in any major release, we should be fine.
CC: Christian Brauner <christian.brauner@ubuntu.com>
CC: Sascha Hauer <s.hauer@pengutronix.de>
Link: https://lore.kernel.org/lkml/20210512153621.n5u43jsytbik4yze@wittgenstein
Signed-off-by: Jan Kara <jack@suse.cz>
|
|
Merge yet more updates from Andrew Morton:
"This is everything else from -mm for this merge window.
90 patches.
Subsystems affected by this patch series: mm (cleanups and slub),
alpha, procfs, sysctl, misc, core-kernel, bitmap, lib, compat,
checkpatch, epoll, isofs, nilfs2, hpfs, exit, fork, kexec, gcov,
panic, delayacct, gdb, resource, selftests, async, initramfs, ipc,
drivers/char, and spelling"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (90 commits)
mm: fix typos in comments
mm: fix typos in comments
treewide: remove editor modelines and cruft
ipc/sem.c: spelling fix
fs: fat: fix spelling typo of values
kernel/sys.c: fix typo
kernel/up.c: fix typo
kernel/user_namespace.c: fix typos
kernel/umh.c: fix some spelling mistakes
include/linux/pgtable.h: few spelling fixes
mm/slab.c: fix spelling mistake "disired" -> "desired"
scripts/spelling.txt: add "overflw"
scripts/spelling.txt: Add "diabled" typo
scripts/spelling.txt: add "overlfow"
arm: print alloc free paths for address in registers
mm/vmalloc: remove vwrite()
mm: remove xlate_dev_kmem_ptr()
drivers/char: remove /dev/kmem for good
mm: fix some typos and code style problems
ipc/sem.c: mundane typo fixes
...
|
|
Since /dev/kmem has been removed, let's remove the xlate_dev_kmem_ptr()
leftovers.
Link: https://lkml.kernel.org/r/20210324102351.6932-3-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: David Hildenbrand <david@redhat.com>
Cc: Krzysztof Kozlowski <krzk@kernel.org>
Cc: Mikulas Patocka <mpatocka@redhat.com>
Cc: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Palmer Dabbelt <palmerdabbelt@google.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Cc: Niklas Schnelle <schnelle@linux.ibm.com>
Cc: Pierre Morel <pmorel@linux.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull more s390 updates from Heiko Carstens:
- add support for system call stack randomization
- handle stale PCI deconfiguration events
- couple of defconfig updates
- some fixes and cleanups
* tag 's390-5.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390: fix detection of vector enhancements facility 1 vs. vector packed decimal facility
s390/entry: add support for syscall stack randomization
s390/configs: change CONFIG_VIRTIO_CONSOLE to "m"
s390/cio: remove invalid condition on IO_SCH_UNREG
s390/cpumf: remove call to perf_event_update_userpage
s390/cpumf: move counter set size calculation to common place
s390/cpumf: beautify if-then-else indentation
s390/configs: enable CONFIG_PCI_IOV
s390/pci: handle stale deconfiguration events
s390/pci: rename zpci_configure_device()
|
|
Merge more updates from Andrew Morton:
"The remainder of the main mm/ queue.
143 patches.
Subsystems affected by this patch series (all mm): pagecache, hugetlb,
userfaultfd, vmscan, compaction, migration, cma, ksm, vmstat, mmap,
kconfig, util, memory-hotplug, zswap, zsmalloc, highmem, cleanups, and
kfence"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (143 commits)
kfence: use power-efficient work queue to run delayed work
kfence: maximize allocation wait timeout duration
kfence: await for allocation using wait_event
kfence: zero guard page after out-of-bounds access
mm/process_vm_access.c: remove duplicate include
mm/mempool: minor coding style tweaks
mm/highmem.c: fix coding style issue
btrfs: use memzero_page() instead of open coded kmap pattern
iov_iter: lift memzero_page() to highmem.h
mm/zsmalloc: use BUG_ON instead of if condition followed by BUG.
mm/zswap.c: switch from strlcpy to strscpy
arm64/Kconfig: introduce ARCH_MHP_MEMMAP_ON_MEMORY_ENABLE
x86/Kconfig: introduce ARCH_MHP_MEMMAP_ON_MEMORY_ENABLE
mm,memory_hotplug: add kernel boot option to enable memmap_on_memory
acpi,memhotplug: enable MHP_MEMMAP_ON_MEMORY when supported
mm,memory_hotplug: allocate memmap from the added memory range
mm,memory_hotplug: factor out adjusting present pages into adjust_present_page_count()
mm,memory_hotplug: relax fully spanned sections check
drivers/base/memory: introduce memory_block_{online,offline}
mm/memory_hotplug: remove broken locking of zone PCP structures during hot remove
...
|
|
ARCH_ENABLE_SPLIT_PMD_PTLOCKS has duplicate definitions on platforms
that subscribe it. Drop these redundant definitions and instead just
select it on applicable platforms.
Link: https://lkml.kernel.org/r/1617259448-22529-6-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com> [arm64]
Acked-by: Heiko Carstens <hca@linux.ibm.com> [s390]
Cc: Will Deacon <will@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Helge Deller <deller@gmx.de>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Palmer Dabbelt <palmerdabbelt@google.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
ARCH_ENABLE_MEMORY_[HOTPLUG|HOTREMOVE] configs have duplicate
definitions on platforms that subscribe them. Instead, just make them
generic options which can be selected on applicable platforms.
Link: https://lkml.kernel.org/r/1617259448-22529-4-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com> [arm64]
Acked-by: Heiko Carstens <hca@linux.ibm.com> [s390]
Cc: Will Deacon <will@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Helge Deller <deller@gmx.de>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Palmer Dabbelt <palmerdabbelt@google.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Patch series "hugetlb: Disable huge pmd unshare for uffd-wp", v4.
This series tries to disable huge pmd unshare of hugetlbfs backed memory
for uffd-wp. Although uffd-wp of hugetlbfs is still during rfc stage,
the idea of this series may be needed for multiple tasks (Axel's uffd
minor fault series, and Mike's soft dirty series), so I picked it out
from the larger series.
This patch (of 4):
It is a preparation work to be able to behave differently in the per
architecture huge_pte_alloc() according to different VMA attributes.
Pass it deeper into huge_pmd_share() so that we can avoid the find_vma() call.
[peterx@redhat.com: build fix]
Link: https://lkml.kernel.org/r/20210304164653.GB397383@xz-x1Link: https://lkml.kernel.org/r/20210218230633.15028-1-peterx@redhat.com
Link: https://lkml.kernel.org/r/20210218230633.15028-2-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Suggested-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Adam Ruprecht <ruprecht@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Cannon Matthews <cannonmatthews@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chinwen Chang <chinwen.chang@mediatek.com>
Cc: David Rientjes <rientjes@google.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Michal Koutn" <mkoutny@suse.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Shawn Anastasio <shawn@anastas.io>
Cc: Steven Price <steven.price@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
decimal facility
The PoP documents:
134: The vector packed decimal facility is installed in the
z/Architecture architectural mode. When bit 134 is
one, bit 129 is also one.
135: The vector enhancements facility 1 is installed in
the z/Architecture architectural mode. When bit 135
is one, bit 129 is also one.
Looks like we confuse the vector enhancements facility 1 ("EXT") with the
Vector packed decimal facility ("BCD"). Let's fix the facility checks.
Detected while working on QEMU/tcg z14 support and only unlocking
the vector enhancements facility 1, but not the vector packed decimal
facility.
Fixes: 2583b848cad0 ("s390: report new vector facilities")
Cc: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Link: https://lore.kernel.org/r/20210503121244.25232-1-david@redhat.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull Landlock LSM from James Morris:
"Add Landlock, a new LSM from Mickaël Salaün.
Briefly, Landlock provides for unprivileged application sandboxing.
From Mickaël's cover letter:
"The goal of Landlock is to enable to restrict ambient rights (e.g.
global filesystem access) for a set of processes. Because Landlock
is a stackable LSM [1], it makes possible to create safe security
sandboxes as new security layers in addition to the existing
system-wide access-controls. This kind of sandbox is expected to
help mitigate the security impact of bugs or unexpected/malicious
behaviors in user-space applications. Landlock empowers any
process, including unprivileged ones, to securely restrict
themselves.
Landlock is inspired by seccomp-bpf but instead of filtering
syscalls and their raw arguments, a Landlock rule can restrict the
use of kernel objects like file hierarchies, according to the
kernel semantic. Landlock also takes inspiration from other OS
sandbox mechanisms: XNU Sandbox, FreeBSD Capsicum or OpenBSD
Pledge/Unveil.
In this current form, Landlock misses some access-control features.
This enables to minimize this patch series and ease review. This
series still addresses multiple use cases, especially with the
combined use of seccomp-bpf: applications with built-in sandboxing,
init systems, security sandbox tools and security-oriented APIs [2]"
The cover letter and v34 posting is here:
https://lore.kernel.org/linux-security-module/20210422154123.13086-1-mic@digikod.net/
See also:
https://landlock.io/
This code has had extensive design discussion and review over several
years"
Link: https://lore.kernel.org/lkml/50db058a-7dde-441b-a7f9-f6837fe8b69f@schaufler-ca.com/ [1]
Link: https://lore.kernel.org/lkml/f646e1c7-33cf-333f-070c-0a40ad0468cd@digikod.net/ [2]
* tag 'landlock_v34' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
landlock: Enable user space to infer supported features
landlock: Add user and kernel documentation
samples/landlock: Add a sandbox manager example
selftests/landlock: Add user space tests
landlock: Add syscall implementations
arch: Wire up Landlock syscalls
fs,security: Add sb_delete hook
landlock: Support filesystem access-control
LSM: Infrastructure management of the superblock
landlock: Add ptrace restrictions
landlock: Set up the security framework and manage credentials
landlock: Add ruleset and domain management
landlock: Add object management
|
|
Pull kvm updates from Paolo Bonzini:
"This is a large update by KVM standards, including AMD PSP (Platform
Security Processor, aka "AMD Secure Technology") and ARM CoreSight
(debug and trace) changes.
ARM:
- CoreSight: Add support for ETE and TRBE
- Stage-2 isolation for the host kernel when running in protected
mode
- Guest SVE support when running in nVHE mode
- Force W^X hypervisor mappings in nVHE mode
- ITS save/restore for guests using direct injection with GICv4.1
- nVHE panics now produce readable backtraces
- Guest support for PTP using the ptp_kvm driver
- Performance improvements in the S2 fault handler
x86:
- AMD PSP driver changes
- Optimizations and cleanup of nested SVM code
- AMD: Support for virtual SPEC_CTRL
- Optimizations of the new MMU code: fast invalidation, zap under
read lock, enable/disably dirty page logging under read lock
- /dev/kvm API for AMD SEV live migration (guest API coming soon)
- support SEV virtual machines sharing the same encryption context
- support SGX in virtual machines
- add a few more statistics
- improved directed yield heuristics
- Lots and lots of cleanups
Generic:
- Rework of MMU notifier interface, simplifying and optimizing the
architecture-specific code
- a handful of "Get rid of oprofile leftovers" patches
- Some selftests improvements"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (379 commits)
KVM: selftests: Speed up set_memory_region_test
selftests: kvm: Fix the check of return value
KVM: x86: Take advantage of kvm_arch_dy_has_pending_interrupt()
KVM: SVM: Skip SEV cache flush if no ASIDs have been used
KVM: SVM: Remove an unnecessary prototype declaration of sev_flush_asids()
KVM: SVM: Drop redundant svm_sev_enabled() helper
KVM: SVM: Move SEV VMCB tracking allocation to sev.c
KVM: SVM: Explicitly check max SEV ASID during sev_hardware_setup()
KVM: SVM: Unconditionally invoke sev_hardware_teardown()
KVM: SVM: Enable SEV/SEV-ES functionality by default (when supported)
KVM: SVM: Condition sev_enabled and sev_es_enabled on CONFIG_KVM_AMD_SEV=y
KVM: SVM: Append "_enabled" to module-scoped SEV/SEV-ES control variables
KVM: SEV: Mask CPUID[0x8000001F].eax according to supported features
KVM: SVM: Move SEV module params/variables to sev.c
KVM: SVM: Disable SEV/SEV-ES if NPT is disabled
KVM: SVM: Free sev_asid_bitmap during init if SEV setup fails
KVM: SVM: Zero out the VMCB array used to track SEV ASID association
x86/sev: Drop redundant and potentially misleading 'sev_enabled'
KVM: x86: Move reverse CPUID helpers to separate header file
KVM: x86: Rename GPR accessors to make mode-aware variants the defaults
...
|
|
Merge misc updates from Andrew Morton:
"A few misc subsystems and some of MM.
175 patches.
Subsystems affected by this patch series: ia64, kbuild, scripts, sh,
ocfs2, kfifo, vfs, kernel/watchdog, and mm (slab-generic, slub,
kmemleak, debug, pagecache, msync, gup, memremap, memcg, pagemap,
mremap, dma, sparsemem, vmalloc, documentation, kasan, initialization,
pagealloc, and memory-failure)"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (175 commits)
mm/memory-failure: unnecessary amount of unmapping
mm/mmzone.h: fix existing kernel-doc comments and link them to core-api
mm: page_alloc: ignore init_on_free=1 for debug_pagealloc=1
net: page_pool: use alloc_pages_bulk in refill code path
net: page_pool: refactor dma_map into own function page_pool_dma_map
SUNRPC: refresh rq_pages using a bulk page allocator
SUNRPC: set rq_page_end differently
mm/page_alloc: inline __rmqueue_pcplist
mm/page_alloc: optimize code layout for __alloc_pages_bulk
mm/page_alloc: add an array-based interface to the bulk page allocator
mm/page_alloc: add a bulk page allocator
mm/page_alloc: rename alloced to allocated
mm/page_alloc: duplicate include linux/vmalloc.h
mm, page_alloc: avoid page_to_pfn() in move_freepages()
mm/Kconfig: remove default DISCONTIGMEM_MANUAL
mm: page_alloc: dump migrate-failed pages
mm/mempolicy: fix mpol_misplaced kernel-doc
mm/mempolicy: rewrite alloc_pages_vma documentation
mm/mempolicy: rewrite alloc_pages documentation
mm/mempolicy: rename alloc_pages_current to alloc_pages
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
- Enable KFENCE for 32-bit.
- Implement EBPF for 32-bit.
- Convert 32-bit to do interrupt entry/exit in C.
- Convert 64-bit BookE to do interrupt entry/exit in C.
- Changes to our signal handling code to use user_access_begin/end()
more extensively.
- Add support for time namespaces (CONFIG_TIME_NS)
- A series of fixes that allow us to reenable STRICT_KERNEL_RWX.
- Other smaller features, fixes & cleanups.
Thanks to Alexey Kardashevskiy, Andreas Schwab, Andrew Donnellan, Aneesh
Kumar K.V, Athira Rajeev, Bhaskar Chowdhury, Bixuan Cui, Cédric Le
Goater, Chen Huang, Chris Packham, Christophe Leroy, Christopher M.
Riedl, Colin Ian King, Dan Carpenter, Daniel Axtens, Daniel Henrique
Barboza, David Gibson, Davidlohr Bueso, Denis Efremov, dingsenjie,
Dmitry Safonov, Dominic DeMarco, Fabiano Rosas, Ganesh Goudar, Geert
Uytterhoeven, Geetika Moolchandani, Greg Kurz, Guenter Roeck, Haren
Myneni, He Ying, Jiapeng Chong, Jordan Niethe, Laurent Dufour, Lee
Jones, Leonardo Bras, Li Huafei, Madhavan Srinivasan, Mahesh Salgaonkar,
Masahiro Yamada, Nathan Chancellor, Nathan Lynch, Nicholas Piggin,
Oliver O'Halloran, Paul Menzel, Pu Lehui, Randy Dunlap, Ravi Bangoria,
Rosen Penev, Russell Currey, Santosh Sivaraj, Sebastian Andrzej Siewior,
Segher Boessenkool, Shivaprasad G Bhat, Srikar Dronamraju, Stephen
Rothwell, Thadeu Lima de Souza Cascardo, Thomas Gleixner, Tony Ambardar,
Tyrel Datwyler, Vaibhav Jain, Vincenzo Frascino, Xiongwei Song, Yang Li,
Yu Kuai, and Zhang Yunkai.
* tag 'powerpc-5.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (302 commits)
powerpc/signal32: Fix erroneous SIGSEGV on RT signal return
powerpc: Avoid clang uninitialized warning in __get_user_size_allowed
powerpc/papr_scm: Mark nvdimm as unarmed if needed during probe
powerpc/kvm: Fix build error when PPC_MEM_KEYS/PPC_PSERIES=n
powerpc/kasan: Fix shadow start address with modules
powerpc/kernel/iommu: Use largepool as a last resort when !largealloc
powerpc/kernel/iommu: Align size for IOMMU_PAGE_SIZE() to save TCEs
powerpc/44x: fix spelling mistake in Kconfig "varients" -> "variants"
powerpc/iommu: Annotate nested lock for lockdep
powerpc/iommu: Do not immediately panic when failed IOMMU table allocation
powerpc/iommu: Allocate it_map by vmalloc
selftests/powerpc: remove unneeded semicolon
powerpc/64s: remove unneeded semicolon
powerpc/eeh: remove unneeded semicolon
powerpc/selftests: Add selftest to test concurrent perf/ptrace events
powerpc/selftests/perf-hwbreak: Add testcases for 2nd DAWR
powerpc/selftests/perf-hwbreak: Coalesce event creation code
powerpc/selftests/ptrace-hwbreak: Add testcases for 2nd DAWR
powerpc/configs: Add IBMVNIC to some 64-bit configs
selftests/powerpc: Add uaccess flush test
...
|
|
mem_init_print_info() is called in mem_init() on each architecture, and
pass NULL argument, so using void argument and move it into mm_init().
Link: https://lkml.kernel.org/r/20210317015210.33641-1-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com> [x86]
Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr> [powerpc]
Acked-by: David Hildenbrand <david@redhat.com>
Tested-by: Anatoly Pugachev <matorola@gmail.com> [sparc64]
Acked-by: Russell King <rmk+kernel@armlinux.org.uk> [arm]
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Guo Ren <guoren@kernel.org>
Cc: Yoshinori Sato <ysato@users.osdn.me>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: "Peter Zijlstra" <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This adds support for adding a random offset to the stack while handling
syscalls. The patch uses get_tod_clock_fast() as this is considered good
enough and has much less performance penalty compared to using
get_random_int(). The patch also adds randomization in pgm_check_handler()
as the sigreturn/rt_sigreturn system calls might be called from there.
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Link: https://lore.kernel.org/r/20210429091451.1062594-1-svens@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
In former times, the virtio-console code had to be compiled into
the kernel since the old guest virtio transport had some hard de-
pendencies. But since the old virtio transport has been removed in
commit 7fb2b2d51244 ("s390/virtio: remove the old KVM virtio transport"),
we do not have this limitation anymore.
Commit bb533ec8bacd ("s390/config: do not select VIRTIO_CONSOLE via
Kconfig") then also lifted the hard setting in the Kconfig system, so
we can finally switch the CONFIG_VIRTIO_CONSOLE knob to compile this
driver as a module now, making it more flexible for the user to only
load it if it is really required.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Link: https://lore.kernel.org/r/20210428082442.321327-1-thuth@redhat.com
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
The function cpumf_pmu_add and cpumf_pmu_del call function
perf_event_update_userpage(). This calls is obsolete, the calls add and
delete a counter event. Counter events do not sample data and the
event->rb member to access the sampling ring buffer is always NULL.
The function perf_event_update_userpage() simply returns in this case.
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by : Sumanth Korikkar <sumanthk@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
The function to calculate the size of counter sets is renamed from
cf_diag_ctrset_size() to cpum_cf_ctrset_size() and moved to the file
containing common functions for the CPU Measurement Counter Facility.
No functional change.
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by : Sumanth Korikkar <sumanthk@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
Beautify if-then-else indentation to match coding guideline.
Also use shorter pointer notation hwc instead of event->hw.
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by : Sumanth Korikkar <sumanthk@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
All major distributions ship with CONFIG_PCI_IOV=y so let us enable it
for our defconfigs as well.
Note also that since commit e5794cf1a270 ("s390/pci: create links
between PFs and VFs") we enabled proper linking between PFs and their
associated VFs so with this commit and its fixes applied we can fully
support handling SR-IOV enabled PFs.
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
The PCIs event with PEC 0x0303 or 0x0304 are a request to deconfigure
a PCI function, respectively an indication that it was already
deconfigured by the platform. If such an event is queued during boot it
may happen that the platform has already adjusted the configuration flag
of the relevant function in the CLP List PCI Functions result. In this
case we might not have configured the PCI function at all and should
thus ignore the event. Note that no locking is necessary as event
handling only starts after we have fully initialized the zPCI subsystem
and scanned all PCI devices listed in the CLP result.
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Acked-by: Pierre Morel <pmorel@linux.ibm.com>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|