starfive-tech/linux.git - StarFive Tech Linux Kernel for VisionFive (JH7110) boards (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2020-10-15	Merge tag 'threads-v5.10' of ↵	Linus Torvalds	2	-175/+133
	git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull pidfd updates from Christian Brauner: "This introduces a new extension to the pidfd_open() syscall. Users can now raise the new PIDFD_NONBLOCK flag to support non-blocking pidfd file descriptors. This has been requested for uses in async process management libraries such as async-pidfd in Rust. Ever since the introduction of pidfds and more advanced async io various programming languages such as Rust have grown support for async event libraries. These libraries are created to help build epoll-based event loops around file descriptors. A common pattern is to automatically make all file descriptors they manage to O_NONBLOCK. For such libraries the EAGAIN error code is treated specially. When a function is called that returns EAGAIN the function isn't called again until the event loop indicates the the file descriptor is ready. Supporting EAGAIN when waiting on pidfds makes such libraries just work with little effort. This introduces a new flag PIDFD_NONBLOCK that is equivalent to O_NONBLOCK. This follows the same patterns we have for other (anon inode) file descriptors such as EFD_NONBLOCK, IN_NONBLOCK, SFD_NONBLOCK, TFD_NONBLOCK and the same for close-on-exec flags. Passing a non-blocking pidfd to waitid() currently has no effect, i.e. is not supported. There are users which would like to use waitid() on pidfds that are O_NONBLOCK and mix it with pidfds that are blocking and both pass them to waitid(). The expected behavior is to have waitid() return -EAGAIN for non-blocking pidfds and to block for blocking pidfds without needing to perform any additional checks for flags set on the pidfd before passing it to waitid(). Non-blocking pidfds will return EAGAIN from waitid() when no child process is ready yet. Returning -EAGAIN for non-blocking pidfds makes it easier for event loops that handle EAGAIN specially. It also makes the API more consistent and uniform. In essence, waitid() is treated like a read on a non-blocking pidfd or a recvmsg() on a non-blocking socket. With the addition of support for non-blocking pidfds we support the same functionality that sockets do. For sockets() recvmsg() supports MSG_DONTWAIT for pidfds waitid() supports WNOHANG. Both flags are per-call options. In contrast non-blocking pidfds and non-blocking sockets are a setting on an open file description affecting all threads in the calling process as well as other processes that hold file descriptors referring to the same open file description. Both behaviors, per call and per open file description, have genuine use-cases. The interaction with the WNOHANG flag is documented as follows: - If a non-blocking pidfd is passed and WNOHANG is not raised we simply raise the WNOHANG flag internally. When do_wait() returns indicating that there are eligible child processes but none have exited yet we set EAGAIN. If no child process exists we continue returning ECHILD. - If a non-blocking pidfd is passed and WNOHANG is raised waitid() will continue returning 0, i.e. it will not set EAGAIN. This ensure backwards compatibility with applications passing WNOHANG explicitly with pidfds" * tag 'threads-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: tests: remove O_NONBLOCK before waiting for WSTOPPED tests: add waitid() tests for non-blocking pidfds tests: port pidfd_wait to kselftest harness pidfd: support PIDFD_NONBLOCK in pidfd_open() exit: support non-blocking pidfds
2020-10-15	Merge tag 'kernel-clone-v5.9' of ↵	Linus Torvalds	16	-35/+35
	git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull kernel_clone() updates from Christian Brauner: "During the v5.9 merge window we reworked the process creation codepaths across multiple architectures. After this work we were only left with the _do_fork() helper based on the struct kernel_clone_args calling convention. As was pointed out _do_fork() isn't valid kernelese especially for a helper that isn't just static. This series removes the _do_fork() helper and introduces the new kernel_clone() helper. The process creation cleanup didn't change the name to something more reasonable mainly because _do_fork() was used in quite a few places. So sending this as a separate series seemed the better strategy. I originally intended to send this early in the v5.9 development cycle after the merge window had closed but given that this was touching quite a few places I decided to defer this until the v5.10 merge window" * tag 'kernel-clone-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: sched: remove _do_fork() tracing: switch to kernel_clone() kgdbts: switch to kernel_clone() kprobes: switch to kernel_clone() x86: switch to kernel_clone() sparc: switch to kernel_clone() nios2: switch to kernel_clone() m68k: switch to kernel_clone() ia64: switch to kernel_clone() h8300: switch to kernel_clone() fork: introduce kernel_clone()
2020-10-15	Merge tag 'linux-kselftest-fixes-5.10-rc1' of ↵	Linus Torvalds	3	-119/+200
	git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull kselftest updates from Shuah Khan: - a selftests harness fix to flush stdout before forking to avoid parent and child printing duplicates messages. This is evident when test output is redirected to a file. - a tools/ wide change to avoid comma separated statements from Joe Perches. This fix spans tools/lib, tools/power/cpupower, and selftests. * tag 'linux-kselftest-fixes-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: tools: Avoid comma separated statements selftests/harness: Flush stdout before forking
2020-10-14	Merge branch 'akpm' (patches from Andrew)	Linus Torvalds	6	-22/+48
	Merge misc updates from Andrew Morton: "181 patches. Subsystems affected by this patch series: kbuild, scripts, ntfs, ocfs2, vfs, mm (slab, slub, kmemleak, dax, debug, pagecache, fadvise, gup, swap, memremap, memcg, selftests, pagemap, mincore, hmm, dma, memory-failure, vmallo and migration)" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (181 commits) mm/migrate: remove obsolete comment about device public mm/migrate: remove cpages-- in migrate_vma_finalize() mm, oom_adj: don't loop through tasks in __set_oom_adj when not necessary memblock: use separate iterators for memory and reserved regions memblock: implement for_each_reserved_mem_region() using __next_mem_region() memblock: remove unused memblock_mem_size() x86/setup: simplify reserve_crashkernel() x86/setup: simplify initrd relocation and reservation arch, drivers: replace for_each_membock() with for_each_mem_range() arch, mm: replace for_each_memblock() with for_each_mem_pfn_range() memblock: reduce number of parameters in for_each_mem_range() memblock: make memblock_debug and related functionality private memblock: make for_each_memblock_type() iterator private mircoblaze: drop unneeded NUMA and sparsemem initializations riscv: drop unneeded node initialization h8300, nds32, openrisc: simplify detection of memory extents arm64: numa: simplify dummy_numa_init() arm, xtensa: simplify initialization of high memory pages dma-contiguous: simplify cma_early_percent_memory() KVM: PPC: Book3S HV: simplify kvm_cma_reserve() ...
2020-10-14	selftests/vm: 8x compaction_test speedup	John Hubbard	1	-5/+6
	This patch reduces the running time for compaction_test from about 27 sec, to 3.3 sec, which is about an 8x speedup. These numbers are for an Intel x86_64 system with 32 GB of DRAM. The compaction_test.c program was spending most of its time doing mmap(), 1 MB at a time, on about 25 GB of memory. Instead, do the mmaps 100 MB at a time. (Going past 100 MB doesn't make things go much faster, because other parts of the program are using the remaining time.) Signed-off-by: John Hubbard <jhubbard@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Sri Jayaramappa <sjayaram@akamai.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Mel Gorman <mgorman@techsingularity.net> Link: https://lkml.kernel.org/r/20201002080621.551044-2-jhubbard@nvidia.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-14	tools/testing/selftests/vm/hmm-tests.c: use the new SKIP() macro	Ralph Campbell	1	-2/+2
	Some tests might not be able to be run if resources like huge pages are not available. Mark these tests as skipped instead of simply passing. Signed-off-by: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Shuah Khan <shuah@kernel.org> Link: http://lkml.kernel.org/r/20200827190400.12608-1-rcampbell@nvidia.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-14	selftests/vm: fix incorrect gcc invocation in some cases	John Hubbard	1	-0/+12
	Avoid accidental wrong builds, due to built-in rules working just a little bit too well--but not quite as well as required for our situation here. In other words, "make userfaultfd" (for example) is supposed to fail to build at all, because this Makefile only supports either "make" (all), or "make /full/path". However, the built-in rules, if not suppressed, will pick up CFLAGS and the initial LDLIBS (but not the target-specific LDLIBS, because those are only set for the full path target!). This causes it to get pretty far into building things despite using incorrect values such as an occasionally incomplete LDLIBS value. Signed-off-by: John Hubbard <jhubbard@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Jason Gunthorpe <jgg@ziepe.ca> Link: https://lkml.kernel.org/r/20200915012901.1655280-3-jhubbard@nvidia.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-14	selftests/vm: fix false build success on the second and later attempts	John Hubbard	1	-0/+5
	Patch series "selftests/vm: fix some minor aggravating factors in the Makefile". This fixes a couple of minor aggravating factors that I ran across while trying to do some changes in selftests/vm. These are simple things, but like most things with GNU Make, it's rarely obvious what's wrong until you understand the entire Makefile and all of its includes. So while there is, of course, joy in learning those details, I thought I'd fix these little things, so as to allow others to skip out on the Joy if they so choose. :) First of all, if you have an item (let's choose userfaultfd for an example) that fails to build, you might do this: $ make -j32 # ...you observe a failed item in the threaded output # OK, let's get a closer look $ make # ...but now the build quietly "succeeds". That's what Patch 0001 fixes. Second, if you instead attempt this approach for your closer look (a casual mistake, as it's not supported): $ make userfaultfd # ...userfaultfd fails to link, due to incomplete LDLIBS That's what Patch 0002 fixes. This patch (of 2): If one or more of these selftest fail to build, then after the first failure, subsequent invocations of "make" will make it appear that there are no build failures, after all. That's because the failed build products remain, with up-to-date timestamps, thus tricking Make (and you!) into believing that there's nothing else to build. Fix this by telling Make to delete targets that didn't completely succeed. Signed-off-by: John Hubbard <jhubbard@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Jason Gunthorpe <jgg@ziepe.ca> Link: https://lkml.kernel.org/r/20200915012901.1655280-1-jhubbard@nvidia.com Link: https://lkml.kernel.org/r/20200915012901.1655280-2-jhubbard@nvidia.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-14	mm/gup_benchmark: use pin_user_pages for FOLL_LONGTERM flag	Barry Song	1	-7/+7
	According to Documentation/core-api/pin_user_pages.rst, FOLL_PIN is a prerequisite to FOLL_LONGTERM. Another way of saying that is, FOLL_LONGTERM is a specific case, more restrictive case of FOLL_PIN. Almost all kernel modules are using pin_user_pages() with FOLL_LONGTERM, mm/gup_benchmark.c seems to the only exception in which FOLL_PIN is not a prerequisite to FOLL_LONGTERM. Signed-off-by: Barry Song <song.bao.hua@hisilicon.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Cc: Jan Kara <jack@suse.cz> Cc: Jérôme Glisse <jglisse@redhat.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@infradead.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Link: http://lkml.kernel.org/r/20200815122056.29508-1-song.bao.hua@hisilicon.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-14	device-dax: add dis-contiguous resource support	Dan Williams	1	-6/+14
	Break the requirement that device-dax instances are physically contiguous. With this constraint removed it allows fragmented available capacity to be fully allocated. This capability is useful to mitigate the "noisy neighbor" problem with memory-side-cache management for virtual machines, or any other scenario where a platform address boundary also designates a performance boundary. For example a direct mapped memory side cache might rotate cache colors at 1GB boundaries. With dis-contiguous allocations a device-dax instance could be configured to contain only 1 cache color. It also satisfies Joao's use case (see link) for partitioning memory for exclusive guest access. It allows for a future potential mode where the host kernel need not allocate 'struct page' capacity up-front. Reported-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Ben Skeggs <bskeggs@redhat.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Brice Goglin <Brice.Goglin@inria.fr> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: David Airlie <airlied@linux.ie> Cc: David Hildenbrand <david@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Hulk Robot <hulkci@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Jason Gunthorpe <jgg@mellanox.com> Cc: Jason Yan <yanaijie@huawei.com> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: "Jérôme Glisse" <jglisse@redhat.com> Cc: Jia He <justin.he@arm.com> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Juergen Gross <jgross@suse.com> Cc: kernel test robot <lkp@intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Paul Mackerras <paulus@ozlabs.org> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/lkml/20200110190313.17144-1-joao.m.martins@oracle.com/ Link: https://lkml.kernel.org/r/159643104304.4062302.16561669534797528660.stgit@dwillia2-desk3.amr.corp.intel.com Link: https://lkml.kernel.org/r/160106116875.30709.11456649969327399771.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-14	mm/memremap_pages: convert to 'struct range'	Dan Williams	1	-1/+1
	The 'struct resource' in 'struct dev_pagemap' is only used for holding resource span information. The other fields, 'name', 'flags', 'desc', 'parent', 'sibling', and 'child' are all unused wasted space. This is in preparation for introducing a multi-range extension of devm_memremap_pages(). The bulk of this change is unwinding all the places internal to libnvdimm that used 'struct resource' unnecessarily, and replacing instances of 'struct dev_pagemap'.res with 'struct dev_pagemap'.range. P2PDMA had a minor usage of the resource flags field, but only to report failures with "%pR". That is replaced with an open coded print of the range. [dan.carpenter@oracle.com: mm/hmm/test: use after free in dmirror_allocate_chunk()] Link: https://lkml.kernel.org/r/20200926121402.GA7467@kadam Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> [xen] Cc: Paul Mackerras <paulus@ozlabs.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Ben Skeggs <bskeggs@redhat.com> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Juergen Gross <jgross@suse.com> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: "Jérôme Glisse" <jglisse@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brice Goglin <Brice.Goglin@inria.fr> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Hulk Robot <hulkci@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jason Gunthorpe <jgg@mellanox.com> Cc: Jason Yan <yanaijie@huawei.com> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Jia He <justin.he@arm.com> Cc: Joao Martins <joao.m.martins@oracle.com> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: kernel test robot <lkp@intel.com> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: Will Deacon <will@kernel.org> Link: https://lkml.kernel.org/r/159643103173.4062302.768998885691711532.stgit@dwillia2-desk3.amr.corp.intel.com Link: https://lkml.kernel.org/r/160106115761.30709.13539840236873663620.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-14	device-dax: make pgmap optional for instance creation	Dan Williams	1	-4/+4
	The passed in dev_pagemap is only required in the pmem case as the libnvdimm core may have reserved a vmem_altmap for dev_memremap_pages() to place the memmap in pmem directly. In the hmem case there is no agent reserving an altmap so it can all be handled by a core internal default. Pass the resource range via a new @range property of 'struct dev_dax_data'. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: David Hildenbrand <david@redhat.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Brice Goglin <Brice.Goglin@inria.fr> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Jia He <justin.he@arm.com> Cc: Joao Martins <joao.m.martins@oracle.com> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Ben Skeggs <bskeggs@redhat.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: David Airlie <airlied@linux.ie> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Hulk Robot <hulkci@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jason Gunthorpe <jgg@mellanox.com> Cc: Jason Yan <yanaijie@huawei.com> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: "Jérôme Glisse" <jglisse@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: kernel test robot <lkp@intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Paul Mackerras <paulus@ozlabs.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: Will Deacon <will@kernel.org> Link: https://lkml.kernel.org/r/159643099958.4062302.10379230791041872886.stgit@dwillia2-desk3.amr.corp.intel.com Link: https://lkml.kernel.org/r/160106110513.30709.4303239334850606031.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-14	Merge tag 'seccomp-v5.10-rc1' of ↵	Linus Torvalds	7	-203/+318
	git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull seccomp updates from Kees Cook: "The bulk of the changes are with the seccomp selftests to accommodate some powerpc-specific behavioral characteristics. Additional cleanups, fixes, and improvements are also included: - heavily refactor seccomp selftests (and clone3 selftests dependency) to fix powerpc (Kees Cook, Thadeu Lima de Souza Cascardo) - fix style issue in selftests (Zou Wei) - upgrade "unknown action" from KILL_THREAD to KILL_PROCESS (Rich Felker) - replace task_pt_regs(current) with current_pt_regs() (Denis Efremov) - fix corner-case race in USER_NOTIF (Jann Horn) - make CONFIG_SECCOMP no longer per-arch (YiFei Zhu)" * tag 'seccomp-v5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (23 commits) seccomp: Make duplicate listener detection non-racy seccomp: Move config option SECCOMP to arch/Kconfig selftests/clone3: Avoid OS-defined clone_args selftests/seccomp: powerpc: Set syscall return during ptrace syscall exit selftests/seccomp: Allow syscall nr and ret value to be set separately selftests/seccomp: Record syscall during ptrace entry selftests/seccomp: powerpc: Fix seccomp return value testing selftests/seccomp: Remove SYSCALL_NUM_RET_SHARE_REG in favor of SYSCALL_RET_SET selftests/seccomp: Avoid redundant register flushes selftests/seccomp: Convert REGSET calls into ARCH_GETREG/ARCH_SETREG selftests/seccomp: Convert HAVE_GETREG into ARCH_GETREG/ARCH_SETREG selftests/seccomp: Remove syscall setting #ifdefs selftests/seccomp: mips: Remove O32-specific macro selftests/seccomp: arm64: Define SYSCALL_NUM_SET macro selftests/seccomp: arm: Define SYSCALL_NUM_SET macro selftests/seccomp: mips: Define SYSCALL_NUM_SET macro selftests/seccomp: Provide generic syscall setting macro selftests/seccomp: Refactor arch register macros to avoid xtensa special case selftests/seccomp: Use __NR_mknodat instead of __NR_mknod selftests/seccomp: Use bitwise instead of arithmetic operator for flags ...
2020-10-12	Merge tag 'sched-core-2020-10-12' of ↵	Linus Torvalds	3	-1/+281
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: - reorganize & clean up the SD* flags definitions and add a bunch of sanity checks. These new checks caught quite a few bugs or at least inconsistencies, resulting in another set of patches. - rseq updates, add MEMBARRIER_CMD_PRIVATE_EXPEDITED_RSEQ - add a new tracepoint to improve CPU capacity tracking - improve overloaded SMP system load-balancing behavior - tweak SMT balancing - energy-aware scheduling updates - NUMA balancing improvements - deadline scheduler fixes and improvements - CPU isolation fixes - misc cleanups, simplifications and smaller optimizations * tag 'sched-core-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (42 commits) sched/deadline: Unthrottle PI boosted threads while enqueuing sched/debug: Add new tracepoint to track cpu_capacity sched/fair: Tweak pick_next_entity() rseq/selftests: Test MEMBARRIER_CMD_PRIVATE_EXPEDITED_RSEQ rseq/selftests,x86_64: Add rseq_offset_deref_addv() rseq/membarrier: Add MEMBARRIER_CMD_PRIVATE_EXPEDITED_RSEQ sched/fair: Use dst group while checking imbalance for NUMA balancer sched/fair: Reduce busy load balance interval sched/fair: Minimize concurrent LBs between domain level sched/fair: Reduce minimal imbalance threshold sched/fair: Relax constraint on task's load during load balance sched/fair: Remove the force parameter of update_tg_load_avg() sched/fair: Fix wrong cpu selecting from isolated domain sched: Remove unused inline function uclamp_bucket_base_value() sched/rt: Disable RT_RUNTIME_SHARE by default sched/deadline: Fix stale throttling on de-/boosted tasks sched/numa: Use runnable_avg to classify node sched/topology: Move sd_flag_debug out of #ifdef CONFIG_SYSCTL MAINTAINERS: Add myself as SCHED_DEADLINE reviewer sched/topology: Move SD_DEGENERATE_GROUPS_MASK out of linux/sched/topology.h ...
2020-10-12	Merge tag 'x86_fsgsbase_for_v5.10' of ↵	Linus Torvalds	1	-0/+68
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fsgsbase updates from Borislav Petkov: "Misc minor cleanups and corrections to the fsgsbase code and respective selftests" * tag 'x86_fsgsbase_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: selftests/x86/fsgsbase: Test PTRACE_PEEKUSER for GSBASE with invalid LDT GS selftests/x86/fsgsbase: Reap a forgotten child x86/fsgsbase: Replace static_cpu_has() with boot_cpu_has() x86/entry/64: Correct the comment over SAVE_AND_SET_GSBASE
2020-10-12	Merge tag 'ras_updates_for_v5.10' of ↵	Linus Torvalds	5	-29/+30
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RAS updates from Borislav Petkov: - Extend the recovery from MCE in kernel space also to processes which encounter an MCE in kernel space but while copying from user memory by sending them a SIGBUS on return to user space and umapping the faulty memory, by Tony Luck and Youquan Song. - memcpy_mcsafe() rework by splitting the functionality into copy_mc_to_user() and copy_mc_to_kernel(). This, as a result, enables support for new hardware which can recover from a machine check encountered during a fast string copy and makes that the default and lets the older hardware which does not support that advance recovery, opt in to use the old, fragile, slow variant, by Dan Williams. - New AMD hw enablement, by Yazen Ghannam and Akshay Gupta. - Do not use MSR-tracing accessors in #MC context and flag any fault while accessing MCA architectural MSRs as an architectural violation with the hope that such hw/fw misdesigns are caught early during the hw eval phase and they don't make it into production. - Misc fixes, improvements and cleanups, as always. * tag 'ras_updates_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mce: Allow for copy_mc_fragile symbol checksum to be generated x86/mce: Decode a kernel instruction to determine if it is copying from user x86/mce: Recover from poison found while copying from user space x86/mce: Avoid tail copy when machine check terminated a copy from user x86/mce: Add _ASM_EXTABLE_CPY for copy user access x86/mce: Provide method to find out the type of an exception handler x86/mce: Pass pointer to saved pt_regs to severity calculation routines x86/copy_mc: Introduce copy_mc_enhanced_fast_string() x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}() x86/mce: Drop AMD-specific "DEFERRED" case from Intel severity rule list x86/mce: Add Skylake quirk for patrol scrub reported errors RAS/CEC: Convert to DEFINE_SHOW_ATTRIBUTE() x86/mce: Annotate mce_rd/wrmsrl() with noinstr x86/mce/dev-mcelog: Do not update kflags on AMD systems x86/mce: Stop mce_reign() from re-computing severity for every CPU x86/mce: Make mce_rdmsrl() panic on an inaccessible MSR x86/mce: Increase maximum number of banks to 64 x86/mce: Delay clearing IA32_MCG_STATUS to the end of do_machine_check() x86/MCE/AMD, EDAC/mce_amd: Remove struct smca_hwid.xec_bitmap RAS/CEC: Fix cec_init() prototype
2020-10-12	Merge tag 'arm64-upstream' of ↵	Linus Torvalds	33	-1/+4644
	git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Will Deacon: "There's quite a lot of code here, but much of it is due to the addition of a new PMU driver as well as some arm64-specific selftests which is an area where we've traditionally been lagging a bit. In terms of exciting features, this includes support for the Memory Tagging Extension which narrowly missed 5.9, hopefully allowing userspace to run with use-after-free detection in production on CPUs that support it. Work is ongoing to integrate the feature with KASAN for 5.11. Another change that I'm excited about (assuming they get the hardware right) is preparing the ASID allocator for sharing the CPU page-table with the SMMU. Those changes will also come in via Joerg with the IOMMU pull. We do stray outside of our usual directories in a few places, mostly due to core changes required by MTE. Although much of this has been Acked, there were a couple of places where we unfortunately didn't get any review feedback. Other than that, we ran into a handful of minor conflicts in -next, but nothing that should post any issues. Summary: - Userspace support for the Memory Tagging Extension introduced by Armv8.5. Kernel support (via KASAN) is likely to follow in 5.11. - Selftests for MTE, Pointer Authentication and FPSIMD/SVE context switching. - Fix and subsequent rewrite of our Spectre mitigations, including the addition of support for PR_SPEC_DISABLE_NOEXEC. - Support for the Armv8.3 Pointer Authentication enhancements. - Support for ASID pinning, which is required when sharing page-tables with the SMMU. - MM updates, including treating flush_tlb_fix_spurious_fault() as a no-op. - Perf/PMU driver updates, including addition of the ARM CMN PMU driver and also support to handle CPU PMU IRQs as NMIs. - Allow prefetchable PCI BARs to be exposed to userspace using normal non-cacheable mappings. - Implementation of ARCH_STACKWALK for unwinding. - Improve reporting of unexpected kernel traps due to BPF JIT failure. - Improve robustness of user-visible HWCAP strings and their corresponding numerical constants. - Removal of TEXT_OFFSET. - Removal of some unused functions, parameters and prototypes. - Removal of MPIDR-based topology detection in favour of firmware description. - Cleanups to handling of SVE and FPSIMD register state in preparation for potential future optimisation of handling across syscalls. - Cleanups to the SDEI driver in preparation for support in KVM. - Miscellaneous cleanups and refactoring work" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (148 commits) Revert "arm64: initialize per-cpu offsets earlier" arm64: random: Remove no longer needed prototypes arm64: initialize per-cpu offsets earlier kselftest/arm64: Check mte tagged user address in kernel kselftest/arm64: Verify KSM page merge for MTE pages kselftest/arm64: Verify all different mmap MTE options kselftest/arm64: Check forked child mte memory accessibility kselftest/arm64: Verify mte tag inclusion via prctl kselftest/arm64: Add utilities and a test to validate mte memory perf: arm-cmn: Fix conversion specifiers for node type perf: arm-cmn: Fix unsigned comparison to less than zero arm64: dbm: Invalidate local TLB when setting TCR_EL1.HD arm64: mm: Make flush_tlb_fix_spurious_fault() a no-op arm64: Add support for PR_SPEC_DISABLE_NOEXEC prctl() option arm64: Pull in task_stack_page() to Spectre-v4 mitigation code KVM: arm64: Allow patching EL2 vectors even with KASLR is not enabled arm64: Get rid of arm64_ssbd_state KVM: arm64: Convert ARCH_WORKAROUND_2 to arm64_get_spectre_v4_state() KVM: arm64: Get rid of kvm_arm_have_ssbd() KVM: arm64: Simplify handling of ARCH_WORKAROUND_2 ...
2020-10-09	tests: remove O_NONBLOCK before waiting for WSTOPPED	Christian Brauner	1	-2/+2
	Naresh reported that selftests: pidfd: pidfd_wait hangs on linux next kernel on x86_64, i386 and arm64 Juno-r2 These devices are using NFS mounted rootfs. I have tested pidfd testcases independently and all test PASS. The Hang or exit from test run noticed when run by run_kselftest.sh pidfd_wait.c:208:wait_nonblock:Expected sys_waitid(P_PIDFD, pidfd, &info, WSTOPPED, NULL) (-1) == 0 (0) wait_nonblock: Test terminated by assertion metadata: git branch: master git repo: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git git commit: e64997027d5f171148687e58b78c8b3c869a6158 git describe: next-20200922 make_kernelversion: 5.9.0-rc6 kernel-config: http://snapshots.linaro.org/openembedded/lkft/lkft/sumo/intel-core2-32/lkft/linux-next/865/config The reason for this is a simple race in the selftests, that I overlooked and which is more likely to hit when there's a lot of processes running on the system. Basically the child process hasn't SIGSTOPed itself yet but the parent is already calling waitid() on a O_NONBLOCK pidfd. Since it doesn't find a WSTOPPED process it returns -EAGAIN correctly. The fix for this is to move the line where we're removing the O_NONBLOCK property from the fd before the waitid() WSTOPPED call so we hang until the child becomes stopped. Fixes: cd89597bbe5a ("tests: add waitid() tests for non-blocking pidfds") Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Tested-by: Naresh Kamboju <naresh.kamboju@linaro.org> Link: https://lkft.validation.linaro.org/scheduler/job/1813223 Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2020-10-08	selftests/clone3: Avoid OS-defined clone_args	Kees Cook	7	-44/+41
	As the UAPI headers start to appear in distros, we need to avoid outdated versions of struct clone_args to be able to test modern features, named "struct __clone_args". Additionally update the struct size macro names to match UAPI names. Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Link: https://lore.kernel.org/lkml/20200921075432.u4gis3s2o5qrsb5g@wittgenstein/ Signed-off-by: Kees Cook <keescook@chromium.org>
2020-10-08	selftests/seccomp: powerpc: Set syscall return during ptrace syscall exit	Kees Cook	1	-4/+21
	Some archs (like powerpc) only support changing the return code during syscall exit when ptrace is used. Test entry vs exit phases for which portions of the syscall number and return values need to be set at which different phases. For non-powerpc, all changes are made during ptrace syscall entry, as before. For powerpc, the syscall number is changed at ptrace syscall entry and the syscall return value is changed on ptrace syscall exit. Reported-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Suggested-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Link: https://lore.kernel.org/linux-kselftest/20200911181012.171027-1-cascardo@canonical.com/ Fixes: 58d0a862f573 ("seccomp: add tests for ptrace hole") Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Link: https://lore.kernel.org/lkml/20200921075300.7iylzof2w5vrutah@wittgenstein/ Signed-off-by: Kees Cook <keescook@chromium.org>
2020-10-08	selftests/seccomp: Allow syscall nr and ret value to be set separately	Kees Cook	1	-12/+47
	In preparation for setting syscall nr and ret values separately, refactor the helpers to take a pointer to a value, so that a NULL can indicate "do not change this respective value". This is done to keep the regset read/write happening once and in one code path. Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Link: https://lore.kernel.org/lkml/20200921075031.j4gruygeugkp2zwd@wittgenstein/ Signed-off-by: Kees Cook <keescook@chromium.org>
2020-10-08	selftests/seccomp: Record syscall during ptrace entry	Kees Cook	1	-13/+27
	In preparation for performing actions during ptrace syscall exit, save the syscall number during ptrace syscall entry. Some architectures do no have the syscall number available during ptrace syscall exit. Suggested-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Link: https://lore.kernel.org/linux-kselftest/20200911181012.171027-1-cascardo@canonical.com/ Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Link: https://lore.kernel.org/lkml/20200921074354.6shkt2e5yhzhj3sn@wittgenstein/ Signed-off-by: Kees Cook <keescook@chromium.org>
2020-10-07	Merge branch 'for-next/late-arrivals' into for-next/core	Will Deacon	13	-1/+2070
	Late patches for 5.10: MTE selftests, minor KCSAN preparation and removal of some unused prototypes. (Amit Daniel Kachhap and others) * for-next/late-arrivals: arm64: random: Remove no longer needed prototypes arm64: initialize per-cpu offsets earlier kselftest/arm64: Check mte tagged user address in kernel kselftest/arm64: Verify KSM page merge for MTE pages kselftest/arm64: Verify all different mmap MTE options kselftest/arm64: Check forked child mte memory accessibility kselftest/arm64: Verify mte tag inclusion via prctl kselftest/arm64: Add utilities and a test to validate mte memory
2020-10-06	x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}()	Dan Williams	5	-29/+30
	In reaction to a proposal to introduce a memcpy_mcsafe_fast() implementation Linus points out that memcpy_mcsafe() is poorly named relative to communicating the scope of the interface. Specifically what addresses are valid to pass as source, destination, and what faults / exceptions are handled. Of particular concern is that even though x86 might be able to handle the semantics of copy_mc_to_user() with its common copy_user_generic() implementation other archs likely need / want an explicit path for this case: On Fri, May 1, 2020 at 11:28 AM Linus Torvalds <torvalds@linux-foundation.org> wrote: > > On Thu, Apr 30, 2020 at 6:21 PM Dan Williams <dan.j.williams@intel.com> wrote: > > > > However now I see that copy_user_generic() works for the wrong reason. > > It works because the exception on the source address due to poison > > looks no different than a write fault on the user address to the > > caller, it's still just a short copy. So it makes copy_to_user() work > > for the wrong reason relative to the name. > > Right. > > And it won't work that way on other architectures. On x86, we have a > generic function that can take faults on either side, and we use it > for both cases (and for the "in_user" case too), but that's an > artifact of the architecture oddity. > > In fact, it's probably wrong even on x86 - because it can hide bugs - > but writing those things is painful enough that everybody prefers > having just one function. Replace a single top-level memcpy_mcsafe() with either copy_mc_to_user(), or copy_mc_to_kernel(). Introduce an x86 copy_mc_fragile() name as the rename for the low-level x86 implementation formerly named memcpy_mcsafe(). It is used as the slow / careful backend that is supplanted by a fast copy_mc_generic() in a follow-on patch. One side-effect of this reorganization is that separating copy_mc_64.S to its own file means that perf no longer needs to track dependencies for its memcpy_64.S benchmarks. [ bp: Massage a bit. ] Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: <stable@vger.kernel.org> Link: http://lore.kernel.org/r/CAHk-=wjSqtXAqfUJxFtWNwmguFASTgB0dz1dT3V-78Quiezqbg@mail.gmail.com Link: https://lkml.kernel.org/r/160195561680.2163339.11574962055305783722.stgit@dwillia2-desk3.amr.corp.intel.com
2020-10-05	kselftest/arm64: Check mte tagged user address in kernel	Amit Daniel Kachhap	4	-0/+127
	Add a testcase to check that user address with valid/invalid mte tag works in kernel mode. This test verifies that the kernel API's __arch_copy_from_user/__arch_copy_to_user works by considering if the user pointer has valid/invalid allocation tags. In MTE sync mode, file memory read/write and other similar interfaces fails if a user memory with invalid tag is accessed in kernel. In async mode no such failure occurs. Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com> Tested-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201002115630.24683-7-amit.kachhap@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2020-10-05	kselftest/arm64: Verify KSM page merge for MTE pages	Amit Daniel Kachhap	2	-0/+160
	Add a testcase to check that KSM should not merge pages containing same data with same/different MTE tag values. This testcase has one positive tests and passes if page merging happens according to the above rule. It also saves and restores any modified ksm sysfs entries. Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com> Tested-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201002115630.24683-6-amit.kachhap@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2020-10-05	kselftest/arm64: Verify all different mmap MTE options	Amit Daniel Kachhap	2	-0/+263
	This testcase checks the different unsupported/supported options for mmap if used with PROT_MTE memory protection flag. These checks are, * Either pstate.tco enable or prctl PR_MTE_TCF_NONE option should not cause any tag mismatch faults. * Different combinations of anonymous/file memory mmap, mprotect, sync/async error mode and private/shared mappings should work. * mprotect should not be able to clear the PROT_MTE page property. Co-developed-by: Gabor Kertesz <gabor.kertesz@arm.com> Signed-off-by: Gabor Kertesz <gabor.kertesz@arm.com> Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com> Tested-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201002115630.24683-5-amit.kachhap@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2020-10-05	kselftest/arm64: Check forked child mte memory accessibility	Amit Daniel Kachhap	2	-0/+196
	This test covers the mte memory behaviour of the forked process with different mapping properties and flags. It checks that all bytes of forked child memory are accessible with the same tag as that of the parent and memory accessed outside the tag range causes fault to occur. Co-developed-by: Gabor Kertesz <gabor.kertesz@arm.com> Signed-off-by: Gabor Kertesz <gabor.kertesz@arm.com> Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com> Tested-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201002115630.24683-4-amit.kachhap@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2020-10-05	kselftest/arm64: Verify mte tag inclusion via prctl	Amit Daniel Kachhap	2	-0/+186
	This testcase verifies that the tag generated with "irg" instruction contains only included tags. This is done via prtcl call. This test covers 4 scenarios, * At least one included tag. * More than one included tags. * All included. * None included. Co-developed-by: Gabor Kertesz <gabor.kertesz@arm.com> Signed-off-by: Gabor Kertesz <gabor.kertesz@arm.com> Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com> Tested-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201002115630.24683-3-amit.kachhap@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2020-10-05	kselftest/arm64: Add utilities and a test to validate mte memory	Amit Daniel Kachhap	8	-1/+1138
	This test checks that the memory tag is present after mte allocation and the memory is accessible with those tags. This testcase verifies all sync, async and none mte error reporting mode. The allocated mte buffers are verified for Allocated range (no error expected while accessing buffer), Underflow range, and Overflow range. Different test scenarios covered here are, * Verify that mte memory are accessible at byte/block level. * Force underflow and overflow to occur and check the data consistency. * Check to/from between tagged and untagged memory. * Check that initial allocated memory to have 0 tag. This change also creates the necessary infrastructure to add mte test cases. MTE kselftests can use the several utility functions provided here to add wide variety of mte test scenarios. GCC compiler need flag '-march=armv8.5-a+memtag' so those flags are verified before compilation. The mte testcases can be launched with kselftest framework as, make TARGETS=arm64 ARM64_SUBTARGETS=mte kselftest or compiled as, make -C tools/testing/selftests TARGETS=arm64 ARM64_SUBTARGETS=mte CC='compiler' Co-developed-by: Gabor Kertesz <gabor.kertesz@arm.com> Signed-off-by: Gabor Kertesz <gabor.kertesz@arm.com> Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com> Tested-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201002115630.24683-2-amit.kachhap@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2020-10-02	tools: Avoid comma separated statements	Joe Perches	2	-119/+195
	Use semicolons and braces. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2020-09-26	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm	Linus Torvalds	1	-1/+1
	Pull more kvm fixes from Paolo Bonzini: "Five small fixes. The nested migration bug will be fixed with a better API in 5.10 or 5.11, for now this is a fix that works with existing userspace but keeps the current ugly API" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: SVM: Add a dedicated INVD intercept routine KVM: x86: Reset MMU context if guest toggles CR4.SMAP or CR4.PKE KVM: x86: fix MSR_IA32_TSC read for nested migration selftests: kvm: Fix assert failure in single-step test KVM: x86: VMX: Make smaller physical guest address space support user-configurable
2020-09-25	rseq/selftests: Test MEMBARRIER_CMD_PRIVATE_EXPEDITED_RSEQ	Peter Oskolkov	2	-1/+224
	Based on Google-internal RSEQ work done by Paul Turner and Andrew Hunter. This patch adds a selftest for MEMBARRIER_CMD_PRIVATE_EXPEDITED_RSEQ. The test quite often fails without the previous patch in this patchset, but consistently passes with it. Signed-off-by: Peter Oskolkov <posk@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lkml.kernel.org/r/20200923233618.2572849-3-posk@google.com
2020-09-25	rseq/selftests,x86_64: Add rseq_offset_deref_addv()	Peter Oskolkov	1	-0/+57
	This patch adds rseq_offset_deref_addv() function to tools/testing/selftests/rseq/rseq-x86.h, to be used in a selftest in the next patch in the patchset. Once an architecture adds support for this function they should define "RSEQ_ARCH_HAS_OFFSET_DEREF_ADDV". Signed-off-by: Peter Oskolkov <posk@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lkml.kernel.org/r/20200923233618.2572849-2-posk@google.com
2020-09-23	selftests: kvm: Fix assert failure in single-step test	Yang Weijiang	1	-1/+1
	This is a follow-up patch to fix an issue left in commit: 98b0bf02738004829d7e26d6cb47b2e469aaba86 selftests: kvm: Use a shorter encoding to clear RAX With the change in the commit, we also need to modify "xor" instruction length from 3 to 2 in array ss_size accordingly to pass below check: for (i = 0; i < (sizeof(ss_size) / sizeof(ss_size[0])); i++) { target_rip += ss_size[i]; CLEAR_DEBUG(); debug.control = KVM_GUESTDBG_ENABLE \| KVM_GUESTDBG_SINGLESTEP; debug.arch.debugreg[7] = 0x00000400; APPLY_DEBUG(); vcpu_run(vm, VCPU_ID); TEST_ASSERT(run->exit_reason == KVM_EXIT_DEBUG && run->debug.arch.exception == DB_VECTOR && run->debug.arch.pc == target_rip && run->debug.arch.dr6 == target_dr6, "SINGLE_STEP[%d]: exit %d exception %d rip 0x%llx " "(should be 0x%llx) dr6 0x%llx (should be 0x%llx)", i, run->exit_reason, run->debug.arch.exception, run->debug.arch.pc, target_rip, run->debug.arch.dr6, target_dr6); } Reported-by: kernel test robot <rong.a.chen@intel.com> Signed-off-by: Yang Weijiang <weijiang.yang@intel.com> Message-Id: <20200826015524.13251-1-weijiang.yang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-23	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Linus Torvalds	2	-0/+62
	Pull networking fixes from Jakub Kicinski: - fix failure to add bond interfaces to a bridge, the offload-handling code was too defensive there and recent refactoring unearthed that. Users complained (Ido) - fix unnecessarily reflecting ECN bits within TOS values / QoS marking in TCP ACK and reset packets (Wei) - fix a deadlock with bpf iterator. Hopefully we're in the clear on this front now... (Yonghong) - BPF fix for clobbering r2 in bpf_gen_ld_abs (Daniel) - fix AQL on mt76 devices with FW rate control and add a couple of AQL issues in mac80211 code (Felix) - fix authentication issue with mwifiex (Maximilian) - WiFi connectivity fix: revert IGTK support in ti/wlcore (Mauro) - fix exception handling for multipath routes via same device (David Ahern) - revert back to a BH spin lock flavor for nsid_lock: there are paths which do require the BH context protection (Taehee) - fix interrupt / queue / NAPI handling in the lantiq driver (Hauke) - fix ife module load deadlock (Cong) - make an adjustment to netlink reply message type for code added in this release (the sole change touching uAPI here) (Michal) - a number of fixes for small NXP and Microchip switches (Vladimir) [ Pull request acked by David: "you can expect more of this in the future as I try to delegate more things to Jakub" ] * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (167 commits) net: mscc: ocelot: fix some key offsets for IP4_TCP_UDP VCAP IS2 entries net: dsa: seville: fix some key offsets for IP4_TCP_UDP VCAP IS2 entries net: dsa: felix: fix some key offsets for IP4_TCP_UDP VCAP IS2 entries inet_diag: validate INET_DIAG_REQ_PROTOCOL attribute net: bridge: br_vlan_get_pvid_rcu() should dereference the VLAN group under RCU net: Update MAINTAINERS for MediaTek switch driver net/mlx5e: mlx5e_fec_in_caps() returns a boolean net/mlx5e: kTLS, Avoid kzalloc(GFP_KERNEL) under spinlock net/mlx5e: kTLS, Fix leak on resync error flow net/mlx5e: kTLS, Add missing dma_unmap in RX resync net/mlx5e: kTLS, Fix napi sync and possible use-after-free net/mlx5e: TLS, Do not expose FPGA TLS counter if not supported net/mlx5e: Fix using wrong stats_grps in mlx5e_update_ndo_stats() net/mlx5e: Fix multicast counter not up-to-date in "ip -s" net/mlx5e: Fix endianness when calculating pedit mask first bit net/mlx5e: Enable adding peer miss rules only if merged eswitch is supported net/mlx5e: CT: Fix freeing ct_label mapping net/mlx5e: Fix memory leak of tunnel info when rule under multipath not ready net/mlx5e: Use synchronize_rcu to sync with NAPI net/mlx5e: Use RCU to protect rq->xdp_prog ...
2020-09-19	selftests/vm: fix display of page size in map_hugetlb	Christophe Leroy	1	-1/+1
	The displayed size is in bytes while the text says it is in kB. Shift it by 10 to really display kBytes. Fixes: fa7b9a805c79 ("tools/selftest/vm: allow choosing mem size and page size in map_hugetlb") Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/e27481224564a93d14106e750de31189deaa8bc8.1598861977.git.christophe.leroy@csgroup.eu Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-09-19	selftests/seccomp: powerpc: Fix seccomp return value testing	Kees Cook	1	-0/+15
	On powerpc, the errno is not inverted, and depends on ccr.so being set. Add this to a powerpc definition of SYSCALL_RET_SET(). Co-developed-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Link: https://lore.kernel.org/linux-kselftest/20200911181012.171027-1-cascardo@canonical.com/ Fixes: 5d83c2b37d43 ("selftests/seccomp: Add powerpc support") Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/lkml/20200912110820.597135-13-keescook@chromium.org Reviewed-by: Michael Ellerman <mpe@ellerman.id.au>
2020-09-19	selftests/seccomp: Remove SYSCALL_NUM_RET_SHARE_REG in favor of SYSCALL_RET_SET	Kees Cook	1	-10/+23
	Instead of special-casing the specific case of shared registers, create a default SYSCALL_RET_SET() macro (mirroring SYSCALL_NUM_SET()), that writes to the SYSCALL_RET register. For architectures that can't set the return value (for whatever reason), they can define SYSCALL_RET_SET() without an associated SYSCALL_RET() macro. This also paves the way for architectures that need to do special things to set the return value (e.g. powerpc). Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/lkml/20200912110820.597135-12-keescook@chromium.org Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
2020-09-19	selftests/seccomp: Avoid redundant register flushes	Kees Cook	1	-2/+4
	When none of the registers have changed, don't flush them back. This can happen if the architecture uses a non-register way to change the syscall (e.g. arm64) , and a return value hasn't been written. Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/lkml/20200912110820.597135-11-keescook@chromium.org Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
2020-09-19	selftests/seccomp: Convert REGSET calls into ARCH_GETREG/ARCH_SETREG	Kees Cook	1	-27/+15
	Consolidate the REGSET logic into the new ARCH_GETREG() and ARCH_SETREG() macros, avoiding more #ifdef code in function bodies. Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/lkml/20200912110820.597135-10-keescook@chromium.org Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
2020-09-19	selftests/seccomp: Convert HAVE_GETREG into ARCH_GETREG/ARCH_SETREG	Kees Cook	1	-12/+15
	Instead of special-casing the get/set-registers routines, move the HAVE_GETREG logic into the new ARCH_GETREG() and ARCH_SETREG() macros. Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/lkml/20200912110820.597135-9-keescook@chromium.org Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
2020-09-19	selftests/seccomp: Remove syscall setting #ifdefs	Kees Cook	1	-13/+3
	With all architectures now using the common SYSCALL_NUM_SET() macro, the arch-specific #ifdef can be removed from change_syscall() itself. Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/lkml/20200912110820.597135-8-keescook@chromium.org Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
2020-09-19	selftests/seccomp: mips: Remove O32-specific macro	Kees Cook	1	-6/+12
	Instead of having the mips O32 macro special-cased, pull the logic into the SYSCALL_NUM() macro. Additionally include the ABI headers, since these appear to have been missing, leaving __NR_O32_Linux undefined. Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/lkml/20200912110820.597135-7-keescook@chromium.org Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
2020-09-19	selftests/seccomp: arm64: Define SYSCALL_NUM_SET macro	Kees Cook	1	-14/+13
	Remove the arm64 special-case in change_syscall(). Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/lkml/20200912110820.597135-6-keescook@chromium.org Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
2020-09-19	selftests/seccomp: arm: Define SYSCALL_NUM_SET macro	Kees Cook	1	-10/+6
	Remove the arm special-case in change_syscall(). Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/lkml/20200912110820.597135-5-keescook@chromium.org Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
2020-09-19	selftests/seccomp: mips: Define SYSCALL_NUM_SET macro	Kees Cook	1	-8/+9
	Remove the mips special-case in change_syscall(). Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/lkml/20200912110820.597135-4-keescook@chromium.org Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
2020-09-19	selftests/seccomp: Provide generic syscall setting macro	Kees Cook	1	-2/+13
	In order to avoid "#ifdef"s in the main function bodies, create a new macro, SYSCALL_NUM_SET(), where arch-specific logic can live. Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/lkml/20200912110820.597135-3-keescook@chromium.org Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
2020-09-19	selftests/seccomp: Refactor arch register macros to avoid xtensa special case	Kees Cook	1	-50/+47
	To avoid an xtensa special-case, refactor all arch register macros to take the register variable instead of depending on the macro expanding as a struct member name. Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/lkml/20200912110820.597135-2-keescook@chromium.org Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
2020-09-19	selftests/seccomp: Use __NR_mknodat instead of __NR_mknod	Kees Cook	1	-1/+1
	The __NR_mknod syscall doesn't exist on arm64 (only __NR_mknodat). Switch to the modern syscall. Fixes: ad5682184a81 ("selftests/seccomp: Check for EPOLLHUP for user_notif") Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/lkml/20200912110820.597135-16-keescook@chromium.org Acked-by: Christian Brauner <christian.brauner@ubuntu.com>