summaryrefslogtreecommitdiff
path: root/lib
AgeCommit message (Collapse)AuthorFilesLines
2018-01-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller1-4/+7
Daniel Borkmann says: ==================== pull-request: bpf 2018-01-09 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) Prevent out-of-bounds speculation in BPF maps by masking the index after bounds checks in order to fix spectre v1, and add an option BPF_JIT_ALWAYS_ON into Kconfig that allows for removing the BPF interpreter from the kernel in favor of JIT-only mode to make spectre v2 harder, from Alexei. 2) Remove false sharing of map refcount with max_entries which was used in spectre v1, from Daniel. 3) Add a missing NULL psock check in sockmap in order to fix a race, from John. 4) Fix test_align BPF selftest case since a recent change in verifier rejects the bit-wise arithmetic on pointers earlier but test_align update was missing, from Alexei. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-10bpf: introduce BPF_JIT_ALWAYS_ON configAlexei Starovoitov1-4/+7
The BPF interpreter has been used as part of the spectre 2 attack CVE-2017-5715. A quote from goolge project zero blog: "At this point, it would normally be necessary to locate gadgets in the host kernel code that can be used to actually leak data by reading from an attacker-controlled location, shifting and masking the result appropriately and then using the result of that as offset to an attacker-controlled address for a load. But piecing gadgets together and figuring out which ones work in a speculation context seems annoying. So instead, we decided to use the eBPF interpreter, which is built into the host kernel - while there is no legitimate way to invoke it from inside a VM, the presence of the code in the host kernel's text section is sufficient to make it usable for the attack, just like with ordinary ROP gadgets." To make attacker job harder introduce BPF_JIT_ALWAYS_ON config option that removes interpreter from the kernel in favor of JIT-only mode. So far eBPF JIT is supported by: x64, arm64, arm32, sparc64, s390, powerpc64, mips64 The start of JITed program is randomized and code page is marked as read-only. In addition "constant blinding" can be turned on with net.core.bpf_jit_harden v2->v3: - move __bpf_prog_ret0 under ifdef (Daniel) v1->v2: - fix init order, test_bpf and cBPF (Daniel's feedback) - fix offloaded bpf (Jakub's feedback) - add 'return 0' dummy in case something can invoke prog->bpf_func - retarget bpf tree. For bpf-next the patch would need one extra hunk. It will be sent when the trees are merged back to net-next Considered doing: int bpf_jit_enable __read_mostly = BPF_EBPF_JIT_DEFAULT; but it seems better to land the patch as-is and in bpf-next remove bpf_jit_enable global variable from all JITs, consolidate in one place and remove this jit_init() function. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-05Merge branch 'linus' of ↵Linus Torvalds1-1/+17
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fixes from Herbert Xu: "This fixes the following issues: - racy use of ctx->rcvused in af_alg - algif_aead crash in chacha20poly1305 - freeing bogus pointer in pcrypt - build error on MIPS in mpi - memory leak in inside-secure - memory overwrite in inside-secure - NULL pointer dereference in inside-secure - state corruption in inside-secure - build error without CRYPTO_GF128MUL in chelsio - use after free in n2" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: inside-secure - do not use areq->result for partial results crypto: inside-secure - fix request allocations in invalidation path crypto: inside-secure - free requests even if their handling failed crypto: inside-secure - per request invalidation lib/mpi: Fix umul_ppmm() for MIPS64r6 crypto: pcrypt - fix freeing pcrypt instances crypto: n2 - cure use after free crypto: af_alg - Fix race around ctx->rcvused by making it atomic_t crypto: chacha20poly1305 - validate the digest size crypto: chelsio - select CRYPTO_GF128MUL
2017-12-31Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds1-3/+5
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Thomas Gleixner: "A pile of fixes for long standing issues with the timer wheel and the NOHZ code: - Prevent timer base confusion accross the nohz switch, which can cause unlocked access and data corruption - Reinitialize the stale base clock on cpu hotplug to prevent subtle side effects including rollovers on 32bit - Prevent an interrupt storm when the timer softirq is already pending caused by tick_nohz_stop_sched_tick() - Move the timer start tracepoint to a place where it actually makes sense - Add documentation to timerqueue functions as they caused confusion several times now" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: timerqueue: Document return values of timerqueue_add/del() timers: Invoke timer_start_debug() where it makes sense nohz: Prevent a timer interrupt storm in tick_nohz_stop_sched_tick() timers: Reinitialize per cpu bases on hotplug timers: Use deferrable base independent of base::nohz_active
2017-12-31Merge tag 'driver-core-4.15-rc6' of ↵Linus Torvalds1-4/+12
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core fixes from Greg KH: "Here are two driver core fixes for 4.15-rc6, resolving some reported issues. The first is a cacheinfo fix for DT based systems to resolve a reported issue that has been around for a while, and the other is to resolve a regression in the kobject uevent code that showed up in 4.15-rc1. Both have been in linux-next for a while with no reported issues" * tag 'driver-core-4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: kobject: fix suppressing modalias in uevents delivered over netlink drivers: base: cacheinfo: fix cache type for non-architected system cache
2017-12-30timerqueue: Document return values of timerqueue_add/del()Thomas Gleixner1-3/+5
The return values of timerqueue_add/del() are not documented in the kernel doc comment. Add proper documentation. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Sebastian Siewior <bigeasy@linutronix.de> Cc: rt@linutronix.de Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Cc: Anna-Maria Gleixner <anna-maria@linutronix.de> Link: https://lkml.kernel.org/r/20171222145337.872681338@linutronix.de
2017-12-22lib/mpi: Fix umul_ppmm() for MIPS64r6James Hogan1-1/+17
Current MIPS64r6 toolchains aren't able to generate efficient DMULU/DMUHU based code for the C implementation of umul_ppmm(), which performs an unsigned 64 x 64 bit multiply and returns the upper and lower 64-bit halves of the 128-bit result. Instead it widens the 64-bit inputs to 128-bits and emits a __multi3 intrinsic call to perform a 128 x 128 multiply. This is both inefficient, and it results in a link error since we don't include __multi3 in MIPS linux. For example commit 90a53e4432b1 ("cfg80211: implement regdb signature checking") merged in v4.15-rc1 recently broke the 64r6_defconfig and 64r6el_defconfig builds by indirectly selecting MPILIB. The same build errors can be reproduced on older kernels by enabling e.g. CRYPTO_RSA: lib/mpi/generic_mpih-mul1.o: In function `mpihelp_mul_1': lib/mpi/generic_mpih-mul1.c:50: undefined reference to `__multi3' lib/mpi/generic_mpih-mul2.o: In function `mpihelp_addmul_1': lib/mpi/generic_mpih-mul2.c:49: undefined reference to `__multi3' lib/mpi/generic_mpih-mul3.o: In function `mpihelp_submul_1': lib/mpi/generic_mpih-mul3.c:49: undefined reference to `__multi3' lib/mpi/mpih-div.o In function `mpihelp_divrem': lib/mpi/mpih-div.c:205: undefined reference to `__multi3' lib/mpi/mpih-div.c:142: undefined reference to `__multi3' Therefore add an efficient MIPS64r6 implementation of umul_ppmm() using inline assembly and the DMULU/DMUHU instructions, to prevent __multi3 calls being emitted. Fixes: 7fd08ca58ae6 ("MIPS: Add build support for the MIPS R6 ISA") Signed-off-by: James Hogan <jhogan@kernel.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "David S. Miller" <davem@davemloft.net> Cc: linux-mips@linux-mips.org Cc: linux-crypto@vger.kernel.org Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-12-21kobject: fix suppressing modalias in uevents delivered over netlinkDmitry Torokhov1-4/+12
The commit 4a336a23d619 ("kobject: copy env blob in one go") optimized constructing uevent data for delivery over netlink by using the raw environment buffer, instead of reconstructing it from individual environment pointers. Unfortunately in doing so it broke suppressing MODALIAS attribute for KOBJ_UNBIND events, as the code that suppressed this attribute only adjusted the environment pointers, but left the buffer itself alone. Let's fix it by making sure the offending attribute is obliterated form the buffer as well. Reported-by: Tariq Toukan <tariqt@mellanox.com> Reported-by: Casey Leedom <leedom@chelsio.com> Fixes: 4a336a23d619 ("kobject: copy env blob in one go") Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-18Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller1-0/+43
Daniel Borkmann says: ==================== pull-request: bpf 2017-12-17 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) Fix a corner case in generic XDP where we have non-linear skbs but enough tailroom in the skb to not miss to linearizing there, from Song. 2) Fix BPF JIT bugs in s390x and ppc64 to not recache skb data when BPF context is not skb, from Daniel. 3) Fix a BPF JIT bug in sparc64 where recaching skb data after helper call would use the wrong register for the skb, from Daniel. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15Merge branch 'locking-urgent-for-linus' of ↵Linus Torvalds1-33/+0
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Ingo Molnar: "Misc fixes: - Fix a S390 boot hang that was caused by the lock-break logic. Remove lock-break to begin with, as review suggested it was unreasonably fragile and our confidence in its continued good health is lower than our confidence in its removal. - Remove the lockdep cross-release checking code for now, because of unresolved false positive warnings. This should make lockdep work well everywhere again. - Get rid of the final (and single) ACCESS_ONCE() straggler and remove the API from v4.15. - Fix a liblockdep build warning" * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: tools/lib/lockdep: Add missing declaration of 'pr_cont()' checkpatch: Remove ACCESS_ONCE() warning compiler.h: Remove ACCESS_ONCE() tools/include: Remove ACCESS_ONCE() tools/perf: Convert ACCESS_ONCE() to READ_ONCE() locking/lockdep: Remove the cross-release locking checks locking/core: Remove break_lock field when CONFIG_GENERIC_LOCKBREAK=y locking/core: Fix deadlock during boot on systems with GENERIC_LOCKBREAK
2017-12-15bpf: add test case for ld_abs and helper changing pkt dataDaniel Borkmann1-0/+43
Add a test that i) uses LD_ABS, ii) zeroing R6 before call, iii) calls a helper that triggers reload of cached skb data, iv) uses LD_ABS again. It's added for test_bpf in order to do runtime testing after JITing as well as test_verifier to test that the sequence is allowed. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2017-12-15lib/rbtree,drm/mm: add rbtree_replace_node_cached()Chris Wilson1-0/+10
Add a variant of rbtree_replace_node() that maintains the leftmost cache of struct rbtree_root_cached when replacing nodes within the rbtree. As drm_mm is the only rb_replace_node() being used on an interval tree, the mistake looks fairly self-contained. Furthermore the only user of drm_mm_replace_node() is its testsuite... Testcase: igt/drm_mm/replace Link: http://lkml.kernel.org/r/20171122100729.3742-1-chris@chris-wilson.co.uk Link: https://patchwork.freedesktop.org/patch/msgid/20171109212435.9265-1-chris@chris-wilson.co.uk Fixes: f808c13fd373 ("lib/interval_tree: fast overlap detection") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Acked-by: Davidlohr Bueso <dbueso@suse.de> Cc: Jérôme Glisse <jglisse@redhat.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-12-12locking/lockdep: Remove the cross-release locking checksIngo Molnar1-33/+0
This code (CONFIG_LOCKDEP_CROSSRELEASE=y and CONFIG_LOCKDEP_COMPLETIONS=y), while it found a number of old bugs initially, was also causing too many false positives that caused people to disable lockdep - which is arguably a worse overall outcome. If we disable cross-release by default but keep the code upstream then in practice the most likely outcome is that we'll allow the situation to degrade gradually, by allowing entropy to introduce more and more false positives, until it overwhelms maintenance capacity. Another bad side effect was that people were trying to work around the false positives by uglifying/complicating unrelated code. There's a marked difference between annotating locking operations and uglifying good code just due to bad lock debugging code ... This gradual decrease in quality happened to a number of debugging facilities in the kernel, and lockdep is pretty complex already, so we cannot risk this outcome. Either cross-release checking can be done right with no false positives, or it should not be included in the upstream kernel. ( Note that it might make sense to maintain it out of tree and go through the false positives every now and then and see whether new bugs were introduced. ) Cc: Byungchul Park <byungchul.park@lge.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-09Merge tag 'keys-fixes-20171208' of ↵James Morris2-27/+38
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs into keys-for-linus Assorted fixes for keyrings, ASN.1, X.509 and PKCS#7.
2017-12-08509: fix printing uninitialized stack memory when OID is emptyEric Biggers1-2/+6
Callers of sprint_oid() do not check its return value before printing the result. In the case where the OID is zero-length, -EBADMSG was being returned without anything being written to the buffer, resulting in uninitialized stack memory being printed. Fix this by writing "(bad)" to the buffer in the cases where -EBADMSG is returned. Fixes: 4f73175d0375 ("X.509: Add utility functions to render OIDs as strings") Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: David Howells <dhowells@redhat.com>
2017-12-08X.509: fix buffer overflow detection in sprint_oid()Eric Biggers1-4/+4
In sprint_oid(), if the input buffer were to be more than 1 byte too small for the first snprintf(), 'bufsize' would underflow, causing a buffer overflow when printing the remainder of the OID. Fortunately this cannot actually happen currently, because no users pass in a buffer that can be too small for the first snprintf(). Regardless, fix it by checking the snprintf() return value correctly. For consistency also tweak the second snprintf() check to look the same. Fixes: 4f73175d0375 ("X.509: Add utility functions to render OIDs as strings") Cc: Takashi Iwai <tiwai@suse.de> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: James Morris <james.l.morris@oracle.com>
2017-12-08ASN.1: check for error from ASN1_OP_END__ACT actionsEric Biggers1-0/+2
asn1_ber_decoder() was ignoring errors from actions associated with the opcodes ASN1_OP_END_SEQ_ACT, ASN1_OP_END_SET_ACT, ASN1_OP_END_SEQ_OF_ACT, and ASN1_OP_END_SET_OF_ACT. In practice, this meant the pkcs7_note_signed_info() action (since that was the only user of those opcodes). Fix it by checking for the error, just like the decoder does for actions associated with the other opcodes. This bug allowed users to leak slab memory by repeatedly trying to add a specially crafted "pkcs7_test" key (requires CONFIG_PKCS7_TEST_KEY). In theory, this bug could also be used to bypass module signature verification, by providing a PKCS#7 message that is misparsed such that a signature's ->authattrs do not contain its ->msgdigest. But it doesn't seem practical in normal cases, due to restrictions on the format of the ->authattrs. Fixes: 42d5ec27f873 ("X.509: Add an ASN.1 decoder") Cc: <stable@vger.kernel.org> # v3.7+ Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: James Morris <james.l.morris@oracle.com>
2017-12-08ASN.1: fix out-of-bounds read when parsing indefinite length itemEric Biggers1-21/+26
In asn1_ber_decoder(), indefinitely-sized ASN.1 items were being passed to the action functions before their lengths had been computed, using the bogus length of 0x80 (ASN1_INDEFINITE_LENGTH). This resulted in reading data past the end of the input buffer, when given a specially crafted message. Fix it by rearranging the code so that the indefinite length is resolved before the action is called. This bug was originally found by fuzzing the X.509 parser in userspace using libFuzzer from the LLVM project. KASAN report (cleaned up slightly): BUG: KASAN: slab-out-of-bounds in memcpy ./include/linux/string.h:341 [inline] BUG: KASAN: slab-out-of-bounds in x509_fabricate_name.constprop.1+0x1a4/0x940 crypto/asymmetric_keys/x509_cert_parser.c:366 Read of size 128 at addr ffff880035dd9eaf by task keyctl/195 CPU: 1 PID: 195 Comm: keyctl Not tainted 4.14.0-09238-g1d3b78bbc6e9 #26 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-20171110_100015-anatol 04/01/2014 Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0xd1/0x175 lib/dump_stack.c:53 print_address_description+0x78/0x260 mm/kasan/report.c:252 kasan_report_error mm/kasan/report.c:351 [inline] kasan_report+0x23f/0x350 mm/kasan/report.c:409 memcpy+0x1f/0x50 mm/kasan/kasan.c:302 memcpy ./include/linux/string.h:341 [inline] x509_fabricate_name.constprop.1+0x1a4/0x940 crypto/asymmetric_keys/x509_cert_parser.c:366 asn1_ber_decoder+0xb4a/0x1fd0 lib/asn1_decoder.c:447 x509_cert_parse+0x1c7/0x620 crypto/asymmetric_keys/x509_cert_parser.c:89 x509_key_preparse+0x61/0x750 crypto/asymmetric_keys/x509_public_key.c:174 asymmetric_key_preparse+0xa4/0x150 crypto/asymmetric_keys/asymmetric_type.c:388 key_create_or_update+0x4d4/0x10a0 security/keys/key.c:850 SYSC_add_key security/keys/keyctl.c:122 [inline] SyS_add_key+0xe8/0x290 security/keys/keyctl.c:62 entry_SYSCALL_64_fastpath+0x1f/0x96 Allocated by task 195: __do_kmalloc_node mm/slab.c:3675 [inline] __kmalloc_node+0x47/0x60 mm/slab.c:3682 kvmalloc ./include/linux/mm.h:540 [inline] SYSC_add_key security/keys/keyctl.c:104 [inline] SyS_add_key+0x19e/0x290 security/keys/keyctl.c:62 entry_SYSCALL_64_fastpath+0x1f/0x96 Fixes: 42d5ec27f873 ("X.509: Add an ASN.1 decoder") Reported-by: Alexander Potapenko <glider@google.com> Cc: <stable@vger.kernel.org> # v3.7+ Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: David Howells <dhowells@redhat.com>
2017-12-07netlink: Relax attr validation for fixed length typesDavid Ahern1-6/+16
Commit 28033ae4e0f5 ("net: netlink: Update attr validation to require exact length for some types") requires attributes using types NLA_U* and NLA_S* to have an exact length. This change is exposing bugs in various userspace commands that are sending attributes with an invalid length (e.g., attribute has type NLA_U8 and userspace sends NLA_U32). While the commands are clearly broken and need to be fixed, users are arguing that the sudden change in enforcement is breaking older commands on newer kernels for use cases that otherwise "worked". Relax the validation to print a warning mesage similar to what is done for messages containing extra bytes after parsing. Fixes: 28033ae4e0f5 ("net: netlink: Update attr validation to require exact length for some types") Signed-off-by: David Ahern <dsahern@gmail.com> Reviewed-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-02Merge tag 'riscv-for-linus-4.15-rc2_cleanups' of ↵Linus Torvalds6-6/+6
git://git.kernel.org/pub/scm/linux/kernel/git/palmer/linux Pull RISC-V cleanups and ABI fixes from Palmer Dabbelt: "This contains a handful of small cleanups that are a result of feedback that didn't make it into our original patch set, either because the feedback hadn't been given yet, I missed the original emails, or we weren't ready to submit the changes yet. I've been maintaining the various cleanup patch sets I have as their own branches, which I then merged together and signed. Each merge commit has a short summary of the changes, and each branch is based on your latest tag (4.15-rc1, in this case). If this isn't the right way to do this then feel free to suggest something else, but it seems sane to me. Here's a short summary of the changes, roughly in order of how interesting they are. - libgcc.h has been moved from include/lib, where it's the only member, to include/linux. This is meant to avoid tab completion conflicts. - VDSO entries for clock_get/gettimeofday/getcpu have been added. These are simple syscalls now, but we want to let glibc use them from the start so we can make them faster later. - A VDSO entry for instruction cache flushing has been added so userspace can flush the instruction cache. - The VDSO symbol versions for __vdso_cmpxchg{32,64} have been removed, as those VDSO entries don't actually exist. - __io_writes has been corrected to respect the given type. - A new READ_ONCE in arch_spin_is_locked(). - __test_and_op_bit_ord() is now actually ordered. - Various small fixes throughout the tree to enable allmodconfig to build cleanly. - Removal of some dead code in our atomic support headers. - Improvements to various comments in our atomic support headers" * tag 'riscv-for-linus-4.15-rc2_cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/linux: (23 commits) RISC-V: __io_writes should respect the length argument move libgcc.h to include/linux RISC-V: Clean up an unused include RISC-V: Allow userspace to flush the instruction cache RISC-V: Flush I$ when making a dirty page executable RISC-V: Add missing include RISC-V: Use define for get_cycles like other architectures RISC-V: Provide stub of setup_profiling_timer() RISC-V: Export some expected symbols for modules RISC-V: move empty_zero_page definition to C and export it RISC-V: io.h: type fixes for warnings RISC-V: use RISCV_{INT,SHORT} instead of {INT,SHORT} for asm macros RISC-V: use generic serial.h RISC-V: remove spin_unlock_wait() RISC-V: `sfence.vma` orderes the instruction cache RISC-V: Add READ_ONCE in arch_spin_is_locked() RISC-V: __test_and_op_bit_ord should be strongly ordered RISC-V: Remove smb_mb__{before,after}_spinlock() RISC-V: Remove __smp_bp__{before,after}_atomic RISC-V: Comment on why {,cmp}xchg is ordered how it is ...
2017-12-02move libgcc.h to include/linuxChristoph Hellwig6-6/+6
Introducing a new include/lib directory just for this file totally messes up tab completion for include/linux, which is highly annoying. Move it to include/linux where we have headers for all kinds of other lib/ code as well. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2017-11-29vsprintf: don't use 'restricted_pointer()' when not restrictingLinus Torvalds1-0/+2
Instead, just fall back on the new '%p' behavior which hashes the pointer. Otherwise, '%pK' - that was intended to mark a pointer as restricted - just ends up leaking pointers that a normal '%p' wouldn't leak. Which just make the whole thing pointless. I suspect we should actually get rid of '%pK' entirely, and make it just work as '%p' regardless, but this is the minimal obvious fix. People who actually use 'kptr_restrict' should weigh in on which behavior they want. Cc: Tobin Harding <me@tobin.cc> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-29vsprintf: add printk specifier %pxTobin C. Harding1-0/+18
printk specifier %p now hashes all addresses before printing. Sometimes we need to see the actual unmodified address. This can be achieved using %lx but then we face the risk that if in future we want to change the way the Kernel handles printing of pointers we will have to grep through the already existent 50 000 %lx call sites. Let's add specifier %px as a clear, opt-in, way to print a pointer and maintain some level of isolation from all the other hex integer output within the Kernel. Add printk specifier %px to print the actual unmodified address. Signed-off-by: Tobin C. Harding <me@tobin.cc>
2017-11-29printk: hash addresses printed with %pTobin C. Harding2-45/+144
Currently there exist approximately 14 000 places in the kernel where addresses are being printed using an unadorned %p. This potentially leaks sensitive information regarding the Kernel layout in memory. Many of these calls are stale, instead of fixing every call lets hash the address by default before printing. This will of course break some users, forcing code printing needed addresses to be updated. Code that _really_ needs the address will soon be able to use the new printk specifier %px to print the address. For what it's worth, usage of unadorned %p can be broken down as follows (thanks to Joe Perches). $ git grep -E '%p[^A-Za-z0-9]' | cut -f1 -d"/" | sort | uniq -c 1084 arch 20 block 10 crypto 32 Documentation 8121 drivers 1221 fs 143 include 101 kernel 69 lib 100 mm 1510 net 40 samples 7 scripts 11 security 166 sound 152 tools 2 virt Add function ptr_to_id() to map an address to a 32 bit unique identifier. Hash any unadorned usage of specifier %p and any malformed specifiers. Signed-off-by: Tobin C. Harding <me@tobin.cc>
2017-11-29vsprintf: refactor %pK code out of pointer()Tobin C. Harding1-43/+54
Currently code to handle %pK is all within the switch statement in pointer(). This is the wrong level of abstraction. Each of the other switch clauses call a helper function, pK should do the same. Refactor code out of pointer() to new function restricted_pointer(). Signed-off-by: Tobin C. Harding <me@tobin.cc>
2017-11-25Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer updates from Thomas Gleixner: - The final conversion of timer wheel timers to timer_setup(). A few manual conversions and a large coccinelle assisted sweep and the removal of the old initialization mechanisms and the related code. - Remove the now unused VSYSCALL update code - Fix permissions of /proc/timer_list. I still need to get rid of that file completely - Rename a misnomed clocksource function and remove a stale declaration * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits) m68k/macboing: Fix missed timer callback assignment treewide: Remove TIMER_FUNC_TYPE and TIMER_DATA_TYPE casts timer: Remove redundant __setup_timer*() macros timer: Pass function down to initialization routines timer: Remove unused data arguments from macros timer: Switch callback prototype to take struct timer_list * argument timer: Pass timer_list pointer to callbacks unconditionally Coccinelle: Remove setup_timer.cocci timer: Remove setup_*timer() interface timer: Remove init_timer() interface treewide: setup_timer() -> timer_setup() (2 field) treewide: setup_timer() -> timer_setup() treewide: init_timer() -> setup_timer() treewide: Switch DEFINE_TIMER callbacks to struct timer_list * s390: cmm: Convert timers to use timer_setup() lightnvm: Convert timers to use timer_setup() drivers/net: cris: Convert timers to use timer_setup() drm/vc4: Convert timers to use timer_setup() block/laptop_mode: Convert timers to use timer_setup() net/atm/mpc: Avoid open-coded assignment of timer callback function ...
2017-11-23Merge tag 'for-linus-20171120' of git://git.infradead.org/linux-mtdLinus Torvalds1-4/+0
Pull MTD updates from Richard Weinberger: "General changes: - Unconfuse get_unmapped_area and point/unpoint driver methods - New partition parser: sharpslpart - Kill GENERIC_IO - Various fixes NAND changes: - Add a flag to mark NANDs that require 3 address cycles to encode a page address - Set a default ECC/free layout when NAND_ECC_NONE is requested - Fix a bug in panic_nand_write() - Another batch of cleanups for the denali driver - Fix PM support in the atmel driver - Remove support for platform data in the omap driver - Fix subpage write in the omap driver - Fix irq handling in the mtk driver - Change link order of mtk_ecc and mtk_nand drivers to speed up boot time - Change log level of ECC error messages in the mxc driver - Patch the pxa3xx driver to support Armada 8k platforms - Add BAM DMA support to the qcom driver - Convert gpio-nand to the GPIO desc API - Fix ECC handling in the mt29f driver SPI-NOR changes: - Introduce system power management support - New mechanism to select the proper .quad_enable() hook by JEDEC ID, when needed, instead of only by manufacturer ID - Add support to new memory parts from Gigadevice, Winbond, Macronix and Everspin - Maintainance for Cadence, Intel, Mediatek and STM32 drivers" * tag 'for-linus-20171120' of git://git.infradead.org/linux-mtd: (85 commits) mtd: Avoid probe failures when mtd->dbg.dfs_dir is invalid mtd: sharpslpart: Add sharpslpart partition parser mtd: Add sanity checks in mtd_write/read_oob() mtd: remove the get_unmapped_area method mtd: implement mtd_get_unmapped_area() using the point method mtd: chips/map_rom.c: implement point and unpoint methods mtd: chips/map_ram.c: implement point and unpoint methods mtd: mtdram: properly handle the phys argument in the point method mtd: mtdswap: fix spelling mistake: 'TRESHOLD' -> 'THRESHOLD' mtd: slram: use memremap() instead of ioremap() kconfig: kill off GENERIC_IO option mtd: Fix C++ comment in include/linux/mtd/mtd.h mtd: constify mtd_partition mtd: plat-ram: Replace manual resource management by devm mtd: nand: Fix writing mtdoops to nand flash. mtd: intel-spi: Add Intel Lewisburg PCH SPI super SKU PCI ID mtd: nand: mtk: fix infinite ECC decode IRQ issue mtd: spi-nor: Add support for mr25h128 mtd: nand: mtk: change the compile sequence of mtk_nand.o and mtk_ecc.o mtd: spi-nor: enable 4B opcodes for mx66l51235l ...
2017-11-22treewide: Switch DEFINE_TIMER callbacks to struct timer_list *Kees Cook1-2/+2
This changes all DEFINE_TIMER() callbacks to use a struct timer_list pointer instead of unsigned long. Since the data argument has already been removed, none of these callbacks are using their argument currently, so this renames the argument to "unused". Done using the following semantic patch: @match_define_timer@ declarer name DEFINE_TIMER; identifier _timer, _callback; @@ DEFINE_TIMER(_timer, _callback); @change_callback depends on match_define_timer@ identifier match_define_timer._callback; type _origtype; identifier _origarg; @@ void -_callback(_origtype _origarg) +_callback(struct timer_list *unused) { ... } Signed-off-by: Kees Cook <keescook@chromium.org>
2017-11-18Merge branch 'akpm' (patches from Andrew)Linus Torvalds17-173/+374
Merge more updates from Andrew Morton: - a bit more MM - procfs updates - dynamic-debug fixes - lib/ updates - checkpatch - epoll - nilfs2 - signals - rapidio - PID management cleanup and optimization - kcov updates - sysvipc updates - quite a few misc things all over the place * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (94 commits) EXPERT Kconfig menu: fix broken EXPERT menu include/asm-generic/topology.h: remove unused parent_node() macro arch/tile/include/asm/topology.h: remove unused parent_node() macro arch/sparc/include/asm/topology_64.h: remove unused parent_node() macro arch/sh/include/asm/topology.h: remove unused parent_node() macro arch/ia64/include/asm/topology.h: remove unused parent_node() macro drivers/pcmcia/sa1111_badge4.c: avoid unused function warning mm: add infrastructure for get_user_pages_fast() benchmarking sysvipc: make get_maxid O(1) again sysvipc: properly name ipc_addid() limit parameter sysvipc: duplicate lock comments wrt ipc_addid() sysvipc: unteach ids->next_id for !CHECKPOINT_RESTORE initramfs: use time64_t timestamps drivers/watchdog: make use of devm_register_reboot_notifier() kernel/reboot.c: add devm_register_reboot_notifier() kcov: update documentation Makefile: support flag -fsanitizer-coverage=trace-cmp kcov: support comparison operands collection kcov: remove pointless current != NULL check kernel/panic.c: add TAINT_AUX ...
2017-11-18Makefile: support flag -fsanitizer-coverage=trace-cmpVictor Chibotaru1-0/+10
The flag enables Clang instrumentation of comparison operations (currently not supported by GCC). This instrumentation is needed by the new KCOV device to collect comparison operands. Link: http://lkml.kernel.org/r/20171011095459.70721-2-glider@google.com Signed-off-by: Victor Chibotaru <tchibo@google.com> Signed-off-by: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Alexander Popov <alex.popov@linux.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Kees Cook <keescook@chromium.org> Cc: Vegard Nossum <vegard.nossum@oracle.com> Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com> Cc: <syzkaller@googlegroups.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-18lib: test module for find_*_bit() functionsYury Norov3-0/+154
find_bit functions are widely used in the kernel, including hot paths. This module tests performance of those functions in 2 typical scenarios: randomly filled bitmap with relatively equal distribution of set and cleared bits, and sparse bitmap which has 1 set bit for 500 cleared bits. On ThunderX machine: Start testing find_bit() with random-filled bitmap find_next_bit: 240043 cycles, 164062 iterations find_next_zero_bit: 312848 cycles, 163619 iterations find_last_bit: 193748 cycles, 164062 iterations find_first_bit: 177720874 cycles, 164062 iterations Start testing find_bit() with sparse bitmap find_next_bit: 3633 cycles, 656 iterations find_next_zero_bit: 620399 cycles, 327025 iterations find_last_bit: 3038 cycles, 656 iterations find_first_bit: 691407 cycles, 656 iterations [arnd@arndb.de: use correct format string for find-bit tests] Link: http://lkml.kernel.org/r/20171113135605.3166307-1-arnd@arndb.de Link: http://lkml.kernel.org/r/20171109140714.13168-1-ynorov@caviumnetworks.com Signed-off-by: Yury Norov <ynorov@caviumnetworks.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Clement Courbet <courbet@google.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-18lib/rbtree-test: lower default paramsDavidlohr Bueso2-3/+3
Fengguang reported soft lockups while running the rbtree and interval tree test modules. The logic for these tests all occur in init phase, and we currently are pounding with the default values for number of nodes and number of iterations of each test. Reduce the latter by two orders of magnitude. This does not influence the value of the tests in that one thousand times by default is enough to get the picture. Link: http://lkml.kernel.org/r/20171109161715.xai2dtwqw2frhkcm@linux-n805 Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-18lib/nmi_backtrace.c: fix kernel text address leakLiu, Changcheng1-2/+2
Don't leak idle function address in NMI backtrace. Link: http://lkml.kernel.org/r/20171106165648.GA95243@sofia Signed-off-by: Liu Changcheng <changcheng.liu@intel.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-18lib/genalloc.c: make the avail variable an atomic_long_tStephen Bates1-5/+5
If the amount of resources allocated to a gen_pool exceeds 2^32 then the avail atomic overflows and this causes problems when clients try and borrow resources from the pool. This is only expected to be an issue on 64 bit systems. Add the <linux/atomic.h> header to pull in atomic_long* operations. So that 32 bit systems continue to use atomic32_t but 64 bit systems can use atomic64_t. Link: http://lkml.kernel.org/r/1509033843-25667-1-git-send-email-sbates@raithlin.com Signed-off-by: Stephen Bates <sbates@raithlin.com> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Reviewed-by: Daniel Mentz <danielmentz@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-18lib/int_sqrt: adjust commentsPeter Zijlstra1-2/+2
Our current int_sqrt() is not rough nor any approximation; it calculates the exact value of: floor(sqrt()). Document this. Link: http://lkml.kernel.org/r/20171020164645.001652117@infradead.org Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Anshul Garg <aksgarg1989@gmail.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: David Miller <davem@davemloft.net> Cc: Ingo Molnar <mingo@kernel.org> Cc: Joe Perches <joe@perches.com> Cc: Kees Cook <keescook@chromium.org> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Michael Davidson <md@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-18lib/int_sqrt: optimize initial value computePeter Zijlstra1-4/+2
The initial value (@m) compute is: m = 1UL << (BITS_PER_LONG - 2); while (m > x) m >>= 2; Which is a linear search for the highest even bit smaller or equal to @x We can implement this using a binary search using __fls() (or better when its hardware implemented). m = 1UL << (__fls(x) & ~1UL); Especially for small values of @x; which are the more common arguments when doing a CDF on idle times; the linear search is near to worst case, while the binary search of __fls() is a constant 6 (or 5 on 32bit) branches. cycles: branches: branch-misses: PRE: hot: 43.633557 +- 0.034373 45.333132 +- 0.002277 0.023529 +- 0.000681 cold: 207.438411 +- 0.125840 45.333132 +- 0.002277 6.976486 +- 0.004219 SOFTWARE FLS: hot: 29.576176 +- 0.028850 26.666730 +- 0.004511 0.019463 +- 0.000663 cold: 165.947136 +- 0.188406 26.666746 +- 0.004511 6.133897 +- 0.004386 HARDWARE FLS: hot: 24.720922 +- 0.025161 20.666784 +- 0.004509 0.020836 +- 0.000677 cold: 132.777197 +- 0.127471 20.666776 +- 0.004509 5.080285 +- 0.003874 Averages computed over all values <128k using a LFSR to generate order. Cold numbers have a LFSR based branch trace buffer 'confuser' ran between each int_sqrt() invocation. Link: http://lkml.kernel.org/r/20171020164644.936577234@infradead.org Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Suggested-by: Joe Perches <joe@perches.com> Acked-by: Will Deacon <will.deacon@arm.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Anshul Garg <aksgarg1989@gmail.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: David Miller <davem@davemloft.net> Cc: Ingo Molnar <mingo@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Michael Davidson <md@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-18lib/int_sqrt: optimize small argumentPeter Zijlstra1-0/+3
The current int_sqrt() computation is sub-optimal for the case of small @x. Which is the interesting case when we're going to do cumulative distribution functions on idle times, which we assume to be a random variable, where the target residency of the deepest idle state gives an upper bound on the variable (5e6ns on recent Intel chips). In the case of small @x, the compute loop: while (m != 0) { b = y + m; y >>= 1; if (x >= b) { x -= b; y += m; } m >>= 2; } can be reduced to: while (m > x) m >>= 2; Because y==0, b==m and until x>=m y will remain 0. And while this is computationally equivalent, it runs much faster because there's less code, in particular less branches. cycles: branches: branch-misses: OLD: hot: 45.109444 +- 0.044117 44.333392 +- 0.002254 0.018723 +- 0.000593 cold: 187.737379 +- 0.156678 44.333407 +- 0.002254 6.272844 +- 0.004305 PRE: hot: 67.937492 +- 0.064124 66.999535 +- 0.000488 0.066720 +- 0.001113 cold: 232.004379 +- 0.332811 66.999527 +- 0.000488 6.914634 +- 0.006568 POST: hot: 43.633557 +- 0.034373 45.333132 +- 0.002277 0.023529 +- 0.000681 cold: 207.438411 +- 0.125840 45.333132 +- 0.002277 6.976486 +- 0.004219 Averages computed over all values <128k using a LFSR to generate order. Cold numbers have a LFSR based branch trace buffer 'confuser' ran between each int_sqrt() invocation. Link: http://lkml.kernel.org/r/20171020164644.876503355@infradead.org Fixes: 30493cc9dddb ("lib/int_sqrt.c: optimize square root algorithm") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Suggested-by: Anshul Garg <aksgarg1989@gmail.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Will Deacon <will.deacon@arm.com> Cc: Joe Perches <joe@perches.com> Cc: David Miller <davem@davemloft.net> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Kees Cook <keescook@chromium.org> Cc: Michael Davidson <md@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-18lib/test: delete five error messages for failed memory allocationsMarkus Elfring3-15/+7
Omit extra messages for a memory allocation failure in these functions. This issue was detected by using the Coccinelle software. Link: http://lkml.kernel.org/r/410a4c5a-4ee0-6fcc-969c-103d8e496b78@users.sourceforge.net Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-18lib: add module support to string testsGeert Uytterhoeven4-142/+143
Extract the string test code into its own source file, to allow compiling it either to a loadable module, or built into the kernel. Fixes: 03270c13c5ffaa6a ("lib/string.c: add testcases for memset16/32/64") Link: http://lkml.kernel.org/r/1505397744-3387-1-git-send-email-geert@linux-m68k.org Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-18dynamic-debug-howto: fix optional/omitted ending line number to be LARGE ↵Randy Dunlap1-0/+4
instead of 0 line-range is supposed to treat "1-" as "1-endoffile", so handle the special case by setting last_lineno to UINT_MAX. Fixes this error: dynamic_debug:ddebug_parse_query: last-line:0 < 1st-line:1 dynamic_debug:ddebug_exec_query: query parse failed Link: http://lkml.kernel.org/r/10a6a101-e2be-209f-1f41-54637824788e@infradead.org Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Jason Baron <jbaron@akamai.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-18bug: define the "cut here" string in a single placeKees Cook1-1/+1
The "cut here" string is used in a few paths. Define it in a single place. Link: http://lkml.kernel.org/r/1510100869-73751-3-git-send-email-keescook@chromium.org Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-18kernel debug: support resetting WARN_ONCE for all architecturesAndi Kleen1-0/+23
Some architectures store the WARN_ONCE state in the flags field of the bug_entry. Clear that one too when resetting once state through /sys/kernel/debug/clear_warn_once Pointed out by Michael Ellerman Improves the earlier patch that add clear_warn_once. [ak@linux.intel.com: add a missing ifdef CONFIG_MODULES] Link: http://lkml.kernel.org/r/20171020170633.9593-1-andi@firstfloor.org [akpm@linux-foundation.org: fix unused var warning] [akpm@linux-foundation.org: Use 0200 for clear_warn_once file, per mpe] [akpm@linux-foundation.org: clear BUGFLAG_DONE in clear_once_table(), per mpe] Link: http://lkml.kernel.org/r/20171019204642.7404-1-andi@firstfloor.org Signed-off-by: Andi Kleen <ak@linux.intel.com> Tested-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-18lib/dma-debug.c: fix incorrect pfn calculationMiles Chen1-2/+18
dma-debug reports the following warning: WARNING: CPU: 3 PID: 298 at kernel-4.4/lib/dma-debug.c:604 debug _dma_assert_idle+0x1a8/0x230() DMA-API: cpu touching an active dma mapped cacheline [cln=0x00000882300] CPU: 3 PID: 298 Comm: vold Tainted: G W O 4.4.22+ #1 Hardware name: MT6739 (DT) Call trace: debug_dma_assert_idle+0x1a8/0x230 wp_page_copy.isra.96+0x118/0x520 do_wp_page+0x4fc/0x534 handle_mm_fault+0xd4c/0x1310 do_page_fault+0x1c8/0x394 do_mem_abort+0x50/0xec I found that debug_dma_alloc_coherent() and debug_dma_free_coherent() assume that dma_alloc_coherent() always returns a linear address. However it's possible that dma_alloc_coherent() returns a non-linear address. In this case, page_to_pfn(virt_to_page(virt)) will return an incorrect pfn. If the pfn is valid and mapped as a COW page, we will hit the warning when doing wp_page_copy(). Fix this by calculating pfn for linear and non-linear addresses. [miles.chen@mediatek.com: v4] Link: http://lkml.kernel.org/r/1510872972-23919-1-git-send-email-miles.chen@mediatek.com Link: http://lkml.kernel.org/r/1506484087-1177-1-git-send-email-miles.chen@mediatek.com Signed-off-by: Miles Chen <miles.chen@mediatek.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-17Merge branch 'work.iov_iter' of ↵Linus Torvalds1-0/+22
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull iov_iter updates from Al Viro: - bio_{map,copy}_user_iov() series; those are cleanups - fixes from the same pile went into mainline (and stable) in late September. - fs/iomap.c iov_iter-related fixes - new primitive - iov_iter_for_each_range(), which applies a function to kernel-mapped segments of an iov_iter. Usable for kvec and bvec ones, the latter does kmap()/kunmap() around the callback. _Not_ usable for iovec- or pipe-backed iov_iter; the latter is not hard to fix if the need ever appears, the former is by design. Another related primitive will have to wait for the next cycle - it passes page + offset + size instead of pointer + size, and that one will be usable for everything _except_ kvec. Unfortunately, that one didn't get exposure in -next yet, so... - a bit more lustre iov_iter work, including a use case for iov_iter_for_each_range() (checksum calculation) - vhost/scsi leak fix in failure exit - misc cleanups and detritectomy... * 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (21 commits) iomap_dio_actor(): fix iov_iter bugs switch ksocknal_lib_recv_...() to use of iov_iter_for_each_range() lustre: switch struct ksock_conn to iov_iter vhost/scsi: switch to iov_iter_get_pages() fix a page leak in vhost_scsi_iov_to_sgl() error recovery new primitive: iov_iter_for_each_range() lnet_return_rx_credits_locked: don't abuse list_entry xen: don't open-code iov_iter_kvec() orangefs: remove detritus from struct orangefs_kiocb_s kill iov_shorten() bio_alloc_map_data(): do bmd->iter setup right there bio_copy_user_iov(): saner bio size calculation bio_map_user_iov(): get rid of copying iov_iter bio_copy_from_iter(): get rid of copying iov_iter move more stuff down into bio_copy_user_iov() blk_rq_map_user_iov(): move iov_iter_advance() down bio_map_user_iov(): get rid of the iov_for_each() bio_map_user_iov(): move alignment check into the main loop don't rely upon subsequent bio_add_pc_page() calls failing ... and with iov_iter_get_pages_alloc() it becomes even simpler ...
2017-11-16Merge tag 'driver-core-4.15-rc1' of ↵Linus Torvalds1-1/+0
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is the set of driver core / debugfs patches for 4.15-rc1. Not many here, mostly all are debugfs fixes to resolve some long-reported problems with files going away with references to them in userspace. There's also some SPDX cleanups for the debugfs code, as well as a few other minor driver core changes for issues reported by people. All of these have been in linux-next for a week or more with no reported issues" * tag 'driver-core-4.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: driver core: Fix device link deferred probe debugfs: Remove redundant license text debugfs: add SPDX identifiers to all debugfs files debugfs: defer debugfs_fsdata allocation to first usage debugfs: call debugfs_real_fops() only after debugfs_file_get() debugfs: purge obsolete SRCU based removal protection IB/hfi1: convert to debugfs_file_get() and -put() debugfs: convert to debugfs_file_get() and -put() debugfs: debugfs_real_fops(): drop __must_hold sparse annotation debugfs: implement per-file removal protection debugfs: add support for more elaborate ->d_fsdata driver core: Move device_links_purge() after bus_remove_device() arch_topology: Fix section miss match warning due to free_raw_capacity() driver-core: pr_err() strings should end with newlines
2017-11-16Merge tag 'drm-for-v4.15' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds1-27/+68
Pull drm updates from Dave Airlie: "This is the main drm pull request for v4.15. Core: - Atomic object lifetime fixes - Atomic iterator improvements - Sparse/smatch fixes - Legacy kms ioctls to be interruptible - EDID override improvements - fb/gem helper cleanups - Simple outreachy patches - Documentation improvements - Fix dma-buf rcu races - DRM mode object leasing for improving VR use cases. - vgaarb improvements for non-x86 platforms. New driver: - tve200: Faraday Technology TVE200 block. This "TV Encoder" encodes a ITU-T BT.656 stream and can be found in the StorLink SL3516 (later Cortina Systems CS3516) as well as the Grain Media GM8180. New bridges: - SiI9234 support New panels: - S6E63J0X03, OTM8009A, Seiko 43WVF1G, 7" rpi touch panel, Toshiba LT089AC19000, Innolux AT043TN24 i915: - Remove Coffeelake from alpha support - Cannonlake workarounds - Infoframe refactoring for DisplayPort - VBT updates - DisplayPort vswing/emph/buffer translation refactoring - CCS fixes - Restore GPU clock boost on missed vblanks - Scatter list updates for userptr allocations - Gen9+ transition watermarks - Display IPC (Isochronous Priority Control) - Private PAT management - GVT: improved error handling and pci config sanitizing - Execlist refactoring - Transparent Huge Page support - User defined priorities support - HuC/GuC firmware refactoring - DP MST fixes - eDP power sequencing fixes - Use RCU instead of stop_machine - PSR state tracking support - Eviction fixes - BDW DP aux channel timeout fixes - LSPCON fixes - Cannonlake PLL fixes amdgpu: - Per VM BO support - Powerplay cleanups - CI powerplay support - PASID mgr for kfd - SR-IOV fixes - initial GPU reset for vega10 - Prime mmap support - TTM updates - Clock query interface for Raven - Fence to handle ioctl - UVD encode ring support on Polaris - Transparent huge page DMA support - Compute LRU pipe tweaks - BO flag to allow buffers to opt out of implicit sync - CTX priority setting API - VRAM lost infrastructure plumbing qxl: - fix flicker since atomic rework amdkfd: - Further improvements from internal AMD tree - Usermode events - Drop radeon support nouveau: - Pascal temperature sensor support - Improved BAR2 handling - MMU rework to support Pascal MMU exynos: - Improved HDMI/mixer support - HDMI audio interface support tegra: - Prep work for tegra186 - Cleanup/fixes msm: - Preemption support for a5xx - Display fixes for 8x96 (snapdragon 820) - Async cursor plane fixes - FW loading rework - GPU debugging improvements vc4: - Prep for DSI panels - fix T-format tiling scanout - New madvise ioctl Rockchip: - LVDS support omapdrm: - omap4 HDMI CEC support etnaviv: - GPU performance counters groundwork sun4i: - refactor driver load + TCON backend - HDMI improvements - A31 support - Misc fixes udl: - Probe/EDID read fixes. tilcdc: - Misc fixes. pl111: - Support more variants adv7511: - Improve EDID handling. - HDMI CEC support sii8620: - Add remote control support" * tag 'drm-for-v4.15' of git://people.freedesktop.org/~airlied/linux: (1480 commits) drm/rockchip: analogix_dp: Use mutex rather than spinlock drm/mode_object: fix documentation for object lookups. drm/i915: Reorder context-close to avoid calling i915_vma_close() under RCU drm/i915: Move init_clock_gating() back to where it was drm/i915: Prune the reservation shared fence array drm/i915: Idle the GPU before shinking everything drm/i915: Lock llist_del_first() vs llist_del_all() drm/i915: Calculate ironlake intermediate watermarks correctly, v2. drm/i915: Disable lazy PPGTT page table optimization for vGPU drm/i915/execlists: Remove the priority "optimisation" drm/i915: Filter out spurious execlists context-switch interrupts drm/amdgpu: use irq-safe lock for kiq->ring_lock drm/amdgpu: bypass lru touch for KIQ ring submission drm/amdgpu: Potential uninitialized variable in amdgpu_vm_update_directories() drm/amdgpu: potential uninitialized variable in amdgpu_vce_ring_parse_cs() drm/amd/powerplay: initialize a variable before using it drm/amd/powerplay: suppress KASAN out of bounds warning in vega10_populate_all_memory_levels drm/amd/amdgpu: fix evicted VRAM bo adjudgement condition drm/vblank: Tune drm_crtc_accurate_vblank_count() WARN down to a debug drm/rockchip: add CONFIG_OF dependency for lvds ...
2017-11-16Merge branch 'akpm' (patches from Andrew)Linus Torvalds4-116/+16
Merge updates from Andrew Morton: - a few misc bits - ocfs2 updates - almost all of MM * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (131 commits) memory hotplug: fix comments when adding section mm: make alloc_node_mem_map a void call if we don't have CONFIG_FLAT_NODE_MEM_MAP mm: simplify nodemask printing mm,oom_reaper: remove pointless kthread_run() error check mm/page_ext.c: check if page_ext is not prepared writeback: remove unused function parameter mm: do not rely on preempt_count in print_vma_addr mm, sparse: do not swamp log with huge vmemmap allocation failures mm/hmm: remove redundant variable align_end mm/list_lru.c: mark expected switch fall-through mm/shmem.c: mark expected switch fall-through mm/page_alloc.c: broken deferred calculation mm: don't warn about allocations which stall for too long fs: fuse: account fuse_inode slab memory as reclaimable mm, page_alloc: fix potential false positive in __zone_watermark_ok mm: mlock: remove lru_add_drain_all() mm, sysctl: make NUMA stats configurable shmem: convert shmem_init_inodecache() to void Unify migrate_pages and move_pages access checks mm, pagevec: rename pagevec drained field ...
2017-11-16mm, truncate: do not check mapping for every page being truncatedMel Gorman2-18/+14
During truncation, the mapping has already been checked for shmem and dax so it's known that workingset_update_node is required. This patch avoids the checks on mapping for each page being truncated. In all other cases, a lookup helper is used to determine if workingset_update_node() needs to be called. The one danger is that the API is slightly harder to use as calling workingset_update_node directly without checking for dax or shmem mappings could lead to surprises. However, the API rarely needs to be used and hopefully the comment is enough to give people the hint. sparsetruncate (tiny) 4.14.0-rc4 4.14.0-rc4 oneirq-v1r1 pickhelper-v1r1 Min Time 141.00 ( 0.00%) 140.00 ( 0.71%) 1st-qrtle Time 142.00 ( 0.00%) 141.00 ( 0.70%) 2nd-qrtle Time 142.00 ( 0.00%) 142.00 ( 0.00%) 3rd-qrtle Time 143.00 ( 0.00%) 143.00 ( 0.00%) Max-90% Time 144.00 ( 0.00%) 144.00 ( 0.00%) Max-95% Time 147.00 ( 0.00%) 145.00 ( 1.36%) Max-99% Time 195.00 ( 0.00%) 191.00 ( 2.05%) Max Time 230.00 ( 0.00%) 205.00 ( 10.87%) Amean Time 144.37 ( 0.00%) 143.82 ( 0.38%) Stddev Time 10.44 ( 0.00%) 9.00 ( 13.74%) Coeff Time 7.23 ( 0.00%) 6.26 ( 13.41%) Best99%Amean Time 143.72 ( 0.00%) 143.34 ( 0.26%) Best95%Amean Time 142.37 ( 0.00%) 142.00 ( 0.26%) Best90%Amean Time 142.19 ( 0.00%) 141.85 ( 0.24%) Best75%Amean Time 141.92 ( 0.00%) 141.58 ( 0.24%) Best50%Amean Time 141.69 ( 0.00%) 141.31 ( 0.27%) Best25%Amean Time 141.38 ( 0.00%) 140.97 ( 0.29%) As you'd expect, the gain is marginal but it can be detected. The differences in bonnie are all within the noise which is not surprising given the impact on the microbenchmark. radix_tree_update_node_t is a callback for some radix operations that optionally passes in a private field. The only user of the callback is workingset_update_node and as it no longer requires a mapping, the private field is removed. Link: http://lkml.kernel.org/r/20171018075952.10627-3-mgorman@techsingularity.net Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-16kmemcheck: rip it outLevin, Alexander (Sasha Levin)2-98/+2
Fix up makefiles, remove references, and git rm kmemcheck. Link: http://lkml.kernel.org/r/20171007030159.22241-4-alexander.levin@verizon.com Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Vegard Nossum <vegardno@ifi.uio.no> Cc: Pekka Enberg <penberg@kernel.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Alexander Potapenko <glider@google.com> Cc: Tim Hansen <devtimhansen@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds5-75/+342
Pull networking updates from David Miller: "Highlights: 1) Maintain the TCP retransmit queue using an rbtree, with 1GB windows at 100Gb this really has become necessary. From Eric Dumazet. 2) Multi-program support for cgroup+bpf, from Alexei Starovoitov. 3) Perform broadcast flooding in hardware in mv88e6xxx, from Andrew Lunn. 4) Add meter action support to openvswitch, from Andy Zhou. 5) Add a data meta pointer for BPF accessible packets, from Daniel Borkmann. 6) Namespace-ify almost all TCP sysctl knobs, from Eric Dumazet. 7) Turn on Broadcom Tags in b53 driver, from Florian Fainelli. 8) More work to move the RTNL mutex down, from Florian Westphal. 9) Add 'bpftool' utility, to help with bpf program introspection. From Jakub Kicinski. 10) Add new 'cpumap' type for XDP_REDIRECT action, from Jesper Dangaard Brouer. 11) Support 'blocks' of transformations in the packet scheduler which can span multiple network devices, from Jiri Pirko. 12) TC flower offload support in cxgb4, from Kumar Sanghvi. 13) Priority based stream scheduler for SCTP, from Marcelo Ricardo Leitner. 14) Thunderbolt networking driver, from Amir Levy and Mika Westerberg. 15) Add RED qdisc offloadability, and use it in mlxsw driver. From Nogah Frankel. 16) eBPF based device controller for cgroup v2, from Roman Gushchin. 17) Add some fundamental tracepoints for TCP, from Song Liu. 18) Remove garbage collection from ipv6 route layer, this is a significant accomplishment. From Wei Wang. 19) Add multicast route offload support to mlxsw, from Yotam Gigi" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2177 commits) tcp: highest_sack fix geneve: fix fill_info when link down bpf: fix lockdep splat net: cdc_ncm: GetNtbFormat endian fix openvswitch: meter: fix NULL pointer dereference in ovs_meter_cmd_reply_start netem: remove unnecessary 64 bit modulus netem: use 64 bit divide by rate tcp: Namespace-ify sysctl_tcp_default_congestion_control net: Protect iterations over net::fib_notifier_ops in fib_seq_sum() ipv6: set all.accept_dad to 0 by default uapi: fix linux/tls.h userspace compilation error usbnet: ipheth: prevent TX queue timeouts when device not ready vhost_net: conditionally enable tx polling uapi: fix linux/rxrpc.h userspace compilation errors net: stmmac: fix LPI transitioning for dwmac4 atm: horizon: Fix irq release error net-sysfs: trigger netlink notification on ifalias change via sysfs openvswitch: Using kfree_rcu() to simplify the code openvswitch: Make local function ovs_nsh_key_attr_size() static openvswitch: Fix return value check in ovs_meter_cmd_features() ...