Age | Commit message (Collapse) | Author | Files | Lines |
|
Minor conflict with the DSA legacy code removal.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Daniel Borkmann says:
====================
pull-request: bpf 2019-05-06
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) Two x32 JIT fixes: one which has buggy signed comparisons in 64
bit conditional jumps and another one for 64 bit negation, both
from Wang.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There was NVMEM support added to of_get_mac_address, so it could now return
ERR_PTR encoded error values, so we need to adjust all current users of
of_get_mac_address to this new fact.
Signed-off-by: Petr Štetiar <ynezz@true.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There was NVMEM support added to of_get_mac_address, so it could now return
ERR_PTR encoded error values, so we need to adjust all current users of
of_get_mac_address to this new fact.
Signed-off-by: Petr Štetiar <ynezz@true.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Replace skb->xmit_more usage by netdev_xmit_more().
Fixes: 4f296edeb9d4 ("drivers: net: aurora: use netdev_xmit_more helper")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Three trivial overlapping conflicts.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pull networking fixes from David Miller:
1) Out of bounds access in xfrm IPSEC policy unlink, from Yue Haibing.
2) Missing length check for esp4 UDP encap, from Sabrina Dubroca.
3) Fix byte order of RX STBC access in mac80211, from Johannes Berg.
4) Inifnite loop in bpftool map create, from Alban Crequy.
5) Register mark fix in ebpf verifier after pkt/null checks, from Paul
Chaignon.
6) Properly use rcu_dereference_sk_user_data in L2TP code, from Eric
Dumazet.
7) Buffer overrun in marvell phy driver, from Andrew Lunn.
8) Several crash and statistics handling fixes to bnxt_en driver, from
Michael Chan and Vasundhara Volam.
9) Several fixes to the TLS layer from Jakub Kicinski (copying negative
amounts of data in reencrypt, reencrypt frag copying, blind nskb->sk
NULL deref, etc).
10) Several UDP GRO fixes, from Paolo Abeni and Eric Dumazet.
11) PID/UID checks on ipv6 flow labels are inverted, from Willem de
Bruijn.
12) Use after free in l2tp, from Eric Dumazet.
13) IPV6 route destroy races, also from Eric Dumazet.
14) SCTP state machine can erroneously run recursively, fix from Xin
Long.
15) Adjust AF_PACKET msg_name length checks, add padding bytes if
necessary. From Willem de Bruijn.
16) Preserve skb_iif, so that forwarded packets have consistent values
even if fragmentation is involved. From Shmulik Ladkani.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (69 commits)
udp: fix GRO packet of death
ipv6: A few fixes on dereferencing rt->from
rds: ib: force endiannes annotation
selftests: fib_rule_tests: print the result and return 1 if any tests failed
ipv4: ip_do_fragment: Preserve skb_iif during fragmentation
net/tls: avoid NULL pointer deref on nskb->sk in fallback
selftests: fib_rule_tests: Fix icmp proto with ipv6
packet: validate msg_namelen in send directly
packet: in recvmsg msg_name return at least sizeof sockaddr_ll
sctp: avoid running the sctp state machine recursively
stmmac: pci: Fix typo in IOT2000 comment
Documentation: fix netdev-FAQ.rst markup warning
ipv6: fix races in ip6_dst_destroy()
l2ip: fix possible use-after-free
appletalk: Set error code if register_snap_client failed
net: dsa: bcm_sf2: fix buffer overflow doing set_rxnfc
rxrpc: Fix net namespace cleanup
ipv6/flowlabel: wait rcu grace period before put_pid()
vrf: Use orig netdev to count Ip6InNoRoutes and a fresh route lookup when sending dest unreach
tcp: add sanity tests in tcp_add_backlog()
...
|
|
The current implementation has two errors:
1: The second xor instruction will clear carry flag which
is necessary for following sbb instruction.
2: The select coding for sbb instruction is wrong, the coding
is "sbb dreg_hi,ecx", but what we need is "sbb ecx,dreg_hi".
This patch rewrites the implementation and fixes the errors.
This patch fixes below errors reported by bpf/test_verifier in x32
platform when the jit is enabled:
"
0: (b4) w1 = 4
1: (b4) w2 = 4
2: (1f) r2 -= r1
3: (4f) r2 |= r1
4: (87) r2 = -r2
5: (c7) r2 s>>= 63
6: (5f) r1 &= r2
7: (bf) r0 = r1
8: (95) exit
processed 9 insns (limit 131072), stack depth 0
0: (b4) w1 = 4
1: (b4) w2 = 4
2: (1f) r2 -= r1
3: (4f) r2 |= r1
4: (87) r2 = -r2
5: (c7) r2 s>>= 63
6: (5f) r1 &= r2
7: (bf) r0 = r1
8: (95) exit
processed 9 insns (limit 131072), stack depth 0
......
Summary: 1189 PASSED, 125 SKIPPED, 15 FAILED
"
Signed-off-by: Wang YanQing <udknight@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
The current method to compare 64-bit numbers for conditional jump is:
1) Compare the high 32-bit first.
2) If the high 32-bit isn't the same, then goto step 4.
3) Compare the low 32-bit.
4) Check the desired condition.
This method is right for unsigned comparison, but it is buggy for signed
comparison, because it does signed comparison for low 32-bit too.
There is only one sign bit in 64-bit number, that is the MSB in the 64-bit
number, it is wrong to treat low 32-bit as signed number and do the signed
comparison for it.
This patch fixes the bug and adds a testcase in selftests/bpf for such bug.
Signed-off-by: Wang YanQing <udknight@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Pull ARC fixes from Vineet Gupta:
"A few minor fixes for ARC.
- regression in memset if line size !64
- avoid panic if PAE and IOC"
* tag 'arc-5.1-final' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
ARC: memset: fix build with L1_CACHE_SHIFT != 6
ARC: [hsdk] Make it easier to add PAE40 region to DTB
ARC: PAE40: don't panic and instead turn off hw ioc
|
|
The pvlock_page and hvclock_page variables are (as the name implies)
addresses to pages, created by the linker script.
But we declared them as just "extern u8" variables, which _works_, but
now that gcc does some more bounds checking, it causes warnings like
warning: array subscript 1 is outside array bounds of ‘u8[1]’
when we then access more than one byte from those variables.
Fix this by simply making the declaration of the variables match
reality, which makes the compiler happy too.
Signed-off-by: Linus Torvalds <torvalds@-linux-foundation.org>
|
|
This doesn't really do anything, but at least we now parse teh
ZERO_PAGE() address argument so that we'll catch the most obvious errors
in usage next time they'll happen.
See commit 6a5c5d26c4c6 ("rdma: fix build errors on s390 and MIPS due to
bad ZERO_PAGE use") what happens when we don't have any use of the macro
argument at all.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Pull ARM fixes from Russell King:
"A small number of ARM fixes
- Fix function tracer and unwinder dependencies so that we don't end
up building kernels that will crash
- Fix ARMv7M nommu initialisation (missing register initialisation)
- Fix EFI decompressor entry (ensuring barrier instructions are
enabled prior to use)"
* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 8857/1: efi: enable CP15 DMB instructions before cleaning the cache
ARM: 8856/1: NOMMU: Fix CCR register faulty initialization when MPU is disabled
ARM: fix function graph tracer and unwinder dependencies
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"A one-liner to make our Radix MMU support depend on HUGETLB_PAGE. We
use some of the hugetlb inlines (eg. pud_huge()) when operating on the
linear mapping and if they're compiled into empty wrappers we can
corrupt memory.
Then two fixes to our VFIO IOMMU code. The first is not a regression
but fixes the locking to avoid a user-triggerable deadlock.
The second does fix a regression since rc1, and depends on the first
fix. It makes it possible to run guests with large amounts of memory
again (~256GB).
Thanks to Alexey Kardashevskiy"
* tag 'powerpc-5.1-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/mm_iommu: Allow pinning large regions
powerpc/mm_iommu: Fix potential deadlock
powerpc/mm/radix: Make Radix require HUGETLB_PAGE
|
|
Daniel Borkmann says:
====================
pull-request: bpf-next 2019-04-28
The following pull-request contains BPF updates for your *net-next* tree.
The main changes are:
1) Introduce BPF socket local storage map so that BPF programs can store
private data they associate with a socket (instead of e.g. separate hash
table), from Martin.
2) Add support for bpftool to dump BTF types. This is done through a new
`bpftool btf dump` sub-command, from Andrii.
3) Enable BPF-based flow dissector for skb-less eth_get_headlen() calls which
was currently not supported since skb was used to lookup netns, from Stanislav.
4) Add an opt-in interface for tracepoints to expose a writable context
for attached BPF programs, used here for NBD sockets, from Matt.
5) BPF xadd related arm64 JIT fixes and scalability improvements, from Daniel.
6) Change the skb->protocol for bpf_skb_adjust_room() helper in order to
support tunnels such as sit. Add selftests as well, from Willem.
7) Various smaller misc fixes.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
- Fix an early boot crash in the RSDP parsing code by effectively
turning off the parsing call - we ran out of time but want to fix the
regression. The more involved fix is being worked on.
- Fix a crash that can trigger in the kmemlek code.
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm: Fix a crash with kmemleak_scan()
x86/boot: Disable RSDP parsing temporarily
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fix from Ingo Molnar:
"A cstate event enumeration fix for Kaby/Coffee Lake CPUs"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/intel: Update KBL Package C-state events to also include PC8/PC9/PC10 counters
|
|
Since ARMv8.1 supplement introduced LSE atomic instructions back in 2016,
lets add support for STADD and use that in favor of LDXR / STXR loop for
the XADD mapping if available. STADD is encoded as an alias for LDADD with
XZR as the destination register, therefore add LDADD to the instruction
encoder along with STADD as special case and use it in the JIT for CPUs
that advertise LSE atomics in CPUID register. If immediate offset in the
BPF XADD insn is 0, then use dst register directly instead of temporary
one.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Prefetch-with-intent-to-write is currently part of the XADD mapping in
the AArch64 JIT and follows the kernel's implementation of atomic_add.
This may interfere with other threads executing the LDXR/STXR loop,
leading to potential starvation and fairness issues. Drop the optional
prefetch instruction.
Fixes: 85f68fe89832 ("bpf, arm64: implement jiting of BPF_XADD")
Reported-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
- keep the tail of an unaligned initrd reserved
- adjust ftrace_make_call() to deal with the relative nature of PLTs
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64/module: ftrace: deal with place relative nature of PLTs
arm64: mm: Ensure tail of unaligned initrd is reserved
|
|
Alexei Starovoitov says:
====================
pull-request: bpf 2019-04-25
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) the bpf verifier fix to properly mark registers in all stack frames, from Paul.
2) preempt_enable_no_resched->preempt_enable fix, from Peter.
3) other misc fixes.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Two easy cases of overlapping changes.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fix sparse warning:
arch/mips/net/ebpf_jit.c:196:5: warning:
symbol 'ebpf_to_mips_reg' was not declared. Should it be static?
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
PC8/PC9/PC10 counters
Kaby Lake (and Coffee Lake) has PC8/PC9/PC10 residency counters.
This patch updates the list of Kaby/Coffee Lake PMU event counters
from the snb_cstates[] list of events to the hswult_cstates[]
list of events, which keeps all previously supported events and
also adds the PKG_C8, PKG_C9 and PKG_C10 residency counters.
This allows user space tools to profile them through the perf interface.
Signed-off-by: Harry Pan <harry.pan@intel.com>
Cc: <stable@vger.kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: gs0622@gmail.com
Link: http://lkml.kernel.org/r/20190424145033.1924-1-harry.pan@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The first kmemleak_scan() call after boot would trigger the crash below
because this callpath:
kernel_init
free_initmem
mem_encrypt_free_decrypted_mem
free_init_pages
unmaps memory inside the .bss when DEBUG_PAGEALLOC=y.
kmemleak_init() will register the .data/.bss sections and then
kmemleak_scan() will scan those addresses and dereference them looking
for pointer references. If free_init_pages() frees and unmaps pages in
those sections, kmemleak_scan() will crash if referencing one of those
addresses:
BUG: unable to handle kernel paging request at ffffffffbd402000
CPU: 12 PID: 325 Comm: kmemleak Not tainted 5.1.0-rc4+ #4
RIP: 0010:scan_block
Call Trace:
scan_gray_list
kmemleak_scan
kmemleak_scan_thread
kthread
ret_from_fork
Since kmemleak_free_part() is tolerant to unknown objects (not tracked
by kmemleak), it is fine to call it from free_init_pages() even if not
all address ranges passed to this function are known to kmemleak.
[ bp: Massage. ]
Fixes: b3f0907c71e0 ("x86/mm: Add .bss..decrypted section to hold shared variables")
Signed-off-by: Qian Cai <cai@lca.pw>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190423165811.36699-1-cai@lca.pw
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic
Pull syscall numbering updates from Arnd Bergmann:
"arch: add pidfd and io_uring syscalls everywhere
This comes a bit late, but should be in 5.1 anyway: we want the newly
added system calls to be synchronized across all architectures in the
release.
I hope that in the future, any newly added system calls can be added
to all architectures at the same time, and tested there while they are
in linux-next, avoiding dependencies between the architecture
maintainer trees and the tree that contains the new system call"
* tag 'syscalls-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
arch: add pidfd and io_uring syscalls everywhere
|
|
The EFI stub is entered with the caches and MMU enabled by the
firmware, and once the stub is ready to hand over to the decompressor,
we clean and disable the caches.
The cache clean routines use CP15 barrier instructions, which can be
disabled via SCTLR. Normally, when using the provided cache handling
routines to enable the caches and MMU, this bit is enabled as well.
However, but since we entered the stub with the caches already enabled,
this routine is not executed before we call the cache clean routines,
resulting in undefined instruction exceptions if the firmware never
enabled this bit.
So set the bit explicitly in the EFI entry code, but do so in a way that
guarantees that the resulting code can still run on v6 cores as well
(which are guaranteed to have CP15 barriers enabled)
Cc: <stable@vger.kernel.org> # v4.9+
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
|
|
When CONFIG_ARM_MPU is not defined, the base address of v7M SCB register
is not initialized with correct value. This prevents enabling I/D caches
when the L1 cache poilcy is applied in kernel.
Fixes: 3c24121039c9da14692eb48f6e39565b28c0f3cf ("ARM: 8756/1: NOMMU: Postpone MPU activation till __after_proc_init")
Signed-off-by: Tigran Tadevosyan <tigran.tadevosyan@arm.com>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
|
|
Naresh Kamboju recently reported that the function-graph tracer crashes
on ARM. The function-graph tracer assumes that the kernel is built with
frame pointers.
We explicitly disabled the function-graph tracer when building Thumb2,
since the Thumb2 ABI doesn't have frame pointers.
We recently changed the way the unwinder method was selected, which
seems to have made it more likely that we can end up with the function-
graph tracer enabled but without the kernel built with frame pointers.
Fix up the function graph tracer dependencies so the option is not
available when we have no possibility of having frame pointers, and
adjust the dependencies on the unwinder option to hide the non-frame
pointer unwinder options if the function-graph tracer is enabled.
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
|
|
Another bodge for the ftrace PLT code: plt_entries_equal() now takes
the place relative nature of the ADRP/ADD based PLT entries into
account, which means that a struct trampoline instance on the stack
is no longer equal to the same set of opcodes in the module struct,
given that they don't point to the same place in memory anymore.
Work around this by using memcmp() in the ftrace PLT handling code.
Acked-by: Will Deacon <will.deacon@arm.com>
Tested-by: dann frazier <dann.frazier@canonical.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
In the event that the start address of the initrd is not aligned, but
has an aligned size, the base + size will not cover the entire initrd
image and there is a chance that the kernel will corrupt the tail of the
image.
By aligning the end of the initrd to a page boundary and then
subtracting the adjusted start address the memblock reservation will
cover all pages that contains the initrd.
Fixes: c756c592e442 ("arm64: Utilize phys_initrd_start/phys_initrd_size")
Cc: stable@vger.kernel.org
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
Pull MIPS fixes from Paul Burton:
"A couple more MIPS fixes:
- Fix indirect syscall tracing & seccomp filtering for big endian
MIPS64 kernels, which previously loaded the syscall number
incorrectly & would always use zero.
- Fix performance counter IRQ setup for Atheros/ath79 SoCs, allowing
perf to function on those systems.
And not really a fix, but a useful addition:
- Add a Broadcom mailing list to the MAINTAINERS entry for BMIPS
systems to allow relevant engineers to track patch submissions"
* tag 'mips_fixes_5.1_3' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
MIPS: perf: ath79: Fix perfcount IRQ assignment
MIPS: scall64-o32: Fix indirect syscall number load
MAINTAINERS: BMIPS: Add internal Broadcom mailing list
|
|
The original intention to move RDSP parsing very early, before KASLR
does its ranges selection, was to accommodate movable memory regions
machines (CONFIG_MEMORY_HOTREMOVE) to still be able to do memory
hotplug.
However, that broke kexec'ing a kernel on EFI machines because depending
on where the EFI systab was mapped, on at least one machine it isn't
present in the kexec mapping of the second kernel, leading to a triple
fault in the early code.
Fixing this properly requires significantly involved surgery and we
cannot allow ourselves to do that, that close to the merge window.
So disable the RSDP parsing code temporarily until it is fixed properly
in the next release cycle.
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Chao Fan <fanc.fnst@cn.fujitsu.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: indou.takao@jp.fujitsu.com
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: kasong@redhat.com
Cc: Kees Cook <keescook@chromium.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: msys.mizuma@gmail.com
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190419141952.GE10324@zn.tnic
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"Misc fixes:
- various tooling fixes
- kretprobe fixes
- kprobes annotation fixes
- kprobes error checking fix
- fix the default events for AMD Family 17h CPUs
- PEBS fix
- AUX record fix
- address filtering fix"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/kprobes: Avoid kretprobe recursion bug
kprobes: Mark ftrace mcount handler functions nokprobe
x86/kprobes: Verify stack frame on kretprobe
perf/x86/amd: Add event map for AMD Family 17h
perf bpf: Return NULL when RB tree lookup fails in perf_env__find_btf()
perf tools: Fix map reference counting
perf evlist: Fix side band thread draining
perf tools: Check maps for bpf programs
perf bpf: Return NULL when RB tree lookup fails in perf_env__find_bpf_prog_info()
tools include uapi: Sync sound/asound.h copy
perf top: Always sample time to satisfy needs of use of ordered queuing
perf evsel: Use hweight64() instead of hweight_long(attr.sample_regs_user)
tools lib traceevent: Fix missing equality check for strcmp
perf stat: Disable DIR_FORMAT feature for 'perf stat record'
perf scripts python: export-to-sqlite.py: Fix use of parent_id in calls_view
perf header: Fix lock/unlock imbalances when processing BPF/BTF info
perf/x86: Fix incorrect PEBS_REGS
perf/ring_buffer: Fix AUX record suppression
perf/core: Fix the address filtering fix
kprobes: Fix error check when reusing optimized probes
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"Misc fixes all over the place: a console spam fix, section attributes
fixes, a KASLR fix, a TLB stack-variable alignment fix, a reboot
quirk, boot options related warnings fix, an LTO fix, a deadlock fix
and an RDT fix"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/cpu/intel: Lower the "ENERGY_PERF_BIAS: Set to normal" message's log priority
x86/cpu/bugs: Use __initconst for 'const' init data
x86/mm/KASLR: Fix the size of the direct mapping section
x86/Kconfig: Fix spelling mistake "effectivness" -> "effectiveness"
x86/mm/tlb: Revert "x86/mm: Align TLB invalidation info"
x86/reboot, efi: Use EFI reboot for Acer TravelMate X514-51T
x86/mm: Prevent bogus warnings with "noexec=off"
x86/build/lto: Fix truncated .bss with -fdata-sections
x86/speculation: Prevent deadlock on ssb_state::lock
x86/resctrl: Do not repeat rdtgroup mode initialization
|
|
The 'timeval' and 'timespec' data structures used for socket timestamps
are going to be redefined in user space based on 64-bit time_t in future
versions of the C library to deal with the y2038 overflow problem,
which breaks the ABI definition.
Unlike many modern ioctl commands, SIOCGSTAMP and SIOCGSTAMPNS do not
use the _IOR() macro to encode the size of the transferred data, so it
remains ambiguous whether the application uses the old or new layout.
The best workaround I could find is rather ugly: we redefine the command
code based on the size of the respective data structure with a ternary
operator. This lets it get evaluated as late as possible, hopefully after
that structure is visible to the caller. We cannot use an #ifdef here,
because inux/sockios.h might have been included before any libc header
that could determine the size of time_t.
The ioctl implementation now interprets the new command codes as always
referring to the 64-bit structure on all architectures, while the old
architecture specific command code still refers to the old architecture
specific layout. The new command number is only used when they are
actually different.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
ia64, parisc and sparc just use a copy of the generic version
of asm/sockios.h, and x86 is a redirect to the same file, so we
can just let the header file be generated.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
priority
The "ENERGY_PERF_BIAS: Set to 'normal', was 'performance'" message triggers
on pretty much every Intel machine. The purpose of log messages with
a warning level is to notify the user of something which potentially is
a problem, or at least somewhat unexpected.
This message clearly does not match those criteria, so lower its log
priority from warning to info.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20181230172715.17469-1-hdegoede@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Some of the recently added const tables use __initdata which causes section
attribute conflicts.
Use __initconst instead.
Fixes: fa1202ef2243 ("x86/speculation: Add command line control")
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20190330004743.29541-9-andi@firstfloor.org
|
|
Avoid kretprobe recursion loop bg by setting a dummy
kprobes to current_kprobe per-CPU variable.
This bug has been introduced with the asm-coded trampoline
code, since previously it used another kprobe for hooking
the function return placeholder (which only has a nop) and
trampoline handler was called from that kprobe.
This revives the old lost kprobe again.
With this fix, we don't see deadlock anymore.
And you can see that all inner-called kretprobe are skipped.
event_1 235 0
event_2 19375 19612
The 1st column is recorded count and the 2nd is missed count.
Above shows (event_1 rec) + (event_2 rec) ~= (event_2 missed)
(some difference are here because the counter is racy)
Reported-by: Andrea Righi <righi.andrea@gmail.com>
Tested-by: Andrea Righi <righi.andrea@gmail.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Fixes: c9becf58d935 ("[PATCH] kretprobe: kretprobe-booster")
Link: http://lkml.kernel.org/r/155094064889.6137.972160690963039.stgit@devbox
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Verify the stack frame pointer on kretprobe trampoline handler,
If the stack frame pointer does not match, it skips the wrong
entry and tries to find correct one.
This can happen if user puts the kretprobe on the function
which can be used in the path of ftrace user-function call.
Such functions should not be probed, so this adds a warning
message that reports which function should be blacklisted.
Tested-by: Andrea Righi <righi.andrea@gmail.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/155094059185.6137.15527904013362842072.stgit@devbox
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fix from Catalin Marinas:
"Avoid compiler uninitialised warning introduced by recent arm64 futex
fix"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: futex: Restore oldval initialization to work around buggy compilers
|
|
Commit 045afc24124d ("arm64: futex: Fix FUTEX_WAKE_OP atomic ops with
non-zero result value") removed oldval's zero initialization in
arch_futex_atomic_op_inuser because it is not necessary. Unfortunately,
Android's arm64 GCC 4.9.4 [1] does not agree:
../kernel/futex.c: In function 'do_futex':
../kernel/futex.c:1658:17: warning: 'oldval' may be used uninitialized
in this function [-Wmaybe-uninitialized]
return oldval == cmparg;
^
In file included from ../kernel/futex.c:73:0:
../arch/arm64/include/asm/futex.h:53:6: note: 'oldval' was declared here
int oldval, ret, tmp;
^
GCC fails to follow that when ret is non-zero, futex_atomic_op_inuser
returns right away, avoiding the uninitialized use that it claims.
Restoring the zero initialization works around this issue.
[1]: https://android.googlesource.com/platform/prebuilts/gcc/linux-x86/aarch64/aarch64-linux-android-4.9/
Cc: stable@vger.kernel.org
Fixes: 045afc24124d ("arm64: futex: Fix FUTEX_WAKE_OP atomic ops with non-zero result value")
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 bug fixes from Martin Schwidefsky:
- Fix overwrite of the initial ramdisk due to misuse of IS_ENABLED
- Fix integer overflow in the dasd driver resulting in incorrect number
of blocks for large devices
- Fix a lockdep false positive in the 3270 driver
- Fix a deadlock in the zcrypt driver
- Fix incorrect debug feature entries in the pkey api
- Fix inline assembly constraints fallout with CONFIG_KASAN=y
* tag 's390-5.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390: correct some inline assembly constraints
s390/pkey: add one more argument space for debug feature entry
s390/zcrypt: fix possible deadlock situation on ap queue remove
s390/3270: fix lockdep false positive on view->lock
s390/dasd: Fix capacity calculation for large volumes
s390/mem_detect: Use IS_ENABLED(CONFIG_BLK_DEV_INITRD)
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fix from Herbert Xu:
"Fix a bug in the implementation of the x86 accelerated version of
poly1305"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: x86/poly1305 - fix overflow during partial reduction
|
|
Family 17h differs from prior families by:
- Does not support an L2 cache miss event
- It has re-enumerated PMC counters for:
- L2 cache references
- front & back end stalled cycles
So we add a new amd_f17h_perfmon_event_map[] so that the generic
perf event names will resolve to the correct h/w events on
family 17h and above processors.
Reference sections 2.1.13.3.3 (stalls) and 2.1.13.3.6 (L2):
https://www.amd.com/system/files/TechDocs/54945_PPR_Family_17h_Models_00h-0Fh.pdf
Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Cc: <stable@vger.kernel.org> # v4.9+
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Liška <mliska@suse.cz>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Pu Wen <puwen@hygon.cn>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Fixes: e40ed1542dd7 ("perf/x86: Add perf support for AMD family-17h processors")
[ Improved the formatting a bit. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
kernel_randomize_memory() uses __PHYSICAL_MASK_SHIFT to calculate
the maximum amount of system RAM supported. The size of the direct
mapping section is obtained from the smaller one of the below two
values:
(actual system RAM size + padding size) vs (max system RAM size supported)
This calculation is wrong since commit
b83ce5ee9147 ("x86/mm/64: Make __PHYSICAL_MASK_SHIFT always 52").
In it, __PHYSICAL_MASK_SHIFT was changed to be 52, regardless of whether
the kernel is using 4-level or 5-level page tables. Thus, it will always
use 4 PB as the maximum amount of system RAM, even in 4-level paging
mode where it should actually be 64 TB.
Thus, the size of the direct mapping section will always
be the sum of the actual system RAM size plus the padding size.
Even when the amount of system RAM is 64 TB, the following layout will
still be used. Obviously KALSR will be weakened significantly.
|____|_______actual RAM_______|_padding_|______the rest_______|
0 64TB ~120TB
Instead, it should be like this:
|____|_______actual RAM_______|_________the rest______________|
0 64TB ~120TB
The size of padding region is controlled by
CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING, which is 10 TB by default.
The above issue only exists when
CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING is set to a non-zero value,
which is the case when CONFIG_MEMORY_HOTPLUG is enabled. Otherwise,
using __PHYSICAL_MASK_SHIFT doesn't affect KASLR.
Fix it by replacing __PHYSICAL_MASK_SHIFT with MAX_PHYSMEM_BITS.
[ bp: Massage commit message. ]
Fixes: b83ce5ee9147 ("x86/mm/64: Make __PHYSICAL_MASK_SHIFT always 52")
Signed-off-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Thomas Garnier <thgarnie@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: frank.ramsay@hpe.com
Cc: herbert@gondor.apana.org.au
Cc: kirill@shutemov.name
Cc: mike.travis@hpe.com
Cc: thgarnie@google.com
Cc: x86-ml <x86@kernel.org>
Cc: yamada.masahiro@socionext.com
Link: https://lkml.kernel.org/r/20190417083536.GE7065@MiWiFi-R3L-srv
|
|
When called with vmas_arg==NULL, get_user_pages_longterm() allocates
an array of nr_pages*8 which can easily get greater that the max order,
for example, registering memory for a 256GB guest does this and fails
in __alloc_pages_nodemask().
This adds a loop over chunks of entries to fit the max order limit.
Fixes: 678e174c4c16 ("powerpc/mm/iommu: allow migration of cma allocated pages during mm_iommu_do_alloc", 2019-03-05)
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Currently mm_iommu_do_alloc() is called in 2 cases:
- VFIO_IOMMU_SPAPR_REGISTER_MEMORY ioctl() for normal memory:
this locks &mem_list_mutex and then locks mm::mmap_sem
several times when adjusting locked_vm or pinning pages;
- vfio_pci_nvgpu_regops::mmap() for GPU memory:
this is called with mm::mmap_sem held already and it locks
&mem_list_mutex.
So one can craft a userspace program to do special ioctl and mmap in
2 threads concurrently and cause a deadlock which lockdep warns about
(below).
We did not hit this yet because QEMU constructs the machine in a single
thread.
This moves the overlap check next to where the new entry is added and
reduces the amount of time spent with &mem_list_mutex held.
This moves locked_vm adjustment from under &mem_list_mutex.
This relies on mm_iommu_adjust_locked_vm() doing nothing when entries==0.
This is one of the lockdep warnings:
======================================================
WARNING: possible circular locking dependency detected
5.1.0-rc2-le_nv2_aikATfstn1-p1 #363 Not tainted
------------------------------------------------------
qemu-system-ppc/8038 is trying to acquire lock:
000000002ec6c453 (mem_list_mutex){+.+.}, at: mm_iommu_do_alloc+0x70/0x490
but task is already holding lock:
00000000fd7da97f (&mm->mmap_sem){++++}, at: vm_mmap_pgoff+0xf0/0x160
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&mm->mmap_sem){++++}:
lock_acquire+0xf8/0x260
down_write+0x44/0xa0
mm_iommu_adjust_locked_vm.part.1+0x4c/0x190
mm_iommu_do_alloc+0x310/0x490
tce_iommu_ioctl.part.9+0xb84/0x1150 [vfio_iommu_spapr_tce]
vfio_fops_unl_ioctl+0x94/0x430 [vfio]
do_vfs_ioctl+0xe4/0x930
ksys_ioctl+0xc4/0x110
sys_ioctl+0x28/0x80
system_call+0x5c/0x70
-> #0 (mem_list_mutex){+.+.}:
__lock_acquire+0x1484/0x1900
lock_acquire+0xf8/0x260
__mutex_lock+0x88/0xa70
mm_iommu_do_alloc+0x70/0x490
vfio_pci_nvgpu_mmap+0xc0/0x130 [vfio_pci]
vfio_pci_mmap+0x198/0x2a0 [vfio_pci]
vfio_device_fops_mmap+0x44/0x70 [vfio]
mmap_region+0x5d4/0x770
do_mmap+0x42c/0x650
vm_mmap_pgoff+0x124/0x160
ksys_mmap_pgoff+0xdc/0x2f0
sys_mmap+0x40/0x80
system_call+0x5c/0x70
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&mm->mmap_sem);
lock(mem_list_mutex);
lock(&mm->mmap_sem);
lock(mem_list_mutex);
*** DEADLOCK ***
1 lock held by qemu-system-ppc/8038:
#0: 00000000fd7da97f (&mm->mmap_sem){++++}, at: vm_mmap_pgoff+0xf0/0x160
Fixes: c10c21efa4bc ("powerpc/vfio/iommu/kvm: Do not pin device memory", 2018-12-19)
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Joel reported weird crashes using skiroot_defconfig, in his case we
jumped into an NX page:
kernel tried to execute exec-protected page (c000000002bff4f0) - exploit attempt? (uid: 0)
BUG: Unable to handle kernel instruction fetch
Faulting instruction address: 0xc000000002bff4f0
Looking at the disassembly, we had simply branched to that address:
c000000000c001bc 49fff335 bl c000000002bff4f0
But that didn't match the original kernel image:
c000000000c001bc 4bfff335 bl c000000000bff4f0 <kobject_get+0x8>
When STRICT_KERNEL_RWX is enabled, and we're using the radix MMU, we
call radix__change_memory_range() late in boot to change page
protections. We do that both to mark rodata read only and also to mark
init text no-execute. That involves walking the kernel page tables,
and clearing _PAGE_WRITE or _PAGE_EXEC respectively.
With radix we may use hugepages for the linear mapping, so the code in
radix__change_memory_range() uses eg. pmd_huge() to test if it has
found a huge mapping, and if so it stops the page table walk and
changes the PMD permissions.
However if the kernel is built without HUGETLBFS support, pmd_huge()
is just a #define that always returns 0. That causes the code in
radix__change_memory_range() to incorrectly interpret the PMD value as
a pointer to a PTE page rather than as a PTE at the PMD level.
We can see this using `dv` in xmon which also uses pmd_huge():
0:mon> dv c000000000000000
pgd @ 0xc000000001740000
pgdp @ 0xc000000001740000 = 0x80000000ffffb009
pudp @ 0xc0000000ffffb000 = 0x80000000ffffa009
pmdp @ 0xc0000000ffffa000 = 0xc00000000000018f <- this is a PTE
ptep @ 0xc000000000000100 = 0xa64bb17da64ab07d <- kernel text
The end result is we treat the value at 0xc000000000000100 as a PTE
and clear _PAGE_WRITE or _PAGE_EXEC, potentially corrupting the code
at that address.
In Joel's specific case we cleared the sign bit in the offset of the
branch, causing a backward branch to turn into a forward branch which
caused us to branch into a non-executable page. However the exact
nature of the crash depends on kernel version, compiler version, and
other factors.
We need to fix radix__change_memory_range() to not use accessors that
depend on HUGETLBFS, but we also have radix memory hotplug code that
uses pmd_huge() etc that will also need fixing. So for now just
disallow the broken combination of Radix with HUGETLBFS disabled.
The only defconfig we have that is affected is skiroot_defconfig, so
turn on HUGETLBFS there so that it still gets Radix.
Fixes: 566ca99af026 ("powerpc/mm/radix: Add dummy radix_enabled()")
Cc: stable@vger.kernel.org # v4.7+
Reported-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|