Age | Commit message (Collapse) | Author | Files | Lines |
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar:
"The main changes in this cycle were:
- Add CONFIG_REFCOUNT_FULL=y to allow the disabling of the 'full'
(robustness checked) refcount_t implementation with slightly lower
runtime overhead. (Kees Cook)
The lighter weight variant is the default. The two variants use the
same API. Having this variant was a precondition by some
maintainers to merge refcount_t cleanups.
- Add lockdep support for rtmutexes (Peter Zijlstra)
- liblockdep fixes and improvements (Sasha Levin, Ben Hutchings)
- ... misc fixes and improvements"
* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (30 commits)
locking/refcount: Remove the half-implemented refcount_sub() API
locking/refcount: Create unchecked atomic_t implementation
locking/rtmutex: Don't initialize lockdep when not required
locking/selftest: Add RT-mutex support
locking/selftest: Remove the bad unlock ordering test
rt_mutex: Add lockdep annotations
MAINTAINERS: Claim atomic*_t maintainership
locking/x86: Remove the unused atomic_inc_short() methd
tools/lib/lockdep: Remove private kernel headers
tools/lib/lockdep: Hide liblockdep output from test results
tools/lib/lockdep: Add dummy current_gfp_context()
tools/include: Add IS_ERR_OR_NULL to err.h
tools/lib/lockdep: Add empty __is_[module,kernel]_percpu_address
tools/lib/lockdep: Include err.h
tools/include: Add (mostly) empty include/linux/sched/mm.h
tools/lib/lockdep: Use LDFLAGS
tools/lib/lockdep: Remove double-quotes from soname
tools/lib/lockdep: Fix object file paths used in an out-of-tree build
tools/lib/lockdep: Fix compilation for 4.11
tools/lib/lockdep: Don't mix fd-based and stream IO
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI updates from Ingo Molnar:
"The main changes in this cycle were:
- Rework the EFI capsule loader to allow for workarounds for
non-compliant firmware (Ard Biesheuvel)
- Implement a capsule loader quirk for Quark X102x (Jan Kiszka)
- Enable SMBIOS/DMI support for the ARM architecture (Ard Biesheuvel)
- Add CONFIG_EFI_PGT_DUMP=y support for x86-32 and kexec (Sai
Praneeth)
- Fixes for EFI support for Xen dom0 guests running under x86-64
hosts (Daniel Kiper)"
* 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/xen/efi: Initialize only the EFI struct members used by Xen
efi: Process the MEMATTR table only if EFI_MEMMAP is enabled
efi/arm: Enable DMI/SMBIOS
x86/efi: Extend CONFIG_EFI_PGT_DUMP support to x86_32 and kexec as well
efi/efi_test: Use memdup_user() helper
efi/capsule: Add support for Quark security header
efi/capsule-loader: Use page addresses rather than struct page pointers
efi/capsule-loader: Redirect calls to efi_capsule_setup_info() via weak alias
efi/capsule: Remove NULL test on kmap()
efi/capsule-loader: Use a cached copy of the capsule header
efi/capsule: Adjust return type of efi_capsule_setup_info()
efi/capsule: Clean up pr_err/_info() messages
efi/capsule: Remove pr_debug() on ENOMEM or EFAULT
efi/capsule: Fix return code on failing kmap/vmap
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull objtool updates from Ingo Molnar:
"This is an extensive rewrite of the objdump tool to track all stack
pointer modifications through the machine instructions of disassembled
functions found in kernel .o files.
This re-design removes the prior dependency on CONFIG_FRAME_POINTERS,
with the goal to prepare the tool to generate kernel debuginfo data in
the future. There's also an increase in checking/tracking robustness
as a side effect as well.
No (intended) changes to existing functionality"
* 'core-objtool-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
objtool: Silence warnings for functions which use IRET
objtool: Implement stack validation 2.0
objtool, x86: Add several functions and files to the objtool whitelist
objtool: Move checking code to check.c
|
|
Pull core block/IO updates from Jens Axboe:
"This is the main pull request for the block layer for 4.13. Not a huge
round in terms of features, but there's a lot of churn related to some
core cleanups.
Note this depends on the UUID tree pull request, that Christoph
already sent out.
This pull request contains:
- A series from Christoph, unifying the error/stats codes in the
block layer. We now use blk_status_t everywhere, instead of using
different schemes for different places.
- Also from Christoph, some cleanups around request allocation and IO
scheduler interactions in blk-mq.
- And yet another series from Christoph, cleaning up how we handle
and do bounce buffering in the block layer.
- A blk-mq debugfs series from Bart, further improving on the support
we have for exporting internal information to aid debugging IO
hangs or stalls.
- Also from Bart, a series that cleans up the request initialization
differences across types of devices.
- A series from Goldwyn Rodrigues, allowing the block layer to return
failure if we will block and the user asked for non-blocking.
- Patch from Hannes for supporting setting loop devices block size to
that of the underlying device.
- Two series of patches from Javier, fixing various issues with
lightnvm, particular around pblk.
- A series from me, adding support for write hints. This comes with
NVMe support as well, so applications can help guide data placement
on flash to improve performance, latencies, and write
amplification.
- A series from Ming, improving and hardening blk-mq support for
stopping/starting and quiescing hardware queues.
- Two pull requests for NVMe updates. Nothing major on the feature
side, but lots of cleanups and bug fixes. From the usual crew.
- A series from Neil Brown, greatly improving the bio rescue set
support. Most notably, this kills the bio rescue work queues, if we
don't really need them.
- Lots of other little bug fixes that are all over the place"
* 'for-4.13/block' of git://git.kernel.dk/linux-block: (217 commits)
lightnvm: pblk: set line bitmap check under debug
lightnvm: pblk: verify that cache read is still valid
lightnvm: pblk: add initialization check
lightnvm: pblk: remove target using async. I/Os
lightnvm: pblk: use vmalloc for GC data buffer
lightnvm: pblk: use right metadata buffer for recovery
lightnvm: pblk: schedule if data is not ready
lightnvm: pblk: remove unused return variable
lightnvm: pblk: fix double-free on pblk init
lightnvm: pblk: fix bad le64 assignations
nvme: Makefile: remove dead build rule
blk-mq: map all HWQ also in hyperthreaded system
nvmet-rdma: register ib_client to not deadlock in device removal
nvme_fc: fix error recovery on link down.
nvmet_fc: fix crashes on bad opcodes
nvme_fc: Fix crash when nvme controller connection fails.
nvme_fc: replace ioabort msleep loop with completion
nvme_fc: fix double calls to nvme_cleanup_cmd()
nvme-fabrics: verify that a controller returns the correct NQN
nvme: simplify nvme_dev_attrs_are_visible
...
|
|
Pull uuid subsystem from Christoph Hellwig:
"This is the new uuid subsystem, in which Amir, Andy and I have started
consolidating our uuid/guid helpers and improving the types used for
them. Note that various other subsystems have pulled in this tree, so
I'd like it to go in early.
UUID/GUID summary:
- introduce the new uuid_t/guid_t types that are going to replace the
somewhat confusing uuid_be/uuid_le types and make the terminology
fit the various specs, as well as the userspace libuuid library.
(me, based on a previous version from Amir)
- consolidated generic uuid/guid helper functions lifted from XFS and
libnvdimm (Amir and me)
- conversions to the new types and helpers (Amir, Andy and me)"
* tag 'uuid-for-4.13' of git://git.infradead.org/users/hch/uuid: (34 commits)
ACPI: hns_dsaf_acpi_dsm_guid can be static
mmc: sdhci-pci: make guid intel_dsm_guid static
uuid: Take const on input of uuid_is_null() and guid_is_null()
thermal: int340x_thermal: fix compile after the UUID API switch
thermal: int340x_thermal: Switch to use new generic UUID API
acpi: always include uuid.h
ACPI: Switch to use generic guid_t in acpi_evaluate_dsm()
ACPI / extlog: Switch to use new generic UUID API
ACPI / bus: Switch to use new generic UUID API
ACPI / APEI: Switch to use new generic UUID API
acpi, nfit: Switch to use new generic UUID API
MAINTAINERS: add uuid entry
tmpfs: generate random sb->s_uuid
scsi_debug: switch to uuid_t
nvme: switch to uuid_t
sysctl: switch to use uuid_t
partitions/ldm: switch to use uuid_t
overlayfs: use uuid_t instead of uuid_be
fs: switch ->s_uuid to uuid_t
ima/policy: switch to use uuid_t
...
|
|
Pull MIPS fixes from Ralf Baechle:
"Here's a final round of fixes for 4.12:
- Fix misordered instructions in assembly code making kenel startup
via UHB unreliable.
- Fix special case of MADDF and MADDF emulation.
- Fix alignment issue in address calculation in pm-cps on 64 bit.
- Fix IRQ tracing & lockdep when rescheduling
- Systems with MAARs require post-DMA cache flushes.
The reordering fix and the MADDF/MSUBF fix have sat in linux-next for
a number of days. The others haven't propagated from my pull tree to
linux-next yet but all have survived manual testing and Imagination's
automated test system and there are no pending bug reports"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: Avoid accidental raw backtrace
MIPS: Perform post-DMA cache flushes on systems with MAARs
MIPS: Fix IRQ tracing & lockdep when rescheduling
MIPS: pm-cps: Drop manual cache-line alignment of ready_count
MIPS: math-emu: Handle zero accumulator case in MADDF and MSUBF separately
MIPS: head: Reorder instructions missing a delay slot
|
|
Pull ARM fix from Russell King:
"One final fix for 4.12 - Doug found a boot failure case triggered by
requesting a non-even MB vmalloc size"
* 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 8685/1: ensure memblock-limit is pmd-aligned
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
"Fixlets for x86:
- Prevent kexec crash when KASLR is enabled, which was caused by an
address calculation bug
- Restore the freeing of PUDs on memory hot remove
- Correct a negated pointer check in the intel uncore performance
monitoring driver
- Plug a memory leak in an error exit path in the RDT code"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/intel_rdt: Fix memory leak on mount failure
x86/boot/KASLR: Fix kexec crash due to 'virt_addr' calculation bug
x86/boot/KASLR: Add checking for the offset of kernel virtual address randomization
perf/x86/intel/uncore: Fix wrong box pointer check
x86/mm/hotplug: Fix BUG_ON() after hot-remove by not freeing PUD
|
|
If mount fails, the kn_info directory is not freed causing memory leak.
Add the missing error handling path.
Fixes: 4e978d06dedb ("x86/intel_rdt: Add "info" files to resctrl file system")
Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: ravi.v.shankar@intel.com
Cc: tony.luck@intel.com
Cc: fenghua.yu@intel.com
Cc: peterz@infradead.org
Cc: vikas.shivappa@intel.com
Cc: andi.kleen@intel.com
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1498503368-20173-3-git-send-email-vikas.shivappa@linux.intel.com
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"Hopefully the last two powerpc fixes for 4.12.
The CXL one is larger than I'd usually send at rc7, but it fixes new
code this cycle, so better to have it working for the release. It was
actually sent a few weeks back but got blocked in testing behind
another fix that was causing issues.
We are still tracking one crash in v4.12-rc7, but only one person has
reproduced it and the commit identified by bisect doesn't touch any of
the relevant code, so I think it's 50/50 whether that commit is
actually the problem or it's some code layout / toolchain issue.
Two fixes for code we merged this cycle:
- cxl: Fixes for Coherent Accelerator Interface Architecture 2.0
- Avoid miscompilation w/GCC 4.6.3 on 32-bit - don't inline
copy_to/from_user()
Thanks to Al Viro, Larry Finger, Christophe Lombard"
* tag 'powerpc-4.12-8' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/32: Avoid miscompilation w/GCC 4.6.3 - don't inline copy_to/from_user()
cxl: Fixes for Coherent Accelerator Interface Architecture 2.0
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU fixes from Joerg Roedel:
"Two fixes:
- A fix for AMD IOMMU interrupt remapping code when IRQs are
forwarded directly to KVM guests
- Fixed check in the recently merged code to allow tboot with
Intel VT-d disabled"
* tag 'iommu-fixes-v4.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/amd: Fix interrupt remapping when disable guest_mode
iommu/vt-d: Correctly disable Intel IOMMU force on
|
|
In preparation for an objtool rewrite which will have broader checks,
whitelist functions and files which cause problems because they do
unusual things with the stack.
These whitelists serve as a TODO list for which functions and files
don't yet have undwarf unwinder coverage. Eventually most of the
whitelists can be removed in favor of manual CFI hint annotations or
objtool improvements.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/7f934a5d707a574bda33ea282e9478e627fb1829.1498659915.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Kernel text KASLR is separated into physical address and virtual
address randomization. And for virtual address randomization, we
only randomiza to get an offset between 16M and KERNEL_IMAGE_SIZE.
So the initial value of 'virt_addr' should be LOAD_PHYSICAL_ADDR,
but not the original kernel loading address 'output'.
The bug will cause kernel boot failure if kernel is loaded at a different
position than the address, 16M, which is decided at compiled time.
Kexec/kdump is such practical case.
To fix it, just assign LOAD_PHYSICAL_ADDR to virt_addr as initial
value.
Tested-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Baoquan He <bhe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 8391c73 ("x86/KASLR: Randomize virtual address separately")
Link: http://lkml.kernel.org/r/1498567146-11990-3-git-send-email-bhe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
randomization
For kernel text KASLR, the virtual address is confined to area of 1G,
[0xffffffff80000000, 0xffffffffc0000000). For the implemenataion of
virtual address randomization, we only randomize to get an offset
between 16M and 1G, then add this offset to the starting address,
0xffffffff80000000. Here 16M is the offset which is decided at linking
stage. So the amount of the local variable 'virt_addr' which respresents
the offset plus the kernel output size can not exceed KERNEL_IMAGE_SIZE.
Add a debug check for the offset. If out of bounds, print error
message and hang there.
Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Baoquan He <bhe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1498567146-11990-2-git-send-email-bhe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Since commit 81a76d7119f6 ("MIPS: Avoid using unwind_stack() with
usermode") show_backtrace() invokes the raw backtracer when
cp0_status & ST0_KSU indicates user mode to fix issues on EVA kernels
where user and kernel address spaces overlap.
However this is used by show_stack() which creates its own pt_regs on
the stack and leaves cp0_status uninitialised in most of the code paths.
This results in the non deterministic use of the raw back tracer
depending on the previous stack content.
show_stack() deals exclusively with kernel mode stacks anyway, so
explicitly initialise regs.cp0_status to KSU_KERNEL (i.e. 0) to ensure
we get a useful backtrace.
Fixes: 81a76d7119f6 ("MIPS: Avoid using unwind_stack() with usermode")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 3.15+
Patchwork: https://patchwork.linux-mips.org/patch/16656/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
Recent CPUs from Imagination Technologies such as the I6400 or P6600 are
able to speculatively fetch data from memory into caches. This means
that if used in a system with non-coherent DMA they require that caches
be invalidated after a device performs DMA, and before the CPU reads the
DMA'd data, in order to ensure that stale values weren't speculatively
prefetched.
Such CPUs also introduced Memory Accessibility Attribute Registers
(MAARs) in order to control the regions in which they are allowed to
speculate. Thus we can use the presence of MAARs as a good indication
that the CPU requires the above cache maintenance. Use the presence of
MAARs to determine the result of cpu_needs_post_dma_flush() in the
default case, in order to handle these recent CPUs correctly.
Note that the return type of cpu_needs_post_dma_flush() is changed to
bool, such that it's clearer what's happening when cpu_has_maar is cast
to bool for the return value. If this patch were backported to a
pre-v4.7 kernel then MIPS_CPU_MAAR was 1ull<<34, so when cast to an int
we would incorrectly return 0. It so happens that MIPS_CPU_MAAR is
currently 1ull<<30, so when truncated to an int gives a non-zero value
anyway, but even so the implicit conversion from long long int to bool
makes it clearer to understand what will happen than the implicit
conversion from long long int to int would. The bool return type also
fits this usage better semantically, so seems like an all-round win.
Thanks to Ed for spotting the issue for pre-v4.7 kernels & suggesting
the return type change.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Reviewed-by: Bryan O'Donoghue <pure.logic@nexus-software.ie>
Tested-by: Bryan O'Donoghue <pure.logic@nexus-software.ie>
Cc: Ed Blake <ed.blake@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/16363/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
When the scheduler sets TIF_NEED_RESCHED & we call into the scheduler
from arch/mips/kernel/entry.S we disable interrupts. This is true
regardless of whether we reach work_resched from syscall_exit_work,
resume_userspace or by looping after calling schedule(). Although we
disable interrupts in these paths we don't call trace_hardirqs_off()
before calling into C code which may acquire locks, and we therefore
leave lockdep with an inconsistent view of whether interrupts are
disabled or not when CONFIG_PROVE_LOCKING & CONFIG_DEBUG_LOCKDEP are
both enabled.
Without tracing this interrupt state lockdep will print warnings such
as the following once a task returns from a syscall via
syscall_exit_partial with TIF_NEED_RESCHED set:
[ 49.927678] ------------[ cut here ]------------
[ 49.934445] WARNING: CPU: 0 PID: 1 at kernel/locking/lockdep.c:3687 check_flags.part.41+0x1dc/0x1e8
[ 49.946031] DEBUG_LOCKS_WARN_ON(current->hardirqs_enabled)
[ 49.946355] CPU: 0 PID: 1 Comm: init Not tainted 4.10.0-00439-gc9fd5d362289-dirty #197
[ 49.963505] Stack : 0000000000000000 ffffffff81bb5d6a 0000000000000006 ffffffff801ce9c4
[ 49.974431] 0000000000000000 0000000000000000 0000000000000000 000000000000004a
[ 49.985300] ffffffff80b7e487 ffffffff80a24498 a8000000ff160000 ffffffff80ede8b8
[ 49.996194] 0000000000000001 0000000000000000 0000000000000000 0000000077c8030c
[ 50.007063] 000000007fd8a510 ffffffff801cd45c 0000000000000000 a8000000ff127c88
[ 50.017945] 0000000000000000 ffffffff801cf928 0000000000000001 ffffffff80a24498
[ 50.028827] 0000000000000000 0000000000000001 0000000000000000 0000000000000000
[ 50.039688] 0000000000000000 a8000000ff127bd0 0000000000000000 ffffffff805509bc
[ 50.050575] 00000000140084e0 0000000000000000 0000000000000000 0000000000040a00
[ 50.061448] 0000000000000000 ffffffff8010e1b0 0000000000000000 ffffffff805509bc
[ 50.072327] ...
[ 50.076087] Call Trace:
[ 50.079869] [<ffffffff8010e1b0>] show_stack+0x80/0xa8
[ 50.086577] [<ffffffff805509bc>] dump_stack+0x10c/0x190
[ 50.093498] [<ffffffff8015dde0>] __warn+0xf0/0x108
[ 50.099889] [<ffffffff8015de34>] warn_slowpath_fmt+0x3c/0x48
[ 50.107241] [<ffffffff801c15b4>] check_flags.part.41+0x1dc/0x1e8
[ 50.114961] [<ffffffff801c239c>] lock_is_held_type+0x8c/0xb0
[ 50.122291] [<ffffffff809461b8>] __schedule+0x8c0/0x10f8
[ 50.129221] [<ffffffff80946a60>] schedule+0x30/0x98
[ 50.135659] [<ffffffff80106278>] work_resched+0x8/0x34
[ 50.142397] ---[ end trace 0cb4f6ef5b99fe21 ]---
[ 50.148405] possible reason: unannotated irqs-off.
[ 50.154600] irq event stamp: 400463
[ 50.159566] hardirqs last enabled at (400463): [<ffffffff8094edc8>] _raw_spin_unlock_irqrestore+0x40/0xa8
[ 50.171981] hardirqs last disabled at (400462): [<ffffffff8094eb98>] _raw_spin_lock_irqsave+0x30/0xb0
[ 50.183897] softirqs last enabled at (400450): [<ffffffff8016580c>] __do_softirq+0x4ac/0x6a8
[ 50.195015] softirqs last disabled at (400425): [<ffffffff80165e78>] irq_exit+0x110/0x128
Fix this by using the TRACE_IRQS_OFF macro to call trace_hardirqs_off()
when CONFIG_TRACE_IRQFLAGS is enabled. This is done before invoking
schedule() following the work_resched label because:
1) Interrupts are disabled regardless of the path we take to reach
work_resched() & schedule().
2) Performing the tracing here avoids the need to do it in paths which
disable interrupts but don't call out to C code before hitting a
path which uses the RESTORE_SOME macro that will call
trace_hardirqs_on() or trace_hardirqs_off() as appropriate.
We call trace_hardirqs_on() using the TRACE_IRQS_ON macro before calling
syscall_trace_leave() for similar reasons, ensuring that lockdep has a
consistent view of state after we re-enable interrupts.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: linux-mips@linux-mips.org
Cc: stable <stable@vger.kernel.org>
Patchwork: https://patchwork.linux-mips.org/patch/15385/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
We allocate memory for a ready_count variable per-CPU, which is accessed
via a cached non-coherent TLB mapping to perform synchronisation between
threads within the core using LL/SC instructions. In order to ensure
that the variable is contained within its own data cache line we
allocate 2 lines worth of memory & align the resulting pointer to a line
boundary. This is however unnecessary, since kmalloc is guaranteed to
return memory which is at least cache-line aligned (see
ARCH_DMA_MINALIGN). Stop the redundant manual alignment.
Besides cleaning up the code & avoiding needless work, this has the side
effect of avoiding an arithmetic error found by Bryan on 64 bit systems
due to the 32 bit size of the former dlinesz. This led the ready_count
variable to have its upper 32b cleared erroneously for MIPS64 kernels,
causing problems when ready_count was later used on MIPS64 via cpuidle.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Fixes: 3179d37ee1ed ("MIPS: pm-cps: add PM state entry code for CPS systems")
Reported-by: Bryan O'Donoghue <bryan.odonoghue@imgtec.com>
Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@imgtec.com>
Tested-by: Bryan O'Donoghue <bryan.odonoghue@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: stable <stable@vger.kernel.org> # v3.16+
Patchwork: https://patchwork.linux-mips.org/patch/15383/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
The pmd containing memblock_limit is cleared by prepare_page_table()
which creates the opportunity for early_alloc() to allocate unmapped
memory if memblock_limit is not pmd aligned causing a boot-time hang.
Commit 965278dcb8ab ("ARM: 8356/1: mm: handle non-pmd-aligned end of RAM")
attempted to resolve this problem, but there is a path through the
adjust_lowmem_bounds() routine where if all memory regions start and
end on pmd-aligned addresses the memblock_limit will be set to
arm_lowmem_limit.
Since arm_lowmem_limit can be affected by the vmalloc early parameter,
the value of arm_lowmem_limit may not be pmd-aligned. This commit
corrects this oversight such that memblock_limit is always rounded
down to pmd-alignment.
Fixes: 965278dcb8ab ("ARM: 8356/1: mm: handle non-pmd-aligned end of RAM")
Signed-off-by: Doug Berger <opendmb@gmail.com>
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
|
|
Should not init a NULL box. It will cause system crash.
The issue looks like caused by a typo.
This was not noticed because there is no NULL box. Also, for most
boxes, they are enabled by default. The init code is not critical.
Fixes: fff4b87e594a ("perf/x86/intel/uncore: Make package handling more robust")
Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20170629190926.2456-1-kan.liang@intel.com
|
|
The only user of thread_saved_pc() in non-arch-specific code was removed
in commit 8243d5597793 ("sched/core: Remove pointless printout in
sched_show_task()"). Remove the implementations as well.
Some architectures use thread_saved_pc() in their arch-specific code.
Leave their thread_saved_pc() intact.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Many subsystems will not use refcount_t unless there is a way to build the
kernel so that there is no regression in speed compared to atomic_t. This
adds CONFIG_REFCOUNT_FULL to enable the full refcount_t implementation
which has the validation but is slightly slower. When not enabled,
refcount_t uses the basic unchecked atomic_t routines, which results in
no code changes compared to just using atomic_t directly.
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: David Windsor <dwindsor@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Elena Reshetova <elena.reshetova@intel.com>
Cc: Eric Biggers <ebiggers3@gmail.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Hans Liljestrand <ishkamiel@gmail.com>
Cc: James Bottomley <James.Bottomley@hansenpartnership.com>
Cc: Jann Horn <jannh@google.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Serge E. Hallyn <serge@hallyn.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: arozansk@redhat.com
Cc: axboe@kernel.dk
Cc: linux-arch <linux-arch@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170621200026.GA115679@beast
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
If accumulator value is zero, just return the value of previously
calculated product. This brings logic in MADDF/MSUBF implementation
closer to the logic in ADD/SUB case.
Signed-off-by: Miodrag Dinic <miodrag.dinic@imgtec.com>
Signed-off-by: Goran Ferenc <goran.ferenc@imgtec.com>
Signed-off-by: Aleksandar Markovic <aleksandar.markovic@imgtec.com>
Cc: James.Hogan@imgtec.com
Cc: Paul.Burton@imgtec.com
Cc: Raghu.Gandham@imgtec.com
Cc: Leonid.Yegoshin@imgtec.com
Cc: Douglas.Leung@imgtec.com
Cc: Petar.Jovanovic@imgtec.com
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/16512/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
In this sequence the 'move' is assumed in the delay slot of the 'beq',
but head.S is in reorder mode and the former gets pushed one 'nop'
farther by the assembler.
The corrected behavior made booting with an UHI supplied dtb erratic.
Fixes: 15f37e158892 ("MIPS: store the appended dtb address in a variable")
Signed-off-by: Karl Beldan <karl.beldan+oss@gmail.com>
Reviewed-by: James Hogan <james.hogan@imgtec.com>
Cc: Jonas Gorski <jogo@openwrt.org>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/16614/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
Pull ARM fixes from Russell King:
"Three more fixes:
- Fix the previous fix merged in the last pull for the Thumb2
decompressor.
- A fix from Vladimir to correctly identify the V7M cache type.
- The optimised 3G vmsplit case does not work with LPAE, so don't
allow this to be selected for LPAE configurations"
* 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 8682/1: V7M: Set cacheid iff DminLine or IminLine is nonzero
ARM: 8681/1: make VMSPLIT_3G_OPT depends on !ARM_LPAE
ARM: 8680/1: boot/compressed: fix inappropriate Thumb2 mnemonic for __nop
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 bugfix from Martin Schwidefsky:
"One last s390 patch for 4.12
Revert the re-IPL semantics back to the v4.7 state. It turned out that
the memory layout may change due to memory hotplug if load-normal is
used"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/ipl: revert Load Normal semantics for LPAR CCW-type re-IPL
|
|
Larry Finger reported that his Powerbook G4 was no longer booting with v4.12-rc,
userspace was up but giving weird errors such as:
udevd[64]: starting version 175
udevd[64]: Unable to receive ctrl message: Bad address.
modprobe: chdir(4.12-rc1): No such file or directory
He bisected the problem to commit 3448890c32c3 ("powerpc: get rid of zeroing,
switch to RAW_COPY_USER").
Al identified that the problem is actually a miscompilation by GCC 4.6.3, which
is exposed by the above commit.
Al also pointed out that inlining copy_to/from_user() is probably of little or
no benefit, which is correct. Using Anton's copy_to_user benchmark, with a
pathological single byte copy, we see a small increase in performance
by *removing* inlining:
Before (inlined):
# time ./copy_to_user -w -l 1 -i 10000000 ( x 3 )
real 0m22.063s
real 0m22.059s
real 0m22.076s
After:
# time ./copy_to_user -w -l 1 -i 10000000 ( x 3 )
real 0m21.325s
real 0m21.299s
real 0m21.364s
So as a small performance improvement and to avoid the miscompilation, drop
inlining copy_to/from_user() on 32-bit.
Fixes: 3448890c32c3 ("powerpc: get rid of zeroing, switch to RAW_COPY_USER")
Reported-by: Larry Finger <Larry.Finger@lwfinger.net>
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Since commit:
af2cf278ef4f ("x86/mm/hotplug: Don't remove PGD entries in remove_pagetable()")
we no longer free PUDs so that we do not have to synchronize
all PGDs on hot-remove/vfree().
But the new 5-level page table patchset reverted that for 4-level
page tables, in the following commit:
f2a6a7050109: ("x86: Convert the rest of the code to support p4d_t")
This patch restores the damage and disables free_pud() if we are in the
4-level page table case, thus avoiding BUG_ON() after hot-remove.
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
[ Clarified the changelog and the code comments. ]
Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/20170624180514.3821-1-jglisse@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fix from Thomas Gleixner:
"A single fix to unbreak the vdso32 build for 64bit kernels caused by
excess #includes in the mshyperv header"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mshyperv: Remove excess #includes from mshyperv.h
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fixes from Thomas Gleixner:
"A few fixes for timekeeping and timers:
- Plug a subtle race due to a missing READ_ONCE() in the timekeeping
code where reloading of a pointer results in an inconsistent
callback argument being supplied to the clocksource->read function.
- Correct the CLOCK_MONOTONIC_RAW sub-nanosecond accounting in the
time keeping core code, to prevent a possible discontuity.
- Apply a similar fix to the arm64 vdso clock_gettime()
implementation
- Add missing includes to clocksource drivers, which relied on
indirect includes which fails in certain configs.
- Use the proper iomem pointer for read/iounmap in a probe function"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
arm64/vdso: Fix nsec handling for CLOCK_MONOTONIC_RAW
time: Fix CLOCK_MONOTONIC_RAW sub-nanosecond accounting
time: Fix clock->read(clock) race around clocksource changes
clocksource: Explicitly include linux/clocksource.h when needed
clocksource/drivers/arm_arch_timer: Fix read and iounmap of incorrect variable
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Thomas Gleixner:
"Three fixlets for perf:
- Return the proper error code if aux buffers for a event are not
supported.
- Calculate the probe offset for inlined functions correctly
- Update the Skylake DTLB load/store miss event so it can count 1G
TLB entries as well"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf probe: Fix probe definition for inlined functions
perf/x86/intel: Add 1G DTLB load/store miss support for SKL
perf/aux: Correct return code of rb_alloc_aux() if !has_aux(ev)
|
|
A recent commit included linux/slab.h in linux/irq.h. This breaks the build
of vdso32 on a 64-bit kernel.
The reason is that linux/irq.h gets included into the vdso code via
linux/interrupt.h which is included from asm/mshyperv.h. That makes the
32-bit vdso compile fail, because slab.h includes the pgtable headers for
64-bit on a 64-bit build.
Neither linux/clocksource.h nor linux/interrupt.h are needed in the
mshyperv.h header file itself - it has a dependency on <linux/atomic.h>.
Remove the includes and unbreak the build.
Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: devel@linuxdriverproject.org
Fixes: dee863b571b0 ("hv: export current Hyper-V clocksource")
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1706231038460.2647@nanos
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"Some more powerpc fixes for 4.12. Most of these actually came in last
week but got held up for some more testing.
- three fixes for kprobes/ftrace/livepatch interactions.
- properly handle data breakpoints when using the Radix MMU.
- fix for perf sampling of registers during call_usermodehelper().
- properly initialise the thread_info on our emergency stacks
- add an explicit flush when doing TLB invalidations for a process
using NPU2.
Thanks to: Alistair Popple, Naveen N. Rao, Nicholas Piggin, Ravi
Bangoria, Masami Hiramatsu"
* tag 'powerpc-4.12-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/64: Initialise thread_info for emergency stacks
powerpc/powernv/npu-dma: Add explicit flush when sending an ATSD
powerpc/perf: Fix oops when kthread execs user process
powerpc/64s: Handle data breakpoints in Radix mode
powerpc/kprobes: Skip livepatch_handler() for jprobes
powerpc/ftrace: Pass the correct stack pointer for DYNAMIC_FTRACE_WITH_REGS
powerpc/kprobes: Pause function_graph tracing during jprobes handling
|
|
The current approach, which is the wholesale efi struct initialization from
a 'efi_xen' local template is not robust. Usually if new member is defined
then it is properly initialized in drivers/firmware/efi/efi.c, but not in
arch/x86/xen/efi.c.
The effect is that the Xen initialization clears any fields the generic code
might have set and the Xen code does not know about yet.
I saw this happen a few times, so let's initialize only the EFI struct members
used by Xen and maintain no local duplicate, to avoid such issues in the future.
Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: andrew.cooper3@citrix.com
Cc: jgross@suse.com
Cc: linux-efi@vger.kernel.org
Cc: matt@codeblueprint.co.uk
Cc: stable@vger.kernel.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1498128697-12943-3-git-send-email-daniel.kiper@oracle.com
[ Clarified the changelog. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Pull in the fix for shared tags, as it conflicts with the pending
changes in for-4.13/block. We already pulled in v4.12-rc5 to solve
other conflicts or get fixes that went into 4.12, so not a lot
of changes in this merge.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Emergency stacks have their thread_info mostly uninitialised, which in
particular means garbage preempt_count values.
Emergency stack code runs with interrupts disabled entirely, and is
used very rarely, so this has been unnoticed so far. It was found by a
proposed new powerpc watchdog that takes a soft-NMI directly from the
masked_interrupt handler and using the emergency stack. That crashed
at BUG_ON(in_nmi()) in nmi_enter(). preempt_count()s were found to be
garbage.
To fix this, zero the entire THREAD_SIZE allocation, and initialize
the thread_info.
Cc: stable@vger.kernel.org
Reported-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Move it all into setup_64.c, use a function not a macro. Fix
crashes on Cell by setting preempt_count to 0 not HARDIRQ_OFFSET]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
TF is handled a bit differently for syscall and sysret, compared
to the other instructions: TF is checked after the instruction completes,
so that the OS can disable #DB at a syscall by adding TF to FMASK.
When the sysret is executed the #DB is taken "as if" the syscall insn
just completed.
KVM emulates syscall so that it can trap 32-bit syscall on Intel processors.
Fix the behavior, otherwise you could get #DB on a user stack which is not
nice. This does not affect Linux guests, as they use an IST or task gate
for #DB.
This fixes CVE-2017-7518.
Cc: stable@vger.kernel.org
Reported-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux
KVM: s390: fix shadow table handling for nested guests
Some odd-ball cases (real-space designation ASCEs) are handled wrong
for the shadow page tables. Fix it.
|
|
NPU2 requires an extra explicit flush to an active GPU PID when
sending address translation shoot downs (ATSDs) to reliably flush the
GPU TLB. This patch adds just such a flush at the end of each sequence
of ATSDs.
We can safely use PID 0 which is always reserved and active on the
GPU. PID 0 is only used for init_mm which will never be a user mm on
the GPU. To enforce this we add a check in pnv_npu2_init_context()
just in case someone tries to use PID 0 on the GPU.
Signed-off-by: Alistair Popple <alistair@popple.id.au>
[mpe: Use true/false for bool literals]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
For real-space designation asces the asce origin part is only a token.
The asce token origin must not be used to generate an effective
address for storage references. This however is erroneously done
within kvm_s390_shadow_tables().
Furthermore within the same function the wrong parts of virtual
addresses are used to generate a corresponding real address
(e.g. the region second index is used as region first index).
Both of the above can result in incorrect address translations. Only
for real space designations with a token origin of zero and addresses
below one megabyte the translation was correct.
Furthermore replace a "!asce.r" statement with a "!*fake" statement to
make it more obvious that a specific condition has nothing to do with
the architecture, but with the fake handling of real space designations.
Fixes: 3218f7094b6b ("s390/mm: support real-space for gmap shadows")
Cc: David Hildenbrand <david@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
|
|
Current DTLB load/store miss events (0x608/0x649) only counts 4K,2M and
4M page size.
Need to extend the events to support any page size (4K/2M/4M/1G).
The complete DTLB load/store miss events are:
DTLB_LOAD_MISSES.WALK_COMPLETED 0xe08
DTLB_STORE_MISSES.WALK_COMPLETED 0xe49
Signed-off-by: Kan Liang <Kan.liang@intel.com>
Cc: <stable@vger.kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: eranian@google.com
Link: http://lkml.kernel.org/r/20170619142609.11058-1-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
This commit fixes a "maybe-uninitialized" build failure in
arch/mips/kvm/tlb.c when KVM, DYNAMIC_DEBUG and JUMP_LABEL are all
enabled. The failure is:
In file included from ./include/linux/printk.h:329:0,
from ./include/linux/kernel.h:13,
from ./include/asm-generic/bug.h:15,
from ./arch/mips/include/asm/bug.h:41,
from ./include/linux/bug.h:4,
from ./include/linux/thread_info.h:11,
from ./include/asm-generic/current.h:4,
from ./arch/mips/include/generated/asm/current.h:1,
from ./include/linux/sched.h:11,
from arch/mips/kvm/tlb.c:13:
arch/mips/kvm/tlb.c: In function ‘kvm_mips_host_tlb_inv’:
./include/linux/dynamic_debug.h:126:3: error: ‘idx_kernel’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
__dynamic_pr_debug(&descriptor, pr_fmt(fmt), \
^~~~~~~~~~~~~~~~~~
arch/mips/kvm/tlb.c:169:16: note: ‘idx_kernel’ was declared here
int idx_user, idx_kernel;
^~~~~~~~~~
There is a similar error relating to "idx_user". Both errors were
observed with GCC 6.
As far as I can tell, it is impossible for either idx_user or idx_kernel
to be uninitialized when they are later read in the calls to kvm_debug,
but to satisfy the compiler, add zero initializers to both variables.
Signed-off-by: James Cowgill <James.Cowgill@imgtec.com>
Fixes: 57e3869cfaae ("KVM: MIPS/TLB: Generalise host TLB invalidate to kernel ASID")
Cc: <stable@vger.kernel.org> # 4.11+
Acked-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc
* fix problems that could cause hangs or crashes in the host on POWER9
* fix problems that could allow guests to potentially affect or disrupt
the execution of the controlling userspace
|
|
Recently vDSO support for CLOCK_MONOTONIC_RAW was added in
49eea433b326 ("arm64: Add support for CLOCK_MONOTONIC_RAW in
clock_gettime() vDSO"). Noticing that the core timekeeping code
never set tkr_raw.xtime_nsec, the vDSO implementation didn't
bother exposing it via the data page and instead took the
unshifted tk->raw_time.tv_nsec value which was then immediately
shifted left in the vDSO code.
Unfortunately, by accellerating the MONOTONIC_RAW clockid, it
uncovered potential 1ns time inconsistencies caused by the
timekeeping core not handing sub-ns resolution.
Now that the core code has been fixed and is actually setting
tkr_raw.xtime_nsec, we need to take that into account in the
vDSO by adding it to the shifted raw_time value, in order to
fix the user-visible inconsistency. Rather than do that at each
use (and expand the data page in the process), instead perform
the shift/addition operation when populating the data page and
remove the shift from the vDSO code entirely.
[jstultz: minor whitespace tweak, tried to improve commit
message to make it more clear this fixes a regression]
Reported-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Tested-by: Daniel Mentz <danielmentz@google.com>
Acked-by: Kevin Brodsky <kevin.brodsky@arm.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Stephen Boyd <stephen.boyd@linaro.org>
Cc: "stable #4 . 8+" <stable@vger.kernel.org>
Cc: Miroslav Lichvar <mlichvar@redhat.com>
Link: http://lkml.kernel.org/r/1496965462-20003-4-git-send-email-john.stultz@linaro.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Stack guard page is a useful feature to reduce a risk of stack smashing
into a different mapping. We have been using a single page gap which
is sufficient to prevent having stack adjacent to a different mapping.
But this seems to be insufficient in the light of the stack usage in
userspace. E.g. glibc uses as large as 64kB alloca() in many commonly
used functions. Others use constructs liks gid_t buffer[NGROUPS_MAX]
which is 256kB or stack strings with MAX_ARG_STRLEN.
This will become especially dangerous for suid binaries and the default
no limit for the stack size limit because those applications can be
tricked to consume a large portion of the stack and a single glibc call
could jump over the guard page. These attacks are not theoretical,
unfortunatelly.
Make those attacks less probable by increasing the stack guard gap
to 1MB (on systems with 4k pages; but make it depend on the page size
because systems with larger base pages might cap stack allocations in
the PAGE_SIZE units) which should cover larger alloca() and VLA stack
allocations. It is obviously not a full fix because the problem is
somehow inherent, but it should reduce attack space a lot.
One could argue that the gap size should be configurable from userspace,
but that can be done later when somebody finds that the new 1MB is wrong
for some special case applications. For now, add a kernel command line
option (stack_guard_gap) to specify the stack gap size (in page units).
Implementation wise, first delete all the old code for stack guard page:
because although we could get away with accounting one extra page in a
stack vma, accounting a larger gap can break userspace - case in point,
a program run with "ulimit -S -v 20000" failed when the 1MB gap was
counted for RLIMIT_AS; similar problems could come with RLIMIT_MLOCK
and strict non-overcommit mode.
Instead of keeping gap inside the stack vma, maintain the stack guard
gap as a gap between vmas: using vm_start_gap() in place of vm_start
(or vm_end_gap() in place of vm_end if VM_GROWSUP) in just those few
places which need to respect the gap - mainly arch_get_unmapped_area(),
and and the vma tree's subtree_gap support for that.
Original-patch-by: Oleg Nesterov <oleg@redhat.com>
Original-patch-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"Stream of fixes has slowed down, only a few this week:
- Some DT fixes for Allwinner platforms, and addition of a clock to
the R_CCU clock controller that had been missed.
- A couple of small DT fixes for am335x-sl50"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
arm64: allwinner: a64: Add PLL_PERIPH0 clock to the R_CCU
ARM: sunxi: h3-h5: Add PLL_PERIPH0 clock to the R_CCU
ARM: dts: am335x-sl50: Fix cannot claim requested pins for spi0
ARM: dts: am335x-sl50: Fix card detect pin for mmc1
arm64: allwinner: h5: Remove syslink to shared DTSI
ARM: sunxi: h3/h5: fix the compatible of R_CCU
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux into fixes
Allwinner fixes for 4.12
A few fixes around the PRCM support that got in 4.12 with a wrong
compatible, and a missing clock in the binding.
* tag 'sunxi-fixes-for-4.12' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux:
arm64: allwinner: a64: Add PLL_PERIPH0 clock to the R_CCU
ARM: sunxi: h3-h5: Add PLL_PERIPH0 clock to the R_CCU
arm64: allwinner: h5: Remove syslink to shared DTSI
ARM: sunxi: h3/h5: fix the compatible of R_CCU
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes
Two fixes for am335x-sl50 to fix a boot time error
for claiming SPI pins, and to fix a SDIO card detect
pin for production version of the device.
* tag 'omap-for-v4.12/fixes-sl50' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: dts: am335x-sl50: Fix cannot claim requested pins for spi0
ARM: dts: am335x-sl50: Fix card detect pin for mmc1
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
Pull MIPS fixes from Ralf Baechle:
- Three highmem fixes:
+ Fixed mapping initialization
+ Adjust the pkmap location
+ Ensure we use at most one page for PTEs
- Fix makefile dependencies for .its targets to depend on vmlinux
- Fix reversed condition in BNEZC and JIALC software branch emulation
- Only flush initialized flush_insn_slot to avoid NULL pointer
dereference
- perf: Remove incorrect odd/even counter handling for I6400
- ftrace: Fix init functions tracing
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: .its targets depend on vmlinux
MIPS: Fix bnezc/jialc return address calculation
MIPS: kprobes: flush_insn_slot should flush only if probe initialised
MIPS: ftrace: fix init functions tracing
MIPS: mm: adjust PKMAP location
MIPS: highmem: ensure that we don't use more than one page for PTEs
MIPS: mm: fixed mappings: correct initialisation
MIPS: perf: Remove incorrect odd/even counter handling for I6400
|