Age | Commit message (Collapse) | Author | Files | Lines |
|
|
|
Re-engineer the LPAE TTBR setup code. Rather than passing some shifted
address in order to fit in a CPU register, pass either a full physical
address (in the case of r4, r5 for TTBR0) or a PFN (for TTBR1).
This removes the ARCH_PGD_SHIFT hack, and the last dangerous user of
cpu_set_ttbr() in the secondary CPU startup code path (which was there
to re-set TTBR1 to the appropriate high physical address space on
Keystone2.)
Tested-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
multiarch
Looks like apps can be made to segfault easily on armhf distros
just by running cpuburn-a8 in the background, then starting apt
get update unless erratum 430973 workaround is enabled. This happens
on r3p2 also, which has 430973 fixed in hardware.
Turns out the reason for this is some bootloaders incorrectly
setting the auxilary register IBE bit, which probably causes us
to hit erratum 687067 on Cortex-A8 later than r1p2.
If the bootloader incorrectly sets the IBE bit in the auxilary control
register for Cortex-A8 revisions with 430973 fixed in hardware, we
need to call flush BTAC/BTB to avoid segfaults probably caused by
erratum 687067. So let's flush BTAC/BTB unconditionally for Cortex-A8.
It won't do anything unless the IBE bit is set.
Note that we keep the erratum 430973 Kconfig option still around and
disabled for multiarch as it may be unsafe to enable for some secure
SoC. It is known safe to be enabled for n900, but won't do anything
on n900 as the IBE bit needs to be set with SMC.
Also note that SoCs probably should also add checks and print warnings
for the misconfigured IBE bit depending on the Cortex-A8 revision
so the bootloaders can be fixed Cortex-A8 revisions later than
r1p2 to not set the IBE bit.
Tested-by: Sebastian Reichel <sre@kernel.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
Avoid the errata 430973 workaround for non-Cortex A8 CPUs. Having this
workaround enabled introduces an additional branch target buffer flush
into the context switching path, something we wish to avoid. To allow
this errata to be enabled in multiplatform kernels while reducing its
impact, rearrange the Cortex-A8 CPU support to avoid impacting on other
Version 7 CPUs.
Tested-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
ARMv6 and greater introduced a new instruction ("bx") which can be used
to return from function calls. Recent CPUs perform better when the
"bx lr" instruction is used rather than the "mov pc, lr" instruction,
and this sequence is strongly recommended to be used by the ARM
architecture manual (section A.4.1.1).
We provide a new macro "ret" with all its variants for the condition
code which will resolve to the appropriate instruction.
Rather than doing this piecemeal, and miss some instances, change all
the "mov pc" instances to use the new macro, with the exception of
the "movs" instruction and the kprobes code. This allows us to detect
the "mov pc, lr" case and fix it up - and also gives us the possibility
of deploying this for other registers depending on the CPU selection.
Reported-by: Will Deacon <will.deacon@arm.com>
Tested-by: Stephen Warren <swarren@nvidia.com> # Tegra Jetson TK1
Tested-by: Robert Jarzmik <robert.jarzmik@free.fr> # mioa701_bootresume.S
Tested-by: Andrew Lunn <andrew@lunn.ch> # Kirkwood
Tested-by: Shawn Guo <shawn.guo@freescale.com>
Tested-by: Tony Lindgren <tony@atomide.com> # OMAPs
Tested-by: Gregory CLEMENT <gregory.clement@free-electrons.com> # Armada XP, 375, 385
Acked-by: Sekhar Nori <nsekhar@ti.com> # DaVinci
Acked-by: Christoffer Dall <christoffer.dall@linaro.org> # kvm/hyp
Acked-by: Haojian Zhuang <haojian.zhuang@gmail.com> # PXA3xx
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> # Xen
Tested-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> # ARMv7M
Tested-by: Simon Horman <horms+renesas@verge.net.au> # Shmobile
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
CPU_32v6 currently selects CPU_USE_DOMAINS if CPU_V6 and MMU. This is
because ARM 1136 r0pX CPUs lack the v6k extensions, and therefore do
not have hardware thread registers. The lack of these registers requires
the kernel to update the vectors page at each context switch in order to
write a new TLS pointer. This write must be done via the userspace
mapping, since aliasing caches can lead to expensive flushing when using
kmap. Finally, this requires the vectors page to be mapped r/w for
kernel and r/o for user, which has implications for things like put_user
which must trigger CoW appropriately when targetting user pages.
The upshot of all this is that a v6/v7 kernel makes use of domains to
segregate kernel and user memory accesses. This has the nasty
side-effect of making device mappings executable, which has been
observed to cause subtle bugs on recent cores (e.g. Cortex-A15
performing a speculative instruction fetch from the GIC and acking an
interrupt in the process).
This patch solves this problem by removing the remaining domain support
from ARMv6. A new memory type is added specifically for the vectors page
which allows that page (and only that page) to be mapped as user r/o,
kernel r/w. All other user r/o pages are mapped also as kernel r/o.
Patch co-developed with Russell King.
Cc: <stable@vger.kernel.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
Commit ae8a8b9553bd ("ARM: 7691/1: mm: kill unused TLB_CAN_READ_FROM_L1_CACHE
and use ALT_SMP instead") added early function returns for page table
cache flushing operations on ARMv7 SMP CPUs.
Unfortunately, when targetting Thumb-2, these `mov pc, lr' sequences
assemble to 2 bytes which can lead to corruption of the instruction
stream after code patching.
This patch fixes the alternates to use wide (32-bit) instructions for
Thumb-2, therefore ensuring that the patching code works correctly.
Cc: <stable@vger.kernel.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
The __cpuinit type of throwaway sections might have made sense
some time ago when RAM was more constrained, but now the savings
do not offset the cost and complications. For example, the fix in
commit 5e427ec2d0 ("x86: Fix bit corruption at CPU resume time")
is a good example of the nasty type of bugs that can be created
with improper use of the various __init prefixes.
After a discussion on LKML[1] it was decided that cpuinit should go
the way of devinit and be phased out. Once all the users are gone,
we can then finally remove the macros themselves from linux/init.h.
Note that some harmless section mismatch warnings may result, since
notify_cpu_starting() and cpu_up() are arch independent (kernel/cpu.c)
and are flagged as __cpuinit -- so if we remove the __cpuinit from
the arch specific callers, we will also get section mismatch warnings.
As an intermediate step, we intend to turn the linux/init.h cpuinit
related content into no-ops as early as possible, since that will get
rid of these warnings. In any case, they are temporary and harmless.
This removes all the ARM uses of the __cpuinit macros from C code,
and all __CPUINIT from assembly code. It also had two ".previous"
section statements that were paired off against __CPUINIT
(aka .section ".cpuinit.text") that also get removed here.
[1] https://lkml.org/lkml/2013/5/20/589
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
|
|
Many ARMv7 cores have hardware page table walkers that can read the L1
cache. This is discoverable from the ID_MMFR3 register, although this
can be expensive to access from the low-level set_pte functions and is a
pain to cache, particularly with multi-cluster systems.
A useful observation is that the multi-processing extensions for ARMv7
require coherent table walks, meaning that we can make use of ALT_SMP
patching in proc-v7-* to patch away the cache flush safely for these
cores.
Reported-by: Albin Tonnerre <Albin.Tonnerre@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
The mmid macro is meant to be used to get the mm->context.id data
from the mm structure, but it seems to have been missed in a cuple
of files.
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
PROT_NONE mappings apply the page protection attributes defined by _P000
which translate to PAGE_NONE for ARM. These attributes specify an XN,
RDONLY pte that is inaccessible to userspace. However, on kernels
configured without support for domains, such a pte *is* accessible to
the kernel and can be read via get_user, allowing tasks to read
PROT_NONE pages via syscalls such as read/write over a pipe.
This patch introduces a new software pte flag, L_PTE_NONE, that is set
to identify faulting, present entries.
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
For long-descriptor translation table formats, the ARMv7 architecture
defines the last two bits of the second- and third-level descriptors to
be:
x0b - Invalid
01b - Block (second-level), Reserved (third-level)
11b - Table (second-level), Page (third-level)
This allows us to define L_PTE_PRESENT as (3 << 0) and use this value to
create ptes directly. However, when determining whether a given pte
value is present in the low-level page table accessors, we only need to
check the least significant bit of the descriptor, allowing us to write
faulting, present entries which are required for PROT_NONE mappings.
This patch introduces L_PTE_VALID, which can be used to test whether a
pte should fault, and updates the low-level page table accessors
accordingly.
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
The simplified access permissions model is not used for the classic MMU
translation regime, so ensure that it is turned off in the sctlr prior
to turning on address translation for ARMv7.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
This patch introduces a new Kconfig option which, when enabled, causes
the kernel to write the PID of the current task into the PROCID field
of the CONTEXTIDR on context switch. This is useful when analysing
hardware trace, since writes to this register can be configured to emit
an event into the trace stream.
The thread notifier for writing the PID is deliberately kept separate
from the ASID-writing code so that we can support newer processors using
LPAE, where the ASID is stored in TTBR0. As such, the switch_mm code is
updated to perform a read-modify-write sequence to ensure that we don't
clobber the PID on CPUs using the classic 2-level page tables.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
Since the ASIDs must be unique to an mm across all the CPUs in a system,
the __new_context() function needs to broadcast a context reset event to
all the CPUs during ASID allocation if a roll-over occurred. Such IPIs
cannot be issued with interrupts disabled and ARM had to define
__ARCH_WANT_INTERRUPTS_ON_CTXSW.
This patch changes the check_context() function to
check_and_switch_context() called from switch_mm(). In case of
ASID-capable CPUs (ARMv6 onwards), if a new ASID is needed and the
interrupts are disabled, it defers the __new_context() and
cpu_switch_mm() calls to the post-lock switch hook where the interrupts
are enabled. Setting the reserved TTBR0 was also moved to
check_and_switch_context() from cpu_v7_switch_mm().
Reviewed-by: Will Deacon <will.deacon@arm.com>
Tested-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Frank Rowand <frank.rowand@am.sony.com>
Tested-by: Marc Zyngier <Marc.Zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
On ARMv7 CPUs that cache first level page table entries (like the
Cortex-A15), using a reserved ASID while changing the TTBR or flushing
the TLB is unsafe.
This is because the CPU may cache the first level entry as the result of
a speculative memory access while the reserved ASID is assigned. After
the process owning the page tables dies, the memory will be reallocated
and may be written with junk values which can be interpreted as global,
valid PTEs by the processor. This will result in the TLB being populated
with bogus global entries.
This patch avoids the use of a reserved context ID in the v7 switch_mm
and ASID rollover code by temporarily using the swapper_pg_dir pointed
at by TTBR1, which contains only global entries that are not tagged
with ASIDs.
Reviewed-by: Frank Rowand <frank.rowand@am.sony.com>
Tested-by: Marc Zyngier <Marc.Zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
[catalin.marinas@arm.com: add LPAE support]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
This patch modifies the proc-v7.S file so that it only contains code
shared between classic MMU and LPAE. The non-common code is factored out
into a separate file.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|