summaryrefslogtreecommitdiff
path: root/arch/arm/mm/cache-v7.S
AgeCommit message (Collapse)AuthorFilesLines
2014-07-18ARM: convert all "mov.* pc, reg" to "bx reg" for ARMv6+Russell King1-15/+15
ARMv6 and greater introduced a new instruction ("bx") which can be used to return from function calls. Recent CPUs perform better when the "bx lr" instruction is used rather than the "mov pc, lr" instruction, and this sequence is strongly recommended to be used by the ARM architecture manual (section A.4.1.1). We provide a new macro "ret" with all its variants for the condition code which will resolve to the appropriate instruction. Rather than doing this piecemeal, and miss some instances, change all the "mov pc" instances to use the new macro, with the exception of the "movs" instruction and the kprobes code. This allows us to detect the "mov pc, lr" case and fix it up - and also gives us the possibility of deploying this for other registers depending on the CPU selection. Reported-by: Will Deacon <will.deacon@arm.com> Tested-by: Stephen Warren <swarren@nvidia.com> # Tegra Jetson TK1 Tested-by: Robert Jarzmik <robert.jarzmik@free.fr> # mioa701_bootresume.S Tested-by: Andrew Lunn <andrew@lunn.ch> # Kirkwood Tested-by: Shawn Guo <shawn.guo@freescale.com> Tested-by: Tony Lindgren <tony@atomide.com> # OMAPs Tested-by: Gregory CLEMENT <gregory.clement@free-electrons.com> # Armada XP, 375, 385 Acked-by: Sekhar Nori <nsekhar@ti.com> # DaVinci Acked-by: Christoffer Dall <christoffer.dall@linaro.org> # kvm/hyp Acked-by: Haojian Zhuang <haojian.zhuang@gmail.com> # PXA3xx Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> # Xen Tested-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> # ARMv7M Tested-by: Simon Horman <horms+renesas@verge.net.au> # Shmobile Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2014-05-26ARM: 8055/1: cacheflush: use -st dsb option for ensuring completionWill Deacon1-6/+6
dsb st can be used to ensure completion of pending cache maintenance operations, so use it for the v7 cache maintenance operations. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-12-29ARM: 7919/1: mm: refactor v7 cache cleaning ops to use way/index sequenceLorenzo Pieralisi1-7/+7
Set-associative caches on all v7 implementations map the index bits to physical addresses LSBs and tag bits to MSBs. As the last level of cache on current and upcoming ARM systems grows in size, this means that under normal DRAM controller configurations, the current v7 cache flush routine using set/way operations triggers a DRAM memory controller precharge/activate for every cache line writeback since the cache routine cleans lines by first fixing the index and then looping through ways (index bits are mapped to lower physical addresses on all v7 cache implementations; this means that, with last level cache sizes in the order of MBytes, lines belonging to the same set but different ways map to different DRAM pages). Given the random content of cache tags, swapping the order between indexes and ways loops do not prevent DRAM pages precharge and activate cycles but at least, on average, improves the chances that either multiple lines hit the same page or multiple lines belong to different DRAM banks, improving throughput significantly. This patch swaps the inner loops in the v7 cache flushing routine to carry out the clean operations first on all sets belonging to a given way (looping through sets) and then decrementing the way. Benchmarks showed that by swapping the ordering in which sets and ways are decremented in the v7 cache flushing routine, that uses set/way operations, time required to flush caches is reduced significantly, owing to improved writebacks throughput to the DRAM controller. Benchmarks results vary and depend heavily on the last level of cache tag RAM content when cache is cleaned and invalidated, ranging from 2x throughput when all tag RAM entries contain dirty lines mapping to sequential pages of RAM to 1x (ie no improvement) when all tag RAM accesses trigger a DRAM precharge/activate cycle, as the current code implies on most DRAM controller configurations. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Nicolas Pitre <nico@linaro.org> Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Reviewed-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-08-12ARM: mm: use inner-shareable barriers for TLB and user cache operationsWill Deacon1-2/+2
System-wide barriers aren't required for situations where we only need to make visibility and ordering guarantees in the inner-shareable domain (i.e. we are not dealing with devices or potentially incoherent CPUs). This patch changes the v7 TLB operations, coherent_user_range and dcache_clean_area functions to user inner-shareable barriers. For cache maintenance, only the store access type is required to ensure completion. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2013-06-17ARM: 7752/1: errata: LoUIS bit field in CLIDR register is incorrectJon Medhurst1-0/+8
On Cortex-A9 before version r1p0, the LoUIS bit field of the CLIDR register returns zero when it should return one. This leads to cache maintenance operations which rely on this value to not function as intended, causing data corruption. The workaround for this errata is to detect affected CPUs and correct the LoUIS value read. Acked-by: Will Deacon <will.deacon@arm.com> Acked-by: Nicolas Pitre <nico@linaro.org> Cc: stable@vger.kernel.org Signed-off-by: Jon Medhurst <tixy@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-02-12arm: Add v7_invalidate_l1 to cache-v7.SDinh Nguyen1-0/+46
mach-socfpga is another platform that needs to use v7_invalidate_l1 to bringup additional cores. There was a comment that the ideal place for v7_invalidate_l1 should be in arm/mm/cache-v7.S Signed-off-by: Dinh Nguyen <dinguyen@altera.com> Acked-by: Simon Horman <horms+renesas@verge.net.au> Acked-by: Stephen Warren <swarren@nvidia.com> Reviewed-by: Pavel Machek <pavel@denx.de> Reviewed-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Tested-by: Pavel Machek <pavel@denx.de> Tested-by: Stephen Warren <swarren@nvidia.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Russell King <linux@arm.linux.org.uk> Cc: Olof Johansson <olof@lixom.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Rob Herring <rob.herring@calxeda.com> Cc: Sascha Hauer <kernel@pengutronix.de> Cc: Magnus Damm <magnus.damm@gmail.com> Signed-off-by: Olof Johansson <olof@lixom.net>
2012-12-20ARM: 7606/1: cache: flush to LoUU instead of LoUIS on uniprocessor CPUsWill Deacon1-2/+4
flush_cache_louis flushes the D-side caches to the point of unification inner-shareable. On uniprocessor CPUs, this is defined as zero and therefore no flushing will take place. Rather than invent a new interface for UP systems, instead use our SMP_ON_UP patching code to read the LoUU from the CLIDR instead. Cc: <stable@vger.kernel.org> Cc: Lorenzo Pieralisi <Lorenzo.Pieralisi@arm.com> Tested-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-10-11Merge branch 'fixes' into for-linusRussell King1-0/+3
Conflicts: arch/arm/kernel/smp.c
2012-09-29ARM: 7541/1: Add ARM ERRATA 775420 workaroundSimon Horman1-0/+3
arm: Add ARM ERRATA 775420 workaround Workaround for the 775420 Cortex-A9 (r2p2, r2p6,r2p8,r2p10,r3p0) erratum. In case a date cache maintenance operation aborts with MMU exception, it might cause the processor to deadlock. This workaround puts DSB before executing ISB if an abort may occur on cache maintenance. Based on work by Kouei Abe and feedback from Catalin Marinas. Signed-off-by: Kouei Abe <kouei.abe.cp@rms.renesas.com> [ horms@verge.net.au: Changed to implementation suggested by catalin.marinas@arm.com ] Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-09-25ARM: mm: rename jump labels in v7_flush_dcache_all functionLorenzo Pieralisi1-7/+7
This patch renames jump labels in v7_flush_dcache_all in order to define a specific flush cache levels entry point. Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Tested-by: Shawn Guo <shawn.guo@linaro.org>
2012-09-25ARM: mm: implement LoUIS API for cache maintenance opsLorenzo Pieralisi1-0/+36
ARM v7 architecture introduced the concept of cache levels and related control registers. New processors like A7 and A15 embed an L2 unified cache controller that becomes part of the cache level hierarchy. Some operations in the kernel like cpu_suspend and __cpu_disable do not require a flush of the entire cache hierarchy to DRAM but just the cache levels belonging to the Level of Unification Inner Shareable (LoUIS), which in most of ARM v7 systems correspond to L1. The current cache flushing API used in cpu_suspend and __cpu_disable, flush_cache_all(), ends up flushing the whole cache hierarchy since for v7 it cleans and invalidates all cache levels up to Level of Coherency (LoC) which cripples system performance when used in hot paths like hotplug and cpuidle. Therefore a new kernel cache maintenance API must be added to cope with latest ARM system requirements. This patch adds flush_cache_louis() to the ARM kernel cache maintenance API. This function cleans and invalidates all data cache levels up to the Level of Unification Inner Shareable (LoUIS) and invalidates the instruction cache for processors that support it (> v7). This patch also creates an alias of the cache LoUIS function to flush_kern_all for all processor versions prior to v7, so that the current cache flushing behaviour is unchanged for those processors. v7 cache maintenance code implements a cache LoUIS function that cleans and invalidates the D-cache up to LoUIS and invalidates the I-cache, according to the new API. Reviewed-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Reviewed-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Tested-by: Shawn Guo <shawn.guo@linaro.org>
2012-05-02ARM: 7408/1: cacheflush: return error to userspace when flushing syscall failsWill Deacon1-6/+4
The cacheflush syscall can fail for two reasons: (1) The arguments are invalid (nonsensical address range or no VMA) (2) The region generates a translation fault on a VIPT or PIPT cache This patch allows do_cache_op to return an error code to userspace in the case of the above. The various coherent_user_range implementations are modified to return 0 in the case of VIVT caches or -EFAULT in the case of an abort on v6/v7 cores. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-02-16ARM: 7325/1: fix v7 boot with lockdep enabledRabin Vincent1-1/+1
Bootup with lockdep enabled has been broken on v7 since b46c0f74657d ("ARM: 7321/1: cache-v7: Disable preemption when reading CCSIDR"). This is because v7_setup (which is called very early during boot) calls v7_flush_dcache_all, and the save_and_disable_irqs added by that patch ends up attempting to call into lockdep C code (trace_hardirqs_off()) when we are in no position to execute it (no stack, MMU off). Fix this by using a notrace variant of save_and_disable_irqs. The code already uses the notrace variant of restore_irqs. Reviewed-by: Nicolas Pitre <nico@linaro.org> Acked-by: Stephen Boyd <sboyd@codeaurora.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: stable@vger.kernel.org Signed-off-by: Rabin Vincent <rabin@rab.in> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-02-09ARM: 7321/1: cache-v7: Disable preemption when reading CCSIDRStephen Boyd1-0/+6
armv7's flush_cache_all() flushes caches via set/way. To determine the cache attributes (line size, number of sets, etc.) the assembly first writes the CSSELR register to select a cache level and then reads the CCSIDR register. The CSSELR register is banked per-cpu and is used to determine which cache level CCSIDR reads. If the task is migrated between when the CSSELR is written and the CCSIDR is read the CCSIDR value may be for an unexpected cache level (for example L1 instead of L2) and incorrect cache flushing could occur. Disable interrupts across the write and read so that the correct cache attributes are read and used for the cache flushing routine. We disable interrupts instead of disabling preemption because the critical section is only 3 instructions and we want to call v7_dcache_flush_all from __v7_setup which doesn't have a full kernel stack with a struct thread_info. This fixes a problem we see in scm_call() when flush_cache_all() is called from preemptible context and sometimes the L2 cache is not properly flushed out. Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Nicolas Pitre <nico@linaro.org> Cc: stable@vger.kernel.org Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-09-17ARM: 7091/1: errata: D-cache line maintenance operation by MVA may not succeedWill Deacon1-0/+20
This patch implements a workaround for erratum 764369 affecting Cortex-A9 MPCore with two or more processors (all current revisions). Under certain timing circumstances, a data cache line maintenance operation by MVA targeting an Inner Shareable memory region may fail to proceed up to either the Point of Coherency or to the Point of Unification of the system. This workaround adds a DSB instruction before the relevant cache maintenance functions and sets a specific bit in the diagnostic control register of the SCU. Cc: <stable@kernel.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-07-07ARM: mm: cache-v7: Use the new processor struct macrosDave Martin1-13/+2
Signed-off-by: Dave Martin <dave.martin@linaro.org>
2011-05-26ARM: 6941/1: cache: ensure MVA is cacheline aligned in flush_kern_dcache_areaWill Deacon1-0/+2
The v6 and v7 implementations of flush_kern_dcache_area do not align the passed MVA to the size of a cacheline in the data cache. If a misaligned address is used, only a subset of the requested area will be flushed. This has been observed to cause failures in SMP boot where the secondary_data initialised by the primary CPU is not cacheline aligned, causing the secondary CPUs to read incorrect values for their pgd and stack pointers. This patch ensures that the base address is cacheline aligned before flushing the d-cache. Cc: <stable@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-03-31Fix common misspellingsLucas De Marchi1-1/+1
Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
2010-12-13ARM: 6528/1: Use CTR for the I-cache line size on ARMv7Catalin Marinas1-10/+17
The current implementation of the v7_coherent_*_range function assumes that the D and I cache lines have the same size, which is incorrect architecturally. This patch adds the icache_line_size macro which reads the CTR register. The main loop in v7_coherent_*_range is split in two independent loops or the D and I caches. This also has the performance advantage that the DSB is moved outside the main loop. Reported-by: Kevin Sapp <ksapp@quicinc.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-10-04ARM: 6405/1: Handle __flush_icache_all for CONFIG_SMP_ON_UPTony Lindgren1-0/+16
Do this by adding flush_icache_all to cache_fns for ARMv6 and 7. As flush_icache_all may neeed to be called from flush_kern_cache_all, add it as the first entry in the cache_fns. Note that now we can remove the ARM_ERRATA_411920 dependency to !SMP so it can be selected on UP ARMv6 processors, such as omap2. Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Anand Gadiyar <gadiyar@ti.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-10-04ARM: Allow SMP kernels to boot on UP systemsRussell King1-10/+4
UP systems do not implement all the instructions that SMP systems have, so in order to boot a SMP kernel on a UP system, we need to rewrite parts of the kernel. Do this using an 'alternatives' scheme, where the kernel code and data is modified prior to initialization to replace the SMP instructions, thereby rendering the problematical code ineffectual. We use the linker to generate a list of 32-bit word locations and their replacement values, and run through these replacements when we detect a UP system. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-05-21ARM: 6139/1: ARMv7: Use the Inner Shareable I-cache on MPSantosh Shilimkar1-0/+4
This patch fixes the flush_cache_all for ARMv7 SMP.It was missing from commit b8349b569aae661dea9d59d7d2ee587ccea3336c Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: <stable@kernel.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-05-08ARM: 6112/1: Use the Inner Shareable I-cache and BTB ops on ARMv7 SMPCatalin Marinas1-0/+4
The standard I-cache Invalidate All (ICIALLU) and Branch Predication Invalidate All (BPIALL) operations are not automatically broadcast to the other CPUs in an ARMv7 MP system. The patch adds the Inner Shareable variants, ICIALLUIS and BPIALLIS, if ARMv7 and SMP. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-02-15ARM: dma-mapping: fix for speculative prefetchingRussell King1-4/+6
ARMv6 and ARMv7 CPUs can perform speculative prefetching, which makes DMA cache coherency handling slightly more interesting. Rather than being able to rely upon the CPU not accessing the DMA buffer until DMA has completed, we now must expect that the cache could be loaded with possibly stale data from the DMA buffer. Where DMA involves data being transferred to the device, we clean the cache before handing it over for DMA, otherwise we invalidate the buffer to get rid of potential writebacks. On DMA Completion, if data was transferred from the device, we invalidate the buffer to get rid of any stale speculative prefetches. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Tested-By: Santosh Shilimkar <santosh.shilimkar@ti.com>
2010-02-15ARM: dma-mapping: remove dmac_clean_range and dmac_inv_rangeRussell King1-4/+2
These are now unused, and so can be removed. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Tested-By: Santosh Shilimkar <santosh.shilimkar@ti.com>
2010-02-15ARM: dma-mapping: provide per-cpu type map/unmap functionsRussell King1-0/+26
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Tested-By: Santosh Shilimkar <santosh.shilimkar@ti.com>
2009-12-14ARM: add size argument to __cpuc_flush_dcache_pageRussell King1-6/+7
... and rename the function since it no longer operates on just pages. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-10-07ARM: 5746/1: Handle possible translation errors in ARMv6/v7 coherent_user_rangeCatalin Marinas1-2/+17
This is needed because applications using the sys_cacheflush system call can pass a memory range which isn't mapped yet even though the corresponding vma is valid. The patch also adds unwinding annotations for correct backtraces from the coherent_user_range() functions. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-07-24Thumb-2: Implement the unified arch/arm/mm supportCatalin Marinas1-5/+11
This patch adds the ARM/Thumb-2 unified support to the arch/arm/mm/* files. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2008-11-06ARMv7: Add extra barriers for flush_cache_all compressed/head.SCatalin Marinas1-0/+2
The flush_cache_all function on ARMv7 is implemented as a series of cache operations by set/way. These are not guaranteed to be ordered with previous memory accesses, requiring a DMB. This patch also adds barriers for the TLB operations in compressed/head.S Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2008-09-01[ARM] 5227/1: Add the ENDPROC declarations to the .S filesCatalin Marinas1-0/+10
This declaration specifies the "function" type and size for various assembly functions, mainly needed for generating the correct branch instructions in Thumb-2. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2007-05-09[ARM] armv7: add support for ARMv7 cores.Catalin Marinas1-0/+253
This patch adds support for the ARMv7 cores. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>