summaryrefslogtreecommitdiff
path: root/arch/arm/include/asm
AgeCommit message (Collapse)AuthorFilesLines
2022-05-24Merge tag 'random-5.19-rc1-for-linus' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/crng/random Pull random number generator updates from Jason Donenfeld: "These updates continue to refine the work began in 5.17 and 5.18 of modernizing the RNG's crypto and streamlining and documenting its code. New for 5.19, the updates aim to improve entropy collection methods and make some initial decisions regarding the "premature next" problem and our threat model. The cloc utility now reports that random.c is 931 lines of code and 466 lines of comments, not that basic metrics like that mean all that much, but at the very least it tells you that this is very much a manageable driver now. Here's a summary of the various updates: - The random_get_entropy() function now always returns something at least minimally useful. This is the primary entropy source in most collectors, which in the best case expands to something like RDTSC, but prior to this change, in the worst case it would just return 0, contributing nothing. For 5.19, additional architectures are wired up, and architectures that are entirely missing a cycle counter now have a generic fallback path, which uses the highest resolution clock available from the timekeeping subsystem. Some of those clocks can actually be quite good, despite the CPU not having a cycle counter of its own, and going off-core for a stamp is generally thought to increase jitter, something positive from the perspective of entropy gathering. Done very early on in the development cycle, this has been sitting in next getting some testing for a while now and has relevant acks from the archs, so it should be pretty well tested and fine, but is nonetheless the thing I'll be keeping my eye on most closely. - Of particular note with the random_get_entropy() improvements is MIPS, which, on CPUs that lack the c0 count register, will now combine the high-speed but short-cycle c0 random register with the lower-speed but long-cycle generic fallback path. - With random_get_entropy() now always returning something useful, the interrupt handler now collects entropy in a consistent construction. - Rather than comparing two samples of random_get_entropy() for the jitter dance, the algorithm now tests many samples, and uses the amount of differing ones to determine whether or not jitter entropy is usable and how laborious it must be. The problem with comparing only two samples was that if the cycle counter was extremely slow, but just so happened to be on the cusp of a change, the slowness wouldn't be detected. Taking many samples fixes that to some degree. This, combined with the other improvements to random_get_entropy(), should make future unification of /dev/random and /dev/urandom maybe more possible. At the very least, were we to attempt it again today (we're not), it wouldn't break any of Guenter's test rigs that broke when we tried it with 5.18. So, not today, but perhaps down the road, that's something we can revisit. - We attempt to reseed the RNG immediately upon waking up from system suspend or hibernation, making use of the various timestamps about suspend time and such available, as well as the usual inputs such as RDRAND when available. - Batched randomness now falls back to ordinary randomness before the RNG is initialized. This provides more consistent guarantees to the types of random numbers being returned by the various accessors. - The "pre-init injection" code is now gone for good. I suspect you in particular will be happy to read that, as I recall you expressing your distaste for it a few months ago. Instead, to avoid a "premature first" issue, while still allowing for maximal amount of entropy availability during system boot, the first 128 bits of estimated entropy are used immediately as it arrives, with the next 128 bits being buffered. And, as before, after the RNG has been fully initialized, it winds up reseeding anyway a few seconds later in most cases. This resulted in a pretty big simplification of the initialization code and let us remove various ad-hoc mechanisms like the ugly crng_pre_init_inject(). - The RNG no longer pretends to handle the "premature next" security model, something that various academics and other RNG designs have tried to care about in the past. After an interesting mailing list thread, these issues are thought to be a) mainly academic and not practical at all, and b) actively harming the real security of the RNG by delaying new entropy additions after a potential compromise, making a potentially bad situation even worse. As well, in the first place, our RNG never even properly handled the premature next issue, so removing an incomplete solution to a fake problem was particularly nice. This allowed for numerous other simplifications in the code, which is a lot cleaner as a consequence. If you didn't see it before, https://lore.kernel.org/lkml/YmlMGx6+uigkGiZ0@zx2c4.com/ may be a thread worth skimming through. - While the interrupt handler received a separate code path years ago that avoids locks by using per-cpu data structures and a faster mixing algorithm, in order to reduce interrupt latency, input and disk events that are triggered in hardirq handlers were still hitting locks and more expensive algorithms. Those are now redirected to use the faster per-cpu data structures. - Rather than having the fake-crypto almost-siphash-based random32 implementation be used right and left, and in many places where cryptographically secure randomness is desirable, the batched entropy code is now fast enough to replace that. - As usual, numerous code quality and documentation cleanups. For example, the initialization state machine now uses enum symbolic constants instead of just hard coding numbers everywhere. - Since the RNG initializes once, and then is always initialized thereafter, a pretty heavy amount of code used during that initialization is never used again. It is now completely cordoned off using static branches and it winds up in the .text.unlikely section so that it doesn't reduce cache compactness after the RNG is ready. - A variety of functions meant for waiting on the RNG to be initialized were only used by vsprintf, and in not a particularly optimal way. Replacing that usage with a more ordinary setup made it possible to remove those functions. - A cleanup of how we warn userspace about the use of uninitialized /dev/urandom and uninitialized get_random_bytes() usage. Interestingly, with the change you merged for 5.18 that attempts to use jitter (but does not block if it can't), the majority of users should never see those warnings for /dev/urandom at all now, and the one for in-kernel usage is mainly a debug thing. - The file_operations struct for /dev/[u]random now implements .read_iter and .write_iter instead of .read and .write, allowing it to also implement .splice_read and .splice_write, which makes splice(2) work again after it was broken here (and in many other places in the tree) during the set_fs() removal. This was a bit of a last minute arrival from Jens that hasn't had as much time to bake, so I'll be keeping my eye on this as well, but it seems fairly ordinary. Unfortunately, read_iter() is around 3% slower than read() in my tests, which I'm not thrilled about. But Jens and Al, spurred by this observation, seem to be making progress in removing the bottlenecks on the iter paths in the VFS layer in general, which should remove the performance gap for all drivers. - Assorted other bug fixes, cleanups, and optimizations. - A small SipHash cleanup" * tag 'random-5.19-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: (49 commits) random: check for signals after page of pool writes random: wire up fops->splice_{read,write}_iter() random: convert to using fops->write_iter() random: convert to using fops->read_iter() random: unify batched entropy implementations random: move randomize_page() into mm where it belongs random: remove mostly unused async readiness notifier random: remove get_random_bytes_arch() and add rng_has_arch_random() random: move initialization functions out of hot pages random: make consistent use of buf and len random: use proper return types on get_random_{int,long}_wait() random: remove extern from functions in header random: use static branch for crng_ready() random: credit architectural init the exact amount random: handle latent entropy and command line from random_init() random: use proper jiffies comparison macro random: remove ratelimiting for in-kernel unseeded randomness random: move initialization out of reseeding hot path random: avoid initializing twice in credit race random: use symbolic constants for crng_init states ...
2022-05-24Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds3-24/+22
Pull ARM updates from Russell King: - amba bus updates - simplify ldr_this_cpu assembler macro for uniprocessor builds - avoid explicit assembler literal loads - more spectre-bhb improvements - add Cortex-A9 Errata 764319 workaround - add all unwind tables for modules * tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: 9204/2: module: Add all unwind tables when load module ARM: 9206/1: A9: Add ARM ERRATA 764319 workaround (Updated) ARM: 9201/1: spectre-bhb: rely on linker to emit cross-section literal loads ARM: 9200/1: spectre-bhb: avoid cross-subsection jump using a numbered label ARM: 9199/1: spectre-bhb: use local DSB and elide ISB in loop8 sequence ARM: 9198/1: spectre-bhb: simplify BPIALL vector macro ARM: 9195/1: entry: avoid explicit literal loads ARM: 9194/1: assembler: simplify ldr_this_cpu for !SMP builds ARM: 9192/1: amba: fix memory leak in amba_device_try_add() ARM: 9193/1: amba: Add amba_read_periphid() helper
2022-05-24Merge tag 'irq-core-2022-05-23' of ↵Linus Torvalds1-6/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull interrupt handling updates from Thomas Gleixner: "Core code: - Make the managed interrupts more robust by shutting them down in the core code when the assigned affinity mask does not contain online CPUs. - Make the irq simulator chip work on RT - A small set of cpumask and power manageent cleanups Drivers: - A set of changes which mark GPIO interrupt chips immutable to prevent the GPIO subsystem from modifying it under the hood. This provides the necessary infrastructure and converts a set of GPIO and pinctrl drivers over. - A set of changes to make the pseudo-NMI handling for GICv3 more robust: a missing barrier and consistent handling of the priority mask. - Another set of GICv3 improvements and fixes, but nothing outstanding - The usual set of improvements and cleanups all over the place - No new irqchip drivers and not even a new device tree binding! 100+ interrupt chips are truly enough" * tag 'irq-core-2022-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (39 commits) irqchip: Add Kconfig symbols for sunxi drivers irqchip/gic-v3: Fix priority mask handling irqchip/gic-v3: Refactor ISB + EOIR at ack time irqchip/gic-v3: Ensure pseudo-NMIs have an ISB between ack and handling genirq/irq_sim: Make the irq_work always run in hard irq context irqchip/armada-370-xp: Do not touch Performance Counter Overflow on A375, A38x, A39x irqchip/gic: Improved warning about incorrect type irqchip/csky: Return true/false (not 1/0) from bool functions irqchip/imx-irqsteer: Add runtime PM support irqchip/imx-irqsteer: Constify irq_chip struct irqchip/armada-370-xp: Enable MSI affinity configuration irqchip/aspeed-scu-ic: Fix irq_of_parse_and_map() return value irqchip/aspeed-i2c-ic: Fix irq_of_parse_and_map() return value irqchip/sun6i-r: Use NULL for chip_data irqchip/xtensa-mx: Fix initial IRQ affinity in non-SMP setup irqchip/exiu: Fix acknowledgment of edge triggered interrupts irqchip/gic-v3: Claim iomem resources dt-bindings: interrupt-controller: arm,gic-v3: Make the v2 compat requirements explicit irqchip/gic-v3: Relax polling of GIC{R,D}_CTLR.RWP irqchip/gic-v3: Detect LPI invalidation MMIO registers ...
2022-05-20ARM: 9204/2: module: Add all unwind tables when load moduleChen Zhongjin2-13/+5
For EABI stack unwinding, when loading .ko module the EXIDX sections will be added to a unwind_table list. However not all EXIDX sections are added because EXIDX sections are searched by hardcoded section names. For functions in other sections such as .ref.text or .kprobes.text, gcc generates seprated EXIDX sections (such as .ARM.exidx.ref.text or .ARM.exidx.kprobes.text). These extra EXIDX sections are not loaded, so when unwinding functions in these sections, we will failed with: unwind: Index not found xxx To fix that, I refactor the code for searching and adding EXIDX sections: - Check section type to search EXIDX tables (0x70000001) instead of strcmp() the hardcoded names. Then find the corresponding text sections by their section names. - Add a unwind_table list in module->arch to save their own unwind_table instead of the fixed-lenth array. - Save .ARM.exidx.init.text section ptr, because it should be cleaned after module init. Now all EXIDX sections of .ko can be added correctly. Signed-off-by: Chen Zhongjin <chenzhongjin@huawei.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-05-20ARM: 9195/1: entry: avoid explicit literal loadsArd Biesheuvel1-9/+9
ARMv7 has MOVW/MOVT instruction pairs to load symbol addresses into registers without having to rely on literal loads that go via the D-cache. For older cores, we now support a similar arrangement, based on PC-relative group relocations. This means we can elide most literal loads entirely from the entry path, by switching to the ldr_va macro to emit the appropriate sequence depending on the target architecture revision. While at it, switch to the bl_r macro for invoking the right PABT/DABT helpers instead of setting the LR register explicitly, which does not play well with cores that speculate across function returns. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-05-20ARM: 9194/1: assembler: simplify ldr_this_cpu for !SMP buildsArd Biesheuvel1-4/+10
When CONFIG_SMP is not defined, the CPU offset is always zero, and so we can simplify the sequence to load a per-CPU variable. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-05-15irqchip/gic-v3: Refactor ISB + EOIR at ack timeMark Rutland1-6/+1
There are cases where a context synchronization event is necessary between an IRQ being raised and being handled, and there are races such that we cannot rely upon the exception entry being subsequent to the interrupt being raised. To fix this, we place an ISB between a read of IAR and the subsequent invocation of an IRQ handler. When EOI mode 1 is in use, we need to EOI an interrupt prior to invoking its handler, and we have a write to EOIR for this. As this write to EOIR requires an ISB, and this is provided by the gic_write_eoir() helper, we omit the usual ISB in this case, with the logic being: | if (static_branch_likely(&supports_deactivate_key)) | gic_write_eoir(irqnr); | else | isb(); This is somewhat opaque, and it would be a little clearer if there were an unconditional ISB, with only the write to EOIR being conditional, e.g. | if (static_branch_likely(&supports_deactivate_key)) | write_gicreg(irqnr, ICC_EOIR1_EL1); | | isb(); This patch rewrites the code that way, with this logic factored into a new helper function with comments explaining what the ISB is for, as were originally laid out in commit: 39a06b67c2c1256b ("irqchip/gic: Ensure we have an ISB between ack and ->handle_irq") Note that since then, we removed the IAR polling in commit: 342677d70ab92142 ("irqchip/gic-v3: Remove acknowledge loop") ... which removed one of the two race conditions. For consistency, other portions of the driver are made to manipulate EOIR using write_gicreg() and explcit ISBs, and the gic_write_eoir() helper function is removed. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220513133038.226182-3-mark.rutland@arm.com
2022-05-14arm: use fallback for random_get_entropy() instead of zeroJason A. Donenfeld1-0/+1
In the event that random_get_entropy() can't access a cycle counter or similar, falling back to returning 0 is really not the best we can do. Instead, at least calling random_get_entropy_fallback() would be preferable, because that always needs to return _something_, even falling back to jiffies eventually. It's not as though random_get_entropy_fallback() is super high precision or guaranteed to be entropic, but basically anything that's not zero all the time is better than returning zero all the time. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-05-11swiotlb-xen: fix DMA_ATTR_NO_KERNEL_MAPPING on armChristoph Hellwig1-2/+0
swiotlb-xen uses very different ways to allocate coherent memory on x86 vs arm. On the former it allocates memory from the page allocator, while on the later it reuses the dma-direct allocator the handles the complexities of non-coherent DMA on arm platforms. Unfortunately the complexities of trying to deal with the two cases in the swiotlb-xen.c code lead to a bug in the handling of DMA_ATTR_NO_KERNEL_MAPPING on arm. With the DMA_ATTR_NO_KERNEL_MAPPING flag the coherent memory allocator does not actually allocate coherent memory, but just a DMA handle for some memory that is DMA addressable by the device, but which does not have to have a kernel mapping. Thus dereferencing the return value will lead to kernel crashed and memory corruption. Fix this by using the dma-direct allocator directly for arm, which works perfectly fine because on arm swiotlb-xen is only used when the domain is 1:1 mapped, and then simplifying the remaining code to only cater for the x86 case with DMA coherent device. Reported-by: Rahul Singh <Rahul.Singh@arm.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Rahul Singh <rahul.singh@arm.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Tested-by: Rahul Singh <rahul.singh@arm.com>
2022-05-10arm[64]/memremap: don't abuse pfn_valid() to ensure presence of linear mapMike Rapoport1-0/+3
The semantics of pfn_valid() is to check presence of the memory map for a PFN and not whether a PFN is covered by the linear map. The memory map may be present for NOMAP memory regions, but they won't be mapped in the linear mapping. Accessing such regions via __va() when they are memremap()'ed will cause a crash. On v5.4.y the crash happens on qemu-arm with UEFI [1]: <1>[ 0.084476] 8<--- cut here --- <1>[ 0.084595] Unable to handle kernel paging request at virtual address dfb76000 <1>[ 0.084938] pgd = (ptrval) <1>[ 0.085038] [dfb76000] *pgd=5f7fe801, *pte=00000000, *ppte=00000000 ... <4>[ 0.093923] [<c0ed6ce8>] (memcpy) from [<c16a06f8>] (dmi_setup+0x60/0x418) <4>[ 0.094204] [<c16a06f8>] (dmi_setup) from [<c16a38d4>] (arm_dmi_init+0x8/0x10) <4>[ 0.094408] [<c16a38d4>] (arm_dmi_init) from [<c0302e9c>] (do_one_initcall+0x50/0x228) <4>[ 0.094619] [<c0302e9c>] (do_one_initcall) from [<c16011e4>] (kernel_init_freeable+0x15c/0x1f8) <4>[ 0.094841] [<c16011e4>] (kernel_init_freeable) from [<c0f028cc>] (kernel_init+0x8/0x10c) <4>[ 0.095057] [<c0f028cc>] (kernel_init) from [<c03010e8>] (ret_from_fork+0x14/0x2c) On kernels v5.10.y and newer the same crash won't reproduce on ARM because commit b10d6bca8720 ("arch, drivers: replace for_each_membock() with for_each_mem_range()") changed the way memory regions are registered in the resource tree, but that merely covers up the problem. On ARM64 memory resources registered in yet another way and there the issue of wrong usage of pfn_valid() to ensure availability of the linear map is also covered. Implement arch_memremap_can_ram_remap() on ARM and ARM64 to prevent access to NOMAP regions via the linear mapping in memremap(). Link: https://lore.kernel.org/all/Yl65zxGgFzF1Okac@sirena.org.uk Link: https://lkml.kernel.org/r/20220426060107.7618-1-rppt@kernel.org Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Reported-by: "kernelci.org bot" <bot@kernelci.org> Tested-by: Mark Brown <broonie@kernel.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Mark Brown <broonie@kernel.org> Cc: Mark-PK Tsai <mark-pk.tsai@mediatek.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Tony Lindgren <tony@atomide.com> Cc: Will Deacon <will@kernel.org> Cc: <stable@vger.kernel.org> [5.4+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-07ARM: pxa/sa1100: move I/O space to PCI_IOBASEArnd Bergmann1-23/+4
PXA and StrongARM1100 traditionally map their I/O space 1:1 into virtual memory, using a per-bus io_offset that matches the base address of the ioremap mapping. In order for PXA to work in a multiplatform config, this needs to change so I/O space starts at PCI_IOBASE (0xfee00000). Since the pcmcia soc_common support is shared with StrongARM1100, both have to change at the same time. The affected machines are: - Anything with a PCMCIA slot now uses pci_remap_iospace, which is made available to PCMCIA configurations as well, rather than just PCI. The first PCMCIA slot now starts at port number 0x10000. - The Zeus and Viper platforms have PC/104-style ISA buses, which have a static mapping for both I/O and memory space at 0xf1000000, which can no longer work. It does not appear to have any in-tree users, so moving it to port number 0 makes them behave like a traditional PC. - SA1100 does support ISA slots in theory, but all machines that originally enabled this appear to have been removed from the tree ages ago, and the I/O space is never mapped anywhere. - The Nanoengine machine has support for PCI slots, but looks like this never included I/O space, the resources only define the location for memory and config space. With this, the definitions of __io() and IO_SPACE_LIMIT can be simplified, as the only remaining cases are the generic PCI_IOBASE and the custom inb()/outb() macros on RiscPC. S3C24xx still has a custom inb()/outb() in this here, but this is already removed in another branch. Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-05-07ARM: pxa: remove unused mach/bitfield.hArnd Bergmann1-2/+0
The sa1111.h header defines some constants using the bitfield macros, but those are only used on sa1100, not on pxa, and the users include the bitfield header through mach/hardware.h. Acked-by: Robert Jarzmik <robert.jarzmik@free.fr> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-04-02Merge branch 'work.misc' of ↵Linus Torvalds1-4/+0
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs updates from Al Viro: "Assorted bits and pieces" * 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: aio: drop needless assignment in aio_read() clean overflow checks in count_mounts() a bit seq_file: fix NULL pointer arithmetic warning uml/x86: use x86 load_unaligned_zeropad() asm/user.h: killed unused macros constify struct path argument of finish_automount()/do_add_mount() fs: Remove FIXME comment in generic_write_checks()
2022-03-24Merge tag 'asm-generic-5.18' of ↵Linus Torvalds1-21/+1
git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull asm-generic updates from Arnd Bergmann: "There are three sets of updates for 5.18 in the asm-generic tree: - The set_fs()/get_fs() infrastructure gets removed for good. This was already gone from all major architectures, but now we can finally remove it everywhere, which loses some particularly tricky and error-prone code. There is a small merge conflict against a parisc cleanup, the solution is to use their new version. - The nds32 architecture ends its tenure in the Linux kernel. The hardware is still used and the code is in reasonable shape, but the mainline port is not actively maintained any more, as all remaining users are thought to run vendor kernels that would never be updated to a future release. - A series from Masahiro Yamada cleans up some of the uapi header files to pass the compile-time checks" * tag 'asm-generic-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: (27 commits) nds32: Remove the architecture uaccess: remove CONFIG_SET_FS ia64: remove CONFIG_SET_FS support sh: remove CONFIG_SET_FS support sparc64: remove CONFIG_SET_FS support lib/test_lockup: fix kernel pointer check for separate address spaces uaccess: generalize access_ok() uaccess: fix type mismatch warnings from access_ok() arm64: simplify access_ok() m68k: fix access_ok for coldfire MIPS: use simpler access_ok() MIPS: Handle address errors for accesses above CPU max virtual user address uaccess: add generic __{get,put}_kernel_nofault nios2: drop access_ok() check from __put_user() x86: use more conventional access_ok() definition x86: remove __range_not_ok() sparc64: add __{get,put}_kernel_nofault() nds32: fix access_ok() checks in get/put_user uaccess: fix nios2 and microblaze get_user_8() sparc64: fix building assembly files ...
2022-03-24Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds20-285/+319
Pull ARM updates from Russell King: "Updates for IRQ stacks and virtually mapped stack support, and ftrace: - Support for IRQ and vmap'ed stacks This covers all the work related to implementing IRQ stacks and vmap'ed stacks for all 32-bit ARM systems that are currently supported by the Linux kernel, including RiscPC and Footbridge. It has been submitted for review in four different waves: - IRQ stacks support for v7 SMP systems [0] - vmap'ed stacks support for v7 SMP systems[1] - extending support for both IRQ stacks and vmap'ed stacks for all remaining configurations, including v6/v7 SMP multiplatform kernels and uniprocessor configurations including v7-M [2] - fixes and updates in [3] - ftrace fixes and cleanups Make all flavors of ftrace available on all builds, regardless of ISA choice, unwinder choice or compiler [4]: - use ADD not POP where possible - fix a couple of Thumb2 related issues - enable HAVE_FUNCTION_GRAPH_FP_TEST for robustness - enable the graph tracer with the EABI unwinder - avoid clobbering frame pointer registers to make Clang happy - Fixes for the above" [0] https://lore.kernel.org/linux-arm-kernel/20211115084732.3704393-1-ardb@kernel.org/ [1] https://lore.kernel.org/linux-arm-kernel/20211122092816.2865873-1-ardb@kernel.org/ [2] https://lore.kernel.org/linux-arm-kernel/20211206164659.1495084-1-ardb@kernel.org/ [3] https://lore.kernel.org/linux-arm-kernel/20220124174744.1054712-1-ardb@kernel.org/ [4] https://lore.kernel.org/linux-arm-kernel/20220203082204.1176734-1-ardb@kernel.org/ * tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm: (62 commits) ARM: fix building NOMMU ARMv4/v5 kernels ARM: unwind: only permit stack switch when unwinding call_with_stack() ARM: Revert "unwind: dump exception stack from calling frame" ARM: entry: fix unwinder problems caused by IRQ stacks ARM: unwind: set frame.pc correctly for current-thread unwinding ARM: 9184/1: return_address: disable again for CONFIG_ARM_UNWIND=y ARM: 9183/1: unwind: avoid spurious warnings on bogus code addresses Revert "ARM: 9144/1: forbid ftrace with clang and thumb2_kernel" ARM: mach-bcm: disable ftrace in SMC invocation routines ARM: cacheflush: avoid clobbering the frame pointer ARM: kprobes: treat R7 as the frame pointer register in Thumb2 builds ARM: ftrace: enable the graph tracer with the EABI unwinder ARM: unwind: track location of LR value in stack frame ARM: ftrace: enable HAVE_FUNCTION_GRAPH_FP_TEST ARM: ftrace: avoid unnecessary literal loads ARM: ftrace: avoid redundant loads or clobbering IP ARM: ftrace: use trampolines to keep .init.text in branching range ARM: ftrace: use ADD not POP to counter PUSH at entry ARM: ftrace: ensure that ADR takes the Thumb bit into account ARM: make get_current() and __my_cpu_offset() __always_inline ...
2022-03-23Merge tag 'folio-5.18c' of git://git.infradead.org/users/willy/pagecacheLinus Torvalds1-0/+2
Pull folio updates from Matthew Wilcox: - Rewrite how munlock works to massively reduce the contention on i_mmap_rwsem (Hugh Dickins): https://lore.kernel.org/linux-mm/8e4356d-9622-a7f0-b2c-f116b5f2efea@google.com/ - Sort out the page refcount mess for ZONE_DEVICE pages (Christoph Hellwig): https://lore.kernel.org/linux-mm/20220210072828.2930359-1-hch@lst.de/ - Convert GUP to use folios and make pincount available for order-1 pages. (Matthew Wilcox) - Convert a few more truncation functions to use folios (Matthew Wilcox) - Convert page_vma_mapped_walk to use PFNs instead of pages (Matthew Wilcox) - Convert rmap_walk to use folios (Matthew Wilcox) - Convert most of shrink_page_list() to use a folio (Matthew Wilcox) - Add support for creating large folios in readahead (Matthew Wilcox) * tag 'folio-5.18c' of git://git.infradead.org/users/willy/pagecache: (114 commits) mm/damon: minor cleanup for damon_pa_young selftests/vm/transhuge-stress: Support file-backed PMD folios mm/filemap: Support VM_HUGEPAGE for file mappings mm/readahead: Switch to page_cache_ra_order mm/readahead: Align file mappings for non-DAX mm/readahead: Add large folio readahead mm: Support arbitrary THP sizes mm: Make large folios depend on THP mm: Fix READ_ONLY_THP warning mm/filemap: Allow large folios to be added to the page cache mm: Turn can_split_huge_page() into can_split_folio() mm/vmscan: Convert pageout() to take a folio mm/vmscan: Turn page_check_references() into folio_check_references() mm/vmscan: Account large folios correctly mm/vmscan: Optimise shrink_page_list for non-PMD-sized folios mm/vmscan: Free non-shmem folios without splitting them mm/rmap: Constify the rmap_walk_control argument mm/rmap: Convert rmap_walk() to take a folio mm: Turn page_anon_vma() into folio_anon_vma() mm/rmap: Turn page_lock_anon_vma_read() into folio_lock_anon_vma_read() ...
2022-03-23Merge tag 'sched-core-2022-03-22' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: - Cleanups for SCHED_DEADLINE - Tracing updates/fixes - CPU Accounting fixes - First wave of changes to optimize the overhead of the scheduler build, from the fast-headers tree - including placeholder *_api.h headers for later header split-ups. - Preempt-dynamic using static_branch() for ARM64 - Isolation housekeeping mask rework; preperatory for further changes - NUMA-balancing: deal with CPU-less nodes - NUMA-balancing: tune systems that have multiple LLC cache domains per node (eg. AMD) - Updates to RSEQ UAPI in preparation for glibc usage - Lots of RSEQ/selftests, for same - Add Suren as PSI co-maintainer * tag 'sched-core-2022-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (81 commits) sched/headers: ARM needs asm/paravirt_api_clock.h too sched/numa: Fix boot crash on arm64 systems headers/prep: Fix header to build standalone: <linux/psi.h> sched/headers: Only include <linux/entry-common.h> when CONFIG_GENERIC_ENTRY=y cgroup: Fix suspicious rcu_dereference_check() usage warning sched/preempt: Tell about PREEMPT_DYNAMIC on kernel headers sched/topology: Remove redundant variable and fix incorrect type in build_sched_domains sched/deadline,rt: Remove unused parameter from pick_next_[rt|dl]_entity() sched/deadline,rt: Remove unused functions for !CONFIG_SMP sched/deadline: Use __node_2_[pdl|dle]() and rb_first_cached() consistently sched/deadline: Merge dl_task_can_attach() and dl_cpu_busy() sched/deadline: Move bandwidth mgmt and reclaim functions into sched class source file sched/deadline: Remove unused def_dl_bandwidth sched/tracing: Report TASK_RTLOCK_WAIT tasks as TASK_UNINTERRUPTIBLE sched/tracing: Don't re-read p->state when emitting sched_switch event sched/rt: Plug rt_mutex_setprio() vs push_rt_task() race sched/cpuacct: Remove redundant RCU read lock sched/cpuacct: Optimize away RCU read lock sched/cpuacct: Fix charge percpu cpuusage sched/headers: Reorganize, clean up and optimize kernel/sched/sched.h dependencies ...
2022-03-22sched/headers: ARM needs asm/paravirt_api_clock.h tooRandy Dunlap1-0/+1
Add <asm/paravirt_api_clock.h> for arch/arm/, mapped to <asm/paravirt.h>, to simplify #ifdeffery in generic code. Fixes this build error introduced by the scheduler tree: In file included from ../kernel/sched/core.c:81: ../kernel/sched/sched.h:87:11: fatal error: asm/paravirt_api_clock.h: No such file or directory 87 | # include <asm/paravirt_api_clock.h> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Fixes: 4ff8f2ca6ccd ("sched/headers: Reorganize, clean up and optimize kernel/sched/sched.h dependencies") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20220316204146.14000-1-rdunlap@infradead.org
2022-03-22Merge branch 'linus' of ↵Linus Torvalds1-14/+28
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto updates from Herbert Xu: "API: - hwrng core now credits for low-quality RNG devices. Algorithms: - Optimisations for neon aes on arm/arm64. - Add accelerated crc32_be on arm64. - Add ffdheXYZ(dh) templates. - Disallow hmac keys < 112 bits in FIPS mode. - Add AVX assembly implementation for sm3 on x86. Drivers: - Add missing local_bh_disable calls for crypto_engine callback. - Ensure BH is disabled in crypto_engine callback path. - Fix zero length DMA mappings in ccree. - Add synchronization between mailbox accesses in octeontx2. - Add Xilinx SHA3 driver. - Add support for the TDES IP available on sama7g5 SoC in atmel" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (137 commits) crypto: xilinx - Turn SHA into a tristate and allow COMPILE_TEST MAINTAINERS: update HPRE/SEC2/TRNG driver maintainers list crypto: dh - Remove the unused function dh_safe_prime_dh_alg() hwrng: nomadik - Change clk_disable to clk_disable_unprepare crypto: arm64 - cleanup comments crypto: qat - fix initialization of pfvf rts_map_msg structures crypto: qat - fix initialization of pfvf cap_msg structures crypto: qat - remove unneeded assignment crypto: qat - disable registration of algorithms crypto: hisilicon/qm - fix memset during queues clearing crypto: xilinx: prevent probing on non-xilinx hardware crypto: marvell/octeontx - Use swap() instead of open coding it crypto: ccree - Fix use after free in cc_cipher_exit() crypto: ccp - ccp_dmaengine_unregister release dma channels crypto: octeontx2 - fix missing unlock hwrng: cavium - fix NULL but dereferenced coccicheck error crypto: cavium/nitrox - don't cast parameter in bit operations crypto: vmx - add missing dependencies MAINTAINERS: Add maintainer for Xilinx ZynqMP SHA3 driver crypto: xilinx - Add Xilinx SHA3 driver ...
2022-03-21arch: Add pmd_pfn() where it is missingMike Rapoport1-0/+2
We need to use this function in common code, so define it for architectures and/or configrations that miss it. The result of pmd_pfn() will only be used if TRANSPARENT_HUGEPAGE is enabled, but a function or macro called pmd_pfn() must be defined, even on machines with two level page tables. Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-03-11ARM: Spectre-BHB: provide empty stub for non-configRandy Dunlap1-0/+6
When CONFIG_GENERIC_CPU_VULNERABILITIES is not set, references to spectre_v2_update_state() cause a build error, so provide an empty stub for that function when the Kconfig option is not set. Fixes this build error: arm-linux-gnueabi-ld: arch/arm/mm/proc-v7-bugs.o: in function `cpu_v7_bugs_init': proc-v7-bugs.c:(.text+0x52): undefined reference to `spectre_v2_update_state' arm-linux-gnueabi-ld: proc-v7-bugs.c:(.text+0x82): undefined reference to `spectre_v2_update_state' Fixes: b9baf5c8c5c3 ("ARM: Spectre-BHB workaround") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: kernel test robot <lkp@intel.com> Cc: Russell King <rmk+kernel@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: patches@armlinux.org.uk Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-11ARM: Revert "unwind: dump exception stack from calling frame"Ard Biesheuvel1-10/+0
After simplifying the stack switch code in the IRQ exception handler by deferring the actual stack switch to call_with_stack(), we no longer need to special case the way we dump the exception stack, since it will always be at the top of whichever stack was active when the exception was taken. So revert this special handling for the ARM unwinder. This reverts commit 4ab6827081c63b83011a18d8e27f621ed34b1194. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-03-10ARM: Do not use NOCROSSREFS directive with ld.lldNathan Chancellor1-0/+8
ld.lld does not support the NOCROSSREFS directive at the moment, which breaks the build after commit b9baf5c8c5c3 ("ARM: Spectre-BHB workaround"): ld.lld: error: ./arch/arm/kernel/vmlinux.lds:34: AT expected, but got NOCROSSREFS Support for this directive will eventually be implemented, at which point a version check can be added. To avoid breaking the build in the meantime, just define NOCROSSREFS to nothing when using ld.lld, with a link to the issue for tracking. Cc: stable@vger.kernel.org Fixes: b9baf5c8c5c3 ("ARM: Spectre-BHB workaround") Link: https://github.com/ClangBuiltLinux/linux/issues/1609 Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-09ARM: fix co-processor register typoRussell King (Oracle)1-1/+1
In the recent Spectre BHB patches, there was a typo that is only exposed in certain configurations: mcr p15,0,XX,c7,r5,4 should have been mcr p15,0,XX,c7,c5,4 Reported-by: kernel test robot <lkp@intel.com> Fixes: b9baf5c8c5c3 ("ARM: Spectre-BHB workaround") Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-08Merge tag 'for-linus-bhb' of git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds3-9/+68
Pull ARM spectre fixes from Russell King: "ARM Spectre BHB mitigations. These patches add Spectre BHB migitations for the following Arm CPUs to the 32-bit ARM kernels: - Cortex A15 - Cortex A57 - Cortex A72 - Cortex A73 - Cortex A75 - Brahma B15 for CVE-2022-23960" * tag 'for-linus-bhb' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: include unprivileged BPF status in Spectre V2 reporting ARM: Spectre-BHB workaround ARM: use LOADADDR() to get load address of sections ARM: early traps initialisation ARM: report Spectre v2 status through sysfs
2022-03-07ARM: 9184/1: return_address: disable again for CONFIG_ARM_UNWIND=yArd Biesheuvel1-0/+18
Commit 41918ec82eb6 ("ARM: ftrace: enable the graph tracer with the EABI unwinder") removed the dummy version of return_address() that was provided for the CONFIG_ARM_UNWIND=y case, on the assumption that the removal of the kernel_text_address() call from unwind_frame() in the preceding patch made it safe to do so. However, this turns out not to be the case: Corentin reports warnings about suspicious RCU usage and other strange behavior that seems to originate in the stack unwinding that occurs in return_address(). Given that the function graph tracer (which is what these changes were enabling for CONFIG_ARM_UNWIND=y builds) does not appear to care about this distinction, let's revert return_address() to the old state. Cc: Corentin Labbe <clabbe.montjoie@gmail.com> Fixes: 41918ec82eb6 ("ARM: ftrace: enable the graph tracer with the EABI unwinder") Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reported-by: Corentin Labbe <clabbe.montjoie@gmail.com> Tested-by: Corentin Labbe <clabbe.montjoie@gmail.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-03-05ARM: Spectre-BHB workaroundRussell King (Oracle)3-3/+29
Workaround the Spectre BHB issues for Cortex-A15, Cortex-A57, Cortex-A72, Cortex-A73 and Cortex-A75. We also include Brahma B15 as well to be safe, which is affected by Spectre V2 in the same ways as Cortex-A15. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-03-05ARM: use LOADADDR() to get load address of sectionsRussell King (Oracle)1-7/+12
Use the linker's LOADADDR() macro to get the load address of the sections, and provide a macro to set the start and end symbols. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-03-05ARM: report Spectre v2 status through sysfsRussell King (Oracle)1-0/+28
As per other architectures, add support for reporting the Spectre vulnerability status via sysfs CPU. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-02-25uaccess: generalize access_ok()Arnd Bergmann1-19/+1
There are many different ways that access_ok() is defined across architectures, but in the end, they all just compare against the user_addr_max() value or they accept anything. Provide one definition that works for most architectures, checking against TASK_SIZE_MAX for user processes or skipping the check inside of uaccess_kernel() sections. For architectures without CONFIG_SET_FS(), this should be the fastest check, as it comes down to a single comparison of a pointer against a compile-time constant, while the architecture specific versions tend to do something more complex for historic reasons or get something wrong. Type checking for __user annotations is handled inconsistently across architectures, but this is easily simplified as well by using an inline function that takes a 'const void __user *' argument. A handful of callers need an extra __user annotation for this. Some architectures had trick to use 33-bit or 65-bit arithmetic on the addresses to calculate the overflow, however this simpler version uses fewer registers, which means it can produce better object code in the end despite needing a second (statically predicted) branch. Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Mark Rutland <mark.rutland@arm.com> [arm64, asm-generic] Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Stafford Horne <shorne@gmail.com> Acked-by: Dinh Nguyen <dinguyen@kernel.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-02-25uaccess: add generic __{get,put}_kernel_nofaultArnd Bergmann1-2/+0
Nine architectures are still missing __{get,put}_kernel_nofault: alpha, ia64, microblaze, nds32, nios2, openrisc, sh, sparc32, xtensa. Add a generic version that lets everything use the normal copy_{from,to}_kernel_nofault() code based on these, removing the last use of get_fs()/set_fs() from architecture-independent code. Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-02-11lib/xor: make xor prototypes more friendly to compiler vectorizationArd Biesheuvel1-14/+28
Modern compilers are perfectly capable of extracting parallelism from the XOR routines, provided that the prototypes reflect the nature of the input accurately, in particular, the fact that the input vectors are expected not to overlap. This is not documented explicitly, but is implied by the interchangeability of the various C routines, some of which use temporary variables while others don't: this means that these routines only behave identically for non-overlapping inputs. So let's decorate these input vectors with the __restrict modifier, which informs the compiler that there is no overlap. While at it, make the input-only vectors pointer-to-const as well. Tested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Link: https://github.com/ClangBuiltLinux/linux/issues/563 Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-02-09ARM: cacheflush: avoid clobbering the frame pointerArd Biesheuvel1-9/+3
Thumb2 uses R7 rather than R11 as the frame pointer, and even if we rarely use a frame pointer to begin with when building in Thumb2 mode, there are cases where it is required by the compiler (Clang when inserting profiling hooks via -pg) However, preserving and restoring the frame pointer is risky, as any unhandled exceptions raised in the mean time will produce a bogus backtrace, and it would be better not to touch the frame pointer at all. This is the case even when CONFIG_FRAME_POINTER is not set, as the unwind directive used by the unwinder may also use R7 or R11 as the unwind anchor, even if the frame pointer is not managed strictly according to the frame pointer ABI. So let's tweak the cacheflush asm code not to clobber R7 or R11 at all, so that we can drop R7 from the clobber lists of the inline asm blocks that call these routines, and remove the code that preserves/restores R11. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
2022-02-09ARM: ftrace: enable the graph tracer with the EABI unwinderArd Biesheuvel1-18/+0
Enable the function graph tracer in combination with the EABI unwinder, so that Thumb2 builds or Clang ARM builds can make use of it. This involves using the unwinder to locate the return address of an instrumented function on the stack, so that it can be overridden and made to refer to the ftrace handling routines that need to be called at function return. Given that for these builds, it is not guaranteed that the value of the link register is stored on the stack, fall back to the stack slot that will be used by the ftrace exit code to restore LR in the instrumented function's execution context. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-02-09ARM: unwind: track location of LR value in stack frameArd Biesheuvel1-0/+3
The ftrace graph tracer needs to override the return address of an instrumented function, in order to install a hook that gets invoked when the function returns again. Currently, we only support this when building for ARM using GCC with frame pointers, as in this case, it is guaranteed that the function will reload LR from [FP, #-4] in all cases, and we can simply pass that address to the ftrace code. In order to support this for configurations that rely on the EABI unwinder, such as Thumb2 builds, make the unwinder keep track of the address from which LR was unwound, permitting ftrace to make use of this in a subsequent patch. Drop the call to is_kernel_text_address(), which is problematic in terms of ftrace recursion, given that it may be instrumented itself. The call is redundant anyway, as no unwind directives will be found unless the PC points to memory that is known to contain executable code. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
2022-02-09ARM: ftrace: enable HAVE_FUNCTION_GRAPH_FP_TESTArd Biesheuvel1-0/+2
Fix the frame pointer handling in the function graph tracer entry and exit code so we can enable HAVE_FUNCTION_GRAPH_FP_TEST. Instead of using FP directly (which will have different values between the entry and exit pieces of the function graph tracer), use the value of SP at entry and exit, as we can derive the former value from the frame pointer. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-01-31ARM: make get_current() and __my_cpu_offset() __always_inlineArd Biesheuvel2-2/+2
The get_current() and __my_cpu_offset() accessors evaluate to only a single instruction emitted inline, but due to the size of the asm string that is created for SMP+v6 configurations, the compiler assumes otherwise, and may emit the functions out of line instead. So use __always_inline to avoid this. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
2022-01-31asm/user.h: killed unused macrosAl Viro1-4/+0
Some of them used to be used by libbfd for a.out coredump handling. Seeing that * libbfd has their copies anyway * we don't export them into userland headers * we don't support a.out coredumps anymore let's bury the definitions. They never had in-kernel users anyway... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-01-25ARM: mm: make vmalloc_seq handling SMP safeArd Biesheuvel3-5/+22
Rework the vmalloc_seq handling so it can be used safely under SMP, as we started using it to ensure that vmap'ed stacks are guaranteed to be mapped by the active mm before switching to a task, and here we need to ensure that changes to the page tables are visible to other CPUs when they observe a change in the sequence count. Since LPAE needs none of this, fold a check against it into the vmalloc_seq counter check after breaking it out into a separate static inline helper. Given that vmap'ed stacks are now also supported on !SMP configurations, let's drop the WARN() that could potentially now fire spuriously. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-01-25ARM: smp: elide HWCAP_TLS checks or __entry_task updates on SMP+v6Ard Biesheuvel2-8/+18
Use the SMP_ON_UP patching framework to elide HWCAP_TLS tests from the context switch and return to userspace code paths, as SMP systems are guaranteed to have this h/w capability. At the same time, omit the update of __entry_task if the system is detected to be UP at runtime, as in that case, the value is never used. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-01-25Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds3-2/+11
Pull ARM fixes from Russell King: - Fix panic whe both KASAN and KPROBEs are enabled - Avoid alignment faults in copy_*_kernel_nofault() - Align SMP alternatives in modules * tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: 9180/1: Thumb2: align ALT_UP() sections in modules sufficiently ARM: 9179/1: uaccess: avoid alignment faults in copy_[from|to]_kernel_nofault ARM: 9170/1: fix panic when kasan and kprobe are enabled
2022-01-24ARM: assembler: define a Kconfig symbol for group relocation supportArd Biesheuvel3-10/+10
Nathan reports the group relocations go out of range in pathological cases such as allyesconfig kernels, which have little chance of actually booting but are still used in validation. So add a Kconfig symbol for this feature, and make it depend on !COMPILE_TEST. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-01-23Merge tag 'bitmap-5.17-rc1' of git://github.com/norov/linuxLinus Torvalds1-1/+0
Pull bitmap updates from Yury Norov: - introduce for_each_set_bitrange() - use find_first_*_bit() instead of find_next_*_bit() where possible - unify for_each_bit() macros * tag 'bitmap-5.17-rc1' of git://github.com/norov/linux: vsprintf: rework bitmap_list_string lib: bitmap: add performance test for bitmap_print_to_pagebuf bitmap: unify find_bit operations mm/percpu: micro-optimize pcpu_is_populated() Replace for_each_*_bit_from() with for_each_*_bit() where appropriate find: micro-optimize for_each_{set,clear}_bit() include/linux: move for_each_bit() macros from bitops.h to find.h cpumask: replace cpumask_next_* with cpumask_first_* where appropriate tools: sync tools/bitmap with mother linux all: replace find_next{,_zero}_bit with find_first{,_zero}_bit where appropriate cpumask: use find_first_and_bit() lib: add find_first_and_bit() arch: remove GENERIC_FIND_FIRST_BIT entirely include: move find.h from asm_generic to linux bitops: move find_bit_*_le functions from le.h to find.h bitops: protect find_first_{,zero}_bit properly
2022-01-19ARM: 9180/1: Thumb2: align ALT_UP() sections in modules sufficientlyArd Biesheuvel2-0/+3
When building for Thumb2, the .alt.smp.init sections that are emitted by the ALT_UP() patching code may not be 32-bit aligned, even though the fixup_smp_on_up() routine expects that. This results in alignment faults at module load time, which need to be fixed up by the fault handler. So let's align those sections explicitly, and prevent this from occurring. Cc: <stable@vger.kernel.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-01-19ARM: 9179/1: uaccess: avoid alignment faults in copy_[from|to]_kernel_nofaultArd Biesheuvel1-2/+8
The helpers that are used to implement copy_from_kernel_nofault() and copy_to_kernel_nofault() cast a void* to a pointer to a wider type, which may result in alignment faults on ARM if the compiler decides to use double-word or multiple-word load/store instructions. Only configurations that define CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y are affected, given that commit 2423de2e6f4d ("ARM: 9115/1: mm/maccess: fix unaligned copy_{from,to}_kernel_nofault") ensures that dst and src are sufficiently aligned otherwise. So use the unaligned accessors for accessing dst and src in cases where they may be misaligned. Cc: <stable@vger.kernel.org> # depends on 2423de2e6f4d Fixes: 2df4c9a741a0 ("ARM: 9112/1: uaccess: add __{get,put}_kernel_nofault") Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-01-16Merge tag 'pci-v5.17-changes' of ↵Linus Torvalds1-1/+4
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull pci updates from Bjorn Helgaas: "Enumeration: - Use pci_find_vsec_capability() instead of open-coding it (Andy Shevchenko) - Convert pci_dev_present() stub from macro to static inline to avoid 'unused variable' errors (Hans de Goede) - Convert sysfs slot attributes from default_attrs to default_groups (Greg Kroah-Hartman) - Use DWORD accesses for LTR, L1 SS to avoid BayHub OZ711LV2 erratum (Rajat Jain) - Remove unnecessary initialization of static variables (Longji Guo) Resource management: - Always write Intel I210 ROM BAR on update to work around device defect (Bjorn Helgaas) PCIe native device hotplug: - Fix pciehp lockdep errors on Thunderbolt undock (Hans de Goede) - Fix infinite loop in pciehp IRQ handler on power fault (Lukas Wunner) Power management: - Convert amd64-agp, sis-agp, via-agp from legacy PCI power management to generic power management (Vaibhav Gupta) IOMMU: - Add function 1 DMA alias quirk for Marvell 88SE9125 SATA controller so it can work with an IOMMU (Yifeng Li) Error handling: - Add PCI_ERROR_RESPONSE and related definitions for signaling and checking for transaction errors on PCI (Naveen Naidu) - Fabricate PCI_ERROR_RESPONSE data (~0) in config read wrappers, instead of in host controller drivers, when transactions fail on PCI (Naveen Naidu) - Use PCI_POSSIBLE_ERROR() to check for possible failure of config reads (Naveen Naidu) Peer-to-peer DMA: - Add Logan Gunthorpe as P2PDMA maintainer (Bjorn Helgaas) ASPM: - Calculate link L0s and L1 exit latencies when needed instead of caching them (Saheed O. Bolarinwa) - Calculate device L0s and L1 acceptable exit latencies when needed instead of caching them (Saheed O. Bolarinwa) - Remove struct aspm_latency since it's no longer needed (Saheed O. Bolarinwa) APM X-Gene PCIe controller driver: - Fix IB window setup, which was broken by the fact that IB resources are now sorted in address order instead of DT dma-ranges order (Rob Herring) Apple PCIe controller driver: - Enable clock gating to save power (Hector Martin) - Fix REFCLK1 enable/poll logic (Hector Martin) Broadcom STB PCIe controller driver: - Declare bitmap correctly for use by bitmap interfaces (Christophe JAILLET) - Clean up computation of legacy and non-legacy MSI bitmasks (Florian Fainelli) - Update suspend/resume/remove error handling to warn about errors and not fail the operation (Jim Quinlan) - Correct the "pcie" and "msi" interrupt descriptions in DT binding (Jim Quinlan) - Add DT bindings for endpoint voltage regulators (Jim Quinlan) - Split brcm_pcie_setup() into two functions (Jim Quinlan) - Add mechanism for turning on voltage regulators for connected devices (Jim Quinlan) - Turn voltage regulators for connected devices on/off when bus is added or removed (Jim Quinlan) - When suspending, don't turn off voltage regulators for wakeup devices (Jim Quinlan) Freescale i.MX6 PCIe controller driver: - Add i.MX8MM support (Richard Zhu) Freescale Layerscape PCIe controller driver: - Use DWC common ops instead of layerscape-specific link-up functions (Hou Zhiqiang) Intel VMD host bridge driver: - Honor platform ACPI _OSC feature negotiation for Root Ports below VMD (Kai-Heng Feng) - Add support for Raptor Lake SKUs (Karthik L Gopalakrishnan) - Reset everything below VMD before enumerating to work around failure to enumerate NVMe devices when guest OS reboots (Nirmal Patel) Bridge emulation (used by Marvell Aardvark and MVEBU): - Make emulated ROM BAR read-only by default (Pali Rohár) - Make some emulated legacy PCI bits read-only for PCIe devices (Pali Rohár) - Update reserved bits in emulated PCIe Capability (Pali Rohár) - Allow drivers to emulate different PCIe Capability versions (Pali Rohár) - Set emulated Capabilities List bit for all PCIe devices, since they must have at least a PCIe Capability (Pali Rohár) Marvell Aardvark PCIe controller driver: - Add bridge emulation definitions for PCIe DEVCAP2, DEVCTL2, DEVSTA2, LNKCAP2, LNKCTL2, LNKSTA2, SLTCAP2, SLTCTL2, SLTSTA2 (Pali Rohár) - Add aardvark support for DEVCAP2, DEVCTL2, LNKCAP2 and LNKCTL2 registers (Pali Rohár) - Clear all MSIs at setup to avoid spurious interrupts (Pali Rohár) - Disable bus mastering when unbinding host controller driver (Pali Rohár) - Mask all interrupts when unbinding host controller driver (Pali Rohár) - Fix memory leak in host controller unbind (Pali Rohár) - Assert PERST# when unbinding host controller driver (Pali Rohár) - Disable link training when unbinding host controller driver (Pali Rohár) - Disable common PHY when unbinding host controller driver (Pali Rohár) - Fix resource type checking to check only IORESOURCE_MEM, not IORESOURCE_MEM_64, which is a flavor of IORESOURCE_MEM (Pali Rohár) Marvell MVEBU PCIe controller driver: - Implement pci_remap_iospace() for ARM so mvebu can use devm_pci_remap_iospace() instead of the previous ARM-specific pci_ioremap_io() interface (Pali Rohár) - Use the standard pci_host_probe() instead of the device-specific mvebu_pci_host_probe() (Pali Rohár) - Replace all uses of ARM-specific pci_ioremap_io() with the ARM implementation of the standard pci_remap_iospace() interface and remove pci_ioremap_io() (Pali Rohár) - Skip initializing invalid Root Ports (Pali Rohár) - Check for errors from pci_bridge_emul_init() (Pali Rohár) - Ignore any bridges at non-zero function numbers (Pali Rohár) - Return ~0 data for invalid config read size (Pali Rohár) - Disallow mapping interrupts on emulated bridges (Pali Rohár) - Clear Root Port Memory & I/O Space Enable and Bus Master Enable at initialization (Pali Rohár) - Make type bits in Root Port I/O Base register read-only (Pali Rohár) - Disable Root Port windows when base/limit set to invalid values (Pali Rohár) - Set controller to Root Complex mode (Pali Rohár) - Set Root Port Class Code to PCI Bridge (Pali Rohár) - Update emulated Root Port secondary bus numbers to better reflect the actual topology (Pali Rohár) - Add PCI_BRIDGE_CTL_BUS_RESET support to emulated Root Ports so pci_reset_secondary_bus() can reset connected devices (Pali Rohár) - Add PCI_EXP_DEVCTL Error Reporting Enable support to emulated Root Ports (Pali Rohár) - Add PCI_EXP_RTSTA PME Status bit support to emulated Root Ports (Pali Rohár) - Add DEVCAP2, DEVCTL2 and LNKCTL2 support to emulated Root Ports on Armada XP and newer devices (Pali Rohár) - Export mvebu-mbus.c symbols to allow pci-mvebu.c to be a module (Pali Rohár) - Add support for compiling as a module (Pali Rohár) MediaTek PCIe controller driver: - Assert PERST# for 100ms to allow power and clock to stabilize (qizhong cheng) MediaTek PCIe Gen3 controller driver: - Disable Mediatek DVFSRC voltage request since lack of DVFSRC to respond to the request causes failure to exit L1 PM Substate (Jianjun Wang) MediaTek MT7621 PCIe controller driver: - Declare mt7621_pci_ops static (Sergio Paracuellos) - Give pcibios_root_bridge_prepare() access to host bridge windows (Sergio Paracuellos) - Move MIPS I/O coherency unit setup from driver to pcibios_root_bridge_prepare() (Sergio Paracuellos) - Add missing MODULE_LICENSE() (Sergio Paracuellos) - Allow COMPILE_TEST for all arches (Sergio Paracuellos) Microsoft Hyper-V host bridge driver: - Add hv-internal interfaces to encapsulate arch IRQ dependencies (Sunil Muthuswamy) - Add arm64 Hyper-V vPCI support (Sunil Muthuswamy) Qualcomm PCIe controller driver: - Undo PM setup in qcom_pcie_probe() error handling path (Christophe JAILLET) - Use __be16 type to store return value from cpu_to_be16() (Manivannan Sadhasivam) - Constify static dw_pcie_ep_ops (Rikard Falkeborn) Renesas R-Car PCIe controller driver: - Fix aarch32 abort handler so it doesn't check the wrong bus clock before accessing the host controller (Marek Vasut) TI Keystone PCIe controller driver: - Add register offset for ti,syscon-pcie-id and ti,syscon-pcie-mode DT properties (Kishon Vijay Abraham I) MicroSemi Switchtec management driver: - Add Gen4 automotive device IDs (Kelvin Cao) - Declare state_names[] as static so it's not allocated and initialized for every call (Kelvin Cao) Host controller driver cleanups: - Use of_device_get_match_data(), not of_match_device(), when we only need the device data in altera, artpec6, cadence, designware-plat, dra7xx, keystone, kirin (Fan Fei) - Drop pointless of_device_get_match_data() cast in j721e (Bjorn Helgaas) - Drop redundant struct device * from j721e since struct cdns_pcie already has one (Bjorn Helgaas) - Rename driver structs to *_pcie in intel-gw, iproc, ls-gen4, mediatek-gen3, microchip, mt7621, rcar-gen2, tegra194, uniphier, xgene, xilinx, xilinx-cpm for consistency across drivers (Fan Fei) - Fix invalid address space conversions in hisi, spear13xx (Bjorn Helgaas) Miscellaneous: - Sort Intel Device IDs by value (Andy Shevchenko) - Change Capability offsets to hex to match spec (Baruch Siach) - Correct misspellings (Krzysztof Wilczyński) - Terminate statement with semicolon in pci_endpoint_test.c (Ming Wang)" * tag 'pci-v5.17-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (151 commits) PCI: mt7621: Allow COMPILE_TEST for all arches PCI: mt7621: Add missing MODULE_LICENSE() PCI: mt7621: Move MIPS setup to pcibios_root_bridge_prepare() PCI: Let pcibios_root_bridge_prepare() access bridge->windows PCI: mt7621: Declare mt7621_pci_ops static PCI: brcmstb: Do not turn off WOL regulators on suspend PCI: brcmstb: Add control of subdevice voltage regulators PCI: brcmstb: Add mechanism to turn on subdev regulators PCI: brcmstb: Split brcm_pcie_setup() into two funcs dt-bindings: PCI: Add bindings for Brcmstb EP voltage regulators dt-bindings: PCI: Correct brcmstb interrupts, interrupt-map. PCI: brcmstb: Fix function return value handling PCI: brcmstb: Do not use __GENMASK PCI: brcmstb: Declare 'used' as bitmap, not unsigned long PCI: hv: Add arm64 Hyper-V vPCI support PCI: hv: Make the code arch neutral by adding arch specific interfaces PCI: pciehp: Use down_read/write_nested(reset_lock) to fix lockdep errors x86/PCI: Remove initialization of static variables to false PCI: Use DWORD accesses for LTR, L1 SS to avoid erratum misc: pci_endpoint_test: Terminate statement with semicolon ...
2022-01-15include: move find.h from asm_generic to linuxYury Norov1-1/+0
find_bit API and bitmap API are closely related, but inclusion paths are different - include/asm-generic and include/linux, correspondingly. In the past it made a lot of troubles due to circular dependencies and/or undefined symbols. Fix this by moving find.h under include/linux. Signed-off-by: Yury Norov <yury.norov@gmail.com> Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
2022-01-12Merge tag 'locking_core_for_v5.17_rc1' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Borislav Petkov: "Lots of cleanups and preparation. Highlights: - futex: Cleanup and remove runtime futex_cmpxchg detection - rtmutex: Some fixes for the PREEMPT_RT locking infrastructure - kcsan: Share owner_on_cpu() between mutex,rtmutex and rwsem and annotate the racy owner->on_cpu access *once*. - atomic64: Dead-Code-Elemination" [ Description above by Peter Zijlstra ] * tag 'locking_core_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/atomic: atomic64: Remove unusable atomic ops futex: Fix additional regressions locking: Allow to include asm/spinlock_types.h from linux/spinlock_types_raw.h x86/mm: Include spinlock_t definition in pgtable. locking: Mark racy reads of owner->on_cpu locking: Make owner_on_cpu() into <linux/sched.h> lockdep/selftests: Adapt ww-tests for PREEMPT_RT lockdep/selftests: Skip the softirq related tests on PREEMPT_RT lockdep/selftests: Unbalanced migrate_disable() & rcu_read_lock(). lockdep/selftests: Avoid using local_lock_{acquire|release}(). lockdep: Remove softirq accounting on PREEMPT_RT. locking/rtmutex: Add rt_mutex_lock_nest_lock() and rt_mutex_lock_killable(). locking/rtmutex: Squash self-deadlock check for ww_rt_mutex. locking: Remove rt_rwlock_is_contended(). sched: Trigger warning if ->migration_disabled counter underflows. futex: Fix sparc32/m68k/nds32 build regression futex: Remove futex_cmpxchg detection futex: Ensure futex_atomic_cmpxchg_inatomic() is present kernel/locking: Use a pointer in ww_mutex_trylock().
2022-01-12Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds3-14/+54
Pull ARM updates from Russell King: - amba bus irq rework - add kfence support - support for Cortex M33 and M55 CPUs - kbuild updates for decompressor - let core code manage thread_info::cpu - avoid unpredictable NOP encoding in decompressor - reduce information printed in calltraces * tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: reduce the information printed in call traces ARM: 9168/1: Add support for Cortex-M55 processor ARM: 9167/1: Add support for Cortex-M33 processor ARM: 9166/1: Support KFENCE for ARM ARM: 9165/1: mm: Provide is_write_fault() ARM: 9164/1: mm: Provide set_memory_valid() ARM: 9163/1: amba: Move of_amba_device_decode_irq() into amba_probe() ARM: 9162/1: amba: Kill sysfs attribute file of irq ARM: 9161/1: mm: mark private VM_FAULT_X defines as vm_fault_t ARM: 9159/1: decompressor: Avoid UNPREDICTABLE NOP encoding ARM: 9158/1: leave it to core code to manage thread_info::cpu ARM: 9154/1: decompressor: do not copy source files while building
2022-01-10Merge branch 'pm-cpufreq'Rafael J. Wysocki1-1/+1
Merge cpufreq updates for 5.17-rc1: - Add new P-state driver for AMD processors (Huang Rui). - Fix initialization of min and max frequency QoS requests in the cpufreq core (Rafael Wysocki). - Fix EPP handling on Alder Lake in intel_pstate (Srinivas Pandruvada). - Make intel_pstate update cpuinfo.max_freq when notified of HWP capabilities changes and drop a redundant function call from that driver (Rafael Wysocki). - Improve IRQ support in the Qcom cpufreq driver (Ard Biesheuvel, Stephen Boyd, Vladimir Zapolskiy). - Fix double devm_remap() in the Mediatek cpufreq driver (Hector Yuan). - Introduce thermal pressure helpers for cpufreq CPU cooling (Lukasz Luba). - Make cpufreq use default_groups in kobj_type (Greg Kroah-Hartman). * pm-cpufreq: (32 commits) x86, sched: Fix undefined reference to init_freq_invariance_cppc() build error cpufreq: amd-pstate: Fix Kconfig dependencies for AMD P-State cpufreq: amd-pstate: Fix struct amd_cpudata kernel-doc comment MAINTAINERS: Add AMD P-State driver maintainer entry Documentation: amd-pstate: Add AMD P-State driver introduction cpufreq: amd-pstate: Add AMD P-State performance attributes cpufreq: amd-pstate: Add AMD P-State frequencies attributes cpufreq: amd-pstate: Add boost mode support for AMD P-State cpufreq: amd-pstate: Add trace for AMD P-State module cpufreq: amd-pstate: Introduce the support for the processors with shared memory solution cpufreq: amd-pstate: Add fast switch function for AMD P-State cpufreq: amd-pstate: Introduce a new AMD P-State driver to support future processors ACPI: CPPC: Add CPPC enable register function ACPI: CPPC: Check present CPUs for determining _CPC is valid ACPI: CPPC: Implement support for SystemIO registers x86/msr: Add AMD CPPC MSR definitions x86/cpufeatures: Add AMD Collaborative Processor Performance Control feature flag cpufreq: use default_groups in kobj_type cpufreq: mediatek-hw: Fix double devm_remap in hotplug case cpufreq: intel_pstate: Update cpuinfo.max_freq on HWP_CAP changes ...