summaryrefslogtreecommitdiff
path: root/arch/powerpc/kernel
AgeCommit message (Collapse)AuthorFilesLines
2021-02-08powerpc/fsl_booke/32: CacheLockingException remove argsNicholas Piggin2-5/+6
Like other interrupt handler conversions, switch to getting registers from the pt_regs argument. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210130130852.2952424-8-npiggin@gmail.com
2021-02-08powerpc: remove arguments from fault handler functionsNicholas Piggin6-29/+15
Make mm fault handlers all just take the pt_regs * argument and load DAR/DSISR from that. Make those that return a value return long. This is done to make the function signatures match other handlers, which will help with a future patch to add wrappers. Explicit arguments could be added for performance but that would require more wrapper macro variants. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210130130852.2952424-7-npiggin@gmail.com
2021-02-08powerpc/64s: move the hash fault handling logic to CNicholas Piggin1-99/+28
The fault handling still has some complex logic particularly around hash table handling, in asm. Implement most of this in C. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210130130852.2952424-6-npiggin@gmail.com
2021-02-08powerpc/64s: move DABR match out of handle_page_faultNicholas Piggin1-18/+16
Similar to the 32/s change, move the test and call to the do_break handler to the DSI. Suggested-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210130130852.2952424-5-npiggin@gmail.com
2021-02-08powerpc/32s: move DABR match out of handle_page_faultChristophe Leroy2-15/+3
handle_page_fault() has some code dedicated to book3s/32 to call do_break() when the DSI is a DABR match. On other platforms, do_break() is handled separately. Do the same for book3s/32, do it earlier in the process of DSI. This change also avoid doing the test on ISI. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210130130852.2952424-4-npiggin@gmail.com
2021-02-08powerpc/64s: interrupt exit improve bounding of interrupt recursionNicholas Piggin1-22/+33
When replaying pending soft-masked interrupts when an interrupt returns to an irqs-enabled context, there is a special case required if this was an asynchronous interrupt to avoid unbounded interrupt recursion. This case was not tested for in the case the asynchronous interrupt hit in user context, because a subsequent nested interrupt would by definition hit in kernel mode, which then exits via the kernel path which does test this case. There is no reason to allow this for such interrupts. While recursion is bounded at the next level, it's simpler and uses less stack to apply the replay logic consistently. This also expands the comment which was really pretty poor and didn't explain the problem (I can say that because I wrote it). Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210130130852.2952424-2-npiggin@gmail.com
2021-02-08powerpc/pci: Move PHB discovery for PCI_DN using platformsOliver O'Halloran1-22/+0
Make powernv, pseries, powermac and maple use ppc_mc.discover_phbs. These platforms need to be done together because they all depend on pci_dn's being created from the DT. The pci_dn contains a pointer to the relevant pci_controller so they need to be created after the pci_controller structures are available, but before PCI devices are scanned. Currently this ordering is provided by initcalls and the sequence is: 1. PHBs are discovered (setup_arch) (early boot, pre-initcalls) 2. pci_dn are created from the unflattended DT (core initcall) 3. PHBs are scanned pcibios_init() (subsys initcall) The new ppc_md.discover_phbs() function is also a core_initcall so we can't guarantee ordering between the creation of pci_controllers and the creation of pci_dn's which require a pci_controller. We could use the postcore, or core_sync initcall levels, but it's cleaner to just move the pci_dn setup into the per-PHB inits which occur inside of .discover_phb() for these platforms. This brings the boot-time path in line with the PHB hotplug path that is used for pseries DLPAR operations too. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> [mpe: Squash powermac & maple in to avoid breakage those platforms, convert memblock allocs to use kmalloc to avoid warnings] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201103043523.916109-2-oohall@gmail.com
2021-02-03powerpc/pci: Add ppc_md.discover_phbs()Oliver O'Halloran1-0/+10
On many powerpc platforms the discovery and initalisation of pci_controllers (PHBs) happens inside of setup_arch(). This is very early in boot (pre-initcalls) and means that we're initialising the PHB long before many basic kernel services (slab allocator, debugfs, a real ioremap) are available. On PowerNV this causes an additional problem since we map the PHB registers with ioremap(). As of commit d538aadc2718 ("powerpc/ioremap: warn on early use of ioremap()") a warning is printed because we're using the "incorrect" API to setup and MMIO mapping in searly boot. The kernel does provide early_ioremap(), but that is not intended to create long-lived MMIO mappings and a seperate warning is printed by generic code if early_ioremap() mappings are "leaked." This is all fixable with dumb hacks like using early_ioremap() to setup the initial mapping then replacing it with a real ioremap later on in boot, but it does raise the question: Why the hell are we setting up the PHB's this early in boot? The old and wise claim it's due to "hysterical rasins." Aside from amused grapes there doesn't appear to be any real reason to maintain the current behaviour. Already most of the newer embedded platforms perform PHB discovery in an arch_initcall and between the end of setup_arch() and the start of initcalls none of the generic kernel code does anything PCI related. On powerpc scanning PHBs occurs in a subsys_initcall so it should be possible to move the PHB discovery to a core, postcore or arch initcall. This patch adds the ppc_md.discover_phbs hook and a core_initcall stub that calls it. The core_initcalls are the earliest to be called so this will any possibly issues with dependency between initcalls. This isn't just an academic issue either since on pseries and PowerNV EEH init occurs in an arch_initcall and depends on the pci_controllers being available, similarly the creation of pci_dns occurs at core_initcall_sync (i.e. between core and postcore initcalls). These problems need to be addressed seperately. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> [mpe: Make discover_phbs() static] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201103043523.916109-1-oohall@gmail.com
2021-02-02powerpc/64/signal: Fix regression in __kernel_sigtramp_rt64() semanticsRaoni Fassina Firmino2-2/+11
Commit 0138ba5783ae ("powerpc/64/signal: Balance return predictor stack in signal trampoline") changed __kernel_sigtramp_rt64() VDSO and trampoline code, and introduced a regression in the way glibc's backtrace()[1] detects the signal-handler stack frame. Apart from the practical implications, __kernel_sigtramp_rt64() was a VDSO function with the semantics that it is a function you can call from userspace to end a signal handling. Now this semantics are no longer valid. I believe the aforementioned change affects all releases since 5.9. This patch tries to fix both the semantics and practical aspect of __kernel_sigtramp_rt64() returning it to the previous code, whilst keeping the intended behaviour of 0138ba5783ae by adding a new symbol to serve as the jump target from the kernel to the trampoline. Now the trampoline has two parts, a new entry point and the old return point. [1] https://lists.ozlabs.org/pipermail/linuxppc-dev/2021-January/223194.html Fixes: 0138ba5783ae ("powerpc/64/signal: Balance return predictor stack in signal trampoline") Cc: stable@vger.kernel.org # v5.9+ Signed-off-by: Raoni Fassina Firmino <raoni@linux.ibm.com> Acked-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Minor tweaks to change log formatting, add stable tag] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210201200505.iz46ubcizipnkcxe@work-tp
2021-01-31powerpc/32s: Only build hash code when CONFIG_PPC_BOOK3S_604 is selectedChristophe Leroy1-0/+12
It is now possible to only build book3s/32 kernel for CPUs without hash table. Opt out hash related code when CONFIG_PPC_BOOK3S_604 is not selected. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/62df436454ef06e104cc334a0859a2878d7888d5.1608274548.git.christophe.leroy@csgroup.eu
2021-01-31powerpc/setup: Adjust six seq_printf() calls in show_cpuinfo()Markus Elfring1-6/+5
A bit of information should be put into a sequence. Thus improve the execution speed for this data output by better usage of corresponding functions. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/5b62379e-a35f-4f56-f1b5-6350f76007e7@web.de
2021-01-31powerpc/mce: Remove per cpu variables from MCE handlersGanesh Goudar2-32/+49
Access to per-cpu variables requires translation to be enabled on pseries machine running in hash mmu mode, Since part of MCE handler runs in realmode and part of MCE handling code is shared between ppc architectures pseries and powernv, it becomes difficult to manage these variables differently on different architectures, So have these variables in paca instead of having them as per-cpu variables to avoid complications. Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210128104143.70668-2-ganeshgr@linux.ibm.com
2021-01-31powerpc/time: Enable sched clock for irqtimePingfan Liu1-0/+2
When CONFIG_IRQ_TIME_ACCOUNTING and CONFIG_VIRT_CPU_ACCOUNTING_GEN, powerpc does not enable "sched_clock_irqtime" and can not utilize irq time accounting. Like x86, powerpc does not use the sched_clock_register() interface. So it needs an dedicated call to enable_sched_clock_irqtime() to enable irq time accounting. Fixes: 518470fe962e ("powerpc: Add HAVE_IRQ_TIME_ACCOUNTING") Signed-off-by: Pingfan Liu <kernelfans@gmail.com> [mpe: Add fixes tag] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1603349479-26185-1-git-send-email-kernelfans@gmail.com
2021-01-31powerpc/prom: Fix "ibm,arch-vec-5-platform-support" scanCédric Le Goater1-8/+4
The "ibm,arch-vec-5-platform-support" property is a list of pairs of bytes representing the options and values supported by the platform firmware. At boot time, Linux scans this list and activates the available features it recognizes : Radix and XIVE. A recent change modified the number of entries to loop on and 8 bytes, 4 pairs of { options, values } entries are always scanned. This is fine on KVM but not on PowerVM which can advertises less. As a consequence on this platform, Linux reads extra entries pointing to random data, interprets these as available features and tries to activate them, leading to a firmware crash in ibm,client-architecture-support. Fix that by using the property length of "ibm,arch-vec-5-platform-support". Fixes: ab91239942a9 ("powerpc/prom: Remove VLA in prom_check_platform_support()") Cc: stable@vger.kernel.org # v4.20+ Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210122075029.797013-1-clg@kaod.org
2021-01-31powerpc/pci: Delete traverse_pci_dn()Oliver O'Halloran1-40/+0
Nothing uses it. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200902035121.1762475-1-oohall@gmail.com
2021-01-31powerpc/eeh: Add a debugfs interface to check if a driver supports recoveryOliver O'Halloran1-0/+50
If a PCI device's current driver implements the error handling callbacks EEH can use them to recover the device after an error occurs. For devices without the error handling callbacks we recover them by removing the device and re-scanning it so the PCI core puts the device back into a known good state. Currently there's no way for userspace to determine if the driver supports recovery or not which makes it difficult to write automated tests for EEH. This patch addressing that by adding a debugfs interface for querying if a specific device can be recovered or not. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201103051512.919333-2-oohall@gmail.com
2021-01-31powerpc/eeh: Rework pci_dev lookup in debugfs attributesOliver O'Halloran1-34/+37
Pull the string -> pci_dev lookup stuff into a helper function. No functional change. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201103051512.919333-1-oohall@gmail.com
2021-01-30powerpc/vdso64: remove meaningless vgettimeofday.o build ruleMasahiro Yamada1-2/+0
VDSO64 is only built for the 64-bit kernel, hence vgettimeofday.o is built by the generic rule in scripts/Makefile.build. This line does not provide anything useful. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201223171142.707053-2-masahiroy@kernel.org
2021-01-30powerpc/vdso: fix unnecessary rebuilds of vgettimeofday.oMasahiro Yamada5-11/+4
vgettimeofday.o is unnecessarily rebuilt. Adding it to 'targets' is not enough to fix the issue. Kbuild is correctly rebuilding it because the command line is changed. PowerPC builds each vdso directory twice; first in vdso_prepare to generate vdso{32,64}-offsets.h, second as part of the ordinary build process to embed vdso{32,64}.so.dbg into the kernel. The problem shows up when CONFIG_PPC_WERROR=y due to the following line in arch/powerpc/Kbuild: subdir-ccflags-$(CONFIG_PPC_WERROR) := -Werror In the preparation stage, Kbuild directly visits the vdso directories, hence it does not inherit subdir-ccflags-y. In the second descend, Kbuild adds -Werror, which results in the command line flipping with/without -Werror. It implies a potential danger; if a more critical flag that would impact the resulted vdso, the offsets recorded in the headers might be different from real offsets in the embedded vdso images. Removing the unneeded second descend solves the problem. Reported-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/linuxppc-dev/87tuslxhry.fsf@mpe.ellerman.id.au/ Link: https://lore.kernel.org/r/20201223171142.707053-1-masahiroy@kernel.org
2021-01-30powerpc/iommu/debug: Add debugfs entries for IOMMU tablesAlexey Kardashevskiy1-0/+46
This adds a folder per LIOBN under /sys/kernel/debug/iommu with IOMMU table parameters. This is enabled by CONFIG_IOMMU_DEBUGFS. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210113102014.124452-1-aik@ozlabs.ru
2021-01-30powerpc/watchdog: Declare soft_nmi_interrupt() prototypeCédric Le Goater1-0/+1
soft_nmi_interrupt() usage requires PPC_WATCHDOG to be configured. Check the CONFIG definition to declare the prototype. It fixes this W=1 compile error : ../arch/powerpc/kernel/watchdog.c:250:6: error: no previous prototype for ‘soft_nmi_interrupt’ [-Werror=missing-prototypes] 250 | void soft_nmi_interrupt(struct pt_regs *regs) | ^~~~~~~~~~~~~~~~~~ Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210104143206.695198-18-clg@kaod.org
2021-01-30powerpc/optprobes: Make patch_imm64_load_insns() staticCédric Le Goater1-1/+1
patch_imm64_load_insns() is only used locally in arch_prepare_optimized_kprobe() and does not need to be external. It fixes this W=1 compile error : ../arch/powerpc/kernel/optprobes.c:149:6: error: no previous prototype for ‘patch_imm64_load_insns’ [-Werror=missing-prototypes] 149 | void patch_imm64_load_insns(unsigned int val, kprobe_opcode_t *addr) Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210104143206.695198-12-clg@kaod.org
2021-01-30powerpc/optprobes: Remove unused routine patch_imm32_load_insns()Cédric Le Goater1-19/+0
Commit 650b55b707fd ("powerpc: Add prefixed instructions to instruction data type") removed the use of patch_imm32_load_insns(). Clean it up to fix this W=1 compile error : ../arch/powerpc/kernel/optprobes.c:149:6: error: no previous prototype for ‘patch_imm32_load_insns’ [-Werror=missing-prototypes] 149 | void patch_imm32_load_insns(unsigned int val, kprobe_opcode_t *addr) Fixes: 650b55b707fd ("powerpc: Add prefixed instructions to instruction data type") Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210104143206.695198-11-clg@kaod.org
2021-01-30powerpc/smp: Make debugger_ipi_callback() staticCédric Le Goater1-1/+1
debugger_ipi_callback() is a local routine used as a NMI IPI handler and does not need to be external. It fixes this W=1 compile error : ../arch/powerpc/kernel/smp.c:579:6: error: no previous prototype for ‘debugger_ipi_callback’ [-Werror=missing-prototypes] 579 | void debugger_ipi_callback(struct pt_regs *regs) | ^~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210104143206.695198-10-clg@kaod.org
2021-01-30powerpc/smp: Include tick_broadcast() prototypeCédric Le Goater1-0/+1
It fixes this W=1 compile error : ../arch/powerpc/kernel/smp.c:569:6: error: no previous prototype for ‘tick_broadcast’ [-Werror=missing-prototypes] 569 | void tick_broadcast(const struct cpumask *mask) | ^~~~~~~~~~~~~~ Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210104143206.695198-9-clg@kaod.org
2021-01-30powerpc/mce: Include prototypesCédric Le Goater1-0/+1
It fixes these W=1 compile errors : ../arch/powerpc/kernel/mce.c:591:14: error: no previous prototype for ‘machine_check_early’ [-Werror=missing-prototypes] 591 | long notrace machine_check_early(struct pt_regs *regs) | ^~~~~~~~~~~~~~~~~~~ ../arch/powerpc/kernel/mce.c:725:6: error: no previous prototype for ‘hmi_exception_realmode’ [-Werror=missing-prototypes] 725 | long hmi_exception_realmode(struct pt_regs *regs) | ^~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210104143206.695198-8-clg@kaod.org
2021-01-30powerpc/setup_64: Make some routines staticCédric Le Goater1-3/+4
The following routines are only called by local services and do not need to be external symbols. It fixes these W=1 errors : ../arch/powerpc/kernel/setup_64.c:261:13: error: no previous prototype for ‘record_spr_defaults’ [-Werror=missing-prototypes] 261 | void __init record_spr_defaults(void) | ^~~~~~~~~~~~~~~~~~~ ../arch/powerpc/kernel/setup_64.c:1011:6: error: no previous prototype for ‘entry_flush_enable’ [-Werror=missing-prototypes] 1011 | void entry_flush_enable(bool enable) | ^~~~~~~~~~~~~~~~~~ ../arch/powerpc/kernel/setup_64.c:1023:6: error: no previous prototype for ‘uaccess_flush_enable’ [-Werror=missing-prototypes] 1023 | void uaccess_flush_enable(bool enable) | ^~~~~~~~~~~~~~~~~~~~ Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210104143206.695198-7-clg@kaod.org
2021-01-29arch: powerpc: Stop building and using oprofileViresh Kumar2-69/+0
The "oprofile" user-space tools don't use the kernel OPROFILE support any more, and haven't in a long time. User-space has been converted to the perf interfaces. This commits stops building oprofile for powerpc and removes any reference to it from directories in arch/powerpc/ apart from arch/powerpc/oprofile, which will be removed in the next commit (this is broken into two commits as the size of the commit became very big, ~5k lines). Note that the member "oprofile_cpu_type" in "struct cpu_spec" isn't removed as it was also used by other parts of the code. Suggested-by: Christoph Hellwig <hch@infradead.org> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Robert Richter <rric@kernel.org> Acked-by: William Cohen <wcohen@redhat.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Thomas Gleixner <tglx@linutronix.de>
2021-01-24fs: add mount_setattr()Christian Brauner1-0/+1
This implements the missing mount_setattr() syscall. While the new mount api allows to change the properties of a superblock there is currently no way to change the properties of a mount or a mount tree using file descriptors which the new mount api is based on. In addition the old mount api has the restriction that mount options cannot be applied recursively. This hasn't changed since changing mount options on a per-mount basis was implemented in [1] and has been a frequent request not just for convenience but also for security reasons. The legacy mount syscall is unable to accommodate this behavior without introducing a whole new set of flags because MS_REC | MS_REMOUNT | MS_BIND | MS_RDONLY | MS_NOEXEC | [...] only apply the mount option to the topmost mount. Changing MS_REC to apply to the whole mount tree would mean introducing a significant uapi change and would likely cause significant regressions. The new mount_setattr() syscall allows to recursively clear and set mount options in one shot. Multiple calls to change mount options requesting the same changes are idempotent: int mount_setattr(int dfd, const char *path, unsigned flags, struct mount_attr *uattr, size_t usize); Flags to modify path resolution behavior are specified in the @flags argument. Currently, AT_EMPTY_PATH, AT_RECURSIVE, AT_SYMLINK_NOFOLLOW, and AT_NO_AUTOMOUNT are supported. If useful, additional lookup flags to restrict path resolution as introduced with openat2() might be supported in the future. The mount_setattr() syscall can be expected to grow over time and is designed with extensibility in mind. It follows the extensible syscall pattern we have used with other syscalls such as openat2(), clone3(), sched_{set,get}attr(), and others. The set of mount options is passed in the uapi struct mount_attr which currently has the following layout: struct mount_attr { __u64 attr_set; __u64 attr_clr; __u64 propagation; __u64 userns_fd; }; The @attr_set and @attr_clr members are used to clear and set mount options. This way a user can e.g. request that a set of flags is to be raised such as turning mounts readonly by raising MOUNT_ATTR_RDONLY in @attr_set while at the same time requesting that another set of flags is to be lowered such as removing noexec from a mount tree by specifying MOUNT_ATTR_NOEXEC in @attr_clr. Note, since the MOUNT_ATTR_<atime> values are an enum starting from 0, not a bitmap, users wanting to transition to a different atime setting cannot simply specify the atime setting in @attr_set, but must also specify MOUNT_ATTR__ATIME in the @attr_clr field. So we ensure that MOUNT_ATTR__ATIME can't be partially set in @attr_clr and that @attr_set can't have any atime bits set if MOUNT_ATTR__ATIME isn't set in @attr_clr. The @propagation field lets callers specify the propagation type of a mount tree. Propagation is a single property that has four different settings and as such is not really a flag argument but an enum. Specifically, it would be unclear what setting and clearing propagation settings in combination would amount to. The legacy mount() syscall thus forbids the combination of multiple propagation settings too. The goal is to keep the semantics of mount propagation somewhat simple as they are overly complex as it is. The @userns_fd field lets user specify a user namespace whose idmapping becomes the idmapping of the mount. This is implemented and explained in detail in the next patch. [1]: commit 2e4b7fcd9260 ("[PATCH] r/o bind mounts: honor mount writer counts at remount") Link: https://lore.kernel.org/r/20210121131959.646623-35-christian.brauner@ubuntu.com Cc: David Howells <dhowells@redhat.com> Cc: Aleksa Sarai <cyphar@cyphar.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Cc: linux-api@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-01-24powerpc/64s: prevent recursive replay_soft_interrupts causing superfluous ↵Nicholas Piggin1-12/+16
interrupt When an asynchronous interrupt calls irq_exit, it checks for softirqs that may have been created, and runs them. Running softirqs enables local irqs, which can replay pending interrupts causing recursion in replay_soft_interrupts. This abridged trace shows how this can occur: ! NIP replay_soft_interrupts LR interrupt_exit_kernel_prepare Call Trace: interrupt_exit_kernel_prepare (unreliable) interrupt_return --- interrupt: ea0 at __rb_reserve_next NIP __rb_reserve_next LR __rb_reserve_next Call Trace: ring_buffer_lock_reserve trace_function function_trace_call ftrace_call __do_softirq irq_exit timer_interrupt ! replay_soft_interrupts interrupt_exit_kernel_prepare interrupt_return --- interrupt: ea0 at arch_local_irq_restore This can not be prevented easily, because softirqs must not block hard irqs, so it has to be dealt with. The recursion is bounded by design in the softirq code because softirq replay disables softirqs and loops around again to check for new softirqs created while it ran, so that's not a problem. However it does mess up interrupt replay state, causing superfluous interrupts when the second replay_soft_interrupts clears a pending interrupt, leaving it still set in the first call in the 'happened' local variable. Fix this by not caching a copy of irqs_happened across interrupt handler calls. Fixes: 3282a3da25bd ("powerpc/64: Implement soft interrupt replay in C") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210123061244.2076145-1-npiggin@gmail.com
2021-01-20powerpc/64s: fix scv entry fallback flush vs interruptNicholas Piggin3-1/+27
The L1D flush fallback functions are not recoverable vs interrupts, yet the scv entry flush runs with MSR[EE]=1. This can result in a timer (soft-NMI) or MCE or SRESET interrupt hitting here and overwriting the EXRFI save area, which ends up corrupting userspace registers for scv return. Fix this by disabling RI and EE for the scv entry fallback flush. Fixes: f79643787e0a0 ("powerpc/64s: flush L1D on kernel entry") Cc: stable@vger.kernel.org # 5.9+ which also have flush L1D patch backport Reported-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210111062408.287092-1-npiggin@gmail.com
2021-01-12powerpc: Fix alignment bug within the init sectionsAriel Marcovitch1-0/+8
This is a bug that causes early crashes in builds with an .exit.text section smaller than a page and an .init.text section that ends in the beginning of a physical page (this is kinda random, which might explain why this wasn't really encountered before). The init sections are ordered like this: .init.text .exit.text .init.data Currently, these sections aren't page aligned. Because the init code might become read-only at runtime and because the .init.text section can potentially reside on the same physical page as .init.data, the beginning of .init.data might be mapped read-only along with .init.text. Then when the kernel tries to modify a variable in .init.data (like kthreadd_done, used in kernel_init()) the kernel panics. To avoid this, make _einittext page aligned and also align .exit.text to make sure .init.data is always seperated from the text segments. Fixes: 060ef9d89d18 ("powerpc32: PAGE_EXEC required for inittext") Signed-off-by: Ariel Marcovitch <ariel.marcovitch@gmail.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210102201156.10805-1-ariel.marcovitch@gmail.com
2021-01-06powerpc: Handle .text.{hot,unlikely}.* in linker scriptNathan Chancellor1-1/+1
Commit eff8728fe698 ("vmlinux.lds.h: Add PGO and AutoFDO input sections") added ".text.unlikely.*" and ".text.hot.*" due to an LLVM change [1]. After another LLVM change [2], these sections are seen in some PowerPC builds, where there is a orphan section warning then build failure: $ make -skj"$(nproc)" \ ARCH=powerpc CROSS_COMPILE=powerpc64le-linux-gnu- LLVM=1 O=out \ distclean powernv_defconfig zImage.epapr ld.lld: warning: kernel/built-in.a(panic.o):(.text.unlikely.) is being placed in '.text.unlikely.' ... ld.lld: warning: address (0xc000000000009314) of section .text is not a multiple of alignment (256) ... ERROR: start_text address is c000000000009400, should be c000000000008000 ERROR: try to enable LD_HEAD_STUB_CATCH config option ERROR: see comments in arch/powerpc/tools/head_check.sh ... Explicitly handle these sections like in the main linker script so there is no more build failure. [1]: https://reviews.llvm.org/D79600 [2]: https://reviews.llvm.org/D92493 Fixes: 83a092cf95f2 ("powerpc: Link warning for orphan sections") Cc: stable@vger.kernel.org Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://github.com/ClangBuiltLinux/linux/issues/1218 Link: https://lore.kernel.org/r/20210104205952.1399409-1-natechancellor@gmail.com
2021-01-04powerpc/32s: Fix RTAS machine check with VMAP stackChristophe Leroy1-0/+9
When we have VMAP stack, exception prolog 1 sets r1, not r11. When it is not an RTAS machine check, don't trash r1 because it is needed by prolog 1. Fixes: da7bb43ab9da ("powerpc/32: Fix vmap stack - Properly set r1 before activating MMU") Fixes: d2e006036082 ("powerpc/32: Use SPRN_SPRG_SCRATCH2 in exception prologs") Cc: stable@vger.kernel.org # v5.10+ Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> [mpe: Squash in fixup for RTAS machine check from Christophe] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/bc77d61d1c18940e456a2dee464f1e2eda65a3f0.1608621048.git.christophe.leroy@csgroup.eu
2020-12-25Merge tag 'powerpc-5.11-2' of ↵Linus Torvalds4-13/+20
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Four commits fixing various things in the new C VDSO code - One fix for a 32-bit VMAP stack bug - Two minor build fixes Thanks to Cédric Le Goater, Christophe Leroy, and Will Springer. * tag 'powerpc-5.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/32: Fix vmap stack - Properly set r1 before activating MMU on syscall too powerpc/vdso: Fix DOTSYM for 32-bit LE VDSO powerpc/vdso: Don't pass 64-bit ABI cflags to 32-bit VDSO powerpc/vdso: Block R_PPC_REL24 relocations powerpc/smp: Add __init to init_big_cores() powerpc/time: Force inlining of get_tb() powerpc/boot: Fix build of dts/fsl
2020-12-23Merge tag 'dma-mapping-5.11' of git://git.infradead.org/users/hch/dma-mappingLinus Torvalds1-2/+69
Pull dma-mapping updates from Christoph Hellwig: - support for a partial IOMMU bypass (Alexey Kardashevskiy) - add a DMA API benchmark (Barry Song) - misc fixes (Tiezhu Yang, tangjianqiang) * tag 'dma-mapping-5.11' of git://git.infradead.org/users/hch/dma-mapping: selftests/dma: add test application for DMA_MAP_BENCHMARK dma-mapping: add benchmark support for streaming DMA APIs dma-contiguous: fix a typo error in a comment dma-pool: no need to check return value of debugfs_create functions powerpc/dma: Fallback to dma_ops when persistent memory present dma-mapping: Allow mixing bypass and mapped DMA operation
2020-12-21powerpc/32: Fix vmap stack - Properly set r1 before activating MMU on ↵Christophe Leroy1-9/+16
syscall too We need r1 to be properly set before activating MMU, otherwise any new exception taken while saving registers into the stack in syscall prologs will use the user stack, which is wrong and will even lockup or crash when KUAP is selected. Do that by switching the meaning of r11 and r1 until we have saved r1 to the stack: copy r1 into r11 and setup the new stack pointer in r1. To avoid complicating and impacting all generic and specific prolog code (and more), copy back r1 into r11 once r11 is save onto the stack. We could get rid of copying r1 back and forth at the cost of rewriting everything to use r1 instead of r11 all the way when CONFIG_VMAP_STACK is set, but the effort is probably not worth it for now. Fixes: da7bb43ab9da ("powerpc/32: Fix vmap stack - Properly set r1 before activating MMU") Cc: stable@vger.kernel.org # v5.10+ Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/a3d819d5c348cee9783a311d5d3f3ba9b48fd219.1608531452.git.christophe.leroy@csgroup.eu
2020-12-21powerpc/vdso: Don't pass 64-bit ABI cflags to 32-bit VDSOMichael Ellerman1-1/+1
When building the 32-bit VDSO, we are building 32-bit code as part of a 64-bit kernel build. That requires us to tweak the cflags to trick the compiler into building 32-bit code for us. The main way we do that is by passing -m32, but there are other options that affect code generation and ABI selection. In particular when building vgettimeofday.c, we end up passing -mcall-aixdesc because it's in KBUILD_CFLAGS, which causes the compiler to generate function descriptors, and dot symbols, eg: $ nm arch/powerpc/kernel/vdso32/vgettimeofday.o 000005d0 T .__c_kernel_clock_getres 00000024 D __c_kernel_clock_getres ... We get away with that at the moment because we also use the DOTSYM macro, and that is also incorrectly prepending a '.' in 32-bit VDSO code due to a separate bug. But we shouldn't be generating function descriptors for this file, there's no 32-bit ABI that includes function descriptors, so the resulting object file is some frankenstein and it's surprising that it even links. So filter out all the ABI-related options we add to CFLAGS for 64-bit builds, so that they're not used when building 32-bit code. With that we only see regular text symbols: $ nm arch/powerpc/kernel/vdso32/vgettimeofday.o michael@alpine1-p1 000005d0 T __c_kernel_clock_getres 00000000 T __c_kernel_clock_gettime 00000200 T __c_kernel_clock_gettime64 00000410 T __c_kernel_gettimeofday 00000650 T __c_kernel_time Fixes: ab037dd87a2f ("powerpc/vdso: Switch VDSO to generic C implementation.") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201218111619.1206391-2-mpe@ellerman.id.au
2020-12-21powerpc/vdso: Block R_PPC_REL24 relocationsMichael Ellerman2-2/+2
Add R_PPC_REL24 relocations to the list of relocations we do NOT support in the VDSO. These are generated in some cases and we do not support relocating them at runtime, so if they appear then the VDSO will not work at runtime, therefore it's preferable to break the build if we see them. Fixes: ab037dd87a2f ("powerpc/vdso: Switch VDSO to generic C implementation.") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201218111619.1206391-1-mpe@ellerman.id.au
2020-12-21powerpc/smp: Add __init to init_big_cores()Cédric Le Goater1-1/+1
It fixes this link warning: WARNING: modpost: vmlinux.o(.text.unlikely+0x2d98): Section mismatch in reference from the function init_big_cores.isra.0() to the function .init.text:init_thread_group_cache_map() The function init_big_cores.isra.0() references the function __init init_thread_group_cache_map(). This is often because init_big_cores.isra.0 lacks a __init annotation or the annotation of init_thread_group_cache_map is wrong. Fixes: 425752c63b6f ("powerpc: Detect the presence of big-cores via "ibm, thread-groups"") Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201221074154.403779-1-clg@kaod.org
2020-12-19epoll: wire up syscall epoll_pwait2Willem de Bruijn1-0/+1
Split off from prev patch in the series that implements the syscall. Link: https://lkml.kernel.org/r/20201121144401.3727659-4-willemdebruijn.kernel@gmail.com Signed-off-by: Willem de Bruijn <willemb@google.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-18Merge tag 'powerpc-5.11-1' of ↵Linus Torvalds64-3066/+1972
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: - Switch to the generic C VDSO, as well as some cleanups of our VDSO setup/handling code. - Support for KUAP (Kernel User Access Prevention) on systems using the hashed page table MMU, using memory protection keys. - Better handling of PowerVM SMT8 systems where all threads of a core do not share an L2, allowing the scheduler to make better scheduling decisions. - Further improvements to our machine check handling. - Show registers when unwinding interrupt frames during stack traces. - Improvements to our pseries (PowerVM) partition migration code. - Several series from Christophe refactoring and cleaning up various parts of the 32-bit code. - Other smaller features, fixes & cleanups. Thanks to: Alan Modra, Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V, Ard Biesheuvel, Athira Rajeev, Balamuruhan S, Bill Wendling, Cédric Le Goater, Christophe Leroy, Christophe Lombard, Colin Ian King, Daniel Axtens, David Hildenbrand, Frederic Barrat, Ganesh Goudar, Gautham R. Shenoy, Geert Uytterhoeven, Giuseppe Sacco, Greg Kurz, Harish, Jan Kratochvil, Jordan Niethe, Kaixu Xia, Laurent Dufour, Leonardo Bras, Madhavan Srinivasan, Mahesh Salgaonkar, Mathieu Desnoyers, Nathan Lynch, Nicholas Piggin, Oleg Nesterov, Oliver O'Halloran, Oscar Salvador, Po-Hsu Lin, Qian Cai, Qinglang Miao, Randy Dunlap, Ravi Bangoria, Sachin Sant, Sandipan Das, Sebastian Andrzej Siewior , Segher Boessenkool, Srikar Dronamraju, Tyrel Datwyler, Uwe Kleine-König, Vincent Stehlé, Youling Tang, and Zhang Xiaoxu. * tag 'powerpc-5.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (304 commits) powerpc/32s: Fix cleanup_cpu_mmu_context() compile bug powerpc: Add config fragment for disabling -Werror powerpc/configs: Add ppc64le_allnoconfig target powerpc/powernv: Rate limit opal-elog read failure message powerpc/pseries/memhotplug: Quieten some DLPAR operations powerpc/ps3: use dma_mapping_error() powerpc: force inlining of csum_partial() to avoid multiple csum_partial() with GCC10 powerpc/perf: Fix Threshold Event Counter Multiplier width for P10 powerpc/mm: Fix hugetlb_free_pmd_range() and hugetlb_free_pud_range() KVM: PPC: Book3S HV: Fix mask size for emulated msgsndp KVM: PPC: fix comparison to bool warning KVM: PPC: Book3S: Assign boolean values to a bool variable powerpc: Inline setup_kup() powerpc/64s: Mark the kuap/kuep functions non __init KVM: PPC: Book3S HV: XIVE: Add a comment regarding VP numbering powerpc/xive: Improve error reporting of OPAL calls powerpc/xive: Simplify xive_do_source_eoi() powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_EOI_FW powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_MASK_FW powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_SHIFT_BUG ...
2020-12-18Merge tag 'trace-v5.11' of ↵Linus Torvalds1-2/+13
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: "The major update to this release is that there's a new arch config option called CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS. Currently, only x86_64 enables it. All the ftrace callbacks now take a struct ftrace_regs instead of a struct pt_regs. If the architecture has HAVE_DYNAMIC_FTRACE_WITH_ARGS enabled, then the ftrace_regs will have enough information to read the arguments of the function being traced, as well as access to the stack pointer. This way, if a user (like live kernel patching) only cares about the arguments, then it can avoid using the heavier weight "regs" callback, that puts in enough information in the struct ftrace_regs to simulate a breakpoint exception (needed for kprobes). A new config option that audits the timestamps of the ftrace ring buffer at most every event recorded. Ftrace recursion protection has been cleaned up to move the protection to the callback itself (this saves on an extra function call for those callbacks). Perf now handles its own RCU protection and does not depend on ftrace to do it for it (saving on that extra function call). New debug option to add "recursed_functions" file to tracefs that lists all the places that triggered the recursion protection of the function tracer. This will show where things need to be fixed as recursion slows down the function tracer. The eval enum mapping updates done at boot up are now offloaded to a work queue, as it caused a noticeable pause on slow embedded boards. Various clean ups and last minute fixes" * tag 'trace-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (33 commits) tracing: Offload eval map updates to a work queue Revert: "ring-buffer: Remove HAVE_64BIT_ALIGNED_ACCESS" ring-buffer: Add rb_check_bpage in __rb_allocate_pages ring-buffer: Fix two typos in comments tracing: Drop unneeded assignment in ring_buffer_resize() tracing: Disable ftrace selftests when any tracer is running seq_buf: Avoid type mismatch for seq_buf_init ring-buffer: Fix a typo in function description ring-buffer: Remove obsolete rb_event_is_commit() ring-buffer: Add test to validate the time stamp deltas ftrace/documentation: Fix RST C code blocks tracing: Clean up after filter logic rewriting tracing: Remove the useless value assignment in test_create_synth_event() livepatch: Use the default ftrace_ops instead of REGS when ARGS is available ftrace/x86: Allow for arguments to be passed in to ftrace_regs by default ftrace: Have the callbacks receive a struct ftrace_regs instead of pt_regs MAINTAINERS: assign ./fs/tracefs to TRACING tracing: Fix some typos in comments ftrace: Remove unused varible 'ret' ring-buffer: Add recording of ring buffer recursion into recursed_functions ...
2020-12-16Merge tag 'tif-task_work.arch-2020-12-14' of git://git.kernel.dk/linux-blockLinus Torvalds1-1/+1
Pull TIF_NOTIFY_SIGNAL updates from Jens Axboe: "This sits on top of of the core entry/exit and x86 entry branch from the tip tree, which contains the generic and x86 parts of this work. Here we convert the rest of the archs to support TIF_NOTIFY_SIGNAL. With that done, we can get rid of JOBCTL_TASK_WORK from task_work and signal.c, and also remove a deadlock work-around in io_uring around knowing that signal based task_work waking is invoked with the sighand wait queue head lock. The motivation for this work is to decouple signal notify based task_work, of which io_uring is a heavy user of, from sighand. The sighand lock becomes a huge contention point, particularly for threaded workloads where it's shared between threads. Even outside of threaded applications it's slower than it needs to be. Roman Gershman <romger@amazon.com> reported that his networked workload dropped from 1.6M QPS at 80% CPU to 1.0M QPS at 100% CPU after io_uring was changed to use TIF_NOTIFY_SIGNAL. The time was all spent hammering on the sighand lock, showing 57% of the CPU time there [1]. There are further cleanups possible on top of this. One example is TIF_PATCH_PENDING, where a patch already exists to use TIF_NOTIFY_SIGNAL instead. Hopefully this will also lead to more consolidation, but the work stands on its own as well" [1] https://github.com/axboe/liburing/issues/215 * tag 'tif-task_work.arch-2020-12-14' of git://git.kernel.dk/linux-block: (28 commits) io_uring: remove 'twa_signal_ok' deadlock work-around kernel: remove checking for TIF_NOTIFY_SIGNAL signal: kill JOBCTL_TASK_WORK io_uring: JOBCTL_TASK_WORK is no longer used by task_work task_work: remove legacy TWA_SIGNAL path sparc: add support for TIF_NOTIFY_SIGNAL riscv: add support for TIF_NOTIFY_SIGNAL nds32: add support for TIF_NOTIFY_SIGNAL ia64: add support for TIF_NOTIFY_SIGNAL h8300: add support for TIF_NOTIFY_SIGNAL c6x: add support for TIF_NOTIFY_SIGNAL alpha: add support for TIF_NOTIFY_SIGNAL xtensa: add support for TIF_NOTIFY_SIGNAL arm: add support for TIF_NOTIFY_SIGNAL microblaze: add support for TIF_NOTIFY_SIGNAL hexagon: add support for TIF_NOTIFY_SIGNAL csky: add support for TIF_NOTIFY_SIGNAL openrisc: add support for TIF_NOTIFY_SIGNAL sh: add support for TIF_NOTIFY_SIGNAL um: add support for TIF_NOTIFY_SIGNAL ...
2020-12-16Merge tag 'fallthrough-fixes-clang-5.11-rc1' of ↵Linus Torvalds2-0/+2
git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux Pull fallthrough fixes from Gustavo A. R. Silva: "Fix many fall-through warnings when building with Clang 12.0.0 using -Wimplicit-fallthrough. - powerpc: boot: include compiler_attributes.h (Nick Desaulniers) - Revert "lib: Revert use of fallthrough pseudo-keyword in lib/" (Nick Desaulniers) - powerpc: fix -Wimplicit-fallthrough (Nick Desaulniers) - lib: Fix fall-through warnings for Clang (Gustavo A. R. Silva)" * tag 'fallthrough-fixes-clang-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux: lib: Fix fall-through warnings for Clang powerpc: fix -Wimplicit-fallthrough Revert "lib: Revert use of fallthrough pseudo-keyword in lib/" powerpc: boot: include compiler_attributes.h
2020-12-16ubsan: enable for all*config buildsKees Cook1-0/+4
With UBSAN_OBJECT_SIZE disabled for GCC, only UBSAN_ALIGNMENT remained a noisy UBSAN option. Disable it for COMPILE_TEST so the rest of UBSAN can be used for full all*config builds or other large combinations. [sfr@canb.auug.org.au: add .data..Lubsan_data*/.data..Lubsan_type* sections explicitly] Link: https://lkml.kernel.org/r/20201208230157.42c42789@canb.auug.org.au Link: https://lore.kernel.org/lkml/CAHk-=wgXW=YLxGN0QVpp-1w5GDd2pf1W-FqY15poKzoVfik2qA@mail.gmail.com/ Link: https://lkml.kernel.org/r/20201203004437.389959-6-keescook@chromium.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: George Popescu <georgepope@android.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Marco Elver <elver@google.com> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Michal Marek <michal.lkml@markovi.net> Cc: Nathan Chancellor <natechancellor@gmail.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Oberparleiter <oberpar@linux.ibm.com> Cc: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-15Merge tag 'irqchip-5.11' of ↵Thomas Gleixner10-111/+201
git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core Pull irqchip updates for 5.11 from Marc Zyngier: - Preliminary support for managed interrupts on platform devices - Correctly identify allocation of MSIs proxyied by another device - Remove the fasteoi IPI flow which has been proved useless - Generalise the Ocelot support to new SoCs - Improve GICv4.1 vcpu entry, matching the corresponding KVM optimisation - Work around spurious interrupts on Qualcomm PDC - Random fixes and cleanups Link: https://lore.kernel.org/r/20201212135626.1479884-1-maz@kernel.org
2020-12-10powerpc/cacheinfo: Print correct cache-sibling map/list for L2 cacheGautham R. Shenoy1-10/+20
On POWER platforms where only some groups of threads within a core share the L2-cache (indicated by the ibm,thread-groups device-tree property), we currently print the incorrect shared_cpu_map/list for L2-cache in the sysfs. This patch reports the correct shared_cpu_map/list on such platforms. Example: On a platform with "ibm,thread-groups" set to 00000001 00000002 00000004 00000000 00000002 00000004 00000006 00000001 00000003 00000005 00000007 00000002 00000002 00000004 00000000 00000002 00000004 00000006 00000001 00000003 00000005 00000007 This indicates that threads {0,2,4,6} in the core share the L2-cache and threads {1,3,5,7} in the core share the L2 cache. However, without the patch, the shared_cpu_map/list for L2 for CPUs 0, 1 is reported in the sysfs as follows: /sys/devices/system/cpu/cpu0/cache/index2/shared_cpu_list:0-7 /sys/devices/system/cpu/cpu0/cache/index2/shared_cpu_map:000000,000000ff /sys/devices/system/cpu/cpu1/cache/index2/shared_cpu_list:0-7 /sys/devices/system/cpu/cpu1/cache/index2/shared_cpu_map:000000,000000ff With the patch, the shared_cpu_map/list for L2 cache for CPUs 0, 1 is correctly reported as follows: /sys/devices/system/cpu/cpu0/cache/index2/shared_cpu_list:0,2,4,6 /sys/devices/system/cpu/cpu0/cache/index2/shared_cpu_map:000000,00000055 /sys/devices/system/cpu/cpu1/cache/index2/shared_cpu_list:1,3,5,7 /sys/devices/system/cpu/cpu1/cache/index2/shared_cpu_map:000000,000000aa This patch also defines cpu_l2_cache_mask() for !CONFIG_SMP case. Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1607596739-32439-6-git-send-email-ego@linux.vnet.ibm.com
2020-12-10powerpc/smp: Add support detecting thread-groups sharing L2 cacheGautham R. Shenoy1-5/+53
On POWER systems, groups of threads within a core sharing the L2-cache can be indicated by the "ibm,thread-groups" property array with the identifier "2". This patch adds support for detecting this, and when present, populate the populating the cpu_l2_cache_mask of every CPU to the core-siblings which share L2 with the CPU as specified in the by the "ibm,thread-groups" property array. On a platform with the following "ibm,thread-group" configuration 00000001 00000002 00000004 00000000 00000002 00000004 00000006 00000001 00000003 00000005 00000007 00000002 00000002 00000004 00000000 00000002 00000004 00000006 00000001 00000003 00000005 00000007 Without this patch, the sched-domain hierarchy for CPUs 0,1 would be CPU0 attaching sched-domain(s): domain-0: span=0,2,4,6 level=SMT domain-1: span=0-7 level=CACHE domain-2: span=0-15,24-39,48-55 level=MC domain-3: span=0-55 level=DIE CPU1 attaching sched-domain(s): domain-0: span=1,3,5,7 level=SMT domain-1: span=0-7 level=CACHE domain-2: span=0-15,24-39,48-55 level=MC domain-3: span=0-55 level=DIE The CACHE domain at 0-7 is incorrect since the ibm,thread-groups sub-array [00000002 00000002 00000004 00000000 00000002 00000004 00000006 00000001 00000003 00000005 00000007] indicates that L2 (Property "2") is shared only between the threads of a single group. There are "2" groups of threads where each group contains "4" threads each. The groups being {0,2,4,6} and {1,3,5,7}. With this patch, the sched-domain hierarchy for CPUs 0,1 would be CPU0 attaching sched-domain(s): domain-0: span=0,2,4,6 level=SMT domain-1: span=0-15,24-39,48-55 level=MC domain-2: span=0-55 level=DIE CPU1 attaching sched-domain(s): domain-0: span=1,3,5,7 level=SMT domain-1: span=0-15,24-39,48-55 level=MC domain-2: span=0-55 level=DIE The CACHE domain with span=0,2,4,6 for CPU 0 (span=1,3,5,7 for CPU 1 resp.) gets degenerated into the SMT domain. Furthermore, the last-level-cache domain gets correctly set to the SMT sched-domain. Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1607596739-32439-5-git-send-email-ego@linux.vnet.ibm.com
2020-12-10powerpc/smp: Rename init_thread_group_l1_cache_map() to make it genericGautham R. Shenoy1-7/+10
init_thread_group_l1_cache_map() initializes the per-cpu cpumask thread_group_l1_cache_map with the core-siblings which share L1 cache with the CPU. Make this function generic to the cache-property (L1 or L2) and update a suitable mask. This is a preparatory patch for the next patch where we will introduce discovery of thread-groups that share L2-cache. No functional change. Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1607596739-32439-4-git-send-email-ego@linux.vnet.ibm.com