summaryrefslogtreecommitdiff
path: root/arch/loongarch
AgeCommit message (Collapse)AuthorFilesLines
2022-12-13Merge tag 'random-6.2-rc1-for-linus' of ↵Linus Torvalds2-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/crng/random Pull random number generator updates from Jason Donenfeld: - Replace prandom_u32_max() and various open-coded variants of it, there is now a new family of functions that uses fast rejection sampling to choose properly uniformly random numbers within an interval: get_random_u32_below(ceil) - [0, ceil) get_random_u32_above(floor) - (floor, U32_MAX] get_random_u32_inclusive(floor, ceil) - [floor, ceil] Coccinelle was used to convert all current users of prandom_u32_max(), as well as many open-coded patterns, resulting in improvements throughout the tree. I'll have a "late" 6.1-rc1 pull for you that removes the now unused prandom_u32_max() function, just in case any other trees add a new use case of it that needs to converted. According to linux-next, there may be two trivial cases of prandom_u32_max() reintroductions that are fixable with a 's/.../.../'. So I'll have for you a final conversion patch doing that alongside the removal patch during the second week. This is a treewide change that touches many files throughout. - More consistent use of get_random_canary(). - Updates to comments, documentation, tests, headers, and simplification in configuration. - The arch_get_random*_early() abstraction was only used by arm64 and wasn't entirely useful, so this has been replaced by code that works in all relevant contexts. - The kernel will use and manage random seeds in non-volatile EFI variables, refreshing a variable with a fresh seed when the RNG is initialized. The RNG GUID namespace is then hidden from efivarfs to prevent accidental leakage. These changes are split into random.c infrastructure code used in the EFI subsystem, in this pull request, and related support inside of EFISTUB, in Ard's EFI tree. These are co-dependent for full functionality, but the order of merging doesn't matter. - Part of the infrastructure added for the EFI support is also used for an improvement to the way vsprintf initializes its siphash key, replacing an sleep loop wart. - The hardware RNG framework now always calls its correct random.c input function, add_hwgenerator_randomness(), rather than sometimes going through helpers better suited for other cases. - The add_latent_entropy() function has long been called from the fork handler, but is a no-op when the latent entropy gcc plugin isn't used, which is fine for the purposes of latent entropy. But it was missing out on the cycle counter that was also being mixed in beside the latent entropy variable. So now, if the latent entropy gcc plugin isn't enabled, add_latent_entropy() will expand to a call to add_device_randomness(NULL, 0), which adds a cycle counter, without the absent latent entropy variable. - The RNG is now reseeded from a delayed worker, rather than on demand when used. Always running from a worker allows it to make use of the CPU RNG on platforms like S390x, whose instructions are too slow to do so from interrupts. It also has the effect of adding in new inputs more frequently with more regularity, amounting to a long term transcript of random values. Plus, it helps a bit with the upcoming vDSO implementation (which isn't yet ready for 6.2). - The jitter entropy algorithm now tries to execute on many different CPUs, round-robining, in hopes of hitting even more memory latencies and other unpredictable effects. It also will mix in a cycle counter when the entropy timer fires, in addition to being mixed in from the main loop, to account more explicitly for fluctuations in that timer firing. And the state it touches is now kept within the same cache line, so that it's assured that the different execution contexts will cause latencies. * tag 'random-6.2-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: (23 commits) random: include <linux/once.h> in the right header random: align entropy_timer_state to cache line random: mix in cycle counter when jitter timer fires random: spread out jitter callback to different CPUs random: remove extraneous period and add a missing one in comments efi: random: refresh non-volatile random seed when RNG is initialized vsprintf: initialize siphash key using notifier random: add back async readiness notifier random: reseed in delayed work rather than on-demand random: always mix cycle counter in add_latent_entropy() hw_random: use add_hwgenerator_randomness() for early entropy random: modernize documentation comment on get_random_bytes() random: adjust comment to account for removed function random: remove early archrandom abstraction random: use random.trust_{bootloader,cpu} command line option only stackprotector: actually use get_random_canary() stackprotector: move get_random_canary() into stackprotector.h treewide: use get_random_u32_inclusive() when possible treewide: use get_random_u32_{above,below}() instead of manual loop treewide: use get_random_u32_below() instead of deprecated function ...
2022-12-13Merge tag 'acpi-6.2-rc1' of ↵Linus Torvalds1-142/+0
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI and PNP updates from Rafael Wysocki: "These include new code (for instance, support for the FFH address space type and support for new firmware data structures in ACPICA), some new quirks (mostly related to backlight handling and I2C enumeration), a number of fixes and a fair amount of cleanups all over. Specifics: - Update the ACPICA code in the kernel to the 20221020 upstream version and fix a couple of issues in it: - Make acpi_ex_load_op() match upstream implementation (Rafael Wysocki) - Add support for loong_arch-specific APICs in MADT (Huacai Chen) - Add support for fixed PCIe wake event (Huacai Chen) - Add EBDA pointer sanity checks (Vit Kabele) - Avoid accessing VGA memory when EBDA < 1KiB (Vit Kabele) - Add CCEL table support to both compiler/disassembler (Kuppuswamy Sathyanarayanan) - Add a couple of new UUIDs to the known UUID list (Bob Moore) - Add support for FFH Opregion special context data (Sudeep Holla) - Improve warning message for "invalid ACPI name" (Bob Moore) - Add support for CXL 3.0 structures (CXIMS & RDPAS) in the CEDT table (Alison Schofield) - Prepare IORT support for revision E.e (Robin Murphy) - Finish support for the CDAT table (Bob Moore) - Fix error code path in acpi_ds_call_control_method() (Rafael Wysocki) - Fix use-after-free in acpi_ut_copy_ipackage_to_ipackage() (Li Zetao) - Update the version of the ACPICA code in the kernel (Bob Moore) - Use ZERO_PAGE(0) instead of empty_zero_page in the ACPI device enumeration code (Giulio Benetti) - Change the return type of the ACPI driver remove callback to void and update its users accordingly (Dawei Li) - Add general support for FFH address space type and implement the low- level part of it for ARM64 (Sudeep Holla) - Fix stale comments in the ACPI tables parsing code and make it print more messages related to MADT (Hanjun Guo, Huacai Chen) - Replace invocations of generic library functions with more kernel- specific counterparts in the ACPI sysfs interface (Christophe JAILLET, Xu Panda) - Print full name paths of ACPI power resource objects during enumeration (Kane Chen) - Eliminate a compiler warning regarding a missing function prototype in the ACPI power management code (Sudeep Holla) - Fix and clean up the ACPI processor driver (Rafael Wysocki, Li Zhong, Colin Ian King, Sudeep Holla) - Add quirk for the HP Pavilion Gaming 15-cx0041ur to the ACPI EC driver (Mia Kanashi) - Add some mew ACPI backlight handling quirks and update some existing ones (Hans de Goede) - Make the ACPI backlight driver prefer the native backlight control over vendor backlight control when possible (Hans de Goede) - Drop unsetting ACPI APEI driver data on remove (Uwe Kleine-König) - Use xchg_release() instead of cmpxchg() for updating new GHES cache slots (Ard Biesheuvel) - Clean up the ACPI APEI code (Sudeep Holla, Christophe JAILLET, Jay Lu) - Add new I2C device enumeration quirks for Medion Lifetab S10346 and Lenovo Yoga Tab 3 Pro (YT3-X90F) (Hans de Goede) - Make the ACPI battery driver notify user space about adding new battery hooks and removing the existing ones (Armin Wolf) - Modify the pfr_update and pfr_telemetry drivers to use ACPI_FREE() for freeing acpi_object structures to help diagnostics (Wang ShaoBo) - Make the ACPI fan driver use sysfs_emit_at() in its sysfs interface code (ye xingchen) - Fix the _FIF package extraction failure handling in the ACPI fan driver (Hanjun Guo) - Fix the PCC mailbox handling error code path (Huisong Li) - Avoid using PCC Opregions if there is no platform interrupt allocated for this purpose (Huisong Li) - Use sysfs_emit() instead of scnprintf() in the ACPI PAD driver and CPPC library (ye xingchen) - Fix some kernel-doc issues in the ACPI GSI processing code (Xiongfeng Wang) - Fix name memory leak in pnp_alloc_dev() (Yang Yingliang) - Do not disable PNP devices on suspend when they cannot be re-enabled on resume (Hans de Goede) - Clean up the ACPI thermal driver a bit (Rafael Wysocki)" * tag 'acpi-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (67 commits) ACPI: x86: Add skip i2c clients quirk for Medion Lifetab S10346 ACPI: APEI: EINJ: Refactor available_error_type_show() ACPI: APEI: EINJ: Fix formatting errors ACPI: processor: perflib: Adjust acpi_processor_notify_smm() return value ACPI: processor: perflib: Rearrange acpi_processor_notify_smm() ACPI: processor: perflib: Rearrange unregistration routine ACPI: processor: perflib: Drop redundant parentheses ACPI: processor: perflib: Adjust white space ACPI: processor: idle: Drop unnecessary statements and parens ACPI: thermal: Adjust critical.flags.valid check ACPI: fan: Convert to use sysfs_emit_at() API ACPICA: Fix use-after-free in acpi_ut_copy_ipackage_to_ipackage() ACPI: battery: Call power_supply_changed() when adding hooks ACPI: use sysfs_emit() instead of scnprintf() ACPI: x86: Add skip i2c clients quirk for Lenovo Yoga Tab 3 Pro (YT3-X90F) ACPI: APEI: Remove a useless include PNP: Do not disable devices on suspend when they cannot be re-enabled on resume ACPI: processor: Silence missing prototype warnings ACPI: processor_idle: Silence missing prototype warnings ACPI: PM: Silence missing prototype warning ...
2022-12-12Merge tag 'irq-core-2022-12-10' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq updates from Thomas Gleixner: "Updates for the interrupt core and driver subsystem: The bulk is the rework of the MSI subsystem to support per device MSI interrupt domains. This solves conceptual problems of the current PCI/MSI design which are in the way of providing support for PCI/MSI[-X] and the upcoming PCI/IMS mechanism on the same device. IMS (Interrupt Message Store] is a new specification which allows device manufactures to provide implementation defined storage for MSI messages (as opposed to PCI/MSI and PCI/MSI-X that has a specified message store which is uniform accross all devices). The PCI/MSI[-X] uniformity allowed us to get away with "global" PCI/MSI domains. IMS not only allows to overcome the size limitations of the MSI-X table, but also gives the device manufacturer the freedom to store the message in arbitrary places, even in host memory which is shared with the device. There have been several attempts to glue this into the current MSI code, but after lengthy discussions it turned out that there is a fundamental design problem in the current PCI/MSI-X implementation. This needs some historical background. When PCI/MSI[-X] support was added around 2003, interrupt management was completely different from what we have today in the actively developed architectures. Interrupt management was completely architecture specific and while there were attempts to create common infrastructure the commonalities were rudimentary and just providing shared data structures and interfaces so that drivers could be written in an architecture agnostic way. The initial PCI/MSI[-X] support obviously plugged into this model which resulted in some basic shared infrastructure in the PCI core code for setting up MSI descriptors, which are a pure software construct for holding data relevant for a particular MSI interrupt, but the actual association to Linux interrupts was completely architecture specific. This model is still supported today to keep museum architectures and notorious stragglers alive. In 2013 Intel tried to add support for hot-pluggable IO/APICs to the kernel, which was creating yet another architecture specific mechanism and resulted in an unholy mess on top of the existing horrors of x86 interrupt handling. The x86 interrupt management code was already an incomprehensible maze of indirections between the CPU vector management, interrupt remapping and the actual IO/APIC and PCI/MSI[-X] implementation. At roughly the same time ARM struggled with the ever growing SoC specific extensions which were glued on top of the architected GIC interrupt controller. This resulted in a fundamental redesign of interrupt management and provided the today prevailing concept of hierarchical interrupt domains. This allowed to disentangle the interactions between x86 vector domain and interrupt remapping and also allowed ARM to handle the zoo of SoC specific interrupt components in a sane way. The concept of hierarchical interrupt domains aims to encapsulate the functionality of particular IP blocks which are involved in interrupt delivery so that they become extensible and pluggable. The X86 encapsulation looks like this: |--- device 1 [Vector]---[Remapping]---[PCI/MSI]--|... |--- device N where the remapping domain is an optional component and in case that it is not available the PCI/MSI[-X] domains have the vector domain as their parent. This reduced the required interaction between the domains pretty much to the initialization phase where it is obviously required to establish the proper parent relation ship in the components of the hierarchy. While in most cases the model is strictly representing the chain of IP blocks and abstracting them so they can be plugged together to form a hierarchy, the design stopped short on PCI/MSI[-X]. Looking at the hardware it's clear that the actual PCI/MSI[-X] interrupt controller is not a global entity, but strict a per PCI device entity. Here we took a short cut on the hierarchical model and went for the easy solution of providing "global" PCI/MSI domains which was possible because the PCI/MSI[-X] handling is uniform across the devices. This also allowed to keep the existing PCI/MSI[-X] infrastructure mostly unchanged which in turn made it simple to keep the existing architecture specific management alive. A similar problem was created in the ARM world with support for IP block specific message storage. Instead of going all the way to stack a IP block specific domain on top of the generic MSI domain this ended in a construct which provides a "global" platform MSI domain which allows overriding the irq_write_msi_msg() callback per allocation. In course of the lengthy discussions we identified other abuse of the MSI infrastructure in wireless drivers, NTB etc. where support for implementation specific message storage was just mindlessly glued into the existing infrastructure. Some of this just works by chance on particular platforms but will fail in hard to diagnose ways when the driver is used on platforms where the underlying MSI interrupt management code does not expect the creative abuse. Another shortcoming of today's PCI/MSI-X support is the inability to allocate or free individual vectors after the initial enablement of MSI-X. This results in an works by chance implementation of VFIO (PCI pass-through) where interrupts on the host side are not set up upfront to avoid resource exhaustion. They are expanded at run-time when the guest actually tries to use them. The way how this is implemented is that the host disables MSI-X and then re-enables it with a larger number of vectors again. That works by chance because most device drivers set up all interrupts before the device actually will utilize them. But that's not universally true because some drivers allocate a large enough number of vectors but do not utilize them until it's actually required, e.g. for acceleration support. But at that point other interrupts of the device might be in active use and the MSI-X disable/enable dance can just result in losing interrupts and therefore hard to diagnose subtle problems. Last but not least the "global" PCI/MSI-X domain approach prevents to utilize PCI/MSI[-X] and PCI/IMS on the same device due to the fact that IMS is not longer providing a uniform storage and configuration model. The solution to this is to implement the missing step and switch from global PCI/MSI domains to per device PCI/MSI domains. The resulting hierarchy then looks like this: |--- [PCI/MSI] device 1 [Vector]---[Remapping]---|... |--- [PCI/MSI] device N which in turn allows to provide support for multiple domains per device: |--- [PCI/MSI] device 1 |--- [PCI/IMS] device 1 [Vector]---[Remapping]---|... |--- [PCI/MSI] device N |--- [PCI/IMS] device N This work converts the MSI and PCI/MSI core and the x86 interrupt domains to the new model, provides new interfaces for post-enable allocation/free of MSI-X interrupts and the base framework for PCI/IMS. PCI/IMS has been verified with the work in progress IDXD driver. There is work in progress to convert ARM over which will replace the platform MSI train-wreck. The cleanup of VFIO, NTB and other creative "solutions" are in the works as well. Drivers: - Updates for the LoongArch interrupt chip drivers - Support for MTK CIRQv2 - The usual small fixes and updates all over the place" * tag 'irq-core-2022-12-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (134 commits) irqchip/ti-sci-inta: Fix kernel doc irqchip/gic-v2m: Mark a few functions __init irqchip/gic-v2m: Include arm-gic-common.h irqchip/irq-mvebu-icu: Fix works by chance pointer assignment iommu/amd: Enable PCI/IMS iommu/vt-d: Enable PCI/IMS x86/apic/msi: Enable PCI/IMS PCI/MSI: Provide pci_ims_alloc/free_irq() PCI/MSI: Provide IMS (Interrupt Message Store) support genirq/msi: Provide constants for PCI/IMS support x86/apic/msi: Enable MSI_FLAG_PCI_MSIX_ALLOC_DYN PCI/MSI: Provide post-enable dynamic allocation interfaces for MSI-X PCI/MSI: Provide prepare_desc() MSI domain op PCI/MSI: Split MSI-X descriptor setup genirq/msi: Provide MSI_FLAG_MSIX_ALLOC_DYN genirq/msi: Provide msi_domain_alloc_irq_at() genirq/msi: Provide msi_domain_ops:: Prepare_desc() genirq/msi: Provide msi_desc:: Msi_data genirq/msi: Provide struct msi_map x86/apic/msi: Remove arch_create_remap_msi_irq_domain() ...
2022-12-12Merge tag 'rcu.2022.12.02a' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu Pull RCU updates from Paul McKenney: - Documentation updates. This is the second in a series from an ongoing review of the RCU documentation. - Miscellaneous fixes. - Introduce a default-off Kconfig option that depends on RCU_NOCB_CPU that, on CPUs mentioned in the nohz_full or rcu_nocbs boot-argument CPU lists, causes call_rcu() to introduce delays. These delays result in significant power savings on nearly idle Android and ChromeOS systems. These savings range from a few percent to more than ten percent. This series also includes several commits that change call_rcu() to a new call_rcu_hurry() function that avoids these delays in a few cases, for example, where timely wakeups are required. Several of these are outside of RCU and thus have acks and reviews from the relevant maintainers. - Create an srcu_read_lock_nmisafe() and an srcu_read_unlock_nmisafe() for architectures that support NMIs, but which do not provide NMI-safe this_cpu_inc(). These NMI-safe SRCU functions are required by the upcoming lockless printk() work by John Ogness et al. - Changes providing minor but important increases in torture test coverage for the new RCU polled-grace-period APIs. - Changes to torturescript that avoid redundant kernel builds, thus providing about a 30% speedup for the torture.sh acceptance test. * tag 'rcu.2022.12.02a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (49 commits) net: devinet: Reduce refcount before grace period net: Use call_rcu_hurry() for dst_release() workqueue: Make queue_rcu_work() use call_rcu_hurry() percpu-refcount: Use call_rcu_hurry() for atomic switch scsi/scsi_error: Use call_rcu_hurry() instead of call_rcu() rcu/rcutorture: Use call_rcu_hurry() where needed rcu/rcuscale: Use call_rcu_hurry() for async reader test rcu/sync: Use call_rcu_hurry() instead of call_rcu rcuscale: Add laziness and kfree tests rcu: Shrinker for lazy rcu rcu: Refactor code a bit in rcu_nocb_do_flush_bypass() rcu: Make call_rcu() lazy to save power rcu: Implement lockdep_rcu_enabled for !CONFIG_DEBUG_LOCK_ALLOC srcu: Debug NMI safety even on archs that don't require it srcu: Explain the reason behind the read side critical section on GP start srcu: Warn when NMI-unsafe API is used in NMI arch/s390: Add ARCH_HAS_NMI_SAFE_THIS_CPU_OPS Kconfig option arch/loongarch: Add ARCH_HAS_NMI_SAFE_THIS_CPU_OPS Kconfig option rcu: Fix __this_cpu_read() lockdep warning in rcu_force_quiescent_state() rcu-tasks: Make grace-period-age message human-readable ...
2022-12-12Merge branch 'acpica'Rafael J. Wysocki1-142/+0
Merge ACPICA changes, including bug fixes and cleanups as well as support for some recently defined data structures, for 6.2-rc1: - Make acpi_ex_load_op() match upstream implementation (Rafael Wysocki). - Add support for loong_arch-specific APICs in MADT (Huacai Chen). - Add support for fixed PCIe wake event (Huacai Chen). - Add EBDA pointer sanity checks (Vit Kabele). - Avoid accessing VGA memory when EBDA < 1KiB (Vit Kabele). - Add CCEL table support to both compiler/disassembler (Kuppuswamy Sathyanarayanan). - Add a couple of new UUIDs to the known UUID list (Bob Moore). - Add support for FFH Opregion special context data (Sudeep Holla). - Improve warning message for "invalid ACPI name" (Bob Moore). - Add support for CXL 3.0 structures (CXIMS & RDPAS) in the CEDT table (Alison Schofield). - Prepare IORT support for revision E.e (Robin Murphy). - Finish support for the CDAT table (Bob Moore). - Fix error code path in acpi_ds_call_control_method() (Rafael Wysocki). - Fix use-after-free in acpi_ut_copy_ipackage_to_ipackage() (Li Zetao). - Update the version of the ACPICA code in the kernel (Bob Moore). * acpica: ACPICA: Fix use-after-free in acpi_ut_copy_ipackage_to_ipackage() ACPICA: Fix error code path in acpi_ds_call_control_method() ACPICA: Update version to 20221020 ACPICA: Add utcksum.o to the acpidump Makefile Revert "LoongArch: Provisionally add ACPICA data structures" ACPICA: Finish support for the CDAT table ACPICA: IORT: Update for revision E.e ACPICA: Add CXL 3.0 structures (CXIMS & RDPAS) to the CEDT table ACPICA: Improve warning message for "invalid ACPI name" ACPICA: Add support for FFH Opregion special context data ACPICA: Add a couple of new UUIDs to the known UUID list ACPICA: iASL: Add CCEL table to both compiler/disassembler ACPICA: Do not touch VGA memory when EBDA < 1ki_b ACPICA: Check that EBDA pointer is in valid memory ACPICA: Events: Support fixed PCIe wake event ACPICA: MADT: Add loong_arch-specific APICs support ACPICA: Make acpi_ex_load_op() match upstream
2022-12-08LoongArch: mm: Fix huge page entry update for virtual machineHuacai Chen1-14/+16
In virtual machine (guest mode), the tlbwr instruction can not write the last entry of MTLB, so we need to make it non-present by invtlb and then write it by tlbfill. This also simplify the whole logic. Signed-off-by: Rui Wang <wangrui@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-12-08LoongArch: Export symbol for function smp_send_reschedule()Bibo Mao2-10/+11
Function smp_send_reschedule() is standard kernel API, which is defined in header file include/linux/smp.h. However, on LoongArch it is defined as an inline function, this is confusing and kernel modules can not use this function. Now we define smp_send_reschedule() as a general function, and add a EXPORT_SYMBOL_GPL on this function, so that kernel modules can use it. Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-12-07Merge tag 'irqchip-6.2' of ↵Thomas Gleixner1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core Pull irqchip updates frim Marc Zyngier: - More APCI fixes and improvements for the LoongArch architecture, adding support for the HTVEC irqchip, suspend-resume, and some PCI INTx workarounds - Initial DT support for LoongArch. I'm not even kidding. - Support for the MTK CIRQv2, a minor deviation from the original version - Error handling fixes for wpcm450, GIC... - BE detection for a FSL controller - Declare the Sifive PLIC as wake-up agnostic - Simplify fishing out the device data for the ST irqchip - Mark some data structures as __initconst in the apple-aic driver - Switch over from strtobool to kstrtobool - COMPILE_TEST fixes
2022-12-03Merge tag 'mm-hotfixes-stable-2022-12-02' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc hotfixes from Andrew Morton: "15 hotfixes, 11 marked cc:stable. Only three or four of the latter address post-6.0 issues, which is hopefully a sign that things are converging" * tag 'mm-hotfixes-stable-2022-12-02' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: revert "kbuild: fix -Wimplicit-function-declaration in license_is_gpl_compatible" Kconfig.debug: provide a little extra FRAME_WARN leeway when KASAN is enabled drm/amdgpu: temporarily disable broken Clang builds due to blown stack-frame mm/khugepaged: invoke MMU notifiers in shmem/file collapse paths mm/khugepaged: fix GUP-fast interaction by sending IPI mm/khugepaged: take the right locks for page table retraction mm: migrate: fix THP's mapcount on isolation mm: introduce arch_has_hw_nonleaf_pmd_young() mm: add dummy pmd_young() for architectures not having it mm/damon/sysfs: fix wrong empty schemes assumption under online tuning in damon_sysfs_set_schemes() tools/vm/slabinfo-gnuplot: use "grep -E" instead of "egrep" nilfs2: fix NULL pointer dereference in nilfs_palloc_commit_free_entry() hugetlb: don't delete vma_lock in hugetlb MADV_DONTNEED processing madvise: use zap_page_range_single for madvise dontneed mm: replace VM_WARN_ON to pr_warn if the node is offline with __GFP_THISNODE
2022-12-01mm: add dummy pmd_young() for architectures not having itJuergen Gross1-0/+1
In order to avoid #ifdeffery add a dummy pmd_young() implementation as a fallback. This is required for the later patch "mm: introduce arch_has_hw_nonleaf_pmd_young()". Link: https://lkml.kernel.org/r/fd3ac3cd-7349-6bbd-890a-71a9454ca0b3@suse.com Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Yu Zhao <yuzhao@google.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Sander Eikelenboom <linux@eikelenboom.it> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-11-26irqchip/loongson-htvec: Add ACPI init supportHuacai Chen1-1/+1
HTVECINTC stands for "HyperTransport Interrupts" that described in Section 14.3 of "Loongson 3A5000 Processor Reference Manual". For more information please refer Documentation/loongarch/irq-chip-model.rst. Though the extended model is the recommended one, there are still some legacy model machines. So we add ACPI init support for HTVECINTC. Co-developed-by: Jianmin Lv <lvjianmin@loongson.cn> Signed-off-by: Jianmin Lv <lvjianmin@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221020142535.1725573-1-chenhuacai@loongson.cn
2022-11-21LoongArch: Fix unsigned comparison with less than zeroKaiLong Wang1-1/+2
Eliminate the following coccicheck warning: ./arch/loongarch/kernel/unwind_prologue.c:84:5-13: WARNING: Unsigned expression compared with zero: frame_ra < 0 Signed-off-by: KaiLong Wang <wangkailong@jari.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-11-21LoongArch: Set _PAGE_DIRTY only if _PAGE_MODIFIED is set in {pmd,pte}_mkwrite()Huacai Chen1-2/+6
Set _PAGE_DIRTY only if _PAGE_MODIFIED is set in {pmd,pte}_mkwrite(). Otherwise, _PAGE_DIRTY silences the TLB modify exception and make us have no chance to mark a pmd/pte dirty (_PAGE_MODIFIED) for software. Reviewed-by: Guo Ren <guoren@kernel.org> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-11-21LoongArch: Set _PAGE_DIRTY only if _PAGE_WRITE is set in {pmd,pte}_mkdirty()Huacai Chen1-2/+6
Now {pmd,pte}_mkdirty() set _PAGE_DIRTY bit unconditionally, this causes random segmentation fault after commit 0ccf7f168e17bb7e ("mm/thp: carry over dirty bit when thp splits on pmd"). The reason is: when fork(), parent process use pmd_wrprotect() to clear huge page's _PAGE_WRITE and _PAGE_DIRTY (for COW); then pte_mkdirty() set _PAGE_DIRTY as well as _PAGE_MODIFIED while splitting dirty huge pages; once _PAGE_DIRTY is set, there will be no tlb modify exception so the COW machanism fails; and at last memory corruption occurred between parent and child processes. So, we should set _PAGE_DIRTY only when _PAGE_WRITE is set in {pmd,pte}_ mkdirty(). Cc: stable@vger.kernel.org Cc: Peter Xu <peterx@redhat.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-11-21LoongArch: Clear FPU/SIMD thread info flags for kernel threadHuacai Chen1-4/+5
If a kernel thread is created by a user thread, it may carry FPU/SIMD thread info flags (TIF_USEDFPU, TIF_USEDSIMD, etc.). Then it will be considered as a fpu owner and kernel try to save its FPU/SIMD context and cause such errors: [ 41.518931] do_fpu invoked from kernel context![#1]: [ 41.523933] CPU: 1 PID: 395 Comm: iou-wrk-394 Not tainted 6.1.0-rc5+ #217 [ 41.530757] Hardware name: Loongson Loongson-3A5000-7A1000-1w-CRB/Loongson-LS3A5000-7A1000-1w-CRB, BIOS vUDK2018-LoongArch-V2.0.pre-beta8 08/18/2022 [ 41.544064] $ 0 : 0000000000000000 90000000011e9468 9000000106c7c000 9000000106c7fcf0 [ 41.552101] $ 4 : 9000000106305d40 9000000106689800 9000000106c7fd08 0000000003995818 [ 41.560138] $ 8 : 0000000000000001 90000000009a72e4 0000000000000020 fffffffffffffffc [ 41.568174] $12 : 0000000000000000 0000000000000000 0000000000000020 00000009aab7e130 [ 41.576211] $16 : 00000000000001ff 0000000000000407 0000000000000001 0000000000000000 [ 41.584247] $20 : 0000000000000000 0000000000000001 9000000106c7fd70 90000001002f0400 [ 41.592284] $24 : 0000000000000000 900000000178f740 90000000011e9834 90000001063057c0 [ 41.600320] $28 : 0000000000000000 0000000000000001 9000000006826b40 9000000106305140 [ 41.608356] era : 9000000000228848 _save_fp+0x0/0xd8 [ 41.613542] ra : 90000000011e9468 __schedule+0x568/0x8d0 [ 41.619160] CSR crmd: 000000b0 [ 41.619163] CSR prmd: 00000000 [ 41.622359] CSR euen: 00000000 [ 41.625558] CSR ecfg: 00071c1c [ 41.628756] CSR estat: 000f0000 [ 41.635239] ExcCode : f (SubCode 0) [ 41.638783] PrId : 0014c010 (Loongson-64bit) [ 41.643191] Modules linked in: acpi_ipmi vfat fat ipmi_si ipmi_devintf cfg80211 ipmi_msghandler rfkill fuse efivarfs [ 41.653734] Process iou-wrk-394 (pid: 395, threadinfo=0000000004ebe913, task=00000000636fa1be) [ 41.662375] Stack : 00000000ffff0875 9000000006800ec0 9000000006800ec0 90000000002d57e0 [ 41.670412] 0000000000000001 0000000000000000 9000000106535880 0000000000000001 [ 41.678450] 9000000105291800 0000000000000000 9000000105291838 900000000178e000 [ 41.686487] 9000000106c7fd90 9000000106305140 0000000000000001 90000000011e9834 [ 41.694523] 00000000ffff0875 90000000011f034c 9000000105291838 9000000105291830 [ 41.702561] 0000000000000000 9000000006801440 00000000ffff0875 90000000002d48c0 [ 41.710597] 9000000128800001 9000000106305140 9000000105291838 9000000105291838 [ 41.718634] 9000000105291830 9000000107811740 9000000105291848 90000000009bf1e0 [ 41.726672] 9000000105291830 9000000107811748 2d6b72772d756f69 0000000000343933 [ 41.734708] 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 41.742745] ... [ 41.745252] Call Trace: [ 42.197868] [<9000000000228848>] _save_fp+0x0/0xd8 [ 42.205214] [<90000000011ed468>] __schedule+0x568/0x8d0 [ 42.210485] [<90000000011ed834>] schedule+0x64/0xd4 [ 42.215411] [<90000000011f434c>] schedule_timeout+0x88/0x188 [ 42.221115] [<90000000009c36d0>] io_wqe_worker+0x184/0x350 [ 42.226645] [<9000000000221cf0>] ret_from_kernel_thread+0xc/0x9c This can be easily triggered by ltp testcase syscalls/io_uring02 and it can also be easily fixed by clearing the FPU/SIMD thread info flags for kernel threads in copy_thread(). Cc: stable@vger.kernel.org Reported-by: Qi Hu <huqi@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-11-21LoongArch: SMP: Change prefix from loongson3 to loongsonHuacai Chen4-39/+39
SMP operations can be shared by Loongson-2 series and Loongson-3 series, so we change the prefix from loongson3 to loongson for all functions and data structures. Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-11-21LoongArch: Combine acpi_boot_table_init() and acpi_boot_init()Huacai Chen2-22/+10
Combine acpi_boot_table_init() and acpi_boot_init() since they are very simple, and we don't need to check the return value of acpi_boot_init(). Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-11-21LoongArch: Makefile: Use "grep -E" instead of "egrep"Tiezhu Yang1-1/+1
The latest version of grep claims the egrep is now obsolete so the build now contains warnings that look like: egrep: warning: egrep is obsolescent; using grep -E Fix this up by changing the LoongArch Makefile to use "grep -E" instead. Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-11-18treewide: use get_random_u32_below() instead of deprecated functionJason A. Donenfeld2-2/+2
This is a simple mechanical transformation done by: @@ expression E; @@ - prandom_u32_max + get_random_u32_below (E) Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs Reviewed-by: SeongJae Park <sj@kernel.org> # for damon Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> # for infiniband Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> # for arm Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # for mmc Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-11-03Revert "LoongArch: Provisionally add ACPICA data structures"Huacai Chen1-142/+0
This reverts commit af6a1cfa6859dab4a843 ("LoongArch: Provisionally add ACPICA data structures") to fix build error for linux-next on LoongArch, since acpica is merged to linux-pm.git now. Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-10-29LoongArch: BPF: Avoid declare variables in switch-caseHuacai Chen1-18/+13
Not all compilers support declare variables in switch-case, so move declarations to the beginning of a function. Otherwise we may get such build errors: arch/loongarch/net/bpf_jit.c: In function ‘emit_atomic’: arch/loongarch/net/bpf_jit.c:362:3: error: a label can only be part of a statement and a declaration is not a statement u8 r0 = regmap[BPF_REG_0]; ^~ arch/loongarch/net/bpf_jit.c: In function ‘build_insn’: arch/loongarch/net/bpf_jit.c:727:3: error: a label can only be part of a statement and a declaration is not a statement u8 t7 = -1; ^~ arch/loongarch/net/bpf_jit.c:778:3: error: a label can only be part of a statement and a declaration is not a statement int ret; ^~~ arch/loongarch/net/bpf_jit.c:779:3: error: expected expression before ‘u64’ u64 func_addr; ^~~ arch/loongarch/net/bpf_jit.c:780:3: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement] bool func_addr_fixed; ^~~~ arch/loongarch/net/bpf_jit.c:784:11: error: ‘func_addr’ undeclared (first use in this function); did you mean ‘in_addr’? &func_addr, &func_addr_fixed); ^~~~~~~~~ in_addr arch/loongarch/net/bpf_jit.c:784:11: note: each undeclared identifier is reported only once for each function it appears in arch/loongarch/net/bpf_jit.c:814:3: error: a label can only be part of a statement and a declaration is not a statement u64 imm64 = (u64)(insn + 1)->imm << 32 | (u32)insn->imm; ^~~ Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-10-29LoongArch: Use flexible-array member instead of zero-length arrayYushan Zhou1-1/+1
Eliminate the following coccicheck warning: ./arch/loongarch/include/asm/ptrace.h:32:15-21: WARNING use flexible-array member instead Reviewed-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Yushan Zhou <katrinzhou@tencent.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-10-29LoongArch: Remove unused kernel stack paddingJinyang He5-7/+6
The current LoongArch kernel stack is padded as if obeying the MIPS o32 calling convention (32 bytes), signifying the port's MIPS lineage but no longer making sense. Remove the padding for clarity. Reviewed-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Jinyang He <hejinyang@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-10-21arch/loongarch: Add ARCH_HAS_NMI_SAFE_THIS_CPU_OPS Kconfig optionPaul E. McKenney1-0/+1
The loongarch architecture uses the atomic read-modify-write amadd instruction to implement this_cpu_add(), which is NMI safe. This means that the old and more-efficient srcu_read_lock() may be used in NMI context, without the need for srcu_read_lock_nmisafe(). Therefore, add the new Kconfig option ARCH_HAS_NMI_SAFE_THIS_CPU_OPS to arch/loongarch/Kconfig, which will cause NEED_SRCU_NMI_SAFE to be deselected, thus preserving the current srcu_read_lock() behavior. Link: https://lore.kernel.org/all/20220910221947.171557773@linutronix.de/ Suggested-by: Neeraj Upadhyay <quic_neeraju@quicinc.com> Suggested-by: Frederic Weisbecker <frederic@kernel.org> Suggested-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: John Ogness <john.ogness@linutronix.de> Cc: Petr Mladek <pmladek@suse.com> Cc: <loongarch@lists.linux.dev>
2022-10-17Merge tag 'random-6.1-rc1-for-linus' of ↵Linus Torvalds2-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/crng/random Pull more random number generator updates from Jason Donenfeld: "This time with some large scale treewide cleanups. The intent of this pull is to clean up the way callers fetch random integers. The current rules for doing this right are: - If you want a secure or an insecure random u64, use get_random_u64() - If you want a secure or an insecure random u32, use get_random_u32() The old function prandom_u32() has been deprecated for a while now and is just a wrapper around get_random_u32(). Same for get_random_int(). - If you want a secure or an insecure random u16, use get_random_u16() - If you want a secure or an insecure random u8, use get_random_u8() - If you want secure or insecure random bytes, use get_random_bytes(). The old function prandom_bytes() has been deprecated for a while now and has long been a wrapper around get_random_bytes() - If you want a non-uniform random u32, u16, or u8 bounded by a certain open interval maximum, use prandom_u32_max() I say "non-uniform", because it doesn't do any rejection sampling or divisions. Hence, it stays within the prandom_*() namespace, not the get_random_*() namespace. I'm currently investigating a "uniform" function for 6.2. We'll see what comes of that. By applying these rules uniformly, we get several benefits: - By using prandom_u32_max() with an upper-bound that the compiler can prove at compile-time is ≤65536 or ≤256, internally get_random_u16() or get_random_u8() is used, which wastes fewer batched random bytes, and hence has higher throughput. - By using prandom_u32_max() instead of %, when the upper-bound is not a constant, division is still avoided, because prandom_u32_max() uses a faster multiplication-based trick instead. - By using get_random_u16() or get_random_u8() in cases where the return value is intended to indeed be a u16 or a u8, we waste fewer batched random bytes, and hence have higher throughput. This series was originally done by hand while I was on an airplane without Internet. Later, Kees and I worked on retroactively figuring out what could be done with Coccinelle and what had to be done manually, and then we split things up based on that. So while this touches a lot of files, the actual amount of code that's hand fiddled is comfortably small" * tag 'random-6.1-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: prandom: remove unused functions treewide: use get_random_bytes() when possible treewide: use get_random_u32() when possible treewide: use get_random_{u8,u16}() when possible, part 2 treewide: use get_random_{u8,u16}() when possible, part 1 treewide: use prandom_u32_max() when possible, part 2 treewide: use prandom_u32_max() when possible, part 1
2022-10-14Merge tag 'mm-stable-2022-10-13' of ↵Linus Torvalds1-0/+3
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull more MM updates from Andrew Morton: - fix a race which causes page refcounting errors in ZONE_DEVICE pages (Alistair Popple) - fix userfaultfd test harness instability (Peter Xu) - various other patches in MM, mainly fixes * tag 'mm-stable-2022-10-13' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (29 commits) highmem: fix kmap_to_page() for kmap_local_page() addresses mm/page_alloc: fix incorrect PGFREE and PGALLOC for high-order page mm/selftest: uffd: explain the write missing fault check mm/hugetlb: use hugetlb_pte_stable in migration race check mm/hugetlb: fix race condition of uffd missing/minor handling zram: always expose rw_page LoongArch: update local TLB if PTE entry exists mm: use update_mmu_tlb() on the second thread kasan: fix array-bounds warnings in tests hmm-tests: add test for migrate_device_range() nouveau/dmem: evict device private memory during release nouveau/dmem: refactor nouveau_dmem_fault_copy_one() mm/migrate_device.c: add migrate_device_range() mm/migrate_device.c: refactor migrate_vma and migrate_deivce_coherent_page() mm/memremap.c: take a pgmap reference on page allocation mm: free device private pages have zero refcount mm/memory.c: fix race when faulting a device private page mm/damon: use damon_sz_region() in appropriate place mm/damon: move sz_damon_region to damon_sz_region lib/test_meminit: add checks for the allocation functions ...
2022-10-13LoongArch: update local TLB if PTE entry existsQi Zheng1-0/+3
Currently, the implementation of update_mmu_tlb() is empty if __HAVE_ARCH_UPDATE_MMU_TLB is not defined. Then if two threads concurrently fault at the same page, the second thread that did not win the race will give up and do nothing. In the LoongArch architecture, this second thread will trigger another fault, and only updates its local TLB. Instead of triggering another fault, it's better to implement update_mmu_tlb() to directly update the local TLB of the second thread. Just do it. Link: https://lkml.kernel.org/r/20220929112318.32393-3-zhengqi.arch@bytedance.com Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> Suggested-by: Bibo Mao <maobibo@loongson.cn> Acked-by: Huacai Chen <chenhuacai@loongson.cn> Cc: Chris Zankel <chris@zankel.net> Cc: David Hildenbrand <david@redhat.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Muchun Song <songmuchun@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-12Merge tag 'mm-nonmm-stable-2022-10-11' of ↵Linus Torvalds1-3/+0
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: - hfs and hfsplus kmap API modernization (Fabio Francesco) - make crash-kexec work properly when invoked from an NMI-time panic (Valentin Schneider) - ntfs bugfixes (Hawkins Jiawei) - improve IPC msg scalability by replacing atomic_t's with percpu counters (Jiebin Sun) - nilfs2 cleanups (Minghao Chi) - lots of other single patches all over the tree! * tag 'mm-nonmm-stable-2022-10-11' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (71 commits) include/linux/entry-common.h: remove has_signal comment of arch_do_signal_or_restart() prototype proc: test how it holds up with mapping'less process mailmap: update Frank Rowand email address ia64: mca: use strscpy() is more robust and safer init/Kconfig: fix unmet direct dependencies ia64: update config files nilfs2: replace WARN_ONs by nilfs_error for checkpoint acquisition failure fork: remove duplicate included header files init/main.c: remove unnecessary (void*) conversions proc: mark more files as permanent nilfs2: remove the unneeded result variable nilfs2: delete unnecessary checks before brelse() checkpatch: warn for non-standard fixes tag style usr/gen_init_cpio.c: remove unnecessary -1 values from int file ipc/msg: mitigate the lock contention with percpu counter percpu: add percpu_counter_add_local and percpu_counter_sub_local fs/ocfs2: fix repeated words in comments relay: use kvcalloc to alloc page array in relay_alloc_page_array proc: make config PROC_CHILDREN depend on PROC_FS fs: uninline inode_maybe_inc_iversion() ...
2022-10-12Merge tag 'loongarch-6.1' of ↵Linus Torvalds56-685/+4696
git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson Pull LoongArch updates from Huacai Chen: - Use EXPLICIT_RELOCS (ABIv2.0) - Use generic BUG() handler - Refactor TLB/Cache operations - Add qspinlock support - Add perf events support - Add kexec/kdump support - Add BPF JIT support - Add ACPI-based laptop driver - Update the default config file * tag 'loongarch-6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: (25 commits) LoongArch: Update Loongson-3 default config file LoongArch: Add ACPI-based generic laptop driver LoongArch: Add BPF JIT support LoongArch: Add some instruction opcodes and formats LoongArch: Move {signed,unsigned}_imm_check() to inst.h LoongArch: Add kdump support LoongArch: Add kexec support LoongArch: Use generic BUG() handler LoongArch: Add SysRq-x (TLB Dump) support LoongArch: Add perf events support LoongArch: Add qspinlock support LoongArch: Use TLB for ioremap() LoongArch: Support access filter to /dev/mem interface LoongArch: Refactor cache probe and flush methods LoongArch: mm: Refactor TLB exception handlers LoongArch: Support R_LARCH_GOT_PC_{LO12,HI20} in modules LoongArch: Support PC-relative relocations in modules LoongArch: Define ELF relocation types added in ABIv2.0 LoongArch: Adjust symbol addressing for AS_HAS_EXPLICIT_RELOCS LoongArch: Add Kconfig option AS_HAS_EXPLICIT_RELOCS ...
2022-10-12LoongArch: Update Loongson-3 default config fileHuacai Chen1-8/+55
1, Enable ZBOOT, KEXEC and BPF_JIT; 2, Add more patition types; 3, Add some USB Type-C options; 4, Add some common network options; 5, Add some Bluetooth device drivers; 6, Remove obsolete config options (for some detailed information, see Link). Link: https://lore.kernel.org/kernel-janitors/20220929090645.1389-1-lukas.bulwahn@gmail.com/ Co-developed-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Co-developed-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Youling Tang <tangyouling@loongson.cn> Co-developed-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-10-12LoongArch: Add BPF JIT supportTiezhu Yang7-0/+1700
BPF programs are normally handled by a BPF interpreter, add BPF JIT support for LoongArch to allow the kernel to generate native code when a program is loaded into the kernel. This will significantly speed-up processing of BPF programs. Co-developed-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-10-12LoongArch: Add some instruction opcodes and formatsTiezhu Yang1-5/+174
According to the "Table of Instruction Encoding" in LoongArch Reference Manual [1], add some instruction opcodes and formats which are used in the BPF JIT for LoongArch. [1] https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html#table-of-instruction-encoding Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-10-12LoongArch: Move {signed,unsigned}_imm_check() to inst.hTiezhu Yang2-10/+10
{signed,unsigned}_imm_check() will also be used in the bpf jit, so move them from module.c to inst.h, this is preparation for later patches. Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-10-12LoongArch: Add kdump supportYouling Tang9-8/+227
This patch adds support for kdump. In kdump case the normal kernel will reserve a region for the crash kernel and jump there on panic. Arch-specific functions are added to allow for implementing a crash dump file interface, /proc/vmcore, which can be viewed as a ELF file. A user-space tool, such as kexec-tools, is responsible for allocating a separate region for the core's ELF header within the crash kdump kernel memory and filling it in when executing kexec_load(). Then, its location will be advertised to the crash dump kernel via a command line argument "elfcorehdr=", and the crash dump kernel will preserve this region for later use with arch_reserve_vmcore() at boot time. At the same time, the crash kdump kernel is also limited within the "crashkernel" area via a command line argument "mem=", so as not to destroy the original kernel dump data. In the crash dump kernel environment, /proc/vmcore is used to access the primary kernel's memory with copy_oldmem_page(). I tested kdump on LoongArch machines (Loongson-3A5000) and it works as expected (suggested crashkernel parameter is "crashkernel=512M@2560M"), you may test it by triggering a crash through /proc/sysrq-trigger: $ sudo kexec -p /boot/vmlinux-kdump --reuse-cmdline --append="nr_cpus=1" # echo c > /proc/sysrq-trigger Signed-off-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-10-12LoongArch: Add kexec supportYouling Tang6-1/+400
Add three new files, kexec.h, machine_kexec.c and relocate_kernel.S to the LoongArch architecture, so as to add support for the kexec re-boot mechanism (CONFIG_KEXEC) on LoongArch platforms. Kexec supports loading vmlinux.elf in ELF format and vmlinux.efi in PE format. I tested kexec on LoongArch machines (Loongson-3A5000) and it works as expected: $ sudo kexec -l /boot/vmlinux.efi --reuse-cmdline $ sudo kexec -e Signed-off-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-10-12LoongArch: Use generic BUG() handlerYouling Tang4-12/+84
Inspired by commit 9fb7410f955("arm64/BUG: Use BRK instruction for generic BUG traps"), do similar for LoongArch to use generic BUG() handler. This patch uses the BREAK software breakpoint instruction to generate a trap instead, similarly to most other arches, with the generic BUG code generating the dmesg boilerplate. This allows bug metadata to be moved to a separate table and reduces the amount of inline code at BUG() and WARN() sites. This also avoids clobbering any registers before they can be dumped. To mitigate the size of the bug table further, this patch makes use of the existing infrastructure for encoding addresses within the bug table as 32-bit relative pointers instead of absolute pointers. (Note: this limits the max kernel size to 2GB.) Before patch: [ 3018.338013] lkdtm: Performing direct entry BUG [ 3018.342445] Kernel bug detected[#5]: [ 3018.345992] CPU: 2 PID: 865 Comm: cat Tainted: G D 6.0.0-rc6+ #35 After patch: [ 125.585985] lkdtm: Performing direct entry BUG [ 125.590433] ------------[ cut here ]------------ [ 125.595020] kernel BUG at drivers/misc/lkdtm/bugs.c:78! [ 125.600211] Oops - BUG[#1]: [ 125.602980] CPU: 3 PID: 410 Comm: cat Not tainted 6.0.0-rc6+ #36 Out-of-line file/line data information obtained compared to before. Signed-off-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-10-12LoongArch: Add SysRq-x (TLB Dump) supportHuacai Chen2-0/+67
Add SysRq-x (TLB Dump) support for LoongArch, which is useful for debugging. Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-10-12LoongArch: Add perf events supportHuacai Chen6-1/+987
The perf events infrastructure of LoongArch is very similar to old MIPS- based Loongson, so most of the codes are derived from MIPS. Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-10-12LoongArch: Add qspinlock supportHuacai Chen4-3/+26
On NUMA system, the performance of qspinlock is better than generic spinlock. Below is the UnixBench test results on a 8 nodes (4 cores per node, 32 cores in total) machine. A. With generic spinlock: System Benchmarks Index Values BASELINE RESULT INDEX Dhrystone 2 using register variables 116700.0 449574022.5 38523.9 Double-Precision Whetstone 55.0 85190.4 15489.2 Execl Throughput 43.0 14696.2 3417.7 File Copy 1024 bufsize 2000 maxblocks 3960.0 143157.8 361.5 File Copy 256 bufsize 500 maxblocks 1655.0 37631.8 227.4 File Copy 4096 bufsize 8000 maxblocks 5800.0 444814.2 766.9 Pipe Throughput 12440.0 5047490.7 4057.5 Pipe-based Context Switching 4000.0 2021545.7 5053.9 Process Creation 126.0 23829.8 1891.3 Shell Scripts (1 concurrent) 42.4 33756.7 7961.5 Shell Scripts (8 concurrent) 6.0 4062.9 6771.5 System Call Overhead 15000.0 2479748.6 1653.2 ======== System Benchmarks Index Score 2955.6 B. With qspinlock: System Benchmarks Index Values BASELINE RESULT INDEX Dhrystone 2 using register variables 116700.0 449467876.9 38514.8 Double-Precision Whetstone 55.0 85174.6 15486.3 Execl Throughput 43.0 14769.1 3434.7 File Copy 1024 bufsize 2000 maxblocks 3960.0 146150.5 369.1 File Copy 256 bufsize 500 maxblocks 1655.0 37496.8 226.6 File Copy 4096 bufsize 8000 maxblocks 5800.0 447527.0 771.6 Pipe Throughput 12440.0 5175989.2 4160.8 Pipe-based Context Switching 4000.0 2207747.8 5519.4 Process Creation 126.0 25125.5 1994.1 Shell Scripts (1 concurrent) 42.4 33461.2 7891.8 Shell Scripts (8 concurrent) 6.0 4024.7 6707.8 System Call Overhead 15000.0 2917278.6 1944.9 ======== System Benchmarks Index Score 3040.1 Signed-off-by: Rui Wang <wangrui@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-10-12LoongArch: Use TLB for ioremap()Huacai Chen7-56/+184
We can support more cache attributes (e.g., CC, SUC and WUC) and page protection when we use TLB for ioremap(). The implementation is based on GENERIC_IOREMAP. The existing simple ioremap() implementation has better performance so we keep it and introduce ARCH_IOREMAP to control the selection. We move pagetable_init() earlier to make early ioremap() works, and we modify the PCI ecam mapping because the TLB-based version of ioremap() will actually take the size into account. Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-10-12LoongArch: Support access filter to /dev/mem interfaceHuacai Chen3-0/+34
Accidental access to /dev/mem is obviously disastrous, but specific access can be used by people debugging the kernel. So select GENERIC_ LIB_DEVMEM_IS_ALLOWED, as well as define ARCH_HAS_VALID_PHYS_ADDR_RANGE and related helpers, to support access filter to /dev/mem interface. Signed-off-by: Weihao Li <liweihao@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-10-12LoongArch: Refactor cache probe and flush methodsHuacai Chen10-267/+236
Current cache probe and flush methods have some drawbacks: 1, Assume there are 3 cache levels and only 3 levels; 2, Assume L1 = I + D, L2 = V, L3 = S, V is exclusive, S is inclusive. However, the fact is I + D, I + D + V, I + D + S and I + D + V + S are all valid. So, refactor the cache probe and flush methods to adapt more types of cache hierarchy. Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-10-12LoongArch: mm: Refactor TLB exception handlersRui Wang1-290/+247
This patch simplifies TLB load, store and modify exception handlers: 1. Reduce instructions, such as alu/csr and memory access; 2. Execute tlb search instruction only in the fast path; 3. Return directly from the fast path for both normal and huge pages; 4. Re-tab the assembly for better vertical alignment. And fixes the concurrent modification issue of fast path for huge pages. This issue will occur in the following steps: CPU-1 (In TLB exception) CPU-2 (In THP splitting) 1: Load PMD entry (HUGE=1) 2: Goto huge path 3: Store PMD entry (HUGE=0) 4: Reload PMD entry (HUGE=0) 5: Fill TLB entry (PA is incorrect) This patch also slightly improves the TLB processing performance: * Normal pages: 2.15%, Huge pages: 1.70%. #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <sys/mman.h> int main(int argc, char *argv[]) { size_t page_size; size_t mem_size; size_t off; void *base; int flags; int i; if (argc < 2) { fprintf(stderr, "%s MEM_SIZE [HUGE]\n", argv[0]); return -1; } page_size = sysconf(_SC_PAGESIZE); flags = MAP_PRIVATE | MAP_ANONYMOUS; mem_size = strtoul(argv[1], NULL, 10); if (argc > 2) flags |= MAP_HUGETLB; for (i = 0; i < 10; i++) { base = mmap(NULL, mem_size, PROT_READ, flags, -1, 0); if (base == MAP_FAILED) { fprintf(stderr, "Map memory failed!\n"); return -1; } for (off = 0; off < mem_size; off += page_size) *(volatile int *)(base + off); munmap(base, mem_size); } return 0; } Signed-off-by: Rui Wang <wangrui@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-10-12LoongArch: Support R_LARCH_GOT_PC_{LO12,HI20} in modulesXi Ruoyao4-7/+99
GCC >= 13 and GNU assembler >= 2.40 use these relocations to address external symbols, so we need to add them. Let the module loader emit GOT entries for data symbols so we would be able to handle GOT relocations. The GOT entry is just the data's symbol address. In module.lds, emit a stub .got section for a section header entry. The actual content of the section entry will be filled at runtime by module_ frob_arch_sections(). Tested-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Xi Ruoyao <xry111@xry111.site> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-10-12LoongArch: Support PC-relative relocations in modulesXi Ruoyao2-1/+75
Binutils >= 2.40 uses R_LARCH_B26 instead of R_LARCH_SOP_PUSH_PLT_PCREL, and R_LARCH_PCALA* instead of R_LARCH_SOP_PUSH_PCREL. Handle R_LARCH_B26 and R_LARCH_PCALA* in the module loader. For R_LARCH_ B26, also create a PLT entry as needed. Tested-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Xi Ruoyao <xry111@xry111.site> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-10-12LoongArch: Define ELF relocation types added in ABIv2.0Xi Ruoyao2-1/+38
These relocation types are used by GNU binutils >= 2.40 and GCC >= 13. Add their definitions so we will be able to use them in later patches. Link: https://github.com/loongson/LoongArch-Documentation/pull/57 Tested-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Xi Ruoyao <xry111@xry111.site> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-10-12LoongArch: Adjust symbol addressing for AS_HAS_EXPLICIT_RELOCSXi Ruoyao4-6/+37
If explicit relocation hints are used by the toolchain, -Wa,-mla-* options will be useless for the C code. So only use them for the !CONFIG_AS_HAS_EXPLICIT_RELOCS case. Replace "la" with "la.pcrel" in head.S to keep the semantic consistent with new and old toolchains for the low level startup code. For per-CPU variables, the "address" of the symbol is actually an offset from $r21. The value is near the loading address of main kernel image, but far from the loading address of modules. So we use model("extreme") attibute to tell the compiler that a PC-relative addressing with 32-bit offset is not sufficient for local per-CPU variables. The behavior with different assemblers and compilers are summarized in the following table: AS has CC has explicit relocs explicit relocs * Behavior ============================================================== No No Use la.* macros. No change from Linux 6.0. -------------------------------------------------------------- No Yes Disable explicit relocs. No change from Linux 6.0. -------------------------------------------------------------- Yes No Not supported. -------------------------------------------------------------- Yes Yes Enable explicit relocs. No -Wa,-mla* options used. ============================================================== *: We assume CC must have model attribute if it has explicit relocs. Both features are added in GCC 13 development cycle, so any GCC release >= 13 should be OK. Using early GCC 13 development snapshots may produce modules with unsupported relocations. Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=f09482a Link: https://gcc.gnu.org/r13-1834 Link: https://gcc.gnu.org/r13-2199 Tested-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Xi Ruoyao <xry111@xry111.site> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-10-12LoongArch: Add Kconfig option AS_HAS_EXPLICIT_RELOCSXi Ruoyao1-0/+3
GNU as >= 2.40 and GCC >= 13 will support using explicit relocation hints in the assembly code, instead of la.* macros. The usage of explicit relocation hints can improve code generation so it's enabled by default by GCC >= 13. Introduce a Kconfig option AS_HAS_EXPLICIT_RELOCS as the switch for "use explicit relocation hints or not". Tested-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Xi Ruoyao <xry111@xry111.site> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-10-12LoongArch: Kconfig: Fix spelling mistake "delibrately" -> "deliberately"Colin Ian King1-1/+1
There is a spelling mistake in a commented section. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-10-12LoongArch: Mark __xchg() and __cmpxchg() as __always_inlineHuacai Chen1-4/+4
Commit ac7c3e4ff401 ("compiler: enable CONFIG_OPTIMIZE_INLINING forcibly") allows compiler to uninline functions marked as 'inline'. In case of __xchg()/__cmpxchg() this would cause to reference BUILD_BUG(), which is an error case for catching bugs and will not happen for correct code, if __xchg()/__cmpxchg() is inlined. This bug can be produced with CONFIG_DEBUG_SECTION_MISMATCH enabled, and the solution is similar to below commits: 46f1619500d0225 ("MIPS: include: Mark __xchg as __always_inline"), 88356d09904bc60 ("MIPS: include: Mark __cmpxchg as __always_inline"). Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>