summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)AuthorFilesLines
2025-12-17perf/x86/core: Register a new vector for handling mediated guest PMIsSean Christopherson1-0/+8
Wire up system vector 0xf5 for handling PMIs (i.e. interrupts delivered through the LVTPC) while running KVM guests with a mediated PMU. Perf currently delivers all PMIs as NMIs, e.g. so that events that trigger while IRQs are disabled aren't delayed and generate useless records, but due to the multiplexing of NMIs throughout the system, correctly identifying NMIs for a mediated PMU is practically infeasible. To (greatly) simplify identifying guest mediated PMU PMIs, perf will switch the CPU's LVTPC between PERF_GUEST_MEDIATED_PMI_VECTOR and NMI when guest PMU context is loaded/put. I.e. PMIs that are generated by the CPU while the guest is active will be identified purely based on the IRQ vector. Route the vector through perf, e.g. as opposed to letting KVM attach a handler directly a la posted interrupt notification vectors, as perf owns the LVTPC and thus is the rightful owner of PERF_GUEST_MEDIATED_PMI_VECTOR. Functionally, having KVM directly own the vector would be fine (both KVM and perf will be completely aware of when a mediated PMU is active), but would lead to an undesirable split in ownership: perf would be responsible for installing the vector, but not handling the resulting IRQs. Add a new perf_guest_info_callbacks hook (and static call) to allow KVM to register its handler with perf when running guests with mediated PMUs. Note, because KVM always runs guests with host IRQs enabled, there is no danger of a PMI being delayed from the guest's perspective due to using a regular IRQ instead of an NMI. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Xudong Hao <xudong.hao@intel.com> Link: https://patch.msgid.link/20251206001720.468579-9-seanjc@google.com
2025-12-17perf: Add APIs to load/put guest mediated PMU contextKan Liang1-0/+2
Add exported APIs to load/put a guest mediated PMU context. KVM will load the guest PMU shortly before VM-Enter, and put the guest PMU shortly after VM-Exit. On the perf side of things, schedule out all exclude_guest events when the guest context is loaded, and schedule them back in when the guest context is put. I.e. yield the hardware PMU resources to the guest, by way of KVM. Note, perf is only responsible for managing host context. KVM is responsible for loading/storing guest state to/from hardware. [sean: shuffle patches around, write changelog] Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Mingwei Zhang <mizhang@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Xudong Hao <xudong.hao@intel.com> Link: https://patch.msgid.link/20251206001720.468579-8-seanjc@google.com
2025-12-17perf: Add a EVENT_GUEST flagKan Liang1-0/+6
Current perf doesn't explicitly schedule out all exclude_guest events while the guest is running. There is no problem with the current emulated vPMU. Because perf owns all the PMU counters. It can mask the counter which is assigned to an exclude_guest event when a guest is running (Intel way), or set the corresponding HOSTONLY bit in evsentsel (AMD way). The counter doesn't count when a guest is running. However, either way doesn't work with the introduced mediated vPMU. A guest owns all the PMU counters when it's running. The host should not mask any counters. The counter may be used by the guest. The evsentsel may be overwritten. Perf should explicitly schedule out all exclude_guest events to release the PMU resources when entering a guest, and resume the counting when exiting the guest. It's possible that an exclude_guest event is created when a guest is running. The new event should not be scheduled in as well. The ctx time is shared among different PMUs. The time cannot be stopped when a guest is running. It is required to calculate the time for events from other PMUs, e.g., uncore events. Add timeguest to track the guest run time. For an exclude_guest event, the elapsed time equals the ctx time - guest time. Cgroup has dedicated times. Use the same method to deduct the guest time from the cgroup time as well. [sean: massage comments] Co-developed-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Mingwei Zhang <mizhang@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Xudong Hao <xudong.hao@intel.com> Link: https://patch.msgid.link/20251206001720.468579-7-seanjc@google.com
2025-12-17perf: Clean up perf ctx timeKan Liang1-6/+7
The current perf tracks two timestamps for the normal ctx and cgroup. The same type of variables and similar codes are used to track the timestamps. In the following patch, the third timestamp to track the guest time will be introduced. To avoid the code duplication, add a new struct perf_time_ctx and factor out a generic function update_perf_time_ctx(). No functional change. Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Mingwei Zhang <mizhang@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Xudong Hao <xudong.hao@intel.com> Link: https://patch.msgid.link/20251206001720.468579-6-seanjc@google.com
2025-12-17perf: Add APIs to create/release mediated guest vPMUsKan Liang1-0/+6
Currently, exposing PMU capabilities to a KVM guest is done by emulating guest PMCs via host perf events, i.e. by having KVM be "just" another user of perf. As a result, the guest and host are effectively competing for resources, and emulating guest accesses to vPMU resources requires expensive actions (expensive relative to the native instruction). The overhead and resource competition results in degraded guest performance and ultimately very poor vPMU accuracy. To address the issues with the perf-emulated vPMU, introduce a "mediated vPMU", where the data plane (PMCs and enable/disable knobs) is exposed directly to the guest, but the control plane (event selectors and access to fixed counters) is managed by KVM (via MSR interceptions). To allow host perf usage of the PMU to (partially) co-exist with KVM/guest usage of the PMU, KVM and perf will coordinate to a world switch between host perf context and guest vPMU context near VM-Enter/VM-Exit. Add two exported APIs, perf_{create,release}_mediated_pmu(), to allow KVM to create and release a mediated PMU instance (per VM). Because host perf context will be deactivated while the guest is running, mediated PMU usage will be mutually exclusive with perf analysis of the guest, i.e. perf events that do NOT exclude the guest will not behave as expected. To avoid silent failure of !exclude_guest perf events, disallow creating a mediated PMU if there are active !exclude_guest events, and on the perf side, disallowing creating new !exclude_guest perf events while there is at least one active mediated PMU. Exempt PMU resources that do not support mediated PMU usage, i.e. that are outside the scope/view of KVM's vPMU and will not be swapped out while the guest is running. Guard mediated PMU with a new kconfig to help readers identify code paths that are unique to mediated PMU support, and to allow for adding arch- specific hooks without stubs. KVM x86 is expected to be the only KVM architecture to support a mediated PMU in the near future (e.g. arm64 is trending toward a partitioned PMU implementation), and KVM x86 will select PERF_GUEST_MEDIATED_PMU unconditionally, i.e. won't need stubs. Immediately select PERF_GUEST_MEDIATED_PMU when KVM x86 is enabled so that all paths are compile tested. Full KVM support is on its way... [sean: add kconfig and WARNing, rewrite changelog, swizzle patch ordering] Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Mingwei Zhang <mizhang@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Xudong Hao <xudong.hao@intel.com> Link: https://patch.msgid.link/20251206001720.468579-5-seanjc@google.com
2025-12-17Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf after 6.19-rc1Alexei Starovoitov245-2028/+8162
Cross-merge BPF and other fixes after downstream PR. Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-12-17Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfLinus Torvalds1-0/+3
Pull bpf fixes from Alexei Starovoitov: - Fix BPF builds due to -fms-extensions. selftests (Alexei Starovoitov), bpftool (Quentin Monnet). - Fix build of net/smc when CONFIG_BPF_SYSCALL=y, but CONFIG_BPF_JIT=n (Geert Uytterhoeven) - Fix livepatch/BPF interaction and support reliable unwinding through BPF stack frames (Josh Poimboeuf) - Do not audit capability check in arm64 JIT (Ondrej Mosnacek) - Fix truncated dmabuf BPF iterator reads (T.J. Mercier) - Fix verifier assumptions of bpf_d_path's output buffer (Shuran Liu) - Fix warnings in libbpf when built with -Wdiscarded-qualifiers under C23 (Mikhail Gavrilov) * tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: selftests/bpf: add regression test for bpf_d_path() bpf: Fix verifier assumptions of bpf_d_path's output buffer selftests/bpf: Add test for truncated dmabuf_iter reads bpf: Fix truncated dmabuf iterator reads x86/unwind/orc: Support reliable unwinding through BPF stack frames bpf: Add bpf_has_frame_pointer() bpf, arm64: Do not audit capability check in do_jit() libbpf: Fix -Wdiscarded-qualifiers under C23 bpftool: Fix build warnings due to MS extensions net: smc: SMC_HS_CTRL_BPF should depend on BPF_JIT selftests/bpf: Add -fms-extensions to bpf build flags
2025-12-17scsi: scsi_transport_fc: Introduce encryption group in fc_rport attributeSarah Catania1-0/+1
Introduce a new structure for reporting an encrypted session over an fc_rport. The encryption group is added as an attribute in struct fc_rport and reports information in fc_encryption_info. This structure contains a status member variable, which stores a bit value indicating an encrypted session. Signed-off-by: Sarah Catania <sarah.catania@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://patch.msgid.link/20251211001659.138635-2-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-12-17soc: qcom: llcc-qcom: Add support for GlymurPankaj Patil1-0/+4
Add system cache table(SCT) and configs for Glymur SoC Updated the list of usecase id's to enable additional clients for Glymur Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Signed-off-by: Pankaj Patil <pankaj.patil@oss.qualcomm.com> Link: https://lore.kernel.org/r/20251211-glymur_llcc_enablement-v3-2-43457b354b0d@oss.qualcomm.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2025-12-16dmaengine: shdma: correct most kernel-doc issues in shdma-base.hRandy Dunlap1-14/+14
Fix kernel-doc comments in include/linux/shdma-base.h to avoid most warnings: - prefix an enum name with "enum" - prefix enum values with '@' - prefix struct member names with '@' shdma-base.h:28: warning: cannot understand function prototype: 'enum shdma_pm_state ' Warning: shdma-base.h:103 struct member 'desc_completed' not described in 'shdma_ops' Warning: shdma-base.h:103 struct member 'halt_channel' not described in 'shdma_ops' Warning: shdma-base.h:103 struct member 'channel_busy' not described in 'shdma_ops' Warning: shdma-base.h:103 struct member 'slave_addr' not described in 'shdma_ops' Warning: shdma-base.h:103 struct member 'desc_setup' not described in 'shdma_ops' Warning: shdma-base.h:103 struct member 'set_slave' not described in 'shdma_ops' Warning: shdma-base.h:103 struct member 'setup_xfer' not described in 'shdma_ops' Warning: shdma-base.h:103 struct member 'start_xfer' not described in 'shdma_ops' Warning: shdma-base.h:103 struct member 'embedded_desc' not described in 'shdma_ops' Warning: shdma-base.h:103 struct member 'chan_irq' not described in 'shdma_ops' This one is not fixed: from 4f46f8ac80416: Warning: shdma-base.h:103 struct member 'get_partial' not described in 'shdma_ops' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Link: https://patch.msgid.link/20251104002001.445297-1-rdunlap@infradead.org Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-12-16dmaengine: dw_edma: correct kernel-doc warnings in <linux/dma/edma.h>Randy Dunlap1-13/+11
Use the correct enum name in its kernel-doc heading. Add ending ':' to struct member names. Drop the @id: kernel-doc entry since there is no struct member named 'id'. edma.h:46: warning: expecting prototype for struct dw_edma_core_ops. Prototype was for struct dw_edma_plat_ops instead Warning: edma.h:101 struct member 'ops' not described in 'dw_edma_chip' Warning: edma.h:101 struct member 'flags' not described in 'dw_edma_chip' Warning: edma.h:101 struct member 'reg_base' not described in 'dw_edma_chip' Warning: edma.h:101 struct member 'll_wr_cnt' not described in 'dw_edma_chip' Warning: edma.h:101 struct member 'll_rd_cnt' not described in 'dw_edma_chip' Warning: edma.h:101 struct member 'll_region_wr' not described in 'dw_edma_chip' Warning: edma.h:101 struct member 'll_region_rd' not described in 'dw_edma_chip' Warning: edma.h:101 struct member 'dt_region_wr' not described in 'dw_edma_chip' Warning: edma.h:101 struct member 'dt_region_rd' not described in 'dw_edma_chip' Warning: edma.h:101 struct member 'mf' not described in 'dw_edma_chip' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Link: https://patch.msgid.link/20251101191524.1991135-1-rdunlap@infradead.org Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-12-16audit: add audit_log_nf_skb helper functionRicardo Robaina1-0/+8
Netfilter code (net/netfilter/nft_log.c and net/netfilter/xt_AUDIT.c) have to be kept in sync. Both source files had duplicated versions of audit_ip4() and audit_ip6() functions, which can result in lack of consistency and/or duplicated work. This patch adds a helper function in audit.c that can be called by netfilter code commonly, aiming to improve maintainability and consistency. Suggested-by: Florian Westphal <fw@strlen.de> Suggested-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Ricardo Robaina <rrobaina@redhat.com> Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-12-16efi: Support EDID informationThomas Zimmermann1-4/+5
In the EFI config table, rename LINUX_EFI_SCREEN_INFO_TABLE_GUID to LINUX_EFI_PRIMARY_DISPLAY_TABLE_GUID. Read sysfb_primary_display from the entry. In addition to the screen_info, the entry now also contains EDID information. In libstub, replace struct screen_info with struct sysfb_display_info from the kernel's sysfb_primary_display and rename functions accordingly. Transfer it to the runtime kernel using the kernel's global state or the LINUX_EFI_PRIMARY_DISPLAY_TABLE_GUID config-table entry. With CONFIG_FIRMWARE_EDID=y, libstub now transfers the GOP device's EDID information to the kernel. If CONFIG_FIRMWARE_EDID=n, EDID information is disabled. Make the Kconfig symbol CONFIG_FIRMWARE_EDID available with EFI. Setting the value to 'n' disables EDID support. Also rename screen_info.c to primary_display.c and adapt the contained comment according to the changes. Link: https://lore.kernel.org/all/20251126160854.553077-8-tzimmermann@suse.de/ Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> [ardb: depend on EFI_GENERIC_STUB not EFI, fix conflicts after dropping the preceding patch from the series] Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-12-16sysfb: Move edid_info into sysfb_primary_displayThomas Zimmermann1-0/+6
Move x86's edid_info into sysfb_primary_display as a new field named edid. Adapt all users. An instance of edid_info has only been defined on x86. With the move into sysfb_primary_display, it becomes available on all architectures. Therefore remove this contraint from CONFIG_FIRMWARE_EDID. x86 fills the EDID data from boot_params.edid_info. DRM drivers pick up the raw data and make it available to DRM clients. Replace the drivers' references to edid_info and instead use the sysfb_display_info as passed from sysfb. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-12-16sysfb: Replace screen_info with sysfb_primary_displayThomas Zimmermann2-4/+3
Replace the global screen_info with sysfb_primary_display of type struct sysfb_display_info. Adapt all users of screen_info. Instances of screen_info are defined for x86, loongarch and EFI, with only one instance compiled into a specific build. Replace all of them with sysfb_primary_display. All existing users of screen_info are updated by pointing them to sysfb_primary_display.screen instead. This introduces some churn to the code, but has no impact on functionality. Boot parameters and EFI config tables are unchanged. They transfer screen_info as before. The logic in EFI's alloc_screen_info() changes slightly, as it now returns the screen field of sysfb_primary_display. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Bjorn Helgaas <bhelgaas@google.com> # drivers/pci/ Reviewed-by: Richard Lyu <richard.lyu@suse.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-12-16sysfb: Add struct sysfb_display_infoThomas Zimmermann1-0/+5
Add struct sysfb_display_info to wrap display-related state. For now it contains only the screen's video mode. Later EDID will be added as well. This struct will be helpful for passing display state to sysfb drivers or from the EFI stub library. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Richard Lyu <richard.lyu@suse.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-12-16efi: sysfb_efi: Reduce number of references to global screen_infoThomas Zimmermann1-4/+5
Replace usage of global screen_info with local pointers. This will later reduce churn when screen_info is being moved. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Richard Lyu <richard.lyu@suse.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-12-16mtd: spinand: add support for Dosilicon DS35Q1GA/DS35M1GAAhmed Naseef1-0/+1
Add support for Dosilicon DS35Q1GA (3.3V) and DS35M1GA (1.8V) SPI NAND. These are 1Gbit (128MB) devices with: - 2048 byte pages + 64 byte OOB - 64 pages per block, 1024 blocks - On-die 4-bit ECC per 512 byte sector The 64-byte OOB area is divided into 4 segments of 16 bytes, with each segment containing 8 bytes of user data (M2+M1) and 8 bytes of ECC parity (R1). This provides 30 bytes of usable OOB space after reserving 2 bytes for the bad block marker. Tested on Genexis Platinum 4410 (EcoNet EN751221) by writing known patterns to OOB and verifying ECC parity placement in R1 regions. Datasheet: https://www.dosilicon.com/resources/SPI%20NAND/DS35X1GAXXX_rev08.pdf Signed-off-by: Ahmed Naseef <naseefkm@gmail.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
2025-12-16Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds1-1/+1
Pull shmem rename fixes from Al Viro: "A couple of shmem rename fixes - recent regression from tree-in-dcache series and older breakage from stable directory offsets stuff" * tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: shmem: fix recovery on rename failures shmem_whiteout(): fix regression from tree-in-dcache series
2025-12-16shmem: fix recovery on rename failuresAl Viro1-1/+1
maple_tree insertions can fail if we are seriously short on memory; simple_offset_rename() does not recover well if it runs into that. The same goes for simple_offset_rename_exchange(). Moreover, shmem_whiteout() expects that if it succeeds, the caller will progress to d_move(), i.e. that shmem_rename2() won't fail past the successful call of shmem_whiteout(). Not hard to fix, fortunately - mtree_store() can't fail if the index we are trying to store into is already present in the tree as a singleton. For simple_offset_rename_exchange() that's enough - we just need to be careful about the order of operations. For simple_offset_rename() solution is to preinsert the target into the tree for new_dir; the rest can be done without any potentially failing operations. That preinsertion has to be done in shmem_rename2() rather than in simple_offset_rename() itself - otherwise we'd need to deal with the possibility of failure after successful shmem_whiteout(). Fixes: a2e459555c5f ("shmem: stable directory offsets") Reviewed-by: Christian Brauner <brauner@kernel.org> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-12-16lsm: fix kernel-doc struct member namesRandy Dunlap1-2/+2
Use the correct struct member names to avoid kernel-doc warnings: Warning: include/linux/lsm_hooks.h:83 struct member 'name' not described in 'lsm_id' Warning: include/linux/lsm_hooks.h:183 struct member 'initcall_device' not described in 'lsm_info' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-12-16irqchip: Add RZ/{T2H,N2H} Interrupt Controller (ICU) driverCosmin Tanislav1-0/+23
The Renesas RZ/T2H (R9A09G077) and Renesas RZ/N2H (R9A09G087) SoCs have an Interrupt Controller (ICU) that supports interrupts from external pins IRQ0 to IRQ15, and SEI, and software-triggered interrupts INTCPU0 to INTCPU15. INTCPU0 to INTCPU13, IRQ0 to IRQ13 are non-safety interrupts, while INTCPU14, INTCPU15, IRQ14, IRQ15 and SEI are safety interrupts, and are exposed via a separate register space. Signed-off-by: Cosmin Tanislav <cosmin-gabriel.tanislav.xa@renesas.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://patch.msgid.link/20251201112933.488801-3-cosmin-gabriel.tanislav.xa@renesas.com
2025-12-16genirq: Add interrupt redirection infrastructureRadu Rendec2-1/+26
Add infrastructure to redirect interrupt handler execution to a different CPU when the current CPU is not part of the interrupt's CPU affinity mask. This is primarily aimed at (de)multiplexed interrupts, where the child interrupt handler runs in the context of the parent interrupt handler, and therefore CPU affinity control for the child interrupt is typically not available. With the new infrastructure, the child interrupt is allowed to freely change its affinity setting, independently of the parent. If the interrupt handler happens to be triggered on an "incompatible" CPU (a CPU that's not part of the child interrupt's affinity mask), the handler is redirected and runs in IRQ work context on a "compatible" CPU. No functional change is being made to any existing irqchip driver, and irqchip drivers must be explicitly modified to use the newly added infrastructure to support interrupt redirection. Originally-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Radu Rendec <rrendec@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/linux-pci/878qpg4o4t.ffs@tglx/ Link: https://patch.msgid.link/20251128212055.1409093-2-rrendec@redhat.com
2025-12-16genirq: Remove setup_percpu_irq()Marc Zyngier1-3/+0
setup_percpu_irq() was always a bad kludge, and should have never been there the first place. Now that the last users are gone, remove it for good. Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://patch.msgid.link/20251210082242.360936-7-maz@kernel.org
2025-12-16genirq: Remove __request_percpu_irq() helperMarc Zyngier1-14/+4
With the IRQ timing stuff being gone, there is no need to specify a flag when requesting a percpu interrupt. Not only IRQF_TIMER was the only flag (set of flags actually) allowed, but nobody ever passed it. Get rid of __request_percpu_irq(), which was only getting 0 as flags, and promote request_percpu_irq_affinity() as its replacement. Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Jinjie Ruan <ruanjinjie@huawei.com> Link: https://patch.msgid.link/20251210082242.360936-3-maz@kernel.org
2025-12-16genirq: Remove IRQ timing tracking infrastructureMarc Zyngier1-6/+0
The IRQ timing tracking infrastructure was merged in 2019, but was never plumbed in, is not selectable, and is therefore never used. As Daniel agrees that there is little hope for this infrastructure to be completed in the near term, drop it altogether. Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Jinjie Ruan <ruanjinjie@huawei.com> Link: https://lore.kernel.org/r/87zf7vex6h.wl-maz@kernel.org Link: https://patch.msgid.link/20251210082242.360936-2-maz@kernel.org
2025-12-15genirq/msi: Correct kernel-doc in <linux/msi.h>Randy Dunlap1-6/+7
Eliminate all kernel-doc warnings in <linux/msi.h>: - add "struct" to struct kernel-doc headers - add missing struct member descriptions or correct typos in them Fixes these warnings: Warning: include/linux/msi.h:60 cannot understand function prototype: 'struct msi_msg' Warning: include/linux/msi.h:73 struct member 'arch_addr_lo' not described in 'msi_msg' Warning: include/linux/msi.h:73 struct member 'arch_addr_hi' not described in 'msi_msg' Warning: include/linux/msi.h:106 cannot understand function prototype: 'struct pci_msi_desc' Warning: include/linux/msi.h:124 struct member 'msi_attrib' not described in 'pci_msi_desc' Warning: include/linux/msi.h:204 struct member 'sysfs_attrs' not described in 'msi_desc' Warning: include/linux/msi.h:227 struct member 'domain' not described in 'msi_dev_domain' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://patch.msgid.link/20251214202341.2205675-1-rdunlap@infradead.org
2025-12-15time/timecounter: Inline timecounter_cyc2time()Eric Dumazet1-2/+29
New network transport protocols want NIC drivers to get hardware timestamps of all incoming packets, and possibly all outgoing packets. One example is the upcoming 'Swift congestion control' which is used by TCP transport and is the primary need for timecounter_cyc2time(). This means timecounter_cyc2time() can be called more than 100 million times per second on a busy server. Inlining timecounter_cyc2time() brings a 12% improvement on a UDP receive stress test on a 100Gbit NIC. Note that FDO, LTO, PGO are unable to magically help for this case, presumably because NIC drivers are almost exclusively shipped as modules. Add an unlikely() around the cc_cyc2ns_backwards() case, even if FDO (when used) is able to take care of this optimization. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://research.google/pubs/swift-delay-is-simple-and-effective-for-congestion-control-in-the-datacenter/ Link: https://patch.msgid.link/20251129095740.3338476-1-edumazet@google.com
2025-12-15filelock: allow lease_managers to dictate what qualifies as a conflictJeff Layton1-0/+1
Requesting a delegation on a file from the userland fcntl() interface currently succeeds when there are conflicting opens present. This is because the lease handling code ignores conflicting opens for FL_LAYOUT and FL_DELEG leases. This was a hack put in place long ago, because nfsd already checks for conflicts in its own way. The kernel needs to perform this check for userland delegations the same way it is done for leases, however. Make this dependent on the lease_manager by adding a new ->lm_open_conflict() lease_manager operation and have generic_add_lease() call that instead of check_conflicting_open(). Morph check_conflicting_open() into a ->lm_open_conflict() op that is only called for userland leases/delegations. Set the ->lm_open_conflict() operations for nfsd to trivial functions that always return 0. Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20251204-dir-deleg-ro-v2-2-22d37f92ce2c@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-12-15iomap: replace folio_batch allocation with stack allocationBrian Foster1-2/+6
Zhang Yi points out that the dynamic folio_batch allocation in iomap_fill_dirty_folios() is problematic for the ext4 on iomap work that is under development because it doesn't sufficiently handle the allocation failure case (by allowing a retry, for example). We've also seen lockdep (via syzbot) complain recently about the scope of the allocation. The dynamic allocation was initially added for simplicity and to help indicate whether the batch was used or not by the calling fs. To address these issues, put the batch on the stack of iomap_zero_range() and use a flag to control whether the batch should be used in the iomap folio lookup path. This keeps things simple and eliminates allocation issues with lockdep and for ext4 on iomap. While here, also clean up the fill helper signature to be more consistent with the underlying filemap helper. Pass through the return value of the filemap helper (folio count) and update the lookup offset via an out param. Fixes: 395ed1ef0012 ("iomap: optional zero range dirty folio processing") Signed-off-by: Brian Foster <bfoster@redhat.com> Link: https://patch.msgid.link/20251208140548.373411-1-bfoster@redhat.com Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-12-15ASoC: Fix acronym for Intel Gemini LakeAndy Shevchenko1-1/+2
While the used GML is consistent with the pattern for other Intel * Lake SoCs, the de facto use is GLK. Update the acronym and users accordingly. Note, a handful of the drivers for Gemini Lake in the Linux kernel use GLK already (LPC, MEI, pin control, SDHCI, ...) and even some in ASoC. The only ones in this patch used the inconsistent one. Acked-by: Bjorn Helgaas <bhelgaas@google.com> # pci_ids.h Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Cezary Rojewski <cezary.rojewski@intel.com> Link: https://patch.msgid.link/20251212181742.3944789-1-andriy.shevchenko@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-12-15fs: Remove internal old mount API codeEric Sandeen2-3/+0
Now that the last in-tree filesystem has been converted to the new mount API, remove all legacy mount API code designed to handle un-converted filesystems, and remove associated documentation as well. (The code to handle the legacy mount(2) syscall from userspace is still in place, of course.) Tested with an allmodconfig build on x86_64, and a sanity check of an old mount(2) syscall mount. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Link: https://patch.msgid.link/20251212174403.2882183-1-sandeen@redhat.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-12-15ns: pad refcountMateusz Guzik1-1/+3
Note no effort is made to make sure structs embedding the namespace are themselves aligned, so this is not guaranteed to eliminate cacheline bouncing due to refcount management. Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Link: https://patch.msgid.link/20251203092851.287617-2-mjguzik@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-12-15fs: track the inode having file locks with a flag in ->i_opflagsMateusz Guzik2-4/+12
Opening and closing an inode dirties the ->i_readcount field. Depending on the alignment of the inode, it may happen to false-share with other fields loaded both for both operations to various extent. This notably concerns the ->i_flctx field. Since most inodes don't have the field populated, this bit can be managed with a flag in ->i_opflags instead which bypasses the problem. Here are results I obtained while opening a file read-only in a loop with 24 cores doing the work on Sapphire Rapids. Utilizing the flag as opposed to reading ->i_flctx field was toggled at runtime as the benchmark was running, to make sure both results come from the same alignment. before: 3233740 after: 3373346 (+4%) before: 3284313 after: 3518711 (+7%) before: 3505545 after: 4092806 (+16%) Or to put it differently, this varies wildly depending on how (un)lucky you get. The primary bottleneck before and after is the avoidable lockref trip in do_dentry_open(). Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Link: https://patch.msgid.link/20251203094837.290654-2-mjguzik@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-12-15filelock: use a consume fence in locks_inode_context()Mateusz Guzik1-1/+4
Matches the idiom of storing a pointer with a release fence and safely getting the content with a consume fence after. Eliminates an actual fence on some archs. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Link: https://patch.msgid.link/20251203094837.290654-1-mjguzik@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-12-15VFS/knfsd: Teach dentry_create() to use atomic_open()Benjamin Coddington1-1/+1
While knfsd offers combined exclusive create and open results to clients, on some filesystems those results may not be atomic. This behavior can be observed. For example, an open O_CREAT with mode 0 will succeed in creating the file but unexpectedly return -EACCES from vfs_open(). Additionally reducing the number of remote RPC calls required for O_CREAT on network filesystem provides a performance benefit in the open path. Teach knfsd's helper dentry_create() to use atomic_open() for filesystems that support it. The previously const @path is passed up to atomic_open() and may be modified depending on whether an existing entry was found or if the atomic_open() returned an error and consumed the passed-in dentry. Signed-off-by: Benjamin Coddington <bcodding@hammerspace.com> Link: https://patch.msgid.link/8e449bfb64ab055abb9fd82641a171531415a88c.1764259052.git.bcodding@hammerspace.com Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-12-15Merge drm/drm-next into drm-misc-nextMaxime Ripard408-7220/+17389
Let's kickstart the v6.20 (7.0?) release cycle. Signed-off-by: Maxime Ripard <mripard@kernel.org>
2025-12-15ata: libata: Allow more quirksNiklas Cassel1-32/+32
We have currently used up 30 out of the 32-bits in the struct ata_device struct member quirks. Thus, it is only possible to add two more quirks. Change the struct ata_device struct member quirks from an unsigned int to an u64. Doing this core level change now, will make it easier for us now, as we will not need to also do core level changes once the final two bits are used as well. Signed-off-by: Niklas Cassel <cassel@kernel.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
2025-12-15ata: libata: Change libata.force to use the generic ATA_QUIRK_MAX_SEC quirkNiklas Cassel2-6/+0
Modify the existing libata.force parameters "max_sec_128" and "max_sec_1024" to use the generic ATA_QUIRK_MAX_SEC quirk rather than individual quirks. This also allows us to remove the individual quirks ATA_QUIRK_MAX_SEC_128 and ATA_QUIRK_MAX_SEC_1024. Signed-off-by: Niklas Cassel <cassel@kernel.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
2025-12-15ata: libata: Add ATA_QUIRK_MAX_SEC and convert all device quirksNiklas Cassel2-3/+2
Add a new quirk ATA_QUIRK_MAX_SEC, which has a separate table with device specific values. Convert all existing ATA_QUIRK_MAX_SEC_XXX device quirks in __ata_dev_quirks to the new format. Quirks ATA_QUIRK_MAX_SEC_128 and ATA_QUIRK_MAX_SEC_1024 cannot be removed yet, since they are also used by libata.force, which functionally, is a separate user of the quirks. The quirks will be removed once all users have been converted to use the new format. The quirk ATA_QUIRK_MAX_SEC_8191 can be removed since it has no equivalent libata.force parameter. Signed-off-by: Niklas Cassel <cassel@kernel.org> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
2025-12-15sched/fair: Separate se->vlag from se->vprotIngo Molnar1-9/+4
There's no real space concerns here and keeping these fields in a union makes reading (and tracing) the scheduler code harder. Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://patch.msgid.link/20251201064647.1851919-4-mingo@kernel.org
2025-12-14<linux/compiler_types.h>: Add the __signed_scalar_typeof() helperPeter Zijlstra1-0/+19
Define __signed_scalar_typeof() to declare a signed scalar type, leaving non-scalar types unchanged. To be used to clean up the scheduler load-balancing code a bit. [ mingo: Split off this patch from the scheduler patch. ] Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shrikanth Hegde <sshegde@linux.ibm.com> Cc: Valentin Schneider <vschneid@redhat.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Link: https://patch.msgid.link/20251127154725.413564507@infradead.org
2025-12-13Merge tag 'irq-urgent-2025-12-12' of ↵Linus Torvalds1-16/+0
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Ingo Molnar: - Fix error code in the irqchip/mchp-eic driver - Fix setup_percpu_irq() affinity assumptions - Remove the unused irq_domain_add_tree() function * tag 'irq-urgent-2025-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/mchp-eic: Fix error code in mchp_eic_domain_alloc() irqdomain: Delete irq_domain_add_tree() genirq: Allow NULL affinity for setup_percpu_irq()
2025-12-13Merge tag 'core-urgent-2025-12-12' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull misc core fixes from Ingo Molnar: - Improve bug reporting - Suppress W=1 format warning - Improve rseq scalability on Clang builds * tag 'core-urgent-2025-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: rseq: Always inline rseq_debug_syscall_return() bug: Hush suggest-attribute=format for __warn_printf() bug: Let report_bug_entry() provide the correct bugaddr
2025-12-13mm: Remove tlb_flush_reason::NR_TLB_FLUSH_REASONS from <linux/mm_types.h>Tal Zussman1-1/+0
This has been unused since it was added 11 years ago in: d17d8f9dedb9 ("x86/mm: Add tracepoints for TLB flushes") Signed-off-by: Tal Zussman <tz2294@columbia.edu> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Rik van Riel <riel@surriel.com> Acked-by: David Hildenbrand <david@redhat.com> Link: https://patch.msgid.link/20251212-tlb-trace-fix-v2-2-d322e0ad9b69@columbia.edu
2025-12-13Merge tag 'ib-mfd-input-power-regulator-v6.19' of ↵Dmitry Torokhov1-0/+273
git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd into next Pull in pf1550-onkey driver.
2025-12-13Merge tag 'mm-nonmm-stable-2025-12-11-11-47' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc updates from Andrew Morton: "There are no significant series in this small merge. Please see the individual changelogs for details" [ Editor's note: it's mainly ocfs2 and a couple of random fixes ] * tag 'mm-nonmm-stable-2025-12-11-11-47' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mm: memfd_luo: add CONFIG_SHMEM dependency mm: shmem: avoid build warning for CONFIG_SHMEM=n ocfs2: fix memory leak in ocfs2_merge_rec_left() ocfs2: invalidate inode if i_mode is zero after block read ocfs2: avoid -Wflex-array-member-not-at-end warning ocfs2: convert remaining read-only checks to ocfs2_emergency_state ocfs2: add ocfs2_emergency_state helper and apply to setattr checkpatch: add uninitialized pointer with __free attribute check args: fix documentation to reflect the correct numbers ocfs2: fix kernel BUG in ocfs2_find_victim_chain liveupdate: luo_core: fix redundant bound check in luo_ioctl() ocfs2: validate inline xattr size and entry count in ocfs2_xattr_ibody_list fs/fat: remove unnecessary wrapper fat_max_cache() ocfs2: replace deprecated strcpy with strscpy ocfs2: check tl_used after reading it from trancate log inode liveupdate: luo_file: don't use invalid list iterator
2025-12-13Merge tag 'mm-stable-2025-12-11-11-39' of ↵Linus Torvalds3-9/+8
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull more MM updates from Andrew Morton: - "powerpc/pseries/cmm: two smaller fixes" (David Hildenbrand) fixes a couple of minor things in ppc land - "Improve folio split related functions" (Zi Yan) some cleanups and minorish fixes in the folio splitting code * tag 'mm-stable-2025-12-11-11-39' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mm/damon/tests/core-kunit: avoid damos_test_commit stack warning mm: vmscan: correct nr_requested tracing in scan_folios MAINTAINERS: add idr core-api doc file to XARRAY mm/hugetlb: fix incorrect error return from hugetlb_reserve_pages() mm: fix CONFIG_STACK_GROWSUP typo in mm.h mm/huge_memory: fix folio split stats counting mm/huge_memory: make min_order_for_split() always return an order mm/huge_memory: replace can_split_folio() with direct refcount calculation mm/huge_memory: change folio_split_supported() to folio_check_splittable() mm/sparse: fix sparse_vmemmap_init_nid_early definition without CONFIG_SPARSEMEM powerpc/pseries/cmm: adjust BALLOON_MIGRATE when migrating pages powerpc/pseries/cmm: call balloon_devinfo_init() also without CONFIG_BALLOON_COMPACTION
2025-12-13file: ensure cleanupChristian Brauner1-7/+6
Brown paper bag time. This is a silly oversight where I missed to drop the error condition checking to ensure we clean up on early error returns. I have an internal unit testset coming up for this which will catch all such issues going forward. Reported-by: Chris Mason <clm@fb.com> Reported-by: Jeff Layton <jlayton@kernel.org> Fixes: 011703a9acd7 ("file: add FD_{ADD,PREPARE}()") Signed-off-by: Christian Brauner <brauner@kernel.org> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-12-13Merge tag 'i3c/for-6.19-2' of ↵Linus Torvalds1-10/+2
git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux Pull further i3c update from Alexandre Belloni: "We are removing a legacy API callback and having this sooner rather than later will help ensuring no one introduces a new driver using it. I've also added patches removing the "__free(...) = NULL" pattern because I'm sure we won't avoid people sending those following the mailing list discussion..." * tag 'i3c/for-6.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux: i3c: adi: Fix confusing cleanup.h syntax i3c: master: Fix confusing cleanup.h syntax i3c: master: cleanup callback .priv_xfers() i3c: master: switch to use new callback .i3c_xfers() from .priv_xfers()