| Age | Commit message (Collapse) | Author | Files | Lines |
|
Remove kvm_ioapic_update_eoi_one()'s ASSERT() that the vector's entry is
configured to be level-triggered, as KVM intercepts and forward EOIs to
the I/O APIC even for edge-triggered IRQs (see kvm_ioapic_scan_entry()),
and nothing guarantees the local APIC's TMR register is synchronized with
the I/O APIC redirection table, i.e. the @trigger_mode check just out of
sight doesn't provide any meaningful protection.
Given that roughly half of the historic ASSERT()s are/were guest- and/or
user-triggerable, it's safe to assume no one has run meaningful workloads
with DEBUG=1, i.e. that the ASSERT() has been dead code since it was
added 18+ years ago.
Opportunistically drop the unnecessary forward declaration of
kvm_ioapic_update_eoi_one().
For all intents and purposes, no functional change intended.
Link: https://patch.msgid.link/20251206004311.479939-4-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Remove the ASSERT()s in apic_find_highest_i{r,s}r() that exist to detect
illegal vectors (0-15 are reserved and never recognized by the local APIC),
as the asserts, if they were ever to be enabled by #defining DEBUG, can be
trivially triggered from both the guest and from userspace, and ultimately
because the ASSERT()s are useless.
In large part due to lack of emulation for the Error Status Register and
its "delayed" read semantics, KVM doesn't filter out bad IRQs (IPIs or
otherwise) when IRQs are sent or received. Instead, probably by dumb
luck on KVM's part, KVM effectively ignores pending illegal vectors in
the IRR due vector 0-15 having priority '0', and thus never being higher
priority than PPR.
As for ISR, a misbehaving userspace could stuff illegal vector bits, but
again the end result is mostly benign (aside from userspace likely
breaking the VM), as processing illegal vectors "works" and doesn't cause
functional problems.
Regardless of the safety and correctness of KVM's illegal vector handling,
one thing is for certain: the ASSERT()s have done absolutely nothing to
help detect such issues since they were added 18+ years ago by commit
97222cc83163 ("KVM: Emulate local APIC in kernel").
For all intents and purposes, no functional change intended.
Link: https://patch.msgid.link/20251206004311.479939-3-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Remove ASSERT()s on vCPU and APIC structures being non-NULL in the local
APIC code as the DEBUG=1 path of ASSERT() ends with BUG(), i.e. isn't
meaningfully better for debugging than a NULL pointer dereference.
For all intents and purposes, no functional change intended.
Link: https://patch.msgid.link/20251206004311.479939-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Add a (gnarly) inline "script" in the Makefile to fail the build if there
is EXPORT_SYMBOL_GPL or EXPORT_SYMBOL usage in virt/kvm or arch/x86/kvm
beyond the known-good/expected exports for other modules. Remembering to
use EXPORT_SYMBOL_FOR_KVM_INTERNAL is surprisingly difficult, and hoping
to detect "bad" exports via code review is not a robust long-term strategy.
Jump through a pile of hoops to coerce make into printing a human-friendly
error message, with the offending files+lines cleanly separated.
E.g. where <srctree> is the resolution of $(srctree), i.e. '.' for in-tree
builds, and the absolute path for out-of-tree-builds:
<srctree>/arch/x86/kvm/Makefile:97: *** ERROR ***
found 2 unwanted occurrences of EXPORT_SYMBOL_GPL:
<srctree>/arch/x86/kvm/x86.c:686:EXPORT_SYMBOL_GPL(__kvm_set_user_return_msr);
<srctree>/arch/x86/kvm/x86.c:703:EXPORT_SYMBOL_GPL(kvm_set_user_return_msr);
in directories:
<srctree>/arch/x86/kvm
<srctree>/virt/kvm
Use EXPORT_SYMBOL_FOR_KVM_INTERNAL, not EXPORT_SYMBOL_GPL. Stop.
and
<srctree>/arch/x86/kvm/Makefile:98: *** ERROR ***
found 1 unwanted occurrences of EXPORT_SYMBOL:
<srctree>/arch/x86/kvm/x86.c:709:EXPORT_SYMBOL(kvm_get_user_return_msr);
in directories:
<srctree>/arch/x86/kvm
<srctree>/virt/kvm
Use EXPORT_SYMBOL_FOR_KVM_INTERNAL, not EXPORT_SYMBOL. Stop.
Put the enforcement in x86's Makefile even though the rule itself applies
to virt/kvm, as putting the enforcement in virt/kvm/Makefile.kvm would
effectively require exempting every architecture except x86. PPC is the
only other architecture with sub-modules, and PPC hasn't been switched to
use EXPORT_SYMBOL_FOR_KVM_INTERNAL (and given its nearly-orphaned state,
likely never will). And for KVM architectures without sub-modules, that
means that, barring truly spurious exports, the exports are intended for
non-KVM usage and thus shouldn't be using EXPORT_SYMBOL_FOR_KVM_INTERNAL.
Tested-by: Chao Gao <chao.gao@intel.com>
Link: https://patch.msgid.link/20251121190514.293385-1-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Remove the arch-specific variant of paravirt_steal_clock() and use
the common one instead.
With all archs supporting Xen now having been switched to the common
variant, including paravirt.h can be dropped from drivers/xen/time.c.
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260105110520.21356-12-jgross@suse.com
|
|
Remove the arch specific variant of paravirt_steal_clock() and use
the common one instead.
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260105110520.21356-11-jgross@suse.com
|
|
Remove the arch specific variant of paravirt_steal_clock() and use
the common one instead.
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260105110520.21356-10-jgross@suse.com
|
|
Remove the arch-specific variant of paravirt_steal_clock() and use
the common one instead.
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260105110520.21356-9-jgross@suse.com
|
|
Remove the arch-specific variant of paravirt_steal_clock() and use
the common one instead.
This allows to remove paravirt.c and paravirt.h from arch/arm.
Until all archs supporting Xen have been switched to the common code
of paravirt_steal_clock(), drivers/xen/time.c needs to include
asm/paravirt.h for those archs, while this is not necessary for arm
any longer.
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260105110520.21356-8-jgross@suse.com
|
|
Paravirt clock related functions are available in multiple archs.
In order to share the common parts, move the common static keys
to kernel/sched/ and remove them from the arch specific files.
Make a common paravirt_steal_clock() implementation available in
kernel/sched/cputime.c, guarding it with a new config option
CONFIG_HAVE_PV_STEAL_CLOCK_GEN, which can be selected by an arch
in case it wants to use that common variant.
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260105110520.21356-7-jgross@suse.com
|
|
All architectures supporting CONFIG_PARAVIRT share the same contents
of asm/paravirt_api_clock.h:
#include <asm/paravirt.h>
So remove all incarnations of asm/paravirt_api_clock.h and remove the
only place where it is included, as there asm/paravirt.h is included
anyway.
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Shrikanth Hegde <sshegde@linux.ibm.com> # powerpc, scheduler bits
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260105110520.21356-6-jgross@suse.com
|
|
The macros for generating PV-thunks are part of the generic paravirt
infrastructure, so they should be in paravirt_types.h.
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260105110520.21356-5-jgross@suse.com
|
|
The only effect of CONFIG_PARAVIRT_DEBUG set is that instead of doing a call
using a NULL pointer a BUG() is being raised.
While the BUG() will be a little bit easier to analyse, the call of NULL isn't
really that difficult to find the reason for.
Remove the config option to make paravirt coding a little bit less annoying.
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260105110520.21356-4-jgross@suse.com
|
|
In paravirt_types.h and paravirt.h there are some struct declarations which
are not needed. Remove them.
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260105110520.21356-3-jgross@suse.com
|
|
Add the UHS state pins for the MMC1 and MMC2 controllers and,
while at it, also add the correct drive strength parameters
for the default pin states for those two.
Reviewed-by: Chen-Yu Tsai <wenst@chromium.org>
Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
|
|
Remove the clock-stretch-ns property from i2c2, as it has always
been (and still is) unused.
Reviewed-by: Chen-Yu Tsai <wenst@chromium.org>
Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
|
|
The only reason to have a regulator in the thermal node is to keep
the CPU cores up while reading temperatures, but this is incorrect
because the AUXADC Thermal IP doesn't need any regulators to work,
at all.
Since the thermal node was inherited only for adding vregs, remove
it entirely.
This change is safe also because, among other things, the actual
driver never used those regulators anyway.
This also fixes a dtbs_check warning.
Reviewed-by: Chen-Yu Tsai <wenst@chromium.org>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
|
|
Since only a single port is present, remove the inner `ports`
parent node and just declare the single port as `port`.
Reviewed-by: Chen-Yu Tsai <wenst@chromium.org>
Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
|
|
Change the node name for Marvell SD8897 SDIO Bluetooth from
`btmrvl@2` to `bluetooth@2` to fix a dtbs_check warning.
While at it, also change the WiFi one from `mwifiex@1" to a
generic "wifi@1" and reorder the nodes so that wifi@1 comes
before bluetooth@2.
Reviewed-by: Chen-Yu Tsai <wenst@chromium.org>
Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
|
|
Change all of the pinmux main nodes to have a "-pins" suffix to
satisfy devicetree bindings checks.
Reviewed-by: Chen-Yu Tsai <wenst@chromium.org>
Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
|
|
Rename "piins-bt-wakeup" to "pins-bt-wakeup" to fix a dtbs_check
warning happening due to this typo.
Fixes: 055ef10ccdd4 ("arm64: dts: mt8183: Add jacuzzi pico/pico6 board")
Reviewed-by: Chen-Yu Tsai <wenst@chromium.org>
Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
|
|
In spi2's flash@0 there is only one `partitions` subnode: this
alone makes specifying address and size cells useless, but then
this subnode has no address and no size, which even makes the
currently declared address/size cells wrong.
Fixes: 869b3bb5ada2 ("arm64: dts: mediatek: mt7981b-openwrt-one: Enable SPI NOR")
Reviewed-by: Chen-Yu Tsai <wenst@chromium.org>
Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
|
|
Change the Murata NCM03WF104 node name from "thermal-sensor" to
"thermistor" (as that's what it is, after all), and change all
of the pinmux main nodes to have a "-pins" suffix to satisfy
devicetree bindings checks.
Reviewed-by: Chen-Yu Tsai <wenst@chromium.org>
Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
|
|
Fix the pinctrl node names to adhere to the bindings, as the main
pin node is supposed to be named like "uart0-pins" and the pinmux
node named like "pins-bus".
While at it, also cleanup all of the MTK_DRIVE_(x)mA by changing
that to just the (x) number.
Reviewed-by: Chen-Yu Tsai <wenst@chromium.org>
Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
|
|
Arnd pointed out that having firmware-name in the device tree is wrong.
Drop it.
Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>
Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
|
|
The PVH entry is available for 32-bit KVM guests, and 32-bit KVM guests
do not depend on CONFIG_X86_PAE. However, mk_early_pgtbl_32() builds
different pagetables depending on whether CONFIG_X86_PAE is set.
Therefore, enabling PAE mode for 32-bit KVM guests without
CONFIG_X86_PAE being set would result in a boot failure during CR3
loading.
Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Message-ID: <d09ce9a134eb9cbc16928a5b316969f8ba606b81.1768017442.git.houwenlong.hwl@antgroup.com>
|
|
The DTS code coding style expects lowercase hex for values and unit
addresses.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/20251223152358.152533-4-krzysztof.kozlowski@oss.qualcomm.com
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
|
|
DTS coding style prefers hyphens instead of underscores in the node
names. Change should be safe, because node names are not considered an
ABI.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/20251223152358.152533-3-krzysztof.kozlowski@oss.qualcomm.com
[geert: Fix'em all]
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
|
|
These .dtsi files are not included anywhere in the tree and can't be
tested.
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Acked-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/20251212203226.458694-1-robh@kernel.org
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
|
|
In some places asm/paravirt.h is included without really being needed.
Remove the related #include statements.
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260105110520.21356-2-jgross@suse.com
|
|
OrangePi 6 Plus adopts CIX CD8180/CD8160 SoC, built-in 12-core 64-bit
processor + NPU processor,integrated graphics processor, equipped with
16GB/32GB/64GB LPDDR5, and provides two M.2 KEY-M interfaces 2280 for NVMe
SSD,as well as SPI FLASH and TF slots to meet the needs of fast read/write
and high-capacity storage
Signed-off-by: Gary Yang <gary.yang@cixtech.com>
Link: https://lore.kernel.org/r/20260110093406.2700505-3-gary.yang@cixtech.com
Signed-off-by: Peter Chen <peter.chen@cixtech.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fix from Ingo Molnar:
"Disable GCOV instrumentation in the SEV noinstr.c collection of SEV
noinstr methods, to further robustify the code"
* tag 'x86-urgent-2026-01-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/sev: Disable GCOV on noinstr object
|
|
In a vain attempt to consolidate the email zoo switch everything to the
kernel.org account.
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Paul Walmsley:
"Notable changes include a fix to close one common microarchitectural
attack vector for out-of-order cores. Another patch exposed an
omission in my boot test coverage, which is currently missing
relocatable kernels. Otherwise, the fixes seem to be settling down for
us.
- Fix CONFIG_RELOCATABLE=y boots by building Image files from
vmlinux, rather than vmlinux.unstripped, now that the .modinfo
section is included in vmlinux.unstripped
- Prevent branch predictor poisoning microarchitectural attacks that
use the syscall index as a vector by using array_index_nospec() to
clamp the index after the bounds check (as x86 and ARM64 already
do)
- Fix a crash in test_kprobes when building with Clang
- Fix a deadlock possible when tracing is enabled for SBI ecalls
- Fix the definition of the Zk standard RISC-V ISA extension bundle,
which was missing the Zknh extension
- A few other miscellaneous non-functional cleanups, removing unused
macros, fixing an out-of-date path in code comments, resolving a
compile-time warning for a type mismatch in a pr_crit(), and
removing an unnecessary header file inclusion"
* tag 'riscv-for-linus-6.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: trace: fix snapshot deadlock with sbi ecall
riscv: remove irqflags.h inclusion in asm/bitops.h
riscv: cpu_ops_sbi: smp_processor_id() returns int, not unsigned int
riscv: configs: Clean up references to non-existing configs
riscv: kexec_image: Fix dead link to boot-image-header.rst
riscv: pgtable: Cleanup useless VA_USER_XXX definitions
riscv: cpufeature: Fix Zk bundled extension missing Zknh
riscv: fix KUnit test_kprobes crash when building with Clang
riscv: Sanitize syscall table indexing under speculation
riscv: boot: Always make Image from vmlinux, not vmlinux.unstripped
|
|
The DT core will call of_platform_default_populate, so it is not
necessary for machine specific code to call it unless there are custom
match entries, auxdata or parent device. Neither of those apply here, so
remove the call.
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Tested-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://lore.kernel.org/r/20260105-at91-probe-v3-3-594013ff2965@kernel.org
Signed-off-by: Claudiu Beznea <claudiu.beznea@tuxon.dev>
|
|
Move the AT91 PM init functions to .init_late hook to ensure driver
dependencies have probed.
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Tested-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://lore.kernel.org/r/20260105-at91-probe-v3-2-594013ff2965@kernel.org
Signed-off-by: Claudiu Beznea <claudiu.beznea@tuxon.dev>
|
|
Since telemetry events are enumerated on resctrl mount the RDT_RESOURCE_PERF_PKG
resource is not considered "monitoring capable" during early resctrl initialization.
This means that the domain list for RDT_RESOURCE_PERF_PKG is not built when the CPU
hotplug notifiers are registered and run for the first time right after resctrl
initialization.
Mark the RDT_RESOURCE_PERF_PKG as "monitoring capable" upon successful telemetry
event enumeration to ensure future CPU hotplug events include this resource and
initialize its domain list for CPUs that are already online.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lore.kernel.org/20251217172121.12030-1-tony.luck@intel.com
|
|
resctrl assumes that only the L3 resource supports monitor events, so it
simply takes the rdt_resource::num_rmid from RDT_RESOURCE_L3 as the system's
number of RMIDs.
The addition of telemetry events in a different resource breaks that
assumption.
Compute the number of available RMIDs as the minimum value across all
mon_capable resources (analogous to how the number of CLOSIDs is computed
across alloc_capable resources).
Note that mount time enumeration of the telemetry resource means that
this number can be reduced. If this happens, then some memory will
be wasted as the allocations for rdt_l3_mon_domain::mbm_states[] and
rdt_l3_mon_domain::rmid_busy_llc created during resctrl initialization will
be larger than needed.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lore.kernel.org/20251217172121.12030-1-tony.luck@intel.com
|
|
There are now three meanings for "number of RMIDs":
1) The number for legacy features enumerated by CPUID leaf 0xF. This is the
maximum number of distinct values that can be loaded into MSR_IA32_PQR_ASSOC.
Note that systems with Sub-NUMA Cluster mode enabled will force scaling down
the CPUID enumerated value by the number of SNC nodes per L3-cache.
2) The number of registers in MMIO space for each event. This is enumerated in
the XML files and is the value initialized into event_group::num_rmid.
3) The number of "hardware counters" (this isn't a strictly accurate
description of how things work, but serves as a useful analogy that does
describe the limitations) feeding to those MMIO registers. This is enumerated
in telemetry_region::num_rmids returned by intel_pmt_get_regions_by_feature().
Event groups with insufficient "hardware counters" to track all RMIDs are
difficult for users to use, since the system may reassign "hardware counters"
at any time. This means that users cannot reliably collect two consecutive
event counts to compute the rate at which events are occurring.
Disable such event groups by default. The user may override this with
a command line "rdt=" option. In this case limit an under-resourced event
group's number of possible monitor resource groups to the lowest number of
"hardware counters".
Scan all enabled event groups and assign the RDT_RESOURCE_PERF_PKG resource
"num_rmid" value to the smallest of these values as this value will be used
later to compare against the number of RMIDs supported by other resources to
determine how many monitoring resource groups are supported.
N.B. Change type of resctrl_mon::num_rmid to u32 to match its usage and the
type of event_group::num_rmid so that min(r->num_rmid, e->num_rmid) won't
complain about mixing signed and unsigned types.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lore.kernel.org/20251217172121.12030-1-tony.luck@intel.com
|
|
Commit ddcadb297ce5 ("KVM: arm64: Ignore EAGAIN for walks outside of a
fault") introduced a new walker flag ('KVM_PGTABLE_WALK_HANDLE_FAULT')
to KVM's page-table code. When set, the walk logic maintains its
previous behaviour of terminating a walk as soon as the visitor callback
returns an error. However, when the flag is clear, the walk will
continue if the visitor returns -EAGAIN and the error is then suppressed
and returned as zero to the caller.
Clearing the flag is beneficial when write-protecting a range of IPAs
with kvm_pgtable_stage2_wrprotect() but is not useful in any other
cases, either because we are operating on a single page (e.g.
kvm_pgtable_stage2_mkyoung() or kvm_phys_addr_ioremap()) or because the
early termination is desirable (e.g. when mapping pages from a fault in
user_mem_abort()).
Subsequently, commit e912efed485a ("KVM: arm64: Introduce the EL1 pKVM
MMU") hooked up pKVM's hypercall interface to the MMU code at EL1 but
failed to propagate any of the walker flags. As a result, page-table
walks at EL2 fail to set KVM_PGTABLE_WALK_HANDLE_FAULT even when the
early termination semantics are desirable on the fault handling path.
Rather than complicate the pKVM hypercall interface, invert the flag so
that the whole thing can be simplified and only pass the new flag
('KVM_PGTABLE_WALK_IGNORE_EAGAIN') from the wrprotect code.
Cc: Fuad Tabba <tabba@google.com>
Cc: Quentin Perret <qperret@google.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Oliver Upton <oupton@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Fixes: fce886a60207 ("KVM: arm64: Plumb the pKVM MMU in KVM")
Signed-off-by: Will Deacon <will@kernel.org>
Reviewed-by: Quentin Perret <qperret@google.com>
Link: https://msgid.link/20260105154939.11041-2-will@kernel.org
Signed-off-by: Oliver Upton <oupton@kernel.org>
|
|
When loading guest XSAVE state via KVM_SET_XSAVE, and when updating XFD in
response to a guest WRMSR, clear XFD-disabled features in the saved (or to
be restored) XSTATE_BV to ensure KVM doesn't attempt to load state for
features that are disabled via the guest's XFD. Because the kernel
executes XRSTOR with the guest's XFD, saving XSTATE_BV[i]=1 with XFD[i]=1
will cause XRSTOR to #NM and panic the kernel.
E.g. if fpu_update_guest_xfd() sets XFD without clearing XSTATE_BV:
------------[ cut here ]------------
WARNING: arch/x86/kernel/traps.c:1524 at exc_device_not_available+0x101/0x110, CPU#29: amx_test/848
Modules linked in: kvm_intel kvm irqbypass
CPU: 29 UID: 1000 PID: 848 Comm: amx_test Not tainted 6.19.0-rc2-ffa07f7fd437-x86_amx_nm_xfd_non_init-vm #171 NONE
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
RIP: 0010:exc_device_not_available+0x101/0x110
Call Trace:
<TASK>
asm_exc_device_not_available+0x1a/0x20
RIP: 0010:restore_fpregs_from_fpstate+0x36/0x90
switch_fpu_return+0x4a/0xb0
kvm_arch_vcpu_ioctl_run+0x1245/0x1e40 [kvm]
kvm_vcpu_ioctl+0x2c3/0x8f0 [kvm]
__x64_sys_ioctl+0x8f/0xd0
do_syscall_64+0x62/0x940
entry_SYSCALL_64_after_hwframe+0x4b/0x53
</TASK>
---[ end trace 0000000000000000 ]---
This can happen if the guest executes WRMSR(MSR_IA32_XFD) to set XFD[18] = 1,
and a host IRQ triggers kernel_fpu_begin() prior to the vmexit handler's
call to fpu_update_guest_xfd().
and if userspace stuffs XSTATE_BV[i]=1 via KVM_SET_XSAVE:
------------[ cut here ]------------
WARNING: arch/x86/kernel/traps.c:1524 at exc_device_not_available+0x101/0x110, CPU#14: amx_test/867
Modules linked in: kvm_intel kvm irqbypass
CPU: 14 UID: 1000 PID: 867 Comm: amx_test Not tainted 6.19.0-rc2-2dace9faccd6-x86_amx_nm_xfd_non_init-vm #168 NONE
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
RIP: 0010:exc_device_not_available+0x101/0x110
Call Trace:
<TASK>
asm_exc_device_not_available+0x1a/0x20
RIP: 0010:restore_fpregs_from_fpstate+0x36/0x90
fpu_swap_kvm_fpstate+0x6b/0x120
kvm_load_guest_fpu+0x30/0x80 [kvm]
kvm_arch_vcpu_ioctl_run+0x85/0x1e40 [kvm]
kvm_vcpu_ioctl+0x2c3/0x8f0 [kvm]
__x64_sys_ioctl+0x8f/0xd0
do_syscall_64+0x62/0x940
entry_SYSCALL_64_after_hwframe+0x4b/0x53
</TASK>
---[ end trace 0000000000000000 ]---
The new behavior is consistent with the AMX architecture. Per Intel's SDM,
XSAVE saves XSTATE_BV as '0' for components that are disabled via XFD
(and non-compacted XSAVE saves the initial configuration of the state
component):
If XSAVE, XSAVEC, XSAVEOPT, or XSAVES is saving the state component i,
the instruction does not generate #NM when XCR0[i] = IA32_XFD[i] = 1;
instead, it operates as if XINUSE[i] = 0 (and the state component was
in its initial state): it saves bit i of XSTATE_BV field of the XSAVE
header as 0; in addition, XSAVE saves the initial configuration of the
state component (the other instructions do not save state component i).
Alternatively, KVM could always do XRSTOR with XFD=0, e.g. by using
a constant XFD based on the set of enabled features when XSAVEing for
a struct fpu_guest. However, having XSTATE_BV[i]=1 for XFD-disabled
features can only happen in the above interrupt case, or in similar
scenarios involving preemption on preemptible kernels, because
fpu_swap_kvm_fpstate()'s call to save_fpregs_to_fpstate() saves the
outgoing FPU state with the current XFD; and that is (on all but the
first WRMSR to XFD) the guest XFD.
Therefore, XFD can only go out of sync with XSTATE_BV in the above
interrupt case, or in similar scenarios involving preemption on
preemptible kernels, and it we can consider it (de facto) part of KVM
ABI that KVM_GET_XSAVE returns XSTATE_BV[i]=0 for XFD-disabled features.
Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: stable@vger.kernel.org
Fixes: 820a6ee944e7 ("kvm: x86: Add emulation for IA32_XFD", 2022-01-14)
Signed-off-by: Sean Christopherson <seanjc@google.com>
[Move clearing of XSTATE_BV from fpu_copy_uabi_to_guest_fpstate
to kvm_vcpu_ioctl_x86_set_xsave. - Paolo]
Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Motor Control PWM shares an interrupt line with TIMER4 on MIC interrupt
controller, the interrupt serves as period (timer limit), pulse-width (match)
and capture event interrupt.
Signed-off-by: Vladimir Zapolskiy <vz@mleia.com>
|
|
Motor Control PWM depends on its own supply clock, the clock gate control
is present in TIMCLK_CTRL1 register.
Fixes: b7d41c937ed7 ("ARM: LPC32xx: Add the motor PWM to base dts file")
Signed-off-by: Vladimir Zapolskiy <vz@mleia.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
- Do not return false if !preemptible() in current_in_efi(). EFI
runtime services can now run with preemption enabled
- Fix uninitialised variable in the arm MPAM driver, reported by sparse
- Fix partial kasan_reset_tag() use in change_memory_common() when
calculating page indices or comparing ranges
- Save/restore TCR2_EL1 during suspend/resume, otherwise the E0POE bit
is lost
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: Fix cleared E0POE bit after cpu_suspend()/resume()
arm64: mm: Fix incomplete tag reset in change_memory_common()
arm_mpam: Stop using uninitialized variables in __ris_msmon_read()
arm64/efi: Don't fail check current_in_efi() if preemptible
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull SoC fixes from Arnd Bergmann:
"The main code change is a revert of the Raspberry Pi RP1 overlay
support that was decided to not be ready.
The other fixes are all for devicetree sources:
- ethernet configuration on ixp42x-actiontec-mi424wr is board
revision specific
- validation warning fixes for imx27/imx51/imx6, hikey960 and k3
- Minor corrections across imx8 boards, addressing all types of
issues with interrups, dma, ethernet and clock settings, all simple
one-line changes"
* tag 'soc-fixes-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (25 commits)
arm64: dts: hisilicon: hikey960: Drop "snps,gctl-reset-quirk" and "snps,tx_de_emphasis*" properties
Documentation/process: maintainer-soc: Mark 'make' as commands
Documentation/process: maintainer-soc: Be more explicit about defconfig
arm64: dts: mba8mx: Fix Ethernet PHY IRQ support
arm64: dts: imx8qm-ss-dma: correct the dma channels of lpuart
arm64: dts: imx8mp: Fix LAN8740Ai PHY reference clock on DH electronics i.MX8M Plus DHCOM
arm64: dts: freescale: tx8p-ml81: fix eqos nvmem-cells
arm64: dts: freescale: moduline-display: fix compatible
dt-bindings: arm: fsl: moduline-display: fix compatible
ARM: dts: imx6q-ba16: fix RTC interrupt level
arm64: dts: freescale: imx95-toradex-smarc: fix SMARC_SDIO_WP label position
arm64: dts: freescale: imx95-toradex-smarc: use edge trigger for ethphy1 interrupt
arm64: dts: add off-on-delay-us for usdhc2 regulator
arm64: dts: imx8qm-mek: correct the light sensor interrupt type to low level
ARM: dts: nxp: imx: Fix mc13xxx LED node names
arm64: dts: imx95: correct I3C2 pclk to IMX95_CLK_BUSWAKEUP
MAINTAINERS: Fix a linusw mail address
arm64: dts: broadcom: rp1: drop RP1 overlay
arm64: dts: broadcom: bcm2712: fix RP1 endpoint PCI topology
misc: rp1: drop overlay support
...
|
|
We set PSTATE.PAN to 1 on exiting from a guest if PAN support has
been compiled in and that it exists on the HW. However, this is not
necessarily correct.
In a nVHE configuration, there is no notion of PAN at EL2, so setting
PSTATE.PAN to anything is pointless.
Furthermore, not setting PAN to 0 when CONFIG_ARM64_PAN isn't set
means we run with the *guest's* PSTATE.PAN (which might be set to 1),
and we will explode on the next userspace access. Yes, the architecture
is delightful in that particular corner.
Fix the whole thing by always setting PAN to something when running
VHE (which implies PAN support), and only ignore it when running nVHE.
Reported-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://msgid.link/20260107124600.2736328-1-maz@kernel.org
Signed-off-by: Oliver Upton <oupton@kernel.org>
|
|
Naturally, updating the Access Flag in a stage-1 descriptor requires
write permission at stage-2, although this isn't actually enforced in
KVM's software PTW.
Generate a stage-2 permission fault if the stage-1 walk attempts to
update the descriptor and its corresponding stage-2 translation lacks
write permission.
Fixes: bff8aa213dee ("KVM: arm64: Implement HW access flag management in stage-1 SW PTW")
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://msgid.link/20260108204230.677172-1-oupton@kernel.org
Signed-off-by: Oliver Upton <oupton@kernel.org>
|
|
Function vcpu_{clear,set}_wfx_traps() are unused since
commit 0b5afe05377d7 ("KVM: arm64: Add early_param to
control WFx trapping").
Remove it.
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Dongxu Sun <sundongxu1024@163.com>
Link: https://msgid.link/20260109080226.761107-1-sundongxu1024@163.com
Signed-off-by: Oliver Upton <oupton@kernel.org>
|
|
Legacy resctrl features are enumerated by X86_FEATURE_* flags. These may be
overridden by quirks to disable features in the case of errata. Users can use
kernel command line options to either disable a feature, or to force enable
a feature that was disabled by a quirk.
A different approach is needed for hardware features that do not have an
X86_FEATURE_* flag.
Update parsing of the "rdt=" boot parameter to call the telemetry driver
directly to handle new "perf" and "energy" options that controls activation of
telemetry monitoring of the named type. By itself a "perf" or "energy" option
controls the forced enabling or disabling (with ! prefix) of all event groups
of the named type. A ":guid" suffix allows for fine grained control per event
group.
[ bp: s/intel_aet_option/intel_handle_aet_option/g ]
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lore.kernel.org/20251217172121.12030-1-tony.luck@intel.com
|
|
The L3 resource has several requirements for domains. There are per-domain
structures that hold the 64-bit values of counters, and elements to keep
track of the overflow and limbo threads.
None of these are needed for the PERF_PKG resource. The hardware counters
are wide enough that they do not wrap around for decades.
Define a new rdt_perf_pkg_mon_domain structure which just consists of the
standard rdt_domain_hdr to keep track of domain id and CPU mask.
Update resctrl_online_mon_domain() for RDT_RESOURCE_PERF_PKG. The only action
needed for this resource is to create and populate domain directories if a
domain is added while resctrl is mounted.
Similarly resctrl_offline_mon_domain() only needs to remove domain directories.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lore.kernel.org/20251217172121.12030-1-tony.luck@intel.com
|