Age | Commit message (Collapse) | Author | Files | Lines |
|
Aishwarya reports that KVM selftests for arm64 fail with the following
error:
| make[4]: Entering directory '/tmp/kci/linux/tools/testing/selftests/kvm'
| Makefile:270: warning: overriding recipe for target
| '/tmp/kci/linux/build/kselftest/kvm/get-reg-list'
| Makefile:265: warning: ignoring old recipe for target
| '/tmp/kci/linux/build/kselftest/kvm/get-reg-list'
| make -C ../../../../tools/arch/arm64/tools/
| make[5]: Entering directory '/tmp/kci/linux/tools/arch/arm64/tools'
| Makefile:10: ../tools/scripts/Makefile.include: No such file or directory
| make[5]: *** No rule to make target '../tools/scripts/Makefile.include'.
| Stop.
It would appear that this only affects builds from the top-level
Makefile (e.g. make kselftest-all), as $(srctree) is set to ".". Work
around the issue by shadowing the kselftest naming scheme for the source
tree variable.
Reported-by: Aishwarya TCV <aishwarya.tcv@arm.com>
Fixes: 0359c946b131 ("tools headers arm64: Update sysreg.h with kernel sources")
Link: https://lore.kernel.org/r/20231027005439.3142015-2-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
* kvm-arm64/sgi-injection:
: vSGI injection improvements + fixes, courtesy Marc Zyngier
:
: Avoid linearly searching for vSGI targets using a compressed MPIDR to
: index a cache. While at it, fix some egregious bugs in KVM's mishandling
: of vcpuid (user-controlled value) and vcpu_idx.
KVM: arm64: Clarify the ordering requirements for vcpu/RD creation
KVM: arm64: vgic-v3: Optimize affinity-based SGI injection
KVM: arm64: Fast-track kvm_mpidr_to_vcpu() when mpidr_data is available
KVM: arm64: Build MPIDR to vcpu index cache at runtime
KVM: arm64: Simplify kvm_vcpu_get_mpidr_aff()
KVM: arm64: Use vcpu_idx for invalidation tracking
KVM: arm64: vgic: Use vcpu_idx for the debug information
KVM: arm64: vgic-v2: Use cpuid from userspace as vcpu_id
KVM: arm64: vgic-v3: Refactor GICv3 SGI generation
KVM: arm64: vgic-its: Treat the collection target address as a vcpu_id
KVM: arm64: vgic: Make kvm_vgic_inject_irq() take a vcpu pointer
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
* kvm-arm64/stage2-vhe-load:
: Setup stage-2 MMU from vcpu_load() for VHE
:
: Unlike nVHE, there is no need to switch the stage-2 MMU around on guest
: entry/exit in VHE mode as the host is running at EL2. Despite this KVM
: reloads the stage-2 on every guest entry, which is needless.
:
: This series moves the setup of the stage-2 MMU context to vcpu_load()
: when running in VHE mode. This is likely to be a win across the board,
: but also allows us to remove an ISB on the guest entry path for systems
: with one of the speculative AT errata.
KVM: arm64: Move VTCR_EL2 into struct s2_mmu
KVM: arm64: Load the stage-2 MMU context in kvm_vcpu_load_vhe()
KVM: arm64: Rename helpers for VHE vCPU load/put
KVM: arm64: Reload stage-2 for VMID change on VHE
KVM: arm64: Restore the stage-2 context in VHE's __tlb_switch_to_host()
KVM: arm64: Don't zero VTTBR in __tlb_switch_to_host()
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
* kvm-arm64/nv-trap-fixes:
: NV trap forwarding fixes, courtesy Miguel Luis and Marc Zyngier
:
: - Explicitly define the effects of HCR_EL2.NV on EL2 sysregs in the
: NV trap encoding
:
: - Make EL2 registers that access AArch32 guest state UNDEF or RAZ/WI
: where appropriate for NV guests
KVM: arm64: Handle AArch32 SPSR_{irq,abt,und,fiq} as RAZ/WI
KVM: arm64: Do not let a L1 hypervisor access the *32_EL2 sysregs
KVM: arm64: Refine _EL2 system register list that require trap reinjection
arm64: Add missing _EL2 encodings
arm64: Add missing _EL12 encodings
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
* kvm-arm64/smccc-filter-cleanups:
: Cleanup the management of KVM's SMCCC maple tree
:
: Avoid the cost of maintaining the SMCCC filter maple tree if userspace
: hasn't writen a rule to the filter. While at it, rip out the now
: unnecessary VM flag to indicate whether or not the SMCCC filter was
: configured.
KVM: arm64: Use mtree_empty() to determine if SMCCC filter configured
KVM: arm64: Only insert reserved ranges when SMCCC filter is used
KVM: arm64: Add a predicate for testing if SMCCC filter is configured
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
* kvm-arm64/pmevtyper-filter:
: Fixes to KVM's handling of the PMUv3 exception level filtering bits
:
: - NSH (count at EL2) and M (count at EL3) should be stateful when the
: respective EL is advertised in the ID registers but have no effect on
: event counting.
:
: - NSU and NSK modify the event filtering of EL0 and EL1, respectively.
: Though the kernel may not use these bits, other KVM guests might.
: Implement these bits exactly as written in the pseudocode if EL3 is
: advertised.
KVM: arm64: Add PMU event filter bits required if EL3 is implemented
KVM: arm64: Make PMEVTYPER<n>_EL0.NSH RES0 if EL2 isn't advertised
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
* kvm-arm64/feature-flag-refactor:
: vCPU feature flag cleanup
:
: Clean up KVM's handling of vCPU feature flags to get rid of the
: vCPU-scoped bitmaps and remove failure paths from kvm_reset_vcpu().
KVM: arm64: Get rid of vCPU-scoped feature bitmap
KVM: arm64: Remove unused return value from kvm_reset_vcpu()
KVM: arm64: Hoist NV+SVE check into KVM_ARM_VCPU_INIT ioctl handler
KVM: arm64: Prevent NV feature flag on systems w/o nested virt
KVM: arm64: Hoist PAuth checks into KVM_ARM_VCPU_INIT ioctl
KVM: arm64: Hoist SVE check into KVM_ARM_VCPU_INIT ioctl handler
KVM: arm64: Hoist PMUv3 check into KVM_ARM_VCPU_INIT ioctl handler
KVM: arm64: Add generic check for system-supported vCPU features
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
* kvm-arm64/misc:
: Miscellaneous updates
:
: - Put an upper bound on the number of I-cache invalidations by
: cacheline to avoid soft lockups
:
: - Get rid of bogus refererence count transfer for THP mappings
:
: - Do a local TLB invalidation on permission fault race
:
: - Fixes for page_fault_test KVM selftest
:
: - Add a tracepoint for detecting MMIO instructions unsupported by KVM
KVM: arm64: Add tracepoint for MMIO accesses where ISV==0
KVM: arm64: selftest: Perform ISB before reading PAR_EL1
KVM: arm64: selftest: Add the missing .guest_prepare()
KVM: arm64: Always invalidate TLB for stage-2 permission faults
KVM: arm64: Do not transfer page refcount for THP adjustment
KVM: arm64: Avoid soft lockups due to I-cache maintenance
arm64: tlbflush: Rename MAX_TLBI_OPS
KVM: arm64: Don't use kerneldoc comment for arm64_check_features()
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
It is a pretty well known fact that KVM does not support MMIO emulation
without valid instruction syndrome information (ESR_EL2.ISV == 0). The
current kvm_pr_unimpl() is pretty useless, as it contains zero context
to relate the event to a vCPU.
Replace it with a precise tracepoint that dumps the relevant context
so the user can make sense of what the guest is doing.
Acked-by: Zenghui Yu <yuzenghui@huawei.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231026205306.3045075-1-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
It looks like a mistake to issue ISB *after* reading PAR_EL1, we should
instead perform it between the AT instruction and the reads of PAR_EL1.
As according to DDI0487J.a IJTYVP,
"When an address translation instruction is executed, explicit
synchronization is required to guarantee the result is visible to
subsequent direct reads of PAR_EL1."
Otherwise all guest_at testcases fail on my box with
==== Test Assertion Failure ====
aarch64/page_fault_test.c:142: par & 1 == 0
pid=1355864 tid=1355864 errno=4 - Interrupted system call
1 0x0000000000402853: vcpu_run_loop at page_fault_test.c:681
2 0x0000000000402cdb: run_test at page_fault_test.c:730
3 0x0000000000403897: for_each_guest_mode at guest_modes.c:100
4 0x00000000004019f3: for_each_test_and_guest_mode at page_fault_test.c:1105
5 (inlined by) main at page_fault_test.c:1131
6 0x0000ffffb153c03b: ?? ??:0
7 0x0000ffffb153c113: ?? ??:0
8 0x0000000000401aaf: _start at ??:?
0x1 != 0x0 (par & 1 != 0)
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231007124043.626-2-yuzenghui@huawei.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
Running page_fault_test on a Cortex A72 fails with
Test: ro_memslot_no_syndrome_guest_cas
Testing guest mode: PA-bits:40, VA-bits:48, 4K pages
Testing memory backing src type: anonymous
==== Test Assertion Failure ====
aarch64/page_fault_test.c:117: guest_check_lse()
pid=1944087 tid=1944087 errno=4 - Interrupted system call
1 0x00000000004028b3: vcpu_run_loop at page_fault_test.c:682
2 0x0000000000402d93: run_test at page_fault_test.c:731
3 0x0000000000403957: for_each_guest_mode at guest_modes.c:100
4 0x00000000004019f3: for_each_test_and_guest_mode at page_fault_test.c:1108
5 (inlined by) main at page_fault_test.c:1134
6 0x0000ffff868e503b: ?? ??:0
7 0x0000ffff868e5113: ?? ??:0
8 0x0000000000401aaf: _start at ??:?
guest_check_lse()
because we don't have a guest_prepare stage to check the presence of
FEAT_LSE and skip the related guest_cas testing, and we end-up failing in
GUEST_ASSERT(guest_check_lse()).
Add the missing .guest_prepare() where it's indeed required.
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231007124043.626-1-yuzenghui@huawei.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
It is possible for multiple vCPUs to fault on the same IPA and attempt
to resolve the fault. One of the page table walks will actually update
the PTE and the rest will return -EAGAIN per our race detection scheme.
KVM elides the TLB invalidation on the racing threads as the return
value is nonzero.
Before commit a12ab1378a88 ("KVM: arm64: Use local TLBI on permission
relaxation") KVM always used broadcast TLB invalidations when handling
permission faults, which had the convenient property of making the
stage-2 updates visible to all CPUs in the system. However now we do a
local invalidation, and TLBI elision leads to the vCPU thread faulting
again on the stale entry. Remember that the architecture permits the TLB
to cache translations that precipitate a permission fault.
Invalidate the TLB entry responsible for the permission fault if the
stage-2 descriptor has been relaxed, regardless of which thread actually
did the job.
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230922223229.1608155-1-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull misc x86 fixes from Ingo Molnar:
- Fix a possible CPU hotplug deadlock bug caused by the new TSC
synchronization code
- Fix a legacy PIC discovery bug that results in device troubles on
affected systems, such as non-working keybards, etc
- Add a new Intel CPU model number to <asm/intel-family.h>
* tag 'x86-urgent-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/tsc: Defer marking TSC unstable to a worker
x86/i8259: Skip probing when ACPI/MADT advertises PCAT compatibility
x86/cpu: Add model number for Intel Arrow Lake mobile processor
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fix from Ingo Molnar:
"Restore unintentionally lost quirk settings in the GIC irqchip driver,
which broke certain devices"
* tag 'irq-urgent-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/gic-v3-its: Don't override quirk settings with default values
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf event fix from Ingo Molnar:
"Fix a potential NULL dereference bug"
* tag 'perf-urgent-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/core: Fix potential NULL deref
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull probes fixes from Masami Hiramatsu:
- tracing/kprobes: Fix kernel-doc warnings for the variable length
arguments
- tracing/kprobes: Fix to count the symbols in modules even if the
module name is not specified so that user can probe the symbols in
the modules without module name
* tag 'probes-fixes-v6.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing/kprobes: Fix symbol counting logic by looking at modules as well
tracing/kprobes: Fix the description of variable length arguments
|
|
git://git.infradead.org/users/hch/dma-mapping
Pull dma-mapping fix from Christoph Hellwig:
- reduce the initialy dynamic swiotlb size to remove an annoying but
harmless warning from the page allocator (Petr Tesarik)
* tag 'dma-mapping-6.6-2023-10-28' of git://git.infradead.org/users/hch/dma-mapping:
swiotlb: do not try to allocate a TLB bigger than MAX_ORDER pages
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
"Here are some very small driver fixes for 6.6-final that have shown up
in the past two weeks. Included in here are:
- tiny fastrpc bugfixes for reported errors
- nvmem register fixes
- iio driver fixes for some reported problems
- fpga test fix
- MAINTAINERS file update for fpga
All of these have been in linux-next this week with no reported
problems"
* tag 'char-misc-6.6-final' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
fpga: Fix memory leak for fpga_region_test_class_find()
fpga: m10bmc-sec: Change contact for secure update driver
fpga: disable KUnit test suites when module support is enabled
iio: afe: rescale: Accept only offset channels
nvmem: imx: correct nregs for i.MX6ULL
nvmem: imx: correct nregs for i.MX6UL
nvmem: imx: correct nregs for i.MX6SLL
misc: fastrpc: Unmap only if buffer is unmapped from DSP
misc: fastrpc: Clean buffers on remote invocation failures
misc: fastrpc: Free DMA handles for RPC calls with no arguments
misc: fastrpc: Reset metadata buffer to avoid incorrect free
iio: exynos-adc: request second interupt only when touchscreen mode is used
iio: adc: xilinx-xadc: Correct temperature offset/scale for UltraScale
iio: adc: xilinx-xadc: Don't clobber preset voltage/temperature thresholds
dt-bindings: iio: add missing reset-gpios constrain
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
"Bugfixes for Axxia when it is a target and for PEC handling of
stm32f7.
Plus, fix an OF node leak pattern in the mux subsystem"
* tag 'i2c-for-6.6-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: stm32f7: Fix PEC handling in case of SMBUS transfers
i2c: muxes: i2c-mux-gpmux: Use of_get_i2c_adapter_by_node()
i2c: muxes: i2c-demux-pinctrl: Use of_get_i2c_adapter_by_node()
i2c: muxes: i2c-mux-pinctrl: Use of_get_i2c_adapter_by_node()
i2c: aspeed: Fix i2c bus hang in slave read
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk fixes from Stephen Boyd:
"Three fixes, one for the clk framework and two for clk drivers:
- Avoid an oops in possible_parent_show() by checking for no parent
properly when a DT index based lookup is used
- Handle errors returned from divider_ro_round_rate() in
clk_stm32_composite_determine_rate()
- Fix clk_ops::determine_rate() implementation of socfpga's
gateclk_ops that was ruining uart output because the divider
was forgotten about"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: stm32: Fix a signedness issue in clk_stm32_composite_determine_rate()
clk: Sanitize possible_parent_show to Handle Return Value of of_clk_get_parent_name
clk: socfpga: gate: Account for the divider in determine_rate
|
|
Pull misc filesystem fixes from Al Viro:
"Assorted fixes all over the place: literally nothing in common, could
have been three separate pull requests.
All are simple regression fixes, but not for anything from this cycle"
* tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
ceph_wait_on_conflict_unlink(): grab reference before dropping ->d_lock
io_uring: kiocb_done() should *not* trust ->ki_pos if ->{read,write}_iter() failed
sparc32: fix a braino in fault handling in csum_and_copy_..._user()
|
|
Recent changes to count number of matching symbols when creating
a kprobe event failed to take into account kernel modules. As such, it
breaks kprobes on kernel module symbols, by assuming there is no match.
Fix this my calling module_kallsyms_on_each_symbol() in addition to
kallsyms_on_each_match_symbol() to perform a proper counting.
Link: https://lore.kernel.org/all/20231027233126.2073148-1-andrii@kernel.org/
Cc: Francis Laniel <flaniel@linux.microsoft.com>
Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Fixes: b022f0c7e404 ("tracing/kprobes: Return EADDRNOTAVAIL when func matches several symbols")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <song@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
|
|
Use of dget() after we'd dropped ->d_lock is too late - dentry might
be gone by that point.
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
failed
->ki_pos value is unreliable in such cases. For an obvious example,
consider O_DSYNC write - we feed the data to page cache and start IO,
then we make sure it's completed. Update of ->ki_pos is dealt with
by the first part; failure in the second ends up with negative value
returned _and_ ->ki_pos left advanced as if sync had been successful.
In the same situation write(2) does not advance the file position
at all.
Reviewed-by: Christian Brauner <brauner@kernel.org>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Pull io_uring fixes from Jens Axboe:
"Fix for an issue reported where reading fdinfo could find a NULL
thread as we didn't properly synchronize, and then a disable for the
IOCB_DIO_CALLER_COMP optimization as a recent reported highlighted how
that could lead to deadlocks if the task issued async O_DIRECT writes
and then proceeded to do sync fallocate() calls"
* tag 'io_uring-6.6-2023-10-27' of git://git.kernel.dk/linux:
io_uring/rw: disable IOCB_DIO_CALLER_COMP
io_uring/fdinfo: lock SQ thread while retrieving thread cpu/pid
|
|
Fault handler used to make non-trivial calls, so it needed
to set a stack frame up. Used to be
save ... - grab a stack frame, old %o... become %i...
....
ret - go back to address originally in %o7, currently %i7
restore - switch to previous stack frame, in delay slot
Non-trivial calls had been gone since ab5e8b331244 and that code should
have become
retl - go back to address in %o7
clr %o0 - have return value set to 0
What it had become instead was
ret - go back to address in %i7 - return address of *caller*
clr %o0 - have return value set to 0
which is not good, to put it mildly - we forcibly return 0 from
csum_and_copy_{from,to}_iter() (which is what the call of that
thing had been inlined into) and do that without dropping the
stack frame of said csum_and_copy_..._iter(). Confuses the
hell out of the caller of csum_and_copy_..._iter(), obviously...
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Fixes: ab5e8b331244 "sparc32: propagate the calling conventions change down to __csum_partial_copy_sparc_generic()"
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Pull block fix from Jens Axboe:
"Just a single fix for a potential divide-by-zero, introduced in this
cycle"
* tag 'block-6.6-2023-10-27' of git://git.kernel.dk/linux:
blk-throttle: check for overflow in calculate_bytes_allowed
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata
Pull ATA fix from Damien Le Moal:
"A single patch to fix a regression introduced by the recent
suspend/resume fixes.
The regression is that ATA disks are not stopped on system shutdown,
which is not recommended and increases the disks SMART counters for
unclean power off events.
This patch fixes this by refining the recent rework of the scsi device
manage_xxx flags"
* tag 'ata-6.6-final' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata:
scsi: sd: Introduce manage_shutdown device flag
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform driver fix from Hans de Goede:
"A single patch to extend the AMD PMC driver DMI quirk list
for laptops which need special handling to avoid NVME s2idle
suspend/resume errors"
* tag 'platform-drivers-x86-v6.6-6' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
platform/x86: Add s2idle quirk for more Lenovo laptops
|
|
Service NMI and SMI requests after PMI requests in vcpu_enter_guest() so
that KVM does not need to cancel and redo the VM-Enter if the guest
configures its PMIs to be delivered as NMIs (likely) or SMIs (unlikely).
Because APIC emulation "injects" NMIs via KVM_REQ_NMI, handling PMI
requests after NMI requests (the likely case) means KVM won't detect the
pending NMI request until the final check for outstanding requests.
Detecting requests at the final stage is costly as KVM has already loaded
guest state, potentially queued events for injection, disabled IRQs,
dropped SRCU, etc., most of which needs to be unwound.
Note that changing the order of request processing doesn't change the end
result, as KVM's final check for outstanding requests prevents entering
the guest until all requests are serviced. I.e. KVM will ultimately
coalesce events (or not) regardless of the ordering.
Using SPEC2017 benchmark programs running along with Intel vtune in a VM
demonstrates that the following code change reduces 800~1500 canceled
VM-Enters per second.
Some glory details:
Probe the invocation to vmx_cancel_injection():
$ perf probe -a vmx_cancel_injection
$ perf stat -a -e probe:vmx_cancel_injection -I 10000 # per 10 seconds
Partial results when SPEC2017 with Intel vtune are running in the VM:
On kernel without the change:
10.010018010 14254 probe:vmx_cancel_injection
20.037646388 15207 probe:vmx_cancel_injection
30.078739816 15261 probe:vmx_cancel_injection
40.114033258 15085 probe:vmx_cancel_injection
50.149297460 15112 probe:vmx_cancel_injection
60.185103088 15104 probe:vmx_cancel_injection
On kernel with the change:
10.003595390 40 probe:vmx_cancel_injection
20.017855682 31 probe:vmx_cancel_injection
30.028355883 34 probe:vmx_cancel_injection
40.038686298 31 probe:vmx_cancel_injection
50.048795162 20 probe:vmx_cancel_injection
60.069057747 19 probe:vmx_cancel_injection
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Mingwei Zhang <mizhang@google.com>
Link: https://lore.kernel.org/r/20231002040839.2630027-1-mizhang@google.com
[sean: hoist PMU/PMI above SMI too, massage changelog]
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Tetsuo reported the following lockdep splat when the TSC synchronization
fails during CPU hotplug:
tsc: Marking TSC unstable due to check_tsc_sync_source failed
WARNING: inconsistent lock state
inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
ffffffff8cfa1c78 (watchdog_lock){?.-.}-{2:2}, at: clocksource_watchdog+0x23/0x5a0
{IN-HARDIRQ-W} state was registered at:
_raw_spin_lock_irqsave+0x3f/0x60
clocksource_mark_unstable+0x1b/0x90
mark_tsc_unstable+0x41/0x50
check_tsc_sync_source+0x14f/0x180
sysvec_call_function_single+0x69/0x90
Possible unsafe locking scenario:
lock(watchdog_lock);
<Interrupt>
lock(watchdog_lock);
stack backtrace:
_raw_spin_lock+0x30/0x40
clocksource_watchdog+0x23/0x5a0
run_timer_softirq+0x2a/0x50
sysvec_apic_timer_interrupt+0x6e/0x90
The reason is the recent conversion of the TSC synchronization function
during CPU hotplug on the control CPU to a SMP function call. In case
that the synchronization with the upcoming CPU fails, the TSC has to be
marked unstable via clocksource_mark_unstable().
clocksource_mark_unstable() acquires 'watchdog_lock', but that lock is
taken with interrupts enabled in the watchdog timer callback to minimize
interrupt disabled time. That's obviously a possible deadlock scenario,
Before that change the synchronization function was invoked in thread
context so this could not happen.
As it is not crucical whether the unstable marking happens slightly
delayed, defer the call to a worker thread which avoids the lock context
problem.
Fixes: 9d349d47f0e3 ("x86/smpboot: Make TSC synchronization function call based")
Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/87zg064ceg.ffs@tglx
|
|
David and a few others reported that on certain newer systems some legacy
interrupts fail to work correctly.
Debugging revealed that the BIOS of these systems leaves the legacy PIC in
uninitialized state which makes the PIC detection fail and the kernel
switches to a dummy implementation.
Unfortunately this fallback causes quite some code to fail as it depends on
checks for the number of legacy PIC interrupts or the availability of the
real PIC.
In theory there is no reason to use the PIC on any modern system when
IO/APIC is available, but the dependencies on the related checks cannot be
resolved trivially and on short notice. This needs lots of analysis and
rework.
The PIC detection has been added to avoid quirky checks and force selection
of the dummy implementation all over the place, especially in VM guest
scenarios. So it's not an option to revert the relevant commit as that
would break a lot of other scenarios.
One solution would be to try to initialize the PIC on detection fail and
retry the detection, but that puts the burden on everything which does not
have a PIC.
Fortunately the ACPI/MADT table header has a flag field, which advertises
in bit 0 that the system is PCAT compatible, which means it has a legacy
8259 PIC.
Evaluate that bit and if set avoid the detection routine and keep the real
PIC installed, which then gets initialized (for nothing) and makes the rest
of the code with all the dependencies work again.
Fixes: e179f6914152 ("x86, irq, pic: Probe for legacy PIC and set legacy_pic appropriately")
Reported-by: David Lazar <dlazar@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: David Lazar <dlazar@gmail.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Cc: stable@vger.kernel.org
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218003
Link: https://lore.kernel.org/r/875y2u5s8g.ffs@tglx
|
|
For "reasons" Intel has code-named this CPU with a "_H" suffix.
[ dhansen: As usual, apply this and send it upstream quickly to
make it easier for anyone who is doing work that
consumes this. ]
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lore.kernel.org/all/20231025202513.12358-1-tony.luck%40intel.com
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu fix from Joerg Roedel:
- Fix boot regression for Sapphire Rapids with Intel VT-d driver
* tag 'iommu-fix-v6.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu: Avoid unnecessary cache invalidations
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
- Fix boot crash with FLATMEM since set_ptes() introduction
- Avoid calling arch_enter/leave_lazy_mmu() in set_ptes()
Thanks to Aneesh Kumar K.V and Erhard Furtner.
* tag 'powerpc-6.6-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/mm: Avoid calling arch_enter/leave_lazy_mmu() in set_ptes
powerpc/mm: Fix boot crash with FLATMEM
|
|
When suspending to idle and resuming on some Lenovo laptops using the
Mendocino APU, multiple NVME IOMMU page faults occur, showing up in
dmesg as repeated errors:
nvme 0000:01:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000b
address=0xb6674000 flags=0x0000]
The system is unstable afterwards.
Applying the s2idle quirk introduced by commit 455cd867b85b ("platform/x86:
thinkpad_acpi: Add a s2idle resume quirk for a number of laptops")
allows these systems to work with the IOMMU enabled and s2idle
resume to work.
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218024
Suggested-by: Mario Limonciello <mario.limonciello@amd.com>
Suggested-by: Mark Pearson <mpearson-lenovo@squebb.ca>
Signed-off-by: David Lazar <dlazar@gmail.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Mark Pearson <mpearson-lenovo@squebb.ca>
Link: https://lore.kernel.org/r/ZTlsyOaFucF2pWrL@localhost
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
|
|
Fix the following kernel-doc warnings:
kernel/trace/trace_kprobe.c:1029: warning: Excess function parameter 'args' description in '__kprobe_event_gen_cmd_start'
kernel/trace/trace_kprobe.c:1097: warning: Excess function parameter 'args' description in '__kprobe_event_add_fields'
Refer to the usage of variable length arguments elsewhere in the kernel
code, "@..." is the proper way to express it in the description.
Link: https://lore.kernel.org/all/20231027041315.2613166-1-yujie.liu@intel.com/
Fixes: 2a588dd1d5d6 ("tracing: Add kprobe event command generation functions")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202310190437.paI6LYJF-lkp@intel.com/
Signed-off-by: Yujie Liu <yujie.liu@intel.com>
Reviewed-by: Mukesh Ojha <quic_mojha@quicinc.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
|
|
The iommu_create_device_direct_mappings() only needs to flush the caches
when the mappings are changed in the affected domain. This is not true
for non-DMA domains, or for devices attached to the domain that have no
reserved regions. To avoid unnecessary cache invalidations, add a check
before iommu_flush_iotlb_all().
Fixes: a48ce36e2786 ("iommu: Prevent RESV_DIRECT devices from blocking domains")
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Tested-by: Henry Willard <henry.willard@oracle.com>
Link: https://lore.kernel.org/r/20231026084942.17387-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Pull drm fixes from Dave Airlie:
"This is the final set of fixes for 6.6, just misc bits mainly in
amdgpu and i915, nothing too noteworthy.
amdgpu:
- ignore duplicated BOs in CS parser
- remove redundant call to amdgpu_ctx_priority_is_valid()
- Extend VI APSM quirks to more platforms
amdkfd:
- reserve fence slot while locking BO
dp_mst:
- Fix NULL deref in get_mst_branch_device_by_guid_helper()
logicvc:
- Kconfig: Select REGMAP and REGMAP_MMIO
ivpu:
- Fix missing VPUIP interrupts
i915:
- Determine context valid in OA reports
- Hold GT forcewake during steering operations
- Check if PMU is closed before stopping event"
* tag 'drm-fixes-2023-10-27' of git://anongit.freedesktop.org/drm/drm:
accel/ivpu/37xx: Fix missing VPUIP interrupts
drm/amd: Disable ASPM for VI w/ all Intel systems
drm/i915/pmu: Check if pmu is closed before stopping event
drm/i915/mcr: Hold GT forcewake during steering operations
drm/logicvc: Kconfig: select REGMAP and REGMAP_MMIO
drm/i915/perf: Determine context valid in OA reports
drm/amdkfd: reserve a fence slot while locking the BO
drm/amdgpu: Remove redundant call to priority_is_valid()
drm/dp_mst: Fix NULL deref in get_mst_branch_device_by_guid_helper()
drm/amdgpu: ignore duplicate BOs again
|
|
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-6.6-2023-10-25:
amdgpu:
- Extend VI APSM quirks to more platforms
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231026035452.14921-1-alexander.deucher@amd.com
|
|
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
- Determine context valid in OA reports (Umesh)
- Hold GT forcewake during steering operations (Matt Roper)
- Check if PMU is closed before stopping event (Umesh)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZTp8IQ0wxzxVjN7J@intel.com
|
|
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
Short summary of fixes pull:
amdgpu:
- ignore duplicated BOs in CS parser
- remove redundant call to amdgpu_ctx_priority_is_valid()
amdkfd:
- reserve fence slot while locking BO
dp_mst:
- Fix NULL deref in get_mst_branch_device_by_guid_helper()
logicvc:
- Kconfig: Select REGMAP and REGMAP_MMIO
ivpu:
- Fix missing VPUIP interrupts
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20231026110132.GA10591@linux-uq9g.fritz.box
|
|
Commit aa3998dbeb3a ("ata: libata-scsi: Disable scsi device
manage_system_start_stop") change setting the manage_system_start_stop
flag to false for libata managed disks to enable libata internal
management of disk suspend/resume. However, a side effect of this change
is that on system shutdown, disks are no longer being stopped (set to
standby mode with the heads unloaded). While this is not a critical
issue, this unclean shutdown is not recommended and shows up with
increased smart counters (e.g. the unexpected power loss counter
"Unexpect_Power_Loss_Ct").
Instead of defining a shutdown driver method for all ATA adapter
drivers (not all of them define that operation), this patch resolves
this issue by further refining the sd driver start/stop control of disks
using the new flag manage_shutdown. If this new flag is set to true by
a low level driver, the function sd_shutdown() will issue a
START STOP UNIT command with the start argument set to 0 when a disk
needs to be powered off (suspended) on system power off, that is, when
system_state is equal to SYSTEM_POWER_OFF.
Similarly to the other manage_xxx flags, the new manage_shutdown flag is
exposed through sysfs as a read-write device attribute.
To avoid any confusion between manage_shutdown and
manage_system_start_stop, the comments describing these flags in
include/scsi/scsi.h are also improved.
Fixes: aa3998dbeb3a ("ata: libata-scsi: Disable scsi device manage_system_start_stop")
Cc: stable@vger.kernel.org
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218038
Link: https://lore.kernel.org/all/cd397c88-bf53-4768-9ab8-9d107df9e613@gmail.com/
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC fixes from Arnd Bergmann:
"A couple of platforms have some last-minute fixes, in particular:
- riscv gets some fixes for noncoherent DMA on the renesas and thead
platforms and dts fix for SPI on the visionfive 2 board
- Qualcomm Snapdragon gets three dts fixes to address board specific
regressions on the pmic and gpio nodes
- Rockchip platforms get multiple dts fixes to address issues on the
recent rk3399 platform as well as the older rk3128 platform that
apparently regressed a while ago.
- TI OMAP gets some trivial code and dts fixes and a regression fix
for the omap1 ams-delta modem
- NXP i.MX firmware has one fix for a use-after-free but in its error
handling"
* tag 'soc-fixes-6.7-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (25 commits)
soc: renesas: ARCH_R9A07G043 depends on !RISCV_ISA_ZICBOM
riscv: only select DMA_DIRECT_REMAP from RISCV_ISA_ZICBOM and ERRATA_THEAD_PBMT
riscv: RISCV_NONSTANDARD_CACHE_OPS shouldn't depend on RISCV_DMA_NONCOHERENT
riscv: dts: thead: set dma-noncoherent to soc bus
arm64: dts: rockchip: Fix i2s0 pin conflict on ROCK Pi 4 boards
arm64: dts: rockchip: Add i2s0-2ch-bus-bclk-off pins to RK3399
clk: ti: Fix missing omap5 mcbsp functional clock and aliases
clk: ti: Fix missing omap4 mcbsp functional clock and aliases
ARM: OMAP1: ams-delta: Fix MODEM initialization failure
soc: renesas: Make ARCH_R9A07G043 depend on required options
riscv: dts: starfive: visionfive 2: correct spi's ss pin
firmware/imx-dsp: Fix use_after_free in imx_dsp_setup_channels()
ARM: OMAP: timer32K: fix all kernel-doc warnings
ARM: omap2: fix a debug printk
ARM: dts: rockchip: Fix timer clocks for RK3128
ARM: dts: rockchip: Add missing quirk for RK3128's dma engine
ARM: dts: rockchip: Add missing arm timer interrupt for RK3128
ARM: dts: rockchip: Fix i2c0 register address for RK3128
arm64: dts: rockchip: set codec system-clock-fixed on px30-ringneck-haikou
arm64: dts: rockchip: use codec as clock master on px30-ringneck-haikou
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from WiFi and netfilter.
Most regressions addressed here come from quite old versions, with the
exceptions of the iavf one and the WiFi fixes. No known outstanding
reports or investigation.
Fixes to fixes:
- eth: iavf: in iavf_down, disable queues when removing the driver
Previous releases - regressions:
- sched: act_ct: additional checks for outdated flows
- tcp: do not leave an empty skb in write queue
- tcp: fix wrong RTO timeout when received SACK reneging
- wifi: cfg80211: pass correct pointer to rdev_inform_bss()
- eth: i40e: sync next_to_clean and next_to_process for programming
status desc
- eth: iavf: initialize waitqueues before starting watchdog_task
Previous releases - always broken:
- eth: r8169: fix data-races
- eth: igb: fix potential memory leak in igb_add_ethtool_nfc_entry
- eth: r8152: avoid writing garbage to the adapter's registers
- eth: gtp: fix fragmentation needed check with gso"
* tag 'net-6.6-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (43 commits)
iavf: in iavf_down, disable queues when removing the driver
vsock/virtio: initialize the_virtio_vsock before using VQs
net: ipv6: fix typo in comments
net: ipv4: fix typo in comments
net/sched: act_ct: additional checks for outdated flows
netfilter: flowtable: GC pushes back packets to classic path
i40e: Fix wrong check for I40E_TXR_FLAGS_WB_ON_ITR
gtp: fix fragmentation needed check with gso
gtp: uapi: fix GTPA_MAX
Fix NULL pointer dereference in cn_filter()
sfc: cleanup and reduce netlink error messages
net/handshake: fix file ref count in handshake_nl_accept_doit()
wifi: mac80211: don't drop all unprotected public action frames
wifi: cfg80211: fix assoc response warning on failed links
wifi: cfg80211: pass correct pointer to rdev_inform_bss()
isdn: mISDN: hfcsusb: Spelling fix in comment
tcp: fix wrong RTO timeout when received SACK reneging
r8152: Block future register access if register access fails
r8152: Rename RTL8152_UNPLUG to RTL8152_INACCESSIBLE
r8152: Check for unplug in r8153b_ups_en() / r8153c_ups_en()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-devel into arm/fixes
Renesas fixes for v6.6 (take three)
- Sort out a few Kconfig dependency issues for the rich set of RISC-V
non-coherent DMA support.
* tag 'renesas-fixes-for-v6.6-tag3' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-devel:
soc: renesas: ARCH_R9A07G043 depends on !RISCV_ISA_ZICBOM
riscv: only select DMA_DIRECT_REMAP from RISCV_ISA_ZICBOM and ERRATA_THEAD_PBMT
riscv: RISCV_NONSTANDARD_CACHE_OPS shouldn't depend on RISCV_DMA_NONCOHERENT
Link: https://lore.kernel.org/r/cover.1698312384.git.geert+renesas@glider.be
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
ARCH_R9A07G043 has its own non-standard global pool based DMA coherent
allocator, which conflicts with the remap based RISCV_ISA_ZICBOM version.
Add a proper dependency.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Acked-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/20231018052654.50074-4-hch@lst.de
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
|
|
RISCV_DMA_NONCOHERENT is also used for whacky non-standard
non-coherent ops that use different hooks in dma-direct.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Tested-by: Samuel Holland <samuel.holland@sifive.com>
Link: https://lore.kernel.org/r/20231018052654.50074-3-hch@lst.de
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
|
|
RISCV_NONSTANDARD_CACHE_OPS is also used for the pmem cache maintenance
helpers, which are built into the kernel unconditionally.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20231018052654.50074-2-hch@lst.de
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
|