<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/virt, branch v7.0.13</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v7.0.13</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v7.0.13'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2026-06-19T11:47:58+00:00</updated>
<entry>
<title>KVM: Don't WARN if memory is dirtied without a vCPU when the VM is dying</title>
<updated>2026-06-19T11:47:58+00:00</updated>
<author>
<name>Sean Christopherson</name>
<email>seanjc@google.com</email>
</author>
<published>2026-05-29T18:35:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=99d7d43784ae3235026581e9bf892c036e04c8e6'/>
<id>urn:sha1:99d7d43784ae3235026581e9bf892c036e04c8e6</id>
<content type='text'>
commit 8618004d3e897c0f1b71d9a9ab860461289bb89a upstream.

When marking a page dirty, complain about not having a running/loaded vCPU
if and only if the VM is still alive, i.e. its refcount is non-zero.  This
will allow fixing a memory leak for x86 SEV-ES guests without hitting what
is effectively a false positive on the WARN.

For some SEV-ES VM-Exits, KVM keeps a writable mapping of a guest page
across an exit to userspace, and typically unmaps the page on the next
KVM_RUN.  But if userspace never calls KVM_RUN after such an exit, then KVM
needs to unmap the page when the vCPU is destroyed, which in turn triggers
the WARN about not having a running vCPU.

Alternatively, SEV-ES could temporarily load the vCPU to suppress the WARN,
as is done in nested_vmx_free_vcpu() (but for completely unrelated reasons;
suppressing WARN from nested_put_vmcs12_pages() is pure happenstance).  But
loading a vCPU during destruction is gross (ideally nVMX code would be
cleaned up), risks complicating the SEV-ES code (KVM would need to ensure
the temporarily load()+put() only runs when the vCPU isn't already loaded),
and is ultimately pointless.

The motivation for the WARN is to guard against KVM dirtying guest memory
without pushing the corresponding GFN to the active vCPU's dirty ring, e.g.
to ensure userspace doesn't miss a dirty page.  But for the VM's refcount
to reach zero, there can't be _any_ userspace mappings to the dirty ring,
as mapping the dirty ring requires doing mmap() on the vCPU FD.  I.e. if
userspace had a valid mapping for the dirty ring, then the vCPU file and
thus the owning VM would still be alive.  And so since userspace can't
possibly reach the dirty ring, whether or not KVM technically "misses" a
push to the dirty ring is irrelevant.

Reported-by: Michael Roth &lt;michael.roth@amd.com&gt;
Cc: stable@vger.kernel.org
Reviewed-by: Michael Roth &lt;michael.roth@amd.com&gt;
Signed-off-by: Sean Christopherson &lt;seanjc@google.com&gt;
Message-ID: &lt;20260501202250.2115252-15-seanjc@google.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Message-ID: &lt;20260529183549.1104619-15-pbonzini@redhat.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>KVM: Reject wrapped offset in kvm_reset_dirty_gfn()</title>
<updated>2026-05-23T11:09:37+00:00</updated>
<author>
<name>Aaron Sacks</name>
<email>contact@xchglabs.com</email>
</author>
<published>2026-05-12T06:07:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ecf9b3ea7847fe14f34b8c41f00de1eb95c747da'/>
<id>urn:sha1:ecf9b3ea7847fe14f34b8c41f00de1eb95c747da</id>
<content type='text'>
commit 577a8d3bae0531f0e5ccfac919cd8192f920a804 upstream.

kvm_reset_dirty_gfn() guards the gfn range with

	if (!memslot || (offset + __fls(mask)) &gt;= memslot-&gt;npages)
		return;

but offset is u64 and the addition is unchecked.  The check can be
silently bypassed by a u64 wrap.

The dirty ring backing those entries is MAP_SHARED at
KVM_DIRTY_LOG_PAGE_OFFSET of the vcpu fd, so the VMM can rewrite the
slot and offset fields of any entry between when the kernel pushes
them and when KVM_RESET_DIRTY_RINGS consumes them.  On reset,
kvm_dirty_ring_reset() re-reads the values via READ_ONCE() and feeds
them straight back into this check; only the flags handshake is
treated as the handover, the slot/offset payload is taken on trust.

Crafting two entries

	entry[i].offset   = 0xffffffffffffffc1
	entry[i+1].offset = 0

makes the coalescing loop in kvm_dirty_ring_reset() compute

	delta = (s64)(0 - 0xffffffffffffffc1) = 63

which falls in [0, BITS_PER_LONG), so it folds entry[i+1] into the
existing mask by setting bit 63.  The trailing kvm_reset_dirty_gfn()
call then sees offset = 0xffffffffffffffc1 and __fls(mask) = 63;
the sum is 0 in u64 and the bounds check passes.

That offset propagates into kvm_arch_mmu_enable_log_dirty_pt_masked()
unchanged.  On the legacy MMU path -- kvm_memslots_have_rmaps() ==
true, i.e. shadow paging, any VM that has allocated shadow roots, or
a write-tracked slot -- it reaches gfn_to_rmap(), which indexes
slot-&gt;arch.rmap[0][] with a near-U64_MAX gfn.  That is an
out-of-bounds load of a kvm_rmap_head, followed by a conditional
clear of PT_WRITABLE_MASK in whatever the loaded pointer points at.
The path is reachable from any process holding /dev/kvm.

Range-check offset on its own first, so the addition cannot wrap.
memslot-&gt;npages is bounded well below U64_MAX, so once offset &lt;
npages holds, offset + __fls(mask) (with __fls(mask) &lt; BITS_PER_LONG)
stays in range.

Fixes: fb04a1eddb1a ("KVM: X86: Implement ring-based dirty memory tracking")
Cc: stable@vger.kernel.org
Signed-off-by: Aaron Sacks &lt;contact@xchglabs.com&gt;
Link: https://patch.msgid.link/20260512060742.1628959-1-contact@xchglabs.com/
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'kvm-x86-generic-7.0-rc3' of https://github.com/kvm-x86/linux into HEAD</title>
<updated>2026-03-11T17:01:55+00:00</updated>
<author>
<name>Paolo Bonzini</name>
<email>pbonzini@redhat.com</email>
</author>
<published>2026-03-11T17:01:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=94fe3e6515ddca2fd33ca1ec53d3635e54fbe456'/>
<id>urn:sha1:94fe3e6515ddca2fd33ca1ec53d3635e54fbe456</id>
<content type='text'>
KVM generic changes for 7.0

 - Remove a subtle pseudo-overlay of kvm_stats_desc, which, aside from being
   unnecessary and confusing, triggered compiler warnings due to
   -Wflex-array-member-not-at-end.

 - Document that vcpu-&gt;mutex is take outside of kvm-&gt;slots_lock and
   kvm-&gt;slots_arch_lock, which is intentional and desirable despite being
   rather unintuitive.
</content>
</entry>
<entry>
<title>KVM: always define KVM_CAP_SYNC_MMU</title>
<updated>2026-02-28T14:31:35+00:00</updated>
<author>
<name>Paolo Bonzini</name>
<email>pbonzini@redhat.com</email>
</author>
<published>2026-02-11T18:06:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=70295a479da684905c18d96656d781823f418ec2'/>
<id>urn:sha1:70295a479da684905c18d96656d781823f418ec2</id>
<content type='text'>
KVM_CAP_SYNC_MMU is provided by KVM's MMU notifiers, which are now always
available.  Move the definition from individual architectures to common
code.

Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
</content>
</entry>
<entry>
<title>KVM: remove CONFIG_KVM_GENERIC_MMU_NOTIFIER</title>
<updated>2026-02-28T14:31:35+00:00</updated>
<author>
<name>Paolo Bonzini</name>
<email>pbonzini@redhat.com</email>
</author>
<published>2026-02-11T18:03:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=407fd8b8d8cce03856aa67329715de48b254b529'/>
<id>urn:sha1:407fd8b8d8cce03856aa67329715de48b254b529</id>
<content type='text'>
All architectures now use MMU notifier for KVM page table management.
Remove the Kconfig symbol and the code that is used when it is
disabled.

Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
</content>
</entry>
<entry>
<title>Convert 'alloc_obj' family to use the new default GFP_KERNEL argument</title>
<updated>2026-02-22T01:09:51+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2026-02-22T00:37:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=bf4afc53b77aeaa48b5409da5c8da6bb4eff7f43'/>
<id>urn:sha1:bf4afc53b77aeaa48b5409da5c8da6bb4eff7f43</id>
<content type='text'>
This was done entirely with mindless brute force, using

    git grep -l '\&lt;k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
        xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'

to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.

Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.

For the same reason the 'flex' versions will be done as a separate
conversion.

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>treewide: Replace kmalloc with kmalloc_obj for non-scalar types</title>
<updated>2026-02-21T09:02:28+00:00</updated>
<author>
<name>Kees Cook</name>
<email>kees@kernel.org</email>
</author>
<published>2026-02-21T07:49:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=69050f8d6d075dc01af7a5f2f550a8067510366f'/>
<id>urn:sha1:69050f8d6d075dc01af7a5f2f550a8067510366f</id>
<content type='text'>
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:

Single allocations:	kmalloc(sizeof(TYPE), ...)
are replaced with:	kmalloc_obj(TYPE, ...)

Array allocations:	kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with:	kmalloc_objs(TYPE, COUNT, ...)

Flex array allocations:	kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with:	kmalloc_flex(*PTR, FAM, COUNT, ...)

(where TYPE may also be *VAR)

The resulting allocations no longer return "void *", instead returning
"TYPE *".

Signed-off-by: Kees Cook &lt;kees@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'kvm-x86-pmu-6.20' of https://github.com/kvm-x86/linux into HEAD</title>
<updated>2026-02-11T17:45:40+00:00</updated>
<author>
<name>Paolo Bonzini</name>
<email>pbonzini@redhat.com</email>
</author>
<published>2026-02-09T18:35:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=bf2c3138ae3694d4687cbe451c774c288ae2ad06'/>
<id>urn:sha1:bf2c3138ae3694d4687cbe451c774c288ae2ad06</id>
<content type='text'>
KVM mediated PMU support for 6.20

Add support for mediated PMUs, where KVM gives the guest full ownership of PMU
hardware (contexted switched around the fastpath run loop) and allows direct
access to data MSRs and PMCs (restricted by the vPMU model), but intercepts
access to control registers, e.g. to enforce event filtering and to prevent the
guest from profiling sensitive host state.

To keep overall complexity reasonable, mediated PMU usage is all or nothing
for a given instance of KVM (controlled via module param).  The Mediated PMU
is disabled default, partly to maintain backwards compatilibity for existing
setup, partly because there are tradeoffs when running with a mediated PMU that
may be non-starters for some use cases, e.g. the host loses the ability to
profile guests with mediated PMUs, the fastpath run loop is also a blind spot,
entry/exit transitions are more expensive, etc.

Versus the emulated PMU, where KVM is "just another perf user", the mediated
PMU delivers more accurate profiling and monitoring (no risk of contention and
thus dropped events), with significantly less overhead (fewer exits and faster
emulation/programming of event selectors) E.g. when running Specint-2017 on
a single-socket Sapphire Rapids with 56 cores and no-SMT, and using perf from
within the guest:

  Perf command:
  a. basic-sampling: perf record -F 1000 -e 6-instructions  -a --overwrite
  b. multiplex-sampling: perf record -F 1000 -e 10-instructions -a --overwrite

  Guest performance overhead:
  ---------------------------------------------------------------------------
  | Test case          | emulated vPMU | all passthrough | passthrough with |
  |                    |               |                 | event filters    |
  ---------------------------------------------------------------------------
  | basic-sampling     |   33.62%      |    4.24%        |   6.21%          |
  ---------------------------------------------------------------------------
  | multiplex-sampling |   79.32%      |    7.34%        |   10.45%         |
  ---------------------------------------------------------------------------
</content>
</entry>
<entry>
<title>Merge tag 'kvm-x86-apic-6.20' of https://github.com/kvm-x86/linux into HEAD</title>
<updated>2026-02-11T17:45:32+00:00</updated>
<author>
<name>Paolo Bonzini</name>
<email>pbonzini@redhat.com</email>
</author>
<published>2026-02-09T18:33:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1b13885edf0a55a451a26d5fa53e7877b31debb5'/>
<id>urn:sha1:1b13885edf0a55a451a26d5fa53e7877b31debb5</id>
<content type='text'>
KVM x86 APIC-ish changes for 6.20

 - Fix a benign bug where KVM could use the wrong memslots (ignored SMM) when
   creating a vCPU-specific mapping of guest memory.

 - Clean up KVM's handling of marking mapped vCPU pages dirty.

 - Drop a pile of *ancient* sanity checks hidden behind in KVM's unused
   ASSERT() macro, most of which could be trivially triggered by the guest
   and/or user, and all of which were useless.

 - Fold "struct dest_map" into its sole user, "struct rtc_status", to make it
   more obvious what the weird parameter is used for, and to allow burying the
   RTC shenanigans behind CONFIG_KVM_IOAPIC=y.

 - Bury all of ioapic.h and KVM_IRQCHIP_KERNEL behind CONFIG_KVM_IOAPIC=y.

 - Add a regression test for recent APICv update fixes.

 - Rework KVM's handling of VMCS updates while L2 is active to temporarily
   switch to vmcs01 instead of deferring the update until the next nested
   VM-Exit.  The deferred updates approach directly contributed to several
   bugs, was proving to be a maintenance burden due to the difficulty in
   auditing the correctness of deferred updates, and was polluting
   "struct nested_vmx" with a growing pile of booleans.

 - Handle "hardware APIC ISR", a.k.a. SVI, updates in kvm_apic_update_apicv()
   to consolidate the updates, and to co-locate SVI updates with the updates
   for KVM's own cache of ISR information.

 - Drop a dead function declaration.
</content>
</entry>
<entry>
<title>Merge tag 'kvm-x86-gmem-6.20' of https://github.com/kvm-x86/linux into HEAD</title>
<updated>2026-02-11T17:45:12+00:00</updated>
<author>
<name>Paolo Bonzini</name>
<email>pbonzini@redhat.com</email>
</author>
<published>2026-02-09T18:08:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=9123c5f956b1fbedd63821eb528ece55ddd0e49c'/>
<id>urn:sha1:9123c5f956b1fbedd63821eb528ece55ddd0e49c</id>
<content type='text'>
KVM guest_memfd changes for 6.20

 - Remove kvm_gmem_populate()'s preparation tracking and half-baked hugepage
   handling, and instead rely on SNP (the only user of the tracking) to do its
   own tracking via the RMP.

 - Retroactively document and enforce (for SNP) that KVM_SEV_SNP_LAUNCH_UPDATE
   and KVM_TDX_INIT_MEM_REGION require the source page to be 4KiB aligned, to
   avoid non-trivial complexity for a non-existent usecase (and because
   in-place conversion simply can't support unaligned sources).

 - When populating guest_memfd memory, GUP the source page in common code and
   pass the refcounted page to the vendor callback, instead of letting vendor
   code do the heavy lifting.  Doing so avoids a looming deadlock bug with
   in-place due an AB-BA conflict betwee mmap_lock and guest_memfd's filemap
   invalidate lock.
</content>
</entry>
</feed>
