<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/arch/x86/virt, branch v6.19.12</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.19.12</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.19.12'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2025-11-12T23:29:38+00:00</updated>
<entry>
<title>x86: Restrict KVM-induced symbol exports to KVM modules where obvious/possible</title>
<updated>2025-11-12T23:29:38+00:00</updated>
<author>
<name>Sean Christopherson</name>
<email>seanjc@google.com</email>
</author>
<published>2025-11-12T17:39:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=6276c67f2bc4aeaf350a7cf889c33c38b3330ea9'/>
<id>urn:sha1:6276c67f2bc4aeaf350a7cf889c33c38b3330ea9</id>
<content type='text'>
Extend KVM's export macro framework to provide EXPORT_SYMBOL_FOR_KVM(),
and use the helper macro to export symbols for KVM throughout x86 if and
only if KVM will build one or more modules, and only for those modules.

To avoid unnecessary exports when CONFIG_KVM=m but kvm.ko will not be
built (because no vendor modules are selected), let arch code #define
EXPORT_SYMBOL_FOR_KVM to suppress/override the exports.

Note, the set of symbols to restrict to KVM was generated by manual search
and audit; any "misses" are due to human error, not some grand plan.

Signed-off-by: Sean Christopherson &lt;seanjc@google.com&gt;
Signed-off-by: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Acked-by: Kai Huang &lt;kai.huang@intel.com&gt;
Tested-by: Kai Huang &lt;kai.huang@intel.com&gt;
Link: https://patch.msgid.link/20251112173944.1380633-5-seanjc%40google.com
</content>
</entry>
<entry>
<title>Merge tag 'x86_tdx_for_6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2025-10-04T17:01:30+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2025-10-04T17:01:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=50ac57c3b156e893e34310f0be340a130f36f6db'/>
<id>urn:sha1:50ac57c3b156e893e34310f0be340a130f36f6db</id>
<content type='text'>
Pull x86 TDX updates from Dave Hansen:
 "The biggest change here is making TDX and kexec play nicely together.

  Before this, the memory encryption hardware (which doesn't respect
  cache coherency) could write back old cachelines on top of data in the
  new kernel, so kexec and TDX were made mutually exclusive. This
  removes the limitation.

  There is also some work to tighten up a hardware bug workaround and
  some MAINTAINERS updates.

   - Make TDX and kexec work together

    - Skip TDX bug workaround when the bug is not present

    - Update maintainers entries"

* tag 'x86_tdx_for_6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/virt/tdx: Use precalculated TDVPR page physical address
  KVM/TDX: Explicitly do WBINVD when no more TDX SEAMCALLs
  x86/virt/tdx: Update the kexec section in the TDX documentation
  x86/virt/tdx: Remove the !KEXEC_CORE dependency
  x86/kexec: Disable kexec/kdump on platforms with TDX partial write erratum
  x86/virt/tdx: Mark memory cache state incoherent when making SEAMCALL
  x86/sme: Use percpu boolean to control WBINVD during kexec
  x86/kexec: Consolidate relocate_kernel() function parameters
  x86/tdx: Skip clearing reclaimed pages unless X86_BUG_TDX_PW_MCE is present
  x86/tdx: Tidy reset_pamt functions
  x86/tdx: Eliminate duplicate code in tdx_clear_page()
  MAINTAINERS: Add KVM mail list to the TDX entry
  MAINTAINERS: Add Rick Edgecombe as a TDX reviewer
  MAINTAINERS: Update the file list in the TDX entry.
</content>
</entry>
<entry>
<title>x86/sev: Add new dump_rmp parameter to snp_leak_pages() API</title>
<updated>2025-09-17T10:04:04+00:00</updated>
<author>
<name>Ashish Kalra</name>
<email>ashish.kalra@amd.com</email>
</author>
<published>2025-09-16T21:29:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e4c00c4ce2aafe61dc7436e763a78d6d112d9e2f'/>
<id>urn:sha1:e4c00c4ce2aafe61dc7436e763a78d6d112d9e2f</id>
<content type='text'>
When leaking certain page types, such as Hypervisor Fixed (HV_FIXED)
pages, it does not make sense to dump RMP contents for the 2MB range of
the page(s) being leaked. In the case of HV_FIXED pages, this is not an
error situation where the surrounding 2MB page RMP entries can provide
debug information.

Add new __snp_leak_pages() API with dump_rmp bool parameter to support
continue adding pages to the snp_leaked_pages_list but not issue
dump_rmpentry().

Make snp_leak_pages() a wrapper for the common case which also allows
existing users to continue to dump RMP entries.

Suggested-by: Thomas Lendacky &lt;Thomas.Lendacky@amd.com&gt;
Suggested-by: Sean Christopherson &lt;seanjc@google.com&gt;
Signed-off-by: Ashish Kalra &lt;ashish.kalra@amd.com&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Reviewed-by: Tom Lendacky &lt;thomas.lendacky@amd.com&gt;
Acked-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Link: https://lore.kernel.org/cover.1758057691.git.ashish.kalra@amd.com
</content>
</entry>
<entry>
<title>x86/virt/tdx: Use precalculated TDVPR page physical address</title>
<updated>2025-09-11T18:38:28+00:00</updated>
<author>
<name>Kai Huang</name>
<email>kai.huang@intel.com</email>
</author>
<published>2025-09-09T07:55:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e414b1005891d74bb0c3d27684c58dfbfbd1754b'/>
<id>urn:sha1:e414b1005891d74bb0c3d27684c58dfbfbd1754b</id>
<content type='text'>
All of the x86 KVM guest types (VMX, SEV and TDX) do some special context
tracking when entering guests. This means that the actual guest entry
sequence must be noinstr.

Part of entering a TDX guest is passing a physical address to the TDX
module. Right now, that physical address is stored as a 'struct page'
and converted to a physical address at guest entry. That page=&gt;phys
conversion can be complicated, can vary greatly based on kernel
config, and it is definitely _not_ a noinstr path today.

There have been a number of tinkering approaches to try and fix this
up, but they all fall down due to some part of the page=&gt;phys
conversion infrastructure not being noinstr friendly.

Precalculate the page=&gt;phys conversion and store it in the existing
'tdx_vp' structure.  Use the new field at every site that needs a
tdvpr physical address. Remove the now redundant tdx_tdvpr_pa().
Remove the __flatten remnant from the tinkering.

Note that only one user of the new field is actually noinstr. All
others can use page_to_phys(). But, they might as well save the effort
since there is a pre-calculated value sitting there for them.

[ dhansen: rewrite all the text ]

Signed-off-by: Kai Huang &lt;kai.huang@intel.com&gt;
Signed-off-by: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Reviewed-by: Kiryl Shutsemau &lt;kas@kernel.org&gt;
Tested-by: Farrah Chen &lt;farrah.chen@intel.com&gt;
</content>
</entry>
<entry>
<title>KVM/TDX: Explicitly do WBINVD when no more TDX SEAMCALLs</title>
<updated>2025-09-05T17:40:41+00:00</updated>
<author>
<name>Kai Huang</name>
<email>kai.huang@intel.com</email>
</author>
<published>2025-09-01T16:09:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=61221d07e815008ba758995d79fd442b5217f51a'/>
<id>urn:sha1:61221d07e815008ba758995d79fd442b5217f51a</id>
<content type='text'>
On TDX platforms, during kexec, the kernel needs to make sure there
are no dirty cachelines of TDX private memory before booting to the new
kernel to avoid silent memory corruption to the new kernel.

To do this, the kernel has a percpu boolean to indicate whether the
cache of a CPU may be in incoherent state.  During kexec, namely in
stop_this_cpu(), the kernel does WBINVD if that percpu boolean is true.
TDX turns on that percpu boolean on a CPU when the kernel does SEAMCALL,
Thus making sure the cache will be flushed during kexec.

However, kexec has a race condition that, while remaining extremely rare,
would be more likely in the presence of a relatively long operation such
as WBINVD.

In particular, the kexec-ing CPU invokes native_stop_other_cpus()
to stop all remote CPUs before booting to the new kernel.
native_stop_other_cpus() then sends a REBOOT vector IPI to remote CPUs
and waits for them to stop; if that times out, it also sends NMIs to the
still-alive CPUs and waits again for them to stop.  If the race happens,
kexec proceeds before all CPUs have processed the NMI and stopped[1],
and the system hangs.

But after tdx_disable_virtualization_cpu(), no more TDX activity
can happen on this cpu.  When kexec is enabled, flush the cache
explicitly at that point; this moves the WBINVD to an earlier stage than
stop_this_cpus(), avoiding a possibly lengthy operation at a time where
it could cause this race.

[1] https://lore.kernel.org/kvm/b963fcd60abe26c7ec5dc20b42f1a2ebbcc72397.1750934177.git.kai.huang@intel.com/

[Make the new function a stub for !CONFIG_KEXEC_CORE. - Paolo]
Signed-off-by: Kai Huang &lt;kai.huang@intel.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Acked-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Tested-by: Farrah Chen &lt;farrah.chen@intel.com&gt;
Link: https://lore.kernel.org/all/20250901160930.1785244-8-pbonzini%40redhat.com
</content>
</entry>
<entry>
<title>x86/virt/tdx: Mark memory cache state incoherent when making SEAMCALL</title>
<updated>2025-09-05T17:40:40+00:00</updated>
<author>
<name>Kai Huang</name>
<email>kai.huang@intel.com</email>
</author>
<published>2025-09-01T16:09:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=10df8607bf1a22249d21859f56eeb61e9a033313'/>
<id>urn:sha1:10df8607bf1a22249d21859f56eeb61e9a033313</id>
<content type='text'>
On TDX platforms, dirty cacheline aliases with and without encryption
bits can coexist, and the cpu can flush them back to memory in random
order.  During kexec, the caches must be flushed before jumping to the
new kernel otherwise the dirty cachelines could silently corrupt the
memory used by the new kernel due to different encryption property.

A percpu boolean is used to mark whether the cache of a given CPU may be
in an incoherent state, and the kexec performs WBINVD on the CPUs with
that boolean turned on.

For TDX, only the TDX module or the TDX guests can generate dirty
cachelines of TDX private memory, i.e., they are only generated when the
kernel does a SEAMCALL.

Set that boolean when the kernel does SEAMCALL so that kexec can flush
the cache correctly.

The kernel provides both the __seamcall*() assembly functions and the
seamcall*() wrapper ones which additionally handle running out of
entropy error in a loop.  Most of the SEAMCALLs are called using the
seamcall*(), except TDH.VP.ENTER and TDH.PHYMEM.PAGE.RDMD which are
called using __seamcall*() variant directly.

To cover the two special cases, add a new __seamcall_dirty_cache()
helper which only sets the percpu boolean and calls the __seamcall*(),
and change the special cases to use the new helper.  To cover all other
SEAMCALLs, change seamcall*() to call the new helper.

For the SEAMCALLs invoked via seamcall*(), they can be made from both
task context and IRQ disabled context.  Given SEAMCALL is just a lengthy
instruction (e.g., thousands of cycles) from kernel's point of view and
preempt_{disable|enable}() is cheap compared to it, just unconditionally
disable preemption during setting the boolean and making SEAMCALL.

Signed-off-by: Kai Huang &lt;kai.huang@intel.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Reviewed-by: Chao Gao &lt;chao.gao@intel.com&gt;
Reviewed-by: Rick Edgecombe &lt;rick.p.edgecombe@intel.com&gt;
Tested-by: Farrah Chen &lt;farrah.chen@intel.com&gt;
Link: https://lore.kernel.org/all/20250901160930.1785244-4-pbonzini%40redhat.com
</content>
</entry>
<entry>
<title>x86/tdx: Skip clearing reclaimed pages unless X86_BUG_TDX_PW_MCE is present</title>
<updated>2025-08-22T14:45:50+00:00</updated>
<author>
<name>Adrian Hunter</name>
<email>adrian.hunter@intel.com</email>
</author>
<published>2025-08-19T15:58:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=01fb93a363e0583a3ce48098aca5ab9825a5b790'/>
<id>urn:sha1:01fb93a363e0583a3ce48098aca5ab9825a5b790</id>
<content type='text'>
Avoid clearing reclaimed TDX private pages unless the platform is affected
by the X86_BUG_TDX_PW_MCE erratum. This significantly reduces VM shutdown
time on unaffected systems.

Background

KVM currently clears reclaimed TDX private pages using MOVDIR64B, which:

   - Clears the TD Owner bit (which identifies TDX private memory) and
     integrity metadata without triggering integrity violations.
   - Clears poison from cache lines without consuming it, avoiding MCEs on
     access (refer TDX Module Base spec. 1348549-006US section 6.5.
     Handling Machine Check Events during Guest TD Operation).

The TDX module also uses MOVDIR64B to initialize private pages before use.
If cache flushing is needed, it sets TDX_FEATURES.CLFLUSH_BEFORE_ALLOC.
However, KVM currently flushes unconditionally, refer commit 94c477a751c7b
("x86/virt/tdx: Add SEAMCALL wrappers to add TD private pages")

In contrast, when private pages are reclaimed, the TDX Module handles
flushing via the TDH.PHYMEM.CACHE.WB SEAMCALL.

Problem

Clearing all private pages during VM shutdown is costly. For guests
with a large amount of memory it can take minutes.

Solution

TDX Module Base Architecture spec. documents that private pages reclaimed
from a TD should be initialized using MOVDIR64B, in order to avoid
integrity violation or TD bit mismatch detection when later being read
using a shared HKID, refer April 2025 spec. "Page Initialization" in
section "8.6.2. Platforms not Using ACT: Required Cache Flush and
Initialization by the Host VMM"

That is an overstatement and will be clarified in coming versions of the
spec. In fact, as outlined in "Table 16.2: Non-ACT Platforms Checks on
Memory" and "Table 16.3: Non-ACT Platforms Checks on Memory Reads in Li
Mode" in the same spec, there is no issue accessing such reclaimed pages
using a shared key that does not have integrity enabled. Linux always uses
KeyID 0 which never has integrity enabled. KeyID 0 is also the TME KeyID
which disallows integrity, refer "TME Policy/Encryption Algorithm" bit
description in "Intel Architecture Memory Encryption Technologies" spec
version 1.6 April 2025. So there is no need to clear pages to avoid
integrity violations.

There remains a risk of poison consumption. However, in the context of
TDX, it is expected that there would be a machine check associated with the
original poisoning. On some platforms that results in a panic. However
platforms may support "SEAM_NR" Machine Check capability, in which case
Linux machine check handler marks the page as poisoned, which prevents it
from being allocated anymore, refer commit 7911f145de5fe ("x86/mce:
Implement recovery for errors in TDX/SEAM non-root mode")

Improvement

By skipping the clearing step on unaffected platforms, shutdown time
can improve by up to 40%.

On platforms with the X86_BUG_TDX_PW_MCE erratum (SPR and EMR), continue
clearing because these platforms may trigger poison on partial writes to
previously-private pages, even with KeyID 0, refer commit 1e536e1068970
("x86/cpu: Detect TDX partial write machine check erratum")

Signed-off-by: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Signed-off-by: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Reviewed-by: Kirill A. Shutemov &lt;kas@kernel.org&gt;
Reviewed-by: Rick Edgecombe &lt;rick.p.edgecombe@intel.com&gt;
Reviewed-by: Xiaoyao Li &lt;xiaoyao.li@intel.com&gt;
Reviewed-by: Binbin Wu &lt;binbin.wu@linux.intel.com&gt;
Acked-by: Kai Huang &lt;kai.huang@intel.com&gt;
Acked-by: Vishal Annapurve &lt;vannapurve@google.com&gt;
Link: https://lore.kernel.org/all/20250819155811.136099-4-adrian.hunter%40intel.com
</content>
</entry>
<entry>
<title>x86/tdx: Tidy reset_pamt functions</title>
<updated>2025-08-22T14:45:50+00:00</updated>
<author>
<name>Adrian Hunter</name>
<email>adrian.hunter@intel.com</email>
</author>
<published>2025-08-19T15:58:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=a27b008a5d7e8c49740dfd4b560cd2d1abe722e4'/>
<id>urn:sha1:a27b008a5d7e8c49740dfd4b560cd2d1abe722e4</id>
<content type='text'>
tdx_quirk_reset_paddr() was renamed to reflect that, in fact, the clearing
is necessary only for hardware with a certain quirk.  That is dealt with in
a subsequent patch.

Rename reset_pamt functions to contain "quirk" to reflect the new
functionality, and remove the now misleading comment.

Signed-off-by: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Signed-off-by: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Reviewed-by: Rick Edgecombe &lt;rick.p.edgecombe@intel.com&gt;
Reviewed-by: Binbin Wu &lt;binbin.wu@linux.intel.com&gt;
Acked-by: Kai Huang &lt;kai.huang@intel.com&gt;
Acked-by: Vishal Annapurve &lt;vannapurve@google.com&gt;
Link: https://lore.kernel.org/all/20250819155811.136099-3-adrian.hunter%40intel.com
</content>
</entry>
<entry>
<title>x86/tdx: Eliminate duplicate code in tdx_clear_page()</title>
<updated>2025-08-22T14:45:50+00:00</updated>
<author>
<name>Adrian Hunter</name>
<email>adrian.hunter@intel.com</email>
</author>
<published>2025-08-19T15:58:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=94272b084a745940e076a170d8193ac3427292e6'/>
<id>urn:sha1:94272b084a745940e076a170d8193ac3427292e6</id>
<content type='text'>
tdx_clear_page() and reset_tdx_pages() duplicate the TDX page clearing
logic.  Rename reset_tdx_pages() to tdx_quirk_reset_paddr() and create
tdx_quirk_reset_page() to call tdx_quirk_reset_paddr() and be used in
place of tdx_clear_page().

The new name reflects that, in fact, the clearing is necessary only for
hardware with a certain quirk.  That is dealt with in a subsequent patch
but doing the rename here avoids additional churn.

Note reset_tdx_pages() is slightly different from tdx_clear_page() because,
more appropriately, it uses mb() in place of __mb().  Except when extra
debugging is enabled (kcsan at present), mb() just calls __mb().

Signed-off-by: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Signed-off-by: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Reviewed-by: Kirill A. Shutemov &lt;kas@kernel.org&gt;
Reviewed-by: Binbin Wu &lt;binbin.wu@linux.intel.com&gt;
Reviewed-by: Xiaoyao Li &lt;xiaoyao.li@intel.com&gt;
Reviewed-by: Rick Edgecombe &lt;rick.p.edgecombe@intel.com&gt;
Acked-by: Kai Huang &lt;kai.huang@intel.com&gt;
Acked-by: Sean Christopherson &lt;seanjc@google.com&gt;
Acked-by: Vishal Annapurve &lt;vannapurve@google.com&gt;
Link: https://lore.kernel.org/all/20250819155811.136099-2-adrian.hunter%40intel.com
</content>
</entry>
<entry>
<title>x86/virt/tdx: Avoid indirect calls to TDX assembly functions</title>
<updated>2025-06-10T19:32:52+00:00</updated>
<author>
<name>Kai Huang</name>
<email>kai.huang@intel.com</email>
</author>
<published>2025-06-06T13:07:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=0b3bc018e86afdc0cbfef61328c63d5c08f8b370'/>
<id>urn:sha1:0b3bc018e86afdc0cbfef61328c63d5c08f8b370</id>
<content type='text'>
Two 'static inline' TDX helper functions (sc_retry() and
sc_retry_prerr()) take function pointer arguments which refer to
assembly functions.  Normally, the compiler inlines the TDX helper,
realizes that the function pointer targets are completely static --
thus can be resolved at compile time -- and generates direct call
instructions.

But, other times (like when CONFIG_CC_OPTIMIZE_FOR_SIZE=y), the
compiler declines to inline the helpers and will instead generate
indirect call instructions.

Indirect calls to assembly functions require special annotation (for
various Control Flow Integrity mechanisms).  But TDX assembly
functions lack the special annotations and can only be called
directly.

Annotate both the helpers as '__always_inline' to prod the compiler
into maintaining the direct calls. There is no guarantee here, but
Peter has volunteered to report the compiler bug if this assumption
ever breaks[1].

Fixes: 1e66a7e27539 ("x86/virt/tdx: Handle SEAMCALL no entropy error in common code")
Fixes: df01f5ae07dd ("x86/virt/tdx: Add SEAMCALL error printing for module initialization")
Signed-off-by: Kai Huang &lt;kai.huang@intel.com&gt;
Signed-off-by: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/lkml/20250605145914.GW39944@noisy.programming.kicks-ass.net/ [1]
Link: https://lore.kernel.org/all/20250606130737.30713-1-kai.huang%40intel.com
</content>
</entry>
</feed>
