<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/fs/anon_inodes.c, branch v6.12.91</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.12.91</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.12.91'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2025-07-10T14:05:09+00:00</updated>
<entry>
<title>fs: export anon_inode_make_secure_inode() and fix secretmem LSM bypass</title>
<updated>2025-07-10T14:05:09+00:00</updated>
<author>
<name>Shivank Garg</name>
<email>shivankg@amd.com</email>
</author>
<published>2025-06-20T07:03:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f94c422157f3e43dd31990567b3e5d54b3e5b32b'/>
<id>urn:sha1:f94c422157f3e43dd31990567b3e5d54b3e5b32b</id>
<content type='text'>
[ Upstream commit cbe4134ea4bc493239786220bd69cb8a13493190 ]

Export anon_inode_make_secure_inode() to allow KVM guest_memfd to create
anonymous inodes with proper security context. This replaces the current
pattern of calling alloc_anon_inode() followed by
inode_init_security_anon() for creating security context manually.

This change also fixes a security regression in secretmem where the
S_PRIVATE flag was not cleared after alloc_anon_inode(), causing
LSM/SELinux checks to be bypassed for secretmem file descriptors.

As guest_memfd currently resides in the KVM module, we need to export this
symbol for use outside the core kernel. In the future, guest_memfd might be
moved to core-mm, at which point the symbols no longer would have to be
exported. When/if that happens is still unclear.

Fixes: 2bfe15c52612 ("mm: create security context for memfd_secret inodes")
Suggested-by: David Hildenbrand &lt;david@redhat.com&gt;
Suggested-by: Mike Rapoport &lt;rppt@kernel.org&gt;
Signed-off-by: Shivank Garg &lt;shivankg@amd.com&gt;
Link: https://lore.kernel.org/20250620070328.803704-3-shivankg@amd.com
Acked-by: "Mike Rapoport (Microsoft)" &lt;rppt@kernel.org&gt;
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>add a string-to-qstr constructor</title>
<updated>2025-07-10T14:05:08+00:00</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2025-01-24T03:51:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e036efbe5822dd679e475432743c32dbe4bd63a6'/>
<id>urn:sha1:e036efbe5822dd679e475432743c32dbe4bd63a6</id>
<content type='text'>
[ Upstream commit c1feab95e0b2e9fce7e4f4b2739baf40d84543af ]

Quite a few places want to build a struct qstr by given string;
it would be convenient to have a primitive doing that, rather
than open-coding it via QSTR_INIT().

The closest approximation was in bcachefs, but that expands to
initializer list - {.len = strlen(string), .name = string}.
It would be more useful to have it as compound literal -
(struct qstr){.len = strlen(string), .name = string}.

Unlike initializer list it's a valid expression.  What's more,
it's a valid lvalue - it's an equivalent of anonymous local
variable with such initializer, so the things like
	path-&gt;dentry = d_alloc_pseudo(mnt-&gt;mnt_sb, &amp;QSTR(name));
are valid.  It can also be used as initializer, with identical
effect -
	struct qstr x = (struct qstr){.name = s, .len = strlen(s)};
is equivalent to
	struct qstr anon_variable = {.name = s, .len = strlen(s)};
	struct qstr x = anon_variable;
	// anon_variable is never used after that point
and any even remotely sane compiler will manage to collapse that
into
	struct qstr x = {.name = s, .len = strlen(s)};

What compound literals can't be used for is initialization of
global variables, but those are covered by QSTR_INIT().

This commit lifts definition(s) of QSTR() into linux/dcache.h,
converts it to compound literal (all bcachefs users are fine
with that) and converts assorted open-coded instances to using
that.

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Stable-dep-of: cbe4134ea4bc ("fs: export anon_inode_make_secure_inode() and fix secretmem LSM bypass")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>fs: Create anon_inode_getfile_fmode()</title>
<updated>2024-04-26T08:33:05+00:00</updated>
<author>
<name>Dawid Osuchowski</name>
<email>linux@osuchow.ski</email>
</author>
<published>2024-04-26T07:58:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=55394d29c9e164f2e1991352f1dc8f973c4f1589'/>
<id>urn:sha1:55394d29c9e164f2e1991352f1dc8f973c4f1589</id>
<content type='text'>
Creates an anon_inode_getfile_fmode() function that works similarly to
anon_inode_getfile() with the addition of being able to set the fmode
member.

Signed-off-by: Dawid Osuchowski &lt;linux@osuchow.ski&gt;
Link: https://lore.kernel.org/r/20240426075854.4723-1-linux@osuchow.ski
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'kvm-guestmemfd' into HEAD</title>
<updated>2023-11-14T13:31:31+00:00</updated>
<author>
<name>Paolo Bonzini</name>
<email>pbonzini@redhat.com</email>
</author>
<published>2023-11-13T10:58:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=6c370dc65374db5afbc5c6c64c662f922a2555ad'/>
<id>urn:sha1:6c370dc65374db5afbc5c6c64c662f922a2555ad</id>
<content type='text'>
Introduce several new KVM uAPIs to ultimately create a guest-first memory
subsystem within KVM, a.k.a. guest_memfd.  Guest-first memory allows KVM
to provide features, enhancements, and optimizations that are kludgly
or outright impossible to implement in a generic memory subsystem.

The core KVM ioctl() for guest_memfd is KVM_CREATE_GUEST_MEMFD, which
similar to the generic memfd_create(), creates an anonymous file and
returns a file descriptor that refers to it.  Again like "regular"
memfd files, guest_memfd files live in RAM, have volatile storage,
and are automatically released when the last reference is dropped.
The key differences between memfd files (and every other memory subystem)
is that guest_memfd files are bound to their owning virtual machine,
cannot be mapped, read, or written by userspace, and cannot be resized.
guest_memfd files do however support PUNCH_HOLE, which can be used to
convert a guest memory area between the shared and guest-private states.

A second KVM ioctl(), KVM_SET_MEMORY_ATTRIBUTES, allows userspace to
specify attributes for a given page of guest memory.  In the long term,
it will likely be extended to allow userspace to specify per-gfn RWX
protections, including allowing memory to be writable in the guest
without it also being writable in host userspace.

The immediate and driving use case for guest_memfd are Confidential
(CoCo) VMs, specifically AMD's SEV-SNP, Intel's TDX, and KVM's own pKVM.
For such use cases, being able to map memory into KVM guests without
requiring said memory to be mapped into the host is a hard requirement.
While SEV+ and TDX prevent untrusted software from reading guest private
data by encrypting guest memory, pKVM provides confidentiality and
integrity *without* relying on memory encryption.  In addition, with
SEV-SNP and especially TDX, accessing guest private memory can be fatal
to the host, i.e. KVM must be prevent host userspace from accessing
guest memory irrespective of hardware behavior.

Long term, guest_memfd may be useful for use cases beyond CoCo VMs,
for example hardening userspace against unintentional accesses to guest
memory.  As mentioned earlier, KVM's ABI uses userspace VMA protections to
define the allow guest protection (with an exception granted to mapping
guest memory executable), and similarly KVM currently requires the guest
mapping size to be a strict subset of the host userspace mapping size.
Decoupling the mappings sizes would allow userspace to precisely map
only what is needed and with the required permissions, without impacting
guest performance.

A guest-first memory subsystem also provides clearer line of sight to
things like a dedicated memory pool (for slice-of-hardware VMs) and
elimination of "struct page" (for offload setups where userspace _never_
needs to DMA from or into guest memory).

guest_memfd is the result of 3+ years of development and exploration;
taking on memory management responsibilities in KVM was not the first,
second, or even third choice for supporting CoCo VMs.  But after many
failed attempts to avoid KVM-specific backing memory, and looking at
where things ended up, it is quite clear that of all approaches tried,
guest_memfd is the simplest, most robust, and most extensible, and the
right thing to do for KVM and the kernel at-large.

The "development cycle" for this version is going to be very short;
ideally, next week I will merge it as is in kvm/next, taking this through
the KVM tree for 6.8 immediately after the end of the merge window.
The series is still based on 6.6 (plus KVM changes for 6.7) so it
will require a small fixup for changes to get_file_rcu() introduced in
6.7 by commit 0ede61d8589c ("file: convert to SLAB_TYPESAFE_BY_RCU").
The fixup will be done as part of the merge commit, and most of the text
above will become the commit message for the merge.

Pending post-merge work includes:
- hugepage support
- looking into using the restrictedmem framework for guest memory
- introducing a testing mechanism to poison memory, possibly using
  the same memory attributes introduced here
- SNP and TDX support

There are two non-KVM patches buried in the middle of this series:

  fs: Rename anon_inode_getfile_secure() and anon_inode_getfd_secure()
  mm: Add AS_UNMOVABLE to mark mapping as completely unmovable

The first is small and mostly suggested-by Christian Brauner; the second
a bit less so but it was written by an mm person (Vlastimil Babka).
</content>
</entry>
<entry>
<title>KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory</title>
<updated>2023-11-14T13:01:03+00:00</updated>
<author>
<name>Sean Christopherson</name>
<email>seanjc@google.com</email>
</author>
<published>2023-11-13T10:42:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=a7800aa80ea4d5356b8474c2302812e9d4926fa6'/>
<id>urn:sha1:a7800aa80ea4d5356b8474c2302812e9d4926fa6</id>
<content type='text'>
Introduce an ioctl(), KVM_CREATE_GUEST_MEMFD, to allow creating file-based
memory that is tied to a specific KVM virtual machine and whose primary
purpose is to serve guest memory.

A guest-first memory subsystem allows for optimizations and enhancements
that are kludgy or outright infeasible to implement/support in a generic
memory subsystem.  With guest_memfd, guest protections and mapping sizes
are fully decoupled from host userspace mappings.   E.g. KVM currently
doesn't support mapping memory as writable in the guest without it also
being writable in host userspace, as KVM's ABI uses VMA protections to
define the allow guest protection.  Userspace can fudge this by
establishing two mappings, a writable mapping for the guest and readable
one for itself, but that’s suboptimal on multiple fronts.

Similarly, KVM currently requires the guest mapping size to be a strict
subset of the host userspace mapping size, e.g. KVM doesn’t support
creating a 1GiB guest mapping unless userspace also has a 1GiB guest
mapping.  Decoupling the mappings sizes would allow userspace to precisely
map only what is needed without impacting guest performance, e.g. to
harden against unintentional accesses to guest memory.

Decoupling guest and userspace mappings may also allow for a cleaner
alternative to high-granularity mappings for HugeTLB, which has reached a
bit of an impasse and is unlikely to ever be merged.

A guest-first memory subsystem also provides clearer line of sight to
things like a dedicated memory pool (for slice-of-hardware VMs) and
elimination of "struct page" (for offload setups where userspace _never_
needs to mmap() guest memory).

More immediately, being able to map memory into KVM guests without mapping
said memory into the host is critical for Confidential VMs (CoCo VMs), the
initial use case for guest_memfd.  While AMD's SEV and Intel's TDX prevent
untrusted software from reading guest private data by encrypting guest
memory with a key that isn't usable by the untrusted host, projects such
as Protected KVM (pKVM) provide confidentiality and integrity *without*
relying on memory encryption.  And with SEV-SNP and TDX, accessing guest
private memory can be fatal to the host, i.e. KVM must be prevent host
userspace from accessing guest memory irrespective of hardware behavior.

Attempt #1 to support CoCo VMs was to add a VMA flag to mark memory as
being mappable only by KVM (or a similarly enlightened kernel subsystem).
That approach was abandoned largely due to it needing to play games with
PROT_NONE to prevent userspace from accessing guest memory.

Attempt #2 to was to usurp PG_hwpoison to prevent the host from mapping
guest private memory into userspace, but that approach failed to meet
several requirements for software-based CoCo VMs, e.g. pKVM, as the kernel
wouldn't easily be able to enforce a 1:1 page:guest association, let alone
a 1:1 pfn:gfn mapping.  And using PG_hwpoison does not work for memory
that isn't backed by 'struct page', e.g. if devices gain support for
exposing encrypted memory regions to guests.

Attempt #3 was to extend the memfd() syscall and wrap shmem to provide
dedicated file-based guest memory.  That approach made it as far as v10
before feedback from Hugh Dickins and Christian Brauner (and others) led
to it demise.

Hugh's objection was that piggybacking shmem made no sense for KVM's use
case as KVM didn't actually *want* the features provided by shmem.  I.e.
KVM was using memfd() and shmem to avoid having to manage memory directly,
not because memfd() and shmem were the optimal solution, e.g. things like
read/write/mmap in shmem were dead weight.

Christian pointed out flaws with implementing a partial overlay (wrapping
only _some_ of shmem), e.g. poking at inode_operations or super_operations
would show shmem stuff, but address_space_operations and file_operations
would show KVM's overlay.  Paraphrashing heavily, Christian suggested KVM
stop being lazy and create a proper API.

Link: https://lore.kernel.org/all/20201020061859.18385-1-kirill.shutemov@linux.intel.com
Link: https://lore.kernel.org/all/20210416154106.23721-1-kirill.shutemov@linux.intel.com
Link: https://lore.kernel.org/all/20210824005248.200037-1-seanjc@google.com
Link: https://lore.kernel.org/all/20211111141352.26311-1-chao.p.peng@linux.intel.com
Link: https://lore.kernel.org/all/20221202061347.1070246-1-chao.p.peng@linux.intel.com
Link: https://lore.kernel.org/all/ff5c5b97-acdf-9745-ebe5-c6609dd6322e@google.com
Link: https://lore.kernel.org/all/20230418-anfallen-irdisch-6993a61be10b@brauner
Link: https://lore.kernel.org/all/ZEM5Zq8oo+xnApW9@google.com
Link: https://lore.kernel.org/linux-mm/20230306191944.GA15773@monkey
Link: https://lore.kernel.org/linux-mm/ZII1p8ZHlHaQ3dDl@casper.infradead.org
Cc: Fuad Tabba &lt;tabba@google.com&gt;
Cc: Vishal Annapurve &lt;vannapurve@google.com&gt;
Cc: Ackerley Tng &lt;ackerleytng@google.com&gt;
Cc: Jarkko Sakkinen &lt;jarkko@kernel.org&gt;
Cc: Maciej Szmigiero &lt;mail@maciej.szmigiero.name&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Quentin Perret &lt;qperret@google.com&gt;
Cc: Michael Roth &lt;michael.roth@amd.com&gt;
Cc: Wang &lt;wei.w.wang@intel.com&gt;
Cc: Liam Merwick &lt;liam.merwick@oracle.com&gt;
Cc: Isaku Yamahata &lt;isaku.yamahata@gmail.com&gt;
Co-developed-by: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Signed-off-by: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Co-developed-by: Yu Zhang &lt;yu.c.zhang@linux.intel.com&gt;
Signed-off-by: Yu Zhang &lt;yu.c.zhang@linux.intel.com&gt;
Co-developed-by: Chao Peng &lt;chao.p.peng@linux.intel.com&gt;
Signed-off-by: Chao Peng &lt;chao.p.peng@linux.intel.com&gt;
Co-developed-by: Ackerley Tng &lt;ackerleytng@google.com&gt;
Signed-off-by: Ackerley Tng &lt;ackerleytng@google.com&gt;
Co-developed-by: Isaku Yamahata &lt;isaku.yamahata@intel.com&gt;
Signed-off-by: Isaku Yamahata &lt;isaku.yamahata@intel.com&gt;
Co-developed-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Co-developed-by: Michael Roth &lt;michael.roth@amd.com&gt;
Signed-off-by: Michael Roth &lt;michael.roth@amd.com&gt;
Signed-off-by: Sean Christopherson &lt;seanjc@google.com&gt;
Message-Id: &lt;20231027182217.3615211-17-seanjc@google.com&gt;
Reviewed-by: Fuad Tabba &lt;tabba@google.com&gt;
Tested-by: Fuad Tabba &lt;tabba@google.com&gt;
Reviewed-by: Xiaoyao Li &lt;xiaoyao.li@intel.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
</content>
</entry>
<entry>
<title>fs: Rename anon_inode_getfile_secure() and anon_inode_getfd_secure()</title>
<updated>2023-11-14T13:00:57+00:00</updated>
<author>
<name>Paolo Bonzini</name>
<email>pbonzini@redhat.com</email>
</author>
<published>2023-11-03T10:47:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=4f0b9194bc119a9850a99e5e824808e2f468c348'/>
<id>urn:sha1:4f0b9194bc119a9850a99e5e824808e2f468c348</id>
<content type='text'>
The call to the inode_init_security_anon() LSM hook is not the sole
reason to use anon_inode_getfile_secure() or anon_inode_getfd_secure().
For example, the functions also allow one to create a file with non-zero
size, without needing a full-blown filesystem.  In this case, you don't
need a "secure" version, just unique inodes; the current name of the
functions is confusing and does not explain well the difference with
the more "standard" anon_inode_getfile() and anon_inode_getfd().

Of course, there is another side of the coin; neither io_uring nor
userfaultfd strictly speaking need distinct inodes, and it is not
that clear anymore that anon_inode_create_get{file,fd}() allow the LSM
to intercept and block the inode's creation.  If one was so inclined,
anon_inode_getfile_secure() and anon_inode_getfd_secure() could be kept,
using the shared inode or a new one depending on CONFIG_SECURITY.
However, this is probably overkill, and potentially a cause of bugs in
different configurations.  Therefore, just add a comment to io_uring
and userfaultfd explaining the choice of the function.

While at it, remove the export for what is now anon_inode_create_getfd().
There is no in-tree module that uses it, and the old name is gone anyway.
If anybody actually needs the symbol, they can ask or they can just use
anon_inode_create_getfile(), which will be exported very soon for use
in KVM.

Suggested-by: Christian Brauner &lt;brauner@kernel.org&gt;
Reviewed-by: Christian Brauner &lt;brauner@kernel.org&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
</content>
</entry>
<entry>
<title>treewide: mark stuff as __ro_after_init</title>
<updated>2023-10-18T21:43:23+00:00</updated>
<author>
<name>Alexey Dobriyan</name>
<email>adobriyan@gmail.com</email>
</author>
<published>2023-10-11T16:55:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=68279f9c9f592e75d30a9ba5154a15e0a0b42ae8'/>
<id>urn:sha1:68279f9c9f592e75d30a9ba5154a15e0a0b42ae8</id>
<content type='text'>
__read_mostly predates __ro_after_init. Many variables which are marked
__read_mostly should have been __ro_after_init from day 1.

Also, mark some stuff as "const" and "__init" while I'm at it.

[akpm@linux-foundation.org: revert sysctl_nr_open_min, sysctl_nr_open_max changes due to arm warning]
[akpm@linux-foundation.org: coding-style cleanups]
Link: https://lkml.kernel.org/r/4f6bb9c0-abba-4ee4-a7aa-89265e886817@p183
Signed-off-by: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>dynamic_dname(): drop unused dentry argument</title>
<updated>2022-08-20T15:34:04+00:00</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2022-01-30T20:03:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=0f60d28828dd94779c6527440289e1c36a05115a'/>
<id>urn:sha1:0f60d28828dd94779c6527440289e1c36a05115a</id>
<content type='text'>
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
</entry>
<entry>
<title>fs: add anon_inode_getfile_secure() similar to anon_inode_getfd_secure()</title>
<updated>2021-09-20T02:35:37+00:00</updated>
<author>
<name>Paul Moore</name>
<email>paul@paul-moore.com</email>
</author>
<published>2021-02-02T00:22:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=3a862cacf8670729b1ad8fc28e4f7e078f9c570c'/>
<id>urn:sha1:3a862cacf8670729b1ad8fc28e4f7e078f9c570c</id>
<content type='text'>
Extending the secure anonymous inode support to other subsystems
requires that we have a secure anon_inode_getfile() variant in
addition to the existing secure anon_inode_getfd() variant.

Thankfully we can reuse the existing __anon_inode_getfile() function
and just wrap it with the proper arguments.

Acked-by: Mickaël Salaün &lt;mic@linux.microsoft.com&gt;
Signed-off-by: Paul Moore &lt;paul@paul-moore.com&gt;
</content>
</entry>
<entry>
<title>fs: anon_inodes: rephrase to appropriate kernel-doc</title>
<updated>2021-01-15T17:17:25+00:00</updated>
<author>
<name>Lukas Bulwahn</name>
<email>lukas.bulwahn@gmail.com</email>
</author>
<published>2021-01-15T12:03:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=365982aba1f264dba26f0908700d62bfa046918c'/>
<id>urn:sha1:365982aba1f264dba26f0908700d62bfa046918c</id>
<content type='text'>
Commit e7e832ce6fa7 ("fs: add LSM-supporting anon-inode interface") adds
more kerneldoc description, but also a few new warnings on
anon_inode_getfd_secure() due to missing parameter descriptions.

Rephrase to appropriate kernel-doc for anon_inode_getfd_secure().

Signed-off-by: Lukas Bulwahn &lt;lukas.bulwahn@gmail.com&gt;
Signed-off-by: Paul Moore &lt;paul@paul-moore.com&gt;
</content>
</entry>
</feed>
