diff options
| author | Deepanshu Kartikey <kartikey406@gmail.com> | 2026-02-14 03:15:35 +0300 |
|---|---|---|
| committer | Greg Kroah-Hartman <gregkh@linuxfoundation.org> | 2026-03-25 13:08:45 +0300 |
| commit | 08de46a75f91a6661bc1ce0a93614f4bc313c581 (patch) | |
| tree | ce9de46db1ed4d6a64b9e65b7b37b834cf53bf0f /include/linux | |
| parent | ad07ea069f924465061cfee40ef2861bb99f4dd8 (diff) | |
| download | linux-08de46a75f91a6661bc1ce0a93614f4bc313c581.tar.xz | |
mm: thp: deny THP for files on anonymous inodes
commit dd085fe9a8ebfc5d10314c60452db38d2b75e609 upstream.
file_thp_enabled() incorrectly allows THP for files on anonymous inodes
(e.g. guest_memfd and secretmem). These files are created via
alloc_file_pseudo(), which does not call get_write_access() and leaves
inode->i_writecount at 0. Combined with S_ISREG(inode->i_mode) being
true, they appear as read-only regular files when
CONFIG_READ_ONLY_THP_FOR_FS is enabled, making them eligible for THP
collapse.
Anonymous inodes can never pass the inode_is_open_for_write() check
since their i_writecount is never incremented through the normal VFS
open path. The right thing to do is to exclude them from THP eligibility
altogether, since CONFIG_READ_ONLY_THP_FOR_FS was designed for real
filesystem files (e.g. shared libraries), not for pseudo-filesystem
inodes.
For guest_memfd, this allows khugepaged and MADV_COLLAPSE to create
large folios in the page cache via the collapse path, but the
guest_memfd fault handler does not support large folios. This triggers
WARN_ON_ONCE(folio_test_large(folio)) in kvm_gmem_fault_user_mapping().
For secretmem, collapse_file() tries to copy page contents through the
direct map, but secretmem pages are removed from the direct map. This
can result in a kernel crash:
BUG: unable to handle page fault for address: ffff88810284d000
RIP: 0010:memcpy_orig+0x16/0x130
Call Trace:
collapse_file
hpage_collapse_scan_file
madvise_collapse
Secretmem is not affected by the crash on upstream as the memory failure
recovery handles the failed copy gracefully, but it still triggers
confusing false memory failure reports:
Memory failure: 0x106d96f: recovery action for clean unevictable
LRU page: Recovered
Check IS_ANON_FILE(inode) in file_thp_enabled() to deny THP for all
anonymous inode files.
Link: https://syzkaller.appspot.com/bug?extid=33a04338019ac7e43a44
Link: https://lore.kernel.org/linux-mm/CAEvNRgHegcz3ro35ixkDw39ES8=U6rs6S7iP0gkR9enr7HoGtA@mail.gmail.com
Link: https://lkml.kernel.org/r/20260214001535.435626-1-kartikey406@gmail.com
Fixes: 7fbb5e188248 ("mm: remove VM_EXEC requirement for THP eligibility")
Signed-off-by: Deepanshu Kartikey <Kartikey406@gmail.com>
Reported-by: syzbot+33a04338019ac7e43a44@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=33a04338019ac7e43a44
Tested-by: syzbot+33a04338019ac7e43a44@syzkaller.appspotmail.com
Tested-by: Lance Yang <lance.yang@linux.dev>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Reviewed-by: Barry Song <baohua@kernel.org>
Reviewed-by: Ackerley Tng <ackerleytng@google.com>
Tested-by: Ackerley Tng <ackerleytng@google.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Fangrui Song <i@maskray.me>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
[ Ackerley: we don't have IS_ANON_FILE() yet. As guest_memfd does
not apply yet, simply check for secretmem explicitly. ]
Signed-off-by: Ackerley Tng <ackerleytng@google.com>
Reviewed-by: David Hildenbrand (Arm) <david@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Diffstat (limited to 'include/linux')
| -rw-r--r-- | include/linux/huge_mm.h | 4 |
1 files changed, 4 insertions, 0 deletions
diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index f70b048596b5..bd34b70dd0cf 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -7,6 +7,7 @@ #include <linux/fs.h> /* only for vma_is_dax() */ #include <linux/kobject.h> +#include <linux/secretmem.h> vm_fault_t do_huge_pmd_anonymous_page(struct vm_fault *vmf); int copy_huge_pmd(struct mm_struct *dst_mm, struct mm_struct *src_mm, @@ -262,6 +263,9 @@ static inline bool file_thp_enabled(struct vm_area_struct *vma) inode = vma->vm_file->f_inode; + if (secretmem_mapping(inode->i_mapping)) + return false; + return (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS)) && !inode_is_open_for_write(inode) && S_ISREG(inode->i_mode); } |
