diff options
| author | Kiryl Shutsemau (Meta) <kas@kernel.org> | 2026-05-29 20:23:25 +0300 |
|---|---|---|
| committer | Andrew Morton <akpm@linux-foundation.org> | 2026-06-09 04:21:28 +0300 |
| commit | 04718f7c9290f95385f0dd328758753dc1c36dec (patch) | |
| tree | 7812cb6582a4ae104a5983caea22b7affee44762 /include/linux/timerqueue.h | |
| parent | 3862816c98152553106dd762c66c0f390337fa38 (diff) | |
| download | linux-04718f7c9290f95385f0dd328758753dc1c36dec.tar.xz | |
fs/proc/task_mmu: fix make_uffd_wp_huge_pte() prot-update race
Patch series "userfaultfd/pagemap: pre-existing fixes".
These are pre-existing bug fixes that were carried at the front of the
userfaultfd RWP working-set-tracking series up to v5 [1]. Per review
feedback that fixes should not sit in the middle of a feature series, they
are split out and sent on their own; the RWP series is reposted rebased on
top of this.
All six were flagged by the Sashiko AI review of the RWP series and carry
Reported-by: Sashiko AI review <sashiko-bot@kernel.org>. They are
independent of RWP, apply to mm-new directly, and carry Cc: stable@.
1: fs/proc/task_mmu: a missing huge_ptep_modify_prot_start() in
make_uffd_wp_huge_pte() can lose hardware Dirty/Accessed updates
when PAGEMAP_SCAN write-protects a hugetlb PTE.
2: fs/proc/task_mmu: pagemap_scan_hugetlb_entry() compares the range
against HPAGE_SIZE rather than the hstate page size, so it never
write-protects gigantic hugetlb pages.
3: fs/proc/task_mmu: PAGEMAP_SCAN with PM_SCAN_WP_MATCHING over an
unpopulated hugetlb range self-deadlocks -- pagemap_scan_pte_hole()
calls uffd_wp_range() while walk_hugetlb_range() holds the hugetlb
vma lock for read, and hugetlb_change_protection() then takes it
for write. Install the marker inline instead.
4: mm/huge_memory: change_non_present_huge_pmd() drops pmd_swp_uffd_wp
on a device-private PMD permission downgrade, silently losing the
uffd-wp marker.
5: userfaultfd: must_wait() applies pte_write() to a locklessly read
PTE without checking pte_present(), so swap/migration entries
decode random offset bits and a thread can stay parked on a stale
fault.
6: userfaultfd: __VMA_UFFD_FLAGS feeds VMA_UFFD_MINOR_BIT (41) to
mk_vma_flags() unconditionally, an out-of-bounds write into the
single-word vma_flags_t on 32-bit. Build the mask from config-gated
per-mode masks so an unavailable bit is never materialised.
This patch (of 6):
make_uffd_wp_huge_pte() arms the UFFD_WP bit on a present HugeTLB PTE by
calling huge_ptep_modify_prot_commit() with a ptent snapshot that was
fetched without the corresponding huge_ptep_modify_prot_start(). The
start helper is what atomically clears the entry so the kernel-owned
snapshot stays consistent until the commit; without it, the hardware may
set Dirty or Accessed in the live PTE between the original read and the
commit, and huge_ptep_modify_prot_commit() (whose generic implementation
just calls set_huge_pte_at()) then writes the stale snapshot back over the
live hardware bits, losing the update.
The non-hugetlb sibling make_uffd_wp_pte() does this correctly via
ptep_modify_prot_start() / ptep_modify_prot_commit(). Mirror that pattern
for the present-PTE branch. The migration case stays as-is -- migration
entries are non-present, so there's no hardware update to race against.
Link: https://lore.kernel.org/20260529172331.356655-1-kas@kernel.org
Link: https://lore.kernel.org/20260529172331.356655-2-kas@kernel.org
Link: https://lore.kernel.org/all/20260526130509.2748441-1-kirill@shutemov.name/ [1]
Fixes: 52526ca7fdb9 ("fs/proc/task_mmu: implement IOCTL to get and optionally clear info about PTEs")
Signed-off-by: Kiryl Shutsemau <kas@kernel.org>
Reported-by: Sashiko AI review <sashiko-bot@kernel.org>
Reviewed-by: Lorenzo Stoakes <ljs@kernel.org>
Reviewed-by: Dev Jain <dev.jain@arm.com>
Cc: David Hildenbrand <david@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Balbir Singh <balbirs@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Diffstat (limited to 'include/linux/timerqueue.h')
0 files changed, 0 insertions, 0 deletions
