From de195c67bfcbc770bb6327bbfd3010f1c371216a Mon Sep 17 00:00:00 2001 From: Lorenzo Stoakes Date: Thu, 29 May 2025 18:15:45 +0100 Subject: mm: ksm: have KSM VMA checks not require a VMA pointer Patch series "mm: ksm: prevent KSM from breaking merging of new VMAs", v3. When KSM-by-default is established using prctl(PR_SET_MEMORY_MERGE), this defaults all newly mapped VMAs to having VM_MERGEABLE set, and thus makes them available to KSM for samepage merging. It also sets VM_MERGEABLE in all existing VMAs. However this causes an issue upon mapping of new VMAs - the initial flags will never have VM_MERGEABLE set when attempting a merge with adjacent VMAs (this is set later in the mmap() logic), and adjacent VMAs will ALWAYS have VM_MERGEABLE set. This renders all newly mapped VMAs unmergeable. To avoid this, this series performs the check for PR_SET_MEMORY_MERGE far earlier in the mmap() logic, prior to the merge being attempted. However we run into complexity with the depreciated .mmap() callback - if a driver hooks this, it might change flags which adjust KSM merge eligibility. We have to worry about this because, while KSM is only applicable to private mappings, this includes both anonymous and MAP_PRIVATE-mapped file-backed mappings. This isn't a problem for brk(), where the VMA must be anonymous. However in mmap() we must be conservative - if the VMA is anonymous then we can always proceed, however if not, we permit only shmem mappings (whose .mmap hook does not affect KSM eligibility) and drivers which implement .mmap_prepare() (invoked prior to the KSM eligibility check). If we can't be sure of the driver changing things, then we maintain the same behaviour of performing the KSM check later in the mmap() logic (and thus losing new VMA mergeability). A great many use-cases for this logic will use anonymous mappings any rate, so this change should already cover the majority of actual KSM use-cases. This patch (of 4): In subsequent commits we are going to determine KSM eligibility prior to a VMA being constructed, at which point we will of course not yet have access to a VMA pointer. It is trivial to boil down the check logic to be parameterised on mm_struct, file and VMA flags, so do so. As a part of this change, additionally expose and use file_is_dax() to determine whether a file is being mapped under a DAX inode. Link: https://lkml.kernel.org/r/cover.1748537921.git.lorenzo.stoakes@oracle.com Link: https://lkml.kernel.org/r/36ad13eb50cdbd8aac6dcfba22c65d5031667295.1748537921.git.lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes Acked-by: David Hildenbrand Reviewed-by: Chengming Zhou Reviewed-by: Vlastimil Babka Reviewed-by: Xu Xin Reviewed-by: Liam R. Howlett Cc: Al Viro Cc: Christian Brauner Cc: Jan Kara Cc: Jann Horn Cc: Stefan Roesch Signed-off-by: Andrew Morton --- include/linux/fs.h | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) (limited to 'include/linux') diff --git a/include/linux/fs.h b/include/linux/fs.h index 040c0036320f..62634af97da6 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -3726,9 +3726,14 @@ void setattr_copy(struct mnt_idmap *, struct inode *inode, extern int file_update_time(struct file *file); +static inline bool file_is_dax(const struct file *file) +{ + return file && IS_DAX(file->f_mapping->host); +} + static inline bool vma_is_dax(const struct vm_area_struct *vma) { - return vma->vm_file && IS_DAX(vma->vm_file->f_mapping->host); + return file_is_dax(vma->vm_file); } static inline bool vma_is_fsdax(struct vm_area_struct *vma) -- cgit v1.2.3