From 0fa5bc4023c188082024833b3deffd5543b93bc9 Mon Sep 17 00:00:00 2001 From: Joao Martins Date: Wed, 24 Feb 2021 12:07:12 -0800 Subject: mm/hugetlb: grab head page refcount once for group of subpages Patch series "mm/hugetlb: follow_hugetlb_page() improvements", v2. While looking at ZONE_DEVICE struct page reuse particularly the last patch[0], I found two possible improvements for follow_hugetlb_page() which is solely used for get_user_pages()/pin_user_pages(). The first patch batches page refcount updates while the second tidies up storing the subpages/vmas. Both together bring the cost of slow variant of gup() cost from ~87.6k usecs to ~5.8k usecs. libhugetlbfs tests seem to pass as well gup_test benchmarks with hugetlbfs vmas. This patch (of 2): follow_hugetlb_page() once it locks the pmd/pud, checks all its N subpages in a huge page and grabs a reference for each one. Similar to gup-fast, have follow_hugetlb_page() grab the head page refcount only after counting all its subpages that are part of the just faulted huge page. Consequently we reduce the number of atomics necessary to pin said huge page, which improves non-fast gup() considerably: - 16G with 1G huge page size gup_test -f /mnt/huge/file -m 16384 -r 10 -L -S -n 512 -w PIN_LONGTERM_BENCHMARK: ~87.6k us -> ~12.8k us Link: https://lkml.kernel.org/r/20210128182632.24562-1-joao.m.martins@oracle.com Link: https://lkml.kernel.org/r/20210128182632.24562-2-joao.m.martins@oracle.com Signed-off-by: Joao Martins Reviewed-by: Mike Kravetz Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- mm/hugetlb.c | 43 ++++++++++++++++++++++++------------------- 1 file changed, 24 insertions(+), 19 deletions(-) (limited to 'mm/hugetlb.c') diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 6920c71d7e0d..d4fc5db6bc32 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -4796,7 +4796,7 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long vaddr = *position; unsigned long remainder = *nr_pages; struct hstate *h = hstate_vma(vma); - int err = -EFAULT; + int err = -EFAULT, refs; while (vaddr < vma->vm_end && remainder) { pte_t *pte; @@ -4916,26 +4916,11 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, continue; } + refs = 0; + same_page: - if (pages) { + if (pages) pages[i] = mem_map_offset(page, pfn_offset); - /* - * try_grab_page() should always succeed here, because: - * a) we hold the ptl lock, and b) we've just checked - * that the huge page is present in the page tables. If - * the huge page is present, then the tail pages must - * also be present. The ptl prevents the head page and - * tail pages from being rearranged in any way. So this - * page must be available at this point, unless the page - * refcount overflowed: - */ - if (WARN_ON_ONCE(!try_grab_page(pages[i], flags))) { - spin_unlock(ptl); - remainder = 0; - err = -ENOMEM; - break; - } - } if (vmas) vmas[i] = vma; @@ -4944,6 +4929,7 @@ same_page: ++pfn_offset; --remainder; ++i; + ++refs; if (vaddr < vma->vm_end && remainder && pfn_offset < pages_per_huge_page(h)) { /* @@ -4951,6 +4937,25 @@ same_page: * of this compound page. */ goto same_page; + } else if (pages) { + /* + * try_grab_compound_head() should always succeed here, + * because: a) we hold the ptl lock, and b) we've just + * checked that the huge page is present in the page + * tables. If the huge page is present, then the tail + * pages must also be present. The ptl prevents the + * head page and tail pages from being rearranged in + * any way. So this page must be available at this + * point, unless the page refcount overflowed: + */ + if (WARN_ON_ONCE(!try_grab_compound_head(pages[i-1], + refs, + flags))) { + spin_unlock(ptl); + remainder = 0; + err = -ENOMEM; + break; + } } spin_unlock(ptl); } -- cgit v1.2.3