summaryrefslogtreecommitdiff
path: root/tools/perf/scripts/python/arm-cs-trace-disasm.py
diff options
context:
space:
mode:
authorFlorian Westphal <fw@strlen.de>2024-02-13 18:23:40 +0300
committerFlorian Westphal <fw@strlen.de>2024-02-21 13:57:11 +0300
commit9f439bd6ef4f60c8a37bd9138fa9ed9b5e7ae0d7 (patch)
treec9bfc5424d8c35cc020e23122bc0e0f73565d995 /tools/perf/scripts/python/arm-cs-trace-disasm.py
parentaac14d516c2b575af20b426fa04129a28d45c287 (diff)
downloadlinux-9f439bd6ef4f60c8a37bd9138fa9ed9b5e7ae0d7.tar.xz
netfilter: nft_set_pipapo: speed up bulk element insertions
Insertions into the set are slow when we try to add many elements. For 800k elements I get: time nft -f pipapo_800k real 19m34.849s user 0m2.390s sys 19m12.828s perf stats: --95.39%--nft_pipapo_insert |--76.60%--pipapo_insert | --76.37%--pipapo_resize | |--72.87%--memcpy_orig | |--1.88%--__free_pages_ok | | --0.89%--free_tail_page_prepare | --1.38%--kvmalloc_node .. --18.56%--pipapo_get.isra.0 |--13.91%--__bitmap_and |--3.01%--pipapo_refill |--0.81%--__kmalloc | --0.74%--__kmalloc_large_node | --0.66%--__alloc_pages .. --0.52%--memset_orig So lots of time is spent in copying exising elements to make space for the next one. Instead of allocating to the exact size of the new rule count, allocate extra slack to reduce alloc/copy/free overhead. After: time nft -f pipapo_800k real 1m54.110s user 0m2.515s sys 1m51.377s --80.46%--nft_pipapo_insert |--73.45%--pipapo_get.isra.0 |--57.63%--__bitmap_and | |--8.52%--pipapo_refill |--3.45%--__kmalloc | --3.05%--__kmalloc_large_node | --2.58%--__alloc_pages --2.59%--memset_orig |--6.51%--pipapo_insert --5.96%--pipapo_resize |--3.63%--memcpy_orig --2.13%--kvmalloc_node The new @rules_alloc fills a hole, so struct size doesn't go up. Also make it so rule removal doesn't shrink unless the free/extra space exceeds two pages. This should be safe as well: When a rule gets removed, the attempt to lower the allocated size is already allowed to fail. Exception: do exact allocations as long as set is very small (less than one page needed). v2: address comments from Stefano: kdoc comment formatting changes remove redundant assignment switch back to PAGE_SIZE Link: https://lore.kernel.org/netfilter-devel/20240213141753.17ef27a6@elisabeth/ Reviewed-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>
Diffstat (limited to 'tools/perf/scripts/python/arm-cs-trace-disasm.py')
0 files changed, 0 insertions, 0 deletions