diff options
| author | Alexei Starovoitov <ast@kernel.org> | 2025-02-27 20:24:04 +0300 | 
|---|---|---|
| committer | Alexei Starovoitov <ast@kernel.org> | 2025-02-27 20:41:09 +0300 | 
| commit | 93ed6fc268c4cc3f1c2b3718d2beb0aa6d04ddc4 (patch) | |
| tree | e236a453607f51e48c9c3910f57fae94fb5bbe15 /tools/perf/scripts/python/check-perf-trace.py | |
| parent | 2014c95afecee3e76ca4a56956a936e23283f05b (diff) | |
| parent | c9eb8102e21e8c4c6fb74d1921d99e4d43713520 (diff) | |
| download | linux-93ed6fc268c4cc3f1c2b3718d2beb0aa6d04ddc4.tar.xz | |
Merge branch 'bpf-mm-introduce-try_alloc_pages'
Alexei Starovoitov says:
====================
The main motivation is to make alloc page and slab reentrant and
remove bpf_mem_alloc.
v8->v9:
- Squash Vlastimil's fix/feature for localtry_trylock, and
  udpate commit log as suggested by Sebastian.
- Drop _noprof suffix in try_alloc_pages kdoc
- rebase
v8:
https://lore.kernel.org/bpf/20250213033556.9534-1-alexei.starovoitov@gmail.com/
v7->v8:
- rebase: s/free_unref_page/free_frozen_page/
v6->v7:
- Took Sebastian's patch for localtry_lock_t as-is with minor
  addition of local_trylock_acquire() for proper LOCKDEP.
  Kept his authorship.
- Adjusted patch 4 to use it. The rest is unchanged.
v6:
https://lore.kernel.org/bpf/20250124035655.78899-1-alexei.starovoitov@gmail.com/
v5->v6:
- Addressed comments from Sebastian, Vlastimil
- New approach for local_lock_t in patch 3. Instead of unconditionally
  increasing local_lock_t size to 4 bytes introduce local_trylock_t
  and use _Generic() tricks to manipulate active field.
- Address stackdepot reentrance issues. alloc part in patch 1 and
  free part in patch 2.
- Inlined mem_cgroup_cancel_charge() in patch 4 since this helper
  is being removed.
- Added Acks.
- Dropped failslab, kfence, kmemleak patch.
- Improved bpf_map_alloc_pages() in patch 6 a bit to demo intended usage.
  It will be refactored further.
- Considered using __GFP_COMP in try_alloc_pages to simplify
  free_pages_nolock a bit, but then decided to make it work
  for all types of pages, since free_pages_nolock() is used by
  stackdepot and currently it's using non-compound order 2.
  I felt it's best to leave it as-is and make free_pages_nolock()
  support all pages.
v5:
https://lore.kernel.org/all/20250115021746.34691-1-alexei.starovoitov@gmail.com/
v4->v5:
- Fixed patch 1 and 4 commit logs and comments per Michal suggestions.
  Added Acks.
- Added patch 6 to make failslab, kfence, kmemleak complaint
  with trylock mode. It's a prerequisite for reentrant slab patches.
v4:
https://lore.kernel.org/bpf/20250114021922.92609-1-alexei.starovoitov@gmail.com/
v3->v4:
Addressed feedback from Michal and Shakeel:
- GFP_TRYLOCK flag is gone. gfpflags_allow_spinning() is used instead.
- Improved comments and commit logs.
v3:
https://lore.kernel.org/bpf/20241218030720.1602449-1-alexei.starovoitov@gmail.com/
v2->v3:
To address the issues spotted by Sebastian, Vlastimil, Steven:
- Made GFP_TRYLOCK internal to mm/internal.h
  try_alloc_pages() and free_pages_nolock() are the only interfaces.
- Since spin_trylock() is not safe in RT from hard IRQ and NMI
  disable such usage in lock_trylock and in try_alloc_pages().
  In such case free_pages_nolock() falls back to llist right away.
- Process trylock_free_pages llist when preemptible.
- Check for things like unaccepted memory and order <= 3 early.
- Don't call into __alloc_pages_slowpath() at all.
- Inspired by Vlastimil's struct local_tryirq_lock adopted it in
  local_lock_t. Extra 4 bytes in !RT in local_lock_t shouldn't
  affect any of the current local_lock_t users. This is patch 3.
- Tested with bpf selftests in RT and !RT and realized how much
  more work is necessary on bpf side to play nice with RT.
  The urgency of this work got higher. The alternative is to
  convert bpf bits left and right to bpf_mem_alloc.
v2:
https://lore.kernel.org/bpf/20241210023936.46871-1-alexei.starovoitov@gmail.com/
v1->v2:
- fixed buggy try_alloc_pages_noprof() in PREEMPT_RT. Thanks Peter.
- optimize all paths by doing spin_trylock_irqsave() first
  and only then check for gfp_flags & __GFP_TRYLOCK.
  Then spin_lock_irqsave() if it's a regular mode.
  So new gfp flag will not add performance overhead.
- patches 2-5 are new. They introduce lockless and/or trylock free_pages_nolock()
  and memcg support. So it's in usable shape for bpf in patch 6.
v1:
https://lore.kernel.org/bpf/20241116014854.55141-1-alexei.starovoitov@gmail.com/
====================
Link: https://patch.msgid.link/20250222024427.30294-1-alexei.starovoitov@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Diffstat (limited to 'tools/perf/scripts/python/check-perf-trace.py')
0 files changed, 0 insertions, 0 deletions
