summaryrefslogtreecommitdiff
path: root/lib/maple_tree.c
AgeCommit message (Collapse)AuthorFilesLines
2026-04-27MAINTAINERS: update Liam's email addressLiam R. Howlett1-1/+1
Switching to private email address. Update all contact information Add an entry to mailmap at the same time. Link: https://lore.kernel.org/20260422184310.2682901-1-liam@infradead.org Signed-off-by: Liam R. Howlett <liam@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05maple_tree: clean up mas_wr_node_store()Liam R. Howlett1-16/+26
The new_end does not need to be passed in as the data is already being checked. This allows for other areas to skip getting the node new_end in the calling function. The type was incorrectly void * instead of void __rcu *, which isn't an issue but is technically incorrect. Move the variable assignment to after the declarations to clean up the initial setup. Ensure there is something to copy before calling memcpy(). Link: https://lkml.kernel.org/r/20260130205935.2559335-31-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andrew Ballance <andrewjballance@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Kujau <lists@nerdbynature.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: SeongJae Park <sj@kernel.org> Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05maple_tree: don't pass end to mas_wr_append()Liam R. Howlett1-4/+3
Figure out the end internally. This is necessary for future cleanups. Link: https://lkml.kernel.org/r/20260130205935.2559335-30-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andrew Ballance <andrewjballance@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Kujau <lists@nerdbynature.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: SeongJae Park <sj@kernel.org> Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05maple_tree: pass maple copy node to mas_wmb_replace()Liam R. Howlett1-35/+25
mas_wmb_replace() is called in three places with the same setup, move the setup into the function itself. The function needs to be relocated as it calls mtree_range_walk(). Link: https://lkml.kernel.org/r/20260130205935.2559335-29-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andrew Ballance <andrewjballance@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Kujau <lists@nerdbynature.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: SeongJae Park <sj@kernel.org> Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05maple_tree: remove maple big node and subtree structsLiam R. Howlett1-1184/+0
Now that no one uses the structures and functions, drop the dead code. Link: https://lkml.kernel.org/r/20260130205935.2559335-28-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andrew Ballance <andrewjballance@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Kujau <lists@nerdbynature.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: SeongJae Park <sj@kernel.org> Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05maple_tree: use maple copy node for mas_wr_split()Liam R. Howlett1-6/+93
Instead of using the maple big node, use the maple copy node for reduced stack usage and aligning with mas_wr_rebalance() and mas_wr_spanning_store(). Splitting a node is similar to rebalancing, but a new evaluation of when to ascend is needed. The only other difference is that the data is pushed and never rebalanced at each level. The testing must also align with the changes to this commit to ensure the test suite continues to pass. Link: https://lkml.kernel.org/r/20260130205935.2559335-27-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andrew Ballance <andrewjballance@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Kujau <lists@nerdbynature.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: SeongJae Park <sj@kernel.org> Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05maple_tree: add cp_converged() helperLiam R. Howlett1-3/+11
When the maple copy node converges into a single entry, then certain operations can stop ascending the tree. This is used more later. Link: https://lkml.kernel.org/r/20260130205935.2559335-26-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andrew Ballance <andrewjballance@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Kujau <lists@nerdbynature.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: SeongJae Park <sj@kernel.org> Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05maple_tree: add copy_tree_location() helperLiam R. Howlett1-6/+12
Extract the copying of the tree location from one maple state to another into its own function. This is used more later. Link: https://lkml.kernel.org/r/20260130205935.2559335-25-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andrew Ballance <andrewjballance@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Kujau <lists@nerdbynature.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: SeongJae Park <sj@kernel.org> Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05maple_tree: use maple copy node for mas_wr_rebalance() operationLiam R. Howlett1-7/+206
Stop using the maple big node for rebalance operations by changing to more align with spanning store. The rebalance operation needs its own data calculation in rebalance_data(). In the event of too much data, the rebalance tries to push the data using push_data_sib(). If there is insufficient data, the rebalance operation will rebalance against a sibling (found with rebalance_sib()). The rebalance starts at the leaf and works its way upward in the tree using rebalance_ascend(). Most of the code is shared with spanning store such as the copy node having a new root, but is fundamentally different in that the data must come from a sibling. A parent maple state is used to track the parent location to avoid multiple mas_ascend() calls. The maple state tree location is copied from the parent to the mas (child) in the ascend step. Ascending itself is done in the main loop. Link: https://lkml.kernel.org/r/20260130205935.2559335-23-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andrew Ballance <andrewjballance@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Kujau <lists@nerdbynature.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: SeongJae Park <sj@kernel.org> Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05maple_tree: add cp_is_new_root() helperLiam R. Howlett1-32/+38
Add a helper to do what is needed when the maple copy node contains a new root node. This is useful for future commits and is self-documenting code. [Liam.Howlett@oracle.com: remove warnings on older compilers] Link: https://lkml.kernel.org/r/malwmirqnpuxqkqrobcmzfkmmxipoyzwfs2nwc5fbpxlt2r2ej@wchmjtaljvw3 [akpm@linux-foundation.org: s/cp->slot[0]/&cp->slot[0]/, per Liam] Link: https://lkml.kernel.org/r/20260130205935.2559335-22-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andrew Ballance <andrewjballance@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Kujau <lists@nerdbynature.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: SeongJae Park <sj@kernel.org> Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05maple_tree: separate wr_split_store and wr_rebalance store type code pathLiam R. Howlett1-24/+23
The split and rebalance store types both go through the same function that uses the big node. Separate the code paths so that each can be updated independently. No functional change intended Link: https://lkml.kernel.org/r/20260130205935.2559335-21-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andrew Ballance <andrewjballance@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Kujau <lists@nerdbynature.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: SeongJae Park <sj@kernel.org> Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05maple_tree: remove unnecessary return statementsLiam R. Howlett1-11/+0
Functions do not need to state return at the end, unless skipping unwind. These can safely be dropped. Link: https://lkml.kernel.org/r/20260130205935.2559335-20-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andrew Ballance <andrewjballance@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Kujau <lists@nerdbynature.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: SeongJae Park <sj@kernel.org> Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05maple_tree: inline mas_wr_spanning_rebalance()Liam R. Howlett1-23/+15
Now that the spanning rebalance is small, fully inline it in mas_wr_spanning_store(). No functional change. Link: https://lkml.kernel.org/r/20260130205935.2559335-19-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andrew Ballance <andrewjballance@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Kujau <lists@nerdbynature.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: SeongJae Park <sj@kernel.org> Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05maple_tree: start using maple copy node for destinationLiam R. Howlett1-181/+443
Stop using the maple subtree state and big node in favour of using three destinations in the maple copy node. That is, expand the way leaves were handled to all levels of the tree and use the maple copy node to track the new nodes. Extract out the sibling init into the data calculation since this is where the insufficient data can be detected. The remainder of the sibling code to shift the next iteration is moved to the spanning_ascend() function, since it is not always needed. Next introduce the dst_setup() function which will decide how many nodes are needed to contain the data at this level. Using the destination count, populate the copy node's dst array with the new nodes and set d_count to the correct value. Note that this can be tricky in the case of a leaf node with exactly enough room because of the rule against NULLs at the end of leaves. Once the destinations are ready, copy the data by altering the cp_data_write() function to copy from the sources to the destinations directly. This eliminates the use of the big node in this code path. On node completion, node_finalise() will zero out the remaining area and set the metadata, if necessary. spanning_ascend() is used to decide if the operation is complete. It may create a new root, converge into one destination, or continue upwards by ascending the left and right write maple states. One test case setup needed to be tweaked so that the targeted node was surrounded by full nodes. [akpm@linux-foundation.org: coding-style cleanups] Link: https://lkml.kernel.org/r/20260130205935.2559335-18-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andrew Ballance <andrewjballance@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Kujau <lists@nerdbynature.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: SeongJae Park <sj@kernel.org> Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05maple_tree: add gap support, slot and pivot sizes for maple copyLiam R. Howlett1-0/+5
Add plumbing work for using maple copy as a normal node for a source of copy operations. This is needed later. Link: https://lkml.kernel.org/r/20260130205935.2559335-17-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andrew Ballance <andrewjballance@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Kujau <lists@nerdbynature.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: SeongJae Park <sj@kernel.org> Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05maple_tree: introduce ma_leaf_max_gap()Liam R. Howlett1-20/+28
This is the same as mas_leaf_max_gap(), but the information necessary is known without a maple state in future code. Adding this function now simplifies the review for a subsequent patch. Link: https://lkml.kernel.org/r/20260130205935.2559335-16-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andrew Ballance <andrewjballance@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Kujau <lists@nerdbynature.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: SeongJae Park <sj@kernel.org> Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05maple_tree: change initial big node setup in mas_wr_spanning_rebalance()Liam R. Howlett1-23/+152
Instead of copying the data into the big node and finding out that the data may need to be moved or appended to, calculate the data space up front (in the maple copy node) and set up another source for the copy. The additional copy source is tracked in the maple state sib (short for sibling), and is put into the maple write states for future operations after the data is in the big node. To facilitate the newly moved node, some initial setup of the maple subtree state are relocated after the potential shift caused by the new way of rebalancing against a sibling. Link: https://lkml.kernel.org/r/20260130205935.2559335-15-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andrew Ballance <andrewjballance@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Kujau <lists@nerdbynature.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: SeongJae Park <sj@kernel.org> Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05maple_tree: inline mas_spanning_rebalance_loop() into ↵Liam R. Howlett1-1/+107
mas_wr_spanning_rebalance() Just copy the code and replace count with height. This is done to avoid affecting other code paths into mas_spanning_rebalance_loop() for the next change. No functional change intended. Link: https://lkml.kernel.org/r/20260130205935.2559335-14-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andrew Ballance <andrewjballance@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Kujau <lists@nerdbynature.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: SeongJae Park <sj@kernel.org> Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05maple_tree: introduce maple_copy node and use it in mas_spanning_rebalance()Liam R. Howlett1-9/+131
Introduce an internal-memory only node type called maple_copy to facilitate internal copy operations. Use it in mas_spanning_rebalance() for just the leaf nodes. Initially, the maple_copy node is used to configure the source nodes and copy the data into the big_node. The maple_copy contains a list of source entries with start and end offsets. One of the maple_copy entries can be itself with an offset of 0 to 2, representing the data where the store partially overwrites entries, or fully overwrites the entry. The side effect is that the source nodes no longer have to worry about partially copying the existing offset if it is not fully overwritten. This is in preparation of removal of the maple big_node, but for the time being the data is copied to the big node to limit the change size. Link: https://lkml.kernel.org/r/20260130205935.2559335-12-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andrew Ballance <andrewjballance@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Kujau <lists@nerdbynature.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: SeongJae Park <sj@kernel.org> Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05maple_tree: correct right ma_wr_state end pivot in mas_wr_spanning_store()Liam R. Howlett1-0/+1
The end_piv will be needed in the next patch set and has not been set correctly in this code path. Correct the oversight before using it. Link: https://lkml.kernel.org/r/20260130205935.2559335-11-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andrew Ballance <andrewjballance@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Kujau <lists@nerdbynature.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: SeongJae Park <sj@kernel.org> Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05maple_tree: move maple_subtree_state from mas_wr_spanning_store to ↵Liam R. Howlett1-20/+21
mas_wr_spanning_rebalance Moving the maple_subtree_state is necessary for future cleanups and is only set up in mas_wr_spanning_rebalance() but never used. Link: https://lkml.kernel.org/r/20260130205935.2559335-10-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andrew Ballance <andrewjballance@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Kujau <lists@nerdbynature.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: SeongJae Park <sj@kernel.org> Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05maple_tree: don't pass through height in mas_wr_spanning_storeLiam R. Howlett1-5/+4
Height is not used locally in the function, so call the height argument closer to where it is passed in the next level. Link: https://lkml.kernel.org/r/20260130205935.2559335-9-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andrew Ballance <andrewjballance@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Kujau <lists@nerdbynature.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: SeongJae Park <sj@kernel.org> Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05maple_tree: remove l_wr_mas from mas_wr_spanning_rebalanceLiam R. Howlett1-11/+8
Use the wr_mas instead of creating another variable on the stack. Take the opportunity to remove l_mas from being used anywhere but in the maple_subtree_state. Link: https://lkml.kernel.org/r/20260130205935.2559335-8-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andrew Ballance <andrewjballance@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Kujau <lists@nerdbynature.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: SeongJae Park <sj@kernel.org> Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05maple_tree: make ma_wr_states reliable for reuse in spanning storeLiam R. Howlett1-0/+2
mas_extend_spanning_null() was not modifying the range min and range max of the resulting store operation. The result was that the maple write state no longer matched what the write was doing. This was not an issue as the values were previously not used, but to make the ma_wr_state usable in future changes, the range min/max stored in the ma_wr_state for left and right need to be consistent with the operation. Link: https://lkml.kernel.org/r/20260130205935.2559335-7-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andrew Ballance <andrewjballance@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Kujau <lists@nerdbynature.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: SeongJae Park <sj@kernel.org> Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05maple_tree: inline mas_spanning_rebalance() into mas_wr_spanning_rebalance()Liam R. Howlett1-1/+19
Copy the contents of mas_spanning_rebalance() into mas_wr_spanning_rebalance(), in preparation of removing initial big node use. No functional changes intended. Link: https://lkml.kernel.org/r/20260130205935.2559335-6-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andrew Ballance <andrewjballance@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Kujau <lists@nerdbynature.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: SeongJae Park <sj@kernel.org> Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05maple_tree: remove unnecessary assignment of orig_l indexLiam R. Howlett1-1/+1
The index value is already a copy of the maple state so there is no need to set it again. Link: https://lkml.kernel.org/r/20260130205935.2559335-5-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andrew Ballance <andrewjballance@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Kujau <lists@nerdbynature.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: SeongJae Park <sj@kernel.org> Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05maple_tree: extract use of big node from mas_wr_spanning_store()Liam R. Howlett1-18/+26
Isolate big node to use in its own function. No functional changes intended. Link: https://lkml.kernel.org/r/20260130205935.2559335-4-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andrew Ballance <andrewjballance@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Kujau <lists@nerdbynature.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: SeongJae Park <sj@kernel.org> Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05maple_tree: move mas_spanning_rebalance loop to functionLiam R. Howlett1-50/+58
Move the loop over the tree levels to its own function. No intended functional changes. Link: https://lkml.kernel.org/r/20260130205935.2559335-3-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andrew Ballance <andrewjballance@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Kujau <lists@nerdbynature.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: SeongJae Park <sj@kernel.org> Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05maple_tree: fix mas_dup_alloc() sparse warningLiam R. Howlett1-2/+9
Patch series "maple_tree: Replace big node with maple copy", v3. The big node struct was created for simplicity of splitting, rebalancing, and spanning store operations by using a copy buffer to create the data necessary prior to breaking it up into 256B nodes. Certain operations were rather tricky due to the restriction of keeping NULL entries together and never at the end of a node (except the right-most node). The big node struct is incompatible with future features that are currently in development. Specifically different node types and different data type sizes for pivots. The big node struct was also a stack variable, which caused issues with certain configurations of kernel build. This series removes big node by introducing another node type which will never be written to the tree: maple_copy. The maple copy node operates more like a scatter/gather operation with a number of sources and destinations of allocated nodes. The sources are copied to the destinations, in turn, until the sources are exhausted. The destination is changed if it is filled or the split location is reached prior to the source data end. New data is inserted by using the maple copy node itself as a source with up to 3 slots and pivots. The data in the maple copy node is the data being written to the tree along with any fragment of the range(s) being overwritten. As with all nodes, the maple copy node is of size 256B. Using a node type allows for the copy operation to treat the new data stored in the maple copy node the same as any other source node. Analysis of the runtime shows no regression or benefit of removing the larger stack structure. The motivation is the ground work to use new node types and to help those with odd configurations that have had issues. The change was tested by myself using mm_tests on amd64 and by Suren on android (arm64). Limited testing on s390 qemu was also performed using stress-ng on the virtual memory, which should cover many corner cases. This patch (of 30): Use RCU_INIT_POINTER to initialize an rcu pointer to an initial value since there are no readers within the tree being created during duplication. There is no risk of readers seeing the initialized or uninitialized value until after the synchronization call in mas_dup_buld(). Link: https://lkml.kernel.org/r/20260130205935.2559335-1-Liam.Howlett@oracle.com Link: https://lkml.kernel.org/r/20260130205935.2559335-2-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andrew Ballance <andrewjballance@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Kujau <lists@nerdbynature.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: SeongJae Park <sj@kernel.org> Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-10maple_tree: fix tracepoint string pointersMartin Kaiser1-14/+16
maple_tree tracepoints contain pointers to function names. Such a pointer is saved when a tracepoint logs an event. There's no guarantee that it's still valid when the event is parsed later and the pointer is dereferenced. The kernel warns about these unsafe pointers. event 'ma_read' has unsafe pointer field 'fn' WARNING: kernel/trace/trace.c:3779 at ignore_event+0x1da/0x1e4 Mark the function names as tracepoint_string() to fix the events. One case that doesn't work without my patch would be trace-cmd record to save the binary ringbuffer and trace-cmd report to parse it in userspace. The address of __func__ can't be dereferenced from userspace but tracepoint_string will add an entry to /sys/kernel/tracing/printk_formats Link: https://lkml.kernel.org/r/20251030155537.87972-1-martin@kaiser.cx Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Martin Kaiser <martin@kaiser.cx> Acked-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-10-03Merge tag 'mm-stable-2025-10-01-19-00' of ↵Linus Torvalds1-6/+6
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - "mm, swap: improve cluster scan strategy" from Kairui Song improves performance and reduces the failure rate of swap cluster allocation - "support large align and nid in Rust allocators" from Vitaly Wool permits Rust allocators to set NUMA node and large alignment when perforning slub and vmalloc reallocs - "mm/damon/vaddr: support stat-purpose DAMOS" from Yueyang Pan extend DAMOS_STAT's handling of the DAMON operations sets for virtual address spaces for ops-level DAMOS filters - "execute PROCMAP_QUERY ioctl under per-vma lock" from Suren Baghdasaryan reduces mmap_lock contention during reads of /proc/pid/maps - "mm/mincore: minor clean up for swap cache checking" from Kairui Song performs some cleanup in the swap code - "mm: vm_normal_page*() improvements" from David Hildenbrand provides code cleanup in the pagemap code - "add persistent huge zero folio support" from Pankaj Raghav provides a block layer speedup by optionalls making the huge_zero_pagepersistent, instead of releasing it when its refcount falls to zero - "kho: fixes and cleanups" from Mike Rapoport adds a few touchups to the recently added Kexec Handover feature - "mm: make mm->flags a bitmap and 64-bit on all arches" from Lorenzo Stoakes turns mm_struct.flags into a bitmap. To end the constant struggle with space shortage on 32-bit conflicting with 64-bit's needs - "mm/swapfile.c and swap.h cleanup" from Chris Li cleans up some swap code - "selftests/mm: Fix false positives and skip unsupported tests" from Donet Tom fixes a few things in our selftests code - "prctl: extend PR_SET_THP_DISABLE to only provide THPs when advised" from David Hildenbrand "allows individual processes to opt-out of THP=always into THP=madvise, without affecting other workloads on the system". It's a long story - the [1/N] changelog spells out the considerations - "Add and use memdesc_flags_t" from Matthew Wilcox gets us started on the memdesc project. Please see https://kernelnewbies.org/MatthewWilcox/Memdescs and https://blogs.oracle.com/linux/post/introducing-memdesc - "Tiny optimization for large read operations" from Chi Zhiling improves the efficiency of the pagecache read path - "Better split_huge_page_test result check" from Zi Yan improves our folio splitting selftest code - "test that rmap behaves as expected" from Wei Yang adds some rmap selftests - "remove write_cache_pages()" from Christoph Hellwig removes that function and converts its two remaining callers - "selftests/mm: uffd-stress fixes" from Dev Jain fixes some UFFD selftests issues - "introduce kernel file mapped folios" from Boris Burkov introduces the concept of "kernel file pages". Using these permits btrfs to account its metadata pages to the root cgroup, rather than to the cgroups of random inappropriate tasks - "mm/pageblock: improve readability of some pageblock handling" from Wei Yang provides some readability improvements to the page allocator code - "mm/damon: support ARM32 with LPAE" from SeongJae Park teaches DAMON to understand arm32 highmem - "tools: testing: Use existing atomic.h for vma/maple tests" from Brendan Jackman performs some code cleanups and deduplication under tools/testing/ - "maple_tree: Fix testing for 32bit compiles" from Liam Howlett fixes a couple of 32-bit issues in tools/testing/radix-tree.c - "kasan: unify kasan_enabled() and remove arch-specific implementations" from Sabyrzhan Tasbolatov moves KASAN arch-specific initialization code into a common arch-neutral implementation - "mm: remove zpool" from Johannes Weiner removes zspool - an indirection layer which now only redirects to a single thing (zsmalloc) - "mm: task_stack: Stack handling cleanups" from Pasha Tatashin makes a couple of cleanups in the fork code - "mm: remove nth_page()" from David Hildenbrand makes rather a lot of adjustments at various nth_page() callsites, eventually permitting the removal of that undesirable helper function - "introduce kasan.write_only option in hw-tags" from Yeoreum Yun creates a KASAN read-only mode for ARM, using that architecture's memory tagging feature. It is felt that a read-only mode KASAN is suitable for use in production systems rather than debug-only - "mm: hugetlb: cleanup hugetlb folio allocation" from Kefeng Wang does some tidying in the hugetlb folio allocation code - "mm: establish const-correctness for pointer parameters" from Max Kellermann makes quite a number of the MM API functions more accurate about the constness of their arguments. This was getting in the way of subsystems (in this case CEPH) when they attempt to improving their own const/non-const accuracy - "Cleanup free_pages() misuse" from Vishal Moola fixes a number of code sites which were confused over when to use free_pages() vs __free_pages() - "Add Rust abstraction for Maple Trees" from Alice Ryhl makes the mapletree code accessible to Rust. Required by nouveau and by its forthcoming successor: the new Rust Nova driver - "selftests/mm: split_huge_page_test: split_pte_mapped_thp improvements" from David Hildenbrand adds a fix and some cleanups to the thp selftesting code - "mm, swap: introduce swap table as swap cache (phase I)" from Chris Li and Kairui Song is the first step along the path to implementing "swap tables" - a new approach to swap allocation and state tracking which is expected to yield speed and space improvements. This patchset itself yields a 5-20% performance benefit in some situations - "Some ptdesc cleanups" from Matthew Wilcox utilizes the new memdesc layer to clean up the ptdesc code a little - "Fix va_high_addr_switch.sh test failure" from Chunyu Hu fixes some issues in our 5-level pagetable selftesting code - "Minor fixes for memory allocation profiling" from Suren Baghdasaryan addresses a couple of minor issues in relatively new memory allocation profiling feature - "Small cleanups" from Matthew Wilcox has a few cleanups in preparation for more memdesc work - "mm/damon: add addr_unit for DAMON_LRU_SORT and DAMON_RECLAIM" from Quanmin Yan makes some changes to DAMON in furtherance of supporting arm highmem - "selftests/mm: Add -Wunreachable-code and fix warnings" from Muhammad Anjum adds that compiler check to selftests code and fixes the fallout, by removing dead code - "Improvements to Victim Process Thawing and OOM Reaper Traversal Order" from zhongjinji makes a number of improvements in the OOM killer: mainly thawing a more appropriate group of victim threads so they can release resources - "mm/damon: misc fixups and improvements for 6.18" from SeongJae Park is a bunch of small and unrelated fixups for DAMON - "mm/damon: define and use DAMON initialization check function" from SeongJae Park implement reliability and maintainability improvements to a recently-added bug fix - "mm/damon/stat: expose auto-tuned intervals and non-idle ages" from SeongJae Park provides additional transparency to userspace clients of the DAMON_STAT information - "Expand scope of khugepaged anonymous collapse" from Dev Jain removes some constraints on khubepaged's collapsing of anon VMAs. It also increases the success rate of MADV_COLLAPSE against an anon vma - "mm: do not assume file == vma->vm_file in compat_vma_mmap_prepare()" from Lorenzo Stoakes moves us further towards removal of file_operations.mmap(). This patchset concentrates upon clearing up the treatment of stacked filesystems - "mm: Improve mlock tracking for large folios" from Kiryl Shutsemau provides some fixes and improvements to mlock's tracking of large folios. /proc/meminfo's "Mlocked" field became more accurate - "mm/ksm: Fix incorrect accounting of KSM counters during fork" from Donet Tom fixes several user-visible KSM stats inaccuracies across forks and adds selftest code to verify these counters - "mm_slot: fix the usage of mm_slot_entry" from Wei Yang addresses some potential but presently benign issues in KSM's mm_slot handling * tag 'mm-stable-2025-10-01-19-00' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (372 commits) mm: swap: check for stable address space before operating on the VMA mm: convert folio_page() back to a macro mm/khugepaged: use start_addr/addr for improved readability hugetlbfs: skip VMAs without shareable locks in hugetlb_vmdelete_list alloc_tag: fix boot failure due to NULL pointer dereference mm: silence data-race in update_hiwater_rss mm/memory-failure: don't select MEMORY_ISOLATION mm/khugepaged: remove definition of struct khugepaged_mm_slot mm/ksm: get mm_slot by mm_slot_entry() when slot is !NULL hugetlb: increase number of reserving hugepages via cmdline selftests/mm: add fork inheritance test for ksm_merging_pages counter mm/ksm: fix incorrect KSM counter handling in mm_struct during fork drivers/base/node: fix double free in register_one_node() mm: remove PMD alignment constraint in execmem_vmalloc() mm/memory_hotplug: fix typo 'esecially' -> 'especially' mm/rmap: improve mlock tracking for large folios mm/filemap: map entire large folio faultaround mm/fault: try to map the entire file folio in finish_fault() mm/rmap: mlock large folios in try_to_unmap_one() mm/rmap: fix a mlock race condition in folio_referenced_one() ...
2025-09-29maple_tree: Convert forking to use the sheaf interfaceLiam R. Howlett1-24/+23
Use the generic interface which should result in less bulk allocations during a forking. A part of this is to abstract the freeing of the sheaf or maple state allocations into its own function so mas_destroy() and the tree duplication code can use the same functionality to return any unused resources. [andriy.shevchenko@linux.intel.com: remove unused mt_alloc_bulk()] Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2025-09-29maple_tree: Add single node allocation support to maple stateLiam R. Howlett1-6/+41
The fast path through a write will require replacing a single node in the tree. Using a sheaf (32 nodes) is too heavy for the fast path, so special case the node store operation by just allocating one node in the maple state. Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2025-09-29maple_tree: Prefilled sheaf conversion and testingLiam R. Howlett1-264/+62
Use prefilled sheaves instead of bulk allocations. This should speed up the allocations and the return path of unused allocations. Remove the push and pop of nodes from the maple state as this is now handled by the slab layer with sheaves. Testing has been removed as necessary since the features of the tree have been reduced. Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2025-09-29maple_tree: Replace mt_free_one() with kfree()Pedro Falcato1-9/+4
kfree() is a little shorter and works with kmem_cache_alloc'd pointers too. Also lets us remove one more helper. Signed-off-by: Pedro Falcato <pfalcato@suse.de> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2025-09-29maple_tree: Use kfree_rcu in ma_free_rcuPedro Falcato1-10/+3
kfree_rcu is an optimized version of call_rcu + kfree. It used to not be possible to call it on non-kmalloc objects, but this restriction was lifted ever since SLOB was dropped from the kernel, and since commit 6c6c47b063b5 ("mm, slab: call kvfree_rcu_barrier() from kmem_cache_destroy()"). Thus, replace call_rcu + mt_free_rcu with kfree_rcu. Signed-off-by: Pedro Falcato <pfalcato@suse.de> Reviewed-by: Harry Yoo <harry.yoo@oracle.com> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2025-09-29maple_tree: use percpu sheaves for maple_node_cacheVlastimil Babka1-2/+7
Setup the maple_node_cache with percpu sheaves of size 32 to hopefully improve its performance. Note this will not immediately take advantage of sheaf batching of kfree_rcu() operations due to the maple tree using call_rcu with custom callbacks. The followup changes to maple tree will change that and also make use of the prefilled sheaves functionality. Reviewed-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2025-09-29maple_tree: Drop bulk insert supportLiam R. Howlett1-266/+4
Bulk insert mode was added to facilitate forking faster, but forking now uses __mt_dup() to duplicate the tree. The addition of sheaves has made the bulk allocations difficult to maintain - since the expected entries would preallocate into the maple state. A big part of the maple state node allocation was the ability to push nodes back onto the state for later use, which was essential to the bulk insert algorithm. Remove mas_expected_entries() and mas_destroy_rebalance() functions as well as the MA_STATE_BULK and MA_STATE_REBALANCE maple state flags since there are no users anymore. Drop the associated testing as well. Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2025-09-29maple_tree: remove redundant __GFP_NOWARNQianfeng Rong1-2/+2
Commit 16f5dfbc851b ("gfp: include __GFP_NOWARN in GFP_NOWAIT") made GFP_NOWAIT implicitly include __GFP_NOWARN. Therefore, explicit __GFP_NOWARN combined with GFP_NOWAIT (e.g., `GFP_NOWAIT | __GFP_NOWARN`) is now redundant. Let's clean up these redundant flags across subsystems. No functional changes. Link: https://lkml.kernel.org/r/20250804125657.482109-1-rongqianfeng@vivo.com Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com> Reviewed-by: Wei Yang <richard.weiyang@gmail.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2025-09-14maple_tree: fix MAPLE_PARENT_RANGE32 and parent pointer docsSidhartha Kumar1-6/+6
MAPLE_PARENT_RANGE32 should be 0x02 as a 32 bit node is indicated by the bit pattern 0b010 which is the hex value 0x02. There are no users currently, so there is no associated bug with this wrong value. Fix typo Note -> Node and replace x with b to indicate binary values. Link: https://lkml.kernel.org/r/20250826151344.403286-1-sidhartha.kumar@oracle.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-14maple tree: add some commentsDev Jain1-1/+7
Add comments explaining the fields for maple_metadata, since "end" is ambiguous and "gap" can be confused as the largest gap, whereas it is actually the offset of the largest gap. Add comment for mas_ascend() to explain, whose min and max we are trying to find. Explain that, for example, if we are already on offset zero, then the parent min is mas->min, otherwise we need to walk up to find the implied pivot min. Link: https://lkml.kernel.org/r/20250703063338.51509-1-dev.jain@arm.com Signed-off-by: Dev Jain <dev.jain@arm.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-13Merge branch 'mm-hotfixes-stable' into mm-stable to pick up changes whichAndrew Morton1-0/+1
are required for a merge of the series "mm: folio_pte_batch() improvements".
2025-07-10maple_tree: fix status setup on restore to activeLiam R. Howlett1-7/+18
During the initial call with a maple state, an error status may be set before a valid node is populated into the maple state node. Subsequent calls with the maple state may restore the state into an active state with no node set. This was masked by the mas_walk() always resetting the status to ma_start and result in an extra walk in this rare scenario. Don't restore the state to active unless there is a value in the structs node. This also allows mas_walk() to be fixed to use the active state without exposing an issue. User visible results are marginal performance improvements when an active state can be restored and used instead of rewalking the tree. Stable is not Cc'ed because the existing code is stable and the performance gains are not worth the risk. Link: https://lore.kernel.org/all/20250611011253.19515-1-richard.weiyang@gmail.com/ Link: https://lore.kernel.org/all/20250407231354.11771-1-richard.weiyang@gmail.com/ Link: https://lore.kernel.org/all/202506191556.6bfc7b93-lkp@intel.com/ Link: https://lkml.kernel.org/r/20250624154823.52221-1-Liam.Howlett@oracle.com Fixes: a8091f039c1e ("maple_tree: add MAS_UNDERFLOW and MAS_OVERFLOW states") Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reported-by: Wei Yang <richard.weiyang@gmail.com> Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202506191556.6bfc7b93-lkp@intel.com Reviewed-by: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-10maple tree: use goto label to simplify codeDev Jain1-5/+2
Use the underflow goto label to set the status to ma_underflow and return NULL, as is being done elsewhere. [akpm@linux-foundation.org: add newline, per Liam (and remove one, per akpm)] Link: https://lkml.kernel.org/r/20250624080748.4855-1-dev.jain@arm.com Signed-off-by: Dev Jain <dev.jain@arm.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reviewed-by: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-10maple_tree: fix mt_destroy_walk() on root leaf nodeWei Yang1-0/+1
On destroy, we should set each node dead. But current code miss this when the maple tree has only the root node. The reason is mt_destroy_walk() leverage mte_destroy_descend() to set node dead, but this is skipped since the only root node is a leaf. Fixes this by setting the node dead if it is a leaf. Link: https://lore.kernel.org/all/20250407231354.11771-1-richard.weiyang@gmail.com/ Link: https://lkml.kernel.org/r/20250624191841.64682-1-Liam.Howlett@oracle.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reviewed-by: Dev Jain <dev.jain@arm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-06-20maple_tree: fix MA_STATE_PREALLOC flag in mas_preallocate()Liam R. Howlett1-1/+3
Temporarily clear the preallocation flag when explicitly requesting allocations. Pre-existing allocations are already counted against the request through mas_node_count_gfp(), but the allocations will not happen if the MA_STATE_PREALLOC flag is set. This flag is meant to avoid re-allocating in bulk allocation mode, and to detect issues with preallocation calculations. The MA_STATE_PREALLOC flag should also always be set on zero allocations so that detection of underflow allocations will print a WARN_ON() during consumption. User visible effect of this flaw is a WARN_ON() followed by a null pointer dereference when subsequent requests for larger number of nodes is ignored, such as the vma merge retry in mmap_region() caused by drivers altering the vma flags (which happens in v6.6, at least) Link: https://lkml.kernel.org/r/20250616184521.3382795-3-Liam.Howlett@oracle.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reported-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com> Reported-by: Hailong Liu <hailong.liu@oppo.com> Link: https://lore.kernel.org/all/1652f7eb-a51b-4fee-8058-c73af63bacd1@oppo.com/ Link: https://lore.kernel.org/all/20250428184058.1416274-1-Liam.Howlett@oracle.com/ Link: https://lore.kernel.org/all/20250429014754.1479118-1-Liam.Howlett@oracle.com/ Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Hailong Liu <hailong.liu@oppo.com> Cc: zhangpeng.00@bytedance.com <zhangpeng.00@bytedance.com> Cc: Steve Kang <Steve.Kang@unisoc.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-12maple_tree: reorder mas->store_type case statementsSidhartha Kumar1-26/+25
Move the unlikely case that mas->store_type is invalid to be the last evaluated case and put liklier cases higher up. Link: https://lkml.kernel.org/r/20250410191446.2474640-7-sidhartha.kumar@oracle.com Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> Suggested-by: Liam R. Howlett <liam.howlett@oracle.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-12maple_tree: add sufficient heightSidhartha Kumar1-3/+16
In order to support rebalancing and spanning stores using less than the worst case number of nodes, we need to track more than just the vacant height. Using only vacant height to reduce the worst case maple node allocation count can lead to a shortcoming of nodes in the following scenarios. For rebalancing writes, when a leaf node becomes insufficient, it may be combined with a sibling into a single node. This means that the parent node which has entries for this children will lose one entry. If this parent node was just meeting the minimum entries, losing one entry will now cause this parent node to be insufficient. This leads to a cascading operation of rebalancing at different levels and can lead to more node allocations than simply using vacant height can return. For spanning writes, a similar situation occurs. At the location at which a spanning write is detected, the number of ancestor nodes may similarly need to rebalanced into a smaller number of nodes and the same cascading situation could occur. To use less than the full height of the tree for the number of allocations, we also need to track the height at which a non-leaf node cannot become insufficient. This means even if a rebalance occurs to a child of this node, it currently has enough entries that it can lose one without any further action. This field is stored in the maple write state as sufficient height. In mas_prealloc_calc() when figuring out how many nodes to allocate, we check if the vacant node is lower in the tree than a sufficient node (has a larger value). If it is, we cannot use the vacant height and must use the difference in the height and sufficient height as the basis for the number of nodes needed. An off by one bug was also discovered in mast_overflow() where it is using >= rather than >. This caused extra iterations of the mas_spanning_rebalance() loop and lead to unneeded allocations. A test is also added to check the number of allocations is correct. Link: https://lkml.kernel.org/r/20250410191446.2474640-6-sidhartha.kumar@oracle.com Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-12maple_tree: break on convergence in mas_spanning_rebalance()Sidhartha Kumar1-3/+13
This allows support for using the vacant height to calculate the worst case number of nodes needed for wr_rebalance operation. mas_spanning_rebalance() was seen to perform unnecessary node allocations. We can reduce allocations by breaking early during the rebalancing loop once we realize that we have ascended to a common ancestor. Link: https://lkml.kernel.org/r/20250410191446.2474640-5-sidhartha.kumar@oracle.com Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> Suggested-by: Liam Howlett <liam.howlett@oracle.com> Reviewed-by: Wei Yang <richard.weiyang@gmail.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-12maple_tree: use vacant nodes to reduce worst case allocationsSidhartha Kumar1-4/+9
In order to determine the store type for a maple tree operation, a walk of the tree is done through mas_wr_walk(). This function descends the tree until a spanning write is detected or we reach a leaf node. While descending, keep track of the height at which we encounter a node with available space. This is done by checking if mas->end is less than the number of slots a given node type can fit. Now that the height of the vacant node is tracked, we can use the difference between the height of the tree and the height of the vacant node to know how many levels we will have to propagate creating new nodes. Update mas_prealloc_calc() to consider the vacant height and reduce the number of worst-case allocations. Rebalancing and spanning stores are not supported and fall back to using the full height of the tree for allocations. Update preallocation testing assertions to take into account vacant height. Link: https://lkml.kernel.org/r/20250410191446.2474640-4-sidhartha.kumar@oracle.com Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>