<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/include/linux/topology.h, branch v6.19.11</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.19.11</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.19.11'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2025-09-03T08:03:12+00:00</updated>
<entry>
<title>sched/fair: Get rid of sched_domains_curr_level hack for tl-&gt;cpumask()</title>
<updated>2025-09-03T08:03:12+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2025-08-25T12:02:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=661f951e371cc134ea31c84238dbdc9a898b8403'/>
<id>urn:sha1:661f951e371cc134ea31c84238dbdc9a898b8403</id>
<content type='text'>
Leon [1] and Vinicius [2] noted a topology_span_sane() warning during
their testing starting from v6.16-rc1. Debug that followed pointed to
the tl-&gt;mask() for the NODE domain being incorrectly resolved to that of
the highest NUMA domain.

tl-&gt;mask() for NODE is set to the sd_numa_mask() which depends on the
global "sched_domains_curr_level" hack. "sched_domains_curr_level" is
set to the "tl-&gt;numa_level" during tl traversal in build_sched_domains()
calling sd_init() but was not reset before topology_span_sane().

Since "tl-&gt;numa_level" still reflected the old value from
build_sched_domains(), topology_span_sane() for the NODE domain trips
when the span of the last NUMA domain overlaps.

Instead of replicating the "sched_domains_curr_level" hack, get rid of
it entirely and instead, pass the entire "sched_domain_topology_level"
object to tl-&gt;cpumask() function to prevent such mishap in the future.

sd_numa_mask() now directly references "tl-&gt;numa_level" instead of
relying on the global "sched_domains_curr_level" hack to index into
sched_domains_numa_masks[].

The original warning was reproducible on the following NUMA topology
reported by Leon:

    $ sudo numactl -H
    available: 5 nodes (0-4)
    node 0 cpus: 0 1
    node 0 size: 2927 MB
    node 0 free: 1603 MB
    node 1 cpus: 2 3
    node 1 size: 3023 MB
    node 1 free: 3008 MB
    node 2 cpus: 4 5
    node 2 size: 3023 MB
    node 2 free: 3007 MB
    node 3 cpus: 6 7
    node 3 size: 3023 MB
    node 3 free: 3002 MB
    node 4 cpus: 8 9
    node 4 size: 3022 MB
    node 4 free: 2718 MB
    node distances:
    node   0   1   2   3   4
      0:  10  39  38  37  36
      1:  39  10  38  37  36
      2:  38  38  10  37  36
      3:  37  37  37  10  36
      4:  36  36  36  36  10

The above topology can be mimicked using the following QEMU cmd that was
used to reproduce the warning and test the fix:

     sudo qemu-system-x86_64 -enable-kvm -cpu host \
     -m 20G -smp cpus=10,sockets=10 -machine q35 \
     -object memory-backend-ram,size=4G,id=m0 \
     -object memory-backend-ram,size=4G,id=m1 \
     -object memory-backend-ram,size=4G,id=m2 \
     -object memory-backend-ram,size=4G,id=m3 \
     -object memory-backend-ram,size=4G,id=m4 \
     -numa node,cpus=0-1,memdev=m0,nodeid=0 \
     -numa node,cpus=2-3,memdev=m1,nodeid=1 \
     -numa node,cpus=4-5,memdev=m2,nodeid=2 \
     -numa node,cpus=6-7,memdev=m3,nodeid=3 \
     -numa node,cpus=8-9,memdev=m4,nodeid=4 \
     -numa dist,src=0,dst=1,val=39 \
     -numa dist,src=0,dst=2,val=38 \
     -numa dist,src=0,dst=3,val=37 \
     -numa dist,src=0,dst=4,val=36 \
     -numa dist,src=1,dst=0,val=39 \
     -numa dist,src=1,dst=2,val=38 \
     -numa dist,src=1,dst=3,val=37 \
     -numa dist,src=1,dst=4,val=36 \
     -numa dist,src=2,dst=0,val=38 \
     -numa dist,src=2,dst=1,val=38 \
     -numa dist,src=2,dst=3,val=37 \
     -numa dist,src=2,dst=4,val=36 \
     -numa dist,src=3,dst=0,val=37 \
     -numa dist,src=3,dst=1,val=37 \
     -numa dist,src=3,dst=2,val=37 \
     -numa dist,src=3,dst=4,val=36 \
     -numa dist,src=4,dst=0,val=36 \
     -numa dist,src=4,dst=1,val=36 \
     -numa dist,src=4,dst=2,val=36 \
     -numa dist,src=4,dst=3,val=36 \
     ...

  [ prateek: Moved common functions to include/linux/sched/topology.h,
    reuse the common bits for s390 and ppc, commit message ]

Closes: https://lore.kernel.org/lkml/20250610110701.GA256154@unreal/ [1]
Fixes: ccf74128d66c ("sched/topology: Assert non-NUMA topology masks don't (partially) overlap") # ce29a7da84cd, f55dac1dafb3
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reported-by: Leon Romanovsky &lt;leon@kernel.org&gt;
Signed-off-by: K Prateek Nayak &lt;kprateek.nayak@amd.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Valentin Schneider &lt;vschneid@redhat.com&gt;
Reviewed-by: Shrikanth Hegde &lt;sshegde@linux.ibm.com&gt;
Tested-by: Valentin Schneider &lt;vschneid@redhat.com&gt; # x86
Tested-by: Shrikanth Hegde &lt;sshegde@linux.ibm.com&gt; # powerpc
Link: https://lore.kernel.org/lkml/a3de98387abad28592e6ab591f3ff6107fe01dc1.1755893468.git.tim.c.chen@linux.intel.com/ [2]
</content>
</entry>
<entry>
<title>Merge tag 'bitmap-for-6.16-rc1' of https://github.com/norov/linux</title>
<updated>2025-06-03T14:39:23+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2025-06-03T14:39:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=8b2198f03776c5c25f0cafe0ba5c0c60807b554b'/>
<id>urn:sha1:8b2198f03776c5c25f0cafe0ba5c0c60807b554b</id>
<content type='text'>
Pull bitmap updates from Yury Norov:

 - dead code cleanups for cpumasks and nodemasks (me)

 - fixed-width flavors of GENMASK() and BIT() (Vincent, Lucas and me)

 - FIELD_MODIFY() helper (Luo)

 - for_each_node_with_cpus() optimization (me)

 - bitmap-str fixes (Andy)

* tag 'bitmap-for-6.16-rc1' of https://github.com/norov/linux:
  topology: make for_each_node_with_cpus() O(N)
  bitfield: Add FIELD_MODIFY() helper
  bitmap-str: Add missing header(s)
  bitmap-str: Get rid of 'extern' for function prototypes
  build_bug.h: more user friendly error messages in BUILD_BUG_ON_ZERO()
  test_bits: add tests for BIT_U*()
  test_bits: add tests for GENMASK_U*()
  drm/i915: Convert REG_GENMASK*() to fixed-width GENMASK_U*()
  bits: introduce fixed-type BIT_U*()
  bits: introduce fixed-type GENMASK_U*()
  bits: add comments and newlines to #if, #else and #endif directives
  cpumask: drop cpumask_assign_cpu()
  riscv: switch set_icache_stale_mask() to using non-atomic assign_cpu()
  cpumask: add non-atomic __assign_cpu()
  nodemask: drop nodes_shift
</content>
</entry>
<entry>
<title>topology: make for_each_node_with_cpus() O(N)</title>
<updated>2025-05-13T15:40:04+00:00</updated>
<author>
<name>Yury Norov [NVIDIA]</name>
<email>yury.norov@gmail.com</email>
</author>
<published>2025-05-09T16:20:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=895ee6a22e3195b7c1fee140c842bdeedb89ed33'/>
<id>urn:sha1:895ee6a22e3195b7c1fee140c842bdeedb89ed33</id>
<content type='text'>
for_each_node_with_cpus() calls nr_cpus_node() at every iteration, which
makes it O(N^2). Kernel tracks such nodes with N_CPU record in node_states
array. Switching to it makes for_each_node_with_cpus() O(N).

Andrea:

Now we can include also offline nodes with CPUs assigned (assuming it's
possible). If checking the online state is required, the user must use
node_online() within the loop.

CC: Andrea Righi &lt;arighi@nvidia.com&gt;
CC:Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Yury Norov [NVIDIA] &lt;yury.norov@gmail.com&gt;
</content>
</entry>
<entry>
<title>arch_topology: Relocate cpu_scale to topology.[h|c]</title>
<updated>2025-05-07T19:56:55+00:00</updated>
<author>
<name>Ricardo Neri</name>
<email>ricardo.neri-calderon@linux.intel.com</email>
</author>
<published>2025-04-19T02:55:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=6bceea7a1e076ef9d71b20d8dda2f7dc52bd34d2'/>
<id>urn:sha1:6bceea7a1e076ef9d71b20d8dda2f7dc52bd34d2</id>
<content type='text'>
arch_topology.c provides functionality to parse and scale CPU capacity.
It also provides a corresponding sysfs interface. Some architectures
parse and scale CPU capacity differently as per their own needs. On
Intel processors, for instance, it is responsibility of the Intel
P-state driver.

Relocate the implementation of that interface to a common location in
topology.c. Architectures can use the interface and populate it using
their own mechanisms.

An alternative approach would be to compile arch_topology.c even if
not needed only to get this interface. This approach would create
duplicated and conflicting functionality and data structures.

Signed-off-by: Ricardo Neri &lt;ricardo.neri-calderon@linux.intel.com&gt;
Tested-by: Christian Loehle &lt;christian.loehle@arm.com&gt;
Link: https://patch.msgid.link/20250419025504.9760-2-ricardo.neri-calderon@linux.intel.com
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
</content>
</entry>
<entry>
<title>Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux</title>
<updated>2025-03-25T20:16:16+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2025-03-25T20:16:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2d09a9449ecd9a2b9fdac62408c12ee20b6307d2'/>
<id>urn:sha1:2d09a9449ecd9a2b9fdac62408c12ee20b6307d2</id>
<content type='text'>
Pull arm64 updates from Catalin Marinas:
 "Nothing major this time around.

  Apart from the usual perf/PMU updates, some page table cleanups, the
  notable features are average CPU frequency based on the AMUv1
  counters, CONFIG_HOTPLUG_SMT and MOPS instructions (memcpy/memset) in
  the uaccess routines.

  Perf and PMUs:

   - Support for the 'Rainier' CPU PMU from Arm

   - Preparatory driver changes and cleanups that pave the way for BRBE
     support

   - Support for partial virtualisation of the Apple-M1 PMU

   - Support for the second event filter in Arm CSPMU designs

   - Minor fixes and cleanups (CMN and DWC PMUs)

   - Enable EL2 requirements for FEAT_PMUv3p9

  Power, CPU topology:

   - Support for AMUv1-based average CPU frequency

   - Run-time SMT control wired up for arm64 (CONFIG_HOTPLUG_SMT). It
     adds a generic topology_is_primary_thread() function overridden by
     x86 and powerpc

  New(ish) features:

   - MOPS (memcpy/memset) support for the uaccess routines

  Security/confidential compute:

   - Fix the DMA address for devices used in Realms with Arm CCA. The
     CCA architecture uses the address bit to differentiate between
     shared and private addresses

   - Spectre-BHB: assume CPUs Linux doesn't know about vulnerable by
     default

  Memory management clean-ups:

   - Drop the P*D_TABLE_BIT definition in preparation for 128-bit PTEs

   - Some minor page table accessor clean-ups

   - PIE/POE (permission indirection/overlay) helpers clean-up

  Kselftests:

   - MTE: skip hugetlb tests if MTE is not supported on such mappings
     and user correct naming for sync/async tag checking modes

  Miscellaneous:

   - Add a PKEY_UNRESTRICTED definition as 0 to uapi (toolchain people
     request)

   - Sysreg updates for new register fields

   - CPU type info for some Qualcomm Kryo cores"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (72 commits)
  arm64: mm: Don't use %pK through printk
  perf/arm_cspmu: Fix missing io.h include
  arm64: errata: Add newer ARM cores to the spectre_bhb_loop_affected() lists
  arm64: cputype: Add MIDR_CORTEX_A76AE
  arm64: errata: Add KRYO 2XX/3XX/4XX silver cores to Spectre BHB safe list
  arm64: errata: Assume that unknown CPUs _are_ vulnerable to Spectre BHB
  arm64: errata: Add QCOM_KRYO_4XX_GOLD to the spectre_bhb_k24_list
  arm64/sysreg: Enforce whole word match for open/close tokens
  arm64/sysreg: Fix unbalanced closing block
  arm64: Kconfig: Enable HOTPLUG_SMT
  arm64: topology: Support SMT control on ACPI based system
  arch_topology: Support SMT control for OF based system
  cpu/SMT: Provide a default topology_is_primary_thread()
  arm64/mm: Define PTDESC_ORDER
  perf/arm_cspmu: Add PMEVFILT2R support
  perf/arm_cspmu: Generalise event filtering
  perf/arm_cspmu: Move register definitons to header
  arm64/kernel: Always use level 2 or higher for early mappings
  arm64/mm: Drop PXD_TABLE_BIT
  arm64/mm: Check pmd_table() in pmd_trans_huge()
  ...
</content>
</entry>
<entry>
<title>cpu/SMT: Provide a default topology_is_primary_thread()</title>
<updated>2025-03-14T17:31:02+00:00</updated>
<author>
<name>Yicong Yang</name>
<email>yangyicong@hisilicon.com</email>
</author>
<published>2025-03-11T07:51:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=4b455f59945aab5610828a1320b045c82cbe8852'/>
<id>urn:sha1:4b455f59945aab5610828a1320b045c82cbe8852</id>
<content type='text'>
Currently if architectures want to support HOTPLUG_SMT they need to
provide a topology_is_primary_thread() telling the framework which
thread in the SMT cannot offline. However arm64 doesn't have a
restriction on which thread in the SMT cannot offline, a simplest
choice is that just make 1st thread as the "primary" thread. So
just make this as the default implementation in the framework and
let architectures like x86 that have special primary thread to
override this function (which they've already done).

There's no need to provide a stub function if !CONFIG_SMP or
!CONFIG_HOTPLUG_SMT. In such case the testing CPU is already
the 1st CPU in the SMT so it's always the primary thread.

Reviewed-by: Jonathan Cameron &lt;Jonathan.Cameron@huawei.com&gt;
Reviewed-by: Pierre Gondois &lt;pierre.gondois@arm.com&gt;
Reviewed-by: Dietmar Eggemann &lt;dietmar.eggemann@arm.com&gt;
Signed-off-by: Yicong Yang &lt;yangyicong@hisilicon.com&gt;
Reviewed-by: Sudeep Holla &lt;sudeep.holla@arm.com&gt;
Link: https://lore.kernel.org/r/20250311075143.61078-2-yangyicong@huawei.com
Signed-off-by: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
</content>
</entry>
<entry>
<title>sched/topology: Introduce for_each_node_numadist() iterator</title>
<updated>2025-02-16T16:52:20+00:00</updated>
<author>
<name>Andrea Righi</name>
<email>arighi@nvidia.com</email>
</author>
<published>2025-02-14T19:40:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f09177ca5f242a32368a2e9414dce4c90dc1d405'/>
<id>urn:sha1:f09177ca5f242a32368a2e9414dce4c90dc1d405</id>
<content type='text'>
Introduce the new helper for_each_node_numadist() to iterate over node
IDs in order of increasing NUMA distance from a given starting node.

This iterator is somehow similar to for_each_numa_hop_mask(), but
instead of providing a cpumask at each iteration, it provides a node ID.

Example usage:

  nodemask_t unvisited = NODE_MASK_ALL;
  int node, start = cpu_to_node(smp_processor_id());

  node = start;
  for_each_node_numadist(node, unvisited)
  	pr_info("node (%d, %d) -&gt; %d\n",
  		 start, node, node_distance(start, node));

On a system with equidistant nodes:

 $ numactl -H
 ...
 node distances:
 node     0    1    2    3
    0:   10   20   20   20
    1:   20   10   20   20
    2:   20   20   10   20
    3:   20   20   20   10

Output of the example above (on node 0):

[    7.367022] node (0, 0) -&gt; 10
[    7.367151] node (0, 1) -&gt; 20
[    7.367186] node (0, 2) -&gt; 20
[    7.367247] node (0, 3) -&gt; 20

On a system with non-equidistant nodes (simulated using virtme-ng):

 $ numactl -H
 ...
 node distances:
 node     0    1    2    3
    0:   10   51   31   41
    1:   51   10   21   61
    2:   31   21   10   11
    3:   41   61   11   10

Output of the example above (on node 0):

 [    8.953644] node (0, 0) -&gt; 10
 [    8.953712] node (0, 2) -&gt; 31
 [    8.953764] node (0, 3) -&gt; 41
 [    8.953817] node (0, 1) -&gt; 51

Suggested-by: Yury Norov [NVIDIA] &lt;yury.norov@gmail.com&gt;
Signed-off-by: Andrea Righi &lt;arighi@nvidia.com&gt;
Acked-by: Yury Norov [NVIDIA] &lt;yury.norov@gmail.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/topology: Fix sched_numa_find_nth_cpu() in non-NUMA case</title>
<updated>2023-09-15T11:48:10+00:00</updated>
<author>
<name>Yury Norov</name>
<email>yury.norov@gmail.com</email>
</author>
<published>2023-08-19T14:12:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=8ab63d418d4339d996f80d02a00dbce0aa3ff972'/>
<id>urn:sha1:8ab63d418d4339d996f80d02a00dbce0aa3ff972</id>
<content type='text'>
When CONFIG_NUMA is enabled, sched_numa_find_nth_cpu() searches for a
CPU in sched_domains_numa_masks. The masks includes only online CPUs,
so effectively offline CPUs are skipped.

When CONFIG_NUMA is disabled, the fallback function should be consistent.

Fixes: cd7f55359c90 ("sched: add sched_numa_find_nth_cpu()")
Signed-off-by: Yury Norov &lt;yury.norov@gmail.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Link: https://lore.kernel.org/r/20230819141239.287290-5-yury.norov@gmail.com
</content>
</entry>
<entry>
<title>sched/topology: Introduce for_each_numa_hop_mask()</title>
<updated>2023-02-08T02:20:00+00:00</updated>
<author>
<name>Valentin Schneider</name>
<email>vschneid@redhat.com</email>
</author>
<published>2023-01-21T04:24:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=06ac01721f7d07da722abe0ec6f147b90bfc8c77'/>
<id>urn:sha1:06ac01721f7d07da722abe0ec6f147b90bfc8c77</id>
<content type='text'>
The recently introduced sched_numa_hop_mask() exposes cpumasks of CPUs
reachable within a given distance budget, wrap the logic for iterating over
all (distance, mask) values inside an iterator macro.

Signed-off-by: Valentin Schneider &lt;vschneid@redhat.com&gt;
Reviewed-by: Yury Norov &lt;yury.norov@gmail.com&gt;
Signed-off-by: Yury Norov &lt;yury.norov@gmail.com&gt;
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/topology: Introduce sched_numa_hop_mask()</title>
<updated>2023-02-08T02:20:00+00:00</updated>
<author>
<name>Valentin Schneider</name>
<email>vschneid@redhat.com</email>
</author>
<published>2023-01-21T04:24:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=9feae65845f7b16376716fe70b7d4b9bf8721848'/>
<id>urn:sha1:9feae65845f7b16376716fe70b7d4b9bf8721848</id>
<content type='text'>
Tariq has pointed out that drivers allocating IRQ vectors would benefit
from having smarter NUMA-awareness - cpumask_local_spread() only knows
about the local node and everything outside is in the same bucket.

sched_domains_numa_masks is pretty much what we want to hand out (a cpumask
of CPUs reachable within a given distance budget), introduce
sched_numa_hop_mask() to export those cpumasks.

Link: http://lore.kernel.org/r/20220728191203.4055-1-tariqt@nvidia.com
Signed-off-by: Valentin Schneider &lt;vschneid@redhat.com&gt;
Reviewed-by: Yury Norov &lt;yury.norov@gmail.com&gt;
Signed-off-by: Yury Norov &lt;yury.norov@gmail.com&gt;
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
</feed>
