<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/include/linux/arch_topology.h, branch v6.12.80</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.12.80</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.12.80'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2024-11-06T20:31:36+00:00</updated>
<entry>
<title>ACPI: processor: Move arch_init_invariance_cppc() call later</title>
<updated>2024-11-06T20:31:36+00:00</updated>
<author>
<name>Mario Limonciello</name>
<email>mario.limonciello@amd.com</email>
</author>
<published>2024-11-04T22:28:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b79276dcac9124a79c8cf7cc8fbdd3d4c3c9a7c7'/>
<id>urn:sha1:b79276dcac9124a79c8cf7cc8fbdd3d4c3c9a7c7</id>
<content type='text'>
arch_init_invariance_cppc() is called at the end of
acpi_cppc_processor_probe() in order to configure frequency invariance
based upon the values from _CPC.

This however doesn't work on AMD CPPC shared memory designs that have
AMD preferred cores enabled because _CPC needs to be analyzed from all
cores to judge if preferred cores are enabled.

This issue manifests to users as a warning since commit 21fb59ab4b97
("ACPI: CPPC: Adjust debug messages in amd_set_max_freq_ratio() to warn"):
```
Could not retrieve highest performance (-19)
```

However the warning isn't the cause of this, it was actually
commit 279f838a61f9 ("x86/amd: Detect preferred cores in
amd_get_boost_ratio_numerator()") which exposed the issue.

To fix this problem, change arch_init_invariance_cppc() into a new weak
symbol that is called at the end of acpi_processor_driver_init().
Each architecture that supports it can declare the symbol to override
the weak one.

Define it for x86, in arch/x86/kernel/acpi/cppc.c, and for all of the
architectures using the generic arch_topology.c code.

Fixes: 279f838a61f9 ("x86/amd: Detect preferred cores in amd_get_boost_ratio_numerator()")
Reported-by: Ivan Shapovalov &lt;intelfx@intelfx.name&gt;
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219431
Tested-by: Oleksandr Natalenko &lt;oleksandr@natalenko.name&gt;
Signed-off-by: Mario Limonciello &lt;mario.limonciello@amd.com&gt;
Link: https://patch.msgid.link/20241104222855.3959267-1-superm1@kernel.org
[ rjw: Changelog edit ]
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
</content>
</entry>
<entry>
<title>sched/cpufreq: Rename arch_update_thermal_pressure() =&gt; arch_update_hw_pressure()</title>
<updated>2024-04-24T10:08:01+00:00</updated>
<author>
<name>Vincent Guittot</name>
<email>vincent.guittot@linaro.org</email>
</author>
<published>2024-03-26T09:16:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d4dbc991714eefcbd8d54a3204bd77a0a52bd32d'/>
<id>urn:sha1:d4dbc991714eefcbd8d54a3204bd77a0a52bd32d</id>
<content type='text'>
Now that cpufreq provides a pressure value to the scheduler, rename
arch_update_thermal_pressure into HW pressure to reflect that it returns
a pressure applied by HW (i.e. with a high frequency change) and not
always related to thermal mitigation but also generated by max current
limitation as an example. Such high frequency signal needs filtering to be
smoothed and provide an value that reflects the average available capacity
into the scheduler time scale.

Signed-off-by: Vincent Guittot &lt;vincent.guittot@linaro.org&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Tested-by: Lukasz Luba &lt;lukasz.luba@arm.com&gt;
Reviewed-by: Qais Yousef &lt;qyousef@layalina.io&gt;
Reviewed-by: Lukasz Luba &lt;lukasz.luba@arm.com&gt;
Link: https://lore.kernel.org/r/20240326091616.3696851-5-vincent.guittot@linaro.org
</content>
</entry>
<entry>
<title>arm64/amu: Use capacity_ref_freq() to set AMU ratio</title>
<updated>2023-12-23T14:52:36+00:00</updated>
<author>
<name>Vincent Guittot</name>
<email>vincent.guittot@linaro.org</email>
</author>
<published>2023-12-11T10:48:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1f023007f5e782bda19ad9104830c404fd622c5d'/>
<id>urn:sha1:1f023007f5e782bda19ad9104830c404fd622c5d</id>
<content type='text'>
Use the new capacity_ref_freq() method to set the ratio that is used by AMU for
computing the arch_scale_freq_capacity().
This helps to keep everything aligned using the same reference for
computing CPUs capacity.

The default value of the ratio (stored in per_cpu(arch_max_freq_scale))
ensures that arch_scale_freq_capacity() returns max capacity until it is
set to its correct value with the cpu capacity and capacity_ref_freq().

Signed-off-by: Vincent Guittot &lt;vincent.guittot@linaro.org&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Acked-by: Sudeep Holla &lt;sudeep.holla@arm.com&gt;
Acked-by: Will Deacon &lt;will@kernel.org&gt;
Link: https://lore.kernel.org/r/20231211104855.558096-8-vincent.guittot@linaro.org
</content>
</entry>
<entry>
<title>sched/topology: Add a new arch_scale_freq_ref() method</title>
<updated>2023-12-23T14:52:34+00:00</updated>
<author>
<name>Vincent Guittot</name>
<email>vincent.guittot@linaro.org</email>
</author>
<published>2023-12-11T10:48:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=9942cb22ea458c34fa17b73d143ea32d4df1caca'/>
<id>urn:sha1:9942cb22ea458c34fa17b73d143ea32d4df1caca</id>
<content type='text'>
Create a new method to get a unique and fixed max frequency. Currently
cpuinfo.max_freq or the highest (or last) state of performance domain are
used as the max frequency when computing the frequency for a level of
utilization, but:

  - cpuinfo_max_freq can change at runtime. boost is one example of
    such change.

  - cpuinfo.max_freq and last item of the PD can be different leading to
    different results between cpufreq and energy model.

We need to save the reference frequency that has been used when computing
the CPUs capacity and use this fixed and coherent value to convert between
frequency and CPU's capacity.

In fact, we already save the frequency that has been used when computing
the capacity of each CPU. We extend the precision to save kHz instead of
MHz currently and we modify the type to be aligned with other variables
used when converting frequency to capacity and the other way.

[ mingo: Minor edits. ]

Signed-off-by: Vincent Guittot &lt;vincent.guittot@linaro.org&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Tested-by: Lukasz Luba &lt;lukasz.luba@arm.com&gt;
Reviewed-by: Lukasz Luba &lt;lukasz.luba@arm.com&gt;
Acked-by: Sudeep Holla &lt;sudeep.holla@arm.com&gt;
Link: https://lore.kernel.org/r/20231211104855.558096-2-vincent.guittot@linaro.org
</content>
</entry>
<entry>
<title>arch_topology: Drop LLC identifier stash from the CPU topology</title>
<updated>2022-07-04T15:22:29+00:00</updated>
<author>
<name>Sudeep Holla</name>
<email>sudeep.holla@arm.com</email>
</author>
<published>2022-07-04T10:15:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5b8dc787ce4a45c87254d1b0b22f161347ab7f81'/>
<id>urn:sha1:5b8dc787ce4a45c87254d1b0b22f161347ab7f81</id>
<content type='text'>
Since the cacheinfo LLC information is used directly in arch_topology,
there is no need to parse and store the LLC ID information only for
ACPI systems in the CPU topology.

Remove the redundant LLC ID from the generic CPU arch_topology
information.

Link: https://lore.kernel.org/r/20220704101605.1318280-13-sudeep.holla@arm.com
Tested-by: Ionela Voinescu &lt;ionela.voinescu@arm.com&gt;
Tested-by: Conor Dooley &lt;conor.dooley@microchip.com&gt;
Reviewed-by: Gavin Shan &lt;gshan@redhat.com&gt;
Signed-off-by: Sudeep Holla &lt;sudeep.holla@arm.com&gt;
</content>
</entry>
<entry>
<title>arch_topology: obtain cpu capacity using information from CPPC</title>
<updated>2022-03-10T19:21:58+00:00</updated>
<author>
<name>Ionela Voinescu</name>
<email>ionela.voinescu@arm.com</email>
</author>
<published>2022-03-10T14:54:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=9924fbb51e0ae30b8d7eec7c1c839d74da9678b3'/>
<id>urn:sha1:9924fbb51e0ae30b8d7eec7c1c839d74da9678b3</id>
<content type='text'>
Define topology_init_cpu_capacity_cppc() to use highest performance
values from _CPC objects to obtain and set maximum capacity information
for each CPU. acpi_cppc_processor_probe() is a good point at which to
trigger the initialization of CPU (u-arch) capacity values, as at this
point the highest performance values can be obtained from each CPU's
_CPC objects. Architectures can therefore use this functionality
through arch_init_invariance_cppc().

The performance scale used by CPPC is a unified scale for all CPUs in
the system. Therefore, by obtaining the raw highest performance values
from the _CPC objects, and normalizing them on the [0, 1024] capacity
scale, used by the task scheduler, we obtain the CPU capacity of each
CPU.

While an ACPI Notify(0x85) could alert about a change in the highest
performance value, which should in turn retrigger the CPU capacity
computations, this notification is not currently handled by the ACPI
processor driver. When supported, a call to arch_init_invariance_cppc()
would perform the update.

Signed-off-by: Ionela Voinescu &lt;ionela.voinescu@arm.com&gt;
Acked-by: Sudeep Holla &lt;sudeep.holla@arm.com&gt;
Tested-by: Valentin Schneider &lt;valentin.schneider@arm.com&gt;
Tested-by: Yicong Yang &lt;yangyicong@hisilicon.com&gt;
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
</content>
</entry>
<entry>
<title>arch_topology: Remove unused topology_set_thermal_pressure() and related</title>
<updated>2021-11-23T09:40:26+00:00</updated>
<author>
<name>Lukasz Luba</name>
<email>lukasz.luba@arm.com</email>
</author>
<published>2021-11-09T19:57:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=7e97b3dc2556743dd02612c92a8de7026e8d7dc9'/>
<id>urn:sha1:7e97b3dc2556743dd02612c92a8de7026e8d7dc9</id>
<content type='text'>
There is no need of this function (and related) since code has been
converted to use the new arch_update_thermal_pressure() API. The old
code can be removed.

Signed-off-by: Lukasz Luba &lt;lukasz.luba@arm.com&gt;
Signed-off-by: Viresh Kumar &lt;viresh.kumar@linaro.org&gt;
</content>
</entry>
<entry>
<title>arch_topology: Introduce thermal pressure update function</title>
<updated>2021-11-23T09:40:26+00:00</updated>
<author>
<name>Lukasz Luba</name>
<email>lukasz.luba@arm.com</email>
</author>
<published>2021-11-09T19:57:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c214f124161d446b340597e7c968e0a2dc149142'/>
<id>urn:sha1:c214f124161d446b340597e7c968e0a2dc149142</id>
<content type='text'>
The thermal pressure is a mechanism which is used for providing
information about reduced CPU performance to the scheduler. Usually code
has to convert the value from frequency units into capacity units,
which are understandable by the scheduler. Create a common conversion code
which can be just used via a handy API.

Internally, the topology_update_thermal_pressure() operates on frequency
in MHz and max CPU frequency is taken from 'freq_factor' (per-cpu).

Signed-off-by: Lukasz Luba &lt;lukasz.luba@arm.com&gt;
Reviewed-by: Thara Gopinath &lt;thara.gopinath@linaro.org&gt;
Signed-off-by: Viresh Kumar &lt;viresh.kumar@linaro.org&gt;
</content>
</entry>
<entry>
<title>topology: Represent clusters of CPUs within a die</title>
<updated>2021-10-15T09:25:15+00:00</updated>
<author>
<name>Jonathan Cameron</name>
<email>Jonathan.Cameron@huawei.com</email>
</author>
<published>2021-09-24T08:51:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c5e22feffdd736cb02b98b0f5b375c8ebc858dd4'/>
<id>urn:sha1:c5e22feffdd736cb02b98b0f5b375c8ebc858dd4</id>
<content type='text'>
Both ACPI and DT provide the ability to describe additional layers of
topology between that of individual cores and higher level constructs
such as the level at which the last level cache is shared.
In ACPI this can be represented in PPTT as a Processor Hierarchy
Node Structure [1] that is the parent of the CPU cores and in turn
has a parent Processor Hierarchy Nodes Structure representing
a higher level of topology.

For example Kunpeng 920 has 6 or 8 clusters in each NUMA node, and each
cluster has 4 cpus. All clusters share L3 cache data, but each cluster
has local L3 tag. On the other hand, each clusters will share some
internal system bus.

+-----------------------------------+                          +---------+
|  +------+    +------+             +--------------------------+         |
|  | CPU0 |    | cpu1 |             |    +-----------+         |         |
|  +------+    +------+             |    |           |         |         |
|                                   +----+    L3     |         |         |
|  +------+    +------+   cluster   |    |    tag    |         |         |
|  | CPU2 |    | CPU3 |             |    |           |         |         |
|  +------+    +------+             |    +-----------+         |         |
|                                   |                          |         |
+-----------------------------------+                          |         |
+-----------------------------------+                          |         |
|  +------+    +------+             +--------------------------+         |
|  |      |    |      |             |    +-----------+         |         |
|  +------+    +------+             |    |           |         |         |
|                                   |    |    L3     |         |         |
|  +------+    +------+             +----+    tag    |         |         |
|  |      |    |      |             |    |           |         |         |
|  +------+    +------+             |    +-----------+         |         |
|                                   |                          |         |
+-----------------------------------+                          |   L3    |
                                                               |   data  |
+-----------------------------------+                          |         |
|  +------+    +------+             |    +-----------+         |         |
|  |      |    |      |             |    |           |         |         |
|  +------+    +------+             +----+    L3     |         |         |
|                                   |    |    tag    |         |         |
|  +------+    +------+             |    |           |         |         |
|  |      |    |      |             |    +-----------+         |         |
|  +------+    +------+             +--------------------------+         |
+-----------------------------------|                          |         |
+-----------------------------------|                          |         |
|  +------+    +------+             +--------------------------+         |
|  |      |    |      |             |    +-----------+         |         |
|  +------+    +------+             |    |           |         |         |
|                                   +----+    L3     |         |         |
|  +------+    +------+             |    |    tag    |         |         |
|  |      |    |      |             |    |           |         |         |
|  +------+    +------+             |    +-----------+         |         |
|                                   |                          |         |
+-----------------------------------+                          |         |
+-----------------------------------+                          |         |
|  +------+    +------+             +--------------------------+         |
|  |      |    |      |             |   +-----------+          |         |
|  +------+    +------+             |   |           |          |         |
|                                   |   |    L3     |          |         |
|  +------+    +------+             +---+    tag    |          |         |
|  |      |    |      |             |   |           |          |         |
|  +------+    +------+             |   +-----------+          |         |
|                                   |                          |         |
+-----------------------------------+                          |         |
+-----------------------------------+                          |         |
|  +------+    +------+             +--------------------------+         |
|  |      |    |      |             |  +-----------+           |         |
|  +------+    +------+             |  |           |           |         |
|                                   |  |    L3     |           |         |
|  +------+    +------+             +--+    tag    |           |         |
|  |      |    |      |             |  |           |           |         |
|  +------+    +------+             |  +-----------+           |         |
|                                   |                          +---------+
+-----------------------------------+

That means spreading tasks among clusters will bring more bandwidth
while packing tasks within one cluster will lead to smaller cache
synchronization latency. So both kernel and userspace will have
a chance to leverage this topology to deploy tasks accordingly to
achieve either smaller cache latency within one cluster or an even
distribution of load among clusters for higher throughput.

This patch exposes cluster topology to both kernel and userspace.
Libraried like hwloc will know cluster by cluster_cpus and related
sysfs attributes. PoC of HWLOC support at [2].

Note this patch only handle the ACPI case.

Special consideration is needed for SMT processors, where it is
necessary to move 2 levels up the hierarchy from the leaf nodes
(thus skipping the processor core level).

Note that arm64 / ACPI does not provide any means of identifying
a die level in the topology but that may be unrelate to the cluster
level.

[1] ACPI Specification 6.3 - section 5.2.29.1 processor hierarchy node
    structure (Type 0)
[2] https://github.com/hisilicon/hwloc/tree/linux-cluster

Signed-off-by: Jonathan Cameron &lt;Jonathan.Cameron@huawei.com&gt;
Signed-off-by: Tian Tao &lt;tiantao6@hisilicon.com&gt;
Signed-off-by: Barry Song &lt;song.bao.hua@hisilicon.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20210924085104.44806-2-21cnbao@gmail.com
</content>
</entry>
<entry>
<title>cpufreq: CPPC: Add support for frequency invariance</title>
<updated>2021-03-22T03:25:28+00:00</updated>
<author>
<name>Viresh Kumar</name>
<email>viresh.kumar@linaro.org</email>
</author>
<published>2020-06-23T10:19:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=4c38f2df71c8e33c0b64865992d693f5022eeaad'/>
<id>urn:sha1:4c38f2df71c8e33c0b64865992d693f5022eeaad</id>
<content type='text'>
The Frequency Invariance Engine (FIE) is providing a frequency scaling
correction factor that helps achieve more accurate load-tracking.

Normally, this scaling factor can be obtained directly with the help of
the cpufreq drivers as they know the exact frequency the hardware is
running at. But that isn't the case for CPPC cpufreq driver.

Another way of obtaining that is using the arch specific counter
support, which is already present in kernel, but that hardware is
optional for platforms.

This patch updates the CPPC driver to register itself with the topology
core to provide its own implementation (cppc_scale_freq_tick()) of
topology_scale_freq_tick() which gets called by the scheduler on every
tick. Note that the arch specific counters have higher priority than
CPPC counters, if available, though the CPPC driver doesn't need to have
any special handling for that.

On an invocation of cppc_scale_freq_tick(), we schedule an irq work
(since we reach here from hard-irq context), which then schedules a
normal work item and cppc_scale_freq_workfn() updates the per_cpu
arch_freq_scale variable based on the counter updates since the last
tick.

To allow platforms to disable this CPPC counter-based frequency
invariance support, this is all done under CONFIG_ACPI_CPPC_CPUFREQ_FIE,
which is enabled by default.

This also exports sched_setattr_nocheck() as the CPPC driver can be
built as a module.

Cc: linux-acpi@vger.kernel.org
Reviewed-by: Ionela Voinescu &lt;ionela.voinescu@arm.com&gt;
Tested-by: Ionela Voinescu &lt;ionela.voinescu@arm.com&gt;
Tested-by: Vincent Guittot &lt;vincent.guittot@linaro.org&gt;
Acked-by: Rafael J. Wysocki &lt;rafael@kernel.org&gt;
Signed-off-by: Viresh Kumar &lt;viresh.kumar@linaro.org&gt;
</content>
</entry>
</feed>
