<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/base/cacheinfo.c, branch v6.1.168</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.1.168</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.1.168'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2026-02-19T15:25:17+00:00</updated>
<entry>
<title>cacheinfo: Remove of_node_put() for fw_token</title>
<updated>2026-02-19T15:25:17+00:00</updated>
<author>
<name>Pierre Gondois</name>
<email>pierre.gondois@arm.com</email>
</author>
<published>2026-02-13T17:57:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2cceca9e2e1cf249d1821923d4a6749363e007bc'/>
<id>urn:sha1:2cceca9e2e1cf249d1821923d4a6749363e007bc</id>
<content type='text'>
commit 2613cc29c5723881ca603b1a3b50f0107010d5d6 upstream

fw_token is used for DT/ACPI systems to identify CPUs sharing caches.
For DT based systems, fw_token is set to a pointer to a DT node.

commit 3da72e18371c ("cacheinfo: Decrement refcount in
cache_setup_of_node()")
doesn't increment the refcount of fw_token anymore in
cache_setup_of_node(). fw_token is indeed used as a token and not
as a (struct device_node*), so no reference to fw_token should be
kept.

However, [1] is triggered when hotplugging a CPU multiple times
since cache_shared_cpu_map_remove() decrements the refcount to
fw_token at each CPU unplugging, eventually reaching 0.

Remove of_node_put() for fw_token in cache_shared_cpu_map_remove().

[1]
------------[ cut here ]------------
refcount_t: saturated; leaking memory.
WARNING: CPU: 4 PID: 32 at lib/refcount.c:22 refcount_warn_saturate (lib/refcount.c:22 (discriminator 3))
Modules linked in:
CPU: 4 PID: 32 Comm: cpuhp/4 Tainted: G        W          6.1.0-rc1-14091-g9fdf2ca7b9c8 #76
Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development Platform, BIOS EDK II Oct 31 2022
pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : refcount_warn_saturate (lib/refcount.c:22 (discriminator 3))
lr : refcount_warn_saturate (lib/refcount.c:22 (discriminator 3))
[...]
Call trace:
[...]
of_node_release (drivers/of/dynamic.c:335)
kobject_put (lib/kobject.c:677 lib/kobject.c:704 ./include/linux/kref.h:65 lib/kobject.c:721)
of_node_put (drivers/of/dynamic.c:49)
free_cache_attributes.part.0 (drivers/base/cacheinfo.c:712)
cacheinfo_cpu_pre_down (drivers/base/cacheinfo.c:718)
cpuhp_invoke_callback (kernel/cpu.c:247 (discriminator 4))
cpuhp_thread_fun (kernel/cpu.c:785)
smpboot_thread_fn (kernel/smpboot.c:164 (discriminator 3))
kthread (kernel/kthread.c:376)
ret_from_fork (arch/arm64/kernel/entry.S:861)
---[ end trace 0000000000000000 ]---

Fixes: 3da72e18371c ("cacheinfo: Decrement refcount in cache_setup_of_node()")
Reported-by: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Reported-by: Marek Szyprowski &lt;m.szyprowski@samsung.com&gt;
Tested-by: Geert Uytterhoeven &lt;geert+renesas@glider.be&gt;
Tested-by: Sudeep Holla &lt;sudeep.holla@arm.com&gt;
Reviewed-by: Sudeep Holla &lt;sudeep.holla@arm.com&gt;
Signed-off-by: Pierre Gondois &lt;pierre.gondois@arm.com&gt;
Link: https://lore.kernel.org/r/20221116094958.2141072-1-pierre.gondois@arm.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Signed-off-by: Florian Fainelli &lt;florian.fainelli@broadcom.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>cacheinfo: Decrement refcount in cache_setup_of_node()</title>
<updated>2026-02-19T15:25:17+00:00</updated>
<author>
<name>Pierre Gondois</name>
<email>pierre.gondois@arm.com</email>
</author>
<published>2026-02-13T17:56:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=dcb53a6b13b42bd5f73585a24c238f71a8a39fbc'/>
<id>urn:sha1:dcb53a6b13b42bd5f73585a24c238f71a8a39fbc</id>
<content type='text'>
commit 3da72e18371c41a6f6f96b594854b178168c7757 upstream

Refcounts to DT nodes are only incremented in the function
and never decremented. Decrease the refcounts when necessary.

Signed-off-by: Pierre Gondois &lt;pierre.gondois@arm.com&gt;
Reviewed-by: Sudeep Holla &lt;sudeep.holla@arm.com&gt;
Link: https://lore.kernel.org/r/20221026185954.991547-1-pierre.gondois@arm.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Signed-off-by: Florian Fainelli &lt;florian.fainelli@broadcom.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>drivers: base: cacheinfo: Update cpu_map_populated during CPU Hotplug</title>
<updated>2025-12-06T21:12:14+00:00</updated>
<author>
<name>K Prateek Nayak</name>
<email>kprateek.nayak@amd.com</email>
</author>
<published>2025-10-20T17:36:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ebb70465e55ff0109d14f1d03d8ac089759fa141'/>
<id>urn:sha1:ebb70465e55ff0109d14f1d03d8ac089759fa141</id>
<content type='text'>
[ Upstream commit c26fabe73330d983c7ce822c6b6ec0879b4da61f ]

Until commit 5c2712387d48 ("cacheinfo: Fix LLC is not exported through
sysfs"), cacheinfo called populate_cache_leaves() for CPU coming online
which let the arch specific functions handle (at least on x86)
populating the shared_cpu_map. However, with the changes in the
aforementioned commit, populate_cache_leaves() is not called when a CPU
comes online as a result of hotplug since last_level_cache_is_valid()
returns true as the cacheinfo data is not discarded. The CPU coming
online is not present in shared_cpu_map, however, it will not be added
since the cpu_cacheinfo-&gt;cpu_map_populated flag is set (it is set in
populate_cache_leaves() when cacheinfo is first populated for x86)

This can lead to inconsistencies in the shared_cpu_map when an offlined
CPU comes online again. Example below depicts the inconsistency in the
shared_cpu_list in cacheinfo when CPU8 is offlined and onlined again on
a 3rd Generation EPYC processor:

  # for i in /sys/devices/system/cpu/cpu8/cache/index*/shared_cpu_list; do echo -n "$i: "; cat $i; done
    /sys/devices/system/cpu/cpu8/cache/index0/shared_cpu_list: 8,136
    /sys/devices/system/cpu/cpu8/cache/index1/shared_cpu_list: 8,136
    /sys/devices/system/cpu/cpu8/cache/index2/shared_cpu_list: 8,136
    /sys/devices/system/cpu/cpu8/cache/index3/shared_cpu_list: 8-15,136-143

  # echo 0 &gt; /sys/devices/system/cpu/cpu8/online
  # echo 1 &gt; /sys/devices/system/cpu/cpu8/online

  # for i in /sys/devices/system/cpu/cpu8/cache/index*/shared_cpu_list; do echo -n "$i: "; cat $i; done
    /sys/devices/system/cpu/cpu8/cache/index0/shared_cpu_list: 8
    /sys/devices/system/cpu/cpu8/cache/index1/shared_cpu_list: 8
    /sys/devices/system/cpu/cpu8/cache/index2/shared_cpu_list: 8
    /sys/devices/system/cpu/cpu8/cache/index3/shared_cpu_list: 8

  # cat /sys/devices/system/cpu/cpu136/cache/index0/shared_cpu_list
    136

  # cat /sys/devices/system/cpu/cpu136/cache/index3/shared_cpu_list
    9-15,136-143

Clear the flag when the CPU is removed from shared_cpu_map when
cache_shared_cpu_map_remove() is called during CPU hotplug. This will
allow cache_shared_cpu_map_setup() to add the CPU coming back online in
the shared_cpu_map. Set the flag again when the shared_cpu_map is setup.
Following are results of performing the same test as described above with
the changes:

  # for i in /sys/devices/system/cpu/cpu8/cache/index*/shared_cpu_list; do echo -n "$i: "; cat $i; done
    /sys/devices/system/cpu/cpu8/cache/index0/shared_cpu_list: 8,136
    /sys/devices/system/cpu/cpu8/cache/index1/shared_cpu_list: 8,136
    /sys/devices/system/cpu/cpu8/cache/index2/shared_cpu_list: 8,136
    /sys/devices/system/cpu/cpu8/cache/index3/shared_cpu_list: 8-15,136-143

  # echo 0 &gt; /sys/devices/system/cpu/cpu8/online
  # echo 1 &gt; /sys/devices/system/cpu/cpu8/online

  # for i in /sys/devices/system/cpu/cpu8/cache/index*/shared_cpu_list; do echo -n "$i: "; cat $i; done
    /sys/devices/system/cpu/cpu8/cache/index0/shared_cpu_list: 8,136
    /sys/devices/system/cpu/cpu8/cache/index1/shared_cpu_list: 8,136
    /sys/devices/system/cpu/cpu8/cache/index2/shared_cpu_list: 8,136
    /sys/devices/system/cpu/cpu8/cache/index3/shared_cpu_list: 8-15,136-143

  # cat /sys/devices/system/cpu/cpu136/cache/index0/shared_cpu_list
    8,136

  # cat /sys/devices/system/cpu/cpu136/cache/index3/shared_cpu_list
    8-15,136-143

Fixes: 5c2712387d48 ("cacheinfo: Fix LLC is not exported through sysfs")
Signed-off-by: K Prateek Nayak &lt;kprateek.nayak@amd.com&gt;
Reviewed-by: Yicong Yang &lt;yangyicong@hisilicon.com&gt;
Reviewed-by: Sudeep Holla &lt;sudeep.holla@arm.com&gt;
Link: https://lore.kernel.org/r/20230508084115.1157-3-kprateek.nayak@amd.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Signed-off-by: Wen Yang &lt;wen.yang@linux.dev&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>cacheinfo: Fix LLC is not exported through sysfs</title>
<updated>2025-12-06T21:12:14+00:00</updated>
<author>
<name>Yicong Yang</name>
<email>yangyicong@hisilicon.com</email>
</author>
<published>2025-10-20T17:36:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=119b5f15607fca5b86556c8e606017725a637545'/>
<id>urn:sha1:119b5f15607fca5b86556c8e606017725a637545</id>
<content type='text'>
[ Upstream commit 5c2712387d4850e0b64121d5fd3e6c4e84ea3266 ]

After entering 6.3-rc1 the LLC cacheinfo is not exported on our ACPI
based arm64 server. This is because the LLC cacheinfo is partly reset
when secondary CPUs boot up. On arm64 the primary cpu will allocate
and setup cacheinfo:
init_cpu_topology()
  for_each_possible_cpu()
    fetch_cache_info() // Allocate cacheinfo and init levels
detect_cache_attributes()
  cache_shared_cpu_map_setup()
    if (!last_level_cache_is_valid()) // not valid, setup LLC
      cache_setup_properties() // setup LLC

On secondary CPU boot up:
detect_cache_attributes()
  populate_cache_leaves()
    get_cache_type() // Get cache type from clidr_el1,
                     // for LLC type=CACHE_TYPE_NOCACHE
  cache_shared_cpu_map_setup()
    if (!last_level_cache_is_valid()) // Valid and won't go to this branch,
                                      // leave LLC's type=CACHE_TYPE_NOCACHE

The last_level_cache_is_valid() use cacheinfo-&gt;{attributes, fw_token} to
test it's valid or not, but populate_cache_leaves() will only reset
LLC's type, so we won't try to re-setup LLC's type and leave it
CACHE_TYPE_NOCACHE and won't export it through sysfs.

This patch tries to fix this by not re-populating the cache leaves if
the LLC is valid.

Fixes: 5944ce092b97 ("arch_topology: Build cacheinfo from primary CPU")
Signed-off-by: Yicong Yang &lt;yangyicong@hisilicon.com&gt;
Reviewed-by: Pierre Gondois &lt;pierre.gondois@arm.com&gt;
Reviewed-by: Sudeep Holla &lt;sudeep.holla@arm.com&gt;
Link: https://lore.kernel.org/r/20230328114915.33340-1-yangyicong@huawei.com
Signed-off-by: Wen Yang &lt;wen.yang@linux.dev&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>cacheinfo: Initialize variables in fetch_cache_info()</title>
<updated>2025-12-06T21:12:13+00:00</updated>
<author>
<name>Pierre Gondois</name>
<email>pierre.gondois@arm.com</email>
</author>
<published>2025-10-20T17:36:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=fc6572e48cd5cba201fce081c24742aa6eb629b2'/>
<id>urn:sha1:fc6572e48cd5cba201fce081c24742aa6eb629b2</id>
<content type='text'>
[ Upstream commit ecaef469920fd6d2c7687f19081946f47684a423 ]

Set potentially uninitialized variables to 0. This is particularly
relevant when CONFIG_ACPI_PPTT is not set.

Reported-by: kernel test robot &lt;lkp@intel.com&gt;
Link: https://lore.kernel.org/all/202301052307.JYt1GWaJ-lkp@intel.com/
Reported-by: Dan Carpenter &lt;error27@gmail.com&gt;
Link: https://lore.kernel.org/all/Y86iruJPuwNN7rZw@kili/
Fixes: 5944ce092b97 ("arch_topology: Build cacheinfo from primary CPU")
Signed-off-by: Pierre Gondois &lt;pierre.gondois@arm.com&gt;
Reviewed-by: Conor Dooley &lt;conor.dooley@microchip.com&gt;
Link: https://lore.kernel.org/r/20230124154053.355376-2-pierre.gondois@arm.com
Signed-off-by: Wen Yang &lt;wen.yang@linux.dev&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>arch_topology: Build cacheinfo from primary CPU</title>
<updated>2025-12-06T21:12:13+00:00</updated>
<author>
<name>Pierre Gondois</name>
<email>pierre.gondois@arm.com</email>
</author>
<published>2025-10-20T17:36:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=22def0b492e683cc5df2a8ef1b94a17be0d50d84'/>
<id>urn:sha1:22def0b492e683cc5df2a8ef1b94a17be0d50d84</id>
<content type='text'>
[ Upstream commit 5944ce092b97caed5d86d961e963b883b5c44ee2 ]

commit 3fcbf1c77d08 ("arch_topology: Fix cache attributes detection
in the CPU hotplug path")
adds a call to detect_cache_attributes() to populate the cacheinfo
before updating the siblings mask. detect_cache_attributes() allocates
memory and can take the PPTT mutex (on ACPI platforms). On PREEMPT_RT
kernels, on secondary CPUs, this triggers a:
  'BUG: sleeping function called from invalid context' [1]
as the code is executed with preemption and interrupts disabled.

The primary CPU was previously storing the cache information using
the now removed (struct cpu_topology).llc_id:
commit 5b8dc787ce4a ("arch_topology: Drop LLC identifier stash from
the CPU topology")

allocate_cache_info() tries to build the cacheinfo from the primary
CPU prior secondary CPUs boot, if the DT/ACPI description
contains cache information.
If allocate_cache_info() fails, then fallback to the current state
for the cacheinfo allocation. [1] will be triggered in such case.

When unplugging a CPU, the cacheinfo memory cannot be freed. If it
was, then the memory would be allocated early by the re-plugged
CPU and would trigger [1].

Note that populate_cache_leaves() might be called multiple times
due to populate_leaves being moved up. This is required since
detect_cache_attributes() might be called with per_cpu_cacheinfo(cpu)
being allocated but not populated.

[1]:
 | BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:46
 | in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 0, name: swapper/111
 | preempt_count: 1, expected: 0
 | RCU nest depth: 1, expected: 1
 | 3 locks held by swapper/111/0:
 |  #0:  (&amp;pcp-&gt;lock){+.+.}-{3:3}, at: get_page_from_freelist+0x218/0x12c8
 |  #1:  (rcu_read_lock){....}-{1:3}, at: rt_spin_trylock+0x48/0xf0
 |  #2:  (&amp;zone-&gt;lock){+.+.}-{3:3}, at: rmqueue_bulk+0x64/0xa80
 | irq event stamp: 0
 | hardirqs last  enabled at (0):  0x0
 | hardirqs last disabled at (0):  copy_process+0x5dc/0x1ab8
 | softirqs last  enabled at (0):  copy_process+0x5dc/0x1ab8
 | softirqs last disabled at (0):  0x0
 | Preemption disabled at:
 |  migrate_enable+0x30/0x130
 | CPU: 111 PID: 0 Comm: swapper/111 Tainted: G        W          6.0.0-rc4-rt6-[...]
 | Call trace:
 |  __kmalloc+0xbc/0x1e8
 |  detect_cache_attributes+0x2d4/0x5f0
 |  update_siblings_masks+0x30/0x368
 |  store_cpu_topology+0x78/0xb8
 |  secondary_start_kernel+0xd0/0x198
 |  __secondary_switched+0xb0/0xb4

Signed-off-by: Pierre Gondois &lt;pierre.gondois@arm.com&gt;
Reviewed-by: Sudeep Holla &lt;sudeep.holla@arm.com&gt;
Acked-by: Palmer Dabbelt &lt;palmer@rivosinc.com&gt;
Link: https://lore.kernel.org/r/20230104183033.755668-7-pierre.gondois@arm.com
Signed-off-by: Sudeep Holla &lt;sudeep.holla@arm.com&gt;
Signed-off-by: Wen Yang &lt;wen.yang@linux.dev&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>cacheinfo: Check 'cache-unified' property to count cache leaves</title>
<updated>2025-12-06T21:12:13+00:00</updated>
<author>
<name>Pierre Gondois</name>
<email>pierre.gondois@arm.com</email>
</author>
<published>2025-10-20T17:36:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=100309b45cc4d18815f68c97db012ab9714e25e2'/>
<id>urn:sha1:100309b45cc4d18815f68c97db012ab9714e25e2</id>
<content type='text'>
[ Upstream commit de0df442ee49cb1f6ee58f3fec5dcb5e5eb70aab ]

The DeviceTree Specification v0.3 specifies that the cache node
'[d-|i-|]cache-size' property is required. The 'cache-unified'
property is specifies whether the cache level is separate
or unified.

If the cache-size property is missing, no cache leaves is accounted.
This can lead to a 'BUG: KASAN: slab-out-of-bounds' [1] bug.

Check 'cache-unified' property and always account for at least
one cache leaf when parsing the device tree.

[1] https://lore.kernel.org/all/0f19cb3f-d6cf-4032-66d2-dedc9d09a0e3@linaro.org/

Reported-by: Krzysztof Kozlowski &lt;krzysztof.kozlowski@linaro.org&gt;
Signed-off-by: Pierre Gondois &lt;pierre.gondois@arm.com&gt;
Tested-by: Krzysztof Kozlowski &lt;krzysztof.kozlowski@linaro.org&gt;
Link: https://lore.kernel.org/r/20230104183033.755668-4-pierre.gondois@arm.com
Signed-off-by: Sudeep Holla &lt;sudeep.holla@arm.com&gt;
Signed-off-by: Wen Yang &lt;wen.yang@linux.dev&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>cacheinfo: Return error code in init_of_cache_level()</title>
<updated>2025-12-06T21:12:13+00:00</updated>
<author>
<name>Pierre Gondois</name>
<email>pierre.gondois@arm.com</email>
</author>
<published>2025-10-20T17:36:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=52a43345ec152e9881b890277bad48e950e8ae31'/>
<id>urn:sha1:52a43345ec152e9881b890277bad48e950e8ae31</id>
<content type='text'>
[ Upstream commit 8844c3df001bc1d8397fddea341308da63855d53 ]

Make init_of_cache_level() return an error code when the cache
information parsing fails to help detecting missing information.

init_of_cache_level() is only called for riscv. Returning an error
code instead of 0 will prevent detect_cache_attributes() to allocate
memory if an incomplete DT is parsed.

Signed-off-by: Pierre Gondois &lt;pierre.gondois@arm.com&gt;
Acked-by: Palmer Dabbelt &lt;palmer@rivosinc.com&gt;
Link: https://lore.kernel.org/r/20230104183033.755668-3-pierre.gondois@arm.com
Signed-off-by: Sudeep Holla &lt;sudeep.holla@arm.com&gt;
Signed-off-by: Wen Yang &lt;wen.yang@linux.dev&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>cacheinfo: Use RISC-V's init_cache_level() as generic OF implementation</title>
<updated>2025-12-06T21:12:13+00:00</updated>
<author>
<name>Pierre Gondois</name>
<email>pierre.gondois@arm.com</email>
</author>
<published>2025-10-20T17:36:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=a6d0510f295d99739abe16349a9c5d7a7f0ee55d'/>
<id>urn:sha1:a6d0510f295d99739abe16349a9c5d7a7f0ee55d</id>
<content type='text'>
[ Upstream commit c3719bd9eeb2edf84bd263d662e36ca0ba262a23 ]

RISC-V's implementation of init_of_cache_level() is following
the Devicetree Specification v0.3 regarding caches, cf.:
- s3.7.3 'Internal (L1) Cache Properties'
- s3.8 'Multi-level and Shared Cache Nodes'

Allow reusing the implementation by moving it.

Also make 'levels', 'leaves' and 'level' unsigned int.

Signed-off-by: Pierre Gondois &lt;pierre.gondois@arm.com&gt;
Reviewed-by: Conor Dooley &lt;conor.dooley@microchip.com&gt;
Acked-by: Palmer Dabbelt &lt;palmer@rivosinc.com&gt;
Link: https://lore.kernel.org/r/20230104183033.755668-2-pierre.gondois@arm.com
Signed-off-by: Sudeep Holla &lt;sudeep.holla@arm.com&gt;
Signed-off-by: Wen Yang &lt;wen.yang@linux.dev&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>drivers: base: cacheinfo: Fix shared_cpu_map changes in event of CPU hotplug</title>
<updated>2023-06-09T08:34:17+00:00</updated>
<author>
<name>K Prateek Nayak</name>
<email>kprateek.nayak@amd.com</email>
</author>
<published>2023-05-08T08:41:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=4d776371127ec55590d49eea7579e77f555548ad'/>
<id>urn:sha1:4d776371127ec55590d49eea7579e77f555548ad</id>
<content type='text'>
[ Upstream commit 126310c9f669c9a8c875a3e5c2292299ca90225d ]

While building the shared_cpu_map, check if the cache level and cache
type matches. On certain systems that build the cache topology based on
the instance ID, there are cases where the same ID may repeat across
multiple cache levels, leading inaccurate topology.

In event of CPU offlining, the cache_shared_cpu_map_remove() does not
consider if IDs at same level are being compared. As a result, when same
IDs repeat across different cache levels, the CPU going offline is not
removed from all the shared_cpu_map.

Below is the output of cache topology of CPU8 and it's SMT sibling after
CPU8 is offlined on a dual socket 3rd Generation AMD EPYC processor
(2 x 64C/128T) running kernel release v6.3:

  # for i in /sys/devices/system/cpu/cpu8/cache/index*/shared_cpu_list; do echo -n "$i: "; cat $i; done
    /sys/devices/system/cpu/cpu8/cache/index0/shared_cpu_list: 8,136
    /sys/devices/system/cpu/cpu8/cache/index1/shared_cpu_list: 8,136
    /sys/devices/system/cpu/cpu8/cache/index2/shared_cpu_list: 8,136
    /sys/devices/system/cpu/cpu8/cache/index3/shared_cpu_list: 8-15,136-143

  # echo 0 &gt; /sys/devices/system/cpu/cpu8/online

  # for i in /sys/devices/system/cpu/cpu136/cache/index*/shared_cpu_list; do echo -n "$i: "; cat $i; done
    /sys/devices/system/cpu/cpu136/cache/index0/shared_cpu_list: 136
    /sys/devices/system/cpu/cpu136/cache/index1/shared_cpu_list: 8,136
    /sys/devices/system/cpu/cpu136/cache/index2/shared_cpu_list: 8,136
    /sys/devices/system/cpu/cpu136/cache/index3/shared_cpu_list: 9-15,136-143

CPU8 is removed from index0 (L1i) but remains in the shared_cpu_list of
index1 (L1d) and index2 (L2). Since L1i, L1d, and L2 are shared by the
SMT siblings, and they have the same cache instance ID, CPU 2 is only
removed from the first index with matching ID which is index1 (L1i) in
this case. With this fix, the results are as expected when performing
the same experiment on the same system:

  # for i in /sys/devices/system/cpu/cpu8/cache/index*/shared_cpu_list; do echo -n "$i: "; cat $i; done
    /sys/devices/system/cpu/cpu8/cache/index0/shared_cpu_list: 8,136
    /sys/devices/system/cpu/cpu8/cache/index1/shared_cpu_list: 8,136
    /sys/devices/system/cpu/cpu8/cache/index2/shared_cpu_list: 8,136
    /sys/devices/system/cpu/cpu8/cache/index3/shared_cpu_list: 8-15,136-143

  # echo 0 &gt; /sys/devices/system/cpu/cpu8/online

  # for i in /sys/devices/system/cpu/cpu136/cache/index*/shared_cpu_list; do echo -n "$i: "; cat $i; done
    /sys/devices/system/cpu/cpu136/cache/index0/shared_cpu_list: 136
    /sys/devices/system/cpu/cpu136/cache/index1/shared_cpu_list: 136
    /sys/devices/system/cpu/cpu136/cache/index2/shared_cpu_list: 136
    /sys/devices/system/cpu/cpu136/cache/index3/shared_cpu_list: 9-15,136-143

When rebuilding topology, the same problem appears as
cache_shared_cpu_map_setup() implements a similar logic. Consider the
same 3rd Generation EPYC processor: CPUs in Core 1, that share the L1
and L2 caches, have L1 and L2 instance ID as 1. For all the CPUs on
the second chiplet, the L3 ID is also 1 leading to grouping on CPUs from
Core 1 (1, 17) and the entire second chiplet (8-15, 24-31) as CPUs
sharing one cache domain. This went undetected since x86 processors
depended on arch specific populate_cache_leaves() method to repopulate
the shared_cpus_map when CPU came back online until kernel release
v6.3-rc5.

Fixes: 198102c9103f ("cacheinfo: Fix shared_cpu_map to handle shared caches at different levels")
Signed-off-by: K Prateek Nayak &lt;kprateek.nayak@amd.com&gt;
Reviewed-by: Sudeep Holla &lt;sudeep.holla@arm.com&gt;
Link: https://lore.kernel.org/r/20230508084115.1157-2-kprateek.nayak@amd.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
</feed>
