kernel/linux.git/drivers/base/cacheinfo.c, branch v6.1.87

drivers: base: cacheinfo: Fix shared_cpu_map changes in event of CPU hotplug

2023-06-09T08:34:17+00:00

[ Upstream commit 126310c9f669c9a8c875a3e5c2292299ca90225d ] While building the shared_cpu_map, check if the cache level and cache type matches. On certain systems that build the cache topology based on the instance ID, there are cases where the same ID may repeat across multiple cache levels, leading inaccurate topology. In event of CPU offlining, the cache_shared_cpu_map_remove() does not consider if IDs at same level are being compared. As a result, when same IDs repeat across different cache levels, the CPU going offline is not removed from all the shared_cpu_map. Below is the output of cache topology of CPU8 and it's SMT sibling after CPU8 is offlined on a dual socket 3rd Generation AMD EPYC processor (2 x 64C/128T) running kernel release v6.3: # for i in /sys/devices/system/cpu/cpu8/cache/index*/shared_cpu_list; do echo -n "$i: "; cat $i; done /sys/devices/system/cpu/cpu8/cache/index0/shared_cpu_list: 8,136 /sys/devices/system/cpu/cpu8/cache/index1/shared_cpu_list: 8,136 /sys/devices/system/cpu/cpu8/cache/index2/shared_cpu_list: 8,136 /sys/devices/system/cpu/cpu8/cache/index3/shared_cpu_list: 8-15,136-143 # echo 0 > /sys/devices/system/cpu/cpu8/online # for i in /sys/devices/system/cpu/cpu136/cache/index*/shared_cpu_list; do echo -n "$i: "; cat $i; done /sys/devices/system/cpu/cpu136/cache/index0/shared_cpu_list: 136 /sys/devices/system/cpu/cpu136/cache/index1/shared_cpu_list: 8,136 /sys/devices/system/cpu/cpu136/cache/index2/shared_cpu_list: 8,136 /sys/devices/system/cpu/cpu136/cache/index3/shared_cpu_list: 9-15,136-143 CPU8 is removed from index0 (L1i) but remains in the shared_cpu_list of index1 (L1d) and index2 (L2). Since L1i, L1d, and L2 are shared by the SMT siblings, and they have the same cache instance ID, CPU 2 is only removed from the first index with matching ID which is index1 (L1i) in this case. With this fix, the results are as expected when performing the same experiment on the same system: # for i in /sys/devices/system/cpu/cpu8/cache/index*/shared_cpu_list; do echo -n "$i: "; cat $i; done /sys/devices/system/cpu/cpu8/cache/index0/shared_cpu_list: 8,136 /sys/devices/system/cpu/cpu8/cache/index1/shared_cpu_list: 8,136 /sys/devices/system/cpu/cpu8/cache/index2/shared_cpu_list: 8,136 /sys/devices/system/cpu/cpu8/cache/index3/shared_cpu_list: 8-15,136-143 # echo 0 > /sys/devices/system/cpu/cpu8/online # for i in /sys/devices/system/cpu/cpu136/cache/index*/shared_cpu_list; do echo -n "$i: "; cat $i; done /sys/devices/system/cpu/cpu136/cache/index0/shared_cpu_list: 136 /sys/devices/system/cpu/cpu136/cache/index1/shared_cpu_list: 136 /sys/devices/system/cpu/cpu136/cache/index2/shared_cpu_list: 136 /sys/devices/system/cpu/cpu136/cache/index3/shared_cpu_list: 9-15,136-143 When rebuilding topology, the same problem appears as cache_shared_cpu_map_setup() implements a similar logic. Consider the same 3rd Generation EPYC processor: CPUs in Core 1, that share the L1 and L2 caches, have L1 and L2 instance ID as 1. For all the CPUs on the second chiplet, the L3 ID is also 1 leading to grouping on CPUs from Core 1 (1, 17) and the entire second chiplet (8-15, 24-31) as CPUs sharing one cache domain. This went undetected since x86 processors depended on arch specific populate_cache_leaves() method to repopulate the shared_cpus_map when CPU came back online until kernel release v6.3-rc5. Fixes: 198102c9103f ("cacheinfo: Fix shared_cpu_map to handle shared caches at different levels") Signed-off-by: K Prateek Nayak Reviewed-by: Sudeep Holla Link: https://lore.kernel.org/r/20230508084115.1157-2-kprateek.nayak@amd.com Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin

cacheinfo: Check sib_leaf in cache_leaves_are_shared()

2023-05-11T14:03:29+00:00

[ Upstream commit 7a306e3eabf2b2fd8cffa69b87b32dbf814d79ce ] If there is no ACPI/DT information, it is assumed that L1 caches are private and L2 (and higher) caches are shared. A cache is 'shared' between two CPUs if it is accessible from these two CPUs. Each CPU owns a representation (i.e. has a dedicated cacheinfo struct) of the caches it has access to. cache_leaves_are_shared() tries to identify whether two representations are designating the same actual cache. In cache_leaves_are_shared(), if 'this_leaf' is a L2 cache (or higher) and 'sib_leaf' is a L1 cache, the caches are detected as shared as only this_leaf's cache level is checked. This is leads to setting sib_leaf as being shared with another CPU, which is incorrect as this is a L1 cache. Check 'sib_leaf->level'. Also update the comment as the function is called when populating 'shared_cpu_map'. Fixes: f16d1becf96f ("cacheinfo: Use cache identifiers to check if the caches are shared if available") Signed-off-by: Pierre Gondois Reviewed-by: Conor Dooley Link: https://lore.kernel.org/r/20230414081453.244787-2-pierre.gondois@arm.com Signed-off-by: Sudeep Holla Signed-off-by: Sasha Levin

cacheinfo: Fix shared_cpu_map to handle shared caches at different levels

2023-03-11T12:55:34+00:00

[ Upstream commit 198102c9103fc78d8478495971947af77edb05c1 ] The cacheinfo sets up the shared_cpu_map by checking whether the caches with the same index are shared between CPUs. However, this will trigger slab-out-of-bounds access if the CPUs do not have the same cache hierarchy. Another problem is the mismatched shared_cpu_map when the shared cache does not have the same index between CPUs. CPU0 I D L3 index 0 1 2 x ^ ^ ^ ^ index 0 1 2 3 CPU1 I D L2 L3 This patch checks each cache is shared with all caches on other CPUs. Reviewed-by: Pierre Gondois Signed-off-by: Yong-Xuan Wang Link: https://lore.kernel.org/r/20230117105133.4445-2-yongxuan.wang@sifive.com Signed-off-by: Sudeep Holla Signed-off-by: Sasha Levin

cacheinfo: Use atomic allocation for percpu cache attributes

2022-07-22T08:04:42+00:00

On couple of architectures like RISC-V and ARM64, we need to detect cache attribues quite early during the boot when the secondary CPUs start. So we will call detect_cache_attributes in the atomic context and since use of normal allocation can sleep, we will end up getting "sleeping in the atomic context" bug splat. In order avoid that, move the allocation to use atomic version in preparation to move the actual detection of cache attributes in the CPU hotplug path which is atomic. Cc: Ionela Voinescu Tested-by: Conor Dooley Signed-off-by: Sudeep Holla Link: https://lore.kernel.org/r/20220720-arch_topo_fixes-v3-1-43d696288e84@arm.com Signed-off-by: Greg Kroah-Hartman

cacheinfo: Align checks in cache_shared_cpu_map_{setup,remove} for readability

2022-07-04T15:22:28+00:00

The checks to skip the CPU itself or no cacheinfo case are implemented bit differently though the effect is exactly same. Just align the implementation in both cache_shared_cpu_map_{setup,remove} just for improved readability. No functional change. Link: https://lore.kernel.org/r/20220704101605.1318280-9-sudeep.holla@arm.com Tested-by: Conor Dooley Signed-off-by: Sudeep Holla

cacheinfo: Use cache identifiers to check if the caches are shared if available

2022-07-04T15:22:28+00:00

The cache identifiers is an optional property on most of the platforms. The presence of one must be indicated by the CACHE_ID valid bit in the attributes. We can use the cache identifiers provided by the firmware to check if any two cpus share the same cache instead of relying on the fw_token generated and set in the OS. Link: https://lore.kernel.org/r/20220704101605.1318280-8-sudeep.holla@arm.com Tested-by: Ionela Voinescu Tested-by: Conor Dooley Signed-off-by: Sudeep Holla

cacheinfo: Allow early detection and population of cache attributes

2022-07-04T15:22:28+00:00

Some architecture/platforms may need to setup cache properties very early in the boot along with other cpu topologies so that all these information can be used to build sched_domains which is used by the scheduler. Allow detect_cache_attributes to be called quite early during the boot. Link: https://lore.kernel.org/r/20220704101605.1318280-7-sudeep.holla@arm.com Tested-by: Ionela Voinescu Tested-by: Conor Dooley Reviewed-by: Gavin Shan Signed-off-by: Sudeep Holla

cacheinfo: Add support to check if last level cache(LLC) is valid or shared

2022-07-04T15:22:28+00:00

It is useful to have helper to check if the given two CPUs share last level cache. We can do that check by comparing fw_token or by comparing the cache ID. Currently we check just for fw_token as the cache ID is optional. This helper can be used to build the llc_sibling during arch specific topology parsing and feeding information to the sched_domains. This also helps to get rid of llc_id in the CPU topology as it is sort of duplicate information. Also add helper to check if the llc information in cacheinfo is valid or not. Link: https://lore.kernel.org/r/20220704101605.1318280-6-sudeep.holla@arm.com Tested-by: Ionela Voinescu Tested-by: Conor Dooley Reviewed-by: Gavin Shan Signed-off-by: Sudeep Holla

cacheinfo: Move cache_leaves_are_shared out of CONFIG_OF

2022-07-04T15:22:28+00:00

cache_leaves_are_shared is already used even with ACPI and PPTT. It checks if the cache leaves are the shared based on fw_token pointer. However it is defined conditionally only if CONFIG_OF is enabled which is wrong. Move the function cache_leaves_are_shared out of CONFIG_OF and keep it generic. It also handles the case where both OF and ACPI is not defined. Link: https://lore.kernel.org/r/20220704101605.1318280-5-sudeep.holla@arm.com Tested-by: Ionela Voinescu Tested-by: Conor Dooley Reviewed-by: Gavin Shan Signed-off-by: Sudeep Holla

cacheinfo: Add helper to access any cache index for a given CPU

2022-07-04T15:22:28+00:00

The cacheinfo for a given CPU at a given index is used at quite a few places by fetching the base point for index 0 using the helper per_cpu_cacheinfo(cpu) and offsetting it by the required index. Instead, add another helper to fetch the required pointer directly and use it to simplify and improve readability. Link: https://lore.kernel.org/r/20220704101605.1318280-4-sudeep.holla@arm.com Tested-by: Ionela Voinescu Tested-by: Conor Dooley Reviewed-by: Gavin Shan Signed-off-by: Sudeep Holla