kernel/linux.git/tools/testing/selftests/resctrl, branch v6.18.21

selftests/resctrl: Fix a division by zero error on Hygon

2026-02-26T22:58:57+00:00

[ Upstream commit 671ef08d9455f5754d1fc96f5a14e357d6b80936 ] Change to adjust effective L3 cache size with SNC enabled change introduced the snc_nodes_per_l3_cache() function to detect the Intel Sub-NUMA Clustering (SNC) feature by comparing #CPUs in node0 with #CPUs sharing LLC with CPU0. The function was designed to return: (1) >1: SNC mode is enabled. (2) 1: SNC mode is not enabled or not supported. However, on certain Hygon CPUs, #CPUs sharing LLC with CPU0 is actually less than #CPUs in node0. This results in snc_nodes_per_l3_cache() returning 0 (calculated as cache_cpus / node_cpus). This leads to a division by zero error in get_cache_size(): *cache_size /= snc_nodes_per_l3_cache(); Causing the resctrl selftest to fail with: "Floating point exception (core dumped)" Fix the issue by ensuring snc_nodes_per_l3_cache() returns 1 when SNC mode is not supported on the platform. Updated commit log to fix commit has issues: Shuah Khan Link: https://lore.kernel.org/r/20251217030456.3834956-2-shenxiaochen@open-hieco.net Fixes: a1cd99e700ec ("selftests/resctrl: Adjust effective L3 cache size with SNC enabled") Signed-off-by: Xiaochen Shen Reviewed-by: Reinette Chatre Reviewed-by: Fenghua Yu Signed-off-by: Shuah Khan Signed-off-by: Sasha Levin

selftests/resctrl: Discover SNC kernel support and adjust messages

2025-01-15T00:06:32+00:00

Resctrl selftest prints a message on test failure that Sub-Numa Clustering (SNC) could be enabled and points the user to check their BIOS settings. No actual check is performed before printing that message so it is not very accurate in pinpointing a problem. When there is SNC support for kernel's resctrl subsystem and SNC is enabled then sub node files are created for each node in the resctrlfs. The sub node files exist in each regular node's L3 monitoring directory. The reliable path to check for existence of sub node files is /sys/fs/resctrl/mon_data/mon_L3_00/mon_sub_L3_00. Add helper that checks for mon_sub_L3_00 existence. Correct old messages to account for kernel support of SNC in resctrl. Signed-off-by: Maciej Wieczor-Retman Reviewed-by: Reinette Chatre Signed-off-by: Shuah Khan

selftests/resctrl: Adjust effective L3 cache size with SNC enabled

2025-01-15T00:06:32+00:00

Sub-NUMA Cluster divides CPUs sharing an L3 cache into separate NUMA nodes. Systems may support splitting into either two, three, four or six nodes. When SNC mode is enabled the effective amount of L3 cache available for allocation is divided by the number of nodes per L3. It's possible to detect which SNC mode is active by comparing the number of CPUs that share a cache with CPU0, with the number of CPUs on node0. Detect SNC mode once and let other tests inherit that information. Update CFLAGS after including lib.mk in the Makefile so that fallthrough macro can be used. To check if SNC detection is reliable one can check the /sys/devices/system/cpu/offline file. If it's empty, it means all cores are operational and the ratio should be calculated correctly. If it has any contents, it means the detected SNC mode can't be trusted and should be disabled. Check if detection was not reliable due to offline cpus. If it was skip running tests since the results couldn't be trusted. Co-developed-by: Tony Luck Signed-off-by: Tony Luck Signed-off-by: Maciej Wieczor-Retman Reviewed-by: Reinette Chatre Signed-off-by: Shuah Khan

selftests/resctrl: Replace magic constants used as array size

2024-11-05T00:02:03+00:00

The Memory Bandwidth Allocation (MBA) test iterates through all possible MBA allocations, from 10% (ALLOCATION_MIN) to 100% (ALLOCATION_MAX) with increments of 10% (ALLOCATION_STEP) at each iteration. During each iteration the test measures the actual memory bandwidth NUM_OF_RUNS times to determine the impact of MBA on actual memory bandwidth. After the MBA test completes all the memory bandwidth measurements are parsed into an array. One array for resctrl Memory Bandwidth Monitoring (MBM) measurements and one array for the Integrated Memory Controller (iMC) measurements. Each array has a hardcoded size of 1024 that is large enough to hold the current test data, but this hardcoded value makes the implementation difficult to understand. It will not be clear that this array needs to be reconsidered if any of the test parameters are changed. Replace the magic constant as array size with the test parameters the array size depends on. Reported-by: Ilpo Järvinen Closes: https://lore.kernel.org/all/45af2a8c-517d-8f0d-137d-ad0f3f6a3c68@linux.intel.com/ Signed-off-by: Reinette Chatre Reviewed-by: Ilpo Järvinen Signed-off-by: Shuah Khan

selftests/resctrl: Keep results from first test run

2024-11-05T00:02:03+00:00

The resctrl selftests drop the results from every first test run to avoid (per comment) "inaccurate due to monitoring setup transition phase" data. Previously inaccurate data resulted from workloads needing some time to "settle" and also the measurements themselves to account for earlier measurements to measure across needed timeframe. commit da50de0a92f3 ("selftests/resctrl: Calculate resctrl FS derived mem bw over sleep(1) only") ensured that measurements accurately measure just the time frame of interest. The default "fill_buf" benchmark since separated the buffer prepare phase from the benchmark run phase reducing the need for the tests themselves to accommodate the benchmark's "settle" time. With these enhancements there are no remaining portions needing to "settle" and the first test run can contribute to measurements. Signed-off-by: Reinette Chatre Reviewed-by: Ilpo Järvinen Signed-off-by: Shuah Khan

selftests/resctrl: Do not compare performance counters and resctrl at low bandwidth

2024-11-05T00:02:03+00:00

The MBA test incrementally throttles memory bandwidth, each time followed by a comparison between the memory bandwidth observed by the performance counters and resctrl respectively. While a comparison between performance counters and resctrl is generally appropriate, they do not have an identical view of memory bandwidth. For example RAS features or memory performance features that generate memory traffic may drive accesses that are counted differently by performance counters and MBM respectively, for instance generating "overhead" traffic which is not counted against any specific RMID. As a ratio, this different view of memory bandwidth becomes more apparent at low memory bandwidths. It is not practical to enable/disable the various features that may generate memory bandwidth to give performance counters and resctrl an identical view. Instead, do not compare performance counters and resctrl view of memory bandwidth when the memory bandwidth is low. Bandwidth throttling behaves differently across platforms so it is not appropriate to drop measurement data simply based on the throttling level. Instead, use a threshold of 750MiB that has been observed to support adequate comparison between performance counters and resctrl. Signed-off-by: Reinette Chatre Reviewed-by: Ilpo Järvinen Signed-off-by: Shuah Khan

selftests/resctrl: Use cache size to determine "fill_buf" buffer size

2024-11-05T00:02:03+00:00

By default the MBM and MBA tests use the "fill_buf" benchmark to read from a buffer with the goal to measure the memory bandwidth generated by this buffer access. Care should be taken when sizing the buffer used by the "fill_buf" benchmark. If the buffer is small enough to fit in the cache then it cannot be expected that the benchmark will generate much memory bandwidth. For example, on a system with 320MB L3 cache the existing hardcoded default of 250MB is insufficient. Use the measured cache size to determine a buffer size that can be expected to trigger memory access while keeping the existing default as minimum, now renamed to MINIMUM_SPAN, that has been appropriate for testing so far. Signed-off-by: Reinette Chatre Reviewed-by: Ilpo Järvinen Signed-off-by: Shuah Khan

selftests/resctrl: Ensure measurements skip initialization of default benchmark

2024-11-05T00:02:03+00:00

The CMT, MBA, and MBM tests rely on the resctrl_val() wrapper to start and run a benchmark while providing test specific flows via callbacks to do test specific configuration and measurements. At a high level, the resctrl_val() flow is: a) Start by fork()ing a child process that installs a signal handler for SIGUSR1 that, on receipt of SIGUSR1, will start running a benchmark. b) Assign the child process created in (a) to the resctrl control and monitoring group that dictates the memory and cache allocations with which the process can run and will contain all resctrl monitoring data of that process. c) Once parent and child are considered "ready" (determined via a message over a pipe) the parent signals the child (via SIGUSR1) to start the benchmark, waits one second for the benchmark to run, and then starts collecting monitoring data for the tests, potentially also changing allocation configuration depending on the various test callbacks. A problem with the above flow is the "black box" view of the benchmark that is combined with an arbitrarily chosen "wait one second" before measurements start. No matter what the benchmark does, it is given one second to initialize before measurements start. The default benchmark "fill_buf" consists of two parts, first it prepares a buffer (allocate, initialize, then flush), then it reads from the buffer (in unpredictable ways) until terminated. Depending on the system and the size of the buffer, the first "prepare" part may not be complete by the time the one second delay expires. Test measurements may thus start before the work needing to be measured runs. Split the default benchmark into its "prepare" and "runtime" parts and simplify the resctrl_val() wrapper while doing so. This same split cannot be done for the user provided benchmark (without a user interface change), so the current behavior is maintained for user provided benchmark. Assign the test itself to the control and monitoring group and run the "prepare" part of the benchmark in this context, ensuring it runs with required cache and memory bandwidth allocations. With the benchmark preparation complete it is only needed to fork() the "runtime" part of the benchmark (or entire user provided benchmark). Keep the "wait one second" delay before measurements start. For the default "fill_buf" benchmark this time now covers only the "runtime" portion that needs to be measured. For the user provided benchmark this delay maintains current behavior. Signed-off-by: Reinette Chatre Reviewed-by: Ilpo Järvinen Signed-off-by: Shuah Khan

selftests/resctrl: Make benchmark parameter passing robust

2024-11-05T00:02:03+00:00

The benchmark used during the CMT, MBM, and MBA tests can be provided by the user via (-b) parameter, if not provided the default "fill_buf" benchmark is used. The user is additionally able to override any of the "fill_buf" default parameters when running the tests with "-b fill_buf ". The "fill_buf" parameters are managed as an array of strings. Using an array of strings is complex because it requires transformations to/from strings at every producer and consumer. This is made worse for the individual tests where the default benchmark parameters values may not be appropriate and additional data wrangling is required. For example, the CMT test duplicates the entire array of strings in order to replace one of the parameters. More issues appear when combining the usage of an array of strings with the use case of user overriding default parameters by specifying "-b fill_buf ". This use case is fragile with opportunities to trigger a SIGSEGV because of opportunities for NULL pointers to exist in the array of strings. For example, by running below (thus by specifying "fill_buf" should be used but all parameters are NULL): $ sudo resctrl_tests -t mbm -b fill_buf Replace the "array of strings" parameters used for "fill_buf" with new struct fill_buf_param that contains the "fill_buf" parameters that can be used directly without transformations to/from strings. Two instances of struct fill_buf_param may exist at any point in time: * If the user provides new parameters to "fill_buf", the user parameter structure (struct user_params) will point to a fully initialized and immutable struct fill_buf_param containing the user provided parameters. * If "fill_buf" is the benchmark that should be used by a test, then the test parameter structure (struct resctrl_val_param) will point to a fully initialized struct fill_buf_param. The latter may contain (a) the user provided parameters verbatim, (b) user provided parameters adjusted to be appropriate for the test, or (c) the default parameters for "fill_buf" that is appropriate for the test if the user did not provide "fill_buf" parameters nor an alternate benchmark. The existing behavior of CMT test is to use test defined value for the buffer size even if the user provides another value via command line. This behavior is maintained since the test requires that the buffer size matches the size of the cache allocated, and the amount of cache allocated can instead be changed by the user with the "-n" command line parameter. Signed-off-by: Reinette Chatre Reviewed-by: Ilpo Järvinen Signed-off-by: Shuah Khan

selftests/resctrl: Remove unused measurement code

2024-11-05T00:02:03+00:00

The MBM and MBA resctrl selftests run a benchmark during which it takes measurements of read memory bandwidth via perf. Code exists to support measurements of write memory bandwidth but there exists no path with which this code can execute. While code exists for write memory bandwidth measurement there has not yet been a use case for it. Remove this unused code. Rename relevant functions to include "read" so that it is clear that it relates only to memory bandwidth reads, while renaming the functions also add consistency by changing the "membw" instances to more prevalent "mem_bw". Signed-off-by: Reinette Chatre Reviewed-by: Ilpo Järvinen Signed-off-by: Shuah Khan