kernel/linux.git/tools/testing/selftests/resctrl/fill_buf.c, branch v6.18.21

selftests/resctrl: Use cache size to determine "fill_buf" buffer size

2024-11-05T00:02:03+00:00

By default the MBM and MBA tests use the "fill_buf" benchmark to read from a buffer with the goal to measure the memory bandwidth generated by this buffer access. Care should be taken when sizing the buffer used by the "fill_buf" benchmark. If the buffer is small enough to fit in the cache then it cannot be expected that the benchmark will generate much memory bandwidth. For example, on a system with 320MB L3 cache the existing hardcoded default of 250MB is insufficient. Use the measured cache size to determine a buffer size that can be expected to trigger memory access while keeping the existing default as minimum, now renamed to MINIMUM_SPAN, that has been appropriate for testing so far. Signed-off-by: Reinette Chatre Reviewed-by: Ilpo Järvinen Signed-off-by: Shuah Khan

selftests/resctrl: Ensure measurements skip initialization of default benchmark

2024-11-05T00:02:03+00:00

The CMT, MBA, and MBM tests rely on the resctrl_val() wrapper to start and run a benchmark while providing test specific flows via callbacks to do test specific configuration and measurements. At a high level, the resctrl_val() flow is: a) Start by fork()ing a child process that installs a signal handler for SIGUSR1 that, on receipt of SIGUSR1, will start running a benchmark. b) Assign the child process created in (a) to the resctrl control and monitoring group that dictates the memory and cache allocations with which the process can run and will contain all resctrl monitoring data of that process. c) Once parent and child are considered "ready" (determined via a message over a pipe) the parent signals the child (via SIGUSR1) to start the benchmark, waits one second for the benchmark to run, and then starts collecting monitoring data for the tests, potentially also changing allocation configuration depending on the various test callbacks. A problem with the above flow is the "black box" view of the benchmark that is combined with an arbitrarily chosen "wait one second" before measurements start. No matter what the benchmark does, it is given one second to initialize before measurements start. The default benchmark "fill_buf" consists of two parts, first it prepares a buffer (allocate, initialize, then flush), then it reads from the buffer (in unpredictable ways) until terminated. Depending on the system and the size of the buffer, the first "prepare" part may not be complete by the time the one second delay expires. Test measurements may thus start before the work needing to be measured runs. Split the default benchmark into its "prepare" and "runtime" parts and simplify the resctrl_val() wrapper while doing so. This same split cannot be done for the user provided benchmark (without a user interface change), so the current behavior is maintained for user provided benchmark. Assign the test itself to the control and monitoring group and run the "prepare" part of the benchmark in this context, ensuring it runs with required cache and memory bandwidth allocations. With the benchmark preparation complete it is only needed to fork() the "runtime" part of the benchmark (or entire user provided benchmark). Keep the "wait one second" delay before measurements start. For the default "fill_buf" benchmark this time now covers only the "runtime" portion that needs to be measured. For the user provided benchmark this delay maintains current behavior. Signed-off-by: Reinette Chatre Reviewed-by: Ilpo Järvinen Signed-off-by: Shuah Khan

selftests/resctrl: Make benchmark parameter passing robust

2024-11-05T00:02:03+00:00

The benchmark used during the CMT, MBM, and MBA tests can be provided by the user via (-b) parameter, if not provided the default "fill_buf" benchmark is used. The user is additionally able to override any of the "fill_buf" default parameters when running the tests with "-b fill_buf ". The "fill_buf" parameters are managed as an array of strings. Using an array of strings is complex because it requires transformations to/from strings at every producer and consumer. This is made worse for the individual tests where the default benchmark parameters values may not be appropriate and additional data wrangling is required. For example, the CMT test duplicates the entire array of strings in order to replace one of the parameters. More issues appear when combining the usage of an array of strings with the use case of user overriding default parameters by specifying "-b fill_buf ". This use case is fragile with opportunities to trigger a SIGSEGV because of opportunities for NULL pointers to exist in the array of strings. For example, by running below (thus by specifying "fill_buf" should be used but all parameters are NULL): $ sudo resctrl_tests -t mbm -b fill_buf Replace the "array of strings" parameters used for "fill_buf" with new struct fill_buf_param that contains the "fill_buf" parameters that can be used directly without transformations to/from strings. Two instances of struct fill_buf_param may exist at any point in time: * If the user provides new parameters to "fill_buf", the user parameter structure (struct user_params) will point to a fully initialized and immutable struct fill_buf_param containing the user provided parameters. * If "fill_buf" is the benchmark that should be used by a test, then the test parameter structure (struct resctrl_val_param) will point to a fully initialized struct fill_buf_param. The latter may contain (a) the user provided parameters verbatim, (b) user provided parameters adjusted to be appropriate for the test, or (c) the default parameters for "fill_buf" that is appropriate for the test if the user did not provide "fill_buf" parameters nor an alternate benchmark. The existing behavior of CMT test is to use test defined value for the buffer size even if the user provides another value via command line. This behavior is maintained since the test requires that the buffer size matches the size of the cache allocated, and the amount of cache allocated can instead be changed by the user with the "-n" command line parameter. Signed-off-by: Reinette Chatre Reviewed-by: Ilpo Järvinen Signed-off-by: Shuah Khan

selftests/resctrl: Only support measured read operation

2024-11-05T00:02:03+00:00

The CMT, MBM, and MBA tests rely on a benchmark to generate memory traffic. By default this is the "fill_buf" benchmark that can be replaced via the "-b" command line argument. The original intent of the "-b" command line parameter was to replace the default "fill_buf" benchmark, but the implementation also exposes an alternative use case where the "fill_buf" parameters itself can be modified. One of the parameters to "fill_buf" is the "operation" that can be either "read" or "write" and indicates whether the "fill_buf" should use "read" or "write" operations on the allocated buffer. While replacing "fill_buf" default parameters is technically possible, replacing the default "read" parameter with "write" is not supported because the MBA and MBM tests only measure "read" operations. The "read" operation is also most appropriate for the CMT test that aims to use the benchmark to allocate into the cache. Avoid any potential inconsistencies between test and measurement by removing code for unsupported "write" operations to the buffer. Ignore any attempt from user space to enable this unsupported test configuration, instead always use read operations. Keep the initialization of the, now unused, "fill_buf" parameters to reserve these parameter positions since it has been exposed as an API. Future parameter additions cannot use these parameter positions. Signed-off-by: Reinette Chatre Reviewed-by: Ilpo Järvinen Signed-off-by: Shuah Khan

selftests/resctrl: Remove "once" parameter required to be false

2024-11-05T00:02:03+00:00

The CMT, MBM, and MBA tests rely on a benchmark that runs while the test makes changes to needed configuration (for example memory bandwidth allocation) and takes needed measurements. By default the "fill_buf" benchmark is used and by default (via its "once = false" setting) "fill_buf" is configured to run until terminated after the test completes. An unintended consequence of enabling the user to override the benchmark also enables the user to change parameters to the "fill_buf" benchmark. This enables the user to set "fill_buf" to only cycle through the buffer once (by setting "once = true") and thus breaking the CMT, MBA, and MBM tests that expect workload/interference to be reflected by their measurements. Prevent user space from changing the "once" parameter and ensure that it is always false for the CMT, MBA, and MBM tests. Suggested-by: Ilpo Järvinen Signed-off-by: Reinette Chatre Reviewed-by: Ilpo Järvinen Signed-off-by: Shuah Khan

selftests/resctrl: Fix memory overflow due to unhandled wraparound

2024-11-05T00:02:02+00:00

alloc_buffer() allocates and initializes (with random data) a buffer of requested size. The initialization starts from the beginning of the allocated buffer and incrementally assigns sizeof(uint64_t) random data to each cache line. The initialization uses the size of the buffer to control the initialization flow, decrementing the amount of buffer needing to be initialized after each iteration. The size of the buffer is stored in an unsigned (size_t) variable s64 and the test "s64 > 0" is used to decide if initialization is complete. The problem is that decrementing the buffer size may wrap around if the buffer size is not divisible by "CL_SIZE / sizeof(uint64_t)" resulting in the "s64 > 0" test being true and memory beyond the buffer "initialized". Use a signed value for the buffer size to support all buffer sizes. Fixes: a2561b12fe39 ("selftests/resctrl: Add built in benchmark") Signed-off-by: Reinette Chatre Reviewed-by: Ilpo Järvinen Signed-off-by: Shuah Khan

selftests/resctrl: Rewrite Cache Allocation Technology (CAT) test

2024-02-13T20:56:45+00:00

CAT test spawns two processes into two different control groups with exclusive schemata. Both the processes alloc a buffer from memory matching their allocated LLC block size and flush the entire buffer out of caches. Since the processes are reading through the buffer only once during the measurement and initially all the buffer was flushed, the test isn't testing CAT. Rewrite the CAT test to allocate a buffer sized to half of LLC. Then perform a sequence of tests with different LLC alloc sizes starting from half of the CBM bits down to 1-bit CBM. Flush the buffer before each test and read the buffer twice. Observe the LLC misses on the second read through the buffer. As the allocated LLC block gets smaller and smaller, the LLC misses will become larger and larger giving a strong signal on CAT working properly. The new CAT test is using only a single process because it relies on measured effect against another run of itself rather than another process adding noise. The rest of the system is set to use the CBM bits not used by the CAT test to keep the test isolated. Replace count_bits() with count_contiguous_bits() to get the first bit position in order to be able to calculate masks based on it. This change has been tested with a number of systems from different generations. Suggested-by: Reinette Chatre Signed-off-by: Ilpo Järvinen Reviewed-by: Reinette Chatre Signed-off-by: Shuah Khan

selftests/resctrl: Read in less obvious order to defeat prefetch optimizations

2024-02-13T20:56:44+00:00

When reading memory in order, HW prefetching optimizations will interfere with measuring how caches and memory are being accessed. This adds noise into the results. Change the fill_buf reading loop to not use an obvious in-order access using multiply by a prime and modulo. Using a prime multiplier with modulo ensures the entire buffer is eventually read. 23 is small enough that the reads are spread out but wrapping does not occur very frequently (wrapping too often can trigger L2 hits more frequently which causes noise to the test because getting the data from LLC is not required). It was discovered that not all primes work equally well and some can cause wildly unstable results (e.g., in an earlier version of this patch, the reads were done in reversed order and 59 was used as the prime resulting in unacceptably high and unstable results in MBA and MBM test on some architectures). Link: https://lore.kernel.org/linux-kselftest/TYAPR01MB6330025B5E6537F94DA49ACB8B499@TYAPR01MB6330.jpnprd01.prod.outlook.com/ Signed-off-by: Ilpo Järvinen Reviewed-by: Reinette Chatre Signed-off-by: Shuah Khan

selftests/resctrl: Replace file write with volatile variable

2024-02-13T20:56:44+00:00

The fill_buf code prevents compiler optimizating the entire read loop away by writing the final value of the variable into a file. While it achieves the goal, writing into a file requires significant amount of work within the innermost test loop and also error handling. A simpler approach is to take advantage of volatile. Writing through a pointer to a volatile variable is enough to prevent compiler from optimizing the write away, and therefore compiler cannot remove the read loop either. Add a volatile 'value_sink' into resctrl_tests.c and make fill_buf to write into it. As a result, the error handling in fill_buf.c can be simplified. Signed-off-by: Ilpo Järvinen Reviewed-by: Reinette Chatre Signed-off-by: Shuah Khan

selftests/resctrl: Refactor fill_buf functions

2024-02-13T20:56:43+00:00

There are unnecessary nested calls in fill_buf.c: - run_fill_buf() calls fill_cache() - alloc_buffer() calls malloc_and_init_memory() Simplify the code flow and remove those unnecessary call levels by moving the called code inside the calling function and remove the duplicated error print. Resolve the difference in run_fill_buf() and fill_cache() parameter name into 'buf_size' which is more descriptive than 'span'. Also, while moving the allocation related code, rename 'p' into 'buf' to be consistent in naming the variables. Signed-off-by: Ilpo Järvinen Reviewed-by: Reinette Chatre Signed-off-by: Shuah Khan