<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/tools/testing/selftests/resctrl/fill_buf.c, branch v6.18.21</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.18.21</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.18.21'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2024-11-05T00:02:03+00:00</updated>
<entry>
<title>selftests/resctrl: Use cache size to determine "fill_buf" buffer size</title>
<updated>2024-11-05T00:02:03+00:00</updated>
<author>
<name>Reinette Chatre</name>
<email>reinette.chatre@intel.com</email>
</author>
<published>2024-10-24T21:18:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f77b9672536e581c945b2623b521a284fdbf75ff'/>
<id>urn:sha1:f77b9672536e581c945b2623b521a284fdbf75ff</id>
<content type='text'>
By default the MBM and MBA tests use the "fill_buf" benchmark to
read from a buffer with the goal to measure the memory bandwidth
generated by this buffer access.

Care should be taken when sizing the buffer used by the "fill_buf"
benchmark. If the buffer is small enough to fit in the cache then
it cannot be expected that the benchmark will generate much memory
bandwidth. For example, on a system with 320MB L3 cache the existing
hardcoded default of 250MB is insufficient.

Use the measured cache size to determine a buffer size that can be
expected to trigger memory access while keeping the existing default
as minimum, now renamed to MINIMUM_SPAN, that has been appropriate for
testing so far.

Signed-off-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Reviewed-by: Ilpo Järvinen &lt;ilpo.jarvinen@linux.intel.com&gt;
Signed-off-by: Shuah Khan &lt;skhan@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>selftests/resctrl: Ensure measurements skip initialization of default benchmark</title>
<updated>2024-11-05T00:02:03+00:00</updated>
<author>
<name>Reinette Chatre</name>
<email>reinette.chatre@intel.com</email>
</author>
<published>2024-10-24T21:18:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=3cb3f0b8755919648281064d44c06319743db343'/>
<id>urn:sha1:3cb3f0b8755919648281064d44c06319743db343</id>
<content type='text'>
The CMT, MBA, and MBM tests rely on the resctrl_val() wrapper to
start and run a benchmark while providing test specific flows
via callbacks to do test specific configuration and measurements.

At a high level, the resctrl_val() flow is:
	a) Start by fork()ing a child process that installs a signal
	   handler for SIGUSR1 that, on receipt of SIGUSR1, will
	   start running a benchmark.
	b) Assign the child process created in (a) to the resctrl
	   control and monitoring group that dictates the memory and
	   cache allocations with which the process can run and will
	   contain all resctrl monitoring data of that process.
	c) Once parent and child are considered "ready" (determined via
	   a message over a pipe) the parent signals the child (via
	   SIGUSR1) to start the benchmark, waits one second for the
	   benchmark to run, and then starts collecting monitoring data
	   for the tests, potentially also changing allocation
	   configuration depending on the various test callbacks.

A problem with the above flow is the "black box" view of the
benchmark that is combined with an arbitrarily chosen
"wait one second" before measurements start. No matter what
the benchmark does, it is given one second to initialize before
measurements start.

The default benchmark "fill_buf" consists of two parts,
first it prepares a buffer (allocate, initialize, then flush), then it
reads from the buffer (in unpredictable ways) until terminated.
Depending on the system and the size of the buffer, the first "prepare"
part may not be complete by the time the one second delay expires. Test
measurements may thus start before the work needing to be measured runs.

Split the default benchmark into its "prepare" and "runtime" parts and
simplify the resctrl_val() wrapper while doing so. This same split
cannot be done for the user provided benchmark (without a user
interface change), so the current behavior is maintained for user
provided benchmark.

Assign the test itself to the control and monitoring group and run the
"prepare" part of the benchmark in this context, ensuring it runs with
required cache and memory bandwidth allocations. With the benchmark
preparation complete it is only needed to fork() the "runtime" part
of the benchmark (or entire user provided benchmark).

Keep the "wait one second" delay before measurements start. For the
default "fill_buf" benchmark this time now covers only the "runtime"
portion that needs to be measured. For the user provided benchmark this
delay maintains current behavior.

Signed-off-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Reviewed-by: Ilpo Järvinen &lt;ilpo.jarvinen@linux.intel.com&gt;
Signed-off-by: Shuah Khan &lt;skhan@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>selftests/resctrl: Make benchmark parameter passing robust</title>
<updated>2024-11-05T00:02:03+00:00</updated>
<author>
<name>Reinette Chatre</name>
<email>reinette.chatre@intel.com</email>
</author>
<published>2024-10-24T21:18:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e958c21e2edecd17c87c4a2df39d1e47a3f7c039'/>
<id>urn:sha1:e958c21e2edecd17c87c4a2df39d1e47a3f7c039</id>
<content type='text'>
The benchmark used during the CMT, MBM, and MBA tests can be provided by
the user via (-b) parameter, if not provided the default "fill_buf"
benchmark is used. The user is additionally able to override
any of the "fill_buf" default parameters when running the tests with
"-b fill_buf &lt;fill_buf parameters&gt;".

The "fill_buf" parameters are managed as an array of strings. Using an
array of strings is complex because it requires transformations to/from
strings at every producer and consumer. This is made worse for the
individual tests where the default benchmark parameters values may not
be appropriate and additional data wrangling is required. For example,
the CMT test duplicates the entire array of strings in order to replace
one of the parameters.

More issues appear when combining the usage of an array of strings with
the use case of user overriding default parameters by specifying
"-b fill_buf &lt;parameters&gt;". This use case is fragile with opportunities
to trigger a SIGSEGV because of opportunities for NULL pointers to exist
in the array of strings. For example, by running below (thus by specifying
"fill_buf" should be used but all parameters are NULL):
	$ sudo resctrl_tests -t mbm -b fill_buf

Replace the "array of strings" parameters used for "fill_buf" with
new struct fill_buf_param that contains the "fill_buf" parameters that
can be used directly without transformations to/from strings. Two
instances of struct fill_buf_param may exist at any point in time:
	* If the user provides new parameters to "fill_buf", the
	  user parameter structure (struct user_params) will point to a
	  fully initialized and immutable struct fill_buf_param
	  containing the user provided parameters.
	* If "fill_buf" is the benchmark that should be used by a test,
	  then the test parameter structure (struct resctrl_val_param)
	  will point to a fully initialized struct fill_buf_param. The
	  latter may contain (a) the user provided parameters verbatim,
	  (b) user provided parameters adjusted to be appropriate for
	  the test, or (c) the default parameters for "fill_buf" that
	  is appropriate for the test if the user did not provide
	  "fill_buf" parameters nor an alternate benchmark.

The existing behavior of CMT test is to use test defined value for the
buffer size even if the user provides another value via command line.
This behavior is maintained since the test requires that the buffer size
matches the size of the cache allocated, and the amount of cache
allocated can instead be changed by the user with the "-n" command line
parameter.

Signed-off-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Reviewed-by: Ilpo Järvinen &lt;ilpo.jarvinen@linux.intel.com&gt;
Signed-off-by: Shuah Khan &lt;skhan@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>selftests/resctrl: Only support measured read operation</title>
<updated>2024-11-05T00:02:03+00:00</updated>
<author>
<name>Reinette Chatre</name>
<email>reinette.chatre@intel.com</email>
</author>
<published>2024-10-24T21:18:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=138424170eb06c956bc7e77e8740479161f65463'/>
<id>urn:sha1:138424170eb06c956bc7e77e8740479161f65463</id>
<content type='text'>
The CMT, MBM, and MBA tests rely on a benchmark to generate
memory traffic. By default this is the "fill_buf" benchmark that
can be replaced via the "-b" command line argument.

The original intent of the "-b" command line parameter was
to replace the default "fill_buf" benchmark, but the implementation
also exposes an alternative use case where the "fill_buf" parameters
itself can be modified. One of the parameters to "fill_buf" is the
"operation" that can be either "read" or "write" and indicates
whether the "fill_buf" should use "read" or "write" operations on the
allocated buffer.

While replacing "fill_buf" default parameters is technically possible,
replacing the default "read" parameter with "write" is not supported
because the MBA and MBM tests only measure "read" operations. The
"read" operation is also most appropriate for the CMT test that aims
to use the benchmark to allocate into the cache.

Avoid any potential inconsistencies between test and measurement by
removing code for unsupported "write" operations to the buffer.
Ignore any attempt from user space to enable this unsupported test
configuration, instead always use read operations.

Keep the initialization of the, now unused, "fill_buf" parameters
to reserve these parameter positions since it has been exposed as an API.
Future parameter additions cannot use these parameter positions.

Signed-off-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Reviewed-by: Ilpo Järvinen &lt;ilpo.jarvinen@linux.intel.com&gt;
Signed-off-by: Shuah Khan &lt;skhan@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>selftests/resctrl: Remove "once" parameter required to be false</title>
<updated>2024-11-05T00:02:03+00:00</updated>
<author>
<name>Reinette Chatre</name>
<email>reinette.chatre@intel.com</email>
</author>
<published>2024-10-24T21:18:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f3069136c95f15c05d166281cbb55f0509a42ae1'/>
<id>urn:sha1:f3069136c95f15c05d166281cbb55f0509a42ae1</id>
<content type='text'>
The CMT, MBM, and MBA tests rely on a benchmark that runs while
the test makes changes to needed configuration (for example memory
bandwidth allocation) and takes needed measurements. By default
the "fill_buf" benchmark is used and by default (via its
"once = false" setting) "fill_buf" is configured to run until
terminated after the test completes.

An unintended consequence of enabling the user to override the
benchmark also enables the user to change parameters to the
"fill_buf" benchmark. This enables the user to set "fill_buf" to
only cycle through the buffer once (by setting "once = true")
and thus breaking the CMT, MBA, and MBM tests that expect
workload/interference to be reflected by their measurements.

Prevent user space from changing the "once" parameter and ensure
that it is always false for the CMT, MBA, and MBM tests.

Suggested-by: Ilpo Järvinen &lt;ilpo.jarvinen@linux.intel.com&gt;
Signed-off-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Reviewed-by: Ilpo Järvinen &lt;ilpo.jarvinen@linux.intel.com&gt;
Signed-off-by: Shuah Khan &lt;skhan@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>selftests/resctrl: Fix memory overflow due to unhandled wraparound</title>
<updated>2024-11-05T00:02:02+00:00</updated>
<author>
<name>Reinette Chatre</name>
<email>reinette.chatre@intel.com</email>
</author>
<published>2024-10-24T21:18:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=caf02626b2bf164a02c808240f19dbf97aced664'/>
<id>urn:sha1:caf02626b2bf164a02c808240f19dbf97aced664</id>
<content type='text'>
alloc_buffer() allocates and initializes (with random data) a
buffer of requested size. The initialization starts from the beginning
of the allocated buffer and incrementally assigns sizeof(uint64_t) random
data to each cache line. The initialization uses the size of the
buffer to control the initialization flow, decrementing the amount of
buffer needing to be initialized after each iteration.

The size of the buffer is stored in an unsigned (size_t) variable s64
and the test "s64 &gt; 0" is used to decide if initialization is complete.
The problem is that decrementing the buffer size may wrap around
if the buffer size is not divisible by "CL_SIZE / sizeof(uint64_t)"
resulting in the "s64 &gt; 0" test being true and memory beyond the buffer
"initialized".

Use a signed value for the buffer size to support all buffer sizes.

Fixes: a2561b12fe39 ("selftests/resctrl: Add built in benchmark")
Signed-off-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Reviewed-by: Ilpo Järvinen &lt;ilpo.jarvinen@linux.intel.com&gt;
Signed-off-by: Shuah Khan &lt;skhan@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>selftests/resctrl: Rewrite Cache Allocation Technology (CAT) test</title>
<updated>2024-02-13T20:56:45+00:00</updated>
<author>
<name>Ilpo Järvinen</name>
<email>ilpo.jarvinen@linux.intel.com</email>
</author>
<published>2023-12-15T15:05:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=205de6ddd7fff9340bd5e4b68105f28120671c6b'/>
<id>urn:sha1:205de6ddd7fff9340bd5e4b68105f28120671c6b</id>
<content type='text'>
CAT test spawns two processes into two different control groups with
exclusive schemata. Both the processes alloc a buffer from memory
matching their allocated LLC block size and flush the entire buffer out
of caches. Since the processes are reading through the buffer only once
during the measurement and initially all the buffer was flushed, the
test isn't testing CAT.

Rewrite the CAT test to allocate a buffer sized to half of LLC. Then
perform a sequence of tests with different LLC alloc sizes starting
from half of the CBM bits down to 1-bit CBM. Flush the buffer before
each test and read the buffer twice. Observe the LLC misses on the
second read through the buffer. As the allocated LLC block gets smaller
and smaller, the LLC misses will become larger and larger giving a
strong signal on CAT working properly.

The new CAT test is using only a single process because it relies on
measured effect against another run of itself rather than another
process adding noise. The rest of the system is set to use the CBM bits
not used by the CAT test to keep the test isolated.

Replace count_bits() with count_contiguous_bits() to get the first bit
position in order to be able to calculate masks based on it.

This change has been tested with a number of systems from different
generations.

Suggested-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Signed-off-by: Ilpo Järvinen &lt;ilpo.jarvinen@linux.intel.com&gt;
Reviewed-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Signed-off-by: Shuah Khan &lt;skhan@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>selftests/resctrl: Read in less obvious order to defeat prefetch optimizations</title>
<updated>2024-02-13T20:56:44+00:00</updated>
<author>
<name>Ilpo Järvinen</name>
<email>ilpo.jarvinen@linux.intel.com</email>
</author>
<published>2023-12-15T15:05:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=bcdb2e9d9f319935938a5addee040e37b60192ff'/>
<id>urn:sha1:bcdb2e9d9f319935938a5addee040e37b60192ff</id>
<content type='text'>
When reading memory in order, HW prefetching optimizations will
interfere with measuring how caches and memory are being accessed. This
adds noise into the results.

Change the fill_buf reading loop to not use an obvious in-order access
using multiply by a prime and modulo.

Using a prime multiplier with modulo ensures the entire buffer is
eventually read. 23 is small enough that the reads are spread out but
wrapping does not occur very frequently (wrapping too often can trigger
L2 hits more frequently which causes noise to the test because getting
the data from LLC is not required).

It was discovered that not all primes work equally well and some can
cause wildly unstable results (e.g., in an earlier version of this
patch, the reads were done in reversed order and 59 was used as the
prime resulting in unacceptably high and unstable results in MBA and
MBM test on some architectures).

Link: https://lore.kernel.org/linux-kselftest/TYAPR01MB6330025B5E6537F94DA49ACB8B499@TYAPR01MB6330.jpnprd01.prod.outlook.com/
Signed-off-by: Ilpo Järvinen &lt;ilpo.jarvinen@linux.intel.com&gt;
Reviewed-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Signed-off-by: Shuah Khan &lt;skhan@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>selftests/resctrl: Replace file write with volatile variable</title>
<updated>2024-02-13T20:56:44+00:00</updated>
<author>
<name>Ilpo Järvinen</name>
<email>ilpo.jarvinen@linux.intel.com</email>
</author>
<published>2023-12-15T15:05:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=90a009db09e2b91990df28e47844d44ae0ef9aa8'/>
<id>urn:sha1:90a009db09e2b91990df28e47844d44ae0ef9aa8</id>
<content type='text'>
The fill_buf code prevents compiler optimizating the entire read loop
away by writing the final value of the variable into a file. While it
achieves the goal, writing into a file requires significant amount of
work within the innermost test loop and also error handling.

A simpler approach is to take advantage of volatile. Writing through
a pointer to a volatile variable is enough to prevent compiler from
optimizing the write away, and therefore compiler cannot remove the
read loop either.

Add a volatile 'value_sink' into resctrl_tests.c and make fill_buf to
write into it. As a result, the error handling in fill_buf.c can be
simplified.

Signed-off-by: Ilpo Järvinen &lt;ilpo.jarvinen@linux.intel.com&gt;
Reviewed-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Signed-off-by: Shuah Khan &lt;skhan@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>selftests/resctrl: Refactor fill_buf functions</title>
<updated>2024-02-13T20:56:43+00:00</updated>
<author>
<name>Ilpo Järvinen</name>
<email>ilpo.jarvinen@linux.intel.com</email>
</author>
<published>2023-12-15T15:04:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=24be05591fb7a2a3edd639092c045298dd57aeea'/>
<id>urn:sha1:24be05591fb7a2a3edd639092c045298dd57aeea</id>
<content type='text'>
There are unnecessary nested calls in fill_buf.c:
  - run_fill_buf() calls fill_cache()
  - alloc_buffer() calls malloc_and_init_memory()

Simplify the code flow and remove those unnecessary call levels by
moving the called code inside the calling function and remove the
duplicated error print.

Resolve the difference in run_fill_buf() and fill_cache() parameter
name into 'buf_size' which is more descriptive than 'span'. Also, while
moving the allocation related code, rename 'p' into 'buf' to be
consistent in naming the variables.

Signed-off-by: Ilpo Järvinen &lt;ilpo.jarvinen@linux.intel.com&gt;
Reviewed-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Signed-off-by: Shuah Khan &lt;skhan@linuxfoundation.org&gt;
</content>
</entry>
</feed>
