<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/tools/testing/selftests/powerpc/eeh, branch v6.19.11</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.19.11</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.19.11'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2024-04-29T13:54:42+00:00</updated>
<entry>
<title>selftests/powerpc: make sub-folders buildable on their own</title>
<updated>2024-04-29T13:54:42+00:00</updated>
<author>
<name>Madhavan Srinivasan</name>
<email>maddy@linux.ibm.com</email>
</author>
<published>2024-02-29T09:37:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=108e5e683333615023265a9a73a29d4c2fa16c70'/>
<id>urn:sha1:108e5e683333615023265a9a73a29d4c2fa16c70</id>
<content type='text'>
Build breaks when executing make with run_tests for sub-folders
under powerpc. This is because, CFLAGS and GIT_VERSION macros are
defined in Makefile of toplevel powerpc folder.

  make: Entering directory '/home/maddy/linux/tools/testing/selftests/powerpc/mm'
  gcc     hugetlb_vs_thp_test.c ../harness.c ../utils.c  -o /home/maddy/selftest_output//hugetlb_vs_thp_test
  hugetlb_vs_thp_test.c:6:10: fatal error: utils.h: No such file or directory
      6 | #include "utils.h"
        |          ^~~~~~~~~
  compilation terminated.

Fix this by adding the flags.mk in each sub-folder Makefile. Also remove
the CFLAGS and GIT_VERSION macros from powerpc/ folder Makefile since
the same is definied in flags.mk

Signed-off-by: Madhavan Srinivasan &lt;maddy@linux.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://msgid.link/20240229093711.581230-3-maddy@linux.ibm.com

</content>
</entry>
<entry>
<title>selftests/powerpc: Add VF recovery tests</title>
<updated>2021-01-31T11:35:47+00:00</updated>
<author>
<name>Oliver O'Halloran</name>
<email>oohall@gmail.com</email>
</author>
<published>2020-11-03T04:45:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=38132cc0e5a6b22b04fac2e4df25c59435fcd6de'/>
<id>urn:sha1:38132cc0e5a6b22b04fac2e4df25c59435fcd6de</id>
<content type='text'>
The basic EEH test ignores VFs since we the way the eeh_dev_break debugfs
interface works means that if multiple VFs are enabled we may cause errors
on all them them. However, we can work around that by only enabling a
single VF at a time.

This patch adds some infrastructure for finding SR-IOV capable devices and
enabling / disabling VFs so we can exercise the VF specific EEH recovery
paths. Two new tests are added, one for testing EEH aware devices and one
for EEH un-aware VFs.

Signed-off-by: Oliver O'Halloran &lt;oohall@gmail.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/20201103044503.917128-3-oohall@gmail.com
</content>
</entry>
<entry>
<title>selftests/powerpc: Use stderr for debug messages in eeh-functions</title>
<updated>2021-01-31T11:35:47+00:00</updated>
<author>
<name>Oliver O'Halloran</name>
<email>oohall@gmail.com</email>
</author>
<published>2020-11-03T04:45:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d6749ccba7ff86f99b4672e50db871487ba69f19'/>
<id>urn:sha1:d6749ccba7ff86f99b4672e50db871487ba69f19</id>
<content type='text'>
We want to use stdout to return lists of devices, etc so log debug / status
messages to stderr rather than stdout.

Signed-off-by: Oliver O'Halloran &lt;oohall@gmail.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/20201103044503.917128-2-oohall@gmail.com
</content>
</entry>
<entry>
<title>selftests/powerpc: Hoist helper code out of eeh-basic</title>
<updated>2021-01-31T11:35:47+00:00</updated>
<author>
<name>Oliver O'Halloran</name>
<email>oohall@gmail.com</email>
</author>
<published>2020-11-03T04:45:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=db82f7097c265776c22ad866511074836f17665e'/>
<id>urn:sha1:db82f7097c265776c22ad866511074836f17665e</id>
<content type='text'>
Hoist some of the useful test environment checking and prep code into
eeh-functions.sh so they can be reused in other tests.

Signed-off-by: Oliver O'Halloran &lt;oohall@gmail.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/20201103044503.917128-1-oohall@gmail.com
</content>
</entry>
<entry>
<title>selftests/powerpc: Make the test check in eeh-basic.sh posix compliant</title>
<updated>2021-01-04T13:00:29+00:00</updated>
<author>
<name>Po-Hsu Lin</name>
<email>po-hsu.lin@canonical.com</email>
</author>
<published>2020-12-28T04:34:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=3db380570af7052620ace20c29e244938610ca71'/>
<id>urn:sha1:3db380570af7052620ace20c29e244938610ca71</id>
<content type='text'>
The == operand is a bash extension, thus this will fail on Ubuntu
with:
  ./eeh-basic.sh: 89: test: 2: unexpected operator

As the /bin/sh on Ubuntu is pointed to DASH.

Use -eq to fix this posix compatibility issue.

Fixes: 996f9e0f93f162 ("selftests/powerpc: Fix eeh-basic.sh exit codes")
Signed-off-by: Po-Hsu Lin &lt;po-hsu.lin@canonical.com&gt;
Reviewed-by: Frederic Barrat &lt;fbarrat@linux.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/20201228043459.14281-1-po-hsu.lin@canonical.com
</content>
</entry>
<entry>
<title>selftests/powerpc/eeh: disable kselftest timeout setting for eeh-basic</title>
<updated>2020-11-19T03:50:14+00:00</updated>
<author>
<name>Po-Hsu Lin</name>
<email>po-hsu.lin@canonical.com</email>
</author>
<published>2020-10-23T02:45:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f5eca0b279117f25020112a2f65ec9c3ea25f3ac'/>
<id>urn:sha1:f5eca0b279117f25020112a2f65ec9c3ea25f3ac</id>
<content type='text'>
The eeh-basic test got its own 60 seconds timeout (defined in commit
414f50434aa2 "selftests/eeh: Bump EEH wait time to 60s") per breakable
device.

And we have discovered that the number of breakable devices varies
on different hardware. The device recovery time ranges from 0 to 35
seconds. In our test pool it will take about 30 seconds to run on a
Power8 system that with 5 breakable devices, 60 seconds to run on a
Power9 system that with 4 breakable devices.

Extend the timeout setting in the kselftest framework to 5 minutes
to give it a chance to finish.

Signed-off-by: Po-Hsu Lin &lt;po-hsu.lin@canonical.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/20201023024539.9512-1-po-hsu.lin@canonical.com
</content>
</entry>
<entry>
<title>selftests/powerpc: Fix eeh-basic.sh exit codes</title>
<updated>2020-10-14T11:03:39+00:00</updated>
<author>
<name>Oliver O'Halloran</name>
<email>oohall@gmail.com</email>
</author>
<published>2020-10-14T02:47:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=996f9e0f93f16211945c8d5f18f296a88cb32f91'/>
<id>urn:sha1:996f9e0f93f16211945c8d5f18f296a88cb32f91</id>
<content type='text'>
The kselftests test running infrastructure expects tests to finish with an
exit code of 4 if the test decided it should be skipped. Currently
eeh-basic.sh exits with the number of devices that failed to recover, so if
four devices didn't recover we'll report a skip instead of a fail.

Fix this by checking if the return code is non-zero and report success
and failure by returning 0 or 1 respectively. For the cases where should
actually skip return 4.

Fixes: 85d86c8aa52e ("selftests/powerpc: Add basic EEH selftest")
Signed-off-by: Oliver O'Halloran &lt;oohall@gmail.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/20201014024711.1138386-1-oohall@gmail.com
</content>
</entry>
<entry>
<title>selftests/powerpc: Squash spurious errors due to device removal</title>
<updated>2020-07-29T11:02:11+00:00</updated>
<author>
<name>Oliver O'Halloran</name>
<email>oohall@gmail.com</email>
</author>
<published>2020-07-27T01:01:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5f8cf6475828b600ff6d000e580c961ac839cc61'/>
<id>urn:sha1:5f8cf6475828b600ff6d000e580c961ac839cc61</id>
<content type='text'>
For drivers that don't have the error handling callbacks we implement
recovery by removing the device and re-probing it. This causes the sysfs
directory for the PCI device to be removed which causes the following
spurious error to be printed when checking the PE state:

Breaking 0005:03:00.0...
./eeh-basic.sh: line 13: can't open /sys/bus/pci/devices/0005:03:00.0/eeh_pe_state: no such file
0005:03:00.0, waited 0/60
0005:03:00.0, waited 1/60
0005:03:00.0, waited 2/60
0005:03:00.0, waited 3/60
0005:03:00.0, waited 4/60
0005:03:00.0, waited 5/60
0005:03:00.0, waited 6/60
0005:03:00.0, waited 7/60
0005:03:00.0, Recovered after 8 seconds

We currently try to avoid this by checking if the PE state file exists
before reading from it. This is however inherently racy so re-work the
state checking so that we only read from the file once, and we squash any
errors that occur while reading.

Fixes: 85d86c8aa52e ("selftests/powerpc: Add basic EEH selftest")
Signed-off-by: Oliver O'Halloran &lt;oohall@gmail.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/20200727010127.23698-1-oohall@gmail.com
</content>
</entry>
<entry>
<title>selftests/eeh: Skip ahci adapters</title>
<updated>2020-04-02T13:09:53+00:00</updated>
<author>
<name>Michael Ellerman</name>
<email>mpe@ellerman.id.au</email>
</author>
<published>2020-03-26T06:11:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=bbe9064f30f06e7791d04f9f61a842226a6a44fe'/>
<id>urn:sha1:bbe9064f30f06e7791d04f9f61a842226a6a44fe</id>
<content type='text'>
The ahci driver doesn't support error recovery, and if your root
filesystem is attached to it the eeh-basic.sh test will likely kill
your machine.

So skip any device we see using the ahci driver.

Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/20200326061144.2006522-1-mpe@ellerman.id.au
</content>
</entry>
<entry>
<title>selftests/eeh: Bump EEH wait time to 60s</title>
<updated>2020-01-25T13:11:37+00:00</updated>
<author>
<name>Oliver O'Halloran</name>
<email>oohall@gmail.com</email>
</author>
<published>2020-01-22T03:11:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=414f50434aa2463202a5b35e844f4125dd1a7101'/>
<id>urn:sha1:414f50434aa2463202a5b35e844f4125dd1a7101</id>
<content type='text'>
Some newer cards supported by aacraid can take up to 40s to recover
after an EEH event. This causes spurious failures in the basic EEH
self-test since the current maximim timeout is only 30s.

Fix the immediate issue by bumping the timeout to a default of 60s,
and allow the wait time to be specified via an environmental variable
(EEH_MAX_WAIT).

Reported-by: Steve Best &lt;sbest@redhat.com&gt;
Suggested-by: Douglas Miller &lt;dougmill@us.ibm.com&gt;
Signed-off-by: Oliver O'Halloran &lt;oohall@gmail.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/20200122031125.25991-1-oohall@gmail.com
</content>
</entry>
</feed>
