summaryrefslogtreecommitdiff
path: root/Documentation/trace/hwlat_detector.rst
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2018-04-03 23:35:51 +0300
committerLinus Torvalds <torvalds@linux-foundation.org>2018-04-03 23:35:51 +0300
commitbb2407a7219760926760f0448fddf00d625e5aec (patch)
tree68d2b7a17f0bb837d04d658a56baf16dd261210f /Documentation/trace/hwlat_detector.rst
parente40dc66220b7ff1b816311b135b9298f8ba14ce6 (diff)
parent86afad7d87f535ebb1a0e978bc32a8c58ac99268 (diff)
downloadlinux-bb2407a7219760926760f0448fddf00d625e5aec.tar.xz
Merge tag 'docs-4.17' of git://git.lwn.net/linux
Pull documentation updates from Jonathan Corbet: "There's been a fair amount of activity in Documentation/ this time around: - Lots of work aligning Documentation/ABI with reality, done by Aishwarya Pant. - The trace documentation has been converted to RST by Changbin Du - I thrashed up kernel-doc to deal with a parsing issue and to try to make the code more readable. It's still a 20+-year-old Perl hack, though. - Lots of other updates, typo fixes, and more" * tag 'docs-4.17' of git://git.lwn.net/linux: (82 commits) Documentation/process: update FUSE project website docs: kernel-doc: fix parsing of arrays dmaengine: Fix spelling for parenthesis in dmatest documentation dmaengine: Make dmatest.rst indeed reST compatible dmaengine: Add note to dmatest documentation about supported channels Documentation: magic-numbers: Fix typo Documentation: admin-guide: add kvmconfig, xenconfig and tinyconfig commands Input: alps - Update documentation for trackstick v3 format Documentation: Mention why %p prints ptrval COPYING: use the new text with points to the license files COPYING: create a new file with points to the Kernel license files Input: trackpoint: document sysfs interface xfs: Change URL for the project in xfs.txt char/bsr: add sysfs interface documentation acpi: nfit: document sysfs interface block: rbd: update sysfs interface Documentation/sparse: fix typo Documentation/CodingStyle: Add an example for braces docs/vm: update 00-INDEX kernel-doc: Remove __sched markings ...
Diffstat (limited to 'Documentation/trace/hwlat_detector.rst')
-rw-r--r--Documentation/trace/hwlat_detector.rst83
1 files changed, 83 insertions, 0 deletions
diff --git a/Documentation/trace/hwlat_detector.rst b/Documentation/trace/hwlat_detector.rst
new file mode 100644
index 000000000000..5739349649c8
--- /dev/null
+++ b/Documentation/trace/hwlat_detector.rst
@@ -0,0 +1,83 @@
+=========================
+Hardware Latency Detector
+=========================
+
+Introduction
+-------------
+
+The tracer hwlat_detector is a special purpose tracer that is used to
+detect large system latencies induced by the behavior of certain underlying
+hardware or firmware, independent of Linux itself. The code was developed
+originally to detect SMIs (System Management Interrupts) on x86 systems,
+however there is nothing x86 specific about this patchset. It was
+originally written for use by the "RT" patch since the Real Time
+kernel is highly latency sensitive.
+
+SMIs are not serviced by the Linux kernel, which means that it does not
+even know that they are occuring. SMIs are instead set up by BIOS code
+and are serviced by BIOS code, usually for "critical" events such as
+management of thermal sensors and fans. Sometimes though, SMIs are used for
+other tasks and those tasks can spend an inordinate amount of time in the
+handler (sometimes measured in milliseconds). Obviously this is a problem if
+you are trying to keep event service latencies down in the microsecond range.
+
+The hardware latency detector works by hogging one of the cpus for configurable
+amounts of time (with interrupts disabled), polling the CPU Time Stamp Counter
+for some period, then looking for gaps in the TSC data. Any gap indicates a
+time when the polling was interrupted and since the interrupts are disabled,
+the only thing that could do that would be an SMI or other hardware hiccup
+(or an NMI, but those can be tracked).
+
+Note that the hwlat detector should *NEVER* be used in a production environment.
+It is intended to be run manually to determine if the hardware platform has a
+problem with long system firmware service routines.
+
+Usage
+------
+
+Write the ASCII text "hwlat" into the current_tracer file of the tracing system
+(mounted at /sys/kernel/tracing or /sys/kernel/tracing). It is possible to
+redefine the threshold in microseconds (us) above which latency spikes will
+be taken into account.
+
+Example::
+
+ # echo hwlat > /sys/kernel/tracing/current_tracer
+ # echo 100 > /sys/kernel/tracing/tracing_thresh
+
+The /sys/kernel/tracing/hwlat_detector interface contains the following files:
+
+ - width - time period to sample with CPUs held (usecs)
+ must be less than the total window size (enforced)
+ - window - total period of sampling, width being inside (usecs)
+
+By default the width is set to 500,000 and window to 1,000,000, meaning that
+for every 1,000,000 usecs (1s) the hwlat detector will spin for 500,000 usecs
+(0.5s). If tracing_thresh contains zero when hwlat tracer is enabled, it will
+change to a default of 10 usecs. If any latencies that exceed the threshold is
+observed then the data will be written to the tracing ring buffer.
+
+The minimum sleep time between periods is 1 millisecond. Even if width
+is less than 1 millisecond apart from window, to allow the system to not
+be totally starved.
+
+If tracing_thresh was zero when hwlat detector was started, it will be set
+back to zero if another tracer is loaded. Note, the last value in
+tracing_thresh that hwlat detector had will be saved and this value will
+be restored in tracing_thresh if it is still zero when hwlat detector is
+started again.
+
+The following tracing directory files are used by the hwlat_detector:
+
+in /sys/kernel/tracing:
+
+ - tracing_threshold - minimum latency value to be considered (usecs)
+ - tracing_max_latency - maximum hardware latency actually observed (usecs)
+ - tracing_cpumask - the CPUs to move the hwlat thread across
+ - hwlat_detector/width - specified amount of time to spin within window (usecs)
+ - hwlat_detector/window - amount of time between (width) runs (usecs)
+
+The hwlat detector's kernel thread will migrate across each CPU specified in
+tracing_cpumask between each window. To limit the migration, either modify
+tracing_cpumask, or modify the hwlat kernel thread (named [hwlatd]) CPU
+affinity directly, and the migration will stop.