summaryrefslogtreecommitdiff
path: root/Documentation
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/ABI/testing/sysfs-bus-mei7
-rw-r--r--Documentation/ABI/testing/sysfs-bus-usb6
-rw-r--r--Documentation/ABI/testing/sysfs-devices-system-cpu12
-rw-r--r--Documentation/DocBook/device-drivers.tmpl2
-rw-r--r--Documentation/devicetree/bindings/arm/atmel-adc.txt13
-rw-r--r--Documentation/devicetree/bindings/arm/msm/ssbi.txt18
-rw-r--r--Documentation/devicetree/bindings/arm/samsung/exynos-adc.txt60
-rw-r--r--Documentation/devicetree/bindings/gpio/gpio.txt6
-rw-r--r--Documentation/devicetree/bindings/hwmon/ntc_thermistor.txt29
-rw-r--r--Documentation/devicetree/bindings/iio/iio-bindings.txt97
-rw-r--r--Documentation/devicetree/bindings/mfd/mc13xxx.txt36
-rw-r--r--Documentation/devicetree/bindings/pinctrl/pinctrl-single.txt109
-rw-r--r--Documentation/devicetree/bindings/pinctrl/samsung-pinctrl.txt3
-rw-r--r--Documentation/devicetree/bindings/regulator/max8952.txt52
-rw-r--r--Documentation/devicetree/bindings/spi/brcm,bcm2835-spi.txt22
-rw-r--r--Documentation/devicetree/bindings/spi/fsl-spi.txt3
-rw-r--r--Documentation/devicetree/bindings/spi/nvidia,tegra114-spi.txt26
-rw-r--r--Documentation/devicetree/bindings/spi/spi-samsung.txt8
-rw-r--r--Documentation/devicetree/bindings/staging/dwc2.txt15
-rw-r--r--Documentation/devicetree/bindings/staging/imx-drm/fsl-imx-drm.txt2
-rw-r--r--Documentation/devicetree/bindings/tty/serial/of-serial.txt4
-rw-r--r--Documentation/devicetree/bindings/usb/ci13xxx-imx.txt2
-rw-r--r--Documentation/devicetree/bindings/usb/ehci-omap.txt32
-rw-r--r--Documentation/devicetree/bindings/usb/ohci-omap3.txt15
-rw-r--r--Documentation/devicetree/bindings/usb/omap-usb.txt40
-rw-r--r--Documentation/devicetree/bindings/usb/samsung-usbphy.txt76
-rw-r--r--Documentation/devicetree/bindings/usb/usb-nop-xceiv.txt34
-rw-r--r--Documentation/devicetree/bindings/vendor-prefixes.txt1
-rw-r--r--Documentation/devicetree/bindings/video/via,vt8500-fb.txt48
-rw-r--r--Documentation/devicetree/bindings/video/wm,wm8505-fb.txt32
-rw-r--r--Documentation/hwmon/adt741047
-rw-r--r--Documentation/hwmon/lm2506634
-rw-r--r--Documentation/hwmon/lm752
-rw-r--r--Documentation/hwmon/lm9523436
-rw-r--r--Documentation/hwmon/ltc2978143
-rw-r--r--Documentation/hwmon/nct6775188
-rw-r--r--Documentation/hwmon/sht152
-rw-r--r--Documentation/hwmon/tmp40125
-rw-r--r--Documentation/hwmon/zl61002
-rw-r--r--Documentation/i2c/busses/i2c-diolan-u2c2
-rw-r--r--Documentation/ia64/err_inject.txt2
-rw-r--r--Documentation/ioctl/ioctl-number.txt1
-rw-r--r--Documentation/kdump/kdump.txt1
-rw-r--r--Documentation/kernel-parameters.txt36
-rw-r--r--Documentation/misc-devices/mei/mei-client-bus.txt138
-rw-r--r--Documentation/networking/ipvs-sysctl.txt7
-rw-r--r--Documentation/pinctrl.txt112
-rw-r--r--Documentation/s390/s390dbf.txt3
-rw-r--r--Documentation/scsi/LICENSE.qla2xxx2
-rw-r--r--Documentation/sound/alsa/ALSA-Configuration.txt7
-rw-r--r--Documentation/sound/alsa/seq_oss.html2
-rw-r--r--Documentation/trace/ftrace.txt2097
-rw-r--r--Documentation/usb/power-management.txt10
53 files changed, 2903 insertions, 806 deletions
diff --git a/Documentation/ABI/testing/sysfs-bus-mei b/Documentation/ABI/testing/sysfs-bus-mei
new file mode 100644
index 000000000000..2066f0bbd453
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-bus-mei
@@ -0,0 +1,7 @@
+What: /sys/bus/mei/devices/.../modalias
+Date: March 2013
+KernelVersion: 3.10
+Contact: Samuel Ortiz <sameo@linux.intel.com>
+ linux-mei@linux.intel.com
+Description: Stores the same MODALIAS value emitted by uevent
+ Format: mei:<mei device name>
diff --git a/Documentation/ABI/testing/sysfs-bus-usb b/Documentation/ABI/testing/sysfs-bus-usb
index c8baaf53594a..f093e59cbe5f 100644
--- a/Documentation/ABI/testing/sysfs-bus-usb
+++ b/Documentation/ABI/testing/sysfs-bus-usb
@@ -32,7 +32,7 @@ Date: January 2008
KernelVersion: 2.6.25
Contact: Sarah Sharp <sarah.a.sharp@intel.com>
Description:
- If CONFIG_PM and CONFIG_USB_SUSPEND are enabled, then this file
+ If CONFIG_PM_RUNTIME is enabled then this file
is present. When read, it returns the total time (in msec)
that the USB device has been connected to the machine. This
file is read-only.
@@ -45,7 +45,7 @@ Date: January 2008
KernelVersion: 2.6.25
Contact: Sarah Sharp <sarah.a.sharp@intel.com>
Description:
- If CONFIG_PM and CONFIG_USB_SUSPEND are enabled, then this file
+ If CONFIG_PM_RUNTIME is enabled then this file
is present. When read, it returns the total time (in msec)
that the USB device has been active, i.e. not in a suspended
state. This file is read-only.
@@ -187,7 +187,7 @@ What: /sys/bus/usb/devices/.../power/usb2_hardware_lpm
Date: September 2011
Contact: Andiry Xu <andiry.xu@amd.com>
Description:
- If CONFIG_USB_SUSPEND is set and a USB 2.0 lpm-capable device
+ If CONFIG_PM_RUNTIME is set and a USB 2.0 lpm-capable device
is plugged in to a xHCI host which support link PM, it will
perform a LPM test; if the test is passed and host supports
USB2 hardware LPM (xHCI 1.0 feature), USB2 hardware LPM will
diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
index 9c978dcae07d..2447698aed41 100644
--- a/Documentation/ABI/testing/sysfs-devices-system-cpu
+++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
@@ -173,3 +173,15 @@ Description: Processor frequency boosting control
Boosting allows the CPU and the firmware to run at a frequency
beyound it's nominal limit.
More details can be found in Documentation/cpu-freq/boost.txt
+
+
+What: /sys/devices/system/cpu/cpu#/crash_notes
+ /sys/devices/system/cpu/cpu#/crash_notes_size
+Date: April 2013
+Contact: kexec@lists.infradead.org
+Description: address and size of the percpu note.
+
+ crash_notes: the physical address of the memory that holds the
+ note of cpu#.
+
+ crash_notes_size: size of the note of cpu#.
diff --git a/Documentation/DocBook/device-drivers.tmpl b/Documentation/DocBook/device-drivers.tmpl
index 7514dbf0a679..c36892c072da 100644
--- a/Documentation/DocBook/device-drivers.tmpl
+++ b/Documentation/DocBook/device-drivers.tmpl
@@ -227,7 +227,7 @@ X!Isound/sound_firmware.c
<chapter id="uart16x50">
<title>16x50 UART Driver</title>
!Edrivers/tty/serial/serial_core.c
-!Edrivers/tty/serial/8250/8250.c
+!Edrivers/tty/serial/8250/8250_core.c
</chapter>
<chapter id="fbdev">
diff --git a/Documentation/devicetree/bindings/arm/atmel-adc.txt b/Documentation/devicetree/bindings/arm/atmel-adc.txt
index c63097d6afeb..16769d9cedd6 100644
--- a/Documentation/devicetree/bindings/arm/atmel-adc.txt
+++ b/Documentation/devicetree/bindings/arm/atmel-adc.txt
@@ -14,9 +14,19 @@ Required properties:
- atmel,adc-status-register: Offset of the Interrupt Status Register
- atmel,adc-trigger-register: Offset of the Trigger Register
- atmel,adc-vref: Reference voltage in millivolts for the conversions
+ - atmel,adc-res: List of resolution in bits supported by the ADC. List size
+ must be two at least.
+ - atmel,adc-res-names: Contains one identifier string for each resolution
+ in atmel,adc-res property. "lowres" and "highres"
+ identifiers are required.
Optional properties:
- atmel,adc-use-external: Boolean to enable of external triggers
+ - atmel,adc-use-res: String corresponding to an identifier from
+ atmel,adc-res-names property. If not specified, the highest
+ resolution will be used.
+ - atmel,adc-sleep-mode: Boolean to enable sleep mode when no conversion
+ - atmel,adc-sample-hold-time: Sample and Hold Time in microseconds
Optional trigger Nodes:
- Required properties:
@@ -40,6 +50,9 @@ adc0: adc@fffb0000 {
atmel,adc-trigger-register = <0x08>;
atmel,adc-use-external;
atmel,adc-vref = <3300>;
+ atmel,adc-res = <8 10>;
+ atmel,adc-res-names = "lowres", "highres";
+ atmel,adc-use-res = "lowres";
trigger@0 {
trigger-name = "external-rising";
diff --git a/Documentation/devicetree/bindings/arm/msm/ssbi.txt b/Documentation/devicetree/bindings/arm/msm/ssbi.txt
new file mode 100644
index 000000000000..54fd5ced3401
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/msm/ssbi.txt
@@ -0,0 +1,18 @@
+* Qualcomm SSBI
+
+Some Qualcomm MSM devices contain a point-to-point serial bus used to
+communicate with a limited range of devices (mostly power management
+chips).
+
+These require the following properties:
+
+- compatible: "qcom,ssbi"
+
+- qcom,controller-type
+ indicates the SSBI bus variant the controller should use to talk
+ with the slave device. This should be one of "ssbi", "ssbi2", or
+ "pmic-arbiter". The type chosen is determined by the attached
+ slave.
+
+The slave device should be the single child node of the ssbi device
+with a compatible field.
diff --git a/Documentation/devicetree/bindings/arm/samsung/exynos-adc.txt b/Documentation/devicetree/bindings/arm/samsung/exynos-adc.txt
new file mode 100644
index 000000000000..47ada1dff216
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/samsung/exynos-adc.txt
@@ -0,0 +1,60 @@
+Samsung Exynos Analog to Digital Converter bindings
+
+The devicetree bindings are for the new ADC driver written for
+Exynos4 and upward SoCs from Samsung.
+
+New driver handles the following
+1. Supports ADC IF found on EXYNOS4412/EXYNOS5250
+ and future SoCs from Samsung
+2. Add ADC driver under iio/adc framework
+3. Also adds the Documentation for device tree bindings
+
+Required properties:
+- compatible: Must be "samsung,exynos-adc-v1"
+ for exynos4412/5250 controllers.
+ Must be "samsung,exynos-adc-v2" for
+ future controllers.
+- reg: Contains ADC register address range (base address and
+ length) and the address of the phy enable register.
+- interrupts: Contains the interrupt information for the timer. The
+ format is being dependent on which interrupt controller
+ the Samsung device uses.
+- #io-channel-cells = <1>; As ADC has multiple outputs
+- clocks From common clock binding: handle to adc clock.
+- clock-names From common clock binding: Shall be "adc".
+- vdd-supply VDD input supply.
+
+Note: child nodes can be added for auto probing from device tree.
+
+Example: adding device info in dtsi file
+
+adc: adc@12D10000 {
+ compatible = "samsung,exynos-adc-v1";
+ reg = <0x12D10000 0x100>, <0x10040718 0x4>;
+ interrupts = <0 106 0>;
+ #io-channel-cells = <1>;
+ io-channel-ranges;
+
+ clocks = <&clock 303>;
+ clock-names = "adc";
+
+ vdd-supply = <&buck5_reg>;
+};
+
+
+Example: Adding child nodes in dts file
+
+adc@12D10000 {
+
+ /* NTC thermistor is a hwmon device */
+ ncp15wb473@0 {
+ compatible = "ntc,ncp15wb473";
+ pullup-uV = <1800000>;
+ pullup-ohm = <47000>;
+ pulldown-ohm = <0>;
+ io-channels = <&adc 4>;
+ };
+};
+
+Note: Does not apply to ADC driver under arch/arm/plat-samsung/
+Note: The child node can be added under the adc node or separately.
diff --git a/Documentation/devicetree/bindings/gpio/gpio.txt b/Documentation/devicetree/bindings/gpio/gpio.txt
index a33628759d36..d933af370697 100644
--- a/Documentation/devicetree/bindings/gpio/gpio.txt
+++ b/Documentation/devicetree/bindings/gpio/gpio.txt
@@ -98,7 +98,7 @@ announce the pinrange to the pin ctrl subsystem. For example,
compatible = "fsl,qe-pario-bank-e", "fsl,qe-pario-bank";
reg = <0x1460 0x18>;
gpio-controller;
- gpio-ranges = <&pinctrl1 20 10>, <&pinctrl2 50 20>;
+ gpio-ranges = <&pinctrl1 0 20 10>, <&pinctrl2 10 50 20>;
}
@@ -107,8 +107,8 @@ where,
Next values specify the base pin and number of pins for the range
handled by 'qe_pio_e' gpio. In the given example from base pin 20 to
- pin 29 under pinctrl1 and pin 50 to pin 69 under pinctrl2 is handled
- by this gpio controller.
+ pin 29 under pinctrl1 with gpio offset 0 and pin 50 to pin 69 under
+ pinctrl2 with gpio offset 10 is handled by this gpio controller.
The pinctrl node must have "#gpio-range-cells" property to show number of
arguments to pass with phandle from gpio controllers node.
diff --git a/Documentation/devicetree/bindings/hwmon/ntc_thermistor.txt b/Documentation/devicetree/bindings/hwmon/ntc_thermistor.txt
new file mode 100644
index 000000000000..c6f66674f19c
--- /dev/null
+++ b/Documentation/devicetree/bindings/hwmon/ntc_thermistor.txt
@@ -0,0 +1,29 @@
+NTC Thermistor hwmon sensors
+-------------------------------
+
+Requires node properties:
+- "compatible" value : one of
+ "ntc,ncp15wb473"
+ "ntc,ncp18wb473"
+ "ntc,ncp21wb473"
+ "ntc,ncp03wb473"
+ "ntc,ncp15wl333"
+- "pullup-uv" Pull up voltage in micro volts
+- "pullup-ohm" Pull up resistor value in ohms
+- "pulldown-ohm" Pull down resistor value in ohms
+- "connected-positive" Always ON, If not specified.
+ Status change is possible.
+- "io-channels" Channel node of ADC to be used for
+ conversion.
+
+Read more about iio bindings at
+ Documentation/devicetree/bindings/iio/iio-bindings.txt
+
+Example:
+ ncp15wb473@0 {
+ compatible = "ntc,ncp15wb473";
+ pullup-uv = <1800000>;
+ pullup-ohm = <47000>;
+ pulldown-ohm = <0>;
+ io-channels = <&adc 3>;
+ };
diff --git a/Documentation/devicetree/bindings/iio/iio-bindings.txt b/Documentation/devicetree/bindings/iio/iio-bindings.txt
new file mode 100644
index 000000000000..0b447d9ad196
--- /dev/null
+++ b/Documentation/devicetree/bindings/iio/iio-bindings.txt
@@ -0,0 +1,97 @@
+This binding is derived from clock bindings, and based on suggestions
+from Lars-Peter Clausen [1].
+
+Sources of IIO channels can be represented by any node in the device
+tree. Those nodes are designated as IIO providers. IIO consumer
+nodes use a phandle and IIO specifier pair to connect IIO provider
+outputs to IIO inputs. Similar to the gpio specifiers, an IIO
+specifier is an array of one or more cells identifying the IIO
+output on a device. The length of an IIO specifier is defined by the
+value of a #io-channel-cells property in the IIO provider node.
+
+[1] http://marc.info/?l=linux-iio&m=135902119507483&w=2
+
+==IIO providers==
+
+Required properties:
+#io-channel-cells: Number of cells in an IIO specifier; Typically 0 for nodes
+ with a single IIO output and 1 for nodes with multiple
+ IIO outputs.
+
+Example for a simple configuration with no trigger:
+
+ adc: voltage-sensor@35 {
+ compatible = "maxim,max1139";
+ reg = <0x35>;
+ #io-channel-cells = <1>;
+ };
+
+Example for a configuration with trigger:
+
+ adc@35 {
+ compatible = "some-vendor,some-adc";
+ reg = <0x35>;
+
+ adc1: iio-device@0 {
+ #io-channel-cells = <1>;
+ /* other properties */
+ };
+ adc2: iio-device@1 {
+ #io-channel-cells = <1>;
+ /* other properties */
+ };
+ };
+
+==IIO consumers==
+
+Required properties:
+io-channels: List of phandle and IIO specifier pairs, one pair
+ for each IIO input to the device. Note: if the
+ IIO provider specifies '0' for #io-channel-cells,
+ then only the phandle portion of the pair will appear.
+
+Optional properties:
+io-channel-names:
+ List of IIO input name strings sorted in the same
+ order as the io-channels property. Consumers drivers
+ will use io-channel-names to match IIO input names
+ with IIO specifiers.
+io-channel-ranges:
+ Empty property indicating that child nodes can inherit named
+ IIO channels from this node. Useful for bus nodes to provide
+ and IIO channel to their children.
+
+For example:
+
+ device {
+ io-channels = <&adc 1>, <&ref 0>;
+ io-channel-names = "vcc", "vdd";
+ };
+
+This represents a device with two IIO inputs, named "vcc" and "vdd".
+The vcc channel is connected to output 1 of the &adc device, and the
+vdd channel is connected to output 0 of the &ref device.
+
+==Example==
+
+ adc: max1139@35 {
+ compatible = "maxim,max1139";
+ reg = <0x35>;
+ #io-channel-cells = <1>;
+ };
+
+ ...
+
+ iio_hwmon {
+ compatible = "iio-hwmon";
+ io-channels = <&adc 0>, <&adc 1>, <&adc 2>,
+ <&adc 3>, <&adc 4>, <&adc 5>,
+ <&adc 6>, <&adc 7>, <&adc 8>,
+ <&adc 9>;
+ };
+
+ some_consumer {
+ compatible = "some-consumer";
+ io-channels = <&adc 10>, <&adc 11>;
+ io-channel-names = "adc1", "adc2";
+ };
diff --git a/Documentation/devicetree/bindings/mfd/mc13xxx.txt b/Documentation/devicetree/bindings/mfd/mc13xxx.txt
index baf07987ae68..abd9e3cb2db7 100644
--- a/Documentation/devicetree/bindings/mfd/mc13xxx.txt
+++ b/Documentation/devicetree/bindings/mfd/mc13xxx.txt
@@ -10,10 +10,40 @@ Optional properties:
- fsl,mc13xxx-uses-touch : Indicate the touchscreen controller is being used
Sub-nodes:
-- regulators : Contain the regulator nodes. The MC13892 regulators are
- bound using their names as listed below with their registers and bits
- for enabling.
+- regulators : Contain the regulator nodes. The regulators are bound using
+ their names as listed below with their registers and bits for enabling.
+MC13783 regulators:
+ sw1a : regulator SW1A (register 24, bit 0)
+ sw1b : regulator SW1B (register 25, bit 0)
+ sw2a : regulator SW2A (register 26, bit 0)
+ sw2b : regulator SW2B (register 27, bit 0)
+ sw3 : regulator SW3 (register 29, bit 20)
+ vaudio : regulator VAUDIO (register 32, bit 0)
+ viohi : regulator VIOHI (register 32, bit 3)
+ violo : regulator VIOLO (register 32, bit 6)
+ vdig : regulator VDIG (register 32, bit 9)
+ vgen : regulator VGEN (register 32, bit 12)
+ vrfdig : regulator VRFDIG (register 32, bit 15)
+ vrfref : regulator VRFREF (register 32, bit 18)
+ vrfcp : regulator VRFCP (register 32, bit 21)
+ vsim : regulator VSIM (register 33, bit 0)
+ vesim : regulator VESIM (register 33, bit 3)
+ vcam : regulator VCAM (register 33, bit 6)
+ vrfbg : regulator VRFBG (register 33, bit 9)
+ vvib : regulator VVIB (register 33, bit 11)
+ vrf1 : regulator VRF1 (register 33, bit 12)
+ vrf2 : regulator VRF2 (register 33, bit 15)
+ vmmc1 : regulator VMMC1 (register 33, bit 18)
+ vmmc2 : regulator VMMC2 (register 33, bit 21)
+ gpo1 : regulator GPO1 (register 34, bit 6)
+ gpo2 : regulator GPO2 (register 34, bit 8)
+ gpo3 : regulator GPO3 (register 34, bit 10)
+ gpo4 : regulator GPO4 (register 34, bit 12)
+ pwgt1spi : regulator PWGT1SPI (register 34, bit 15)
+ pwgt2spi : regulator PWGT2SPI (register 34, bit 16)
+
+MC13892 regulators:
vcoincell : regulator VCOINCELL (register 13, bit 23)
sw1 : regulator SW1 (register 24, bit 0)
sw2 : regulator SW2 (register 25, bit 0)
diff --git a/Documentation/devicetree/bindings/pinctrl/pinctrl-single.txt b/Documentation/devicetree/bindings/pinctrl/pinctrl-single.txt
index 2c81e45f1374..08f0c3d01575 100644
--- a/Documentation/devicetree/bindings/pinctrl/pinctrl-single.txt
+++ b/Documentation/devicetree/bindings/pinctrl/pinctrl-single.txt
@@ -1,7 +1,9 @@
One-register-per-pin type device tree based pinctrl driver
Required properties:
-- compatible : "pinctrl-single"
+- compatible : "pinctrl-single" or "pinconf-single".
+ "pinctrl-single" means that pinconf isn't supported.
+ "pinconf-single" means that generic pinconf is supported.
- reg : offset and length of the register set for the mux registers
@@ -14,9 +16,61 @@ Optional properties:
- pinctrl-single,function-off : function off mode for disabled state if
available and same for all registers; if not specified, disabling of
pin functions is ignored
+
- pinctrl-single,bit-per-mux : boolean to indicate that one register controls
more than one pin
+- pinctrl-single,drive-strength : array of value that are used to configure
+ drive strength in the pinmux register. They're value of drive strength
+ current and drive strength mask.
+
+ /* drive strength current, mask */
+ pinctrl-single,power-source = <0x30 0xf0>;
+
+- pinctrl-single,bias-pullup : array of value that are used to configure the
+ input bias pullup in the pinmux register.
+
+ /* input, enabled pullup bits, disabled pullup bits, mask */
+ pinctrl-single,bias-pullup = <0 1 0 1>;
+
+- pinctrl-single,bias-pulldown : array of value that are used to configure the
+ input bias pulldown in the pinmux register.
+
+ /* input, enabled pulldown bits, disabled pulldown bits, mask */
+ pinctrl-single,bias-pulldown = <2 2 0 2>;
+
+ * Two bits to control input bias pullup and pulldown: User should use
+ pinctrl-single,bias-pullup & pinctrl-single,bias-pulldown. One bit means
+ pullup, and the other one bit means pulldown.
+ * Three bits to control input bias enable, pullup and pulldown. User should
+ use pinctrl-single,bias-pullup & pinctrl-single,bias-pulldown. Input bias
+ enable bit should be included in pullup or pulldown bits.
+ * Although driver could set PIN_CONFIG_BIAS_DISABLE, there's no property as
+ pinctrl-single,bias-disable. Because pinctrl single driver could implement
+ it by calling pulldown, pullup disabled.
+
+- pinctrl-single,input-schmitt : array of value that are used to configure
+ input schmitt in the pinmux register. In some silicons, there're two input
+ schmitt value (rising-edge & falling-edge) in the pinmux register.
+
+ /* input schmitt value, mask */
+ pinctrl-single,input-schmitt = <0x30 0x70>;
+
+- pinctrl-single,input-schmitt-enable : array of value that are used to
+ configure input schmitt enable or disable in the pinmux register.
+
+ /* input, enable bits, disable bits, mask */
+ pinctrl-single,input-schmitt-enable = <0x30 0x40 0 0x70>;
+
+- pinctrl-single,gpio-range : list of value that are used to configure a GPIO
+ range. They're value of subnode phandle, pin base in pinctrl device, pin
+ number in this range, GPIO function value of this GPIO range.
+ The number of parameters is depend on #pinctrl-single,gpio-range-cells
+ property.
+
+ /* pin base, nr pins & gpio function */
+ pinctrl-single,gpio-range = <&range 0 3 0 &range 3 9 1>;
+
This driver assumes that there is only one register for each pin (unless the
pinctrl-single,bit-per-mux is set), and uses the common pinctrl bindings as
specified in the pinctrl-bindings.txt document in this directory.
@@ -42,6 +96,20 @@ Where 0xdc is the offset from the pinctrl register base address for the
device pinctrl register, 0x18 is the desired value, and 0xff is the sub mask to
be used when applying this change to the register.
+
+Optional sub-node: In case some pins could be configured as GPIO in the pinmux
+register, those pins could be defined as a GPIO range. This sub-node is required
+by pinctrl-single,gpio-range property.
+
+Required properties in sub-node:
+- #pinctrl-single,gpio-range-cells : the number of parameters after phandle in
+ pinctrl-single,gpio-range property.
+
+ range: gpio-range {
+ #pinctrl-single,gpio-range-cells = <3>;
+ };
+
+
Example:
/* SoC common file */
@@ -58,7 +126,7 @@ pmx_core: pinmux@4a100040 {
/* second controller instance for pins in wkup domain */
pmx_wkup: pinmux@4a31e040 {
- compatible = "pinctrl-single;
+ compatible = "pinctrl-single";
reg = <0x4a31e040 0x0038>;
#address-cells = <1>;
#size-cells = <0>;
@@ -76,6 +144,29 @@ control_devconf0: pinmux@48002274 {
pinctrl-single,function-mask = <0x5F>;
};
+/* third controller instance for pins in gpio domain */
+pmx_gpio: pinmux@d401e000 {
+ compatible = "pinconf-single";
+ reg = <0xd401e000 0x0330>;
+ #address-cells = <1>;
+ #size-cells = <1>;
+ ranges;
+
+ pinctrl-single,register-width = <32>;
+ pinctrl-single,function-mask = <7>;
+
+ /* sparse GPIO range could be supported */
+ pinctrl-single,gpio-range = <&range 0 3 0 &range 3 9 1
+ &range 12 1 0 &range 13 29 1
+ &range 43 1 0 &range 44 49 1
+ &range 94 1 1 &range 96 2 1>;
+
+ range: gpio-range {
+ #pinctrl-single,gpio-range-cells = <3>;
+ };
+};
+
+
/* board specific .dts file */
&pmx_core {
@@ -96,6 +187,15 @@ control_devconf0: pinmux@48002274 {
>;
};
+ uart0_pins: pinmux_uart0_pins {
+ pinctrl-single,pins = <
+ 0x208 0 /* UART0_RXD (IOCFG138) */
+ 0x20c 0 /* UART0_TXD (IOCFG139) */
+ >;
+ pinctrl-single,bias-pulldown = <0 2 2>;
+ pinctrl-single,bias-pullup = <0 1 1>;
+ };
+
/* map uart2 pins */
uart2_pins: pinmux_uart2_pins {
pinctrl-single,pins = <
@@ -122,6 +222,11 @@ control_devconf0: pinmux@48002274 {
};
+&uart1 {
+ pinctrl-names = "default";
+ pinctrl-0 = <&uart0_pins>;
+};
+
&uart2 {
pinctrl-names = "default";
pinctrl-0 = <&uart2_pins>;
diff --git a/Documentation/devicetree/bindings/pinctrl/samsung-pinctrl.txt b/Documentation/devicetree/bindings/pinctrl/samsung-pinctrl.txt
index 4598a47aa0cd..c70fca146e91 100644
--- a/Documentation/devicetree/bindings/pinctrl/samsung-pinctrl.txt
+++ b/Documentation/devicetree/bindings/pinctrl/samsung-pinctrl.txt
@@ -7,6 +7,7 @@ on-chip controllers onto these pads.
Required Properties:
- compatible: should be one of the following.
+ - "samsung,s3c64xx-pinctrl": for S3C64xx-compatible pin-controller,
- "samsung,exynos4210-pinctrl": for Exynos4210 compatible pin-controller.
- "samsung,exynos4x12-pinctrl": for Exynos4x12 compatible pin-controller.
- "samsung,exynos5250-pinctrl": for Exynos5250 compatible pin-controller.
@@ -105,6 +106,8 @@ B. External Wakeup Interrupts: For supporting external wakeup interrupts, a
- compatible: identifies the type of the external wakeup interrupt controller
The possible values are:
+ - samsung,s3c64xx-wakeup-eint: represents wakeup interrupt controller
+ found on Samsung S3C64xx SoCs,
- samsung,exynos4210-wakeup-eint: represents wakeup interrupt controller
found on Samsung Exynos4210 SoC.
- interrupt-parent: phandle of the interrupt parent to which the external
diff --git a/Documentation/devicetree/bindings/regulator/max8952.txt b/Documentation/devicetree/bindings/regulator/max8952.txt
new file mode 100644
index 000000000000..866fcdd0f4eb
--- /dev/null
+++ b/Documentation/devicetree/bindings/regulator/max8952.txt
@@ -0,0 +1,52 @@
+Maxim MAX8952 voltage regulator
+
+Required properties:
+- compatible: must be equal to "maxim,max8952"
+- reg: I2C slave address, usually 0x60
+- max8952,dvs-mode-microvolt: array of 4 integer values defining DVS voltages
+ in microvolts. All values must be from range <770000, 1400000>
+- any required generic properties defined in regulator.txt
+
+Optional properties:
+- max8952,vid-gpios: array of two GPIO pins used for DVS voltage selection
+- max8952,en-gpio: GPIO used to control enable status of regulator
+- max8952,default-mode: index of default DVS voltage, from <0, 3> range
+- max8952,sync-freq: sync frequency, must be one of following values:
+ - 0: 26 MHz
+ - 1: 13 MHz
+ - 2: 19.2 MHz
+ Defaults to 26 MHz if not specified.
+- max8952,ramp-speed: voltage ramp speed, must be one of following values:
+ - 0: 32mV/us
+ - 1: 16mV/us
+ - 2: 8mV/us
+ - 3: 4mV/us
+ - 4: 2mV/us
+ - 5: 1mV/us
+ - 6: 0.5mV/us
+ - 7: 0.25mV/us
+ Defaults to 32mV/us if not specified.
+- any available generic properties defined in regulator.txt
+
+Example:
+
+ vdd_arm_reg: pmic@60 {
+ compatible = "maxim,max8952";
+ reg = <0x60>;
+
+ /* max8952-specific properties */
+ max8952,vid-gpios = <&gpx0 3 0>, <&gpx0 4 0>;
+ max8952,en-gpio = <&gpx0 1 0>;
+ max8952,default-mode = <0>;
+ max8952,dvs-mode-microvolt = <1250000>, <1200000>,
+ <1050000>, <950000>;
+ max8952,sync-freq = <0>;
+ max8952,ramp-speed = <0>;
+
+ /* generic regulator properties */
+ regulator-name = "vdd_arm";
+ regulator-min-microvolt = <770000>;
+ regulator-max-microvolt = <1400000>;
+ regulator-always-on;
+ regulator-boot-on;
+ };
diff --git a/Documentation/devicetree/bindings/spi/brcm,bcm2835-spi.txt b/Documentation/devicetree/bindings/spi/brcm,bcm2835-spi.txt
new file mode 100644
index 000000000000..8bf89c643640
--- /dev/null
+++ b/Documentation/devicetree/bindings/spi/brcm,bcm2835-spi.txt
@@ -0,0 +1,22 @@
+Broadcom BCM2835 SPI0 controller
+
+The BCM2835 contains two forms of SPI master controller, one known simply as
+SPI0, and the other known as the "Universal SPI Master"; part of the
+auxilliary block. This binding applies to the SPI0 controller.
+
+Required properties:
+- compatible: Should be "brcm,bcm2835-spi".
+- reg: Should contain register location and length.
+- interrupts: Should contain interrupt.
+- clocks: The clock feeding the SPI controller.
+
+Example:
+
+spi@20204000 {
+ compatible = "brcm,bcm2835-spi";
+ reg = <0x7e204000 0x1000>;
+ interrupts = <2 22>;
+ clocks = <&clk_spi>;
+ #address-cells = <1>;
+ #size-cells = <0>;
+};
diff --git a/Documentation/devicetree/bindings/spi/fsl-spi.txt b/Documentation/devicetree/bindings/spi/fsl-spi.txt
index 777abd7399d5..b032dd76e9d2 100644
--- a/Documentation/devicetree/bindings/spi/fsl-spi.txt
+++ b/Documentation/devicetree/bindings/spi/fsl-spi.txt
@@ -4,7 +4,7 @@ Required properties:
- cell-index : QE SPI subblock index.
0: QE subblock SPI1
1: QE subblock SPI2
-- compatible : should be "fsl,spi".
+- compatible : should be "fsl,spi" or "aeroflexgaisler,spictrl".
- mode : the SPI operation mode, it can be "cpu" or "cpu-qe".
- reg : Offset and length of the register set for the device
- interrupts : <a b> where a is the interrupt number and b is a
@@ -14,6 +14,7 @@ Required properties:
controller you have.
- interrupt-parent : the phandle for the interrupt controller that
services interrupts for this device.
+- clock-frequency : input clock frequency to non FSL_SOC cores
Optional properties:
- gpios : specifies the gpio pins to be used for chipselects.
diff --git a/Documentation/devicetree/bindings/spi/nvidia,tegra114-spi.txt b/Documentation/devicetree/bindings/spi/nvidia,tegra114-spi.txt
new file mode 100644
index 000000000000..91ff771c7e77
--- /dev/null
+++ b/Documentation/devicetree/bindings/spi/nvidia,tegra114-spi.txt
@@ -0,0 +1,26 @@
+NVIDIA Tegra114 SPI controller.
+
+Required properties:
+- compatible : should be "nvidia,tegra114-spi".
+- reg: Should contain SPI registers location and length.
+- interrupts: Should contain SPI interrupts.
+- nvidia,dma-request-selector : The Tegra DMA controller's phandle and
+ request selector for this SPI controller.
+- This is also require clock named "spi" as per binding document
+ Documentation/devicetree/bindings/clock/clock-bindings.txt
+
+Recommended properties:
+- spi-max-frequency: Definition as per
+ Documentation/devicetree/bindings/spi/spi-bus.txt
+Example:
+
+spi@7000d600 {
+ compatible = "nvidia,tegra114-spi";
+ reg = <0x7000d600 0x200>;
+ interrupts = <0 82 0x04>;
+ nvidia,dma-request-selector = <&apbdma 16>;
+ spi-max-frequency = <25000000>;
+ #address-cells = <1>;
+ #size-cells = <0>;
+ status = "disabled";
+};
diff --git a/Documentation/devicetree/bindings/spi/spi-samsung.txt b/Documentation/devicetree/bindings/spi/spi-samsung.txt
index a15ffeddfba4..86aa061f069f 100644
--- a/Documentation/devicetree/bindings/spi/spi-samsung.txt
+++ b/Documentation/devicetree/bindings/spi/spi-samsung.txt
@@ -31,9 +31,6 @@ Required Board Specific Properties:
- #address-cells: should be 1.
- #size-cells: should be 0.
-- gpios: The gpio specifier for clock, mosi and miso interface lines (in the
- order specified). The format of the gpio specifier depends on the gpio
- controller.
Optional Board Specific Properties:
@@ -86,9 +83,8 @@ Example:
spi_0: spi@12d20000 {
#address-cells = <1>;
#size-cells = <0>;
- gpios = <&gpa2 4 2 3 0>,
- <&gpa2 6 2 3 0>,
- <&gpa2 7 2 3 0>;
+ pinctrl-names = "default";
+ pinctrl-0 = <&spi0_bus>;
w25q80bw@0 {
#address-cells = <1>;
diff --git a/Documentation/devicetree/bindings/staging/dwc2.txt b/Documentation/devicetree/bindings/staging/dwc2.txt
new file mode 100644
index 000000000000..1a1b7cfa4845
--- /dev/null
+++ b/Documentation/devicetree/bindings/staging/dwc2.txt
@@ -0,0 +1,15 @@
+Platform DesignWare HS OTG USB 2.0 controller
+-----------------------------------------------------
+
+Required properties:
+- compatible : "snps,dwc2"
+- reg : Should contain 1 register range (address and length)
+- interrupts : Should contain 1 interrupt
+
+Example:
+
+ usb@101c0000 {
+ compatible = "ralink,rt3050-usb, snps,dwc2";
+ reg = <0x101c0000 40000>;
+ interrupts = <18>;
+ };
diff --git a/Documentation/devicetree/bindings/staging/imx-drm/fsl-imx-drm.txt b/Documentation/devicetree/bindings/staging/imx-drm/fsl-imx-drm.txt
index 07654f0338b6..8071ac20d4b3 100644
--- a/Documentation/devicetree/bindings/staging/imx-drm/fsl-imx-drm.txt
+++ b/Documentation/devicetree/bindings/staging/imx-drm/fsl-imx-drm.txt
@@ -26,7 +26,7 @@ Required properties:
- crtc: the crtc this display is connected to, see below
Optional properties:
- interface_pix_fmt: How this display is connected to the
- crtc. Currently supported types: "rgb24", "rgb565"
+ crtc. Currently supported types: "rgb24", "rgb565", "bgr666"
- edid: verbatim EDID data block describing attached display.
- ddc: phandle describing the i2c bus handling the display data
channel
diff --git a/Documentation/devicetree/bindings/tty/serial/of-serial.txt b/Documentation/devicetree/bindings/tty/serial/of-serial.txt
index 8f01cb190f25..1928a3e83cd0 100644
--- a/Documentation/devicetree/bindings/tty/serial/of-serial.txt
+++ b/Documentation/devicetree/bindings/tty/serial/of-serial.txt
@@ -33,6 +33,10 @@ Optional properties:
RTAS and should not be registered.
- no-loopback-test: set to indicate that the port does not implements loopback
test mode
+- fifo-size: the fifo size of the UART.
+- auto-flow-control: one way to enable automatic flow control support. The
+ driver is allowed to detect support for the capability even without this
+ property.
Example:
diff --git a/Documentation/devicetree/bindings/usb/ci13xxx-imx.txt b/Documentation/devicetree/bindings/usb/ci13xxx-imx.txt
index 5778b9c83bd8..1c04a4c9515f 100644
--- a/Documentation/devicetree/bindings/usb/ci13xxx-imx.txt
+++ b/Documentation/devicetree/bindings/usb/ci13xxx-imx.txt
@@ -11,6 +11,7 @@ Optional properties:
that indicate usb controller index
- vbus-supply: regulator for vbus
- disable-over-current: disable over current detect
+- external-vbus-divider: enables off-chip resistor divider for Vbus
Examples:
usb@02184000 { /* USB OTG */
@@ -20,4 +21,5 @@ usb@02184000 { /* USB OTG */
fsl,usbphy = <&usbphy1>;
fsl,usbmisc = <&usbmisc 0>;
disable-over-current;
+ external-vbus-divider;
};
diff --git a/Documentation/devicetree/bindings/usb/ehci-omap.txt b/Documentation/devicetree/bindings/usb/ehci-omap.txt
new file mode 100644
index 000000000000..485a9a1efa7a
--- /dev/null
+++ b/Documentation/devicetree/bindings/usb/ehci-omap.txt
@@ -0,0 +1,32 @@
+OMAP HS USB EHCI controller
+
+This device is usually the child of the omap-usb-host
+Documentation/devicetree/bindings/mfd/omap-usb-host.txt
+
+Required properties:
+
+- compatible: should be "ti,ehci-omap"
+- reg: should contain one register range i.e. start and length
+- interrupts: description of the interrupt line
+
+Optional properties:
+
+- phys: list of phandles to PHY nodes.
+ This property is required if at least one of the ports are in
+ PHY mode i.e. OMAP_EHCI_PORT_MODE_PHY
+
+To specify the port mode, see
+Documentation/devicetree/bindings/mfd/omap-usb-host.txt
+
+Example for OMAP4:
+
+usbhsehci: ehci@4a064c00 {
+ compatible = "ti,ehci-omap", "usb-ehci";
+ reg = <0x4a064c00 0x400>;
+ interrupts = <0 77 0x4>;
+};
+
+&usbhsehci {
+ phys = <&hsusb1_phy 0 &hsusb3_phy>;
+};
+
diff --git a/Documentation/devicetree/bindings/usb/ohci-omap3.txt b/Documentation/devicetree/bindings/usb/ohci-omap3.txt
new file mode 100644
index 000000000000..14ab42812a8e
--- /dev/null
+++ b/Documentation/devicetree/bindings/usb/ohci-omap3.txt
@@ -0,0 +1,15 @@
+OMAP HS USB OHCI controller (OMAP3 and later)
+
+Required properties:
+
+- compatible: should be "ti,ohci-omap3"
+- reg: should contain one register range i.e. start and length
+- interrupts: description of the interrupt line
+
+Example for OMAP4:
+
+usbhsohci: ohci@4a064800 {
+ compatible = "ti,ohci-omap3", "usb-ohci";
+ reg = <0x4a064800 0x400>;
+ interrupts = <0 76 0x4>;
+};
diff --git a/Documentation/devicetree/bindings/usb/omap-usb.txt b/Documentation/devicetree/bindings/usb/omap-usb.txt
index 1ef0ce71f8fa..662f0f1d2315 100644
--- a/Documentation/devicetree/bindings/usb/omap-usb.txt
+++ b/Documentation/devicetree/bindings/usb/omap-usb.txt
@@ -8,10 +8,10 @@ OMAP MUSB GLUE
and disconnect.
- multipoint : Should be "1" indicating the musb controller supports
multipoint. This is a MUSB configuration-specific setting.
- - num_eps : Specifies the number of endpoints. This is also a
+ - num-eps : Specifies the number of endpoints. This is also a
MUSB configuration-specific setting. Should be set to "16"
- - ram_bits : Specifies the ram address size. Should be set to "12"
- - interface_type : This is a board specific setting to describe the type of
+ - ram-bits : Specifies the ram address size. Should be set to "12"
+ - interface-type : This is a board specific setting to describe the type of
interface between the controller and the phy. It should be "0" or "1"
specifying ULPI and UTMI respectively.
- mode : Should be "3" to represent OTG. "1" signifies HOST and "2"
@@ -29,18 +29,46 @@ usb_otg_hs: usb_otg_hs@4a0ab000 {
ti,hwmods = "usb_otg_hs";
ti,has-mailbox;
multipoint = <1>;
- num_eps = <16>;
- ram_bits = <12>;
+ num-eps = <16>;
+ ram-bits = <12>;
ctrl-module = <&omap_control_usb>;
};
Board specific device node entry
&usb_otg_hs {
- interface_type = <1>;
+ interface-type = <1>;
mode = <3>;
power = <50>;
};
+OMAP DWC3 GLUE
+ - compatible : Should be "ti,dwc3"
+ - ti,hwmods : Should be "usb_otg_ss"
+ - reg : Address and length of the register set for the device.
+ - interrupts : The irq number of this device that is used to interrupt the
+ MPU
+ - #address-cells, #size-cells : Must be present if the device has sub-nodes
+ - utmi-mode : controls the source of UTMI/PIPE status for VBUS and OTG ID.
+ It should be set to "1" for HW mode and "2" for SW mode.
+ - ranges: the child address space are mapped 1:1 onto the parent address space
+
+Sub-nodes:
+The dwc3 core should be added as subnode to omap dwc3 glue.
+- dwc3 :
+ The binding details of dwc3 can be found in:
+ Documentation/devicetree/bindings/usb/dwc3.txt
+
+omap_dwc3 {
+ compatible = "ti,dwc3";
+ ti,hwmods = "usb_otg_ss";
+ reg = <0x4a020000 0x1ff>;
+ interrupts = <0 93 4>;
+ #address-cells = <1>;
+ #size-cells = <1>;
+ utmi-mode = <2>;
+ ranges;
+};
+
OMAP CONTROL USB
Required properties:
diff --git a/Documentation/devicetree/bindings/usb/samsung-usbphy.txt b/Documentation/devicetree/bindings/usb/samsung-usbphy.txt
index 033194934f64..f575302e5173 100644
--- a/Documentation/devicetree/bindings/usb/samsung-usbphy.txt
+++ b/Documentation/devicetree/bindings/usb/samsung-usbphy.txt
@@ -1,20 +1,25 @@
-* Samsung's usb phy transceiver
+SAMSUNG USB-PHY controllers
-The Samsung's phy transceiver is used for controlling usb phy for
-s3c-hsotg as well as ehci-s5p and ohci-exynos usb controllers
-across Samsung SOCs.
+** Samsung's usb 2.0 phy transceiver
+
+The Samsung's usb 2.0 phy transceiver is used for controlling
+usb 2.0 phy for s3c-hsotg as well as ehci-s5p and ohci-exynos
+usb controllers across Samsung SOCs.
TODO: Adding the PHY binding with controller(s) according to the under
developement generic PHY driver.
Required properties:
Exynos4210:
-- compatible : should be "samsung,exynos4210-usbphy"
+- compatible : should be "samsung,exynos4210-usb2phy"
- reg : base physical address of the phy registers and length of memory mapped
region.
+- clocks: Clock IDs array as required by the controller.
+- clock-names: names of clock correseponding IDs clock property as requested
+ by the controller driver.
Exynos5250:
-- compatible : should be "samsung,exynos5250-usbphy"
+- compatible : should be "samsung,exynos5250-usb2phy"
- reg : base physical address of the phy registers and length of memory mapped
region.
@@ -44,12 +49,69 @@ Example:
usbphy@125B0000 {
#address-cells = <1>;
#size-cells = <1>;
- compatible = "samsung,exynos4210-usbphy";
+ compatible = "samsung,exynos4210-usb2phy";
reg = <0x125B0000 0x100>;
ranges;
+ clocks = <&clock 2>, <&clock 305>;
+ clock-names = "xusbxti", "otg";
+
usbphy-sys {
/* USB device and host PHY_CONTROL registers */
reg = <0x10020704 0x8>;
};
};
+
+
+** Samsung's usb 3.0 phy transceiver
+
+Starting exynso5250, Samsung's SoC have usb 3.0 phy transceiver
+which is used for controlling usb 3.0 phy for dwc3-exynos usb 3.0
+controllers across Samsung SOCs.
+
+Required properties:
+
+Exynos5250:
+- compatible : should be "samsung,exynos5250-usb3phy"
+- reg : base physical address of the phy registers and length of memory mapped
+ region.
+- clocks: Clock IDs array as required by the controller.
+- clock-names: names of clocks correseponding to IDs in the clock property
+ as requested by the controller driver.
+
+Optional properties:
+- #address-cells: should be '1' when usbphy node has a child node with 'reg'
+ property.
+- #size-cells: should be '1' when usbphy node has a child node with 'reg'
+ property.
+- ranges: allows valid translation between child's address space and parent's
+ address space.
+
+- The child node 'usbphy-sys' to the node 'usbphy' is for the system controller
+ interface for usb-phy. It should provide the following information required by
+ usb-phy controller to control phy.
+ - reg : base physical address of PHY_CONTROL registers.
+ The size of this register is the total sum of size of all PHY_CONTROL
+ registers that the SoC has. For example, the size will be
+ '0x4' in case we have only one PHY_CONTROL register (e.g.
+ OTHERS register in S3C64XX or USB_PHY_CONTROL register in S5PV210)
+ and, '0x8' in case we have two PHY_CONTROL registers (e.g.
+ USBDEVICE_PHY_CONTROL and USBHOST_PHY_CONTROL registers in exynos4x).
+ and so on.
+
+Example:
+ usbphy@12100000 {
+ compatible = "samsung,exynos5250-usb3phy";
+ reg = <0x12100000 0x100>;
+ #address-cells = <1>;
+ #size-cells = <1>;
+ ranges;
+
+ clocks = <&clock 1>, <&clock 286>;
+ clock-names = "ext_xtal", "usbdrd30";
+
+ usbphy-sys {
+ /* USB device and host PHY_CONTROL registers */
+ reg = <0x10040704 0x8>;
+ };
+ };
diff --git a/Documentation/devicetree/bindings/usb/usb-nop-xceiv.txt b/Documentation/devicetree/bindings/usb/usb-nop-xceiv.txt
new file mode 100644
index 000000000000..d7e272671c7e
--- /dev/null
+++ b/Documentation/devicetree/bindings/usb/usb-nop-xceiv.txt
@@ -0,0 +1,34 @@
+USB NOP PHY
+
+Required properties:
+- compatible: should be usb-nop-xceiv
+
+Optional properties:
+- clocks: phandle to the PHY clock. Use as per Documentation/devicetree
+ /bindings/clock/clock-bindings.txt
+ This property is required if clock-frequency is specified.
+
+- clock-names: Should be "main_clk"
+
+- clock-frequency: the clock frequency (in Hz) that the PHY clock must
+ be configured to.
+
+- vcc-supply: phandle to the regulator that provides RESET to the PHY.
+
+- reset-supply: phandle to the regulator that provides power to the PHY.
+
+Example:
+
+ hsusb1_phy {
+ compatible = "usb-nop-xceiv";
+ clock-frequency = <19200000>;
+ clocks = <&osc 0>;
+ clock-names = "main_clk";
+ vcc-supply = <&hsusb1_vcc_regulator>;
+ reset-supply = <&hsusb1_reset_regulator>;
+ };
+
+hsusb1_phy is a NOP USB PHY device that gets its clock from an oscillator
+and expects that clock to be configured to 19.2MHz by the NOP PHY driver.
+hsusb1_vcc_regulator provides power to the PHY and hsusb1_reset_regulator
+controls RESET.
diff --git a/Documentation/devicetree/bindings/vendor-prefixes.txt b/Documentation/devicetree/bindings/vendor-prefixes.txt
index ca608496507b..4d1919bf2332 100644
--- a/Documentation/devicetree/bindings/vendor-prefixes.txt
+++ b/Documentation/devicetree/bindings/vendor-prefixes.txt
@@ -5,6 +5,7 @@ using them to avoid name-space collisions.
ad Avionic Design GmbH
adi Analog Devices, Inc.
+aeroflexgaisler Aeroflex Gaisler AB
ak Asahi Kasei Corp.
amcc Applied Micro Circuits Corporation (APM, formally AMCC)
apm Applied Micro Circuits Corporation (APM)
diff --git a/Documentation/devicetree/bindings/video/via,vt8500-fb.txt b/Documentation/devicetree/bindings/video/via,vt8500-fb.txt
index c870b6478ec8..2871e218a0fb 100644
--- a/Documentation/devicetree/bindings/video/via,vt8500-fb.txt
+++ b/Documentation/devicetree/bindings/video/via,vt8500-fb.txt
@@ -5,58 +5,32 @@ Required properties:
- compatible : "via,vt8500-fb"
- reg : Should contain 1 register ranges(address and length)
- interrupts : framebuffer controller interrupt
-- display: a phandle pointing to the display node
+- bits-per-pixel : bit depth of framebuffer (16 or 32)
-Required nodes:
-- display: a display node is required to initialize the lcd panel
- This should be in the board dts.
-- default-mode: a videomode within the display with timing parameters
- as specified below.
+Required subnodes:
+- display-timings: see display-timing.txt for information
Example:
- fb@d800e400 {
+ fb@d8050800 {
compatible = "via,vt8500-fb";
reg = <0xd800e400 0x400>;
interrupts = <12>;
- display = <&display>;
- default-mode = <&mode0>;
- };
-
-VIA VT8500 Display
------------------------------------------------------
-Required properties (as per of_videomode_helper):
-
- - hactive, vactive: Display resolution
- - hfront-porch, hback-porch, hsync-len: Horizontal Display timing parameters
- in pixels
- vfront-porch, vback-porch, vsync-len: Vertical display timing parameters in
- lines
- - clock: displayclock in Hz
- - bpp: lcd panel bit-depth.
- <16> for RGB565, <32> for RGB888
-
-Optional properties (as per of_videomode_helper):
- - width-mm, height-mm: Display dimensions in mm
- - hsync-active-high (bool): Hsync pulse is active high
- - vsync-active-high (bool): Vsync pulse is active high
- - interlaced (bool): This is an interlaced mode
- - doublescan (bool): This is a doublescan mode
+ bits-per-pixel = <16>;
-Example:
- display: display@0 {
- modes {
- mode0: mode@0 {
+ display-timings {
+ native-mode = <&timing0>;
+ timing0: 800x480 {
+ clock-frequency = <0>; /* unused but required */
hactive = <800>;
vactive = <480>;
- hback-porch = <88>;
hfront-porch = <40>;
+ hback-porch = <88>;
hsync-len = <0>;
vback-porch = <32>;
vfront-porch = <11>;
vsync-len = <1>;
- clock = <0>; /* unused but required */
- bpp = <16>; /* non-standard but required */
};
};
};
+
diff --git a/Documentation/devicetree/bindings/video/wm,wm8505-fb.txt b/Documentation/devicetree/bindings/video/wm,wm8505-fb.txt
index 3d325e1d11ee..0bcadb2840a5 100644
--- a/Documentation/devicetree/bindings/video/wm,wm8505-fb.txt
+++ b/Documentation/devicetree/bindings/video/wm,wm8505-fb.txt
@@ -4,20 +4,30 @@ Wondermedia WM8505 Framebuffer
Required properties:
- compatible : "wm,wm8505-fb"
- reg : Should contain 1 register ranges(address and length)
-- via,display: a phandle pointing to the display node
+- bits-per-pixel : bit depth of framebuffer (16 or 32)
-Required nodes:
-- display: a display node is required to initialize the lcd panel
- This should be in the board dts. See definition in
- Documentation/devicetree/bindings/video/via,vt8500-fb.txt
-- default-mode: a videomode node as specified in
- Documentation/devicetree/bindings/video/via,vt8500-fb.txt
+Required subnodes:
+- display-timings: see display-timing.txt for information
Example:
- fb@d8050800 {
+ fb@d8051700 {
compatible = "wm,wm8505-fb";
- reg = <0xd8050800 0x200>;
- display = <&display>;
- default-mode = <&mode0>;
+ reg = <0xd8051700 0x200>;
+ bits-per-pixel = <16>;
+
+ display-timings {
+ native-mode = <&timing0>;
+ timing0: 800x480 {
+ clock-frequency = <0>; /* unused but required */
+ hactive = <800>;
+ vactive = <480>;
+ hfront-porch = <40>;
+ hback-porch = <88>;
+ hsync-len = <0>;
+ vback-porch = <32>;
+ vfront-porch = <11>;
+ vsync-len = <1>;
+ };
+ };
};
diff --git a/Documentation/hwmon/adt7410 b/Documentation/hwmon/adt7410
index 58150c480e56..9817941e5f19 100644
--- a/Documentation/hwmon/adt7410
+++ b/Documentation/hwmon/adt7410
@@ -12,29 +12,42 @@ Supported chips:
Addresses scanned: None
Datasheet: Publicly available at the Analog Devices website
http://www.analog.com/static/imported-files/data_sheets/ADT7420.pdf
+ * Analog Devices ADT7310
+ Prefix: 'adt7310'
+ Addresses scanned: None
+ Datasheet: Publicly available at the Analog Devices website
+ http://www.analog.com/static/imported-files/data_sheets/ADT7310.pdf
+ * Analog Devices ADT7320
+ Prefix: 'adt7320'
+ Addresses scanned: None
+ Datasheet: Publicly available at the Analog Devices website
+ http://www.analog.com/static/imported-files/data_sheets/ADT7320.pdf
Author: Hartmut Knaack <knaack.h@gmx.de>
Description
-----------
-The ADT7410 is a temperature sensor with rated temperature range of -55°C to
-+150°C. It has a high accuracy of +/-0.5°C and can be operated at a resolution
-of 13 bits (0.0625°C) or 16 bits (0.0078°C). The sensor provides an INT pin to
-indicate that a minimum or maximum temperature set point has been exceeded, as
-well as a critical temperature (CT) pin to indicate that the critical
-temperature set point has been exceeded. Both pins can be set up with a common
-hysteresis of 0°C - 15°C and a fault queue, ranging from 1 to 4 events. Both
-pins can individually set to be active-low or active-high, while the whole
-device can either run in comparator mode or interrupt mode. The ADT7410
-supports continous temperature sampling, as well as sampling one temperature
-value per second or even justget one sample on demand for power saving.
-Besides, it can completely power down its ADC, if power management is
-required.
-
-The ADT7420 is register compatible, the only differences being the package,
-a slightly narrower operating temperature range (-40°C to +150°C), and a
-better accuracy (0.25°C instead of 0.50°C.)
+The ADT7310/ADT7410 is a temperature sensor with rated temperature range of
+-55°C to +150°C. It has a high accuracy of +/-0.5°C and can be operated at a
+resolution of 13 bits (0.0625°C) or 16 bits (0.0078°C). The sensor provides an
+INT pin to indicate that a minimum or maximum temperature set point has been
+exceeded, as well as a critical temperature (CT) pin to indicate that the
+critical temperature set point has been exceeded. Both pins can be set up with a
+common hysteresis of 0°C - 15°C and a fault queue, ranging from 1 to 4 events.
+Both pins can individually set to be active-low or active-high, while the whole
+device can either run in comparator mode or interrupt mode. The ADT7410 supports
+continuous temperature sampling, as well as sampling one temperature value per
+second or even just get one sample on demand for power saving. Besides, it can
+completely power down its ADC, if power management is required.
+
+The ADT7320/ADT7420 is register compatible, the only differences being the
+package, a slightly narrower operating temperature range (-40°C to +150°C), and
+a better accuracy (0.25°C instead of 0.50°C.)
+
+The difference between the ADT7310/ADT7320 and ADT7410/ADT7420 is the control
+interface, the ADT7310 and ADT7320 use SPI while the ADT7410 and ADT7420 use
+I2C.
Configuration Notes
-------------------
diff --git a/Documentation/hwmon/lm25066 b/Documentation/hwmon/lm25066
index 26025e419d35..c1b57d72efc3 100644
--- a/Documentation/hwmon/lm25066
+++ b/Documentation/hwmon/lm25066
@@ -1,7 +1,13 @@
-Kernel driver max8688
+Kernel driver lm25066
=====================
Supported chips:
+ * TI LM25056
+ Prefix: 'lm25056'
+ Addresses scanned: -
+ Datasheets:
+ http://www.ti.com/lit/gpn/lm25056
+ http://www.ti.com/lit/gpn/lm25056a
* National Semiconductor LM25066
Prefix: 'lm25066'
Addresses scanned: -
@@ -25,8 +31,9 @@ Author: Guenter Roeck <linux@roeck-us.net>
Description
-----------
-This driver supports hardware montoring for National Semiconductor LM25066,
-LM5064, and LM5064 Power Management, Monitoring, Control, and Protection ICs.
+This driver supports hardware montoring for National Semiconductor / TI LM25056,
+LM25066, LM5064, and LM5064 Power Management, Monitoring, Control, and
+Protection ICs.
The driver is a client driver to the core PMBus driver. Please see
Documentation/hwmon/pmbus for details on PMBus client drivers.
@@ -60,14 +67,19 @@ in1_max Maximum input voltage.
in1_min_alarm Input voltage low alarm.
in1_max_alarm Input voltage high alarm.
-in2_label "vout1"
-in2_input Measured output voltage.
-in2_average Average measured output voltage.
-in2_min Minimum output voltage.
-in2_min_alarm Output voltage low alarm.
-
-in3_label "vout2"
-in3_input Measured voltage on vaux pin
+in2_label "vmon"
+in2_input Measured voltage on VAUX pin
+in2_min Minimum VAUX voltage (LM25056 only).
+in2_max Maximum VAUX voltage (LM25056 only).
+in2_min_alarm VAUX voltage low alarm (LM25056 only).
+in2_max_alarm VAUX voltage high alarm (LM25056 only).
+
+in3_label "vout1"
+ Not supported on LM25056.
+in3_input Measured output voltage.
+in3_average Average measured output voltage.
+in3_min Minimum output voltage.
+in3_min_alarm Output voltage low alarm.
curr1_label "iin"
curr1_input Measured input current.
diff --git a/Documentation/hwmon/lm75 b/Documentation/hwmon/lm75
index c91a1d15fa28..69af1c7db6b7 100644
--- a/Documentation/hwmon/lm75
+++ b/Documentation/hwmon/lm75
@@ -23,7 +23,7 @@ Supported chips:
Datasheet: Publicly available at the Maxim website
http://www.maxim-ic.com/
* Microchip (TelCom) TCN75
- Prefix: 'lm75'
+ Prefix: 'tcn75'
Addresses scanned: none
Datasheet: Publicly available at the Microchip website
http://www.microchip.com/
diff --git a/Documentation/hwmon/lm95234 b/Documentation/hwmon/lm95234
new file mode 100644
index 000000000000..a0e95ddfd372
--- /dev/null
+++ b/Documentation/hwmon/lm95234
@@ -0,0 +1,36 @@
+Kernel driver lm95234
+=====================
+
+Supported chips:
+ * National Semiconductor / Texas Instruments LM95234
+ Addresses scanned: I2C 0x18, 0x4d, 0x4e
+ Datasheet: Publicly available at the Texas Instruments website
+ http://www.ti.com/product/lm95234
+
+
+Author: Guenter Roeck <linux@roeck-us.net>
+
+Description
+-----------
+
+LM95234 is an 11-bit digital temperature sensor with a 2-wire System Management
+Bus (SMBus) interface and TrueTherm technology that can very accurately monitor
+the temperature of four remote diodes as well as its own temperature.
+The four remote diodes can be external devices such as microprocessors,
+graphics processors or diode-connected 2N3904s. The LM95234's TruTherm
+beta compensation technology allows sensing of 90 nm or 65 nm process
+thermal diodes accurately.
+
+All temperature values are given in millidegrees Celsius. Temperature
+is provided within a range of -127 to +255 degrees (+127.875 degrees for
+the internal sensor). Resolution depends on temperature input and range.
+
+Each sensor has its own maximum limit, but the hysteresis is common to all
+channels. The hysteresis is configurable with the tem1_max_hyst attribute and
+affects the hysteresis on all channels. The first two external sensors also
+have a critical limit.
+
+The lm95234 driver can change its update interval to a fixed set of values.
+It will round up to the next selectable interval. See the datasheet for exact
+values. Reading sensor values more often will do no harm, but will return
+'old' values.
diff --git a/Documentation/hwmon/ltc2978 b/Documentation/hwmon/ltc2978
index e4d75c606c97..dc0d08c61305 100644
--- a/Documentation/hwmon/ltc2978
+++ b/Documentation/hwmon/ltc2978
@@ -2,6 +2,10 @@ Kernel driver ltc2978
=====================
Supported chips:
+ * Linear Technology LTC2974
+ Prefix: 'ltc2974'
+ Addresses scanned: -
+ Datasheet: http://www.linear.com/product/ltc2974
* Linear Technology LTC2978
Prefix: 'ltc2978'
Addresses scanned: -
@@ -10,6 +14,10 @@ Supported chips:
Prefix: 'ltc3880'
Addresses scanned: -
Datasheet: http://www.linear.com/product/ltc3880
+ * Linear Technology LTC3883
+ Prefix: 'ltc3883'
+ Addresses scanned: -
+ Datasheet: http://www.linear.com/product/ltc3883
Author: Guenter Roeck <linux@roeck-us.net>
@@ -17,9 +25,9 @@ Author: Guenter Roeck <linux@roeck-us.net>
Description
-----------
-The LTC2978 is an octal power supply monitor, supervisor, sequencer and
-margin controller. The LTC3880 is a dual, PolyPhase DC/DC synchronous
-step-down switching regulator controller.
+LTC2974 is a quad digital power supply manager. LTC2978 is an octal power supply
+monitor. LTC3880 is a dual output poly-phase step-down DC/DC controller. LTC3883
+is a single phase step-down DC/DC controller.
Usage Notes
@@ -41,63 +49,90 @@ Sysfs attributes
in1_label "vin"
in1_input Measured input voltage.
in1_min Minimum input voltage.
-in1_max Maximum input voltage.
-in1_lcrit Critical minimum input voltage.
+in1_max Maximum input voltage. LTC2974 and LTC2978 only.
+in1_lcrit Critical minimum input voltage. LTC2974 and LTC2978
+ only.
in1_crit Critical maximum input voltage.
in1_min_alarm Input voltage low alarm.
-in1_max_alarm Input voltage high alarm.
-in1_lcrit_alarm Input voltage critical low alarm.
+in1_max_alarm Input voltage high alarm. LTC2974 and LTC2978 only.
+in1_lcrit_alarm Input voltage critical low alarm. LTC2974 and LTC2978
+ only.
in1_crit_alarm Input voltage critical high alarm.
-in1_lowest Lowest input voltage. LTC2978 only.
+in1_lowest Lowest input voltage. LTC2974 and LTC2978 only.
in1_highest Highest input voltage.
-in1_reset_history Reset history. Writing into this attribute will reset
- history for all attributes.
-
-in[2-9]_label "vout[1-8]". Channels 3 to 9 on LTC2978 only.
-in[2-9]_input Measured output voltage.
-in[2-9]_min Minimum output voltage.
-in[2-9]_max Maximum output voltage.
-in[2-9]_lcrit Critical minimum output voltage.
-in[2-9]_crit Critical maximum output voltage.
-in[2-9]_min_alarm Output voltage low alarm.
-in[2-9]_max_alarm Output voltage high alarm.
-in[2-9]_lcrit_alarm Output voltage critical low alarm.
-in[2-9]_crit_alarm Output voltage critical high alarm.
-in[2-9]_lowest Lowest output voltage. LTC2978 only.
-in[2-9]_highest Lowest output voltage.
-in[2-9]_reset_history Reset history. Writing into this attribute will reset
- history for all attributes.
-
-temp[1-3]_input Measured temperature.
+in1_reset_history Reset input voltage history.
+
+in[N]_label "vout[1-8]".
+ LTC2974: N=2-5
+ LTC2978: N=2-9
+ LTC3880: N=2-3
+ LTC3883: N=2
+in[N]_input Measured output voltage.
+in[N]_min Minimum output voltage.
+in[N]_max Maximum output voltage.
+in[N]_lcrit Critical minimum output voltage.
+in[N]_crit Critical maximum output voltage.
+in[N]_min_alarm Output voltage low alarm.
+in[N]_max_alarm Output voltage high alarm.
+in[N]_lcrit_alarm Output voltage critical low alarm.
+in[N]_crit_alarm Output voltage critical high alarm.
+in[N]_lowest Lowest output voltage. LTC2974 and LTC2978 only.
+in[N]_highest Highest output voltage.
+in[N]_reset_history Reset output voltage history.
+
+temp[N]_input Measured temperature.
+ On LTC2974, temp[1-4] report external temperatures,
+ and temp5 reports the chip temperature.
On LTC2978, only one temperature measurement is
- supported and reflects the internal temperature.
+ supported and reports the chip temperature.
On LTC3880, temp1 and temp2 report external
- temperatures, and temp3 reports the internal
- temperature.
-temp[1-3]_min Mimimum temperature.
-temp[1-3]_max Maximum temperature.
-temp[1-3]_lcrit Critical low temperature.
-temp[1-3]_crit Critical high temperature.
-temp[1-3]_min_alarm Chip temperature low alarm.
-temp[1-3]_max_alarm Chip temperature high alarm.
-temp[1-3]_lcrit_alarm Chip temperature critical low alarm.
-temp[1-3]_crit_alarm Chip temperature critical high alarm.
-temp[1-3]_lowest Lowest measured temperature. LTC2978 only.
-temp[1-3]_highest Highest measured temperature.
-temp[1-3]_reset_history Reset history. Writing into this attribute will reset
- history for all attributes.
-
-power[1-2]_label "pout[1-2]". LTC3880 only.
-power[1-2]_input Measured power.
-
-curr1_label "iin". LTC3880 only.
+ temperatures, and temp3 reports the chip temperature.
+ On LTC3883, temp1 reports an external temperature,
+ and temp2 reports the chip temperature.
+temp[N]_min Mimimum temperature. LTC2974 and LTC2978 only.
+temp[N]_max Maximum temperature.
+temp[N]_lcrit Critical low temperature.
+temp[N]_crit Critical high temperature.
+temp[N]_min_alarm Temperature low alarm. LTC2974 and LTC2978 only.
+temp[N]_max_alarm Temperature high alarm.
+temp[N]_lcrit_alarm Temperature critical low alarm.
+temp[N]_crit_alarm Temperature critical high alarm.
+temp[N]_lowest Lowest measured temperature. LTC2974 and LTC2978 only.
+ Not supported for chip temperature sensor on LTC2974.
+temp[N]_highest Highest measured temperature. Not supported for chip
+ temperature sensor on LTC2974.
+temp[N]_reset_history Reset temperature history. Not supported for chip
+ temperature sensor on LTC2974.
+
+power1_label "pin". LTC3883 only.
+power1_input Measured input power.
+
+power[N]_label "pout[1-4]".
+ LTC2974: N=1-4
+ LTC2978: Not supported
+ LTC3880: N=1-2
+ LTC3883: N=2
+power[N]_input Measured output power.
+
+curr1_label "iin". LTC3880 and LTC3883 only.
curr1_input Measured input current.
curr1_max Maximum input current.
curr1_max_alarm Input current high alarm.
-
-curr[2-3]_label "iout[1-2]". LTC3880 only.
-curr[2-3]_input Measured input current.
-curr[2-3]_max Maximum input current.
-curr[2-3]_crit Critical input current.
-curr[2-3]_max_alarm Input current high alarm.
-curr[2-3]_crit_alarm Input current critical high alarm.
+curr1_highest Highest input current. LTC3883 only.
+curr1_reset_history Reset input current history. LTC3883 only.
+
+curr[N]_label "iout[1-4]".
+ LTC2974: N=1-4
+ LTC2978: not supported
+ LTC3880: N=2-3
+ LTC3883: N=2
+curr[N]_input Measured output current.
+curr[N]_max Maximum output current.
+curr[N]_crit Critical high output current.
+curr[N]_lcrit Critical low output current. LTC2974 only.
+curr[N]_max_alarm Output current high alarm.
+curr[N]_crit_alarm Output current critical high alarm.
+curr[N]_lcrit_alarm Output current critical low alarm. LTC2974 only.
+curr[N]_lowest Lowest output current. LTC2974 only.
+curr[N]_highest Highest output current.
+curr[N]_reset_history Reset output current history.
diff --git a/Documentation/hwmon/nct6775 b/Documentation/hwmon/nct6775
new file mode 100644
index 000000000000..4e9ef60e8c6c
--- /dev/null
+++ b/Documentation/hwmon/nct6775
@@ -0,0 +1,188 @@
+Note
+====
+
+This driver supersedes the NCT6775F and NCT6776F support in the W83627EHF
+driver.
+
+Kernel driver NCT6775
+=====================
+
+Supported chips:
+ * Nuvoton NCT5572D/NCT6771F/NCT6772F/NCT6775F/W83677HG-I
+ Prefix: 'nct6775'
+ Addresses scanned: ISA address retrieved from Super I/O registers
+ Datasheet: Available from Nuvoton upon request
+ * Nuvoton NCT5577D/NCT6776D/NCT6776F
+ Prefix: 'nct6776'
+ Addresses scanned: ISA address retrieved from Super I/O registers
+ Datasheet: Available from Nuvoton upon request
+ * Nuvoton NCT5532D/NCT6779D
+ Prefix: 'nct6779'
+ Addresses scanned: ISA address retrieved from Super I/O registers
+ Datasheet: Available from Nuvoton upon request
+
+Authors:
+ Guenter Roeck <linux@roeck-us.net>
+
+Description
+-----------
+
+This driver implements support for the Nuvoton NCT6775F, NCT6776F, and NCT6779D
+and compatible super I/O chips.
+
+The chips support up to 25 temperature monitoring sources. Up to 6 of those are
+direct temperature sensor inputs, the others are special sources such as PECI,
+PCH, and SMBUS. Depending on the chip type, 2 to 6 of the temperature sources
+can be monitored and compared against minimum, maximum, and critical
+temperatures. The driver reports up to 10 of the temperatures to the user.
+There are 4 to 5 fan rotation speed sensors, 8 to 15 analog voltage sensors,
+one VID, alarms with beep warnings (control unimplemented), and some automatic
+fan regulation strategies (plus manual fan control mode).
+
+The temperature sensor sources on all chips are configurable. The configured
+source for each of the temperature sensors is provided in tempX_label.
+
+Temperatures are measured in degrees Celsius and measurement resolution is
+either 1 degC or 0.5 degC, depending on the temperature source and
+configuration. An alarm is triggered when the temperature gets higher than
+the high limit; it stays on until the temperature falls below the hysteresis
+value. Alarms are only supported for temp1 to temp6, depending on the chip type.
+
+Fan rotation speeds are reported in RPM (rotations per minute). An alarm is
+triggered if the rotation speed has dropped below a programmable limit. On
+NCT6775F, fan readings can be divided by a programmable divider (1, 2, 4, 8,
+16, 32, 64 or 128) to give the readings more range or accuracy; the other chips
+do not have a fan speed divider. The driver sets the most suitable fan divisor
+itself; specifically, it increases the divider value each time a fan speed
+reading returns an invalid value, and it reduces it if the fan speed reading
+is lower than optimal. Some fans might not be present because they share pins
+with other functions.
+
+Voltage sensors (also known as IN sensors) report their values in millivolts.
+An alarm is triggered if the voltage has crossed a programmable minimum
+or maximum limit.
+
+The driver supports automatic fan control mode known as Thermal Cruise.
+In this mode, the chip attempts to keep the measured temperature in a
+predefined temperature range. If the temperature goes out of range, fan
+is driven slower/faster to reach the predefined range again.
+
+The mode works for fan1-fan5.
+
+sysfs attributes
+----------------
+
+pwm[1-5] - this file stores PWM duty cycle or DC value (fan speed) in range:
+ 0 (lowest speed) to 255 (full)
+
+pwm[1-5]_enable - this file controls mode of fan/temperature control:
+ * 0 Fan control disabled (fans set to maximum speed)
+ * 1 Manual mode, write to pwm[0-5] any value 0-255
+ * 2 "Thermal Cruise" mode
+ * 3 "Fan Speed Cruise" mode
+ * 4 "Smart Fan III" mode (NCT6775F only)
+ * 5 "Smart Fan IV" mode
+
+pwm[1-5]_mode - controls if output is PWM or DC level
+ * 0 DC output
+ * 1 PWM output
+
+Common fan control attributes
+-----------------------------
+
+pwm[1-5]_temp_sel Temperature source. Value is temperature sensor index.
+ For example, select '1' for temp1_input.
+pwm[1-5]_weight_temp_sel
+ Secondary temperature source. Value is temperature
+ sensor index. For example, select '1' for temp1_input.
+ Set to 0 to disable secondary temperature control.
+
+If secondary temperature functionality is enabled, it is controlled with the
+following attributes.
+
+pwm[1-5]_weight_duty_step
+ Duty step size.
+pwm[1-5]_weight_temp_step
+ Temperature step size. With each step over
+ temp_step_base, the value of weight_duty_step is added
+ to the current pwm value.
+pwm[1-5]_weight_temp_step_base
+ Temperature at which secondary temperature control kicks
+ in.
+pwm[1-5]_weight_temp_step_tol
+ Temperature step tolerance.
+
+Thermal Cruise mode (2)
+-----------------------
+
+If the temperature is in the range defined by:
+
+pwm[1-5]_target_temp Target temperature, unit millidegree Celsius
+ (range 0 - 127000)
+pwm[1-5]_temp_tolerance
+ Target temperature tolerance, unit millidegree Celsius
+
+there are no changes to fan speed. Once the temperature leaves the interval, fan
+speed increases (if temperature is higher that desired) or decreases (if
+temperature is lower than desired), using the following limits and time
+intervals.
+
+pwm[1-5]_start fan pwm start value (range 1 - 255), to start fan
+ when the temperature is above defined range.
+pwm[1-5]_floor lowest fan pwm (range 0 - 255) if temperature is below
+ the defined range. If set to 0, the fan is expected to
+ stop if the temperature is below the defined range.
+pwm[1-5]_step_up_time milliseconds before fan speed is increased
+pwm[1-5]_step_down_time milliseconds before fan speed is decreased
+pwm[1-5]_stop_time how many milliseconds must elapse to switch
+ corresponding fan off (when the temperature was below
+ defined range).
+
+Speed Cruise mode (3)
+---------------------
+
+This modes tries to keep the fan speed constant.
+
+fan[1-5]_target Target fan speed
+fan[1-5]_tolerance
+ Target speed tolerance
+
+
+Untested; use at your own risk.
+
+Smart Fan IV mode (5)
+---------------------
+
+This mode offers multiple slopes to control the fan speed. The slopes can be
+controlled by setting the pwm and temperature attributes. When the temperature
+rises, the chip will calculate the DC/PWM output based on the current slope.
+There are up to seven data points depending on the chip type. Subsequent data
+points should be set to higher temperatures and higher pwm values to achieve
+higher fan speeds with increasing temperature. The last data point reflects
+critical temperature mode, in which the fans should run at full speed.
+
+pwm[1-5]_auto_point[1-7]_pwm
+ pwm value to be set if temperature reaches matching
+ temperature range.
+pwm[1-5]_auto_point[1-7]_temp
+ Temperature over which the matching pwm is enabled.
+pwm[1-5]_temp_tolerance
+ Temperature tolerance, unit millidegree Celsius
+pwm[1-5]_crit_temp_tolerance
+ Temperature tolerance for critical temperature,
+ unit millidegree Celsius
+
+pwm[1-5]_step_up_time milliseconds before fan speed is increased
+pwm[1-5]_step_down_time milliseconds before fan speed is decreased
+
+Usage Notes
+-----------
+
+On various ASUS boards with NCT6776F, it appears that CPUTIN is not really
+connected to anything and floats, or that it is connected to some non-standard
+temperature measurement device. As a result, the temperature reported on CPUTIN
+will not reflect a usable value. It often reports unreasonably high
+temperatures, and in some cases the reported temperature declines if the actual
+temperature increases (similar to the raw PECI temperature value - see PECI
+specification for details). CPUTIN should therefore be be ignored on ASUS
+boards. The CPU temperature on ASUS boards is reported from PECI 0.
diff --git a/Documentation/hwmon/sht15 b/Documentation/hwmon/sht15
index 02850bdfac18..778987d1856f 100644
--- a/Documentation/hwmon/sht15
+++ b/Documentation/hwmon/sht15
@@ -40,7 +40,7 @@ bits for humidity, or 12 bits for temperature and 8 bits for humidity.
The humidity calibration coefficients are programmed into an OTP memory on the
chip. These coefficients are used to internally calibrate the signals from the
sensors. Disabling the reload of those coefficients allows saving 10ms for each
-measurement and decrease power consumption, while loosing on precision.
+measurement and decrease power consumption, while losing on precision.
Some options may be set directly in the sht15_platform_data structure
or via sysfs attributes.
diff --git a/Documentation/hwmon/tmp401 b/Documentation/hwmon/tmp401
index 9fc447249212..f91e3fa7e5ec 100644
--- a/Documentation/hwmon/tmp401
+++ b/Documentation/hwmon/tmp401
@@ -8,8 +8,16 @@ Supported chips:
Datasheet: http://focus.ti.com/docs/prod/folders/print/tmp401.html
* Texas Instruments TMP411
Prefix: 'tmp411'
- Addresses scanned: I2C 0x4c
+ Addresses scanned: I2C 0x4c, 0x4d, 0x4e
Datasheet: http://focus.ti.com/docs/prod/folders/print/tmp411.html
+ * Texas Instruments TMP431
+ Prefix: 'tmp431'
+ Addresses scanned: I2C 0x4c, 0x4d
+ Datasheet: http://focus.ti.com/docs/prod/folders/print/tmp431.html
+ * Texas Instruments TMP432
+ Prefix: 'tmp432'
+ Addresses scanned: I2C 0x4c, 0x4d
+ Datasheet: http://focus.ti.com/docs/prod/folders/print/tmp432.html
Authors:
Hans de Goede <hdegoede@redhat.com>
@@ -18,19 +26,19 @@ Authors:
Description
-----------
-This driver implements support for Texas Instruments TMP401 and
-TMP411 chips. These chips implements one remote and one local
-temperature sensor. Temperature is measured in degrees
+This driver implements support for Texas Instruments TMP401, TMP411,
+TMP431, and TMP432 chips. These chips implement one or two remote and
+one local temperature sensors. Temperature is measured in degrees
Celsius. Resolution of the remote sensor is 0.0625 degree. Local
sensor resolution can be set to 0.5, 0.25, 0.125 or 0.0625 degree (not
supported by the driver so far, so using the default resolution of 0.5
degree).
The driver provides the common sysfs-interface for temperatures (see
-/Documentation/hwmon/sysfs-interface under Temperatures).
+Documentation/hwmon/sysfs-interface under Temperatures).
-The TMP411 chip is compatible with TMP401. It provides some additional
-features.
+The TMP411 and TMP431 chips are compatible with TMP401. TMP411 provides
+some additional features.
* Minimum and Maximum temperature measured since power-on, chip-reset
@@ -40,3 +48,6 @@ features.
Exported via sysfs attribute temp_reset_history. Writing 1 to this
file triggers a reset.
+
+TMP432 is compatible with TMP401 and TMP431. It supports two external
+temperature sensors.
diff --git a/Documentation/hwmon/zl6100 b/Documentation/hwmon/zl6100
index 756b57c6b73e..33908a4d68ff 100644
--- a/Documentation/hwmon/zl6100
+++ b/Documentation/hwmon/zl6100
@@ -125,7 +125,7 @@ in2_label "vmon"
in2_input Measured voltage on VMON (ZL2004) or VDRV (ZL9101M,
ZL9117M) pin. Reported voltage is 16x the voltage on the
pin (adjusted internally by the chip).
-in2_lcrit Critical minumum VMON/VDRV Voltage.
+in2_lcrit Critical minimum VMON/VDRV Voltage.
in2_crit Critical maximum VMON/VDRV voltage.
in2_lcrit_alarm VMON/VDRV voltage critical low alarm.
in2_crit_alarm VMON/VDRV voltage critical high alarm.
diff --git a/Documentation/i2c/busses/i2c-diolan-u2c b/Documentation/i2c/busses/i2c-diolan-u2c
index 30fe4bb9a069..0d6018c316c7 100644
--- a/Documentation/i2c/busses/i2c-diolan-u2c
+++ b/Documentation/i2c/busses/i2c-diolan-u2c
@@ -5,7 +5,7 @@ Supported adapters:
Documentation:
http://www.diolan.com/i2c/u2c12.html
-Author: Guenter Roeck <guenter.roeck@ericsson.com>
+Author: Guenter Roeck <linux@roeck-us.net>
Description
-----------
diff --git a/Documentation/ia64/err_inject.txt b/Documentation/ia64/err_inject.txt
index 223e4f0582d0..9f651c181429 100644
--- a/Documentation/ia64/err_inject.txt
+++ b/Documentation/ia64/err_inject.txt
@@ -882,7 +882,7 @@ int err_inj()
cpu=parameters[i].cpu;
k = cpu%64;
j = cpu/64;
- mask[j]=1<<k;
+ mask[j] = 1UL << k;
if (sched_setaffinity(0, MASK_SIZE*8, mask)==-1) {
perror("Error sched_setaffinity:");
diff --git a/Documentation/ioctl/ioctl-number.txt b/Documentation/ioctl/ioctl-number.txt
index 3210540f8bd3..237acab169dd 100644
--- a/Documentation/ioctl/ioctl-number.txt
+++ b/Documentation/ioctl/ioctl-number.txt
@@ -131,6 +131,7 @@ Code Seq#(hex) Include File Comments
'H' 40-4F sound/hdspm.h conflict!
'H' 40-4F sound/hdsp.h conflict!
'H' 90 sound/usb/usx2y/usb_stream.h
+'H' A0 uapi/linux/usb/cdc-wdm.h
'H' C0-F0 net/bluetooth/hci.h conflict!
'H' C0-DF net/bluetooth/hidp/hidp.h conflict!
'H' C0-DF net/bluetooth/cmtp/cmtp.h conflict!
diff --git a/Documentation/kdump/kdump.txt b/Documentation/kdump/kdump.txt
index 13f1aa09b938..9c7fd988e299 100644
--- a/Documentation/kdump/kdump.txt
+++ b/Documentation/kdump/kdump.txt
@@ -297,6 +297,7 @@ Boot into System Kernel
On ia64, 256M@256M is a generous value that typically works.
The region may be automatically placed on ia64, see the
dump-capture kernel config option notes above.
+ If use sparse memory, the size should be rounded to GRANULE boundaries.
On s390x, typically use "crashkernel=xxM". The value of xx is dependent
on the memory consumption of the kdump system. In general this is not
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 9aa5562fed9c..365f7bd40eeb 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -321,6 +321,13 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
on: enable for both 32- and 64-bit processes
off: disable for both 32- and 64-bit processes
+ alloc_snapshot [FTRACE]
+ Allocate the ftrace snapshot buffer on boot up when the
+ main buffer is allocated. This is handy if debugging
+ and you need to use tracing_snapshot() on boot up, and
+ do not want to use tracing_snapshot_alloc() as it needs
+ to be done where GFP_KERNEL allocations are allowed.
+
amd_iommu= [HW,X86-64]
Pass parameters to the AMD IOMMU driver in the system.
Possible values are:
@@ -604,9 +611,6 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
is selected automatically. Check
Documentation/kdump/kdump.txt for further details.
- crashkernel_low=size[KMG]
- [KNL, x86] parts under 4G.
-
crashkernel=range1:size1[,range2:size2,...][@offset]
[KNL] Same as above, but depends on the memory
in the running system. The syntax of range is
@@ -614,6 +618,26 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
a memory unit (amount[KMG]). See also
Documentation/kdump/kdump.txt for an example.
+ crashkernel=size[KMG],high
+ [KNL, x86_64] range could be above 4G. Allow kernel
+ to allocate physical memory region from top, so could
+ be above 4G if system have more than 4G ram installed.
+ Otherwise memory region will be allocated below 4G, if
+ available.
+ It will be ignored if crashkernel=X is specified.
+ crashkernel=size[KMG],low
+ [KNL, x86_64] range under 4G. When crashkernel=X,high
+ is passed, kernel could allocate physical memory region
+ above 4G, that cause second kernel crash on system
+ that require some amount of low memory, e.g. swiotlb
+ requires at least 64M+32K low memory. Kernel would
+ try to allocate 72M below 4G automatically.
+ This one let user to specify own low range under 4G
+ for second kernel instead.
+ 0: to disable low allocation.
+ It will be ignored when crashkernel=X,high is not used
+ or memory reserved is below 4G.
+
cs89x0_dma= [HW,NET]
Format: <dma>
@@ -796,6 +820,12 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
edd= [EDD]
Format: {"off" | "on" | "skip[mbr]"}
+ efi_no_storage_paranoia [EFI; X86]
+ Using this parameter you can use more than 50% of
+ your efi variable storage. Use this parameter only if
+ you are really sure that your UEFI does sane gc and
+ fulfills the spec otherwise your board may brick.
+
eisa_irq_edge= [PARISC,HW]
See header of drivers/parisc/eisa.c.
diff --git a/Documentation/misc-devices/mei/mei-client-bus.txt b/Documentation/misc-devices/mei/mei-client-bus.txt
new file mode 100644
index 000000000000..f83910a8ce76
--- /dev/null
+++ b/Documentation/misc-devices/mei/mei-client-bus.txt
@@ -0,0 +1,138 @@
+Intel(R) Management Engine (ME) Client bus API
+===============================================
+
+
+Rationale
+=========
+MEI misc character device is useful for dedicated applications to send and receive
+data to the many FW appliance found in Intel's ME from the user space.
+However for some of the ME functionalities it make sense to leverage existing software
+stack and expose them through existing kernel subsystems.
+
+In order to plug seamlessly into the kernel device driver model we add kernel virtual
+bus abstraction on top of the MEI driver. This allows implementing linux kernel drivers
+for the various MEI features as a stand alone entities found in their respective subsystem.
+Existing device drivers can even potentially be re-used by adding an MEI CL bus layer to
+the existing code.
+
+
+MEI CL bus API
+===========
+A driver implementation for an MEI Client is very similar to existing bus
+based device drivers. The driver registers itself as an MEI CL bus driver through
+the mei_cl_driver structure:
+
+struct mei_cl_driver {
+ struct device_driver driver;
+ const char *name;
+
+ const struct mei_cl_device_id *id_table;
+
+ int (*probe)(struct mei_cl_device *dev, const struct mei_cl_id *id);
+ int (*remove)(struct mei_cl_device *dev);
+};
+
+struct mei_cl_id {
+ char name[MEI_NAME_SIZE];
+ kernel_ulong_t driver_info;
+};
+
+The mei_cl_id structure allows the driver to bind itself against a device name.
+
+To actually register a driver on the ME Client bus one must call the mei_cl_add_driver()
+API. This is typically called at module init time.
+
+Once registered on the ME Client bus, a driver will typically try to do some I/O on
+this bus and this should be done through the mei_cl_send() and mei_cl_recv()
+routines. The latter is synchronous (blocks and sleeps until data shows up).
+In order for drivers to be notified of pending events waiting for them (e.g.
+an Rx event) they can register an event handler through the
+mei_cl_register_event_cb() routine. Currently only the MEI_EVENT_RX event
+will trigger an event handler call and the driver implementation is supposed
+to call mei_recv() from the event handler in order to fetch the pending
+received buffers.
+
+
+Example
+=======
+As a theoretical example let's pretend the ME comes with a "contact" NFC IP.
+The driver init and exit routines for this device would look like:
+
+#define CONTACT_DRIVER_NAME "contact"
+
+static struct mei_cl_device_id contact_mei_cl_tbl[] = {
+ { CONTACT_DRIVER_NAME, },
+
+ /* required last entry */
+ { }
+};
+MODULE_DEVICE_TABLE(mei_cl, contact_mei_cl_tbl);
+
+static struct mei_cl_driver contact_driver = {
+ .id_table = contact_mei_tbl,
+ .name = CONTACT_DRIVER_NAME,
+
+ .probe = contact_probe,
+ .remove = contact_remove,
+};
+
+static int contact_init(void)
+{
+ int r;
+
+ r = mei_cl_driver_register(&contact_driver);
+ if (r) {
+ pr_err(CONTACT_DRIVER_NAME ": driver registration failed\n");
+ return r;
+ }
+
+ return 0;
+}
+
+static void __exit contact_exit(void)
+{
+ mei_cl_driver_unregister(&contact_driver);
+}
+
+module_init(contact_init);
+module_exit(contact_exit);
+
+And the driver's simplified probe routine would look like that:
+
+int contact_probe(struct mei_cl_device *dev, struct mei_cl_device_id *id)
+{
+ struct contact_driver *contact;
+
+ [...]
+ mei_cl_enable_device(dev);
+
+ mei_cl_register_event_cb(dev, contact_event_cb, contact);
+
+ return 0;
+ }
+
+In the probe routine the driver first enable the MEI device and then registers
+an ME bus event handler which is as close as it can get to registering a
+threaded IRQ handler.
+The handler implementation will typically call some I/O routine depending on
+the pending events:
+
+#define MAX_NFC_PAYLOAD 128
+
+static void contact_event_cb(struct mei_cl_device *dev, u32 events,
+ void *context)
+{
+ struct contact_driver *contact = context;
+
+ if (events & BIT(MEI_EVENT_RX)) {
+ u8 payload[MAX_NFC_PAYLOAD];
+ int payload_size;
+
+ payload_size = mei_recv(dev, payload, MAX_NFC_PAYLOAD);
+ if (payload_size <= 0)
+ return;
+
+ /* Hook to the NFC subsystem */
+ nfc_hci_recv_frame(contact->hdev, payload, payload_size);
+ }
+}
diff --git a/Documentation/networking/ipvs-sysctl.txt b/Documentation/networking/ipvs-sysctl.txt
index f2a2488f1bf3..9573d0c48c6e 100644
--- a/Documentation/networking/ipvs-sysctl.txt
+++ b/Documentation/networking/ipvs-sysctl.txt
@@ -15,6 +15,13 @@ amemthresh - INTEGER
enabled and the variable is automatically set to 2, otherwise
the strategy is disabled and the variable is set to 1.
+backup_only - BOOLEAN
+ 0 - disabled (default)
+ not 0 - enabled
+
+ If set, disable the director function while the server is
+ in backup mode to avoid packet loops for DR/TUN methods.
+
conntrack - BOOLEAN
0 - disabled (default)
not 0 - enabled
diff --git a/Documentation/pinctrl.txt b/Documentation/pinctrl.txt
index a2b57e0a1db0..447fd4cd54ec 100644
--- a/Documentation/pinctrl.txt
+++ b/Documentation/pinctrl.txt
@@ -736,6 +736,13 @@ All the above functions are mandatory to implement for a pinmux driver.
Pin control interaction with the GPIO subsystem
===============================================
+Note that the following implies that the use case is to use a certain pin
+from the Linux kernel using the API in <linux/gpio.h> with gpio_request()
+and similar functions. There are cases where you may be using something
+that your datasheet calls "GPIO mode" but actually is just an electrical
+configuration for a certain device. See the section below named
+"GPIO mode pitfalls" for more details on this scenario.
+
The public pinmux API contains two functions named pinctrl_request_gpio()
and pinctrl_free_gpio(). These two functions shall *ONLY* be called from
gpiolib-based drivers as part of their gpio_request() and
@@ -774,6 +781,111 @@ obtain the function "gpioN" where "N" is the global GPIO pin number if no
special GPIO-handler is registered.
+GPIO mode pitfalls
+==================
+
+Sometime the developer may be confused by a datasheet talking about a pin
+being possible to set into "GPIO mode". It appears that what hardware
+engineers mean with "GPIO mode" is not necessarily the use case that is
+implied in the kernel interface <linux/gpio.h>: a pin that you grab from
+kernel code and then either listen for input or drive high/low to
+assert/deassert some external line.
+
+Rather hardware engineers think that "GPIO mode" means that you can
+software-control a few electrical properties of the pin that you would
+not be able to control if the pin was in some other mode, such as muxed in
+for a device.
+
+Example: a pin is usually muxed in to be used as a UART TX line. But during
+system sleep, we need to put this pin into "GPIO mode" and ground it.
+
+If you make a 1-to-1 map to the GPIO subsystem for this pin, you may start
+to think that you need to come up with something real complex, that the
+pin shall be used for UART TX and GPIO at the same time, that you will grab
+a pin control handle and set it to a certain state to enable UART TX to be
+muxed in, then twist it over to GPIO mode and use gpio_direction_output()
+to drive it low during sleep, then mux it over to UART TX again when you
+wake up and maybe even gpio_request/gpio_free as part of this cycle. This
+all gets very complicated.
+
+The solution is to not think that what the datasheet calls "GPIO mode"
+has to be handled by the <linux/gpio.h> interface. Instead view this as
+a certain pin config setting. Look in e.g. <linux/pinctrl/pinconf-generic.h>
+and you find this in the documentation:
+
+ PIN_CONFIG_OUTPUT: this will configure the pin in output, use argument
+ 1 to indicate high level, argument 0 to indicate low level.
+
+So it is perfectly possible to push a pin into "GPIO mode" and drive the
+line low as part of the usual pin control map. So for example your UART
+driver may look like this:
+
+#include <linux/pinctrl/consumer.h>
+
+struct pinctrl *pinctrl;
+struct pinctrl_state *pins_default;
+struct pinctrl_state *pins_sleep;
+
+pins_default = pinctrl_lookup_state(uap->pinctrl, PINCTRL_STATE_DEFAULT);
+pins_sleep = pinctrl_lookup_state(uap->pinctrl, PINCTRL_STATE_SLEEP);
+
+/* Normal mode */
+retval = pinctrl_select_state(pinctrl, pins_default);
+/* Sleep mode */
+retval = pinctrl_select_state(pinctrl, pins_sleep);
+
+And your machine configuration may look like this:
+--------------------------------------------------
+
+static unsigned long uart_default_mode[] = {
+ PIN_CONF_PACKED(PIN_CONFIG_DRIVE_PUSH_PULL, 0),
+};
+
+static unsigned long uart_sleep_mode[] = {
+ PIN_CONF_PACKED(PIN_CONFIG_OUTPUT, 0),
+};
+
+static struct pinctrl_map __initdata pinmap[] = {
+ PIN_MAP_MUX_GROUP("uart", PINCTRL_STATE_DEFAULT, "pinctrl-foo",
+ "u0_group", "u0"),
+ PIN_MAP_CONFIGS_PIN("uart", PINCTRL_STATE_DEFAULT, "pinctrl-foo",
+ "UART_TX_PIN", uart_default_mode),
+ PIN_MAP_MUX_GROUP("uart", PINCTRL_STATE_SLEEP, "pinctrl-foo",
+ "u0_group", "gpio-mode"),
+ PIN_MAP_CONFIGS_PIN("uart", PINCTRL_STATE_SLEEP, "pinctrl-foo",
+ "UART_TX_PIN", uart_sleep_mode),
+};
+
+foo_init(void) {
+ pinctrl_register_mappings(pinmap, ARRAY_SIZE(pinmap));
+}
+
+Here the pins we want to control are in the "u0_group" and there is some
+function called "u0" that can be enabled on this group of pins, and then
+everything is UART business as usual. But there is also some function
+named "gpio-mode" that can be mapped onto the same pins to move them into
+GPIO mode.
+
+This will give the desired effect without any bogus interaction with the
+GPIO subsystem. It is just an electrical configuration used by that device
+when going to sleep, it might imply that the pin is set into something the
+datasheet calls "GPIO mode" but that is not the point: it is still used
+by that UART device to control the pins that pertain to that very UART
+driver, putting them into modes needed by the UART. GPIO in the Linux
+kernel sense are just some 1-bit line, and is a different use case.
+
+How the registers are poked to attain the push/pull and output low
+configuration and the muxing of the "u0" or "gpio-mode" group onto these
+pins is a question for the driver.
+
+Some datasheets will be more helpful and refer to the "GPIO mode" as
+"low power mode" rather than anything to do with GPIO. This often means
+the same thing electrically speaking, but in this latter case the
+software engineers will usually quickly identify that this is some
+specific muxing/configuration rather than anything related to the GPIO
+API.
+
+
Board/machine configuration
==================================
diff --git a/Documentation/s390/s390dbf.txt b/Documentation/s390/s390dbf.txt
index ae66f9b90a25..fcaf0b4efba2 100644
--- a/Documentation/s390/s390dbf.txt
+++ b/Documentation/s390/s390dbf.txt
@@ -143,7 +143,8 @@ Parameter: id: handle for debug log
Return Value: none
-Description: frees memory for a debug log
+Description: frees memory for a debug log and removes all registered debug
+ views.
Must not be called within an interrupt handler
---------------------------------------------------------------------------
diff --git a/Documentation/scsi/LICENSE.qla2xxx b/Documentation/scsi/LICENSE.qla2xxx
index 27a91cf43d6d..5020b7b5a244 100644
--- a/Documentation/scsi/LICENSE.qla2xxx
+++ b/Documentation/scsi/LICENSE.qla2xxx
@@ -1,4 +1,4 @@
-Copyright (c) 2003-2012 QLogic Corporation
+Copyright (c) 2003-2013 QLogic Corporation
QLogic Linux FC-FCoE Driver
This program includes a device driver for Linux 3.x.
diff --git a/Documentation/sound/alsa/ALSA-Configuration.txt b/Documentation/sound/alsa/ALSA-Configuration.txt
index ce6581c8ca26..95731a08f257 100644
--- a/Documentation/sound/alsa/ALSA-Configuration.txt
+++ b/Documentation/sound/alsa/ALSA-Configuration.txt
@@ -890,9 +890,8 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed.
enable_msi - Enable Message Signaled Interrupt (MSI) (default = off)
power_save - Automatic power-saving timeout (in second, 0 =
disable)
- power_save_controller - Support runtime D3 of HD-audio controller
- (-1 = on for supported chip (default), false = off,
- true = force to on even for unsupported hardware)
+ power_save_controller - Reset HD-audio controller in power-saving mode
+ (default = on)
align_buffer_size - Force rounding of buffer/period sizes to multiples
of 128 bytes. This is more efficient in terms of memory
access but isn't required by the HDA spec and prevents
@@ -912,7 +911,7 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed.
models depending on the codec chip. The list of available models
is found in HD-Audio-Models.txt
- The model name "genric" is treated as a special case. When this
+ The model name "generic" is treated as a special case. When this
model is given, the driver uses the generic codec parser without
"codec-patch". It's sometimes good for testing and debugging.
diff --git a/Documentation/sound/alsa/seq_oss.html b/Documentation/sound/alsa/seq_oss.html
index d9776cf60c07..9663b45f6fde 100644
--- a/Documentation/sound/alsa/seq_oss.html
+++ b/Documentation/sound/alsa/seq_oss.html
@@ -285,7 +285,7 @@ sample data.
<H4>
7.2.4 Close Callback</H4>
The <TT>close</TT> callback is called when this device is closed by the
-applicaion. If any private data was allocated in open callback, it must
+application. If any private data was allocated in open callback, it must
be released in the close callback. The deletion of ALSA port should be
done here, too. This callback must not be NULL.
<H4>
diff --git a/Documentation/trace/ftrace.txt b/Documentation/trace/ftrace.txt
index a372304aef10..bfe8c29b1f1d 100644
--- a/Documentation/trace/ftrace.txt
+++ b/Documentation/trace/ftrace.txt
@@ -8,6 +8,7 @@ Copyright 2008 Red Hat Inc.
Reviewers: Elias Oltmanns, Randy Dunlap, Andrew Morton,
John Kacur, and David Teigland.
Written for: 2.6.28-rc2
+Updated for: 3.10
Introduction
------------
@@ -17,13 +18,16 @@ designers of systems to find what is going on inside the kernel.
It can be used for debugging or analyzing latencies and
performance issues that take place outside of user-space.
-Although ftrace is the function tracer, it also includes an
-infrastructure that allows for other types of tracing. Some of
-the tracers that are currently in ftrace include a tracer to
-trace context switches, the time it takes for a high priority
-task to run after it was woken up, the time interrupts are
-disabled, and more (ftrace allows for tracer plugins, which
-means that the list of tracers can always grow).
+Although ftrace is typically considered the function tracer, it
+is really a frame work of several assorted tracing utilities.
+There's latency tracing to examine what occurs between interrupts
+disabled and enabled, as well as for preemption and from a time
+a task is woken to the task is actually scheduled in.
+
+One of the most common uses of ftrace is the event tracing.
+Through out the kernel is hundreds of static event points that
+can be enabled via the debugfs file system to see what is
+going on in certain parts of the kernel.
Implementation Details
@@ -61,7 +65,7 @@ the extended "/sys/kernel/debug/tracing" path name.
That's it! (assuming that you have ftrace configured into your kernel)
-After mounting the debugfs, you can see a directory called
+After mounting debugfs, you can see a directory called
"tracing". This directory contains the control and output files
of ftrace. Here is a list of some of the key files:
@@ -84,7 +88,9 @@ of ftrace. Here is a list of some of the key files:
This sets or displays whether writing to the trace
ring buffer is enabled. Echo 0 into this file to disable
- the tracer or 1 to enable it.
+ the tracer or 1 to enable it. Note, this only disables
+ writing to the ring buffer, the tracing overhead may
+ still be occurring.
trace:
@@ -109,7 +115,15 @@ of ftrace. Here is a list of some of the key files:
This file lets the user control the amount of data
that is displayed in one of the above output
- files.
+ files. Options also exist to modify how a tracer
+ or events work (stack traces, timestamps, etc).
+
+ options:
+
+ This is a directory that has a file for every available
+ trace option (also in trace_options). Options may also be set
+ or cleared by writing a "1" or "0" respectively into the
+ corresponding file with the option name.
tracing_max_latency:
@@ -121,10 +135,17 @@ of ftrace. Here is a list of some of the key files:
latency is greater than the value in this
file. (in microseconds)
+ tracing_thresh:
+
+ Some latency tracers will record a trace whenever the
+ latency is greater than the number in this file.
+ Only active when the file contains a number greater than 0.
+ (in microseconds)
+
buffer_size_kb:
This sets or displays the number of kilobytes each CPU
- buffer can hold. The tracer buffers are the same size
+ buffer holds. By default, the trace buffers are the same size
for each CPU. The displayed number is the size of the
CPU buffer and not total size of all buffers. The
trace buffers are allocated in pages (blocks of memory
@@ -133,16 +154,30 @@ of ftrace. Here is a list of some of the key files:
than requested, the rest of the page will be used,
making the actual allocation bigger than requested.
( Note, the size may not be a multiple of the page size
- due to buffer management overhead. )
+ due to buffer management meta-data. )
- This can only be updated when the current_tracer
- is set to "nop".
+ buffer_total_size_kb:
+
+ This displays the total combined size of all the trace buffers.
+
+ free_buffer:
+
+ If a process is performing the tracing, and the ring buffer
+ should be shrunk "freed" when the process is finished, even
+ if it were to be killed by a signal, this file can be used
+ for that purpose. On close of this file, the ring buffer will
+ be resized to its minimum size. Having a process that is tracing
+ also open this file, when the process exits its file descriptor
+ for this file will be closed, and in doing so, the ring buffer
+ will be "freed".
+
+ It may also stop tracing if disable_on_free option is set.
tracing_cpumask:
This is a mask that lets the user only trace
- on specified CPUS. The format is a hex string
- representing the CPUS.
+ on specified CPUs. The format is a hex string
+ representing the CPUs.
set_ftrace_filter:
@@ -183,6 +218,261 @@ of ftrace. Here is a list of some of the key files:
"set_ftrace_notrace". (See the section "dynamic ftrace"
below for more details.)
+ enabled_functions:
+
+ This file is more for debugging ftrace, but can also be useful
+ in seeing if any function has a callback attached to it.
+ Not only does the trace infrastructure use ftrace function
+ trace utility, but other subsystems might too. This file
+ displays all functions that have a callback attached to them
+ as well as the number of callbacks that have been attached.
+ Note, a callback may also call multiple functions which will
+ not be listed in this count.
+
+ If the callback registered to be traced by a function with
+ the "save regs" attribute (thus even more overhead), a 'R'
+ will be displayed on the same line as the function that
+ is returning registers.
+
+ function_profile_enabled:
+
+ When set it will enable all functions with either the function
+ tracer, or if enabled, the function graph tracer. It will
+ keep a histogram of the number of functions that were called
+ and if run with the function graph tracer, it will also keep
+ track of the time spent in those functions. The histogram
+ content can be displayed in the files:
+
+ trace_stats/function<cpu> ( function0, function1, etc).
+
+ trace_stats:
+
+ A directory that holds different tracing stats.
+
+ kprobe_events:
+
+ Enable dynamic trace points. See kprobetrace.txt.
+
+ kprobe_profile:
+
+ Dynamic trace points stats. See kprobetrace.txt.
+
+ max_graph_depth:
+
+ Used with the function graph tracer. This is the max depth
+ it will trace into a function. Setting this to a value of
+ one will show only the first kernel function that is called
+ from user space.
+
+ printk_formats:
+
+ This is for tools that read the raw format files. If an event in
+ the ring buffer references a string (currently only trace_printk()
+ does this), only a pointer to the string is recorded into the buffer
+ and not the string itself. This prevents tools from knowing what
+ that string was. This file displays the string and address for
+ the string allowing tools to map the pointers to what the
+ strings were.
+
+ saved_cmdlines:
+
+ Only the pid of the task is recorded in a trace event unless
+ the event specifically saves the task comm as well. Ftrace
+ makes a cache of pid mappings to comms to try to display
+ comms for events. If a pid for a comm is not listed, then
+ "<...>" is displayed in the output.
+
+ snapshot:
+
+ This displays the "snapshot" buffer and also lets the user
+ take a snapshot of the current running trace.
+ See the "Snapshot" section below for more details.
+
+ stack_max_size:
+
+ When the stack tracer is activated, this will display the
+ maximum stack size it has encountered.
+ See the "Stack Trace" section below.
+
+ stack_trace:
+
+ This displays the stack back trace of the largest stack
+ that was encountered when the stack tracer is activated.
+ See the "Stack Trace" section below.
+
+ stack_trace_filter:
+
+ This is similar to "set_ftrace_filter" but it limits what
+ functions the stack tracer will check.
+
+ trace_clock:
+
+ Whenever an event is recorded into the ring buffer, a
+ "timestamp" is added. This stamp comes from a specified
+ clock. By default, ftrace uses the "local" clock. This
+ clock is very fast and strictly per cpu, but on some
+ systems it may not be monotonic with respect to other
+ CPUs. In other words, the local clocks may not be in sync
+ with local clocks on other CPUs.
+
+ Usual clocks for tracing:
+
+ # cat trace_clock
+ [local] global counter x86-tsc
+
+ local: Default clock, but may not be in sync across CPUs
+
+ global: This clock is in sync with all CPUs but may
+ be a bit slower than the local clock.
+
+ counter: This is not a clock at all, but literally an atomic
+ counter. It counts up one by one, but is in sync
+ with all CPUs. This is useful when you need to
+ know exactly the order events occurred with respect to
+ each other on different CPUs.
+
+ uptime: This uses the jiffies counter and the time stamp
+ is relative to the time since boot up.
+
+ perf: This makes ftrace use the same clock that perf uses.
+ Eventually perf will be able to read ftrace buffers
+ and this will help out in interleaving the data.
+
+ x86-tsc: Architectures may define their own clocks. For
+ example, x86 uses its own TSC cycle clock here.
+
+ To set a clock, simply echo the clock name into this file.
+
+ echo global > trace_clock
+
+ trace_marker:
+
+ This is a very useful file for synchronizing user space
+ with events happening in the kernel. Writing strings into
+ this file will be written into the ftrace buffer.
+
+ It is useful in applications to open this file at the start
+ of the application and just reference the file descriptor
+ for the file.
+
+ void trace_write(const char *fmt, ...)
+ {
+ va_list ap;
+ char buf[256];
+ int n;
+
+ if (trace_fd < 0)
+ return;
+
+ va_start(ap, fmt);
+ n = vsnprintf(buf, 256, fmt, ap);
+ va_end(ap);
+
+ write(trace_fd, buf, n);
+ }
+
+ start:
+
+ trace_fd = open("trace_marker", WR_ONLY);
+
+ uprobe_events:
+
+ Add dynamic tracepoints in programs.
+ See uprobetracer.txt
+
+ uprobe_profile:
+
+ Uprobe statistics. See uprobetrace.txt
+
+ instances:
+
+ This is a way to make multiple trace buffers where different
+ events can be recorded in different buffers.
+ See "Instances" section below.
+
+ events:
+
+ This is the trace event directory. It holds event tracepoints
+ (also known as static tracepoints) that have been compiled
+ into the kernel. It shows what event tracepoints exist
+ and how they are grouped by system. There are "enable"
+ files at various levels that can enable the tracepoints
+ when a "1" is written to them.
+
+ See events.txt for more information.
+
+ per_cpu:
+
+ This is a directory that contains the trace per_cpu information.
+
+ per_cpu/cpu0/buffer_size_kb:
+
+ The ftrace buffer is defined per_cpu. That is, there's a separate
+ buffer for each CPU to allow writes to be done atomically,
+ and free from cache bouncing. These buffers may have different
+ size buffers. This file is similar to the buffer_size_kb
+ file, but it only displays or sets the buffer size for the
+ specific CPU. (here cpu0).
+
+ per_cpu/cpu0/trace:
+
+ This is similar to the "trace" file, but it will only display
+ the data specific for the CPU. If written to, it only clears
+ the specific CPU buffer.
+
+ per_cpu/cpu0/trace_pipe
+
+ This is similar to the "trace_pipe" file, and is a consuming
+ read, but it will only display (and consume) the data specific
+ for the CPU.
+
+ per_cpu/cpu0/trace_pipe_raw
+
+ For tools that can parse the ftrace ring buffer binary format,
+ the trace_pipe_raw file can be used to extract the data
+ from the ring buffer directly. With the use of the splice()
+ system call, the buffer data can be quickly transferred to
+ a file or to the network where a server is collecting the
+ data.
+
+ Like trace_pipe, this is a consuming reader, where multiple
+ reads will always produce different data.
+
+ per_cpu/cpu0/snapshot:
+
+ This is similar to the main "snapshot" file, but will only
+ snapshot the current CPU (if supported). It only displays
+ the content of the snapshot for a given CPU, and if
+ written to, only clears this CPU buffer.
+
+ per_cpu/cpu0/snapshot_raw:
+
+ Similar to the trace_pipe_raw, but will read the binary format
+ from the snapshot buffer for the given CPU.
+
+ per_cpu/cpu0/stats:
+
+ This displays certain stats about the ring buffer:
+
+ entries: The number of events that are still in the buffer.
+
+ overrun: The number of lost events due to overwriting when
+ the buffer was full.
+
+ commit overrun: Should always be zero.
+ This gets set if so many events happened within a nested
+ event (ring buffer is re-entrant), that it fills the
+ buffer and starts dropping events.
+
+ bytes: Bytes actually read (not overwritten).
+
+ oldest event ts: The oldest timestamp in the buffer
+
+ now ts: The current timestamp
+
+ dropped events: Events lost due to overwrite option being off.
+
+ read events: The number of events read.
The Tracers
-----------
@@ -234,11 +524,6 @@ Here is the list of current tracers that may be configured.
RT tasks (as the current "wakeup" does). This is useful
for those interested in wake up timings of RT tasks.
- "hw-branch-tracer"
-
- Uses the BTS CPU feature on x86 CPUs to traces all
- branches executed.
-
"nop"
This is the "trace nothing" tracer. To remove all
@@ -261,70 +546,100 @@ Here is an example of the output format of the file "trace"
--------
# tracer: function
#
-# TASK-PID CPU# TIMESTAMP FUNCTION
-# | | | | |
- bash-4251 [01] 10152.583854: path_put <-path_walk
- bash-4251 [01] 10152.583855: dput <-path_put
- bash-4251 [01] 10152.583855: _atomic_dec_and_lock <-dput
+# entries-in-buffer/entries-written: 140080/250280 #P:4
+#
+# _-----=> irqs-off
+# / _----=> need-resched
+# | / _---=> hardirq/softirq
+# || / _--=> preempt-depth
+# ||| / delay
+# TASK-PID CPU# |||| TIMESTAMP FUNCTION
+# | | | |||| | |
+ bash-1977 [000] .... 17284.993652: sys_close <-system_call_fastpath
+ bash-1977 [000] .... 17284.993653: __close_fd <-sys_close
+ bash-1977 [000] .... 17284.993653: _raw_spin_lock <-__close_fd
+ sshd-1974 [003] .... 17284.993653: __srcu_read_unlock <-fsnotify
+ bash-1977 [000] .... 17284.993654: add_preempt_count <-_raw_spin_lock
+ bash-1977 [000] ...1 17284.993655: _raw_spin_unlock <-__close_fd
+ bash-1977 [000] ...1 17284.993656: sub_preempt_count <-_raw_spin_unlock
+ bash-1977 [000] .... 17284.993657: filp_close <-__close_fd
+ bash-1977 [000] .... 17284.993657: dnotify_flush <-filp_close
+ sshd-1974 [003] .... 17284.993658: sys_select <-system_call_fastpath
--------
A header is printed with the tracer name that is represented by
-the trace. In this case the tracer is "function". Then a header
-showing the format. Task name "bash", the task PID "4251", the
-CPU that it was running on "01", the timestamp in <secs>.<usecs>
-format, the function name that was traced "path_put" and the
-parent function that called this function "path_walk". The
-timestamp is the time at which the function was entered.
+the trace. In this case the tracer is "function". Then it shows the
+number of events in the buffer as well as the total number of entries
+that were written. The difference is the number of entries that were
+lost due to the buffer filling up (250280 - 140080 = 110200 events
+lost).
+
+The header explains the content of the events. Task name "bash", the task
+PID "1977", the CPU that it was running on "000", the latency format
+(explained below), the timestamp in <secs>.<usecs> format, the
+function name that was traced "sys_close" and the parent function that
+called this function "system_call_fastpath". The timestamp is the time
+at which the function was entered.
Latency trace format
--------------------
-When the latency-format option is enabled, the trace file gives
-somewhat more information to see why a latency happened.
-Here is a typical trace.
+When the latency-format option is enabled or when one of the latency
+tracers is set, the trace file gives somewhat more information to see
+why a latency happened. Here is a typical trace.
# tracer: irqsoff
#
-irqsoff latency trace v1.1.5 on 2.6.26-rc8
---------------------------------------------------------------------
- latency: 97 us, #3/3, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2)
- -----------------
- | task: swapper-0 (uid:0 nice:0 policy:0 rt_prio:0)
- -----------------
- => started at: apic_timer_interrupt
- => ended at: do_softirq
-
-# _------=> CPU#
-# / _-----=> irqs-off
-# | / _----=> need-resched
-# || / _---=> hardirq/softirq
-# ||| / _--=> preempt-depth
-# |||| /
-# ||||| delay
-# cmd pid ||||| time | caller
-# \ / ||||| \ | /
- <idle>-0 0d..1 0us+: trace_hardirqs_off_thunk (apic_timer_interrupt)
- <idle>-0 0d.s. 97us : __do_softirq (do_softirq)
- <idle>-0 0d.s1 98us : trace_hardirqs_on (do_softirq)
+# irqsoff latency trace v1.1.5 on 3.8.0-test+
+# --------------------------------------------------------------------
+# latency: 259 us, #4/4, CPU#2 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4)
+# -----------------
+# | task: ps-6143 (uid:0 nice:0 policy:0 rt_prio:0)
+# -----------------
+# => started at: __lock_task_sighand
+# => ended at: _raw_spin_unlock_irqrestore
+#
+#
+# _------=> CPU#
+# / _-----=> irqs-off
+# | / _----=> need-resched
+# || / _---=> hardirq/softirq
+# ||| / _--=> preempt-depth
+# |||| / delay
+# cmd pid ||||| time | caller
+# \ / ||||| \ | /
+ ps-6143 2d... 0us!: trace_hardirqs_off <-__lock_task_sighand
+ ps-6143 2d..1 259us+: trace_hardirqs_on <-_raw_spin_unlock_irqrestore
+ ps-6143 2d..1 263us+: time_hardirqs_on <-_raw_spin_unlock_irqrestore
+ ps-6143 2d..1 306us : <stack trace>
+ => trace_hardirqs_on_caller
+ => trace_hardirqs_on
+ => _raw_spin_unlock_irqrestore
+ => do_task_stat
+ => proc_tgid_stat
+ => proc_single_show
+ => seq_read
+ => vfs_read
+ => sys_read
+ => system_call_fastpath
This shows that the current tracer is "irqsoff" tracing the time
-for which interrupts were disabled. It gives the trace version
-and the version of the kernel upon which this was executed on
-(2.6.26-rc8). Then it displays the max latency in microsecs (97
-us). The number of trace entries displayed and the total number
-recorded (both are three: #3/3). The type of preemption that was
-used (PREEMPT). VP, KP, SP, and HP are always zero and are
-reserved for later use. #P is the number of online CPUS (#P:2).
+for which interrupts were disabled. It gives the trace version (which
+never changes) and the version of the kernel upon which this was executed on
+(3.10). Then it displays the max latency in microseconds (259 us). The number
+of trace entries displayed and the total number (both are four: #4/4).
+VP, KP, SP, and HP are always zero and are reserved for later use.
+#P is the number of online CPUs (#P:4).
The task is the process that was running when the latency
-occurred. (swapper pid: 0).
+occurred. (ps pid: 6143).
The start and stop (the functions in which the interrupts were
disabled and enabled respectively) that caused the latencies:
- apic_timer_interrupt is where the interrupts were disabled.
- do_softirq is where they were enabled again.
+ __lock_task_sighand is where the interrupts were disabled.
+ _raw_spin_unlock_irqrestore is where they were enabled again.
The next lines after the header are the trace itself. The header
explains which is which.
@@ -367,16 +682,43 @@ The above is mostly meaningful for kernel developers.
The rest is the same as the 'trace' file.
+ Note, the latency tracers will usually end with a back trace
+ to easily find where the latency occurred.
trace_options
-------------
-The trace_options file is used to control what gets printed in
-the trace output. To see what is available, simply cat the file:
+The trace_options file (or the options directory) is used to control
+what gets printed in the trace output, or manipulate the tracers.
+To see what is available, simply cat the file:
cat trace_options
- print-parent nosym-offset nosym-addr noverbose noraw nohex nobin \
- noblock nostacktrace nosched-tree nouserstacktrace nosym-userobj
+print-parent
+nosym-offset
+nosym-addr
+noverbose
+noraw
+nohex
+nobin
+noblock
+nostacktrace
+trace_printk
+noftrace_preempt
+nobranch
+annotate
+nouserstacktrace
+nosym-userobj
+noprintk-msg-only
+context-info
+latency-format
+sleep-time
+graph-time
+record-cmd
+overwrite
+nodisable_on_free
+irq-info
+markers
+function-trace
To disable one of the options, echo in the option prepended with
"no".
@@ -428,13 +770,34 @@ Here are the available options:
bin - This will print out the formats in raw binary.
- block - TBD (needs update)
+ block - When set, reading trace_pipe will not block when polled.
stacktrace - This is one of the options that changes the trace
itself. When a trace is recorded, so is the stack
of functions. This allows for back traces of
trace sites.
+ trace_printk - Can disable trace_printk() from writing into the buffer.
+
+ branch - Enable branch tracing with the tracer.
+
+ annotate - It is sometimes confusing when the CPU buffers are full
+ and one CPU buffer had a lot of events recently, thus
+ a shorter time frame, were another CPU may have only had
+ a few events, which lets it have older events. When
+ the trace is reported, it shows the oldest events first,
+ and it may look like only one CPU ran (the one with the
+ oldest events). When the annotate option is set, it will
+ display when a new CPU buffer started:
+
+ <idle>-0 [001] dNs4 21169.031481: wake_up_idle_cpu <-add_timer_on
+ <idle>-0 [001] dNs4 21169.031482: _raw_spin_unlock_irqrestore <-add_timer_on
+ <idle>-0 [001] .Ns4 21169.031484: sub_preempt_count <-_raw_spin_unlock_irqrestore
+##### CPU 2 buffer started ####
+ <idle>-0 [002] .N.1 21169.031484: rcu_idle_exit <-cpu_idle
+ <idle>-0 [001] .Ns3 21169.031484: _raw_spin_unlock <-clocksource_watchdog
+ <idle>-0 [001] .Ns3 21169.031485: sub_preempt_count <-_raw_spin_unlock
+
userstacktrace - This option changes the trace. It records a
stacktrace of the current userspace thread.
@@ -451,9 +814,13 @@ Here are the available options:
a.out-1623 [000] 40874.465068: /root/a.out[+0x480] <-/root/a.out[+0
x494] <- /root/a.out[+0x4a8] <- /lib/libc-2.7.so[+0x1e1a6]
- sched-tree - trace all tasks that are on the runqueue, at
- every scheduling event. Will add overhead if
- there's a lot of tasks running at once.
+
+ printk-msg-only - When set, trace_printk()s will only show the format
+ and not their parameters (if trace_bprintk() or
+ trace_bputs() was used to save the trace_printk()).
+
+ context-info - Show only the event data. Hides the comm, PID,
+ timestamp, CPU, and other useful data.
latency-format - This option changes the trace. When
it is enabled, the trace displays
@@ -461,31 +828,61 @@ x494] <- /root/a.out[+0x4a8] <- /lib/libc-2.7.so[+0x1e1a6]
latencies, as described in "Latency
trace format".
+ sleep-time - When running function graph tracer, to include
+ the time a task schedules out in its function.
+ When enabled, it will account time the task has been
+ scheduled out as part of the function call.
+
+ graph-time - When running function graph tracer, to include the
+ time to call nested functions. When this is not set,
+ the time reported for the function will only include
+ the time the function itself executed for, not the time
+ for functions that it called.
+
+ record-cmd - When any event or tracer is enabled, a hook is enabled
+ in the sched_switch trace point to fill comm cache
+ with mapped pids and comms. But this may cause some
+ overhead, and if you only care about pids, and not the
+ name of the task, disabling this option can lower the
+ impact of tracing.
+
overwrite - This controls what happens when the trace buffer is
full. If "1" (default), the oldest events are
discarded and overwritten. If "0", then the newest
events are discarded.
+ (see per_cpu/cpu0/stats for overrun and dropped)
-ftrace_enabled
---------------
+ disable_on_free - When the free_buffer is closed, tracing will
+ stop (tracing_on set to 0).
-The following tracers (listed below) give different output
-depending on whether or not the sysctl ftrace_enabled is set. To
-set ftrace_enabled, one can either use the sysctl function or
-set it via the proc file system interface.
+ irq-info - Shows the interrupt, preempt count, need resched data.
+ When disabled, the trace looks like:
- sysctl kernel.ftrace_enabled=1
+# tracer: function
+#
+# entries-in-buffer/entries-written: 144405/9452052 #P:4
+#
+# TASK-PID CPU# TIMESTAMP FUNCTION
+# | | | | |
+ <idle>-0 [002] 23636.756054: ttwu_do_activate.constprop.89 <-try_to_wake_up
+ <idle>-0 [002] 23636.756054: activate_task <-ttwu_do_activate.constprop.89
+ <idle>-0 [002] 23636.756055: enqueue_task <-activate_task
- or
- echo 1 > /proc/sys/kernel/ftrace_enabled
+ markers - When set, the trace_marker is writable (only by root).
+ When disabled, the trace_marker will error with EINVAL
+ on write.
+
+
+ function-trace - The latency tracers will enable function tracing
+ if this option is enabled (default it is). When
+ it is disabled, the latency tracers do not trace
+ functions. This keeps the overhead of the tracer down
+ when performing latency tests.
-To disable ftrace_enabled simply replace the '1' with '0' in the
-above commands.
+ Note: Some tracers have their own options. They only appear
+ when the tracer is active.
-When ftrace_enabled is set the tracers will also record the
-functions that are within the trace. The descriptions of the
-tracers will also show an example with ftrace enabled.
irqsoff
@@ -506,95 +903,133 @@ new trace is saved.
To reset the maximum, echo 0 into tracing_max_latency. Here is
an example:
+ # echo 0 > options/function-trace
# echo irqsoff > current_tracer
- # echo latency-format > trace_options
- # echo 0 > tracing_max_latency
# echo 1 > tracing_on
+ # echo 0 > tracing_max_latency
# ls -ltr
[...]
# echo 0 > tracing_on
# cat trace
# tracer: irqsoff
#
-irqsoff latency trace v1.1.5 on 2.6.26
---------------------------------------------------------------------
- latency: 12 us, #3/3, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2)
- -----------------
- | task: bash-3730 (uid:0 nice:0 policy:0 rt_prio:0)
- -----------------
- => started at: sys_setpgid
- => ended at: sys_setpgid
-
-# _------=> CPU#
-# / _-----=> irqs-off
-# | / _----=> need-resched
-# || / _---=> hardirq/softirq
-# ||| / _--=> preempt-depth
-# |||| /
-# ||||| delay
-# cmd pid ||||| time | caller
-# \ / ||||| \ | /
- bash-3730 1d... 0us : _write_lock_irq (sys_setpgid)
- bash-3730 1d..1 1us+: _write_unlock_irq (sys_setpgid)
- bash-3730 1d..2 14us : trace_hardirqs_on (sys_setpgid)
-
-
-Here we see that that we had a latency of 12 microsecs (which is
-very good). The _write_lock_irq in sys_setpgid disabled
-interrupts. The difference between the 12 and the displayed
-timestamp 14us occurred because the clock was incremented
+# irqsoff latency trace v1.1.5 on 3.8.0-test+
+# --------------------------------------------------------------------
+# latency: 16 us, #4/4, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4)
+# -----------------
+# | task: swapper/0-0 (uid:0 nice:0 policy:0 rt_prio:0)
+# -----------------
+# => started at: run_timer_softirq
+# => ended at: run_timer_softirq
+#
+#
+# _------=> CPU#
+# / _-----=> irqs-off
+# | / _----=> need-resched
+# || / _---=> hardirq/softirq
+# ||| / _--=> preempt-depth
+# |||| / delay
+# cmd pid ||||| time | caller
+# \ / ||||| \ | /
+ <idle>-0 0d.s2 0us+: _raw_spin_lock_irq <-run_timer_softirq
+ <idle>-0 0dNs3 17us : _raw_spin_unlock_irq <-run_timer_softirq
+ <idle>-0 0dNs3 17us+: trace_hardirqs_on <-run_timer_softirq
+ <idle>-0 0dNs3 25us : <stack trace>
+ => _raw_spin_unlock_irq
+ => run_timer_softirq
+ => __do_softirq
+ => call_softirq
+ => do_softirq
+ => irq_exit
+ => smp_apic_timer_interrupt
+ => apic_timer_interrupt
+ => rcu_idle_exit
+ => cpu_idle
+ => rest_init
+ => start_kernel
+ => x86_64_start_reservations
+ => x86_64_start_kernel
+
+Here we see that that we had a latency of 16 microseconds (which is
+very good). The _raw_spin_lock_irq in run_timer_softirq disabled
+interrupts. The difference between the 16 and the displayed
+timestamp 25us occurred because the clock was incremented
between the time of recording the max latency and the time of
recording the function that had that latency.
-Note the above example had ftrace_enabled not set. If we set the
-ftrace_enabled, we get a much larger output:
+Note the above example had function-trace not set. If we set
+function-trace, we get a much larger output:
+
+ with echo 1 > options/function-trace
# tracer: irqsoff
#
-irqsoff latency trace v1.1.5 on 2.6.26-rc8
---------------------------------------------------------------------
- latency: 50 us, #101/101, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2)
- -----------------
- | task: ls-4339 (uid:0 nice:0 policy:0 rt_prio:0)
- -----------------
- => started at: __alloc_pages_internal
- => ended at: __alloc_pages_internal
-
-# _------=> CPU#
-# / _-----=> irqs-off
-# | / _----=> need-resched
-# || / _---=> hardirq/softirq
-# ||| / _--=> preempt-depth
-# |||| /
-# ||||| delay
-# cmd pid ||||| time | caller
-# \ / ||||| \ | /
- ls-4339 0...1 0us+: get_page_from_freelist (__alloc_pages_internal)
- ls-4339 0d..1 3us : rmqueue_bulk (get_page_from_freelist)
- ls-4339 0d..1 3us : _spin_lock (rmqueue_bulk)
- ls-4339 0d..1 4us : add_preempt_count (_spin_lock)
- ls-4339 0d..2 4us : __rmqueue (rmqueue_bulk)
- ls-4339 0d..2 5us : __rmqueue_smallest (__rmqueue)
- ls-4339 0d..2 5us : __mod_zone_page_state (__rmqueue_smallest)
- ls-4339 0d..2 6us : __rmqueue (rmqueue_bulk)
- ls-4339 0d..2 6us : __rmqueue_smallest (__rmqueue)
- ls-4339 0d..2 7us : __mod_zone_page_state (__rmqueue_smallest)
- ls-4339 0d..2 7us : __rmqueue (rmqueue_bulk)
- ls-4339 0d..2 8us : __rmqueue_smallest (__rmqueue)
+# irqsoff latency trace v1.1.5 on 3.8.0-test+
+# --------------------------------------------------------------------
+# latency: 71 us, #168/168, CPU#3 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4)
+# -----------------
+# | task: bash-2042 (uid:0 nice:0 policy:0 rt_prio:0)
+# -----------------
+# => started at: ata_scsi_queuecmd
+# => ended at: ata_scsi_queuecmd
+#
+#
+# _------=> CPU#
+# / _-----=> irqs-off
+# | / _----=> need-resched
+# || / _---=> hardirq/softirq
+# ||| / _--=> preempt-depth
+# |||| / delay
+# cmd pid ||||| time | caller
+# \ / ||||| \ | /
+ bash-2042 3d... 0us : _raw_spin_lock_irqsave <-ata_scsi_queuecmd
+ bash-2042 3d... 0us : add_preempt_count <-_raw_spin_lock_irqsave
+ bash-2042 3d..1 1us : ata_scsi_find_dev <-ata_scsi_queuecmd
+ bash-2042 3d..1 1us : __ata_scsi_find_dev <-ata_scsi_find_dev
+ bash-2042 3d..1 2us : ata_find_dev.part.14 <-__ata_scsi_find_dev
+ bash-2042 3d..1 2us : ata_qc_new_init <-__ata_scsi_queuecmd
+ bash-2042 3d..1 3us : ata_sg_init <-__ata_scsi_queuecmd
+ bash-2042 3d..1 4us : ata_scsi_rw_xlat <-__ata_scsi_queuecmd
+ bash-2042 3d..1 4us : ata_build_rw_tf <-ata_scsi_rw_xlat
[...]
- ls-4339 0d..2 46us : __rmqueue_smallest (__rmqueue)
- ls-4339 0d..2 47us : __mod_zone_page_state (__rmqueue_smallest)
- ls-4339 0d..2 47us : __rmqueue (rmqueue_bulk)
- ls-4339 0d..2 48us : __rmqueue_smallest (__rmqueue)
- ls-4339 0d..2 48us : __mod_zone_page_state (__rmqueue_smallest)
- ls-4339 0d..2 49us : _spin_unlock (rmqueue_bulk)
- ls-4339 0d..2 49us : sub_preempt_count (_spin_unlock)
- ls-4339 0d..1 50us : get_page_from_freelist (__alloc_pages_internal)
- ls-4339 0d..2 51us : trace_hardirqs_on (__alloc_pages_internal)
-
-
-
-Here we traced a 50 microsecond latency. But we also see all the
+ bash-2042 3d..1 67us : delay_tsc <-__delay
+ bash-2042 3d..1 67us : add_preempt_count <-delay_tsc
+ bash-2042 3d..2 67us : sub_preempt_count <-delay_tsc
+ bash-2042 3d..1 67us : add_preempt_count <-delay_tsc
+ bash-2042 3d..2 68us : sub_preempt_count <-delay_tsc
+ bash-2042 3d..1 68us+: ata_bmdma_start <-ata_bmdma_qc_issue
+ bash-2042 3d..1 71us : _raw_spin_unlock_irqrestore <-ata_scsi_queuecmd
+ bash-2042 3d..1 71us : _raw_spin_unlock_irqrestore <-ata_scsi_queuecmd
+ bash-2042 3d..1 72us+: trace_hardirqs_on <-ata_scsi_queuecmd
+ bash-2042 3d..1 120us : <stack trace>
+ => _raw_spin_unlock_irqrestore
+ => ata_scsi_queuecmd
+ => scsi_dispatch_cmd
+ => scsi_request_fn
+ => __blk_run_queue_uncond
+ => __blk_run_queue
+ => blk_queue_bio
+ => generic_make_request
+ => submit_bio
+ => submit_bh
+ => __ext3_get_inode_loc
+ => ext3_iget
+ => ext3_lookup
+ => lookup_real
+ => __lookup_hash
+ => walk_component
+ => lookup_last
+ => path_lookupat
+ => filename_lookup
+ => user_path_at_empty
+ => user_path_at
+ => vfs_fstatat
+ => vfs_stat
+ => sys_newstat
+ => system_call_fastpath
+
+
+Here we traced a 71 microsecond latency. But we also see all the
functions that were called during that time. Note that by
enabling function tracing, we incur an added overhead. This
overhead may extend the latency times. But nevertheless, this
@@ -614,120 +1049,122 @@ Like the irqsoff tracer, it records the maximum latency for
which preemption was disabled. The control of preemptoff tracer
is much like the irqsoff tracer.
+ # echo 0 > options/function-trace
# echo preemptoff > current_tracer
- # echo latency-format > trace_options
- # echo 0 > tracing_max_latency
# echo 1 > tracing_on
+ # echo 0 > tracing_max_latency
# ls -ltr
[...]
# echo 0 > tracing_on
# cat trace
# tracer: preemptoff
#
-preemptoff latency trace v1.1.5 on 2.6.26-rc8
---------------------------------------------------------------------
- latency: 29 us, #3/3, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2)
- -----------------
- | task: sshd-4261 (uid:0 nice:0 policy:0 rt_prio:0)
- -----------------
- => started at: do_IRQ
- => ended at: __do_softirq
-
-# _------=> CPU#
-# / _-----=> irqs-off
-# | / _----=> need-resched
-# || / _---=> hardirq/softirq
-# ||| / _--=> preempt-depth
-# |||| /
-# ||||| delay
-# cmd pid ||||| time | caller
-# \ / ||||| \ | /
- sshd-4261 0d.h. 0us+: irq_enter (do_IRQ)
- sshd-4261 0d.s. 29us : _local_bh_enable (__do_softirq)
- sshd-4261 0d.s1 30us : trace_preempt_on (__do_softirq)
+# preemptoff latency trace v1.1.5 on 3.8.0-test+
+# --------------------------------------------------------------------
+# latency: 46 us, #4/4, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4)
+# -----------------
+# | task: sshd-1991 (uid:0 nice:0 policy:0 rt_prio:0)
+# -----------------
+# => started at: do_IRQ
+# => ended at: do_IRQ
+#
+#
+# _------=> CPU#
+# / _-----=> irqs-off
+# | / _----=> need-resched
+# || / _---=> hardirq/softirq
+# ||| / _--=> preempt-depth
+# |||| / delay
+# cmd pid ||||| time | caller
+# \ / ||||| \ | /
+ sshd-1991 1d.h. 0us+: irq_enter <-do_IRQ
+ sshd-1991 1d..1 46us : irq_exit <-do_IRQ
+ sshd-1991 1d..1 47us+: trace_preempt_on <-do_IRQ
+ sshd-1991 1d..1 52us : <stack trace>
+ => sub_preempt_count
+ => irq_exit
+ => do_IRQ
+ => ret_from_intr
This has some more changes. Preemption was disabled when an
-interrupt came in (notice the 'h'), and was enabled while doing
-a softirq. (notice the 's'). But we also see that interrupts
-have been disabled when entering the preempt off section and
-leaving it (the 'd'). We do not know if interrupts were enabled
-in the mean time.
+interrupt came in (notice the 'h'), and was enabled on exit.
+But we also see that interrupts have been disabled when entering
+the preempt off section and leaving it (the 'd'). We do not know if
+interrupts were enabled in the mean time or shortly after this
+was over.
# tracer: preemptoff
#
-preemptoff latency trace v1.1.5 on 2.6.26-rc8
---------------------------------------------------------------------
- latency: 63 us, #87/87, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2)
- -----------------
- | task: sshd-4261 (uid:0 nice:0 policy:0 rt_prio:0)
- -----------------
- => started at: remove_wait_queue
- => ended at: __do_softirq
-
-# _------=> CPU#
-# / _-----=> irqs-off
-# | / _----=> need-resched
-# || / _---=> hardirq/softirq
-# ||| / _--=> preempt-depth
-# |||| /
-# ||||| delay
-# cmd pid ||||| time | caller
-# \ / ||||| \ | /
- sshd-4261 0d..1 0us : _spin_lock_irqsave (remove_wait_queue)
- sshd-4261 0d..1 1us : _spin_unlock_irqrestore (remove_wait_queue)
- sshd-4261 0d..1 2us : do_IRQ (common_interrupt)
- sshd-4261 0d..1 2us : irq_enter (do_IRQ)
- sshd-4261 0d..1 2us : idle_cpu (irq_enter)
- sshd-4261 0d..1 3us : add_preempt_count (irq_enter)
- sshd-4261 0d.h1 3us : idle_cpu (irq_enter)
- sshd-4261 0d.h. 4us : handle_fasteoi_irq (do_IRQ)
+# preemptoff latency trace v1.1.5 on 3.8.0-test+
+# --------------------------------------------------------------------
+# latency: 83 us, #241/241, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4)
+# -----------------
+# | task: bash-1994 (uid:0 nice:0 policy:0 rt_prio:0)
+# -----------------
+# => started at: wake_up_new_task
+# => ended at: task_rq_unlock
+#
+#
+# _------=> CPU#
+# / _-----=> irqs-off
+# | / _----=> need-resched
+# || / _---=> hardirq/softirq
+# ||| / _--=> preempt-depth
+# |||| / delay
+# cmd pid ||||| time | caller
+# \ / ||||| \ | /
+ bash-1994 1d..1 0us : _raw_spin_lock_irqsave <-wake_up_new_task
+ bash-1994 1d..1 0us : select_task_rq_fair <-select_task_rq
+ bash-1994 1d..1 1us : __rcu_read_lock <-select_task_rq_fair
+ bash-1994 1d..1 1us : source_load <-select_task_rq_fair
+ bash-1994 1d..1 1us : source_load <-select_task_rq_fair
[...]
- sshd-4261 0d.h. 12us : add_preempt_count (_spin_lock)
- sshd-4261 0d.h1 12us : ack_ioapic_quirk_irq (handle_fasteoi_irq)
- sshd-4261 0d.h1 13us : move_native_irq (ack_ioapic_quirk_irq)
- sshd-4261 0d.h1 13us : _spin_unlock (handle_fasteoi_irq)
- sshd-4261 0d.h1 14us : sub_preempt_count (_spin_unlock)
- sshd-4261 0d.h1 14us : irq_exit (do_IRQ)
- sshd-4261 0d.h1 15us : sub_preempt_count (irq_exit)
- sshd-4261 0d..2 15us : do_softirq (irq_exit)
- sshd-4261 0d... 15us : __do_softirq (do_softirq)
- sshd-4261 0d... 16us : __local_bh_disable (__do_softirq)
- sshd-4261 0d... 16us+: add_preempt_count (__local_bh_disable)
- sshd-4261 0d.s4 20us : add_preempt_count (__local_bh_disable)
- sshd-4261 0d.s4 21us : sub_preempt_count (local_bh_enable)
- sshd-4261 0d.s5 21us : sub_preempt_count (local_bh_enable)
+ bash-1994 1d..1 12us : irq_enter <-smp_apic_timer_interrupt
+ bash-1994 1d..1 12us : rcu_irq_enter <-irq_enter
+ bash-1994 1d..1 13us : add_preempt_count <-irq_enter
+ bash-1994 1d.h1 13us : exit_idle <-smp_apic_timer_interrupt
+ bash-1994 1d.h1 13us : hrtimer_interrupt <-smp_apic_timer_interrupt
+ bash-1994 1d.h1 13us : _raw_spin_lock <-hrtimer_interrupt
+ bash-1994 1d.h1 14us : add_preempt_count <-_raw_spin_lock
+ bash-1994 1d.h2 14us : ktime_get_update_offsets <-hrtimer_interrupt
[...]
- sshd-4261 0d.s6 41us : add_preempt_count (__local_bh_disable)
- sshd-4261 0d.s6 42us : sub_preempt_count (local_bh_enable)
- sshd-4261 0d.s7 42us : sub_preempt_count (local_bh_enable)
- sshd-4261 0d.s5 43us : add_preempt_count (__local_bh_disable)
- sshd-4261 0d.s5 43us : sub_preempt_count (local_bh_enable_ip)
- sshd-4261 0d.s6 44us : sub_preempt_count (local_bh_enable_ip)
- sshd-4261 0d.s5 44us : add_preempt_count (__local_bh_disable)
- sshd-4261 0d.s5 45us : sub_preempt_count (local_bh_enable)
+ bash-1994 1d.h1 35us : lapic_next_event <-clockevents_program_event
+ bash-1994 1d.h1 35us : irq_exit <-smp_apic_timer_interrupt
+ bash-1994 1d.h1 36us : sub_preempt_count <-irq_exit
+ bash-1994 1d..2 36us : do_softirq <-irq_exit
+ bash-1994 1d..2 36us : __do_softirq <-call_softirq
+ bash-1994 1d..2 36us : __local_bh_disable <-__do_softirq
+ bash-1994 1d.s2 37us : add_preempt_count <-_raw_spin_lock_irq
+ bash-1994 1d.s3 38us : _raw_spin_unlock <-run_timer_softirq
+ bash-1994 1d.s3 39us : sub_preempt_count <-_raw_spin_unlock
+ bash-1994 1d.s2 39us : call_timer_fn <-run_timer_softirq
[...]
- sshd-4261 0d.s. 63us : _local_bh_enable (__do_softirq)
- sshd-4261 0d.s1 64us : trace_preempt_on (__do_softirq)
+ bash-1994 1dNs2 81us : cpu_needs_another_gp <-rcu_process_callbacks
+ bash-1994 1dNs2 82us : __local_bh_enable <-__do_softirq
+ bash-1994 1dNs2 82us : sub_preempt_count <-__local_bh_enable
+ bash-1994 1dN.2 82us : idle_cpu <-irq_exit
+ bash-1994 1dN.2 83us : rcu_irq_exit <-irq_exit
+ bash-1994 1dN.2 83us : sub_preempt_count <-irq_exit
+ bash-1994 1.N.1 84us : _raw_spin_unlock_irqrestore <-task_rq_unlock
+ bash-1994 1.N.1 84us+: trace_preempt_on <-task_rq_unlock
+ bash-1994 1.N.1 104us : <stack trace>
+ => sub_preempt_count
+ => _raw_spin_unlock_irqrestore
+ => task_rq_unlock
+ => wake_up_new_task
+ => do_fork
+ => sys_clone
+ => stub_clone
The above is an example of the preemptoff trace with
-ftrace_enabled set. Here we see that interrupts were disabled
+function-trace set. Here we see that interrupts were not disabled
the entire time. The irq_enter code lets us know that we entered
an interrupt 'h'. Before that, the functions being traced still
show that it is not in an interrupt, but we can see from the
functions themselves that this is not the case.
-Notice that __do_softirq when called does not have a
-preempt_count. It may seem that we missed a preempt enabling.
-What really happened is that the preempt count is held on the
-thread's stack and we switched to the softirq stack (4K stacks
-in effect). The code does not copy the preempt count, but
-because interrupts are disabled, we do not need to worry about
-it. Having a tracer like this is good for letting people know
-what really happens inside the kernel.
-
-
preemptirqsoff
--------------
@@ -762,38 +1199,57 @@ tracer.
Again, using this trace is much like the irqsoff and preemptoff
tracers.
+ # echo 0 > options/function-trace
# echo preemptirqsoff > current_tracer
- # echo latency-format > trace_options
- # echo 0 > tracing_max_latency
# echo 1 > tracing_on
+ # echo 0 > tracing_max_latency
# ls -ltr
[...]
# echo 0 > tracing_on
# cat trace
# tracer: preemptirqsoff
#
-preemptirqsoff latency trace v1.1.5 on 2.6.26-rc8
---------------------------------------------------------------------
- latency: 293 us, #3/3, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2)
- -----------------
- | task: ls-4860 (uid:0 nice:0 policy:0 rt_prio:0)
- -----------------
- => started at: apic_timer_interrupt
- => ended at: __do_softirq
-
-# _------=> CPU#
-# / _-----=> irqs-off
-# | / _----=> need-resched
-# || / _---=> hardirq/softirq
-# ||| / _--=> preempt-depth
-# |||| /
-# ||||| delay
-# cmd pid ||||| time | caller
-# \ / ||||| \ | /
- ls-4860 0d... 0us!: trace_hardirqs_off_thunk (apic_timer_interrupt)
- ls-4860 0d.s. 294us : _local_bh_enable (__do_softirq)
- ls-4860 0d.s1 294us : trace_preempt_on (__do_softirq)
-
+# preemptirqsoff latency trace v1.1.5 on 3.8.0-test+
+# --------------------------------------------------------------------
+# latency: 100 us, #4/4, CPU#3 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4)
+# -----------------
+# | task: ls-2230 (uid:0 nice:0 policy:0 rt_prio:0)
+# -----------------
+# => started at: ata_scsi_queuecmd
+# => ended at: ata_scsi_queuecmd
+#
+#
+# _------=> CPU#
+# / _-----=> irqs-off
+# | / _----=> need-resched
+# || / _---=> hardirq/softirq
+# ||| / _--=> preempt-depth
+# |||| / delay
+# cmd pid ||||| time | caller
+# \ / ||||| \ | /
+ ls-2230 3d... 0us+: _raw_spin_lock_irqsave <-ata_scsi_queuecmd
+ ls-2230 3...1 100us : _raw_spin_unlock_irqrestore <-ata_scsi_queuecmd
+ ls-2230 3...1 101us+: trace_preempt_on <-ata_scsi_queuecmd
+ ls-2230 3...1 111us : <stack trace>
+ => sub_preempt_count
+ => _raw_spin_unlock_irqrestore
+ => ata_scsi_queuecmd
+ => scsi_dispatch_cmd
+ => scsi_request_fn
+ => __blk_run_queue_uncond
+ => __blk_run_queue
+ => blk_queue_bio
+ => generic_make_request
+ => submit_bio
+ => submit_bh
+ => ext3_bread
+ => ext3_dir_bread
+ => htree_dirblock_to_tree
+ => ext3_htree_fill_tree
+ => ext3_readdir
+ => vfs_readdir
+ => sys_getdents
+ => system_call_fastpath
The trace_hardirqs_off_thunk is called from assembly on x86 when
@@ -802,105 +1258,158 @@ function tracing, we do not know if interrupts were enabled
within the preemption points. We do see that it started with
preemption enabled.
-Here is a trace with ftrace_enabled set:
-
+Here is a trace with function-trace set:
# tracer: preemptirqsoff
#
-preemptirqsoff latency trace v1.1.5 on 2.6.26-rc8
---------------------------------------------------------------------
- latency: 105 us, #183/183, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2)
- -----------------
- | task: sshd-4261 (uid:0 nice:0 policy:0 rt_prio:0)
- -----------------
- => started at: write_chan
- => ended at: __do_softirq
-
-# _------=> CPU#
-# / _-----=> irqs-off
-# | / _----=> need-resched
-# || / _---=> hardirq/softirq
-# ||| / _--=> preempt-depth
-# |||| /
-# ||||| delay
-# cmd pid ||||| time | caller
-# \ / ||||| \ | /
- ls-4473 0.N.. 0us : preempt_schedule (write_chan)
- ls-4473 0dN.1 1us : _spin_lock (schedule)
- ls-4473 0dN.1 2us : add_preempt_count (_spin_lock)
- ls-4473 0d..2 2us : put_prev_task_fair (schedule)
-[...]
- ls-4473 0d..2 13us : set_normalized_timespec (ktime_get_ts)
- ls-4473 0d..2 13us : __switch_to (schedule)
- sshd-4261 0d..2 14us : finish_task_switch (schedule)
- sshd-4261 0d..2 14us : _spin_unlock_irq (finish_task_switch)
- sshd-4261 0d..1 15us : add_preempt_count (_spin_lock_irqsave)
- sshd-4261 0d..2 16us : _spin_unlock_irqrestore (hrtick_set)
- sshd-4261 0d..2 16us : do_IRQ (common_interrupt)
- sshd-4261 0d..2 17us : irq_enter (do_IRQ)
- sshd-4261 0d..2 17us : idle_cpu (irq_enter)
- sshd-4261 0d..2 18us : add_preempt_count (irq_enter)
- sshd-4261 0d.h2 18us : idle_cpu (irq_enter)
- sshd-4261 0d.h. 18us : handle_fasteoi_irq (do_IRQ)
- sshd-4261 0d.h. 19us : _spin_lock (handle_fasteoi_irq)
- sshd-4261 0d.h. 19us : add_preempt_count (_spin_lock)
- sshd-4261 0d.h1 20us : _spin_unlock (handle_fasteoi_irq)
- sshd-4261 0d.h1 20us : sub_preempt_count (_spin_unlock)
-[...]
- sshd-4261 0d.h1 28us : _spin_unlock (handle_fasteoi_irq)
- sshd-4261 0d.h1 29us : sub_preempt_count (_spin_unlock)
- sshd-4261 0d.h2 29us : irq_exit (do_IRQ)
- sshd-4261 0d.h2 29us : sub_preempt_count (irq_exit)
- sshd-4261 0d..3 30us : do_softirq (irq_exit)
- sshd-4261 0d... 30us : __do_softirq (do_softirq)
- sshd-4261 0d... 31us : __local_bh_disable (__do_softirq)
- sshd-4261 0d... 31us+: add_preempt_count (__local_bh_disable)
- sshd-4261 0d.s4 34us : add_preempt_count (__local_bh_disable)
+# preemptirqsoff latency trace v1.1.5 on 3.8.0-test+
+# --------------------------------------------------------------------
+# latency: 161 us, #339/339, CPU#3 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4)
+# -----------------
+# | task: ls-2269 (uid:0 nice:0 policy:0 rt_prio:0)
+# -----------------
+# => started at: schedule
+# => ended at: mutex_unlock
+#
+#
+# _------=> CPU#
+# / _-----=> irqs-off
+# | / _----=> need-resched
+# || / _---=> hardirq/softirq
+# ||| / _--=> preempt-depth
+# |||| / delay
+# cmd pid ||||| time | caller
+# \ / ||||| \ | /
+kworker/-59 3...1 0us : __schedule <-schedule
+kworker/-59 3d..1 0us : rcu_preempt_qs <-rcu_note_context_switch
+kworker/-59 3d..1 1us : add_preempt_count <-_raw_spin_lock_irq
+kworker/-59 3d..2 1us : deactivate_task <-__schedule
+kworker/-59 3d..2 1us : dequeue_task <-deactivate_task
+kworker/-59 3d..2 2us : update_rq_clock <-dequeue_task
+kworker/-59 3d..2 2us : dequeue_task_fair <-dequeue_task
+kworker/-59 3d..2 2us : update_curr <-dequeue_task_fair
+kworker/-59 3d..2 2us : update_min_vruntime <-update_curr
+kworker/-59 3d..2 3us : cpuacct_charge <-update_curr
+kworker/-59 3d..2 3us : __rcu_read_lock <-cpuacct_charge
+kworker/-59 3d..2 3us : __rcu_read_unlock <-cpuacct_charge
+kworker/-59 3d..2 3us : update_cfs_rq_blocked_load <-dequeue_task_fair
+kworker/-59 3d..2 4us : clear_buddies <-dequeue_task_fair
+kworker/-59 3d..2 4us : account_entity_dequeue <-dequeue_task_fair
+kworker/-59 3d..2 4us : update_min_vruntime <-dequeue_task_fair
+kworker/-59 3d..2 4us : update_cfs_shares <-dequeue_task_fair
+kworker/-59 3d..2 5us : hrtick_update <-dequeue_task_fair
+kworker/-59 3d..2 5us : wq_worker_sleeping <-__schedule
+kworker/-59 3d..2 5us : kthread_data <-wq_worker_sleeping
+kworker/-59 3d..2 5us : put_prev_task_fair <-__schedule
+kworker/-59 3d..2 6us : pick_next_task_fair <-pick_next_task
+kworker/-59 3d..2 6us : clear_buddies <-pick_next_task_fair
+kworker/-59 3d..2 6us : set_next_entity <-pick_next_task_fair
+kworker/-59 3d..2 6us : update_stats_wait_end <-set_next_entity
+ ls-2269 3d..2 7us : finish_task_switch <-__schedule
+ ls-2269 3d..2 7us : _raw_spin_unlock_irq <-finish_task_switch
+ ls-2269 3d..2 8us : do_IRQ <-ret_from_intr
+ ls-2269 3d..2 8us : irq_enter <-do_IRQ
+ ls-2269 3d..2 8us : rcu_irq_enter <-irq_enter
+ ls-2269 3d..2 9us : add_preempt_count <-irq_enter
+ ls-2269 3d.h2 9us : exit_idle <-do_IRQ
[...]
- sshd-4261 0d.s3 43us : sub_preempt_count (local_bh_enable_ip)
- sshd-4261 0d.s4 44us : sub_preempt_count (local_bh_enable_ip)
- sshd-4261 0d.s3 44us : smp_apic_timer_interrupt (apic_timer_interrupt)
- sshd-4261 0d.s3 45us : irq_enter (smp_apic_timer_interrupt)
- sshd-4261 0d.s3 45us : idle_cpu (irq_enter)
- sshd-4261 0d.s3 46us : add_preempt_count (irq_enter)
- sshd-4261 0d.H3 46us : idle_cpu (irq_enter)
- sshd-4261 0d.H3 47us : hrtimer_interrupt (smp_apic_timer_interrupt)
- sshd-4261 0d.H3 47us : ktime_get (hrtimer_interrupt)
+ ls-2269 3d.h3 20us : sub_preempt_count <-_raw_spin_unlock
+ ls-2269 3d.h2 20us : irq_exit <-do_IRQ
+ ls-2269 3d.h2 21us : sub_preempt_count <-irq_exit
+ ls-2269 3d..3 21us : do_softirq <-irq_exit
+ ls-2269 3d..3 21us : __do_softirq <-call_softirq
+ ls-2269 3d..3 21us+: __local_bh_disable <-__do_softirq
+ ls-2269 3d.s4 29us : sub_preempt_count <-_local_bh_enable_ip
+ ls-2269 3d.s5 29us : sub_preempt_count <-_local_bh_enable_ip
+ ls-2269 3d.s5 31us : do_IRQ <-ret_from_intr
+ ls-2269 3d.s5 31us : irq_enter <-do_IRQ
+ ls-2269 3d.s5 31us : rcu_irq_enter <-irq_enter
[...]
- sshd-4261 0d.H3 81us : tick_program_event (hrtimer_interrupt)
- sshd-4261 0d.H3 82us : ktime_get (tick_program_event)
- sshd-4261 0d.H3 82us : ktime_get_ts (ktime_get)
- sshd-4261 0d.H3 83us : getnstimeofday (ktime_get_ts)
- sshd-4261 0d.H3 83us : set_normalized_timespec (ktime_get_ts)
- sshd-4261 0d.H3 84us : clockevents_program_event (tick_program_event)
- sshd-4261 0d.H3 84us : lapic_next_event (clockevents_program_event)
- sshd-4261 0d.H3 85us : irq_exit (smp_apic_timer_interrupt)
- sshd-4261 0d.H3 85us : sub_preempt_count (irq_exit)
- sshd-4261 0d.s4 86us : sub_preempt_count (irq_exit)
- sshd-4261 0d.s3 86us : add_preempt_count (__local_bh_disable)
+ ls-2269 3d.s5 31us : rcu_irq_enter <-irq_enter
+ ls-2269 3d.s5 32us : add_preempt_count <-irq_enter
+ ls-2269 3d.H5 32us : exit_idle <-do_IRQ
+ ls-2269 3d.H5 32us : handle_irq <-do_IRQ
+ ls-2269 3d.H5 32us : irq_to_desc <-handle_irq
+ ls-2269 3d.H5 33us : handle_fasteoi_irq <-handle_irq
[...]
- sshd-4261 0d.s1 98us : sub_preempt_count (net_rx_action)
- sshd-4261 0d.s. 99us : add_preempt_count (_spin_lock_irq)
- sshd-4261 0d.s1 99us+: _spin_unlock_irq (run_timer_softirq)
- sshd-4261 0d.s. 104us : _local_bh_enable (__do_softirq)
- sshd-4261 0d.s. 104us : sub_preempt_count (_local_bh_enable)
- sshd-4261 0d.s. 105us : _local_bh_enable (__do_softirq)
- sshd-4261 0d.s1 105us : trace_preempt_on (__do_softirq)
-
-
-This is a very interesting trace. It started with the preemption
-of the ls task. We see that the task had the "need_resched" bit
-set via the 'N' in the trace. Interrupts were disabled before
-the spin_lock at the beginning of the trace. We see that a
-schedule took place to run sshd. When the interrupts were
-enabled, we took an interrupt. On return from the interrupt
-handler, the softirq ran. We took another interrupt while
-running the softirq as we see from the capital 'H'.
+ ls-2269 3d.s5 158us : _raw_spin_unlock_irqrestore <-rtl8139_poll
+ ls-2269 3d.s3 158us : net_rps_action_and_irq_enable.isra.65 <-net_rx_action
+ ls-2269 3d.s3 159us : __local_bh_enable <-__do_softirq
+ ls-2269 3d.s3 159us : sub_preempt_count <-__local_bh_enable
+ ls-2269 3d..3 159us : idle_cpu <-irq_exit
+ ls-2269 3d..3 159us : rcu_irq_exit <-irq_exit
+ ls-2269 3d..3 160us : sub_preempt_count <-irq_exit
+ ls-2269 3d... 161us : __mutex_unlock_slowpath <-mutex_unlock
+ ls-2269 3d... 162us+: trace_hardirqs_on <-mutex_unlock
+ ls-2269 3d... 186us : <stack trace>
+ => __mutex_unlock_slowpath
+ => mutex_unlock
+ => process_output
+ => n_tty_write
+ => tty_write
+ => vfs_write
+ => sys_write
+ => system_call_fastpath
+
+This is an interesting trace. It started with kworker running and
+scheduling out and ls taking over. But as soon as ls released the
+rq lock and enabled interrupts (but not preemption) an interrupt
+triggered. When the interrupt finished, it started running softirqs.
+But while the softirq was running, another interrupt triggered.
+When an interrupt is running inside a softirq, the annotation is 'H'.
wakeup
------
+One common case that people are interested in tracing is the
+time it takes for a task that is woken to actually wake up.
+Now for non Real-Time tasks, this can be arbitrary. But tracing
+it none the less can be interesting.
+
+Without function tracing:
+
+ # echo 0 > options/function-trace
+ # echo wakeup > current_tracer
+ # echo 1 > tracing_on
+ # echo 0 > tracing_max_latency
+ # chrt -f 5 sleep 1
+ # echo 0 > tracing_on
+ # cat trace
+# tracer: wakeup
+#
+# wakeup latency trace v1.1.5 on 3.8.0-test+
+# --------------------------------------------------------------------
+# latency: 15 us, #4/4, CPU#3 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4)
+# -----------------
+# | task: kworker/3:1H-312 (uid:0 nice:-20 policy:0 rt_prio:0)
+# -----------------
+#
+# _------=> CPU#
+# / _-----=> irqs-off
+# | / _----=> need-resched
+# || / _---=> hardirq/softirq
+# ||| / _--=> preempt-depth
+# |||| / delay
+# cmd pid ||||| time | caller
+# \ / ||||| \ | /
+ <idle>-0 3dNs7 0us : 0:120:R + [003] 312:100:R kworker/3:1H
+ <idle>-0 3dNs7 1us+: ttwu_do_activate.constprop.87 <-try_to_wake_up
+ <idle>-0 3d..3 15us : __schedule <-schedule
+ <idle>-0 3d..3 15us : 0:120:R ==> [003] 312:100:R kworker/3:1H
+
+The tracer only traces the highest priority task in the system
+to avoid tracing the normal circumstances. Here we see that
+the kworker with a nice priority of -20 (not very nice), took
+just 15 microseconds from the time it woke up, to the time it
+ran.
+
+Non Real-Time tasks are not that interesting. A more interesting
+trace is to concentrate only on Real-Time tasks.
+
+wakeup_rt
+---------
+
In a Real-Time environment it is very important to know the
wakeup time it takes for the highest priority task that is woken
up to the time that it executes. This is also known as "schedule
@@ -914,124 +1423,229 @@ Real-Time environments are interested in the worst case latency.
That is the longest latency it takes for something to happen,
and not the average. We can have a very fast scheduler that may
only have a large latency once in a while, but that would not
-work well with Real-Time tasks. The wakeup tracer was designed
+work well with Real-Time tasks. The wakeup_rt tracer was designed
to record the worst case wakeups of RT tasks. Non-RT tasks are
not recorded because the tracer only records one worst case and
tracing non-RT tasks that are unpredictable will overwrite the
-worst case latency of RT tasks.
+worst case latency of RT tasks (just run the normal wakeup
+tracer for a while to see that effect).
Since this tracer only deals with RT tasks, we will run this
slightly differently than we did with the previous tracers.
Instead of performing an 'ls', we will run 'sleep 1' under
'chrt' which changes the priority of the task.
- # echo wakeup > current_tracer
- # echo latency-format > trace_options
- # echo 0 > tracing_max_latency
+ # echo 0 > options/function-trace
+ # echo wakeup_rt > current_tracer
# echo 1 > tracing_on
+ # echo 0 > tracing_max_latency
# chrt -f 5 sleep 1
# echo 0 > tracing_on
# cat trace
# tracer: wakeup
#
-wakeup latency trace v1.1.5 on 2.6.26-rc8
---------------------------------------------------------------------
- latency: 4 us, #2/2, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2)
- -----------------
- | task: sleep-4901 (uid:0 nice:0 policy:1 rt_prio:5)
- -----------------
-
-# _------=> CPU#
-# / _-----=> irqs-off
-# | / _----=> need-resched
-# || / _---=> hardirq/softirq
-# ||| / _--=> preempt-depth
-# |||| /
-# ||||| delay
-# cmd pid ||||| time | caller
-# \ / ||||| \ | /
- <idle>-0 1d.h4 0us+: try_to_wake_up (wake_up_process)
- <idle>-0 1d..4 4us : schedule (cpu_idle)
-
-
-Running this on an idle system, we see that it only took 4
-microseconds to perform the task switch. Note, since the trace
-marker in the schedule is before the actual "switch", we stop
-the tracing when the recorded task is about to schedule in. This
-may change if we add a new marker at the end of the scheduler.
-
-Notice that the recorded task is 'sleep' with the PID of 4901
+# tracer: wakeup_rt
+#
+# wakeup_rt latency trace v1.1.5 on 3.8.0-test+
+# --------------------------------------------------------------------
+# latency: 5 us, #4/4, CPU#3 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4)
+# -----------------
+# | task: sleep-2389 (uid:0 nice:0 policy:1 rt_prio:5)
+# -----------------
+#
+# _------=> CPU#
+# / _-----=> irqs-off
+# | / _----=> need-resched
+# || / _---=> hardirq/softirq
+# ||| / _--=> preempt-depth
+# |||| / delay
+# cmd pid ||||| time | caller
+# \ / ||||| \ | /
+ <idle>-0 3d.h4 0us : 0:120:R + [003] 2389: 94:R sleep
+ <idle>-0 3d.h4 1us+: ttwu_do_activate.constprop.87 <-try_to_wake_up
+ <idle>-0 3d..3 5us : __schedule <-schedule
+ <idle>-0 3d..3 5us : 0:120:R ==> [003] 2389: 94:R sleep
+
+
+Running this on an idle system, we see that it only took 5 microseconds
+to perform the task switch. Note, since the trace point in the schedule
+is before the actual "switch", we stop the tracing when the recorded task
+is about to schedule in. This may change if we add a new marker at the
+end of the scheduler.
+
+Notice that the recorded task is 'sleep' with the PID of 2389
and it has an rt_prio of 5. This priority is user-space priority
and not the internal kernel priority. The policy is 1 for
SCHED_FIFO and 2 for SCHED_RR.
-Doing the same with chrt -r 5 and ftrace_enabled set.
+Note, that the trace data shows the internal priority (99 - rtprio).
-# tracer: wakeup
+ <idle>-0 3d..3 5us : 0:120:R ==> [003] 2389: 94:R sleep
+
+The 0:120:R means idle was running with a nice priority of 0 (120 - 20)
+and in the running state 'R'. The sleep task was scheduled in with
+2389: 94:R. That is the priority is the kernel rtprio (99 - 5 = 94)
+and it too is in the running state.
+
+Doing the same with chrt -r 5 and function-trace set.
+
+ echo 1 > options/function-trace
+
+# tracer: wakeup_rt
#
-wakeup latency trace v1.1.5 on 2.6.26-rc8
---------------------------------------------------------------------
- latency: 50 us, #60/60, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2)
- -----------------
- | task: sleep-4068 (uid:0 nice:0 policy:2 rt_prio:5)
- -----------------
-
-# _------=> CPU#
-# / _-----=> irqs-off
-# | / _----=> need-resched
-# || / _---=> hardirq/softirq
-# ||| / _--=> preempt-depth
-# |||| /
-# ||||| delay
-# cmd pid ||||| time | caller
-# \ / ||||| \ | /
-ksoftirq-7 1d.H3 0us : try_to_wake_up (wake_up_process)
-ksoftirq-7 1d.H4 1us : sub_preempt_count (marker_probe_cb)
-ksoftirq-7 1d.H3 2us : check_preempt_wakeup (try_to_wake_up)
-ksoftirq-7 1d.H3 3us : update_curr (check_preempt_wakeup)
-ksoftirq-7 1d.H3 4us : calc_delta_mine (update_curr)
-ksoftirq-7 1d.H3 5us : __resched_task (check_preempt_wakeup)
-ksoftirq-7 1d.H3 6us : task_wake_up_rt (try_to_wake_up)
-ksoftirq-7 1d.H3 7us : _spin_unlock_irqrestore (try_to_wake_up)
-[...]
-ksoftirq-7 1d.H2 17us : irq_exit (smp_apic_timer_interrupt)
-ksoftirq-7 1d.H2 18us : sub_preempt_count (irq_exit)
-ksoftirq-7 1d.s3 19us : sub_preempt_count (irq_exit)
-ksoftirq-7 1..s2 20us : rcu_process_callbacks (__do_softirq)
-[...]
-ksoftirq-7 1..s2 26us : __rcu_process_callbacks (rcu_process_callbacks)
-ksoftirq-7 1d.s2 27us : _local_bh_enable (__do_softirq)
-ksoftirq-7 1d.s2 28us : sub_preempt_count (_local_bh_enable)
-ksoftirq-7 1.N.3 29us : sub_preempt_count (ksoftirqd)
-ksoftirq-7 1.N.2 30us : _cond_resched (ksoftirqd)
-ksoftirq-7 1.N.2 31us : __cond_resched (_cond_resched)
-ksoftirq-7 1.N.2 32us : add_preempt_count (__cond_resched)
-ksoftirq-7 1.N.2 33us : schedule (__cond_resched)
-ksoftirq-7 1.N.2 33us : add_preempt_count (schedule)
-ksoftirq-7 1.N.3 34us : hrtick_clear (schedule)
-ksoftirq-7 1dN.3 35us : _spin_lock (schedule)
-ksoftirq-7 1dN.3 36us : add_preempt_count (_spin_lock)
-ksoftirq-7 1d..4 37us : put_prev_task_fair (schedule)
-ksoftirq-7 1d..4 38us : update_curr (put_prev_task_fair)
-[...]
-ksoftirq-7 1d..5 47us : _spin_trylock (tracing_record_cmdline)
-ksoftirq-7 1d..5 48us : add_preempt_count (_spin_trylock)
-ksoftirq-7 1d..6 49us : _spin_unlock (tracing_record_cmdline)
-ksoftirq-7 1d..6 49us : sub_preempt_count (_spin_unlock)
-ksoftirq-7 1d..4 50us : schedule (__cond_resched)
-
-The interrupt went off while running ksoftirqd. This task runs
-at SCHED_OTHER. Why did not we see the 'N' set early? This may
-be a harmless bug with x86_32 and 4K stacks. On x86_32 with 4K
-stacks configured, the interrupt and softirq run with their own
-stack. Some information is held on the top of the task's stack
-(need_resched and preempt_count are both stored there). The
-setting of the NEED_RESCHED bit is done directly to the task's
-stack, but the reading of the NEED_RESCHED is done by looking at
-the current stack, which in this case is the stack for the hard
-interrupt. This hides the fact that NEED_RESCHED has been set.
-We do not see the 'N' until we switch back to the task's
-assigned stack.
+# wakeup_rt latency trace v1.1.5 on 3.8.0-test+
+# --------------------------------------------------------------------
+# latency: 29 us, #85/85, CPU#3 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4)
+# -----------------
+# | task: sleep-2448 (uid:0 nice:0 policy:1 rt_prio:5)
+# -----------------
+#
+# _------=> CPU#
+# / _-----=> irqs-off
+# | / _----=> need-resched
+# || / _---=> hardirq/softirq
+# ||| / _--=> preempt-depth
+# |||| / delay
+# cmd pid ||||| time | caller
+# \ / ||||| \ | /
+ <idle>-0 3d.h4 1us+: 0:120:R + [003] 2448: 94:R sleep
+ <idle>-0 3d.h4 2us : ttwu_do_activate.constprop.87 <-try_to_wake_up
+ <idle>-0 3d.h3 3us : check_preempt_curr <-ttwu_do_wakeup
+ <idle>-0 3d.h3 3us : resched_task <-check_preempt_curr
+ <idle>-0 3dNh3 4us : task_woken_rt <-ttwu_do_wakeup
+ <idle>-0 3dNh3 4us : _raw_spin_unlock <-try_to_wake_up
+ <idle>-0 3dNh3 4us : sub_preempt_count <-_raw_spin_unlock
+ <idle>-0 3dNh2 5us : ttwu_stat <-try_to_wake_up
+ <idle>-0 3dNh2 5us : _raw_spin_unlock_irqrestore <-try_to_wake_up
+ <idle>-0 3dNh2 6us : sub_preempt_count <-_raw_spin_unlock_irqrestore
+ <idle>-0 3dNh1 6us : _raw_spin_lock <-__run_hrtimer
+ <idle>-0 3dNh1 6us : add_preempt_count <-_raw_spin_lock
+ <idle>-0 3dNh2 7us : _raw_spin_unlock <-hrtimer_interrupt
+ <idle>-0 3dNh2 7us : sub_preempt_count <-_raw_spin_unlock
+ <idle>-0 3dNh1 7us : tick_program_event <-hrtimer_interrupt
+ <idle>-0 3dNh1 7us : clockevents_program_event <-tick_program_event
+ <idle>-0 3dNh1 8us : ktime_get <-clockevents_program_event
+ <idle>-0 3dNh1 8us : lapic_next_event <-clockevents_program_event
+ <idle>-0 3dNh1 8us : irq_exit <-smp_apic_timer_interrupt
+ <idle>-0 3dNh1 9us : sub_preempt_count <-irq_exit
+ <idle>-0 3dN.2 9us : idle_cpu <-irq_exit
+ <idle>-0 3dN.2 9us : rcu_irq_exit <-irq_exit
+ <idle>-0 3dN.2 10us : rcu_eqs_enter_common.isra.45 <-rcu_irq_exit
+ <idle>-0 3dN.2 10us : sub_preempt_count <-irq_exit
+ <idle>-0 3.N.1 11us : rcu_idle_exit <-cpu_idle
+ <idle>-0 3dN.1 11us : rcu_eqs_exit_common.isra.43 <-rcu_idle_exit
+ <idle>-0 3.N.1 11us : tick_nohz_idle_exit <-cpu_idle
+ <idle>-0 3dN.1 12us : menu_hrtimer_cancel <-tick_nohz_idle_exit
+ <idle>-0 3dN.1 12us : ktime_get <-tick_nohz_idle_exit
+ <idle>-0 3dN.1 12us : tick_do_update_jiffies64 <-tick_nohz_idle_exit
+ <idle>-0 3dN.1 13us : update_cpu_load_nohz <-tick_nohz_idle_exit
+ <idle>-0 3dN.1 13us : _raw_spin_lock <-update_cpu_load_nohz
+ <idle>-0 3dN.1 13us : add_preempt_count <-_raw_spin_lock
+ <idle>-0 3dN.2 13us : __update_cpu_load <-update_cpu_load_nohz
+ <idle>-0 3dN.2 14us : sched_avg_update <-__update_cpu_load
+ <idle>-0 3dN.2 14us : _raw_spin_unlock <-update_cpu_load_nohz
+ <idle>-0 3dN.2 14us : sub_preempt_count <-_raw_spin_unlock
+ <idle>-0 3dN.1 15us : calc_load_exit_idle <-tick_nohz_idle_exit
+ <idle>-0 3dN.1 15us : touch_softlockup_watchdog <-tick_nohz_idle_exit
+ <idle>-0 3dN.1 15us : hrtimer_cancel <-tick_nohz_idle_exit
+ <idle>-0 3dN.1 15us : hrtimer_try_to_cancel <-hrtimer_cancel
+ <idle>-0 3dN.1 16us : lock_hrtimer_base.isra.18 <-hrtimer_try_to_cancel
+ <idle>-0 3dN.1 16us : _raw_spin_lock_irqsave <-lock_hrtimer_base.isra.18
+ <idle>-0 3dN.1 16us : add_preempt_count <-_raw_spin_lock_irqsave
+ <idle>-0 3dN.2 17us : __remove_hrtimer <-remove_hrtimer.part.16
+ <idle>-0 3dN.2 17us : hrtimer_force_reprogram <-__remove_hrtimer
+ <idle>-0 3dN.2 17us : tick_program_event <-hrtimer_force_reprogram
+ <idle>-0 3dN.2 18us : clockevents_program_event <-tick_program_event
+ <idle>-0 3dN.2 18us : ktime_get <-clockevents_program_event
+ <idle>-0 3dN.2 18us : lapic_next_event <-clockevents_program_event
+ <idle>-0 3dN.2 19us : _raw_spin_unlock_irqrestore <-hrtimer_try_to_cancel
+ <idle>-0 3dN.2 19us : sub_preempt_count <-_raw_spin_unlock_irqrestore
+ <idle>-0 3dN.1 19us : hrtimer_forward <-tick_nohz_idle_exit
+ <idle>-0 3dN.1 20us : ktime_add_safe <-hrtimer_forward
+ <idle>-0 3dN.1 20us : ktime_add_safe <-hrtimer_forward
+ <idle>-0 3dN.1 20us : hrtimer_start_range_ns <-hrtimer_start_expires.constprop.11
+ <idle>-0 3dN.1 20us : __hrtimer_start_range_ns <-hrtimer_start_range_ns
+ <idle>-0 3dN.1 21us : lock_hrtimer_base.isra.18 <-__hrtimer_start_range_ns
+ <idle>-0 3dN.1 21us : _raw_spin_lock_irqsave <-lock_hrtimer_base.isra.18
+ <idle>-0 3dN.1 21us : add_preempt_count <-_raw_spin_lock_irqsave
+ <idle>-0 3dN.2 22us : ktime_add_safe <-__hrtimer_start_range_ns
+ <idle>-0 3dN.2 22us : enqueue_hrtimer <-__hrtimer_start_range_ns
+ <idle>-0 3dN.2 22us : tick_program_event <-__hrtimer_start_range_ns
+ <idle>-0 3dN.2 23us : clockevents_program_event <-tick_program_event
+ <idle>-0 3dN.2 23us : ktime_get <-clockevents_program_event
+ <idle>-0 3dN.2 23us : lapic_next_event <-clockevents_program_event
+ <idle>-0 3dN.2 24us : _raw_spin_unlock_irqrestore <-__hrtimer_start_range_ns
+ <idle>-0 3dN.2 24us : sub_preempt_count <-_raw_spin_unlock_irqrestore
+ <idle>-0 3dN.1 24us : account_idle_ticks <-tick_nohz_idle_exit
+ <idle>-0 3dN.1 24us : account_idle_time <-account_idle_ticks
+ <idle>-0 3.N.1 25us : sub_preempt_count <-cpu_idle
+ <idle>-0 3.N.. 25us : schedule <-cpu_idle
+ <idle>-0 3.N.. 25us : __schedule <-preempt_schedule
+ <idle>-0 3.N.. 26us : add_preempt_count <-__schedule
+ <idle>-0 3.N.1 26us : rcu_note_context_switch <-__schedule
+ <idle>-0 3.N.1 26us : rcu_sched_qs <-rcu_note_context_switch
+ <idle>-0 3dN.1 27us : rcu_preempt_qs <-rcu_note_context_switch
+ <idle>-0 3.N.1 27us : _raw_spin_lock_irq <-__schedule
+ <idle>-0 3dN.1 27us : add_preempt_count <-_raw_spin_lock_irq
+ <idle>-0 3dN.2 28us : put_prev_task_idle <-__schedule
+ <idle>-0 3dN.2 28us : pick_next_task_stop <-pick_next_task
+ <idle>-0 3dN.2 28us : pick_next_task_rt <-pick_next_task
+ <idle>-0 3dN.2 29us : dequeue_pushable_task <-pick_next_task_rt
+ <idle>-0 3d..3 29us : __schedule <-preempt_schedule
+ <idle>-0 3d..3 30us : 0:120:R ==> [003] 2448: 94:R sleep
+
+This isn't that big of a trace, even with function tracing enabled,
+so I included the entire trace.
+
+The interrupt went off while when the system was idle. Somewhere
+before task_woken_rt() was called, the NEED_RESCHED flag was set,
+this is indicated by the first occurrence of the 'N' flag.
+
+Latency tracing and events
+--------------------------
+As function tracing can induce a much larger latency, but without
+seeing what happens within the latency it is hard to know what
+caused it. There is a middle ground, and that is with enabling
+events.
+
+ # echo 0 > options/function-trace
+ # echo wakeup_rt > current_tracer
+ # echo 1 > events/enable
+ # echo 1 > tracing_on
+ # echo 0 > tracing_max_latency
+ # chrt -f 5 sleep 1
+ # echo 0 > tracing_on
+ # cat trace
+# tracer: wakeup_rt
+#
+# wakeup_rt latency trace v1.1.5 on 3.8.0-test+
+# --------------------------------------------------------------------
+# latency: 6 us, #12/12, CPU#2 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4)
+# -----------------
+# | task: sleep-5882 (uid:0 nice:0 policy:1 rt_prio:5)
+# -----------------
+#
+# _------=> CPU#
+# / _-----=> irqs-off
+# | / _----=> need-resched
+# || / _---=> hardirq/softirq
+# ||| / _--=> preempt-depth
+# |||| / delay
+# cmd pid ||||| time | caller
+# \ / ||||| \ | /
+ <idle>-0 2d.h4 0us : 0:120:R + [002] 5882: 94:R sleep
+ <idle>-0 2d.h4 0us : ttwu_do_activate.constprop.87 <-try_to_wake_up
+ <idle>-0 2d.h4 1us : sched_wakeup: comm=sleep pid=5882 prio=94 success=1 target_cpu=002
+ <idle>-0 2dNh2 1us : hrtimer_expire_exit: hrtimer=ffff88007796feb8
+ <idle>-0 2.N.2 2us : power_end: cpu_id=2
+ <idle>-0 2.N.2 3us : cpu_idle: state=4294967295 cpu_id=2
+ <idle>-0 2dN.3 4us : hrtimer_cancel: hrtimer=ffff88007d50d5e0
+ <idle>-0 2dN.3 4us : hrtimer_start: hrtimer=ffff88007d50d5e0 function=tick_sched_timer expires=34311211000000 softexpires=34311211000000
+ <idle>-0 2.N.2 5us : rcu_utilization: Start context switch
+ <idle>-0 2.N.2 5us : rcu_utilization: End context switch
+ <idle>-0 2d..3 6us : __schedule <-schedule
+ <idle>-0 2d..3 6us : 0:120:R ==> [002] 5882: 94:R sleep
+
function
--------
@@ -1039,6 +1653,7 @@ function
This tracer is the function tracer. Enabling the function tracer
can be done from the debug file system. Make sure the
ftrace_enabled is set; otherwise this tracer is a nop.
+See the "ftrace_enabled" section below.
# sysctl kernel.ftrace_enabled=1
# echo function > current_tracer
@@ -1048,23 +1663,23 @@ ftrace_enabled is set; otherwise this tracer is a nop.
# cat trace
# tracer: function
#
-# TASK-PID CPU# TIMESTAMP FUNCTION
-# | | | | |
- bash-4003 [00] 123.638713: finish_task_switch <-schedule
- bash-4003 [00] 123.638714: _spin_unlock_irq <-finish_task_switch
- bash-4003 [00] 123.638714: sub_preempt_count <-_spin_unlock_irq
- bash-4003 [00] 123.638715: hrtick_set <-schedule
- bash-4003 [00] 123.638715: _spin_lock_irqsave <-hrtick_set
- bash-4003 [00] 123.638716: add_preempt_count <-_spin_lock_irqsave
- bash-4003 [00] 123.638716: _spin_unlock_irqrestore <-hrtick_set
- bash-4003 [00] 123.638717: sub_preempt_count <-_spin_unlock_irqrestore
- bash-4003 [00] 123.638717: hrtick_clear <-hrtick_set
- bash-4003 [00] 123.638718: sub_preempt_count <-schedule
- bash-4003 [00] 123.638718: sub_preempt_count <-preempt_schedule
- bash-4003 [00] 123.638719: wait_for_completion <-__stop_machine_run
- bash-4003 [00] 123.638719: wait_for_common <-wait_for_completion
- bash-4003 [00] 123.638720: _spin_lock_irq <-wait_for_common
- bash-4003 [00] 123.638720: add_preempt_count <-_spin_lock_irq
+# entries-in-buffer/entries-written: 24799/24799 #P:4
+#
+# _-----=> irqs-off
+# / _----=> need-resched
+# | / _---=> hardirq/softirq
+# || / _--=> preempt-depth
+# ||| / delay
+# TASK-PID CPU# |||| TIMESTAMP FUNCTION
+# | | | |||| | |
+ bash-1994 [002] .... 3082.063030: mutex_unlock <-rb_simple_write
+ bash-1994 [002] .... 3082.063031: __mutex_unlock_slowpath <-mutex_unlock
+ bash-1994 [002] .... 3082.063031: __fsnotify_parent <-fsnotify_modify
+ bash-1994 [002] .... 3082.063032: fsnotify <-fsnotify_modify
+ bash-1994 [002] .... 3082.063032: __srcu_read_lock <-fsnotify
+ bash-1994 [002] .... 3082.063032: add_preempt_count <-__srcu_read_lock
+ bash-1994 [002] ...1 3082.063032: sub_preempt_count <-__srcu_read_lock
+ bash-1994 [002] .... 3082.063033: __srcu_read_unlock <-fsnotify
[...]
@@ -1214,79 +1829,19 @@ int main (int argc, char **argv)
return 0;
}
+Or this simple script!
-hw-branch-tracer (x86 only)
----------------------------
-
-This tracer uses the x86 last branch tracing hardware feature to
-collect a branch trace on all cpus with relatively low overhead.
-
-The tracer uses a fixed-size circular buffer per cpu and only
-traces ring 0 branches. The trace file dumps that buffer in the
-following format:
-
-# tracer: hw-branch-tracer
-#
-# CPU# TO <- FROM
- 0 scheduler_tick+0xb5/0x1bf <- task_tick_idle+0x5/0x6
- 2 run_posix_cpu_timers+0x2b/0x72a <- run_posix_cpu_timers+0x25/0x72a
- 0 scheduler_tick+0x139/0x1bf <- scheduler_tick+0xed/0x1bf
- 0 scheduler_tick+0x17c/0x1bf <- scheduler_tick+0x148/0x1bf
- 2 run_posix_cpu_timers+0x9e/0x72a <- run_posix_cpu_timers+0x5e/0x72a
- 0 scheduler_tick+0x1b6/0x1bf <- scheduler_tick+0x1aa/0x1bf
-
-
-The tracer may be used to dump the trace for the oops'ing cpu on
-a kernel oops into the system log. To enable this,
-ftrace_dump_on_oops must be set. To set ftrace_dump_on_oops, one
-can either use the sysctl function or set it via the proc system
-interface.
-
- sysctl kernel.ftrace_dump_on_oops=n
-
-or
-
- echo n > /proc/sys/kernel/ftrace_dump_on_oops
-
-If n = 1, ftrace will dump buffers of all CPUs, if n = 2 ftrace will
-only dump the buffer of the CPU that triggered the oops.
-
-Here's an example of such a dump after a null pointer
-dereference in a kernel module:
-
-[57848.105921] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
-[57848.106019] IP: [<ffffffffa0000006>] open+0x6/0x14 [oops]
-[57848.106019] PGD 2354e9067 PUD 2375e7067 PMD 0
-[57848.106019] Oops: 0002 [#1] SMP
-[57848.106019] last sysfs file: /sys/devices/pci0000:00/0000:00:1e.0/0000:20:05.0/local_cpus
-[57848.106019] Dumping ftrace buffer:
-[57848.106019] ---------------------------------
-[...]
-[57848.106019] 0 chrdev_open+0xe6/0x165 <- cdev_put+0x23/0x24
-[57848.106019] 0 chrdev_open+0x117/0x165 <- chrdev_open+0xfa/0x165
-[57848.106019] 0 chrdev_open+0x120/0x165 <- chrdev_open+0x11c/0x165
-[57848.106019] 0 chrdev_open+0x134/0x165 <- chrdev_open+0x12b/0x165
-[57848.106019] 0 open+0x0/0x14 [oops] <- chrdev_open+0x144/0x165
-[57848.106019] 0 page_fault+0x0/0x30 <- open+0x6/0x14 [oops]
-[57848.106019] 0 error_entry+0x0/0x5b <- page_fault+0x4/0x30
-[57848.106019] 0 error_kernelspace+0x0/0x31 <- error_entry+0x59/0x5b
-[57848.106019] 0 error_sti+0x0/0x1 <- error_kernelspace+0x2d/0x31
-[57848.106019] 0 page_fault+0x9/0x30 <- error_sti+0x0/0x1
-[57848.106019] 0 do_page_fault+0x0/0x881 <- page_fault+0x1a/0x30
-[...]
-[57848.106019] 0 do_page_fault+0x66b/0x881 <- is_prefetch+0x1ee/0x1f2
-[57848.106019] 0 do_page_fault+0x6e0/0x881 <- do_page_fault+0x67a/0x881
-[57848.106019] 0 oops_begin+0x0/0x96 <- do_page_fault+0x6e0/0x881
-[57848.106019] 0 trace_hw_branch_oops+0x0/0x2d <- oops_begin+0x9/0x96
-[...]
-[57848.106019] 0 ds_suspend_bts+0x2a/0xe3 <- ds_suspend_bts+0x1a/0xe3
-[57848.106019] ---------------------------------
-[57848.106019] CPU 0
-[57848.106019] Modules linked in: oops
-[57848.106019] Pid: 5542, comm: cat Tainted: G W 2.6.28 #23
-[57848.106019] RIP: 0010:[<ffffffffa0000006>] [<ffffffffa0000006>] open+0x6/0x14 [oops]
-[57848.106019] RSP: 0018:ffff880235457d48 EFLAGS: 00010246
-[...]
+------
+#!/bin/bash
+
+debugfs=`sed -ne 's/^debugfs \(.*\) debugfs.*/\1/p' /proc/mounts`
+echo nop > $debugfs/tracing/current_tracer
+echo 0 > $debugfs/tracing/tracing_on
+echo $$ > $debugfs/tracing/set_ftrace_pid
+echo function > $debugfs/tracing/current_tracer
+echo 1 > $debugfs/tracing/tracing_on
+exec "$@"
+------
function graph tracer
@@ -1473,16 +2028,18 @@ starts of pointing to a simple return. (Enabling FTRACE will
include the -pg switch in the compiling of the kernel.)
At compile time every C file object is run through the
-recordmcount.pl script (located in the scripts directory). This
-script will process the C object using objdump to find all the
-locations in the .text section that call mcount. (Note, only the
-.text section is processed, since processing other sections like
-.init.text may cause races due to those sections being freed).
+recordmcount program (located in the scripts directory). This
+program will parse the ELF headers in the C object to find all
+the locations in the .text section that call mcount. (Note, only
+white listed .text sections are processed, since processing other
+sections like .init.text may cause races due to those sections
+being freed unexpectedly).
A new section called "__mcount_loc" is created that holds
references to all the mcount call sites in the .text section.
-This section is compiled back into the original object. The
-final linker will add all these references into a single table.
+The recordmcount program re-links this section back into the
+original object. The final linking stage of the kernel will add all these
+references into a single table.
On boot up, before SMP is initialized, the dynamic ftrace code
scans this table and updates all the locations into nops. It
@@ -1493,13 +2050,25 @@ unloaded, it also removes its functions from the ftrace function
list. This is automatic in the module unload code, and the
module author does not need to worry about it.
-When tracing is enabled, kstop_machine is called to prevent
-races with the CPUS executing code being modified (which can
-cause the CPU to do undesirable things), and the nops are
+When tracing is enabled, the process of modifying the function
+tracepoints is dependent on architecture. The old method is to use
+kstop_machine to prevent races with the CPUs executing code being
+modified (which can cause the CPU to do undesirable things, especially
+if the modified code crosses cache (or page) boundaries), and the nops are
patched back to calls. But this time, they do not call mcount
(which is just a function stub). They now call into the ftrace
infrastructure.
+The new method of modifying the function tracepoints is to place
+a breakpoint at the location to be modified, sync all CPUs, modify
+the rest of the instruction not covered by the breakpoint. Sync
+all CPUs again, and then remove the breakpoint with the finished
+version to the ftrace call site.
+
+Some archs do not even need to monkey around with the synchronization,
+and can just slap the new code on top of the old without any
+problems with other CPUs executing it at the same time.
+
One special side-effect to the recording of the functions being
traced is that we can now selectively choose which functions we
wish to trace and which ones we want the mcount calls to remain
@@ -1530,20 +2099,28 @@ mutex_lock
If I am only interested in sys_nanosleep and hrtimer_interrupt:
- # echo sys_nanosleep hrtimer_interrupt \
- > set_ftrace_filter
+ # echo sys_nanosleep hrtimer_interrupt > set_ftrace_filter
# echo function > current_tracer
# echo 1 > tracing_on
# usleep 1
# echo 0 > tracing_on
# cat trace
-# tracer: ftrace
+# tracer: function
+#
+# entries-in-buffer/entries-written: 5/5 #P:4
#
-# TASK-PID CPU# TIMESTAMP FUNCTION
-# | | | | |
- usleep-4134 [00] 1317.070017: hrtimer_interrupt <-smp_apic_timer_interrupt
- usleep-4134 [00] 1317.070111: sys_nanosleep <-syscall_call
- <idle>-0 [00] 1317.070115: hrtimer_interrupt <-smp_apic_timer_interrupt
+# _-----=> irqs-off
+# / _----=> need-resched
+# | / _---=> hardirq/softirq
+# || / _--=> preempt-depth
+# ||| / delay
+# TASK-PID CPU# |||| TIMESTAMP FUNCTION
+# | | | |||| | |
+ usleep-2665 [001] .... 4186.475355: sys_nanosleep <-system_call_fastpath
+ <idle>-0 [001] d.h1 4186.475409: hrtimer_interrupt <-smp_apic_timer_interrupt
+ usleep-2665 [001] d.h1 4186.475426: hrtimer_interrupt <-smp_apic_timer_interrupt
+ <idle>-0 [003] d.h1 4186.475426: hrtimer_interrupt <-smp_apic_timer_interrupt
+ <idle>-0 [002] d.h1 4186.475427: hrtimer_interrupt <-smp_apic_timer_interrupt
To see which functions are being traced, you can cat the file:
@@ -1571,20 +2148,25 @@ Note: It is better to use quotes to enclose the wild cards,
Produces:
-# tracer: ftrace
+# tracer: function
#
-# TASK-PID CPU# TIMESTAMP FUNCTION
-# | | | | |
- bash-4003 [00] 1480.611794: hrtimer_init <-copy_process
- bash-4003 [00] 1480.611941: hrtimer_start <-hrtick_set
- bash-4003 [00] 1480.611956: hrtimer_cancel <-hrtick_clear
- bash-4003 [00] 1480.611956: hrtimer_try_to_cancel <-hrtimer_cancel
- <idle>-0 [00] 1480.612019: hrtimer_get_next_event <-get_next_timer_interrupt
- <idle>-0 [00] 1480.612025: hrtimer_get_next_event <-get_next_timer_interrupt
- <idle>-0 [00] 1480.612032: hrtimer_get_next_event <-get_next_timer_interrupt
- <idle>-0 [00] 1480.612037: hrtimer_get_next_event <-get_next_timer_interrupt
- <idle>-0 [00] 1480.612382: hrtimer_get_next_event <-get_next_timer_interrupt
-
+# entries-in-buffer/entries-written: 897/897 #P:4
+#
+# _-----=> irqs-off
+# / _----=> need-resched
+# | / _---=> hardirq/softirq
+# || / _--=> preempt-depth
+# ||| / delay
+# TASK-PID CPU# |||| TIMESTAMP FUNCTION
+# | | | |||| | |
+ <idle>-0 [003] dN.1 4228.547803: hrtimer_cancel <-tick_nohz_idle_exit
+ <idle>-0 [003] dN.1 4228.547804: hrtimer_try_to_cancel <-hrtimer_cancel
+ <idle>-0 [003] dN.2 4228.547805: hrtimer_force_reprogram <-__remove_hrtimer
+ <idle>-0 [003] dN.1 4228.547805: hrtimer_forward <-tick_nohz_idle_exit
+ <idle>-0 [003] dN.1 4228.547805: hrtimer_start_range_ns <-hrtimer_start_expires.constprop.11
+ <idle>-0 [003] d..1 4228.547858: hrtimer_get_next_event <-get_next_timer_interrupt
+ <idle>-0 [003] d..1 4228.547859: hrtimer_start <-__tick_nohz_idle_enter
+ <idle>-0 [003] d..2 4228.547860: hrtimer_force_reprogram <-__rem
Notice that we lost the sys_nanosleep.
@@ -1651,19 +2233,29 @@ traced.
Produces:
-# tracer: ftrace
+# tracer: function
+#
+# entries-in-buffer/entries-written: 39608/39608 #P:4
#
-# TASK-PID CPU# TIMESTAMP FUNCTION
-# | | | | |
- bash-4043 [01] 115.281644: finish_task_switch <-schedule
- bash-4043 [01] 115.281645: hrtick_set <-schedule
- bash-4043 [01] 115.281645: hrtick_clear <-hrtick_set
- bash-4043 [01] 115.281646: wait_for_completion <-__stop_machine_run
- bash-4043 [01] 115.281647: wait_for_common <-wait_for_completion
- bash-4043 [01] 115.281647: kthread_stop <-stop_machine_run
- bash-4043 [01] 115.281648: init_waitqueue_head <-kthread_stop
- bash-4043 [01] 115.281648: wake_up_process <-kthread_stop
- bash-4043 [01] 115.281649: try_to_wake_up <-wake_up_process
+# _-----=> irqs-off
+# / _----=> need-resched
+# | / _---=> hardirq/softirq
+# || / _--=> preempt-depth
+# ||| / delay
+# TASK-PID CPU# |||| TIMESTAMP FUNCTION
+# | | | |||| | |
+ bash-1994 [000] .... 4342.324896: file_ra_state_init <-do_dentry_open
+ bash-1994 [000] .... 4342.324897: open_check_o_direct <-do_last
+ bash-1994 [000] .... 4342.324897: ima_file_check <-do_last
+ bash-1994 [000] .... 4342.324898: process_measurement <-ima_file_check
+ bash-1994 [000] .... 4342.324898: ima_get_action <-process_measurement
+ bash-1994 [000] .... 4342.324898: ima_match_policy <-ima_get_action
+ bash-1994 [000] .... 4342.324899: do_truncate <-do_last
+ bash-1994 [000] .... 4342.324899: should_remove_suid <-do_truncate
+ bash-1994 [000] .... 4342.324899: notify_change <-do_truncate
+ bash-1994 [000] .... 4342.324900: current_fs_time <-notify_change
+ bash-1994 [000] .... 4342.324900: current_kernel_time <-current_fs_time
+ bash-1994 [000] .... 4342.324900: timespec_trunc <-current_fs_time
We can see that there's no more lock or preempt tracing.
@@ -1729,6 +2321,28 @@ this special filter via:
echo > set_graph_function
+ftrace_enabled
+--------------
+
+Note, the proc sysctl ftrace_enable is a big on/off switch for the
+function tracer. By default it is enabled (when function tracing is
+enabled in the kernel). If it is disabled, all function tracing is
+disabled. This includes not only the function tracers for ftrace, but
+also for any other uses (perf, kprobes, stack tracing, profiling, etc).
+
+Please disable this with care.
+
+This can be disable (and enabled) with:
+
+ sysctl kernel.ftrace_enabled=0
+ sysctl kernel.ftrace_enabled=1
+
+ or
+
+ echo 0 > /proc/sys/kernel/ftrace_enabled
+ echo 1 > /proc/sys/kernel/ftrace_enabled
+
+
Filter commands
---------------
@@ -1763,12 +2377,58 @@ The following commands are supported:
echo '__schedule_bug:traceoff:5' > set_ftrace_filter
+ To always disable tracing when __schedule_bug is hit:
+
+ echo '__schedule_bug:traceoff' > set_ftrace_filter
+
These commands are cumulative whether or not they are appended
to set_ftrace_filter. To remove a command, prepend it by '!'
and drop the parameter:
+ echo '!__schedule_bug:traceoff:0' > set_ftrace_filter
+
+ The above removes the traceoff command for __schedule_bug
+ that have a counter. To remove commands without counters:
+
echo '!__schedule_bug:traceoff' > set_ftrace_filter
+- snapshot
+ Will cause a snapshot to be triggered when the function is hit.
+
+ echo 'native_flush_tlb_others:snapshot' > set_ftrace_filter
+
+ To only snapshot once:
+
+ echo 'native_flush_tlb_others:snapshot:1' > set_ftrace_filter
+
+ To remove the above commands:
+
+ echo '!native_flush_tlb_others:snapshot' > set_ftrace_filter
+ echo '!native_flush_tlb_others:snapshot:0' > set_ftrace_filter
+
+- enable_event/disable_event
+ These commands can enable or disable a trace event. Note, because
+ function tracing callbacks are very sensitive, when these commands
+ are registered, the trace point is activated, but disabled in
+ a "soft" mode. That is, the tracepoint will be called, but
+ just will not be traced. The event tracepoint stays in this mode
+ as long as there's a command that triggers it.
+
+ echo 'try_to_wake_up:enable_event:sched:sched_switch:2' > \
+ set_ftrace_filter
+
+ The format is:
+
+ <function>:enable_event:<system>:<event>[:count]
+ <function>:disable_event:<system>:<event>[:count]
+
+ To remove the events commands:
+
+
+ echo '!try_to_wake_up:enable_event:sched:sched_switch:0' > \
+ set_ftrace_filter
+ echo '!schedule:disable_event:sched:sched_switch' > \
+ set_ftrace_filter
trace_pipe
----------
@@ -1787,28 +2447,31 @@ different. The trace is live.
# cat trace
# tracer: function
#
-# TASK-PID CPU# TIMESTAMP FUNCTION
-# | | | | |
+# entries-in-buffer/entries-written: 0/0 #P:4
+#
+# _-----=> irqs-off
+# / _----=> need-resched
+# | / _---=> hardirq/softirq
+# || / _--=> preempt-depth
+# ||| / delay
+# TASK-PID CPU# |||| TIMESTAMP FUNCTION
+# | | | |||| | |
#
# cat /tmp/trace.out
- bash-4043 [00] 41.267106: finish_task_switch <-schedule
- bash-4043 [00] 41.267106: hrtick_set <-schedule
- bash-4043 [00] 41.267107: hrtick_clear <-hrtick_set
- bash-4043 [00] 41.267108: wait_for_completion <-__stop_machine_run
- bash-4043 [00] 41.267108: wait_for_common <-wait_for_completion
- bash-4043 [00] 41.267109: kthread_stop <-stop_machine_run
- bash-4043 [00] 41.267109: init_waitqueue_head <-kthread_stop
- bash-4043 [00] 41.267110: wake_up_process <-kthread_stop
- bash-4043 [00] 41.267110: try_to_wake_up <-wake_up_process
- bash-4043 [00] 41.267111: select_task_rq_rt <-try_to_wake_up
+ bash-1994 [000] .... 5281.568961: mutex_unlock <-rb_simple_write
+ bash-1994 [000] .... 5281.568963: __mutex_unlock_slowpath <-mutex_unlock
+ bash-1994 [000] .... 5281.568963: __fsnotify_parent <-fsnotify_modify
+ bash-1994 [000] .... 5281.568964: fsnotify <-fsnotify_modify
+ bash-1994 [000] .... 5281.568964: __srcu_read_lock <-fsnotify
+ bash-1994 [000] .... 5281.568964: add_preempt_count <-__srcu_read_lock
+ bash-1994 [000] ...1 5281.568965: sub_preempt_count <-__srcu_read_lock
+ bash-1994 [000] .... 5281.568965: __srcu_read_unlock <-fsnotify
+ bash-1994 [000] .... 5281.568967: sys_dup2 <-system_call_fastpath
Note, reading the trace_pipe file will block until more input is
-added. By changing the tracer, trace_pipe will issue an EOF. We
-needed to set the function tracer _before_ we "cat" the
-trace_pipe file.
-
+added.
trace entries
-------------
@@ -1817,31 +2480,50 @@ Having too much or not enough data can be troublesome in
diagnosing an issue in the kernel. The file buffer_size_kb is
used to modify the size of the internal trace buffers. The
number listed is the number of entries that can be recorded per
-CPU. To know the full size, multiply the number of possible CPUS
+CPU. To know the full size, multiply the number of possible CPUs
with the number of entries.
# cat buffer_size_kb
1408 (units kilobytes)
-Note, to modify this, you must have tracing completely disabled.
-To do that, echo "nop" into the current_tracer. If the
-current_tracer is not set to "nop", an EINVAL error will be
-returned.
+Or simply read buffer_total_size_kb
+
+ # cat buffer_total_size_kb
+5632
+
+To modify the buffer, simple echo in a number (in 1024 byte segments).
- # echo nop > current_tracer
# echo 10000 > buffer_size_kb
# cat buffer_size_kb
10000 (units kilobytes)
-The number of pages which will be allocated is limited to a
-percentage of available memory. Allocating too much will produce
-an error.
+It will try to allocate as much as possible. If you allocate too
+much, it can cause Out-Of-Memory to trigger.
# echo 1000000000000 > buffer_size_kb
-bash: echo: write error: Cannot allocate memory
# cat buffer_size_kb
85
+The per_cpu buffers can be changed individually as well:
+
+ # echo 10000 > per_cpu/cpu0/buffer_size_kb
+ # echo 100 > per_cpu/cpu1/buffer_size_kb
+
+When the per_cpu buffers are not the same, the buffer_size_kb
+at the top level will just show an X
+
+ # cat buffer_size_kb
+X
+
+This is where the buffer_total_size_kb is useful:
+
+ # cat buffer_total_size_kb
+12916
+
+Writing to the top level buffer_size_kb will reset all the buffers
+to be the same again.
+
Snapshot
--------
CONFIG_TRACER_SNAPSHOT makes a generic snapshot feature
@@ -1925,7 +2607,188 @@ bash: echo: write error: Device or resource busy
# cat snapshot
cat: snapshot: Device or resource busy
+
+Instances
+---------
+In the debugfs tracing directory is a directory called "instances".
+This directory can have new directories created inside of it using
+mkdir, and removing directories with rmdir. The directory created
+with mkdir in this directory will already contain files and other
+directories after it is created.
+
+ # mkdir instances/foo
+ # ls instances/foo
+buffer_size_kb buffer_total_size_kb events free_buffer per_cpu
+set_event snapshot trace trace_clock trace_marker trace_options
+trace_pipe tracing_on
+
+As you can see, the new directory looks similar to the tracing directory
+itself. In fact, it is very similar, except that the buffer and
+events are agnostic from the main director, or from any other
+instances that are created.
+
+The files in the new directory work just like the files with the
+same name in the tracing directory except the buffer that is used
+is a separate and new buffer. The files affect that buffer but do not
+affect the main buffer with the exception of trace_options. Currently,
+the trace_options affect all instances and the top level buffer
+the same, but this may change in future releases. That is, options
+may become specific to the instance they reside in.
+
+Notice that none of the function tracer files are there, nor is
+current_tracer and available_tracers. This is because the buffers
+can currently only have events enabled for them.
+
+ # mkdir instances/foo
+ # mkdir instances/bar
+ # mkdir instances/zoot
+ # echo 100000 > buffer_size_kb
+ # echo 1000 > instances/foo/buffer_size_kb
+ # echo 5000 > instances/bar/per_cpu/cpu1/buffer_size_kb
+ # echo function > current_trace
+ # echo 1 > instances/foo/events/sched/sched_wakeup/enable
+ # echo 1 > instances/foo/events/sched/sched_wakeup_new/enable
+ # echo 1 > instances/foo/events/sched/sched_switch/enable
+ # echo 1 > instances/bar/events/irq/enable
+ # echo 1 > instances/zoot/events/syscalls/enable
+ # cat trace_pipe
+CPU:2 [LOST 11745 EVENTS]
+ bash-2044 [002] .... 10594.481032: _raw_spin_lock_irqsave <-get_page_from_freelist
+ bash-2044 [002] d... 10594.481032: add_preempt_count <-_raw_spin_lock_irqsave
+ bash-2044 [002] d..1 10594.481032: __rmqueue <-get_page_from_freelist
+ bash-2044 [002] d..1 10594.481033: _raw_spin_unlock <-get_page_from_freelist
+ bash-2044 [002] d..1 10594.481033: sub_preempt_count <-_raw_spin_unlock
+ bash-2044 [002] d... 10594.481033: get_pageblock_flags_group <-get_pageblock_migratetype
+ bash-2044 [002] d... 10594.481034: __mod_zone_page_state <-get_page_from_freelist
+ bash-2044 [002] d... 10594.481034: zone_statistics <-get_page_from_freelist
+ bash-2044 [002] d... 10594.481034: __inc_zone_state <-zone_statistics
+ bash-2044 [002] d... 10594.481034: __inc_zone_state <-zone_statistics
+ bash-2044 [002] .... 10594.481035: arch_dup_task_struct <-copy_process
+[...]
+
+ # cat instances/foo/trace_pipe
+ bash-1998 [000] d..4 136.676759: sched_wakeup: comm=kworker/0:1 pid=59 prio=120 success=1 target_cpu=000
+ bash-1998 [000] dN.4 136.676760: sched_wakeup: comm=bash pid=1998 prio=120 success=1 target_cpu=000
+ <idle>-0 [003] d.h3 136.676906: sched_wakeup: comm=rcu_preempt pid=9 prio=120 success=1 target_cpu=003
+ <idle>-0 [003] d..3 136.676909: sched_switch: prev_comm=swapper/3 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=rcu_preempt next_pid=9 next_prio=120
+ rcu_preempt-9 [003] d..3 136.676916: sched_switch: prev_comm=rcu_preempt prev_pid=9 prev_prio=120 prev_state=S ==> next_comm=swapper/3 next_pid=0 next_prio=120
+ bash-1998 [000] d..4 136.677014: sched_wakeup: comm=kworker/0:1 pid=59 prio=120 success=1 target_cpu=000
+ bash-1998 [000] dN.4 136.677016: sched_wakeup: comm=bash pid=1998 prio=120 success=1 target_cpu=000
+ bash-1998 [000] d..3 136.677018: sched_switch: prev_comm=bash prev_pid=1998 prev_prio=120 prev_state=R+ ==> next_comm=kworker/0:1 next_pid=59 next_prio=120
+ kworker/0:1-59 [000] d..4 136.677022: sched_wakeup: comm=sshd pid=1995 prio=120 success=1 target_cpu=001
+ kworker/0:1-59 [000] d..3 136.677025: sched_switch: prev_comm=kworker/0:1 prev_pid=59 prev_prio=120 prev_state=S ==> next_comm=bash next_pid=1998 next_prio=120
+[...]
+
+ # cat instances/bar/trace_pipe
+ migration/1-14 [001] d.h3 138.732674: softirq_raise: vec=3 [action=NET_RX]
+ <idle>-0 [001] dNh3 138.732725: softirq_raise: vec=3 [action=NET_RX]
+ bash-1998 [000] d.h1 138.733101: softirq_raise: vec=1 [action=TIMER]
+ bash-1998 [000] d.h1 138.733102: softirq_raise: vec=9 [action=RCU]
+ bash-1998 [000] ..s2 138.733105: softirq_entry: vec=1 [action=TIMER]
+ bash-1998 [000] ..s2 138.733106: softirq_exit: vec=1 [action=TIMER]
+ bash-1998 [000] ..s2 138.733106: softirq_entry: vec=9 [action=RCU]
+ bash-1998 [000] ..s2 138.733109: softirq_exit: vec=9 [action=RCU]
+ sshd-1995 [001] d.h1 138.733278: irq_handler_entry: irq=21 name=uhci_hcd:usb4
+ sshd-1995 [001] d.h1 138.733280: irq_handler_exit: irq=21 ret=unhandled
+ sshd-1995 [001] d.h1 138.733281: irq_handler_entry: irq=21 name=eth0
+ sshd-1995 [001] d.h1 138.733283: irq_handler_exit: irq=21 ret=handled
+[...]
+
+ # cat instances/zoot/trace
+# tracer: nop
+#
+# entries-in-buffer/entries-written: 18996/18996 #P:4
+#
+# _-----=> irqs-off
+# / _----=> need-resched
+# | / _---=> hardirq/softirq
+# || / _--=> preempt-depth
+# ||| / delay
+# TASK-PID CPU# |||| TIMESTAMP FUNCTION
+# | | | |||| | |
+ bash-1998 [000] d... 140.733501: sys_write -> 0x2
+ bash-1998 [000] d... 140.733504: sys_dup2(oldfd: a, newfd: 1)
+ bash-1998 [000] d... 140.733506: sys_dup2 -> 0x1
+ bash-1998 [000] d... 140.733508: sys_fcntl(fd: a, cmd: 1, arg: 0)
+ bash-1998 [000] d... 140.733509: sys_fcntl -> 0x1
+ bash-1998 [000] d... 140.733510: sys_close(fd: a)
+ bash-1998 [000] d... 140.733510: sys_close -> 0x0
+ bash-1998 [000] d... 140.733514: sys_rt_sigprocmask(how: 0, nset: 0, oset: 6e2768, sigsetsize: 8)
+ bash-1998 [000] d... 140.733515: sys_rt_sigprocmask -> 0x0
+ bash-1998 [000] d... 140.733516: sys_rt_sigaction(sig: 2, act: 7fff718846f0, oact: 7fff71884650, sigsetsize: 8)
+ bash-1998 [000] d... 140.733516: sys_rt_sigaction -> 0x0
+
+You can see that the trace of the top most trace buffer shows only
+the function tracing. The foo instance displays wakeups and task
+switches.
+
+To remove the instances, simply delete their directories:
+
+ # rmdir instances/foo
+ # rmdir instances/bar
+ # rmdir instances/zoot
+
+Note, if a process has a trace file open in one of the instance
+directories, the rmdir will fail with EBUSY.
+
+
+Stack trace
-----------
+Since the kernel has a fixed sized stack, it is important not to
+waste it in functions. A kernel developer must be conscience of
+what they allocate on the stack. If they add too much, the system
+can be in danger of a stack overflow, and corruption will occur,
+usually leading to a system panic.
+
+There are some tools that check this, usually with interrupts
+periodically checking usage. But if you can perform a check
+at every function call that will become very useful. As ftrace provides
+a function tracer, it makes it convenient to check the stack size
+at every function call. This is enabled via the stack tracer.
+
+CONFIG_STACK_TRACER enables the ftrace stack tracing functionality.
+To enable it, write a '1' into /proc/sys/kernel/stack_tracer_enabled.
+
+ # echo 1 > /proc/sys/kernel/stack_tracer_enabled
+
+You can also enable it from the kernel command line to trace
+the stack size of the kernel during boot up, by adding "stacktrace"
+to the kernel command line parameter.
+
+After running it for a few minutes, the output looks like:
+
+ # cat stack_max_size
+2928
+
+ # cat stack_trace
+ Depth Size Location (18 entries)
+ ----- ---- --------
+ 0) 2928 224 update_sd_lb_stats+0xbc/0x4ac
+ 1) 2704 160 find_busiest_group+0x31/0x1f1
+ 2) 2544 256 load_balance+0xd9/0x662
+ 3) 2288 80 idle_balance+0xbb/0x130
+ 4) 2208 128 __schedule+0x26e/0x5b9
+ 5) 2080 16 schedule+0x64/0x66
+ 6) 2064 128 schedule_timeout+0x34/0xe0
+ 7) 1936 112 wait_for_common+0x97/0xf1
+ 8) 1824 16 wait_for_completion+0x1d/0x1f
+ 9) 1808 128 flush_work+0xfe/0x119
+ 10) 1680 16 tty_flush_to_ldisc+0x1e/0x20
+ 11) 1664 48 input_available_p+0x1d/0x5c
+ 12) 1616 48 n_tty_poll+0x6d/0x134
+ 13) 1568 64 tty_poll+0x64/0x7f
+ 14) 1504 880 do_select+0x31e/0x511
+ 15) 624 400 core_sys_select+0x177/0x216
+ 16) 224 96 sys_select+0x91/0xb9
+ 17) 128 128 system_call_fastpath+0x16/0x1b
+
+Note, if -mfentry is being used by gcc, functions get traced before
+they set up the stack frame. This means that leaf level functions
+are not tested by the stack tracer when -mfentry is used.
+
+Currently, -mfentry is used by gcc 4.6.0 and above on x86 only.
+
+---------
More details can be found in the source code, in the
kernel/trace/*.c files.
diff --git a/Documentation/usb/power-management.txt b/Documentation/usb/power-management.txt
index 4204eb01fd38..1392b61d6ebe 100644
--- a/Documentation/usb/power-management.txt
+++ b/Documentation/usb/power-management.txt
@@ -33,6 +33,10 @@ built with CONFIG_USB_SUSPEND enabled (which depends on
CONFIG_PM_RUNTIME). System PM support is present only if the kernel
was built with CONFIG_SUSPEND or CONFIG_HIBERNATION enabled.
+(Starting with the 3.10 kernel release, dynamic PM support for USB is
+present whenever the kernel was built with CONFIG_PM_RUNTIME enabled.
+The CONFIG_USB_SUSPEND option has been eliminated.)
+
What is Remote Wakeup?
----------------------
@@ -206,10 +210,8 @@ initialized to 5. (The idle-delay values for already existing devices
will not be affected.)
Setting the initial default idle-delay to -1 will prevent any
-autosuspend of any USB device. This is a simple alternative to
-disabling CONFIG_USB_SUSPEND and rebuilding the kernel, and it has the
-added benefit of allowing you to enable autosuspend for selected
-devices.
+autosuspend of any USB device. This has the benefit of allowing you
+then to enable autosuspend for selected devices.
Warnings