diff options
Diffstat (limited to 'Documentation')
155 files changed, 6991 insertions, 1566 deletions
diff --git a/Documentation/ABI/testing/sysfs-bus-mei b/Documentation/ABI/testing/sysfs-bus-mei new file mode 100644 index 000000000000..2066f0bbd453 --- /dev/null +++ b/Documentation/ABI/testing/sysfs-bus-mei @@ -0,0 +1,7 @@ +What: /sys/bus/mei/devices/.../modalias +Date: March 2013 +KernelVersion: 3.10 +Contact: Samuel Ortiz <sameo@linux.intel.com> + linux-mei@linux.intel.com +Description: Stores the same MODALIAS value emitted by uevent + Format: mei:<mei device name> diff --git a/Documentation/ABI/testing/sysfs-bus-usb b/Documentation/ABI/testing/sysfs-bus-usb index c8baaf53594a..f093e59cbe5f 100644 --- a/Documentation/ABI/testing/sysfs-bus-usb +++ b/Documentation/ABI/testing/sysfs-bus-usb @@ -32,7 +32,7 @@ Date: January 2008 KernelVersion: 2.6.25 Contact: Sarah Sharp <sarah.a.sharp@intel.com> Description: - If CONFIG_PM and CONFIG_USB_SUSPEND are enabled, then this file + If CONFIG_PM_RUNTIME is enabled then this file is present. When read, it returns the total time (in msec) that the USB device has been connected to the machine. This file is read-only. @@ -45,7 +45,7 @@ Date: January 2008 KernelVersion: 2.6.25 Contact: Sarah Sharp <sarah.a.sharp@intel.com> Description: - If CONFIG_PM and CONFIG_USB_SUSPEND are enabled, then this file + If CONFIG_PM_RUNTIME is enabled then this file is present. When read, it returns the total time (in msec) that the USB device has been active, i.e. not in a suspended state. This file is read-only. @@ -187,7 +187,7 @@ What: /sys/bus/usb/devices/.../power/usb2_hardware_lpm Date: September 2011 Contact: Andiry Xu <andiry.xu@amd.com> Description: - If CONFIG_USB_SUSPEND is set and a USB 2.0 lpm-capable device + If CONFIG_PM_RUNTIME is set and a USB 2.0 lpm-capable device is plugged in to a xHCI host which support link PM, it will perform a LPM test; if the test is passed and host supports USB2 hardware LPM (xHCI 1.0 feature), USB2 hardware LPM will diff --git a/Documentation/ABI/testing/sysfs-class-net-mesh b/Documentation/ABI/testing/sysfs-class-net-mesh index bc41da61608d..bdcd8b4e38f2 100644 --- a/Documentation/ABI/testing/sysfs-class-net-mesh +++ b/Documentation/ABI/testing/sysfs-class-net-mesh @@ -67,6 +67,14 @@ Description: Defines the penalty which will be applied to an originator message's tq-field on every hop. +What: /sys/class/net/<mesh_iface>/mesh/network_coding +Date: Nov 2012 +Contact: Martin Hundeboll <martin@hundeboll.net> +Description: + Controls whether Network Coding (using some magic + to send fewer wifi packets but still the same + content) is enabled or not. + What: /sys/class/net/<mesh_iface>/mesh/orig_interval Date: May 2010 Contact: Marek Lindner <lindner_marek@yahoo.de> diff --git a/Documentation/ABI/testing/sysfs-devices-lpss_ltr b/Documentation/ABI/testing/sysfs-devices-lpss_ltr new file mode 100644 index 000000000000..ea9298d9bbaf --- /dev/null +++ b/Documentation/ABI/testing/sysfs-devices-lpss_ltr @@ -0,0 +1,44 @@ +What: /sys/devices/.../lpss_ltr/ +Date: March 2013 +Contact: Rafael J. Wysocki <rafael.j.wysocki@intel.com> +Description: + The /sys/devices/.../lpss_ltr/ directory is only present for + devices included into the Intel Lynxpoint Low Power Subsystem + (LPSS). If present, it contains attributes containing the LTR + mode and the values of LTR registers of the device. + +What: /sys/devices/.../lpss_ltr/ltr_mode +Date: March 2013 +Contact: Rafael J. Wysocki <rafael.j.wysocki@intel.com> +Description: + The /sys/devices/.../lpss_ltr/ltr_mode attribute contains an + integer number (0 or 1) indicating whether or not the devices' + LTR functionality is working in the software mode (1). + + This attribute is read-only. If the device's runtime PM status + is not "active", attempts to read from this attribute cause + -EAGAIN to be returned. + +What: /sys/devices/.../lpss_ltr/auto_ltr +Date: March 2013 +Contact: Rafael J. Wysocki <rafael.j.wysocki@intel.com> +Description: + The /sys/devices/.../lpss_ltr/auto_ltr attribute contains the + current value of the device's AUTO_LTR register (raw) + represented as an 8-digit hexadecimal number. + + This attribute is read-only. If the device's runtime PM status + is not "active", attempts to read from this attribute cause + -EAGAIN to be returned. + +What: /sys/devices/.../lpss_ltr/sw_ltr +Date: March 2013 +Contact: Rafael J. Wysocki <rafael.j.wysocki@intel.com> +Description: + The /sys/devices/.../lpss_ltr/auto_ltr attribute contains the + current value of the device's SW_LTR register (raw) represented + as an 8-digit hexadecimal number. + + This attribute is read-only. If the device's runtime PM status + is not "active", attempts to read from this attribute cause + -EAGAIN to be returned. diff --git a/Documentation/ABI/testing/sysfs-devices-power_resources_wakeup b/Documentation/ABI/testing/sysfs-devices-power_resources_wakeup new file mode 100644 index 000000000000..e0588feeb6e1 --- /dev/null +++ b/Documentation/ABI/testing/sysfs-devices-power_resources_wakeup @@ -0,0 +1,13 @@ +What: /sys/devices/.../power_resources_wakeup/ +Date: April 2013 +Contact: Rafael J. Wysocki <rafael.j.wysocki@intel.com> +Description: + The /sys/devices/.../power_resources_wakeup/ directory is only + present for device objects representing ACPI device nodes that + require ACPI power resources for wakeup signaling. + + If present, it contains symbolic links to device directories + representing ACPI power resources that need to be turned on for + the given device node to be able to signal wakeup. The names of + the links are the same as the names of the directories they + point to. diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu index 9c978dcae07d..2447698aed41 100644 --- a/Documentation/ABI/testing/sysfs-devices-system-cpu +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu @@ -173,3 +173,15 @@ Description: Processor frequency boosting control Boosting allows the CPU and the firmware to run at a frequency beyound it's nominal limit. More details can be found in Documentation/cpu-freq/boost.txt + + +What: /sys/devices/system/cpu/cpu#/crash_notes + /sys/devices/system/cpu/cpu#/crash_notes_size +Date: April 2013 +Contact: kexec@lists.infradead.org +Description: address and size of the percpu note. + + crash_notes: the physical address of the memory that holds the + note of cpu#. + + crash_notes_size: size of the note of cpu#. diff --git a/Documentation/ABI/testing/sysfs-driver-hid-roccat-isku b/Documentation/ABI/testing/sysfs-driver-hid-roccat-isku index 9eca5a182e64..c601d0f2ac46 100644 --- a/Documentation/ABI/testing/sysfs-driver-hid-roccat-isku +++ b/Documentation/ABI/testing/sysfs-driver-hid-roccat-isku @@ -101,7 +101,8 @@ Date: June 2011 Contact: Stefan Achatz <erazor_de@users.sourceforge.net> Description: When written, this file lets one set the backlight intensity for a specific profile. Profile number is included in written data. - The data has to be 10 bytes long. + The data has to be 10 bytes long for Isku, IskuFX needs 16 bytes + of data. Before reading this file, control has to be written to select which profile to read. Users: http://roccat.sourceforge.net @@ -141,3 +142,12 @@ Description: When written, this file lets one trigger easyshift functionality The data has to be 16 bytes long. This file is writeonly. Users: http://roccat.sourceforge.net + +What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/isku/roccatisku<minor>/talkfx +Date: February 2013 +Contact: Stefan Achatz <erazor_de@users.sourceforge.net> +Description: When written, this file lets one trigger temporary color schemes + from the host. + The data has to be 16 bytes long. + This file is writeonly. +Users: http://roccat.sourceforge.net diff --git a/Documentation/ABI/testing/sysfs-driver-hid-roccat-konepure b/Documentation/ABI/testing/sysfs-driver-hid-roccat-konepure new file mode 100644 index 000000000000..41a9b7fbfc79 --- /dev/null +++ b/Documentation/ABI/testing/sysfs-driver-hid-roccat-konepure @@ -0,0 +1,105 @@ +What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/konepure/roccatkonepure<minor>/actual_profile +Date: December 2012 +Contact: Stefan Achatz <erazor_de@users.sourceforge.net> +Description: The mouse can store 5 profiles which can be switched by the + press of a button. actual_profile holds number of actual profile. + This value is persistent, so its value determines the profile + that's active when the mouse is powered on next time. + When written, the mouse activates the set profile immediately. + The data has to be 3 bytes long. + The mouse will reject invalid data. +Users: http://roccat.sourceforge.net + +What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/konepure/roccatkonepure<minor>/control +Date: December 2012 +Contact: Stefan Achatz <erazor_de@users.sourceforge.net> +Description: When written, this file lets one select which data from which + profile will be read next. The data has to be 3 bytes long. + This file is writeonly. +Users: http://roccat.sourceforge.net + +What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/konepure/roccatkonepure<minor>/info +Date: December 2012 +Contact: Stefan Achatz <erazor_de@users.sourceforge.net> +Description: When read, this file returns general data like firmware version. + When written, the device can be reset. + The data is 6 bytes long. +Users: http://roccat.sourceforge.net + +What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/konepure/roccatkonepure<minor>/macro +Date: December 2012 +Contact: Stefan Achatz <erazor_de@users.sourceforge.net> +Description: The mouse can store a macro with max 500 key/button strokes + internally. + When written, this file lets one set the sequence for a specific + button for a specific profile. Button and profile numbers are + included in written data. The data has to be 2082 bytes long. + This file is writeonly. +Users: http://roccat.sourceforge.net + +What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/konepure/roccatkonepure<minor>/profile_buttons +Date: December 2012 +Contact: Stefan Achatz <erazor_de@users.sourceforge.net> +Description: The mouse can store 5 profiles which can be switched by the + press of a button. A profile is split in settings and buttons. + profile_buttons holds information about button layout. + When written, this file lets one write the respective profile + buttons back to the mouse. The data has to be 59 bytes long. + The mouse will reject invalid data. + Which profile to write is determined by the profile number + contained in the data. + Before reading this file, control has to be written to select + which profile to read. +Users: http://roccat.sourceforge.net + +What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/konepure/roccatkonepure<minor>/profile_settings +Date: December 2012 +Contact: Stefan Achatz <erazor_de@users.sourceforge.net> +Description: The mouse can store 5 profiles which can be switched by the + press of a button. A profile is split in settings and buttons. + profile_settings holds information like resolution, sensitivity + and light effects. + When written, this file lets one write the respective profile + settings back to the mouse. The data has to be 31 bytes long. + The mouse will reject invalid data. + Which profile to write is determined by the profile number + contained in the data. + Before reading this file, control has to be written to select + which profile to read. +Users: http://roccat.sourceforge.net + +What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/konepure/roccatkonepure<minor>/sensor +Date: December 2012 +Contact: Stefan Achatz <erazor_de@users.sourceforge.net> +Description: The mouse has a tracking- and a distance-control-unit. These + can be activated/deactivated and the lift-off distance can be + set. The data has to be 6 bytes long. + This file is writeonly. +Users: http://roccat.sourceforge.net + +What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/konepure/roccatkonepure<minor>/talk +Date: December 2012 +Contact: Stefan Achatz <erazor_de@users.sourceforge.net> +Description: Used to active some easy* functions of the mouse from outside. + The data has to be 16 bytes long. + This file is writeonly. +Users: http://roccat.sourceforge.net + +What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/konepure/roccatkonepure<minor>/tcu +Date: December 2012 +Contact: Stefan Achatz <erazor_de@users.sourceforge.net> +Description: When written a calibration process for the tracking control unit + can be initiated/cancelled. Also lets one read/write sensor + registers. + The data has to be 4 bytes long. +Users: http://roccat.sourceforge.net + +What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/<hid-bus>:<vendor-id>:<product-id>.<num>/konepure/roccatkonepure<minor>/tcu_image +Date: December 2012 +Contact: Stefan Achatz <erazor_de@users.sourceforge.net> +Description: When read the mouse returns a 30x30 pixel image of the + sampled underground. This works only in the course of a + calibration process initiated with tcu. + The returned data is 1028 bytes in size. + This file is readonly. +Users: http://roccat.sourceforge.net diff --git a/Documentation/ABI/testing/sysfs-firmware-acpi b/Documentation/ABI/testing/sysfs-firmware-acpi index dd930c8db41f..ce9bee98b43b 100644 --- a/Documentation/ABI/testing/sysfs-firmware-acpi +++ b/Documentation/ABI/testing/sysfs-firmware-acpi @@ -18,6 +18,32 @@ Description: yoffset: The number of pixels between the top of the screen and the top edge of the image. +What: /sys/firmware/acpi/hotplug/ +Date: February 2013 +Contact: Rafael J. Wysocki <rafael.j.wysocki@intel.com> +Description: + There are separate hotplug profiles for different classes of + devices supported by ACPI, such as containers, memory modules, + processors, PCI root bridges etc. A hotplug profile for a given + class of devices is a collection of settings defining the way + that class of devices will be handled by the ACPI core hotplug + code. Those profiles are represented in sysfs as subdirectories + of /sys/firmware/acpi/hotplug/. + + The following setting is available to user space for each + hotplug profile: + + enabled: If set, the ACPI core will handle notifications of + hotplug events associated with the given class of + devices and will allow those devices to be ejected with + the help of the _EJ0 control method. Unsetting it + effectively disables hotplug for the correspoinding + class of devices. + + The value of the above attribute is an integer number: 1 (set) + or 0 (unset). Attempts to write any other values to it will + cause -EINVAL to be returned. + What: /sys/firmware/acpi/interrupts/ Date: February 2008 Contact: Len Brown <lenb@kernel.org> diff --git a/Documentation/DocBook/80211.tmpl b/Documentation/DocBook/80211.tmpl index 284ced7a228f..0f6a3edcd44b 100644 --- a/Documentation/DocBook/80211.tmpl +++ b/Documentation/DocBook/80211.tmpl @@ -437,7 +437,7 @@ </section> !Finclude/net/mac80211.h ieee80211_get_buffered_bc !Finclude/net/mac80211.h ieee80211_beacon_get -!Finclude/net/mac80211.h ieee80211_sta_eosp_irqsafe +!Finclude/net/mac80211.h ieee80211_sta_eosp !Finclude/net/mac80211.h ieee80211_frame_release_type !Finclude/net/mac80211.h ieee80211_sta_ps_transition !Finclude/net/mac80211.h ieee80211_sta_ps_transition_ni diff --git a/Documentation/DocBook/device-drivers.tmpl b/Documentation/DocBook/device-drivers.tmpl index 7514dbf0a679..c36892c072da 100644 --- a/Documentation/DocBook/device-drivers.tmpl +++ b/Documentation/DocBook/device-drivers.tmpl @@ -227,7 +227,7 @@ X!Isound/sound_firmware.c <chapter id="uart16x50"> <title>16x50 UART Driver</title> !Edrivers/tty/serial/serial_core.c -!Edrivers/tty/serial/8250/8250.c +!Edrivers/tty/serial/8250/8250_core.c </chapter> <chapter id="fbdev"> diff --git a/Documentation/DocBook/media/dvb/dvbproperty.xml b/Documentation/DocBook/media/dvb/dvbproperty.xml index 4a5eaeed0b9e..a9b15e34c5b2 100644 --- a/Documentation/DocBook/media/dvb/dvbproperty.xml +++ b/Documentation/DocBook/media/dvb/dvbproperty.xml @@ -1,6 +1,6 @@ <section id="FE_GET_SET_PROPERTY"> <title><constant>FE_GET_PROPERTY/FE_SET_PROPERTY</constant></title> -<para>This section describes the DVB version 5 extention of the DVB-API, also +<para>This section describes the DVB version 5 extension of the DVB-API, also called "S2API", as this API were added to provide support for DVB-S2. It was designed to be able to replace the old frontend API. Yet, the DISEQC and the capability ioctls weren't implemented yet via the new way.</para> @@ -903,14 +903,12 @@ enum fe_interleaving { <constant>svalue</constant> is for signed values of the measure (dB measures) and <constant>uvalue</constant> is for unsigned values (counters, relative scale)</para></listitem> <listitem><para><constant>scale</constant> - Scale for the value. It can be:</para> - <section id = "fecap-scale-params"> - <itemizedlist mark='bullet'> + <itemizedlist mark='bullet' id="fecap-scale-params"> <listitem><para><constant>FE_SCALE_NOT_AVAILABLE</constant> - The parameter is supported by the frontend, but it was not possible to collect it (could be a transitory or permanent condition)</para></listitem> <listitem><para><constant>FE_SCALE_DECIBEL</constant> - parameter is a signed value, measured in 1/1000 dB</para></listitem> <listitem><para><constant>FE_SCALE_RELATIVE</constant> - parameter is a unsigned value, where 0 means 0% and 65535 means 100%.</para></listitem> <listitem><para><constant>FE_SCALE_COUNTER</constant> - parameter is a unsigned value that counts the occurrence of an event, like bit error, block error, or lapsed time.</para></listitem> </itemizedlist> - </section> </listitem> </itemizedlist> <section id="DTV-STAT-SIGNAL-STRENGTH"> @@ -918,9 +916,9 @@ enum fe_interleaving { <para>Indicates the signal strength level at the analog part of the tuner or of the demod.</para> <para>Possible scales for this metric are:</para> <itemizedlist mark='bullet'> - <listitem><constant>FE_SCALE_NOT_AVAILABLE</constant> - it failed to measure it, or the measurement was not complete yet.</listitem> - <listitem><constant>FE_SCALE_DECIBEL</constant> - signal strength is in 0.0001 dBm units, power measured in miliwatts. This value is generally negative.</listitem> - <listitem><constant>FE_SCALE_RELATIVE</constant> - The frontend provides a 0% to 100% measurement for power (actually, 0 to 65535).</listitem> + <listitem><para><constant>FE_SCALE_NOT_AVAILABLE</constant> - it failed to measure it, or the measurement was not complete yet.</para></listitem> + <listitem><para><constant>FE_SCALE_DECIBEL</constant> - signal strength is in 0.0001 dBm units, power measured in miliwatts. This value is generally negative.</para></listitem> + <listitem><para><constant>FE_SCALE_RELATIVE</constant> - The frontend provides a 0% to 100% measurement for power (actually, 0 to 65535).</para></listitem> </itemizedlist> </section> <section id="DTV-STAT-CNR"> @@ -928,9 +926,9 @@ enum fe_interleaving { <para>Indicates the Signal to Noise ratio for the main carrier.</para> <para>Possible scales for this metric are:</para> <itemizedlist mark='bullet'> - <listitem><constant>FE_SCALE_NOT_AVAILABLE</constant> - it failed to measure it, or the measurement was not complete yet.</listitem> - <listitem><constant>FE_SCALE_DECIBEL</constant> - Signal/Noise ratio is in 0.0001 dB units.</listitem> - <listitem><constant>FE_SCALE_RELATIVE</constant> - The frontend provides a 0% to 100% measurement for Signal/Noise (actually, 0 to 65535).</listitem> + <listitem><para><constant>FE_SCALE_NOT_AVAILABLE</constant> - it failed to measure it, or the measurement was not complete yet.</para></listitem> + <listitem><para><constant>FE_SCALE_DECIBEL</constant> - Signal/Noise ratio is in 0.0001 dB units.</para></listitem> + <listitem><para><constant>FE_SCALE_RELATIVE</constant> - The frontend provides a 0% to 100% measurement for Signal/Noise (actually, 0 to 65535).</para></listitem> </itemizedlist> </section> <section id="DTV-STAT-PRE-ERROR-BIT-COUNT"> @@ -943,8 +941,8 @@ enum fe_interleaving { The frontend may reset it when a channel/transponder is tuned.</para> <para>Possible scales for this metric are:</para> <itemizedlist mark='bullet'> - <listitem><constant>FE_SCALE_NOT_AVAILABLE</constant> - it failed to measure it, or the measurement was not complete yet.</listitem> - <listitem><constant>FE_SCALE_COUNTER</constant> - Number of error bits counted before the inner coding.</listitem> + <listitem><para><constant>FE_SCALE_NOT_AVAILABLE</constant> - it failed to measure it, or the measurement was not complete yet.</para></listitem> + <listitem><para><constant>FE_SCALE_COUNTER</constant> - Number of error bits counted before the inner coding.</para></listitem> </itemizedlist> </section> <section id="DTV-STAT-PRE-TOTAL-BIT-COUNT"> @@ -952,14 +950,14 @@ enum fe_interleaving { <para>Measures the amount of bits received before the inner code block, during the same period as <link linkend="DTV-STAT-PRE-ERROR-BIT-COUNT"><constant>DTV_STAT_PRE_ERROR_BIT_COUNT</constant></link> measurement was taken.</para> <para>It should be noticed that this measurement can be smaller than the total amount of bits on the transport stream, - as the frontend may need to manually restart the measurement, loosing some data between each measurement interval.</para> + as the frontend may need to manually restart the measurement, losing some data between each measurement interval.</para> <para>This measurement is monotonically increased, as the frontend gets more bit count measurements. The frontend may reset it when a channel/transponder is tuned.</para> <para>Possible scales for this metric are:</para> <itemizedlist mark='bullet'> - <listitem><constant>FE_SCALE_NOT_AVAILABLE</constant> - it failed to measure it, or the measurement was not complete yet.</listitem> - <listitem><constant>FE_SCALE_COUNTER</constant> - Number of bits counted while measuring - <link linkend="DTV-STAT-PRE-ERROR-BIT-COUNT"><constant>DTV_STAT_PRE_ERROR_BIT_COUNT</constant></link>.</listitem> + <listitem><para><constant>FE_SCALE_NOT_AVAILABLE</constant> - it failed to measure it, or the measurement was not complete yet.</para></listitem> + <listitem><para><constant>FE_SCALE_COUNTER</constant> - Number of bits counted while measuring + <link linkend="DTV-STAT-PRE-ERROR-BIT-COUNT"><constant>DTV_STAT_PRE_ERROR_BIT_COUNT</constant></link>.</para></listitem> </itemizedlist> </section> <section id="DTV-STAT-POST-ERROR-BIT-COUNT"> @@ -972,8 +970,8 @@ enum fe_interleaving { The frontend may reset it when a channel/transponder is tuned.</para> <para>Possible scales for this metric are:</para> <itemizedlist mark='bullet'> - <listitem><constant>FE_SCALE_NOT_AVAILABLE</constant> - it failed to measure it, or the measurement was not complete yet.</listitem> - <listitem><constant>FE_SCALE_COUNTER</constant> - Number of error bits counted after the inner coding.</listitem> + <listitem><para><constant>FE_SCALE_NOT_AVAILABLE</constant> - it failed to measure it, or the measurement was not complete yet.</para></listitem> + <listitem><para><constant>FE_SCALE_COUNTER</constant> - Number of error bits counted after the inner coding.</para></listitem> </itemizedlist> </section> <section id="DTV-STAT-POST-TOTAL-BIT-COUNT"> @@ -981,14 +979,14 @@ enum fe_interleaving { <para>Measures the amount of bits received after the inner coding, during the same period as <link linkend="DTV-STAT-POST-ERROR-BIT-COUNT"><constant>DTV_STAT_POST_ERROR_BIT_COUNT</constant></link> measurement was taken.</para> <para>It should be noticed that this measurement can be smaller than the total amount of bits on the transport stream, - as the frontend may need to manually restart the measurement, loosing some data between each measurement interval.</para> + as the frontend may need to manually restart the measurement, losing some data between each measurement interval.</para> <para>This measurement is monotonically increased, as the frontend gets more bit count measurements. The frontend may reset it when a channel/transponder is tuned.</para> <para>Possible scales for this metric are:</para> <itemizedlist mark='bullet'> - <listitem><constant>FE_SCALE_NOT_AVAILABLE</constant> - it failed to measure it, or the measurement was not complete yet.</listitem> - <listitem><constant>FE_SCALE_COUNTER</constant> - Number of bits counted while measuring - <link linkend="DTV-STAT-POST-ERROR-BIT-COUNT"><constant>DTV_STAT_POST_ERROR_BIT_COUNT</constant></link>.</listitem> + <listitem><para><constant>FE_SCALE_NOT_AVAILABLE</constant> - it failed to measure it, or the measurement was not complete yet.</para></listitem> + <listitem><para><constant>FE_SCALE_COUNTER</constant> - Number of bits counted while measuring + <link linkend="DTV-STAT-POST-ERROR-BIT-COUNT"><constant>DTV_STAT_POST_ERROR_BIT_COUNT</constant></link>.</para></listitem> </itemizedlist> </section> <section id="DTV-STAT-ERROR-BLOCK-COUNT"> @@ -998,8 +996,8 @@ enum fe_interleaving { The frontend may reset it when a channel/transponder is tuned.</para> <para>Possible scales for this metric are:</para> <itemizedlist mark='bullet'> - <listitem><constant>FE_SCALE_NOT_AVAILABLE</constant> - it failed to measure it, or the measurement was not complete yet.</listitem> - <listitem><constant>FE_SCALE_COUNTER</constant> - Number of error blocks counted after the outer coding.</listitem> + <listitem><para><constant>FE_SCALE_NOT_AVAILABLE</constant> - it failed to measure it, or the measurement was not complete yet.</para></listitem> + <listitem><para><constant>FE_SCALE_COUNTER</constant> - Number of error blocks counted after the outer coding.</para></listitem> </itemizedlist> </section> <section id="DTV-STAT-TOTAL-BLOCK-COUNT"> @@ -1011,9 +1009,9 @@ enum fe_interleaving { by <link linkend="DTV-STAT-TOTAL-BLOCK-COUNT"><constant>DTV-STAT-TOTAL-BLOCK-COUNT</constant></link>.</para> <para>Possible scales for this metric are:</para> <itemizedlist mark='bullet'> - <listitem><constant>FE_SCALE_NOT_AVAILABLE</constant> - it failed to measure it, or the measurement was not complete yet.</listitem> - <listitem><constant>FE_SCALE_COUNTER</constant> - Number of blocks counted while measuring - <link linkend="DTV-STAT-ERROR-BLOCK-COUNT"><constant>DTV_STAT_ERROR_BLOCK_COUNT</constant></link>.</listitem> + <listitem><para><constant>FE_SCALE_NOT_AVAILABLE</constant> - it failed to measure it, or the measurement was not complete yet.</para></listitem> + <listitem><para><constant>FE_SCALE_COUNTER</constant> - Number of blocks counted while measuring + <link linkend="DTV-STAT-ERROR-BLOCK-COUNT"><constant>DTV_STAT_ERROR_BLOCK_COUNT</constant></link>.</para></listitem> </itemizedlist> </section> </section> diff --git a/Documentation/DocBook/media/v4l/common.xml b/Documentation/DocBook/media/v4l/common.xml index ae06afbbb3a9..1ddf354aa997 100644 --- a/Documentation/DocBook/media/v4l/common.xml +++ b/Documentation/DocBook/media/v4l/common.xml @@ -750,15 +750,6 @@ header can be used to get the timings of the formats in the <xref linkend="cea86 <xref linkend="vesadmt" /> standards. </para> </listitem> - <listitem> - <para>DV Presets: Digital Video (DV) presets (<emphasis role="bold">deprecated</emphasis>). - These are IDs representing a -video timing at the input/output. Presets are pre-defined timings implemented -by the hardware according to video standards. A __u32 data type is used to represent -a preset unlike the bit mask that is used in &v4l2-std-id; allowing future extensions -to support as many different presets as needed. This API is deprecated in favor of the DV Timings -API.</para> - </listitem> </itemizedlist> <para>To enumerate and query the attributes of the DV timings supported by a device, applications use the &VIDIOC-ENUM-DV-TIMINGS; and &VIDIOC-DV-TIMINGS-CAP; ioctls. @@ -766,11 +757,6 @@ API.</para> &VIDIOC-S-DV-TIMINGS; ioctl and to get current DV timings they use the &VIDIOC-G-DV-TIMINGS; ioctl. To detect the DV timings as seen by the video receiver applications use the &VIDIOC-QUERY-DV-TIMINGS; ioctl.</para> - <para>To enumerate and query the attributes of DV presets supported by a device, -applications use the &VIDIOC-ENUM-DV-PRESETS; ioctl. To get the current DV preset, -applications use the &VIDIOC-G-DV-PRESET; ioctl and to set a preset they use the -&VIDIOC-S-DV-PRESET; ioctl. To detect the preset as seen by the video receiver applications -use the &VIDIOC-QUERY-DV-PRESET; ioctl.</para> <para>Applications can make use of the <xref linkend="input-capabilities" /> and <xref linkend="output-capabilities"/> flags to decide what ioctls are available to set the video timings for the device.</para> diff --git a/Documentation/DocBook/media/v4l/compat.xml b/Documentation/DocBook/media/v4l/compat.xml index 104a1a2b8849..f43542ae2981 100644 --- a/Documentation/DocBook/media/v4l/compat.xml +++ b/Documentation/DocBook/media/v4l/compat.xml @@ -2310,6 +2310,9 @@ more information.</para> <listitem> <para>Added FM Modulator (FM TX) Extended Control Class: <constant>V4L2_CTRL_CLASS_FM_TX</constant> and their Control IDs.</para> </listitem> +<listitem> + <para>Added FM Receiver (FM RX) Extended Control Class: <constant>V4L2_CTRL_CLASS_FM_RX</constant> and their Control IDs.</para> + </listitem> <listitem> <para>Added Remote Controller chapter, describing the default Remote Controller mapping for media devices.</para> </listitem> @@ -2493,6 +2496,23 @@ that used it. It was originally scheduled for removal in 2.6.35. </orderedlist> </section> + <section> + <title>V4L2 in Linux 3.10</title> + <orderedlist> + <listitem> + <para>Removed obsolete and unused DV_PRESET ioctls + VIDIOC_G_DV_PRESET, VIDIOC_S_DV_PRESET, VIDIOC_QUERY_DV_PRESET and + VIDIOC_ENUM_DV_PRESET. Remove the related v4l2_input/output capability + flags V4L2_IN_CAP_PRESETS and V4L2_OUT_CAP_PRESETS. + </para> + </listitem> + <listitem> + <para>Added new debugging ioctl &VIDIOC-DBG-G-CHIP-INFO;. + </para> + </listitem> + </orderedlist> + </section> + <section id="other"> <title>Relation of V4L2 to other Linux multimedia APIs</title> @@ -2625,8 +2645,8 @@ interfaces and should not be implemented in new drivers.</para> <xref linkend="extended-controls" />.</para> </listitem> <listitem> - <para>&VIDIOC-G-DV-PRESET;, &VIDIOC-S-DV-PRESET;, &VIDIOC-ENUM-DV-PRESETS; and - &VIDIOC-QUERY-DV-PRESET; ioctls. Use the DV Timings API (<xref linkend="dv-timings" />).</para> + <para>VIDIOC_G_DV_PRESET, VIDIOC_S_DV_PRESET, VIDIOC_ENUM_DV_PRESETS and + VIDIOC_QUERY_DV_PRESET ioctls. Use the DV Timings API (<xref linkend="dv-timings" />).</para> </listitem> <listitem> <para><constant>VIDIOC_SUBDEV_G_CROP</constant> and diff --git a/Documentation/DocBook/media/v4l/controls.xml b/Documentation/DocBook/media/v4l/controls.xml index 9e8f85498678..8d7a77928d49 100644 --- a/Documentation/DocBook/media/v4l/controls.xml +++ b/Documentation/DocBook/media/v4l/controls.xml @@ -2300,6 +2300,12 @@ Possible values are:</entry> </row> <row><entry></entry></row> <row> + <entry spanname="id"><constant>V4L2_CID_MPEG_VIDEO_REPEAT_SEQ_HEADER</constant> </entry> + <entry>boolean</entry> + </row><row><entry spanname="descr">Repeat the video sequence headers. Repeating these +headers makes random access to the video stream easier. Applicable to the MPEG1, 2 and 4 encoder.</entry> + </row> + <row> <entry spanname="id"><constant>V4L2_CID_MPEG_VIDEO_DECODER_MPEG4_DEBLOCK_FILTER</constant> </entry> <entry>boolean</entry> </row><row><entry spanname="descr">Enabled the deblocking post processing filter for MPEG4 decoder. @@ -3136,6 +3142,13 @@ giving priority to the center of the metered area.</entry> <entry><constant>V4L2_EXPOSURE_METERING_SPOT</constant> </entry> <entry>Measure only very small area at the center of the frame.</entry> </row> + <row> + <entry><constant>V4L2_EXPOSURE_METERING_MATRIX</constant> </entry> + <entry>A multi-zone metering. The light intensity is measured +in several points of the frame and the the results are combined. The +algorithm of the zones selection and their significance in calculating the +final value is device dependant.</entry> + </row> </tbody> </entrytbl> </row> @@ -3848,7 +3861,7 @@ in Hz. The range and step are driver-specific.</entry> </row> <row> <entry spanname="id"><constant>V4L2_CID_TUNE_PREEMPHASIS</constant> </entry> - <entry>integer</entry> + <entry>enum v4l2_preemphasis</entry> </row> <row id="v4l2-preemphasis"><entry spanname="descr">Configures the pre-emphasis value for broadcasting. A pre-emphasis filter is applied to the broadcast to accentuate the high audio frequencies. @@ -4687,4 +4700,76 @@ interface and may change in the future.</para> </table> </section> + + <section id="fm-rx-controls"> + <title>FM Receiver Control Reference</title> + + <para>The FM Receiver (FM_RX) class includes controls for common features of + FM Reception capable devices.</para> + + <table pgwide="1" frame="none" id="fm-rx-control-id"> + <title>FM_RX Control IDs</title> + + <tgroup cols="4"> + <colspec colname="c1" colwidth="1*" /> + <colspec colname="c2" colwidth="6*" /> + <colspec colname="c3" colwidth="2*" /> + <colspec colname="c4" colwidth="6*" /> + <spanspec namest="c1" nameend="c2" spanname="id" /> + <spanspec namest="c2" nameend="c4" spanname="descr" /> + <thead> + <row> + <entry spanname="id" align="left">ID</entry> + <entry align="left">Type</entry> + </row><row rowsep="1"><entry spanname="descr" align="left">Description</entry> + </row> + </thead> + <tbody valign="top"> + <row><entry></entry></row> + <row> + <entry spanname="id"><constant>V4L2_CID_FM_RX_CLASS</constant> </entry> + <entry>class</entry> + </row><row><entry spanname="descr">The FM_RX class +descriptor. Calling &VIDIOC-QUERYCTRL; for this control will return a +description of this control class.</entry> + </row> + <row> + <entry spanname="id"><constant>V4L2_CID_RDS_RECEPTION</constant> </entry> + <entry>boolean</entry> + </row><row><entry spanname="descr">Enables/disables RDS + reception by the radio tuner</entry> + </row> + <row> + <entry spanname="id"><constant>V4L2_CID_TUNE_DEEMPHASIS</constant> </entry> + <entry>enum v4l2_deemphasis</entry> + </row> + <row id="v4l2-deemphasis"><entry spanname="descr">Configures the de-emphasis value for reception. +A de-emphasis filter is applied to the broadcast to accentuate the high audio frequencies. +Depending on the region, a time constant of either 50 or 75 useconds is used. The enum v4l2_deemphasis +defines possible values for de-emphasis. Here they are:</entry> + </row><row> + <entrytbl spanname="descr" cols="2"> + <tbody valign="top"> + <row> + <entry><constant>V4L2_DEEMPHASIS_DISABLED</constant> </entry> + <entry>No de-emphasis is applied.</entry> + </row> + <row> + <entry><constant>V4L2_DEEMPHASIS_50_uS</constant> </entry> + <entry>A de-emphasis of 50 uS is used.</entry> + </row> + <row> + <entry><constant>V4L2_DEEMPHASIS_75_uS</constant> </entry> + <entry>A de-emphasis of 75 uS is used.</entry> + </row> + </tbody> + </entrytbl> + + </row> + <row><entry></entry></row> + </tbody> + </tgroup> + </table> + + </section> </section> diff --git a/Documentation/DocBook/media/v4l/io.xml b/Documentation/DocBook/media/v4l/io.xml index e6c58559ca6b..2c4c068dde83 100644 --- a/Documentation/DocBook/media/v4l/io.xml +++ b/Documentation/DocBook/media/v4l/io.xml @@ -1145,6 +1145,12 @@ in which case caches have not been used.</entry> same clock outside V4L2, use <function>clock_gettime(2)</function> .</entry> </row> + <row> + <entry><constant>V4L2_BUF_FLAG_TIMESTAMP_COPY</constant></entry> + <entry>0x4000</entry> + <entry>The CAPTURE buffer timestamp has been taken from the + corresponding OUTPUT buffer. This flag applies only to mem2mem devices.</entry> + </row> </tbody> </tgroup> </table> diff --git a/Documentation/DocBook/media/v4l/media-ioc-enum-entities.xml b/Documentation/DocBook/media/v4l/media-ioc-enum-entities.xml index 576b68b33f2c..116c301656e0 100644 --- a/Documentation/DocBook/media/v4l/media-ioc-enum-entities.xml +++ b/Documentation/DocBook/media/v4l/media-ioc-enum-entities.xml @@ -272,6 +272,16 @@ <entry><constant>MEDIA_ENT_T_V4L2_SUBDEV_LENS</constant></entry> <entry>Lens controller</entry> </row> + <row> + <entry><constant>MEDIA_ENT_T_V4L2_SUBDEV_DECODER</constant></entry> + <entry>Video decoder, the basic function of the video decoder is to + accept analogue video from a wide variety of sources such as + broadcast, DVD players, cameras and video cassette recorders, in + either NTSC, PAL or HD format and still occasionally SECAM, separate + it into its component parts, luminance and chrominance, and output + it in some digital video standard, with appropriate embedded timing + signals.</entry> + </row> </tbody> </tgroup> </table> diff --git a/Documentation/DocBook/media/v4l/subdev-formats.xml b/Documentation/DocBook/media/v4l/subdev-formats.xml index cc51372ed5e0..adc61982df7b 100644 --- a/Documentation/DocBook/media/v4l/subdev-formats.xml +++ b/Documentation/DocBook/media/v4l/subdev-formats.xml @@ -93,19 +93,35 @@ <table pgwide="0" frame="none" id="v4l2-mbus-pixelcode-rgb"> <title>RGB formats</title> - <tgroup cols="11"> + <tgroup cols="27"> <colspec colname="id" align="left" /> <colspec colname="code" align="center"/> <colspec colname="bit" /> - <colspec colnum="4" colname="b07" align="center" /> - <colspec colnum="5" colname="b06" align="center" /> - <colspec colnum="6" colname="b05" align="center" /> - <colspec colnum="7" colname="b04" align="center" /> - <colspec colnum="8" colname="b03" align="center" /> - <colspec colnum="9" colname="b02" align="center" /> - <colspec colnum="10" colname="b01" align="center" /> - <colspec colnum="11" colname="b00" align="center" /> - <spanspec namest="b07" nameend="b00" spanname="b0" /> + <colspec colnum="4" colname="b23" align="center" /> + <colspec colnum="5" colname="b22" align="center" /> + <colspec colnum="6" colname="b21" align="center" /> + <colspec colnum="7" colname="b20" align="center" /> + <colspec colnum="8" colname="b19" align="center" /> + <colspec colnum="9" colname="b18" align="center" /> + <colspec colnum="10" colname="b17" align="center" /> + <colspec colnum="11" colname="b16" align="center" /> + <colspec colnum="12" colname="b15" align="center" /> + <colspec colnum="13" colname="b14" align="center" /> + <colspec colnum="14" colname="b13" align="center" /> + <colspec colnum="15" colname="b12" align="center" /> + <colspec colnum="16" colname="b11" align="center" /> + <colspec colnum="17" colname="b10" align="center" /> + <colspec colnum="18" colname="b09" align="center" /> + <colspec colnum="19" colname="b08" align="center" /> + <colspec colnum="20" colname="b07" align="center" /> + <colspec colnum="21" colname="b06" align="center" /> + <colspec colnum="22" colname="b05" align="center" /> + <colspec colnum="23" colname="b04" align="center" /> + <colspec colnum="24" colname="b03" align="center" /> + <colspec colnum="25" colname="b02" align="center" /> + <colspec colnum="26" colname="b01" align="center" /> + <colspec colnum="27" colname="b00" align="center" /> + <spanspec namest="b23" nameend="b00" spanname="b0" /> <thead> <row> <entry>Identifier</entry> @@ -117,6 +133,22 @@ <entry></entry> <entry></entry> <entry>Bit</entry> + <entry>23</entry> + <entry>22</entry> + <entry>21</entry> + <entry>20</entry> + <entry>19</entry> + <entry>18</entry> + <entry>17</entry> + <entry>16</entry> + <entry>15</entry> + <entry>14</entry> + <entry>13</entry> + <entry>12</entry> + <entry>11</entry> + <entry>10</entry> + <entry>9</entry> + <entry>8</entry> <entry>7</entry> <entry>6</entry> <entry>5</entry> @@ -132,6 +164,7 @@ <entry>V4L2_MBUS_FMT_RGB444_2X8_PADHI_BE</entry> <entry>0x1001</entry> <entry></entry> + &dash-ent-16; <entry>0</entry> <entry>0</entry> <entry>0</entry> @@ -145,6 +178,7 @@ <entry></entry> <entry></entry> <entry></entry> + &dash-ent-16; <entry>g<subscript>3</subscript></entry> <entry>g<subscript>2</subscript></entry> <entry>g<subscript>1</subscript></entry> @@ -158,6 +192,7 @@ <entry>V4L2_MBUS_FMT_RGB444_2X8_PADHI_LE</entry> <entry>0x1002</entry> <entry></entry> + &dash-ent-16; <entry>g<subscript>3</subscript></entry> <entry>g<subscript>2</subscript></entry> <entry>g<subscript>1</subscript></entry> @@ -171,6 +206,7 @@ <entry></entry> <entry></entry> <entry></entry> + &dash-ent-16; <entry>0</entry> <entry>0</entry> <entry>0</entry> @@ -184,6 +220,7 @@ <entry>V4L2_MBUS_FMT_RGB555_2X8_PADHI_BE</entry> <entry>0x1003</entry> <entry></entry> + &dash-ent-16; <entry>0</entry> <entry>r<subscript>4</subscript></entry> <entry>r<subscript>3</subscript></entry> @@ -197,6 +234,7 @@ <entry></entry> <entry></entry> <entry></entry> + &dash-ent-16; <entry>g<subscript>2</subscript></entry> <entry>g<subscript>1</subscript></entry> <entry>g<subscript>0</subscript></entry> @@ -210,6 +248,7 @@ <entry>V4L2_MBUS_FMT_RGB555_2X8_PADHI_LE</entry> <entry>0x1004</entry> <entry></entry> + &dash-ent-16; <entry>g<subscript>2</subscript></entry> <entry>g<subscript>1</subscript></entry> <entry>g<subscript>0</subscript></entry> @@ -223,6 +262,7 @@ <entry></entry> <entry></entry> <entry></entry> + &dash-ent-16; <entry>0</entry> <entry>r<subscript>4</subscript></entry> <entry>r<subscript>3</subscript></entry> @@ -236,6 +276,7 @@ <entry>V4L2_MBUS_FMT_BGR565_2X8_BE</entry> <entry>0x1005</entry> <entry></entry> + &dash-ent-16; <entry>b<subscript>4</subscript></entry> <entry>b<subscript>3</subscript></entry> <entry>b<subscript>2</subscript></entry> @@ -249,6 +290,7 @@ <entry></entry> <entry></entry> <entry></entry> + &dash-ent-16; <entry>g<subscript>2</subscript></entry> <entry>g<subscript>1</subscript></entry> <entry>g<subscript>0</subscript></entry> @@ -262,6 +304,7 @@ <entry>V4L2_MBUS_FMT_BGR565_2X8_LE</entry> <entry>0x1006</entry> <entry></entry> + &dash-ent-16; <entry>g<subscript>2</subscript></entry> <entry>g<subscript>1</subscript></entry> <entry>g<subscript>0</subscript></entry> @@ -275,6 +318,7 @@ <entry></entry> <entry></entry> <entry></entry> + &dash-ent-16; <entry>b<subscript>4</subscript></entry> <entry>b<subscript>3</subscript></entry> <entry>b<subscript>2</subscript></entry> @@ -288,6 +332,7 @@ <entry>V4L2_MBUS_FMT_RGB565_2X8_BE</entry> <entry>0x1007</entry> <entry></entry> + &dash-ent-16; <entry>r<subscript>4</subscript></entry> <entry>r<subscript>3</subscript></entry> <entry>r<subscript>2</subscript></entry> @@ -301,6 +346,7 @@ <entry></entry> <entry></entry> <entry></entry> + &dash-ent-16; <entry>g<subscript>2</subscript></entry> <entry>g<subscript>1</subscript></entry> <entry>g<subscript>0</subscript></entry> @@ -314,6 +360,7 @@ <entry>V4L2_MBUS_FMT_RGB565_2X8_LE</entry> <entry>0x1008</entry> <entry></entry> + &dash-ent-16; <entry>g<subscript>2</subscript></entry> <entry>g<subscript>1</subscript></entry> <entry>g<subscript>0</subscript></entry> @@ -327,6 +374,27 @@ <entry></entry> <entry></entry> <entry></entry> + &dash-ent-16; + <entry>r<subscript>4</subscript></entry> + <entry>r<subscript>3</subscript></entry> + <entry>r<subscript>2</subscript></entry> + <entry>r<subscript>1</subscript></entry> + <entry>r<subscript>0</subscript></entry> + <entry>g<subscript>5</subscript></entry> + <entry>g<subscript>4</subscript></entry> + <entry>g<subscript>3</subscript></entry> + </row> + <row id="V4L2-MBUS-FMT-RGB666-1X18"> + <entry>V4L2_MBUS_FMT_RGB666_1X18</entry> + <entry>0x1009</entry> + <entry></entry> + <entry>-</entry> + <entry>-</entry> + <entry>-</entry> + <entry>-</entry> + <entry>-</entry> + <entry>-</entry> + <entry>r<subscript>5</subscript></entry> <entry>r<subscript>4</subscript></entry> <entry>r<subscript>3</subscript></entry> <entry>r<subscript>2</subscript></entry> @@ -335,6 +403,124 @@ <entry>g<subscript>5</subscript></entry> <entry>g<subscript>4</subscript></entry> <entry>g<subscript>3</subscript></entry> + <entry>g<subscript>2</subscript></entry> + <entry>g<subscript>1</subscript></entry> + <entry>g<subscript>0</subscript></entry> + <entry>b<subscript>5</subscript></entry> + <entry>b<subscript>4</subscript></entry> + <entry>b<subscript>3</subscript></entry> + <entry>b<subscript>2</subscript></entry> + <entry>b<subscript>1</subscript></entry> + <entry>b<subscript>0</subscript></entry> + </row> + <row id="V4L2-MBUS-FMT-RGB888-1X24"> + <entry>V4L2_MBUS_FMT_RGB888_1X24</entry> + <entry>0x100a</entry> + <entry></entry> + <entry>r<subscript>7</subscript></entry> + <entry>r<subscript>6</subscript></entry> + <entry>r<subscript>5</subscript></entry> + <entry>r<subscript>4</subscript></entry> + <entry>r<subscript>3</subscript></entry> + <entry>r<subscript>2</subscript></entry> + <entry>r<subscript>1</subscript></entry> + <entry>r<subscript>0</subscript></entry> + <entry>g<subscript>7</subscript></entry> + <entry>g<subscript>6</subscript></entry> + <entry>g<subscript>5</subscript></entry> + <entry>g<subscript>4</subscript></entry> + <entry>g<subscript>3</subscript></entry> + <entry>g<subscript>2</subscript></entry> + <entry>g<subscript>1</subscript></entry> + <entry>g<subscript>0</subscript></entry> + <entry>b<subscript>7</subscript></entry> + <entry>b<subscript>6</subscript></entry> + <entry>b<subscript>5</subscript></entry> + <entry>b<subscript>4</subscript></entry> + <entry>b<subscript>3</subscript></entry> + <entry>b<subscript>2</subscript></entry> + <entry>b<subscript>1</subscript></entry> + <entry>b<subscript>0</subscript></entry> + </row> + <row id="V4L2-MBUS-FMT-RGB888-2X12-BE"> + <entry>V4L2_MBUS_FMT_RGB888_2X12_BE</entry> + <entry>0x100b</entry> + <entry></entry> + &dash-ent-10; + <entry>-</entry> + <entry>-</entry> + <entry>r<subscript>7</subscript></entry> + <entry>r<subscript>6</subscript></entry> + <entry>r<subscript>5</subscript></entry> + <entry>r<subscript>4</subscript></entry> + <entry>r<subscript>3</subscript></entry> + <entry>r<subscript>2</subscript></entry> + <entry>r<subscript>1</subscript></entry> + <entry>r<subscript>0</subscript></entry> + <entry>g<subscript>7</subscript></entry> + <entry>g<subscript>6</subscript></entry> + <entry>g<subscript>5</subscript></entry> + <entry>g<subscript>4</subscript></entry> + </row> + <row> + <entry></entry> + <entry></entry> + <entry></entry> + &dash-ent-10; + <entry>-</entry> + <entry>-</entry> + <entry>g<subscript>3</subscript></entry> + <entry>g<subscript>2</subscript></entry> + <entry>g<subscript>1</subscript></entry> + <entry>g<subscript>0</subscript></entry> + <entry>b<subscript>7</subscript></entry> + <entry>b<subscript>6</subscript></entry> + <entry>b<subscript>5</subscript></entry> + <entry>b<subscript>4</subscript></entry> + <entry>b<subscript>3</subscript></entry> + <entry>b<subscript>2</subscript></entry> + <entry>b<subscript>1</subscript></entry> + <entry>b<subscript>0</subscript></entry> + </row> + <row id="V4L2-MBUS-FMT-RGB888-2X12-LE"> + <entry>V4L2_MBUS_FMT_RGB888_2X12_LE</entry> + <entry>0x100c</entry> + <entry></entry> + &dash-ent-10; + <entry>-</entry> + <entry>-</entry> + <entry>g<subscript>3</subscript></entry> + <entry>g<subscript>2</subscript></entry> + <entry>g<subscript>1</subscript></entry> + <entry>g<subscript>0</subscript></entry> + <entry>b<subscript>7</subscript></entry> + <entry>b<subscript>6</subscript></entry> + <entry>b<subscript>5</subscript></entry> + <entry>b<subscript>4</subscript></entry> + <entry>b<subscript>3</subscript></entry> + <entry>b<subscript>2</subscript></entry> + <entry>b<subscript>1</subscript></entry> + <entry>b<subscript>0</subscript></entry> + </row> + <row> + <entry></entry> + <entry></entry> + <entry></entry> + &dash-ent-10; + <entry>-</entry> + <entry>-</entry> + <entry>r<subscript>7</subscript></entry> + <entry>r<subscript>6</subscript></entry> + <entry>r<subscript>5</subscript></entry> + <entry>r<subscript>4</subscript></entry> + <entry>r<subscript>3</subscript></entry> + <entry>r<subscript>2</subscript></entry> + <entry>r<subscript>1</subscript></entry> + <entry>r<subscript>0</subscript></entry> + <entry>g<subscript>7</subscript></entry> + <entry>g<subscript>6</subscript></entry> + <entry>g<subscript>5</subscript></entry> + <entry>g<subscript>4</subscript></entry> </row> </tbody> </tgroup> diff --git a/Documentation/DocBook/media/v4l/v4l2.xml b/Documentation/DocBook/media/v4l/v4l2.xml index a3cce18384e9..bfc93cdcf696 100644 --- a/Documentation/DocBook/media/v4l/v4l2.xml +++ b/Documentation/DocBook/media/v4l/v4l2.xml @@ -124,6 +124,7 @@ Remote Controller chapter.</contrib> <year>2010</year> <year>2011</year> <year>2012</year> + <year>2013</year> <holder>Bill Dirks, Michael H. Schimek, Hans Verkuil, Martin Rubli, Andy Walls, Muralidharan Karicheri, Mauro Carvalho Chehab, Pawel Osciak</holder> @@ -140,12 +141,22 @@ structs, ioctls) must be noted in more detail in the history chapter applications. --> <revision> + <revnumber>3.10</revnumber> + <date>2013-03-25</date> + <authorinitials>hv</authorinitials> + <revremark>Remove obsolete and unused DV_PRESET ioctls: + VIDIOC_G_DV_PRESET, VIDIOC_S_DV_PRESET, VIDIOC_QUERY_DV_PRESET and + VIDIOC_ENUM_DV_PRESET. Remove the related v4l2_input/output capability + flags V4L2_IN_CAP_PRESETS and V4L2_OUT_CAP_PRESETS. Added VIDIOC_DBG_G_CHIP_INFO. + </revremark> + </revision> + + <revision> <revnumber>3.9</revnumber> <date>2012-12-03</date> <authorinitials>sa, sn</authorinitials> <revremark>Added timestamp types to v4l2_buffer. - Added <constant>V4L2_EVENT_CTRL_CH_RANGE</constant> control - event changes flag, see <xref linkend="changes-flags"/>. + Added V4L2_EVENT_CTRL_CH_RANGE control event changes flag. </revremark> </revision> @@ -537,6 +548,7 @@ and discussions on the V4L mailing list.</revremark> &sub-create-bufs; &sub-cropcap; &sub-dbg-g-chip-ident; + &sub-dbg-g-chip-info; &sub-dbg-g-register; &sub-decoder-cmd; &sub-dqevent; @@ -544,7 +556,6 @@ and discussions on the V4L mailing list.</revremark> &sub-encoder-cmd; &sub-enumaudio; &sub-enumaudioout; - &sub-enum-dv-presets; &sub-enum-dv-timings; &sub-enum-fmt; &sub-enum-framesizes; @@ -558,7 +569,6 @@ and discussions on the V4L mailing list.</revremark> &sub-g-audioout; &sub-g-crop; &sub-g-ctrl; - &sub-g-dv-preset; &sub-g-dv-timings; &sub-g-enc-index; &sub-g-ext-ctrls; @@ -582,7 +592,6 @@ and discussions on the V4L mailing list.</revremark> &sub-querybuf; &sub-querycap; &sub-queryctrl; - &sub-query-dv-preset; &sub-query-dv-timings; &sub-querystd; &sub-reqbufs; diff --git a/Documentation/DocBook/media/v4l/vidioc-dbg-g-chip-ident.xml b/Documentation/DocBook/media/v4l/vidioc-dbg-g-chip-ident.xml index 4ecd966808de..921e18550d26 100644 --- a/Documentation/DocBook/media/v4l/vidioc-dbg-g-chip-ident.xml +++ b/Documentation/DocBook/media/v4l/vidioc-dbg-g-chip-ident.xml @@ -200,10 +200,10 @@ the values from <xref linkend="chip-ids" />.</entry> &cs-def; <tbody valign="top"> <row> - <entry><constant>V4L2_CHIP_MATCH_HOST</constant></entry> + <entry><constant>V4L2_CHIP_MATCH_BRIDGE</constant></entry> <entry>0</entry> <entry>Match the nth chip on the card, zero for the - host chip. Does not match &i2c; chips.</entry> + bridge chip. Does not match sub-devices.</entry> </row> <row> <entry><constant>V4L2_CHIP_MATCH_I2C_DRIVER</constant></entry> @@ -220,6 +220,11 @@ the values from <xref linkend="chip-ids" />.</entry> <entry>3</entry> <entry>Match the nth anciliary AC97 chip.</entry> </row> + <row> + <entry><constant>V4L2_CHIP_MATCH_SUBDEV</constant></entry> + <entry>4</entry> + <entry>Match the nth sub-device. Can't be used with this ioctl.</entry> + </row> </tbody> </tgroup> </table> diff --git a/Documentation/DocBook/media/v4l/vidioc-dbg-g-chip-info.xml b/Documentation/DocBook/media/v4l/vidioc-dbg-g-chip-info.xml new file mode 100644 index 000000000000..e1cece6c5de1 --- /dev/null +++ b/Documentation/DocBook/media/v4l/vidioc-dbg-g-chip-info.xml @@ -0,0 +1,223 @@ +<refentry id="vidioc-dbg-g-chip-info"> + <refmeta> + <refentrytitle>ioctl VIDIOC_DBG_G_CHIP_INFO</refentrytitle> + &manvol; + </refmeta> + + <refnamediv> + <refname>VIDIOC_DBG_G_CHIP_INFO</refname> + <refpurpose>Identify the chips on a TV card</refpurpose> + </refnamediv> + + <refsynopsisdiv> + <funcsynopsis> + <funcprototype> + <funcdef>int <function>ioctl</function></funcdef> + <paramdef>int <parameter>fd</parameter></paramdef> + <paramdef>int <parameter>request</parameter></paramdef> + <paramdef>struct v4l2_dbg_chip_info +*<parameter>argp</parameter></paramdef> + </funcprototype> + </funcsynopsis> + </refsynopsisdiv> + + <refsect1> + <title>Arguments</title> + + <variablelist> + <varlistentry> + <term><parameter>fd</parameter></term> + <listitem> + <para>&fd;</para> + </listitem> + </varlistentry> + <varlistentry> + <term><parameter>request</parameter></term> + <listitem> + <para>VIDIOC_DBG_G_CHIP_INFO</para> + </listitem> + </varlistentry> + <varlistentry> + <term><parameter>argp</parameter></term> + <listitem> + <para></para> + </listitem> + </varlistentry> + </variablelist> + </refsect1> + + <refsect1> + <title>Description</title> + + <note> + <title>Experimental</title> + + <para>This is an <link +linkend="experimental">experimental</link> interface and may change in +the future.</para> + </note> + + <para>For driver debugging purposes this ioctl allows test +applications to query the driver about the chips present on the TV +card. Regular applications must not use it. When you found a chip +specific bug, please contact the linux-media mailing list (&v4l-ml;) +so it can be fixed.</para> + + <para>Additionally the Linux kernel must be compiled with the +<constant>CONFIG_VIDEO_ADV_DEBUG</constant> option to enable this ioctl.</para> + + <para>To query the driver applications must initialize the +<structfield>match.type</structfield> and +<structfield>match.addr</structfield> or <structfield>match.name</structfield> +fields of a &v4l2-dbg-chip-info; +and call <constant>VIDIOC_DBG_G_CHIP_INFO</constant> with a pointer to +this structure. On success the driver stores information about the +selected chip in the <structfield>name</structfield> and +<structfield>flags</structfield> fields. On failure the structure +remains unchanged.</para> + + <para>When <structfield>match.type</structfield> is +<constant>V4L2_CHIP_MATCH_BRIDGE</constant>, +<structfield>match.addr</structfield> selects the nth bridge 'chip' +on the TV card. You can enumerate all chips by starting at zero and +incrementing <structfield>match.addr</structfield> by one until +<constant>VIDIOC_DBG_G_CHIP_INFO</constant> fails with an &EINVAL;. +The number zero always selects the bridge chip itself, ⪚ the chip +connected to the PCI or USB bus. Non-zero numbers identify specific +parts of the bridge chip such as an AC97 register block.</para> + + <para>When <structfield>match.type</structfield> is +<constant>V4L2_CHIP_MATCH_SUBDEV</constant>, +<structfield>match.addr</structfield> selects the nth sub-device. This +allows you to enumerate over all sub-devices.</para> + + <para>On success, the <structfield>name</structfield> field will +contain a chip name and the <structfield>flags</structfield> field will +contain <constant>V4L2_CHIP_FL_READABLE</constant> if the driver supports +reading registers from the device or <constant>V4L2_CHIP_FL_WRITABLE</constant> +if the driver supports writing registers to the device.</para> + + <para>We recommended the <application>v4l2-dbg</application> +utility over calling this ioctl directly. It is available from the +LinuxTV v4l-dvb repository; see <ulink +url="http://linuxtv.org/repo/">http://linuxtv.org/repo/</ulink> for +access instructions.</para> + + <!-- Note for convenience vidioc-dbg-g-register.sgml + contains a duplicate of this table. --> + <table pgwide="1" frame="none" id="name-v4l2-dbg-match"> + <title>struct <structname>v4l2_dbg_match</structname></title> + <tgroup cols="4"> + &cs-ustr; + <tbody valign="top"> + <row> + <entry>__u32</entry> + <entry><structfield>type</structfield></entry> + <entry>See <xref linkend="name-chip-match-types" /> for a list of +possible types.</entry> + </row> + <row> + <entry>union</entry> + <entry>(anonymous)</entry> + </row> + <row> + <entry></entry> + <entry>__u32</entry> + <entry><structfield>addr</structfield></entry> + <entry>Match a chip by this number, interpreted according +to the <structfield>type</structfield> field.</entry> + </row> + <row> + <entry></entry> + <entry>char</entry> + <entry><structfield>name[32]</structfield></entry> + <entry>Match a chip by this name, interpreted according +to the <structfield>type</structfield> field.</entry> + </row> + </tbody> + </tgroup> + </table> + + <table pgwide="1" frame="none" id="v4l2-dbg-chip-info"> + <title>struct <structname>v4l2_dbg_chip_info</structname></title> + <tgroup cols="3"> + &cs-str; + <tbody valign="top"> + <row> + <entry>struct v4l2_dbg_match</entry> + <entry><structfield>match</structfield></entry> + <entry>How to match the chip, see <xref linkend="name-v4l2-dbg-match" />.</entry> + </row> + <row> + <entry>char</entry> + <entry><structfield>name[32]</structfield></entry> + <entry>The name of the chip.</entry> + </row> + <row> + <entry>__u32</entry> + <entry><structfield>flags</structfield></entry> + <entry>Set by the driver. If <constant>V4L2_CHIP_FL_READABLE</constant> +is set, then the driver supports reading registers from the device. If +<constant>V4L2_CHIP_FL_WRITABLE</constant> is set, then it supports writing registers.</entry> + </row> + <row> + <entry>__u32</entry> + <entry><structfield>reserved[8]</structfield></entry> + <entry>Reserved fields, both application and driver must set these to 0.</entry> + </row> + </tbody> + </tgroup> + </table> + + <!-- Note for convenience vidioc-dbg-g-register.sgml + contains a duplicate of this table. --> + <table pgwide="1" frame="none" id="name-chip-match-types"> + <title>Chip Match Types</title> + <tgroup cols="3"> + &cs-def; + <tbody valign="top"> + <row> + <entry><constant>V4L2_CHIP_MATCH_BRIDGE</constant></entry> + <entry>0</entry> + <entry>Match the nth chip on the card, zero for the + bridge chip. Does not match sub-devices.</entry> + </row> + <row> + <entry><constant>V4L2_CHIP_MATCH_I2C_DRIVER</constant></entry> + <entry>1</entry> + <entry>Match an &i2c; chip by its driver name. Can't be used with this ioctl.</entry> + </row> + <row> + <entry><constant>V4L2_CHIP_MATCH_I2C_ADDR</constant></entry> + <entry>2</entry> + <entry>Match a chip by its 7 bit &i2c; bus address. Can't be used with this ioctl.</entry> + </row> + <row> + <entry><constant>V4L2_CHIP_MATCH_AC97</constant></entry> + <entry>3</entry> + <entry>Match the nth anciliary AC97 chip. Can't be used with this ioctl.</entry> + </row> + <row> + <entry><constant>V4L2_CHIP_MATCH_SUBDEV</constant></entry> + <entry>4</entry> + <entry>Match the nth sub-device.</entry> + </row> + </tbody> + </tgroup> + </table> + </refsect1> + + <refsect1> + &return-value; + + <variablelist> + <varlistentry> + <term><errorcode>EINVAL</errorcode></term> + <listitem> + <para>The <structfield>match_type</structfield> is invalid or +no device could be matched.</para> + </listitem> + </varlistentry> + </variablelist> + </refsect1> +</refentry> diff --git a/Documentation/DocBook/media/v4l/vidioc-dbg-g-register.xml b/Documentation/DocBook/media/v4l/vidioc-dbg-g-register.xml index a44aebc7997a..d13bac9e2445 100644 --- a/Documentation/DocBook/media/v4l/vidioc-dbg-g-register.xml +++ b/Documentation/DocBook/media/v4l/vidioc-dbg-g-register.xml @@ -87,7 +87,7 @@ written into the register.</para> <para>To read a register applications must initialize the <structfield>match.type</structfield>, -<structfield>match.chip</structfield> or <structfield>match.name</structfield> and +<structfield>match.addr</structfield> or <structfield>match.name</structfield> and <structfield>reg</structfield> fields, and call <constant>VIDIOC_DBG_G_REGISTER</constant> with a pointer to this structure. On success the driver stores the register value in the @@ -95,11 +95,11 @@ structure. On success the driver stores the register value in the unchanged.</para> <para>When <structfield>match.type</structfield> is -<constant>V4L2_CHIP_MATCH_HOST</constant>, -<structfield>match.addr</structfield> selects the nth non-&i2c; chip +<constant>V4L2_CHIP_MATCH_BRIDGE</constant>, +<structfield>match.addr</structfield> selects the nth non-sub-device chip on the TV card. The number zero always selects the host chip, ⪚ the chip connected to the PCI or USB bus. You can find out which chips are -present with the &VIDIOC-DBG-G-CHIP-IDENT; ioctl.</para> +present with the &VIDIOC-DBG-G-CHIP-INFO; ioctl.</para> <para>When <structfield>match.type</structfield> is <constant>V4L2_CHIP_MATCH_I2C_DRIVER</constant>, @@ -109,7 +109,7 @@ For instance supported by the saa7127 driver, regardless of its &i2c; bus address. When multiple chips supported by the same driver are present, the effect of these ioctls is undefined. Again with the -&VIDIOC-DBG-G-CHIP-IDENT; ioctl you can find out which &i2c; chips are +&VIDIOC-DBG-G-CHIP-INFO; ioctl you can find out which &i2c; chips are present.</para> <para>When <structfield>match.type</structfield> is @@ -122,19 +122,23 @@ bus address.</para> <structfield>match.addr</structfield> selects the nth AC97 chip on the TV card.</para> + <para>When <structfield>match.type</structfield> is +<constant>V4L2_CHIP_MATCH_SUBDEV</constant>, +<structfield>match.addr</structfield> selects the nth sub-device.</para> + <note> <title>Success not guaranteed</title> <para>Due to a flaw in the Linux &i2c; bus driver these ioctls may return successfully without actually reading or writing a register. To -catch the most likely failure we recommend a &VIDIOC-DBG-G-CHIP-IDENT; +catch the most likely failure we recommend a &VIDIOC-DBG-G-CHIP-INFO; call confirming the presence of the selected &i2c; chip.</para> </note> <para>These ioctls are optional, not all drivers may support them. However when a driver supports these ioctls it must also support -&VIDIOC-DBG-G-CHIP-IDENT;. Conversely it may support -<constant>VIDIOC_DBG_G_CHIP_IDENT</constant> but not these ioctls.</para> +&VIDIOC-DBG-G-CHIP-INFO;. Conversely it may support +<constant>VIDIOC_DBG_G_CHIP_INFO</constant> but not these ioctls.</para> <para><constant>VIDIOC_DBG_G_REGISTER</constant> and <constant>VIDIOC_DBG_S_REGISTER</constant> were introduced in Linux @@ -217,10 +221,10 @@ register.</entry> &cs-def; <tbody valign="top"> <row> - <entry><constant>V4L2_CHIP_MATCH_HOST</constant></entry> + <entry><constant>V4L2_CHIP_MATCH_BRIDGE</constant></entry> <entry>0</entry> <entry>Match the nth chip on the card, zero for the - host chip. Does not match &i2c; chips.</entry> + bridge chip. Does not match sub-devices.</entry> </row> <row> <entry><constant>V4L2_CHIP_MATCH_I2C_DRIVER</constant></entry> @@ -237,6 +241,11 @@ register.</entry> <entry>3</entry> <entry>Match the nth anciliary AC97 chip.</entry> </row> + <row> + <entry><constant>V4L2_CHIP_MATCH_SUBDEV</constant></entry> + <entry>4</entry> + <entry>Match the nth sub-device.</entry> + </row> </tbody> </tgroup> </table> diff --git a/Documentation/DocBook/media/v4l/vidioc-enum-dv-presets.xml b/Documentation/DocBook/media/v4l/vidioc-enum-dv-presets.xml deleted file mode 100644 index fced5fb0dbf0..000000000000 --- a/Documentation/DocBook/media/v4l/vidioc-enum-dv-presets.xml +++ /dev/null @@ -1,240 +0,0 @@ -<refentry id="vidioc-enum-dv-presets"> - <refmeta> - <refentrytitle>ioctl VIDIOC_ENUM_DV_PRESETS</refentrytitle> - &manvol; - </refmeta> - - <refnamediv> - <refname>VIDIOC_ENUM_DV_PRESETS</refname> - <refpurpose>Enumerate supported Digital Video presets</refpurpose> - </refnamediv> - - <refsynopsisdiv> - <funcsynopsis> - <funcprototype> - <funcdef>int <function>ioctl</function></funcdef> - <paramdef>int <parameter>fd</parameter></paramdef> - <paramdef>int <parameter>request</parameter></paramdef> - <paramdef>struct v4l2_dv_enum_preset *<parameter>argp</parameter></paramdef> - </funcprototype> - </funcsynopsis> - </refsynopsisdiv> - - <refsect1> - <title>Arguments</title> - - <variablelist> - <varlistentry> - <term><parameter>fd</parameter></term> - <listitem> - <para>&fd;</para> - </listitem> - </varlistentry> - <varlistentry> - <term><parameter>request</parameter></term> - <listitem> - <para>VIDIOC_ENUM_DV_PRESETS</para> - </listitem> - </varlistentry> - <varlistentry> - <term><parameter>argp</parameter></term> - <listitem> - <para></para> - </listitem> - </varlistentry> - </variablelist> - </refsect1> - - <refsect1> - <title>Description</title> - - <para>This ioctl is <emphasis role="bold">deprecated</emphasis>. - New drivers and applications should use &VIDIOC-ENUM-DV-TIMINGS; instead. - </para> - - <para>To query the attributes of a DV preset, applications initialize the -<structfield>index</structfield> field and zero the reserved array of &v4l2-dv-enum-preset; -and call the <constant>VIDIOC_ENUM_DV_PRESETS</constant> ioctl with a pointer to this -structure. Drivers fill the rest of the structure or return an -&EINVAL; when the index is out of bounds. To enumerate all DV Presets supported, -applications shall begin at index zero, incrementing by one until the -driver returns <errorcode>EINVAL</errorcode>. Drivers may enumerate a -different set of DV presets after switching the video input or -output.</para> - - <table pgwide="1" frame="none" id="v4l2-dv-enum-preset"> - <title>struct <structname>v4l2_dv_enum_presets</structname></title> - <tgroup cols="3"> - &cs-str; - <tbody valign="top"> - <row> - <entry>__u32</entry> - <entry><structfield>index</structfield></entry> - <entry>Number of the DV preset, set by the -application.</entry> - </row> - <row> - <entry>__u32</entry> - <entry><structfield>preset</structfield></entry> - <entry>This field identifies one of the DV preset values listed in <xref linkend="v4l2-dv-presets-vals"/>.</entry> - </row> - <row> - <entry>__u8</entry> - <entry><structfield>name</structfield>[24]</entry> - <entry>Name of the preset, a NUL-terminated ASCII string, for example: "720P-60", "1080I-60". This information is -intended for the user.</entry> - </row> - <row> - <entry>__u32</entry> - <entry><structfield>width</structfield></entry> - <entry>Width of the active video in pixels for the DV preset.</entry> - </row> - <row> - <entry>__u32</entry> - <entry><structfield>height</structfield></entry> - <entry>Height of the active video in lines for the DV preset.</entry> - </row> - <row> - <entry>__u32</entry> - <entry><structfield>reserved</structfield>[4]</entry> - <entry>Reserved for future extensions. Drivers must set the array to zero.</entry> - </row> - </tbody> - </tgroup> - </table> - - <table pgwide="1" frame="none" id="v4l2-dv-presets-vals"> - <title>struct <structname>DV Presets</structname></title> - <tgroup cols="3"> - &cs-str; - <tbody valign="top"> - <row> - <entry>Preset</entry> - <entry>Preset value</entry> - <entry>Description</entry> - </row> - <row> - <entry></entry> - <entry></entry> - <entry></entry> - </row> - <row> - <entry>V4L2_DV_INVALID</entry> - <entry>0</entry> - <entry>Invalid preset value.</entry> - </row> - <row> - <entry>V4L2_DV_480P59_94</entry> - <entry>1</entry> - <entry>720x480 progressive video at 59.94 fps as per BT.1362.</entry> - </row> - <row> - <entry>V4L2_DV_576P50</entry> - <entry>2</entry> - <entry>720x576 progressive video at 50 fps as per BT.1362.</entry> - </row> - <row> - <entry>V4L2_DV_720P24</entry> - <entry>3</entry> - <entry>1280x720 progressive video at 24 fps as per SMPTE 296M.</entry> - </row> - <row> - <entry>V4L2_DV_720P25</entry> - <entry>4</entry> - <entry>1280x720 progressive video at 25 fps as per SMPTE 296M.</entry> - </row> - <row> - <entry>V4L2_DV_720P30</entry> - <entry>5</entry> - <entry>1280x720 progressive video at 30 fps as per SMPTE 296M.</entry> - </row> - <row> - <entry>V4L2_DV_720P50</entry> - <entry>6</entry> - <entry>1280x720 progressive video at 50 fps as per SMPTE 296M.</entry> - </row> - <row> - <entry>V4L2_DV_720P59_94</entry> - <entry>7</entry> - <entry>1280x720 progressive video at 59.94 fps as per SMPTE 274M.</entry> - </row> - <row> - <entry>V4L2_DV_720P60</entry> - <entry>8</entry> - <entry>1280x720 progressive video at 60 fps as per SMPTE 274M/296M.</entry> - </row> - <row> - <entry>V4L2_DV_1080I29_97</entry> - <entry>9</entry> - <entry>1920x1080 interlaced video at 29.97 fps as per BT.1120/SMPTE 274M.</entry> - </row> - <row> - <entry>V4L2_DV_1080I30</entry> - <entry>10</entry> - <entry>1920x1080 interlaced video at 30 fps as per BT.1120/SMPTE 274M.</entry> - </row> - <row> - <entry>V4L2_DV_1080I25</entry> - <entry>11</entry> - <entry>1920x1080 interlaced video at 25 fps as per BT.1120.</entry> - </row> - <row> - <entry>V4L2_DV_1080I50</entry> - <entry>12</entry> - <entry>1920x1080 interlaced video at 50 fps as per SMPTE 296M.</entry> - </row> - <row> - <entry>V4L2_DV_1080I60</entry> - <entry>13</entry> - <entry>1920x1080 interlaced video at 60 fps as per SMPTE 296M.</entry> - </row> - <row> - <entry>V4L2_DV_1080P24</entry> - <entry>14</entry> - <entry>1920x1080 progressive video at 24 fps as per SMPTE 296M.</entry> - </row> - <row> - <entry>V4L2_DV_1080P25</entry> - <entry>15</entry> - <entry>1920x1080 progressive video at 25 fps as per SMPTE 296M.</entry> - </row> - <row> - <entry>V4L2_DV_1080P30</entry> - <entry>16</entry> - <entry>1920x1080 progressive video at 30 fps as per SMPTE 296M.</entry> - </row> - <row> - <entry>V4L2_DV_1080P50</entry> - <entry>17</entry> - <entry>1920x1080 progressive video at 50 fps as per BT.1120.</entry> - </row> - <row> - <entry>V4L2_DV_1080P60</entry> - <entry>18</entry> - <entry>1920x1080 progressive video at 60 fps as per BT.1120.</entry> - </row> - </tbody> - </tgroup> - </table> - </refsect1> - - <refsect1> - &return-value; - - <variablelist> - <varlistentry> - <term><errorcode>EINVAL</errorcode></term> - <listitem> - <para>The &v4l2-dv-enum-preset; <structfield>index</structfield> -is out of bounds.</para> - </listitem> - </varlistentry> - <varlistentry> - <term><errorcode>ENODATA</errorcode></term> - <listitem> - <para>Digital video presets are not supported for this input or output.</para> - </listitem> - </varlistentry> - </variablelist> - </refsect1> -</refentry> diff --git a/Documentation/DocBook/media/v4l/vidioc-enuminput.xml b/Documentation/DocBook/media/v4l/vidioc-enuminput.xml index 3c9a81305ad4..493a39a8ef21 100644 --- a/Documentation/DocBook/media/v4l/vidioc-enuminput.xml +++ b/Documentation/DocBook/media/v4l/vidioc-enuminput.xml @@ -278,11 +278,6 @@ input/output interface to linux-media@vger.kernel.org on 19 Oct 2009. &cs-def; <tbody valign="top"> <row> - <entry><constant>V4L2_IN_CAP_PRESETS</constant></entry> - <entry>0x00000001</entry> - <entry>This input supports setting DV presets by using VIDIOC_S_DV_PRESET.</entry> - </row> - <row> <entry><constant>V4L2_IN_CAP_DV_TIMINGS</constant></entry> <entry>0x00000002</entry> <entry>This input supports setting video timings by using VIDIOC_S_DV_TIMINGS.</entry> diff --git a/Documentation/DocBook/media/v4l/vidioc-enumoutput.xml b/Documentation/DocBook/media/v4l/vidioc-enumoutput.xml index f4ab0798545d..2654e097df39 100644 --- a/Documentation/DocBook/media/v4l/vidioc-enumoutput.xml +++ b/Documentation/DocBook/media/v4l/vidioc-enumoutput.xml @@ -163,11 +163,6 @@ input/output interface to linux-media@vger.kernel.org on 19 Oct 2009. &cs-def; <tbody valign="top"> <row> - <entry><constant>V4L2_OUT_CAP_PRESETS</constant></entry> - <entry>0x00000001</entry> - <entry>This output supports setting DV presets by using VIDIOC_S_DV_PRESET.</entry> - </row> - <row> <entry><constant>V4L2_OUT_CAP_DV_TIMINGS</constant></entry> <entry>0x00000002</entry> <entry>This output supports setting video timings by using VIDIOC_S_DV_TIMINGS.</entry> diff --git a/Documentation/DocBook/media/v4l/vidioc-g-dv-preset.xml b/Documentation/DocBook/media/v4l/vidioc-g-dv-preset.xml deleted file mode 100644 index b9ea37634f6c..000000000000 --- a/Documentation/DocBook/media/v4l/vidioc-g-dv-preset.xml +++ /dev/null @@ -1,113 +0,0 @@ -<refentry id="vidioc-g-dv-preset"> - <refmeta> - <refentrytitle>ioctl VIDIOC_G_DV_PRESET, VIDIOC_S_DV_PRESET</refentrytitle> - &manvol; - </refmeta> - - <refnamediv> - <refname>VIDIOC_G_DV_PRESET</refname> - <refname>VIDIOC_S_DV_PRESET</refname> - <refpurpose>Query or select the DV preset of the current input or output</refpurpose> - </refnamediv> - - <refsynopsisdiv> - <funcsynopsis> - <funcprototype> - <funcdef>int <function>ioctl</function></funcdef> - <paramdef>int <parameter>fd</parameter></paramdef> - <paramdef>int <parameter>request</parameter></paramdef> - <paramdef>struct v4l2_dv_preset *<parameter>argp</parameter></paramdef> - </funcprototype> - </funcsynopsis> - </refsynopsisdiv> - - <refsect1> - <title>Arguments</title> - - <variablelist> - <varlistentry> - <term><parameter>fd</parameter></term> - <listitem> - <para>&fd;</para> - </listitem> - </varlistentry> - <varlistentry> - <term><parameter>request</parameter></term> - <listitem> - <para>VIDIOC_G_DV_PRESET, VIDIOC_S_DV_PRESET</para> - </listitem> - </varlistentry> - <varlistentry> - <term><parameter>argp</parameter></term> - <listitem> - <para></para> - </listitem> - </varlistentry> - </variablelist> - </refsect1> - - <refsect1> - <title>Description</title> - - <para>These ioctls are <emphasis role="bold">deprecated</emphasis>. - New drivers and applications should use &VIDIOC-G-DV-TIMINGS; and &VIDIOC-S-DV-TIMINGS; - instead. - </para> - - <para>To query and select the current DV preset, applications -use the <constant>VIDIOC_G_DV_PRESET</constant> and <constant>VIDIOC_S_DV_PRESET</constant> -ioctls which take a pointer to a &v4l2-dv-preset; type as argument. -Applications must zero the reserved array in &v4l2-dv-preset;. -<constant>VIDIOC_G_DV_PRESET</constant> returns a dv preset in the field -<structfield>preset</structfield> of &v4l2-dv-preset;.</para> - - <para><constant>VIDIOC_S_DV_PRESET</constant> accepts a pointer to a &v4l2-dv-preset; -that has the preset value to be set. Applications must zero the reserved array in &v4l2-dv-preset;. -If the preset is not supported, it returns an &EINVAL; </para> - </refsect1> - - <refsect1> - &return-value; - - <variablelist> - <varlistentry> - <term><errorcode>EINVAL</errorcode></term> - <listitem> - <para>This ioctl is not supported, or the -<constant>VIDIOC_S_DV_PRESET</constant>,<constant>VIDIOC_S_DV_PRESET</constant> parameter was unsuitable.</para> - </listitem> - </varlistentry> - <varlistentry> - <term><errorcode>ENODATA</errorcode></term> - <listitem> - <para>Digital video presets are not supported for this input or output.</para> - </listitem> - </varlistentry> - <varlistentry> - <term><errorcode>EBUSY</errorcode></term> - <listitem> - <para>The device is busy and therefore can not change the preset.</para> - </listitem> - </varlistentry> - </variablelist> - - <table pgwide="1" frame="none" id="v4l2-dv-preset"> - <title>struct <structname>v4l2_dv_preset</structname></title> - <tgroup cols="3"> - &cs-str; - <tbody valign="top"> - <row> - <entry>__u32</entry> - <entry><structfield>preset</structfield></entry> - <entry>Preset value to represent the digital video timings</entry> - </row> - <row> - <entry>__u32</entry> - <entry><structfield>reserved[4]</structfield></entry> - <entry>Reserved fields for future use</entry> - </row> - </tbody> - </tgroup> - </table> - </refsect1> -</refentry> diff --git a/Documentation/DocBook/media/v4l/vidioc-g-ext-ctrls.xml b/Documentation/DocBook/media/v4l/vidioc-g-ext-ctrls.xml index 4e16112df992..b3bb9575b2e0 100644 --- a/Documentation/DocBook/media/v4l/vidioc-g-ext-ctrls.xml +++ b/Documentation/DocBook/media/v4l/vidioc-g-ext-ctrls.xml @@ -319,6 +319,15 @@ These controls are described in <xref processing controls. These controls are described in <xref linkend="image-process-controls" />.</entry> </row> + + <row> + <entry><constant>V4L2_CTRL_CLASS_FM_RX</constant></entry> + <entry>0xa10000</entry> + <entry>The class containing FM Receiver (FM RX) controls. +These controls are described in <xref + linkend="fm-rx-controls" />.</entry> + </row> + </tbody> </tgroup> </table> diff --git a/Documentation/DocBook/media/v4l/vidioc-query-dv-preset.xml b/Documentation/DocBook/media/v4l/vidioc-query-dv-preset.xml deleted file mode 100644 index 68b49d09e245..000000000000 --- a/Documentation/DocBook/media/v4l/vidioc-query-dv-preset.xml +++ /dev/null @@ -1,78 +0,0 @@ -<refentry id="vidioc-query-dv-preset"> - <refmeta> - <refentrytitle>ioctl VIDIOC_QUERY_DV_PRESET</refentrytitle> - &manvol; - </refmeta> - - <refnamediv> - <refname>VIDIOC_QUERY_DV_PRESET</refname> - <refpurpose>Sense the DV preset received by the current -input</refpurpose> - </refnamediv> - - <refsynopsisdiv> - <funcsynopsis> - <funcprototype> - <funcdef>int <function>ioctl</function></funcdef> - <paramdef>int <parameter>fd</parameter></paramdef> - <paramdef>int <parameter>request</parameter></paramdef> - <paramdef>struct v4l2_dv_preset *<parameter>argp</parameter></paramdef> - </funcprototype> - </funcsynopsis> - </refsynopsisdiv> - - <refsect1> - <title>Arguments</title> - - <variablelist> - <varlistentry> - <term><parameter>fd</parameter></term> - <listitem> - <para>&fd;</para> - </listitem> - </varlistentry> - <varlistentry> - <term><parameter>request</parameter></term> - <listitem> - <para>VIDIOC_QUERY_DV_PRESET</para> - </listitem> - </varlistentry> - <varlistentry> - <term><parameter>argp</parameter></term> - <listitem> - <para></para> - </listitem> - </varlistentry> - </variablelist> - </refsect1> - - <refsect1> - <title>Description</title> - - <para>This ioctl is <emphasis role="bold">deprecated</emphasis>. - New drivers and applications should use &VIDIOC-QUERY-DV-TIMINGS; instead. - </para> - - <para>The hardware may be able to detect the current DV preset -automatically, similar to sensing the video standard. To do so, applications -call <constant> VIDIOC_QUERY_DV_PRESET</constant> with a pointer to a -&v4l2-dv-preset; type. Once the hardware detects a preset, that preset is -returned in the preset field of &v4l2-dv-preset;. If the preset could not be -detected because there was no signal, or the signal was unreliable, or the -signal did not map to a supported preset, then the value V4L2_DV_INVALID is -returned.</para> - </refsect1> - - <refsect1> - &return-value; - - <variablelist> - <varlistentry> - <term><errorcode>ENODATA</errorcode></term> - <listitem> - <para>Digital video presets are not supported for this input or output.</para> - </listitem> - </varlistentry> - </variablelist> - </refsect1> -</refentry> diff --git a/Documentation/DocBook/media_api.tmpl b/Documentation/DocBook/media_api.tmpl index 1f6593deb995..6a8b7158697f 100644 --- a/Documentation/DocBook/media_api.tmpl +++ b/Documentation/DocBook/media_api.tmpl @@ -23,6 +23,7 @@ <!-- LinuxTV v4l-dvb repository. --> <!ENTITY v4l-dvb "<ulink url='http://linuxtv.org/repo/'>http://linuxtv.org/repo/</ulink>"> <!ENTITY dash-ent-10 "<entry>-</entry><entry>-</entry><entry>-</entry><entry>-</entry><entry>-</entry><entry>-</entry><entry>-</entry><entry>-</entry><entry>-</entry><entry>-</entry>"> +<!ENTITY dash-ent-16 "<entry>-</entry><entry>-</entry><entry>-</entry><entry>-</entry><entry>-</entry><entry>-</entry><entry>-</entry><entry>-</entry><entry>-</entry><entry>-</entry><entry>-</entry><entry>-</entry><entry>-</entry><entry>-</entry><entry>-</entry><entry>-</entry>"> ]> <book id="media_api"> diff --git a/Documentation/RCU/checklist.txt b/Documentation/RCU/checklist.txt index 31ef8fe07f82..79e789b8b8ea 100644 --- a/Documentation/RCU/checklist.txt +++ b/Documentation/RCU/checklist.txt @@ -217,9 +217,14 @@ over a rather long period of time, but improvements are always welcome! whether the increased speed is worth it. 8. Although synchronize_rcu() is slower than is call_rcu(), it - usually results in simpler code. So, unless update performance - is critically important or the updaters cannot block, - synchronize_rcu() should be used in preference to call_rcu(). + usually results in simpler code. So, unless update performance is + critically important, the updaters cannot block, or the latency of + synchronize_rcu() is visible from userspace, synchronize_rcu() + should be used in preference to call_rcu(). Furthermore, + kfree_rcu() usually results in even simpler code than does + synchronize_rcu() without synchronize_rcu()'s multi-millisecond + latency. So please take advantage of kfree_rcu()'s "fire and + forget" memory-freeing capabilities where it applies. An especially important property of the synchronize_rcu() primitive is that it automatically self-limits: if grace periods @@ -268,7 +273,8 @@ over a rather long period of time, but improvements are always welcome! e. Periodically invoke synchronize_rcu(), permitting a limited number of updates per grace period. - The same cautions apply to call_rcu_bh() and call_rcu_sched(). + The same cautions apply to call_rcu_bh(), call_rcu_sched(), + call_srcu(), and kfree_rcu(). 9. All RCU list-traversal primitives, which include rcu_dereference(), list_for_each_entry_rcu(), and @@ -296,9 +302,9 @@ over a rather long period of time, but improvements are always welcome! all currently executing rcu_read_lock()-protected RCU read-side critical sections complete. It does -not- necessarily guarantee that all currently running interrupts, NMIs, preempt_disable() - code, or idle loops will complete. Therefore, if you do not have - rcu_read_lock()-protected read-side critical sections, do -not- - use synchronize_rcu(). + code, or idle loops will complete. Therefore, if your + read-side critical sections are protected by something other + than rcu_read_lock(), do -not- use synchronize_rcu(). Similarly, disabling preemption is not an acceptable substitute for rcu_read_lock(). Code that attempts to use preemption @@ -401,9 +407,9 @@ over a rather long period of time, but improvements are always welcome! read-side critical sections. It is the responsibility of the RCU update-side primitives to deal with this. -17. Use CONFIG_PROVE_RCU, CONFIG_DEBUG_OBJECTS_RCU_HEAD, and - the __rcu sparse checks to validate your RCU code. These - can help find problems as follows: +17. Use CONFIG_PROVE_RCU, CONFIG_DEBUG_OBJECTS_RCU_HEAD, and the + __rcu sparse checks (enabled by CONFIG_SPARSE_RCU_POINTER) to + validate your RCU code. These can help find problems as follows: CONFIG_PROVE_RCU: check that accesses to RCU-protected data structures are carried out under the proper RCU diff --git a/Documentation/RCU/lockdep.txt b/Documentation/RCU/lockdep.txt index a102d4b3724b..cd83d2348fef 100644 --- a/Documentation/RCU/lockdep.txt +++ b/Documentation/RCU/lockdep.txt @@ -64,6 +64,11 @@ checking of rcu_dereference() primitives: but retain the compiler constraints that prevent duplicating or coalescsing. This is useful when when testing the value of the pointer itself, for example, against NULL. + rcu_access_index(idx): + Return the value of the index and omit all barriers, but + retain the compiler constraints that prevent duplicating + or coalescsing. This is useful when when testing the + value of the index itself, for example, against -1. The rcu_dereference_check() check expression can be any boolean expression, but would normally include a lockdep expression. However, diff --git a/Documentation/RCU/rcubarrier.txt b/Documentation/RCU/rcubarrier.txt index 38428c125135..2e319d1b9ef2 100644 --- a/Documentation/RCU/rcubarrier.txt +++ b/Documentation/RCU/rcubarrier.txt @@ -79,7 +79,20 @@ complete. Pseudo-code using rcu_barrier() is as follows: 2. Execute rcu_barrier(). 3. Allow the module to be unloaded. -The rcutorture module makes use of rcu_barrier in its exit function +There are also rcu_barrier_bh(), rcu_barrier_sched(), and srcu_barrier() +functions for the other flavors of RCU, and you of course must match +the flavor of rcu_barrier() with that of call_rcu(). If your module +uses multiple flavors of call_rcu(), then it must also use multiple +flavors of rcu_barrier() when unloading that module. For example, if +it uses call_rcu_bh(), call_srcu() on srcu_struct_1, and call_srcu() on +srcu_struct_2(), then the following three lines of code will be required +when unloading: + + 1 rcu_barrier_bh(); + 2 srcu_barrier(&srcu_struct_1); + 3 srcu_barrier(&srcu_struct_2); + +The rcutorture module makes use of rcu_barrier() in its exit function as follows: 1 static void diff --git a/Documentation/RCU/stallwarn.txt b/Documentation/RCU/stallwarn.txt index 1927151b386b..e38b8df3d727 100644 --- a/Documentation/RCU/stallwarn.txt +++ b/Documentation/RCU/stallwarn.txt @@ -92,14 +92,14 @@ If the CONFIG_RCU_CPU_STALL_INFO kernel configuration parameter is set, more information is printed with the stall-warning message, for example: INFO: rcu_preempt detected stall on CPU - 0: (63959 ticks this GP) idle=241/3fffffffffffffff/0 + 0: (63959 ticks this GP) idle=241/3fffffffffffffff/0 softirq=82/543 (t=65000 jiffies) In kernels with CONFIG_RCU_FAST_NO_HZ, even more information is printed: INFO: rcu_preempt detected stall on CPU - 0: (64628 ticks this GP) idle=dd5/3fffffffffffffff/0 drain=0 . timer not pending + 0: (64628 ticks this GP) idle=dd5/3fffffffffffffff/0 softirq=82/543 last_accelerate: a345/d342 nonlazy_posted: 25 .D (t=65000 jiffies) The "(64628 ticks this GP)" indicates that this CPU has taken more @@ -116,13 +116,28 @@ number between the two "/"s is the value of the nesting, which will be a small positive number if in the idle loop and a very large positive number (as shown above) otherwise. -For CONFIG_RCU_FAST_NO_HZ kernels, the "drain=0" indicates that the CPU is -not in the process of trying to force itself into dyntick-idle state, the -"." indicates that the CPU has not given up forcing RCU into dyntick-idle -mode (it would be "H" otherwise), and the "timer not pending" indicates -that the CPU has not recently forced RCU into dyntick-idle mode (it -would otherwise indicate the number of microseconds remaining in this -forced state). +The "softirq=" portion of the message tracks the number of RCU softirq +handlers that the stalled CPU has executed. The number before the "/" +is the number that had executed since boot at the time that this CPU +last noted the beginning of a grace period, which might be the current +(stalled) grace period, or it might be some earlier grace period (for +example, if the CPU might have been in dyntick-idle mode for an extended +time period. The number after the "/" is the number that have executed +since boot until the current time. If this latter number stays constant +across repeated stall-warning messages, it is possible that RCU's softirq +handlers are no longer able to execute on this CPU. This can happen if +the stalled CPU is spinning with interrupts are disabled, or, in -rt +kernels, if a high-priority process is starving RCU's softirq handler. + +For CONFIG_RCU_FAST_NO_HZ kernels, the "last_accelerate:" prints the +low-order 16 bits (in hex) of the jiffies counter when this CPU last +invoked rcu_try_advance_all_cbs() from rcu_needs_cpu() or last invoked +rcu_accelerate_cbs() from rcu_prepare_for_idle(). The "nonlazy_posted:" +prints the number of non-lazy callbacks posted since the last call to +rcu_needs_cpu(). Finally, an "L" indicates that there are currently +no non-lazy callbacks ("." is printed otherwise, as shown above) and +"D" indicates that dyntick-idle processing is enabled ("." is printed +otherwise, for example, if disabled via the "nohz=" kernel boot parameter). Multiple Warnings From One Stall diff --git a/Documentation/RCU/whatisRCU.txt b/Documentation/RCU/whatisRCU.txt index 0cc7820967f4..10df0b82f459 100644 --- a/Documentation/RCU/whatisRCU.txt +++ b/Documentation/RCU/whatisRCU.txt @@ -265,9 +265,9 @@ rcu_dereference() rcu_read_lock(); p = rcu_dereference(head.next); rcu_read_unlock(); - x = p->address; + x = p->address; /* BUG!!! */ rcu_read_lock(); - y = p->data; + y = p->data; /* BUG!!! */ rcu_read_unlock(); Holding a reference from one RCU read-side critical section diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches index aa0c1e63f050..6e97e73d87b5 100644 --- a/Documentation/SubmittingPatches +++ b/Documentation/SubmittingPatches @@ -420,7 +420,7 @@ person it names. This tag documents that potentially interested parties have been included in the discussion -14) Using Reported-by:, Tested-by: and Reviewed-by: +14) Using Reported-by:, Tested-by:, Reviewed-by: and Suggested-by: If this patch fixes a problem reported by somebody else, consider adding a Reported-by: tag to credit the reporter for their contribution. Please @@ -468,6 +468,13 @@ done on the patch. Reviewed-by: tags, when supplied by reviewers known to understand the subject area and to perform thorough reviews, will normally increase the likelihood of your patch getting into the kernel. +A Suggested-by: tag indicates that the patch idea is suggested by the person +named and ensures credit to the person for the idea. Please note that this +tag should not be added without the reporter's permission, especially if the +idea was not posted in a public forum. That said, if we diligently credit our +idea reporters, they will, hopefully, be inspired to help us again in the +future. + 15) The canonical patch format diff --git a/Documentation/arm/sunxi/clocks.txt b/Documentation/arm/sunxi/clocks.txt new file mode 100644 index 000000000000..e09a88aa3136 --- /dev/null +++ b/Documentation/arm/sunxi/clocks.txt @@ -0,0 +1,56 @@ +Frequently asked questions about the sunxi clock system +======================================================= + +This document contains useful bits of information that people tend to ask +about the sunxi clock system, as well as accompanying ASCII art when adequate. + +Q: Why is the main 24MHz oscillator gatable? Wouldn't that break the + system? + +A: The 24MHz oscillator allows gating to save power. Indeed, if gated + carelessly the system would stop functioning, but with the right + steps, one can gate it and keep the system running. Consider this + simplified suspend example: + + While the system is operational, you would see something like + + 24MHz 32kHz + | + PLL1 + \ + \_ CPU Mux + | + [CPU] + + When you are about to suspend, you switch the CPU Mux to the 32kHz + oscillator: + + 24Mhz 32kHz + | | + PLL1 | + / + CPU Mux _/ + | + [CPU] + + Finally you can gate the main oscillator + + 32kHz + | + | + / + CPU Mux _/ + | + [CPU] + +Q: Were can I learn more about the sunxi clocks? + +A: The linux-sunxi wiki contains a page documenting the clock registers, + you can find it at + + http://linux-sunxi.org/A10/CCM + + The authoritative source for information at this time is the ccmu driver + released by Allwinner, you can find it at + + https://github.com/linux-sunxi/linux-sunxi/tree/sunxi-3.0/arch/arm/mach-sun4i/clock/ccmu diff --git a/Documentation/backlight/lp855x-driver.txt b/Documentation/backlight/lp855x-driver.txt index 18b06ca038ea..1c732f0c6758 100644 --- a/Documentation/backlight/lp855x-driver.txt +++ b/Documentation/backlight/lp855x-driver.txt @@ -32,14 +32,10 @@ Platform data for lp855x For supporting platform specific data, the lp855x platform data can be used. * name : Backlight driver name. If it is not defined, default name is set. -* mode : Brightness control mode. PWM or register based. * device_control : Value of DEVICE CONTROL register. * initial_brightness : Initial value of backlight brightness. * period_ns : Platform specific PWM period value. unit is nano. Only valid when brightness is pwm input mode. -* load_new_rom_data : - 0 : use default configuration data - 1 : update values of eeprom or eprom registers on loading driver * size_program : Total size of lp855x_rom_data. * rom_data : List of new eeprom/eprom registers. @@ -54,10 +50,8 @@ static struct lp855x_rom_data lp8552_eeprom_arr[] = { static struct lp855x_platform_data lp8552_pdata = { .name = "lcd-bl", - .mode = REGISTER_BASED, .device_control = I2C_CONFIG(LP8552), .initial_brightness = INITIAL_BRT, - .load_new_rom_data = 1, .size_program = ARRAY_SIZE(lp8552_eeprom_arr), .rom_data = lp8552_eeprom_arr, }; @@ -65,7 +59,6 @@ static struct lp855x_platform_data lp8552_pdata = { example 2) lp8556 platform data : pwm input mode with default rom data static struct lp855x_platform_data lp8556_pdata = { - .mode = PWM_BASED, .device_control = PWM_CONFIG(LP8556), .initial_brightness = INITIAL_BRT, .period_ns = 1000000, diff --git a/Documentation/cgroups/00-INDEX b/Documentation/cgroups/00-INDEX index f5635a09c3f6..bc461b6425a7 100644 --- a/Documentation/cgroups/00-INDEX +++ b/Documentation/cgroups/00-INDEX @@ -18,6 +18,8 @@ memcg_test.txt - Memory Resource Controller; implementation details. memory.txt - Memory Resource Controller; design, accounting, interface, testing. +net_cls.txt + - Network classifier cgroups details and usages. net_prio.txt - Network priority cgroups details and usages. resource_counter.txt diff --git a/Documentation/cgroups/cgroups.txt b/Documentation/cgroups/cgroups.txt index bcf1a00b06a1..638bf17ff869 100644 --- a/Documentation/cgroups/cgroups.txt +++ b/Documentation/cgroups/cgroups.txt @@ -442,7 +442,7 @@ You can attach the current shell task by echoing 0: You can use the cgroup.procs file instead of the tasks file to move all threads in a threadgroup at once. Echoing the PID of any task in a threadgroup to cgroup.procs causes all tasks in that threadgroup to be -be attached to the cgroup. Writing 0 to cgroup.procs moves all tasks +attached to the cgroup. Writing 0 to cgroup.procs moves all tasks in the writing task's threadgroup. Note: Since every task is always a member of exactly one cgroup in each @@ -580,6 +580,7 @@ propagation along the hierarchy. See the comment on cgroup_for_each_descendant_pre() for details. void css_offline(struct cgroup *cgrp); +(cgroup_mutex held by caller) This is the counterpart of css_online() and called iff css_online() has succeeded on @cgrp. This signifies the beginning of the end of diff --git a/Documentation/cgroups/devices.txt b/Documentation/cgroups/devices.txt index 16624a7f8222..3c1095ca02ea 100644 --- a/Documentation/cgroups/devices.txt +++ b/Documentation/cgroups/devices.txt @@ -13,9 +13,7 @@ either an integer or * for all. Access is a composition of r The root device cgroup starts with rwm to 'all'. A child device cgroup gets a copy of the parent. Administrators can then remove devices from the whitelist or add new entries. A child cgroup can -never receive a device access which is denied by its parent. However -when a device access is removed from a parent it will not also be -removed from the child(ren). +never receive a device access which is denied by its parent. 2. User Interface @@ -50,3 +48,69 @@ task to a new cgroup. (Again we'll probably want to change that). A cgroup may not be granted more permissions than the cgroup's parent has. + +4. Hierarchy + +device cgroups maintain hierarchy by making sure a cgroup never has more +access permissions than its parent. Every time an entry is written to +a cgroup's devices.deny file, all its children will have that entry removed +from their whitelist and all the locally set whitelist entries will be +re-evaluated. In case one of the locally set whitelist entries would provide +more access than the cgroup's parent, it'll be removed from the whitelist. + +Example: + A + / \ + B + + group behavior exceptions + A allow "b 8:* rwm", "c 116:1 rw" + B deny "c 1:3 rwm", "c 116:2 rwm", "b 3:* rwm" + +If a device is denied in group A: + # echo "c 116:* r" > A/devices.deny +it'll propagate down and after revalidating B's entries, the whitelist entry +"c 116:2 rwm" will be removed: + + group whitelist entries denied devices + A all "b 8:* rwm", "c 116:* rw" + B "c 1:3 rwm", "b 3:* rwm" all the rest + +In case parent's exceptions change and local exceptions are not allowed +anymore, they'll be deleted. + +Notice that new whitelist entries will not be propagated: + A + / \ + B + + group whitelist entries denied devices + A "c 1:3 rwm", "c 1:5 r" all the rest + B "c 1:3 rwm", "c 1:5 r" all the rest + +when adding "c *:3 rwm": + # echo "c *:3 rwm" >A/devices.allow + +the result: + group whitelist entries denied devices + A "c *:3 rwm", "c 1:5 r" all the rest + B "c 1:3 rwm", "c 1:5 r" all the rest + +but now it'll be possible to add new entries to B: + # echo "c 2:3 rwm" >B/devices.allow + # echo "c 50:3 r" >B/devices.allow +or even + # echo "c *:3 rwm" >B/devices.allow + +Allowing or denying all by writing 'a' to devices.allow or devices.deny will +not be possible once the device cgroups has children. + +4.1 Hierarchy (internal implementation) + +device cgroups is implemented internally using a behavior (ALLOW, DENY) and a +list of exceptions. The internal state is controlled using the same user +interface to preserve compatibility with the previous whitelist-only +implementation. Removal or addition of exceptions that will reduce the access +to devices will be propagated down the hierarchy. +For every propagated exception, the effective rules will be re-evaluated based +on current parent's access rules. diff --git a/Documentation/cgroups/memory.txt b/Documentation/cgroups/memory.txt index 8b8c28b9864c..09027a9fece5 100644 --- a/Documentation/cgroups/memory.txt +++ b/Documentation/cgroups/memory.txt @@ -40,6 +40,7 @@ Features: - soft limit - moving (recharging) account at moving a task is selectable. - usage threshold notifier + - memory pressure notifier - oom-killer disable knob and oom-notifier - Root cgroup has no limit controls. @@ -65,6 +66,7 @@ Brief summary of control files. memory.stat # show various statistics memory.use_hierarchy # set/show hierarchical account enabled memory.force_empty # trigger forced move charge to parent + memory.pressure_level # set memory pressure notifications memory.swappiness # set/show swappiness parameter of vmscan (See sysctl's vm.swappiness) memory.move_charge_at_immigrate # set/show controls of moving charges @@ -194,7 +196,7 @@ the cgroup that brought it in -- this will happen on memory pressure). But see section 8.2: when moving a task to another cgroup, its pages may be recharged to the new cgroup, if move_charge_at_immigrate has been chosen. -Exception: If CONFIG_CGROUP_CGROUP_MEMCG_SWAP is not used. +Exception: If CONFIG_MEMCG_SWAP is not used. When you do swapoff and make swapped-out pages of shmem(tmpfs) to be backed into memory in force, charges for pages are accounted against the caller of swapoff rather than the users of shmem. @@ -762,7 +764,73 @@ At reading, current status of OOM is shown. under_oom 0 or 1 (if 1, the memory cgroup is under OOM, tasks may be stopped.) -11. TODO +11. Memory Pressure + +The pressure level notifications can be used to monitor the memory +allocation cost; based on the pressure, applications can implement +different strategies of managing their memory resources. The pressure +levels are defined as following: + +The "low" level means that the system is reclaiming memory for new +allocations. Monitoring this reclaiming activity might be useful for +maintaining cache level. Upon notification, the program (typically +"Activity Manager") might analyze vmstat and act in advance (i.e. +prematurely shutdown unimportant services). + +The "medium" level means that the system is experiencing medium memory +pressure, the system might be making swap, paging out active file caches, +etc. Upon this event applications may decide to further analyze +vmstat/zoneinfo/memcg or internal memory usage statistics and free any +resources that can be easily reconstructed or re-read from a disk. + +The "critical" level means that the system is actively thrashing, it is +about to out of memory (OOM) or even the in-kernel OOM killer is on its +way to trigger. Applications should do whatever they can to help the +system. It might be too late to consult with vmstat or any other +statistics, so it's advisable to take an immediate action. + +The events are propagated upward until the event is handled, i.e. the +events are not pass-through. Here is what this means: for example you have +three cgroups: A->B->C. Now you set up an event listener on cgroups A, B +and C, and suppose group C experiences some pressure. In this situation, +only group C will receive the notification, i.e. groups A and B will not +receive it. This is done to avoid excessive "broadcasting" of messages, +which disturbs the system and which is especially bad if we are low on +memory or thrashing. So, organize the cgroups wisely, or propagate the +events manually (or, ask us to implement the pass-through events, +explaining why would you need them.) + +The file memory.pressure_level is only used to setup an eventfd. To +register a notification, an application must: + +- create an eventfd using eventfd(2); +- open memory.pressure_level; +- write string like "<event_fd> <fd of memory.pressure_level> <level>" + to cgroup.event_control. + +Application will be notified through eventfd when memory pressure is at +the specific level (or higher). Read/write operations to +memory.pressure_level are no implemented. + +Test: + + Here is a small script example that makes a new cgroup, sets up a + memory limit, sets up a notification in the cgroup and then makes child + cgroup experience a critical pressure: + + # cd /sys/fs/cgroup/memory/ + # mkdir foo + # cd foo + # cgroup_event_listener memory.pressure_level low & + # echo 8000000 > memory.limit_in_bytes + # echo 8000000 > memory.memsw.limit_in_bytes + # echo $$ > tasks + # dd if=/dev/zero | read x + + (Expect a bunch of notifications, and eventually, the oom-killer will + trigger.) + +12. TODO 1. Add support for accounting huge pages (as a separate controller) 2. Make per-cgroup scanner reclaim not-shared pages first diff --git a/Documentation/cgroups/net_cls.txt b/Documentation/cgroups/net_cls.txt new file mode 100644 index 000000000000..9face6bb578a --- /dev/null +++ b/Documentation/cgroups/net_cls.txt @@ -0,0 +1,34 @@ +Network classifier cgroup +------------------------- + +The Network classifier cgroup provides an interface to +tag network packets with a class identifier (classid). + +The Traffic Controller (tc) can be used to assign +different priorities to packets from different cgroups. + +Creating a net_cls cgroups instance creates a net_cls.classid file. +This net_cls.classid value is initialized to 0. + +You can write hexadecimal values to net_cls.classid; the format for these +values is 0xAAAABBBB; AAAA is the major handle number and BBBB +is the minor handle number. +Reading net_cls.classid yields a decimal result. + +Example: +mkdir /sys/fs/cgroup/net_cls +mount -t cgroup -onet_cls net_cls /sys/fs/cgroup/net_cls +mkdir /sys/fs/cgroup/net_cls/0 +echo 0x100001 > /sys/fs/cgroup/net_cls/0/net_cls.classid + - setting a 10:1 handle. + +cat /sys/fs/cgroup/net_cls/0/net_cls.classid +1048577 + +configuring tc: +tc qdisc add dev eth0 root handle 10: htb + +tc class add dev eth0 parent 10: classid 10:1 htb rate 40mbit + - creating traffic class 10:1 + +tc filter add dev eth0 parent 10: protocol ip prio 10 handle 1: cgroup diff --git a/Documentation/clk.txt b/Documentation/clk.txt index 1943fae014fd..b9911c27f496 100644 --- a/Documentation/clk.txt +++ b/Documentation/clk.txt @@ -174,9 +174,9 @@ int clk_foo_enable(struct clk_hw *hw) }; Below is a matrix detailing which clk_ops are mandatory based upon the -hardware capbilities of that clock. A cell marked as "y" means +hardware capabilities of that clock. A cell marked as "y" means mandatory, a cell marked as "n" implies that either including that -callback is invalid or otherwise uneccesary. Empty cells are either +callback is invalid or otherwise unnecessary. Empty cells are either optional or must be evaluated on a case-by-case basis. clock hardware characteristics @@ -231,3 +231,14 @@ To better enforce this policy, always follow this simple rule: any statically initialized clock data MUST be defined in a separate file from the logic that implements its ops. Basically separate the logic from the data and all is well. + + Part 6 - Disabling clock gating of unused clocks + +Sometimes during development it can be useful to be able to bypass the +default disabling of unused clocks. For example, if drivers aren't enabling +clocks properly but rely on them being on from the bootloader, bypassing +the disabling means that the driver will remain functional while the issues +are sorted out. + +To bypass this disabling, include "clk_ignore_unused" in the bootargs to the +kernel. diff --git a/Documentation/cpu-freq/cpu-drivers.txt b/Documentation/cpu-freq/cpu-drivers.txt index 72f70b16d299..a3585eac83b6 100644 --- a/Documentation/cpu-freq/cpu-drivers.txt +++ b/Documentation/cpu-freq/cpu-drivers.txt @@ -108,8 +108,9 @@ policy->governor must contain the "default policy" for cpufreq_driver.target is called with these values. -For setting some of these values, the frequency table helpers might be -helpful. See the section 2 for more information on them. +For setting some of these values (cpuinfo.min[max]_freq, policy->min[max]), the +frequency table helpers might be helpful. See the section 2 for more information +on them. SMP systems normally have same clock source for a group of cpus. For these the .init() would be called only once for the first online cpu. Here the .init() @@ -184,10 +185,10 @@ the reference implementation in drivers/cpufreq/longrun.c As most cpufreq processors only allow for being set to a few specific frequencies, a "frequency table" with some functions might assist in some work of the processor driver. Such a "frequency table" consists -of an array of struct cpufreq_freq_table entries, with any value in +of an array of struct cpufreq_frequency_table entries, with any value in "index" you want to use, and the corresponding frequency in "frequency". At the end of the table, you need to add a -cpufreq_freq_table entry with frequency set to CPUFREQ_TABLE_END. And +cpufreq_frequency_table entry with frequency set to CPUFREQ_TABLE_END. And if you want to skip one entry in the table, set the frequency to CPUFREQ_ENTRY_INVALID. The entries don't need to be in ascending order. diff --git a/Documentation/cpu-freq/governors.txt b/Documentation/cpu-freq/governors.txt index c7a2eb8450c2..66f9cc310686 100644 --- a/Documentation/cpu-freq/governors.txt +++ b/Documentation/cpu-freq/governors.txt @@ -167,6 +167,27 @@ of load evaluation and helping the CPU stay at its top speed when truly busy, rather than shifting back and forth in speed. This tunable has no effect on behavior at lower speeds/lower CPU loads. +powersave_bias: this parameter takes a value between 0 to 1000. It +defines the percentage (times 10) value of the target frequency that +will be shaved off of the target. For example, when set to 100 -- 10%, +when ondemand governor would have targeted 1000 MHz, it will target +1000 MHz - (10% of 1000 MHz) = 900 MHz instead. This is set to 0 +(disabled) by default. +When AMD frequency sensitivity powersave bias driver -- +drivers/cpufreq/amd_freq_sensitivity.c is loaded, this parameter +defines the workload frequency sensitivity threshold in which a lower +frequency is chosen instead of ondemand governor's original target. +The frequency sensitivity is a hardware reported (on AMD Family 16h +Processors and above) value between 0 to 100% that tells software how +the performance of the workload running on a CPU will change when +frequency changes. A workload with sensitivity of 0% (memory/IO-bound) +will not perform any better on higher core frequency, whereas a +workload with sensitivity of 100% (CPU-bound) will perform better +higher the frequency. When the driver is loaded, this is set to 400 +by default -- for CPUs running workloads with sensitivity value below +40%, a lower frequency is chosen. Unloading the driver or writing 0 +will disable this feature. + 2.5 Conservative ---------------- @@ -191,6 +212,12 @@ governor but for the opposite direction. For example when set to its default value of '20' it means that if the CPU usage needs to be below 20% between samples to have the frequency decreased. +sampling_down_factor: similar functionality as in "ondemand" governor. +But in "conservative", it controls the rate at which the kernel makes +a decision on when to decrease the frequency while running in any +speed. Load for frequency increase is still evaluated every +sampling rate. + 3. The Governor Interface in the CPUfreq Core ============================================= diff --git a/Documentation/cpuidle/driver.txt b/Documentation/cpuidle/driver.txt index 7a9e09ece931..1b0d81d92583 100644 --- a/Documentation/cpuidle/driver.txt +++ b/Documentation/cpuidle/driver.txt @@ -15,11 +15,17 @@ has mechanisms in place to support actual entry-exit into CPU idle states. cpuidle driver initializes the cpuidle_device structure for each CPU device and registers with cpuidle using cpuidle_register_device. +If all the idle states are the same, the wrapper function cpuidle_register +could be used instead. + It can also support the dynamic changes (like battery <-> AC), by using cpuidle_pause_and_lock, cpuidle_disable_device and cpuidle_enable_device, cpuidle_resume_and_unlock. Interfaces: +extern int cpuidle_register(struct cpuidle_driver *drv, + const struct cpumask *const coupled_cpus); +extern int cpuidle_unregister(struct cpuidle_driver *drv); extern int cpuidle_register_driver(struct cpuidle_driver *drv); extern void cpuidle_unregister_driver(struct cpuidle_driver *drv); extern int cpuidle_register_device(struct cpuidle_device *dev); diff --git a/Documentation/device-mapper/dm-raid.txt b/Documentation/device-mapper/dm-raid.txt index b428556197c9..e9192283e5a5 100644 --- a/Documentation/device-mapper/dm-raid.txt +++ b/Documentation/device-mapper/dm-raid.txt @@ -1,10 +1,13 @@ dm-raid -------- +======= The device-mapper RAID (dm-raid) target provides a bridge from DM to MD. It allows the MD RAID drivers to be accessed using a device-mapper interface. + +Mapping Table Interface +----------------------- The target is named "raid" and it accepts the following parameters: <raid_type> <#raid_params> <raid_params> \ @@ -47,7 +50,7 @@ The target is named "raid" and it accepts the following parameters: followed by optional parameters (in any order): [sync|nosync] Force or prevent RAID initialization. - [rebuild <idx>] Rebuild drive number idx (first drive is 0). + [rebuild <idx>] Rebuild drive number 'idx' (first drive is 0). [daemon_sleep <ms>] Interval between runs of the bitmap daemon that @@ -56,9 +59,9 @@ The target is named "raid" and it accepts the following parameters: [min_recovery_rate <kB/sec/disk>] Throttle RAID initialization [max_recovery_rate <kB/sec/disk>] Throttle RAID initialization - [write_mostly <idx>] Drive index is write-mostly - [max_write_behind <sectors>] See '-write-behind=' (man mdadm) - [stripe_cache <sectors>] Stripe cache size (higher RAIDs only) + [write_mostly <idx>] Mark drive index 'idx' write-mostly. + [max_write_behind <sectors>] See '--write-behind=' (man mdadm) + [stripe_cache <sectors>] Stripe cache size (RAID 4/5/6 only) [region_size <sectors>] The region_size multiplied by the number of regions is the logical size of the array. The bitmap records the device @@ -122,7 +125,7 @@ The target is named "raid" and it accepts the following parameters: given for both the metadata and data drives for a given position. -Example tables +Example Tables -------------- # RAID4 - 4 data drives, 1 parity (no metadata devices) # No metadata devices specified to hold superblock/bitmap info @@ -141,26 +144,70 @@ Example tables raid4 4 2048 sync min_recovery_rate 20 \ 5 8:17 8:18 8:33 8:34 8:49 8:50 8:65 8:66 8:81 8:82 + +Status Output +------------- 'dmsetup table' displays the table used to construct the mapping. The optional parameters are always printed in the order listed above with "sync" or "nosync" always output ahead of the other arguments, regardless of the order used when originally loading the table. Arguments that can be repeated are ordered by value. -'dmsetup status' yields information on the state and health of the -array. -The output is as follows: + +'dmsetup status' yields information on the state and health of the array. +The output is as follows (normally a single line, but expanded here for +clarity): 1: <s> <l> raid \ -2: <raid_type> <#devices> <1 health char for each dev> <resync_ratio> +2: <raid_type> <#devices> <health_chars> \ +3: <sync_ratio> <sync_action> <mismatch_cnt> Line 1 is the standard output produced by device-mapper. -Line 2 is produced by the raid target, and best explained by example: - 0 1960893648 raid raid4 5 AAAAA 2/490221568 +Line 2 & 3 are produced by the raid target and are best explained by example: + 0 1960893648 raid raid4 5 AAAAA 2/490221568 init 0 Here we can see the RAID type is raid4, there are 5 devices - all of -which are 'A'live, and the array is 2/490221568 complete with recovery. -Faulty or missing devices are marked 'D'. Devices that are out-of-sync -are marked 'a'. - +which are 'A'live, and the array is 2/490221568 complete with its initial +recovery. Here is a fuller description of the individual fields: + <raid_type> Same as the <raid_type> used to create the array. + <health_chars> One char for each device, indicating: 'A' = alive and + in-sync, 'a' = alive but not in-sync, 'D' = dead/failed. + <sync_ratio> The ratio indicating how much of the array has undergone + the process described by 'sync_action'. If the + 'sync_action' is "check" or "repair", then the process + of "resync" or "recover" can be considered complete. + <sync_action> One of the following possible states: + idle - No synchronization action is being performed. + frozen - The current action has been halted. + resync - Array is undergoing its initial synchronization + or is resynchronizing after an unclean shutdown + (possibly aided by a bitmap). + recover - A device in the array is being rebuilt or + replaced. + check - A user-initiated full check of the array is + being performed. All blocks are read and + checked for consistency. The number of + discrepancies found are recorded in + <mismatch_cnt>. No changes are made to the + array by this action. + repair - The same as "check", but discrepancies are + corrected. + reshape - The array is undergoing a reshape. + <mismatch_cnt> The number of discrepancies found between mirror copies + in RAID1/10 or wrong parity values found in RAID4/5/6. + This value is valid only after a "check" of the array + is performed. A healthy array has a 'mismatch_cnt' of 0. + +Message Interface +----------------- +The dm-raid target will accept certain actions through the 'message' interface. +('man dmsetup' for more information on the message interface.) These actions +include: + "idle" - Halt the current sync action. + "frozen" - Freeze the current sync action. + "resync" - Initiate/continue a resync. + "recover"- Initiate/continue a recover process. + "check" - Initiate a check (i.e. a "scrub") of the array. + "repair" - Initiate a repair of the array. + "reshape"- Currently unsupported (-EINVAL). Version History --------------- @@ -171,4 +218,7 @@ Version History 1.3.1 Allow device replacement/rebuild for RAID 10 1.3.2 Fix/improve redundancy checking for RAID10 1.4.0 Non-functional change. Removes arg from mapping function. -1.4.1 Add RAID10 "far" and "offset" algorithm support. +1.4.1 RAID10 fix redundancy validation checks (commit 55ebbb5). +1.4.2 Add RAID10 "far" and "offset" algorithm support. +1.5.0 Add message interface to allow manipulation of the sync_action. + New status (STATUSTYPE_INFO) fields: sync_action and mismatch_cnt. diff --git a/Documentation/devicetree/bindings/arm/atmel-adc.txt b/Documentation/devicetree/bindings/arm/atmel-adc.txt index c63097d6afeb..16769d9cedd6 100644 --- a/Documentation/devicetree/bindings/arm/atmel-adc.txt +++ b/Documentation/devicetree/bindings/arm/atmel-adc.txt @@ -14,9 +14,19 @@ Required properties: - atmel,adc-status-register: Offset of the Interrupt Status Register - atmel,adc-trigger-register: Offset of the Trigger Register - atmel,adc-vref: Reference voltage in millivolts for the conversions + - atmel,adc-res: List of resolution in bits supported by the ADC. List size + must be two at least. + - atmel,adc-res-names: Contains one identifier string for each resolution + in atmel,adc-res property. "lowres" and "highres" + identifiers are required. Optional properties: - atmel,adc-use-external: Boolean to enable of external triggers + - atmel,adc-use-res: String corresponding to an identifier from + atmel,adc-res-names property. If not specified, the highest + resolution will be used. + - atmel,adc-sleep-mode: Boolean to enable sleep mode when no conversion + - atmel,adc-sample-hold-time: Sample and Hold Time in microseconds Optional trigger Nodes: - Required properties: @@ -40,6 +50,9 @@ adc0: adc@fffb0000 { atmel,adc-trigger-register = <0x08>; atmel,adc-use-external; atmel,adc-vref = <3300>; + atmel,adc-res = <8 10>; + atmel,adc-res-names = "lowres", "highres"; + atmel,adc-use-res = "lowres"; trigger@0 { trigger-name = "external-rising"; diff --git a/Documentation/devicetree/bindings/arm/msm/ssbi.txt b/Documentation/devicetree/bindings/arm/msm/ssbi.txt new file mode 100644 index 000000000000..54fd5ced3401 --- /dev/null +++ b/Documentation/devicetree/bindings/arm/msm/ssbi.txt @@ -0,0 +1,18 @@ +* Qualcomm SSBI + +Some Qualcomm MSM devices contain a point-to-point serial bus used to +communicate with a limited range of devices (mostly power management +chips). + +These require the following properties: + +- compatible: "qcom,ssbi" + +- qcom,controller-type + indicates the SSBI bus variant the controller should use to talk + with the slave device. This should be one of "ssbi", "ssbi2", or + "pmic-arbiter". The type chosen is determined by the attached + slave. + +The slave device should be the single child node of the ssbi device +with a compatible field. diff --git a/Documentation/devicetree/bindings/arm/samsung/exynos-adc.txt b/Documentation/devicetree/bindings/arm/samsung/exynos-adc.txt new file mode 100644 index 000000000000..47ada1dff216 --- /dev/null +++ b/Documentation/devicetree/bindings/arm/samsung/exynos-adc.txt @@ -0,0 +1,60 @@ +Samsung Exynos Analog to Digital Converter bindings + +The devicetree bindings are for the new ADC driver written for +Exynos4 and upward SoCs from Samsung. + +New driver handles the following +1. Supports ADC IF found on EXYNOS4412/EXYNOS5250 + and future SoCs from Samsung +2. Add ADC driver under iio/adc framework +3. Also adds the Documentation for device tree bindings + +Required properties: +- compatible: Must be "samsung,exynos-adc-v1" + for exynos4412/5250 controllers. + Must be "samsung,exynos-adc-v2" for + future controllers. +- reg: Contains ADC register address range (base address and + length) and the address of the phy enable register. +- interrupts: Contains the interrupt information for the timer. The + format is being dependent on which interrupt controller + the Samsung device uses. +- #io-channel-cells = <1>; As ADC has multiple outputs +- clocks From common clock binding: handle to adc clock. +- clock-names From common clock binding: Shall be "adc". +- vdd-supply VDD input supply. + +Note: child nodes can be added for auto probing from device tree. + +Example: adding device info in dtsi file + +adc: adc@12D10000 { + compatible = "samsung,exynos-adc-v1"; + reg = <0x12D10000 0x100>, <0x10040718 0x4>; + interrupts = <0 106 0>; + #io-channel-cells = <1>; + io-channel-ranges; + + clocks = <&clock 303>; + clock-names = "adc"; + + vdd-supply = <&buck5_reg>; +}; + + +Example: Adding child nodes in dts file + +adc@12D10000 { + + /* NTC thermistor is a hwmon device */ + ncp15wb473@0 { + compatible = "ntc,ncp15wb473"; + pullup-uV = <1800000>; + pullup-ohm = <47000>; + pulldown-ohm = <0>; + io-channels = <&adc 4>; + }; +}; + +Note: Does not apply to ADC driver under arch/arm/plat-samsung/ +Note: The child node can be added under the adc node or separately. diff --git a/Documentation/devicetree/bindings/ata/imx-pata.txt b/Documentation/devicetree/bindings/ata/imx-pata.txt new file mode 100644 index 000000000000..e38d73414b0d --- /dev/null +++ b/Documentation/devicetree/bindings/ata/imx-pata.txt @@ -0,0 +1,17 @@ +* Freescale i.MX PATA Controller + +Required properties: +- compatible: "fsl,imx27-pata" +- reg: Address range of the PATA Controller +- interrupts: The interrupt of the PATA Controller +- clocks: the clocks for the PATA Controller + +Example: + + pata: pata@83fe0000 { + compatible = "fsl,imx51-pata", "fsl,imx27-pata"; + reg = <0x83fe0000 0x4000>; + interrupts = <70>; + clocks = <&clks 161>; + status = "disabled"; + }; diff --git a/Documentation/devicetree/bindings/clock/axi-clkgen.txt b/Documentation/devicetree/bindings/clock/axi-clkgen.txt new file mode 100644 index 000000000000..028b493e97ff --- /dev/null +++ b/Documentation/devicetree/bindings/clock/axi-clkgen.txt @@ -0,0 +1,22 @@ +Binding for the axi-clkgen clock generator + +This binding uses the common clock binding[1]. + +[1] Documentation/devicetree/bindings/clock/clock-bindings.txt + +Required properties: +- compatible : shall be "adi,axi-clkgen". +- #clock-cells : from common clock binding; Should always be set to 0. +- reg : Address and length of the axi-clkgen register set. +- clocks : Phandle and clock specifier for the parent clock. + +Optional properties: +- clock-output-names : From common clock binding. + +Example: + clock@0xff000000 { + compatible = "adi,axi-clkgen"; + #clock-cells = <0>; + reg = <0xff000000 0x1000>; + clocks = <&osc 1>; + }; diff --git a/Documentation/devicetree/bindings/clock/fixed-factor-clock.txt b/Documentation/devicetree/bindings/clock/fixed-factor-clock.txt new file mode 100644 index 000000000000..5757f9abfc26 --- /dev/null +++ b/Documentation/devicetree/bindings/clock/fixed-factor-clock.txt @@ -0,0 +1,24 @@ +Binding for simple fixed factor rate clock sources. + +This binding uses the common clock binding[1]. + +[1] Documentation/devicetree/bindings/clock/clock-bindings.txt + +Required properties: +- compatible : shall be "fixed-factor-clock". +- #clock-cells : from common clock binding; shall be set to 0. +- clock-div: fixed divider. +- clock-mult: fixed multiplier. +- clocks: parent clock. + +Optional properties: +- clock-output-names : From common clock binding. + +Example: + clock { + compatible = "fixed-factor-clock"; + clocks = <&parentclk>; + #clock-cells = <0>; + div = <2>; + mult = <1>; + }; diff --git a/Documentation/devicetree/bindings/clock/silabs,si5351.txt b/Documentation/devicetree/bindings/clock/silabs,si5351.txt new file mode 100644 index 000000000000..cc374651662c --- /dev/null +++ b/Documentation/devicetree/bindings/clock/silabs,si5351.txt @@ -0,0 +1,114 @@ +Binding for Silicon Labs Si5351a/b/c programmable i2c clock generator. + +Reference +[1] Si5351A/B/C Data Sheet + http://www.silabs.com/Support%20Documents/TechnicalDocs/Si5351.pdf + +The Si5351a/b/c are programmable i2c clock generators with upto 8 output +clocks. Si5351a also has a reduced pin-count package (MSOP10) where only +3 output clocks are accessible. The internal structure of the clock +generators can be found in [1]. + +==I2C device node== + +Required properties: +- compatible: shall be one of "silabs,si5351{a,a-msop,b,c}". +- reg: i2c device address, shall be 0x60 or 0x61. +- #clock-cells: from common clock binding; shall be set to 1. +- clocks: from common clock binding; list of parent clock + handles, shall be xtal reference clock or xtal and clkin for + si5351c only. +- #address-cells: shall be set to 1. +- #size-cells: shall be set to 0. + +Optional properties: +- silabs,pll-source: pair of (number, source) for each pll. Allows + to overwrite clock source of pll A (number=0) or B (number=1). + +==Child nodes== + +Each of the clock outputs can be overwritten individually by +using a child node to the I2C device node. If a child node for a clock +output is not set, the eeprom configuration is not overwritten. + +Required child node properties: +- reg: number of clock output. + +Optional child node properties: +- silabs,clock-source: source clock of the output divider stage N, shall be + 0 = multisynth N + 1 = multisynth 0 for output clocks 0-3, else multisynth4 + 2 = xtal + 3 = clkin (si5351c only) +- silabs,drive-strength: output drive strength in mA, shall be one of {2,4,6,8}. +- silabs,multisynth-source: source pll A(0) or B(1) of corresponding multisynth + divider. +- silabs,pll-master: boolean, multisynth can change pll frequency. + +==Example== + +/* 25MHz reference crystal */ +ref25: ref25M { + compatible = "fixed-clock"; + #clock-cells = <0>; + clock-frequency = <25000000>; +}; + +i2c-master-node { + + /* Si5351a msop10 i2c clock generator */ + si5351a: clock-generator@60 { + compatible = "silabs,si5351a-msop"; + reg = <0x60>; + #address-cells = <1>; + #size-cells = <0>; + #clock-cells = <1>; + + /* connect xtal input to 25MHz reference */ + clocks = <&ref25>; + + /* connect xtal input as source of pll0 and pll1 */ + silabs,pll-source = <0 0>, <1 0>; + + /* + * overwrite clkout0 configuration with: + * - 8mA output drive strength + * - pll0 as clock source of multisynth0 + * - multisynth0 as clock source of output divider + * - multisynth0 can change pll0 + * - set initial clock frequency of 74.25MHz + */ + clkout0 { + reg = <0>; + silabs,drive-strength = <8>; + silabs,multisynth-source = <0>; + silabs,clock-source = <0>; + silabs,pll-master; + clock-frequency = <74250000>; + }; + + /* + * overwrite clkout1 configuration with: + * - 4mA output drive strength + * - pll1 as clock source of multisynth1 + * - multisynth1 as clock source of output divider + * - multisynth1 can change pll1 + */ + clkout1 { + reg = <1>; + silabs,drive-strength = <4>; + silabs,multisynth-source = <1>; + silabs,clock-source = <0>; + pll-master; + }; + + /* + * overwrite clkout2 configuration with: + * - xtal as clock source of output divider + */ + clkout2 { + reg = <2>; + silabs,clock-source = <2>; + }; + }; +}; diff --git a/Documentation/devicetree/bindings/clock/sunxi.txt b/Documentation/devicetree/bindings/clock/sunxi.txt new file mode 100644 index 000000000000..729f52426fe1 --- /dev/null +++ b/Documentation/devicetree/bindings/clock/sunxi.txt @@ -0,0 +1,151 @@ +Device Tree Clock bindings for arch-sunxi + +This binding uses the common clock binding[1]. + +[1] Documentation/devicetree/bindings/clock/clock-bindings.txt + +Required properties: +- compatible : shall be one of the following: + "allwinner,sun4i-osc-clk" - for a gatable oscillator + "allwinner,sun4i-pll1-clk" - for the main PLL clock + "allwinner,sun4i-cpu-clk" - for the CPU multiplexer clock + "allwinner,sun4i-axi-clk" - for the AXI clock + "allwinner,sun4i-axi-gates-clk" - for the AXI gates + "allwinner,sun4i-ahb-clk" - for the AHB clock + "allwinner,sun4i-ahb-gates-clk" - for the AHB gates + "allwinner,sun4i-apb0-clk" - for the APB0 clock + "allwinner,sun4i-apb0-gates-clk" - for the APB0 gates + "allwinner,sun4i-apb1-clk" - for the APB1 clock + "allwinner,sun4i-apb1-mux-clk" - for the APB1 clock muxing + "allwinner,sun4i-apb1-gates-clk" - for the APB1 gates + +Required properties for all clocks: +- reg : shall be the control register address for the clock. +- clocks : shall be the input parent clock(s) phandle for the clock +- #clock-cells : from common clock binding; shall be set to 0 except for + "allwinner,sun4i-*-gates-clk" where it shall be set to 1 + +Additionally, "allwinner,sun4i-*-gates-clk" clocks require: +- clock-output-names : the corresponding gate names that the clock controls + +For example: + +osc24M: osc24M@01c20050 { + #clock-cells = <0>; + compatible = "allwinner,sun4i-osc-clk"; + reg = <0x01c20050 0x4>; + clocks = <&osc24M_fixed>; +}; + +pll1: pll1@01c20000 { + #clock-cells = <0>; + compatible = "allwinner,sun4i-pll1-clk"; + reg = <0x01c20000 0x4>; + clocks = <&osc24M>; +}; + +cpu: cpu@01c20054 { + #clock-cells = <0>; + compatible = "allwinner,sun4i-cpu-clk"; + reg = <0x01c20054 0x4>; + clocks = <&osc32k>, <&osc24M>, <&pll1>; +}; + + + +Gate clock outputs + +The "allwinner,sun4i-*-gates-clk" clocks provide several gatable outputs; +their corresponding offsets as present on sun4i are listed below. Note that +some of these gates are not present on sun5i. + + * AXI gates ("allwinner,sun4i-axi-gates-clk") + + DRAM 0 + + * AHB gates ("allwinner,sun4i-ahb-gates-clk") + + USB0 0 + EHCI0 1 + OHCI0 2* + EHCI1 3 + OHCI1 4* + SS 5 + DMA 6 + BIST 7 + MMC0 8 + MMC1 9 + MMC2 10 + MMC3 11 + MS 12** + NAND 13 + SDRAM 14 + + ACE 16 + EMAC 17 + TS 18 + + SPI0 20 + SPI1 21 + SPI2 22 + SPI3 23 + PATA 24 + SATA 25** + GPS 26* + + VE 32 + TVD 33 + TVE0 34 + TVE1 35 + LCD0 36 + LCD1 37 + + CSI0 40 + CSI1 41 + + HDMI 43 + DE_BE0 44 + DE_BE1 45 + DE_FE0 46 + DE_FE1 47 + + MP 50 + + MALI400 52 + + * APB0 gates ("allwinner,sun4i-apb0-gates-clk") + + CODEC 0 + SPDIF 1* + AC97 2 + IIS 3 + + PIO 5 + IR0 6 + IR1 7 + + KEYPAD 10 + + * APB1 gates ("allwinner,sun4i-apb1-gates-clk") + + I2C0 0 + I2C1 1 + I2C2 2 + + CAN 4 + SCR 5 + PS20 6 + PS21 7 + + UART0 16 + UART1 17 + UART2 18 + UART3 19 + UART4 20 + UART5 21 + UART6 22 + UART7 23 + +Notation: + [*]: The datasheet didn't mention these, but they are present on AW code + [**]: The datasheet had this marked as "NC" but they are used on AW code diff --git a/Documentation/devicetree/bindings/cpufreq/arm_big_little_dt.txt b/Documentation/devicetree/bindings/cpufreq/arm_big_little_dt.txt new file mode 100644 index 000000000000..0715695e94a9 --- /dev/null +++ b/Documentation/devicetree/bindings/cpufreq/arm_big_little_dt.txt @@ -0,0 +1,65 @@ +Generic ARM big LITTLE cpufreq driver's DT glue +----------------------------------------------- + +This is DT specific glue layer for generic cpufreq driver for big LITTLE +systems. + +Both required and optional properties listed below must be defined +under node /cpus/cpu@x. Where x is the first cpu inside a cluster. + +FIXME: Cpus should boot in the order specified in DT and all cpus for a cluster +must be present contiguously. Generic DT driver will check only node 'x' for +cpu:x. + +Required properties: +- operating-points: Refer to Documentation/devicetree/bindings/power/opp.txt + for details + +Optional properties: +- clock-latency: Specify the possible maximum transition latency for clock, + in unit of nanoseconds. + +Examples: + +cpus { + #address-cells = <1>; + #size-cells = <0>; + + cpu@0 { + compatible = "arm,cortex-a15"; + reg = <0>; + next-level-cache = <&L2>; + operating-points = < + /* kHz uV */ + 792000 1100000 + 396000 950000 + 198000 850000 + >; + clock-latency = <61036>; /* two CLK32 periods */ + }; + + cpu@1 { + compatible = "arm,cortex-a15"; + reg = <1>; + next-level-cache = <&L2>; + }; + + cpu@100 { + compatible = "arm,cortex-a7"; + reg = <100>; + next-level-cache = <&L2>; + operating-points = < + /* kHz uV */ + 792000 950000 + 396000 750000 + 198000 450000 + >; + clock-latency = <61036>; /* two CLK32 periods */ + }; + + cpu@101 { + compatible = "arm,cortex-a7"; + reg = <101>; + next-level-cache = <&L2>; + }; +}; diff --git a/Documentation/devicetree/bindings/cpufreq/cpufreq-cpu0.txt b/Documentation/devicetree/bindings/cpufreq/cpufreq-cpu0.txt index 4416ccc33472..051f764bedb8 100644 --- a/Documentation/devicetree/bindings/cpufreq/cpufreq-cpu0.txt +++ b/Documentation/devicetree/bindings/cpufreq/cpufreq-cpu0.txt @@ -32,7 +32,7 @@ cpus { 396000 950000 198000 850000 >; - transition-latency = <61036>; /* two CLK32 periods */ + clock-latency = <61036>; /* two CLK32 periods */ }; cpu@1 { diff --git a/Documentation/devicetree/bindings/cpufreq/cpufreq-exynos5440.txt b/Documentation/devicetree/bindings/cpufreq/cpufreq-exynos5440.txt new file mode 100644 index 000000000000..caff1a57436f --- /dev/null +++ b/Documentation/devicetree/bindings/cpufreq/cpufreq-exynos5440.txt @@ -0,0 +1,28 @@ + +Exynos5440 cpufreq driver +------------------- + +Exynos5440 SoC cpufreq driver for CPU frequency scaling. + +Required properties: +- interrupts: Interrupt to know the completion of cpu frequency change. +- operating-points: Table of frequencies and voltage CPU could be transitioned into, + in the decreasing order. Frequency should be in KHz units and voltage + should be in microvolts. + +Optional properties: +- clock-latency: Clock monitor latency in microsecond. + +All the required listed above must be defined under node cpufreq. + +Example: +-------- + cpufreq@160000 { + compatible = "samsung,exynos5440-cpufreq"; + reg = <0x160000 0x1000>; + interrupts = <0 57 0>; + operating-points = < + 1000000 975000 + 800000 925000>; + clock-latency = <100000>; + }; diff --git a/Documentation/devicetree/bindings/gpio/gpio.txt b/Documentation/devicetree/bindings/gpio/gpio.txt index a33628759d36..d933af370697 100644 --- a/Documentation/devicetree/bindings/gpio/gpio.txt +++ b/Documentation/devicetree/bindings/gpio/gpio.txt @@ -98,7 +98,7 @@ announce the pinrange to the pin ctrl subsystem. For example, compatible = "fsl,qe-pario-bank-e", "fsl,qe-pario-bank"; reg = <0x1460 0x18>; gpio-controller; - gpio-ranges = <&pinctrl1 20 10>, <&pinctrl2 50 20>; + gpio-ranges = <&pinctrl1 0 20 10>, <&pinctrl2 10 50 20>; } @@ -107,8 +107,8 @@ where, Next values specify the base pin and number of pins for the range handled by 'qe_pio_e' gpio. In the given example from base pin 20 to - pin 29 under pinctrl1 and pin 50 to pin 69 under pinctrl2 is handled - by this gpio controller. + pin 29 under pinctrl1 with gpio offset 0 and pin 50 to pin 69 under + pinctrl2 with gpio offset 10 is handled by this gpio controller. The pinctrl node must have "#gpio-range-cells" property to show number of arguments to pass with phandle from gpio controllers node. diff --git a/Documentation/devicetree/bindings/hwmon/ntc_thermistor.txt b/Documentation/devicetree/bindings/hwmon/ntc_thermistor.txt new file mode 100644 index 000000000000..c6f66674f19c --- /dev/null +++ b/Documentation/devicetree/bindings/hwmon/ntc_thermistor.txt @@ -0,0 +1,29 @@ +NTC Thermistor hwmon sensors +------------------------------- + +Requires node properties: +- "compatible" value : one of + "ntc,ncp15wb473" + "ntc,ncp18wb473" + "ntc,ncp21wb473" + "ntc,ncp03wb473" + "ntc,ncp15wl333" +- "pullup-uv" Pull up voltage in micro volts +- "pullup-ohm" Pull up resistor value in ohms +- "pulldown-ohm" Pull down resistor value in ohms +- "connected-positive" Always ON, If not specified. + Status change is possible. +- "io-channels" Channel node of ADC to be used for + conversion. + +Read more about iio bindings at + Documentation/devicetree/bindings/iio/iio-bindings.txt + +Example: + ncp15wb473@0 { + compatible = "ntc,ncp15wb473"; + pullup-uv = <1800000>; + pullup-ohm = <47000>; + pulldown-ohm = <0>; + io-channels = <&adc 3>; + }; diff --git a/Documentation/devicetree/bindings/i2c/i2c-s3c2410.txt b/Documentation/devicetree/bindings/i2c/i2c-s3c2410.txt index f98d4c5b5cca..296eb4536129 100644 --- a/Documentation/devicetree/bindings/i2c/i2c-s3c2410.txt +++ b/Documentation/devicetree/bindings/i2c/i2c-s3c2410.txt @@ -26,7 +26,7 @@ Required for all cases except "samsung,s3c2440-hdmiphy-i2c": - pinctrl-names: Should contain only one value - "default". Optional properties: - - samsung,i2c-slave-addr: Slave address in multi-master enviroment. If not + - samsung,i2c-slave-addr: Slave address in multi-master environment. If not specified, default value is 0. - samsung,i2c-max-bus-freq: Desired frequency in Hz of the bus. If not specified, the default value in Hz is 100000. diff --git a/Documentation/devicetree/bindings/i2c/trivial-devices.txt b/Documentation/devicetree/bindings/i2c/trivial-devices.txt index 446859fcdca4..ad6a73852f08 100644 --- a/Documentation/devicetree/bindings/i2c/trivial-devices.txt +++ b/Documentation/devicetree/bindings/i2c/trivial-devices.txt @@ -35,6 +35,8 @@ fsl,mc13892 MC13892: Power Management Integrated Circuit (PMIC) for i.MX35/51 fsl,mma8450 MMA8450Q: Xtrinsic Low-power, 3-axis Xtrinsic Accelerometer fsl,mpr121 MPR121: Proximity Capacitive Touch Sensor Controller fsl,sgtl5000 SGTL5000: Ultra Low-Power Audio Codec +infineon,slb9635tt Infineon SLB9635 (Soft-) I2C TPM (old protocol, max 100khz) +infineon,slb9645tt Infineon SLB9645 I2C TPM (new protocol, max 400khz) maxim,ds1050 5 Bit Programmable, Pulse-Width Modulator maxim,max1237 Low-Power, 4-/12-Channel, 2-Wire Serial, 12-Bit ADCs maxim,max6625 9-Bit/12-Bit Temperature Sensors with I²C-Compatible Serial Interface diff --git a/Documentation/devicetree/bindings/iio/iio-bindings.txt b/Documentation/devicetree/bindings/iio/iio-bindings.txt new file mode 100644 index 000000000000..0b447d9ad196 --- /dev/null +++ b/Documentation/devicetree/bindings/iio/iio-bindings.txt @@ -0,0 +1,97 @@ +This binding is derived from clock bindings, and based on suggestions +from Lars-Peter Clausen [1]. + +Sources of IIO channels can be represented by any node in the device +tree. Those nodes are designated as IIO providers. IIO consumer +nodes use a phandle and IIO specifier pair to connect IIO provider +outputs to IIO inputs. Similar to the gpio specifiers, an IIO +specifier is an array of one or more cells identifying the IIO +output on a device. The length of an IIO specifier is defined by the +value of a #io-channel-cells property in the IIO provider node. + +[1] http://marc.info/?l=linux-iio&m=135902119507483&w=2 + +==IIO providers== + +Required properties: +#io-channel-cells: Number of cells in an IIO specifier; Typically 0 for nodes + with a single IIO output and 1 for nodes with multiple + IIO outputs. + +Example for a simple configuration with no trigger: + + adc: voltage-sensor@35 { + compatible = "maxim,max1139"; + reg = <0x35>; + #io-channel-cells = <1>; + }; + +Example for a configuration with trigger: + + adc@35 { + compatible = "some-vendor,some-adc"; + reg = <0x35>; + + adc1: iio-device@0 { + #io-channel-cells = <1>; + /* other properties */ + }; + adc2: iio-device@1 { + #io-channel-cells = <1>; + /* other properties */ + }; + }; + +==IIO consumers== + +Required properties: +io-channels: List of phandle and IIO specifier pairs, one pair + for each IIO input to the device. Note: if the + IIO provider specifies '0' for #io-channel-cells, + then only the phandle portion of the pair will appear. + +Optional properties: +io-channel-names: + List of IIO input name strings sorted in the same + order as the io-channels property. Consumers drivers + will use io-channel-names to match IIO input names + with IIO specifiers. +io-channel-ranges: + Empty property indicating that child nodes can inherit named + IIO channels from this node. Useful for bus nodes to provide + and IIO channel to their children. + +For example: + + device { + io-channels = <&adc 1>, <&ref 0>; + io-channel-names = "vcc", "vdd"; + }; + +This represents a device with two IIO inputs, named "vcc" and "vdd". +The vcc channel is connected to output 1 of the &adc device, and the +vdd channel is connected to output 0 of the &ref device. + +==Example== + + adc: max1139@35 { + compatible = "maxim,max1139"; + reg = <0x35>; + #io-channel-cells = <1>; + }; + + ... + + iio_hwmon { + compatible = "iio-hwmon"; + io-channels = <&adc 0>, <&adc 1>, <&adc 2>, + <&adc 3>, <&adc 4>, <&adc 5>, + <&adc 6>, <&adc 7>, <&adc 8>, + <&adc 9>; + }; + + some_consumer { + compatible = "some-consumer"; + io-channels = <&adc 10>, <&adc 11>; + io-channel-names = "adc1", "adc2"; + }; diff --git a/Documentation/devicetree/bindings/input/ps2keyb-mouse-apbps2.txt b/Documentation/devicetree/bindings/input/ps2keyb-mouse-apbps2.txt new file mode 100644 index 000000000000..3029c5694cf6 --- /dev/null +++ b/Documentation/devicetree/bindings/input/ps2keyb-mouse-apbps2.txt @@ -0,0 +1,16 @@ +Aeroflex Gaisler APBPS2 PS/2 Core, supporting Keyboard or Mouse. + +The APBPS2 PS/2 core is available in the GRLIB VHDL IP core library. + +Note: In the ordinary environment for the APBPS2 core, a LEON SPARC system, +these properties are built from information in the AMBA plug&play and from +bootloader settings. + +Required properties: + +- name : Should be "GAISLER_APBPS2" or "01_060" +- reg : Address and length of the register set for the device +- interrupts : Interrupt numbers for this device + +For further information look in the documentation for the GLIB IP core library: +http://www.gaisler.com/products/grlib/grip.pdf diff --git a/Documentation/devicetree/bindings/input/touchscreen/auo_pixcir_ts.txt b/Documentation/devicetree/bindings/input/touchscreen/auo_pixcir_ts.txt new file mode 100644 index 000000000000..f40f21c642b9 --- /dev/null +++ b/Documentation/devicetree/bindings/input/touchscreen/auo_pixcir_ts.txt @@ -0,0 +1,30 @@ +* AUO in-cell touchscreen controller using Pixcir sensors + +Required properties: +- compatible: must be "auo,auo_pixcir_ts" +- reg: I2C address of the chip +- interrupts: interrupt to which the chip is connected +- gpios: gpios the chip is connected to + first one is the interrupt gpio and second one the reset gpio +- x-size: horizontal resolution of touchscreen +- y-size: vertical resolution of touchscreen + +Example: + + i2c@00000000 { + /* ... */ + + auo_pixcir_ts@5c { + compatible = "auo,auo_pixcir_ts"; + reg = <0x5c>; + interrupts = <2 0>; + + gpios = <&gpf 2 0 2>, /* INT */ + <&gpf 5 1 0>; /* RST */ + + x-size = <800>; + y-size = <600>; + }; + + /* ... */ + }; diff --git a/Documentation/devicetree/bindings/input/touchscreen/sitronix-st1232.txt b/Documentation/devicetree/bindings/input/touchscreen/sitronix-st1232.txt new file mode 100644 index 000000000000..64ad48b824a2 --- /dev/null +++ b/Documentation/devicetree/bindings/input/touchscreen/sitronix-st1232.txt @@ -0,0 +1,24 @@ +* Sitronix st1232 touchscreen controller + +Required properties: +- compatible: must be "sitronix,st1232" +- reg: I2C address of the chip +- interrupts: interrupt to which the chip is connected + +Optional properties: +- gpios: a phandle to the reset GPIO + +Example: + + i2c@00000000 { + /* ... */ + + touchscreen@55 { + compatible = "sitronix,st1232"; + reg = <0x55>; + interrupts = <2 0>; + gpios = <&gpio1 166 0>; + }; + + /* ... */ + }; diff --git a/Documentation/devicetree/bindings/leds/tca6507.txt b/Documentation/devicetree/bindings/leds/tca6507.txt index 2b6693b972fb..80ff3dfb1f32 100644 --- a/Documentation/devicetree/bindings/leds/tca6507.txt +++ b/Documentation/devicetree/bindings/leds/tca6507.txt @@ -1,4 +1,4 @@ -LEDs conected to tca6507 +LEDs connected to tca6507 Required properties: - compatible : should be : "ti,tca6507". diff --git a/Documentation/devicetree/bindings/marvell.txt b/Documentation/devicetree/bindings/marvell.txt index f1533d91953a..f7a0da6b4022 100644 --- a/Documentation/devicetree/bindings/marvell.txt +++ b/Documentation/devicetree/bindings/marvell.txt @@ -115,6 +115,9 @@ prefixed with the string "marvell,", for Marvell Technology Group Ltd. - compatible : "marvell,mv64360-eth-block" - reg : Offset and length of the register set for this block + Optional properties: + - clocks : Phandle to the clock control device and gate bit + Example Discovery Ethernet block node: ethernet-block@2000 { #address-cells = <1>; diff --git a/Documentation/devicetree/bindings/media/coda.txt b/Documentation/devicetree/bindings/media/coda.txt new file mode 100644 index 000000000000..2865d04e4030 --- /dev/null +++ b/Documentation/devicetree/bindings/media/coda.txt @@ -0,0 +1,30 @@ +Chips&Media Coda multi-standard codec IP +======================================== + +Coda codec IPs are present in i.MX SoCs in various versions, +called VPU (Video Processing Unit). + +Required properties: +- compatible : should be "fsl,<chip>-src" for i.MX SoCs: + (a) "fsl,imx27-vpu" for CodaDx6 present in i.MX27 + (b) "fsl,imx53-vpu" for CODA7541 present in i.MX53 + (c) "fsl,imx6q-vpu" for CODA960 present in i.MX6q +- reg: should be register base and length as documented in the + SoC reference manual +- interrupts : Should contain the VPU interrupt. For CODA960, + a second interrupt is needed for the MJPEG unit. +- clocks : Should contain the ahb and per clocks, in the order + determined by the clock-names property. +- clock-names : Should be "ahb", "per" +- iram : phandle pointing to the SRAM device node + +Example: + +vpu: vpu@63ff4000 { + compatible = "fsl,imx53-vpu"; + reg = <0x63ff4000 0x1000>; + interrupts = <9>; + clocks = <&clks 63>, <&clks 63>; + clock-names = "ahb", "per"; + iram = <&ocram>; +}; diff --git a/Documentation/devicetree/bindings/media/exynos-fimc-lite.txt b/Documentation/devicetree/bindings/media/exynos-fimc-lite.txt new file mode 100644 index 000000000000..3f62adfb3e0b --- /dev/null +++ b/Documentation/devicetree/bindings/media/exynos-fimc-lite.txt @@ -0,0 +1,14 @@ +Exynos4x12/Exynos5 SoC series camera host interface (FIMC-LITE) + +Required properties: + +- compatible : should be "samsung,exynos4212-fimc" for Exynos4212 and + Exynos4412 SoCs; +- reg : physical base address and size of the device memory mapped + registers; +- interrupts : should contain FIMC-LITE interrupt; +- clocks : FIMC LITE gate clock should be specified in this property. +- clock-names : should contain "flite" entry. + +Each FIMC device should have an alias in the aliases node, in the form of +fimc-lite<n>, where <n> is an integer specifying the IP block instance. diff --git a/Documentation/devicetree/bindings/media/exynos4-fimc-is.txt b/Documentation/devicetree/bindings/media/exynos4-fimc-is.txt new file mode 100644 index 000000000000..55c9ad6f9599 --- /dev/null +++ b/Documentation/devicetree/bindings/media/exynos4-fimc-is.txt @@ -0,0 +1,49 @@ +Exynos4x12 SoC series Imaging Subsystem (FIMC-IS) + +The FIMC-IS is a subsystem for processing image signal from an image sensor. +The Exynos4x12 SoC series FIMC-IS V1.5 comprises of a dedicated ARM Cortex-A5 +processor, ISP, DRC and FD IP blocks and peripheral devices such as UART, I2C +and SPI bus controllers, PWM and ADC. + +fimc-is node +------------ + +Required properties: +- compatible : should be "samsung,exynos4212-fimc-is" for Exynos4212 and + Exynos4412 SoCs; +- reg : physical base address and length of the registers set; +- interrupts : must contain two FIMC-IS interrupts, in order: ISP0, ISP1; +- clocks : list of clock specifiers, corresponding to entries in + clock-names property; +- clock-names : must contain "ppmuispx", "ppmuispx", "lite0", "lite1" + "mpll", "sysreg", "isp", "drc", "fd", "mcuisp", "uart", + "ispdiv0", "ispdiv1", "mcuispdiv0", "mcuispdiv1", "aclk200", + "div_aclk200", "aclk400mcuisp", "div_aclk400mcuisp" entries, + matching entries in the clocks property. +pmu subnode +----------- + +Required properties: + - reg : must contain PMU physical base address and size of the register set. + +The following are the FIMC-IS peripheral device nodes and can be specified +either standalone or as the fimc-is node child nodes. + +i2c-isp (ISP I2C bus controller) nodes +------------------------------------------ + +Required properties: + +- compatible : should be "samsung,exynos4212-i2c-isp" for Exynos4212 and + Exynos4412 SoCs; +- reg : physical base address and length of the registers set; +- clocks : must contain gate clock specifier for this controller; +- clock-names : must contain "i2c_isp" entry. + +For the above nodes it is required to specify a pinctrl state named "default", +according to the pinctrl bindings defined in ../pinctrl/pinctrl-bindings.txt. + +Device tree nodes of the image sensors' controlled directly by the FIMC-IS +firmware must be child nodes of their corresponding ISP I2C bus controller node. +The data link of these image sensors must be specified using the common video +interfaces bindings, defined in video-interfaces.txt. diff --git a/Documentation/devicetree/bindings/media/samsung-fimc.txt b/Documentation/devicetree/bindings/media/samsung-fimc.txt new file mode 100644 index 000000000000..51c776b7f7a3 --- /dev/null +++ b/Documentation/devicetree/bindings/media/samsung-fimc.txt @@ -0,0 +1,197 @@ +Samsung S5P/EXYNOS SoC Camera Subsystem (FIMC) +---------------------------------------------- + +The S5P/Exynos SoC Camera subsystem comprises of multiple sub-devices +represented by separate device tree nodes. Currently this includes: FIMC (in +the S5P SoCs series known as CAMIF), MIPI CSIS, FIMC-LITE and FIMC-IS (ISP). + +The sub-subdevices are defined as child nodes of the common 'camera' node which +also includes common properties of the whole subsystem not really specific to +any single sub-device, like common camera port pins or the CAMCLK clock outputs +for external image sensors attached to an SoC. + +Common 'camera' node +-------------------- + +Required properties: + +- compatible : must be "samsung,fimc", "simple-bus" +- clocks : list of clock specifiers, corresponding to entries in + the clock-names property; +- clock-names : must contain "sclk_cam0", "sclk_cam1", "pxl_async0", + "pxl_async1" entries, matching entries in the clocks property. + +The pinctrl bindings defined in ../pinctrl/pinctrl-bindings.txt must be used +to define a required pinctrl state named "default" and optional pinctrl states: +"idle", "active-a", active-b". These optional states can be used to switch the +camera port pinmux at runtime. The "idle" state should configure both the camera +ports A and B into high impedance state, especially the CAMCLK clock output +should be inactive. For the "active-a" state the camera port A must be activated +and the port B deactivated and for the state "active-b" it should be the other +way around. + +The 'camera' node must include at least one 'fimc' child node. + +'fimc' device nodes +------------------- + +Required properties: + +- compatible: "samsung,s5pv210-fimc" for S5PV210, "samsung,exynos4210-fimc" + for Exynos4210 and "samsung,exynos4212-fimc" for Exynos4x12 SoCs; +- reg: physical base address and length of the registers set for the device; +- interrupts: should contain FIMC interrupt; +- clocks: list of clock specifiers, must contain an entry for each required + entry in clock-names; +- clock-names: must contain "fimc", "sclk_fimc" entries. +- samsung,pix-limits: an array of maximum supported image sizes in pixels, for + details refer to Table 2-1 in the S5PV210 SoC User Manual; The meaning of + each cell is as follows: + 0 - scaler input horizontal size, + 1 - input horizontal size for the scaler bypassed, + 2 - REAL_WIDTH without input rotation, + 3 - REAL_HEIGHT with input rotation, +- samsung,sysreg: a phandle to the SYSREG node. + +Each FIMC device should have an alias in the aliases node, in the form of +fimc<n>, where <n> is an integer specifying the IP block instance. + +Optional properties: + +- clock-frequency: maximum FIMC local clock (LCLK) frequency; +- samsung,min-pix-sizes: an array specyfing minimum image size in pixels at + the FIMC input and output DMA, in the first and second cell respectively. + Default value when this property is not present is <16 16>; +- samsung,min-pix-alignment: minimum supported image height alignment (first + cell) and the horizontal image offset (second cell). The values are in pixels + and default to <2 1> when this property is not present; +- samsung,mainscaler-ext: a boolean property indicating whether the FIMC IP + supports extended image size and has CIEXTEN register; +- samsung,rotators: a bitmask specifying whether this IP has the input and + the output rotator. Bits 4 and 0 correspond to input and output rotator + respectively. If a rotator is present its corresponding bit should be set. + Default value when this property is not specified is 0x11. +- samsung,cam-if: a bolean property indicating whether the IP block includes + the camera input interface. +- samsung,isp-wb: this property must be present if the IP block has the ISP + writeback input. +- samsung,lcd-wb: this property must be present if the IP block has the LCD + writeback input. + + +'parallel-ports' node +--------------------- + +This node should contain child 'port' nodes specifying active parallel video +input ports. It includes camera A and camera B inputs. 'reg' property in the +port nodes specifies data input - 0, 1 indicates input A, B respectively. + +Optional properties + +- samsung,camclk-out : specifies clock output for remote sensor, + 0 - CAM_A_CLKOUT, 1 - CAM_B_CLKOUT; + +Image sensor nodes +------------------ + +The sensor device nodes should be added to their control bus controller (e.g. +I2C0) nodes and linked to a port node in the csis or the parallel-ports node, +using the common video interfaces bindings, defined in video-interfaces.txt. +The implementation of this bindings requires clock-frequency property to be +present in the sensor device nodes. + +Example: + + aliases { + fimc0 = &fimc_0; + }; + + /* Parallel bus IF sensor */ + i2c_0: i2c@13860000 { + s5k6aa: sensor@3c { + compatible = "samsung,s5k6aafx"; + reg = <0x3c>; + vddio-supply = <...>; + + clock-frequency = <24000000>; + clocks = <...>; + clock-names = "mclk"; + + port { + s5k6aa_ep: endpoint { + remote-endpoint = <&fimc0_ep>; + bus-width = <8>; + hsync-active = <0>; + vsync-active = <1>; + pclk-sample = <1>; + }; + }; + }; + }; + + /* MIPI CSI-2 bus IF sensor */ + s5c73m3: sensor@0x1a { + compatible = "samsung,s5c73m3"; + reg = <0x1a>; + vddio-supply = <...>; + + clock-frequency = <24000000>; + clocks = <...>; + clock-names = "mclk"; + + port { + s5c73m3_1: endpoint { + data-lanes = <1 2 3 4>; + remote-endpoint = <&csis0_ep>; + }; + }; + }; + + camera { + compatible = "samsung,fimc", "simple-bus"; + #address-cells = <1>; + #size-cells = <1>; + status = "okay"; + + pinctrl-names = "default"; + pinctrl-0 = <&cam_port_a_clk_active>; + + /* parallel camera ports */ + parallel-ports { + /* camera A input */ + port@0 { + reg = <0>; + fimc0_ep: endpoint { + remote-endpoint = <&s5k6aa_ep>; + bus-width = <8>; + hsync-active = <0>; + vsync-active = <1>; + pclk-sample = <1>; + }; + }; + }; + + fimc_0: fimc@11800000 { + compatible = "samsung,exynos4210-fimc"; + reg = <0x11800000 0x1000>; + interrupts = <0 85 0>; + status = "okay"; + }; + + csis_0: csis@11880000 { + compatible = "samsung,exynos4210-csis"; + reg = <0x11880000 0x1000>; + interrupts = <0 78 0>; + /* camera C input */ + port@3 { + reg = <3>; + csis0_ep: endpoint { + remote-endpoint = <&s5c73m3_ep>; + data-lanes = <1 2 3 4>; + samsung,csis-hs-settle = <12>; + }; + }; + }; + }; + +The MIPI-CSIS device binding is defined in samsung-mipi-csis.txt. diff --git a/Documentation/devicetree/bindings/media/samsung-mipi-csis.txt b/Documentation/devicetree/bindings/media/samsung-mipi-csis.txt new file mode 100644 index 000000000000..5f8e28e2484f --- /dev/null +++ b/Documentation/devicetree/bindings/media/samsung-mipi-csis.txt @@ -0,0 +1,81 @@ +Samsung S5P/EXYNOS SoC series MIPI CSI-2 receiver (MIPI CSIS) +------------------------------------------------------------- + +Required properties: + +- compatible : "samsung,s5pv210-csis" for S5PV210 (S5PC110), + "samsung,exynos4210-csis" for Exynos4210 (S5PC210), + "samsung,exynos4212-csis" for Exynos4212/Exynos4412 + SoC series; +- reg : offset and length of the register set for the device; +- interrupts : should contain MIPI CSIS interrupt; the format of the + interrupt specifier depends on the interrupt controller; +- bus-width : maximum number of data lanes supported (SoC specific); +- vddio-supply : MIPI CSIS I/O and PLL voltage supply (e.g. 1.8V); +- vddcore-supply : MIPI CSIS Core voltage supply (e.g. 1.1V); +- clocks : list of clock specifiers, corresponding to entries in + clock-names property; +- clock-names : must contain "csis", "sclk_csis" entries, matching entries + in the clocks property. + +Optional properties: + +- clock-frequency : The IP's main (system bus) clock frequency in Hz, default + value when this property is not specified is 166 MHz; +- samsung,csis-wclk : CSI-2 wrapper clock selection. If this property is present + external clock from CMU will be used, or the bus clock if + if it's not specified. + +The device node should contain one 'port' child node with one child 'endpoint' +node, according to the bindings defined in Documentation/devicetree/bindings/ +media/video-interfaces.txt. The following are properties specific to those nodes. + +port node +--------- + +- reg : (required) must be 3 for camera C input (CSIS0) or 4 for + camera D input (CSIS1); + +endpoint node +------------- + +- data-lanes : (required) an array specifying active physical MIPI-CSI2 + data input lanes and their mapping to logical lanes; the + array's content is unused, only its length is meaningful; + +- samsung,csis-hs-settle : (optional) differential receiver (HS-RX) settle time; + + +Example: + + reg0: regulator@0 { + }; + + reg1: regulator@1 { + }; + +/* SoC properties */ + + csis_0: csis@11880000 { + compatible = "samsung,exynos4210-csis"; + reg = <0x11880000 0x1000>; + interrupts = <0 78 0>; + #address-cells = <1>; + #size-cells = <0>; + }; + +/* Board properties */ + + csis_0: csis@11880000 { + clock-frequency = <166000000>; + vddio-supply = <®0>; + vddcore-supply = <®1>; + port { + reg = <3>; /* 3 - CSIS0, 4 - CSIS1 */ + csis0_ep: endpoint { + remote-endpoint = <...>; + data-lanes = <1>, <2>; + samsung,csis-hs-settle = <12>; + }; + }; + }; diff --git a/Documentation/devicetree/bindings/media/video-interfaces.txt b/Documentation/devicetree/bindings/media/video-interfaces.txt new file mode 100644 index 000000000000..e022d2dc4962 --- /dev/null +++ b/Documentation/devicetree/bindings/media/video-interfaces.txt @@ -0,0 +1,228 @@ +Common bindings for video receiver and transmitter interfaces + +General concept +--------------- + +Video data pipelines usually consist of external devices, e.g. camera sensors, +controlled over an I2C, SPI or UART bus, and SoC internal IP blocks, including +video DMA engines and video data processors. + +SoC internal blocks are described by DT nodes, placed similarly to other SoC +blocks. External devices are represented as child nodes of their respective +bus controller nodes, e.g. I2C. + +Data interfaces on all video devices are described by their child 'port' nodes. +Configuration of a port depends on other devices participating in the data +transfer and is described by 'endpoint' subnodes. + +device { + ... + ports { + #address-cells = <1>; + #size-cells = <0>; + + port@0 { + ... + endpoint@0 { ... }; + endpoint@1 { ... }; + }; + port@1 { ... }; + }; +}; + +If a port can be configured to work with more than one remote device on the same +bus, an 'endpoint' child node must be provided for each of them. If more than +one port is present in a device node or there is more than one endpoint at a +port, or port node needs to be associated with a selected hardware interface, +a common scheme using '#address-cells', '#size-cells' and 'reg' properties is +used. + +All 'port' nodes can be grouped under optional 'ports' node, which allows to +specify #address-cells, #size-cells properties independently for the 'port' +and 'endpoint' nodes and any child device nodes a device might have. + +Two 'endpoint' nodes are linked with each other through their 'remote-endpoint' +phandles. An endpoint subnode of a device contains all properties needed for +configuration of this device for data exchange with other device. In most +cases properties at the peer 'endpoint' nodes will be identical, however they +might need to be different when there is any signal modifications on the bus +between two devices, e.g. there are logic signal inverters on the lines. + +It is allowed for multiple endpoints at a port to be active simultaneously, +where supported by a device. For example, in case where a data interface of +a device is partitioned into multiple data busses, e.g. 16-bit input port +divided into two separate ITU-R BT.656 8-bit busses. In such case bus-width +and data-shift properties can be used to assign physical data lines to each +endpoint node (logical bus). + + +Required properties +------------------- + +If there is more than one 'port' or more than one 'endpoint' node or 'reg' +property is present in port and/or endpoint nodes the following properties +are required in a relevant parent node: + + - #address-cells : number of cells required to define port/endpoint + identifier, should be 1. + - #size-cells : should be zero. + +Optional endpoint properties +---------------------------- + +- remote-endpoint: phandle to an 'endpoint' subnode of a remote device node. +- slave-mode: a boolean property indicating that the link is run in slave mode. + The default when this property is not specified is master mode. In the slave + mode horizontal and vertical synchronization signals are provided to the + slave device (data source) by the master device (data sink). In the master + mode the data source device is also the source of the synchronization signals. +- bus-width: number of data lines actively used, valid for the parallel busses. +- data-shift: on the parallel data busses, if bus-width is used to specify the + number of data lines, data-shift can be used to specify which data lines are + used, e.g. "bus-width=<8>; data-shift=<2>;" means, that lines 9:2 are used. +- hsync-active: active state of the HSYNC signal, 0/1 for LOW/HIGH respectively. +- vsync-active: active state of the VSYNC signal, 0/1 for LOW/HIGH respectively. + Note, that if HSYNC and VSYNC polarities are not specified, embedded + synchronization may be required, where supported. +- data-active: similar to HSYNC and VSYNC, specifies data line polarity. +- field-even-active: field signal level during the even field data transmission. +- pclk-sample: sample data on rising (1) or falling (0) edge of the pixel clock + signal. +- data-lanes: an array of physical data lane indexes. Position of an entry + determines the logical lane number, while the value of an entry indicates + physical lane, e.g. for 2-lane MIPI CSI-2 bus we could have + "data-lanes = <1 2>;", assuming the clock lane is on hardware lane 0. + This property is valid for serial busses only (e.g. MIPI CSI-2). +- clock-lanes: an array of physical clock lane indexes. Position of an entry + determines the logical lane number, while the value of an entry indicates + physical lane, e.g. for a MIPI CSI-2 bus we could have "clock-lanes = <0>;", + which places the clock lane on hardware lane 0. This property is valid for + serial busses only (e.g. MIPI CSI-2). Note that for the MIPI CSI-2 bus this + array contains only one entry. +- clock-noncontinuous: a boolean property to allow MIPI CSI-2 non-continuous + clock mode. + + +Example +------- + +The example snippet below describes two data pipelines. ov772x and imx074 are +camera sensors with a parallel and serial (MIPI CSI-2) video bus respectively. +Both sensors are on the I2C control bus corresponding to the i2c0 controller +node. ov772x sensor is linked directly to the ceu0 video host interface. +imx074 is linked to ceu0 through the MIPI CSI-2 receiver (csi2). ceu0 has a +(single) DMA engine writing captured data to memory. ceu0 node has a single +'port' node which may indicate that at any time only one of the following data +pipelines can be active: ov772x -> ceu0 or imx074 -> csi2 -> ceu0. + + ceu0: ceu@0xfe910000 { + compatible = "renesas,sh-mobile-ceu"; + reg = <0xfe910000 0xa0>; + interrupts = <0x880>; + + mclk: master_clock { + compatible = "renesas,ceu-clock"; + #clock-cells = <1>; + clock-frequency = <50000000>; /* Max clock frequency */ + clock-output-names = "mclk"; + }; + + port { + #address-cells = <1>; + #size-cells = <0>; + + /* Parallel bus endpoint */ + ceu0_1: endpoint@1 { + reg = <1>; /* Local endpoint # */ + remote = <&ov772x_1_1>; /* Remote phandle */ + bus-width = <8>; /* Used data lines */ + data-shift = <2>; /* Lines 9:2 are used */ + + /* If hsync-active/vsync-active are missing, + embedded BT.656 sync is used */ + hsync-active = <0>; /* Active low */ + vsync-active = <0>; /* Active low */ + data-active = <1>; /* Active high */ + pclk-sample = <1>; /* Rising */ + }; + + /* MIPI CSI-2 bus endpoint */ + ceu0_0: endpoint@0 { + reg = <0>; + remote = <&csi2_2>; + }; + }; + }; + + i2c0: i2c@0xfff20000 { + ... + ov772x_1: camera@0x21 { + compatible = "omnivision,ov772x"; + reg = <0x21>; + vddio-supply = <®ulator1>; + vddcore-supply = <®ulator2>; + + clock-frequency = <20000000>; + clocks = <&mclk 0>; + clock-names = "xclk"; + + port { + /* With 1 endpoint per port no need for addresses. */ + ov772x_1_1: endpoint { + bus-width = <8>; + remote-endpoint = <&ceu0_1>; + hsync-active = <1>; + vsync-active = <0>; /* Who came up with an + inverter here ?... */ + data-active = <1>; + pclk-sample = <1>; + }; + }; + }; + + imx074: camera@0x1a { + compatible = "sony,imx074"; + reg = <0x1a>; + vddio-supply = <®ulator1>; + vddcore-supply = <®ulator2>; + + clock-frequency = <30000000>; /* Shared clock with ov772x_1 */ + clocks = <&mclk 0>; + clock-names = "sysclk"; /* Assuming this is the + name in the datasheet */ + port { + imx074_1: endpoint { + clock-lanes = <0>; + data-lanes = <1 2>; + remote-endpoint = <&csi2_1>; + }; + }; + }; + }; + + csi2: csi2@0xffc90000 { + compatible = "renesas,sh-mobile-csi2"; + reg = <0xffc90000 0x1000>; + interrupts = <0x17a0>; + #address-cells = <1>; + #size-cells = <0>; + + port@1 { + compatible = "renesas,csi2c"; /* One of CSI2I and CSI2C. */ + reg = <1>; /* CSI-2 PHY #1 of 2: PHY_S, + PHY_M has port address 0, + is unused. */ + csi2_1: endpoint { + clock-lanes = <0>; + data-lanes = <2 1>; + remote-endpoint = <&imx074_1>; + }; + }; + port@2 { + reg = <2>; /* port 2: link to the CEU */ + + csi2_2: endpoint { + remote-endpoint = <&ceu0_0>; + }; + }; + }; diff --git a/Documentation/devicetree/bindings/metag/meta-intc.txt b/Documentation/devicetree/bindings/metag/meta-intc.txt index 8c47dcbfabc6..80994adab392 100644 --- a/Documentation/devicetree/bindings/metag/meta-intc.txt +++ b/Documentation/devicetree/bindings/metag/meta-intc.txt @@ -12,7 +12,7 @@ Required properties: handle 32 interrupt sources). - interrupt-controller: The presence of this property identifies the node - as an interupt controller. No property value shall be defined. + as an interrupt controller. No property value shall be defined. - #interrupt-cells: Specifies the number of cells needed to encode an interrupt source. The type shall be a <u32> and the value shall be 2. diff --git a/Documentation/devicetree/bindings/mfd/mc13xxx.txt b/Documentation/devicetree/bindings/mfd/mc13xxx.txt index baf07987ae68..abd9e3cb2db7 100644 --- a/Documentation/devicetree/bindings/mfd/mc13xxx.txt +++ b/Documentation/devicetree/bindings/mfd/mc13xxx.txt @@ -10,10 +10,40 @@ Optional properties: - fsl,mc13xxx-uses-touch : Indicate the touchscreen controller is being used Sub-nodes: -- regulators : Contain the regulator nodes. The MC13892 regulators are - bound using their names as listed below with their registers and bits - for enabling. +- regulators : Contain the regulator nodes. The regulators are bound using + their names as listed below with their registers and bits for enabling. +MC13783 regulators: + sw1a : regulator SW1A (register 24, bit 0) + sw1b : regulator SW1B (register 25, bit 0) + sw2a : regulator SW2A (register 26, bit 0) + sw2b : regulator SW2B (register 27, bit 0) + sw3 : regulator SW3 (register 29, bit 20) + vaudio : regulator VAUDIO (register 32, bit 0) + viohi : regulator VIOHI (register 32, bit 3) + violo : regulator VIOLO (register 32, bit 6) + vdig : regulator VDIG (register 32, bit 9) + vgen : regulator VGEN (register 32, bit 12) + vrfdig : regulator VRFDIG (register 32, bit 15) + vrfref : regulator VRFREF (register 32, bit 18) + vrfcp : regulator VRFCP (register 32, bit 21) + vsim : regulator VSIM (register 33, bit 0) + vesim : regulator VESIM (register 33, bit 3) + vcam : regulator VCAM (register 33, bit 6) + vrfbg : regulator VRFBG (register 33, bit 9) + vvib : regulator VVIB (register 33, bit 11) + vrf1 : regulator VRF1 (register 33, bit 12) + vrf2 : regulator VRF2 (register 33, bit 15) + vmmc1 : regulator VMMC1 (register 33, bit 18) + vmmc2 : regulator VMMC2 (register 33, bit 21) + gpo1 : regulator GPO1 (register 34, bit 6) + gpo2 : regulator GPO2 (register 34, bit 8) + gpo3 : regulator GPO3 (register 34, bit 10) + gpo4 : regulator GPO4 (register 34, bit 12) + pwgt1spi : regulator PWGT1SPI (register 34, bit 15) + pwgt2spi : regulator PWGT2SPI (register 34, bit 16) + +MC13892 regulators: vcoincell : regulator VCOINCELL (register 13, bit 23) sw1 : regulator SW1 (register 24, bit 0) sw2 : regulator SW2 (register 25, bit 0) diff --git a/Documentation/devicetree/bindings/misc/sram.txt b/Documentation/devicetree/bindings/misc/sram.txt new file mode 100644 index 000000000000..4d0a00e453a8 --- /dev/null +++ b/Documentation/devicetree/bindings/misc/sram.txt @@ -0,0 +1,16 @@ +Generic on-chip SRAM + +Simple IO memory regions to be managed by the genalloc API. + +Required properties: + +- compatible : mmio-sram + +- reg : SRAM iomem address range + +Example: + +sram: sram@5c000000 { + compatible = "mmio-sram"; + reg = <0x5c000000 0x40000>; /* 256 KiB SRAM at address 0x5c000000 */ +}; diff --git a/Documentation/devicetree/bindings/net/can/atmel-can.txt b/Documentation/devicetree/bindings/net/can/atmel-can.txt new file mode 100644 index 000000000000..72cf0c5daff4 --- /dev/null +++ b/Documentation/devicetree/bindings/net/can/atmel-can.txt @@ -0,0 +1,14 @@ +* AT91 CAN * + +Required properties: + - compatible: Should be "atmel,at91sam9263-can" or "atmel,at91sam9x5-can" + - reg: Should contain CAN controller registers location and length + - interrupts: Should contain IRQ line for the CAN controller + +Example: + + can0: can@f000c000 { + compatbile = "atmel,at91sam9x5-can"; + reg = <0xf000c000 0x300>; + interrupts = <40 4 5> + }; diff --git a/Documentation/devicetree/bindings/net/cpsw.txt b/Documentation/devicetree/bindings/net/cpsw.txt index ecfdf756d10f..4f2ca6b4a182 100644 --- a/Documentation/devicetree/bindings/net/cpsw.txt +++ b/Documentation/devicetree/bindings/net/cpsw.txt @@ -15,16 +15,22 @@ Required properties: - mac_control : Specifies Default MAC control register content for the specific platform - slaves : Specifies number for slaves -- cpts_active_slave : Specifies the slave to use for time stamping +- active_slave : Specifies the slave to use for time stamping, + ethtool and SIOCGMIIPHY - cpts_clock_mult : Numerator to convert input clock ticks into nanoseconds - cpts_clock_shift : Denominator to convert input clock ticks into nanoseconds -- phy_id : Specifies slave phy id -- mac-address : Specifies slave MAC address Optional properties: - ti,hwmods : Must be "cpgmac0" - no_bd_ram : Must be 0 or 1 - dual_emac : Specifies Switch to act as Dual EMAC + +Slave Properties: +Required properties: +- phy_id : Specifies slave phy id +- mac-address : Specifies slave MAC address + +Optional properties: - dual_emac_res_vlan : Specifies VID to be used to segregate the ports Note: "ti,hwmods" field is used to fetch the base address and irq @@ -47,7 +53,7 @@ Examples: rx_descs = <64>; mac_control = <0x20>; slaves = <2>; - cpts_active_slave = <0>; + active_slave = <0>; cpts_clock_mult = <0x80000000>; cpts_clock_shift = <29>; cpsw_emac0: slave@0 { @@ -73,7 +79,7 @@ Examples: rx_descs = <64>; mac_control = <0x20>; slaves = <2>; - cpts_active_slave = <0>; + active_slave = <0>; cpts_clock_mult = <0x80000000>; cpts_clock_shift = <29>; cpsw_emac0: slave@0 { diff --git a/Documentation/devicetree/bindings/net/dsa/dsa.txt b/Documentation/devicetree/bindings/net/dsa/dsa.txt new file mode 100644 index 000000000000..49f4f7ae3f51 --- /dev/null +++ b/Documentation/devicetree/bindings/net/dsa/dsa.txt @@ -0,0 +1,91 @@ +Marvell Distributed Switch Architecture Device Tree Bindings +------------------------------------------------------------ + +Required properties: +- compatible : Should be "marvell,dsa" +- #address-cells : Must be 2, first cell is the address on the MDIO bus + and second cell is the address in the switch tree. + Second cell is used only when cascading/chaining. +- #size-cells : Must be 0 +- dsa,ethernet : Should be a phandle to a valid Ethernet device node +- dsa,mii-bus : Should be a phandle to a valid MDIO bus device node + +Optionnal properties: +- interrupts : property with a value describing the switch + interrupt number (not supported by the driver) + +A DSA node can contain multiple switch chips which are therefore child nodes of +the parent DSA node. The maximum number of allowed child nodes is 4 +(DSA_MAX_SWITCHES). +Each of these switch child nodes should have the following required properties: + +- reg : Describes the switch address on the MII bus +- #address-cells : Must be 1 +- #size-cells : Must be 0 + +A switch may have multiple "port" children nodes + +Each port children node must have the following mandatory properties: +- reg : Describes the port address in the switch +- label : Describes the label associated with this port, special + labels are "cpu" to indicate a CPU port and "dsa" to + indicate an uplink/downlink port. + +Note that a port labelled "dsa" will imply checking for the uplink phandle +described below. + +Optionnal property: +- link : Should be a phandle to another switch's DSA port. + This property is only used when switches are being + chained/cascaded together. + +Example: + + dsa@0 { + compatible = "marvell,dsa"; + #address-cells = <2>; + #size-cells = <0>; + + interrupts = <10>; + dsa,ethernet = <ðernet0>; + dsa,mii-bus = <&mii_bus0>; + + switch@0 { + #address-cells = <1>; + #size-cells = <0>; + reg = <16 0>; /* MDIO address 16, switch 0 in tree */ + + port@0 { + reg = <0>; + label = "lan1"; + }; + + port@1 { + reg = <1>; + label = "lan2"; + }; + + port@5 { + reg = <5>; + label = "cpu"; + }; + + switch0uplink: port@6 { + reg = <6>; + label = "dsa"; + link = <&switch1uplink>; + }; + }; + + switch@1 { + #address-cells = <1>; + #size-cells = <0>; + reg = <17 1>; /* MDIO address 17, switch 1 in tree */ + + switch1uplink: port@0 { + reg = <0>; + label = "dsa"; + link = <&switch0uplink>; + }; + }; + }; diff --git a/Documentation/devicetree/bindings/net/marvell-orion-mdio.txt b/Documentation/devicetree/bindings/net/marvell-orion-mdio.txt index 34e7aafa321c..9417e54c26c0 100644 --- a/Documentation/devicetree/bindings/net/marvell-orion-mdio.txt +++ b/Documentation/devicetree/bindings/net/marvell-orion-mdio.txt @@ -9,6 +9,10 @@ Required properties: - compatible: "marvell,orion-mdio" - reg: address and length of the SMI register +Optional properties: +- interrupts: interrupt line number for the SMI error/done interrupt +- clocks: Phandle to the clock control device and gate bit + The child nodes of the MDIO driver are the individual PHY devices connected to this MDIO bus. They must have a "reg" property given the PHY address on the MDIO bus. diff --git a/Documentation/devicetree/bindings/pinctrl/atmel,at91-pinctrl.txt b/Documentation/devicetree/bindings/pinctrl/atmel,at91-pinctrl.txt index bc50899e0c81..648d60eb9fd8 100644 --- a/Documentation/devicetree/bindings/pinctrl/atmel,at91-pinctrl.txt +++ b/Documentation/devicetree/bindings/pinctrl/atmel,at91-pinctrl.txt @@ -1,6 +1,6 @@ * Atmel AT91 Pinmux Controller -The AT91 Pinmux Controler, enables the IC +The AT91 Pinmux Controller, enables the IC to share one PAD to several functional blocks. The sharing is done by multiplexing the PAD input/output signals. For each PAD there are up to 8 muxing options (called periph modes). Since different modules require diff --git a/Documentation/devicetree/bindings/pinctrl/brcm,bcm2835-gpio.txt b/Documentation/devicetree/bindings/pinctrl/brcm,bcm2835-gpio.txt index 8edc20e1b09e..2569866c692f 100644 --- a/Documentation/devicetree/bindings/pinctrl/brcm,bcm2835-gpio.txt +++ b/Documentation/devicetree/bindings/pinctrl/brcm,bcm2835-gpio.txt @@ -5,7 +5,7 @@ controller, and pinmux/control device. Required properties: - compatible: "brcm,bcm2835-gpio" -- reg: Should contain the physical address of the GPIO module's registes. +- reg: Should contain the physical address of the GPIO module's registers. - gpio-controller: Marks the device node as a GPIO controller. - #gpio-cells : Should be two. The first cell is the pin number and the second cell is used to specify optional parameters: diff --git a/Documentation/devicetree/bindings/pinctrl/pinctrl-single.txt b/Documentation/devicetree/bindings/pinctrl/pinctrl-single.txt index 2c81e45f1374..08f0c3d01575 100644 --- a/Documentation/devicetree/bindings/pinctrl/pinctrl-single.txt +++ b/Documentation/devicetree/bindings/pinctrl/pinctrl-single.txt @@ -1,7 +1,9 @@ One-register-per-pin type device tree based pinctrl driver Required properties: -- compatible : "pinctrl-single" +- compatible : "pinctrl-single" or "pinconf-single". + "pinctrl-single" means that pinconf isn't supported. + "pinconf-single" means that generic pinconf is supported. - reg : offset and length of the register set for the mux registers @@ -14,9 +16,61 @@ Optional properties: - pinctrl-single,function-off : function off mode for disabled state if available and same for all registers; if not specified, disabling of pin functions is ignored + - pinctrl-single,bit-per-mux : boolean to indicate that one register controls more than one pin +- pinctrl-single,drive-strength : array of value that are used to configure + drive strength in the pinmux register. They're value of drive strength + current and drive strength mask. + + /* drive strength current, mask */ + pinctrl-single,power-source = <0x30 0xf0>; + +- pinctrl-single,bias-pullup : array of value that are used to configure the + input bias pullup in the pinmux register. + + /* input, enabled pullup bits, disabled pullup bits, mask */ + pinctrl-single,bias-pullup = <0 1 0 1>; + +- pinctrl-single,bias-pulldown : array of value that are used to configure the + input bias pulldown in the pinmux register. + + /* input, enabled pulldown bits, disabled pulldown bits, mask */ + pinctrl-single,bias-pulldown = <2 2 0 2>; + + * Two bits to control input bias pullup and pulldown: User should use + pinctrl-single,bias-pullup & pinctrl-single,bias-pulldown. One bit means + pullup, and the other one bit means pulldown. + * Three bits to control input bias enable, pullup and pulldown. User should + use pinctrl-single,bias-pullup & pinctrl-single,bias-pulldown. Input bias + enable bit should be included in pullup or pulldown bits. + * Although driver could set PIN_CONFIG_BIAS_DISABLE, there's no property as + pinctrl-single,bias-disable. Because pinctrl single driver could implement + it by calling pulldown, pullup disabled. + +- pinctrl-single,input-schmitt : array of value that are used to configure + input schmitt in the pinmux register. In some silicons, there're two input + schmitt value (rising-edge & falling-edge) in the pinmux register. + + /* input schmitt value, mask */ + pinctrl-single,input-schmitt = <0x30 0x70>; + +- pinctrl-single,input-schmitt-enable : array of value that are used to + configure input schmitt enable or disable in the pinmux register. + + /* input, enable bits, disable bits, mask */ + pinctrl-single,input-schmitt-enable = <0x30 0x40 0 0x70>; + +- pinctrl-single,gpio-range : list of value that are used to configure a GPIO + range. They're value of subnode phandle, pin base in pinctrl device, pin + number in this range, GPIO function value of this GPIO range. + The number of parameters is depend on #pinctrl-single,gpio-range-cells + property. + + /* pin base, nr pins & gpio function */ + pinctrl-single,gpio-range = <&range 0 3 0 &range 3 9 1>; + This driver assumes that there is only one register for each pin (unless the pinctrl-single,bit-per-mux is set), and uses the common pinctrl bindings as specified in the pinctrl-bindings.txt document in this directory. @@ -42,6 +96,20 @@ Where 0xdc is the offset from the pinctrl register base address for the device pinctrl register, 0x18 is the desired value, and 0xff is the sub mask to be used when applying this change to the register. + +Optional sub-node: In case some pins could be configured as GPIO in the pinmux +register, those pins could be defined as a GPIO range. This sub-node is required +by pinctrl-single,gpio-range property. + +Required properties in sub-node: +- #pinctrl-single,gpio-range-cells : the number of parameters after phandle in + pinctrl-single,gpio-range property. + + range: gpio-range { + #pinctrl-single,gpio-range-cells = <3>; + }; + + Example: /* SoC common file */ @@ -58,7 +126,7 @@ pmx_core: pinmux@4a100040 { /* second controller instance for pins in wkup domain */ pmx_wkup: pinmux@4a31e040 { - compatible = "pinctrl-single; + compatible = "pinctrl-single"; reg = <0x4a31e040 0x0038>; #address-cells = <1>; #size-cells = <0>; @@ -76,6 +144,29 @@ control_devconf0: pinmux@48002274 { pinctrl-single,function-mask = <0x5F>; }; +/* third controller instance for pins in gpio domain */ +pmx_gpio: pinmux@d401e000 { + compatible = "pinconf-single"; + reg = <0xd401e000 0x0330>; + #address-cells = <1>; + #size-cells = <1>; + ranges; + + pinctrl-single,register-width = <32>; + pinctrl-single,function-mask = <7>; + + /* sparse GPIO range could be supported */ + pinctrl-single,gpio-range = <&range 0 3 0 &range 3 9 1 + &range 12 1 0 &range 13 29 1 + &range 43 1 0 &range 44 49 1 + &range 94 1 1 &range 96 2 1>; + + range: gpio-range { + #pinctrl-single,gpio-range-cells = <3>; + }; +}; + + /* board specific .dts file */ &pmx_core { @@ -96,6 +187,15 @@ control_devconf0: pinmux@48002274 { >; }; + uart0_pins: pinmux_uart0_pins { + pinctrl-single,pins = < + 0x208 0 /* UART0_RXD (IOCFG138) */ + 0x20c 0 /* UART0_TXD (IOCFG139) */ + >; + pinctrl-single,bias-pulldown = <0 2 2>; + pinctrl-single,bias-pullup = <0 1 1>; + }; + /* map uart2 pins */ uart2_pins: pinmux_uart2_pins { pinctrl-single,pins = < @@ -122,6 +222,11 @@ control_devconf0: pinmux@48002274 { }; +&uart1 { + pinctrl-names = "default"; + pinctrl-0 = <&uart0_pins>; +}; + &uart2 { pinctrl-names = "default"; pinctrl-0 = <&uart2_pins>; diff --git a/Documentation/devicetree/bindings/pinctrl/samsung-pinctrl.txt b/Documentation/devicetree/bindings/pinctrl/samsung-pinctrl.txt index 4598a47aa0cd..c70fca146e91 100644 --- a/Documentation/devicetree/bindings/pinctrl/samsung-pinctrl.txt +++ b/Documentation/devicetree/bindings/pinctrl/samsung-pinctrl.txt @@ -7,6 +7,7 @@ on-chip controllers onto these pads. Required Properties: - compatible: should be one of the following. + - "samsung,s3c64xx-pinctrl": for S3C64xx-compatible pin-controller, - "samsung,exynos4210-pinctrl": for Exynos4210 compatible pin-controller. - "samsung,exynos4x12-pinctrl": for Exynos4x12 compatible pin-controller. - "samsung,exynos5250-pinctrl": for Exynos5250 compatible pin-controller. @@ -105,6 +106,8 @@ B. External Wakeup Interrupts: For supporting external wakeup interrupts, a - compatible: identifies the type of the external wakeup interrupt controller The possible values are: + - samsung,s3c64xx-wakeup-eint: represents wakeup interrupt controller + found on Samsung S3C64xx SoCs, - samsung,exynos4210-wakeup-eint: represents wakeup interrupt controller found on Samsung Exynos4210 SoC. - interrupt-parent: phandle of the interrupt parent to which the external diff --git a/Documentation/devicetree/bindings/power_supply/power_supply.txt b/Documentation/devicetree/bindings/power_supply/power_supply.txt new file mode 100644 index 000000000000..8391bfa0edac --- /dev/null +++ b/Documentation/devicetree/bindings/power_supply/power_supply.txt @@ -0,0 +1,23 @@ +Power Supply Core Support + +Optional Properties: + - power-supplies : This property is added to a supply in order to list the + devices which supply it power, referenced by their phandles. + +Example: + + usb-charger: power@e { + compatible = "some,usb-charger"; + ... + }; + + ac-charger: power@c { + compatible = "some,ac-charger"; + ... + }; + + battery@b { + compatible = "some,battery"; + ... + power-supplies = <&usb-charger>, <&ac-charger>; + }; diff --git a/Documentation/devicetree/bindings/power_supply/qnap-poweroff.txt b/Documentation/devicetree/bindings/power_supply/qnap-poweroff.txt index 9a599d27bd75..0347d8350d94 100644 --- a/Documentation/devicetree/bindings/power_supply/qnap-poweroff.txt +++ b/Documentation/devicetree/bindings/power_supply/qnap-poweroff.txt @@ -2,7 +2,7 @@ QNAP NAS devices have a microcontroller controlling the main power supply. This microcontroller is connected to UART1 of the Kirkwood and -Orion5x SoCs. Sending the charactor 'A', at 19200 baud, tells the +Orion5x SoCs. Sending the character 'A', at 19200 baud, tells the microcontroller to turn the power off. This driver adds a handler to pm_power_off which is called to turn the power off. diff --git a/Documentation/devicetree/bindings/power_supply/tps65090.txt b/Documentation/devicetree/bindings/power_supply/tps65090.txt new file mode 100644 index 000000000000..8e5e0d3910df --- /dev/null +++ b/Documentation/devicetree/bindings/power_supply/tps65090.txt @@ -0,0 +1,17 @@ +TPS65090 Frontend PMU with Switchmode Charger + +Required Properties: +-compatible: "ti,tps65090-charger" + +Optional Properties: +-ti,enable-low-current-chrg: Enables charging when a low current is detected + while the default logic is to stop charging. + +This node is a subnode of the tps65090 PMIC. + +Example: + + tps65090-charger { + compatible = "ti,tps65090-charger"; + ti,enable-low-current-chrg; + }; diff --git a/Documentation/devicetree/bindings/regulator/max8952.txt b/Documentation/devicetree/bindings/regulator/max8952.txt new file mode 100644 index 000000000000..866fcdd0f4eb --- /dev/null +++ b/Documentation/devicetree/bindings/regulator/max8952.txt @@ -0,0 +1,52 @@ +Maxim MAX8952 voltage regulator + +Required properties: +- compatible: must be equal to "maxim,max8952" +- reg: I2C slave address, usually 0x60 +- max8952,dvs-mode-microvolt: array of 4 integer values defining DVS voltages + in microvolts. All values must be from range <770000, 1400000> +- any required generic properties defined in regulator.txt + +Optional properties: +- max8952,vid-gpios: array of two GPIO pins used for DVS voltage selection +- max8952,en-gpio: GPIO used to control enable status of regulator +- max8952,default-mode: index of default DVS voltage, from <0, 3> range +- max8952,sync-freq: sync frequency, must be one of following values: + - 0: 26 MHz + - 1: 13 MHz + - 2: 19.2 MHz + Defaults to 26 MHz if not specified. +- max8952,ramp-speed: voltage ramp speed, must be one of following values: + - 0: 32mV/us + - 1: 16mV/us + - 2: 8mV/us + - 3: 4mV/us + - 4: 2mV/us + - 5: 1mV/us + - 6: 0.5mV/us + - 7: 0.25mV/us + Defaults to 32mV/us if not specified. +- any available generic properties defined in regulator.txt + +Example: + + vdd_arm_reg: pmic@60 { + compatible = "maxim,max8952"; + reg = <0x60>; + + /* max8952-specific properties */ + max8952,vid-gpios = <&gpx0 3 0>, <&gpx0 4 0>; + max8952,en-gpio = <&gpx0 1 0>; + max8952,default-mode = <0>; + max8952,dvs-mode-microvolt = <1250000>, <1200000>, + <1050000>, <950000>; + max8952,sync-freq = <0>; + max8952,ramp-speed = <0>; + + /* generic regulator properties */ + regulator-name = "vdd_arm"; + regulator-min-microvolt = <770000>; + regulator-max-microvolt = <1400000>; + regulator-always-on; + regulator-boot-on; + }; diff --git a/Documentation/devicetree/bindings/regulator/max8997-regulator.txt b/Documentation/devicetree/bindings/regulator/max8997-regulator.txt index 9fd69a18b0ba..9e5e51d78868 100644 --- a/Documentation/devicetree/bindings/regulator/max8997-regulator.txt +++ b/Documentation/devicetree/bindings/regulator/max8997-regulator.txt @@ -28,7 +28,7 @@ Required properties: safe operating voltage). If either of the 'max8997,pmic-buck[1/2/5]-uses-gpio-dvs' optional - property is specified, then all the eigth voltage values for the + property is specified, then all the eight voltage values for the 'max8997,pmic-buck[1/2/5]-dvs-voltage' should be specified. Optional properties: diff --git a/Documentation/devicetree/bindings/rtc/atmel,at91rm9200-rtc.txt b/Documentation/devicetree/bindings/rtc/atmel,at91rm9200-rtc.txt new file mode 100644 index 000000000000..2a3feabd3b22 --- /dev/null +++ b/Documentation/devicetree/bindings/rtc/atmel,at91rm9200-rtc.txt @@ -0,0 +1,15 @@ +Atmel AT91RM9200 Real Time Clock + +Required properties: +- compatible: should be: "atmel,at91rm9200-rtc" +- reg: physical base address of the controller and length of memory mapped + region. +- interrupts: rtc alarm/event interrupt + +Example: + +rtc@fffffe00 { + compatible = "atmel,at91rm9200-rtc"; + reg = <0xfffffe00 0x100>; + interrupts = <1 4 7>; +}; diff --git a/Documentation/devicetree/bindings/serio/snps-arc_ps2.txt b/Documentation/devicetree/bindings/serio/snps-arc_ps2.txt new file mode 100644 index 000000000000..38c2f21e8044 --- /dev/null +++ b/Documentation/devicetree/bindings/serio/snps-arc_ps2.txt @@ -0,0 +1,16 @@ +* ARC PS/2 driver: PS/2 block used in some ARC FPGA's & nSIM OSCI model + +Required properties: +- compatible : "snps,arc_ps2" +- reg : offset and length (always 0x14) of registers +- interrupts : interrupt +- interrupt-names : name of interrupt, must be "arc_ps2_irq" + +Example: + +serio@c9000400 { + compatible = "snps,arc_ps2"; + reg = <0xc9000400 0x14>; + interrupts = <13>; + interrupt-names = "arc_ps2_irq"; +} diff --git a/Documentation/devicetree/bindings/spi/brcm,bcm2835-spi.txt b/Documentation/devicetree/bindings/spi/brcm,bcm2835-spi.txt new file mode 100644 index 000000000000..8bf89c643640 --- /dev/null +++ b/Documentation/devicetree/bindings/spi/brcm,bcm2835-spi.txt @@ -0,0 +1,22 @@ +Broadcom BCM2835 SPI0 controller + +The BCM2835 contains two forms of SPI master controller, one known simply as +SPI0, and the other known as the "Universal SPI Master"; part of the +auxilliary block. This binding applies to the SPI0 controller. + +Required properties: +- compatible: Should be "brcm,bcm2835-spi". +- reg: Should contain register location and length. +- interrupts: Should contain interrupt. +- clocks: The clock feeding the SPI controller. + +Example: + +spi@20204000 { + compatible = "brcm,bcm2835-spi"; + reg = <0x7e204000 0x1000>; + interrupts = <2 22>; + clocks = <&clk_spi>; + #address-cells = <1>; + #size-cells = <0>; +}; diff --git a/Documentation/devicetree/bindings/spi/fsl-spi.txt b/Documentation/devicetree/bindings/spi/fsl-spi.txt index 777abd7399d5..b032dd76e9d2 100644 --- a/Documentation/devicetree/bindings/spi/fsl-spi.txt +++ b/Documentation/devicetree/bindings/spi/fsl-spi.txt @@ -4,7 +4,7 @@ Required properties: - cell-index : QE SPI subblock index. 0: QE subblock SPI1 1: QE subblock SPI2 -- compatible : should be "fsl,spi". +- compatible : should be "fsl,spi" or "aeroflexgaisler,spictrl". - mode : the SPI operation mode, it can be "cpu" or "cpu-qe". - reg : Offset and length of the register set for the device - interrupts : <a b> where a is the interrupt number and b is a @@ -14,6 +14,7 @@ Required properties: controller you have. - interrupt-parent : the phandle for the interrupt controller that services interrupts for this device. +- clock-frequency : input clock frequency to non FSL_SOC cores Optional properties: - gpios : specifies the gpio pins to be used for chipselects. diff --git a/Documentation/devicetree/bindings/spi/nvidia,tegra114-spi.txt b/Documentation/devicetree/bindings/spi/nvidia,tegra114-spi.txt new file mode 100644 index 000000000000..91ff771c7e77 --- /dev/null +++ b/Documentation/devicetree/bindings/spi/nvidia,tegra114-spi.txt @@ -0,0 +1,26 @@ +NVIDIA Tegra114 SPI controller. + +Required properties: +- compatible : should be "nvidia,tegra114-spi". +- reg: Should contain SPI registers location and length. +- interrupts: Should contain SPI interrupts. +- nvidia,dma-request-selector : The Tegra DMA controller's phandle and + request selector for this SPI controller. +- This is also require clock named "spi" as per binding document + Documentation/devicetree/bindings/clock/clock-bindings.txt + +Recommended properties: +- spi-max-frequency: Definition as per + Documentation/devicetree/bindings/spi/spi-bus.txt +Example: + +spi@7000d600 { + compatible = "nvidia,tegra114-spi"; + reg = <0x7000d600 0x200>; + interrupts = <0 82 0x04>; + nvidia,dma-request-selector = <&apbdma 16>; + spi-max-frequency = <25000000>; + #address-cells = <1>; + #size-cells = <0>; + status = "disabled"; +}; diff --git a/Documentation/devicetree/bindings/spi/spi-samsung.txt b/Documentation/devicetree/bindings/spi/spi-samsung.txt index a15ffeddfba4..86aa061f069f 100644 --- a/Documentation/devicetree/bindings/spi/spi-samsung.txt +++ b/Documentation/devicetree/bindings/spi/spi-samsung.txt @@ -31,9 +31,6 @@ Required Board Specific Properties: - #address-cells: should be 1. - #size-cells: should be 0. -- gpios: The gpio specifier for clock, mosi and miso interface lines (in the - order specified). The format of the gpio specifier depends on the gpio - controller. Optional Board Specific Properties: @@ -86,9 +83,8 @@ Example: spi_0: spi@12d20000 { #address-cells = <1>; #size-cells = <0>; - gpios = <&gpa2 4 2 3 0>, - <&gpa2 6 2 3 0>, - <&gpa2 7 2 3 0>; + pinctrl-names = "default"; + pinctrl-0 = <&spi0_bus>; w25q80bw@0 { #address-cells = <1>; diff --git a/Documentation/devicetree/bindings/staging/dwc2.txt b/Documentation/devicetree/bindings/staging/dwc2.txt new file mode 100644 index 000000000000..1a1b7cfa4845 --- /dev/null +++ b/Documentation/devicetree/bindings/staging/dwc2.txt @@ -0,0 +1,15 @@ +Platform DesignWare HS OTG USB 2.0 controller +----------------------------------------------------- + +Required properties: +- compatible : "snps,dwc2" +- reg : Should contain 1 register range (address and length) +- interrupts : Should contain 1 interrupt + +Example: + + usb@101c0000 { + compatible = "ralink,rt3050-usb, snps,dwc2"; + reg = <0x101c0000 40000>; + interrupts = <18>; + }; diff --git a/Documentation/devicetree/bindings/staging/imx-drm/fsl-imx-drm.txt b/Documentation/devicetree/bindings/staging/imx-drm/fsl-imx-drm.txt index 07654f0338b6..8071ac20d4b3 100644 --- a/Documentation/devicetree/bindings/staging/imx-drm/fsl-imx-drm.txt +++ b/Documentation/devicetree/bindings/staging/imx-drm/fsl-imx-drm.txt @@ -26,7 +26,7 @@ Required properties: - crtc: the crtc this display is connected to, see below Optional properties: - interface_pix_fmt: How this display is connected to the - crtc. Currently supported types: "rgb24", "rgb565" + crtc. Currently supported types: "rgb24", "rgb565", "bgr666" - edid: verbatim EDID data block describing attached display. - ddc: phandle describing the i2c bus handling the display data channel diff --git a/Documentation/devicetree/bindings/tty/serial/of-serial.txt b/Documentation/devicetree/bindings/tty/serial/of-serial.txt index 8f01cb190f25..1928a3e83cd0 100644 --- a/Documentation/devicetree/bindings/tty/serial/of-serial.txt +++ b/Documentation/devicetree/bindings/tty/serial/of-serial.txt @@ -33,6 +33,10 @@ Optional properties: RTAS and should not be registered. - no-loopback-test: set to indicate that the port does not implements loopback test mode +- fifo-size: the fifo size of the UART. +- auto-flow-control: one way to enable automatic flow control support. The + driver is allowed to detect support for the capability even without this + property. Example: diff --git a/Documentation/devicetree/bindings/usb/ci13xxx-imx.txt b/Documentation/devicetree/bindings/usb/ci13xxx-imx.txt index 5778b9c83bd8..1c04a4c9515f 100644 --- a/Documentation/devicetree/bindings/usb/ci13xxx-imx.txt +++ b/Documentation/devicetree/bindings/usb/ci13xxx-imx.txt @@ -11,6 +11,7 @@ Optional properties: that indicate usb controller index - vbus-supply: regulator for vbus - disable-over-current: disable over current detect +- external-vbus-divider: enables off-chip resistor divider for Vbus Examples: usb@02184000 { /* USB OTG */ @@ -20,4 +21,5 @@ usb@02184000 { /* USB OTG */ fsl,usbphy = <&usbphy1>; fsl,usbmisc = <&usbmisc 0>; disable-over-current; + external-vbus-divider; }; diff --git a/Documentation/devicetree/bindings/usb/ehci-omap.txt b/Documentation/devicetree/bindings/usb/ehci-omap.txt new file mode 100644 index 000000000000..485a9a1efa7a --- /dev/null +++ b/Documentation/devicetree/bindings/usb/ehci-omap.txt @@ -0,0 +1,32 @@ +OMAP HS USB EHCI controller + +This device is usually the child of the omap-usb-host +Documentation/devicetree/bindings/mfd/omap-usb-host.txt + +Required properties: + +- compatible: should be "ti,ehci-omap" +- reg: should contain one register range i.e. start and length +- interrupts: description of the interrupt line + +Optional properties: + +- phys: list of phandles to PHY nodes. + This property is required if at least one of the ports are in + PHY mode i.e. OMAP_EHCI_PORT_MODE_PHY + +To specify the port mode, see +Documentation/devicetree/bindings/mfd/omap-usb-host.txt + +Example for OMAP4: + +usbhsehci: ehci@4a064c00 { + compatible = "ti,ehci-omap", "usb-ehci"; + reg = <0x4a064c00 0x400>; + interrupts = <0 77 0x4>; +}; + +&usbhsehci { + phys = <&hsusb1_phy 0 &hsusb3_phy>; +}; + diff --git a/Documentation/devicetree/bindings/usb/ohci-omap3.txt b/Documentation/devicetree/bindings/usb/ohci-omap3.txt new file mode 100644 index 000000000000..14ab42812a8e --- /dev/null +++ b/Documentation/devicetree/bindings/usb/ohci-omap3.txt @@ -0,0 +1,15 @@ +OMAP HS USB OHCI controller (OMAP3 and later) + +Required properties: + +- compatible: should be "ti,ohci-omap3" +- reg: should contain one register range i.e. start and length +- interrupts: description of the interrupt line + +Example for OMAP4: + +usbhsohci: ohci@4a064800 { + compatible = "ti,ohci-omap3", "usb-ohci"; + reg = <0x4a064800 0x400>; + interrupts = <0 76 0x4>; +}; diff --git a/Documentation/devicetree/bindings/usb/omap-usb.txt b/Documentation/devicetree/bindings/usb/omap-usb.txt index 1ef0ce71f8fa..662f0f1d2315 100644 --- a/Documentation/devicetree/bindings/usb/omap-usb.txt +++ b/Documentation/devicetree/bindings/usb/omap-usb.txt @@ -8,10 +8,10 @@ OMAP MUSB GLUE and disconnect. - multipoint : Should be "1" indicating the musb controller supports multipoint. This is a MUSB configuration-specific setting. - - num_eps : Specifies the number of endpoints. This is also a + - num-eps : Specifies the number of endpoints. This is also a MUSB configuration-specific setting. Should be set to "16" - - ram_bits : Specifies the ram address size. Should be set to "12" - - interface_type : This is a board specific setting to describe the type of + - ram-bits : Specifies the ram address size. Should be set to "12" + - interface-type : This is a board specific setting to describe the type of interface between the controller and the phy. It should be "0" or "1" specifying ULPI and UTMI respectively. - mode : Should be "3" to represent OTG. "1" signifies HOST and "2" @@ -29,18 +29,46 @@ usb_otg_hs: usb_otg_hs@4a0ab000 { ti,hwmods = "usb_otg_hs"; ti,has-mailbox; multipoint = <1>; - num_eps = <16>; - ram_bits = <12>; + num-eps = <16>; + ram-bits = <12>; ctrl-module = <&omap_control_usb>; }; Board specific device node entry &usb_otg_hs { - interface_type = <1>; + interface-type = <1>; mode = <3>; power = <50>; }; +OMAP DWC3 GLUE + - compatible : Should be "ti,dwc3" + - ti,hwmods : Should be "usb_otg_ss" + - reg : Address and length of the register set for the device. + - interrupts : The irq number of this device that is used to interrupt the + MPU + - #address-cells, #size-cells : Must be present if the device has sub-nodes + - utmi-mode : controls the source of UTMI/PIPE status for VBUS and OTG ID. + It should be set to "1" for HW mode and "2" for SW mode. + - ranges: the child address space are mapped 1:1 onto the parent address space + +Sub-nodes: +The dwc3 core should be added as subnode to omap dwc3 glue. +- dwc3 : + The binding details of dwc3 can be found in: + Documentation/devicetree/bindings/usb/dwc3.txt + +omap_dwc3 { + compatible = "ti,dwc3"; + ti,hwmods = "usb_otg_ss"; + reg = <0x4a020000 0x1ff>; + interrupts = <0 93 4>; + #address-cells = <1>; + #size-cells = <1>; + utmi-mode = <2>; + ranges; +}; + OMAP CONTROL USB Required properties: diff --git a/Documentation/devicetree/bindings/usb/samsung-usbphy.txt b/Documentation/devicetree/bindings/usb/samsung-usbphy.txt index 033194934f64..33fd3543f3f8 100644 --- a/Documentation/devicetree/bindings/usb/samsung-usbphy.txt +++ b/Documentation/devicetree/bindings/usb/samsung-usbphy.txt @@ -1,20 +1,25 @@ -* Samsung's usb phy transceiver +SAMSUNG USB-PHY controllers -The Samsung's phy transceiver is used for controlling usb phy for -s3c-hsotg as well as ehci-s5p and ohci-exynos usb controllers -across Samsung SOCs. +** Samsung's usb 2.0 phy transceiver + +The Samsung's usb 2.0 phy transceiver is used for controlling +usb 2.0 phy for s3c-hsotg as well as ehci-s5p and ohci-exynos +usb controllers across Samsung SOCs. TODO: Adding the PHY binding with controller(s) according to the under -developement generic PHY driver. +development generic PHY driver. Required properties: Exynos4210: -- compatible : should be "samsung,exynos4210-usbphy" +- compatible : should be "samsung,exynos4210-usb2phy" - reg : base physical address of the phy registers and length of memory mapped region. +- clocks: Clock IDs array as required by the controller. +- clock-names: names of clock correseponding IDs clock property as requested + by the controller driver. Exynos5250: -- compatible : should be "samsung,exynos5250-usbphy" +- compatible : should be "samsung,exynos5250-usb2phy" - reg : base physical address of the phy registers and length of memory mapped region. @@ -44,12 +49,69 @@ Example: usbphy@125B0000 { #address-cells = <1>; #size-cells = <1>; - compatible = "samsung,exynos4210-usbphy"; + compatible = "samsung,exynos4210-usb2phy"; reg = <0x125B0000 0x100>; ranges; + clocks = <&clock 2>, <&clock 305>; + clock-names = "xusbxti", "otg"; + usbphy-sys { /* USB device and host PHY_CONTROL registers */ reg = <0x10020704 0x8>; }; }; + + +** Samsung's usb 3.0 phy transceiver + +Starting exynso5250, Samsung's SoC have usb 3.0 phy transceiver +which is used for controlling usb 3.0 phy for dwc3-exynos usb 3.0 +controllers across Samsung SOCs. + +Required properties: + +Exynos5250: +- compatible : should be "samsung,exynos5250-usb3phy" +- reg : base physical address of the phy registers and length of memory mapped + region. +- clocks: Clock IDs array as required by the controller. +- clock-names: names of clocks correseponding to IDs in the clock property + as requested by the controller driver. + +Optional properties: +- #address-cells: should be '1' when usbphy node has a child node with 'reg' + property. +- #size-cells: should be '1' when usbphy node has a child node with 'reg' + property. +- ranges: allows valid translation between child's address space and parent's + address space. + +- The child node 'usbphy-sys' to the node 'usbphy' is for the system controller + interface for usb-phy. It should provide the following information required by + usb-phy controller to control phy. + - reg : base physical address of PHY_CONTROL registers. + The size of this register is the total sum of size of all PHY_CONTROL + registers that the SoC has. For example, the size will be + '0x4' in case we have only one PHY_CONTROL register (e.g. + OTHERS register in S3C64XX or USB_PHY_CONTROL register in S5PV210) + and, '0x8' in case we have two PHY_CONTROL registers (e.g. + USBDEVICE_PHY_CONTROL and USBHOST_PHY_CONTROL registers in exynos4x). + and so on. + +Example: + usbphy@12100000 { + compatible = "samsung,exynos5250-usb3phy"; + reg = <0x12100000 0x100>; + #address-cells = <1>; + #size-cells = <1>; + ranges; + + clocks = <&clock 1>, <&clock 286>; + clock-names = "ext_xtal", "usbdrd30"; + + usbphy-sys { + /* USB device and host PHY_CONTROL registers */ + reg = <0x10040704 0x8>; + }; + }; diff --git a/Documentation/devicetree/bindings/usb/usb-nop-xceiv.txt b/Documentation/devicetree/bindings/usb/usb-nop-xceiv.txt new file mode 100644 index 000000000000..d7e272671c7e --- /dev/null +++ b/Documentation/devicetree/bindings/usb/usb-nop-xceiv.txt @@ -0,0 +1,34 @@ +USB NOP PHY + +Required properties: +- compatible: should be usb-nop-xceiv + +Optional properties: +- clocks: phandle to the PHY clock. Use as per Documentation/devicetree + /bindings/clock/clock-bindings.txt + This property is required if clock-frequency is specified. + +- clock-names: Should be "main_clk" + +- clock-frequency: the clock frequency (in Hz) that the PHY clock must + be configured to. + +- vcc-supply: phandle to the regulator that provides RESET to the PHY. + +- reset-supply: phandle to the regulator that provides power to the PHY. + +Example: + + hsusb1_phy { + compatible = "usb-nop-xceiv"; + clock-frequency = <19200000>; + clocks = <&osc 0>; + clock-names = "main_clk"; + vcc-supply = <&hsusb1_vcc_regulator>; + reset-supply = <&hsusb1_reset_regulator>; + }; + +hsusb1_phy is a NOP USB PHY device that gets its clock from an oscillator +and expects that clock to be configured to 19.2MHz by the NOP PHY driver. +hsusb1_vcc_regulator provides power to the PHY and hsusb1_reset_regulator +controls RESET. diff --git a/Documentation/devicetree/bindings/vendor-prefixes.txt b/Documentation/devicetree/bindings/vendor-prefixes.txt index 19e1ef73ab0d..4d1919bf2332 100644 --- a/Documentation/devicetree/bindings/vendor-prefixes.txt +++ b/Documentation/devicetree/bindings/vendor-prefixes.txt @@ -5,6 +5,7 @@ using them to avoid name-space collisions. ad Avionic Design GmbH adi Analog Devices, Inc. +aeroflexgaisler Aeroflex Gaisler AB ak Asahi Kasei Corp. amcc Applied Micro Circuits Corporation (APM, formally AMCC) apm Applied Micro Circuits Corporation (APM) @@ -48,6 +49,7 @@ samsung Samsung Semiconductor sbs Smart Battery System schindler Schindler sil Silicon Image +silabs Silicon Laboratories simtek sirf SiRF Technology, Inc. snps Synopsys, Inc. diff --git a/Documentation/devicetree/bindings/video/backlight/lp855x.txt b/Documentation/devicetree/bindings/video/backlight/lp855x.txt new file mode 100644 index 000000000000..1482103d288f --- /dev/null +++ b/Documentation/devicetree/bindings/video/backlight/lp855x.txt @@ -0,0 +1,41 @@ +lp855x bindings + +Required properties: + - compatible: "ti,lp8550", "ti,lp8551", "ti,lp8552", "ti,lp8553", + "ti,lp8556", "ti,lp8557" + - reg: I2C slave address (u8) + - dev-ctrl: Value of DEVICE CONTROL register (u8). It depends on the device. + +Optional properties: + - bl-name: Backlight device name (string) + - init-brt: Initial value of backlight brightness (u8) + - pwm-period: PWM period value. Set only PWM input mode used (u32) + - rom-addr: Register address of ROM area to be updated (u8) + - rom-val: Register value to be updated (u8) + +Example: + + /* LP8556 */ + backlight@2c { + compatible = "ti,lp8556"; + reg = <0x2c>; + + bl-name = "lcd-bl"; + dev-ctrl = /bits/ 8 <0x85>; + init-brt = /bits/ 8 <0x10>; + }; + + /* LP8557 */ + backlight@2c { + compatible = "ti,lp8557"; + reg = <0x2c>; + + dev-ctrl = /bits/ 8 <0x41>; + init-brt = /bits/ 8 <0x0a>; + + /* 4V OV, 4 output LED string enabled */ + rom_14h { + rom-addr = /bits/ 8 <0x14>; + rom-val = /bits/ 8 <0xcf>; + }; + }; diff --git a/Documentation/devicetree/bindings/video/backlight/tps65217-backlight.txt b/Documentation/devicetree/bindings/video/backlight/tps65217-backlight.txt new file mode 100644 index 000000000000..5fb9279ac287 --- /dev/null +++ b/Documentation/devicetree/bindings/video/backlight/tps65217-backlight.txt @@ -0,0 +1,27 @@ +TPS65217 family of regulators + +The TPS65217 chip contains a boost converter and current sinks which can be +used to drive LEDs for use as backlights. + +Required properties: +- compatible: "ti,tps65217" +- reg: I2C slave address +- backlight: node for specifying WLED1 and WLED2 lines in TPS65217 +- isel: selection bit, valid values: 1 for ISEL1 (low-level) and 2 for ISEL2 (high-level) +- fdim: PWM dimming frequency, valid values: 100, 200, 500, 1000 +- default-brightness: valid values: 0-100 + +Each regulator is defined using the standard binding for regulators. + +Example: + + tps: tps@24 { + reg = <0x24>; + compatible = "ti,tps65217"; + backlight { + isel = <1>; /* 1 - ISET1, 2 ISET2 */ + fdim = <100>; /* TPS65217_BL_FDIM_100HZ */ + default-brightness = <50>; + }; + }; + diff --git a/Documentation/devicetree/bindings/video/via,vt8500-fb.txt b/Documentation/devicetree/bindings/video/via,vt8500-fb.txt index c870b6478ec8..2871e218a0fb 100644 --- a/Documentation/devicetree/bindings/video/via,vt8500-fb.txt +++ b/Documentation/devicetree/bindings/video/via,vt8500-fb.txt @@ -5,58 +5,32 @@ Required properties: - compatible : "via,vt8500-fb" - reg : Should contain 1 register ranges(address and length) - interrupts : framebuffer controller interrupt -- display: a phandle pointing to the display node +- bits-per-pixel : bit depth of framebuffer (16 or 32) -Required nodes: -- display: a display node is required to initialize the lcd panel - This should be in the board dts. -- default-mode: a videomode within the display with timing parameters - as specified below. +Required subnodes: +- display-timings: see display-timing.txt for information Example: - fb@d800e400 { + fb@d8050800 { compatible = "via,vt8500-fb"; reg = <0xd800e400 0x400>; interrupts = <12>; - display = <&display>; - default-mode = <&mode0>; - }; - -VIA VT8500 Display ------------------------------------------------------ -Required properties (as per of_videomode_helper): - - - hactive, vactive: Display resolution - - hfront-porch, hback-porch, hsync-len: Horizontal Display timing parameters - in pixels - vfront-porch, vback-porch, vsync-len: Vertical display timing parameters in - lines - - clock: displayclock in Hz - - bpp: lcd panel bit-depth. - <16> for RGB565, <32> for RGB888 - -Optional properties (as per of_videomode_helper): - - width-mm, height-mm: Display dimensions in mm - - hsync-active-high (bool): Hsync pulse is active high - - vsync-active-high (bool): Vsync pulse is active high - - interlaced (bool): This is an interlaced mode - - doublescan (bool): This is a doublescan mode + bits-per-pixel = <16>; -Example: - display: display@0 { - modes { - mode0: mode@0 { + display-timings { + native-mode = <&timing0>; + timing0: 800x480 { + clock-frequency = <0>; /* unused but required */ hactive = <800>; vactive = <480>; - hback-porch = <88>; hfront-porch = <40>; + hback-porch = <88>; hsync-len = <0>; vback-porch = <32>; vfront-porch = <11>; vsync-len = <1>; - clock = <0>; /* unused but required */ - bpp = <16>; /* non-standard but required */ }; }; }; + diff --git a/Documentation/devicetree/bindings/video/wm,wm8505-fb.txt b/Documentation/devicetree/bindings/video/wm,wm8505-fb.txt index 3d325e1d11ee..0bcadb2840a5 100644 --- a/Documentation/devicetree/bindings/video/wm,wm8505-fb.txt +++ b/Documentation/devicetree/bindings/video/wm,wm8505-fb.txt @@ -4,20 +4,30 @@ Wondermedia WM8505 Framebuffer Required properties: - compatible : "wm,wm8505-fb" - reg : Should contain 1 register ranges(address and length) -- via,display: a phandle pointing to the display node +- bits-per-pixel : bit depth of framebuffer (16 or 32) -Required nodes: -- display: a display node is required to initialize the lcd panel - This should be in the board dts. See definition in - Documentation/devicetree/bindings/video/via,vt8500-fb.txt -- default-mode: a videomode node as specified in - Documentation/devicetree/bindings/video/via,vt8500-fb.txt +Required subnodes: +- display-timings: see display-timing.txt for information Example: - fb@d8050800 { + fb@d8051700 { compatible = "wm,wm8505-fb"; - reg = <0xd8050800 0x200>; - display = <&display>; - default-mode = <&mode0>; + reg = <0xd8051700 0x200>; + bits-per-pixel = <16>; + + display-timings { + native-mode = <&timing0>; + timing0: 800x480 { + clock-frequency = <0>; /* unused but required */ + hactive = <800>; + vactive = <480>; + hfront-porch = <40>; + hback-porch = <88>; + hsync-len = <0>; + vback-porch = <32>; + vfront-porch = <11>; + vsync-len = <1>; + }; + }; }; diff --git a/Documentation/dma-buf-sharing.txt b/Documentation/dma-buf-sharing.txt index 4966b1be42ac..0b23261561d2 100644 --- a/Documentation/dma-buf-sharing.txt +++ b/Documentation/dma-buf-sharing.txt @@ -52,14 +52,23 @@ The dma_buf buffer sharing API usage contains the following steps: associated with this buffer. Interface: - struct dma_buf *dma_buf_export(void *priv, struct dma_buf_ops *ops, - size_t size, int flags) + struct dma_buf *dma_buf_export_named(void *priv, struct dma_buf_ops *ops, + size_t size, int flags, + const char *exp_name) If this succeeds, dma_buf_export allocates a dma_buf structure, and returns a pointer to the same. It also associates an anonymous file with this buffer, so it can be exported. On failure to allocate the dma_buf object, it returns NULL. + 'exp_name' is the name of exporter - to facilitate information while + debugging. + + Exporting modules which do not wish to provide any specific name may use the + helper define 'dma_buf_export()', with the same arguments as above, but + without the last argument; a __FILE__ pre-processor directive will be + inserted in place of 'exp_name' instead. + 2. Userspace gets a handle to pass around to potential buffer-users Userspace entity requests for a file-descriptor (fd) which is a handle to the diff --git a/Documentation/filesystems/ext4.txt b/Documentation/filesystems/ext4.txt index 34ea4f1fa6ea..f7cbf574a875 100644 --- a/Documentation/filesystems/ext4.txt +++ b/Documentation/filesystems/ext4.txt @@ -494,6 +494,17 @@ Files in /sys/fs/ext4/<devname> session_write_kbytes This file is read-only and shows the number of kilobytes of data that have been written to this filesystem since it was mounted. + + reserved_clusters This is RW file and contains number of reserved + clusters in the file system which will be used + in the specific situations to avoid costly + zeroout, unexpected ENOSPC, or possible data + loss. The default is 2% or 4096 clusters, + whichever is smaller and this can be changed + however it can never exceed number of clusters + in the file system. If there is not enough space + for the reserved space when mounting the file + mount will _not_ fail. .............................................................................. Ioctls @@ -587,6 +598,16 @@ Table of Ext4 specific ioctls bitmaps and inode table, the userspace tool thus just passes the new number of blocks. +EXT4_IOC_SWAP_BOOT Swap i_blocks and associated attributes + (like i_blocks, i_size, i_flags, ...) from + the specified inode with inode + EXT4_BOOT_LOADER_INO (#5). This is typically + used to store a boot loader in a secure part of + the filesystem, where it can't be changed by a + normal user by accident. + The data blocks of the previous boot loader + will be associated with the given inode. + .............................................................................. References diff --git a/Documentation/filesystems/vfat.txt b/Documentation/filesystems/vfat.txt index d230dd9c99b0..4a93e98b290a 100644 --- a/Documentation/filesystems/vfat.txt +++ b/Documentation/filesystems/vfat.txt @@ -150,12 +150,28 @@ discard -- If set, issues discard/TRIM commands to the block device when blocks are freed. This is useful for SSD devices and sparse/thinly-provisoned LUNs. -nfs -- This option maintains an index (cache) of directory - inodes by i_logstart which is used by the nfs-related code to - improve look-ups. +nfs=stale_rw|nostale_ro + Enable this only if you want to export the FAT filesystem + over NFS. + + stale_rw: This option maintains an index (cache) of directory + inodes by i_logstart which is used by the nfs-related code to + improve look-ups. Full file operations (read/write) over NFS is + supported but with cache eviction at NFS server, this could + result in ESTALE issues. + + nostale_ro: This option bases the inode number and filehandle + on the on-disk location of a file in the MS-DOS directory entry. + This ensures that ESTALE will not be returned after a file is + evicted from the inode cache. However, it means that operations + such as rename, create and unlink could cause filehandles that + previously pointed at one file to point at a different file, + potentially causing data corruption. For this reason, this + option also mounts the filesystem readonly. + + To maintain backward compatibility, '-o nfs' is also accepted, + defaulting to stale_rw - Enable this only if you want to export the FAT filesystem - over NFS <bool>: 0,1,yes,no,true,false diff --git a/Documentation/hwmon/ab8500 b/Documentation/hwmon/ab8500 new file mode 100644 index 000000000000..cf169c8ef4e3 --- /dev/null +++ b/Documentation/hwmon/ab8500 @@ -0,0 +1,22 @@ +Kernel driver ab8500 +==================== + +Supported chips: + * ST-Ericsson AB8500 + Prefix: 'ab8500' + Addresses scanned: - + Datasheet: http://www.stericsson.com/developers/documentation.jsp + +Authors: + Martin Persson <martin.persson@stericsson.com> + Hongbo Zhang <hongbo.zhang@linaro.org> + +Description +----------- + +See also Documentation/hwmon/abx500. This is the ST-Ericsson AB8500 specific +driver. + +Currently only the AB8500 internal sensor and one external sensor for battery +temperature are monitored. Other GPADC channels can also be monitored if needed +in future. diff --git a/Documentation/hwmon/abx500 b/Documentation/hwmon/abx500 new file mode 100644 index 000000000000..319a058cec7c --- /dev/null +++ b/Documentation/hwmon/abx500 @@ -0,0 +1,28 @@ +Kernel driver abx500 +==================== + +Supported chips: + * ST-Ericsson ABx500 series + Prefix: 'abx500' + Addresses scanned: - + Datasheet: http://www.stericsson.com/developers/documentation.jsp + +Authors: + Martin Persson <martin.persson@stericsson.com> + Hongbo Zhang <hongbo.zhang@linaro.org> + +Description +----------- + +Every ST-Ericsson Ux500 SOC consists of both ABx500 and DBx500 physically, +this is kernel hwmon driver for ABx500. + +There are some GPADCs inside ABx500 which are designed for connecting to +thermal sensors, and there is also a thermal sensor inside ABx500 too, which +raises interrupt when critical temperature reached. + +This abx500 is a common layer which can monitor all of the sensors, every +specific abx500 chip has its special configurations in its own file, e.g. some +sensors can be configured invisible if they are not available on that chip, and +the corresponding gpadc_addr should be set to 0, thus this sensor won't be +polled. diff --git a/Documentation/hwmon/adt7410 b/Documentation/hwmon/adt7410 index 58150c480e56..9817941e5f19 100644 --- a/Documentation/hwmon/adt7410 +++ b/Documentation/hwmon/adt7410 @@ -12,29 +12,42 @@ Supported chips: Addresses scanned: None Datasheet: Publicly available at the Analog Devices website http://www.analog.com/static/imported-files/data_sheets/ADT7420.pdf + * Analog Devices ADT7310 + Prefix: 'adt7310' + Addresses scanned: None + Datasheet: Publicly available at the Analog Devices website + http://www.analog.com/static/imported-files/data_sheets/ADT7310.pdf + * Analog Devices ADT7320 + Prefix: 'adt7320' + Addresses scanned: None + Datasheet: Publicly available at the Analog Devices website + http://www.analog.com/static/imported-files/data_sheets/ADT7320.pdf Author: Hartmut Knaack <knaack.h@gmx.de> Description ----------- -The ADT7410 is a temperature sensor with rated temperature range of -55°C to -+150°C. It has a high accuracy of +/-0.5°C and can be operated at a resolution -of 13 bits (0.0625°C) or 16 bits (0.0078°C). The sensor provides an INT pin to -indicate that a minimum or maximum temperature set point has been exceeded, as -well as a critical temperature (CT) pin to indicate that the critical -temperature set point has been exceeded. Both pins can be set up with a common -hysteresis of 0°C - 15°C and a fault queue, ranging from 1 to 4 events. Both -pins can individually set to be active-low or active-high, while the whole -device can either run in comparator mode or interrupt mode. The ADT7410 -supports continous temperature sampling, as well as sampling one temperature -value per second or even justget one sample on demand for power saving. -Besides, it can completely power down its ADC, if power management is -required. - -The ADT7420 is register compatible, the only differences being the package, -a slightly narrower operating temperature range (-40°C to +150°C), and a -better accuracy (0.25°C instead of 0.50°C.) +The ADT7310/ADT7410 is a temperature sensor with rated temperature range of +-55°C to +150°C. It has a high accuracy of +/-0.5°C and can be operated at a +resolution of 13 bits (0.0625°C) or 16 bits (0.0078°C). The sensor provides an +INT pin to indicate that a minimum or maximum temperature set point has been +exceeded, as well as a critical temperature (CT) pin to indicate that the +critical temperature set point has been exceeded. Both pins can be set up with a +common hysteresis of 0°C - 15°C and a fault queue, ranging from 1 to 4 events. +Both pins can individually set to be active-low or active-high, while the whole +device can either run in comparator mode or interrupt mode. The ADT7410 supports +continuous temperature sampling, as well as sampling one temperature value per +second or even just get one sample on demand for power saving. Besides, it can +completely power down its ADC, if power management is required. + +The ADT7320/ADT7420 is register compatible, the only differences being the +package, a slightly narrower operating temperature range (-40°C to +150°C), and +a better accuracy (0.25°C instead of 0.50°C.) + +The difference between the ADT7310/ADT7320 and ADT7410/ADT7420 is the control +interface, the ADT7310 and ADT7320 use SPI while the ADT7410 and ADT7420 use +I2C. Configuration Notes ------------------- diff --git a/Documentation/hwmon/lm25066 b/Documentation/hwmon/lm25066 index 26025e419d35..c1b57d72efc3 100644 --- a/Documentation/hwmon/lm25066 +++ b/Documentation/hwmon/lm25066 @@ -1,7 +1,13 @@ -Kernel driver max8688 +Kernel driver lm25066 ===================== Supported chips: + * TI LM25056 + Prefix: 'lm25056' + Addresses scanned: - + Datasheets: + http://www.ti.com/lit/gpn/lm25056 + http://www.ti.com/lit/gpn/lm25056a * National Semiconductor LM25066 Prefix: 'lm25066' Addresses scanned: - @@ -25,8 +31,9 @@ Author: Guenter Roeck <linux@roeck-us.net> Description ----------- -This driver supports hardware montoring for National Semiconductor LM25066, -LM5064, and LM5064 Power Management, Monitoring, Control, and Protection ICs. +This driver supports hardware montoring for National Semiconductor / TI LM25056, +LM25066, LM5064, and LM5064 Power Management, Monitoring, Control, and +Protection ICs. The driver is a client driver to the core PMBus driver. Please see Documentation/hwmon/pmbus for details on PMBus client drivers. @@ -60,14 +67,19 @@ in1_max Maximum input voltage. in1_min_alarm Input voltage low alarm. in1_max_alarm Input voltage high alarm. -in2_label "vout1" -in2_input Measured output voltage. -in2_average Average measured output voltage. -in2_min Minimum output voltage. -in2_min_alarm Output voltage low alarm. - -in3_label "vout2" -in3_input Measured voltage on vaux pin +in2_label "vmon" +in2_input Measured voltage on VAUX pin +in2_min Minimum VAUX voltage (LM25056 only). +in2_max Maximum VAUX voltage (LM25056 only). +in2_min_alarm VAUX voltage low alarm (LM25056 only). +in2_max_alarm VAUX voltage high alarm (LM25056 only). + +in3_label "vout1" + Not supported on LM25056. +in3_input Measured output voltage. +in3_average Average measured output voltage. +in3_min Minimum output voltage. +in3_min_alarm Output voltage low alarm. curr1_label "iin" curr1_input Measured input current. diff --git a/Documentation/hwmon/lm95234 b/Documentation/hwmon/lm95234 new file mode 100644 index 000000000000..a0e95ddfd372 --- /dev/null +++ b/Documentation/hwmon/lm95234 @@ -0,0 +1,36 @@ +Kernel driver lm95234 +===================== + +Supported chips: + * National Semiconductor / Texas Instruments LM95234 + Addresses scanned: I2C 0x18, 0x4d, 0x4e + Datasheet: Publicly available at the Texas Instruments website + http://www.ti.com/product/lm95234 + + +Author: Guenter Roeck <linux@roeck-us.net> + +Description +----------- + +LM95234 is an 11-bit digital temperature sensor with a 2-wire System Management +Bus (SMBus) interface and TrueTherm technology that can very accurately monitor +the temperature of four remote diodes as well as its own temperature. +The four remote diodes can be external devices such as microprocessors, +graphics processors or diode-connected 2N3904s. The LM95234's TruTherm +beta compensation technology allows sensing of 90 nm or 65 nm process +thermal diodes accurately. + +All temperature values are given in millidegrees Celsius. Temperature +is provided within a range of -127 to +255 degrees (+127.875 degrees for +the internal sensor). Resolution depends on temperature input and range. + +Each sensor has its own maximum limit, but the hysteresis is common to all +channels. The hysteresis is configurable with the tem1_max_hyst attribute and +affects the hysteresis on all channels. The first two external sensors also +have a critical limit. + +The lm95234 driver can change its update interval to a fixed set of values. +It will round up to the next selectable interval. See the datasheet for exact +values. Reading sensor values more often will do no harm, but will return +'old' values. diff --git a/Documentation/hwmon/ltc2978 b/Documentation/hwmon/ltc2978 index e4d75c606c97..dc0d08c61305 100644 --- a/Documentation/hwmon/ltc2978 +++ b/Documentation/hwmon/ltc2978 @@ -2,6 +2,10 @@ Kernel driver ltc2978 ===================== Supported chips: + * Linear Technology LTC2974 + Prefix: 'ltc2974' + Addresses scanned: - + Datasheet: http://www.linear.com/product/ltc2974 * Linear Technology LTC2978 Prefix: 'ltc2978' Addresses scanned: - @@ -10,6 +14,10 @@ Supported chips: Prefix: 'ltc3880' Addresses scanned: - Datasheet: http://www.linear.com/product/ltc3880 + * Linear Technology LTC3883 + Prefix: 'ltc3883' + Addresses scanned: - + Datasheet: http://www.linear.com/product/ltc3883 Author: Guenter Roeck <linux@roeck-us.net> @@ -17,9 +25,9 @@ Author: Guenter Roeck <linux@roeck-us.net> Description ----------- -The LTC2978 is an octal power supply monitor, supervisor, sequencer and -margin controller. The LTC3880 is a dual, PolyPhase DC/DC synchronous -step-down switching regulator controller. +LTC2974 is a quad digital power supply manager. LTC2978 is an octal power supply +monitor. LTC3880 is a dual output poly-phase step-down DC/DC controller. LTC3883 +is a single phase step-down DC/DC controller. Usage Notes @@ -41,63 +49,90 @@ Sysfs attributes in1_label "vin" in1_input Measured input voltage. in1_min Minimum input voltage. -in1_max Maximum input voltage. -in1_lcrit Critical minimum input voltage. +in1_max Maximum input voltage. LTC2974 and LTC2978 only. +in1_lcrit Critical minimum input voltage. LTC2974 and LTC2978 + only. in1_crit Critical maximum input voltage. in1_min_alarm Input voltage low alarm. -in1_max_alarm Input voltage high alarm. -in1_lcrit_alarm Input voltage critical low alarm. +in1_max_alarm Input voltage high alarm. LTC2974 and LTC2978 only. +in1_lcrit_alarm Input voltage critical low alarm. LTC2974 and LTC2978 + only. in1_crit_alarm Input voltage critical high alarm. -in1_lowest Lowest input voltage. LTC2978 only. +in1_lowest Lowest input voltage. LTC2974 and LTC2978 only. in1_highest Highest input voltage. -in1_reset_history Reset history. Writing into this attribute will reset - history for all attributes. - -in[2-9]_label "vout[1-8]". Channels 3 to 9 on LTC2978 only. -in[2-9]_input Measured output voltage. -in[2-9]_min Minimum output voltage. -in[2-9]_max Maximum output voltage. -in[2-9]_lcrit Critical minimum output voltage. -in[2-9]_crit Critical maximum output voltage. -in[2-9]_min_alarm Output voltage low alarm. -in[2-9]_max_alarm Output voltage high alarm. -in[2-9]_lcrit_alarm Output voltage critical low alarm. -in[2-9]_crit_alarm Output voltage critical high alarm. -in[2-9]_lowest Lowest output voltage. LTC2978 only. -in[2-9]_highest Lowest output voltage. -in[2-9]_reset_history Reset history. Writing into this attribute will reset - history for all attributes. - -temp[1-3]_input Measured temperature. +in1_reset_history Reset input voltage history. + +in[N]_label "vout[1-8]". + LTC2974: N=2-5 + LTC2978: N=2-9 + LTC3880: N=2-3 + LTC3883: N=2 +in[N]_input Measured output voltage. +in[N]_min Minimum output voltage. +in[N]_max Maximum output voltage. +in[N]_lcrit Critical minimum output voltage. +in[N]_crit Critical maximum output voltage. +in[N]_min_alarm Output voltage low alarm. +in[N]_max_alarm Output voltage high alarm. +in[N]_lcrit_alarm Output voltage critical low alarm. +in[N]_crit_alarm Output voltage critical high alarm. +in[N]_lowest Lowest output voltage. LTC2974 and LTC2978 only. +in[N]_highest Highest output voltage. +in[N]_reset_history Reset output voltage history. + +temp[N]_input Measured temperature. + On LTC2974, temp[1-4] report external temperatures, + and temp5 reports the chip temperature. On LTC2978, only one temperature measurement is - supported and reflects the internal temperature. + supported and reports the chip temperature. On LTC3880, temp1 and temp2 report external - temperatures, and temp3 reports the internal - temperature. -temp[1-3]_min Mimimum temperature. -temp[1-3]_max Maximum temperature. -temp[1-3]_lcrit Critical low temperature. -temp[1-3]_crit Critical high temperature. -temp[1-3]_min_alarm Chip temperature low alarm. -temp[1-3]_max_alarm Chip temperature high alarm. -temp[1-3]_lcrit_alarm Chip temperature critical low alarm. -temp[1-3]_crit_alarm Chip temperature critical high alarm. -temp[1-3]_lowest Lowest measured temperature. LTC2978 only. -temp[1-3]_highest Highest measured temperature. -temp[1-3]_reset_history Reset history. Writing into this attribute will reset - history for all attributes. - -power[1-2]_label "pout[1-2]". LTC3880 only. -power[1-2]_input Measured power. - -curr1_label "iin". LTC3880 only. + temperatures, and temp3 reports the chip temperature. + On LTC3883, temp1 reports an external temperature, + and temp2 reports the chip temperature. +temp[N]_min Mimimum temperature. LTC2974 and LTC2978 only. +temp[N]_max Maximum temperature. +temp[N]_lcrit Critical low temperature. +temp[N]_crit Critical high temperature. +temp[N]_min_alarm Temperature low alarm. LTC2974 and LTC2978 only. +temp[N]_max_alarm Temperature high alarm. +temp[N]_lcrit_alarm Temperature critical low alarm. +temp[N]_crit_alarm Temperature critical high alarm. +temp[N]_lowest Lowest measured temperature. LTC2974 and LTC2978 only. + Not supported for chip temperature sensor on LTC2974. +temp[N]_highest Highest measured temperature. Not supported for chip + temperature sensor on LTC2974. +temp[N]_reset_history Reset temperature history. Not supported for chip + temperature sensor on LTC2974. + +power1_label "pin". LTC3883 only. +power1_input Measured input power. + +power[N]_label "pout[1-4]". + LTC2974: N=1-4 + LTC2978: Not supported + LTC3880: N=1-2 + LTC3883: N=2 +power[N]_input Measured output power. + +curr1_label "iin". LTC3880 and LTC3883 only. curr1_input Measured input current. curr1_max Maximum input current. curr1_max_alarm Input current high alarm. - -curr[2-3]_label "iout[1-2]". LTC3880 only. -curr[2-3]_input Measured input current. -curr[2-3]_max Maximum input current. -curr[2-3]_crit Critical input current. -curr[2-3]_max_alarm Input current high alarm. -curr[2-3]_crit_alarm Input current critical high alarm. +curr1_highest Highest input current. LTC3883 only. +curr1_reset_history Reset input current history. LTC3883 only. + +curr[N]_label "iout[1-4]". + LTC2974: N=1-4 + LTC2978: not supported + LTC3880: N=2-3 + LTC3883: N=2 +curr[N]_input Measured output current. +curr[N]_max Maximum output current. +curr[N]_crit Critical high output current. +curr[N]_lcrit Critical low output current. LTC2974 only. +curr[N]_max_alarm Output current high alarm. +curr[N]_crit_alarm Output current critical high alarm. +curr[N]_lcrit_alarm Output current critical low alarm. LTC2974 only. +curr[N]_lowest Lowest output current. LTC2974 only. +curr[N]_highest Highest output current. +curr[N]_reset_history Reset output current history. diff --git a/Documentation/hwmon/nct6775 b/Documentation/hwmon/nct6775 new file mode 100644 index 000000000000..4e9ef60e8c6c --- /dev/null +++ b/Documentation/hwmon/nct6775 @@ -0,0 +1,188 @@ +Note +==== + +This driver supersedes the NCT6775F and NCT6776F support in the W83627EHF +driver. + +Kernel driver NCT6775 +===================== + +Supported chips: + * Nuvoton NCT5572D/NCT6771F/NCT6772F/NCT6775F/W83677HG-I + Prefix: 'nct6775' + Addresses scanned: ISA address retrieved from Super I/O registers + Datasheet: Available from Nuvoton upon request + * Nuvoton NCT5577D/NCT6776D/NCT6776F + Prefix: 'nct6776' + Addresses scanned: ISA address retrieved from Super I/O registers + Datasheet: Available from Nuvoton upon request + * Nuvoton NCT5532D/NCT6779D + Prefix: 'nct6779' + Addresses scanned: ISA address retrieved from Super I/O registers + Datasheet: Available from Nuvoton upon request + +Authors: + Guenter Roeck <linux@roeck-us.net> + +Description +----------- + +This driver implements support for the Nuvoton NCT6775F, NCT6776F, and NCT6779D +and compatible super I/O chips. + +The chips support up to 25 temperature monitoring sources. Up to 6 of those are +direct temperature sensor inputs, the others are special sources such as PECI, +PCH, and SMBUS. Depending on the chip type, 2 to 6 of the temperature sources +can be monitored and compared against minimum, maximum, and critical +temperatures. The driver reports up to 10 of the temperatures to the user. +There are 4 to 5 fan rotation speed sensors, 8 to 15 analog voltage sensors, +one VID, alarms with beep warnings (control unimplemented), and some automatic +fan regulation strategies (plus manual fan control mode). + +The temperature sensor sources on all chips are configurable. The configured +source for each of the temperature sensors is provided in tempX_label. + +Temperatures are measured in degrees Celsius and measurement resolution is +either 1 degC or 0.5 degC, depending on the temperature source and +configuration. An alarm is triggered when the temperature gets higher than +the high limit; it stays on until the temperature falls below the hysteresis +value. Alarms are only supported for temp1 to temp6, depending on the chip type. + +Fan rotation speeds are reported in RPM (rotations per minute). An alarm is +triggered if the rotation speed has dropped below a programmable limit. On +NCT6775F, fan readings can be divided by a programmable divider (1, 2, 4, 8, +16, 32, 64 or 128) to give the readings more range or accuracy; the other chips +do not have a fan speed divider. The driver sets the most suitable fan divisor +itself; specifically, it increases the divider value each time a fan speed +reading returns an invalid value, and it reduces it if the fan speed reading +is lower than optimal. Some fans might not be present because they share pins +with other functions. + +Voltage sensors (also known as IN sensors) report their values in millivolts. +An alarm is triggered if the voltage has crossed a programmable minimum +or maximum limit. + +The driver supports automatic fan control mode known as Thermal Cruise. +In this mode, the chip attempts to keep the measured temperature in a +predefined temperature range. If the temperature goes out of range, fan +is driven slower/faster to reach the predefined range again. + +The mode works for fan1-fan5. + +sysfs attributes +---------------- + +pwm[1-5] - this file stores PWM duty cycle or DC value (fan speed) in range: + 0 (lowest speed) to 255 (full) + +pwm[1-5]_enable - this file controls mode of fan/temperature control: + * 0 Fan control disabled (fans set to maximum speed) + * 1 Manual mode, write to pwm[0-5] any value 0-255 + * 2 "Thermal Cruise" mode + * 3 "Fan Speed Cruise" mode + * 4 "Smart Fan III" mode (NCT6775F only) + * 5 "Smart Fan IV" mode + +pwm[1-5]_mode - controls if output is PWM or DC level + * 0 DC output + * 1 PWM output + +Common fan control attributes +----------------------------- + +pwm[1-5]_temp_sel Temperature source. Value is temperature sensor index. + For example, select '1' for temp1_input. +pwm[1-5]_weight_temp_sel + Secondary temperature source. Value is temperature + sensor index. For example, select '1' for temp1_input. + Set to 0 to disable secondary temperature control. + +If secondary temperature functionality is enabled, it is controlled with the +following attributes. + +pwm[1-5]_weight_duty_step + Duty step size. +pwm[1-5]_weight_temp_step + Temperature step size. With each step over + temp_step_base, the value of weight_duty_step is added + to the current pwm value. +pwm[1-5]_weight_temp_step_base + Temperature at which secondary temperature control kicks + in. +pwm[1-5]_weight_temp_step_tol + Temperature step tolerance. + +Thermal Cruise mode (2) +----------------------- + +If the temperature is in the range defined by: + +pwm[1-5]_target_temp Target temperature, unit millidegree Celsius + (range 0 - 127000) +pwm[1-5]_temp_tolerance + Target temperature tolerance, unit millidegree Celsius + +there are no changes to fan speed. Once the temperature leaves the interval, fan +speed increases (if temperature is higher that desired) or decreases (if +temperature is lower than desired), using the following limits and time +intervals. + +pwm[1-5]_start fan pwm start value (range 1 - 255), to start fan + when the temperature is above defined range. +pwm[1-5]_floor lowest fan pwm (range 0 - 255) if temperature is below + the defined range. If set to 0, the fan is expected to + stop if the temperature is below the defined range. +pwm[1-5]_step_up_time milliseconds before fan speed is increased +pwm[1-5]_step_down_time milliseconds before fan speed is decreased +pwm[1-5]_stop_time how many milliseconds must elapse to switch + corresponding fan off (when the temperature was below + defined range). + +Speed Cruise mode (3) +--------------------- + +This modes tries to keep the fan speed constant. + +fan[1-5]_target Target fan speed +fan[1-5]_tolerance + Target speed tolerance + + +Untested; use at your own risk. + +Smart Fan IV mode (5) +--------------------- + +This mode offers multiple slopes to control the fan speed. The slopes can be +controlled by setting the pwm and temperature attributes. When the temperature +rises, the chip will calculate the DC/PWM output based on the current slope. +There are up to seven data points depending on the chip type. Subsequent data +points should be set to higher temperatures and higher pwm values to achieve +higher fan speeds with increasing temperature. The last data point reflects +critical temperature mode, in which the fans should run at full speed. + +pwm[1-5]_auto_point[1-7]_pwm + pwm value to be set if temperature reaches matching + temperature range. +pwm[1-5]_auto_point[1-7]_temp + Temperature over which the matching pwm is enabled. +pwm[1-5]_temp_tolerance + Temperature tolerance, unit millidegree Celsius +pwm[1-5]_crit_temp_tolerance + Temperature tolerance for critical temperature, + unit millidegree Celsius + +pwm[1-5]_step_up_time milliseconds before fan speed is increased +pwm[1-5]_step_down_time milliseconds before fan speed is decreased + +Usage Notes +----------- + +On various ASUS boards with NCT6776F, it appears that CPUTIN is not really +connected to anything and floats, or that it is connected to some non-standard +temperature measurement device. As a result, the temperature reported on CPUTIN +will not reflect a usable value. It often reports unreasonably high +temperatures, and in some cases the reported temperature declines if the actual +temperature increases (similar to the raw PECI temperature value - see PECI +specification for details). CPUTIN should therefore be be ignored on ASUS +boards. The CPU temperature on ASUS boards is reported from PECI 0. diff --git a/Documentation/hwmon/sht15 b/Documentation/hwmon/sht15 index 02850bdfac18..778987d1856f 100644 --- a/Documentation/hwmon/sht15 +++ b/Documentation/hwmon/sht15 @@ -40,7 +40,7 @@ bits for humidity, or 12 bits for temperature and 8 bits for humidity. The humidity calibration coefficients are programmed into an OTP memory on the chip. These coefficients are used to internally calibrate the signals from the sensors. Disabling the reload of those coefficients allows saving 10ms for each -measurement and decrease power consumption, while loosing on precision. +measurement and decrease power consumption, while losing on precision. Some options may be set directly in the sht15_platform_data structure or via sysfs attributes. diff --git a/Documentation/hwmon/tmp401 b/Documentation/hwmon/tmp401 index 9fc447249212..f91e3fa7e5ec 100644 --- a/Documentation/hwmon/tmp401 +++ b/Documentation/hwmon/tmp401 @@ -8,8 +8,16 @@ Supported chips: Datasheet: http://focus.ti.com/docs/prod/folders/print/tmp401.html * Texas Instruments TMP411 Prefix: 'tmp411' - Addresses scanned: I2C 0x4c + Addresses scanned: I2C 0x4c, 0x4d, 0x4e Datasheet: http://focus.ti.com/docs/prod/folders/print/tmp411.html + * Texas Instruments TMP431 + Prefix: 'tmp431' + Addresses scanned: I2C 0x4c, 0x4d + Datasheet: http://focus.ti.com/docs/prod/folders/print/tmp431.html + * Texas Instruments TMP432 + Prefix: 'tmp432' + Addresses scanned: I2C 0x4c, 0x4d + Datasheet: http://focus.ti.com/docs/prod/folders/print/tmp432.html Authors: Hans de Goede <hdegoede@redhat.com> @@ -18,19 +26,19 @@ Authors: Description ----------- -This driver implements support for Texas Instruments TMP401 and -TMP411 chips. These chips implements one remote and one local -temperature sensor. Temperature is measured in degrees +This driver implements support for Texas Instruments TMP401, TMP411, +TMP431, and TMP432 chips. These chips implement one or two remote and +one local temperature sensors. Temperature is measured in degrees Celsius. Resolution of the remote sensor is 0.0625 degree. Local sensor resolution can be set to 0.5, 0.25, 0.125 or 0.0625 degree (not supported by the driver so far, so using the default resolution of 0.5 degree). The driver provides the common sysfs-interface for temperatures (see -/Documentation/hwmon/sysfs-interface under Temperatures). +Documentation/hwmon/sysfs-interface under Temperatures). -The TMP411 chip is compatible with TMP401. It provides some additional -features. +The TMP411 and TMP431 chips are compatible with TMP401. TMP411 provides +some additional features. * Minimum and Maximum temperature measured since power-on, chip-reset @@ -40,3 +48,6 @@ features. Exported via sysfs attribute temp_reset_history. Writing 1 to this file triggers a reset. + +TMP432 is compatible with TMP401 and TMP431. It supports two external +temperature sensors. diff --git a/Documentation/hwmon/zl6100 b/Documentation/hwmon/zl6100 index 756b57c6b73e..33908a4d68ff 100644 --- a/Documentation/hwmon/zl6100 +++ b/Documentation/hwmon/zl6100 @@ -125,7 +125,7 @@ in2_label "vmon" in2_input Measured voltage on VMON (ZL2004) or VDRV (ZL9101M, ZL9117M) pin. Reported voltage is 16x the voltage on the pin (adjusted internally by the chip). -in2_lcrit Critical minumum VMON/VDRV Voltage. +in2_lcrit Critical minimum VMON/VDRV Voltage. in2_crit Critical maximum VMON/VDRV voltage. in2_lcrit_alarm VMON/VDRV voltage critical low alarm. in2_crit_alarm VMON/VDRV voltage critical high alarm. diff --git a/Documentation/ia64/err_inject.txt b/Documentation/ia64/err_inject.txt index 223e4f0582d0..9f651c181429 100644 --- a/Documentation/ia64/err_inject.txt +++ b/Documentation/ia64/err_inject.txt @@ -882,7 +882,7 @@ int err_inj() cpu=parameters[i].cpu; k = cpu%64; j = cpu/64; - mask[j]=1<<k; + mask[j] = 1UL << k; if (sched_setaffinity(0, MASK_SIZE*8, mask)==-1) { perror("Error sched_setaffinity:"); diff --git a/Documentation/ioctl/ioctl-number.txt b/Documentation/ioctl/ioctl-number.txt index 3210540f8bd3..237acab169dd 100644 --- a/Documentation/ioctl/ioctl-number.txt +++ b/Documentation/ioctl/ioctl-number.txt @@ -131,6 +131,7 @@ Code Seq#(hex) Include File Comments 'H' 40-4F sound/hdspm.h conflict! 'H' 40-4F sound/hdsp.h conflict! 'H' 90 sound/usb/usx2y/usb_stream.h +'H' A0 uapi/linux/usb/cdc-wdm.h 'H' C0-F0 net/bluetooth/hci.h conflict! 'H' C0-DF net/bluetooth/hidp/hidp.h conflict! 'H' C0-DF net/bluetooth/cmtp/cmtp.h conflict! diff --git a/Documentation/iostats.txt b/Documentation/iostats.txt index c76c21d87e85..65f694f2d1c9 100644 --- a/Documentation/iostats.txt +++ b/Documentation/iostats.txt @@ -71,6 +71,8 @@ Field 4 -- # of milliseconds spent reading measured from __make_request() to end_that_request_last()). Field 5 -- # of writes completed This is the total number of writes completed successfully. +Field 6 -- # of writes merged + See the description of field 2. Field 7 -- # of sectors written This is the total number of sectors written successfully. Field 8 -- # of milliseconds spent writing diff --git a/Documentation/kdump/kdump.txt b/Documentation/kdump/kdump.txt index 13f1aa09b938..9c7fd988e299 100644 --- a/Documentation/kdump/kdump.txt +++ b/Documentation/kdump/kdump.txt @@ -297,6 +297,7 @@ Boot into System Kernel On ia64, 256M@256M is a generous value that typically works. The region may be automatically placed on ia64, see the dump-capture kernel config option notes above. + If use sparse memory, the size should be rounded to GRANULE boundaries. On s390x, typically use "crashkernel=xxM". The value of xx is dependent on the memory consumption of the kdump system. In general this is not diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 4609e81dbc37..8c01a0218a1e 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -44,6 +44,7 @@ parameter is applicable: AVR32 AVR32 architecture is enabled. AX25 Appropriate AX.25 support is enabled. BLACKFIN Blackfin architecture is enabled. + CLK Common clock infrastructure is enabled. DRM Direct Rendering Management support is enabled. DYNAMIC_DEBUG Build in debug messages and enable them at runtime EDD BIOS Enhanced Disk Drive Services (EDD) is enabled @@ -320,6 +321,13 @@ bytes respectively. Such letter suffixes can also be entirely omitted. on: enable for both 32- and 64-bit processes off: disable for both 32- and 64-bit processes + alloc_snapshot [FTRACE] + Allocate the ftrace snapshot buffer on boot up when the + main buffer is allocated. This is handy if debugging + and you need to use tracing_snapshot() on boot up, and + do not want to use tracing_snapshot_alloc() as it needs + to be done where GFP_KERNEL allocations are allowed. + amd_iommu= [HW,X86-64] Pass parameters to the AMD IOMMU driver in the system. Possible values are: @@ -465,6 +473,13 @@ bytes respectively. Such letter suffixes can also be entirely omitted. cio_ignore= [S390] See Documentation/s390/CommonIO for details. + clk_ignore_unused + [CLK] + Keep all clocks already enabled by bootloader on, + even if no driver has claimed them. This is useful + for debug and development, but should not be + needed on a platform with proper driver support. + For more information, see Documentation/clk.txt. clock= [BUGS=X86-32, HW] gettimeofday clocksource override. [Deprecated] @@ -596,9 +611,6 @@ bytes respectively. Such letter suffixes can also be entirely omitted. is selected automatically. Check Documentation/kdump/kdump.txt for further details. - crashkernel_low=size[KMG] - [KNL, x86] parts under 4G. - crashkernel=range1:size1[,range2:size2,...][@offset] [KNL] Same as above, but depends on the memory in the running system. The syntax of range is @@ -606,6 +618,26 @@ bytes respectively. Such letter suffixes can also be entirely omitted. a memory unit (amount[KMG]). See also Documentation/kdump/kdump.txt for an example. + crashkernel=size[KMG],high + [KNL, x86_64] range could be above 4G. Allow kernel + to allocate physical memory region from top, so could + be above 4G if system have more than 4G ram installed. + Otherwise memory region will be allocated below 4G, if + available. + It will be ignored if crashkernel=X is specified. + crashkernel=size[KMG],low + [KNL, x86_64] range under 4G. When crashkernel=X,high + is passed, kernel could allocate physical memory region + above 4G, that cause second kernel crash on system + that require some amount of low memory, e.g. swiotlb + requires at least 64M+32K low memory. Kernel would + try to allocate 72M below 4G automatically. + This one let user to specify own low range under 4G + for second kernel instead. + 0: to disable low allocation. + It will be ignored when crashkernel=X,high is not used + or memory reserved is below 4G. + cs89x0_dma= [HW,NET] Format: <dma> @@ -757,19 +789,31 @@ bytes respectively. Such letter suffixes can also be entirely omitted. (mmio) or 32-bit (mmio32). The options are the same as for ttyS, above. - earlyprintk= [X86,SH,BLACKFIN] + earlyprintk= [X86,SH,BLACKFIN,ARM] earlyprintk=vga earlyprintk=xen earlyprintk=serial[,ttySn[,baudrate]] + earlyprintk=serial[,0x...[,baudrate]] earlyprintk=ttySn[,baudrate] earlyprintk=dbgp[debugController#] + earlyprintk is useful when the kernel crashes before + the normal console is initialized. It is not enabled by + default because it has some cosmetic problems. + Append ",keep" to not disable it when the real console takes over. Only vga or serial or usb debug port at a time. - Currently only ttyS0 and ttyS1 are supported. + Currently only ttyS0 and ttyS1 may be specified by + name. Other I/O ports may be explicitly specified + on some architectures (x86 and arm at least) by + replacing ttySn with an I/O port address, like this: + earlyprintk=serial,0x1008,115200 + You can find the port for a given device in + /proc/tty/driver/serial: + 2: uart:ST16650V2 port:00001008 irq:18 ... Interaction with the standard serial driver is not very good. @@ -788,6 +832,12 @@ bytes respectively. Such letter suffixes can also be entirely omitted. edd= [EDD] Format: {"off" | "on" | "skip[mbr]"} + efi_no_storage_paranoia [EFI; X86] + Using this parameter you can use more than 50% of + your efi variable storage. Use this parameter only if + you are really sure that your UEFI does sane gc and + fulfills the spec otherwise your board may brick. + eisa_irq_edge= [PARISC,HW] See header of drivers/parisc/eisa.c. @@ -1625,7 +1675,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted. module.sig_enforce [KNL] When CONFIG_MODULE_SIG is set, this means that modules without (valid) signatures will fail to load. - Note that if CONFIG_MODULE_SIG_ENFORCE is set, that + Note that if CONFIG_MODULE_SIG_FORCE is set, that is always true, so this option does nothing. mousedev.tap_time= @@ -1974,8 +2024,6 @@ bytes respectively. Such letter suffixes can also be entirely omitted. noreplace-smp [X86-32,SMP] Don't replace SMP instructions with UP alternatives - noresidual [PPC] Don't use residual data on PReP machines. - nordrand [X86] Disable the direct use of the RDRAND instruction even if it is supported by the processor. RDRAND is still available to user @@ -2461,9 +2509,12 @@ bytes respectively. Such letter suffixes can also be entirely omitted. In kernels built with CONFIG_RCU_NOCB_CPU=y, set the specified list of CPUs to be no-callback CPUs. Invocation of these CPUs' RCU callbacks will - be offloaded to "rcuoN" kthreads created for - that purpose. This reduces OS jitter on the + be offloaded to "rcuox/N" kthreads created for + that purpose, where "x" is "b" for RCU-bh, "p" + for RCU-preempt, and "s" for RCU-sched, and "N" + is the CPU number. This reduces OS jitter on the offloaded CPUs, which can be useful for HPC and + real-time workloads. It can also improve energy efficiency for asymmetric multiprocessors. @@ -2487,6 +2538,17 @@ bytes respectively. Such letter suffixes can also be entirely omitted. leaf rcu_node structure. Useful for very large systems. + rcutree.jiffies_till_first_fqs= [KNL,BOOT] + Set delay from grace-period initialization to + first attempt to force quiescent states. + Units are jiffies, minimum value is zero, + and maximum value is HZ. + + rcutree.jiffies_till_next_fqs= [KNL,BOOT] + Set delay between subsequent attempts to force + quiescent states. Units are jiffies, minimum + value is one, and maximum value is HZ. + rcutree.qhimark= [KNL,BOOT] Set threshold of queued RCU callbacks over which batch limiting is disabled. @@ -2501,16 +2563,15 @@ bytes respectively. Such letter suffixes can also be entirely omitted. rcutree.rcu_cpu_stall_timeout= [KNL,BOOT] Set timeout for RCU CPU stall warning messages. - rcutree.jiffies_till_first_fqs= [KNL,BOOT] - Set delay from grace-period initialization to - first attempt to force quiescent states. - Units are jiffies, minimum value is zero, - and maximum value is HZ. + rcutree.rcu_idle_gp_delay= [KNL,BOOT] + Set wakeup interval for idle CPUs that have + RCU callbacks (RCU_FAST_NO_HZ=y). - rcutree.jiffies_till_next_fqs= [KNL,BOOT] - Set delay between subsequent attempts to force - quiescent states. Units are jiffies, minimum - value is one, and maximum value is HZ. + rcutree.rcu_idle_lazy_gp_delay= [KNL,BOOT] + Set wakeup interval for idle CPUs that have + only "lazy" RCU callbacks (RCU_FAST_NO_HZ=y). + Lazy RCU callbacks are those which RCU can + prove do nothing more than free memory. rcutorture.fqs_duration= [KNL,BOOT] Set duration of force_quiescent_state bursts. @@ -3222,6 +3283,15 @@ bytes respectively. Such letter suffixes can also be entirely omitted. or other driver-specific files in the Documentation/watchdog/ directory. + workqueue.disable_numa + By default, all work items queued to unbound + workqueues are affine to the NUMA nodes they're + issued on, which results in better behavior in + general. If NUMA affinity needs to be disabled for + whatever reason, this option can be used. Note + that this also can be controlled per-workqueue for + workqueues visible under /sys/bus/workqueue/. + x2apic_phys [X86-64,APIC] Use x2apic physical mode instead of default x2apic cluster mode on platforms supporting x2apic. diff --git a/Documentation/md.txt b/Documentation/md.txt index 993fba37b7d1..e0ddd327632d 100644 --- a/Documentation/md.txt +++ b/Documentation/md.txt @@ -119,7 +119,7 @@ device to add. The array is started with the RUN_ARRAY ioctl. Once started, new devices can be added. They should have an -appropriate superblock written to them, and then passed be in with +appropriate superblock written to them, and then be passed in with ADD_NEW_DISK. Devices that have failed or are not yet active can be detached from an @@ -131,7 +131,7 @@ Specific Rules that apply to format-0 super block arrays, and ------------------------------------------------------------- An array can be 'created' by describing the array (level, chunksize -etc) in a SET_ARRAY_INFO ioctl. This must has major_version==0 and +etc) in a SET_ARRAY_INFO ioctl. This must have major_version==0 and raid_disks != 0. Then uninitialized devices can be added with ADD_NEW_DISK. The @@ -426,7 +426,7 @@ Each directory contains: offset This gives the location in the device (in sectors from the start) where data from the array will be stored. Any part of - the device before this offset us not touched, unless it is + the device before this offset is not touched, unless it is used for storing metadata (Formats 1.1 and 1.2). size @@ -440,7 +440,7 @@ Each directory contains: When the device is not 'in_sync', this records the number of sectors from the start of the device which are known to be correct. This is normally zero, but during a recovery - operation is will steadily increase, and if the recovery is + operation it will steadily increase, and if the recovery is interrupted, restoring this value can cause recovery to avoid repeating the earlier blocks. With v1.x metadata, this value is saved and restored automatically. @@ -468,7 +468,7 @@ Each directory contains: -An active md device will also contain and entry for each active device +An active md device will also contain an entry for each active device in the array. These are named rdNN @@ -482,7 +482,7 @@ will show 'in_sync' on every line. -Active md devices for levels that support data redundancy (1,4,5,6) +Active md devices for levels that support data redundancy (1,4,5,6,10) also have sync_action @@ -494,7 +494,7 @@ also have failed/missing device idle - nothing is happening check - A full check of redundancy was requested and is - happening. This reads all block and checks + happening. This reads all blocks and checks them. A repair may also happen for some raid levels. repair - A full check and repair is happening. This is @@ -522,7 +522,7 @@ also have degraded This contains a count of the number of devices by which the - arrays is degraded. So an optimal array with show '0'. A + arrays is degraded. So an optimal array will show '0'. A single failed/missing drive will show '1', etc. This file responds to select/poll, any increase or decrease in the count of missing devices will trigger an event. diff --git a/Documentation/misc-devices/mei/mei-client-bus.txt b/Documentation/misc-devices/mei/mei-client-bus.txt new file mode 100644 index 000000000000..f83910a8ce76 --- /dev/null +++ b/Documentation/misc-devices/mei/mei-client-bus.txt @@ -0,0 +1,138 @@ +Intel(R) Management Engine (ME) Client bus API +=============================================== + + +Rationale +========= +MEI misc character device is useful for dedicated applications to send and receive +data to the many FW appliance found in Intel's ME from the user space. +However for some of the ME functionalities it make sense to leverage existing software +stack and expose them through existing kernel subsystems. + +In order to plug seamlessly into the kernel device driver model we add kernel virtual +bus abstraction on top of the MEI driver. This allows implementing linux kernel drivers +for the various MEI features as a stand alone entities found in their respective subsystem. +Existing device drivers can even potentially be re-used by adding an MEI CL bus layer to +the existing code. + + +MEI CL bus API +=========== +A driver implementation for an MEI Client is very similar to existing bus +based device drivers. The driver registers itself as an MEI CL bus driver through +the mei_cl_driver structure: + +struct mei_cl_driver { + struct device_driver driver; + const char *name; + + const struct mei_cl_device_id *id_table; + + int (*probe)(struct mei_cl_device *dev, const struct mei_cl_id *id); + int (*remove)(struct mei_cl_device *dev); +}; + +struct mei_cl_id { + char name[MEI_NAME_SIZE]; + kernel_ulong_t driver_info; +}; + +The mei_cl_id structure allows the driver to bind itself against a device name. + +To actually register a driver on the ME Client bus one must call the mei_cl_add_driver() +API. This is typically called at module init time. + +Once registered on the ME Client bus, a driver will typically try to do some I/O on +this bus and this should be done through the mei_cl_send() and mei_cl_recv() +routines. The latter is synchronous (blocks and sleeps until data shows up). +In order for drivers to be notified of pending events waiting for them (e.g. +an Rx event) they can register an event handler through the +mei_cl_register_event_cb() routine. Currently only the MEI_EVENT_RX event +will trigger an event handler call and the driver implementation is supposed +to call mei_recv() from the event handler in order to fetch the pending +received buffers. + + +Example +======= +As a theoretical example let's pretend the ME comes with a "contact" NFC IP. +The driver init and exit routines for this device would look like: + +#define CONTACT_DRIVER_NAME "contact" + +static struct mei_cl_device_id contact_mei_cl_tbl[] = { + { CONTACT_DRIVER_NAME, }, + + /* required last entry */ + { } +}; +MODULE_DEVICE_TABLE(mei_cl, contact_mei_cl_tbl); + +static struct mei_cl_driver contact_driver = { + .id_table = contact_mei_tbl, + .name = CONTACT_DRIVER_NAME, + + .probe = contact_probe, + .remove = contact_remove, +}; + +static int contact_init(void) +{ + int r; + + r = mei_cl_driver_register(&contact_driver); + if (r) { + pr_err(CONTACT_DRIVER_NAME ": driver registration failed\n"); + return r; + } + + return 0; +} + +static void __exit contact_exit(void) +{ + mei_cl_driver_unregister(&contact_driver); +} + +module_init(contact_init); +module_exit(contact_exit); + +And the driver's simplified probe routine would look like that: + +int contact_probe(struct mei_cl_device *dev, struct mei_cl_device_id *id) +{ + struct contact_driver *contact; + + [...] + mei_cl_enable_device(dev); + + mei_cl_register_event_cb(dev, contact_event_cb, contact); + + return 0; + } + +In the probe routine the driver first enable the MEI device and then registers +an ME bus event handler which is as close as it can get to registering a +threaded IRQ handler. +The handler implementation will typically call some I/O routine depending on +the pending events: + +#define MAX_NFC_PAYLOAD 128 + +static void contact_event_cb(struct mei_cl_device *dev, u32 events, + void *context) +{ + struct contact_driver *contact = context; + + if (events & BIT(MEI_EVENT_RX)) { + u8 payload[MAX_NFC_PAYLOAD]; + int payload_size; + + payload_size = mei_recv(dev, payload, MAX_NFC_PAYLOAD); + if (payload_size <= 0) + return; + + /* Hook to the NFC subsystem */ + nfc_hci_recv_frame(contact->hdev, payload, payload_size); + } +} diff --git a/Documentation/networking/ieee802154.txt b/Documentation/networking/ieee802154.txt index 703cf4370c79..67a9cb259d40 100644 --- a/Documentation/networking/ieee802154.txt +++ b/Documentation/networking/ieee802154.txt @@ -71,8 +71,9 @@ submits skb to qdisc), so if you need something from that cb later, you should store info in the skb->data on your own. To hook the MLME interface you have to populate the ml_priv field of your -net_device with a pointer to struct ieee802154_mlme_ops instance. All fields are -required. +net_device with a pointer to struct ieee802154_mlme_ops instance. The fields +assoc_req, assoc_resp, disassoc_req, start_req, and scan_req are optional. +All other fields are required. We provide an example of simple HardMAC driver at drivers/ieee802154/fakehard.c diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt index dc2dc87d2557..f98ca633b528 100644 --- a/Documentation/networking/ip-sysctl.txt +++ b/Documentation/networking/ip-sysctl.txt @@ -29,7 +29,7 @@ route/max_size - INTEGER neigh/default/gc_thresh1 - INTEGER Minimum number of entries to keep. Garbage collector will not purge entries if there are fewer than this number. - Default: 256 + Default: 128 neigh/default/gc_thresh3 - INTEGER Maximum number of neighbor entries allowed. Increase this @@ -175,14 +175,6 @@ tcp_congestion_control - STRING is inherited. [see setsockopt(listenfd, SOL_TCP, TCP_CONGESTION, "name" ...) ] -tcp_cookie_size - INTEGER - Default size of TCP Cookie Transactions (TCPCT) option, that may be - overridden on a per socket basis by the TCPCT socket option. - Values greater than the maximum (16) are interpreted as the maximum. - Values greater than zero and less than the minimum (8) are interpreted - as the minimum. Odd values are interpreted as the next even value. - Default: 0 (off). - tcp_dsack - BOOLEAN Allows TCP to send "duplicate" SACKs. @@ -190,7 +182,9 @@ tcp_early_retrans - INTEGER Enable Early Retransmit (ER), per RFC 5827. ER lowers the threshold for triggering fast retransmit when the amount of outstanding data is small and when no previously unsent data can be transmitted (such - that limited transmit could be used). + that limited transmit could be used). Also controls the use of + Tail loss probe (TLP) that converts RTOs occuring due to tail + losses into fast recovery (draft-dukkipati-tcpm-tcp-loss-probe-01). Possible values: 0 disables ER 1 enables ER @@ -198,7 +192,9 @@ tcp_early_retrans - INTEGER by a fourth of RTT. This mitigates connection falsely recovers when network has a small degree of reordering (less than 3 packets). - Default: 2 + 3 enables delayed ER and TLP. + 4 enables TLP only. + Default: 3 tcp_ecn - INTEGER Control use of Explicit Congestion Notification (ECN) by TCP. @@ -229,36 +225,13 @@ tcp_fin_timeout - INTEGER Default: 60 seconds tcp_frto - INTEGER - Enables Forward RTO-Recovery (F-RTO) defined in RFC4138. + Enables Forward RTO-Recovery (F-RTO) defined in RFC5682. F-RTO is an enhanced recovery algorithm for TCP retransmission - timeouts. It is particularly beneficial in wireless environments - where packet loss is typically due to random radio interference - rather than intermediate router congestion. F-RTO is sender-side - only modification. Therefore it does not require any support from - the peer. - - If set to 1, basic version is enabled. 2 enables SACK enhanced - F-RTO if flow uses SACK. The basic version can be used also when - SACK is in use though scenario(s) with it exists where F-RTO - interacts badly with the packet counting of the SACK enabled TCP - flow. - -tcp_frto_response - INTEGER - When F-RTO has detected that a TCP retransmission timeout was - spurious (i.e, the timeout would have been avoided had TCP set a - longer retransmission timeout), TCP has several options what to do - next. Possible values are: - 0 Rate halving based; a smooth and conservative response, - results in halved cwnd and ssthresh after one RTT - 1 Very conservative response; not recommended because even - though being valid, it interacts poorly with the rest of - Linux TCP, halves cwnd and ssthresh immediately - 2 Aggressive response; undoes congestion control measures - that are now known to be unnecessary (ignoring the - possibility of a lost retransmission that would require - TCP to be more cautious), cwnd and ssthresh are restored - to the values prior timeout - Default: 0 (rate halving based) + timeouts. It is particularly beneficial in networks where the + RTT fluctuates (e.g., wireless). F-RTO is sender-side only + modification. It does not require any support from the peer. + + By default it's enabled with a non-zero value. 0 disables F-RTO. tcp_keepalive_time - INTEGER How often TCP sends out keepalive messages when keepalive is enabled. diff --git a/Documentation/networking/netlink_mmap.txt b/Documentation/networking/netlink_mmap.txt new file mode 100644 index 000000000000..1c2dab409625 --- /dev/null +++ b/Documentation/networking/netlink_mmap.txt @@ -0,0 +1,339 @@ +This file documents how to use memory mapped I/O with netlink. + +Author: Patrick McHardy <kaber@trash.net> + +Overview +-------- + +Memory mapped netlink I/O can be used to increase throughput and decrease +overhead of unicast receive and transmit operations. Some netlink subsystems +require high throughput, these are mainly the netfilter subsystems +nfnetlink_queue and nfnetlink_log, but it can also help speed up large +dump operations of f.i. the routing database. + +Memory mapped netlink I/O used two circular ring buffers for RX and TX which +are mapped into the processes address space. + +The RX ring is used by the kernel to directly construct netlink messages into +user-space memory without copying them as done with regular socket I/O, +additionally as long as the ring contains messages no recvmsg() or poll() +syscalls have to be issued by user-space to get more message. + +The TX ring is used to process messages directly from user-space memory, the +kernel processes all messages contained in the ring using a single sendmsg() +call. + +Usage overview +-------------- + +In order to use memory mapped netlink I/O, user-space needs three main changes: + +- ring setup +- conversion of the RX path to get messages from the ring instead of recvmsg() +- conversion of the TX path to construct messages into the ring + +Ring setup is done using setsockopt() to provide the ring parameters to the +kernel, then a call to mmap() to map the ring into the processes address space: + +- setsockopt(fd, SOL_NETLINK, NETLINK_RX_RING, ¶ms, sizeof(params)); +- setsockopt(fd, SOL_NETLINK, NETLINK_TX_RING, ¶ms, sizeof(params)); +- ring = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0) + +Usage of either ring is optional, but even if only the RX ring is used the +mapping still needs to be writable in order to update the frame status after +processing. + +Conversion of the reception path involves calling poll() on the file +descriptor, once the socket is readable the frames from the ring are +processsed in order until no more messages are available, as indicated by +a status word in the frame header. + +On kernel side, in order to make use of memory mapped I/O on receive, the +originating netlink subsystem needs to support memory mapped I/O, otherwise +it will use an allocated socket buffer as usual and the contents will be + copied to the ring on transmission, nullifying most of the performance gains. +Dumps of kernel databases automatically support memory mapped I/O. + +Conversion of the transmit path involves changing message contruction to +use memory from the TX ring instead of (usually) a buffer declared on the +stack and setting up the frame header approriately. Optionally poll() can +be used to wait for free frames in the TX ring. + +Structured and definitions for using memory mapped I/O are contained in +<linux/netlink.h>. + +RX and TX rings +---------------- + +Each ring contains a number of continous memory blocks, containing frames of +fixed size dependant on the parameters used for ring setup. + +Ring: [ block 0 ] + [ frame 0 ] + [ frame 1 ] + [ block 1 ] + [ frame 2 ] + [ frame 3 ] + ... + [ block n ] + [ frame 2 * n ] + [ frame 2 * n + 1 ] + +The blocks are only visible to the kernel, from the point of view of user-space +the ring just contains the frames in a continous memory zone. + +The ring parameters used for setting up the ring are defined as follows: + +struct nl_mmap_req { + unsigned int nm_block_size; + unsigned int nm_block_nr; + unsigned int nm_frame_size; + unsigned int nm_frame_nr; +}; + +Frames are grouped into blocks, where each block is a continous region of memory +and holds nm_block_size / nm_frame_size frames. The total number of frames in +the ring is nm_frame_nr. The following invariants hold: + +- frames_per_block = nm_block_size / nm_frame_size + +- nm_frame_nr = frames_per_block * nm_block_nr + +Some parameters are constrained, specifically: + +- nm_block_size must be a multiple of the architectures memory page size. + The getpagesize() function can be used to get the page size. + +- nm_frame_size must be equal or larger to NL_MMAP_HDRLEN, IOW a frame must be + able to hold at least the frame header + +- nm_frame_size must be smaller or equal to nm_block_size + +- nm_frame_size must be a multiple of NL_MMAP_MSG_ALIGNMENT + +- nm_frame_nr must equal the actual number of frames as specified above. + +When the kernel can't allocate phsyically continous memory for a ring block, +it will fall back to use physically discontinous memory. This might affect +performance negatively, in order to avoid this the nm_frame_size parameter +should be chosen to be as small as possible for the required frame size and +the number of blocks should be increased instead. + +Ring frames +------------ + +Each frames contain a frame header, consisting of a synchronization word and some +meta-data, and the message itself. + +Frame: [ header message ] + +The frame header is defined as follows: + +struct nl_mmap_hdr { + unsigned int nm_status; + unsigned int nm_len; + __u32 nm_group; + /* credentials */ + __u32 nm_pid; + __u32 nm_uid; + __u32 nm_gid; +}; + +- nm_status is used for synchronizing processing between the kernel and user- + space and specifies ownership of the frame as well as the operation to perform + +- nm_len contains the length of the message contained in the data area + +- nm_group specified the destination multicast group of message + +- nm_pid, nm_uid and nm_gid contain the netlink pid, UID and GID of the sending + process. These values correspond to the data available using SOCK_PASSCRED in + the SCM_CREDENTIALS cmsg. + +The possible values in the status word are: + +- NL_MMAP_STATUS_UNUSED: + RX ring: frame belongs to the kernel and contains no message + for user-space. Approriate action is to invoke poll() + to wait for new messages. + + TX ring: frame belongs to user-space and can be used for + message construction. + +- NL_MMAP_STATUS_RESERVED: + RX ring only: frame is currently used by the kernel for message + construction and contains no valid message yet. + Appropriate action is to invoke poll() to wait for + new messages. + +- NL_MMAP_STATUS_VALID: + RX ring: frame contains a valid message. Approriate action is + to process the message and release the frame back to + the kernel by setting the status to + NL_MMAP_STATUS_UNUSED or queue the frame by setting the + status to NL_MMAP_STATUS_SKIP. + + TX ring: the frame contains a valid message from user-space to + be processed by the kernel. After completing processing + the kernel will release the frame back to user-space by + setting the status to NL_MMAP_STATUS_UNUSED. + +- NL_MMAP_STATUS_COPY: + RX ring only: a message is ready to be processed but could not be + stored in the ring, either because it exceeded the + frame size or because the originating subsystem does + not support memory mapped I/O. Appropriate action is + to invoke recvmsg() to receive the message and release + the frame back to the kernel by setting the status to + NL_MMAP_STATUS_UNUSED. + +- NL_MMAP_STATUS_SKIP: + RX ring only: user-space queued the message for later processing, but + processed some messages following it in the ring. The + kernel should skip this frame when looking for unused + frames. + +The data area of a frame begins at a offset of NL_MMAP_HDRLEN relative to the +frame header. + +TX limitations +-------------- + +Kernel processing usually involves validation of the message received by +user-space, then processing its contents. The kernel must assure that +userspace is not able to modify the message contents after they have been +validated. In order to do so, the message is copied from the ring frame +to an allocated buffer if either of these conditions is false: + +- only a single mapping of the ring exists +- the file descriptor is not shared between processes + +This means that for threaded programs, the kernel will fall back to copying. + +Example +------- + +Ring setup: + + unsigned int block_size = 16 * getpagesize(); + struct nl_mmap_req req = { + .nm_block_size = block_size, + .nm_block_nr = 64, + .nm_frame_size = 16384, + .nm_frame_nr = 64 * block_size / 16384, + }; + unsigned int ring_size; + void *rx_ring, *tx_ring; + + /* Configure ring parameters */ + if (setsockopt(fd, NETLINK_RX_RING, &req, sizeof(req)) < 0) + exit(1); + if (setsockopt(fd, NETLINK_TX_RING, &req, sizeof(req)) < 0) + exit(1) + + /* Calculate size of each invididual ring */ + ring_size = req.nm_block_nr * req.nm_block_size; + + /* Map RX/TX rings. The TX ring is located after the RX ring */ + rx_ring = mmap(NULL, 2 * ring_size, PROT_READ | PROT_WRITE, + MAP_SHARED, fd, 0); + if ((long)rx_ring == -1L) + exit(1); + tx_ring = rx_ring + ring_size: + +Message reception: + +This example assumes some ring parameters of the ring setup are available. + + unsigned int frame_offset = 0; + struct nl_mmap_hdr *hdr; + struct nlmsghdr *nlh; + unsigned char buf[16384]; + ssize_t len; + + while (1) { + struct pollfd pfds[1]; + + pfds[0].fd = fd; + pfds[0].events = POLLIN | POLLERR; + pfds[0].revents = 0; + + if (poll(pfds, 1, -1) < 0 && errno != -EINTR) + exit(1); + + /* Check for errors. Error handling omitted */ + if (pfds[0].revents & POLLERR) + <handle error> + + /* If no new messages, poll again */ + if (!(pfds[0].revents & POLLIN)) + continue; + + /* Process all frames */ + while (1) { + /* Get next frame header */ + hdr = rx_ring + frame_offset; + + if (hdr->nm_status == NL_MMAP_STATUS_VALID) + /* Regular memory mapped frame */ + nlh = (void *hdr) + NL_MMAP_HDRLEN; + len = hdr->nm_len; + + /* Release empty message immediately. May happen + * on error during message construction. + */ + if (len == 0) + goto release; + } else if (hdr->nm_status == NL_MMAP_STATUS_COPY) { + /* Frame queued to socket receive queue */ + len = recv(fd, buf, sizeof(buf), MSG_DONTWAIT); + if (len <= 0) + break; + nlh = buf; + } else + /* No more messages to process, continue polling */ + break; + + process_msg(nlh); +release: + /* Release frame back to the kernel */ + hdr->nm_status = NL_MMAP_STATUS_UNUSED; + + /* Advance frame offset to next frame */ + frame_offset = (frame_offset + frame_size) % ring_size; + } + } + +Message transmission: + +This example assumes some ring parameters of the ring setup are available. +A single message is constructed and transmitted, to send multiple messages +at once they would be constructed in consecutive frames before a final call +to sendto(). + + unsigned int frame_offset = 0; + struct nl_mmap_hdr *hdr; + struct nlmsghdr *nlh; + struct sockaddr_nl addr = { + .nl_family = AF_NETLINK, + }; + + hdr = tx_ring + frame_offset; + if (hdr->nm_status != NL_MMAP_STATUS_UNUSED) + /* No frame available. Use poll() to avoid. */ + exit(1); + + nlh = (void *)hdr + NL_MMAP_HDRLEN; + + /* Build message */ + build_message(nlh); + + /* Fill frame header: length and status need to be set */ + hdr->nm_len = nlh->nlmsg_len; + hdr->nm_status = NL_MMAP_STATUS_VALID; + + if (sendto(fd, NULL, 0, 0, &addr, sizeof(addr)) < 0) + exit(1); + + /* Advance frame offset to next frame */ + frame_offset = (frame_offset + frame_size) % ring_size; diff --git a/Documentation/networking/packet_mmap.txt b/Documentation/networking/packet_mmap.txt index 94444b152fbc..23dd80e82b8e 100644 --- a/Documentation/networking/packet_mmap.txt +++ b/Documentation/networking/packet_mmap.txt @@ -685,14 +685,342 @@ int main(int argc, char **argp) } ------------------------------------------------------------------------------- ++ AF_PACKET TPACKET_V3 example +------------------------------------------------------------------------------- + +AF_PACKET's TPACKET_V3 ring buffer can be configured to use non-static frame +sizes by doing it's own memory management. It is based on blocks where polling +works on a per block basis instead of per ring as in TPACKET_V2 and predecessor. + +It is said that TPACKET_V3 brings the following benefits: + *) ~15 - 20% reduction in CPU-usage + *) ~20% increase in packet capture rate + *) ~2x increase in packet density + *) Port aggregation analysis + *) Non static frame size to capture entire packet payload + +So it seems to be a good candidate to be used with packet fanout. + +Minimal example code by Daniel Borkmann based on Chetan Loke's lolpcap (compile +it with gcc -Wall -O2 blob.c, and try things like "./a.out eth0", etc.): + +#include <stdio.h> +#include <stdlib.h> +#include <stdint.h> +#include <string.h> +#include <assert.h> +#include <net/if.h> +#include <arpa/inet.h> +#include <netdb.h> +#include <poll.h> +#include <unistd.h> +#include <signal.h> +#include <inttypes.h> +#include <sys/socket.h> +#include <sys/mman.h> +#include <linux/if_packet.h> +#include <linux/if_ether.h> +#include <linux/ip.h> + +#define BLOCK_SIZE (1 << 22) +#define FRAME_SIZE 2048 + +#define NUM_BLOCKS 64 +#define NUM_FRAMES ((BLOCK_SIZE * NUM_BLOCKS) / FRAME_SIZE) + +#define BLOCK_RETIRE_TOV_IN_MS 64 +#define BLOCK_PRIV_AREA_SZ 13 + +#define ALIGN_8(x) (((x) + 8 - 1) & ~(8 - 1)) + +#define BLOCK_STATUS(x) ((x)->h1.block_status) +#define BLOCK_NUM_PKTS(x) ((x)->h1.num_pkts) +#define BLOCK_O2FP(x) ((x)->h1.offset_to_first_pkt) +#define BLOCK_LEN(x) ((x)->h1.blk_len) +#define BLOCK_SNUM(x) ((x)->h1.seq_num) +#define BLOCK_O2PRIV(x) ((x)->offset_to_priv) +#define BLOCK_PRIV(x) ((void *) ((uint8_t *) (x) + BLOCK_O2PRIV(x))) +#define BLOCK_HDR_LEN (ALIGN_8(sizeof(struct block_desc))) +#define BLOCK_PLUS_PRIV(sz_pri) (BLOCK_HDR_LEN + ALIGN_8((sz_pri))) + +#ifndef likely +# define likely(x) __builtin_expect(!!(x), 1) +#endif +#ifndef unlikely +# define unlikely(x) __builtin_expect(!!(x), 0) +#endif + +struct block_desc { + uint32_t version; + uint32_t offset_to_priv; + struct tpacket_hdr_v1 h1; +}; + +struct ring { + struct iovec *rd; + uint8_t *map; + struct tpacket_req3 req; +}; + +static unsigned long packets_total = 0, bytes_total = 0; +static sig_atomic_t sigint = 0; + +void sighandler(int num) +{ + sigint = 1; +} + +static int setup_socket(struct ring *ring, char *netdev) +{ + int err, i, fd, v = TPACKET_V3; + struct sockaddr_ll ll; + + fd = socket(AF_PACKET, SOCK_RAW, htons(ETH_P_ALL)); + if (fd < 0) { + perror("socket"); + exit(1); + } + + err = setsockopt(fd, SOL_PACKET, PACKET_VERSION, &v, sizeof(v)); + if (err < 0) { + perror("setsockopt"); + exit(1); + } + + memset(&ring->req, 0, sizeof(ring->req)); + ring->req.tp_block_size = BLOCK_SIZE; + ring->req.tp_frame_size = FRAME_SIZE; + ring->req.tp_block_nr = NUM_BLOCKS; + ring->req.tp_frame_nr = NUM_FRAMES; + ring->req.tp_retire_blk_tov = BLOCK_RETIRE_TOV_IN_MS; + ring->req.tp_sizeof_priv = BLOCK_PRIV_AREA_SZ; + ring->req.tp_feature_req_word |= TP_FT_REQ_FILL_RXHASH; + + err = setsockopt(fd, SOL_PACKET, PACKET_RX_RING, &ring->req, + sizeof(ring->req)); + if (err < 0) { + perror("setsockopt"); + exit(1); + } + + ring->map = mmap(NULL, ring->req.tp_block_size * ring->req.tp_block_nr, + PROT_READ | PROT_WRITE, MAP_SHARED | MAP_LOCKED, + fd, 0); + if (ring->map == MAP_FAILED) { + perror("mmap"); + exit(1); + } + + ring->rd = malloc(ring->req.tp_block_nr * sizeof(*ring->rd)); + assert(ring->rd); + for (i = 0; i < ring->req.tp_block_nr; ++i) { + ring->rd[i].iov_base = ring->map + (i * ring->req.tp_block_size); + ring->rd[i].iov_len = ring->req.tp_block_size; + } + + memset(&ll, 0, sizeof(ll)); + ll.sll_family = PF_PACKET; + ll.sll_protocol = htons(ETH_P_ALL); + ll.sll_ifindex = if_nametoindex(netdev); + ll.sll_hatype = 0; + ll.sll_pkttype = 0; + ll.sll_halen = 0; + + err = bind(fd, (struct sockaddr *) &ll, sizeof(ll)); + if (err < 0) { + perror("bind"); + exit(1); + } + + return fd; +} + +#ifdef __checked +static uint64_t prev_block_seq_num = 0; + +void assert_block_seq_num(struct block_desc *pbd) +{ + if (unlikely(prev_block_seq_num + 1 != BLOCK_SNUM(pbd))) { + printf("prev_block_seq_num:%"PRIu64", expected seq:%"PRIu64" != " + "actual seq:%"PRIu64"\n", prev_block_seq_num, + prev_block_seq_num + 1, (uint64_t) BLOCK_SNUM(pbd)); + exit(1); + } + + prev_block_seq_num = BLOCK_SNUM(pbd); +} + +static void assert_block_len(struct block_desc *pbd, uint32_t bytes, int block_num) +{ + if (BLOCK_NUM_PKTS(pbd)) { + if (unlikely(bytes != BLOCK_LEN(pbd))) { + printf("block:%u with %upackets, expected len:%u != actual len:%u\n", + block_num, BLOCK_NUM_PKTS(pbd), bytes, BLOCK_LEN(pbd)); + exit(1); + } + } else { + if (unlikely(BLOCK_LEN(pbd) != BLOCK_PLUS_PRIV(BLOCK_PRIV_AREA_SZ))) { + printf("block:%u, expected len:%lu != actual len:%u\n", + block_num, BLOCK_HDR_LEN, BLOCK_LEN(pbd)); + exit(1); + } + } +} + +static void assert_block_header(struct block_desc *pbd, const int block_num) +{ + uint32_t block_status = BLOCK_STATUS(pbd); + + if (unlikely((block_status & TP_STATUS_USER) == 0)) { + printf("block:%u, not in TP_STATUS_USER\n", block_num); + exit(1); + } + + assert_block_seq_num(pbd); +} +#else +static inline void assert_block_header(struct block_desc *pbd, const int block_num) +{ +} +static void assert_block_len(struct block_desc *pbd, uint32_t bytes, int block_num) +{ +} +#endif + +static void display(struct tpacket3_hdr *ppd) +{ + struct ethhdr *eth = (struct ethhdr *) ((uint8_t *) ppd + ppd->tp_mac); + struct iphdr *ip = (struct iphdr *) ((uint8_t *) eth + ETH_HLEN); + + if (eth->h_proto == htons(ETH_P_IP)) { + struct sockaddr_in ss, sd; + char sbuff[NI_MAXHOST], dbuff[NI_MAXHOST]; + + memset(&ss, 0, sizeof(ss)); + ss.sin_family = PF_INET; + ss.sin_addr.s_addr = ip->saddr; + getnameinfo((struct sockaddr *) &ss, sizeof(ss), + sbuff, sizeof(sbuff), NULL, 0, NI_NUMERICHOST); + + memset(&sd, 0, sizeof(sd)); + sd.sin_family = PF_INET; + sd.sin_addr.s_addr = ip->daddr; + getnameinfo((struct sockaddr *) &sd, sizeof(sd), + dbuff, sizeof(dbuff), NULL, 0, NI_NUMERICHOST); + + printf("%s -> %s, ", sbuff, dbuff); + } + + printf("rxhash: 0x%x\n", ppd->hv1.tp_rxhash); +} + +static void walk_block(struct block_desc *pbd, const int block_num) +{ + int num_pkts = BLOCK_NUM_PKTS(pbd), i; + unsigned long bytes = 0; + unsigned long bytes_with_padding = BLOCK_PLUS_PRIV(BLOCK_PRIV_AREA_SZ); + struct tpacket3_hdr *ppd; + + assert_block_header(pbd, block_num); + + ppd = (struct tpacket3_hdr *) ((uint8_t *) pbd + BLOCK_O2FP(pbd)); + for (i = 0; i < num_pkts; ++i) { + bytes += ppd->tp_snaplen; + if (ppd->tp_next_offset) + bytes_with_padding += ppd->tp_next_offset; + else + bytes_with_padding += ALIGN_8(ppd->tp_snaplen + ppd->tp_mac); + + display(ppd); + + ppd = (struct tpacket3_hdr *) ((uint8_t *) ppd + ppd->tp_next_offset); + __sync_synchronize(); + } + + assert_block_len(pbd, bytes_with_padding, block_num); + + packets_total += num_pkts; + bytes_total += bytes; +} + +void flush_block(struct block_desc *pbd) +{ + BLOCK_STATUS(pbd) = TP_STATUS_KERNEL; + __sync_synchronize(); +} + +static void teardown_socket(struct ring *ring, int fd) +{ + munmap(ring->map, ring->req.tp_block_size * ring->req.tp_block_nr); + free(ring->rd); + close(fd); +} + +int main(int argc, char **argp) +{ + int fd, err; + socklen_t len; + struct ring ring; + struct pollfd pfd; + unsigned int block_num = 0; + struct block_desc *pbd; + struct tpacket_stats_v3 stats; + + if (argc != 2) { + fprintf(stderr, "Usage: %s INTERFACE\n", argp[0]); + return EXIT_FAILURE; + } + + signal(SIGINT, sighandler); + + memset(&ring, 0, sizeof(ring)); + fd = setup_socket(&ring, argp[argc - 1]); + assert(fd > 0); + + memset(&pfd, 0, sizeof(pfd)); + pfd.fd = fd; + pfd.events = POLLIN | POLLERR; + pfd.revents = 0; + + while (likely(!sigint)) { + pbd = (struct block_desc *) ring.rd[block_num].iov_base; +retry_block: + if ((BLOCK_STATUS(pbd) & TP_STATUS_USER) == 0) { + poll(&pfd, 1, -1); + goto retry_block; + } + + walk_block(pbd, block_num); + flush_block(pbd); + block_num = (block_num + 1) % NUM_BLOCKS; + } + + len = sizeof(stats); + err = getsockopt(fd, SOL_PACKET, PACKET_STATISTICS, &stats, &len); + if (err < 0) { + perror("getsockopt"); + exit(1); + } + + fflush(stdout); + printf("\nReceived %u packets, %lu bytes, %u dropped, freeze_q_cnt: %u\n", + stats.tp_packets, bytes_total, stats.tp_drops, + stats.tp_freeze_q_cnt); + + teardown_socket(&ring, fd); + return 0; +} + +------------------------------------------------------------------------------- + PACKET_TIMESTAMP ------------------------------------------------------------------------------- The PACKET_TIMESTAMP setting determines the source of the timestamp in -the packet meta information. If your NIC is capable of timestamping -packets in hardware, you can request those hardware timestamps to used. -Note: you may need to enable the generation of hardware timestamps with -SIOCSHWTSTAMP. +the packet meta information for mmap(2)ed RX_RING and TX_RINGs. If your +NIC is capable of timestamping packets in hardware, you can request those +hardware timestamps to be used. Note: you may need to enable the generation +of hardware timestamps with SIOCSHWTSTAMP (see related information from +Documentation/networking/timestamping.txt). PACKET_TIMESTAMP accepts the same integer bit field as SO_TIMESTAMPING. However, only the SOF_TIMESTAMPING_SYS_HARDWARE @@ -704,8 +1032,36 @@ SOF_TIMESTAMPING_RAW_HARDWARE if both bits are set. req |= SOF_TIMESTAMPING_SYS_HARDWARE; setsockopt(fd, SOL_PACKET, PACKET_TIMESTAMP, (void *) &req, sizeof(req)) -If PACKET_TIMESTAMP is not set, a software timestamp generated inside -the networking stack is used (the behavior before this setting was added). +For the mmap(2)ed ring buffers, such timestamps are stored in the +tpacket{,2,3}_hdr structure's tp_sec and tp_{n,u}sec members. To determine +what kind of timestamp has been reported, the tp_status field is binary |'ed +with the following possible bits ... + + TP_STATUS_TS_SYS_HARDWARE + TP_STATUS_TS_RAW_HARDWARE + TP_STATUS_TS_SOFTWARE + +... that are equivalent to its SOF_TIMESTAMPING_* counterparts. For the +RX_RING, if none of those 3 are set (i.e. PACKET_TIMESTAMP is not set), +then this means that a software fallback was invoked *within* PF_PACKET's +processing code (less precise). + +Getting timestamps for the TX_RING works as follows: i) fill the ring frames, +ii) call sendto() e.g. in blocking mode, iii) wait for status of relevant +frames to be updated resp. the frame handed over to the application, iv) walk +through the frames to pick up the individual hw/sw timestamps. + +Only (!) if transmit timestamping is enabled, then these bits are combined +with binary | with TP_STATUS_AVAILABLE, so you must check for that in your +application (e.g. !(tp_status & (TP_STATUS_SEND_REQUEST | TP_STATUS_SENDING)) +in a first step to see if the frame belongs to the application, and then +one can extract the type of timestamp in a second step from tp_status)! + +If you don't care about them, thus having it disabled, checking for +TP_STATUS_AVAILABLE resp. TP_STATUS_WRONG_FORMAT is sufficient. If in the +TX_RING part only TP_STATUS_AVAILABLE is set, then the tp_sec and tp_{n,u}sec +members do not contain a valid value. For TX_RINGs, by default no timestamp +is generated! See include/linux/net_tstamp.h and Documentation/networking/timestamping for more information on hardware timestamps. diff --git a/Documentation/networking/stmmac.txt b/Documentation/networking/stmmac.txt index f9fa6db40a52..654d2e55c8cb 100644 --- a/Documentation/networking/stmmac.txt +++ b/Documentation/networking/stmmac.txt @@ -1,6 +1,6 @@ STMicroelectronics 10/100/1000 Synopsys Ethernet driver -Copyright (C) 2007-2010 STMicroelectronics Ltd +Copyright (C) 2007-2013 STMicroelectronics Ltd Author: Giuseppe Cavallaro <peppe.cavallaro@st.com> This is the driver for the MAC 10/100/1000 on-chip Ethernet controllers @@ -10,7 +10,7 @@ Currently this network device driver is for all STM embedded MAC/GMAC (i.e. 7xxx/5xxx SoCs), SPEAr (arm), Loongson1B (mips) and XLINX XC2V3000 FF1152AMT0221 D1215994A VIRTEX FPGA board. -DWC Ether MAC 10/100/1000 Universal version 3.60a (and older) and DWC Ether +DWC Ether MAC 10/100/1000 Universal version 3.70a (and older) and DWC Ether MAC 10/100 Universal version 4.0 have been used for developing this driver. This driver supports both the platform bus and PCI. @@ -32,6 +32,8 @@ The kernel configuration option is STMMAC_ETH: watchdog: transmit timeout (in milliseconds); flow_ctrl: Flow control ability [on/off]; pause: Flow Control Pause Time; + eee_timer: tx EEE timer; + chain_mode: select chain mode instead of ring. 3) Command line options Driver parameters can be also passed in command line by using: @@ -164,12 +166,12 @@ Where: o bus_setup: perform HW setup of the bus. For example, on some ST platforms this field is used to configure the AMBA bridge to generate more efficient STBus traffic. - o init/exit: callbacks used for calling a custom initialisation; + o init/exit: callbacks used for calling a custom initialization; this is sometime necessary on some platforms (e.g. ST boxes) where the HW needs to have set some PIO lines or system cfg registers. o custom_cfg/custom_data: this is a custom configuration that can be passed - while initialising the resources. + while initializing the resources. o bsp_priv: another private poiter. For MDIO bus The we have: @@ -273,6 +275,8 @@ reset procedure etc). o norm_desc.c: functions for handling normal descriptors; o chain_mode.c/ring_mode.c:: functions to manage RING/CHAINED modes; o mmc_core.c/mmc.h: Management MAC Counters; + o stmmac_hwtstamp.c: HW timestamp support for PTP + o stmmac_ptp.c: PTP 1588 clock 5) Debug Information @@ -326,6 +330,35 @@ To enter in Tx LPI mode the driver needs to have a software timer that enable and disable the LPI mode when there is nothing to be transmitted. -7) TODO: +7) Extended descriptors +The extended descriptors give us information about the receive Ethernet payload +when it is carrying PTP packets or TCP/UDP/ICMP over IP. +These are not available on GMAC Synopsys chips older than the 3.50. +At probe time the driver will decide if these can be actually used. +This support also is mandatory for PTPv2 because the extra descriptors 6 and 7 +are used for saving the hardware timestamps. + +8) Precision Time Protocol (PTP) +The driver supports the IEEE 1588-2002, Precision Time Protocol (PTP), +which enables precise synchronization of clocks in measurement and +control systems implemented with technologies such as network +communication. + +In addition to the basic timestamp features mentioned in IEEE 1588-2002 +Timestamps, new GMAC cores support the advanced timestamp features. +IEEE 1588-2008 that can be enabled when configure the Kernel. + +9) SGMII/RGMII supports +New GMAC devices provide own way to manage RGMII/SGMII. +This information is available at run-time by looking at the +HW capability register. This means that the stmmac can manage +auto-negotiation and link status w/o using the PHYLIB stuff +In fact, the HW provides a subset of extended registers to +restart the ANE, verify Full/Half duplex mode and Speed. +Also thanks to these registers it is possible to look at the +Auto-negotiated Link Parter Ability. + +10) TODO: o XGMAC is not supported. - o Add the PTP - precision time protocol + o Complete the TBI & RTBI support. + o extened VLAN support for 3.70a SYNP GMAC. diff --git a/Documentation/pinctrl.txt b/Documentation/pinctrl.txt index a2b57e0a1db0..447fd4cd54ec 100644 --- a/Documentation/pinctrl.txt +++ b/Documentation/pinctrl.txt @@ -736,6 +736,13 @@ All the above functions are mandatory to implement for a pinmux driver. Pin control interaction with the GPIO subsystem =============================================== +Note that the following implies that the use case is to use a certain pin +from the Linux kernel using the API in <linux/gpio.h> with gpio_request() +and similar functions. There are cases where you may be using something +that your datasheet calls "GPIO mode" but actually is just an electrical +configuration for a certain device. See the section below named +"GPIO mode pitfalls" for more details on this scenario. + The public pinmux API contains two functions named pinctrl_request_gpio() and pinctrl_free_gpio(). These two functions shall *ONLY* be called from gpiolib-based drivers as part of their gpio_request() and @@ -774,6 +781,111 @@ obtain the function "gpioN" where "N" is the global GPIO pin number if no special GPIO-handler is registered. +GPIO mode pitfalls +================== + +Sometime the developer may be confused by a datasheet talking about a pin +being possible to set into "GPIO mode". It appears that what hardware +engineers mean with "GPIO mode" is not necessarily the use case that is +implied in the kernel interface <linux/gpio.h>: a pin that you grab from +kernel code and then either listen for input or drive high/low to +assert/deassert some external line. + +Rather hardware engineers think that "GPIO mode" means that you can +software-control a few electrical properties of the pin that you would +not be able to control if the pin was in some other mode, such as muxed in +for a device. + +Example: a pin is usually muxed in to be used as a UART TX line. But during +system sleep, we need to put this pin into "GPIO mode" and ground it. + +If you make a 1-to-1 map to the GPIO subsystem for this pin, you may start +to think that you need to come up with something real complex, that the +pin shall be used for UART TX and GPIO at the same time, that you will grab +a pin control handle and set it to a certain state to enable UART TX to be +muxed in, then twist it over to GPIO mode and use gpio_direction_output() +to drive it low during sleep, then mux it over to UART TX again when you +wake up and maybe even gpio_request/gpio_free as part of this cycle. This +all gets very complicated. + +The solution is to not think that what the datasheet calls "GPIO mode" +has to be handled by the <linux/gpio.h> interface. Instead view this as +a certain pin config setting. Look in e.g. <linux/pinctrl/pinconf-generic.h> +and you find this in the documentation: + + PIN_CONFIG_OUTPUT: this will configure the pin in output, use argument + 1 to indicate high level, argument 0 to indicate low level. + +So it is perfectly possible to push a pin into "GPIO mode" and drive the +line low as part of the usual pin control map. So for example your UART +driver may look like this: + +#include <linux/pinctrl/consumer.h> + +struct pinctrl *pinctrl; +struct pinctrl_state *pins_default; +struct pinctrl_state *pins_sleep; + +pins_default = pinctrl_lookup_state(uap->pinctrl, PINCTRL_STATE_DEFAULT); +pins_sleep = pinctrl_lookup_state(uap->pinctrl, PINCTRL_STATE_SLEEP); + +/* Normal mode */ +retval = pinctrl_select_state(pinctrl, pins_default); +/* Sleep mode */ +retval = pinctrl_select_state(pinctrl, pins_sleep); + +And your machine configuration may look like this: +-------------------------------------------------- + +static unsigned long uart_default_mode[] = { + PIN_CONF_PACKED(PIN_CONFIG_DRIVE_PUSH_PULL, 0), +}; + +static unsigned long uart_sleep_mode[] = { + PIN_CONF_PACKED(PIN_CONFIG_OUTPUT, 0), +}; + +static struct pinctrl_map __initdata pinmap[] = { + PIN_MAP_MUX_GROUP("uart", PINCTRL_STATE_DEFAULT, "pinctrl-foo", + "u0_group", "u0"), + PIN_MAP_CONFIGS_PIN("uart", PINCTRL_STATE_DEFAULT, "pinctrl-foo", + "UART_TX_PIN", uart_default_mode), + PIN_MAP_MUX_GROUP("uart", PINCTRL_STATE_SLEEP, "pinctrl-foo", + "u0_group", "gpio-mode"), + PIN_MAP_CONFIGS_PIN("uart", PINCTRL_STATE_SLEEP, "pinctrl-foo", + "UART_TX_PIN", uart_sleep_mode), +}; + +foo_init(void) { + pinctrl_register_mappings(pinmap, ARRAY_SIZE(pinmap)); +} + +Here the pins we want to control are in the "u0_group" and there is some +function called "u0" that can be enabled on this group of pins, and then +everything is UART business as usual. But there is also some function +named "gpio-mode" that can be mapped onto the same pins to move them into +GPIO mode. + +This will give the desired effect without any bogus interaction with the +GPIO subsystem. It is just an electrical configuration used by that device +when going to sleep, it might imply that the pin is set into something the +datasheet calls "GPIO mode" but that is not the point: it is still used +by that UART device to control the pins that pertain to that very UART +driver, putting them into modes needed by the UART. GPIO in the Linux +kernel sense are just some 1-bit line, and is a different use case. + +How the registers are poked to attain the push/pull and output low +configuration and the muxing of the "u0" or "gpio-mode" group onto these +pins is a question for the driver. + +Some datasheets will be more helpful and refer to the "GPIO mode" as +"low power mode" rather than anything to do with GPIO. This often means +the same thing electrically speaking, but in this latter case the +software engineers will usually quickly identify that this is some +specific muxing/configuration rather than anything related to the GPIO +API. + + Board/machine configuration ================================== diff --git a/Documentation/printk-formats.txt b/Documentation/printk-formats.txt index 6e953564de03..3af5ae6c9c11 100644 --- a/Documentation/printk-formats.txt +++ b/Documentation/printk-formats.txt @@ -17,6 +17,8 @@ Symbols/Function Pointers: %pF versatile_init+0x0/0x110 %pf versatile_init %pS versatile_init+0x0/0x110 + %pSR versatile_init+0x9/0x110 + (with __builtin_extract_return_addr() translation) %ps versatile_init %pB prev_fn_of_versatile_init+0x88/0x88 diff --git a/Documentation/s390/s390dbf.txt b/Documentation/s390/s390dbf.txt index ae66f9b90a25..fcaf0b4efba2 100644 --- a/Documentation/s390/s390dbf.txt +++ b/Documentation/s390/s390dbf.txt @@ -143,7 +143,8 @@ Parameter: id: handle for debug log Return Value: none -Description: frees memory for a debug log +Description: frees memory for a debug log and removes all registered debug + views. Must not be called within an interrupt handler --------------------------------------------------------------------------- diff --git a/Documentation/scsi/LICENSE.qla2xxx b/Documentation/scsi/LICENSE.qla2xxx index 27a91cf43d6d..5020b7b5a244 100644 --- a/Documentation/scsi/LICENSE.qla2xxx +++ b/Documentation/scsi/LICENSE.qla2xxx @@ -1,4 +1,4 @@ -Copyright (c) 2003-2012 QLogic Corporation +Copyright (c) 2003-2013 QLogic Corporation QLogic Linux FC-FCoE Driver This program includes a device driver for Linux 3.x. diff --git a/Documentation/security/Smack.txt b/Documentation/security/Smack.txt index 8a177e4b6e21..7a2d30c132e3 100644 --- a/Documentation/security/Smack.txt +++ b/Documentation/security/Smack.txt @@ -117,6 +117,17 @@ access2 ambient This contains the Smack label applied to unlabeled network packets. +change-rule + This interface allows modification of existing access control rules. + The format accepted on write is: + "%s %s %s %s" + where the first string is the subject label, the second the + object label, the third the access to allow and the fourth the + access to deny. The access strings may contain only the characters + "rwxat-". If a rule for a given subject and object exists it will be + modified by enabling the permissions in the third string and disabling + those in the fourth string. If there is no such rule it will be + created using the access specified in the third and the fourth strings. cipso This interface allows a specific CIPSO header to be assigned to a Smack label. The format accepted on write is: diff --git a/Documentation/sound/alsa/ALSA-Configuration.txt b/Documentation/sound/alsa/ALSA-Configuration.txt index 4499bd948860..95731a08f257 100644 --- a/Documentation/sound/alsa/ALSA-Configuration.txt +++ b/Documentation/sound/alsa/ALSA-Configuration.txt @@ -890,9 +890,8 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. enable_msi - Enable Message Signaled Interrupt (MSI) (default = off) power_save - Automatic power-saving timeout (in second, 0 = disable) - power_save_controller - Support runtime D3 of HD-audio controller - (-1 = on for supported chip (default), false = off, - true = force to on even for unsupported hardware) + power_save_controller - Reset HD-audio controller in power-saving mode + (default = on) align_buffer_size - Force rounding of buffer/period sizes to multiples of 128 bytes. This is more efficient in terms of memory access but isn't required by the HDA spec and prevents diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt index 078701fdbd4d..dcc75a9ed919 100644 --- a/Documentation/sysctl/vm.txt +++ b/Documentation/sysctl/vm.txt @@ -18,6 +18,7 @@ files can be found in mm/swap.c. Currently, these files are in /proc/sys/vm: +- admin_reserve_kbytes - block_dump - compact_memory - dirty_background_bytes @@ -53,11 +54,41 @@ Currently, these files are in /proc/sys/vm: - percpu_pagelist_fraction - stat_interval - swappiness +- user_reserve_kbytes - vfs_cache_pressure - zone_reclaim_mode ============================================================== +admin_reserve_kbytes + +The amount of free memory in the system that should be reserved for users +with the capability cap_sys_admin. + +admin_reserve_kbytes defaults to min(3% of free pages, 8MB) + +That should provide enough for the admin to log in and kill a process, +if necessary, under the default overcommit 'guess' mode. + +Systems running under overcommit 'never' should increase this to account +for the full Virtual Memory Size of programs used to recover. Otherwise, +root may not be able to log in to recover the system. + +How do you calculate a minimum useful reserve? + +sshd or login + bash (or some other shell) + top (or ps, kill, etc.) + +For overcommit 'guess', we can sum resident set sizes (RSS). +On x86_64 this is about 8MB. + +For overcommit 'never', we can take the max of their virtual sizes (VSZ) +and add the sum of their RSS. +On x86_64 this is about 128MB. + +Changing this takes effect whenever an application requests memory. + +============================================================== + block_dump block_dump enables block I/O debugging when set to a nonzero value. More @@ -542,6 +573,7 @@ memory until it actually runs out. When this flag is 2, the kernel uses a "never overcommit" policy that attempts to prevent any overcommit of memory. +Note that user_reserve_kbytes affects this policy. This feature can be very useful because there are a lot of programs that malloc() huge amounts of memory "just-in-case" @@ -645,6 +677,24 @@ The default value is 60. ============================================================== +- user_reserve_kbytes + +When overcommit_memory is set to 2, "never overommit" mode, reserve +min(3% of current process size, user_reserve_kbytes) of free memory. +This is intended to prevent a user from starting a single memory hogging +process, such that they cannot recover (kill the hog). + +user_reserve_kbytes defaults to min(3% of the current process size, 128MB). + +If this is reduced to zero, then the user will be allowed to allocate +all free memory with a single process, minus admin_reserve_kbytes. +Any subsequent attempts to execute a command will result in +"fork: Cannot allocate memory". + +Changing this takes effect whenever an application requests memory. + +============================================================== + vfs_cache_pressure ------------------ diff --git a/Documentation/sysrq.txt b/Documentation/sysrq.txt index 2a4cdda4828e..8cb4d7842a5f 100644 --- a/Documentation/sysrq.txt +++ b/Documentation/sysrq.txt @@ -129,9 +129,9 @@ On all - write a character to /proc/sysrq-trigger. e.g.: * Okay, so what can I use them for? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Well, un'R'aw is very handy when your X server or a svgalib program crashes. +Well, unraw(r) is very handy when your X server or a svgalib program crashes. -sa'K' (Secure Access Key) is useful when you want to be sure there is no +sak(k) (Secure Access Key) is useful when you want to be sure there is no trojan program running at console which could grab your password when you would try to login. It will kill all programs on given console, thus letting you make sure that the login prompt you see is actually @@ -143,20 +143,20 @@ IMPORTANT: such. :IMPORTANT useful when you want to exit a program that will not let you switch consoles. (For example, X or a svgalib program.) -re'B'oot is good when you're unable to shut down. But you should also 'S'ync -and 'U'mount first. +reboot(b) is good when you're unable to shut down. But you should also +sync(s) and umount(u) first. -'C'rash can be used to manually trigger a crashdump when the system is hung. +crash(c) can be used to manually trigger a crashdump when the system is hung. Note that this just triggers a crash if there is no dump mechanism available. -'S'ync is great when your system is locked up, it allows you to sync your +sync(s) is great when your system is locked up, it allows you to sync your disks and will certainly lessen the chance of data loss and fscking. Note that the sync hasn't taken place until you see the "OK" and "Done" appear on the screen. (If the kernel is really in strife, you may not ever get the OK or Done message...) -'U'mount is basically useful in the same ways as 'S'ync. I generally 'S'ync, -'U'mount, then re'B'oot when my system locks. It's saved me many a fsck. +umount(u) is basically useful in the same ways as sync(s). I generally sync(s), +umount(u), then reboot(b) when my system locks. It's saved me many a fsck. Again, the unmount (remount read-only) hasn't taken place until you see the "OK" and "Done" message appear on the screen. @@ -165,11 +165,11 @@ kernel messages you do not want to see. Selecting '0' will prevent all but the most urgent kernel messages from reaching your console. (They will still be logged if syslogd/klogd are alive, though.) -t'E'rm and k'I'll are useful if you have some sort of runaway process you +term(e) and kill(i) are useful if you have some sort of runaway process you are unable to kill any other way, especially if it's spawning other processes. -"'J'ust thaw it" is useful if your system becomes unresponsive due to a frozen +"just thaw it(j)" is useful if your system becomes unresponsive due to a frozen (probably root) filesystem via the FIFREEZE ioctl. * Sometimes SysRq seems to get 'stuck' after using it, what can I do? diff --git a/Documentation/this_cpu_ops.txt b/Documentation/this_cpu_ops.txt new file mode 100644 index 000000000000..1a4ce7e3e05f --- /dev/null +++ b/Documentation/this_cpu_ops.txt @@ -0,0 +1,205 @@ +this_cpu operations +------------------- + +this_cpu operations are a way of optimizing access to per cpu +variables associated with the *currently* executing processor through +the use of segment registers (or a dedicated register where the cpu +permanently stored the beginning of the per cpu area for a specific +processor). + +The this_cpu operations add a per cpu variable offset to the processor +specific percpu base and encode that operation in the instruction +operating on the per cpu variable. + +This means there are no atomicity issues between the calculation of +the offset and the operation on the data. Therefore it is not +necessary to disable preempt or interrupts to ensure that the +processor is not changed between the calculation of the address and +the operation on the data. + +Read-modify-write operations are of particular interest. Frequently +processors have special lower latency instructions that can operate +without the typical synchronization overhead but still provide some +sort of relaxed atomicity guarantee. The x86 for example can execute +RMV (Read Modify Write) instructions like inc/dec/cmpxchg without the +lock prefix and the associated latency penalty. + +Access to the variable without the lock prefix is not synchronized but +synchronization is not necessary since we are dealing with per cpu +data specific to the currently executing processor. Only the current +processor should be accessing that variable and therefore there are no +concurrency issues with other processors in the system. + +On x86 the fs: or the gs: segment registers contain the base of the +per cpu area. It is then possible to simply use the segment override +to relocate a per cpu relative address to the proper per cpu area for +the processor. So the relocation to the per cpu base is encoded in the +instruction via a segment register prefix. + +For example: + + DEFINE_PER_CPU(int, x); + int z; + + z = this_cpu_read(x); + +results in a single instruction + + mov ax, gs:[x] + +instead of a sequence of calculation of the address and then a fetch +from that address which occurs with the percpu operations. Before +this_cpu_ops such sequence also required preempt disable/enable to +prevent the kernel from moving the thread to a different processor +while the calculation is performed. + +The main use of the this_cpu operations has been to optimize counter +operations. + + this_cpu_inc(x) + +results in the following single instruction (no lock prefix!) + + inc gs:[x] + +instead of the following operations required if there is no segment +register. + + int *y; + int cpu; + + cpu = get_cpu(); + y = per_cpu_ptr(&x, cpu); + (*y)++; + put_cpu(); + +Note that these operations can only be used on percpu data that is +reserved for a specific processor. Without disabling preemption in the +surrounding code this_cpu_inc() will only guarantee that one of the +percpu counters is correctly incremented. However, there is no +guarantee that the OS will not move the process directly before or +after the this_cpu instruction is executed. In general this means that +the value of the individual counters for each processor are +meaningless. The sum of all the per cpu counters is the only value +that is of interest. + +Per cpu variables are used for performance reasons. Bouncing cache +lines can be avoided if multiple processors concurrently go through +the same code paths. Since each processor has its own per cpu +variables no concurrent cacheline updates take place. The price that +has to be paid for this optimization is the need to add up the per cpu +counters when the value of the counter is needed. + + +Special operations: +------------------- + + y = this_cpu_ptr(&x) + +Takes the offset of a per cpu variable (&x !) and returns the address +of the per cpu variable that belongs to the currently executing +processor. this_cpu_ptr avoids multiple steps that the common +get_cpu/put_cpu sequence requires. No processor number is +available. Instead the offset of the local per cpu area is simply +added to the percpu offset. + + + +Per cpu variables and offsets +----------------------------- + +Per cpu variables have *offsets* to the beginning of the percpu +area. They do not have addresses although they look like that in the +code. Offsets cannot be directly dereferenced. The offset must be +added to a base pointer of a percpu area of a processor in order to +form a valid address. + +Therefore the use of x or &x outside of the context of per cpu +operations is invalid and will generally be treated like a NULL +pointer dereference. + +In the context of per cpu operations + + x is a per cpu variable. Most this_cpu operations take a cpu + variable. + + &x is the *offset* a per cpu variable. this_cpu_ptr() takes + the offset of a per cpu variable which makes this look a bit + strange. + + + +Operations on a field of a per cpu structure +-------------------------------------------- + +Let's say we have a percpu structure + + struct s { + int n,m; + }; + + DEFINE_PER_CPU(struct s, p); + + +Operations on these fields are straightforward + + this_cpu_inc(p.m) + + z = this_cpu_cmpxchg(p.m, 0, 1); + + +If we have an offset to struct s: + + struct s __percpu *ps = &p; + + z = this_cpu_dec(ps->m); + + z = this_cpu_inc_return(ps->n); + + +The calculation of the pointer may require the use of this_cpu_ptr() +if we do not make use of this_cpu ops later to manipulate fields: + + struct s *pp; + + pp = this_cpu_ptr(&p); + + pp->m--; + + z = pp->n++; + + +Variants of this_cpu ops +------------------------- + +this_cpu ops are interrupt safe. Some architecture do not support +these per cpu local operations. In that case the operation must be +replaced by code that disables interrupts, then does the operations +that are guaranteed to be atomic and then reenable interrupts. Doing +so is expensive. If there are other reasons why the scheduler cannot +change the processor we are executing on then there is no reason to +disable interrupts. For that purpose the __this_cpu operations are +provided. For example. + + __this_cpu_inc(x); + +Will increment x and will not fallback to code that disables +interrupts on platforms that cannot accomplish atomicity through +address relocation and a Read-Modify-Write operation in the same +instruction. + + + +&this_cpu_ptr(pp)->n vs this_cpu_ptr(&pp->n) +-------------------------------------------- + +The first operation takes the offset and forms an address and then +adds the offset of the n field. + +The second one first adds the two offsets and then does the +relocation. IMHO the second form looks cleaner and has an easier time +with (). The second form also is consistent with the way +this_cpu_read() and friends are used. + + +Christoph Lameter, April 3rd, 2013 diff --git a/Documentation/trace/ftrace.txt b/Documentation/trace/ftrace.txt index a372304aef10..bfe8c29b1f1d 100644 --- a/Documentation/trace/ftrace.txt +++ b/Documentation/trace/ftrace.txt @@ -8,6 +8,7 @@ Copyright 2008 Red Hat Inc. Reviewers: Elias Oltmanns, Randy Dunlap, Andrew Morton, John Kacur, and David Teigland. Written for: 2.6.28-rc2 +Updated for: 3.10 Introduction ------------ @@ -17,13 +18,16 @@ designers of systems to find what is going on inside the kernel. It can be used for debugging or analyzing latencies and performance issues that take place outside of user-space. -Although ftrace is the function tracer, it also includes an -infrastructure that allows for other types of tracing. Some of -the tracers that are currently in ftrace include a tracer to -trace context switches, the time it takes for a high priority -task to run after it was woken up, the time interrupts are -disabled, and more (ftrace allows for tracer plugins, which -means that the list of tracers can always grow). +Although ftrace is typically considered the function tracer, it +is really a frame work of several assorted tracing utilities. +There's latency tracing to examine what occurs between interrupts +disabled and enabled, as well as for preemption and from a time +a task is woken to the task is actually scheduled in. + +One of the most common uses of ftrace is the event tracing. +Through out the kernel is hundreds of static event points that +can be enabled via the debugfs file system to see what is +going on in certain parts of the kernel. Implementation Details @@ -61,7 +65,7 @@ the extended "/sys/kernel/debug/tracing" path name. That's it! (assuming that you have ftrace configured into your kernel) -After mounting the debugfs, you can see a directory called +After mounting debugfs, you can see a directory called "tracing". This directory contains the control and output files of ftrace. Here is a list of some of the key files: @@ -84,7 +88,9 @@ of ftrace. Here is a list of some of the key files: This sets or displays whether writing to the trace ring buffer is enabled. Echo 0 into this file to disable - the tracer or 1 to enable it. + the tracer or 1 to enable it. Note, this only disables + writing to the ring buffer, the tracing overhead may + still be occurring. trace: @@ -109,7 +115,15 @@ of ftrace. Here is a list of some of the key files: This file lets the user control the amount of data that is displayed in one of the above output - files. + files. Options also exist to modify how a tracer + or events work (stack traces, timestamps, etc). + + options: + + This is a directory that has a file for every available + trace option (also in trace_options). Options may also be set + or cleared by writing a "1" or "0" respectively into the + corresponding file with the option name. tracing_max_latency: @@ -121,10 +135,17 @@ of ftrace. Here is a list of some of the key files: latency is greater than the value in this file. (in microseconds) + tracing_thresh: + + Some latency tracers will record a trace whenever the + latency is greater than the number in this file. + Only active when the file contains a number greater than 0. + (in microseconds) + buffer_size_kb: This sets or displays the number of kilobytes each CPU - buffer can hold. The tracer buffers are the same size + buffer holds. By default, the trace buffers are the same size for each CPU. The displayed number is the size of the CPU buffer and not total size of all buffers. The trace buffers are allocated in pages (blocks of memory @@ -133,16 +154,30 @@ of ftrace. Here is a list of some of the key files: than requested, the rest of the page will be used, making the actual allocation bigger than requested. ( Note, the size may not be a multiple of the page size - due to buffer management overhead. ) + due to buffer management meta-data. ) - This can only be updated when the current_tracer - is set to "nop". + buffer_total_size_kb: + + This displays the total combined size of all the trace buffers. + + free_buffer: + + If a process is performing the tracing, and the ring buffer + should be shrunk "freed" when the process is finished, even + if it were to be killed by a signal, this file can be used + for that purpose. On close of this file, the ring buffer will + be resized to its minimum size. Having a process that is tracing + also open this file, when the process exits its file descriptor + for this file will be closed, and in doing so, the ring buffer + will be "freed". + + It may also stop tracing if disable_on_free option is set. tracing_cpumask: This is a mask that lets the user only trace - on specified CPUS. The format is a hex string - representing the CPUS. + on specified CPUs. The format is a hex string + representing the CPUs. set_ftrace_filter: @@ -183,6 +218,261 @@ of ftrace. Here is a list of some of the key files: "set_ftrace_notrace". (See the section "dynamic ftrace" below for more details.) + enabled_functions: + + This file is more for debugging ftrace, but can also be useful + in seeing if any function has a callback attached to it. + Not only does the trace infrastructure use ftrace function + trace utility, but other subsystems might too. This file + displays all functions that have a callback attached to them + as well as the number of callbacks that have been attached. + Note, a callback may also call multiple functions which will + not be listed in this count. + + If the callback registered to be traced by a function with + the "save regs" attribute (thus even more overhead), a 'R' + will be displayed on the same line as the function that + is returning registers. + + function_profile_enabled: + + When set it will enable all functions with either the function + tracer, or if enabled, the function graph tracer. It will + keep a histogram of the number of functions that were called + and if run with the function graph tracer, it will also keep + track of the time spent in those functions. The histogram + content can be displayed in the files: + + trace_stats/function<cpu> ( function0, function1, etc). + + trace_stats: + + A directory that holds different tracing stats. + + kprobe_events: + + Enable dynamic trace points. See kprobetrace.txt. + + kprobe_profile: + + Dynamic trace points stats. See kprobetrace.txt. + + max_graph_depth: + + Used with the function graph tracer. This is the max depth + it will trace into a function. Setting this to a value of + one will show only the first kernel function that is called + from user space. + + printk_formats: + + This is for tools that read the raw format files. If an event in + the ring buffer references a string (currently only trace_printk() + does this), only a pointer to the string is recorded into the buffer + and not the string itself. This prevents tools from knowing what + that string was. This file displays the string and address for + the string allowing tools to map the pointers to what the + strings were. + + saved_cmdlines: + + Only the pid of the task is recorded in a trace event unless + the event specifically saves the task comm as well. Ftrace + makes a cache of pid mappings to comms to try to display + comms for events. If a pid for a comm is not listed, then + "<...>" is displayed in the output. + + snapshot: + + This displays the "snapshot" buffer and also lets the user + take a snapshot of the current running trace. + See the "Snapshot" section below for more details. + + stack_max_size: + + When the stack tracer is activated, this will display the + maximum stack size it has encountered. + See the "Stack Trace" section below. + + stack_trace: + + This displays the stack back trace of the largest stack + that was encountered when the stack tracer is activated. + See the "Stack Trace" section below. + + stack_trace_filter: + + This is similar to "set_ftrace_filter" but it limits what + functions the stack tracer will check. + + trace_clock: + + Whenever an event is recorded into the ring buffer, a + "timestamp" is added. This stamp comes from a specified + clock. By default, ftrace uses the "local" clock. This + clock is very fast and strictly per cpu, but on some + systems it may not be monotonic with respect to other + CPUs. In other words, the local clocks may not be in sync + with local clocks on other CPUs. + + Usual clocks for tracing: + + # cat trace_clock + [local] global counter x86-tsc + + local: Default clock, but may not be in sync across CPUs + + global: This clock is in sync with all CPUs but may + be a bit slower than the local clock. + + counter: This is not a clock at all, but literally an atomic + counter. It counts up one by one, but is in sync + with all CPUs. This is useful when you need to + know exactly the order events occurred with respect to + each other on different CPUs. + + uptime: This uses the jiffies counter and the time stamp + is relative to the time since boot up. + + perf: This makes ftrace use the same clock that perf uses. + Eventually perf will be able to read ftrace buffers + and this will help out in interleaving the data. + + x86-tsc: Architectures may define their own clocks. For + example, x86 uses its own TSC cycle clock here. + + To set a clock, simply echo the clock name into this file. + + echo global > trace_clock + + trace_marker: + + This is a very useful file for synchronizing user space + with events happening in the kernel. Writing strings into + this file will be written into the ftrace buffer. + + It is useful in applications to open this file at the start + of the application and just reference the file descriptor + for the file. + + void trace_write(const char *fmt, ...) + { + va_list ap; + char buf[256]; + int n; + + if (trace_fd < 0) + return; + + va_start(ap, fmt); + n = vsnprintf(buf, 256, fmt, ap); + va_end(ap); + + write(trace_fd, buf, n); + } + + start: + + trace_fd = open("trace_marker", WR_ONLY); + + uprobe_events: + + Add dynamic tracepoints in programs. + See uprobetracer.txt + + uprobe_profile: + + Uprobe statistics. See uprobetrace.txt + + instances: + + This is a way to make multiple trace buffers where different + events can be recorded in different buffers. + See "Instances" section below. + + events: + + This is the trace event directory. It holds event tracepoints + (also known as static tracepoints) that have been compiled + into the kernel. It shows what event tracepoints exist + and how they are grouped by system. There are "enable" + files at various levels that can enable the tracepoints + when a "1" is written to them. + + See events.txt for more information. + + per_cpu: + + This is a directory that contains the trace per_cpu information. + + per_cpu/cpu0/buffer_size_kb: + + The ftrace buffer is defined per_cpu. That is, there's a separate + buffer for each CPU to allow writes to be done atomically, + and free from cache bouncing. These buffers may have different + size buffers. This file is similar to the buffer_size_kb + file, but it only displays or sets the buffer size for the + specific CPU. (here cpu0). + + per_cpu/cpu0/trace: + + This is similar to the "trace" file, but it will only display + the data specific for the CPU. If written to, it only clears + the specific CPU buffer. + + per_cpu/cpu0/trace_pipe + + This is similar to the "trace_pipe" file, and is a consuming + read, but it will only display (and consume) the data specific + for the CPU. + + per_cpu/cpu0/trace_pipe_raw + + For tools that can parse the ftrace ring buffer binary format, + the trace_pipe_raw file can be used to extract the data + from the ring buffer directly. With the use of the splice() + system call, the buffer data can be quickly transferred to + a file or to the network where a server is collecting the + data. + + Like trace_pipe, this is a consuming reader, where multiple + reads will always produce different data. + + per_cpu/cpu0/snapshot: + + This is similar to the main "snapshot" file, but will only + snapshot the current CPU (if supported). It only displays + the content of the snapshot for a given CPU, and if + written to, only clears this CPU buffer. + + per_cpu/cpu0/snapshot_raw: + + Similar to the trace_pipe_raw, but will read the binary format + from the snapshot buffer for the given CPU. + + per_cpu/cpu0/stats: + + This displays certain stats about the ring buffer: + + entries: The number of events that are still in the buffer. + + overrun: The number of lost events due to overwriting when + the buffer was full. + + commit overrun: Should always be zero. + This gets set if so many events happened within a nested + event (ring buffer is re-entrant), that it fills the + buffer and starts dropping events. + + bytes: Bytes actually read (not overwritten). + + oldest event ts: The oldest timestamp in the buffer + + now ts: The current timestamp + + dropped events: Events lost due to overwrite option being off. + + read events: The number of events read. The Tracers ----------- @@ -234,11 +524,6 @@ Here is the list of current tracers that may be configured. RT tasks (as the current "wakeup" does). This is useful for those interested in wake up timings of RT tasks. - "hw-branch-tracer" - - Uses the BTS CPU feature on x86 CPUs to traces all - branches executed. - "nop" This is the "trace nothing" tracer. To remove all @@ -261,70 +546,100 @@ Here is an example of the output format of the file "trace" -------- # tracer: function # -# TASK-PID CPU# TIMESTAMP FUNCTION -# | | | | | - bash-4251 [01] 10152.583854: path_put <-path_walk - bash-4251 [01] 10152.583855: dput <-path_put - bash-4251 [01] 10152.583855: _atomic_dec_and_lock <-dput +# entries-in-buffer/entries-written: 140080/250280 #P:4 +# +# _-----=> irqs-off +# / _----=> need-resched +# | / _---=> hardirq/softirq +# || / _--=> preempt-depth +# ||| / delay +# TASK-PID CPU# |||| TIMESTAMP FUNCTION +# | | | |||| | | + bash-1977 [000] .... 17284.993652: sys_close <-system_call_fastpath + bash-1977 [000] .... 17284.993653: __close_fd <-sys_close + bash-1977 [000] .... 17284.993653: _raw_spin_lock <-__close_fd + sshd-1974 [003] .... 17284.993653: __srcu_read_unlock <-fsnotify + bash-1977 [000] .... 17284.993654: add_preempt_count <-_raw_spin_lock + bash-1977 [000] ...1 17284.993655: _raw_spin_unlock <-__close_fd + bash-1977 [000] ...1 17284.993656: sub_preempt_count <-_raw_spin_unlock + bash-1977 [000] .... 17284.993657: filp_close <-__close_fd + bash-1977 [000] .... 17284.993657: dnotify_flush <-filp_close + sshd-1974 [003] .... 17284.993658: sys_select <-system_call_fastpath -------- A header is printed with the tracer name that is represented by -the trace. In this case the tracer is "function". Then a header -showing the format. Task name "bash", the task PID "4251", the -CPU that it was running on "01", the timestamp in <secs>.<usecs> -format, the function name that was traced "path_put" and the -parent function that called this function "path_walk". The -timestamp is the time at which the function was entered. +the trace. In this case the tracer is "function". Then it shows the +number of events in the buffer as well as the total number of entries +that were written. The difference is the number of entries that were +lost due to the buffer filling up (250280 - 140080 = 110200 events +lost). + +The header explains the content of the events. Task name "bash", the task +PID "1977", the CPU that it was running on "000", the latency format +(explained below), the timestamp in <secs>.<usecs> format, the +function name that was traced "sys_close" and the parent function that +called this function "system_call_fastpath". The timestamp is the time +at which the function was entered. Latency trace format -------------------- -When the latency-format option is enabled, the trace file gives -somewhat more information to see why a latency happened. -Here is a typical trace. +When the latency-format option is enabled or when one of the latency +tracers is set, the trace file gives somewhat more information to see +why a latency happened. Here is a typical trace. # tracer: irqsoff # -irqsoff latency trace v1.1.5 on 2.6.26-rc8 --------------------------------------------------------------------- - latency: 97 us, #3/3, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2) - ----------------- - | task: swapper-0 (uid:0 nice:0 policy:0 rt_prio:0) - ----------------- - => started at: apic_timer_interrupt - => ended at: do_softirq - -# _------=> CPU# -# / _-----=> irqs-off -# | / _----=> need-resched -# || / _---=> hardirq/softirq -# ||| / _--=> preempt-depth -# |||| / -# ||||| delay -# cmd pid ||||| time | caller -# \ / ||||| \ | / - <idle>-0 0d..1 0us+: trace_hardirqs_off_thunk (apic_timer_interrupt) - <idle>-0 0d.s. 97us : __do_softirq (do_softirq) - <idle>-0 0d.s1 98us : trace_hardirqs_on (do_softirq) +# irqsoff latency trace v1.1.5 on 3.8.0-test+ +# -------------------------------------------------------------------- +# latency: 259 us, #4/4, CPU#2 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4) +# ----------------- +# | task: ps-6143 (uid:0 nice:0 policy:0 rt_prio:0) +# ----------------- +# => started at: __lock_task_sighand +# => ended at: _raw_spin_unlock_irqrestore +# +# +# _------=> CPU# +# / _-----=> irqs-off +# | / _----=> need-resched +# || / _---=> hardirq/softirq +# ||| / _--=> preempt-depth +# |||| / delay +# cmd pid ||||| time | caller +# \ / ||||| \ | / + ps-6143 2d... 0us!: trace_hardirqs_off <-__lock_task_sighand + ps-6143 2d..1 259us+: trace_hardirqs_on <-_raw_spin_unlock_irqrestore + ps-6143 2d..1 263us+: time_hardirqs_on <-_raw_spin_unlock_irqrestore + ps-6143 2d..1 306us : <stack trace> + => trace_hardirqs_on_caller + => trace_hardirqs_on + => _raw_spin_unlock_irqrestore + => do_task_stat + => proc_tgid_stat + => proc_single_show + => seq_read + => vfs_read + => sys_read + => system_call_fastpath This shows that the current tracer is "irqsoff" tracing the time -for which interrupts were disabled. It gives the trace version -and the version of the kernel upon which this was executed on -(2.6.26-rc8). Then it displays the max latency in microsecs (97 -us). The number of trace entries displayed and the total number -recorded (both are three: #3/3). The type of preemption that was -used (PREEMPT). VP, KP, SP, and HP are always zero and are -reserved for later use. #P is the number of online CPUS (#P:2). +for which interrupts were disabled. It gives the trace version (which +never changes) and the version of the kernel upon which this was executed on +(3.10). Then it displays the max latency in microseconds (259 us). The number +of trace entries displayed and the total number (both are four: #4/4). +VP, KP, SP, and HP are always zero and are reserved for later use. +#P is the number of online CPUs (#P:4). The task is the process that was running when the latency -occurred. (swapper pid: 0). +occurred. (ps pid: 6143). The start and stop (the functions in which the interrupts were disabled and enabled respectively) that caused the latencies: - apic_timer_interrupt is where the interrupts were disabled. - do_softirq is where they were enabled again. + __lock_task_sighand is where the interrupts were disabled. + _raw_spin_unlock_irqrestore is where they were enabled again. The next lines after the header are the trace itself. The header explains which is which. @@ -367,16 +682,43 @@ The above is mostly meaningful for kernel developers. The rest is the same as the 'trace' file. + Note, the latency tracers will usually end with a back trace + to easily find where the latency occurred. trace_options ------------- -The trace_options file is used to control what gets printed in -the trace output. To see what is available, simply cat the file: +The trace_options file (or the options directory) is used to control +what gets printed in the trace output, or manipulate the tracers. +To see what is available, simply cat the file: cat trace_options - print-parent nosym-offset nosym-addr noverbose noraw nohex nobin \ - noblock nostacktrace nosched-tree nouserstacktrace nosym-userobj +print-parent +nosym-offset +nosym-addr +noverbose +noraw +nohex +nobin +noblock +nostacktrace +trace_printk +noftrace_preempt +nobranch +annotate +nouserstacktrace +nosym-userobj +noprintk-msg-only +context-info +latency-format +sleep-time +graph-time +record-cmd +overwrite +nodisable_on_free +irq-info +markers +function-trace To disable one of the options, echo in the option prepended with "no". @@ -428,13 +770,34 @@ Here are the available options: bin - This will print out the formats in raw binary. - block - TBD (needs update) + block - When set, reading trace_pipe will not block when polled. stacktrace - This is one of the options that changes the trace itself. When a trace is recorded, so is the stack of functions. This allows for back traces of trace sites. + trace_printk - Can disable trace_printk() from writing into the buffer. + + branch - Enable branch tracing with the tracer. + + annotate - It is sometimes confusing when the CPU buffers are full + and one CPU buffer had a lot of events recently, thus + a shorter time frame, were another CPU may have only had + a few events, which lets it have older events. When + the trace is reported, it shows the oldest events first, + and it may look like only one CPU ran (the one with the + oldest events). When the annotate option is set, it will + display when a new CPU buffer started: + + <idle>-0 [001] dNs4 21169.031481: wake_up_idle_cpu <-add_timer_on + <idle>-0 [001] dNs4 21169.031482: _raw_spin_unlock_irqrestore <-add_timer_on + <idle>-0 [001] .Ns4 21169.031484: sub_preempt_count <-_raw_spin_unlock_irqrestore +##### CPU 2 buffer started #### + <idle>-0 [002] .N.1 21169.031484: rcu_idle_exit <-cpu_idle + <idle>-0 [001] .Ns3 21169.031484: _raw_spin_unlock <-clocksource_watchdog + <idle>-0 [001] .Ns3 21169.031485: sub_preempt_count <-_raw_spin_unlock + userstacktrace - This option changes the trace. It records a stacktrace of the current userspace thread. @@ -451,9 +814,13 @@ Here are the available options: a.out-1623 [000] 40874.465068: /root/a.out[+0x480] <-/root/a.out[+0 x494] <- /root/a.out[+0x4a8] <- /lib/libc-2.7.so[+0x1e1a6] - sched-tree - trace all tasks that are on the runqueue, at - every scheduling event. Will add overhead if - there's a lot of tasks running at once. + + printk-msg-only - When set, trace_printk()s will only show the format + and not their parameters (if trace_bprintk() or + trace_bputs() was used to save the trace_printk()). + + context-info - Show only the event data. Hides the comm, PID, + timestamp, CPU, and other useful data. latency-format - This option changes the trace. When it is enabled, the trace displays @@ -461,31 +828,61 @@ x494] <- /root/a.out[+0x4a8] <- /lib/libc-2.7.so[+0x1e1a6] latencies, as described in "Latency trace format". + sleep-time - When running function graph tracer, to include + the time a task schedules out in its function. + When enabled, it will account time the task has been + scheduled out as part of the function call. + + graph-time - When running function graph tracer, to include the + time to call nested functions. When this is not set, + the time reported for the function will only include + the time the function itself executed for, not the time + for functions that it called. + + record-cmd - When any event or tracer is enabled, a hook is enabled + in the sched_switch trace point to fill comm cache + with mapped pids and comms. But this may cause some + overhead, and if you only care about pids, and not the + name of the task, disabling this option can lower the + impact of tracing. + overwrite - This controls what happens when the trace buffer is full. If "1" (default), the oldest events are discarded and overwritten. If "0", then the newest events are discarded. + (see per_cpu/cpu0/stats for overrun and dropped) -ftrace_enabled --------------- + disable_on_free - When the free_buffer is closed, tracing will + stop (tracing_on set to 0). -The following tracers (listed below) give different output -depending on whether or not the sysctl ftrace_enabled is set. To -set ftrace_enabled, one can either use the sysctl function or -set it via the proc file system interface. + irq-info - Shows the interrupt, preempt count, need resched data. + When disabled, the trace looks like: - sysctl kernel.ftrace_enabled=1 +# tracer: function +# +# entries-in-buffer/entries-written: 144405/9452052 #P:4 +# +# TASK-PID CPU# TIMESTAMP FUNCTION +# | | | | | + <idle>-0 [002] 23636.756054: ttwu_do_activate.constprop.89 <-try_to_wake_up + <idle>-0 [002] 23636.756054: activate_task <-ttwu_do_activate.constprop.89 + <idle>-0 [002] 23636.756055: enqueue_task <-activate_task - or - echo 1 > /proc/sys/kernel/ftrace_enabled + markers - When set, the trace_marker is writable (only by root). + When disabled, the trace_marker will error with EINVAL + on write. + + + function-trace - The latency tracers will enable function tracing + if this option is enabled (default it is). When + it is disabled, the latency tracers do not trace + functions. This keeps the overhead of the tracer down + when performing latency tests. -To disable ftrace_enabled simply replace the '1' with '0' in the -above commands. + Note: Some tracers have their own options. They only appear + when the tracer is active. -When ftrace_enabled is set the tracers will also record the -functions that are within the trace. The descriptions of the -tracers will also show an example with ftrace enabled. irqsoff @@ -506,95 +903,133 @@ new trace is saved. To reset the maximum, echo 0 into tracing_max_latency. Here is an example: + # echo 0 > options/function-trace # echo irqsoff > current_tracer - # echo latency-format > trace_options - # echo 0 > tracing_max_latency # echo 1 > tracing_on + # echo 0 > tracing_max_latency # ls -ltr [...] # echo 0 > tracing_on # cat trace # tracer: irqsoff # -irqsoff latency trace v1.1.5 on 2.6.26 --------------------------------------------------------------------- - latency: 12 us, #3/3, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2) - ----------------- - | task: bash-3730 (uid:0 nice:0 policy:0 rt_prio:0) - ----------------- - => started at: sys_setpgid - => ended at: sys_setpgid - -# _------=> CPU# -# / _-----=> irqs-off -# | / _----=> need-resched -# || / _---=> hardirq/softirq -# ||| / _--=> preempt-depth -# |||| / -# ||||| delay -# cmd pid ||||| time | caller -# \ / ||||| \ | / - bash-3730 1d... 0us : _write_lock_irq (sys_setpgid) - bash-3730 1d..1 1us+: _write_unlock_irq (sys_setpgid) - bash-3730 1d..2 14us : trace_hardirqs_on (sys_setpgid) - - -Here we see that that we had a latency of 12 microsecs (which is -very good). The _write_lock_irq in sys_setpgid disabled -interrupts. The difference between the 12 and the displayed -timestamp 14us occurred because the clock was incremented +# irqsoff latency trace v1.1.5 on 3.8.0-test+ +# -------------------------------------------------------------------- +# latency: 16 us, #4/4, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4) +# ----------------- +# | task: swapper/0-0 (uid:0 nice:0 policy:0 rt_prio:0) +# ----------------- +# => started at: run_timer_softirq +# => ended at: run_timer_softirq +# +# +# _------=> CPU# +# / _-----=> irqs-off +# | / _----=> need-resched +# || / _---=> hardirq/softirq +# ||| / _--=> preempt-depth +# |||| / delay +# cmd pid ||||| time | caller +# \ / ||||| \ | / + <idle>-0 0d.s2 0us+: _raw_spin_lock_irq <-run_timer_softirq + <idle>-0 0dNs3 17us : _raw_spin_unlock_irq <-run_timer_softirq + <idle>-0 0dNs3 17us+: trace_hardirqs_on <-run_timer_softirq + <idle>-0 0dNs3 25us : <stack trace> + => _raw_spin_unlock_irq + => run_timer_softirq + => __do_softirq + => call_softirq + => do_softirq + => irq_exit + => smp_apic_timer_interrupt + => apic_timer_interrupt + => rcu_idle_exit + => cpu_idle + => rest_init + => start_kernel + => x86_64_start_reservations + => x86_64_start_kernel + +Here we see that that we had a latency of 16 microseconds (which is +very good). The _raw_spin_lock_irq in run_timer_softirq disabled +interrupts. The difference between the 16 and the displayed +timestamp 25us occurred because the clock was incremented between the time of recording the max latency and the time of recording the function that had that latency. -Note the above example had ftrace_enabled not set. If we set the -ftrace_enabled, we get a much larger output: +Note the above example had function-trace not set. If we set +function-trace, we get a much larger output: + + with echo 1 > options/function-trace # tracer: irqsoff # -irqsoff latency trace v1.1.5 on 2.6.26-rc8 --------------------------------------------------------------------- - latency: 50 us, #101/101, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2) - ----------------- - | task: ls-4339 (uid:0 nice:0 policy:0 rt_prio:0) - ----------------- - => started at: __alloc_pages_internal - => ended at: __alloc_pages_internal - -# _------=> CPU# -# / _-----=> irqs-off -# | / _----=> need-resched -# || / _---=> hardirq/softirq -# ||| / _--=> preempt-depth -# |||| / -# ||||| delay -# cmd pid ||||| time | caller -# \ / ||||| \ | / - ls-4339 0...1 0us+: get_page_from_freelist (__alloc_pages_internal) - ls-4339 0d..1 3us : rmqueue_bulk (get_page_from_freelist) - ls-4339 0d..1 3us : _spin_lock (rmqueue_bulk) - ls-4339 0d..1 4us : add_preempt_count (_spin_lock) - ls-4339 0d..2 4us : __rmqueue (rmqueue_bulk) - ls-4339 0d..2 5us : __rmqueue_smallest (__rmqueue) - ls-4339 0d..2 5us : __mod_zone_page_state (__rmqueue_smallest) - ls-4339 0d..2 6us : __rmqueue (rmqueue_bulk) - ls-4339 0d..2 6us : __rmqueue_smallest (__rmqueue) - ls-4339 0d..2 7us : __mod_zone_page_state (__rmqueue_smallest) - ls-4339 0d..2 7us : __rmqueue (rmqueue_bulk) - ls-4339 0d..2 8us : __rmqueue_smallest (__rmqueue) +# irqsoff latency trace v1.1.5 on 3.8.0-test+ +# -------------------------------------------------------------------- +# latency: 71 us, #168/168, CPU#3 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4) +# ----------------- +# | task: bash-2042 (uid:0 nice:0 policy:0 rt_prio:0) +# ----------------- +# => started at: ata_scsi_queuecmd +# => ended at: ata_scsi_queuecmd +# +# +# _------=> CPU# +# / _-----=> irqs-off +# | / _----=> need-resched +# || / _---=> hardirq/softirq +# ||| / _--=> preempt-depth +# |||| / delay +# cmd pid ||||| time | caller +# \ / ||||| \ | / + bash-2042 3d... 0us : _raw_spin_lock_irqsave <-ata_scsi_queuecmd + bash-2042 3d... 0us : add_preempt_count <-_raw_spin_lock_irqsave + bash-2042 3d..1 1us : ata_scsi_find_dev <-ata_scsi_queuecmd + bash-2042 3d..1 1us : __ata_scsi_find_dev <-ata_scsi_find_dev + bash-2042 3d..1 2us : ata_find_dev.part.14 <-__ata_scsi_find_dev + bash-2042 3d..1 2us : ata_qc_new_init <-__ata_scsi_queuecmd + bash-2042 3d..1 3us : ata_sg_init <-__ata_scsi_queuecmd + bash-2042 3d..1 4us : ata_scsi_rw_xlat <-__ata_scsi_queuecmd + bash-2042 3d..1 4us : ata_build_rw_tf <-ata_scsi_rw_xlat [...] - ls-4339 0d..2 46us : __rmqueue_smallest (__rmqueue) - ls-4339 0d..2 47us : __mod_zone_page_state (__rmqueue_smallest) - ls-4339 0d..2 47us : __rmqueue (rmqueue_bulk) - ls-4339 0d..2 48us : __rmqueue_smallest (__rmqueue) - ls-4339 0d..2 48us : __mod_zone_page_state (__rmqueue_smallest) - ls-4339 0d..2 49us : _spin_unlock (rmqueue_bulk) - ls-4339 0d..2 49us : sub_preempt_count (_spin_unlock) - ls-4339 0d..1 50us : get_page_from_freelist (__alloc_pages_internal) - ls-4339 0d..2 51us : trace_hardirqs_on (__alloc_pages_internal) - - - -Here we traced a 50 microsecond latency. But we also see all the + bash-2042 3d..1 67us : delay_tsc <-__delay + bash-2042 3d..1 67us : add_preempt_count <-delay_tsc + bash-2042 3d..2 67us : sub_preempt_count <-delay_tsc + bash-2042 3d..1 67us : add_preempt_count <-delay_tsc + bash-2042 3d..2 68us : sub_preempt_count <-delay_tsc + bash-2042 3d..1 68us+: ata_bmdma_start <-ata_bmdma_qc_issue + bash-2042 3d..1 71us : _raw_spin_unlock_irqrestore <-ata_scsi_queuecmd + bash-2042 3d..1 71us : _raw_spin_unlock_irqrestore <-ata_scsi_queuecmd + bash-2042 3d..1 72us+: trace_hardirqs_on <-ata_scsi_queuecmd + bash-2042 3d..1 120us : <stack trace> + => _raw_spin_unlock_irqrestore + => ata_scsi_queuecmd + => scsi_dispatch_cmd + => scsi_request_fn + => __blk_run_queue_uncond + => __blk_run_queue + => blk_queue_bio + => generic_make_request + => submit_bio + => submit_bh + => __ext3_get_inode_loc + => ext3_iget + => ext3_lookup + => lookup_real + => __lookup_hash + => walk_component + => lookup_last + => path_lookupat + => filename_lookup + => user_path_at_empty + => user_path_at + => vfs_fstatat + => vfs_stat + => sys_newstat + => system_call_fastpath + + +Here we traced a 71 microsecond latency. But we also see all the functions that were called during that time. Note that by enabling function tracing, we incur an added overhead. This overhead may extend the latency times. But nevertheless, this @@ -614,120 +1049,122 @@ Like the irqsoff tracer, it records the maximum latency for which preemption was disabled. The control of preemptoff tracer is much like the irqsoff tracer. + # echo 0 > options/function-trace # echo preemptoff > current_tracer - # echo latency-format > trace_options - # echo 0 > tracing_max_latency # echo 1 > tracing_on + # echo 0 > tracing_max_latency # ls -ltr [...] # echo 0 > tracing_on # cat trace # tracer: preemptoff # -preemptoff latency trace v1.1.5 on 2.6.26-rc8 --------------------------------------------------------------------- - latency: 29 us, #3/3, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2) - ----------------- - | task: sshd-4261 (uid:0 nice:0 policy:0 rt_prio:0) - ----------------- - => started at: do_IRQ - => ended at: __do_softirq - -# _------=> CPU# -# / _-----=> irqs-off -# | / _----=> need-resched -# || / _---=> hardirq/softirq -# ||| / _--=> preempt-depth -# |||| / -# ||||| delay -# cmd pid ||||| time | caller -# \ / ||||| \ | / - sshd-4261 0d.h. 0us+: irq_enter (do_IRQ) - sshd-4261 0d.s. 29us : _local_bh_enable (__do_softirq) - sshd-4261 0d.s1 30us : trace_preempt_on (__do_softirq) +# preemptoff latency trace v1.1.5 on 3.8.0-test+ +# -------------------------------------------------------------------- +# latency: 46 us, #4/4, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4) +# ----------------- +# | task: sshd-1991 (uid:0 nice:0 policy:0 rt_prio:0) +# ----------------- +# => started at: do_IRQ +# => ended at: do_IRQ +# +# +# _------=> CPU# +# / _-----=> irqs-off +# | / _----=> need-resched +# || / _---=> hardirq/softirq +# ||| / _--=> preempt-depth +# |||| / delay +# cmd pid ||||| time | caller +# \ / ||||| \ | / + sshd-1991 1d.h. 0us+: irq_enter <-do_IRQ + sshd-1991 1d..1 46us : irq_exit <-do_IRQ + sshd-1991 1d..1 47us+: trace_preempt_on <-do_IRQ + sshd-1991 1d..1 52us : <stack trace> + => sub_preempt_count + => irq_exit + => do_IRQ + => ret_from_intr This has some more changes. Preemption was disabled when an -interrupt came in (notice the 'h'), and was enabled while doing -a softirq. (notice the 's'). But we also see that interrupts -have been disabled when entering the preempt off section and -leaving it (the 'd'). We do not know if interrupts were enabled -in the mean time. +interrupt came in (notice the 'h'), and was enabled on exit. +But we also see that interrupts have been disabled when entering +the preempt off section and leaving it (the 'd'). We do not know if +interrupts were enabled in the mean time or shortly after this +was over. # tracer: preemptoff # -preemptoff latency trace v1.1.5 on 2.6.26-rc8 --------------------------------------------------------------------- - latency: 63 us, #87/87, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2) - ----------------- - | task: sshd-4261 (uid:0 nice:0 policy:0 rt_prio:0) - ----------------- - => started at: remove_wait_queue - => ended at: __do_softirq - -# _------=> CPU# -# / _-----=> irqs-off -# | / _----=> need-resched -# || / _---=> hardirq/softirq -# ||| / _--=> preempt-depth -# |||| / -# ||||| delay -# cmd pid ||||| time | caller -# \ / ||||| \ | / - sshd-4261 0d..1 0us : _spin_lock_irqsave (remove_wait_queue) - sshd-4261 0d..1 1us : _spin_unlock_irqrestore (remove_wait_queue) - sshd-4261 0d..1 2us : do_IRQ (common_interrupt) - sshd-4261 0d..1 2us : irq_enter (do_IRQ) - sshd-4261 0d..1 2us : idle_cpu (irq_enter) - sshd-4261 0d..1 3us : add_preempt_count (irq_enter) - sshd-4261 0d.h1 3us : idle_cpu (irq_enter) - sshd-4261 0d.h. 4us : handle_fasteoi_irq (do_IRQ) +# preemptoff latency trace v1.1.5 on 3.8.0-test+ +# -------------------------------------------------------------------- +# latency: 83 us, #241/241, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4) +# ----------------- +# | task: bash-1994 (uid:0 nice:0 policy:0 rt_prio:0) +# ----------------- +# => started at: wake_up_new_task +# => ended at: task_rq_unlock +# +# +# _------=> CPU# +# / _-----=> irqs-off +# | / _----=> need-resched +# || / _---=> hardirq/softirq +# ||| / _--=> preempt-depth +# |||| / delay +# cmd pid ||||| time | caller +# \ / ||||| \ | / + bash-1994 1d..1 0us : _raw_spin_lock_irqsave <-wake_up_new_task + bash-1994 1d..1 0us : select_task_rq_fair <-select_task_rq + bash-1994 1d..1 1us : __rcu_read_lock <-select_task_rq_fair + bash-1994 1d..1 1us : source_load <-select_task_rq_fair + bash-1994 1d..1 1us : source_load <-select_task_rq_fair [...] - sshd-4261 0d.h. 12us : add_preempt_count (_spin_lock) - sshd-4261 0d.h1 12us : ack_ioapic_quirk_irq (handle_fasteoi_irq) - sshd-4261 0d.h1 13us : move_native_irq (ack_ioapic_quirk_irq) - sshd-4261 0d.h1 13us : _spin_unlock (handle_fasteoi_irq) - sshd-4261 0d.h1 14us : sub_preempt_count (_spin_unlock) - sshd-4261 0d.h1 14us : irq_exit (do_IRQ) - sshd-4261 0d.h1 15us : sub_preempt_count (irq_exit) - sshd-4261 0d..2 15us : do_softirq (irq_exit) - sshd-4261 0d... 15us : __do_softirq (do_softirq) - sshd-4261 0d... 16us : __local_bh_disable (__do_softirq) - sshd-4261 0d... 16us+: add_preempt_count (__local_bh_disable) - sshd-4261 0d.s4 20us : add_preempt_count (__local_bh_disable) - sshd-4261 0d.s4 21us : sub_preempt_count (local_bh_enable) - sshd-4261 0d.s5 21us : sub_preempt_count (local_bh_enable) + bash-1994 1d..1 12us : irq_enter <-smp_apic_timer_interrupt + bash-1994 1d..1 12us : rcu_irq_enter <-irq_enter + bash-1994 1d..1 13us : add_preempt_count <-irq_enter + bash-1994 1d.h1 13us : exit_idle <-smp_apic_timer_interrupt + bash-1994 1d.h1 13us : hrtimer_interrupt <-smp_apic_timer_interrupt + bash-1994 1d.h1 13us : _raw_spin_lock <-hrtimer_interrupt + bash-1994 1d.h1 14us : add_preempt_count <-_raw_spin_lock + bash-1994 1d.h2 14us : ktime_get_update_offsets <-hrtimer_interrupt [...] - sshd-4261 0d.s6 41us : add_preempt_count (__local_bh_disable) - sshd-4261 0d.s6 42us : sub_preempt_count (local_bh_enable) - sshd-4261 0d.s7 42us : sub_preempt_count (local_bh_enable) - sshd-4261 0d.s5 43us : add_preempt_count (__local_bh_disable) - sshd-4261 0d.s5 43us : sub_preempt_count (local_bh_enable_ip) - sshd-4261 0d.s6 44us : sub_preempt_count (local_bh_enable_ip) - sshd-4261 0d.s5 44us : add_preempt_count (__local_bh_disable) - sshd-4261 0d.s5 45us : sub_preempt_count (local_bh_enable) + bash-1994 1d.h1 35us : lapic_next_event <-clockevents_program_event + bash-1994 1d.h1 35us : irq_exit <-smp_apic_timer_interrupt + bash-1994 1d.h1 36us : sub_preempt_count <-irq_exit + bash-1994 1d..2 36us : do_softirq <-irq_exit + bash-1994 1d..2 36us : __do_softirq <-call_softirq + bash-1994 1d..2 36us : __local_bh_disable <-__do_softirq + bash-1994 1d.s2 37us : add_preempt_count <-_raw_spin_lock_irq + bash-1994 1d.s3 38us : _raw_spin_unlock <-run_timer_softirq + bash-1994 1d.s3 39us : sub_preempt_count <-_raw_spin_unlock + bash-1994 1d.s2 39us : call_timer_fn <-run_timer_softirq [...] - sshd-4261 0d.s. 63us : _local_bh_enable (__do_softirq) - sshd-4261 0d.s1 64us : trace_preempt_on (__do_softirq) + bash-1994 1dNs2 81us : cpu_needs_another_gp <-rcu_process_callbacks + bash-1994 1dNs2 82us : __local_bh_enable <-__do_softirq + bash-1994 1dNs2 82us : sub_preempt_count <-__local_bh_enable + bash-1994 1dN.2 82us : idle_cpu <-irq_exit + bash-1994 1dN.2 83us : rcu_irq_exit <-irq_exit + bash-1994 1dN.2 83us : sub_preempt_count <-irq_exit + bash-1994 1.N.1 84us : _raw_spin_unlock_irqrestore <-task_rq_unlock + bash-1994 1.N.1 84us+: trace_preempt_on <-task_rq_unlock + bash-1994 1.N.1 104us : <stack trace> + => sub_preempt_count + => _raw_spin_unlock_irqrestore + => task_rq_unlock + => wake_up_new_task + => do_fork + => sys_clone + => stub_clone The above is an example of the preemptoff trace with -ftrace_enabled set. Here we see that interrupts were disabled +function-trace set. Here we see that interrupts were not disabled the entire time. The irq_enter code lets us know that we entered an interrupt 'h'. Before that, the functions being traced still show that it is not in an interrupt, but we can see from the functions themselves that this is not the case. -Notice that __do_softirq when called does not have a -preempt_count. It may seem that we missed a preempt enabling. -What really happened is that the preempt count is held on the -thread's stack and we switched to the softirq stack (4K stacks -in effect). The code does not copy the preempt count, but -because interrupts are disabled, we do not need to worry about -it. Having a tracer like this is good for letting people know -what really happens inside the kernel. - - preemptirqsoff -------------- @@ -762,38 +1199,57 @@ tracer. Again, using this trace is much like the irqsoff and preemptoff tracers. + # echo 0 > options/function-trace # echo preemptirqsoff > current_tracer - # echo latency-format > trace_options - # echo 0 > tracing_max_latency # echo 1 > tracing_on + # echo 0 > tracing_max_latency # ls -ltr [...] # echo 0 > tracing_on # cat trace # tracer: preemptirqsoff # -preemptirqsoff latency trace v1.1.5 on 2.6.26-rc8 --------------------------------------------------------------------- - latency: 293 us, #3/3, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2) - ----------------- - | task: ls-4860 (uid:0 nice:0 policy:0 rt_prio:0) - ----------------- - => started at: apic_timer_interrupt - => ended at: __do_softirq - -# _------=> CPU# -# / _-----=> irqs-off -# | / _----=> need-resched -# || / _---=> hardirq/softirq -# ||| / _--=> preempt-depth -# |||| / -# ||||| delay -# cmd pid ||||| time | caller -# \ / ||||| \ | / - ls-4860 0d... 0us!: trace_hardirqs_off_thunk (apic_timer_interrupt) - ls-4860 0d.s. 294us : _local_bh_enable (__do_softirq) - ls-4860 0d.s1 294us : trace_preempt_on (__do_softirq) - +# preemptirqsoff latency trace v1.1.5 on 3.8.0-test+ +# -------------------------------------------------------------------- +# latency: 100 us, #4/4, CPU#3 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4) +# ----------------- +# | task: ls-2230 (uid:0 nice:0 policy:0 rt_prio:0) +# ----------------- +# => started at: ata_scsi_queuecmd +# => ended at: ata_scsi_queuecmd +# +# +# _------=> CPU# +# / _-----=> irqs-off +# | / _----=> need-resched +# || / _---=> hardirq/softirq +# ||| / _--=> preempt-depth +# |||| / delay +# cmd pid ||||| time | caller +# \ / ||||| \ | / + ls-2230 3d... 0us+: _raw_spin_lock_irqsave <-ata_scsi_queuecmd + ls-2230 3...1 100us : _raw_spin_unlock_irqrestore <-ata_scsi_queuecmd + ls-2230 3...1 101us+: trace_preempt_on <-ata_scsi_queuecmd + ls-2230 3...1 111us : <stack trace> + => sub_preempt_count + => _raw_spin_unlock_irqrestore + => ata_scsi_queuecmd + => scsi_dispatch_cmd + => scsi_request_fn + => __blk_run_queue_uncond + => __blk_run_queue + => blk_queue_bio + => generic_make_request + => submit_bio + => submit_bh + => ext3_bread + => ext3_dir_bread + => htree_dirblock_to_tree + => ext3_htree_fill_tree + => ext3_readdir + => vfs_readdir + => sys_getdents + => system_call_fastpath The trace_hardirqs_off_thunk is called from assembly on x86 when @@ -802,105 +1258,158 @@ function tracing, we do not know if interrupts were enabled within the preemption points. We do see that it started with preemption enabled. -Here is a trace with ftrace_enabled set: - +Here is a trace with function-trace set: # tracer: preemptirqsoff # -preemptirqsoff latency trace v1.1.5 on 2.6.26-rc8 --------------------------------------------------------------------- - latency: 105 us, #183/183, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2) - ----------------- - | task: sshd-4261 (uid:0 nice:0 policy:0 rt_prio:0) - ----------------- - => started at: write_chan - => ended at: __do_softirq - -# _------=> CPU# -# / _-----=> irqs-off -# | / _----=> need-resched -# || / _---=> hardirq/softirq -# ||| / _--=> preempt-depth -# |||| / -# ||||| delay -# cmd pid ||||| time | caller -# \ / ||||| \ | / - ls-4473 0.N.. 0us : preempt_schedule (write_chan) - ls-4473 0dN.1 1us : _spin_lock (schedule) - ls-4473 0dN.1 2us : add_preempt_count (_spin_lock) - ls-4473 0d..2 2us : put_prev_task_fair (schedule) -[...] - ls-4473 0d..2 13us : set_normalized_timespec (ktime_get_ts) - ls-4473 0d..2 13us : __switch_to (schedule) - sshd-4261 0d..2 14us : finish_task_switch (schedule) - sshd-4261 0d..2 14us : _spin_unlock_irq (finish_task_switch) - sshd-4261 0d..1 15us : add_preempt_count (_spin_lock_irqsave) - sshd-4261 0d..2 16us : _spin_unlock_irqrestore (hrtick_set) - sshd-4261 0d..2 16us : do_IRQ (common_interrupt) - sshd-4261 0d..2 17us : irq_enter (do_IRQ) - sshd-4261 0d..2 17us : idle_cpu (irq_enter) - sshd-4261 0d..2 18us : add_preempt_count (irq_enter) - sshd-4261 0d.h2 18us : idle_cpu (irq_enter) - sshd-4261 0d.h. 18us : handle_fasteoi_irq (do_IRQ) - sshd-4261 0d.h. 19us : _spin_lock (handle_fasteoi_irq) - sshd-4261 0d.h. 19us : add_preempt_count (_spin_lock) - sshd-4261 0d.h1 20us : _spin_unlock (handle_fasteoi_irq) - sshd-4261 0d.h1 20us : sub_preempt_count (_spin_unlock) -[...] - sshd-4261 0d.h1 28us : _spin_unlock (handle_fasteoi_irq) - sshd-4261 0d.h1 29us : sub_preempt_count (_spin_unlock) - sshd-4261 0d.h2 29us : irq_exit (do_IRQ) - sshd-4261 0d.h2 29us : sub_preempt_count (irq_exit) - sshd-4261 0d..3 30us : do_softirq (irq_exit) - sshd-4261 0d... 30us : __do_softirq (do_softirq) - sshd-4261 0d... 31us : __local_bh_disable (__do_softirq) - sshd-4261 0d... 31us+: add_preempt_count (__local_bh_disable) - sshd-4261 0d.s4 34us : add_preempt_count (__local_bh_disable) +# preemptirqsoff latency trace v1.1.5 on 3.8.0-test+ +# -------------------------------------------------------------------- +# latency: 161 us, #339/339, CPU#3 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4) +# ----------------- +# | task: ls-2269 (uid:0 nice:0 policy:0 rt_prio:0) +# ----------------- +# => started at: schedule +# => ended at: mutex_unlock +# +# +# _------=> CPU# +# / _-----=> irqs-off +# | / _----=> need-resched +# || / _---=> hardirq/softirq +# ||| / _--=> preempt-depth +# |||| / delay +# cmd pid ||||| time | caller +# \ / ||||| \ | / +kworker/-59 3...1 0us : __schedule <-schedule +kworker/-59 3d..1 0us : rcu_preempt_qs <-rcu_note_context_switch +kworker/-59 3d..1 1us : add_preempt_count <-_raw_spin_lock_irq +kworker/-59 3d..2 1us : deactivate_task <-__schedule +kworker/-59 3d..2 1us : dequeue_task <-deactivate_task +kworker/-59 3d..2 2us : update_rq_clock <-dequeue_task +kworker/-59 3d..2 2us : dequeue_task_fair <-dequeue_task +kworker/-59 3d..2 2us : update_curr <-dequeue_task_fair +kworker/-59 3d..2 2us : update_min_vruntime <-update_curr +kworker/-59 3d..2 3us : cpuacct_charge <-update_curr +kworker/-59 3d..2 3us : __rcu_read_lock <-cpuacct_charge +kworker/-59 3d..2 3us : __rcu_read_unlock <-cpuacct_charge +kworker/-59 3d..2 3us : update_cfs_rq_blocked_load <-dequeue_task_fair +kworker/-59 3d..2 4us : clear_buddies <-dequeue_task_fair +kworker/-59 3d..2 4us : account_entity_dequeue <-dequeue_task_fair +kworker/-59 3d..2 4us : update_min_vruntime <-dequeue_task_fair +kworker/-59 3d..2 4us : update_cfs_shares <-dequeue_task_fair +kworker/-59 3d..2 5us : hrtick_update <-dequeue_task_fair +kworker/-59 3d..2 5us : wq_worker_sleeping <-__schedule +kworker/-59 3d..2 5us : kthread_data <-wq_worker_sleeping +kworker/-59 3d..2 5us : put_prev_task_fair <-__schedule +kworker/-59 3d..2 6us : pick_next_task_fair <-pick_next_task +kworker/-59 3d..2 6us : clear_buddies <-pick_next_task_fair +kworker/-59 3d..2 6us : set_next_entity <-pick_next_task_fair +kworker/-59 3d..2 6us : update_stats_wait_end <-set_next_entity + ls-2269 3d..2 7us : finish_task_switch <-__schedule + ls-2269 3d..2 7us : _raw_spin_unlock_irq <-finish_task_switch + ls-2269 3d..2 8us : do_IRQ <-ret_from_intr + ls-2269 3d..2 8us : irq_enter <-do_IRQ + ls-2269 3d..2 8us : rcu_irq_enter <-irq_enter + ls-2269 3d..2 9us : add_preempt_count <-irq_enter + ls-2269 3d.h2 9us : exit_idle <-do_IRQ [...] - sshd-4261 0d.s3 43us : sub_preempt_count (local_bh_enable_ip) - sshd-4261 0d.s4 44us : sub_preempt_count (local_bh_enable_ip) - sshd-4261 0d.s3 44us : smp_apic_timer_interrupt (apic_timer_interrupt) - sshd-4261 0d.s3 45us : irq_enter (smp_apic_timer_interrupt) - sshd-4261 0d.s3 45us : idle_cpu (irq_enter) - sshd-4261 0d.s3 46us : add_preempt_count (irq_enter) - sshd-4261 0d.H3 46us : idle_cpu (irq_enter) - sshd-4261 0d.H3 47us : hrtimer_interrupt (smp_apic_timer_interrupt) - sshd-4261 0d.H3 47us : ktime_get (hrtimer_interrupt) + ls-2269 3d.h3 20us : sub_preempt_count <-_raw_spin_unlock + ls-2269 3d.h2 20us : irq_exit <-do_IRQ + ls-2269 3d.h2 21us : sub_preempt_count <-irq_exit + ls-2269 3d..3 21us : do_softirq <-irq_exit + ls-2269 3d..3 21us : __do_softirq <-call_softirq + ls-2269 3d..3 21us+: __local_bh_disable <-__do_softirq + ls-2269 3d.s4 29us : sub_preempt_count <-_local_bh_enable_ip + ls-2269 3d.s5 29us : sub_preempt_count <-_local_bh_enable_ip + ls-2269 3d.s5 31us : do_IRQ <-ret_from_intr + ls-2269 3d.s5 31us : irq_enter <-do_IRQ + ls-2269 3d.s5 31us : rcu_irq_enter <-irq_enter [...] - sshd-4261 0d.H3 81us : tick_program_event (hrtimer_interrupt) - sshd-4261 0d.H3 82us : ktime_get (tick_program_event) - sshd-4261 0d.H3 82us : ktime_get_ts (ktime_get) - sshd-4261 0d.H3 83us : getnstimeofday (ktime_get_ts) - sshd-4261 0d.H3 83us : set_normalized_timespec (ktime_get_ts) - sshd-4261 0d.H3 84us : clockevents_program_event (tick_program_event) - sshd-4261 0d.H3 84us : lapic_next_event (clockevents_program_event) - sshd-4261 0d.H3 85us : irq_exit (smp_apic_timer_interrupt) - sshd-4261 0d.H3 85us : sub_preempt_count (irq_exit) - sshd-4261 0d.s4 86us : sub_preempt_count (irq_exit) - sshd-4261 0d.s3 86us : add_preempt_count (__local_bh_disable) + ls-2269 3d.s5 31us : rcu_irq_enter <-irq_enter + ls-2269 3d.s5 32us : add_preempt_count <-irq_enter + ls-2269 3d.H5 32us : exit_idle <-do_IRQ + ls-2269 3d.H5 32us : handle_irq <-do_IRQ + ls-2269 3d.H5 32us : irq_to_desc <-handle_irq + ls-2269 3d.H5 33us : handle_fasteoi_irq <-handle_irq [...] - sshd-4261 0d.s1 98us : sub_preempt_count (net_rx_action) - sshd-4261 0d.s. 99us : add_preempt_count (_spin_lock_irq) - sshd-4261 0d.s1 99us+: _spin_unlock_irq (run_timer_softirq) - sshd-4261 0d.s. 104us : _local_bh_enable (__do_softirq) - sshd-4261 0d.s. 104us : sub_preempt_count (_local_bh_enable) - sshd-4261 0d.s. 105us : _local_bh_enable (__do_softirq) - sshd-4261 0d.s1 105us : trace_preempt_on (__do_softirq) - - -This is a very interesting trace. It started with the preemption -of the ls task. We see that the task had the "need_resched" bit -set via the 'N' in the trace. Interrupts were disabled before -the spin_lock at the beginning of the trace. We see that a -schedule took place to run sshd. When the interrupts were -enabled, we took an interrupt. On return from the interrupt -handler, the softirq ran. We took another interrupt while -running the softirq as we see from the capital 'H'. + ls-2269 3d.s5 158us : _raw_spin_unlock_irqrestore <-rtl8139_poll + ls-2269 3d.s3 158us : net_rps_action_and_irq_enable.isra.65 <-net_rx_action + ls-2269 3d.s3 159us : __local_bh_enable <-__do_softirq + ls-2269 3d.s3 159us : sub_preempt_count <-__local_bh_enable + ls-2269 3d..3 159us : idle_cpu <-irq_exit + ls-2269 3d..3 159us : rcu_irq_exit <-irq_exit + ls-2269 3d..3 160us : sub_preempt_count <-irq_exit + ls-2269 3d... 161us : __mutex_unlock_slowpath <-mutex_unlock + ls-2269 3d... 162us+: trace_hardirqs_on <-mutex_unlock + ls-2269 3d... 186us : <stack trace> + => __mutex_unlock_slowpath + => mutex_unlock + => process_output + => n_tty_write + => tty_write + => vfs_write + => sys_write + => system_call_fastpath + +This is an interesting trace. It started with kworker running and +scheduling out and ls taking over. But as soon as ls released the +rq lock and enabled interrupts (but not preemption) an interrupt +triggered. When the interrupt finished, it started running softirqs. +But while the softirq was running, another interrupt triggered. +When an interrupt is running inside a softirq, the annotation is 'H'. wakeup ------ +One common case that people are interested in tracing is the +time it takes for a task that is woken to actually wake up. +Now for non Real-Time tasks, this can be arbitrary. But tracing +it none the less can be interesting. + +Without function tracing: + + # echo 0 > options/function-trace + # echo wakeup > current_tracer + # echo 1 > tracing_on + # echo 0 > tracing_max_latency + # chrt -f 5 sleep 1 + # echo 0 > tracing_on + # cat trace +# tracer: wakeup +# +# wakeup latency trace v1.1.5 on 3.8.0-test+ +# -------------------------------------------------------------------- +# latency: 15 us, #4/4, CPU#3 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4) +# ----------------- +# | task: kworker/3:1H-312 (uid:0 nice:-20 policy:0 rt_prio:0) +# ----------------- +# +# _------=> CPU# +# / _-----=> irqs-off +# | / _----=> need-resched +# || / _---=> hardirq/softirq +# ||| / _--=> preempt-depth +# |||| / delay +# cmd pid ||||| time | caller +# \ / ||||| \ | / + <idle>-0 3dNs7 0us : 0:120:R + [003] 312:100:R kworker/3:1H + <idle>-0 3dNs7 1us+: ttwu_do_activate.constprop.87 <-try_to_wake_up + <idle>-0 3d..3 15us : __schedule <-schedule + <idle>-0 3d..3 15us : 0:120:R ==> [003] 312:100:R kworker/3:1H + +The tracer only traces the highest priority task in the system +to avoid tracing the normal circumstances. Here we see that +the kworker with a nice priority of -20 (not very nice), took +just 15 microseconds from the time it woke up, to the time it +ran. + +Non Real-Time tasks are not that interesting. A more interesting +trace is to concentrate only on Real-Time tasks. + +wakeup_rt +--------- + In a Real-Time environment it is very important to know the wakeup time it takes for the highest priority task that is woken up to the time that it executes. This is also known as "schedule @@ -914,124 +1423,229 @@ Real-Time environments are interested in the worst case latency. That is the longest latency it takes for something to happen, and not the average. We can have a very fast scheduler that may only have a large latency once in a while, but that would not -work well with Real-Time tasks. The wakeup tracer was designed +work well with Real-Time tasks. The wakeup_rt tracer was designed to record the worst case wakeups of RT tasks. Non-RT tasks are not recorded because the tracer only records one worst case and tracing non-RT tasks that are unpredictable will overwrite the -worst case latency of RT tasks. +worst case latency of RT tasks (just run the normal wakeup +tracer for a while to see that effect). Since this tracer only deals with RT tasks, we will run this slightly differently than we did with the previous tracers. Instead of performing an 'ls', we will run 'sleep 1' under 'chrt' which changes the priority of the task. - # echo wakeup > current_tracer - # echo latency-format > trace_options - # echo 0 > tracing_max_latency + # echo 0 > options/function-trace + # echo wakeup_rt > current_tracer # echo 1 > tracing_on + # echo 0 > tracing_max_latency # chrt -f 5 sleep 1 # echo 0 > tracing_on # cat trace # tracer: wakeup # -wakeup latency trace v1.1.5 on 2.6.26-rc8 --------------------------------------------------------------------- - latency: 4 us, #2/2, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2) - ----------------- - | task: sleep-4901 (uid:0 nice:0 policy:1 rt_prio:5) - ----------------- - -# _------=> CPU# -# / _-----=> irqs-off -# | / _----=> need-resched -# || / _---=> hardirq/softirq -# ||| / _--=> preempt-depth -# |||| / -# ||||| delay -# cmd pid ||||| time | caller -# \ / ||||| \ | / - <idle>-0 1d.h4 0us+: try_to_wake_up (wake_up_process) - <idle>-0 1d..4 4us : schedule (cpu_idle) - - -Running this on an idle system, we see that it only took 4 -microseconds to perform the task switch. Note, since the trace -marker in the schedule is before the actual "switch", we stop -the tracing when the recorded task is about to schedule in. This -may change if we add a new marker at the end of the scheduler. - -Notice that the recorded task is 'sleep' with the PID of 4901 +# tracer: wakeup_rt +# +# wakeup_rt latency trace v1.1.5 on 3.8.0-test+ +# -------------------------------------------------------------------- +# latency: 5 us, #4/4, CPU#3 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4) +# ----------------- +# | task: sleep-2389 (uid:0 nice:0 policy:1 rt_prio:5) +# ----------------- +# +# _------=> CPU# +# / _-----=> irqs-off +# | / _----=> need-resched +# || / _---=> hardirq/softirq +# ||| / _--=> preempt-depth +# |||| / delay +# cmd pid ||||| time | caller +# \ / ||||| \ | / + <idle>-0 3d.h4 0us : 0:120:R + [003] 2389: 94:R sleep + <idle>-0 3d.h4 1us+: ttwu_do_activate.constprop.87 <-try_to_wake_up + <idle>-0 3d..3 5us : __schedule <-schedule + <idle>-0 3d..3 5us : 0:120:R ==> [003] 2389: 94:R sleep + + +Running this on an idle system, we see that it only took 5 microseconds +to perform the task switch. Note, since the trace point in the schedule +is before the actual "switch", we stop the tracing when the recorded task +is about to schedule in. This may change if we add a new marker at the +end of the scheduler. + +Notice that the recorded task is 'sleep' with the PID of 2389 and it has an rt_prio of 5. This priority is user-space priority and not the internal kernel priority. The policy is 1 for SCHED_FIFO and 2 for SCHED_RR. -Doing the same with chrt -r 5 and ftrace_enabled set. +Note, that the trace data shows the internal priority (99 - rtprio). -# tracer: wakeup + <idle>-0 3d..3 5us : 0:120:R ==> [003] 2389: 94:R sleep + +The 0:120:R means idle was running with a nice priority of 0 (120 - 20) +and in the running state 'R'. The sleep task was scheduled in with +2389: 94:R. That is the priority is the kernel rtprio (99 - 5 = 94) +and it too is in the running state. + +Doing the same with chrt -r 5 and function-trace set. + + echo 1 > options/function-trace + +# tracer: wakeup_rt # -wakeup latency trace v1.1.5 on 2.6.26-rc8 --------------------------------------------------------------------- - latency: 50 us, #60/60, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2) - ----------------- - | task: sleep-4068 (uid:0 nice:0 policy:2 rt_prio:5) - ----------------- - -# _------=> CPU# -# / _-----=> irqs-off -# | / _----=> need-resched -# || / _---=> hardirq/softirq -# ||| / _--=> preempt-depth -# |||| / -# ||||| delay -# cmd pid ||||| time | caller -# \ / ||||| \ | / -ksoftirq-7 1d.H3 0us : try_to_wake_up (wake_up_process) -ksoftirq-7 1d.H4 1us : sub_preempt_count (marker_probe_cb) -ksoftirq-7 1d.H3 2us : check_preempt_wakeup (try_to_wake_up) -ksoftirq-7 1d.H3 3us : update_curr (check_preempt_wakeup) -ksoftirq-7 1d.H3 4us : calc_delta_mine (update_curr) -ksoftirq-7 1d.H3 5us : __resched_task (check_preempt_wakeup) -ksoftirq-7 1d.H3 6us : task_wake_up_rt (try_to_wake_up) -ksoftirq-7 1d.H3 7us : _spin_unlock_irqrestore (try_to_wake_up) -[...] -ksoftirq-7 1d.H2 17us : irq_exit (smp_apic_timer_interrupt) -ksoftirq-7 1d.H2 18us : sub_preempt_count (irq_exit) -ksoftirq-7 1d.s3 19us : sub_preempt_count (irq_exit) -ksoftirq-7 1..s2 20us : rcu_process_callbacks (__do_softirq) -[...] -ksoftirq-7 1..s2 26us : __rcu_process_callbacks (rcu_process_callbacks) -ksoftirq-7 1d.s2 27us : _local_bh_enable (__do_softirq) -ksoftirq-7 1d.s2 28us : sub_preempt_count (_local_bh_enable) -ksoftirq-7 1.N.3 29us : sub_preempt_count (ksoftirqd) -ksoftirq-7 1.N.2 30us : _cond_resched (ksoftirqd) -ksoftirq-7 1.N.2 31us : __cond_resched (_cond_resched) -ksoftirq-7 1.N.2 32us : add_preempt_count (__cond_resched) -ksoftirq-7 1.N.2 33us : schedule (__cond_resched) -ksoftirq-7 1.N.2 33us : add_preempt_count (schedule) -ksoftirq-7 1.N.3 34us : hrtick_clear (schedule) -ksoftirq-7 1dN.3 35us : _spin_lock (schedule) -ksoftirq-7 1dN.3 36us : add_preempt_count (_spin_lock) -ksoftirq-7 1d..4 37us : put_prev_task_fair (schedule) -ksoftirq-7 1d..4 38us : update_curr (put_prev_task_fair) -[...] -ksoftirq-7 1d..5 47us : _spin_trylock (tracing_record_cmdline) -ksoftirq-7 1d..5 48us : add_preempt_count (_spin_trylock) -ksoftirq-7 1d..6 49us : _spin_unlock (tracing_record_cmdline) -ksoftirq-7 1d..6 49us : sub_preempt_count (_spin_unlock) -ksoftirq-7 1d..4 50us : schedule (__cond_resched) - -The interrupt went off while running ksoftirqd. This task runs -at SCHED_OTHER. Why did not we see the 'N' set early? This may -be a harmless bug with x86_32 and 4K stacks. On x86_32 with 4K -stacks configured, the interrupt and softirq run with their own -stack. Some information is held on the top of the task's stack -(need_resched and preempt_count are both stored there). The -setting of the NEED_RESCHED bit is done directly to the task's -stack, but the reading of the NEED_RESCHED is done by looking at -the current stack, which in this case is the stack for the hard -interrupt. This hides the fact that NEED_RESCHED has been set. -We do not see the 'N' until we switch back to the task's -assigned stack. +# wakeup_rt latency trace v1.1.5 on 3.8.0-test+ +# -------------------------------------------------------------------- +# latency: 29 us, #85/85, CPU#3 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4) +# ----------------- +# | task: sleep-2448 (uid:0 nice:0 policy:1 rt_prio:5) +# ----------------- +# +# _------=> CPU# +# / _-----=> irqs-off +# | / _----=> need-resched +# || / _---=> hardirq/softirq +# ||| / _--=> preempt-depth +# |||| / delay +# cmd pid ||||| time | caller +# \ / ||||| \ | / + <idle>-0 3d.h4 1us+: 0:120:R + [003] 2448: 94:R sleep + <idle>-0 3d.h4 2us : ttwu_do_activate.constprop.87 <-try_to_wake_up + <idle>-0 3d.h3 3us : check_preempt_curr <-ttwu_do_wakeup + <idle>-0 3d.h3 3us : resched_task <-check_preempt_curr + <idle>-0 3dNh3 4us : task_woken_rt <-ttwu_do_wakeup + <idle>-0 3dNh3 4us : _raw_spin_unlock <-try_to_wake_up + <idle>-0 3dNh3 4us : sub_preempt_count <-_raw_spin_unlock + <idle>-0 3dNh2 5us : ttwu_stat <-try_to_wake_up + <idle>-0 3dNh2 5us : _raw_spin_unlock_irqrestore <-try_to_wake_up + <idle>-0 3dNh2 6us : sub_preempt_count <-_raw_spin_unlock_irqrestore + <idle>-0 3dNh1 6us : _raw_spin_lock <-__run_hrtimer + <idle>-0 3dNh1 6us : add_preempt_count <-_raw_spin_lock + <idle>-0 3dNh2 7us : _raw_spin_unlock <-hrtimer_interrupt + <idle>-0 3dNh2 7us : sub_preempt_count <-_raw_spin_unlock + <idle>-0 3dNh1 7us : tick_program_event <-hrtimer_interrupt + <idle>-0 3dNh1 7us : clockevents_program_event <-tick_program_event + <idle>-0 3dNh1 8us : ktime_get <-clockevents_program_event + <idle>-0 3dNh1 8us : lapic_next_event <-clockevents_program_event + <idle>-0 3dNh1 8us : irq_exit <-smp_apic_timer_interrupt + <idle>-0 3dNh1 9us : sub_preempt_count <-irq_exit + <idle>-0 3dN.2 9us : idle_cpu <-irq_exit + <idle>-0 3dN.2 9us : rcu_irq_exit <-irq_exit + <idle>-0 3dN.2 10us : rcu_eqs_enter_common.isra.45 <-rcu_irq_exit + <idle>-0 3dN.2 10us : sub_preempt_count <-irq_exit + <idle>-0 3.N.1 11us : rcu_idle_exit <-cpu_idle + <idle>-0 3dN.1 11us : rcu_eqs_exit_common.isra.43 <-rcu_idle_exit + <idle>-0 3.N.1 11us : tick_nohz_idle_exit <-cpu_idle + <idle>-0 3dN.1 12us : menu_hrtimer_cancel <-tick_nohz_idle_exit + <idle>-0 3dN.1 12us : ktime_get <-tick_nohz_idle_exit + <idle>-0 3dN.1 12us : tick_do_update_jiffies64 <-tick_nohz_idle_exit + <idle>-0 3dN.1 13us : update_cpu_load_nohz <-tick_nohz_idle_exit + <idle>-0 3dN.1 13us : _raw_spin_lock <-update_cpu_load_nohz + <idle>-0 3dN.1 13us : add_preempt_count <-_raw_spin_lock + <idle>-0 3dN.2 13us : __update_cpu_load <-update_cpu_load_nohz + <idle>-0 3dN.2 14us : sched_avg_update <-__update_cpu_load + <idle>-0 3dN.2 14us : _raw_spin_unlock <-update_cpu_load_nohz + <idle>-0 3dN.2 14us : sub_preempt_count <-_raw_spin_unlock + <idle>-0 3dN.1 15us : calc_load_exit_idle <-tick_nohz_idle_exit + <idle>-0 3dN.1 15us : touch_softlockup_watchdog <-tick_nohz_idle_exit + <idle>-0 3dN.1 15us : hrtimer_cancel <-tick_nohz_idle_exit + <idle>-0 3dN.1 15us : hrtimer_try_to_cancel <-hrtimer_cancel + <idle>-0 3dN.1 16us : lock_hrtimer_base.isra.18 <-hrtimer_try_to_cancel + <idle>-0 3dN.1 16us : _raw_spin_lock_irqsave <-lock_hrtimer_base.isra.18 + <idle>-0 3dN.1 16us : add_preempt_count <-_raw_spin_lock_irqsave + <idle>-0 3dN.2 17us : __remove_hrtimer <-remove_hrtimer.part.16 + <idle>-0 3dN.2 17us : hrtimer_force_reprogram <-__remove_hrtimer + <idle>-0 3dN.2 17us : tick_program_event <-hrtimer_force_reprogram + <idle>-0 3dN.2 18us : clockevents_program_event <-tick_program_event + <idle>-0 3dN.2 18us : ktime_get <-clockevents_program_event + <idle>-0 3dN.2 18us : lapic_next_event <-clockevents_program_event + <idle>-0 3dN.2 19us : _raw_spin_unlock_irqrestore <-hrtimer_try_to_cancel + <idle>-0 3dN.2 19us : sub_preempt_count <-_raw_spin_unlock_irqrestore + <idle>-0 3dN.1 19us : hrtimer_forward <-tick_nohz_idle_exit + <idle>-0 3dN.1 20us : ktime_add_safe <-hrtimer_forward + <idle>-0 3dN.1 20us : ktime_add_safe <-hrtimer_forward + <idle>-0 3dN.1 20us : hrtimer_start_range_ns <-hrtimer_start_expires.constprop.11 + <idle>-0 3dN.1 20us : __hrtimer_start_range_ns <-hrtimer_start_range_ns + <idle>-0 3dN.1 21us : lock_hrtimer_base.isra.18 <-__hrtimer_start_range_ns + <idle>-0 3dN.1 21us : _raw_spin_lock_irqsave <-lock_hrtimer_base.isra.18 + <idle>-0 3dN.1 21us : add_preempt_count <-_raw_spin_lock_irqsave + <idle>-0 3dN.2 22us : ktime_add_safe <-__hrtimer_start_range_ns + <idle>-0 3dN.2 22us : enqueue_hrtimer <-__hrtimer_start_range_ns + <idle>-0 3dN.2 22us : tick_program_event <-__hrtimer_start_range_ns + <idle>-0 3dN.2 23us : clockevents_program_event <-tick_program_event + <idle>-0 3dN.2 23us : ktime_get <-clockevents_program_event + <idle>-0 3dN.2 23us : lapic_next_event <-clockevents_program_event + <idle>-0 3dN.2 24us : _raw_spin_unlock_irqrestore <-__hrtimer_start_range_ns + <idle>-0 3dN.2 24us : sub_preempt_count <-_raw_spin_unlock_irqrestore + <idle>-0 3dN.1 24us : account_idle_ticks <-tick_nohz_idle_exit + <idle>-0 3dN.1 24us : account_idle_time <-account_idle_ticks + <idle>-0 3.N.1 25us : sub_preempt_count <-cpu_idle + <idle>-0 3.N.. 25us : schedule <-cpu_idle + <idle>-0 3.N.. 25us : __schedule <-preempt_schedule + <idle>-0 3.N.. 26us : add_preempt_count <-__schedule + <idle>-0 3.N.1 26us : rcu_note_context_switch <-__schedule + <idle>-0 3.N.1 26us : rcu_sched_qs <-rcu_note_context_switch + <idle>-0 3dN.1 27us : rcu_preempt_qs <-rcu_note_context_switch + <idle>-0 3.N.1 27us : _raw_spin_lock_irq <-__schedule + <idle>-0 3dN.1 27us : add_preempt_count <-_raw_spin_lock_irq + <idle>-0 3dN.2 28us : put_prev_task_idle <-__schedule + <idle>-0 3dN.2 28us : pick_next_task_stop <-pick_next_task + <idle>-0 3dN.2 28us : pick_next_task_rt <-pick_next_task + <idle>-0 3dN.2 29us : dequeue_pushable_task <-pick_next_task_rt + <idle>-0 3d..3 29us : __schedule <-preempt_schedule + <idle>-0 3d..3 30us : 0:120:R ==> [003] 2448: 94:R sleep + +This isn't that big of a trace, even with function tracing enabled, +so I included the entire trace. + +The interrupt went off while when the system was idle. Somewhere +before task_woken_rt() was called, the NEED_RESCHED flag was set, +this is indicated by the first occurrence of the 'N' flag. + +Latency tracing and events +-------------------------- +As function tracing can induce a much larger latency, but without +seeing what happens within the latency it is hard to know what +caused it. There is a middle ground, and that is with enabling +events. + + # echo 0 > options/function-trace + # echo wakeup_rt > current_tracer + # echo 1 > events/enable + # echo 1 > tracing_on + # echo 0 > tracing_max_latency + # chrt -f 5 sleep 1 + # echo 0 > tracing_on + # cat trace +# tracer: wakeup_rt +# +# wakeup_rt latency trace v1.1.5 on 3.8.0-test+ +# -------------------------------------------------------------------- +# latency: 6 us, #12/12, CPU#2 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4) +# ----------------- +# | task: sleep-5882 (uid:0 nice:0 policy:1 rt_prio:5) +# ----------------- +# +# _------=> CPU# +# / _-----=> irqs-off +# | / _----=> need-resched +# || / _---=> hardirq/softirq +# ||| / _--=> preempt-depth +# |||| / delay +# cmd pid ||||| time | caller +# \ / ||||| \ | / + <idle>-0 2d.h4 0us : 0:120:R + [002] 5882: 94:R sleep + <idle>-0 2d.h4 0us : ttwu_do_activate.constprop.87 <-try_to_wake_up + <idle>-0 2d.h4 1us : sched_wakeup: comm=sleep pid=5882 prio=94 success=1 target_cpu=002 + <idle>-0 2dNh2 1us : hrtimer_expire_exit: hrtimer=ffff88007796feb8 + <idle>-0 2.N.2 2us : power_end: cpu_id=2 + <idle>-0 2.N.2 3us : cpu_idle: state=4294967295 cpu_id=2 + <idle>-0 2dN.3 4us : hrtimer_cancel: hrtimer=ffff88007d50d5e0 + <idle>-0 2dN.3 4us : hrtimer_start: hrtimer=ffff88007d50d5e0 function=tick_sched_timer expires=34311211000000 softexpires=34311211000000 + <idle>-0 2.N.2 5us : rcu_utilization: Start context switch + <idle>-0 2.N.2 5us : rcu_utilization: End context switch + <idle>-0 2d..3 6us : __schedule <-schedule + <idle>-0 2d..3 6us : 0:120:R ==> [002] 5882: 94:R sleep + function -------- @@ -1039,6 +1653,7 @@ function This tracer is the function tracer. Enabling the function tracer can be done from the debug file system. Make sure the ftrace_enabled is set; otherwise this tracer is a nop. +See the "ftrace_enabled" section below. # sysctl kernel.ftrace_enabled=1 # echo function > current_tracer @@ -1048,23 +1663,23 @@ ftrace_enabled is set; otherwise this tracer is a nop. # cat trace # tracer: function # -# TASK-PID CPU# TIMESTAMP FUNCTION -# | | | | | - bash-4003 [00] 123.638713: finish_task_switch <-schedule - bash-4003 [00] 123.638714: _spin_unlock_irq <-finish_task_switch - bash-4003 [00] 123.638714: sub_preempt_count <-_spin_unlock_irq - bash-4003 [00] 123.638715: hrtick_set <-schedule - bash-4003 [00] 123.638715: _spin_lock_irqsave <-hrtick_set - bash-4003 [00] 123.638716: add_preempt_count <-_spin_lock_irqsave - bash-4003 [00] 123.638716: _spin_unlock_irqrestore <-hrtick_set - bash-4003 [00] 123.638717: sub_preempt_count <-_spin_unlock_irqrestore - bash-4003 [00] 123.638717: hrtick_clear <-hrtick_set - bash-4003 [00] 123.638718: sub_preempt_count <-schedule - bash-4003 [00] 123.638718: sub_preempt_count <-preempt_schedule - bash-4003 [00] 123.638719: wait_for_completion <-__stop_machine_run - bash-4003 [00] 123.638719: wait_for_common <-wait_for_completion - bash-4003 [00] 123.638720: _spin_lock_irq <-wait_for_common - bash-4003 [00] 123.638720: add_preempt_count <-_spin_lock_irq +# entries-in-buffer/entries-written: 24799/24799 #P:4 +# +# _-----=> irqs-off +# / _----=> need-resched +# | / _---=> hardirq/softirq +# || / _--=> preempt-depth +# ||| / delay +# TASK-PID CPU# |||| TIMESTAMP FUNCTION +# | | | |||| | | + bash-1994 [002] .... 3082.063030: mutex_unlock <-rb_simple_write + bash-1994 [002] .... 3082.063031: __mutex_unlock_slowpath <-mutex_unlock + bash-1994 [002] .... 3082.063031: __fsnotify_parent <-fsnotify_modify + bash-1994 [002] .... 3082.063032: fsnotify <-fsnotify_modify + bash-1994 [002] .... 3082.063032: __srcu_read_lock <-fsnotify + bash-1994 [002] .... 3082.063032: add_preempt_count <-__srcu_read_lock + bash-1994 [002] ...1 3082.063032: sub_preempt_count <-__srcu_read_lock + bash-1994 [002] .... 3082.063033: __srcu_read_unlock <-fsnotify [...] @@ -1214,79 +1829,19 @@ int main (int argc, char **argv) return 0; } +Or this simple script! -hw-branch-tracer (x86 only) ---------------------------- - -This tracer uses the x86 last branch tracing hardware feature to -collect a branch trace on all cpus with relatively low overhead. - -The tracer uses a fixed-size circular buffer per cpu and only -traces ring 0 branches. The trace file dumps that buffer in the -following format: - -# tracer: hw-branch-tracer -# -# CPU# TO <- FROM - 0 scheduler_tick+0xb5/0x1bf <- task_tick_idle+0x5/0x6 - 2 run_posix_cpu_timers+0x2b/0x72a <- run_posix_cpu_timers+0x25/0x72a - 0 scheduler_tick+0x139/0x1bf <- scheduler_tick+0xed/0x1bf - 0 scheduler_tick+0x17c/0x1bf <- scheduler_tick+0x148/0x1bf - 2 run_posix_cpu_timers+0x9e/0x72a <- run_posix_cpu_timers+0x5e/0x72a - 0 scheduler_tick+0x1b6/0x1bf <- scheduler_tick+0x1aa/0x1bf - - -The tracer may be used to dump the trace for the oops'ing cpu on -a kernel oops into the system log. To enable this, -ftrace_dump_on_oops must be set. To set ftrace_dump_on_oops, one -can either use the sysctl function or set it via the proc system -interface. - - sysctl kernel.ftrace_dump_on_oops=n - -or - - echo n > /proc/sys/kernel/ftrace_dump_on_oops - -If n = 1, ftrace will dump buffers of all CPUs, if n = 2 ftrace will -only dump the buffer of the CPU that triggered the oops. - -Here's an example of such a dump after a null pointer -dereference in a kernel module: - -[57848.105921] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 -[57848.106019] IP: [<ffffffffa0000006>] open+0x6/0x14 [oops] -[57848.106019] PGD 2354e9067 PUD 2375e7067 PMD 0 -[57848.106019] Oops: 0002 [#1] SMP -[57848.106019] last sysfs file: /sys/devices/pci0000:00/0000:00:1e.0/0000:20:05.0/local_cpus -[57848.106019] Dumping ftrace buffer: -[57848.106019] --------------------------------- -[...] -[57848.106019] 0 chrdev_open+0xe6/0x165 <- cdev_put+0x23/0x24 -[57848.106019] 0 chrdev_open+0x117/0x165 <- chrdev_open+0xfa/0x165 -[57848.106019] 0 chrdev_open+0x120/0x165 <- chrdev_open+0x11c/0x165 -[57848.106019] 0 chrdev_open+0x134/0x165 <- chrdev_open+0x12b/0x165 -[57848.106019] 0 open+0x0/0x14 [oops] <- chrdev_open+0x144/0x165 -[57848.106019] 0 page_fault+0x0/0x30 <- open+0x6/0x14 [oops] -[57848.106019] 0 error_entry+0x0/0x5b <- page_fault+0x4/0x30 -[57848.106019] 0 error_kernelspace+0x0/0x31 <- error_entry+0x59/0x5b -[57848.106019] 0 error_sti+0x0/0x1 <- error_kernelspace+0x2d/0x31 -[57848.106019] 0 page_fault+0x9/0x30 <- error_sti+0x0/0x1 -[57848.106019] 0 do_page_fault+0x0/0x881 <- page_fault+0x1a/0x30 -[...] -[57848.106019] 0 do_page_fault+0x66b/0x881 <- is_prefetch+0x1ee/0x1f2 -[57848.106019] 0 do_page_fault+0x6e0/0x881 <- do_page_fault+0x67a/0x881 -[57848.106019] 0 oops_begin+0x0/0x96 <- do_page_fault+0x6e0/0x881 -[57848.106019] 0 trace_hw_branch_oops+0x0/0x2d <- oops_begin+0x9/0x96 -[...] -[57848.106019] 0 ds_suspend_bts+0x2a/0xe3 <- ds_suspend_bts+0x1a/0xe3 -[57848.106019] --------------------------------- -[57848.106019] CPU 0 -[57848.106019] Modules linked in: oops -[57848.106019] Pid: 5542, comm: cat Tainted: G W 2.6.28 #23 -[57848.106019] RIP: 0010:[<ffffffffa0000006>] [<ffffffffa0000006>] open+0x6/0x14 [oops] -[57848.106019] RSP: 0018:ffff880235457d48 EFLAGS: 00010246 -[...] +------ +#!/bin/bash + +debugfs=`sed -ne 's/^debugfs \(.*\) debugfs.*/\1/p' /proc/mounts` +echo nop > $debugfs/tracing/current_tracer +echo 0 > $debugfs/tracing/tracing_on +echo $$ > $debugfs/tracing/set_ftrace_pid +echo function > $debugfs/tracing/current_tracer +echo 1 > $debugfs/tracing/tracing_on +exec "$@" +------ function graph tracer @@ -1473,16 +2028,18 @@ starts of pointing to a simple return. (Enabling FTRACE will include the -pg switch in the compiling of the kernel.) At compile time every C file object is run through the -recordmcount.pl script (located in the scripts directory). This -script will process the C object using objdump to find all the -locations in the .text section that call mcount. (Note, only the -.text section is processed, since processing other sections like -.init.text may cause races due to those sections being freed). +recordmcount program (located in the scripts directory). This +program will parse the ELF headers in the C object to find all +the locations in the .text section that call mcount. (Note, only +white listed .text sections are processed, since processing other +sections like .init.text may cause races due to those sections +being freed unexpectedly). A new section called "__mcount_loc" is created that holds references to all the mcount call sites in the .text section. -This section is compiled back into the original object. The -final linker will add all these references into a single table. +The recordmcount program re-links this section back into the +original object. The final linking stage of the kernel will add all these +references into a single table. On boot up, before SMP is initialized, the dynamic ftrace code scans this table and updates all the locations into nops. It @@ -1493,13 +2050,25 @@ unloaded, it also removes its functions from the ftrace function list. This is automatic in the module unload code, and the module author does not need to worry about it. -When tracing is enabled, kstop_machine is called to prevent -races with the CPUS executing code being modified (which can -cause the CPU to do undesirable things), and the nops are +When tracing is enabled, the process of modifying the function +tracepoints is dependent on architecture. The old method is to use +kstop_machine to prevent races with the CPUs executing code being +modified (which can cause the CPU to do undesirable things, especially +if the modified code crosses cache (or page) boundaries), and the nops are patched back to calls. But this time, they do not call mcount (which is just a function stub). They now call into the ftrace infrastructure. +The new method of modifying the function tracepoints is to place +a breakpoint at the location to be modified, sync all CPUs, modify +the rest of the instruction not covered by the breakpoint. Sync +all CPUs again, and then remove the breakpoint with the finished +version to the ftrace call site. + +Some archs do not even need to monkey around with the synchronization, +and can just slap the new code on top of the old without any +problems with other CPUs executing it at the same time. + One special side-effect to the recording of the functions being traced is that we can now selectively choose which functions we wish to trace and which ones we want the mcount calls to remain @@ -1530,20 +2099,28 @@ mutex_lock If I am only interested in sys_nanosleep and hrtimer_interrupt: - # echo sys_nanosleep hrtimer_interrupt \ - > set_ftrace_filter + # echo sys_nanosleep hrtimer_interrupt > set_ftrace_filter # echo function > current_tracer # echo 1 > tracing_on # usleep 1 # echo 0 > tracing_on # cat trace -# tracer: ftrace +# tracer: function +# +# entries-in-buffer/entries-written: 5/5 #P:4 # -# TASK-PID CPU# TIMESTAMP FUNCTION -# | | | | | - usleep-4134 [00] 1317.070017: hrtimer_interrupt <-smp_apic_timer_interrupt - usleep-4134 [00] 1317.070111: sys_nanosleep <-syscall_call - <idle>-0 [00] 1317.070115: hrtimer_interrupt <-smp_apic_timer_interrupt +# _-----=> irqs-off +# / _----=> need-resched +# | / _---=> hardirq/softirq +# || / _--=> preempt-depth +# ||| / delay +# TASK-PID CPU# |||| TIMESTAMP FUNCTION +# | | | |||| | | + usleep-2665 [001] .... 4186.475355: sys_nanosleep <-system_call_fastpath + <idle>-0 [001] d.h1 4186.475409: hrtimer_interrupt <-smp_apic_timer_interrupt + usleep-2665 [001] d.h1 4186.475426: hrtimer_interrupt <-smp_apic_timer_interrupt + <idle>-0 [003] d.h1 4186.475426: hrtimer_interrupt <-smp_apic_timer_interrupt + <idle>-0 [002] d.h1 4186.475427: hrtimer_interrupt <-smp_apic_timer_interrupt To see which functions are being traced, you can cat the file: @@ -1571,20 +2148,25 @@ Note: It is better to use quotes to enclose the wild cards, Produces: -# tracer: ftrace +# tracer: function # -# TASK-PID CPU# TIMESTAMP FUNCTION -# | | | | | - bash-4003 [00] 1480.611794: hrtimer_init <-copy_process - bash-4003 [00] 1480.611941: hrtimer_start <-hrtick_set - bash-4003 [00] 1480.611956: hrtimer_cancel <-hrtick_clear - bash-4003 [00] 1480.611956: hrtimer_try_to_cancel <-hrtimer_cancel - <idle>-0 [00] 1480.612019: hrtimer_get_next_event <-get_next_timer_interrupt - <idle>-0 [00] 1480.612025: hrtimer_get_next_event <-get_next_timer_interrupt - <idle>-0 [00] 1480.612032: hrtimer_get_next_event <-get_next_timer_interrupt - <idle>-0 [00] 1480.612037: hrtimer_get_next_event <-get_next_timer_interrupt - <idle>-0 [00] 1480.612382: hrtimer_get_next_event <-get_next_timer_interrupt - +# entries-in-buffer/entries-written: 897/897 #P:4 +# +# _-----=> irqs-off +# / _----=> need-resched +# | / _---=> hardirq/softirq +# || / _--=> preempt-depth +# ||| / delay +# TASK-PID CPU# |||| TIMESTAMP FUNCTION +# | | | |||| | | + <idle>-0 [003] dN.1 4228.547803: hrtimer_cancel <-tick_nohz_idle_exit + <idle>-0 [003] dN.1 4228.547804: hrtimer_try_to_cancel <-hrtimer_cancel + <idle>-0 [003] dN.2 4228.547805: hrtimer_force_reprogram <-__remove_hrtimer + <idle>-0 [003] dN.1 4228.547805: hrtimer_forward <-tick_nohz_idle_exit + <idle>-0 [003] dN.1 4228.547805: hrtimer_start_range_ns <-hrtimer_start_expires.constprop.11 + <idle>-0 [003] d..1 4228.547858: hrtimer_get_next_event <-get_next_timer_interrupt + <idle>-0 [003] d..1 4228.547859: hrtimer_start <-__tick_nohz_idle_enter + <idle>-0 [003] d..2 4228.547860: hrtimer_force_reprogram <-__rem Notice that we lost the sys_nanosleep. @@ -1651,19 +2233,29 @@ traced. Produces: -# tracer: ftrace +# tracer: function +# +# entries-in-buffer/entries-written: 39608/39608 #P:4 # -# TASK-PID CPU# TIMESTAMP FUNCTION -# | | | | | - bash-4043 [01] 115.281644: finish_task_switch <-schedule - bash-4043 [01] 115.281645: hrtick_set <-schedule - bash-4043 [01] 115.281645: hrtick_clear <-hrtick_set - bash-4043 [01] 115.281646: wait_for_completion <-__stop_machine_run - bash-4043 [01] 115.281647: wait_for_common <-wait_for_completion - bash-4043 [01] 115.281647: kthread_stop <-stop_machine_run - bash-4043 [01] 115.281648: init_waitqueue_head <-kthread_stop - bash-4043 [01] 115.281648: wake_up_process <-kthread_stop - bash-4043 [01] 115.281649: try_to_wake_up <-wake_up_process +# _-----=> irqs-off +# / _----=> need-resched +# | / _---=> hardirq/softirq +# || / _--=> preempt-depth +# ||| / delay +# TASK-PID CPU# |||| TIMESTAMP FUNCTION +# | | | |||| | | + bash-1994 [000] .... 4342.324896: file_ra_state_init <-do_dentry_open + bash-1994 [000] .... 4342.324897: open_check_o_direct <-do_last + bash-1994 [000] .... 4342.324897: ima_file_check <-do_last + bash-1994 [000] .... 4342.324898: process_measurement <-ima_file_check + bash-1994 [000] .... 4342.324898: ima_get_action <-process_measurement + bash-1994 [000] .... 4342.324898: ima_match_policy <-ima_get_action + bash-1994 [000] .... 4342.324899: do_truncate <-do_last + bash-1994 [000] .... 4342.324899: should_remove_suid <-do_truncate + bash-1994 [000] .... 4342.324899: notify_change <-do_truncate + bash-1994 [000] .... 4342.324900: current_fs_time <-notify_change + bash-1994 [000] .... 4342.324900: current_kernel_time <-current_fs_time + bash-1994 [000] .... 4342.324900: timespec_trunc <-current_fs_time We can see that there's no more lock or preempt tracing. @@ -1729,6 +2321,28 @@ this special filter via: echo > set_graph_function +ftrace_enabled +-------------- + +Note, the proc sysctl ftrace_enable is a big on/off switch for the +function tracer. By default it is enabled (when function tracing is +enabled in the kernel). If it is disabled, all function tracing is +disabled. This includes not only the function tracers for ftrace, but +also for any other uses (perf, kprobes, stack tracing, profiling, etc). + +Please disable this with care. + +This can be disable (and enabled) with: + + sysctl kernel.ftrace_enabled=0 + sysctl kernel.ftrace_enabled=1 + + or + + echo 0 > /proc/sys/kernel/ftrace_enabled + echo 1 > /proc/sys/kernel/ftrace_enabled + + Filter commands --------------- @@ -1763,12 +2377,58 @@ The following commands are supported: echo '__schedule_bug:traceoff:5' > set_ftrace_filter + To always disable tracing when __schedule_bug is hit: + + echo '__schedule_bug:traceoff' > set_ftrace_filter + These commands are cumulative whether or not they are appended to set_ftrace_filter. To remove a command, prepend it by '!' and drop the parameter: + echo '!__schedule_bug:traceoff:0' > set_ftrace_filter + + The above removes the traceoff command for __schedule_bug + that have a counter. To remove commands without counters: + echo '!__schedule_bug:traceoff' > set_ftrace_filter +- snapshot + Will cause a snapshot to be triggered when the function is hit. + + echo 'native_flush_tlb_others:snapshot' > set_ftrace_filter + + To only snapshot once: + + echo 'native_flush_tlb_others:snapshot:1' > set_ftrace_filter + + To remove the above commands: + + echo '!native_flush_tlb_others:snapshot' > set_ftrace_filter + echo '!native_flush_tlb_others:snapshot:0' > set_ftrace_filter + +- enable_event/disable_event + These commands can enable or disable a trace event. Note, because + function tracing callbacks are very sensitive, when these commands + are registered, the trace point is activated, but disabled in + a "soft" mode. That is, the tracepoint will be called, but + just will not be traced. The event tracepoint stays in this mode + as long as there's a command that triggers it. + + echo 'try_to_wake_up:enable_event:sched:sched_switch:2' > \ + set_ftrace_filter + + The format is: + + <function>:enable_event:<system>:<event>[:count] + <function>:disable_event:<system>:<event>[:count] + + To remove the events commands: + + + echo '!try_to_wake_up:enable_event:sched:sched_switch:0' > \ + set_ftrace_filter + echo '!schedule:disable_event:sched:sched_switch' > \ + set_ftrace_filter trace_pipe ---------- @@ -1787,28 +2447,31 @@ different. The trace is live. # cat trace # tracer: function # -# TASK-PID CPU# TIMESTAMP FUNCTION -# | | | | | +# entries-in-buffer/entries-written: 0/0 #P:4 +# +# _-----=> irqs-off +# / _----=> need-resched +# | / _---=> hardirq/softirq +# || / _--=> preempt-depth +# ||| / delay +# TASK-PID CPU# |||| TIMESTAMP FUNCTION +# | | | |||| | | # # cat /tmp/trace.out - bash-4043 [00] 41.267106: finish_task_switch <-schedule - bash-4043 [00] 41.267106: hrtick_set <-schedule - bash-4043 [00] 41.267107: hrtick_clear <-hrtick_set - bash-4043 [00] 41.267108: wait_for_completion <-__stop_machine_run - bash-4043 [00] 41.267108: wait_for_common <-wait_for_completion - bash-4043 [00] 41.267109: kthread_stop <-stop_machine_run - bash-4043 [00] 41.267109: init_waitqueue_head <-kthread_stop - bash-4043 [00] 41.267110: wake_up_process <-kthread_stop - bash-4043 [00] 41.267110: try_to_wake_up <-wake_up_process - bash-4043 [00] 41.267111: select_task_rq_rt <-try_to_wake_up + bash-1994 [000] .... 5281.568961: mutex_unlock <-rb_simple_write + bash-1994 [000] .... 5281.568963: __mutex_unlock_slowpath <-mutex_unlock + bash-1994 [000] .... 5281.568963: __fsnotify_parent <-fsnotify_modify + bash-1994 [000] .... 5281.568964: fsnotify <-fsnotify_modify + bash-1994 [000] .... 5281.568964: __srcu_read_lock <-fsnotify + bash-1994 [000] .... 5281.568964: add_preempt_count <-__srcu_read_lock + bash-1994 [000] ...1 5281.568965: sub_preempt_count <-__srcu_read_lock + bash-1994 [000] .... 5281.568965: __srcu_read_unlock <-fsnotify + bash-1994 [000] .... 5281.568967: sys_dup2 <-system_call_fastpath Note, reading the trace_pipe file will block until more input is -added. By changing the tracer, trace_pipe will issue an EOF. We -needed to set the function tracer _before_ we "cat" the -trace_pipe file. - +added. trace entries ------------- @@ -1817,31 +2480,50 @@ Having too much or not enough data can be troublesome in diagnosing an issue in the kernel. The file buffer_size_kb is used to modify the size of the internal trace buffers. The number listed is the number of entries that can be recorded per -CPU. To know the full size, multiply the number of possible CPUS +CPU. To know the full size, multiply the number of possible CPUs with the number of entries. # cat buffer_size_kb 1408 (units kilobytes) -Note, to modify this, you must have tracing completely disabled. -To do that, echo "nop" into the current_tracer. If the -current_tracer is not set to "nop", an EINVAL error will be -returned. +Or simply read buffer_total_size_kb + + # cat buffer_total_size_kb +5632 + +To modify the buffer, simple echo in a number (in 1024 byte segments). - # echo nop > current_tracer # echo 10000 > buffer_size_kb # cat buffer_size_kb 10000 (units kilobytes) -The number of pages which will be allocated is limited to a -percentage of available memory. Allocating too much will produce -an error. +It will try to allocate as much as possible. If you allocate too +much, it can cause Out-Of-Memory to trigger. # echo 1000000000000 > buffer_size_kb -bash: echo: write error: Cannot allocate memory # cat buffer_size_kb 85 +The per_cpu buffers can be changed individually as well: + + # echo 10000 > per_cpu/cpu0/buffer_size_kb + # echo 100 > per_cpu/cpu1/buffer_size_kb + +When the per_cpu buffers are not the same, the buffer_size_kb +at the top level will just show an X + + # cat buffer_size_kb +X + +This is where the buffer_total_size_kb is useful: + + # cat buffer_total_size_kb +12916 + +Writing to the top level buffer_size_kb will reset all the buffers +to be the same again. + Snapshot -------- CONFIG_TRACER_SNAPSHOT makes a generic snapshot feature @@ -1925,7 +2607,188 @@ bash: echo: write error: Device or resource busy # cat snapshot cat: snapshot: Device or resource busy + +Instances +--------- +In the debugfs tracing directory is a directory called "instances". +This directory can have new directories created inside of it using +mkdir, and removing directories with rmdir. The directory created +with mkdir in this directory will already contain files and other +directories after it is created. + + # mkdir instances/foo + # ls instances/foo +buffer_size_kb buffer_total_size_kb events free_buffer per_cpu +set_event snapshot trace trace_clock trace_marker trace_options +trace_pipe tracing_on + +As you can see, the new directory looks similar to the tracing directory +itself. In fact, it is very similar, except that the buffer and +events are agnostic from the main director, or from any other +instances that are created. + +The files in the new directory work just like the files with the +same name in the tracing directory except the buffer that is used +is a separate and new buffer. The files affect that buffer but do not +affect the main buffer with the exception of trace_options. Currently, +the trace_options affect all instances and the top level buffer +the same, but this may change in future releases. That is, options +may become specific to the instance they reside in. + +Notice that none of the function tracer files are there, nor is +current_tracer and available_tracers. This is because the buffers +can currently only have events enabled for them. + + # mkdir instances/foo + # mkdir instances/bar + # mkdir instances/zoot + # echo 100000 > buffer_size_kb + # echo 1000 > instances/foo/buffer_size_kb + # echo 5000 > instances/bar/per_cpu/cpu1/buffer_size_kb + # echo function > current_trace + # echo 1 > instances/foo/events/sched/sched_wakeup/enable + # echo 1 > instances/foo/events/sched/sched_wakeup_new/enable + # echo 1 > instances/foo/events/sched/sched_switch/enable + # echo 1 > instances/bar/events/irq/enable + # echo 1 > instances/zoot/events/syscalls/enable + # cat trace_pipe +CPU:2 [LOST 11745 EVENTS] + bash-2044 [002] .... 10594.481032: _raw_spin_lock_irqsave <-get_page_from_freelist + bash-2044 [002] d... 10594.481032: add_preempt_count <-_raw_spin_lock_irqsave + bash-2044 [002] d..1 10594.481032: __rmqueue <-get_page_from_freelist + bash-2044 [002] d..1 10594.481033: _raw_spin_unlock <-get_page_from_freelist + bash-2044 [002] d..1 10594.481033: sub_preempt_count <-_raw_spin_unlock + bash-2044 [002] d... 10594.481033: get_pageblock_flags_group <-get_pageblock_migratetype + bash-2044 [002] d... 10594.481034: __mod_zone_page_state <-get_page_from_freelist + bash-2044 [002] d... 10594.481034: zone_statistics <-get_page_from_freelist + bash-2044 [002] d... 10594.481034: __inc_zone_state <-zone_statistics + bash-2044 [002] d... 10594.481034: __inc_zone_state <-zone_statistics + bash-2044 [002] .... 10594.481035: arch_dup_task_struct <-copy_process +[...] + + # cat instances/foo/trace_pipe + bash-1998 [000] d..4 136.676759: sched_wakeup: comm=kworker/0:1 pid=59 prio=120 success=1 target_cpu=000 + bash-1998 [000] dN.4 136.676760: sched_wakeup: comm=bash pid=1998 prio=120 success=1 target_cpu=000 + <idle>-0 [003] d.h3 136.676906: sched_wakeup: comm=rcu_preempt pid=9 prio=120 success=1 target_cpu=003 + <idle>-0 [003] d..3 136.676909: sched_switch: prev_comm=swapper/3 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=rcu_preempt next_pid=9 next_prio=120 + rcu_preempt-9 [003] d..3 136.676916: sched_switch: prev_comm=rcu_preempt prev_pid=9 prev_prio=120 prev_state=S ==> next_comm=swapper/3 next_pid=0 next_prio=120 + bash-1998 [000] d..4 136.677014: sched_wakeup: comm=kworker/0:1 pid=59 prio=120 success=1 target_cpu=000 + bash-1998 [000] dN.4 136.677016: sched_wakeup: comm=bash pid=1998 prio=120 success=1 target_cpu=000 + bash-1998 [000] d..3 136.677018: sched_switch: prev_comm=bash prev_pid=1998 prev_prio=120 prev_state=R+ ==> next_comm=kworker/0:1 next_pid=59 next_prio=120 + kworker/0:1-59 [000] d..4 136.677022: sched_wakeup: comm=sshd pid=1995 prio=120 success=1 target_cpu=001 + kworker/0:1-59 [000] d..3 136.677025: sched_switch: prev_comm=kworker/0:1 prev_pid=59 prev_prio=120 prev_state=S ==> next_comm=bash next_pid=1998 next_prio=120 +[...] + + # cat instances/bar/trace_pipe + migration/1-14 [001] d.h3 138.732674: softirq_raise: vec=3 [action=NET_RX] + <idle>-0 [001] dNh3 138.732725: softirq_raise: vec=3 [action=NET_RX] + bash-1998 [000] d.h1 138.733101: softirq_raise: vec=1 [action=TIMER] + bash-1998 [000] d.h1 138.733102: softirq_raise: vec=9 [action=RCU] + bash-1998 [000] ..s2 138.733105: softirq_entry: vec=1 [action=TIMER] + bash-1998 [000] ..s2 138.733106: softirq_exit: vec=1 [action=TIMER] + bash-1998 [000] ..s2 138.733106: softirq_entry: vec=9 [action=RCU] + bash-1998 [000] ..s2 138.733109: softirq_exit: vec=9 [action=RCU] + sshd-1995 [001] d.h1 138.733278: irq_handler_entry: irq=21 name=uhci_hcd:usb4 + sshd-1995 [001] d.h1 138.733280: irq_handler_exit: irq=21 ret=unhandled + sshd-1995 [001] d.h1 138.733281: irq_handler_entry: irq=21 name=eth0 + sshd-1995 [001] d.h1 138.733283: irq_handler_exit: irq=21 ret=handled +[...] + + # cat instances/zoot/trace +# tracer: nop +# +# entries-in-buffer/entries-written: 18996/18996 #P:4 +# +# _-----=> irqs-off +# / _----=> need-resched +# | / _---=> hardirq/softirq +# || / _--=> preempt-depth +# ||| / delay +# TASK-PID CPU# |||| TIMESTAMP FUNCTION +# | | | |||| | | + bash-1998 [000] d... 140.733501: sys_write -> 0x2 + bash-1998 [000] d... 140.733504: sys_dup2(oldfd: a, newfd: 1) + bash-1998 [000] d... 140.733506: sys_dup2 -> 0x1 + bash-1998 [000] d... 140.733508: sys_fcntl(fd: a, cmd: 1, arg: 0) + bash-1998 [000] d... 140.733509: sys_fcntl -> 0x1 + bash-1998 [000] d... 140.733510: sys_close(fd: a) + bash-1998 [000] d... 140.733510: sys_close -> 0x0 + bash-1998 [000] d... 140.733514: sys_rt_sigprocmask(how: 0, nset: 0, oset: 6e2768, sigsetsize: 8) + bash-1998 [000] d... 140.733515: sys_rt_sigprocmask -> 0x0 + bash-1998 [000] d... 140.733516: sys_rt_sigaction(sig: 2, act: 7fff718846f0, oact: 7fff71884650, sigsetsize: 8) + bash-1998 [000] d... 140.733516: sys_rt_sigaction -> 0x0 + +You can see that the trace of the top most trace buffer shows only +the function tracing. The foo instance displays wakeups and task +switches. + +To remove the instances, simply delete their directories: + + # rmdir instances/foo + # rmdir instances/bar + # rmdir instances/zoot + +Note, if a process has a trace file open in one of the instance +directories, the rmdir will fail with EBUSY. + + +Stack trace ----------- +Since the kernel has a fixed sized stack, it is important not to +waste it in functions. A kernel developer must be conscience of +what they allocate on the stack. If they add too much, the system +can be in danger of a stack overflow, and corruption will occur, +usually leading to a system panic. + +There are some tools that check this, usually with interrupts +periodically checking usage. But if you can perform a check +at every function call that will become very useful. As ftrace provides +a function tracer, it makes it convenient to check the stack size +at every function call. This is enabled via the stack tracer. + +CONFIG_STACK_TRACER enables the ftrace stack tracing functionality. +To enable it, write a '1' into /proc/sys/kernel/stack_tracer_enabled. + + # echo 1 > /proc/sys/kernel/stack_tracer_enabled + +You can also enable it from the kernel command line to trace +the stack size of the kernel during boot up, by adding "stacktrace" +to the kernel command line parameter. + +After running it for a few minutes, the output looks like: + + # cat stack_max_size +2928 + + # cat stack_trace + Depth Size Location (18 entries) + ----- ---- -------- + 0) 2928 224 update_sd_lb_stats+0xbc/0x4ac + 1) 2704 160 find_busiest_group+0x31/0x1f1 + 2) 2544 256 load_balance+0xd9/0x662 + 3) 2288 80 idle_balance+0xbb/0x130 + 4) 2208 128 __schedule+0x26e/0x5b9 + 5) 2080 16 schedule+0x64/0x66 + 6) 2064 128 schedule_timeout+0x34/0xe0 + 7) 1936 112 wait_for_common+0x97/0xf1 + 8) 1824 16 wait_for_completion+0x1d/0x1f + 9) 1808 128 flush_work+0xfe/0x119 + 10) 1680 16 tty_flush_to_ldisc+0x1e/0x20 + 11) 1664 48 input_available_p+0x1d/0x5c + 12) 1616 48 n_tty_poll+0x6d/0x134 + 13) 1568 64 tty_poll+0x64/0x7f + 14) 1504 880 do_select+0x31e/0x511 + 15) 624 400 core_sys_select+0x177/0x216 + 16) 224 96 sys_select+0x91/0xb9 + 17) 128 128 system_call_fastpath+0x16/0x1b + +Note, if -mfentry is being used by gcc, functions get traced before +they set up the stack frame. This means that leaf level functions +are not tested by the stack tracer when -mfentry is used. + +Currently, -mfentry is used by gcc 4.6.0 and above on x86 only. + +--------- More details can be found in the source code, in the kernel/trace/*.c files. diff --git a/Documentation/trace/tracepoints.txt b/Documentation/trace/tracepoints.txt index c0e1ceed75a4..da49437d5aeb 100644 --- a/Documentation/trace/tracepoints.txt +++ b/Documentation/trace/tracepoints.txt @@ -81,7 +81,6 @@ tracepoint_synchronize_unregister() must be called before the end of the module exit function to make sure there is no caller left using the probe. This, and the fact that preemption is disabled around the probe call, make sure that probe removal and module unload are safe. -See the "Probe example" section below for a sample probe module. The tracepoint mechanism supports inserting multiple instances of the same tracepoint, but a single definition must be made of a given @@ -100,17 +99,3 @@ core kernel image or in modules. If the tracepoint has to be used in kernel modules, an EXPORT_TRACEPOINT_SYMBOL_GPL() or EXPORT_TRACEPOINT_SYMBOL() can be used to export the defined tracepoints. - -* Probe / tracepoint example - -See the example provided in samples/tracepoints - -Compile them with your kernel. They are built during 'make' (not -'make modules') when CONFIG_SAMPLE_TRACEPOINTS=m. - -Run, as root : -modprobe tracepoint-sample (insmod order is not important) -modprobe tracepoint-probe-sample -cat /proc/tracepoint-sample (returns an expected error) -rmmod tracepoint-sample tracepoint-probe-sample -dmesg diff --git a/Documentation/trace/uprobetracer.txt b/Documentation/trace/uprobetracer.txt index 24ce6823a09e..d9c3e682312c 100644 --- a/Documentation/trace/uprobetracer.txt +++ b/Documentation/trace/uprobetracer.txt @@ -1,6 +1,8 @@ - Uprobe-tracer: Uprobe-based Event Tracing - ========================================= - Documentation written by Srikar Dronamraju + Uprobe-tracer: Uprobe-based Event Tracing + ========================================= + + Documentation written by Srikar Dronamraju + Overview -------- @@ -13,78 +15,94 @@ current_tracer. Instead of that, add probe points via /sys/kernel/debug/tracing/events/uprobes/<EVENT>/enabled. However unlike kprobe-event tracer, the uprobe event interface expects the -user to calculate the offset of the probepoint in the object +user to calculate the offset of the probepoint in the object. Synopsis of uprobe_tracer ------------------------- - p[:[GRP/]EVENT] PATH:SYMBOL[+offs] [FETCHARGS] : Set a probe + p[:[GRP/]EVENT] PATH:SYMBOL[+offs] [FETCHARGS] : Set a uprobe + r[:[GRP/]EVENT] PATH:SYMBOL[+offs] [FETCHARGS] : Set a return uprobe (uretprobe) + -:[GRP/]EVENT : Clear uprobe or uretprobe event - GRP : Group name. If omitted, use "uprobes" for it. - EVENT : Event name. If omitted, the event name is generated - based on SYMBOL+offs. - PATH : path to an executable or a library. - SYMBOL[+offs] : Symbol+offset where the probe is inserted. + GRP : Group name. If omitted, "uprobes" is the default value. + EVENT : Event name. If omitted, the event name is generated based + on SYMBOL+offs. + PATH : Path to an executable or a library. + SYMBOL[+offs] : Symbol+offset where the probe is inserted. - FETCHARGS : Arguments. Each probe can have up to 128 args. - %REG : Fetch register REG + FETCHARGS : Arguments. Each probe can have up to 128 args. + %REG : Fetch register REG Event Profiling --------------- - You can check the total number of probe hits and probe miss-hits via +You can check the total number of probe hits and probe miss-hits via /sys/kernel/debug/tracing/uprobe_profile. - The first column is event name, the second is the number of probe hits, +The first column is event name, the second is the number of probe hits, the third is the number of probe miss-hits. Usage examples -------------- -To add a probe as a new event, write a new definition to uprobe_events -as below. + * Add a probe as a new uprobe event, write a new definition to uprobe_events +as below: (sets a uprobe at an offset of 0x4245c0 in the executable /bin/bash) + + echo 'p: /bin/bash:0x4245c0' > /sys/kernel/debug/tracing/uprobe_events + + * Add a probe as a new uretprobe event: + + echo 'r: /bin/bash:0x4245c0' > /sys/kernel/debug/tracing/uprobe_events + + * Unset registered event: - echo 'p: /bin/bash:0x4245c0' > /sys/kernel/debug/tracing/uprobe_events + echo '-:bash_0x4245c0' >> /sys/kernel/debug/tracing/uprobe_events - This sets a uprobe at an offset of 0x4245c0 in the executable /bin/bash + * Print out the events that are registered: - echo > /sys/kernel/debug/tracing/uprobe_events + cat /sys/kernel/debug/tracing/uprobe_events - This clears all probe points. + * Clear all events: -The following example shows how to dump the instruction pointer and %ax -a register at the probed text address. Here we are trying to probe -function zfree in /bin/zsh + echo > /sys/kernel/debug/tracing/uprobe_events + +Following example shows how to dump the instruction pointer and %ax register +at the probed text address. Probe zfree function in /bin/zsh: # cd /sys/kernel/debug/tracing/ - # cat /proc/`pgrep zsh`/maps | grep /bin/zsh | grep r-xp + # cat /proc/`pgrep zsh`/maps | grep /bin/zsh | grep r-xp 00400000-0048a000 r-xp 00000000 08:03 130904 /bin/zsh # objdump -T /bin/zsh | grep -w zfree 0000000000446420 g DF .text 0000000000000012 Base zfree -0x46420 is the offset of zfree in object /bin/zsh that is loaded at -0x00400000. Hence the command to probe would be : + 0x46420 is the offset of zfree in object /bin/zsh that is loaded at + 0x00400000. Hence the command to uprobe would be: + + # echo 'p:zfree_entry /bin/zsh:0x46420 %ip %ax' > uprobe_events + + And the same for the uretprobe would be: - # echo 'p /bin/zsh:0x46420 %ip %ax' > uprobe_events + # echo 'r:zfree_exit /bin/zsh:0x46420 %ip %ax' >> uprobe_events -Please note: User has to explicitly calculate the offset of the probepoint +Please note: User has to explicitly calculate the offset of the probe-point in the object. We can see the events that are registered by looking at the uprobe_events file. # cat uprobe_events - p:uprobes/p_zsh_0x46420 /bin/zsh:0x00046420 arg1=%ip arg2=%ax + p:uprobes/zfree_entry /bin/zsh:0x00046420 arg1=%ip arg2=%ax + r:uprobes/zfree_exit /bin/zsh:0x00046420 arg1=%ip arg2=%ax -The format of events can be seen by viewing the file events/uprobes/p_zsh_0x46420/format +Format of events can be seen by viewing the file events/uprobes/zfree_entry/format - # cat events/uprobes/p_zsh_0x46420/format - name: p_zsh_0x46420 + # cat events/uprobes/zfree_entry/format + name: zfree_entry ID: 922 format: - field:unsigned short common_type; offset:0; size:2; signed:0; - field:unsigned char common_flags; offset:2; size:1; signed:0; - field:unsigned char common_preempt_count; offset:3; size:1; signed:0; - field:int common_pid; offset:4; size:4; signed:1; - field:int common_padding; offset:8; size:4; signed:1; + field:unsigned short common_type; offset:0; size:2; signed:0; + field:unsigned char common_flags; offset:2; size:1; signed:0; + field:unsigned char common_preempt_count; offset:3; size:1; signed:0; + field:int common_pid; offset:4; size:4; signed:1; + field:int common_padding; offset:8; size:4; signed:1; - field:unsigned long __probe_ip; offset:12; size:4; signed:0; - field:u32 arg1; offset:16; size:4; signed:0; - field:u32 arg2; offset:20; size:4; signed:0; + field:unsigned long __probe_ip; offset:12; size:4; signed:0; + field:u32 arg1; offset:16; size:4; signed:0; + field:u32 arg2; offset:20; size:4; signed:0; print fmt: "(%lx) arg1=%lx arg2=%lx", REC->__probe_ip, REC->arg1, REC->arg2 @@ -94,6 +112,7 @@ events, you need to enable it by: # echo 1 > events/uprobes/enable Lets disable the event after sleeping for some time. + # sleep 20 # echo 0 > events/uprobes/enable @@ -104,10 +123,11 @@ And you can see the traced information via /sys/kernel/debug/tracing/trace. # # TASK-PID CPU# TIMESTAMP FUNCTION # | | | | | - zsh-24842 [006] 258544.995456: p_zsh_0x46420: (0x446420) arg1=446421 arg2=79 - zsh-24842 [007] 258545.000270: p_zsh_0x46420: (0x446420) arg1=446421 arg2=79 - zsh-24842 [002] 258545.043929: p_zsh_0x46420: (0x446420) arg1=446421 arg2=79 - zsh-24842 [004] 258547.046129: p_zsh_0x46420: (0x446420) arg1=446421 arg2=79 - -Each line shows us probes were triggered for a pid 24842 with ip being -0x446421 and contents of ax register being 79. + zsh-24842 [006] 258544.995456: zfree_entry: (0x446420) arg1=446420 arg2=79 + zsh-24842 [007] 258545.000270: zfree_exit: (0x446540 <- 0x446420) arg1=446540 arg2=0 + zsh-24842 [002] 258545.043929: zfree_entry: (0x446420) arg1=446420 arg2=79 + zsh-24842 [004] 258547.046129: zfree_exit: (0x446540 <- 0x446420) arg1=446540 arg2=0 + +Output shows us uprobe was triggered for a pid 24842 with ip being 0x446420 +and contents of ax register being 79. And uretprobe was triggered with ip at +0x446540 with counterpart function entry at 0x446420. diff --git a/Documentation/usb/power-management.txt b/Documentation/usb/power-management.txt index 4204eb01fd38..1392b61d6ebe 100644 --- a/Documentation/usb/power-management.txt +++ b/Documentation/usb/power-management.txt @@ -33,6 +33,10 @@ built with CONFIG_USB_SUSPEND enabled (which depends on CONFIG_PM_RUNTIME). System PM support is present only if the kernel was built with CONFIG_SUSPEND or CONFIG_HIBERNATION enabled. +(Starting with the 3.10 kernel release, dynamic PM support for USB is +present whenever the kernel was built with CONFIG_PM_RUNTIME enabled. +The CONFIG_USB_SUSPEND option has been eliminated.) + What is Remote Wakeup? ---------------------- @@ -206,10 +210,8 @@ initialized to 5. (The idle-delay values for already existing devices will not be affected.) Setting the initial default idle-delay to -1 will prevent any -autosuspend of any USB device. This is a simple alternative to -disabling CONFIG_USB_SUSPEND and rebuilding the kernel, and it has the -added benefit of allowing you to enable autosuspend for selected -devices. +autosuspend of any USB device. This has the benefit of allowing you +then to enable autosuspend for selected devices. Warnings diff --git a/Documentation/video4linux/CARDLIST.em28xx b/Documentation/video4linux/CARDLIST.em28xx index 3f12865b2a88..e81864405102 100644 --- a/Documentation/video4linux/CARDLIST.em28xx +++ b/Documentation/video4linux/CARDLIST.em28xx @@ -76,7 +76,7 @@ 76 -> KWorld PlusTV 340U or UB435-Q (ATSC) (em2870) [1b80:a340] 77 -> EM2874 Leadership ISDBT (em2874) 78 -> PCTV nanoStick T2 290e (em28174) - 79 -> Terratec Cinergy H5 (em2884) [0ccd:10a2,0ccd:10ad] + 79 -> Terratec Cinergy H5 (em2884) [0ccd:10a2,0ccd:10ad,0ccd:10b6] 80 -> PCTV DVB-S2 Stick (460e) (em28174) 81 -> Hauppauge WinTV HVR 930C (em2884) [2040:1605] 82 -> Terratec Cinergy HTC Stick (em2884) [0ccd:00b2] @@ -85,3 +85,4 @@ 85 -> PCTV QuatroStick (510e) (em2884) [2304:0242] 86 -> PCTV QuatroStick nano (520e) (em2884) [2013:0251] 87 -> Terratec Cinergy HTC USB XS (em2884) [0ccd:008e,0ccd:00ac] + 88 -> C3 Tech Digital Duo HDTV/SDTV USB (em2884) [1b80:e755] diff --git a/Documentation/video4linux/CARDLIST.tuner b/Documentation/video4linux/CARDLIST.tuner index c83f6e418879..5b83a3ff15c2 100644 --- a/Documentation/video4linux/CARDLIST.tuner +++ b/Documentation/video4linux/CARDLIST.tuner @@ -86,3 +86,6 @@ tuner=85 - Philips FQ1236 MK5 tuner=86 - Tena TNF5337 MFD tuner=87 - Xceive 4000 tuner tuner=88 - Xceive 5000C tuner +tuner=89 - Sony PAL+SECAM (BTF-PG472Z) +tuner=90 - Sony NTSC-M-JP (BTF-PK467Z) +tuner=91 - Sony NTSC-M (BTF-PB463Z) diff --git a/Documentation/video4linux/si476x.txt b/Documentation/video4linux/si476x.txt new file mode 100644 index 000000000000..d1a08db2cbd9 --- /dev/null +++ b/Documentation/video4linux/si476x.txt @@ -0,0 +1,187 @@ +SI476x Driver Readme +------------------------------------------------ + Copyright (C) 2013 Andrey Smirnov <andrew.smirnov@gmail.com> + +TODO for the driver +------------------------------ + +- According to the SiLabs' datasheet it is possible to update the + firmware of the radio chip in the run-time, thus bringing it to the + most recent version. Unfortunately I couldn't find any mentioning of + the said firmware update for the old chips that I tested the driver + against, so for chips like that the driver only exposes the old + functionality. + + +Parameters exposed over debugfs +------------------------------- +SI476x allow user to get multiple characteristics that can be very +useful for EoL testing/RF performance estimation, parameters that have +very little to do with V4L2 subsystem. Such parameters are exposed via +debugfs and can be accessed via regular file I/O operations. + +The drivers exposes following files: + +* /sys/kernel/debug/<device-name>/acf + This file contains ACF(Automatically Controlled Features) status + information. The contents of the file is binary data of the + following layout: + + Offset | Name | Description + ==================================================================== + 0x00 | blend_int | Flag, set when stereo separation has + | | crossed below the blend threshold + -------------------------------------------------------------------- + 0x01 | hblend_int | Flag, set when HiBlend cutoff + | | frequency is lower than threshold + -------------------------------------------------------------------- + 0x02 | hicut_int | Flag, set when HiCut cutoff + | | frequency is lower than threshold + -------------------------------------------------------------------- + 0x03 | chbw_int | Flag, set when channel filter + | | bandwidth is less than threshold + -------------------------------------------------------------------- + 0x04 | softmute_int | Flag indicating that softmute + | | attenuation has increased above + | | softmute threshold + -------------------------------------------------------------------- + 0x05 | smute | 0 - Audio is not soft muted + | | 1 - Audio is soft muted + -------------------------------------------------------------------- + 0x06 | smattn | Soft mute attenuation level in dB + -------------------------------------------------------------------- + 0x07 | chbw | Channel filter bandwidth in kHz + -------------------------------------------------------------------- + 0x08 | hicut | HiCut cutoff frequency in units of + | | 100Hz + -------------------------------------------------------------------- + 0x09 | hiblend | HiBlend cutoff frequency in units + | | of 100 Hz + -------------------------------------------------------------------- + 0x10 | pilot | 0 - Stereo pilot is not present + | | 1 - Stereo pilot is present + -------------------------------------------------------------------- + 0x11 | stblend | Stereo blend in % + -------------------------------------------------------------------- + + +* /sys/kernel/debug/<device-name>/rds_blckcnt + This file contains statistics about RDS receptions. It's binary data + has the following layout: + + Offset | Name | Description + ==================================================================== + 0x00 | expected | Number of expected RDS blocks + -------------------------------------------------------------------- + 0x02 | received | Number of received RDS blocks + -------------------------------------------------------------------- + 0x04 | uncorrectable | Number of uncorrectable RDS blocks + -------------------------------------------------------------------- + +* /sys/kernel/debug/<device-name>/agc + This file contains information about parameters pertaining to + AGC(Automatic Gain Control) + + The layout is: + Offset | Name | Description + ==================================================================== + 0x00 | mxhi | 0 - FM Mixer PD high threshold is + | | not tripped + | | 1 - FM Mixer PD high threshold is + | | tripped + -------------------------------------------------------------------- + 0x01 | mxlo | ditto for FM Mixer PD low + -------------------------------------------------------------------- + 0x02 | lnahi | ditto for FM LNA PD high + -------------------------------------------------------------------- + 0x03 | lnalo | ditto for FM LNA PD low + -------------------------------------------------------------------- + 0x04 | fmagc1 | FMAGC1 attenuator resistance + | | (see datasheet for more detail) + -------------------------------------------------------------------- + 0x05 | fmagc2 | ditto for FMAGC2 + -------------------------------------------------------------------- + 0x06 | pgagain | PGA gain in dB + -------------------------------------------------------------------- + 0x07 | fmwblang | FM/WB LNA Gain in dB + -------------------------------------------------------------------- + +* /sys/kernel/debug/<device-name>/rsq + This file contains information about parameters pertaining to + RSQ(Received Signal Quality) + + The layout is: + Offset | Name | Description + ==================================================================== + 0x00 | multhint | 0 - multipath value has not crossed + | | the Multipath high threshold + | | 1 - multipath value has crossed + | | the Multipath high threshold + -------------------------------------------------------------------- + 0x01 | multlint | ditto for Multipath low threshold + -------------------------------------------------------------------- + 0x02 | snrhint | 0 - received signal's SNR has not + | | crossed high threshold + | | 1 - received signal's SNR has + | | crossed high threshold + -------------------------------------------------------------------- + 0x03 | snrlint | ditto for low threshold + -------------------------------------------------------------------- + 0x04 | rssihint | ditto for RSSI high threshold + -------------------------------------------------------------------- + 0x05 | rssilint | ditto for RSSI low threshold + -------------------------------------------------------------------- + 0x06 | bltf | Flag indicating if seek command + | | reached/wrapped seek band limit + -------------------------------------------------------------------- + 0x07 | snr_ready | Indicates that SNR metrics is ready + -------------------------------------------------------------------- + 0x08 | rssiready | ditto for RSSI metrics + -------------------------------------------------------------------- + 0x09 | injside | 0 - Low-side injection is being used + | | 1 - High-side injection is used + -------------------------------------------------------------------- + 0x10 | afcrl | Flag indicating if AFC rails + -------------------------------------------------------------------- + 0x11 | valid | Flag indicating if channel is valid + -------------------------------------------------------------------- + 0x12 | readfreq | Current tuned frequency + -------------------------------------------------------------------- + 0x14 | freqoff | Singed frequency offset in units of + | | 2ppm + -------------------------------------------------------------------- + 0x15 | rssi | Signed value of RSSI in dBuV + -------------------------------------------------------------------- + 0x16 | snr | Signed RF SNR in dB + -------------------------------------------------------------------- + 0x17 | issi | Signed Image Strength Signal + | | indicator + -------------------------------------------------------------------- + 0x18 | lassi | Signed Low side adjacent Channel + | | Strength indicator + -------------------------------------------------------------------- + 0x19 | hassi | ditto fpr High side + -------------------------------------------------------------------- + 0x20 | mult | Multipath indicator + -------------------------------------------------------------------- + 0x21 | dev | Frequency deviation + -------------------------------------------------------------------- + 0x24 | assi | Adjascent channel SSI + -------------------------------------------------------------------- + 0x25 | usn | Ultrasonic noise indicator + -------------------------------------------------------------------- + 0x26 | pilotdev | Pilot deviation in units of 100 Hz + -------------------------------------------------------------------- + 0x27 | rdsdev | ditto for RDS + -------------------------------------------------------------------- + 0x28 | assidev | ditto for ASSI + -------------------------------------------------------------------- + 0x29 | strongdev | Frequency deviation + -------------------------------------------------------------------- + 0x30 | rdspi | RDS PI code + -------------------------------------------------------------------- + +* /sys/kernel/debug/<device-name>/rsq_primary + This file contains information about parameters pertaining to + RSQ(Received Signal Quality) for primary tuner only. Layout is as + the one above. diff --git a/Documentation/virtual/virtio-spec.txt b/Documentation/virtual/virtio-spec.txt index 0d6ec85481cb..eb094039b50d 100644 --- a/Documentation/virtual/virtio-spec.txt +++ b/Documentation/virtual/virtio-spec.txt @@ -1389,7 +1389,7 @@ segmentation, if both guests are amenable. Packets are transmitted by placing them in the transmitq, and buffers for incoming packets are placed in the receiveq. In each -case, the packet itself is preceeded by a header: +case, the packet itself is preceded by a header: struct virtio_net_hdr { @@ -1631,7 +1631,7 @@ struct virtio_net_ctrl_mac { The device can filter incoming packets by any number of destination MAC addresses.[footnote: -Since there are no guarentees, it can use a hash filter +Since there are no guarantees, it can use a hash filter orsilently switch to allmulti or promiscuous mode if it is given too many addresses. ] This table is set using the class VIRTIO_NET_CTRL_MAC and the @@ -1822,7 +1822,7 @@ the FLUSH and FLUSH_OUT types are equivalent, the device does not distinguish between them ]). If the device has VIRTIO_BLK_F_BARRIER feature the high bit (VIRTIO_BLK_T_BARRIER) indicates that this request acts as a -barrier and that all preceeding requests must be complete before +barrier and that all preceding requests must be complete before this one, and all following requests must not be started until this is complete. Note that a barrier does not flush caches in the underlying backend device in host, and thus does not serve as diff --git a/Documentation/vm/overcommit-accounting b/Documentation/vm/overcommit-accounting index 706d7ed9d8d2..8eaa2fc4b8fa 100644 --- a/Documentation/vm/overcommit-accounting +++ b/Documentation/vm/overcommit-accounting @@ -8,7 +8,9 @@ The Linux kernel supports the following overcommit handling modes default. 1 - Always overcommit. Appropriate for some scientific - applications. + applications. Classic example is code using sparse arrays + and just relying on the virtual memory consisting almost + entirely of zero pages. 2 - Don't overcommit. The total address space commit for the system is not permitted to exceed swap + a @@ -18,6 +20,10 @@ The Linux kernel supports the following overcommit handling modes pages but will receive errors on memory allocation as appropriate. + Useful for applications that want to guarantee their + memory allocations will be available in the future + without having to initialize every page. + The overcommit policy is set via the sysctl `vm.overcommit_memory'. The overcommit percentage is set via `vm.overcommit_ratio'. diff --git a/Documentation/x86/x86_64/boot-options.txt b/Documentation/x86/x86_64/boot-options.txt index e015a83c3996..e9e8ddbbf376 100644 --- a/Documentation/x86/x86_64/boot-options.txt +++ b/Documentation/x86/x86_64/boot-options.txt @@ -91,20 +91,6 @@ APICs apicmaintimer. Useful when your PIT timer is totally broken. -Early Console - - syntax: earlyprintk=vga - earlyprintk=serial[,ttySn[,baudrate]] - - The early console is useful when the kernel crashes before the - normal console is initialized. It is not enabled by - default because it has some cosmetic problems. - Append ,keep to not disable it when the real console takes over. - Only vga or serial at a time, not both. - Currently only ttyS0 and ttyS1 are supported. - Interaction with the standard serial driver is not very good. - The VGA output is eventually overwritten by the real console. - Timing notsc diff --git a/Documentation/x86/x86_64/mm.txt b/Documentation/x86/x86_64/mm.txt index d6498e3cd713..881582f75c9c 100644 --- a/Documentation/x86/x86_64/mm.txt +++ b/Documentation/x86/x86_64/mm.txt @@ -13,7 +13,9 @@ ffffe90000000000 - ffffe9ffffffffff (=40 bits) hole ffffea0000000000 - ffffeaffffffffff (=40 bits) virtual memory map (1TB) ... unused hole ... ffffffff80000000 - ffffffffa0000000 (=512 MB) kernel text mapping, from phys 0 -ffffffffa0000000 - fffffffffff00000 (=1536 MB) module mapping space +ffffffffa0000000 - ffffffffff5fffff (=1525 MB) module mapping space +ffffffffff600000 - ffffffffffdfffff (=8 MB) vsyscalls +ffffffffffe00000 - ffffffffffffffff (=2 MB) unused hole The direct mapping covers all memory in the system up to the highest memory address (this means in some cases it can also include PCI memory |