summaryrefslogtreecommitdiff
path: root/Documentation
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/ABI/testing/sysfs-block13
-rw-r--r--Documentation/ABI/testing/sysfs-bus-rbd7
-rw-r--r--Documentation/ABI/testing/sysfs-driver-hid-logitech-lg4ff2
-rw-r--r--Documentation/DocBook/debugobjects.tmpl50
-rw-r--r--Documentation/DocBook/uio-howto.tmpl7
-rw-r--r--Documentation/HOWTO4
-rw-r--r--Documentation/RCU/checklist.txt6
-rw-r--r--Documentation/RCU/rcu.txt10
-rw-r--r--Documentation/RCU/stallwarn.txt16
-rw-r--r--Documentation/RCU/torture.txt13
-rw-r--r--Documentation/RCU/trace.txt4
-rw-r--r--Documentation/RCU/whatisRCU.txt19
-rw-r--r--Documentation/arm/memory.txt11
-rw-r--r--Documentation/atomic_ops.txt87
-rw-r--r--Documentation/blockdev/cciss.txt14
-rw-r--r--Documentation/cgroups/memory.txt28
-rw-r--r--Documentation/cgroups/net_prio.txt53
-rw-r--r--Documentation/cpu-freq/governors.txt4
-rw-r--r--Documentation/development-process/5.Posting8
-rw-r--r--Documentation/devices.txt2
-rw-r--r--Documentation/devicetree/bindings/arm/gic.txt4
-rw-r--r--Documentation/devicetree/bindings/arm/vic.txt29
-rw-r--r--Documentation/devicetree/bindings/i2c/i2c-designware.txt22
-rw-r--r--Documentation/devicetree/bindings/i2c/trivial-devices.txt58
-rw-r--r--Documentation/devicetree/bindings/net/calxeda-xgmac.txt15
-rw-r--r--Documentation/devicetree/bindings/net/can/cc770.txt53
-rw-r--r--Documentation/devicetree/bindings/powerpc/fsl/srio-rmu.txt163
-rw-r--r--Documentation/devicetree/bindings/powerpc/fsl/srio.txt103
-rw-r--r--Documentation/devicetree/bindings/vendor-prefixes.txt4
-rw-r--r--Documentation/dma-buf-sharing.txt224
-rw-r--r--Documentation/driver-model/devres.txt1
-rw-r--r--Documentation/feature-removal-schedule.txt14
-rw-r--r--Documentation/filesystems/Locking8
-rw-r--r--Documentation/filesystems/btrfs.txt4
-rw-r--r--Documentation/filesystems/configfs/configfs.txt2
-rw-r--r--Documentation/filesystems/debugfs.txt56
-rw-r--r--Documentation/filesystems/sysfs.txt2
-rw-r--r--Documentation/filesystems/vfs.txt8
-rw-r--r--Documentation/hwmon/pmbus5
-rw-r--r--Documentation/hwmon/zl610015
-rw-r--r--Documentation/i2c/ten-bit-addresses36
-rw-r--r--Documentation/kernel-parameters.txt18
-rw-r--r--Documentation/lockdep-design.txt63
-rw-r--r--Documentation/md.txt22
-rw-r--r--Documentation/networking/00-INDEX2
-rw-r--r--Documentation/networking/batman-adv.txt7
-rw-r--r--Documentation/networking/bonding.txt17
-rw-r--r--Documentation/networking/ieee802154.txt27
-rw-r--r--Documentation/networking/ifenslave.c2
-rw-r--r--Documentation/networking/ip-sysctl.txt25
-rw-r--r--Documentation/networking/openvswitch.txt195
-rw-r--r--Documentation/networking/packet_mmap.txt2
-rw-r--r--Documentation/networking/scaling.txt8
-rw-r--r--Documentation/networking/stmmac.txt16
-rw-r--r--Documentation/networking/team.txt2
-rw-r--r--Documentation/power/devices.txt118
-rw-r--r--Documentation/power/freezing-of-tasks.txt39
-rw-r--r--Documentation/power/runtime_pm.txt158
-rw-r--r--Documentation/scsi/53c700.txt21
-rw-r--r--Documentation/serial/serial-rs485.txt14
-rw-r--r--Documentation/sound/alsa/HD-Audio-Models.txt1
-rw-r--r--Documentation/sound/alsa/HD-Audio.txt8
-rw-r--r--Documentation/sound/alsa/soc/machine.txt6
-rw-r--r--Documentation/trace/events.txt2
-rw-r--r--Documentation/usb/linux-cdc-acm.inf4
-rw-r--r--Documentation/vgaarbiter.txt2
-rw-r--r--Documentation/virtual/kvm/api.txt16
67 files changed, 1663 insertions, 316 deletions
diff --git a/Documentation/ABI/testing/sysfs-block b/Documentation/ABI/testing/sysfs-block
index 2b5d56127fce..c1eb41cb9876 100644
--- a/Documentation/ABI/testing/sysfs-block
+++ b/Documentation/ABI/testing/sysfs-block
@@ -206,16 +206,3 @@ Description:
when a discarded area is read the discard_zeroes_data
parameter will be set to one. Otherwise it will be 0 and
the result of reading a discarded area is undefined.
-What: /sys/block/<disk>/alias
-Date: Aug 2011
-Contact: Nao Nishijima <nao.nishijima.xt@hitachi.com>
-Description:
- A raw device name of a disk does not always point a same disk
- each boot-up time. Therefore, users have to use persistent
- device names, which udev creates when the kernel finds a disk,
- instead of raw device name. However, kernel doesn't show those
- persistent names on its messages (e.g. dmesg).
- This file can store an alias of the disk and it would be
- appeared in kernel messages if it is set. A disk can have an
- alias which length is up to 255bytes. Users can use alphabets,
- numbers, "-" and "_" in alias name. This file is writeonce.
diff --git a/Documentation/ABI/testing/sysfs-bus-rbd b/Documentation/ABI/testing/sysfs-bus-rbd
index fa72ccb2282e..dbedafb095e2 100644
--- a/Documentation/ABI/testing/sysfs-bus-rbd
+++ b/Documentation/ABI/testing/sysfs-bus-rbd
@@ -57,13 +57,6 @@ create_snap
$ echo <snap-name> > /sys/bus/rbd/devices/<dev-id>/snap_create
-rollback_snap
-
- Rolls back data to the specified snapshot. This goes over the entire
- list of rados blocks and sends a rollback command to each.
-
- $ echo <snap-name> > /sys/bus/rbd/devices/<dev-id>/snap_rollback
-
snap_*
A directory per each snapshot
diff --git a/Documentation/ABI/testing/sysfs-driver-hid-logitech-lg4ff b/Documentation/ABI/testing/sysfs-driver-hid-logitech-lg4ff
index 9aec8ef228b0..167d9032b970 100644
--- a/Documentation/ABI/testing/sysfs-driver-hid-logitech-lg4ff
+++ b/Documentation/ABI/testing/sysfs-driver-hid-logitech-lg4ff
@@ -1,7 +1,7 @@
What: /sys/module/hid_logitech/drivers/hid:logitech/<dev>/range.
Date: July 2011
KernelVersion: 3.2
-Contact: Michal Malư <madcatxster@gmail.com>
+Contact: Michal MalĂ½ <madcatxster@gmail.com>
Description: Display minimum, maximum and current range of the steering
wheel. Writing a value within min and max boundaries sets the
range of the wheel.
diff --git a/Documentation/DocBook/debugobjects.tmpl b/Documentation/DocBook/debugobjects.tmpl
index 08ff908aa7a2..24979f691e3e 100644
--- a/Documentation/DocBook/debugobjects.tmpl
+++ b/Documentation/DocBook/debugobjects.tmpl
@@ -96,6 +96,7 @@
<listitem><para>debug_object_deactivate</para></listitem>
<listitem><para>debug_object_destroy</para></listitem>
<listitem><para>debug_object_free</para></listitem>
+ <listitem><para>debug_object_assert_init</para></listitem>
</itemizedlist>
Each of these functions takes the address of the real object and
a pointer to the object type specific debug description
@@ -273,6 +274,26 @@
debug checks.
</para>
</sect1>
+
+ <sect1 id="debug_object_assert_init">
+ <title>debug_object_assert_init</title>
+ <para>
+ This function is called to assert that an object has been
+ initialized.
+ </para>
+ <para>
+ When the real object is not tracked by debugobjects, it calls
+ fixup_assert_init of the object type description structure
+ provided by the caller, with the hardcoded object state
+ ODEBUG_NOT_AVAILABLE. The fixup function can correct the problem
+ by calling debug_object_init and other specific initializing
+ functions.
+ </para>
+ <para>
+ When the real object is already tracked by debugobjects it is
+ ignored.
+ </para>
+ </sect1>
</chapter>
<chapter id="fixupfunctions">
<title>Fixup functions</title>
@@ -381,6 +402,35 @@
statistics.
</para>
</sect1>
+ <sect1 id="fixup_assert_init">
+ <title>fixup_assert_init</title>
+ <para>
+ This function is called from the debug code whenever a problem
+ in debug_object_assert_init is detected.
+ </para>
+ <para>
+ Called from debug_object_assert_init() with a hardcoded state
+ ODEBUG_STATE_NOTAVAILABLE when the object is not found in the
+ debug bucket.
+ </para>
+ <para>
+ The function returns 1 when the fixup was successful,
+ otherwise 0. The return value is used to update the
+ statistics.
+ </para>
+ <para>
+ Note, this function should make sure debug_object_init() is
+ called before returning.
+ </para>
+ <para>
+ The handling of statically initialized objects is a special
+ case. The fixup function should check if this is a legitimate
+ case of a statically initialized object or not. In this case only
+ debug_object_init() should be called to make the object known to
+ the tracker. Then the function should return 0 because this is not
+ a real fixup.
+ </para>
+ </sect1>
</chapter>
<chapter id="bugs">
<title>Known Bugs And Assumptions</title>
diff --git a/Documentation/DocBook/uio-howto.tmpl b/Documentation/DocBook/uio-howto.tmpl
index 54883de5d5f9..ac3d0018140c 100644
--- a/Documentation/DocBook/uio-howto.tmpl
+++ b/Documentation/DocBook/uio-howto.tmpl
@@ -521,6 +521,11 @@ Here's a description of the fields of <varname>struct uio_mem</varname>:
<itemizedlist>
<listitem><para>
+<varname>const char *name</varname>: Optional. Set this to help identify
+the memory region, it will show up in the corresponding sysfs node.
+</para></listitem>
+
+<listitem><para>
<varname>int memtype</varname>: Required if the mapping is used. Set this to
<varname>UIO_MEM_PHYS</varname> if you you have physical memory on your
card to be mapped. Use <varname>UIO_MEM_LOGICAL</varname> for logical
@@ -553,7 +558,7 @@ instead to remember such an address.
</itemizedlist>
<para>
-Please do not touch the <varname>kobj</varname> element of
+Please do not touch the <varname>map</varname> element of
<varname>struct uio_mem</varname>! It is used by the UIO framework
to set up sysfs files for this mapping. Simply leave it alone.
</para>
diff --git a/Documentation/HOWTO b/Documentation/HOWTO
index 81bc1a9ab9d8..f7ade3b3b40d 100644
--- a/Documentation/HOWTO
+++ b/Documentation/HOWTO
@@ -275,8 +275,8 @@ versions.
If no 2.6.x.y kernel is available, then the highest numbered 2.6.x
kernel is the current stable kernel.
-2.6.x.y are maintained by the "stable" team <stable@kernel.org>, and are
-released as needs dictate. The normal release period is approximately
+2.6.x.y are maintained by the "stable" team <stable@vger.kernel.org>, and
+are released as needs dictate. The normal release period is approximately
two weeks, but it can be longer if there are no pressing problems. A
security-related problem, instead, can cause a release to happen almost
instantly.
diff --git a/Documentation/RCU/checklist.txt b/Documentation/RCU/checklist.txt
index 0c134f8afc6f..bff2d8be1e18 100644
--- a/Documentation/RCU/checklist.txt
+++ b/Documentation/RCU/checklist.txt
@@ -328,6 +328,12 @@ over a rather long period of time, but improvements are always welcome!
RCU rather than SRCU, because RCU is almost always faster and
easier to use than is SRCU.
+ If you need to enter your read-side critical section in a
+ hardirq or exception handler, and then exit that same read-side
+ critical section in the task that was interrupted, then you need
+ to srcu_read_lock_raw() and srcu_read_unlock_raw(), which avoid
+ the lockdep checking that would otherwise this practice illegal.
+
Also unlike other forms of RCU, explicit initialization
and cleanup is required via init_srcu_struct() and
cleanup_srcu_struct(). These are passed a "struct srcu_struct"
diff --git a/Documentation/RCU/rcu.txt b/Documentation/RCU/rcu.txt
index 31852705b586..bf778332a28f 100644
--- a/Documentation/RCU/rcu.txt
+++ b/Documentation/RCU/rcu.txt
@@ -38,11 +38,11 @@ o How can the updater tell when a grace period has completed
Preemptible variants of RCU (CONFIG_TREE_PREEMPT_RCU) get the
same effect, but require that the readers manipulate CPU-local
- counters. These counters allow limited types of blocking
- within RCU read-side critical sections. SRCU also uses
- CPU-local counters, and permits general blocking within
- RCU read-side critical sections. These two variants of
- RCU detect grace periods by sampling these counters.
+ counters. These counters allow limited types of blocking within
+ RCU read-side critical sections. SRCU also uses CPU-local
+ counters, and permits general blocking within RCU read-side
+ critical sections. These variants of RCU detect grace periods
+ by sampling these counters.
o If I am running on a uniprocessor kernel, which can only do one
thing at a time, why should I wait for a grace period?
diff --git a/Documentation/RCU/stallwarn.txt b/Documentation/RCU/stallwarn.txt
index 4e959208f736..083d88cbc089 100644
--- a/Documentation/RCU/stallwarn.txt
+++ b/Documentation/RCU/stallwarn.txt
@@ -101,6 +101,11 @@ o A CPU-bound real-time task in a CONFIG_PREEMPT_RT kernel that
CONFIG_TREE_PREEMPT_RCU case, you might see stall-warning
messages.
+o A hardware or software issue shuts off the scheduler-clock
+ interrupt on a CPU that is not in dyntick-idle mode. This
+ problem really has happened, and seems to be most likely to
+ result in RCU CPU stall warnings for CONFIG_NO_HZ=n kernels.
+
o A bug in the RCU implementation.
o A hardware failure. This is quite unlikely, but has occurred
@@ -109,12 +114,11 @@ o A hardware failure. This is quite unlikely, but has occurred
This resulted in a series of RCU CPU stall warnings, eventually
leading the realization that the CPU had failed.
-The RCU, RCU-sched, and RCU-bh implementations have CPU stall
-warning. SRCU does not have its own CPU stall warnings, but its
-calls to synchronize_sched() will result in RCU-sched detecting
-RCU-sched-related CPU stalls. Please note that RCU only detects
-CPU stalls when there is a grace period in progress. No grace period,
-no CPU stall warnings.
+The RCU, RCU-sched, and RCU-bh implementations have CPU stall warning.
+SRCU does not have its own CPU stall warnings, but its calls to
+synchronize_sched() will result in RCU-sched detecting RCU-sched-related
+CPU stalls. Please note that RCU only detects CPU stalls when there is
+a grace period in progress. No grace period, no CPU stall warnings.
To diagnose the cause of the stall, inspect the stack traces.
The offending function will usually be near the top of the stack.
diff --git a/Documentation/RCU/torture.txt b/Documentation/RCU/torture.txt
index 783d6c134d3f..d67068d0d2b9 100644
--- a/Documentation/RCU/torture.txt
+++ b/Documentation/RCU/torture.txt
@@ -61,11 +61,24 @@ nreaders This is the number of RCU reading threads supported.
To properly exercise RCU implementations with preemptible
read-side critical sections.
+onoff_interval
+ The number of seconds between each attempt to execute a
+ randomly selected CPU-hotplug operation. Defaults to
+ zero, which disables CPU hotplugging. In HOTPLUG_CPU=n
+ kernels, rcutorture will silently refuse to do any
+ CPU-hotplug operations regardless of what value is
+ specified for onoff_interval.
+
shuffle_interval
The number of seconds to keep the test threads affinitied
to a particular subset of the CPUs, defaults to 3 seconds.
Used in conjunction with test_no_idle_hz.
+shutdown_secs The number of seconds to run the test before terminating
+ the test and powering off the system. The default is
+ zero, which disables test termination and system shutdown.
+ This capability is useful for automated testing.
+
stat_interval The number of seconds between output of torture
statistics (via printk()). Regardless of the interval,
statistics are printed when the module is unloaded.
diff --git a/Documentation/RCU/trace.txt b/Documentation/RCU/trace.txt
index aaf65f6c6cd7..49587abfc2f7 100644
--- a/Documentation/RCU/trace.txt
+++ b/Documentation/RCU/trace.txt
@@ -105,14 +105,10 @@ o "dt" is the current value of the dyntick counter that is incremented
or one greater than the interrupt-nesting depth otherwise.
The number after the second "/" is the NMI nesting depth.
- This field is displayed only for CONFIG_NO_HZ kernels.
-
o "df" is the number of times that some other CPU has forced a
quiescent state on behalf of this CPU due to this CPU being in
dynticks-idle state.
- This field is displayed only for CONFIG_NO_HZ kernels.
-
o "of" is the number of times that some other CPU has forced a
quiescent state on behalf of this CPU due to this CPU being
offline. In a perfect world, this might never happen, but it
diff --git a/Documentation/RCU/whatisRCU.txt b/Documentation/RCU/whatisRCU.txt
index 6ef692667e2f..6bbe8dcdc3da 100644
--- a/Documentation/RCU/whatisRCU.txt
+++ b/Documentation/RCU/whatisRCU.txt
@@ -4,6 +4,7 @@ to start learning about RCU:
1. What is RCU, Fundamentally? http://lwn.net/Articles/262464/
2. What is RCU? Part 2: Usage http://lwn.net/Articles/263130/
3. RCU part 3: the RCU API http://lwn.net/Articles/264090/
+4. The RCU API, 2010 Edition http://lwn.net/Articles/418853/
What is RCU?
@@ -834,6 +835,8 @@ SRCU: Critical sections Grace period Barrier
srcu_read_lock synchronize_srcu N/A
srcu_read_unlock synchronize_srcu_expedited
+ srcu_read_lock_raw
+ srcu_read_unlock_raw
srcu_dereference
SRCU: Initialization/cleanup
@@ -855,27 +858,33 @@ list can be helpful:
a. Will readers need to block? If so, you need SRCU.
-b. What about the -rt patchset? If readers would need to block
+b. Is it necessary to start a read-side critical section in a
+ hardirq handler or exception handler, and then to complete
+ this read-side critical section in the task that was
+ interrupted? If so, you need SRCU's srcu_read_lock_raw() and
+ srcu_read_unlock_raw() primitives.
+
+c. What about the -rt patchset? If readers would need to block
in an non-rt kernel, you need SRCU. If readers would block
in a -rt kernel, but not in a non-rt kernel, SRCU is not
necessary.
-c. Do you need to treat NMI handlers, hardirq handlers,
+d. Do you need to treat NMI handlers, hardirq handlers,
and code segments with preemption disabled (whether
via preempt_disable(), local_irq_save(), local_bh_disable(),
or some other mechanism) as if they were explicit RCU readers?
If so, you need RCU-sched.
-d. Do you need RCU grace periods to complete even in the face
+e. Do you need RCU grace periods to complete even in the face
of softirq monopolization of one or more of the CPUs? For
example, is your code subject to network-based denial-of-service
attacks? If so, you need RCU-bh.
-e. Is your workload too update-intensive for normal use of
+f. Is your workload too update-intensive for normal use of
RCU, but inappropriate for other synchronization mechanisms?
If so, consider SLAB_DESTROY_BY_RCU. But please be careful!
-f. Otherwise, use RCU.
+g. Otherwise, use RCU.
Of course, this all assumes that you have determined that RCU is in fact
the right tool for your job.
diff --git a/Documentation/arm/memory.txt b/Documentation/arm/memory.txt
index 771d48d3b335..208a2d465b92 100644
--- a/Documentation/arm/memory.txt
+++ b/Documentation/arm/memory.txt
@@ -51,15 +51,14 @@ ffc00000 ffefffff DMA memory mapping region. Memory returned
ff000000 ffbfffff Reserved for future expansion of DMA
mapping region.
-VMALLOC_END feffffff Free for platform use, recommended.
- VMALLOC_END must be aligned to a 2MB
- boundary.
-
VMALLOC_START VMALLOC_END-1 vmalloc() / ioremap() space.
Memory returned by vmalloc/ioremap will
be dynamically placed in this region.
- VMALLOC_START may be based upon the value
- of the high_memory variable.
+ Machine specific static mappings are also
+ located here through iotable_init().
+ VMALLOC_START is based upon the value
+ of the high_memory variable, and VMALLOC_END
+ is equal to 0xff000000.
PAGE_OFFSET high_memory-1 Kernel direct-mapped RAM region.
This maps the platforms RAM, and typically
diff --git a/Documentation/atomic_ops.txt b/Documentation/atomic_ops.txt
index 3bd585b44927..27f2b21a9d5c 100644
--- a/Documentation/atomic_ops.txt
+++ b/Documentation/atomic_ops.txt
@@ -84,6 +84,93 @@ compiler optimizes the section accessing atomic_t variables.
*** YOU HAVE BEEN WARNED! ***
+Properly aligned pointers, longs, ints, and chars (and unsigned
+equivalents) may be atomically loaded from and stored to in the same
+sense as described for atomic_read() and atomic_set(). The ACCESS_ONCE()
+macro should be used to prevent the compiler from using optimizations
+that might otherwise optimize accesses out of existence on the one hand,
+or that might create unsolicited accesses on the other.
+
+For example consider the following code:
+
+ while (a > 0)
+ do_something();
+
+If the compiler can prove that do_something() does not store to the
+variable a, then the compiler is within its rights transforming this to
+the following:
+
+ tmp = a;
+ if (a > 0)
+ for (;;)
+ do_something();
+
+If you don't want the compiler to do this (and you probably don't), then
+you should use something like the following:
+
+ while (ACCESS_ONCE(a) < 0)
+ do_something();
+
+Alternatively, you could place a barrier() call in the loop.
+
+For another example, consider the following code:
+
+ tmp_a = a;
+ do_something_with(tmp_a);
+ do_something_else_with(tmp_a);
+
+If the compiler can prove that do_something_with() does not store to the
+variable a, then the compiler is within its rights to manufacture an
+additional load as follows:
+
+ tmp_a = a;
+ do_something_with(tmp_a);
+ tmp_a = a;
+ do_something_else_with(tmp_a);
+
+This could fatally confuse your code if it expected the same value
+to be passed to do_something_with() and do_something_else_with().
+
+The compiler would be likely to manufacture this additional load if
+do_something_with() was an inline function that made very heavy use
+of registers: reloading from variable a could save a flush to the
+stack and later reload. To prevent the compiler from attacking your
+code in this manner, write the following:
+
+ tmp_a = ACCESS_ONCE(a);
+ do_something_with(tmp_a);
+ do_something_else_with(tmp_a);
+
+For a final example, consider the following code, assuming that the
+variable a is set at boot time before the second CPU is brought online
+and never changed later, so that memory barriers are not needed:
+
+ if (a)
+ b = 9;
+ else
+ b = 42;
+
+The compiler is within its rights to manufacture an additional store
+by transforming the above code into the following:
+
+ b = 42;
+ if (a)
+ b = 9;
+
+This could come as a fatal surprise to other code running concurrently
+that expected b to never have the value 42 if a was zero. To prevent
+the compiler from doing this, write something like:
+
+ if (a)
+ ACCESS_ONCE(b) = 9;
+ else
+ ACCESS_ONCE(b) = 42;
+
+Don't even -think- about doing this without proper use of memory barriers,
+locks, or atomic operations if variable a can change at runtime!
+
+*** WARNING: ACCESS_ONCE() DOES NOT IMPLY A BARRIER! ***
+
Now, we move onto the atomic operation interfaces typically implemented with
the help of assembly code.
diff --git a/Documentation/blockdev/cciss.txt b/Documentation/blockdev/cciss.txt
index 71464e09ec18..b79d0a13e7cd 100644
--- a/Documentation/blockdev/cciss.txt
+++ b/Documentation/blockdev/cciss.txt
@@ -98,14 +98,12 @@ You must enable "SCSI tape drive support for Smart Array 5xxx" and
"SCSI support" in your kernel configuration to be able to use SCSI
tape drives with your Smart Array 5xxx controller.
-Additionally, note that the driver will not engage the SCSI core at init
-time. The driver must be directed to dynamically engage the SCSI core via
-the /proc filesystem entry which the "block" side of the driver creates as
-/proc/driver/cciss/cciss* at runtime. This is because at driver init time,
-the SCSI core may not yet be initialized (because the driver is a block
-driver) and attempting to register it with the SCSI core in such a case
-would cause a hang. This is best done via an initialization script
-(typically in /etc/init.d, but could vary depending on distribution).
+Additionally, note that the driver will engage the SCSI core at init
+time if any tape drives or medium changers are detected. The driver may
+also be directed to dynamically engage the SCSI core via the /proc filesystem
+entry which the "block" side of the driver creates as
+/proc/driver/cciss/cciss* at runtime. This is best done via a script.
+
For example:
for x in /proc/driver/cciss/cciss[0-9]*
diff --git a/Documentation/cgroups/memory.txt b/Documentation/cgroups/memory.txt
index cc0ebc5241b3..4d8774f6f48a 100644
--- a/Documentation/cgroups/memory.txt
+++ b/Documentation/cgroups/memory.txt
@@ -44,8 +44,8 @@ Features:
- oom-killer disable knob and oom-notifier
- Root cgroup has no limit controls.
- Kernel memory and Hugepages are not under control yet. We just manage
- pages on LRU. To add more controls, we have to take care of performance.
+ Kernel memory support is work in progress, and the current version provides
+ basically functionality. (See Section 2.7)
Brief summary of control files.
@@ -72,6 +72,9 @@ Brief summary of control files.
memory.oom_control # set/show oom controls.
memory.numa_stat # show the number of memory usage per numa node
+ memory.kmem.tcp.limit_in_bytes # set/show hard limit for tcp buf memory
+ memory.kmem.tcp.usage_in_bytes # show current tcp buf memory allocation
+
1. History
The memory controller has a long history. A request for comments for the memory
@@ -255,6 +258,27 @@ When oom event notifier is registered, event will be delivered.
per-zone-per-cgroup LRU (cgroup's private LRU) is just guarded by
zone->lru_lock, it has no lock of its own.
+2.7 Kernel Memory Extension (CONFIG_CGROUP_MEM_RES_CTLR_KMEM)
+
+With the Kernel memory extension, the Memory Controller is able to limit
+the amount of kernel memory used by the system. Kernel memory is fundamentally
+different than user memory, since it can't be swapped out, which makes it
+possible to DoS the system by consuming too much of this precious resource.
+
+Kernel memory limits are not imposed for the root cgroup. Usage for the root
+cgroup may or may not be accounted.
+
+Currently no soft limit is implemented for kernel memory. It is future work
+to trigger slab reclaim when those limits are reached.
+
+2.7.1 Current Kernel Memory resources accounted
+
+* sockets memory pressure: some sockets protocols have memory pressure
+thresholds. The Memory Controller allows them to be controlled individually
+per cgroup, instead of globally.
+
+* tcp memory pressure: sockets memory pressure for the tcp protocol.
+
3. User Interface
0. Configuration
diff --git a/Documentation/cgroups/net_prio.txt b/Documentation/cgroups/net_prio.txt
new file mode 100644
index 000000000000..01b322635591
--- /dev/null
+++ b/Documentation/cgroups/net_prio.txt
@@ -0,0 +1,53 @@
+Network priority cgroup
+-------------------------
+
+The Network priority cgroup provides an interface to allow an administrator to
+dynamically set the priority of network traffic generated by various
+applications
+
+Nominally, an application would set the priority of its traffic via the
+SO_PRIORITY socket option. This however, is not always possible because:
+
+1) The application may not have been coded to set this value
+2) The priority of application traffic is often a site-specific administrative
+ decision rather than an application defined one.
+
+This cgroup allows an administrator to assign a process to a group which defines
+the priority of egress traffic on a given interface. Network priority groups can
+be created by first mounting the cgroup filesystem.
+
+# mount -t cgroup -onet_prio none /sys/fs/cgroup/net_prio
+
+With the above step, the initial group acting as the parent accounting group
+becomes visible at '/sys/fs/cgroup/net_prio'. This group includes all tasks in
+the system. '/sys/fs/cgroup/net_prio/tasks' lists the tasks in this cgroup.
+
+Each net_prio cgroup contains two files that are subsystem specific
+
+net_prio.prioidx
+This file is read-only, and is simply informative. It contains a unique integer
+value that the kernel uses as an internal representation of this cgroup.
+
+net_prio.ifpriomap
+This file contains a map of the priorities assigned to traffic originating from
+processes in this group and egressing the system on various interfaces. It
+contains a list of tuples in the form <ifname priority>. Contents of this file
+can be modified by echoing a string into the file using the same tuple format.
+for example:
+
+echo "eth0 5" > /sys/fs/cgroups/net_prio/iscsi/net_prio.ifpriomap
+
+This command would force any traffic originating from processes belonging to the
+iscsi net_prio cgroup and egressing on interface eth0 to have the priority of
+said traffic set to the value 5. The parent accounting group also has a
+writeable 'net_prio.ifpriomap' file that can be used to set a system default
+priority.
+
+Priorities are set immediately prior to queueing a frame to the device
+queueing discipline (qdisc) so priorities will be assigned prior to the hardware
+queue selection being made.
+
+One usage for the net_prio cgroup is with mqprio qdisc allowing application
+traffic to be steered to hardware/driver based traffic classes. These mappings
+can then be managed by administrators or other networking protocols such as
+DCBX.
diff --git a/Documentation/cpu-freq/governors.txt b/Documentation/cpu-freq/governors.txt
index d221781dabaa..c7a2eb8450c2 100644
--- a/Documentation/cpu-freq/governors.txt
+++ b/Documentation/cpu-freq/governors.txt
@@ -127,7 +127,7 @@ in the bash (as said, 1000 is default), do:
echo `$(($(cat cpuinfo_transition_latency) * 750 / 1000)) \
>ondemand/sampling_rate
-show_sampling_rate_min:
+sampling_rate_min:
The sampling rate is limited by the HW transition latency:
transition_latency * 100
Or by kernel restrictions:
@@ -140,8 +140,6 @@ HZ=100: min=200000us (200ms)
The highest value of kernel and HW latency restrictions is shown and
used as the minimum sampling rate.
-show_sampling_rate_max: THIS INTERFACE IS DEPRECATED, DON'T USE IT.
-
up_threshold: defines what the average CPU usage between the samplings
of 'sampling_rate' needs to be for the kernel to make a decision on
whether it should increase the frequency. For example when it is set
diff --git a/Documentation/development-process/5.Posting b/Documentation/development-process/5.Posting
index 903a2546f138..8a48c9b62864 100644
--- a/Documentation/development-process/5.Posting
+++ b/Documentation/development-process/5.Posting
@@ -271,10 +271,10 @@ copies should go to:
the linux-kernel list.
- If you are fixing a bug, think about whether the fix should go into the
- next stable update. If so, stable@kernel.org should get a copy of the
- patch. Also add a "Cc: stable@kernel.org" to the tags within the patch
- itself; that will cause the stable team to get a notification when your
- fix goes into the mainline.
+ next stable update. If so, stable@vger.kernel.org should get a copy of
+ the patch. Also add a "Cc: stable@vger.kernel.org" to the tags within
+ the patch itself; that will cause the stable team to get a notification
+ when your fix goes into the mainline.
When selecting recipients for a patch, it is good to have an idea of who
you think will eventually accept the patch and get it merged. While it
diff --git a/Documentation/devices.txt b/Documentation/devices.txt
index eccffe715229..cec8864ce4e8 100644
--- a/Documentation/devices.txt
+++ b/Documentation/devices.txt
@@ -379,7 +379,7 @@ Your cooperation is appreciated.
162 = /dev/smbus System Management Bus
163 = /dev/lik Logitech Internet Keyboard
164 = /dev/ipmo Intel Intelligent Platform Management
- 165 = /dev/vmmon VMWare virtual machine monitor
+ 165 = /dev/vmmon VMware virtual machine monitor
166 = /dev/i2o/ctl I2O configuration manager
167 = /dev/specialix_sxctl Specialix serial control
168 = /dev/tcldrv Technology Concepts serial control
diff --git a/Documentation/devicetree/bindings/arm/gic.txt b/Documentation/devicetree/bindings/arm/gic.txt
index 52916b4aa1fe..9b4b82a721b6 100644
--- a/Documentation/devicetree/bindings/arm/gic.txt
+++ b/Documentation/devicetree/bindings/arm/gic.txt
@@ -42,6 +42,10 @@ Optional
- interrupts : Interrupt source of the parent interrupt controller. Only
present on secondary GICs.
+- cpu-offset : per-cpu offset within the distributor and cpu interface
+ regions, used when the GIC doesn't have banked registers. The offset is
+ cpu-offset * cpu-nr.
+
Example:
intc: interrupt-controller@fff11000 {
diff --git a/Documentation/devicetree/bindings/arm/vic.txt b/Documentation/devicetree/bindings/arm/vic.txt
new file mode 100644
index 000000000000..266716b23437
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/vic.txt
@@ -0,0 +1,29 @@
+* ARM Vectored Interrupt Controller
+
+One or more Vectored Interrupt Controllers (VIC's) can be connected in an ARM
+system for interrupt routing. For multiple controllers they can either be
+nested or have the outputs wire-OR'd together.
+
+Required properties:
+
+- compatible : should be one of
+ "arm,pl190-vic"
+ "arm,pl192-vic"
+- interrupt-controller : Identifies the node as an interrupt controller
+- #interrupt-cells : The number of cells to define the interrupts. Must be 1 as
+ the VIC has no configuration options for interrupt sources. The cell is a u32
+ and defines the interrupt number.
+- reg : The register bank for the VIC.
+
+Optional properties:
+
+- interrupts : Interrupt source for parent controllers if the VIC is nested.
+
+Example:
+
+ vic0: interrupt-controller@60000 {
+ compatible = "arm,pl192-vic";
+ interrupt-controller;
+ #interrupt-cells = <1>;
+ reg = <0x60000 0x1000>;
+ };
diff --git a/Documentation/devicetree/bindings/i2c/i2c-designware.txt b/Documentation/devicetree/bindings/i2c/i2c-designware.txt
new file mode 100644
index 000000000000..e42a2ee233e6
--- /dev/null
+++ b/Documentation/devicetree/bindings/i2c/i2c-designware.txt
@@ -0,0 +1,22 @@
+* Synopsys DesignWare I2C
+
+Required properties :
+
+ - compatible : should be "snps,designware-i2c"
+ - reg : Offset and length of the register set for the device
+ - interrupts : <IRQ> where IRQ is the interrupt number.
+
+Recommended properties :
+
+ - clock-frequency : desired I2C bus clock frequency in Hz.
+
+Example :
+
+ i2c@f0000 {
+ #address-cells = <1>;
+ #size-cells = <0>;
+ compatible = "snps,designware-i2c";
+ reg = <0xf0000 0x1000>;
+ interrupts = <11>;
+ clock-frequency = <400000>;
+ };
diff --git a/Documentation/devicetree/bindings/i2c/trivial-devices.txt b/Documentation/devicetree/bindings/i2c/trivial-devices.txt
new file mode 100644
index 000000000000..1a85f986961b
--- /dev/null
+++ b/Documentation/devicetree/bindings/i2c/trivial-devices.txt
@@ -0,0 +1,58 @@
+This is a list of trivial i2c devices that have simple device tree
+bindings, consisting only of a compatible field, an address and
+possibly an interrupt line.
+
+If a device needs more specific bindings, such as properties to
+describe some aspect of it, there needs to be a specific binding
+document for it just like any other devices.
+
+
+Compatible Vendor / Chip
+========== =============
+ad,ad7414 SMBus/I2C Digital Temperature Sensor in 6-Pin SOT with SMBus Alert and Over Temperature Pin
+ad,adm9240 ADM9240: Complete System Hardware Monitor for uProcessor-Based Systems
+adi,adt7461 +/-1C TDM Extended Temp Range I.C
+adt7461 +/-1C TDM Extended Temp Range I.C
+at,24c08 i2c serial eeprom (24cxx)
+atmel,24c02 i2c serial eeprom (24cxx)
+catalyst,24c32 i2c serial eeprom
+dallas,ds1307 64 x 8, Serial, I2C Real-Time Clock
+dallas,ds1338 I2C RTC with 56-Byte NV RAM
+dallas,ds1339 I2C Serial Real-Time Clock
+dallas,ds1340 I2C RTC with Trickle Charger
+dallas,ds1374 I2C, 32-Bit Binary Counter Watchdog RTC with Trickle Charger and Reset Input/Output
+dallas,ds1631 High-Precision Digital Thermometer
+dallas,ds1682 Total-Elapsed-Time Recorder with Alarm
+dallas,ds1775 Tiny Digital Thermometer and Thermostat
+dallas,ds3232 Extremely Accurate I²C RTC with Integrated Crystal and SRAM
+dallas,ds4510 CPU Supervisor with Nonvolatile Memory and Programmable I/O
+dallas,ds75 Digital Thermometer and Thermostat
+dialog,da9053 DA9053: flexible system level PMIC with multicore support
+epson,rx8025 High-Stability. I2C-Bus INTERFACE REAL TIME CLOCK MODULE
+epson,rx8581 I2C-BUS INTERFACE REAL TIME CLOCK MODULE
+fsl,mag3110 MAG3110: Xtrinsic High Accuracy, 3D Magnetometer
+fsl,mc13892 MC13892: Power Management Integrated Circuit (PMIC) for i.MX35/51
+fsl,mma8450 MMA8450Q: Xtrinsic Low-power, 3-axis Xtrinsic Accelerometer
+fsl,mpr121 MPR121: Proximity Capacitive Touch Sensor Controller
+fsl,sgtl5000 SGTL5000: Ultra Low-Power Audio Codec
+maxim,ds1050 5 Bit Programmable, Pulse-Width Modulator
+maxim,max1237 Low-Power, 4-/12-Channel, 2-Wire Serial, 12-Bit ADCs
+maxim,max6625 9-Bit/12-Bit Temperature Sensors with I²C-Compatible Serial Interface
+mc,rv3029c2 Real Time Clock Module with I2C-Bus
+national,lm75 I2C TEMP SENSOR
+national,lm80 Serial Interface ACPI-Compatible Microprocessor System Hardware Monitor
+national,lm92 ±0.33°C Accurate, 12-Bit + Sign Temperature Sensor and Thermal Window Comparator with Two-Wire Interface
+nxp,pca9556 Octal SMBus and I2C registered interface
+nxp,pca9557 8-bit I2C-bus and SMBus I/O port with reset
+nxp,pcf8563 Real-time clock/calendar
+ovti,ov5642 OV5642: Color CMOS QSXGA (5-megapixel) Image Sensor with OmniBSI and Embedded TrueFocus
+pericom,pt7c4338 Real-time Clock Module
+plx,pex8648 48-Lane, 12-Port PCI Express Gen 2 (5.0 GT/s) Switch
+ramtron,24c64 i2c serial eeprom (24cxx)
+ricoh,rs5c372a I2C bus SERIAL INTERFACE REAL-TIME CLOCK IC
+samsung,24ad0xd1 S524AD0XF1 (128K/256K-bit Serial EEPROM for Low Power)
+st-micro,24c256 i2c serial eeprom (24cxx)
+stm,m41t00 Serial Access TIMEKEEPER
+stm,m41t62 Serial real-time clock (RTC) with alarm
+stm,m41t80 M41T80 - SERIAL ACCESS RTC WITH ALARMS
+ti,tsc2003 I2C Touch-Screen Controller
diff --git a/Documentation/devicetree/bindings/net/calxeda-xgmac.txt b/Documentation/devicetree/bindings/net/calxeda-xgmac.txt
new file mode 100644
index 000000000000..411727a3f82d
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/calxeda-xgmac.txt
@@ -0,0 +1,15 @@
+* Calxeda Highbank 10Gb XGMAC Ethernet
+
+Required properties:
+- compatible : Should be "calxeda,hb-xgmac"
+- reg : Address and length of the register set for the device
+- interrupts : Should contain 3 xgmac interrupts. The 1st is main interrupt.
+ The 2nd is pwr mgt interrupt. The 3rd is low power state interrupt.
+
+Example:
+
+ethernet@fff50000 {
+ compatible = "calxeda,hb-xgmac";
+ reg = <0xfff50000 0x1000>;
+ interrupts = <0 77 4 0 78 4 0 79 4>;
+};
diff --git a/Documentation/devicetree/bindings/net/can/cc770.txt b/Documentation/devicetree/bindings/net/can/cc770.txt
new file mode 100644
index 000000000000..77027bf6460a
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/can/cc770.txt
@@ -0,0 +1,53 @@
+Memory mapped Bosch CC770 and Intel AN82527 CAN controller
+
+Note: The CC770 is a CAN controller from Bosch, which is 100%
+compatible with the old AN82527 from Intel, but with "bugs" being fixed.
+
+Required properties:
+
+- compatible : should be "bosch,cc770" for the CC770 and "intc,82527"
+ for the AN82527.
+
+- reg : should specify the chip select, address offset and size required
+ to map the registers of the controller. The size is usually 0x80.
+
+- interrupts : property with a value describing the interrupt source
+ (number and sensitivity) required for the controller.
+
+Optional properties:
+
+- bosch,external-clock-frequency : frequency of the external oscillator
+ clock in Hz. Note that the internal clock frequency used by the
+ controller is half of that value. If not specified, a default
+ value of 16000000 (16 MHz) is used.
+
+- bosch,clock-out-frequency : slock frequency in Hz on the CLKOUT pin.
+ If not specified or if the specified value is 0, the CLKOUT pin
+ will be disabled.
+
+- bosch,slew-rate : slew rate of the CLKOUT signal. If not specified,
+ a resonable value will be calculated.
+
+- bosch,disconnect-rx0-input : see data sheet.
+
+- bosch,disconnect-rx1-input : see data sheet.
+
+- bosch,disconnect-tx1-output : see data sheet.
+
+- bosch,polarity-dominant : see data sheet.
+
+- bosch,divide-memory-clock : see data sheet.
+
+- bosch,iso-low-speed-mux : see data sheet.
+
+For further information, please have a look to the CC770 or AN82527.
+
+Examples:
+
+can@3,100 {
+ compatible = "bosch,cc770";
+ reg = <3 0x100 0x80>;
+ interrupts = <2 0>;
+ interrupt-parent = <&mpic>;
+ bosch,external-clock-frequency = <16000000>;
+};
diff --git a/Documentation/devicetree/bindings/powerpc/fsl/srio-rmu.txt b/Documentation/devicetree/bindings/powerpc/fsl/srio-rmu.txt
new file mode 100644
index 000000000000..b9a8a2bcfae7
--- /dev/null
+++ b/Documentation/devicetree/bindings/powerpc/fsl/srio-rmu.txt
@@ -0,0 +1,163 @@
+Message unit node:
+
+For SRIO controllers that implement the message unit as part of the controller
+this node is required. For devices with RMAN this node should NOT exist. The
+node is composed of three types of sub-nodes ("fsl-srio-msg-unit",
+"fsl-srio-dbell-unit" and "fsl-srio-port-write-unit").
+
+See srio.txt for more details about generic SRIO controller details.
+
+ - compatible
+ Usage: required
+ Value type: <string>
+ Definition: Must include "fsl,srio-rmu-vX.Y", "fsl,srio-rmu".
+
+ The version X.Y should match the general SRIO controller's IP Block
+ revision register's Major(X) and Minor (Y) value.
+
+ - reg
+ Usage: required
+ Value type: <prop-encoded-array>
+ Definition: A standard property. Specifies the physical address and
+ length of the SRIO configuration registers for message units
+ and doorbell units.
+
+ - fsl,liodn
+ Usage: optional-but-recommended (for devices with PAMU)
+ Value type: <prop-encoded-array>
+ Definition: The logical I/O device number for the PAMU (IOMMU) to be
+ correctly configured for SRIO accesses. The property should
+ not exist on devices that do not support PAMU.
+
+ The LIODN value is associated with all RMU transactions
+ (msg-unit, doorbell, port-write).
+
+Sub-Nodes for RMU: The RMU node is composed of multiple sub-nodes that
+correspond to the actual sub-controllers in the RMU. The manual for a given
+SoC will detail which and how many of these sub-controllers are implemented.
+
+Message Unit:
+
+ - compatible
+ Usage: required
+ Value type: <string>
+ Definition: Must include "fsl,srio-msg-unit-vX.Y", "fsl,srio-msg-unit".
+
+ The version X.Y should match the general SRIO controller's IP Block
+ revision register's Major(X) and Minor (Y) value.
+
+ - reg
+ Usage: required
+ Value type: <prop-encoded-array>
+ Definition: A standard property. Specifies the physical address and
+ length of the SRIO configuration registers for message units
+ and doorbell units.
+
+ - interrupts
+ Usage: required
+ Value type: <prop_encoded-array>
+ Definition: Specifies the interrupts generated by this device. The
+ value of the interrupts property consists of one interrupt
+ specifier. The format of the specifier is defined by the
+ binding document describing the node's interrupt parent.
+
+ A pair of IRQs are specified in this property. The first
+ element is associated with the transmit (TX) interrupt and the
+ second element is associated with the receive (RX) interrupt.
+
+Doorbell Unit:
+
+ - compatible
+ Usage: required
+ Value type: <string>
+ Definition: Must include:
+ "fsl,srio-dbell-unit-vX.Y", "fsl,srio-dbell-unit"
+
+ The version X.Y should match the general SRIO controller's IP Block
+ revision register's Major(X) and Minor (Y) value.
+
+ - reg
+ Usage: required
+ Value type: <prop-encoded-array>
+ Definition: A standard property. Specifies the physical address and
+ length of the SRIO configuration registers for message units
+ and doorbell units.
+
+ - interrupts
+ Usage: required
+ Value type: <prop_encoded-array>
+ Definition: Specifies the interrupts generated by this device. The
+ value of the interrupts property consists of one interrupt
+ specifier. The format of the specifier is defined by the
+ binding document describing the node's interrupt parent.
+
+ A pair of IRQs are specified in this property. The first
+ element is associated with the transmit (TX) interrupt and the
+ second element is associated with the receive (RX) interrupt.
+
+Port-Write Unit:
+
+ - compatible
+ Usage: required
+ Value type: <string>
+ Definition: Must include:
+ "fsl,srio-port-write-unit-vX.Y", "fsl,srio-port-write-unit"
+
+ The version X.Y should match the general SRIO controller's IP Block
+ revision register's Major(X) and Minor (Y) value.
+
+ - reg
+ Usage: required
+ Value type: <prop-encoded-array>
+ Definition: A standard property. Specifies the physical address and
+ length of the SRIO configuration registers for message units
+ and doorbell units.
+
+ - interrupts
+ Usage: required
+ Value type: <prop_encoded-array>
+ Definition: Specifies the interrupts generated by this device. The
+ value of the interrupts property consists of one interrupt
+ specifier. The format of the specifier is defined by the
+ binding document describing the node's interrupt parent.
+
+ A single IRQ that handles port-write conditions is
+ specified by this property. (Typically shared with error).
+
+ Note: All other standard properties (see the ePAPR) are allowed
+ but are optional.
+
+Example:
+ rmu: rmu@d3000 {
+ compatible = "fsl,srio-rmu";
+ reg = <0xd3000 0x400>;
+ ranges = <0x0 0xd3000 0x400>;
+ fsl,liodn = <0xc8>;
+
+ message-unit@0 {
+ compatible = "fsl,srio-msg-unit";
+ reg = <0x0 0x100>;
+ interrupts = <
+ 60 2 0 0 /* msg1_tx_irq */
+ 61 2 0 0>;/* msg1_rx_irq */
+ };
+ message-unit@100 {
+ compatible = "fsl,srio-msg-unit";
+ reg = <0x100 0x100>;
+ interrupts = <
+ 62 2 0 0 /* msg2_tx_irq */
+ 63 2 0 0>;/* msg2_rx_irq */
+ };
+ doorbell-unit@400 {
+ compatible = "fsl,srio-dbell-unit";
+ reg = <0x400 0x80>;
+ interrupts = <
+ 56 2 0 0 /* bell_outb_irq */
+ 57 2 0 0>;/* bell_inb_irq */
+ };
+ port-write-unit@4e0 {
+ compatible = "fsl,srio-port-write-unit";
+ reg = <0x4e0 0x20>;
+ interrupts = <16 2 1 11>;
+ };
+ };
diff --git a/Documentation/devicetree/bindings/powerpc/fsl/srio.txt b/Documentation/devicetree/bindings/powerpc/fsl/srio.txt
new file mode 100644
index 000000000000..b039bcbee134
--- /dev/null
+++ b/Documentation/devicetree/bindings/powerpc/fsl/srio.txt
@@ -0,0 +1,103 @@
+* Freescale Serial RapidIO (SRIO) Controller
+
+RapidIO port node:
+Properties:
+ - compatible
+ Usage: required
+ Value type: <string>
+ Definition: Must include "fsl,srio" for IP blocks with IP Block
+ Revision Register (SRIO IPBRR1) Major ID equal to 0x01c0.
+
+ Optionally, a compatiable string of "fsl,srio-vX.Y" where X is Major
+ version in IP Block Revision Register and Y is Minor version. If this
+ compatiable is provided it should be ordered before "fsl,srio".
+
+ - reg
+ Usage: required
+ Value type: <prop-encoded-array>
+ Definition: A standard property. Specifies the physical address and
+ length of the SRIO configuration registers. The size should
+ be set to 0x11000.
+
+ - interrupts
+ Usage: required
+ Value type: <prop_encoded-array>
+ Definition: Specifies the interrupts generated by this device. The
+ value of the interrupts property consists of one interrupt
+ specifier. The format of the specifier is defined by the
+ binding document describing the node's interrupt parent.
+
+ A single IRQ that handles error conditions is specified by this
+ property. (Typically shared with port-write).
+
+ - fsl,srio-rmu-handle:
+ Usage: required if rmu node is defined
+ Value type: <phandle>
+ Definition: A single <phandle> value that points to the RMU.
+ (See srio-rmu.txt for more details on RMU node binding)
+
+Port Child Nodes: There should a port child node for each port that exists in
+the controller. The ports are numbered starting at one (1) and should have
+the following properties:
+
+ - cell-index
+ Usage: required
+ Value type: <u32>
+ Definition: A standard property. Matches the port id.
+
+ - ranges
+ Usage: required if local access windows preset
+ Value type: <prop-encoded-array>
+ Definition: A standard property. Utilized to describe the memory mapped
+ IO space utilized by the controller. This corresponds to the
+ setting of the local access windows that are targeted to this
+ SRIO port.
+
+ - fsl,liodn
+ Usage: optional-but-recommended (for devices with PAMU)
+ Value type: <prop-encoded-array>
+ Definition: The logical I/O device number for the PAMU (IOMMU) to be
+ correctly configured for SRIO accesses. The property should
+ not exist on devices that do not support PAMU.
+
+ For HW (ie, the P4080) that only supports a LIODN for both
+ memory and maintenance transactions then a single LIODN is
+ represented in the property for both transactions.
+
+ For HW (ie, the P304x/P5020, etc) that supports an LIODN for
+ memory transactions and a unique LIODN for maintenance
+ transactions then a pair of LIODNs are represented in the
+ property. Within the pair, the first element represents the
+ LIODN associated with memory transactions and the second element
+ represents the LIODN associated with maintenance transactions
+ for the port.
+
+Note: All other standard properties (see ePAPR) are allowed but are optional.
+
+Example:
+
+ rapidio: rapidio@ffe0c0000 {
+ #address-cells = <2>;
+ #size-cells = <2>;
+ reg = <0xf 0xfe0c0000 0 0x11000>;
+ compatible = "fsl,srio";
+ interrupts = <16 2 1 11>; /* err_irq */
+ fsl,srio-rmu-handle = <&rmu>;
+ ranges;
+
+ port1 {
+ cell-index = <1>;
+ #address-cells = <2>;
+ #size-cells = <2>;
+ fsl,liodn = <34>;
+ ranges = <0 0 0xc 0x20000000 0 0x10000000>;
+ };
+
+ port2 {
+ cell-index = <2>;
+ #address-cells = <2>;
+ #size-cells = <2>;
+ fsl,liodn = <48>;
+ ranges = <0 0 0xc 0x30000000 0 0x10000000>;
+ };
+ };
diff --git a/Documentation/devicetree/bindings/vendor-prefixes.txt b/Documentation/devicetree/bindings/vendor-prefixes.txt
index e8552782b440..18626965159e 100644
--- a/Documentation/devicetree/bindings/vendor-prefixes.txt
+++ b/Documentation/devicetree/bindings/vendor-prefixes.txt
@@ -8,7 +8,9 @@ amcc Applied Micro Circuits Corporation (APM, formally AMCC)
apm Applied Micro Circuits Corporation (APM)
arm ARM Ltd.
atmel Atmel Corporation
+cavium Cavium, Inc.
chrp Common Hardware Reference Platform
+cortina Cortina Systems, Inc.
dallas Maxim Integrated Products (formerly Dallas Semiconductor)
denx Denx Software Engineering
epson Seiko Epson Corp.
@@ -33,8 +35,10 @@ qcom Qualcomm, Inc.
ramtron Ramtron International
samsung Samsung Semiconductor
schindler Schindler
+sil Silicon Image
simtek
sirf SiRF Technology, Inc.
+st STMicroelectronics
stericsson ST-Ericsson
ti Texas Instruments
xlnx Xilinx
diff --git a/Documentation/dma-buf-sharing.txt b/Documentation/dma-buf-sharing.txt
new file mode 100644
index 000000000000..510eab32f392
--- /dev/null
+++ b/Documentation/dma-buf-sharing.txt
@@ -0,0 +1,224 @@
+ DMA Buffer Sharing API Guide
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+ Sumit Semwal
+ <sumit dot semwal at linaro dot org>
+ <sumit dot semwal at ti dot com>
+
+This document serves as a guide to device-driver writers on what is the dma-buf
+buffer sharing API, how to use it for exporting and using shared buffers.
+
+Any device driver which wishes to be a part of DMA buffer sharing, can do so as
+either the 'exporter' of buffers, or the 'user' of buffers.
+
+Say a driver A wants to use buffers created by driver B, then we call B as the
+exporter, and A as buffer-user.
+
+The exporter
+- implements and manages operations[1] for the buffer
+- allows other users to share the buffer by using dma_buf sharing APIs,
+- manages the details of buffer allocation,
+- decides about the actual backing storage where this allocation happens,
+- takes care of any migration of scatterlist - for all (shared) users of this
+ buffer,
+
+The buffer-user
+- is one of (many) sharing users of the buffer.
+- doesn't need to worry about how the buffer is allocated, or where.
+- needs a mechanism to get access to the scatterlist that makes up this buffer
+ in memory, mapped into its own address space, so it can access the same area
+ of memory.
+
+*IMPORTANT*: [see https://lkml.org/lkml/2011/12/20/211 for more details]
+For this first version, A buffer shared using the dma_buf sharing API:
+- *may* be exported to user space using "mmap" *ONLY* by exporter, outside of
+ this framework.
+- may be used *ONLY* by importers that do not need CPU access to the buffer.
+
+The dma_buf buffer sharing API usage contains the following steps:
+
+1. Exporter announces that it wishes to export a buffer
+2. Userspace gets the file descriptor associated with the exported buffer, and
+ passes it around to potential buffer-users based on use case
+3. Each buffer-user 'connects' itself to the buffer
+4. When needed, buffer-user requests access to the buffer from exporter
+5. When finished with its use, the buffer-user notifies end-of-DMA to exporter
+6. when buffer-user is done using this buffer completely, it 'disconnects'
+ itself from the buffer.
+
+
+1. Exporter's announcement of buffer export
+
+ The buffer exporter announces its wish to export a buffer. In this, it
+ connects its own private buffer data, provides implementation for operations
+ that can be performed on the exported dma_buf, and flags for the file
+ associated with this buffer.
+
+ Interface:
+ struct dma_buf *dma_buf_export(void *priv, struct dma_buf_ops *ops,
+ size_t size, int flags)
+
+ If this succeeds, dma_buf_export allocates a dma_buf structure, and returns a
+ pointer to the same. It also associates an anonymous file with this buffer,
+ so it can be exported. On failure to allocate the dma_buf object, it returns
+ NULL.
+
+2. Userspace gets a handle to pass around to potential buffer-users
+
+ Userspace entity requests for a file-descriptor (fd) which is a handle to the
+ anonymous file associated with the buffer. It can then share the fd with other
+ drivers and/or processes.
+
+ Interface:
+ int dma_buf_fd(struct dma_buf *dmabuf)
+
+ This API installs an fd for the anonymous file associated with this buffer;
+ returns either 'fd', or error.
+
+3. Each buffer-user 'connects' itself to the buffer
+
+ Each buffer-user now gets a reference to the buffer, using the fd passed to
+ it.
+
+ Interface:
+ struct dma_buf *dma_buf_get(int fd)
+
+ This API will return a reference to the dma_buf, and increment refcount for
+ it.
+
+ After this, the buffer-user needs to attach its device with the buffer, which
+ helps the exporter to know of device buffer constraints.
+
+ Interface:
+ struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
+ struct device *dev)
+
+ This API returns reference to an attachment structure, which is then used
+ for scatterlist operations. It will optionally call the 'attach' dma_buf
+ operation, if provided by the exporter.
+
+ The dma-buf sharing framework does the bookkeeping bits related to managing
+ the list of all attachments to a buffer.
+
+Until this stage, the buffer-exporter has the option to choose not to actually
+allocate the backing storage for this buffer, but wait for the first buffer-user
+to request use of buffer for allocation.
+
+
+4. When needed, buffer-user requests access to the buffer
+
+ Whenever a buffer-user wants to use the buffer for any DMA, it asks for
+ access to the buffer using dma_buf_map_attachment API. At least one attach to
+ the buffer must have happened before map_dma_buf can be called.
+
+ Interface:
+ struct sg_table * dma_buf_map_attachment(struct dma_buf_attachment *,
+ enum dma_data_direction);
+
+ This is a wrapper to dma_buf->ops->map_dma_buf operation, which hides the
+ "dma_buf->ops->" indirection from the users of this interface.
+
+ In struct dma_buf_ops, map_dma_buf is defined as
+ struct sg_table * (*map_dma_buf)(struct dma_buf_attachment *,
+ enum dma_data_direction);
+
+ It is one of the buffer operations that must be implemented by the exporter.
+ It should return the sg_table containing scatterlist for this buffer, mapped
+ into caller's address space.
+
+ If this is being called for the first time, the exporter can now choose to
+ scan through the list of attachments for this buffer, collate the requirements
+ of the attached devices, and choose an appropriate backing storage for the
+ buffer.
+
+ Based on enum dma_data_direction, it might be possible to have multiple users
+ accessing at the same time (for reading, maybe), or any other kind of sharing
+ that the exporter might wish to make available to buffer-users.
+
+ map_dma_buf() operation can return -EINTR if it is interrupted by a signal.
+
+
+5. When finished, the buffer-user notifies end-of-DMA to exporter
+
+ Once the DMA for the current buffer-user is over, it signals 'end-of-DMA' to
+ the exporter using the dma_buf_unmap_attachment API.
+
+ Interface:
+ void dma_buf_unmap_attachment(struct dma_buf_attachment *,
+ struct sg_table *);
+
+ This is a wrapper to dma_buf->ops->unmap_dma_buf() operation, which hides the
+ "dma_buf->ops->" indirection from the users of this interface.
+
+ In struct dma_buf_ops, unmap_dma_buf is defined as
+ void (*unmap_dma_buf)(struct dma_buf_attachment *, struct sg_table *);
+
+ unmap_dma_buf signifies the end-of-DMA for the attachment provided. Like
+ map_dma_buf, this API also must be implemented by the exporter.
+
+
+6. when buffer-user is done using this buffer, it 'disconnects' itself from the
+ buffer.
+
+ After the buffer-user has no more interest in using this buffer, it should
+ disconnect itself from the buffer:
+
+ - it first detaches itself from the buffer.
+
+ Interface:
+ void dma_buf_detach(struct dma_buf *dmabuf,
+ struct dma_buf_attachment *dmabuf_attach);
+
+ This API removes the attachment from the list in dmabuf, and optionally calls
+ dma_buf->ops->detach(), if provided by exporter, for any housekeeping bits.
+
+ - Then, the buffer-user returns the buffer reference to exporter.
+
+ Interface:
+ void dma_buf_put(struct dma_buf *dmabuf);
+
+ This API then reduces the refcount for this buffer.
+
+ If, as a result of this call, the refcount becomes 0, the 'release' file
+ operation related to this fd is called. It calls the dmabuf->ops->release()
+ operation in turn, and frees the memory allocated for dmabuf when exported.
+
+NOTES:
+- Importance of attach-detach and {map,unmap}_dma_buf operation pairs
+ The attach-detach calls allow the exporter to figure out backing-storage
+ constraints for the currently-interested devices. This allows preferential
+ allocation, and/or migration of pages across different types of storage
+ available, if possible.
+
+ Bracketing of DMA access with {map,unmap}_dma_buf operations is essential
+ to allow just-in-time backing of storage, and migration mid-way through a
+ use-case.
+
+- Migration of backing storage if needed
+ If after
+ - at least one map_dma_buf has happened,
+ - and the backing storage has been allocated for this buffer,
+ another new buffer-user intends to attach itself to this buffer, it might
+ be allowed, if possible for the exporter.
+
+ In case it is allowed by the exporter:
+ if the new buffer-user has stricter 'backing-storage constraints', and the
+ exporter can handle these constraints, the exporter can just stall on the
+ map_dma_buf until all outstanding access is completed (as signalled by
+ unmap_dma_buf).
+ Once all users have finished accessing and have unmapped this buffer, the
+ exporter could potentially move the buffer to the stricter backing-storage,
+ and then allow further {map,unmap}_dma_buf operations from any buffer-user
+ from the migrated backing-storage.
+
+ If the exporter cannot fulfil the backing-storage constraints of the new
+ buffer-user device as requested, dma_buf_attach() would return an error to
+ denote non-compatibility of the new buffer-sharing request with the current
+ buffer.
+
+ If the exporter chooses not to allow an attach() operation once a
+ map_dma_buf() API has been called, it simply returns an error.
+
+References:
+[1] struct dma_buf_ops in include/linux/dma-buf.h
+[2] All interfaces mentioned above defined in include/linux/dma-buf.h
diff --git a/Documentation/driver-model/devres.txt b/Documentation/driver-model/devres.txt
index d79aead9418b..10c64c8a13d4 100644
--- a/Documentation/driver-model/devres.txt
+++ b/Documentation/driver-model/devres.txt
@@ -262,6 +262,7 @@ IOMAP
devm_ioremap()
devm_ioremap_nocache()
devm_iounmap()
+ devm_request_and_ioremap() : checks resource, requests region, ioremaps
pcim_iomap()
pcim_iounmap()
pcim_iomap_table() : array of mapped addresses indexed by BAR
diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt
index 3d849122b5b1..a1e7f3eec98f 100644
--- a/Documentation/feature-removal-schedule.txt
+++ b/Documentation/feature-removal-schedule.txt
@@ -85,17 +85,6 @@ Who: Robin Getz <rgetz@blackfin.uclinux.org> & Matt Mackall <mpm@selenic.com>
---------------------------
-What: Deprecated snapshot ioctls
-When: 2.6.36
-
-Why: The ioctls in kernel/power/user.c were marked as deprecated long time
- ago. Now they notify users about that so that they need to replace
- their userspace. After some more time, remove them completely.
-
-Who: Jiri Slaby <jirislaby@gmail.com>
-
----------------------------
-
What: The ieee80211_regdom module parameter
When: March 2010 / desktop catchup
@@ -263,8 +252,7 @@ Who: Ravikiran Thirumalai <kiran@scalex86.org>
What: Code that is now under CONFIG_WIRELESS_EXT_SYSFS
(in net/core/net-sysfs.c)
-When: After the only user (hal) has seen a release with the patches
- for enough time, probably some time in 2010.
+When: 3.5
Why: Over 1K .text/.data size reduction, data is available in other
ways (ioctls)
Who: Johannes Berg <johannes@sipsolutions.net>
diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking
index d819ba16a0c7..4fca82e5276e 100644
--- a/Documentation/filesystems/Locking
+++ b/Documentation/filesystems/Locking
@@ -37,15 +37,15 @@ d_manage: no no yes (ref-walk) maybe
--------------------------- inode_operations ---------------------------
prototypes:
- int (*create) (struct inode *,struct dentry *,int, struct nameidata *);
+ int (*create) (struct inode *,struct dentry *,umode_t, struct nameidata *);
struct dentry * (*lookup) (struct inode *,struct dentry *, struct nameid
ata *);
int (*link) (struct dentry *,struct inode *,struct dentry *);
int (*unlink) (struct inode *,struct dentry *);
int (*symlink) (struct inode *,struct dentry *,const char *);
- int (*mkdir) (struct inode *,struct dentry *,int);
+ int (*mkdir) (struct inode *,struct dentry *,umode_t);
int (*rmdir) (struct inode *,struct dentry *);
- int (*mknod) (struct inode *,struct dentry *,int,dev_t);
+ int (*mknod) (struct inode *,struct dentry *,umode_t,dev_t);
int (*rename) (struct inode *, struct dentry *,
struct inode *, struct dentry *);
int (*readlink) (struct dentry *, char __user *,int);
@@ -117,7 +117,7 @@ prototypes:
int (*statfs) (struct dentry *, struct kstatfs *);
int (*remount_fs) (struct super_block *, int *, char *);
void (*umount_begin) (struct super_block *);
- int (*show_options)(struct seq_file *, struct vfsmount *);
+ int (*show_options)(struct seq_file *, struct dentry *);
ssize_t (*quota_read)(struct super_block *, int, char *, size_t, loff_t);
ssize_t (*quota_write)(struct super_block *, int, const char *, size_t, loff_t);
int (*bdev_try_to_free_page)(struct super_block*, struct page*, gfp_t);
diff --git a/Documentation/filesystems/btrfs.txt b/Documentation/filesystems/btrfs.txt
index 64087c34327f..7671352216f1 100644
--- a/Documentation/filesystems/btrfs.txt
+++ b/Documentation/filesystems/btrfs.txt
@@ -63,8 +63,8 @@ IRC network.
Userspace tools for creating and manipulating Btrfs file systems are
available from the git repository at the following location:
- http://git.kernel.org/?p=linux/kernel/git/mason/btrfs-progs-unstable.git
- git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-progs-unstable.git
+ http://git.kernel.org/?p=linux/kernel/git/mason/btrfs-progs.git
+ git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-progs.git
These include the following tools:
diff --git a/Documentation/filesystems/configfs/configfs.txt b/Documentation/filesystems/configfs/configfs.txt
index dd57bb6bb390..b40fec9d3f53 100644
--- a/Documentation/filesystems/configfs/configfs.txt
+++ b/Documentation/filesystems/configfs/configfs.txt
@@ -192,7 +192,7 @@ attribute value uses the store_attribute() method.
struct configfs_attribute {
char *ca_name;
struct module *ca_owner;
- mode_t ca_mode;
+ umode_t ca_mode;
};
When a config_item wants an attribute to appear as a file in the item's
diff --git a/Documentation/filesystems/debugfs.txt b/Documentation/filesystems/debugfs.txt
index 742cc06e138f..6872c91bce35 100644
--- a/Documentation/filesystems/debugfs.txt
+++ b/Documentation/filesystems/debugfs.txt
@@ -35,7 +35,7 @@ described below will work.
The most general way to create a file within a debugfs directory is with:
- struct dentry *debugfs_create_file(const char *name, mode_t mode,
+ struct dentry *debugfs_create_file(const char *name, umode_t mode,
struct dentry *parent, void *data,
const struct file_operations *fops);
@@ -53,13 +53,13 @@ actually necessary; the debugfs code provides a number of helper functions
for simple situations. Files containing a single integer value can be
created with any of:
- struct dentry *debugfs_create_u8(const char *name, mode_t mode,
+ struct dentry *debugfs_create_u8(const char *name, umode_t mode,
struct dentry *parent, u8 *value);
- struct dentry *debugfs_create_u16(const char *name, mode_t mode,
+ struct dentry *debugfs_create_u16(const char *name, umode_t mode,
struct dentry *parent, u16 *value);
- struct dentry *debugfs_create_u32(const char *name, mode_t mode,
+ struct dentry *debugfs_create_u32(const char *name, umode_t mode,
struct dentry *parent, u32 *value);
- struct dentry *debugfs_create_u64(const char *name, mode_t mode,
+ struct dentry *debugfs_create_u64(const char *name, umode_t mode,
struct dentry *parent, u64 *value);
These files support both reading and writing the given value; if a specific
@@ -67,13 +67,13 @@ file should not be written to, simply set the mode bits accordingly. The
values in these files are in decimal; if hexadecimal is more appropriate,
the following functions can be used instead:
- struct dentry *debugfs_create_x8(const char *name, mode_t mode,
+ struct dentry *debugfs_create_x8(const char *name, umode_t mode,
struct dentry *parent, u8 *value);
- struct dentry *debugfs_create_x16(const char *name, mode_t mode,
+ struct dentry *debugfs_create_x16(const char *name, umode_t mode,
struct dentry *parent, u16 *value);
- struct dentry *debugfs_create_x32(const char *name, mode_t mode,
+ struct dentry *debugfs_create_x32(const char *name, umode_t mode,
struct dentry *parent, u32 *value);
- struct dentry *debugfs_create_x64(const char *name, mode_t mode,
+ struct dentry *debugfs_create_x64(const char *name, umode_t mode,
struct dentry *parent, u64 *value);
These functions are useful as long as the developer knows the size of the
@@ -81,7 +81,7 @@ value to be exported. Some types can have different widths on different
architectures, though, complicating the situation somewhat. There is a
function meant to help out in one special case:
- struct dentry *debugfs_create_size_t(const char *name, mode_t mode,
+ struct dentry *debugfs_create_size_t(const char *name, umode_t mode,
struct dentry *parent,
size_t *value);
@@ -90,21 +90,22 @@ a variable of type size_t.
Boolean values can be placed in debugfs with:
- struct dentry *debugfs_create_bool(const char *name, mode_t mode,
+ struct dentry *debugfs_create_bool(const char *name, umode_t mode,
struct dentry *parent, u32 *value);
A read on the resulting file will yield either Y (for non-zero values) or
N, followed by a newline. If written to, it will accept either upper- or
lower-case values, or 1 or 0. Any other input will be silently ignored.
-Finally, a block of arbitrary binary data can be exported with:
+Another option is exporting a block of arbitrary binary data, with
+this structure and function:
struct debugfs_blob_wrapper {
void *data;
unsigned long size;
};
- struct dentry *debugfs_create_blob(const char *name, mode_t mode,
+ struct dentry *debugfs_create_blob(const char *name, umode_t mode,
struct dentry *parent,
struct debugfs_blob_wrapper *blob);
@@ -115,6 +116,35 @@ can be used to export binary information, but there does not appear to be
any code which does so in the mainline. Note that all files created with
debugfs_create_blob() are read-only.
+If you want to dump a block of registers (something that happens quite
+often during development, even if little such code reaches mainline.
+Debugfs offers two functions: one to make a registers-only file, and
+another to insert a register block in the middle of another sequential
+file.
+
+ struct debugfs_reg32 {
+ char *name;
+ unsigned long offset;
+ };
+
+ struct debugfs_regset32 {
+ struct debugfs_reg32 *regs;
+ int nregs;
+ void __iomem *base;
+ };
+
+ struct dentry *debugfs_create_regset32(const char *name, mode_t mode,
+ struct dentry *parent,
+ struct debugfs_regset32 *regset);
+
+ int debugfs_print_regs32(struct seq_file *s, struct debugfs_reg32 *regs,
+ int nregs, void __iomem *base, char *prefix);
+
+The "base" argument may be 0, but you may want to build the reg32 array
+using __stringify, and a number of register names (macros) are actually
+byte offsets over a base for the register block.
+
+
There are a couple of other directory-oriented helper functions:
struct dentry *debugfs_rename(struct dentry *old_dir,
diff --git a/Documentation/filesystems/sysfs.txt b/Documentation/filesystems/sysfs.txt
index 07235caec22c..a6619b7064b9 100644
--- a/Documentation/filesystems/sysfs.txt
+++ b/Documentation/filesystems/sysfs.txt
@@ -70,7 +70,7 @@ An attribute definition is simply:
struct attribute {
char * name;
struct module *owner;
- mode_t mode;
+ umode_t mode;
};
diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt
index 43cbd0821721..3d9393b845b8 100644
--- a/Documentation/filesystems/vfs.txt
+++ b/Documentation/filesystems/vfs.txt
@@ -225,7 +225,7 @@ struct super_operations {
void (*clear_inode) (struct inode *);
void (*umount_begin) (struct super_block *);
- int (*show_options)(struct seq_file *, struct vfsmount *);
+ int (*show_options)(struct seq_file *, struct dentry *);
ssize_t (*quota_read)(struct super_block *, int, char *, size_t, loff_t);
ssize_t (*quota_write)(struct super_block *, int, const char *, size_t, loff_t);
@@ -341,14 +341,14 @@ This describes how the VFS can manipulate an inode in your
filesystem. As of kernel 2.6.22, the following members are defined:
struct inode_operations {
- int (*create) (struct inode *,struct dentry *,int, struct nameidata *);
+ int (*create) (struct inode *,struct dentry *, umode_t, struct nameidata *);
struct dentry * (*lookup) (struct inode *,struct dentry *, struct nameidata *);
int (*link) (struct dentry *,struct inode *,struct dentry *);
int (*unlink) (struct inode *,struct dentry *);
int (*symlink) (struct inode *,struct dentry *,const char *);
- int (*mkdir) (struct inode *,struct dentry *,int);
+ int (*mkdir) (struct inode *,struct dentry *,umode_t);
int (*rmdir) (struct inode *,struct dentry *);
- int (*mknod) (struct inode *,struct dentry *,int,dev_t);
+ int (*mknod) (struct inode *,struct dentry *,umode_t,dev_t);
int (*rename) (struct inode *, struct dentry *,
struct inode *, struct dentry *);
int (*readlink) (struct dentry *, char __user *,int);
diff --git a/Documentation/hwmon/pmbus b/Documentation/hwmon/pmbus
index 15ac911ce51b..d28b591753d1 100644
--- a/Documentation/hwmon/pmbus
+++ b/Documentation/hwmon/pmbus
@@ -2,9 +2,8 @@ Kernel driver pmbus
====================
Supported chips:
- * Ericsson BMR45X series
- DC/DC Converter
- Prefixes: 'bmr450', 'bmr451', 'bmr453', 'bmr454'
+ * Ericsson BMR453, BMR454
+ Prefixes: 'bmr453', 'bmr454'
Addresses scanned: -
Datasheet:
http://archive.ericsson.net/service/internet/picov/get?DocNo=28701-EN/LZT146395
diff --git a/Documentation/hwmon/zl6100 b/Documentation/hwmon/zl6100
index 7617798b5c97..51f76a189fee 100644
--- a/Documentation/hwmon/zl6100
+++ b/Documentation/hwmon/zl6100
@@ -6,6 +6,10 @@ Supported chips:
Prefix: 'zl2004'
Addresses scanned: -
Datasheet: http://www.intersil.com/data/fn/fn6847.pdf
+ * Intersil / Zilker Labs ZL2005
+ Prefix: 'zl2005'
+ Addresses scanned: -
+ Datasheet: http://www.intersil.com/data/fn/fn6848.pdf
* Intersil / Zilker Labs ZL2006
Prefix: 'zl2006'
Addresses scanned: -
@@ -30,6 +34,17 @@ Supported chips:
Prefix: 'zl6105'
Addresses scanned: -
Datasheet: http://www.intersil.com/data/fn/fn6906.pdf
+ * Ericsson BMR450, BMR451
+ Prefix: 'bmr450', 'bmr451'
+ Addresses scanned: -
+ Datasheet:
+http://archive.ericsson.net/service/internet/picov/get?DocNo=28701-EN/LZT146401
+ * Ericsson BMR462, BMR463, BMR464
+ Prefixes: 'bmr462', 'bmr463', 'bmr464'
+ Addresses scanned: -
+ Datasheet:
+http://archive.ericsson.net/service/internet/picov/get?DocNo=28701-EN/LZT146256
+
Author: Guenter Roeck <guenter.roeck@ericsson.com>
diff --git a/Documentation/i2c/ten-bit-addresses b/Documentation/i2c/ten-bit-addresses
index e9890709c508..cdfe13901b99 100644
--- a/Documentation/i2c/ten-bit-addresses
+++ b/Documentation/i2c/ten-bit-addresses
@@ -1,22 +1,24 @@
The I2C protocol knows about two kinds of device addresses: normal 7 bit
addresses, and an extended set of 10 bit addresses. The sets of addresses
do not intersect: the 7 bit address 0x10 is not the same as the 10 bit
-address 0x10 (though a single device could respond to both of them). You
-select a 10 bit address by adding an extra byte after the address
-byte:
- S Addr7 Rd/Wr ....
-becomes
- S 11110 Addr10 Rd/Wr
-S is the start bit, Rd/Wr the read/write bit, and if you count the number
-of bits, you will see the there are 8 after the S bit for 7 bit addresses,
-and 16 after the S bit for 10 bit addresses.
+address 0x10 (though a single device could respond to both of them).
-WARNING! The current 10 bit address support is EXPERIMENTAL. There are
-several places in the code that will cause SEVERE PROBLEMS with 10 bit
-addresses, even though there is some basic handling and hooks. Also,
-almost no supported adapter handles the 10 bit addresses correctly.
+I2C messages to and from 10-bit address devices have a different format.
+See the I2C specification for the details.
-As soon as a real 10 bit address device is spotted 'in the wild', we
-can and will add proper support. Right now, 10 bit address devices
-are defined by the I2C protocol, but we have never seen a single device
-which supports them.
+The current 10 bit address support is minimal. It should work, however
+you can expect some problems along the way:
+* Not all bus drivers support 10-bit addresses. Some don't because the
+ hardware doesn't support them (SMBus doesn't require 10-bit address
+ support for example), some don't because nobody bothered adding the
+ code (or it's there but not working properly.) Software implementation
+ (i2c-algo-bit) is known to work.
+* Some optional features do not support 10-bit addresses. This is the
+ case of automatic detection and instantiation of devices by their,
+ drivers, for example.
+* Many user-space packages (for example i2c-tools) lack support for
+ 10-bit addresses.
+
+Note that 10-bit address devices are still pretty rare, so the limitations
+listed above could stay for a long time, maybe even forever if nobody
+needs them to be fixed.
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index a0c5c5f4fce6..e229769606f2 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -315,12 +315,12 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
CPU-intensive style benchmark, and it can vary highly in
a microbenchmark depending on workload and compiler.
- 1: only for 32-bit processes
- 2: only for 64-bit processes
+ 32: only for 32-bit processes
+ 64: only for 64-bit processes
on: enable for both 32- and 64-bit processes
off: disable for both 32- and 64-bit processes
- amd_iommu= [HW,X86-84]
+ amd_iommu= [HW,X86-64]
Pass parameters to the AMD IOMMU driver in the system.
Possible values are:
fullflush - enable flushing of IO/TLB entries when
@@ -1885,6 +1885,11 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
arch_perfmon: [X86] Force use of architectural
perfmon on Intel CPUs instead of the
CPU specific event set.
+ timer: [X86] Force use of architectural NMI
+ timer mode (see also oprofile.timer
+ for generic hr timer mode)
+ [s390] Force legacy basic mode sampling
+ (report cpu_type "timer")
oops=panic Always panic on oopses. Default is to just kill the
process, but there is a small probability of
@@ -2750,11 +2755,10 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
functions are at fixed addresses, they make nice
targets for exploits that can control RIP.
- emulate Vsyscalls turn into traps and are emulated
- reasonably safely.
+ emulate [default] Vsyscalls turn into traps and are
+ emulated reasonably safely.
- native [default] Vsyscalls are native syscall
- instructions.
+ native Vsyscalls are native syscall instructions.
This is a little bit faster than trapping
and makes a few dynamic recompilers work
better than they would in emulation mode.
diff --git a/Documentation/lockdep-design.txt b/Documentation/lockdep-design.txt
index abf768c681e2..5dbc99c04f6e 100644
--- a/Documentation/lockdep-design.txt
+++ b/Documentation/lockdep-design.txt
@@ -221,3 +221,66 @@ when the chain is validated for the first time, is then put into a hash
table, which hash-table can be checked in a lockfree manner. If the
locking chain occurs again later on, the hash table tells us that we
dont have to validate the chain again.
+
+Troubleshooting:
+----------------
+
+The validator tracks a maximum of MAX_LOCKDEP_KEYS number of lock classes.
+Exceeding this number will trigger the following lockdep warning:
+
+ (DEBUG_LOCKS_WARN_ON(id >= MAX_LOCKDEP_KEYS))
+
+By default, MAX_LOCKDEP_KEYS is currently set to 8191, and typical
+desktop systems have less than 1,000 lock classes, so this warning
+normally results from lock-class leakage or failure to properly
+initialize locks. These two problems are illustrated below:
+
+1. Repeated module loading and unloading while running the validator
+ will result in lock-class leakage. The issue here is that each
+ load of the module will create a new set of lock classes for
+ that module's locks, but module unloading does not remove old
+ classes (see below discussion of reuse of lock classes for why).
+ Therefore, if that module is loaded and unloaded repeatedly,
+ the number of lock classes will eventually reach the maximum.
+
+2. Using structures such as arrays that have large numbers of
+ locks that are not explicitly initialized. For example,
+ a hash table with 8192 buckets where each bucket has its own
+ spinlock_t will consume 8192 lock classes -unless- each spinlock
+ is explicitly initialized at runtime, for example, using the
+ run-time spin_lock_init() as opposed to compile-time initializers
+ such as __SPIN_LOCK_UNLOCKED(). Failure to properly initialize
+ the per-bucket spinlocks would guarantee lock-class overflow.
+ In contrast, a loop that called spin_lock_init() on each lock
+ would place all 8192 locks into a single lock class.
+
+ The moral of this story is that you should always explicitly
+ initialize your locks.
+
+One might argue that the validator should be modified to allow
+lock classes to be reused. However, if you are tempted to make this
+argument, first review the code and think through the changes that would
+be required, keeping in mind that the lock classes to be removed are
+likely to be linked into the lock-dependency graph. This turns out to
+be harder to do than to say.
+
+Of course, if you do run out of lock classes, the next thing to do is
+to find the offending lock classes. First, the following command gives
+you the number of lock classes currently in use along with the maximum:
+
+ grep "lock-classes" /proc/lockdep_stats
+
+This command produces the following output on a modest system:
+
+ lock-classes: 748 [max: 8191]
+
+If the number allocated (748 above) increases continually over time,
+then there is likely a leak. The following command can be used to
+identify the leaking lock classes:
+
+ grep "BD" /proc/lockdep
+
+Run the command and save the output, then compare against the output from
+a later run of this command to identify the leakers. This same output
+can also help you find situations where runtime lock initialization has
+been omitted.
diff --git a/Documentation/md.txt b/Documentation/md.txt
index fc94770f44ab..993fba37b7d1 100644
--- a/Documentation/md.txt
+++ b/Documentation/md.txt
@@ -357,14 +357,14 @@ Each directory contains:
written to, that device.
state
- A file recording the current state of the device in the array
+ A file recording the current state of the device in the array
which can be a comma separated list of
faulty - device has been kicked from active use due to
- a detected fault or it has unacknowledged bad
- blocks
+ a detected fault, or it has unacknowledged bad
+ blocks
in_sync - device is a fully in-sync member of the array
writemostly - device will only be subject to read
- requests if there are no other options.
+ requests if there are no other options.
This applies only to raid1 arrays.
blocked - device has failed, and the failure hasn't been
acknowledged yet by the metadata handler.
@@ -374,6 +374,13 @@ Each directory contains:
This includes spares that are in the process
of being recovered to
write_error - device has ever seen a write error.
+ want_replacement - device is (mostly) working but probably
+ should be replaced, either due to errors or
+ due to user request.
+ replacement - device is a replacement for another active
+ device with same raid_disk.
+
+
This list may grow in future.
This can be written to.
Writing "faulty" simulates a failure on the device.
@@ -386,6 +393,13 @@ Each directory contains:
Writing "in_sync" sets the in_sync flag.
Writing "write_error" sets writeerrorseen flag.
Writing "-write_error" clears writeerrorseen flag.
+ Writing "want_replacement" is allowed at any time except to a
+ replacement device or a spare. It sets the flag.
+ Writing "-want_replacement" is allowed at any time. It clears
+ the flag.
+ Writing "replacement" or "-replacement" is only allowed before
+ starting the array. It sets or clears the flag.
+
This file responds to select/poll. Any change to 'faulty'
or 'blocked' causes an event.
diff --git a/Documentation/networking/00-INDEX b/Documentation/networking/00-INDEX
index bbce1215434a..9ad9ddeb384c 100644
--- a/Documentation/networking/00-INDEX
+++ b/Documentation/networking/00-INDEX
@@ -144,6 +144,8 @@ nfc.txt
- The Linux Near Field Communication (NFS) subsystem.
olympic.txt
- IBM PCI Pit/Pit-Phy/Olympic Token Ring driver info.
+openvswitch.txt
+ - Open vSwitch developer documentation.
operstates.txt
- Overview of network interface operational states.
packet_mmap.txt
diff --git a/Documentation/networking/batman-adv.txt b/Documentation/networking/batman-adv.txt
index c86d03f18a5b..221ad0cdf11f 100644
--- a/Documentation/networking/batman-adv.txt
+++ b/Documentation/networking/batman-adv.txt
@@ -200,15 +200,16 @@ abled during run time. Following log_levels are defined:
0 - All debug output disabled
1 - Enable messages related to routing / flooding / broadcasting
-2 - Enable route or tt entry added / changed / deleted
-3 - Enable all messages
+2 - Enable messages related to route added / changed / deleted
+4 - Enable messages related to translation table operations
+7 - Enable all messages
The debug output can be changed at runtime using the file
/sys/class/net/bat0/mesh/log_level. e.g.
# echo 2 > /sys/class/net/bat0/mesh/log_level
-will enable debug messages for when routes or TTs change.
+will enable debug messages for when routes change.
BATCTL
diff --git a/Documentation/networking/bonding.txt b/Documentation/networking/bonding.txt
index 91df678fb7f8..080ad26690ae 100644
--- a/Documentation/networking/bonding.txt
+++ b/Documentation/networking/bonding.txt
@@ -196,6 +196,23 @@ or, for backwards compatibility, the option value. E.g.,
The parameters are as follows:
+active_slave
+
+ Specifies the new active slave for modes that support it
+ (active-backup, balance-alb and balance-tlb). Possible values
+ are the name of any currently enslaved interface, or an empty
+ string. If a name is given, the slave and its link must be up in order
+ to be selected as the new active slave. If an empty string is
+ specified, the current active slave is cleared, and a new active
+ slave is selected automatically.
+
+ Note that this is only available through the sysfs interface. No module
+ parameter by this name exists.
+
+ The normal value of this option is the name of the currently
+ active slave, or the empty string if there is no active slave or
+ the current mode does not use an active slave.
+
ad_select
Specifies the 802.3ad aggregation selection logic to use. The
diff --git a/Documentation/networking/ieee802154.txt b/Documentation/networking/ieee802154.txt
index f41ea2405220..1dc1c24a7547 100644
--- a/Documentation/networking/ieee802154.txt
+++ b/Documentation/networking/ieee802154.txt
@@ -78,3 +78,30 @@ in software. This is currently WIP.
See header include/net/mac802154.h and several drivers in drivers/ieee802154/.
+6LoWPAN Linux implementation
+============================
+
+The IEEE 802.15.4 standard specifies an MTU of 128 bytes, yielding about 80
+octets of actual MAC payload once security is turned on, on a wireless link
+with a link throughput of 250 kbps or less. The 6LoWPAN adaptation format
+[RFC4944] was specified to carry IPv6 datagrams over such constrained links,
+taking into account limited bandwidth, memory, or energy resources that are
+expected in applications such as wireless Sensor Networks. [RFC4944] defines
+a Mesh Addressing header to support sub-IP forwarding, a Fragmentation header
+to support the IPv6 minimum MTU requirement [RFC2460], and stateless header
+compression for IPv6 datagrams (LOWPAN_HC1 and LOWPAN_HC2) to reduce the
+relatively large IPv6 and UDP headers down to (in the best case) several bytes.
+
+In Semptember 2011 the standard update was published - [RFC6282].
+It deprecates HC1 and HC2 compression and defines IPHC encoding format which is
+used in this Linux implementation.
+
+All the code related to 6lowpan you may find in files: net/ieee802154/6lowpan.*
+
+To setup 6lowpan interface you need (busybox release > 1.17.0):
+1. Add IEEE802.15.4 interface and initialize PANid;
+2. Add 6lowpan interface by command like:
+ # ip link add link wpan0 name lowpan0 type lowpan
+3. Set MAC (if needs):
+ # ip link set lowpan0 address de:ad:be:ef:ca:fe:ba:be
+4. Bring up 'lowpan0' interface
diff --git a/Documentation/networking/ifenslave.c b/Documentation/networking/ifenslave.c
index 65968fbf1e49..ac5debb2f16c 100644
--- a/Documentation/networking/ifenslave.c
+++ b/Documentation/networking/ifenslave.c
@@ -539,12 +539,14 @@ static int if_getconfig(char *ifname)
metric = 0;
} else
metric = ifr.ifr_metric;
+ printf("The result of SIOCGIFMETRIC is %d\n", metric);
strcpy(ifr.ifr_name, ifname);
if (ioctl(skfd, SIOCGIFMTU, &ifr) < 0)
mtu = 0;
else
mtu = ifr.ifr_mtu;
+ printf("The result of SIOCGIFMTU is %d\n", mtu);
strcpy(ifr.ifr_name, ifname);
if (ioctl(skfd, SIOCGIFDSTADDR, &ifr) < 0) {
diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
index cb7f3148035d..ad3e80e17b4f 100644
--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -20,7 +20,7 @@ ip_no_pmtu_disc - BOOLEAN
default FALSE
min_pmtu - INTEGER
- default 562 - minimum discovered Path MTU
+ default 552 - minimum discovered Path MTU
route/max_size - INTEGER
Maximum number of routes allowed in the kernel. Increase
@@ -31,6 +31,16 @@ neigh/default/gc_thresh3 - INTEGER
when using large numbers of interfaces and when communicating
with large numbers of directly-connected peers.
+neigh/default/unres_qlen_bytes - INTEGER
+ The maximum number of bytes which may be used by packets
+ queued for each unresolved address by other network layers.
+ (added in linux 3.3)
+
+neigh/default/unres_qlen - INTEGER
+ The maximum number of packets which may be queued for each
+ unresolved address by other network layers.
+ (deprecated in linux 3.3) : use unres_qlen_bytes instead.
+
mtu_expires - INTEGER
Time, in seconds, that cached PMTU information is kept.
@@ -165,6 +175,9 @@ tcp_congestion_control - STRING
connections. The algorithm "reno" is always available, but
additional choices may be available based on kernel configuration.
Default is set as part of kernel configuration.
+ For passive connections, the listener congestion control choice
+ is inherited.
+ [see setsockopt(listenfd, SOL_TCP, TCP_CONGESTION, "name" ...) ]
tcp_cookie_size - INTEGER
Default size of TCP Cookie Transactions (TCPCT) option, that may be
@@ -282,11 +295,11 @@ tcp_max_ssthresh - INTEGER
Default: 0 (off)
tcp_max_syn_backlog - INTEGER
- Maximal number of remembered connection requests, which are
- still did not receive an acknowledgment from connecting client.
- Default value is 1024 for systems with more than 128Mb of memory,
- and 128 for low memory machines. If server suffers of overload,
- try to increase this number.
+ Maximal number of remembered connection requests, which have not
+ received an acknowledgment from connecting client.
+ The minimal value is 128 for low memory machines, and it will
+ increase in proportion to the memory of machine.
+ If server suffers from overload, try increasing this number.
tcp_max_tw_buckets - INTEGER
Maximal number of timewait sockets held by system simultaneously.
diff --git a/Documentation/networking/openvswitch.txt b/Documentation/networking/openvswitch.txt
new file mode 100644
index 000000000000..b8a048b8df3a
--- /dev/null
+++ b/Documentation/networking/openvswitch.txt
@@ -0,0 +1,195 @@
+Open vSwitch datapath developer documentation
+=============================================
+
+The Open vSwitch kernel module allows flexible userspace control over
+flow-level packet processing on selected network devices. It can be
+used to implement a plain Ethernet switch, network device bonding,
+VLAN processing, network access control, flow-based network control,
+and so on.
+
+The kernel module implements multiple "datapaths" (analogous to
+bridges), each of which can have multiple "vports" (analogous to ports
+within a bridge). Each datapath also has associated with it a "flow
+table" that userspace populates with "flows" that map from keys based
+on packet headers and metadata to sets of actions. The most common
+action forwards the packet to another vport; other actions are also
+implemented.
+
+When a packet arrives on a vport, the kernel module processes it by
+extracting its flow key and looking it up in the flow table. If there
+is a matching flow, it executes the associated actions. If there is
+no match, it queues the packet to userspace for processing (as part of
+its processing, userspace will likely set up a flow to handle further
+packets of the same type entirely in-kernel).
+
+
+Flow key compatibility
+----------------------
+
+Network protocols evolve over time. New protocols become important
+and existing protocols lose their prominence. For the Open vSwitch
+kernel module to remain relevant, it must be possible for newer
+versions to parse additional protocols as part of the flow key. It
+might even be desirable, someday, to drop support for parsing
+protocols that have become obsolete. Therefore, the Netlink interface
+to Open vSwitch is designed to allow carefully written userspace
+applications to work with any version of the flow key, past or future.
+
+To support this forward and backward compatibility, whenever the
+kernel module passes a packet to userspace, it also passes along the
+flow key that it parsed from the packet. Userspace then extracts its
+own notion of a flow key from the packet and compares it against the
+kernel-provided version:
+
+ - If userspace's notion of the flow key for the packet matches the
+ kernel's, then nothing special is necessary.
+
+ - If the kernel's flow key includes more fields than the userspace
+ version of the flow key, for example if the kernel decoded IPv6
+ headers but userspace stopped at the Ethernet type (because it
+ does not understand IPv6), then again nothing special is
+ necessary. Userspace can still set up a flow in the usual way,
+ as long as it uses the kernel-provided flow key to do it.
+
+ - If the userspace flow key includes more fields than the
+ kernel's, for example if userspace decoded an IPv6 header but
+ the kernel stopped at the Ethernet type, then userspace can
+ forward the packet manually, without setting up a flow in the
+ kernel. This case is bad for performance because every packet
+ that the kernel considers part of the flow must go to userspace,
+ but the forwarding behavior is correct. (If userspace can
+ determine that the values of the extra fields would not affect
+ forwarding behavior, then it could set up a flow anyway.)
+
+How flow keys evolve over time is important to making this work, so
+the following sections go into detail.
+
+
+Flow key format
+---------------
+
+A flow key is passed over a Netlink socket as a sequence of Netlink
+attributes. Some attributes represent packet metadata, defined as any
+information about a packet that cannot be extracted from the packet
+itself, e.g. the vport on which the packet was received. Most
+attributes, however, are extracted from headers within the packet,
+e.g. source and destination addresses from Ethernet, IP, or TCP
+headers.
+
+The <linux/openvswitch.h> header file defines the exact format of the
+flow key attributes. For informal explanatory purposes here, we write
+them as comma-separated strings, with parentheses indicating arguments
+and nesting. For example, the following could represent a flow key
+corresponding to a TCP packet that arrived on vport 1:
+
+ in_port(1), eth(src=e0:91:f5:21:d0:b2, dst=00:02:e3:0f:80:a4),
+ eth_type(0x0800), ipv4(src=172.16.0.20, dst=172.18.0.52, proto=17, tos=0,
+ frag=no), tcp(src=49163, dst=80)
+
+Often we ellipsize arguments not important to the discussion, e.g.:
+
+ in_port(1), eth(...), eth_type(0x0800), ipv4(...), tcp(...)
+
+
+Basic rule for evolving flow keys
+---------------------------------
+
+Some care is needed to really maintain forward and backward
+compatibility for applications that follow the rules listed under
+"Flow key compatibility" above.
+
+The basic rule is obvious:
+
+ ------------------------------------------------------------------
+ New network protocol support must only supplement existing flow
+ key attributes. It must not change the meaning of already defined
+ flow key attributes.
+ ------------------------------------------------------------------
+
+This rule does have less-obvious consequences so it is worth working
+through a few examples. Suppose, for example, that the kernel module
+did not already implement VLAN parsing. Instead, it just interpreted
+the 802.1Q TPID (0x8100) as the Ethertype then stopped parsing the
+packet. The flow key for any packet with an 802.1Q header would look
+essentially like this, ignoring metadata:
+
+ eth(...), eth_type(0x8100)
+
+Naively, to add VLAN support, it makes sense to add a new "vlan" flow
+key attribute to contain the VLAN tag, then continue to decode the
+encapsulated headers beyond the VLAN tag using the existing field
+definitions. With this change, an TCP packet in VLAN 10 would have a
+flow key much like this:
+
+ eth(...), vlan(vid=10, pcp=0), eth_type(0x0800), ip(proto=6, ...), tcp(...)
+
+But this change would negatively affect a userspace application that
+has not been updated to understand the new "vlan" flow key attribute.
+The application could, following the flow compatibility rules above,
+ignore the "vlan" attribute that it does not understand and therefore
+assume that the flow contained IP packets. This is a bad assumption
+(the flow only contains IP packets if one parses and skips over the
+802.1Q header) and it could cause the application's behavior to change
+across kernel versions even though it follows the compatibility rules.
+
+The solution is to use a set of nested attributes. This is, for
+example, why 802.1Q support uses nested attributes. A TCP packet in
+VLAN 10 is actually expressed as:
+
+ eth(...), eth_type(0x8100), vlan(vid=10, pcp=0), encap(eth_type(0x0800),
+ ip(proto=6, ...), tcp(...)))
+
+Notice how the "eth_type", "ip", and "tcp" flow key attributes are
+nested inside the "encap" attribute. Thus, an application that does
+not understand the "vlan" key will not see either of those attributes
+and therefore will not misinterpret them. (Also, the outer eth_type
+is still 0x8100, not changed to 0x0800.)
+
+Handling malformed packets
+--------------------------
+
+Don't drop packets in the kernel for malformed protocol headers, bad
+checksums, etc. This would prevent userspace from implementing a
+simple Ethernet switch that forwards every packet.
+
+Instead, in such a case, include an attribute with "empty" content.
+It doesn't matter if the empty content could be valid protocol values,
+as long as those values are rarely seen in practice, because userspace
+can always forward all packets with those values to userspace and
+handle them individually.
+
+For example, consider a packet that contains an IP header that
+indicates protocol 6 for TCP, but which is truncated just after the IP
+header, so that the TCP header is missing. The flow key for this
+packet would include a tcp attribute with all-zero src and dst, like
+this:
+
+ eth(...), eth_type(0x0800), ip(proto=6, ...), tcp(src=0, dst=0)
+
+As another example, consider a packet with an Ethernet type of 0x8100,
+indicating that a VLAN TCI should follow, but which is truncated just
+after the Ethernet type. The flow key for this packet would include
+an all-zero-bits vlan and an empty encap attribute, like this:
+
+ eth(...), eth_type(0x8100), vlan(0), encap()
+
+Unlike a TCP packet with source and destination ports 0, an
+all-zero-bits VLAN TCI is not that rare, so the CFI bit (aka
+VLAN_TAG_PRESENT inside the kernel) is ordinarily set in a vlan
+attribute expressly to allow this situation to be distinguished.
+Thus, the flow key in this second example unambiguously indicates a
+missing or malformed VLAN TCI.
+
+Other rules
+-----------
+
+The other rules for flow keys are much less subtle:
+
+ - Duplicate attributes are not allowed at a given nesting level.
+
+ - Ordering of attributes is not significant.
+
+ - When the kernel sends a given flow key to userspace, it always
+ composes it the same way. This allows userspace to hash and
+ compare entire flow keys that it may not be able to fully
+ interpret.
diff --git a/Documentation/networking/packet_mmap.txt b/Documentation/networking/packet_mmap.txt
index 4acea6603720..1c08a4b0981f 100644
--- a/Documentation/networking/packet_mmap.txt
+++ b/Documentation/networking/packet_mmap.txt
@@ -155,7 +155,7 @@ As capture, each frame contains two parts:
/* fill sockaddr_ll struct to prepare binding */
my_addr.sll_family = AF_PACKET;
- my_addr.sll_protocol = ETH_P_ALL;
+ my_addr.sll_protocol = htons(ETH_P_ALL);
my_addr.sll_ifindex = s_ifr.ifr_ifindex;
/* bind socket to eth0 */
diff --git a/Documentation/networking/scaling.txt b/Documentation/networking/scaling.txt
index a177de21d28e..579994afbe06 100644
--- a/Documentation/networking/scaling.txt
+++ b/Documentation/networking/scaling.txt
@@ -208,7 +208,7 @@ The counter in rps_dev_flow_table values records the length of the current
CPU's backlog when a packet in this flow was last enqueued. Each backlog
queue has a head counter that is incremented on dequeue. A tail counter
is computed as head counter + queue length. In other words, the counter
-in rps_dev_flow_table[i] records the last element in flow i that has
+in rps_dev_flow[i] records the last element in flow i that has
been enqueued onto the currently designated CPU for flow i (of course,
entry i is actually selected by hash and multiple flows may hash to the
same entry i).
@@ -224,7 +224,7 @@ following is true:
- The current CPU's queue head counter >= the recorded tail counter
value in rps_dev_flow[i]
-- The current CPU is unset (equal to NR_CPUS)
+- The current CPU is unset (equal to RPS_NO_CPU)
- The current CPU is offline
After this check, the packet is sent to the (possibly updated) current
@@ -235,7 +235,7 @@ CPU.
==== RFS Configuration
-RFS is only available if the kconfig symbol CONFIG_RFS is enabled (on
+RFS is only available if the kconfig symbol CONFIG_RPS is enabled (on
by default for SMP). The functionality remains disabled until explicitly
configured. The number of entries in the global flow table is set through:
@@ -258,7 +258,7 @@ For a single queue device, the rps_flow_cnt value for the single queue
would normally be configured to the same value as rps_sock_flow_entries.
For a multi-queue device, the rps_flow_cnt for each queue might be
configured as rps_sock_flow_entries / N, where N is the number of
-queues. So for instance, if rps_flow_entries is set to 32768 and there
+queues. So for instance, if rps_sock_flow_entries is set to 32768 and there
are 16 configured receive queues, rps_flow_cnt for each queue might be
configured as 2048.
diff --git a/Documentation/networking/stmmac.txt b/Documentation/networking/stmmac.txt
index 8d67980fabe8..d0aeeadd264b 100644
--- a/Documentation/networking/stmmac.txt
+++ b/Documentation/networking/stmmac.txt
@@ -4,14 +4,16 @@ Copyright (C) 2007-2010 STMicroelectronics Ltd
Author: Giuseppe Cavallaro <peppe.cavallaro@st.com>
This is the driver for the MAC 10/100/1000 on-chip Ethernet controllers
-(Synopsys IP blocks); it has been fully tested on STLinux platforms.
+(Synopsys IP blocks).
Currently this network device driver is for all STM embedded MAC/GMAC
-(i.e. 7xxx/5xxx SoCs) and it's known working on other platforms i.e. ARM SPEAr.
+(i.e. 7xxx/5xxx SoCs), SPEAr (arm), Loongson1B (mips) and XLINX XC2V3000
+FF1152AMT0221 D1215994A VIRTEX FPGA board.
-DWC Ether MAC 10/100/1000 Universal version 3.41a and DWC Ether MAC 10/100
-Universal version 4.0 have been used for developing the first code
-implementation.
+DWC Ether MAC 10/100/1000 Universal version 3.60a (and older) and DWC Ether MAC 10/100
+Universal version 4.0 have been used for developing this driver.
+
+This driver supports both the platform bus and PCI.
Please, for more information also visit: www.stlinux.com
@@ -277,5 +279,5 @@ In fact, these can generate an huge amount of debug messages.
6) TODO:
o XGMAC is not supported.
- o Review the timer optimisation code to use an embedded device that will be
- available in new chip generations.
+ o Add the EEE - Energy Efficient Ethernet
+ o Add the PTP - precision time protocol
diff --git a/Documentation/networking/team.txt b/Documentation/networking/team.txt
new file mode 100644
index 000000000000..5a013686b9ea
--- /dev/null
+++ b/Documentation/networking/team.txt
@@ -0,0 +1,2 @@
+Team devices are driven from userspace via libteam library which is here:
+ https://github.com/jpirko/libteam
diff --git a/Documentation/power/devices.txt b/Documentation/power/devices.txt
index 646a89e0c07d..20af7def23c8 100644
--- a/Documentation/power/devices.txt
+++ b/Documentation/power/devices.txt
@@ -123,9 +123,12 @@ please refer directly to the source code for more information about it.
Subsystem-Level Methods
-----------------------
The core methods to suspend and resume devices reside in struct dev_pm_ops
-pointed to by the pm member of struct bus_type, struct device_type and
-struct class. They are mostly of interest to the people writing infrastructure
-for buses, like PCI or USB, or device type and device class drivers.
+pointed to by the ops member of struct dev_pm_domain, or by the pm member of
+struct bus_type, struct device_type and struct class. They are mostly of
+interest to the people writing infrastructure for platforms and buses, like PCI
+or USB, or device type and device class drivers. They also are relevant to the
+writers of device drivers whose subsystems (PM domains, device types, device
+classes and bus types) don't provide all power management methods.
Bus drivers implement these methods as appropriate for the hardware and the
drivers using it; PCI works differently from USB, and so on. Not many people
@@ -139,41 +142,57 @@ sequencing in the driver model tree.
/sys/devices/.../power/wakeup files
-----------------------------------
-All devices in the driver model have two flags to control handling of wakeup
-events (hardware signals that can force the device and/or system out of a low
-power state). These flags are initialized by bus or device driver code using
+All device objects in the driver model contain fields that control the handling
+of system wakeup events (hardware signals that can force the system out of a
+sleep state). These fields are initialized by bus or device driver code using
device_set_wakeup_capable() and device_set_wakeup_enable(), defined in
include/linux/pm_wakeup.h.
-The "can_wakeup" flag just records whether the device (and its driver) can
+The "power.can_wakeup" flag just records whether the device (and its driver) can
physically support wakeup events. The device_set_wakeup_capable() routine
-affects this flag. The "should_wakeup" flag controls whether the device should
-try to use its wakeup mechanism. device_set_wakeup_enable() affects this flag;
-for the most part drivers should not change its value. The initial value of
-should_wakeup is supposed to be false for the majority of devices; the major
-exceptions are power buttons, keyboards, and Ethernet adapters whose WoL
-(wake-on-LAN) feature has been set up with ethtool. It should also default
-to true for devices that don't generate wakeup requests on their own but merely
-forward wakeup requests from one bus to another (like PCI bridges).
+affects this flag. The "power.wakeup" field is a pointer to an object of type
+struct wakeup_source used for controlling whether or not the device should use
+its system wakeup mechanism and for notifying the PM core of system wakeup
+events signaled by the device. This object is only present for wakeup-capable
+devices (i.e. devices whose "can_wakeup" flags are set) and is created (or
+removed) by device_set_wakeup_capable().
Whether or not a device is capable of issuing wakeup events is a hardware
matter, and the kernel is responsible for keeping track of it. By contrast,
whether or not a wakeup-capable device should issue wakeup events is a policy
decision, and it is managed by user space through a sysfs attribute: the
-power/wakeup file. User space can write the strings "enabled" or "disabled" to
-set or clear the "should_wakeup" flag, respectively. This file is only present
-for wakeup-capable devices (i.e. devices whose "can_wakeup" flags are set)
-and is created (or removed) by device_set_wakeup_capable(). Reads from the
-file will return the corresponding string.
-
-The device_may_wakeup() routine returns true only if both flags are set.
+"power/wakeup" file. User space can write the strings "enabled" or "disabled"
+to it to indicate whether or not, respectively, the device is supposed to signal
+system wakeup. This file is only present if the "power.wakeup" object exists
+for the given device and is created (or removed) along with that object, by
+device_set_wakeup_capable(). Reads from the file will return the corresponding
+string.
+
+The "power/wakeup" file is supposed to contain the "disabled" string initially
+for the majority of devices; the major exceptions are power buttons, keyboards,
+and Ethernet adapters whose WoL (wake-on-LAN) feature has been set up with
+ethtool. It should also default to "enabled" for devices that don't generate
+wakeup requests on their own but merely forward wakeup requests from one bus to
+another (like PCI Express ports).
+
+The device_may_wakeup() routine returns true only if the "power.wakeup" object
+exists and the corresponding "power/wakeup" file contains the string "enabled".
This information is used by subsystems, like the PCI bus type code, to see
whether or not to enable the devices' wakeup mechanisms. If device wakeup
mechanisms are enabled or disabled directly by drivers, they also should use
device_may_wakeup() to decide what to do during a system sleep transition.
-However for runtime power management, wakeup events should be enabled whenever
-the device and driver both support them, regardless of the should_wakeup flag.
-
+Device drivers, however, are not supposed to call device_set_wakeup_enable()
+directly in any case.
+
+It ought to be noted that system wakeup is conceptually different from "remote
+wakeup" used by runtime power management, although it may be supported by the
+same physical mechanism. Remote wakeup is a feature allowing devices in
+low-power states to trigger specific interrupts to signal conditions in which
+they should be put into the full-power state. Those interrupts may or may not
+be used to signal system wakeup events, depending on the hardware design. On
+some systems it is impossible to trigger them from system sleep states. In any
+case, remote wakeup should always be enabled for runtime power management for
+all devices and drivers that support it.
/sys/devices/.../power/control files
------------------------------------
@@ -249,23 +268,37 @@ for every device before the next phase begins. Not all busses or classes
support all these callbacks and not all drivers use all the callbacks. The
various phases always run after tasks have been frozen and before they are
unfrozen. Furthermore, the *_noirq phases run at a time when IRQ handlers have
-been disabled (except for those marked with the IRQ_WAKEUP flag).
+been disabled (except for those marked with the IRQF_NO_SUSPEND flag).
+
+All phases use PM domain, bus, type, class or driver callbacks (that is, methods
+defined in dev->pm_domain->ops, dev->bus->pm, dev->type->pm, dev->class->pm or
+dev->driver->pm). These callbacks are regarded by the PM core as mutually
+exclusive. Moreover, PM domain callbacks always take precedence over all of the
+other callbacks and, for example, type callbacks take precedence over bus, class
+and driver callbacks. To be precise, the following rules are used to determine
+which callback to execute in the given phase:
+
+ 1. If dev->pm_domain is present, the PM core will choose the callback
+ included in dev->pm_domain->ops for execution
+
+ 2. Otherwise, if both dev->type and dev->type->pm are present, the callback
+ included in dev->type->pm will be chosen for execution.
+
+ 3. Otherwise, if both dev->class and dev->class->pm are present, the
+ callback included in dev->class->pm will be chosen for execution.
+
+ 4. Otherwise, if both dev->bus and dev->bus->pm are present, the callback
+ included in dev->bus->pm will be chosen for execution.
+
+This allows PM domains and device types to override callbacks provided by bus
+types or device classes if necessary.
-All phases use bus, type, or class callbacks (that is, methods defined in
-dev->bus->pm, dev->type->pm, or dev->class->pm). These callbacks are mutually
-exclusive, so if the device type provides a struct dev_pm_ops object pointed to
-by its pm field (i.e. both dev->type and dev->type->pm are defined), the
-callbacks included in that object (i.e. dev->type->pm) will be used. Otherwise,
-if the class provides a struct dev_pm_ops object pointed to by its pm field
-(i.e. both dev->class and dev->class->pm are defined), the PM core will use the
-callbacks from that object (i.e. dev->class->pm). Finally, if the pm fields of
-both the device type and class objects are NULL (or those objects do not exist),
-the callbacks provided by the bus (that is, the callbacks from dev->bus->pm)
-will be used (this allows device types to override callbacks provided by bus
-types or classes if necessary).
+The PM domain, type, class and bus callbacks may in turn invoke device- or
+driver-specific methods stored in dev->driver->pm, but they don't have to do
+that.
-These callbacks may in turn invoke device- or driver-specific methods stored in
-dev->driver->pm, but they don't have to.
+If the subsystem callback chosen for execution is not present, the PM core will
+execute the corresponding method from dev->driver->pm instead if there is one.
Entering System Suspend
@@ -283,9 +316,8 @@ When the system goes into the standby or memory sleep state, the phases are:
After the prepare callback method returns, no new children may be
registered below the device. The method may also prepare the device or
- driver in some way for the upcoming system power transition (for
- example, by allocating additional memory required for this purpose), but
- it should not put the device into a low-power state.
+ driver in some way for the upcoming system power transition, but it
+ should not put the device into a low-power state.
2. The suspend methods should quiesce the device to stop it from performing
I/O. They also may save the device registers and put it into the
diff --git a/Documentation/power/freezing-of-tasks.txt b/Documentation/power/freezing-of-tasks.txt
index 316c2ba187f4..6ccb68f68da6 100644
--- a/Documentation/power/freezing-of-tasks.txt
+++ b/Documentation/power/freezing-of-tasks.txt
@@ -21,7 +21,7 @@ freeze_processes() (defined in kernel/power/process.c) is called. It executes
try_to_freeze_tasks() that sets TIF_FREEZE for all of the freezable tasks and
either wakes them up, if they are kernel threads, or sends fake signals to them,
if they are user space processes. A task that has TIF_FREEZE set, should react
-to it by calling the function called refrigerator() (defined in
+to it by calling the function called __refrigerator() (defined in
kernel/freezer.c), which sets the task's PF_FROZEN flag, changes its state
to TASK_UNINTERRUPTIBLE and makes it loop until PF_FROZEN is cleared for it.
Then, we say that the task is 'frozen' and therefore the set of functions
@@ -29,10 +29,10 @@ handling this mechanism is referred to as 'the freezer' (these functions are
defined in kernel/power/process.c, kernel/freezer.c & include/linux/freezer.h).
User space processes are generally frozen before kernel threads.
-It is not recommended to call refrigerator() directly. Instead, it is
-recommended to use the try_to_freeze() function (defined in
-include/linux/freezer.h), that checks the task's TIF_FREEZE flag and makes the
-task enter refrigerator() if the flag is set.
+__refrigerator() must not be called directly. Instead, use the
+try_to_freeze() function (defined in include/linux/freezer.h), that checks
+the task's TIF_FREEZE flag and makes the task enter __refrigerator() if the
+flag is set.
For user space processes try_to_freeze() is called automatically from the
signal-handling code, but the freezable kernel threads need to call it
@@ -61,13 +61,13 @@ wait_event_freezable() and wait_event_freezable_timeout() macros.
After the system memory state has been restored from a hibernation image and
devices have been reinitialized, the function thaw_processes() is called in
order to clear the PF_FROZEN flag for each frozen task. Then, the tasks that
-have been frozen leave refrigerator() and continue running.
+have been frozen leave __refrigerator() and continue running.
III. Which kernel threads are freezable?
Kernel threads are not freezable by default. However, a kernel thread may clear
PF_NOFREEZE for itself by calling set_freezable() (the resetting of PF_NOFREEZE
-directly is strongly discouraged). From this point it is regarded as freezable
+directly is not allowed). From this point it is regarded as freezable
and must call try_to_freeze() in a suitable place.
IV. Why do we do that?
@@ -176,3 +176,28 @@ tasks, since it generally exists anyway.
A driver must have all firmwares it may need in RAM before suspend() is called.
If keeping them is not practical, for example due to their size, they must be
requested early enough using the suspend notifier API described in notifiers.txt.
+
+VI. Are there any precautions to be taken to prevent freezing failures?
+
+Yes, there are.
+
+First of all, grabbing the 'pm_mutex' lock to mutually exclude a piece of code
+from system-wide sleep such as suspend/hibernation is not encouraged.
+If possible, that piece of code must instead hook onto the suspend/hibernation
+notifiers to achieve mutual exclusion. Look at the CPU-Hotplug code
+(kernel/cpu.c) for an example.
+
+However, if that is not feasible, and grabbing 'pm_mutex' is deemed necessary,
+it is strongly discouraged to directly call mutex_[un]lock(&pm_mutex) since
+that could lead to freezing failures, because if the suspend/hibernate code
+successfully acquired the 'pm_mutex' lock, and hence that other entity failed
+to acquire the lock, then that task would get blocked in TASK_UNINTERRUPTIBLE
+state. As a consequence, the freezer would not be able to freeze that task,
+leading to freezing failure.
+
+However, the [un]lock_system_sleep() APIs are safe to use in this scenario,
+since they ask the freezer to skip freezing this task, since it is anyway
+"frozen enough" as it is blocked on 'pm_mutex', which will be released
+only after the entire suspend/hibernation sequence is complete.
+So, to summarize, use [un]lock_system_sleep() instead of directly using
+mutex_[un]lock(&pm_mutex). That would prevent freezing failures.
diff --git a/Documentation/power/runtime_pm.txt b/Documentation/power/runtime_pm.txt
index 5336149f831b..4abe83e1045a 100644
--- a/Documentation/power/runtime_pm.txt
+++ b/Documentation/power/runtime_pm.txt
@@ -44,98 +44,112 @@ struct dev_pm_ops {
};
The ->runtime_suspend(), ->runtime_resume() and ->runtime_idle() callbacks
-are executed by the PM core for either the power domain, or the device type
-(if the device power domain's struct dev_pm_ops does not exist), or the class
-(if the device power domain's and type's struct dev_pm_ops object does not
-exist), or the bus type (if the device power domain's, type's and class'
-struct dev_pm_ops objects do not exist) of the given device, so the priority
-order of callbacks from high to low is that power domain callbacks, device
-type callbacks, class callbacks and bus type callbacks, and the high priority
-one will take precedence over low priority one. The bus type, device type and
-class callbacks are referred to as subsystem-level callbacks in what follows,
-and generally speaking, the power domain callbacks are used for representing
-power domains within a SoC.
+are executed by the PM core for the device's subsystem that may be either of
+the following:
+
+ 1. PM domain of the device, if the device's PM domain object, dev->pm_domain,
+ is present.
+
+ 2. Device type of the device, if both dev->type and dev->type->pm are present.
+
+ 3. Device class of the device, if both dev->class and dev->class->pm are
+ present.
+
+ 4. Bus type of the device, if both dev->bus and dev->bus->pm are present.
+
+If the subsystem chosen by applying the above rules doesn't provide the relevant
+callback, the PM core will invoke the corresponding driver callback stored in
+dev->driver->pm directly (if present).
+
+The PM core always checks which callback to use in the order given above, so the
+priority order of callbacks from high to low is: PM domain, device type, class
+and bus type. Moreover, the high-priority one will always take precedence over
+a low-priority one. The PM domain, bus type, device type and class callbacks
+are referred to as subsystem-level callbacks in what follows.
By default, the callbacks are always invoked in process context with interrupts
-enabled. However, subsystems can use the pm_runtime_irq_safe() helper function
-to tell the PM core that a device's ->runtime_suspend() and ->runtime_resume()
-callbacks should be invoked in atomic context with interrupts disabled.
-This implies that these callback routines must not block or sleep, but it also
-means that the synchronous helper functions listed at the end of Section 4 can
-be used within an interrupt handler or in an atomic context.
-
-The subsystem-level suspend callback is _entirely_ _responsible_ for handling
-the suspend of the device as appropriate, which may, but need not include
-executing the device driver's own ->runtime_suspend() callback (from the
+enabled. However, the pm_runtime_irq_safe() helper function can be used to tell
+the PM core that it is safe to run the ->runtime_suspend(), ->runtime_resume()
+and ->runtime_idle() callbacks for the given device in atomic context with
+interrupts disabled. This implies that the callback routines in question must
+not block or sleep, but it also means that the synchronous helper functions
+listed at the end of Section 4 may be used for that device within an interrupt
+handler or generally in an atomic context.
+
+The subsystem-level suspend callback, if present, is _entirely_ _responsible_
+for handling the suspend of the device as appropriate, which may, but need not
+include executing the device driver's own ->runtime_suspend() callback (from the
PM core's point of view it is not necessary to implement a ->runtime_suspend()
callback in a device driver as long as the subsystem-level suspend callback
knows what to do to handle the device).
- * Once the subsystem-level suspend callback has completed successfully
- for given device, the PM core regards the device as suspended, which need
- not mean that the device has been put into a low power state. It is
- supposed to mean, however, that the device will not process data and will
- not communicate with the CPU(s) and RAM until the subsystem-level resume
- callback is executed for it. The runtime PM status of a device after
- successful execution of the subsystem-level suspend callback is 'suspended'.
-
- * If the subsystem-level suspend callback returns -EBUSY or -EAGAIN,
- the device's runtime PM status is 'active', which means that the device
- _must_ be fully operational afterwards.
-
- * If the subsystem-level suspend callback returns an error code different
- from -EBUSY or -EAGAIN, the PM core regards this as a fatal error and will
- refuse to run the helper functions described in Section 4 for the device,
- until the status of it is directly set either to 'active', or to 'suspended'
- (the PM core provides special helper functions for this purpose).
-
-In particular, if the driver requires remote wake-up capability (i.e. hardware
+ * Once the subsystem-level suspend callback (or the driver suspend callback,
+ if invoked directly) has completed successfully for the given device, the PM
+ core regards the device as suspended, which need not mean that it has been
+ put into a low power state. It is supposed to mean, however, that the
+ device will not process data and will not communicate with the CPU(s) and
+ RAM until the appropriate resume callback is executed for it. The runtime
+ PM status of a device after successful execution of the suspend callback is
+ 'suspended'.
+
+ * If the suspend callback returns -EBUSY or -EAGAIN, the device's runtime PM
+ status remains 'active', which means that the device _must_ be fully
+ operational afterwards.
+
+ * If the suspend callback returns an error code different from -EBUSY and
+ -EAGAIN, the PM core regards this as a fatal error and will refuse to run
+ the helper functions described in Section 4 for the device until its status
+ is directly set to either'active', or 'suspended' (the PM core provides
+ special helper functions for this purpose).
+
+In particular, if the driver requires remote wakeup capability (i.e. hardware
mechanism allowing the device to request a change of its power state, such as
PCI PME) for proper functioning and device_run_wake() returns 'false' for the
device, then ->runtime_suspend() should return -EBUSY. On the other hand, if
-device_run_wake() returns 'true' for the device and the device is put into a low
-power state during the execution of the subsystem-level suspend callback, it is
-expected that remote wake-up will be enabled for the device. Generally, remote
-wake-up should be enabled for all input devices put into a low power state at
-run time.
-
-The subsystem-level resume callback is _entirely_ _responsible_ for handling the
-resume of the device as appropriate, which may, but need not include executing
-the device driver's own ->runtime_resume() callback (from the PM core's point of
-view it is not necessary to implement a ->runtime_resume() callback in a device
-driver as long as the subsystem-level resume callback knows what to do to handle
-the device).
-
- * Once the subsystem-level resume callback has completed successfully, the PM
- core regards the device as fully operational, which means that the device
- _must_ be able to complete I/O operations as needed. The runtime PM status
- of the device is then 'active'.
-
- * If the subsystem-level resume callback returns an error code, the PM core
- regards this as a fatal error and will refuse to run the helper functions
- described in Section 4 for the device, until its status is directly set
- either to 'active' or to 'suspended' (the PM core provides special helper
- functions for this purpose).
-
-The subsystem-level idle callback is executed by the PM core whenever the device
-appears to be idle, which is indicated to the PM core by two counters, the
-device's usage counter and the counter of 'active' children of the device.
+device_run_wake() returns 'true' for the device and the device is put into a
+low-power state during the execution of the suspend callback, it is expected
+that remote wakeup will be enabled for the device. Generally, remote wakeup
+should be enabled for all input devices put into low-power states at run time.
+
+The subsystem-level resume callback, if present, is _entirely_ _responsible_ for
+handling the resume of the device as appropriate, which may, but need not
+include executing the device driver's own ->runtime_resume() callback (from the
+PM core's point of view it is not necessary to implement a ->runtime_resume()
+callback in a device driver as long as the subsystem-level resume callback knows
+what to do to handle the device).
+
+ * Once the subsystem-level resume callback (or the driver resume callback, if
+ invoked directly) has completed successfully, the PM core regards the device
+ as fully operational, which means that the device _must_ be able to complete
+ I/O operations as needed. The runtime PM status of the device is then
+ 'active'.
+
+ * If the resume callback returns an error code, the PM core regards this as a
+ fatal error and will refuse to run the helper functions described in Section
+ 4 for the device, until its status is directly set to either 'active', or
+ 'suspended' (by means of special helper functions provided by the PM core
+ for this purpose).
+
+The idle callback (a subsystem-level one, if present, or the driver one) is
+executed by the PM core whenever the device appears to be idle, which is
+indicated to the PM core by two counters, the device's usage counter and the
+counter of 'active' children of the device.
* If any of these counters is decreased using a helper function provided by
the PM core and it turns out to be equal to zero, the other counter is
checked. If that counter also is equal to zero, the PM core executes the
- subsystem-level idle callback with the device as an argument.
+ idle callback with the device as its argument.
-The action performed by a subsystem-level idle callback is totally dependent on
-the subsystem in question, but the expected and recommended action is to check
+The action performed by the idle callback is totally dependent on the subsystem
+(or driver) in question, but the expected and recommended action is to check
if the device can be suspended (i.e. if all of the conditions necessary for
suspending the device are satisfied) and to queue up a suspend request for the
device in that case. The value returned by this callback is ignored by the PM
core.
The helper functions provided by the PM core, described in Section 4, guarantee
-that the following constraints are met with respect to the bus type's runtime
-PM callbacks:
+that the following constraints are met with respect to runtime PM callbacks for
+one device:
(1) The callbacks are mutually exclusive (e.g. it is forbidden to execute
->runtime_suspend() in parallel with ->runtime_resume() or with another
diff --git a/Documentation/scsi/53c700.txt b/Documentation/scsi/53c700.txt
index 0da681d497a2..e31aceb6df15 100644
--- a/Documentation/scsi/53c700.txt
+++ b/Documentation/scsi/53c700.txt
@@ -16,32 +16,13 @@ fill in to get the driver working.
Compile Time Flags
==================
-The driver may be either io mapped or memory mapped. This is
-selectable by configuration flags:
-
-CONFIG_53C700_MEM_MAPPED
-
-define if the driver is memory mapped.
-
-CONFIG_53C700_IO_MAPPED
-
-define if the driver is to be io mapped.
-
-One or other of the above flags *must* be defined.
-
-Other flags are:
+A compile time flag is:
CONFIG_53C700_LE_ON_BE
define if the chipset must be supported in little endian mode on a big
endian architecture (used for the 700 on parisc).
-CONFIG_53C700_USE_CONSISTENT
-
-allocate consistent memory (should only be used if your architecture
-has a mixture of consistent and inconsistent memory). Fully
-consistent or fully inconsistent architectures should not define this.
-
Using the Chip Core Driver
==========================
diff --git a/Documentation/serial/serial-rs485.txt b/Documentation/serial/serial-rs485.txt
index 079cb3df62cf..41c8378c0b2f 100644
--- a/Documentation/serial/serial-rs485.txt
+++ b/Documentation/serial/serial-rs485.txt
@@ -97,15 +97,23 @@
struct serial_rs485 rs485conf;
- /* Set RS485 mode: */
+ /* Enable RS485 mode: */
rs485conf.flags |= SER_RS485_ENABLED;
+ /* Set logical level for RTS pin equal to 1 when sending: */
+ rs485conf.flags |= SER_RS485_RTS_ON_SEND;
+ /* or, set logical level for RTS pin equal to 0 when sending: */
+ rs485conf.flags &= ~(SER_RS485_RTS_ON_SEND);
+
+ /* Set logical level for RTS pin equal to 1 after sending: */
+ rs485conf.flags |= SER_RS485_RTS_AFTER_SEND;
+ /* or, set logical level for RTS pin equal to 0 after sending: */
+ rs485conf.flags &= ~(SER_RS485_RTS_AFTER_SEND);
+
/* Set rts delay before send, if needed: */
- rs485conf.flags |= SER_RS485_RTS_BEFORE_SEND;
rs485conf.delay_rts_before_send = ...;
/* Set rts delay after send, if needed: */
- rs485conf.flags |= SER_RS485_RTS_AFTER_SEND;
rs485conf.delay_rts_after_send = ...;
/* Set this flag if you want to receive data even whilst sending data */
diff --git a/Documentation/sound/alsa/HD-Audio-Models.txt b/Documentation/sound/alsa/HD-Audio-Models.txt
index 4f3443230d89..edad99abec21 100644
--- a/Documentation/sound/alsa/HD-Audio-Models.txt
+++ b/Documentation/sound/alsa/HD-Audio-Models.txt
@@ -349,6 +349,7 @@ STAC92HD83*
ref Reference board
mic-ref Reference board with power management for ports
dell-s14 Dell laptop
+ dell-vostro-3500 Dell Vostro 3500 laptop
hp HP laptops with (inverted) mute-LED
hp-dv7-4000 HP dv-7 4000
auto BIOS setup (default)
diff --git a/Documentation/sound/alsa/HD-Audio.txt b/Documentation/sound/alsa/HD-Audio.txt
index 03e2771ddeef..91fee3b45fb8 100644
--- a/Documentation/sound/alsa/HD-Audio.txt
+++ b/Documentation/sound/alsa/HD-Audio.txt
@@ -579,7 +579,7 @@ Development Tree
~~~~~~~~~~~~~~~~
The latest development codes for HD-audio are found on sound git tree:
-- git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6.git
+- git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git
The master branch or for-next branches can be used as the main
development branches in general while the HD-audio specific patches
@@ -594,7 +594,7 @@ is, installed via the usual spells: configure, make and make
install(-modules). See INSTALL in the package. The snapshot tarballs
are found at:
-- ftp://ftp.kernel.org/pub/linux/kernel/people/tiwai/snapshot/
+- ftp://ftp.suse.com/pub/people/tiwai/snapshot/
Sending a Bug Report
@@ -696,7 +696,7 @@ via hda-verb won't change the mixer value.
The hda-verb program is found in the ftp directory:
-- ftp://ftp.kernel.org/pub/linux/kernel/people/tiwai/misc/
+- ftp://ftp.suse.com/pub/people/tiwai/misc/
Also a git repository is available:
@@ -764,7 +764,7 @@ operation, the jack plugging simulation, etc.
The package is found in:
-- ftp://ftp.kernel.org/pub/linux/kernel/people/tiwai/misc/
+- ftp://ftp.suse.com/pub/people/tiwai/misc/
A git repository is available:
diff --git a/Documentation/sound/alsa/soc/machine.txt b/Documentation/sound/alsa/soc/machine.txt
index 3e2ec9cbf397..d50c14df3411 100644
--- a/Documentation/sound/alsa/soc/machine.txt
+++ b/Documentation/sound/alsa/soc/machine.txt
@@ -50,8 +50,7 @@ Machine DAI Configuration
The machine DAI configuration glues all the codec and CPU DAIs together. It can
also be used to set up the DAI system clock and for any machine related DAI
initialisation e.g. the machine audio map can be connected to the codec audio
-map, unconnected codec pins can be set as such. Please see corgi.c, spitz.c
-for examples.
+map, unconnected codec pins can be set as such.
struct snd_soc_dai_link is used to set up each DAI in your machine. e.g.
@@ -83,8 +82,7 @@ Machine Power Map
The machine driver can optionally extend the codec power map and to become an
audio power map of the audio subsystem. This allows for automatic power up/down
of speaker/HP amplifiers, etc. Codec pins can be connected to the machines jack
-sockets in the machine init function. See soc/pxa/spitz.c and dapm.txt for
-details.
+sockets in the machine init function.
Machine Controls
diff --git a/Documentation/trace/events.txt b/Documentation/trace/events.txt
index b510564aac7e..bb24c2a0e870 100644
--- a/Documentation/trace/events.txt
+++ b/Documentation/trace/events.txt
@@ -191,8 +191,6 @@ And for string fields they are:
Currently, only exact string matches are supported.
-Currently, the maximum number of predicates in a filter is 16.
-
5.2 Setting filters
-------------------
diff --git a/Documentation/usb/linux-cdc-acm.inf b/Documentation/usb/linux-cdc-acm.inf
index 37a02ce54841..f0ffc27d4c0a 100644
--- a/Documentation/usb/linux-cdc-acm.inf
+++ b/Documentation/usb/linux-cdc-acm.inf
@@ -90,10 +90,10 @@ ServiceBinary=%12%\USBSER.sys
[SourceDisksFiles]
[SourceDisksNames]
[DeviceList]
-%DESCRIPTION%=DriverInstall, USB\VID_0525&PID_A4A7, USB\VID_1D6B&PID_0104&MI_02
+%DESCRIPTION%=DriverInstall, USB\VID_0525&PID_A4A7, USB\VID_1D6B&PID_0104&MI_02, USB\VID_1D6B&PID_0106&MI_00
[DeviceList.NTamd64]
-%DESCRIPTION%=DriverInstall, USB\VID_0525&PID_A4A7, USB\VID_1D6B&PID_0104&MI_02
+%DESCRIPTION%=DriverInstall, USB\VID_0525&PID_A4A7, USB\VID_1D6B&PID_0104&MI_02, USB\VID_1D6B&PID_0106&MI_00
;------------------------------------------------------------------------------
diff --git a/Documentation/vgaarbiter.txt b/Documentation/vgaarbiter.txt
index b7d401e0eae9..014423e2824c 100644
--- a/Documentation/vgaarbiter.txt
+++ b/Documentation/vgaarbiter.txt
@@ -177,7 +177,7 @@ II. Credits
Benjamin Herrenschmidt (IBM?) started this work when he discussed such design
with the Xorg community in 2005 [1, 2]. In the end of 2007, Paulo Zanoni and
-Tiago Vignatti (both of C3SL/Federal University of Paraná) proceeded his work
+Tiago Vignatti (both of C3SL/Federal University of ParanĂ¡) proceeded his work
enhancing the kernel code to adapt as a kernel module and also did the
implementation of the user space side [3]. Now (2009) Tiago Vignatti and Dave
Airlie finally put this work in shape and queued to Jesse Barnes' PCI tree.
diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index 7945b0bd35e2..e2a4b5287361 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -1100,6 +1100,15 @@ emulate them efficiently. The fields in each entry are defined as follows:
eax, ebx, ecx, edx: the values returned by the cpuid instruction for
this function/index combination
+The TSC deadline timer feature (CPUID leaf 1, ecx[24]) is always returned
+as false, since the feature depends on KVM_CREATE_IRQCHIP for local APIC
+support. Instead it is reported via
+
+ ioctl(KVM_CHECK_EXTENSION, KVM_CAP_TSC_DEADLINE_TIMER)
+
+if that returns true and you use KVM_CREATE_IRQCHIP, or if you emulate the
+feature in userspace, then you can enable the feature for KVM_SET_CPUID2.
+
4.47 KVM_PPC_GET_PVINFO
Capability: KVM_CAP_PPC_GET_PVINFO
@@ -1151,6 +1160,13 @@ following flags are specified:
/* Depends on KVM_CAP_IOMMU */
#define KVM_DEV_ASSIGN_ENABLE_IOMMU (1 << 0)
+The KVM_DEV_ASSIGN_ENABLE_IOMMU flag is a mandatory option to ensure
+isolation of the device. Usages not specifying this flag are deprecated.
+
+Only PCI header type 0 devices with PCI BAR resources are supported by
+device assignment. The user requesting this ioctl must have read/write
+access to the PCI sysfs resource files associated with the device.
+
4.49 KVM_DEASSIGN_PCI_DEVICE
Capability: KVM_CAP_DEVICE_DEASSIGNMENT