<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/include, branch v3.12.17</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v3.12.17</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v3.12.17'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2014-04-03T08:32:30+00:00</updated>
<entry>
<title>mm: close PageTail race</title>
<updated>2014-04-03T08:32:30+00:00</updated>
<author>
<name>David Rientjes</name>
<email>rientjes@google.com</email>
</author>
<published>2014-03-03T23:38:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=9a110858ed2e494b8be683c6959113f73685eb1f'/>
<id>urn:sha1:9a110858ed2e494b8be683c6959113f73685eb1f</id>
<content type='text'>
commit 668f9abbd4334e6c29fa8acd71635c4f9101caa7 upstream.

Commit bf6bddf1924e ("mm: introduce compaction and migration for
ballooned pages") introduces page_count(page) into memory compaction
which dereferences page-&gt;first_page if PageTail(page).

This results in a very rare NULL pointer dereference on the
aforementioned page_count(page).  Indeed, anything that does
compound_head(), including page_count() is susceptible to racing with
prep_compound_page() and seeing a NULL or dangling page-&gt;first_page
pointer.

This patch uses Andrea's implementation of compound_trans_head() that
deals with such a race and makes it the default compound_head()
implementation.  This includes a read memory barrier that ensures that
if PageTail(head) is true that we return a head page that is neither
NULL nor dangling.  The patch then adds a store memory barrier to
prep_compound_page() to ensure page-&gt;first_page is set.

This is the safest way to ensure we see the head page that we are
expecting, PageTail(page) is already in the unlikely() path and the
memory barriers are unfortunately required.

Hugetlbfs is the exception, we don't enforce a store memory barrier
during init since no race is possible.

Signed-off-by: David Rientjes &lt;rientjes@google.com&gt;
Cc: Holger Kiehl &lt;Holger.Kiehl@dwd.de&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Cc: Rafael Aquini &lt;aquini@redhat.com&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Michal Hocko &lt;mhocko@suse.cz&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: "Kirill A. Shutemov" &lt;kirill.shutemov@linux.intel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>ext4: atomically set inode-&gt;i_flags in ext4_set_inode_flags()</title>
<updated>2014-04-03T08:32:22+00:00</updated>
<author>
<name>Theodore Ts'o</name>
<email>tytso@mit.edu</email>
</author>
<published>2014-03-30T14:20:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=710e5cbeeb6869bb0f1227ce3cba7374449fdeb4'/>
<id>urn:sha1:710e5cbeeb6869bb0f1227ce3cba7374449fdeb4</id>
<content type='text'>
commit 00a1a053ebe5febcfc2ec498bd894f035ad2aa06 upstream.

Use cmpxchg() to atomically set i_flags instead of clearing out the
S_IMMUTABLE, S_APPEND, etc. flags and then setting them from the
EXT4_IMMUTABLE_FL, EXT4_APPEND_FL flags, since this opens up a race
where an immutable file has the immutable flag cleared for a brief
window of time.

Reported-by: John Sullivan &lt;jsrhbz@kanargh.force9.co.uk&gt;
Signed-off-by: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>NFS: fix the handling of NFS_INO_INVALID_DATA flag in nfs_revalidate_mapping</title>
<updated>2014-04-03T08:32:17+00:00</updated>
<author>
<name>Jeff Layton</name>
<email>jlayton@redhat.com</email>
</author>
<published>2014-01-27T18:46:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=a18d760264c9ff07b35ec501403f037c2654cded'/>
<id>urn:sha1:a18d760264c9ff07b35ec501403f037c2654cded</id>
<content type='text'>
commit d529ef83c355f97027ff85298a9709fe06216a66 upstream.

There is a possible race in how the nfs_invalidate_mapping function is
handled.  Currently, we go and invalidate the pages in the file and then
clear NFS_INO_INVALID_DATA.

The problem is that it's possible for a stale page to creep into the
mapping after the page was invalidated (i.e., via readahead). If another
writer comes along and sets the flag after that happens but before
invalidate_inode_pages2 returns then we could clear the flag
without the cache having been properly invalidated.

So, we must clear the flag first and then invalidate the pages. Doing
this however, opens another race:

It's possible to have two concurrent read() calls that end up in
nfs_revalidate_mapping at the same time. The first one clears the
NFS_INO_INVALID_DATA flag and then goes to call nfs_invalidate_mapping.

Just before calling that though, the other task races in, checks the
flag and finds it cleared. At that point, it trusts that the mapping is
good and gets the lock on the page, allowing the read() to be satisfied
from the cache even though the data is no longer valid.

These effects are easily manifested by running diotest3 from the LTP
test suite on NFS. That program does a series of DIO writes and buffered
reads. The operations are serialized and page-aligned but the existing
code fails the test since it occasionally allows a read to come out of
the cache incorrectly. While mixing direct and buffered I/O isn't
recommended, I believe it's possible to hit this in other ways that just
use buffered I/O, though that situation is much harder to reproduce.

The problem is that the checking/clearing of that flag and the
invalidation of the mapping really need to be atomic. Fix this by
serializing concurrent invalidations with a bitlock.

At the same time, we also need to allow other places that check
NFS_INO_INVALID_DATA to check whether we might be in the middle of
invalidating the file, so fix up a couple of places that do that
to look for the new NFS_INO_INVALIDATING flag.

Doing this requires us to be careful not to set the bitlock
unnecessarily, so this code only does that if it believes it will
be doing an invalidation.

Signed-off-by: Jeff Layton &lt;jlayton@redhat.com&gt;
Signed-off-by: Trond Myklebust &lt;trond.myklebust@primarydata.com&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>libceph: rename ceph_msg::front_max to front_alloc_len</title>
<updated>2014-03-31T12:22:35+00:00</updated>
<author>
<name>Ilya Dryomov</name>
<email>ilya.dryomov@inktank.com</email>
</author>
<published>2014-01-09T18:08:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=7335817378e3e4c3325c65b0e15b2918cc04f5c1'/>
<id>urn:sha1:7335817378e3e4c3325c65b0e15b2918cc04f5c1</id>
<content type='text'>
commit 3cea4c3071d4e55e9d7356efe9d0ebf92f0c2204 upstream.

Rename front_max field of struct ceph_msg to front_alloc_len to make
its purpose more clear.

Signed-off-by: Ilya Dryomov &lt;ilya.dryomov@inktank.com&gt;
Reviewed-by: Sage Weil &lt;sage@inktank.com&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>libceph: block I/O when PAUSE or FULL osd map flags are set</title>
<updated>2014-03-31T12:22:27+00:00</updated>
<author>
<name>Josh Durgin</name>
<email>josh.durgin@inktank.com</email>
</author>
<published>2013-12-03T03:11:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=7388cfc1876b7caf0747d6864b74e7bef59a4ad2'/>
<id>urn:sha1:7388cfc1876b7caf0747d6864b74e7bef59a4ad2</id>
<content type='text'>
commit d29adb34a94715174c88ca93e8aba955850c9bde upstream.

The PAUSEWR and PAUSERD flags are meant to stop the cluster from
processing writes and reads, respectively. The FULL flag is set when
the cluster determines that it is out of space, and will no longer
process writes.  PAUSEWR and PAUSERD are purely client-side settings
already implemented in userspace clients. The osd does nothing special
with these flags.

When the FULL flag is set, however, the osd responds to all writes
with -ENOSPC. For cephfs, this makes sense, but for rbd the block
layer translates this into EIO.  If a cluster goes from full to
non-full quickly, a filesystem on top of rbd will not behave well,
since some writes succeed while others get EIO.

Fix this by blocking any writes when the FULL flag is set in the osd
client. This is the same strategy used by userspace, so apply it by
default.  A follow-on patch makes this configurable.

__map_request() is called to re-target osd requests in case the
available osds changed.  Add a paused field to a ceph_osd_request, and
set it whenever an appropriate osd map flag is set.  Avoid queueing
paused requests in __map_request(), but force them to be resent if
they become unpaused.

Also subscribe to the next osd map from the monitor if any of these
flags are set, so paused requests can be unblocked as soon as
possible.

Fixes: http://tracker.ceph.com/issues/6079

Reviewed-by: Sage Weil &lt;sage@inktank.com&gt;
Signed-off-by: Josh Durgin &lt;josh.durgin@inktank.com&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>tracing: Fix array size mismatch in format string</title>
<updated>2014-03-31T12:22:25+00:00</updated>
<author>
<name>Vaibhav Nagarnaik</name>
<email>vnagarnaik@google.com</email>
</author>
<published>2014-02-14T03:51:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c0316f975b1a29a1161f86d524150f493188e9b2'/>
<id>urn:sha1:c0316f975b1a29a1161f86d524150f493188e9b2</id>
<content type='text'>
commit 87291347c49dc40aa339f587b209618201c2e527 upstream.

In event format strings, the array size is reported in two locations.
One in array subscript and then via the "size:" attribute. The values
reported there have a mismatch.

For e.g., in sched:sched_switch the prev_comm and next_comm character
arrays have subscript values as [32] where as the actual field size is
16.

name: sched_switch
ID: 301
format:
        field:unsigned short common_type;       offset:0;       size:2; signed:0;
        field:unsigned char common_flags;       offset:2;       size:1; signed:0;
        field:unsigned char common_preempt_count;       offset:3;       size:1;signed:0;
        field:int common_pid;   offset:4;       size:4; signed:1;

        field:char prev_comm[32];       offset:8;       size:16;        signed:1;
        field:pid_t prev_pid;   offset:24;      size:4; signed:1;
        field:int prev_prio;    offset:28;      size:4; signed:1;
        field:long prev_state;  offset:32;      size:8; signed:1;
        field:char next_comm[32];       offset:40;      size:16;        signed:1;
        field:pid_t next_pid;   offset:56;      size:4; signed:1;
        field:int next_prio;    offset:60;      size:4; signed:1;

After bisection, the following commit was blamed:
92edca0 tracing: Use direct field, type and system names

This commit removes the duplication of strings for field-&gt;name and
field-&gt;type assuming that all the strings passed in
__trace_define_field() are immutable. This is not true for arrays, where
the type string is created in event_storage variable and field-&gt;type for
all array fields points to event_storage.

Use __stringify() to create a string constant for the type string.

Also, get rid of event_storage and event_storage_mutex that are not
needed anymore.

also, an added benefit is that this reduces the overhead of events a bit more:

   text    data     bss     dec     hex filename
8424787 2036472 1302528 11763787         b3804b vmlinux
8420814 2036408 1302528 11759750         b37086 vmlinux.patched

Link: http://lkml.kernel.org/r/1392349908-29685-1-git-send-email-vnagarnaik@google.com

Cc: Laurent Chavey &lt;chavey@google.com&gt;
Signed-off-by: Vaibhav Nagarnaik &lt;vnagarnaik@google.com&gt;
Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>cgroup: protect modifications to cgroup_idr with cgroup_mutex</title>
<updated>2014-03-26T11:24:36+00:00</updated>
<author>
<name>Li Zefan</name>
<email>lizefan@huawei.com</email>
</author>
<published>2014-02-11T08:05:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=59d9c5f94655a3cc63e2772e847509010598b7f1'/>
<id>urn:sha1:59d9c5f94655a3cc63e2772e847509010598b7f1</id>
<content type='text'>
commit 0ab02ca8f887908152d1a96db5130fc661d36a1e upstream.

Setup cgroupfs like this:
  # mount -t cgroup -o cpuacct xxx /cgroup
  # mkdir /cgroup/sub1
  # mkdir /cgroup/sub2

Then run these two commands:
  # for ((; ;)) { mkdir /cgroup/sub1/tmp &amp;&amp; rmdir /mnt/sub1/tmp; } &amp;
  # for ((; ;)) { mkdir /cgroup/sub2/tmp &amp;&amp; rmdir /mnt/sub2/tmp; } &amp;

After seconds you may see this warning:

------------[ cut here ]------------
WARNING: CPU: 1 PID: 25243 at lib/idr.c:527 sub_remove+0x87/0x1b0()
idr_remove called for id=6 which is not allocated.
...
Call Trace:
 [&lt;ffffffff8156063c&gt;] dump_stack+0x7a/0x96
 [&lt;ffffffff810591ac&gt;] warn_slowpath_common+0x8c/0xc0
 [&lt;ffffffff81059296&gt;] warn_slowpath_fmt+0x46/0x50
 [&lt;ffffffff81300aa7&gt;] sub_remove+0x87/0x1b0
 [&lt;ffffffff810f3f02&gt;] ? css_killed_work_fn+0x32/0x1b0
 [&lt;ffffffff81300bf5&gt;] idr_remove+0x25/0xd0
 [&lt;ffffffff810f2bab&gt;] cgroup_destroy_css_killed+0x5b/0xc0
 [&lt;ffffffff810f4000&gt;] css_killed_work_fn+0x130/0x1b0
 [&lt;ffffffff8107cdbc&gt;] process_one_work+0x26c/0x550
 [&lt;ffffffff8107eefe&gt;] worker_thread+0x12e/0x3b0
 [&lt;ffffffff81085f96&gt;] kthread+0xe6/0xf0
 [&lt;ffffffff81570bac&gt;] ret_from_fork+0x7c/0xb0
---[ end trace 2d1577ec10cf80d0 ]---

It's because allocating/removing cgroup ID is not properly synchronized.

The bug was introduced when we converted cgroup_ida to cgroup_idr.
While synchronization is already done inside ida_simple_{get,remove}(),
users are responsible for concurrent calls to idr_{alloc,remove}().

[mhocko@suse.cz: ported to 3.12]
Fixes: 4e96ee8e981b ("cgroup: convert cgroup_ida to cgroup_idr")
Cc: &lt;stable@vger.kernel.org&gt; #3.12+
Reported-by: Michal Hocko &lt;mhocko@suse.cz&gt;
Signed-off-by: Li Zefan &lt;lizefan@huawei.com&gt;
Signed-off-by: Michal Hocko &lt;mhocko@suse.cz&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>NFSv4: Fix another nfs4_sequence corruptor</title>
<updated>2014-03-26T08:44:13+00:00</updated>
<author>
<name>Trond Myklebust</name>
<email>trond.myklebust@primarydata.com</email>
</author>
<published>2014-03-22T14:00:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=dec2e6e6bd7cc728a9b3091075dd17aef6b4c33c'/>
<id>urn:sha1:dec2e6e6bd7cc728a9b3091075dd17aef6b4c33c</id>
<content type='text'>
commit b7e63a1079b266866a732cf699d8c4d61391bbda upstream.

nfs4_release_lockowner needs to set the rpc_message reply to point to
the nfs4_sequence_res in order to avoid another Oopsable situation
in nfs41_assign_slot.

Fixes: fbd4bfd1d9d21 (NFS: Add nfs4_sequence calls for RELEASE_LOCKOWNER)
Cc: stable@vger.kernel.org # 3.12+
Signed-off-by: Trond Myklebust &lt;trond.myklebust@primarydata.com&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>iscsi/iser-target: Fix isert_conn-&gt;state hung shutdown issues</title>
<updated>2014-03-22T21:01:58+00:00</updated>
<author>
<name>Nicholas Bellinger</name>
<email>nab@linux-iscsi.org</email>
</author>
<published>2014-02-03T20:54:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2f82fa987df6b060241fc760eadf5dd02c51a979'/>
<id>urn:sha1:2f82fa987df6b060241fc760eadf5dd02c51a979</id>
<content type='text'>
commit defd884845297fd5690594bfe89656b01f16d87e upstream.

This patch addresses a couple of different hug shutdown issues
related to wait_event() + isert_conn-&gt;state.  First, it changes
isert_conn-&gt;conn_wait + isert_conn-&gt;conn_wait_comp_err from
waitqueues to completions, and sets ISER_CONN_TERMINATING from
within isert_disconnect_work().

Second, it splits isert_free_conn() into isert_wait_conn() that
is called earlier in iscsit_close_connection() to ensure that
all outstanding commands have completed before continuing.

Finally, it breaks isert_cq_comp_err() into seperate TX / RX
related code, and adds logic in isert_cq_rx_comp_err() to wait
for outstanding commands to complete before setting ISER_CONN_DOWN
and calling complete(&amp;isert_conn-&gt;conn_wait_comp_err).

Acked-by: Sagi Grimberg &lt;sagig@mellanox.com&gt;
Cc: Or Gerlitz &lt;ogerlitz@mellanox.com&gt;
Signed-off-by: Nicholas Bellinger &lt;nab@linux-iscsi.org&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;

</content>
</entry>
<entry>
<title>firewire: don't use PREPARE_DELAYED_WORK</title>
<updated>2014-03-22T21:01:56+00:00</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2014-03-07T15:19:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=0f72d5c490bc4e5b84829cce251266d485663cb6'/>
<id>urn:sha1:0f72d5c490bc4e5b84829cce251266d485663cb6</id>
<content type='text'>
commit 70044d71d31d6973665ced5be04ef39ac1c09a48 upstream.

PREPARE_[DELAYED_]WORK() are being phased out.  They have few users
and a nasty surprise in terms of reentrancy guarantee as workqueue
considers work items to be different if they don't have the same work
function.

firewire core-device and sbp2 have been been multiplexing work items
with multiple work functions.  Introduce fw_device_workfn() and
sbp2_lu_workfn() which invoke fw_device-&gt;workfn and
sbp2_logical_unit-&gt;workfn respectively and always use the two
functions as the work functions and update the users to set the
-&gt;workfn fields instead of overriding work functions using
PREPARE_DELAYED_WORK().

This fixes a variety of possible regressions since a2c1c57be8d9
"workqueue: consider work function when searching for busy work items"
due to which fw_workqueue lost its required non-reentrancy property.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Acked-by: Stefan Richter &lt;stefanr@s5r6.in-berlin.de&gt;
Cc: linux1394-devel@lists.sourceforge.net
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;

</content>
</entry>
</feed>
