<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git, branch v3.12.24</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v3.12.24</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v3.12.24'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2014-07-04T08:06:59+00:00</updated>
<entry>
<title>Linux 3.12.24</title>
<updated>2014-07-04T08:06:59+00:00</updated>
<author>
<name>Jiri Slaby</name>
<email>jslaby@suse.cz</email>
</author>
<published>2014-06-30T11:49:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=8097be3bc58315b14498000585299e3c06fb13ce'/>
<id>urn:sha1:8097be3bc58315b14498000585299e3c06fb13ce</id>
<content type='text'>
</content>
</entry>
<entry>
<title>xfs: don't perform discard if the given range length is less than block size</title>
<updated>2014-07-04T08:06:58+00:00</updated>
<author>
<name>Jie Liu</name>
<email>jeff.liu@oracle.com</email>
</author>
<published>2013-11-20T08:08:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=0af3f13634228f13b0298551326a51b89d95a7e4'/>
<id>urn:sha1:0af3f13634228f13b0298551326a51b89d95a7e4</id>
<content type='text'>
commit f9fd0135610084abef6867d984e9951c3099950d upstream.

For discard operation, we should return EINVAL if the given range length
is less than a block size, otherwise it will go through the file system
to discard data blocks as the end range might be evaluated to -1, e.g,
/xfs7: 9811378176 bytes were trimmed

This issue can be triggered via xfstests/generic/288.

Also, it seems to get the request queue pointer via bdev_get_queue()
instead of the hard code pointer dereference is not a bad thing.

Signed-off-by: Jie Liu &lt;jeff.liu@oracle.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Ben Myers &lt;bpm@sgi.com&gt;

Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>xfs: xfs_remove deadlocks due to inverted AGF vs AGI lock ordering</title>
<updated>2014-07-04T08:06:58+00:00</updated>
<author>
<name>Dave Chinner</name>
<email>dchinner@redhat.com</email>
</author>
<published>2013-10-29T11:11:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f7009499bc593c23ec4fe9525f4e545ad0c16691'/>
<id>urn:sha1:f7009499bc593c23ec4fe9525f4e545ad0c16691</id>
<content type='text'>
commit 273203699f82667296e1f14344c5a5a6c4600470 upstream.

Removing an inode from the namespace involves removing the directory
entry and dropping the link count on the inode. Removing the
directory entry can result in locking an AGF (directory blocks were
freed) and removing a link count can result in placing the inode on
an unlinked list which results in locking an AGI.

The big problem here is that we have an ordering constraint on AGF
and AGI locking - inode allocation locks the AGI, then can allocate
a new extent for new inodes, locking the AGF after the AGI.
Similarly, freeing the inode removes the inode from the unlinked
list, requiring that we lock the AGI first, and then freeing the
inode can result in an inode chunk being freed and hence freeing
disk space requiring that we lock an AGF.

Hence the ordering that is imposed by other parts of the code is AGI
before AGF. This means we cannot remove the directory entry before
we drop the inode reference count and put it on the unlinked list as
this results in a lock order of AGF then AGI, and this can deadlock
against inode allocation and freeing. Therefore we must drop the
link counts before we remove the directory entry.

This is still safe from a transactional point of view - it is not
until we get to xfs_bmap_finish() that we have the possibility of
multiple transactions in this operation. Hence as long as we remove
the directory entry and drop the link count in the first transaction
of the remove operation, there are no transactional constraints on
the ordering here.

Change the ordering of the operations in the xfs_remove() function
to align the ordering of AGI and AGF locking to match that of the
rest of the code.

Signed-off-by: Dave Chinner &lt;dchinner@redhat.com&gt;
Reviewed-by: Ben Myers &lt;bpm@sgi.com&gt;
Signed-off-by: Ben Myers &lt;bpm@sgi.com&gt;

Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>xfs: fix the extent count when allocating an new indirection array entry</title>
<updated>2014-07-04T08:06:58+00:00</updated>
<author>
<name>Jie Liu</name>
<email>jeff.liu@oracle.com</email>
</author>
<published>2013-10-25T06:52:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=67820ad00235119639737a7a4bcf4e526de43d54'/>
<id>urn:sha1:67820ad00235119639737a7a4bcf4e526de43d54</id>
<content type='text'>
commit bb86d21cba22a045b09d11b71decf5ca7c3d5def upstream.

At xfs_iext_add(), if extent(s) are being appended to the last page in
the indirection array and the new extent(s) don't fit in the page, the
number of extents(erp-&gt;er_extcount) in a new allocated entry should be
the minimum value between count and XFS_LINEAR_EXTS, instead of count.

For now, there is no existing test case can demonstrates a problem with
the er_extcount being set incorrectly here, but it obviously like a bug.

Signed-off-by: Jie Liu &lt;jeff.liu@oracle.com&gt;
Reviewed-by: Ben Myers &lt;bpm@sgi.com&gt;
Signed-off-by: Ben Myers &lt;bpm@sgi.com&gt;

Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>xfs: fix possible NULL dereference in xlog_verify_iclog</title>
<updated>2014-07-04T08:06:57+00:00</updated>
<author>
<name>Geyslan G. Bem</name>
<email>geyslan@gmail.com</email>
</author>
<published>2013-10-30T21:01:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=9af76725ffcc8517643fd07d379f4a1232f6ae92'/>
<id>urn:sha1:9af76725ffcc8517643fd07d379f4a1232f6ae92</id>
<content type='text'>
commit 643f7c4e5656bd18c769211f933190f7bb738245 upstream.

In xlog_verify_iclog a debug check of the incore log buffers prints an
error if icptr is null and then goes on to dereference the pointer
regardless.  Convert this to an assert so that the intention is clear.
This was reported by Coverty.

Signed-off-by: Ben Myers &lt;bpm@sgi.com&gt;
Reviewed-by: Eric Sandeen &lt;sandeen@redhat.com&gt;

Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>xfs: prevent stack overflows from page cache allocation</title>
<updated>2014-07-04T08:06:56+00:00</updated>
<author>
<name>Dave Chinner</name>
<email>dchinner@redhat.com</email>
</author>
<published>2013-10-29T11:11:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=710246604e1d8b6be253d056c922bdf7f57cc243'/>
<id>urn:sha1:710246604e1d8b6be253d056c922bdf7f57cc243</id>
<content type='text'>
commit ad22c7a043c2cc6792820e6c5da699935933e87d upstream.

Page cache allocation doesn't always go through -&gt;begin_write and
hence we don't always get the opportunity to set the allocation
context to GFP_NOFS. Failing to do this means we open up the direct
relcaim stack to recurse into the filesystem and consume a
significant amount of stack.

On RHEL6.4 kernels we are seeing ra_submit() and
generic_file_splice_read() from an nfsd context recursing into the
filesystem via the inode cache shrinker and evicting inodes. This is
causing truncation to be run (e.g EOF block freeing) and causing
bmap btree block merges and free space btree block splits to occur.
These btree manipulations are occurring with the call chain already
30 functions deep and hence there is not enough stack space to
complete such operations.

To avoid these specific overruns, we need to prevent the page cache
allocation from recursing via direct reclaim. We can do that because
the allocation functions take the allocation context from that which
is stored in the mapping for the inode. We don't set that right now,
so the default is GFP_HIGHUSER_MOVABLE, which is effectively a
GFP_KERNEL context. We need it to be the equivalent of GFP_NOFS, so
when we initialise an inode, set the mapping gfp mask appropriately.

This makes the use of AOP_FLAG_NOFS redundant from other parts of
the XFS IO path, so get rid of it.

Signed-off-by: Dave Chinner &lt;dchinner@redhat.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Ben Myers &lt;bpm@sgi.com&gt;

Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>xfs: don't break from growfs ag update loop on error</title>
<updated>2014-07-04T08:06:56+00:00</updated>
<author>
<name>Eric Sandeen</name>
<email>sandeen@sandeen.net</email>
</author>
<published>2013-10-11T19:14:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e465c9574dc68d85ec16663b062c93763a6435e4'/>
<id>urn:sha1:e465c9574dc68d85ec16663b062c93763a6435e4</id>
<content type='text'>
commit 59e5a0e821d838854b3afd030d31f82cee3ecd58 upstream.

When xfs_growfs_data_private() is updating backup superblocks,
it bails out on the first error encountered, whether reading or
writing:

* If we get an error writing out the alternate superblocks,
* just issue a warning and continue.  The real work is
* already done and committed.

This can cause a problem later during repair, because repair
looks at all superblocks, and picks the most prevalent one
as correct.  If we bail out early in the backup superblock
loop, we can end up with more "bad" matching superblocks than
good, and a post-growfs repair may revert the filesystem to
the old geometry.

With the combination of superblock verifiers and old bugs,
we're more likely to encounter read errors due to verification.

And perhaps even worse, we don't even properly write any of the
newly-added superblocks in the new AGs.

Even with this change, growfs will still say:

  xfs_growfs: XFS_IOC_FSGROWFSDATA xfsctl failed: Structure needs cleaning
  data blocks changed from 319815680 to 335216640

which might be confusing to the user, but it at least communicates
that something has gone wrong, and dmesg will probably highlight
the need for an xfs_repair.

And this is still best-effort; if verifiers fail on more than
half the backup supers, they may still "win" - but that's probably
best left to repair to more gracefully handle by doing its own
strict verification as part of the backup super "voting."

Signed-off-by: Eric Sandeen &lt;sandeen@redhat.com&gt;
Acked-by: Dave Chinner &lt;david@fromorbit.com&gt;
Reviewed-by: Mark Tinguely &lt;tinguely@sgi.com&gt;
Signed-off-by: Ben Myers &lt;bpm@sgi.com&gt;

Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>xfs: don't emit corruption noise on fs probes</title>
<updated>2014-07-04T08:06:55+00:00</updated>
<author>
<name>Eric Sandeen</name>
<email>sandeen@sandeen.net</email>
</author>
<published>2013-10-11T19:12:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=22a16208d5e73c9b84629790c28df0df4183bba4'/>
<id>urn:sha1:22a16208d5e73c9b84629790c28df0df4183bba4</id>
<content type='text'>
commit 31625f28ad7be67701dc4cefcf52087addd88af4 upstream.

If we get EWRONGFS due to probing of non-xfs filesystems,
there's no need to issue the scary corruption error and backtrace.

Signed-off-by: Eric Sandeen &lt;sandeen@redhat.com&gt;
Reviewed-by: Mark Tinguely &lt;tinguely@sgi.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Ben Myers &lt;bpm@sgi.com&gt;

Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>xfs: prevent deadlock trying to cover an active log</title>
<updated>2014-07-04T08:06:55+00:00</updated>
<author>
<name>Dave Chinner</name>
<email>dchinner@redhat.com</email>
</author>
<published>2013-10-14T22:17:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=192fedc5d3840a89a2d31f0f9cb7195f52a11b9d'/>
<id>urn:sha1:192fedc5d3840a89a2d31f0f9cb7195f52a11b9d</id>
<content type='text'>
commit 2c6e24ce1aa6b3b147c75d488c2797ee258eb22b upstream.

Recent analysis of a deadlocked XFS filesystem from a kernel
crash dump indicated that the filesystem was stuck waiting for log
space. The short story of the hang on the RHEL6 kernel is this:

	- the tail of the log is pinned by an inode
	- the inode has been pushed by the xfsaild
	- the inode has been flushed to it's backing buffer and is
	  currently flush locked and hence waiting for backing
	  buffer IO to complete and remove it from the AIL
	- the backing buffer is marked for write - it is on the
	  delayed write queue
	- the inode buffer has been modified directly and logged
	  recently due to unlinked inode list modification
	- the backing buffer is pinned in memory as it is in the
	  active CIL context.
	- the xfsbufd won't start buffer writeback because it is
	  pinned
	- xfssyncd won't force the log because it sees the log as
	  needing to be covered and hence wants to issue a dummy
	  transaction to move the log covering state machine along.

Hence there is no trigger to force the CIL to the log and hence
unpin the inode buffer and therefore complete the inode IO, remove
it from the AIL and hence move the tail of the log along, allowing
transactions to start again.

Mainline kernels also have the same deadlock, though the signature
is slightly different - the inode buffer never reaches the delayed
write lists because xfs_buf_item_push() sees that it is pinned and
hence never adds it to the delayed write list that the xfsaild
flushes.

There are two possible solutions here. The first is to simply force
the log before trying to cover the log and so ensure that the CIL is
emptied before we try to reserve space for the dummy transaction in
the xfs_log_worker(). While this might work most of the time, it is
still racy and is no guarantee that we don't get stuck in
xfs_trans_reserve waiting for log space to come free. Hence it's not
the best way to solve the problem.

The second solution is to modify xfs_log_need_covered() to be aware
of the CIL. We only should be attempting to cover the log if there
is no current activity in the log - covering the log is the process
of ensuring that the head and tail in the log on disk are identical
(i.e. the log is clean and at idle). Hence, by definition, if there
are items in the CIL then the log is not at idle and so we don't
need to attempt to cover it.

When we don't need to cover the log because it is active or idle, we
issue a log force from xfs_log_worker() - if the log is idle, then
this does nothing.  However, if the log is active due to there being
items in the CIL, it will force the items in the CIL to the log and
unpin them.

In the case of the above deadlock scenario, instead of
xfs_log_worker() getting stuck in xfs_trans_reserve() attempting to
cover the log, it will instead force the log, thereby unpinning the
inode buffer, allowing IO to be issued and complete and hence
removing the inode that was pinning the tail of the log from the
AIL. At that point, everything will start moving along again. i.e.
the xfs_log_worker turns back into a watchdog that can alleviate
deadlocks based around pinned items that prevent the tail of the log
from being moved...

Signed-off-by: Dave Chinner &lt;dchinner@redhat.com&gt;
Reviewed-by: Eric Sandeen &lt;sandeen@redhat.com&gt;
Signed-off-by: Ben Myers &lt;bpm@sgi.com&gt;

Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>xfs: fix the wrong new_size/rnew_size at xfs_iext_realloc_direct()</title>
<updated>2014-07-04T08:06:54+00:00</updated>
<author>
<name>Jie Liu</name>
<email>jeff.liu@oracle.com</email>
</author>
<published>2013-09-22T08:25:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5b09381286f17ce95e334c32629584a1cad13775'/>
<id>urn:sha1:5b09381286f17ce95e334c32629584a1cad13775</id>
<content type='text'>
commit 17ec81c15fd022842f9bc947841ba9fb9eb52591 upstream.

At xfs_iext_realloc_direct(), the new_size is changed by adding
if_bytes if originally the extent records are stored at the inline
extent buffer, and we have to switch from it to a direct extent
list for those new allocated extents, this is wrong. e.g,

Create a file with three extents which was showing as following,

xfs_io -f -c "truncate 100m" /xfs/testme

for i in $(seq 0 5 10); do
	offset=$(($i * $((1 &lt;&lt; 20))))
	xfs_io -c "pwrite $offset 1m" /xfs/testme
done

Inline
------
irec:	if_bytes	bytes_diff	new_size
1st	0		16		16
2nd	16		16		32

Switching
---------						rnew_size
3rd	32		16		48 + 32 = 80	roundup=128

In this case, the desired value of new_size should be 48, and then
it will be roundup to 64 and be assigned to rnew_size.

However, this issue has been covered by resetting the if_bytes to
the new_size which is calculated at the begnning of xfs_iext_add()
before leaving out this function, and in turn make the rnew_size
correctly again. Hence, this can not be detected via xfstestes.

This patch fix above problem and revise the new_size comments at
xfs_iext_realloc_direct() to make it more readable.  Also, fix the
comments while switching from the inline extent buffer to a direct
extent list to reflect this change.

Signed-off-by: Jie Liu &lt;jeff.liu@oracle.com&gt;
Reviewed-by: Dave Chinner &lt;dchinner@redhat.com&gt;
Signed-off-by: Ben Myers &lt;bpm@sgi.com&gt;

Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
</feed>
