kernel/linux.git - Linux kernel stable tree (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2017-12-20	ceph: drop negative child dentries before try pruning inode's alias	Yan, Zheng	1	-4/+38
	commit 040d786032bf59002d374b86d75b04d97624005c upstream. Negative child dentry holds reference on inode's alias, it makes d_prune_aliases() do nothing. Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-20	ovl: update ctx->pos on impure dir iteration	Amir Goldstein	1	-1/+4
	commit b02a16e6413a2f782e542ef60bad9ff6bf212f8a upstream. This fixes a regression with readdir of impure dir in overlayfs that is shared to VM via 9p fs. Reported-by: Miguel Bernal Marin <miguel.bernal.marin@linux.intel.com> Fixes: 4edb83bb1041 ("ovl: constant d_ino for non-merge dirs") Signed-off-by: Amir Goldstein <amir73il@gmail.com> Tested-by: Miguel Bernal Marin <miguel.bernal.marin@linux.intel.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-20	ovl: Pass ovl_get_nlink() parameters in right order	Vivek Goyal	1	-1/+1
	commit 08d8f8a5b094b66b29936e8751b4a818b8db1207 upstream. Right now we seem to be passing index as "lowerdentry" and origin.dentry as "upperdentry". IIUC, we should pass these parameters in reversed order and this looks like a bug. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: Amir Goldstein <amir73il@gmail.com> Fixes: caf70cb2ba5d ("ovl: cleanup orphan index entries") Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-20	kernel: make groups_sort calling a responsibility group_info allocators	Thiago Rafael Becker	1	-0/+3
	commit bdcf0a423ea1c40bbb40e7ee483b50fc8aa3d758 upstream. In testing, we found that nfsd threads may call set_groups in parallel for the same entry cached in auth.unix.gid, racing in the call of groups_sort, corrupting the groups for that entry and leading to permission denials for the client. This patch: - Make groups_sort globally visible. - Move the call to groups_sort to the modifiers of group_info - Remove the call to groups_sort from set_groups Link: http://lkml.kernel.org/r/20171211151420.18655-1-thiago.becker@gmail.com Signed-off-by: Thiago Rafael Becker <thiago.becker@gmail.com> Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com> Reviewed-by: NeilBrown <neilb@suse.com> Acked-by: "J. Bruce Fields" <bfields@fieldses.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-20	autofs: fix careless error in recent commit	NeilBrown	1	-1/+0
	commit 302ec300ef8a545a7fc7f667e5fd743b091c2eeb upstream. Commit ecc0c469f277 ("autofs: don't fail mount for transient error") was meant to replace an 'if' with a 'switch', but instead added the 'switch' leaving the case in place. Link: http://lkml.kernel.org/r/87zi6wstmw.fsf@notabene.neil.brown.name Fixes: ecc0c469f277 ("autofs: don't fail mount for transient error") Reported-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: NeilBrown <neilb@suse.com> Cc: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-20	cifs: fix NULL deref in SMB2_read	Ronnie Sahlberg	1	-15/+15
	commit a821df3f1af72aa6a0d573eea94a7dd2613e9f4e upstream. Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-17	fcntl: don't cap l_start and l_end values for F_GETLK64 in compat syscall	Jeff Layton	1	-6/+5
	commit 4d2dc2cc766c3b51929658cacbc6e34fc8e242fb upstream. Currently, we're capping the values too low in the F_GETLK64 case. The fields in that structure are 64-bit values, so we shouldn't need to do any sort of fixup there. Make sure we check that assumption at build time in the future however by ensuring that the sizes we're copying will fit. With this, we no longer need COMPAT_LOFF_T_MAX either, so remove it. Fixes: 94073ad77fff2 (fs/locks: don't mess with the address limit in compat_fcntl64) Reported-by: Vitaly Lipatov <lav@etersoft.ru> Signed-off-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-14	afs: Connect up the CB.ProbeUuid	David Howells	1	-0/+3
	[ Upstream commit f4b3526d83c40dd8bf5948b9d7a1b2c340f0dcc8 ] The handler for the CB.ProbeUuid operation in the cache manager is implemented, but isn't listed in the switch-statement of operation selection, so won't be used. Fix this by adding it. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-14	afs: Fix total-length calculation for multiple-page send	David Howells	1	-2/+11
	[ Upstream commit 1199db603511d7463d9d3840f96f61967affc766 ] Fix the total-length calculation in afs_make_call() when the operation being dispatched has data from a series of pages attached. Despite the patched code looking like that it should reduce mathematically to the current code, it doesn't because the 32-bit unsigned arithmetic being used to calculate the page-offset-difference doesn't correctly extend to a 64-bit value when the result is effectively negative. Without this, some FS.StoreData operations that span multiple pages fail, reporting too little or too much data. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-14	f2fs: fix to clear FI_NO_PREALLOC	Chao Yu	1	-0/+1
	[ Upstream commit 28cfafb73853f0494b06649716687a3ea07681d5 ] We need to clear FI_NO_PREALLOC flag in error path of f2fs_file_write_iter, otherwise we will lose the chance to preallocate blocks in latter write() at one time. Fixes: dc91de78e5e1 ("f2fs: do not preallocate blocks which has wrong buffer") Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-14	fcntl: don't leak fd reference when fixup_compat_flock fails	Jeff Layton	1	-3/+2
	[ Upstream commit 9280a601e6080c9ff658468c1c775ff6514099a6 ] Currently we just return err here, but we need to put the fd reference first. Fixes: 94073ad77fff (fs/locks: don't mess with the address limit in compat_fcntl64) Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-14	xfs: fix forgotten rcu read unlock when skipping inode reclaim	Darrick J. Wong	1	-0/+1
	[ Upstream commit 962cc1ad6caddb5abbb9f0a43e5abe7131a71f18 ] In commit f2e9ad21 ("xfs: check for race with xfs_reclaim_inode"), we skip an inode if we're racing with freeing the inode via xfs_reclaim_inode, but we forgot to release the rcu read lock when dumping the inode, with the result that we exit to userspace with a lock held. Don't do that; generic/320 with a 1k block size fails this very occasionally. ================================================ WARNING: lock held when returning to user space! 4.14.0-rc6-djwong #4 Tainted: G W ------------------------------------------------ rm/30466 is leaving the kernel with locks still held! 1 lock held by rm/30466: #0: (rcu_read_lock){....}, at: [<ffffffffa01364d3>] xfs_ifree_cluster.isra.17+0x2c3/0x6f0 [xfs] ------------[ cut here ]------------ WARNING: CPU: 1 PID: 30466 at kernel/rcu/tree_plugin.h:329 rcu_note_context_switch+0x71/0x700 Modules linked in: deadline_iosched dm_snapshot dm_bufio ext4 mbcache jbd2 dm_flakey xfs libcrc32c dax_pmem device_dax nd_pmem sch_fq_codel af_packet [last unloaded: scsi_debug] CPU: 1 PID: 30466 Comm: rm Tainted: G W 4.14.0-rc6-djwong #4 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-1ubuntu1djwong0 04/01/2014 task: ffff880037680000 task.stack: ffffc90001064000 RIP: 0010:rcu_note_context_switch+0x71/0x700 RSP: 0000:ffffc90001067e50 EFLAGS: 00010002 RAX: 0000000000000001 RBX: ffff880037680000 RCX: ffff88003e73d200 RDX: 0000000000000002 RSI: ffffffff819e53e9 RDI: ffffffff819f4375 RBP: 0000000000000000 R08: 0000000000000000 R09: ffff880062c900d0 R10: 0000000000000000 R11: 0000000000000000 R12: ffff880037680000 R13: 0000000000000000 R14: ffffc90001067eb8 R15: ffff880037680690 FS: 00007fa3b8ce8700(0000) GS:ffff88003ec00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f69bf77c000 CR3: 000000002450a000 CR4: 00000000000006e0 Call Trace: __schedule+0xb8/0xb10 schedule+0x40/0x90 exit_to_usermode_loop+0x6b/0xa0 prepare_exit_to_usermode+0x7a/0x90 retint_user+0x8/0x20 RIP: 0033:0x7fa3b87fda87 RSP: 002b:00007ffe41206568 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff02 RAX: 0000000000000000 RBX: 00000000010e88c0 RCX: 00007fa3b87fda87 RDX: 0000000000000000 RSI: 00000000010e89c8 RDI: 0000000000000005 RBP: 0000000000000000 R08: 0000000000000003 R09: 0000000000000000 R10: 000000000000015e R11: 0000000000000246 R12: 00000000010c8060 R13: 00007ffe41206690 R14: 0000000000000000 R15: 0000000000000000 ---[ end trace e88f83bf0cfbd07d ]--- Fixes: f2e9ad212def50bcf4c098c6288779dd97fff0f0 Cc: Omar Sandoval <osandov@fb.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-14	NFS: Fix a typo in nfs_rename()	Trond Myklebust	1	-1/+1
	[ Upstream commit d803224c84be067754db7fa58a93f36f61566493 ] On successful rename, the "old_dentry" is retained and is attached to the "new_dir", so we need to call nfs_set_verifier() accordingly. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-14	pipe: match pipe_max_size data type with procfs	Joe Lawrence	1	-1/+1
	[ Upstream commit 98159d977f71c3b3dee898d1c34e56f520b094e7 ] Patch series "A few round_pipe_size() and pipe-max-size fixups", v3. While backporting Michael's "pipe: fix limit handling" patchset to a distro-kernel, Mikulas noticed that current upstream pipe limit handling contains a few problems: 1 - procfs signed wrap: echo'ing a large number into /proc/sys/fs/pipe-max-size and then cat'ing it back out shows a negative value. 2 - round_pipe_size() nr_pages overflow on 32bit: this would subsequently try roundup_pow_of_two(0), which is undefined. 3 - visible non-rounded pipe-max-size value: there is no mutual exclusion or protection between the time pipe_max_size is assigned a raw value from proc_dointvec_minmax() and when it is rounded. 4 - unsigned long -> unsigned int conversion makes for potential odd return errors from do_proc_douintvec_minmax_conv() and do_proc_dopipe_max_size_conv(). This version underwent the same testing as v1: https://marc.info/?l=linux-kernel&m=150643571406022&w=2 This patch (of 4): pipe_max_size is defined as an unsigned int: unsigned int pipe_max_size = 1048576; but its procfs/sysctl representation is an integer: static struct ctl_table fs_table[] = { ... { .procname = "pipe-max-size", .data = &pipe_max_size, .maxlen = sizeof(int), .mode = 0644, .proc_handler = &pipe_proc_fn, .extra1 = &pipe_min_size, }, ... that is signed: int pipe_proc_fn(struct ctl_table table, int write, void __user buf, size_t lenp, loff_t ppos) { ... ret = proc_dointvec_minmax(table, write, buf, lenp, ppos) This leads to signed results via procfs for large values of pipe_max_size: % echo 2147483647 >/proc/sys/fs/pipe-max-size % cat /proc/sys/fs/pipe-max-size -2147483648 Use unsigned operations on this variable to avoid such negative values. Link: http://lkml.kernel.org/r/1507658689-11669-2-git-send-email-joe.lawrence@redhat.com Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com> Reported-by: Mikulas Patocka <mpatocka@redhat.com> Reviewed-by: Mikulas Patocka <mpatocka@redhat.com> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Jens Axboe <axboe@kernel.dk> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-14	btrfs: handle errors while updating refcounts in update_ref_for_cow	Jeff Mahoney	1	-6/+12
	commit 692826b2738101549f032a761a9191636e83be4e upstream. Since commit fb235dc06fa (btrfs: qgroup: Move half of the qgroup accounting time out of commit trans) the assumption that btrfs_add_delayed_{data,tree}_ref can only return 0 or -ENOMEM has been false. The qgroup operations call into btrfs_search_slot and friends and can now return the full spectrum of error codes. Fortunately, the fix here is easy since update_ref_for_cow failing is already handled so we just need to bail early with the error code. Fixes: fb235dc06fa (btrfs: qgroup: Move half of the qgroup accounting ...) Signed-off-by: Jeff Mahoney <jeffm@suse.com> Reviewed-by: Edmund Nadolski <enadolski@suse.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-14	btrfs: fix missing error return in btrfs_drop_snapshot	Jeff Mahoney	1	-0/+1
	commit e19182c0fff451e3744c1107d98f072e7ca377a0 upstream. If btrfs_del_root fails in btrfs_drop_snapshot, we'll pick up the error but then return 0 anyway due to mixing err and ret. Fixes: 79787eaab4612 ("btrfs: replace many BUG_ONs with proper error handling") Signed-off-by: Jeff Mahoney <jeffm@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-05	lockd: lost rollback of set_grace_period() in lockd_down_net()	Vasily Averin	1	-0/+2
	commit 3a2b19d1ee5633f76ae8a88da7bc039a5d1732aa upstream. Commit efda760fe95ea ("lockd: fix lockd shutdown race") is incorrect, it removes lockd_manager and disarm grace_period_end for init_net only. If nfsd was started from another net namespace lockd_up_net() calls set_grace_period() that adds lockd_manager into per-netns list and queues grace_period_end delayed work. These action should be reverted in lockd_down_net(). Otherwise it can lead to double list_add on after restart nfsd in netns, and to use-after-free if non-disarmed delayed work will be executed after netns destroy. Fixes: efda760fe95e ("lockd: fix lockd shutdown race") Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-05	nfsd: fix panic in posix_unblock_lock called from nfs4_laundromat	Naofumi Honda	1	-2/+2
	commit 64ebe12494fd5d193f014ce38e1fd83cc57883c8 upstream. From kernel 4.9, my two nfsv4 servers sometimes suffer from "panic: unable to handle kernel page request" in posix_unblock_lock() called from nfs4_laundromat(). These panics diseappear if we revert the commit "nfsd: add a LRU list for blocked locks". The cause appears to be a typo in nfs4_laundromat(), which is also present in nfs4_state_shutdown_net(). Fixes: 7919d0a27f1e "nfsd: add a LRU list for blocked locks" Cc: jlayton@redhat.com Reveiwed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-05	nfsd: Fix another OPEN stateid race	Trond Myklebust	1	-15/+13
	commit d8a1a000555ecd1b824ac1ed6df8fe364dfbbbb0 upstream. If nfsd4_process_open2() is initialising a new stateid, and yet the call to nfs4_get_vfs_file() fails for some reason, then we must declare the stateid closed, and unhash it before dropping the mutex. Right now, we unhash the stateid after dropping the mutex, and without changing the stateid type, meaning that another OPEN could theoretically look it up and attempt to use it. Reported-by: Andrew W Elble <aweits@rit.edu> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-05	nfsd: Fix stateid races between OPEN and CLOSE	Trond Myklebust	1	-8/+59
	commit 15ca08d3299682dc49bad73251677b2c5017ef08 upstream. Open file stateids can linger on the nfs4_file list of stateids even after they have been closed. In order to avoid reusing such a stateid, and confusing the client, we need to recheck the nfs4_stid's type after taking the mutex. Otherwise, we risk reusing an old stateid that was already closed, which will confuse clients that expect new stateids to conform to RFC7530 Sections 9.1.4.2 and 16.2.5 or RFC5661 Sections 8.2.2 and 18.2.4. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-05	btrfs: clear space cache inode generation always	Josef Bacik	1	-7/+7
	commit 8e138e0d92c6c9d3d481674fb14e3439b495be37 upstream. We discovered a box that had double allocations, and suspected the space cache may be to blame. While auditing the write out path I noticed that if we've already setup the space cache we will just carry on. This means that any error we hit after cache_save_setup before we go to actually write the cache out we won't reset the inode generation, so whatever was already written will be considered correct, except it'll be stale. Fix this by _always_ resetting the generation on the block group inode, this way we only ever have valid or invalid cache. With this patch I was no longer able to reproduce cache corruption with dm-log-writes and my bpf error injection tool. Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-05	autofs: revert "autofs: fix AT_NO_AUTOMOUNT not being honored"	Ian Kent	1	-12/+3
	commit 5d38f049cee1e1c4a7ac55aa79d37d01ddcc3860 upstream. Commit 42f461482178 ("autofs: fix AT_NO_AUTOMOUNT not being honored") allowed the fstatat(2) system call to properly honor the AT_NO_AUTOMOUNT flag but introduced a semantic change. In order to honor AT_NO_AUTOMOUNT a semantic change was made to the negative dentry case for stat family system calls in follow_automount(). This changed the unconditional triggering of an automount in this case to no longer be done and an error returned instead. This has caused more problems than I expected so reverting the change is needed. In a discussion with Neil Brown it was concluded that the automount(8) daemon can implement this change without kernel modifications. So that will be done instead and the autofs module documentation updated with a description of the problem and what needs to be done by module users for this specific case. Link: http://lkml.kernel.org/r/151174730120.6162.3848002191530283984.stgit@pluto.themaw.net Fixes: 42f4614821 ("autofs: fix AT_NO_AUTOMOUNT not being honored") Signed-off-by: Ian Kent <raven@themaw.net> Cc: Neil Brown <neilb@suse.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: David Howells <dhowells@redhat.com> Cc: Colin Walters <walters@redhat.com> Cc: Ondrej Holy <oholy@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-05	autofs: revert "autofs: take more care to not update last_used on path walk"	Ian Kent	1	-11/+6
	commit 43694d4bf843ddd34519e8e9de983deefeada699 upstream. While commit 092a53452bb7 ("autofs: take more care to not update last_used on path walk") helped (partially) resolve a problem where automounts were not expiring due to aggressive accesses from user space it has a side effect for very large environments. This change helps with the expire problem by making the expire more aggressive but, for very large environments, that means more mount requests from clients. When there are a lot of clients that can mean fairly significant server load increases. It turns out I put the last_used in this position to solve this very problem and failed to update my own thinking of the autofs expire policy. So the patch being reverted introduces a regression which should be fixed. Link: http://lkml.kernel.org/r/151174729420.6162.1832622523537052460.stgit@pluto.themaw.net Fixes: 092a53452b ("autofs: take more care to not update last_used on path walk") Signed-off-by: Ian Kent <raven@themaw.net> Reviewed-by: NeilBrown <neilb@suse.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Colin Walters <walters@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Ondrej Holy <oholy@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-05	fs/fat/inode.c: fix sb_rdonly() change	OGAWA Hirofumi	1	-1/+1
	commit b6e8e12c0aeb5fbf1bf46c84d58cc93aedede385 upstream. Commit bc98a42c1f7d ("VFS: Convert sb->s_flags & MS_RDONLY to sb_rdonly(sb)") converted fat_remount():new_rdonly from a bool to an int. However fat_remount() depends upon the compiler's conversion of a non-zero integer into boolean `true'. Fix it by switching `new_rdonly' back into a bool. Link: http://lkml.kernel.org/r/87mv3d5x51.fsf@mail.parknet.co.jp Fixes: bc98a42c1f7d0f8 ("VFS: Convert sb->s_flags & MS_RDONLY to sb_rdonly(sb)") Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Joe Perches <joe@perches.com> Cc: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-05	exec: avoid RLIMIT_STACK races with prlimit()	Kees Cook	1	-1/+6
	commit 04e35f4495dd560db30c25efca4eecae8ec8c375 upstream. While the defense-in-depth RLIMIT_STACK limit on setuid processes was protected against races from other threads calling setrlimit(), I missed protecting it against races from external processes calling prlimit(). This adds locking around the change and makes sure that rlim_max is set too. Link: http://lkml.kernel.org/r/20171127193457.GA11348@beast Fixes: 64701dee4178e ("exec: Use sane stack rlimit under secureexec") Signed-off-by: Kees Cook <keescook@chromium.org> Reported-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Reported-by: Brad Spengler <spender@grsecurity.net> Acked-by: Serge Hallyn <serge@hallyn.com> Cc: James Morris <james.l.morris@oracle.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30	dax: fix PMD faults on zero-length files	Jeff Moyer	1	-3/+3
	commit 957ac8c421ad8b5eef9b17fe98e146d8311a541e upstream. PMD faults on a zero length file on a file system mounted with -o dax will not generate SIGBUS as expected. fd = open(...O_TRUNC); addr = mmap(NULL, 210241024, PROT_READ\|PROT_WRITE, MAP_SHARED, fd, 0); *addr = 'a'; <expect SIGBUS> The problem is this code in dax_iomap_pmd_fault: max_pgoff = (i_size_read(inode) - 1) >> PAGE_SHIFT; If the inode size is zero, we end up with a max_pgoff that is way larger than 0. :) Fix it by using DIV_ROUND_UP, as is done elsewhere in the kernel. I tested this with some simple test code that ensured that SIGBUS was received where expected. Fixes: 642261ac995e ("dax: add struct iomap based DAX PMD support") Signed-off-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30	lockd: double unregister of inetaddr notifiers	Vasily Averin	1	-11/+9
	commit dc3033e16c59a2c4e62b31341258a5786cbcee56 upstream. lockd_up() can call lockd_unregister_notifiers twice: inside lockd_start_svc() when it calls lockd_svc_exit_thread() and then in error path of lockd_up() Patch forces lockd_start_svc() to unregister notifiers in all error cases and removes extra unregister in error path of lockd_up(). Fixes: cb7d224f82e4 "lockd: unregister notifier blocks if the service ..." Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30	fs/9p: Compare qid.path in v9fs_test_inode	Tuomas Tynkkynen	2	-0/+6
	commit 8ee031631546cf2f7859cc69593bd60bbdd70b46 upstream. Commit fd2421f54423 ("fs/9p: When doing inode lookup compare qid details and inode mode bits.") transformed v9fs_qid_iget() to use iget5_locked() instead of iget_locked(). However, the test() callback is not checking fid.path at all, which means that a lookup in the inode cache can now accidentally locate a completely wrong inode from the same inode hash bucket if the other fields (qid.type and qid.version) match. Fixes: fd2421f54423 ("fs/9p: When doing inode lookup compare qid details and inode mode bits.") Reviewed-by: Latchesar Ionkov <lucho@ionkov.net> Signed-off-by: Tuomas Tynkkynen <tuomas@tuxera.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30	ext4: prevent data corruption with journaling + DAX	Ross Zwisler	2	-8/+13
	commit e9072d859df3e0f2c3ba450f0d1739595c2d5d13 upstream. The current code has the potential for data corruption when changing an inode's journaling mode, as that can result in a subsequent unsafe change in S_DAX. I've captured an instance of this data corruption in the following fstest: https://patchwork.kernel.org/patch/9948377/ Prevent this data corruption from happening by disallowing changes to the journaling mode if the '-o dax' mount option was used. This means that for a given filesystem we could have a mix of inodes using either DAX or data journaling, but whatever state the inodes are in will be held for the duration of the mount. Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30	ext4: prevent data corruption with inline data + DAX	Ross Zwisler	2	-10/+5
	commit 559db4c6d784ceedc2a5418ced4d357cb843e221 upstream. If an inode has inline data it is currently prevented from using DAX by a check in ext4_set_inode_flags(). When the inode grows inline data via ext4_create_inline_data() or removes its inline data via ext4_destroy_inline_data_nolock(), the value of S_DAX can change. Currently these changes are unsafe because we don't hold off page faults and I/O, write back dirty radix tree entries and invalidate all mappings. There are also issues with mm-level races when changing the value of S_DAX, as well as issues with the VM_MIXEDMAP flag: https://www.spinics.net/lists/linux-xfs/msg09859.html The unsafe transition of S_DAX can reliably cause data corruption, as shown by the following fstest: https://patchwork.kernel.org/patch/9948381/ Fix this issue by preventing the DAX mount option from being used on filesystems that were created to support inline data. Inline data is an option given to mkfs.ext4. Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30	ext4: fix interaction between i_size, fallocate, and delalloc after a crash	Theodore Ts'o	1	-2/+4
	commit 51e3ae81ec58e95f10a98ef3dd6d7bce5d8e35a2 upstream. If there are pending writes subject to delayed allocation, then i_size will show size after the writes have completed, while i_disksize contains the value of i_size on the disk (since the writes have not been persisted to disk). If fallocate(2) is called with the FALLOC_FL_KEEP_SIZE flag, either with or without the FALLOC_FL_ZERO_RANGE flag set, and the new size after the fallocate(2) is between i_size and i_disksize, then after a crash, if a journal commit has resulted in the changes made by the fallocate() call to be persisted after a crash, but the delayed allocation write has not resolved itself, i_size would not be updated, and this would cause the following e2fsck complaint: Inode 12, end of extent exceeds allowed value (logical block 33, physical block 33441, len 7) This can only take place on a sparse file, where the fallocate(2) call is allocating blocks in a range which is before a pending delayed allocation write which is extending i_size. Since this situation is quite rare, and the window in which the crash must take place is typically < 30 seconds, in practice this condition will rarely happen. Nevertheless, it can be triggered in testing, and in particular by xfstests generic/456. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reported-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30	fsnotify: fix pinning group in fsnotify_prepare_user_wait()	Miklos Szeredi	1	-14/+11
	commit 9a31d7ad997f55768c687974ce36b759065b49e5 upstream. Blind increment of group's user_waits is not enough, we could be far enough in the group's destruction that it isn't taken into account (i.e. grabbing the mark ref afterwards doesn't guarantee that it was the ref coming from the _group_ that was grabbed). Instead we need to check (under lock) that the mark is still attached to the group after having obtained a ref to the mark. If not, skip it. Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Fixes: 9385a84d7e1f ("fsnotify: Pass fsnotify_iter_info into handle_event handler") Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30	fsnotify: pin both inode and vfsmount mark	Miklos Szeredi	1	-3/+7
	commit 0d6ec079d6aaa098b978d6395973bb027c752a03 upstream. We may fail to pin one of the marks in fsnotify_prepare_user_wait() when dropping the srcu read lock, resulting in use after free at the next iteration. Solution is to store both marks in iter_info instead of just the one we'll be sending the event for. Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Fixes: 9385a84d7e1f ("fsnotify: Pass fsnotify_iter_info into handle_event handler") Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30	fsnotify: clean up fsnotify_prepare/finish_user_wait()	Miklos Szeredi	1	-48/+48
	commit 24c20305c7fc8959836211cb8c50aab93ae0e54f upstream. This patch doesn't actually fix any bug, just paves the way for fixing mark and group pinning. Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30	nfsd: deal with revoked delegations appropriately	Andrew Elble	1	-1/+24
	commit 95da1b3a5aded124dd1bda1e3cdb876184813140 upstream. If a delegation has been revoked by the server, operations using that delegation should error out with NFS4ERR_DELEG_REVOKED in the >4.1 case, and NFS4ERR_BAD_STATEID otherwise. The server needs NFSv4.1 clients to explicitly free revoked delegations. If the server returns NFS4ERR_DELEG_REVOKED, the client will do that; otherwise it may just forget about the delegation and be unable to recover when it later sees SEQ4_STATUS_RECALLABLE_STATE_REVOKED set on a SEQUENCE reply. That can cause the Linux 4.1 client to loop in its stage manager. Signed-off-by: Andrew Elble <aweits@rit.edu> Reviewed-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30	NFS: revalidate "." etc correctly on "open".	NeilBrown	1	-2/+2
	commit b688741cb06695312f18b730653d6611e1bad28d upstream. For correct close-to-open semantics, NFS must validate the change attribute of a directory (or file) on open. Since commit ecf3d1f1aa74 ("vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op"), open() of "." or a path ending ".." is not revalidated reliably (except when that direct is a mount point). Prior to that commit, "." was revalidated using nfs_lookup_revalidate() which checks the LOOKUP_OPEN flag and forces revalidation if the flag is set. Since that commit, nfs_weak_revalidate() is used for NFSv3 (which ignores the flags) and nothing is used for NFSv4. This is fixed by using nfs_lookup_verify_inode() in nfs_weak_revalidate(). This does the revalidation exactly when needed. Also, add a definition of .d_weak_revalidate for NFSv4. The incorrect behavior is easily demonstrated by running "echo " in some non-mountpoint NFS directory while watching network traffic. Without this patch, "echo " sometimes doesn't produce any traffic. With the patch it always does. Fixes: ecf3d1f1aa74 ("vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op") Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30	NFS: Avoid RCU usage in tracepoints	Anna Schumaker	1	-18/+6
	commit 3944369db701f075092357b511fd9f5755771585 upstream. There isn't an obvious way to acquire and release the RCU lock during a tracepoint, so we can't use the rpc_peeraddr2str() function here. Instead, rely on the client's cl_hostname, which should have similar enough information without needing an rcu_dereference(). Reported-by: Dave Jones <davej@codemonkey.org.uk> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30	nfs: Fix ugly referral attributes	Chuck Lever	1	-10/+8
	commit c05cefcc72416a37eba5a2b35f0704ed758a9145 upstream. Before traversing a referral and performing a mount, the mounted-on directory looks strange: dr-xr-xr-x. 2 4294967294 4294967294 0 Dec 31 1969 dir.0 nfs4_get_referral is wiping out any cached attributes with what was returned via GETATTR(fs_locations), but the bit mask for that operation does not request any file attributes. Retrieve owner and timestamp information so that the memcpy in nfs4_get_referral fills in more attributes. Changes since v1: - Don't request attributes that the client unconditionally replaces - Request only MOUNTED_ON_FILEID or FILEID attribute, not both - encode_fs_locations() doesn't use the third bitmask word Fixes: 6b97fd3da1ea ("NFSv4: Follow a referral") Suggested-by: Pradeep Thomas <pradeepthomas@gmail.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30	NFS: Revert "NFS: Move the flock open mode check into nfs_flock()"	Benjamin Coddington	2	-16/+16
	commit fcfa447062b2061e11f68b846d61cbfe60d0d604 upstream. Commit e12937279c8b "NFS: Move the flock open mode check into nfs_flock()" changed NFSv3 behavior for flock() such that the open mode must match the lock type, however that requirement shouldn't be enforced for flock(). Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30	NFS: Fix typo in nomigration mount option	Joshua Watt	1	-1/+1
	commit f02fee227e5f21981152850744a6084ff3fa94ee upstream. The option was incorrectly masking off all other options. Signed-off-by: Joshua Watt <JPEWhacker@gmail.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30	f2fs: expose some sectors to user in inline data or dentry case	Jaegeuk Kim	1	-0/+6
	commit 5b4267d195dd887c4412e34b5a7365baa741b679 upstream. If there's some data written through inline data or dentry, we need to shouw st_blocks. This fixes reporting zero blocks even though there is small written data. Reviewed-by: Chao Yu <yuchao0@huawei.com> [Jaegeuk Kim: avoid link file for quotacheck] Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30	btrfs: change how we decide to commit transactions during flushing	Josef Bacik	1	-14/+28
	commit 996478ca9c460886ac147eb0d00e99841b71d31b upstream. Nikolay reported that generic/273 was failing currently with ENOSPC. Turns out this is because we get to the point where the outstanding reservations are greater than the pinned space on the fs. This is a mistake, previously we used the current reservation amount in may_commit_transaction, not the entire outstanding reservation amount. Fix this to find the minimum byte size needed to make progress in flushing, and pass that into may_commit_transaction. From there we can make a smarter decision on whether to commit the transaction or not. This fixes the failure in generic/273. From Nikolai, IOW: when we go to the final stage of deciding whether to do trans commit, instead of passing all the reservations from all tickets we just pass the reservation for the current ticket. Otherwise, in case all reservations exceed pinned space, then we don't commit transaction and fail prematurely. Before we passed num_bytes from flush_space, where num_bytes was the sum of all pending reserverations, but now all we do is take the first ticket and commit the trans if we can satisfy that. Fixes: 957780eb2788 ("Btrfs: introduce ticketed enospc infrastructure") Reported-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Josef Bacik <jbacik@fb.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Tested-by: Nikolay Borisov <nborisov@suse.com> [ added Nikolai's comment ] Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30	isofs: fix timestamps beyond 2027	Arnd Bergmann	3	-3/+3
	commit 34be4dbf87fc3e474a842305394534216d428f5d upstream. isofs uses a 'char' variable to load the number of years since 1900 for an inode timestamp. On architectures that use a signed char type by default, this results in an invalid date for anything beyond 2027. This changes the function argument to a 'u8' array, which is defined the same way on all architectures, and unambiguously lets us use years until 2155. This should be backported to all kernels that might still be in use by that date. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30	fanotify: fix fsnotify_prepare_user_wait() failure	Miklos Szeredi	1	-13/+20
	commit f37650f1c7c71cf5180b43229d13b421d81e7170 upstream. If fsnotify_prepare_user_wait() fails, we leave the event on the notification list. Which will result in a warning in fsnotify_destroy_event() and later use-after-free. Instead of adding a new helper to remove the event from the list in this case, I opted to move the prepare/finish up into fanotify_handle_event(). This will allow these to be moved further out into the generic code later, and perhaps let us move to non-sleeping RCU. Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Fixes: 05f0e38724e8 ("fanotify: Release SRCU lock when waiting for userspace response") Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30	fs: guard_bio_eod() needs to consider partitions	Greg Edwards	1	-1/+9
	commit 67f2519fe2903c4041c0e94394d14d372fe51399 upstream. guard_bio_eod() needs to look at the partition capacity, not just the capacity of the whole device, when determining if truncation is necessary. [ 60.268688] attempt to access beyond end of device [ 60.268690] unknown-block(9,1): rw=0, want=67103509, limit=67103506 [ 60.268693] buffer_io_error: 2 callbacks suppressed [ 60.268696] Buffer I/O error on dev md1p7, logical block 4524305, async page read Fixes: 74d46992e0d9 ("block: replace bi_bdev with a gendisk pointer and partitions index") Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Edwards <gedwards@ddn.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30	eCryptfs: use after free in ecryptfs_release_messaging()	Dan Carpenter	1	-3/+4
	commit db86be3a12d0b6e5c5b51c2ab2a48f06329cb590 upstream. We're freeing the list iterator so we should be using the _safe() version of hlist_for_each_entry(). Fixes: 88b4a07e6610 ("[PATCH] eCryptfs: Public key transport mechanism") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Tyler Hicks <tyhicks@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30	fscrypt: lock mutex before checking for bounce page pool	Eric Biggers	1	-5/+2
	commit a0b3bc855374c50b5ea85273553485af48caf2f7 upstream. fscrypt_initialize(), which allocates the global bounce page pool when an encrypted file is first accessed, uses "double-checked locking" to try to avoid locking fscrypt_init_mutex. However, it doesn't use any memory barriers, so it's theoretically possible for a thread to observe a bounce page pool which has not been fully initialized. This is a classic bug with "double-checked locking". While "only a theoretical issue" in the latest kernel, in pre-4.8 kernels the pointer that was checked was not even the last to be initialized, so it was easily possible for a crash (NULL pointer dereference) to happen. This was changed only incidentally by the large refactor to use fs/crypto/. Solve both problems in a trivial way that can easily be backported: just always take the mutex. It's theoretically less efficient, but it shouldn't be noticeable in practice as the mutex is only acquired very briefly once per encrypted file. Later I'd like to make this use a helper macro like DO_ONCE(). However, DO_ONCE() runs in atomic context, so we'd need to add a new macro that allows blocking. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30	nilfs2: fix race condition that causes file system corruption	Andreas Rohner	1	-2/+4
	commit 31ccb1f7ba3cfe29631587d451cf5bb8ab593550 upstream. There is a race condition between nilfs_dirty_inode() and nilfs_set_file_dirty(). When a file is opened, nilfs_dirty_inode() is called to update the access timestamp in the inode. It calls __nilfs_mark_inode_dirty() in a separate transaction. __nilfs_mark_inode_dirty() caches the ifile buffer_head in the i_bh field of the inode info structure and marks it as dirty. After some data was written to the file in another transaction, the function nilfs_set_file_dirty() is called, which adds the inode to the ns_dirty_files list. Then the segment construction calls nilfs_segctor_collect_dirty_files(), which goes through the ns_dirty_files list and checks the i_bh field. If there is a cached buffer_head in i_bh it is not marked as dirty again. Since nilfs_dirty_inode() and nilfs_set_file_dirty() use separate transactions, it is possible that a segment construction that writes out the ifile occurs in-between the two. If this happens the inode is not on the ns_dirty_files list, but its ifile block is still marked as dirty and written out. In the next segment construction, the data for the file is written out and nilfs_bmap_propagate() updates the b-tree. Eventually the bmap root is written into the i_bh block, which is not dirty, because it was written out in another segment construction. As a result the bmap update can be lost, which leads to file system corruption. Either the virtual block address points to an unallocated DAT block, or the DAT entry will be reused for something different. The error can remain undetected for a long time. A typical error message would be one of the "bad btree" errors or a warning that a DAT entry could not be found. This bug can be reproduced reliably by a simple benchmark that creates and overwrites millions of 4k files. Link: http://lkml.kernel.org/r/1509367935-3086-2-git-send-email-konishi.ryusuke@lab.ntt.co.jp Signed-off-by: Andreas Rohner <andreas.rohner@gmx.net> Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Tested-by: Andreas Rohner <andreas.rohner@gmx.net> Tested-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30	autofs: don't fail mount for transient error	NeilBrown	1	-1/+14
	commit ecc0c469f27765ed1e2b967be0aa17cee1a60b76 upstream. Currently if the autofs kernel module gets an error when writing to the pipe which links to the daemon, then it marks the whole moutpoint as catatonic, and it will stop working. It is possible that the error is transient. This can happen if the daemon is slow and more than 16 requests queue up. If a subsequent process tries to queue a request, and is then signalled, the write to the pipe will return -ERESTARTSYS and autofs will take that as total failure. So change the code to assess -ERESTARTSYS and -ENOMEM as transient failures which only abort the current request, not the whole mountpoint. It isn't a crash or a data corruption, but having autofs mountpoints suddenly stop working is rather inconvenient. Ian said: : And given the problems with a half dozen (or so) user space applications : consuming large amounts of CPU under heavy mount and umount activity this : could happen more easily than we expect. Link: http://lkml.kernel.org/r/87y3norvgp.fsf@notabene.neil.brown.name Signed-off-by: NeilBrown <neilb@suse.com> Acked-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30	ovl: Put upperdentry if ovl_check_origin() fails	Vivek Goyal	1	-1/+1
	commit 5455f92b54e516995a9ca45bbf790d3629c27a93 upstream. If ovl_check_origin() fails, we should put upperdentry. We have a reference on it by now. So goto out_put_upper instead of out. Fixes: a9d019573e88 ("ovl: lookup non-dir copy-up-origin by file handle") Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>