summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)AuthorFilesLines
2016-09-19NFS pnfs data server multipath session trunkingAndy Adamson4-14/+86
Try all multipath addresses for a data server. The first address that successfully connects and creates a session is the DS mount address. All subsequent addresses are tested for session trunking and added as aliases. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-09-19NFS test session trunking with exchange idAndy Adamson2-11/+44
Use an async exchange id call to test for session trunking To conform with RFC 5661 section 18.35.4, the Non-Update on Existing Clientid case, save the exchange id verifier in cl_confirm and use it for the session trunking exhange id test. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-09-19NFS add xprt switch addrs test to match clientAndy Adamson1-1/+4
Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-09-19NFS detect session trunkingAndy Adamson2-0/+92
Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-09-19NFS refactor nfs4_check_serverowner_major_idAndy Adamson1-6/+5
For session trunking, to compare nfs41_exchange_id_res with existing nfs_client Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-09-19NFS refactor nfs4_match_clientidsAndy Adamson1-5/+5
For session trunking, to compare nfs41_exchange_id_res with exiting nfs_client. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-09-19NFS setup async exchange_idAndy Adamson1-81/+134
Testing an rpc_xprt for session trunking should not delay application progress over already established transports. Setup exchange_id to be able to be an async call to test an rpc_xprt for session trunking use. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-09-19NFSv4.x: Add kernel parameter to control the callback serverTrond Myklebust6-6/+21
Add support for the kernel parameter nfs.callback_nr_threads to set the number of threads that will be assigned to the callback channel. Add support for the kernel parameter nfs.nfs.max_session_cb_slots to set the maximum size of the callback channel slot table. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-09-19NFSv4.x: Switch to using svc_set_num_threads() to manage the callback threadsTrond Myklebust2-66/+15
This will allow us to bump the number of callback threads at will. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-09-19NFSv4.x: Fix up the global tracking of the callback serverTrond Myklebust1-5/+9
Ensure that the nfs_callback_info[] array correctly tracks the struct svc_serv. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-09-19SUNRPC: Initialise struct svc_serv backchannel fields during __svc_create()Trond Myklebust1-3/+0
Clean up. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-09-19NFSv4.x: Set up struct svc_serv_ops for the callback channelTrond Myklebust1-18/+39
In order to manage the threads using svc_set_num_threads, we need to fill in a few extra fields. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-09-19pnfs: track multiple layout types in fsinfo structureJeff Layton4-27/+29
Current NFSv4.1/pNFS client assumes that MDS supports only one layout type. While it's true for most existing servers, nevertheless, this can be change in the near future. For now, this patch just plumbs in the ability to track a list of layouts in the fsinfo structure. The existing behavior of the client is preserved, by having it just select the first entry in the list. Signed-off-by: Tigran Mkrtchyan <tigran.mkrtchyan@desy.de> Signed-off-by: Jeff Layton <jlayton@poochiereds.net> Reviewed-by: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-09-17Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds3-20/+42
Pull cifs fixes from Steve French: "Small set of cifs fixes" * 'for-next' of git://git.samba.org/sfrench/cifs-2.6: Move check for prefix path to within cifs_get_root() Compare prepaths when comparing superblocks Fix memory leaks in cifs_do_mount()
2016-09-16aio: mark AIO pseudo-fs noexecJann Horn1-1/+6
This ensures that do_mmap() won't implicitly make AIO memory mappings executable if the READ_IMPLIES_EXEC personality flag is set. Such behavior is problematic because the security_mmap_file LSM hook doesn't catch this case, potentially permitting an attacker to bypass a W^X policy enforced by SELinux. I have tested the patch on my machine. To test the behavior, compile and run this: #define _GNU_SOURCE #include <unistd.h> #include <sys/personality.h> #include <linux/aio_abi.h> #include <err.h> #include <stdlib.h> #include <stdio.h> #include <sys/syscall.h> int main(void) { personality(READ_IMPLIES_EXEC); aio_context_t ctx = 0; if (syscall(__NR_io_setup, 1, &ctx)) err(1, "io_setup"); char cmd[1000]; sprintf(cmd, "cat /proc/%d/maps | grep -F '/[aio]'", (int)getpid()); system(cmd); return 0; } In the output, "rw-s" is good, "rwxs" is bad. Signed-off-by: Jann Horn <jann@thejh.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-15vfs: cap dedupe request structure size at PAGE_SIZEDarrick J. Wong1-0/+4
Kirill A Shutemov reports that the kernel doesn't try to cap dest_count in any way, and uses the number to allocate kernel memory. This causes high order allocation warnings in the kernel log if someone passes in a big enough value. We should clamp the allocation at PAGE_SIZE to avoid stressing the VM. The two existing users of the dedupe ioctl never send more than 120 requests, so we can safely clamp dest_range at PAGE_SIZE, because with 4k pages we can handle up to 127 dedupe candidates. Given the max extent length of 16MB, we can end up doing 2GB of IO which is plenty. [ Note: the "offsetof()" can't overflow, because 'count' is just a 16-bit integer. That's not obvious in the limited context of the patch, so I'm noting it here because it made me go look. - Linus ] Reported-by: "Kirill A. Shutemov" <kirill@shutemov.name> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-15vfs: fix return type of ioctl_file_dedupe_rangeDarrick J. Wong1-1/+1
All the VFS functions in the dedupe ioctl path return int status, so the ioctl handler ought to as well. Found by Coverity, CID 1350952. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-13Merge tag 'nfs-for-4.8-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds3-24/+44
Pull NFS client bugfixes from Trond Myklebust: "Highlights include: Stable patches: - We must serialise LAYOUTGET and LAYOUTRETURN to ensure correct state accounting - Fix the CREATE_SESSION slot number Bugfixes: - sunrpc: fix a UDP memory accounting regression - NFS: Fix an error reporting regression in nfs_file_write() - pNFS: Fix further layout stateid issues - RPC/rdma: Revert 3d4cf35bd4fa ("xprtrdma: Reply buffer exhaustion...") - RPC/rdma: Fix receive buffer accounting" * tag 'nfs-for-4.8-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFSv4.1: Fix the CREATE_SESSION slot number accounting xprtrdma: Fix receive buffer accounting xprtrdma: Revert 3d4cf35bd4fa ("xprtrdma: Reply buffer exhaustion...") pNFS: Don't forget the layout stateid if there are outstanding LAYOUTGETs pNFS: Clear out all layout segments if the server unsets lrp->res.lrs_present pNFS: Fix pnfs_set_layout_stateid() to clear NFS_LAYOUT_INVALID_STID pNFS: Ensure LAYOUTGET and LAYOUTRETURN are properly serialised NFS: Fix error reporting in nfs_file_write() sunrpc: fix UDP memory accounting
2016-09-11NFSv4.1: Fix the CREATE_SESSION slot number accountingTrond Myklebust1-2/+10
Ensure that we conform to the algorithm described in RFC5661, section 18.36.4 for when to bump the sequence id. In essence we do it for all cases except when the RPC call timed out, or in case of the server returning NFS4ERR_DELAY or NFS4ERR_STALE_CLIENTID. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Cc: stable@vger.kernel.org
2016-09-10Merge branch 'libnvdimm-fixes' of ↵Linus Torvalds1-0/+2
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm fixes from Dan Williams: "nvdimm fixes for v4.8, two of them are tagged for -stable: - Fix devm_memremap_pages() to use track_pfn_insert(). Otherwise, DAX pmd mappings end up with an uncached pgprot, and unusable performance for the device-dax interface. The device-dax interface appeared in 4.7 so this is tagged for -stable. - Fix a couple VM_BUG_ON() checks in the show_smaps() path to understand DAX pmd entries. This fix is tagged for -stable. - Fix a mis-merge of the nfit machine-check handler to flip the polarity of an if() to match the final version of the patch that Vishal sent for 4.8-rc1. Without this the nfit machine check handler never detects / inserts new 'badblocks' entries which applications use to identify lost portions of files. - For test purposes, fix the nvdimm_clear_poison() path to operate on legacy / simulated nvdimm memory ranges. Without this fix a test can set badblocks, but never clear them on these ranges. - Fix the range checking done by dax_dev_pmd_fault(). This is not tagged for -stable since this problem is mitigated by specifying aligned resources at device-dax setup time. These patches have appeared in a next release over the past week. The recent rebase you can see in the timestamps was to drop an invalid fix as identified by the updated device-dax unit tests [1]. The -mm touches have an ack from Andrew" [1]: "[ndctl PATCH 0/3] device-dax test for recent kernel bugs" https://lists.01.org/pipermail/linux-nvdimm/2016-September/006855.html * 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: libnvdimm: allow legacy (e820) pmem region to clear bad blocks nfit, mce: Fix SPA matching logic in MCE handler mm: fix cache mode of dax pmd mappings mm: fix show_smap() for zone_device-pmd ranges dax: fix mapping size check
2016-09-10Merge tag 'for_linus_stable' of ↵Linus Torvalds3-21/+31
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull fscrypto fixes fromTed Ts'o: "Fix some brown-paper-bag bugs for fscrypto, including one one which allows a malicious user to set an encryption policy on an empty directory which they do not own" * tag 'for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: fscrypto: require write access to mount to set encryption policy fscrypto: only allow setting encryption policy on directories fscrypto: add authorization check for setting encryption policy
2016-09-10fscrypto: require write access to mount to set encryption policyEric Biggers3-22/+27
Since setting an encryption policy requires writing metadata to the filesystem, it should be guarded by mnt_want_write/mnt_drop_write. Otherwise, a user could cause a write to a frozen or readonly filesystem. This was handled correctly by f2fs but not by ext4. Make fscrypt_process_policy() handle it rather than relying on the filesystem to get it right. Signed-off-by: Eric Biggers <ebiggers@google.com> Cc: stable@vger.kernel.org # 4.1+; check fs/{ext4,f2fs} Signed-off-by: Theodore Ts'o <tytso@mit.edu> Acked-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-09-10Move check for prefix path to within cifs_get_root()Sachin Prabhu1-5/+4
Signed-off-by: Sachin Prabhu <sprabhu@redhat.com> Tested-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <smfrench@gmail.com>
2016-09-10Compare prepaths when comparing superblocksSachin Prabhu1-1/+20
The patch fs/cifs: make share unaccessible at root level mountable makes use of prepaths when any component of the underlying path is inaccessible. When mounting 2 separate shares having different prepaths but are other wise similar in other respects, we end up sharing superblocks when we shouldn't be doing so. Signed-off-by: Sachin Prabhu <sprabhu@redhat.com> Tested-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <smfrench@gmail.com>
2016-09-10Fix memory leaks in cifs_do_mount()Sachin Prabhu3-14/+18
Fix memory leaks introduced by the patch fs/cifs: make share unaccessible at root level mountable Also move allocation of cifs_sb->prepath to cifs_setup_cifs_sb(). Signed-off-by: Sachin Prabhu <sprabhu@redhat.com> Tested-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <smfrench@gmail.com>
2016-09-10fscrypto: only allow setting encryption policy on directoriesEric Biggers1-0/+2
The FS_IOC_SET_ENCRYPTION_POLICY ioctl allowed setting an encryption policy on nondirectory files. This was unintentional, and in the case of nonempty regular files did not behave as expected because existing data was not actually encrypted by the ioctl. In the case of ext4, the user could also trigger filesystem errors in ->empty_dir(), e.g. due to mismatched "directory" checksums when the kernel incorrectly tried to interpret a regular file as a directory. This bug affected ext4 with kernels v4.8-rc1 or later and f2fs with kernels v4.6 and later. It appears that older kernels only permitted directories and that the check was accidentally lost during the refactoring to share the file encryption code between ext4 and f2fs. This patch restores the !S_ISDIR() check that was present in older kernels. Signed-off-by: Eric Biggers <ebiggers@google.com> Cc: stable@vger.kernel.org Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-09-10fscrypto: add authorization check for setting encryption policyEric Biggers1-0/+3
On an ext4 or f2fs filesystem with file encryption supported, a user could set an encryption policy on any empty directory(*) to which they had readonly access. This is obviously problematic, since such a directory might be owned by another user and the new encryption policy would prevent that other user from creating files in their own directory (for example). Fix this by requiring inode_owner_or_capable() permission to set an encryption policy. This means that either the caller must own the file, or the caller must have the capability CAP_FOWNER. (*) Or also on any regular file, for f2fs v4.6 and later and ext4 v4.8-rc1 and later; a separate bug fix is coming for that. Signed-off-by: Eric Biggers <ebiggers@google.com> Cc: stable@vger.kernel.org # 4.1+; check fs/{ext4,f2fs} Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-09-10mm: fix show_smap() for zone_device-pmd rangesDan Williams1-0/+2
Attempting to dump /proc/<pid>/smaps for a process with pmd dax mappings currently results in the following VM_BUG_ONs: kernel BUG at mm/huge_memory.c:1105! task: ffff88045f16b140 task.stack: ffff88045be14000 RIP: 0010:[<ffffffff81268f9b>] [<ffffffff81268f9b>] follow_trans_huge_pmd+0x2cb/0x340 [..] Call Trace: [<ffffffff81306030>] smaps_pte_range+0xa0/0x4b0 [<ffffffff814c2755>] ? vsnprintf+0x255/0x4c0 [<ffffffff8123c46e>] __walk_page_range+0x1fe/0x4d0 [<ffffffff8123c8a2>] walk_page_vma+0x62/0x80 [<ffffffff81307656>] show_smap+0xa6/0x2b0 kernel BUG at fs/proc/task_mmu.c:585! RIP: 0010:[<ffffffff81306469>] [<ffffffff81306469>] smaps_pte_range+0x499/0x4b0 Call Trace: [<ffffffff814c2795>] ? vsnprintf+0x255/0x4c0 [<ffffffff8123c46e>] __walk_page_range+0x1fe/0x4d0 [<ffffffff8123c8a2>] walk_page_vma+0x62/0x80 [<ffffffff81307696>] show_smap+0xa6/0x2b0 These locations are sanity checking page flags that must be set for an anonymous transparent huge page, but are not set for the zone_device pages associated with dax mappings. Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-09-09Merge branch 'for-linus' of ↵Linus Torvalds1-3/+4
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse fix from Miklos Szeredi: "This fixes a deadlock when fuse, direct I/O and loop device are combined" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: direct-io: don't dirty ITER_BVEC pages
2016-09-09Merge branch 'overlayfs-linus' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs Pull overlayfs fix from Miklos Szeredi: "This fixes a regression caused by the last pull request" * 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: ovl: fix workdir creation
2016-09-09Merge branch 'for-linus-4.8' of ↵Linus Torvalds3-8/+17
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs fixes from Chris Mason: "I'm not proud of how long it took me to track down that one liner in btrfs_sync_log(), but the good news is the patches I was trying to blame for these problems were actually fine (sorry Filipe)" * 'for-linus-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: btrfs: introduce tickets_id to determine whether asynchronous metadata reclaim work makes progress btrfs: remove root_log_ctx from ctx list before btrfs_sync_log returns btrfs: do not decrease bytes_may_use when replaying extents
2016-09-07Merge branch 'for-chris' of ↵Chris Mason2-8/+16
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus-4.8
2016-09-06btrfs: introduce tickets_id to determine whether asynchronous metadata ↵Wang Xiaoguang2-5/+7
reclaim work makes progress In btrfs_async_reclaim_metadata_space(), we use ticket's address to determine whether asynchronous metadata reclaim work is making progress. ticket = list_first_entry(&space_info->tickets, struct reserve_ticket, list); if (last_ticket == ticket) { flush_state++; } else { last_ticket = ticket; flush_state = FLUSH_DELAYED_ITEMS_NR; if (commit_cycles) commit_cycles--; } But indeed it's wrong, we should not rely on local variable's address to do this check, because addresses may be same. In my test environment, I dd one 168MB file in a 256MB fs, found that for this file, every time wait_reserve_ticket() called, local variable ticket's address is same, For above codes, assume a previous ticket's address is addrA, last_ticket is addrA. Btrfs_async_reclaim_metadata_space() finished this ticket and wake up it, then another ticket is added, but with the same address addrA, now last_ticket will be same to current ticket, then current ticket's flush work will start from current flush_state, not initial FLUSH_DELAYED_ITEMS_NR, which may result in some enospc issues(I have seen this in my test machine). Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-09-06Btrfs: remove root_log_ctx from ctx list before btrfs_sync_log returnsChris Mason1-0/+1
We use a btrfs_log_ctx structure to pass information into the tree log commit, and get error values out. It gets added to a per log-transaction list which we walk when things go bad. Commit d1433debe added an optimization to skip waiting for the log commit, but didn't take root_log_ctx out of the list. This patch makes sure we remove things before exiting. Signed-off-by: Chris Mason <clm@fb.com> Fixes: d1433debe7f4346cf9fc0dafc71c3137d2a97bc4 cc: stable@vger.kernel.org # 3.15+
2016-09-05btrfs: do not decrease bytes_may_use when replaying extentsWang Xiaoguang1-3/+9
When replaying extents, there is no need to update bytes_may_use in btrfs_alloc_logged_file_extent(), otherwise it'll trigger a WARN_ON about bytes_may_use. Fixes: ("btrfs: update btrfs_space_info's bytes_may_use timely") Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-09-05ceph: do not modify fi->frag in need_reset_readdir()Nicolas Iooss1-1/+1
Commit f3c4ebe65ea1 ("ceph: using hash value to compose dentry offset") modified "if (fpos_frag(new_pos) != fi->frag)" to "if (fi->frag |= fpos_frag(new_pos))" in need_reset_readdir(), thus replacing a comparison operator with an assignment one. This looks like a typo which is reported by clang when building the kernel with some warning flags: fs/ceph/dir.c:600:22: error: using the result of an assignment as a condition without parentheses [-Werror,-Wparentheses] } else if (fi->frag |= fpos_frag(new_pos)) { ~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~ fs/ceph/dir.c:600:22: note: place parentheses around the assignment to silence this warning } else if (fi->frag |= fpos_frag(new_pos)) { ^ ( ) fs/ceph/dir.c:600:22: note: use '!=' to turn this compound assignment into an inequality comparison } else if (fi->frag |= fpos_frag(new_pos)) { ^~ != Fixes: f3c4ebe65ea1 ("ceph: using hash value to compose dentry offset") Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-09-05ovl: fix workdir creationMiklos Szeredi1-2/+2
Workdir creation fails in latest kernel. Fix by allowing EOPNOTSUPP as a valid return value from vfs_removexattr(XATTR_NAME_POSIX_ACL_*). Upper filesystem may not support ACL and still be perfectly able to support overlayfs. Reported-by: Martin Ziegler <ziegler@uni-freiburg.de> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Fixes: c11b9fdd6a61 ("ovl: remove posix_acl_default from workdir") Cc: <stable@vger.kernel.org>
2016-09-04pNFS: Don't forget the layout stateid if there are outstanding LAYOUTGETsTrond Myklebust1-1/+2
If there are outstanding LAYOUTGET rpc calls, then we want to ensure that we keep the layout stateid around so we that don't inadvertently pick up an old/misordered sequence id. The race is as follows: Client Server ====== ====== LAYOUTGET(seqid) LAYOUTGET(seqid) return LAYOUTGET(seqid+1) return LAYOUTGET(seqid+2) process LAYOUTGET(seqid+2) forget layout process LAYOUTGET(seqid+1) If it forgets the layout stateid before processing seqid+1, then the client will not check the layout->plh_barrier, and so will set the stateid with seqid+1. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-09-03Merge branch 'for-linus-4.8' of ↵Linus Torvalds3-11/+15
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs fixes from Chris Mason: "I'm still prepping a set of fixes for btrfs fsync, just nailing down a hard to trigger memory corruption. For now, these are tested and ready." * 'for-linus-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: btrfs: fix one bug that process may endlessly wait for ticket in wait_reserve_ticket() Btrfs: fix endless loop in balancing block groups Btrfs: kill invalid ASSERT() in process_all_refs()
2016-09-03Merge tag 'driver-core-4.8-rc5' of ↵Linus Torvalds2-8/+28
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core fixes from Greg KH: "Here are three small fixes for 4.8-rc5. One for sysfs, one for kernfs, and one documentation fix, all for reported issues. All of these have been in linux-next for a while" * tag 'driver-core-4.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: sysfs: correctly handle read offset on PREALLOC attrs documentation: drivers/core/of: fix name of of_node symlink kernfs: don't depend on d_find_any_alias() when generating notifications
2016-09-03devpts: return NULL pts 'priv' entry for non-devpts nodesLinus Torvalds1-1/+2
In commit 8ead9dd54716 ("devpts: more pty driver interface cleanups") I made devpts_get_priv() just return the dentry->fs_data directly. And because I thought it wouldn't happen, I added a warning if you ever saw a pts node that wasn't on devpts. And no, that warning never triggered under any actual real use, but you can trigger it by creating nonsensical pts nodes by hand. So just revert the warning, and make devpts_get_priv() return NULL for that case like it used to. Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: stable@vger.kernel.org # 4.6+ Cc: Eric W Biederman" <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-03pNFS: Clear out all layout segments if the server unsets lrp->res.lrs_presentTrond Myklebust1-3/+6
If the server fails to set lrp->res.lrs_present in the LAYOUTRETURN reply, then that means it believes the client holds no more layout state for that file, and that the layout stateid is now invalid. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-09-03pNFS: Fix pnfs_set_layout_stateid() to clear NFS_LAYOUT_INVALID_STIDTrond Myklebust1-17/+19
If the layout was marked as invalid, we want to ensure to initialise the layout header fields correctly. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-09-03pNFS: Ensure LAYOUTGET and LAYOUTRETURN are properly serialisedTrond Myklebust1-0/+3
According to RFC5661, the client is responsible for serialising LAYOUTGET and LAYOUTRETURN to avoid ambiguity. Consider the case where we send both in parallel. Client Server ====== ====== LAYOUTGET(seqid=X) LAYOUTRETURN(seqid=X) LAYOUTGET return seqid=X+1 LAYOUTRETURN return seqid=X+2 Process LAYOUTRETURN Forget layout stateid Process LAYOUTGET Set seqid=X+1 The client processes the layoutget/layoutreturn in the wrong order, and since the result of the layoutreturn was to clear the only existing layout segment, the client forgets the layout stateid. When the LAYOUTGET comes in, it is treated as having a completely new stateid, and so the client sets the wrong sequence id... Fix is to check if there are outstanding LAYOUTGET requests before we send the LAYOUTRETURN (note that LAYOUGET will already wait if it sees an outstanding LAYOUTRETURN). Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Cc: stable@vger.kernel.org # v4.5+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-09-03NFS: Fix error reporting in nfs_file_write()Trond Myklebust1-1/+4
When doing O_DSYNC writes, the actual write errors are reported through generic_write_sync(), so we must test the result. Reported-by: J. R. Okajima <hooanon05g@gmail.com> Fixes: 18290650b1c8 ("NFS: Move buffered I/O locking into nfs_file_write()") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-09-02Merge branch 'overlayfs-linus' of ↵Linus Torvalds6-99/+242
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs Pull overlayfs fixes from Miklos Szeredi: "Most of this is regression fixes for posix acl behavior introduced in 4.8-rc1 (these were caught by the pjd-fstest suite). The are also miscellaneous fixes marked as stable material and cleanups. Other than overlayfs code, it touches <linux/fs.h> to add a constant with which to disable posix acl caching. No changes needed to the actual caching code, it automatically does the right thing, although later we may want to optimize this case. I'm now testing overlayfs with the following test suites to catch regressions: - unionmount-testsuite - xfstests - pjd-fstest" * 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: ovl: update doc ovl: listxattr: use strnlen() ovl: Switch to generic_getxattr ovl: copyattr after setting POSIX ACL ovl: Switch to generic_removexattr ovl: Get rid of ovl_xattr_noacl_handlers array ovl: Fix OVL_XATTR_PREFIX ovl: fix spelling mistake: "directries" -> "directories" ovl: don't cache acl on overlay layer ovl: use cached acl on underlying layer ovl: proper cleanup of workdir ovl: remove posix_acl_default from workdir ovl: handle umask and posix_acl_default correctly on creation ovl: don't copy up opaqueness
2016-09-02Merge branch 'stable-4.8' of git://git.infradead.org/users/pcmoore/auditLinus Torvalds1-6/+1
Pull audit fixes from Paul Moore: "Two small patches to fix some bugs with the audit-by-executable functionality we introduced back in v4.3 (both patches are marked for the stable folks)" * 'stable-4.8' of git://git.infradead.org/users/pcmoore/audit: audit: fix exe_file access in audit_exe_compare mm: introduce get_task_exe_file
2016-09-02Merge tag 'xfs-iomap-for-linus-4.8-rc5' of ↵Linus Torvalds10-22/+40
git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs Pull xfs and iomap fixes from Dave Chinner: "Most of these changes are small regression fixes that address problems introduced in the 4.8-rc1 window. The two fixes that aren't (IO completion fix and superblock inprogress check) are fixes for problems introduced some time ago and need to be pushed back to stable kernels. Changes in this update: - iomap FIEMAP_EXTENT_MERGED usage fix - additional mount-time feature restrictions - rmap btree query fixes - freeze/unmount io completion workqueue fix - memory corruption fix for deferred operations handling" * tag 'xfs-iomap-for-linus-4.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs: xfs: track log done items directly in the deferred pending work item iomap: don't set FIEMAP_EXTENT_MERGED for extent based filesystems xfs: prevent dropping ioend completions during buftarg wait xfs: fix superblock inprogress check xfs: simple btree query range should look right if LE lookup fails xfs: fix some key handling problems in _btree_simple_query_range xfs: don't log the entire end of the AGF xfs: disallow mounting of realtime + rmap filesystems xfs: don't perform lookups on zero-height btrees
2016-09-01btrfs: fix one bug that process may endlessly wait for ticket in ↵Wang Xiaoguang1-5/+5
wait_reserve_ticket() If can_overcommit() in btrfs_calc_reclaim_metadata_size() returns true, btrfs_async_reclaim_metadata_space() will not reclaim metadata space, just return directly and also forget to wake up process which are waiting for their tickets, so these processes will wait endlessly. Fstests case generic/172 with mount option "-o compress=lzo" have revealed this bug in my test machine. Here if we have tickets to handle, we must handle them first. Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-09-01Btrfs: fix endless loop in balancing block groupsLiu Bo1-3/+5
Qgroup function may overwrite the saved error 'err' with 0 in case quota is not enabled, and this ends up with a endless loop in balance because we keep going back to balance the same block group. It really should use 'ret' instead. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.com>