summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)AuthorFilesLines
2019-03-13f2fs: remove wrong comment in f2fs_invalidate_page()Chao Yu1-1/+0
Since 8c242db9b8c0 ("f2fs: fix stale ATOMIC_WRITTEN_PAGE private pointer"), we've started to not skip clear private flag for atomic_write page truncation, so removing old wrong comment in f2fs_invalidate_page(). Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-03-13f2fs: fix to use kvfree instead of kzfreeChao Yu1-5/+5
As Jiqun Li reported in bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=202747 System can panic due to using wrong allocate/free function pair in xattr interface: - use kvmalloc to allocate memory - use kzfree to free memory Let's fix to use kvfree instead of kzfree, BTW, we are safe to get rid of kzfree, since there is no such confidential data stored as xattr, we don't need to zero it before free memory. Fixes: 5222595d093e ("f2fs: use kvmalloc, if kmalloc is failed") Reported-by: Jiqun Li <jiqun.li@unisoc.com> Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-03-13f2fs: trace f2fs_ioc_shutdownChao Yu1-0/+3
This patch supports to trace f2fs_ioc_shutdown. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-03-13f2fs: fix to avoid deadlock of atomic file operationsChao Yu1-12/+31
Thread A Thread B - __fput - f2fs_release_file - drop_inmem_pages - mutex_lock(&fi->inmem_lock) - __revoke_inmem_pages - lock_page(page) - open - f2fs_setattr - truncate_setsize - truncate_inode_pages_range - lock_page(page) - truncate_cleanup_page - f2fs_invalidate_page - drop_inmem_page - mutex_lock(&fi->inmem_lock); We may encounter above ABBA deadlock as reported by Kyungtae Kim: I'm reporting a bug in linux-4.17.19: "INFO: task hung in drop_inmem_page" (no reproducer) I think this might be somehow related to the following: https://groups.google.com/forum/#!searchin/syzkaller-bugs/INFO$3A$20task$20hung$20in$20%7Csort:date/syzkaller-bugs/c6soBTrdaIo/AjAzPeIzCgAJ ========================================= INFO: task syz-executor7:10822 blocked for more than 120 seconds. Not tainted 4.17.19 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. syz-executor7 D27024 10822 6346 0x00000004 Call Trace: context_switch kernel/sched/core.c:2867 [inline] __schedule+0x721/0x1e60 kernel/sched/core.c:3515 schedule+0x88/0x1c0 kernel/sched/core.c:3559 schedule_preempt_disabled+0x18/0x30 kernel/sched/core.c:3617 __mutex_lock_common kernel/locking/mutex.c:833 [inline] __mutex_lock+0x5bd/0x1410 kernel/locking/mutex.c:893 mutex_lock_nested+0x1b/0x20 kernel/locking/mutex.c:908 drop_inmem_page+0xcb/0x810 fs/f2fs/segment.c:327 f2fs_invalidate_page+0x337/0x5e0 fs/f2fs/data.c:2401 do_invalidatepage mm/truncate.c:165 [inline] truncate_cleanup_page+0x261/0x330 mm/truncate.c:187 truncate_inode_pages_range+0x552/0x1610 mm/truncate.c:367 truncate_inode_pages mm/truncate.c:478 [inline] truncate_pagecache+0x6d/0x90 mm/truncate.c:801 truncate_setsize+0x81/0xa0 mm/truncate.c:826 f2fs_setattr+0x44f/0x1270 fs/f2fs/file.c:781 notify_change+0xa62/0xe80 fs/attr.c:313 do_truncate+0x12e/0x1e0 fs/open.c:63 do_last fs/namei.c:2955 [inline] path_openat+0x2042/0x29f0 fs/namei.c:3505 do_filp_open+0x1bd/0x2c0 fs/namei.c:3540 do_sys_open+0x35e/0x4e0 fs/open.c:1101 __do_sys_open fs/open.c:1119 [inline] __se_sys_open fs/open.c:1114 [inline] __x64_sys_open+0x89/0xc0 fs/open.c:1114 do_syscall_64+0xc4/0x4e0 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x4497b9 RSP: 002b:00007f734e459c68 EFLAGS: 00000246 ORIG_RAX: 0000000000000002 RAX: ffffffffffffffda RBX: 00007f734e45a6cc RCX: 00000000004497b9 RDX: 0000000000000104 RSI: 00000000000a8280 RDI: 0000000020000080 RBP: 000000000071bea0 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff R13: 0000000000007230 R14: 00000000006f02d0 R15: 00007f734e45a700 INFO: task syz-executor7:10858 blocked for more than 120 seconds. Not tainted 4.17.19 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. syz-executor7 D28880 10858 6346 0x00000004 Call Trace: context_switch kernel/sched/core.c:2867 [inline] __schedule+0x721/0x1e60 kernel/sched/core.c:3515 schedule+0x88/0x1c0 kernel/sched/core.c:3559 __rwsem_down_write_failed_common kernel/locking/rwsem-xadd.c:565 [inline] rwsem_down_write_failed+0x5e6/0xc90 kernel/locking/rwsem-xadd.c:594 call_rwsem_down_write_failed+0x17/0x30 arch/x86/lib/rwsem.S:117 __down_write arch/x86/include/asm/rwsem.h:142 [inline] down_write+0x58/0xa0 kernel/locking/rwsem.c:72 inode_lock include/linux/fs.h:713 [inline] do_truncate+0x120/0x1e0 fs/open.c:61 do_last fs/namei.c:2955 [inline] path_openat+0x2042/0x29f0 fs/namei.c:3505 do_filp_open+0x1bd/0x2c0 fs/namei.c:3540 do_sys_open+0x35e/0x4e0 fs/open.c:1101 __do_sys_open fs/open.c:1119 [inline] __se_sys_open fs/open.c:1114 [inline] __x64_sys_open+0x89/0xc0 fs/open.c:1114 do_syscall_64+0xc4/0x4e0 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x4497b9 RSP: 002b:00007f734e3b4c68 EFLAGS: 00000246 ORIG_RAX: 0000000000000002 RAX: ffffffffffffffda RBX: 00007f734e3b56cc RCX: 00000000004497b9 RDX: 0000000000000104 RSI: 00000000000a8280 RDI: 0000000020000080 RBP: 000000000071c238 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff R13: 0000000000007230 R14: 00000000006f02d0 R15: 00007f734e3b5700 INFO: task syz-executor5:10829 blocked for more than 120 seconds. Not tainted 4.17.19 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. syz-executor5 D28760 10829 6308 0x80000002 Call Trace: context_switch kernel/sched/core.c:2867 [inline] __schedule+0x721/0x1e60 kernel/sched/core.c:3515 schedule+0x88/0x1c0 kernel/sched/core.c:3559 io_schedule+0x21/0x80 kernel/sched/core.c:5179 wait_on_page_bit_common mm/filemap.c:1100 [inline] __lock_page+0x2b5/0x390 mm/filemap.c:1273 lock_page include/linux/pagemap.h:483 [inline] __revoke_inmem_pages+0xb35/0x11c0 fs/f2fs/segment.c:231 drop_inmem_pages+0xa3/0x3e0 fs/f2fs/segment.c:306 f2fs_release_file+0x2c7/0x330 fs/f2fs/file.c:1556 __fput+0x2c7/0x780 fs/file_table.c:209 ____fput+0x1a/0x20 fs/file_table.c:243 task_work_run+0x151/0x1d0 kernel/task_work.c:113 exit_task_work include/linux/task_work.h:22 [inline] do_exit+0x8ba/0x30a0 kernel/exit.c:865 do_group_exit+0x13b/0x3a0 kernel/exit.c:968 get_signal+0x6bb/0x1650 kernel/signal.c:2482 do_signal+0x84/0x1b70 arch/x86/kernel/signal.c:810 exit_to_usermode_loop+0x155/0x190 arch/x86/entry/common.c:162 prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline] syscall_return_slowpath arch/x86/entry/common.c:265 [inline] do_syscall_64+0x445/0x4e0 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x4497b9 RSP: 002b:00007f1c68e74ce8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca RAX: fffffffffffffe00 RBX: 000000000071bf80 RCX: 00000000004497b9 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000071bf80 RBP: 000000000071bf80 R08: 0000000000000000 R09: 000000000071bf58 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000000 R14: 00007f1c68e759c0 R15: 00007f1c68e75700 This patch tries to use trylock_page to mitigate such deadlock condition for fix. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-03-13f2fs: fix to dirty inode for i_mode recoveryChao Yu1-4/+1
As Seulbae Kim reported in bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=202637 We didn't recover permission field correctly after sudden power-cut, the reason is in setattr we didn't add inode into global dirty list once i_mode is changed, so latter checkpoint triggered by fsync will not flush last i_mode into disk, result in this problem, fix it. Reported-by: Seulbae Kim <seulbae@gatech.edu> Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-03-13f2fs: give random value to i_generationJaegeuk Kim3-4/+2
This follows to give random number to i_generation along with commit 232530680290b ("ext4: improve smp scalability for inode generation") This can be used for DUN for UFS HW encryption. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-03-13f2fs: no need to take page lock in readdirGao Xiang1-3/+3
VFS will take inode_lock for readdir, therefore no need to take page lock in readdir at all just as the majority of other generic filesystems. This patch improves concurrency since .iterate_shared was introduced to VFS years ago. Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-03-13f2fs: fix to update iostat correctly in IPU pathChao Yu1-3/+3
In error path of IPU, we didn't account iostat correctly, fix it. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-03-13f2fs: fix encrypted page memory leakChao Yu1-2/+7
For IPU path of f2fs_do_write_data_page(), in its error path, we need to release encrypted page and fscrypt context, otherwise it will cause memory leak. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-03-13f2fs: make fault injection covering __submit_flush_wait()Chao Yu1-1/+5
This patch changes to allow failure of f2fs_bio_alloc() in __submit_flush_wait(), which can simulate flush error in checkpoint() for covering more error paths. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-03-13f2fs: fix to retry fill_super only if recovery failedChao Yu1-8/+11
With current retry mechanism in f2fs_fill_super, first fill_super fails due to no memory, then second fill_super runs w/o recovery, if we succeed, we may lose fsynced data, it doesn't make sense. Let's retry fill_super only if it occurs non-ENOMEM error during recovery. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-03-13f2fs: silence VM_WARN_ON_ONCE in mempool_allocGao Xiang1-1/+2
Note that __GFP_ZERO is not supported for mempool_alloc, which also documented in the mempool_alloc comments. Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-03-13f2fs: fix wrong #endifJaegeuk Kim1-2/+2
We have to cover whole headerfile with last #endif. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-03-13Merge tag 'nfsd-5.1' of git://linux-nfs.org/~bfields/linuxLinus Torvalds5-11/+26
Pull NFS server updates from Bruce Fields: "Miscellaneous NFS server fixes. Probably the most visible bug is one that could artificially limit NFSv4.1 performance by limiting the number of oustanding rpcs from a single client. Neil Brown also gets a special mention for fixing a 14.5-year-old memory-corruption bug in the encoding of NFSv3 readdir responses" * tag 'nfsd-5.1' of git://linux-nfs.org/~bfields/linux: nfsd: allow nfsv3 readdir request to be larger. nfsd: fix wrong check in write_v4_end_grace() nfsd: fix memory corruption caused by readdir nfsd: fix performance-limiting session calculation svcrpc: fix UDP on servers with lots of threads svcrdma: Remove syslog warnings in work completion handlers svcrdma: Squelch compiler warning when SUNRPC_DEBUG is disabled svcrdma: Use struct_size() in kmalloc() svcrpc: fix unlikely races preventing queueing of sockets svcrpc: svc_xprt_has_something_to_do seems a little long SUNRPC: Don't allow compiler optimisation of svc_xprt_release_slot() nfsd: fix an IS_ERR() vs NULL check
2019-03-13Merge tag 'ext4_for_linus' of ↵Linus Torvalds17-147/+250
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 updates from Ted Ts'o: "A large number of bug fixes and cleanups. One new feature to allow users to more easily find the jbd2 journal thread for a particular ext4 file system" * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (25 commits) jbd2: jbd2_get_transaction does not need to return a value jbd2: fix invalid descriptor block checksum ext4: fix bigalloc cluster freeing when hole punching under load ext4: add sysfs attr /sys/fs/ext4/<disk>/journal_task ext4: Change debugging support help prefix from EXT4 to Ext4 ext4: fix compile error when using BUFFER_TRACE jbd2: fix compile warning when using JBUFFER_TRACE ext4: fix some error pointer dereferences ext4: annotate more implicit fall throughs ext4: annotate implicit fall throughs ext4: don't update s_rev_level if not required jbd2: fold jbd2_superblock_csum_{verify,set} into their callers jbd2: fix race when writing superblock ext4: fix crash during online resizing ext4: disallow files with EXT4_JOURNAL_DATA_FL from EXT4_IOC_SWAP_BOOT ext4: add mask of ext4 flags to swap ext4: update quota information while swapping boot loader inode ext4: cleanup pagecache before swap i_data ext4: fix check of inode in swap_inode_boot_loader ext4: unlock unused_pages timely when doing writeback ...
2019-03-13Merge tag 'ceph-for-5.1-rc1' of git://github.com/ceph/ceph-clientLinus Torvalds11-314/+1289
Pull ceph updates from Ilya Dryomov: "The highlights are: - rbd will now ignore discards that aren't aligned and big enough to actually free up some space (myself). This is controlled by the new alloc_size map option and can be disabled if needed. - support for rbd deep-flatten feature (myself). Deep-flatten allows "rbd flatten" to fully disconnect the clone image and its snapshots from the parent and make the parent snapshot removable. - a new round of cap handling improvements (Zheng Yan). The kernel client should now be much more prompt about releasing its caps and it is possible to put a limit on the number of caps held. - support for getting ceph.dir.pin extended attribute (Zheng Yan)" * tag 'ceph-for-5.1-rc1' of git://github.com/ceph/ceph-client: (26 commits) Documentation: modern versions of ceph are not backed by btrfs rbd: advertise support for RBD_FEATURE_DEEP_FLATTEN rbd: whole-object write and zeroout should copyup when snapshots exist rbd: copyup with an empty snapshot context (aka deep-copyup) rbd: introduce rbd_obj_issue_copyup_ops() rbd: stop copying num_osd_ops in rbd_obj_issue_copyup() rbd: factor out __rbd_osd_req_create() rbd: clear ->xferred on error from rbd_obj_issue_copyup() rbd: remove experimental designation from kernel layering ceph: add mount option to limit caps count ceph: periodically trim stale dentries ceph: delete stale dentry when last reference is dropped ceph: remove dentry_lru file from debugfs ceph: touch existing cap when handling reply ceph: pass inclusive lend parameter to filemap_write_and_wait_range() rbd: round off and ignore discards that are too small rbd: handle DISCARD and WRITE_ZEROES separately rbd: get rid of obj_req->obj_request_count libceph: use struct_size() for kmalloc() in crush_decode() ceph: send cap releases more aggressively ...
2019-03-13Merge tag 'for-5.1-part2-tag' of ↵Linus Torvalds7-38/+90
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: "Correctness and a deadlock fixes" * tag 'for-5.1-part2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: zstd: ensure reclaim timer is properly cleaned up btrfs: move ulist allocation out of transaction in quota enable btrfs: save drop_progress if we drop refs at all btrfs: check for refs on snapshot delete resume Btrfs: fix deadlock between clone/dedupe and rename Btrfs: fix corruption reading shared and compressed extents after hole punching
2019-03-13Merge tag 'nfs-for-5.1-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds40-1095/+1291
Pull NFS client updates from Trond Myklebust: "Highlights include: Stable fixes: - Fixes for NFS I/O request leakages - Fix error handling paths in the NFS I/O recoalescing code - Reinitialise NFSv4.1 sequence results before retransmitting a request - Fix a soft lockup in the delegation recovery code - Bulk destroy of layouts needs to be safe w.r.t. umount - Prevent thundering herd issues when the SUNRPC socket is not connected - Respect RPC call timeouts when retrying transmission Features: - Convert rpc auth layer to use xdr_streams - Config option to disable insecure RPCSEC_GSS crypto types - Reduce size of RPC receive buffers - Readdirplus optimization by cache mechanism - Convert SUNRPC socket send code to use iov_iter() - SUNRPC micro-optimisations to avoid indirect calls - Add support for the pNFS LAYOUTERROR operation and use it with the pNFS/flexfiles driver - Add trace events to report non-zero NFS status codes - Various removals of unnecessary dprintks Bugfixes and cleanups: - Fix a number of sparse warnings and documentation format warnings - Fix nfs_parse_devname to not modify it's argument - Fix potential corruption of page being written through pNFS/blocks - fix xfstest generic/099 failures on nfsv3 - Avoid NFSv4.1 "false retries" when RPC calls are interrupted - Abort I/O early if the pNFS/flexfiles layout segment was invalidated - Avoid unnecessary pNFS/flexfiles layout invalidations" * tag 'nfs-for-5.1-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (90 commits) SUNRPC: Take the transport send lock before binding+connecting SUNRPC: Micro-optimise when the task is known not to be sleeping SUNRPC: Check whether the task was transmitted before rebind/reconnect SUNRPC: Remove redundant calls to RPC_IS_QUEUED() SUNRPC: Clean up SUNRPC: Respect RPC call timeouts when retrying transmission SUNRPC: Fix up RPC back channel transmission SUNRPC: Prevent thundering herd when the socket is not connected SUNRPC: Allow dynamic allocation of back channel slots NFSv4.1: Bump the default callback session slot count to 16 SUNRPC: Convert remaining GFP_NOIO, and GFP_NOWAIT sites in sunrpc NFS/flexfiles: Clean up mirror DS initialisation NFS/flexfiles: Remove dead code in ff_layout_mirror_valid() NFS/flexfile: Simplify nfs4_ff_layout_select_ds_stateid() NFS/flexfile: Simplify nfs4_ff_layout_ds_version() NFS/flexfiles: Simplify ff_layout_get_ds_cred() NFS/flexfiles: Simplify nfs4_ff_find_or_create_ds_client() NFS/flexfiles: Simplify nfs4_ff_layout_select_ds_fh() NFS/flexfiles: Speed up read failover when DSes are down NFS/flexfiles: Don't invalidate DS deviceids for being unresponsive ...
2019-03-13Merge tag 'ovl-update-5.1' of ↵Linus Torvalds3-35/+81
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs Pull overlayfs updates from Miklos Szeredi: "Fix copy up of security related xattrs" * tag 'ovl-update-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: ovl: Do not lose security.capability xattr over metadata file copy-up ovl: During copy up, first copy up data and then xattrs
2019-03-13Merge tag 'fuse-update-5.1' of ↵Linus Torvalds8-259/+321
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse updates from Miklos Szeredi: "Scalability and performance improvements, as well as minor bug fixes and cleanups" * tag 'fuse-update-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: (25 commits) fuse: cache readdir calls if filesystem opts out of opendir fuse: support clients that don't implement 'opendir' fuse: lift bad inode checks into callers fuse: multiplex cached/direct_io file operations fuse add copy_file_range to direct io fops fuse: use iov_iter based generic splice helpers fuse: Switch to using async direct IO for FOPEN_DIRECT_IO fuse: use atomic64_t for khctr fuse: clean up aborted fuse: Protect ff->reserved_req via corresponding fi->lock fuse: Protect fi->nlookup with fi->lock fuse: Introduce fi->lock to protect write related fields fuse: Convert fc->attr_version into atomic64_t fuse: Add fuse_inode argument to fuse_prepare_release() fuse: Verify userspace asks to requeue interrupt that we really sent fuse: Do some refactoring in fuse_dev_do_write() fuse: Wake up req->waitq of only if not background fuse: Optimize request_end() by not taking fiq->waitq.lock fuse: Kill fasync only if interrupt is queued in queue_interrupt() fuse: Remove stale comment in end_requests() ...
2019-03-13Merge branch 'work.mount' of ↵Linus Torvalds23-886/+2421
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs mount infrastructure updates from Al Viro: "The rest of core infrastructure; no new syscalls in that pile, but the old parts are switched to new infrastructure. At that point conversions of individual filesystems can happen independently; some are done here (afs, cgroup, procfs, etc.), there's also a large series outside of that pile dealing with NFS (quite a bit of option-parsing stuff is getting used there - it's one of the most convoluted filesystems in terms of mount-related logics), but NFS bits are the next cycle fodder. It got seriously simplified since the last cycle; documentation is probably the weakest bit at the moment - I considered dropping the commit introducing Documentation/filesystems/mount_api.txt (cutting the size increase by quarter ;-), but decided that it would be better to fix it up after -rc1 instead. That pile allows to do followup work in independent branches, which should make life much easier for the next cycle. fs/super.c size increase is unpleasant; there's a followup series that allows to shrink it considerably, but I decided to leave that until the next cycle" * 'work.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (41 commits) afs: Use fs_context to pass parameters over automount afs: Add fs_context support vfs: Add some logging to the core users of the fs_context log vfs: Implement logging through fs_context vfs: Provide documentation for new mount API vfs: Remove kern_mount_data() hugetlbfs: Convert to fs_context cpuset: Use fs_context kernfs, sysfs, cgroup, intel_rdt: Support fs_context cgroup: store a reference to cgroup_ns into cgroup_fs_context cgroup1_get_tree(): separate "get cgroup_root to use" into a separate helper cgroup_do_mount(): massage calling conventions cgroup: stash cgroup_root reference into cgroup_fs_context cgroup2: switch to option-by-option parsing cgroup1: switch to option-by-option parsing cgroup: take options parsing into ->parse_monolithic() cgroup: fold cgroup1_mount() into cgroup1_get_tree() cgroup: start switching to fs_context ipc: Convert mqueue fs to fs_context proc: Add fs_context support to procfs ...
2019-03-12Merge branch 'work.misc' of ↵Linus Torvalds6-19/+51
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull misc vfs updates from Al Viro: "Assorted fixes (really no common topic here)" * 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: vfs: Make __vfs_write() static vfs: fix preadv64v2 and pwritev64v2 compat syscalls with offset == -1 pipe: stop using ->can_merge splice: don't merge into linked buffers fs: move generic stat response attr handling to vfs_getattr_nosec orangefs: don't reinitialize result_mask in ->getattr fs/devpts: always delete dcache dentry-s in dput()
2019-03-12pNFS: Fix a typo in pnfs_update_layoutTrond Myklebust1-1/+1
We're supposed to wait for the outstanding layout count to go to zero, but that got lost somehow. Fixes: d03360aaf5cca ("pNFS: Ensure we return the error if someone...") Reported-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-03-12Merge branch 'akpm' (patches from Andrew)Linus Torvalds6-42/+30
Merge misc updates from Andrew Morton: - a few misc things - the rest of MM - remove flex_arrays, replace with new simple radix-tree implementation * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (38 commits) Drop flex_arrays sctp: convert to genradix proc: commit to genradix generic radix trees selinux: convert to kvmalloc md: convert to kvmalloc openvswitch: convert to kvmalloc of: fix kmemleak crash caused by imbalance in early memory reservation mm: memblock: update comments and kernel-doc memblock: split checks whether a region should be skipped to a helper function memblock: remove memblock_{set,clear}_region_flags memblock: drop memblock_alloc_*_nopanic() variants memblock: memblock_alloc_try_nid: don't panic treewide: add checks for the return value of memblock_alloc*() swiotlb: add checks for the return value of memblock_alloc*() init/main: add checks for the return value of memblock_alloc*() mm/percpu: add checks for the return value of memblock_alloc*() sparc: add checks for the return value of memblock_alloc*() ia64: add checks for the return value of memblock_alloc*() arch: don't memset(0) memory returned by memblock_alloc() ...
2019-03-12proc: commit to genradixKent Overstreet1-28/+15
The new generic radix trees have a simpler API and implementation, and no limitations on number of elements, so all flex_array users are being converted Link: http://lkml.kernel.org/r/20181217131929.11727-6-kent.overstreet@gmail.com Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Reviewed-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Eric Paris <eparis@parisplace.org> Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: Paul Moore <paul@paul-moore.com> Cc: Pravin B Shelar <pshelar@ovn.org> Cc: Shaohua Li <shli@kernel.org> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-12proc: calculate end pointer for /proc/*/* lookup at compile timeAlexey Dobriyan1-9/+10
Compilers like to transform loops like for (i = 0; i < n; i++) { [use p[i]] } into for (p = p0; p < end; p++) { ... } Do it by hand, so that it results in overall simpler loop and smaller code. Space savings: $ ./scripts/bloat-o-meter ../vmlinux-001 ../obj/vmlinux add/remove: 0/0 grow/shrink: 2/1 up/down: 4/-9 (-5) Function old new delta proc_tid_base_lookup 17 19 +2 proc_tgid_base_lookup 17 19 +2 proc_pident_lookup 179 170 -9 The same could be done to proc_pident_readdir(), but the code becomes bigger for some reason. [sfr@canb.auug.org.au: merge fix for proc_pident_lookup() API change] Link: http://lkml.kernel.org/r/20190131160135.4a8ae70b@canb.auug.org.au Link: http://lkml.kernel.org/r/20190114200422.GB9680@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: James Morris <jmorris@namei.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Casey Schaufler <casey@schaufler-ca.com> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-12mm: refactor readahead defines in mm.hNikolay Borisov5-5/+5
All users of VM_MAX_READAHEAD actually convert it to kbytes and then to pages. Define the macro explicitly as (SZ_128K / PAGE_SIZE). This simplifies the expression in every filesystem. Also rename the macro to VM_READAHEAD_PAGES to properly convey its meaning. Finally remove unused VM_MIN_READAHEAD [akpm@linux-foundation.org: fix fs/io_uring.c, per Stephen] Link: http://lkml.kernel.org/r/20181221144053.24318-1-nborisov@suse.com Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Matthew Wilcox <willy@infradead.org> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Eric Van Hensbergen <ericvh@gmail.com> Cc: Latchesar Ionkov <lucho@ionkov.net> Cc: Dominique Martinet <asmadeus@codewreck.org> Cc: David Howells <dhowells@redhat.com> Cc: Chris Mason <clm@fb.com> Cc: Josef Bacik <josef@toxicpanda.com> Cc: David Sterba <dsterba@suse.com> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-12hpfs: fix spelling mistake "partion" -> "partition"Colin Ian King1-4/+4
Trivial fix to spelling mistakes in comments Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Mikulas Patocka <mikulas@twibright.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-12xfs: clean up xfs_dir2_leaf_addnameDarrick J. Wong1-18/+15
Remove typedefs and consolidate local variable initialization. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Allison Henderson <allison.henderson@oracle.com> Reviewed-by: Bill O'Donnell <billodo@redhat.com>
2019-03-12Merge tag 'xarray-5.1-rc1' of git://git.infradead.org/users/willy/linux-daxLinus Torvalds1-1/+1
Pull XArray updates from Matthew Wilcox: "This pull request changes the xa_alloc() API. I'm only aware of one subsystem that has started trying to use it, and we agree on the fixup as part of the merge. The xa_insert() error code also changed to match xa_alloc() (EEXIST to EBUSY), and I added xa_alloc_cyclic(). Beyond that, the usual bugfixes, optimisations and tweaking. I now have a git tree with all users of the radix tree and IDR converted over to the XArray that I'll be feeding to maintainers over the next few weeks" * tag 'xarray-5.1-rc1' of git://git.infradead.org/users/willy/linux-dax: XArray: Fix xa_reserve for 2-byte aligned entries XArray: Fix xa_erase of 2-byte aligned entries XArray: Use xa_cmpxchg to implement xa_reserve XArray: Fix xa_release in allocating arrays XArray: Mark xa_insert and xa_reserve as must_check XArray: Add cyclic allocation XArray: Redesign xa_alloc API XArray: Add support for 1s-based allocation XArray: Change xa_insert to return -EBUSY XArray: Update xa_erase family descriptions XArray tests: RCU lock prohibits GFP_KERNEL
2019-03-11inotify: Fix fsnotify_mark refcount leak in inotify_update_existing_watch()ZhangXiaoxu1-2/+5
Commit 4d97f7d53da7dc83 ("inotify: Add flag IN_MASK_CREATE for inotify_add_watch()") forgot to call fsnotify_put_mark() with IN_MASK_CREATE after fsnotify_find_mark() Fixes: 4d97f7d53da7dc83 ("inotify: Add flag IN_MASK_CREATE for inotify_add_watch()") Signed-off-by: ZhangXiaoxu <zhangxiaoxu5@huawei.com> Signed-off-by: Jan Kara <jack@suse.cz>
2019-03-11Merge branch 'next-integrity' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull integrity updates from James Morris: "Mimi Zohar says: 'Linux 5.0 introduced the platform keyring to allow verifying the IMA kexec kernel image signature using the pre-boot keys. This pull request similarly makes keys on the platform keyring accessible for verifying the PE kernel image signature. Also included in this pull request is a new IMA hook that tags tmp files, in policy, indicating the file hash needs to be calculated. The remaining patches are cleanup'" * 'next-integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: evm: Use defined constant for UUID representation ima: define ima_post_create_tmpfile() hook and add missing call evm: remove set but not used variable 'xattr' encrypted-keys: fix Opt_err/Opt_error = -1 kexec, KEYS: Make use of platform keyring for signature verify integrity, KEYS: add a reference to platform keyring
2019-03-10xfs: zero initialize highstale and lowstale in xfs_dir2_leaf_addnameDarrick J. Wong1-2/+2
Smatch complains about the following: fs/xfs/libxfs/xfs_dir2_leaf.c:848 xfs_dir2_leaf_addname() error: uninitialized symbol 'lowstale'. fs/xfs/libxfs/xfs_dir2_leaf.c:849 xfs_dir2_leaf_addname() error: uninitialized symbol 'highstale'. I don't think there's any incorrect behavior associated with the uninitialized variable, but as the author of the previous zero-init patch points out, it's best not to be passing around pointers to uninitialized stack areas. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Reviewed-by: Allison Henderson <allison.henderson@oracle.com> Reviewed-by: Bill O'Donnell <billodo@redhat.com>
2019-03-10Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds17-6411/+0
Pull SCSI updates from James Bottomley: "This is mostly update of the usual drivers: arcmsr, qla2xxx, lpfc, hisi_sas, target/iscsi and target/core. Additionally Christoph refactored gdth as part of the dma changes. The major mid-layer change this time is the removal of bidi commands and with them the whole of the osd/exofs driver and filesystem. This is a major simplification for block and mq in particular" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (240 commits) scsi: cxgb4i: validate tcp sequence number only if chip version <= T5 scsi: cxgb4i: get pf number from lldi->pf scsi: core: replace GFP_ATOMIC with GFP_KERNEL in scsi_scan.c scsi: mpt3sas: Add missing breaks in switch statements scsi: aacraid: Fix missing break in switch statement scsi: kill command serial number scsi: csiostor: drop serial_number usage scsi: mvumi: use request tag instead of serial_number scsi: dpt_i2o: remove serial number usage scsi: st: osst: Remove negative constant left-shifts scsi: ufs-bsg: Allow reading descriptors scsi: ufs: Allow reading descriptor via raw upiu scsi: ufs-bsg: Change the calling convention for write descriptor scsi: ufs: Remove unused device quirks Revert "scsi: ufs: disable vccq if it's not needed by UFS device" scsi: megaraid_sas: Remove a bunch of set but not used variables scsi: clean obsolete return values of eh_timed_out scsi: sd: Optimal I/O size should be a multiple of physical block size scsi: MAINTAINERS: SCSI initiator and target tweaks scsi: fcoe: make use of fip_mode enum complete ...
2019-03-10Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds1-1/+1
Pull rdma updates from Jason Gunthorpe: "This has been a slightly more active cycle than normal with ongoing core changes and quite a lot of collected driver updates. - Various driver fixes for bnxt_re, cxgb4, hns, mlx5, pvrdma, rxe - A new data transfer mode for HFI1 giving higher performance - Significant functional and bug fix update to the mlx5 On-Demand-Paging MR feature - A chip hang reset recovery system for hns - Change mm->pinned_vm to an atomic64 - Update bnxt_re to support a new 57500 chip - A sane netlink 'rdma link add' method for creating rxe devices and fixing the various unregistration race conditions in rxe's unregister flow - Allow lookup up objects by an ID over netlink - Various reworking of the core to driver interface: - drivers should not assume umem SGLs are in PAGE_SIZE chunks - ucontext is accessed via udata not other means - start to make the core code responsible for object memory allocation - drivers should convert struct device to struct ib_device via a helper - drivers have more tools to avoid use after unregister problems" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (280 commits) net/mlx5: ODP support for XRC transport is not enabled by default in FW IB/hfi1: Close race condition on user context disable and close RDMA/umem: Revert broken 'off by one' fix RDMA/umem: minor bug fix in error handling path RDMA/hns: Use GFP_ATOMIC in hns_roce_v2_modify_qp cxgb4: kfree mhp after the debug print IB/rdmavt: Fix concurrency panics in QP post_send and modify to error IB/rdmavt: Fix loopback send with invalidate ordering IB/iser: Fix dma_nents type definition IB/mlx5: Set correct write permissions for implicit ODP MR bnxt_re: Clean cq for kernel consumers only RDMA/uverbs: Don't do double free of allocated PD RDMA: Handle ucontext allocations by IB/core RDMA/core: Fix a WARN() message bnxt_re: fix the regression due to changes in alloc_pbl IB/mlx4: Increase the timeout for CM cache IB/core: Abort page fault handler silently during owning process exit IB/mlx5: Validate correct PD before prefetch MR IB/mlx5: Protect against prefetch of invalid MR RDMA/uverbs: Store PR pointer before it is overwritten ...
2019-03-09Merge tag 'gfs2-5.1.fixes' of ↵Linus Torvalds5-69/+20
git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Pull gfs2 updates from Bob Peterson: "We've only got three patches ready for this merge window: - Fix a hang related to missed wakeups for glocks from Andreas Gruenbacher - Rework of how gfs2 manages its debugfs files from Greg K-H - An incorrect assert when truncating or deleting files from Tim Smith" * tag 'gfs2-5.1.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: gfs2: Fix missed wakeups in find_insert_glock gfs2: Fix an incorrect gfs2_assert() gfs: no need to check return value of debugfs_create functions
2019-03-09Merge tag '5.1-rc-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds21-337/+929
Pull smb3 updates from Steve French: - smb3/cifs fixes including for large i/o error cases - fixes for three xfstests - improved crediting (smb3 flow control) - improved tracing * tag '5.1-rc-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6: (44 commits) fs: cifs: Kconfig: pedantic formatting smb3: request more credits on normal (non-large read/write) ops CIFS: Mask off signals when sending SMB packets CIFS: Return -EAGAIN instead of -ENOTSOCK CIFS: Only send SMB2_NEGOTIATE command on new TCP connections CIFS: Fix read after write for files with read caching smb3: for kerberos mounts display the credential uid used cifs: use correct format characters smb3: add dynamic trace point for query_info_enter/done smb3: add dynamic trace point for smb3_cmd_enter smb3: improve dynamic tracing of open and posix mkdir smb3: add missing read completion trace point smb3: Add tracepoints for read, write and query_dir enter smb3: add tracepoints for query dir smb3: Update POSIX negotiate context with POSIX ctxt GUID cifs: update internal module version number CIFS: Try to acquire credits at once for compound requests CIFS: Return error code when getting file handle for writeback CIFS: Move open file handling to writepages CIFS: Move unlocking pages from wdata_send_pages() ...
2019-03-09Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscryptLinus Torvalds33-145/+88
Pull fscrypt updates from Eric Biggers: "First: Ted, Jaegeuk, and I have decided to add me as a co-maintainer for fscrypt, and we're now using a shared git tree. So we've updated MAINTAINERS accordingly, and I'm doing the pull request this time. The actual changes for v5.1 are: - Remove the fs-specific kconfig options like CONFIG_EXT4_ENCRYPTION and make fscrypt support for all fscrypt-capable filesystems be controlled by CONFIG_FS_ENCRYPTION, similar to how CONFIG_QUOTA works. - Improve error code for rename() and link() into encrypted directories. - Various cleanups" * tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt: MAINTAINERS: add Eric Biggers as an fscrypt maintainer fscrypt: return -EXDEV for incompatible rename or link into encrypted dir fscrypt: remove filesystem specific build config option f2fs: use IS_ENCRYPTED() to check encryption status ext4: use IS_ENCRYPTED() to check encryption status fscrypt: remove CRYPTO_CTR dependency
2019-03-09Merge tag 'pstore-v5.1-rc1' of ↵Linus Torvalds2-35/+32
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull pstore cleanups from Kees Cook: - Remove some needless memory allocations (Yue Hu, Kees Cook) - Add zero-length checks to avoid no-op calls (Yue Hu) * tag 'pstore-v5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: pstore/ram: Avoid needless alloc during header write pstore/ram: Add kmsg hlen zero check to ramoops_pstore_write() pstore/ram: Move initialization earlier pstore: Avoid writing records with zero size pstore/ram: Replace dummy_data heap memory with stack memory
2019-03-09Merge tag 'io_uring-2019-03-06' of git://git.kernel.dk/linux-blockLinus Torvalds4-7/+2989
Pull io_uring IO interface from Jens Axboe: "Second attempt at adding the io_uring interface. Since the first one, we've added basic unit testing of the three system calls, that resides in liburing like the other unit tests that we have so far. It'll take a while to get full coverage of it, but we're working towards it. I've also added two basic test programs to tools/io_uring. One uses the raw interface and has support for all the various features that io_uring supports outside of standard IO, like fixed files, fixed IO buffers, and polled IO. The other uses the liburing API, and is a simplified version of cp(1). This adds support for a new IO interface, io_uring. io_uring allows an application to communicate with the kernel through two rings, the submission queue (SQ) and completion queue (CQ) ring. This allows for very efficient handling of IOs, see the v5 posting for some basic numbers: https://lore.kernel.org/linux-block/20190116175003.17880-1-axboe@kernel.dk/ Outside of just efficiency, the interface is also flexible and extendable, and allows for future use cases like the upcoming NVMe key-value store API, networked IO, and so on. It also supports async buffered IO, something that we've always failed to support in the kernel. Outside of basic IO features, it supports async polled IO as well. This particular feature has already been tested at Facebook months ago for flash storage boxes, with 25-33% improvements. It makes polled IO actually useful for real world use cases, where even basic flash sees a nice win in terms of efficiency, latency, and performance. These boxes were IOPS bound before, now they are not. This series adds three new system calls. One for setting up an io_uring instance (io_uring_setup(2)), one for submitting/completing IO (io_uring_enter(2)), and one for aux functions like registrating file sets, buffers, etc (io_uring_register(2)). Through the help of Arnd, I've coordinated the syscall numbers so merge on that front should be painless. Jon did a writeup of the interface a while back, which (except for minor details that have been tweaked) is still accurate. Find that here: https://lwn.net/Articles/776703/ Huge thanks to Al Viro for helping getting the reference cycle code correct, and to Jann Horn for his extensive reviews focused on both security and bugs in general. There's a userspace library that provides basic functionality for applications that don't need or want to care about how to fiddle with the rings directly. It has helpers to allow applications to easily set up an io_uring instance, and submit/complete IO through it without knowing about the intricacies of the rings. It also includes man pages (thanks to Jeff Moyer), and will continue to grow support helper functions and features as time progresses. Find it here: git://git.kernel.dk/liburing Fio has full support for the raw interface, both in the form of an IO engine (io_uring), but also with a small test application (t/io_uring) that can exercise and benchmark the interface" * tag 'io_uring-2019-03-06' of git://git.kernel.dk/linux-block: io_uring: add a few test tools io_uring: allow workqueue item to handle multiple buffered requests io_uring: add support for IORING_OP_POLL io_uring: add io_kiocb ref count io_uring: add submission polling io_uring: add file set registration net: split out functions related to registering inflight socket files io_uring: add support for pre-mapped user IO buffers block: implement bio helper to add iter bvec pages to bio io_uring: batch io_kiocb allocation io_uring: use fget/fput_many() for file references fs: add fget_many() and fput_many() io_uring: support for IO polling io_uring: add fsync support Add io_uring IO interface
2019-03-09xfs: clean up xfs_dir2_leafn_addDarrick J. Wong1-12/+8
Remove typedefs and consolidate local variable initialization. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
2019-03-09xfs: Zero initialize highstale and lowstale in xfs_dir2_leafn_addNathan Chancellor1-0/+2
When building with -Wsometimes-uninitialized, Clang warns: fs/xfs/libxfs/xfs_dir2_node.c:481:6: warning: variable 'lowstale' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized] fs/xfs/libxfs/xfs_dir2_node.c:481:6: warning: variable 'highstale' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized] While it isn't technically wrong, it isn't a problem in practice because highstale and lowstale are only initialized in xfs_dir2_leafn_add when compact is not zero then they are passed to xfs_dir3_leaf_find_entry, where they are initialized before use when compact is zero. Regardless, it's better not to be passing around uninitialized stack memory so zero initialize these variables, which silences this warning. Link: https://github.com/ClangBuiltLinux/linux/issues/393 Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-03-09Merge tag 'for-5.1/block-20190302' of git://git.kernel.dk/linux-blockLinus Torvalds21-53/+123
Pull block layer updates from Jens Axboe: "Not a huge amount of changes in this round, the biggest one is that we finally have Mings multi-page bvec support merged. Apart from that, this pull request contains: - Small series that avoids quiescing the queue for sysfs changes that match what we currently have (Aleksei) - Series of bcache fixes (via Coly) - Series of lightnvm fixes (via Mathias) - NVMe pull request from Christoph. Nothing major, just SPDX/license cleanups, RR mp policy (Hannes), and little fixes (Bart, Chaitanya). - BFQ series (Paolo) - Save blk-mq cpu -> hw queue mapping, removing a pointer indirection for the fast path (Jianchao) - fops->iopoll() added for async IO polling, this is a feature that the upcoming io_uring interface will use (Christoph, me) - Partition scan loop fixes (Dongli) - mtip32xx conversion from managed resource API (Christoph) - cdrom registration race fix (Guenter) - MD pull from Song, two minor fixes. - Various documentation fixes (Marcos) - Multi-page bvec feature. This brings a lot of nice improvements with it, like more efficient splitting, larger IOs can be supported without growing the bvec table size, and so on. (Ming) - Various little fixes to core and drivers" * tag 'for-5.1/block-20190302' of git://git.kernel.dk/linux-block: (117 commits) block: fix updating bio's front segment size block: Replace function name in string with __func__ nbd: propagate genlmsg_reply return code floppy: remove set but not used variable 'q' null_blk: fix checking for REQ_FUA block: fix NULL pointer dereference in register_disk fs: fix guard_bio_eod to check for real EOD errors blk-mq: use HCTX_TYPE_DEFAULT but not 0 to index blk_mq_tag_set->map block: optimize bvec iteration in bvec_iter_advance block: introduce mp_bvec_for_each_page() for iterating over page block: optimize blk_bio_segment_split for single-page bvec block: optimize __blk_segment_map_sg() for single-page bvec block: introduce bvec_nth_page() iomap: wire up the iopoll method block: add bio_set_polled() helper block: wire up block device iopoll method fs: add an iopoll method to struct file_operations loop: set GENHD_FL_NO_PART_SCAN after blkdev_reread_part() loop: do not print warn message if partition scan is successful block: bounce: make sure that bvec table is updated ...
2019-03-08nfsd: allow nfsv3 readdir request to be larger.NeilBrown2-2/+4
nfsd currently reports the NFSv3 dtpref FSINFO parameter to be PAGE_SIZE, so NFS clients will typically ask for one page of directory entries at a time. This is needlessly restrictive as nfsd can handle larger replies easily. Also, a READDIR request (but not a READDIRPLUS request) has the count size clipped to PAGE_SIE, again unnecessary. This patch lifts these limits so that larger readdir requests can be used. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-03-08gfs2: Fix missed wakeups in find_insert_glockAndreas Gruenbacher1-1/+1
Mark Syms has reported seeing tasks that are stuck waiting in find_insert_glock. It turns out that struct lm_lockname contains four padding bytes on 64-bit architectures that function glock_waitqueue doesn't skip when hashing the glock name. As a result, we can end up waking up the wrong waitqueue, and the waiting tasks may be stuck forever. Fix that by using ht_parms.key_len instead of sizeof(struct lm_lockname) for the key length. Reported-by: Mark Syms <mark.syms@citrix.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2019-03-08Merge branch 'akpm' (patches from Andrew)Linus Torvalds8-105/+168
Merge more updates from Andrew Morton: - some of the rest of MM - various misc things - dynamic-debug updates - checkpatch - some epoll speedups - autofs - rapidio - lib/, lib/lzo/ updates * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (83 commits) samples/mic/mpssd/mpssd.h: remove duplicate header kernel/fork.c: remove duplicated include include/linux/relay.h: fix percpu annotation in struct rchan arch/nios2/mm/fault.c: remove duplicate include unicore32: stop printing the virtual memory layout MAINTAINERS: fix GTA02 entry and mark as orphan mm: create the new vm_fault_t type arm, s390, unicore32: remove oneliner wrappers for memblock_alloc() arch: simplify several early memory allocations openrisc: simplify pte_alloc_one_kernel() sh: prefer memblock APIs returning virtual address microblaze: prefer memblock API returning virtual address powerpc: prefer memblock APIs returning virtual address lib/lzo: separate lzo-rle from lzo lib/lzo: implement run-length encoding lib/lzo: fast 8-byte copy on arm64 lib/lzo: 64-bit CTZ on arm64 lib/lzo: tidy-up ifdefs ipc/sem.c: replace kvmalloc/memset with kvzalloc and use struct_size ipc: annotate implicit fall through ...
2019-03-08exec: increase BINPRM_BUF_SIZE to 256Oleg Nesterov1-1/+1
Large enterprise clients often run applications out of networked file systems where the IT mandated layout of project volumes can end up leading to paths that are longer than 128 characters. Bumping this up to the next order of two solves this problem in all but the most egregious case while still fitting into a 512b slab. [oleg@redhat.com: update comment, per Kees] Link: http://lkml.kernel.org/r/20181112160956.GA28472@redhat.com Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reported-by: Ben Woodard <woodard@redhat.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Kees Cook <keescook@chromium.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-08fs/exec.c: replace opencoded set_mask_bits()Vineet Gupta1-6/+1
Link: http://lkml.kernel.org/r/1548275584-18096-2-git-send-email-vgupta@synopsys.com Link: http://lkml.kernel.org/g/20150807115710.GA16897@redhat.com Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Reviewed-by: Anthony Yznaga <anthony.yznaga@oracle.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Miklos Szeredi <mszeredi@redhat.com> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-08fat: enable .splice_write to support splice on O_DIRECT fileHou Tao1-0/+1
Now splice() on O_DIRECT-opened fat file will return -EFAULT, that is because the default .splice_write, namely default_file_splice_write(), will construct an ITER_KVEC iov_iter and dio_refill_pages() in dio path can not handle it. Fix it by implementing .splice_write through iter_file_splice_write(). Spotted by xfs-tests generic/091. Link: http://lkml.kernel.org/r/20190210094754.56355-1-houtao1@huawei.com Signed-off-by: Hou Tao <houtao1@huawei.com> Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-08autofs: clear O_NONBLOCK on the pipeNeilBrown1-0/+2
autofs does not expect the pipe it is given to have O_NONBLOCK set - specifically if __kernel_write() in autofs_write() returns -EAGAIN, this is treated as a fatal error and the pipe is closed. For safety autofs should, therefore, clear the O_NONBLOCK flag. Releases of systemd prior to 8th February 2019 used pipe2(p, O_NONBLOCK|O_CLOEXEC) and thus (inadvertently) set this flag. Link: http://lkml.kernel.org/r/154993550902.3321.1183632970046073478.stgit@pluto-themaw-net Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>