summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)AuthorFilesLines
2012-07-29mknod: take sanity checks on mode into the very beginningAl Viro1-5/+3
Note that applying umask can't affect their results. While that affects errno in cases like mknod("/no_such_directory/a", 030000) yielding -EINVAL (due to impossible mode_t) instead of -ENOENT (due to inexistent directory), IMO that makes a lot more sense, POSIX allows to return either and any software that relies on getting -ENOENT instead of -EINVAL in that case deserves everything it gets. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-29new helper: done_path_create()Al Viro2-15/+13
releases what needs to be released after {kern,user}_path_create() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-23ext4: switch EXT4_IOC_RESIZE_FS to mnt_want_write_file()Al Viro1-2/+2
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-23btrfs: switch btrfs_ioctl_balance() to mnt_want_write_file()Al Viro1-2/+2
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-23switch dentry_open() to struct path, make it grab references itselfAl Viro9-63/+45
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-23ecryptfs: don't reinvent the wheels, please - use struct completionAl Viro3-65/+26
... and keep the sodding requests on stack - they are small enough. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-23don't expose I_NEW inodes via dentry->d_inodeAl Viro7-19/+19
d_instantiate(dentry, inode); unlock_new_inode(inode); is a bad idea; do it the other way round... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-23tidy up namei.c a bitAl Viro1-18/+21
locking/unlocking for rcu walk taken to a couple of inline helpers Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-23unobfuscate follow_up() a bitAl Viro1-1/+1
really convoluted test in there has grown up during struct mount introduction; what it checks is that we'd reached the root of mount tree.
2012-07-23ext3: pass custom EOF to generic_file_llseek_size()Eric Sandeen1-2/+2
Use the new custom EOF argument to generic_file_llseek_size so that SEEK_END will go to the max hash value for htree dirs in ext3 rather than to i_size_read() Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-23ext4: use core vfs llseek code for dir seeksEric Sandeen2-64/+17
Use the new functionality in generic_file_llseek_size() to accept a custom EOF position, and un-cut-and-paste all the vfs llseek code from ext4. Also fix up comments on ext4_llseek() to reflect reality. Signed-off-by: Eric Sandeen <sandeen@redaht.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-23vfs: allow custom EOF in generic_file_llseek codeEric Sandeen3-10/+14
For ext3/4 htree directories, using the vfs llseek function with SEEK_END goes to i_size like for any other file, but in reality we want the maximum possible hash value. Recent changes in ext4 have cut & pasted generic_file_llseek() back into fs/ext4/dir.c, but replicating this core code seems like a bad idea, especially since the copy has already diverged from the vfs. This patch updates generic_file_llseek_size to accept both a custom maximum offset, and a custom EOF position. With this in place, ext4_dir_llseek can pass in the appropriate maximum hash position for both maxsize and eof, and get what it wants. As far as I know, this does not fix any bugs - nfs in the kernel doesn't use SEEK_END, and I don't know of any user who does. But some ext4 folks seem keen on doing the right thing here, and I can't really argue. (Patch also fixes up some comments slightly) Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-22vfs: Avoid unnecessary WB_SYNC_NONE writeback during sys_sync and reorder ↵Jan Kara1-10/+9
sync passes wakeup_flusher_threads(0) will queue work doing complete writeback for each flusher thread. Thus there is not much point in submitting another work doing full inode WB_SYNC_NONE writeback by writeback_inodes_sb(). After this change it does not make sense to call nonblocking ->sync_fs and block device flush before calling sync_inodes_sb() because wakeup_flusher_threads() is completely asynchronous and thus these functions would be called in parallel with inode writeback running which will effectively void any work they do. So we move sync_inodes_sb() call before these two functions. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-22vfs: Remove unnecessary flushing of block devicesJan Kara1-8/+8
It is not necessary to write block devices twice. The reason why we first did flush and then proper sync is that for_each_bdev() { write_bdev() wait_for_completion() } is much slower than for_each_bdev() write_bdev() for_each_bdev() wait_for_completion() when there is bigger amount of data. But as is seen in the above, there's no real need to scan pages and submit them twice. We just need to separate the submission and waiting part. This patch does that. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-22vfs: Make sys_sync writeout also block device inodesJan Kara1-7/+11
In case block device does not have filesystem mounted on it, sys_sync will just ignore it and doesn't writeout its dirty pages. This is because writeback code avoids writing inodes from superblock without backing device and blockdev_superblock is such a superblock. Since it's unexpected that sync doesn't writeout dirty data for block devices be nice to users and change the behavior to do so. So now we iterate over all block devices on blockdev_super instead of iterating over all superblocks when syncing block devices. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-22vfs: Create function for iterating over block devicesJan Kara1-0/+36
Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-22vfs: Reorder operations during sys_syncJan Kara1-12/+34
Change the order of operations during sync from for_each_sb { writeback_inodes_sb(); sync_fs(nowait); __sync_blockdev(nowait); } for_each_sb { sync_inodes_sb(); sync_fs(wait); __sync_blockdev(wait); } to for_each_sb writeback_inodes_sb(); for_each_sb sync_fs(nowait); for_each_sb __sync_blockdev(nowait); for_each_sb sync_inodes_sb(); for_each_sb sync_fs(wait); for_each_sb __sync_blockdev(wait); This is a preparation for the following patches in this series. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-22quota: Move quota syncing to ->sync_fs methodJan Kara7-3/+28
Since the moment writes to quota files are using block device page cache and space for quota structures is reserved at the moment they are first accessed we have no reason to sync quota before inode writeback. In fact this order is now only harmful since quota information can easily change during inode writeback (either because conversion of delayed-allocated extents or simply because of allocation of new blocks for simple filesystems not using page_mkwrite). So move syncing of quota information after writeback of inodes into ->sync_fs method. This way we do not have to use ->quota_sync callback which is primarily intended for use by quotactl syscall anyway and we get rid of calling ->sync_fs() twice unnecessarily. We skip quota syncing for OCFS2 since it does proper quota journalling in all cases (unlike ext3, ext4, and reiserfs which also support legacy non-journalled quotas) and thus there are no dirty quota structures. CC: "Theodore Ts'o" <tytso@mit.edu> CC: Joel Becker <jlbec@evilplan.org> CC: reiserfs-devel@vger.kernel.org Acked-by: Steven Whitehouse <swhiteho@redhat.com> Acked-by: Dave Kleikamp <shaggy@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-22quota: Split dquot_quota_sync() to writeback and cache flushing partJan Kara7-11/+29
Split off part of dquot_quota_sync() which writes dquots into a quota file to a separate function. In the next patch we will use the function from filesystems and we do not want to abuse ->quota_sync quotactl callback more than necessary. Acked-by: Steven Whitehouse <swhiteho@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-22vfs: Move noop_backing_dev_info check from sync into writebackJan Kara2-7/+5
In principle, a filesystem may want to have ->sync_fs() called during sync(1) although it does not have a bdi (i.e. s_bdi is set to noop_backing_dev_info). Only writeback code really needs bdi set to something reasonable. So move the checks where they are more logical. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-22fs/ufs: get rid of write_superArtem Bityutskiy5-16/+42
This patch makes UFS stop using the VFS '->write_super()' method along with the 's_dirt' superblock flag, because they are on their way out. The way we implement this is that we schedule a delay job instead relying on 's_dirt' and '->write_super()'. The whole "superblock write-out" VFS infrastructure is served by the 'sync_supers()' kernel thread, which wakes up every 5 (by default) seconds and writes out all dirty superblocks using the '->write_super()' call-back. But the problem with this thread is that it wastes power by waking up the system every 5 seconds, even if there are no diry superblocks, or there are no client file-systems which would need this (e.g., btrfs does not use '->write_super()'). So we want to kill it completely and thus, we need to make file-systems to stop using the '->write_super()' VFS service, and then remove it together with the kernel thread. Tested using fsstress from the LTP project. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-22fs/ufs: re-arrange the code a bitArtem Bityutskiy1-59/+58
This patch does not do any functional changes. It only moves 3 functions in fs/ufs/super.c a little bit up in order to prepare for further changes where I'll need this new arrangement to avoid forward declarations. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-22fs/ufs: remove extra superblock write on unmountArtem Bityutskiy1-3/+0
UFS calls 'ufs_write_super()' from 'ufs_put_super()' in order to write the superblocks to the media. However, it is not needed because VFS calls '->sync_fs()' before calling '->put_super()' - so by the time we are in 'ufs_write_super()', the superblocks are already synchronized. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-22fs/sysv: stop using write_super and s_dirtArtem Bityutskiy2-11/+0
It does not look like sysv FS needs 'write_super()' at all, because all it does is a timestamp update. I cannot test this patch, because this file-system is so old and probably has not been used by anyone for years, so there are no tools to create it in Linux. But from the code I see that marking the superblock as dirty is basically marking the superblock buffers as drity and then setting the s_dirt flag. And when 'write_super()' is executed to handle the s_dirt flag, we just update the timestamp and again mark the superblock buffer as dirty. Seems pointless. It looks like we can update the timestamp more opprtunistically - on unmount or remount of sync, and nothing should change. Thus, this patch removes 'sysv_write_super()' and 's_dirt'. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-22fs/sysv: remove another useless write_super callArtem Bityutskiy1-4/+1
We do not need to call 'sysv_write_super()' from 'sysv_remount()', because VFS has called 'sysv_sync_fs()' before calling '->remount()'. So remove it. Remove also '(un)lock_super()' which obvioulsy is becoming useless in this function. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-22fs/sysv: remove useless write_super callArtem Bityutskiy1-3/+0
We do not need to call 'sysv_write_super()' from 'sysv_put_super()', because VFS has called 'sysv_sync_fs()' before calling '->put_super()'. So remove it. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-22hfs: get rid of hfs_sync_superArtem Bityutskiy4-39/+48
This patch makes hfs stop using the VFS '->write_super()' method along with the 's_dirt' superblock flag, because they are on their way out. The whole "superblock write-out" VFS infrastructure is served by the 'sync_supers()' kernel thread, which wakes up every 5 (by default) seconds and writes out all dirty superblocks using the '->write_super()' call-back. But the problem with this thread is that it wastes power by waking up the system every 5 seconds, even if there are no diry superblocks, or there are no client file-systems which would need this (e.g., btrfs does not use '->write_super()'). So we want to kill it completely and thus, we need to make file-systems to stop using the '->write_super()' VFS service, and then remove it together with the kernel thread. Tested using fsstress from the LTP project. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-22hfs: introduce VFS superblock object back-referenceArtem Bityutskiy2-5/+2
Add an 'sb' VFS superblock back-reference to the 'struct hfs_sb_info' data structure - we will need to find the VFS superblock from a 'struct hfs_sb_info' object in the next patch, so this change is jut a preparation. Remove few useless newlines while on it. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-22hfs: simplify a bit checking for R/OArtem Bityutskiy3-4/+5
We have the following pattern in 2 places in HFS if (!RDONLY) hfs_mdb_commit(); This patch pushes the RDONLY check down to 'hfs_mdb_commit()'. This will make the following patches a bit simpler. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-22hfs: remove extra mdb write on unmountArtem Bityutskiy1-2/+0
HFS calls 'hfs_write_super()' from 'hfs_put_super()' in order to write the MDB to the media. However, it is not needed because VFS calls '->sync_fs()' before calling '->put_super()' - so by the time we are in 'hfs_write_super()', the MDB is already synchronized. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-22hfs: get rid of lock_superArtem Bityutskiy1-2/+10
Stop using lock_super for serializing the MDB changes - use the buffer-head own lock instead. Tested with fsstress. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-22hfs: push lock_super downArtem Bityutskiy3-6/+2
HFS uses 'lock_super()'/'unlock_super()' around 'hfs_mdb_commit()' in order to serialize MDB (Master Directory Block) changes. Push it down to 'hfs_mdb_commit()' in order to simplify the code a bit. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-22hfsplus: get rid of write_superArtem Bityutskiy5-16/+43
This patch makes hfsplus stop using the VFS '->write_super()' method along with the 's_dirt' superblock flag, because they are on their way out. The whole "superblock write-out" VFS infrastructure is served by the 'sync_supers()' kernel thread, which wakes up every 5 (by default) seconds and writes out all dirty superblocks using the '->write_super()' call-back. But the problem with this thread is that it wastes power by waking up the system every 5 seconds, even if there are no diry superblocks, or there are no client file-systems which would need this (e.g., btrfs does not use '->write_super()'). So we want to kill it completely and thus, we need to make file-systems to stop using the '->write_super()' VFS service, and then remove it together with the kernel thread. Tested using fsstress from the LTP project. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-22hfsplus: remove useless checkArtem Bityutskiy1-3/+0
This check is useless because we always have 'sb->s_fs_info' to be non-NULL. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-22hfsplus: amend debugging printArtem Bityutskiy1-1/+1
Print correct function name in the debugging print of the 'hfsplus_sync_fs()' function. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-22hfsplus: make hfsplus_sync_fs staticArtem Bityutskiy2-2/+1
... because it is used only in fs/hfsplus/super.c. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-22aio: now fput() is OK from interrupt context; get rid of manual delayed __fput()Al Viro1-70/+3
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-22switch fput to task_work_addAl Viro1-2/+70
... and schedule_work() for interrupt/kernel_thread callers (and yes, now it *is* OK to call from interrupt). We are guaranteed that __fput() will be done before we return to userland (or exit). Note that for fput() from a kernel thread we get an async behaviour; it's almost always OK, but sometimes you might need to have __fput() completed before you do anything else. There are two mechanisms for that - a general barrier (flush_delayed_fput()) and explicit __fput_sync(). Both should be used with care (as was the case for fput() from kernel threads all along). See comments in fs/file_table.c for details. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-22use __lookup_hash() in kern_path_parent()Al Viro1-1/+1
No need to bother with lookup_one_len() here - it's an overkill Signed-off-by Al Viro <viro@zeniv.linux.org.uk>
2012-07-14VFS: Split inode_permission()David Howells2-17/+54
Split inode_permission() into inode- and superblock-dependent parts. This is aimed at unionmounts where the superblock from the upper layer has to be checked rather than the superblock from the lower layer as the upper layer may be writable, thus allowing an unwritable file from the lower layer to be copied up and modified. Original-author: Valerie Aurora <vaurora@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com> (Further development) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14VFS: Pass mount flags to sget()David Howells17-45/+37
Pass mount flags to sget() so that it can use them in initialising a new superblock before the set function is called. They could also be passed to the compare function. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14VFS: Comment mount following codeDavid Howells2-2/+24
Add comments describing what the directions "up" and "down" mean and ref count handling to the VFS mount following family of functions. Signed-off-by: Valerie Aurora <vaurora@redhat.com> (Original author) Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14VFS: Make clone_mnt()/copy_tree()/collect_mounts() return errorsDavid Howells2-57/+68
copy_tree() can theoretically fail in a case other than ENOMEM, but always returns NULL which is interpreted by callers as -ENOMEM. Change it to return an explicit error. Also change clone_mnt() for consistency and because union mounts will add new error cases. Thanks to Andreas Gruenbacher <agruen@suse.de> for a bug fix. [AV: folded braino fix by Dan Carpenter] Original-author: Valerie Aurora <vaurora@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com> Cc: Valerie Aurora <valerie.aurora@gmail.com> Cc: Andreas Gruenbacher <agruen@suse.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14VFS: Make chown() and lchown() call fchownat()David Howells1-34/+7
Make the chown() and lchown() syscalls jump to the fchownat() syscall with the appropriate extra arguments. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14do_dentry_open(): close the race with mark_files_ro() in failure exitAl Viro1-1/+1
we want to take it out of mark_files_ro() reach *before* we start checking if we ought to drop write access. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14mark_files_ro(): don't bother with mntget/mntputAl Viro1-8/+1
mnt_drop_write_file() is safe under any lock Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14notify_change(): check that i_mutex is heldAndrew Morton1-1/+2
Cc: Djalal Harouni <tixxdz@opendz.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14fs: add nd_jump_linkChristoph Hellwig2-12/+18
Add a helper that abstracts out the jump to an already parsed struct path from ->follow_link operation from procfs. Not only does this clean up the code by moving the two sides of this game into a single helper, but it also prepares for making struct nameidata private to namei.c Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14fs: move path_put on failure out of ->follow_linkChristoph Hellwig2-6/+9
Currently the non-nd_set_link based versions of ->follow_link are expected to do a path_put(&nd->path) on failure. This calling convention is unexpected, undocumented and doesn't match what the nd_set_link-based instances do. Move the path_put out of the only non-nd_set_link based ->follow_link instance into the caller. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14debugfs: get rid of useless arguments to debugfs_{mkdir,symlink}Al Viro1-11/+9
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>