summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)AuthorFilesLines
2016-05-13gfs2: switch to ->iterate_shared()Al Viro1-2/+2
protected by glock and already used without locking the directory by gfs2_get_name() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-10f2fs: switch to ->iterate_shared()Al Viro1-1/+1
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-10afs: switch to ->iterate_shared()Al Viro1-1/+1
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-10befs: switch to ->iterate_shared()Al Viro1-1/+1
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-10befs: constify stuff a bitAl Viro6-31/+34
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-09isofs: switch to ->iterate_shared()Al Viro1-1/+1
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-09get_acorn_filename(): deobfuscate a bitAl Viro1-1/+1
Lots of Idiotic Silly Parentheses is -> that way... What that condition checks is that there's exactly 32 bytes between the end of name and the end of entire drectory record. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-09btrfs: switch to ->iterate_shared()Al Viro1-1/+1
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-09logfs: no need to lock directory in lseekAl Viro1-1/+1
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-09switch ecryptfs to ->iterate_sharedAl Viro1-1/+1
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-09Merge branch 'for-linus' into work.lookupsAl Viro2-19/+65
2016-05-099p: switch to ->iterate_shared()Al Viro1-2/+2
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-09fat: switch to ->iterate_shared()Al Viro1-3/+3
... and make that weird ioctl lock directory only shared. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-09romfs, squashfs: switch to ->iterate_shared()Al Viro2-4/+4
don't need to lock directory in ->llseek(), either Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-09more trivial ->iterate_shared conversionsAl Viro16-17/+16
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-09kernfs: no point locking directory around that generic_file_llseek()Al Viro1-15/+2
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-09configfs_readdir(): make safe under shared lockAl Viro1-13/+7
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-09nfs: per-name sillyunlink exclusionAl Viro5-154/+56
use d_alloc_parallel() for sillyunlink/lookup exclusion and explicit rwsem (nfs_rmdir() being a writer and nfs_call_unlink() - a reader) for rmdir/sillyunlink one. That ought to make lookup/readdir/!O_CREAT atomic_open really parallel on NFS. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-08get_rock_ridge_filename(): handle malformed NM entriesAl Viro1-3/+10
Payloads of NM entries are not supposed to contain NUL. When we run into such, only the part prior to the first NUL goes into the concatenation (i.e. the directory entry name being encoded by a bunch of NM entries). We do stop when the amount collected so far + the claimed amount in the current NM entry exceed 254. So far, so good, but what we return as the total length is the sum of *claimed* sizes, not the actual amount collected. And that can grow pretty large - not unlimited, since you'd need to put CE entries in between to be able to get more than the maximum that could be contained in one isofs directory entry / continuation chunk and we are stop once we'd encountered 32 CEs, but you can get about 8Kb easily. And that's what will be passed to readdir callback as the name length. 8Kb __copy_to_user() from a buffer allocated by __get_free_page() Cc: stable@vger.kernel.org # 0.98pl6+ (yes, really) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-04ecryptfs: fix handling of directory openingAl Viro1-16/+55
First of all, trying to open them r/w is idiocy; it's guaranteed to fail. Moreover, assigning ->f_pos and assuming that everything will work is blatantly broken - try that with e.g. tmpfs as underlying layer and watch the fireworks. There may be a non-trivial amount of state associated with current IO position, well beyond the numeric offset. Using the single struct file associated with underlying inode is really not a good idea; we ought to open one for each ecryptfs directory struct file. Additionally, file_operations both for directories and non-directories are full of pointless methods; non-directories should *not* have ->iterate(), directories should not have ->flush(), ->fasync() and ->splice_read(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03nfs: switch to ->iterate_shared()Al Viro1-28/+43
aside of the usual care about seeding dcache from readdir, we need to be careful about the pagecache evictions here. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03lookup_open(): lock the parent shared unless O_CREAT is givenAl Viro1-3/+9
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03lookup_open(): put the dentry fed to ->lookup() or ->atomic_open() into ↵Al Viro1-11/+26
in-lookup hash Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03lookup_open(): expand the call of real_lookup()Al Viro1-3/+10
... and lose the duplicate IS_DEADDIR() - we'd already checked that. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03atomic_open(): reorder and clean up a bitAl Viro1-34/+27
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03lookup_open(): lift the "fallback to !O_CREAT" logics from atomic_open()Al Viro1-89/+55
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03atomic_open(): be paranoid about may_open() return valueAl Viro1-0/+2
It should never return positives; however, with Linux S&M crowd involved, no bogosity is impossible. Results would be unpleasant... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03atomic_open(): delay open_to_namei_flags() until the method callAl Viro1-3/+4
nobody else needs that transformation. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03do_last(): take fput() on error after opening to out:Al Viro1-17/+5
make it conditional on *opened & FILE_OPENED; in addition to getting rid of exit_fput: thing, it simplifies atomic_open() cleanup on may_open() failure. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03do_last(): get rid of duplicate ELOOP checkAl Viro1-4/+0
may_open() will catch it Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03atomic_open(): massage the create_error logics a bitAl Viro1-23/+20
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03atomic_open(): consolidate "overridden ENOENT" in open-yourself casesAl Viro1-8/+1
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03atomic_open(): don't bother with EEXIST check - it's done in do_last()Al Viro1-5/+0
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03Merge branch 'for-linus' into work.lookupsAl Viro2-16/+7
2016-05-03lookup_open(): expand the call of vfs_create()Al Viro1-9/+12
Lift IS_DEADDIR handling up into the part common with atomic_open(), remove it from the latter. Collapse permission checks into the call of may_o_create(), getting it closer to atomic_open() case. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03path_openat(): take O_PATH handling out of do_last()Al Viro1-7/+24
do_last() and lookup_open() simpler that way and so does O_PATH itself. As it bloody well should: we find what the pathname resolves to, same way as in stat() et.al. and associate it with FMODE_PATH struct file. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03simple local filesystems: switch to ->iterate_shared()Al Viro6-6/+6
no changes needed (XFS isn't simple, but it has the same parallelism in the interesting parts exercised from CXFS). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03dcache_{readdir,dir_lseek}() users: switch to ->iterate_sharedAl Viro2-6/+3
no need to lock directory in dcache_dir_lseek(), while we are at it - per-struct file exclusion is enough. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03cifs: switch to ->iterate_shared()Al Viro2-27/+30
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03fuse: switch to ->iterate_shared()Al Viro1-49/+45
Switch dcache pre-seeding on readdir to d_alloc_parallel(); nothing else is needed. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03switch all procfs directories ->iterate_shared()Al Viro7-20/+21
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03proc_sys_fill_cache(): switch to d_alloc_parallel()Al Viro1-7/+8
make it usable with directory locked shared Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03proc_fill_cache(): switch to d_alloc_parallel()Al Viro1-5/+10
... making it usable with directory locked shared Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03introduce a parallel variant of ->iterate()Al Viro3-11/+29
New method: ->iterate_shared(). Same arguments as in ->iterate(), called with the directory locked only shared. Once all filesystems switch, the old one will be gone. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03give readdir(2)/getdents(2)/etc. uniform exclusion with lseek()Al Viro5-25/+18
same as read() on regular files has, and for the same reason. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03parallel lookups: actual switch to rwsemAl Viro9-26/+34
ta-da! The main issue is the lack of down_write_killable(), so the places like readdir.c switched to plain inode_lock(); once killable variants of rwsem primitives appear, that'll be dealt with. lockdep side also might need more work Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03parallel lookups machinery, part 4 (and last)Al Viro2-21/+76
If we *do* run into an in-lookup match, we need to wait for it to cease being in-lookup. Fortunately, we do have unused space in in-lookup dentries - d_lru is never looked at until it stops being in-lookup. So we can stash a pointer to wait_queue_head from stack frame of the caller of ->lookup(). Some precautions are needed while waiting, but it's not that hard - we do hold a reference to dentry we are waiting for, so it can't go away. If it's found to be in-lookup the wait_queue_head is still alive and will remain so at least while ->d_lock is held. Moreover, the condition we are waiting for becomes true at the same point where everything on that wq gets woken up, so we can just add ourselves to the queue once. d_alloc_parallel() gets a pointer to wait_queue_head_t from its caller; lookup_slow() adjusted, d_add_ci() taught to use d_alloc_parallel() if the dentry passed to it happens to be in-lookup one (i.e. if it's been called from the parallel lookup). That's pretty much it - all that remains is to switch ->i_mutex to rwsem and have lookup_slow() take it shared. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03parallel lookups machinery, part 3Al Viro2-25/+123
We will need to be able to check if there is an in-lookup dentry with matching parent/name. Right now it's impossible, but as soon as start locking directories shared such beasts will appear. Add a secondary hash for locating those. Hash chains go through the same space where d_alias will be once it's not in-lookup anymore. Search is done under the same bitlock we use for modifications - with the primary hash we can rely on d_rehash() into the wrong chain being the worst that could happen, but here the pointers are buggered once it's removed from the chain. On the other hand, the chains are not going to be long and normally we'll end up adding to the chain anyway. That allows us to avoid bothering with ->d_lock when doing the comparisons - everything is stable until removed from chain. New helper: d_alloc_parallel(). Right now it allocates, verifies that no hashed and in-lookup matches exist and adds to in-lookup hash. Returns ERR_PTR() for error, hashed match (in the unlikely case it's been found) or new dentry. In-lookup matches trigger BUG() for now; that will change in the next commit when we introduce waiting for ongoing lookup to finish. Note that in-lookup matches won't be possible until we actually go for shared locking. lookup_slow() switched to use of d_alloc_parallel(). Again, these commits are separated only for making it easier to review. All this machinery will start doing something useful only when we go for shared locking; it's just that the combination is too large for my taste. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03parallel lookups machinery, part 2Al Viro2-2/+33
We'll need to verify that there's neither a hashed nor in-lookup dentry with desired parent/name before adding to in-lookup set. One possible solution would be to hold the parent's ->d_lock through both checks, but while the in-lookup set is relatively small at any time, dcache is not. And holding the parent's ->d_lock through something like __d_lookup_rcu() would suck too badly. So we leave the parent's ->d_lock alone, which means that we watch out for the following scenario: * we verify that there's no hashed match * existing in-lookup match gets hashed by another process * we verify that there's no in-lookup matches and decide that everything's fine. Solution: per-directory kinda-sorta seqlock, bumped around the times we hash something that used to be in-lookup or move (and hash) something in place of in-lookup. Then the above would turn into * read the counter * do dcache lookup * if no matches found, check for in-lookup matches * if there had been none of those either, check if the counter has changed; repeat if it has. The "kinda-sorta" part is due to the fact that we don't have much spare space in inode. There is a spare word (shared with i_bdev/i_cdev/i_pipe), so the counter part is not a problem, but spinlock is a different story. We could use the parent's ->d_lock, and it would be less painful in terms of contention, for __d_add() it would be rather inconvenient to grab; we could do that (using lock_parent()), but... Fortunately, we can get serialization on the counter itself, and it might be a good idea in general; we can use cmpxchg() in a loop to get from even to odd and smp_store_release() from odd to even. This commit adds the counter and updating logics; the readers will be added in the next commit. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03beginning of transition to parallel lookups - marking in-lookup dentriesAl Viro2-0/+17
marked as such when (would be) parallel lookup is about to pass them to actual ->lookup(); unmarked when * __d_add() is about to make it hashed, positive or not. * __d_move() (from d_splice_alias(), directly or via __d_unalias()) puts a preexisting dentry in its place * in caller of ->lookup() if it has escaped all of the above. Bug (WARN_ON, actually) if it reaches the final dput() or d_instantiate() while still marked such. As the result, we are guaranteed that for as long as the flag is set, dentry will * remain negative unhashed with positive refcount * never have its ->d_alias looked at * never have its ->d_lru looked at * never have its ->d_parent and ->d_name changed Right now we have at most one such for any given parent directory. With parallel lookups that restriction will weaken to * only exist when parent is locked shared * at most one with given (parent,name) pair (comparison of names is according to ->d_compare()) * only exist when there's no hashed dentry with the same (parent,name) Transition will take the next several commits; unfortunately, we'll only be able to switch to rwsem at the end of this series. The reason for not making it a single patch is to simplify review. New primitives: d_in_lookup() (a predicate checking if dentry is in the in-lookup state) and d_lookup_done() (tells the system that we are done with lookup and if it's still marked as in-lookup, it should cease to be such). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>