summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2016-05-03lookup_open(): lift the "fallback to !O_CREAT" logics from atomic_open()Al Viro1-89/+55
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03atomic_open(): be paranoid about may_open() return valueAl Viro1-0/+2
It should never return positives; however, with Linux S&M crowd involved, no bogosity is impossible. Results would be unpleasant... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03atomic_open(): delay open_to_namei_flags() until the method callAl Viro1-3/+4
nobody else needs that transformation. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03do_last(): take fput() on error after opening to out:Al Viro1-17/+5
make it conditional on *opened & FILE_OPENED; in addition to getting rid of exit_fput: thing, it simplifies atomic_open() cleanup on may_open() failure. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03do_last(): get rid of duplicate ELOOP checkAl Viro1-4/+0
may_open() will catch it Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03atomic_open(): massage the create_error logics a bitAl Viro1-23/+20
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03atomic_open(): consolidate "overridden ENOENT" in open-yourself casesAl Viro1-8/+1
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03atomic_open(): don't bother with EEXIST check - it's done in do_last()Al Viro1-5/+0
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03Merge branch 'for-linus' into work.lookupsAl Viro5-55/+35
2016-05-03lookup_open(): expand the call of vfs_create()Al Viro1-9/+12
Lift IS_DEADDIR handling up into the part common with atomic_open(), remove it from the latter. Collapse permission checks into the call of may_o_create(), getting it closer to atomic_open() case. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03path_openat(): take O_PATH handling out of do_last()Al Viro1-7/+24
do_last() and lookup_open() simpler that way and so does O_PATH itself. As it bloody well should: we find what the pathname resolves to, same way as in stat() et.al. and associate it with FMODE_PATH struct file. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03simple local filesystems: switch to ->iterate_shared()Al Viro6-6/+6
no changes needed (XFS isn't simple, but it has the same parallelism in the interesting parts exercised from CXFS). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03dcache_{readdir,dir_lseek}() users: switch to ->iterate_sharedAl Viro3-7/+4
no need to lock directory in dcache_dir_lseek(), while we are at it - per-struct file exclusion is enough. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03cifs: switch to ->iterate_shared()Al Viro2-27/+30
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03fuse: switch to ->iterate_shared()Al Viro1-49/+45
Switch dcache pre-seeding on readdir to d_alloc_parallel(); nothing else is needed. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03switch all procfs directories ->iterate_shared()Al Viro7-20/+21
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03proc_sys_fill_cache(): switch to d_alloc_parallel()Al Viro1-7/+8
make it usable with directory locked shared Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03proc_fill_cache(): switch to d_alloc_parallel()Al Viro1-5/+10
... making it usable with directory locked shared Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03introduce a parallel variant of ->iterate()Al Viro5-11/+48
New method: ->iterate_shared(). Same arguments as in ->iterate(), called with the directory locked only shared. Once all filesystems switch, the old one will be gone. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03give readdir(2)/getdents(2)/etc. uniform exclusion with lseek()Al Viro7-27/+33
same as read() on regular files has, and for the same reason. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03parallel lookups: actual switch to rwsemAl Viro11-32/+73
ta-da! The main issue is the lack of down_write_killable(), so the places like readdir.c switched to plain inode_lock(); once killable variants of rwsem primitives appear, that'll be dealt with. lockdep side also might need more work Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03parallel lookups machinery, part 4 (and last)Al Viro3-23/+82
If we *do* run into an in-lookup match, we need to wait for it to cease being in-lookup. Fortunately, we do have unused space in in-lookup dentries - d_lru is never looked at until it stops being in-lookup. So we can stash a pointer to wait_queue_head from stack frame of the caller of ->lookup(). Some precautions are needed while waiting, but it's not that hard - we do hold a reference to dentry we are waiting for, so it can't go away. If it's found to be in-lookup the wait_queue_head is still alive and will remain so at least while ->d_lock is held. Moreover, the condition we are waiting for becomes true at the same point where everything on that wq gets woken up, so we can just add ourselves to the queue once. d_alloc_parallel() gets a pointer to wait_queue_head_t from its caller; lookup_slow() adjusted, d_add_ci() taught to use d_alloc_parallel() if the dentry passed to it happens to be in-lookup one (i.e. if it's been called from the parallel lookup). That's pretty much it - all that remains is to switch ->i_mutex to rwsem and have lookup_slow() take it shared. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03parallel lookups machinery, part 3Al Viro3-25/+125
We will need to be able to check if there is an in-lookup dentry with matching parent/name. Right now it's impossible, but as soon as start locking directories shared such beasts will appear. Add a secondary hash for locating those. Hash chains go through the same space where d_alias will be once it's not in-lookup anymore. Search is done under the same bitlock we use for modifications - with the primary hash we can rely on d_rehash() into the wrong chain being the worst that could happen, but here the pointers are buggered once it's removed from the chain. On the other hand, the chains are not going to be long and normally we'll end up adding to the chain anyway. That allows us to avoid bothering with ->d_lock when doing the comparisons - everything is stable until removed from chain. New helper: d_alloc_parallel(). Right now it allocates, verifies that no hashed and in-lookup matches exist and adds to in-lookup hash. Returns ERR_PTR() for error, hashed match (in the unlikely case it's been found) or new dentry. In-lookup matches trigger BUG() for now; that will change in the next commit when we introduce waiting for ongoing lookup to finish. Note that in-lookup matches won't be possible until we actually go for shared locking. lookup_slow() switched to use of d_alloc_parallel(). Again, these commits are separated only for making it easier to review. All this machinery will start doing something useful only when we go for shared locking; it's just that the combination is too large for my taste. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03parallel lookups machinery, part 2Al Viro5-3/+44
We'll need to verify that there's neither a hashed nor in-lookup dentry with desired parent/name before adding to in-lookup set. One possible solution would be to hold the parent's ->d_lock through both checks, but while the in-lookup set is relatively small at any time, dcache is not. And holding the parent's ->d_lock through something like __d_lookup_rcu() would suck too badly. So we leave the parent's ->d_lock alone, which means that we watch out for the following scenario: * we verify that there's no hashed match * existing in-lookup match gets hashed by another process * we verify that there's no in-lookup matches and decide that everything's fine. Solution: per-directory kinda-sorta seqlock, bumped around the times we hash something that used to be in-lookup or move (and hash) something in place of in-lookup. Then the above would turn into * read the counter * do dcache lookup * if no matches found, check for in-lookup matches * if there had been none of those either, check if the counter has changed; repeat if it has. The "kinda-sorta" part is due to the fact that we don't have much spare space in inode. There is a spare word (shared with i_bdev/i_cdev/i_pipe), so the counter part is not a problem, but spinlock is a different story. We could use the parent's ->d_lock, and it would be less painful in terms of contention, for __d_add() it would be rather inconvenient to grab; we could do that (using lock_parent()), but... Fortunately, we can get serialization on the counter itself, and it might be a good idea in general; we can use cmpxchg() in a loop to get from even to odd and smp_store_release() from odd to even. This commit adds the counter and updating logics; the readers will be added in the next commit. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03beginning of transition to parallel lookups - marking in-lookup dentriesAl Viro3-0/+35
marked as such when (would be) parallel lookup is about to pass them to actual ->lookup(); unmarked when * __d_add() is about to make it hashed, positive or not. * __d_move() (from d_splice_alias(), directly or via __d_unalias()) puts a preexisting dentry in its place * in caller of ->lookup() if it has escaped all of the above. Bug (WARN_ON, actually) if it reaches the final dput() or d_instantiate() while still marked such. As the result, we are guaranteed that for as long as the flag is set, dentry will * remain negative unhashed with positive refcount * never have its ->d_alias looked at * never have its ->d_lru looked at * never have its ->d_parent and ->d_name changed Right now we have at most one such for any given parent directory. With parallel lookups that restriction will weaken to * only exist when parent is locked shared * at most one with given (parent,name) pair (comparison of names is according to ->d_compare()) * only exist when there's no hashed dentry with the same (parent,name) Transition will take the next several commits; unfortunately, we'll only be able to switch to rwsem at the end of this series. The reason for not making it a single patch is to simplify review. New primitives: d_in_lookup() (a predicate checking if dentry is in the in-lookup state) and d_lookup_done() (tells the system that we are done with lookup and if it's still marked as in-lookup, it should cease to be such). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03__d_add(): don't drop/regain ->d_lockAl Viro1-3/+11
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03lookup_slow(): bugger off on IS_DEADDIR() from the very beginningAl Viro1-6/+17
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03nfs: missing wakeup in nfs_unblock_sillyrename()Al Viro1-0/+1
will be needed as soon as lookups are not serialized by ->i_mutex Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03make ext2_get_page() and friends work without external serializationAl Viro5-35/+35
Right now ext2_get_page() (and its analogues in a bunch of other filesystems) relies upon the directory being locked - the way it sets and tests Checked and Error bits would be racy without that. Switch to a slightly different scheme, _not_ setting Checked in case of failure. That way the logics becomes if Checked => OK else if Error => fail else if !validate => fail else => OK with validation setting Checked or Error on success and failure resp. and returning which one had happened. Equivalent to the current logics, but unlike the current logics not sensitive to the order of set_bit, test_bit getting reordered by CPU, etc. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03ovl_lookup_real(): use lookup_one_len_unlocked()Al Viro1-3/+1
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03reconnect_one(): use lookup_one_len_unlocked()Al Viro1-3/+7
... and explain the non-obvious logics in case when lookup yields a different dentry. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03reiserfs: open-code reiserfs_mutex_lock_safe() in reiserfs_unpack()Al Viro1-1/+5
... and have it use inode_lock() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03orangefs: don't open-code inode_lock/inode_unlockAl Viro2-4/+4
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03ocfs2: don't open-code inode_lock/inode_unlockAl Viro1-2/+2
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03configfs_detach_prep(): make sure that wait_mutex won't go awayAl Viro1-8/+9
grab a reference to dentry we'd got the sucker from, and return that dentry via *wait, rather than just returning the address of ->i_mutex. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03kernfs: use lookup_one_len_unlocked()Al Viro1-3/+2
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03security_d_instantiate(): move to the point prior to attaching dentry to inodeAl Viro1-8/+7
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03Merge getxattr prototype change into work.lookupsAl Viro107-458/+414
The rest of work.xattr stuff isn't needed for this branch
2016-04-30atomic_open(): fix the handling of create_errorAl Viro1-16/+4
* if we have a hashed negative dentry and either CREAT|EXCL on r/o filesystem, or CREAT|TRUNC on r/o filesystem, or CREAT|EXCL with failing may_o_create(), we should fail with EROFS or the error may_o_create() has returned, but not ENOENT. Which is what the current code ends up returning. * if we have CREAT|TRUNC hitting a regular file on a read-only filesystem, we can't fail with EROFS here. At the very least, not until we'd done follow_managed() - we might have a writable file (or a device, for that matter) bound on top of that one. Moreover, the code downstream will see that O_TRUNC and attempt to grab the write access (*after* following possible mount), so if we really should fail with EROFS, it will happen. No need to do that inside atomic_open(). The real logics is much simpler than what the current code is trying to do - if we decided to go for simple lookup, ended up with a negative dentry *and* had create_error set, fail with create_error. No matter whether we'd got that negative dentry from lookup_real() or had found it in dcache. Cc: stable@vger.kernel.org # v3.6+ Acked-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-04-11->getxattr(): pass dentry and inode as separate argumentsAl Viro34-85/+94
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-04-11Linux 4.6-rc3v4.6-rc3Linus Torvalds1-1/+1
2016-04-11xattr_handler: pass dentry and inode as separate arguments of ->get()Al Viro31-114/+113
... and do not assume they are already attached to each other Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-04-11Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-armLinus Torvalds5-7/+13
Pull ARM fixes from Russell King: "A couple of small fixes, and wiring up the new syscalls which appeared during the merge window" * 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm: ARM: 8550/1: protect idiv patching against undefined gcc behavior ARM: wire up preadv2 and pwritev2 syscalls ARM: SMP enable of cache maintanence broadcast
2016-04-11Merge tag 'mmc-v4.6-rc1' of git://git.linaro.org/people/ulf.hansson/mmcLinus Torvalds5-9/+84
Pull MMC fixes from Ulf Hansson: "Here are a couple of mmc fixes intended for v4.6 rc3: MMC host: - sdhci: Fix regression setting power on Trats2 board - sdhci-pci: Add support and PCI IDs for more Broxton host controllers" * tag 'mmc-v4.6-rc1' of git://git.linaro.org/people/ulf.hansson/mmc: mmc: sdhci-pci: Add support and PCI IDs for more Broxton host controllers mmc: sdhci: Fix regression setting power on Trats2 board
2016-04-11Merge branch 'i2c/for-current' of ↵Linus Torvalds4-36/+49
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "Some bugfixes from I2C: - fix a uevent triggered boot problem by removing a useless debug print - fix sysfs-attributes of the new i2c-demux-pinctrl driver to follow standard kernel behaviour - fix a potential division-by-zero error (needed two takes)" * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: jz4780: really prevent potential division by zero Revert "i2c: jz4780: prevent potential division by zero" i2c: jz4780: prevent potential division by zero i2c: mux: demux-pinctrl: Update docs to new sysfs-attributes i2c: mux: demux-pinctrl: Clean up sysfs attributes i2c: prevent endless uevent loop with CONFIG_I2C_DEBUG_CORE
2016-04-11Revert "ext4: allow readdir()'s of large empty directories to be interrupted"Linus Torvalds2-10/+0
This reverts commit 1028b55bafb7611dda1d8fed2aeca16a436b7dff. It's broken: it makes ext4 return an error at an invalid point, causing the readdir wrappers to write the the position of the last successful directory entry into the position field, which means that the next readdir will now return that last successful entry _again_. You can only return fatal errors (that terminate the readdir directory walk) from within the filesystem readdir functions, the "normal" errors (that happen when the readdir buffer fills up, for example) happen in the iterorator where we know the position of the actual failing entry. I do have a very different patch that does the "signal_pending()" handling inside the iterator function where it is allowable, but while that one passes all the sanity checks, I screwed up something like four times while emailing it out, so I'm not going to commit it today. So my track record is not good enough, and the stars will have to align better before that one gets committed. And it would be good to get some review too, of course, since celestial alignments are always an iffy debugging model. IOW, let's just revert the commit that caused the problem for now. Reported-by: Greg Thelen <gthelen@google.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-11reiserfs: switch to generic_{get,set,remove}xattr()Al Viro7-98/+31
reiserfs_xattr_[sg]et() will fail with -EOPNOTSUPP for V1 inodes anyway, and all reiserfs instances of ->[sg]et() call it and so does ->set_acl(). Checks for name length in the instances had been bogus; they should've been "bugger off if it's _exactly_ the prefix" (as generic would do on its own) and not "bugger off if it's shorter than the prefix" - that can't happen. xattr_full_name() is needed to adjust for the fact that generic instances will skip the prefix in the name passed to ->[gs]et(); reiserfs homegrown analogues didn't. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-04-11cifs: kill more bogus checks in ->...xattr() methodsAl Viro1-36/+6
none of that stuff can ever be called for NULL or negative dentry. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-04-11don't bother with ->d_inode->i_sb - it's always equal to ->d_sbAl Viro30-50/+41
... and neither can ever be NULL Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-04-10Merge branch 'parisc-4.6-3' of ↵Linus Torvalds7-11/+29
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parisc fixes from Helge Deller: "Since commit 0de798584bde ("parisc: Use generic extable search and sort routines") module loading is boken on parisc, because the parisc module loader wasn't prepared for the new R_PARISC_PCREL32 relocations. In addition, due to that breakage, Mikulas Patocka noticed that handling exceptions from modules probably never worked on parisc. It was just masked by the fact that exceptions from modules don't happen during normal use. This patch series fixes those issues and survives the tests of the lib/test_user_copy kernel module test. Some patches are tagged for stable" * 'parisc-4.6-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: Update comment regarding relative extable support parisc: Unbreak handling exceptions from kernel modules parisc: Fix kernel crash with reversed copy_from_user() parisc: Avoid function pointers for kernel exception routines parisc: Handle R_PARISC_PCREL32 relocations in kernel modules