summaryrefslogtreecommitdiff
path: root/fs/fuse/dir.c
AgeCommit message (Collapse)AuthorFilesLines
2016-07-05Use the right predicate in ->atomic_open() instancesAl Viro1-1/+1
->atomic_open() can be given an in-lookup dentry *or* a negative one found in dcache. Use d_in_lookup() to tell one from another, rather than d_unhashed(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-06-30fuse: serialize dirops by defaultMiklos Szeredi1-0/+4
Negotiate with userspace filesystems whether they support parallel readdir and lookup. Disable parallelism by default for fear of breaking fuse filesystems. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Fixes: 9902af79c01a ("parallel lookups: actual switch to rwsem") Fixes: d9b3dbdcfd62 ("fuse: switch to ->iterate_shared()")
2016-05-28switch ->setxattr() to passing dentry and inode separatelyAl Viro1-3/+3
smack ->d_instantiate() uses ->setxattr(), so to be able to call it before we'd hashed the new dentry and attached it to inode, we need ->setxattr() instances getting the inode as an explicit argument rather than obtaining it from dentry. Similar change for ->getxattr() had been done in commit ce23e64. Unlike ->getxattr() (which is used by both selinux and smack instances of ->d_instantiate()) ->setxattr() is used only by smack one and unfortunately it got missed back then. Reported-by: Seung-Woo Kim <sw0312.kim@samsung.com> Tested-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03fuse: switch to ->iterate_shared()Al Viro1-49/+45
Switch dcache pre-seeding on readdir to d_alloc_parallel(); nothing else is needed. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-04-11->getxattr(): pass dentry and inode as separate argumentsAl Viro1-3/+2
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-01-23wrappers for ->i_mutex accessAl Viro1-5/+5
parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested}, inode_foo(inode) being mutex_foo(&inode->i_mutex). Please, use those for access to ->i_mutex; over the coming cycle ->i_mutex will become rwsem, with ->lookup() done with it held only shared. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-30switch ->get_link() to delayed_call, kill ->put_link()Al Viro1-3/+3
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-30kill free_page_put_link()Al Viro1-3/+3
all callers are better off with kfree_put_link() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-09replace ->follow_link() with new method that could stay in RCU modeAl Viro1-3/+6
new method: ->get_link(); replacement of ->follow_link(). The differences are: * inode and dentry are passed separately * might be called both in RCU and non-RCU mode; the former is indicated by passing it a NULL dentry. * when called that way it isn't allowed to block and should return ERR_PTR(-ECHILD) if it needs to be called in non-RCU mode. It's a flagday change - the old method is gone, all in-tree instances converted. Conversion isn't hard; said that, so far very few instances do not immediately bail out when called in RCU mode. That'll change in the next commits. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-05-11new helper: free_page_put_link()Al Viro1-6/+1
similar to kfree_put_link() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-05-11switch ->put_link() from dentry to inodeAl Viro1-1/+1
only one instance looks at that argument at all; that sole exception wants inode rather than dentry. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-05-11don't pass nameidata to ->follow_link()Al Viro1-1/+1
its only use is getting passed to nd_jump_link(), which can obtain it from current->nameidata Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-05-11new ->follow_link() and ->put_link() calling conventionsAl Viro1-15/+4
a) instead of storing the symlink body (via nd_set_link()) and returning an opaque pointer later passed to ->put_link(), ->follow_link() _stores_ that opaque pointer (into void * passed by address by caller) and returns the symlink body. Returning ERR_PTR() on error, NULL on jump (procfs magic symlinks) and pointer to symlink body for normal symlinks. Stored pointer is ignored in all cases except the last one. Storing NULL for opaque pointer (or not storing it at all) means no call of ->put_link(). b) the body used to be passed to ->put_link() implicitly (via nameidata). Now only the opaque pointer is. In the cases when we used the symlink body to free stuff, ->follow_link() now should store it as opaque pointer in addition to returning it. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-15VFS: normal filesystems (and lustre): d_inode() annotationsDavid Howells1-30/+30
that's the bulk of filesystem drivers dealing with inodes of their own Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-22VFS: (Scripted) Convert S_ISLNK/DIR/REG(dentry->d_inode) to d_is_*(dentry)David Howells1-1/+1
Convert the following where appropriate: (1) S_ISLNK(dentry->d_inode) to d_is_symlink(dentry). (2) S_ISREG(dentry->d_inode) to d_is_reg(dentry). (3) S_ISDIR(dentry->d_inode) to d_is_dir(dentry). This is actually more complicated than it appears as some calls should be converted to d_can_lookup() instead. The difference is whether the directory in question is a real dir with a ->lookup op or whether it's a fake dir with a ->d_automount op. In some circumstances, we can subsume checks for dentry->d_inode not being NULL into this, provided we the code isn't in a filesystem that expects d_inode to be NULL if the dirent really *is* negative (ie. if we're going to use d_inode() rather than d_backing_inode() to get the inode pointer). Note that the dentry type field may be set to something other than DCACHE_MISS_TYPE when d_inode is NULL in the case of unionmount, where the VFS manages the fall-through from a negative dentry to a lower layer. In such a case, the dentry type of the negative union dentry is set to the same as the type of the lower dentry. However, if you know d_inode is not NULL at the call site, then you can use the d_is_xxx() functions even in a filesystem. There is one further complication: a 0,0 chardev dentry may be labelled DCACHE_WHITEOUT_TYPE rather than DCACHE_SPECIAL_TYPE. Strictly, this was intended for special directory entry types that don't have attached inodes. The following perl+coccinelle script was used: use strict; my @callers; open($fd, 'git grep -l \'S_IS[A-Z].*->d_inode\' |') || die "Can't grep for S_ISDIR and co. callers"; @callers = <$fd>; close($fd); unless (@callers) { print "No matches\n"; exit(0); } my @cocci = ( '@@', 'expression E;', '@@', '', '- S_ISLNK(E->d_inode->i_mode)', '+ d_is_symlink(E)', '', '@@', 'expression E;', '@@', '', '- S_ISDIR(E->d_inode->i_mode)', '+ d_is_dir(E)', '', '@@', 'expression E;', '@@', '', '- S_ISREG(E->d_inode->i_mode)', '+ d_is_reg(E)' ); my $coccifile = "tmp.sp.cocci"; open($fd, ">$coccifile") || die $coccifile; print($fd "$_\n") || die $coccifile foreach (@cocci); close($fd); foreach my $file (@callers) { chomp $file; print "Processing ", $file, "\n"; system("spatch", "--sp-file", $coccifile, $file, "--in-place", "--no-show-diff") == 0 || die "spatch failed"; } [AV: overlayfs parts skipped] Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-01-06fuse: fix LOOKUP vs INIT compat handlingMiklos Szeredi1-24/+7
Analysis from Marc: "Commit 7078187a795f ("fuse: introduce fuse_simple_request() helper") from the above pull request triggers some EIO errors for me in some tests that rely on fuse Looking at the code changes and a bit of debugging info I think there's a general problem here that fuse_get_req checks and possibly waits for fc->initialized, and this was always called first. But this commit changes the ordering and in many places fc->minor is now possibly used before fuse_get_req, and we can't be sure that fc has been initialized. In my case fuse_lookup_init sets req->out.args[0].size to the wrong size because fc->minor at that point is still 0, leading to the EIO error." Fix by moving the compat adjustments into fuse_simple_request() to after fuse_get_req(). This is also more readable than the original, since now compatibility is handled in a single function instead of cluttering each operation. Reported-by: Marc Dionne <marc.c.dionne@gmail.com> Tested-by: Marc Dionne <marc.c.dionne@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Fixes: 7078187a795f ("fuse: introduce fuse_simple_request() helper")
2014-12-17Merge branch 'for-linus' of ↵Linus Torvalds1-323/+215
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse update from Miklos Szeredi: "The first part makes sure we don't hold up umount with pending async requests. In addition to being a cleanup, this is a small behavioral change (for the better) and unlikely to break anything. The second part prepares for a cleanup of the fuse device I/O code by adding a helper for simple request submission, with some savings in line numbers already realized" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: use file_inode() in fuse_file_fallocate() fuse: introduce fuse_simple_request() helper fuse: reduce max out args fuse: hold inode instead of path after release fuse: flush requests on umount fuse: don't wake up reserved req in fuse_conn_kill()
2014-12-12fuse: introduce fuse_simple_request() helperMiklos Szeredi1-323/+215
The following pattern is repeated many times: req = fuse_get_req_nopages(fc); /* Initialize req->(in|out).args */ fuse_request_send(fc, req); err = req->out.h.error; fuse_put_request(req); Create a new replacement helper: /* Initialize args */ err = fuse_simple_request(fc, &args); In addition to reducing the code size, this will ease moving from the complex arg-based to a simpler page-based I/O on the fuse device. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-11-19switch d_materialise_unique() users to d_splice_alias()Al Viro1-2/+2
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-10-09vfs: Make d_invalidate return voidEric W. Biederman1-3/+1
Now that d_invalidate can no longer fail, stop returning a useless return code. For the few callers that checked the return code update remove the handling of d_invalidate failure. Reviewed-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-10-09vfs: Remove unnecessary calls of check_submounts_and_dropEric W. Biederman1-3/+0
Now that check_submounts_and_drop can not fail and is called from d_invalidate there is no longer a need to call check_submounts_and_drom from filesystem d_revalidate methods so remove it. Reviewed-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-08-07fs: call rename2 if existsMiklos Szeredi1-7/+0
Christoph Hellwig suggests: 1) make vfs_rename call ->rename2 if it exists instead of ->rename 2) switch all filesystems that you're adding NOREPLACE support for to use ->rename2 3) see how many ->rename instances we'll have left after a few iterations of 2. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-07-10fuse: restructure ->rename2()Miklos Szeredi1-14/+20
Make ->rename2() universal, i.e. able to handle zero flags. This is to make future change of the API easier. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-07-07fuse: ignore entry-timeout on LOOKUP_REVALAnand Avati1-1/+2
The following test case demonstrates the bug: sh# mount -t glusterfs localhost:meta-test /mnt/one sh# mount -t glusterfs localhost:meta-test /mnt/two sh# echo stuff > /mnt/one/file; rm -f /mnt/two/file; echo stuff > /mnt/one/file bash: /mnt/one/file: Stale file handle sh# echo stuff > /mnt/one/file; rm -f /mnt/two/file; sleep 1; echo stuff > /mnt/one/file On the second open() on /mnt/one, FUSE would have used the old nodeid (file handle) trying to re-open it. Gluster is returning -ESTALE. The ESTALE propagates back to namei.c:filename_lookup() where lookup is re-attempted with LOOKUP_REVAL. The right behavior now, would be for FUSE to ignore the entry-timeout and and do the up-call revalidation. Instead FUSE is ignoring LOOKUP_REVAL, succeeding the revalidation (because entry-timeout has not passed), and open() is again retried on the old file handle and finally the ESTALE is going back to the application. Fix: if revalidation is happening with LOOKUP_REVAL, then ignore entry-timeout and always do the up-call. Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: stable@vger.kernel.org
2014-07-07fuse: timeout comparison fixMiklos Szeredi1-3/+3
As suggested by checkpatch.pl, use time_before64() instead of direct comparison of jiffies64 values. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: <stable@vger.kernel.org>
2014-04-28fuse: add renameat2 supportMiklos Szeredi1-8/+47
Support RENAME_EXCHANGE and RENAME_NOREPLACE flags on the userspace ABI. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-04-28fuse: clear FUSE_I_CTIME_DIRTY flag on setattrMaxim Patlasov1-9/+17
The patch addresses two use-cases when the flag may be safely cleared: 1. fuse_do_setattr() is called with ATTR_CTIME flag set in attr->ia_valid. In this case attr->ia_ctime bears actual value. In-kernel fuse must send it to the userspace server and then assign the value to inode->i_ctime. 2. fuse_do_setattr() is called with ATTR_SIZE flag set in attr->ia_valid, whereas ATTR_CTIME is not set (truncate(2)). In this case in-kernel fuse must sent "now" to the userspace server and then assign the value to inode->i_ctime. In both cases we could clear I_DIRTY_SYNC, but that needs more thought. Signed-off-by: Maxim Patlasov <MPatlasov@parallels.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-04-28fuse: trust kernel i_ctime onlyMaxim Patlasov1-2/+20
Let the kernel maintain i_ctime locally: update i_ctime explicitly on truncate, fallocate, open(O_TRUNC), setxattr, removexattr, link, rename, unlink. The inode flag I_DIRTY_SYNC serves as indication that local i_ctime should be flushed to the server eventually. The patch sets the flag and updates i_ctime in course of operations listed above. Signed-off-by: Maxim Patlasov <MPatlasov@parallels.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-04-28fuse: remove .update_timeMiklos Szeredi1-12/+0
This implements updating ctime as well as mtime on file_update_time(). Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-04-28fuse: allow ctime flushing to userspaceMaxim Patlasov1-2/+7
The patch extends fuse_setattr_in, and extends the flush procedure (fuse_flush_times()) called on ->write_inode() to send the ctime as well as mtime. Signed-off-by: Maxim Patlasov <MPatlasov@parallels.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-04-28fuse: add .write_inodeMiklos Szeredi1-17/+11
...and flush mtime from this. This allows us to use the kernel infrastructure for writing out dirty metadata (mtime at this point, but ctime in the next patches and also maybe atime). Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-04-28fuse: update mtime on truncate(2)Maxim Patlasov1-0/+2
Handling truncate(2), VFS doesn't set ATTR_MTIME bit in iattr structure; only ATTR_SIZE bit is set. In-kernel fuse must handle the case by setting mtime fields of struct fuse_setattr_in to "now" and set FATTR_MTIME bit even though ATTR_MTIME was not set. Signed-off-by: Maxim Patlasov <MPatlasov@parallels.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-04-02fuse: Trust kernel i_mtime onlyMaxim Patlasov1-17/+91
Let the kernel maintain i_mtime locally: - clear S_NOCMTIME - implement i_op->update_time() - flush mtime on fsync and last close - update i_mtime explicitly on truncate and fallocate Fuse inode flag FUSE_I_MTIME_DIRTY serves as indication that local i_mtime should be flushed to the server eventually. Signed-off-by: Maxim Patlasov <MPatlasov@parallels.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-04-02fuse: Trust kernel i_size onlyPavel Emelyanov1-2/+11
Make fuse think that when writeback is on the inode's i_size is always up-to-date and not update it with the value received from the userspace. This is done because the page cache code may update i_size without letting the FS know. This assumption implies fixing the previously introduced short-read helper -- when a short read occurs the 'hole' is filled with zeroes. fuse_file_fallocate() is also fixed because now we should keep i_size up to date, so it must be updated if FUSE_FALLOCATE request succeeded. Signed-off-by: Maxim V. Patlasov <MPatlasov@parallels.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-01-22fuse: don't invalidate attrs when not using atimeAndrew Gallagher1-2/+12
Various read operations (e.g. readlink, readdir) invalidate the cached attrs for atime changes. This patch adds a new function 'fuse_invalidate_atime', which checks for a read-only super block and avoids the attr invalidation in that case. Signed-off-by: Andrew Gallagher <andrewjcg@fb.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2013-10-25vfs: introduce d_instantiate_no_diralias()Miklos Szeredi1-35/+5
...which just returns -EBUSY if a directory alias would be created. This is to be used by fuse mkdir to make sure that a buggy or malicious userspace filesystem doesn't do anything nasty. Previously fuse used a private mutex for this purpose, which can now go away. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2013-10-01fuse: no RCU mode in fuse_access()Miklos Szeredi1-3/+2
fuse_access() is never called in RCU walk, only on the final component of access(2) and chdir(2)... Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2013-10-01fuse: readdirplus: fix RCU walkMiklos Szeredi1-3/+9
Doing dput(parent) is not valid in RCU walk mode. In RCU mode it would probably be okay to update the parent flags, but it's actually not necessary most of the time... So only set the FUSE_I_ADVISE_RDPLUS flag on the parent when the entry was recently initialized by READDIRPLUS. This is achieved by setting FUSE_I_INIT_RDPLUS on entries added by READDIRPLUS and only dropping out of RCU mode if this flag is set. FUSE_I_INIT_RDPLUS is cleared once the FUSE_I_ADVISE_RDPLUS flag is set in the parent. Reported-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: stable@vger.kernel.org
2013-10-01fuse: don't check_submounts_and_drop() in RCU walkMiklos Szeredi1-1/+2
If revalidate finds an invalid dentry in RCU walk mode, let the VFS deal with it instead of calling check_submounts_and_drop() which is not prepared for being called from RCU walk. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: stable@vger.kernel.org
2013-09-13truncate: drop 'oldsize' truncate_pagecache() parameterKirill A. Shutemov1-1/+1
truncate_pagecache() doesn't care about old size since commit cedabed49b39 ("vfs: Fix vmtruncate() regression"). Let's drop it. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-09Merge branch 'for-linus' of ↵Linus Torvalds1-1/+14
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse bugfixes from Miklos Szeredi: "Just a bunch of bugfixes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: use list_for_each_entry() for list traversing fuse: readdir: check for slash in names fuse: hotfix truncate_pagecache() issue fuse: invalidate inode attributes on xattr modification fuse: postpone end_page_writeback() in fuse_writepage_locked()
2013-09-06fuse: drop dentry on failed revalidateAnand Avati1-0/+2
Drop a subtree when we find that it has moved or been delated. This can be done as long as there are no submounts under this location. If the directory was moved and we come across the same directory in a future lookup it will be reconnected by d_materialise_unique(). Signed-off-by: Anand Avati <avati@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-09-06fuse: clean up return in fuse_dentry_revalidate()Miklos Szeredi1-8/+18
On errors unrelated to the filesystem's state (ENOMEM, ENOTCONN) return the error itself from ->d_revalidate() insted of returning zero (invalid). Also make a common label for invalidating the dentry. This will be used by the next patch. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-09-06fuse: use d_materialise_unique()Miklos Szeredi1-43/+26
Use d_materialise_unique() instead of d_splice_alias(). This allows dentry subtrees to be moved to a new place if there moved, even if something is referencing a dentry in the subtree (open fd, cwd, etc..). This will also allow us to drop a subtree if it is found to be replaced by something else. In this case the disconnected subtree can later be reconnected to its new location. d_materialise_unique() ensures that a directory entry only ever has one alias. We keep fc->inst_mutex around the calls for d_materialise_unique() on directories to prevent a race with mkdir "stealing" the inode. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-09-03fuse: readdir: check for slash in namesMiklos Szeredi1-0/+4
Userspace can add names containing a slash character to the directory listing. Don't allow this as it could cause all sorts of trouble. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: stable@vger.kernel.org
2013-09-03fuse: hotfix truncate_pagecache() issueMaxim Patlasov1-1/+6
The way how fuse calls truncate_pagecache() from fuse_change_attributes() is completely wrong. Because, w/o i_mutex held, we never sure whether 'oldsize' and 'attr->size' are valid by the time of execution of truncate_pagecache(inode, oldsize, attr->size). In fact, as soon as we released fc->lock in the middle of fuse_change_attributes(), we completely loose control of actions which may happen with given inode until we reach truncate_pagecache. The list of potentially dangerous actions includes mmap-ed reads and writes, ftruncate(2) and write(2) extending file size. The typical outcome of doing truncate_pagecache() with outdated arguments is data corruption from user point of view. This is (in some sense) acceptable in cases when the issue is triggered by a change of the file on the server (i.e. externally wrt fuse operation), but it is absolutely intolerable in scenarios when a single fuse client modifies a file without any external intervention. A real life case I discovered by fsx-linux looked like this: 1. Shrinking ftruncate(2) comes to fuse_do_setattr(). The latter sends FUSE_SETATTR to the server synchronously, but before getting fc->lock ... 2. fuse_dentry_revalidate() is asynchronously called. It sends FUSE_LOOKUP to the server synchronously, then calls fuse_change_attributes(). The latter updates i_size, releases fc->lock, but before comparing oldsize vs attr->size.. 3. fuse_do_setattr() from the first step proceeds by acquiring fc->lock and updating attributes and i_size, but now oldsize is equal to outarg.attr.size because i_size has just been updated (step 2). Hence, fuse_do_setattr() returns w/o calling truncate_pagecache(). 4. As soon as ftruncate(2) completes, the user extends file size by write(2) making a hole in the middle of file, then reads data from the hole either by read(2) or mmap-ed read. The user expects to get zero data from the hole, but gets stale data because truncate_pagecache() is not executed yet. The scenario above illustrates one side of the problem: not truncating the page cache even though we should. Another side corresponds to truncating page cache too late, when the state of inode changed significantly. Theoretically, the following is possible: 1. As in the previous scenario fuse_dentry_revalidate() discovered that i_size changed (due to our own fuse_do_setattr()) and is going to call truncate_pagecache() for some 'new_size' it believes valid right now. But by the time that particular truncate_pagecache() is called ... 2. fuse_do_setattr() returns (either having called truncate_pagecache() or not -- it doesn't matter). 3. The file is extended either by write(2) or ftruncate(2) or fallocate(2). 4. mmap-ed write makes a page in the extended region dirty. The result will be the lost of data user wrote on the fourth step. The patch is a hotfix resolving the issue in a simplistic way: let's skip dangerous i_size update and truncate_pagecache if an operation changing file size is in progress. This simplistic approach looks correct for the cases w/o external changes. And to handle them properly, more sophisticated and intrusive techniques (e.g. NFS-like one) would be required. I'd like to postpone it until the issue is well discussed on the mailing list(s). Changed in v2: - improved patch description to cover both sides of the issue. Signed-off-by: Maxim Patlasov <mpatlasov@parallels.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: stable@vger.kernel.org
2013-09-03fuse: invalidate inode attributes on xattr modificationAnand Avati1-0/+4
Calls like setxattr and removexattr result in updation of ctime. Therefore invalidate inode attributes to force a refresh. Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: stable@vger.kernel.org
2013-07-17fuse: readdirplus: cleanupMiklos Szeredi1-3/+1
Niels noted that we don't need the 'dentry = NULL' line. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> CC: Niels de Vos <ndevos@redhat.com>
2013-07-17fuse: readdirplus: change attributes onceMiklos Szeredi1-3/+4
If we got the inode through fuse_iget() then the attributes are already up-to-date. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2013-07-17fuse: readdirplus: fix instantiateMiklos Szeredi1-4/+13
Fuse does instantiation slightly differently from NFS/CIFS which use d_materialise_unique(). Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> CC: stable@vger.kernel.org