summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)AuthorFilesLines
2013-06-29[readdir] convert hpfsAl Viro1-27/+29
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29reiserfs: switch reiserfs_readdir_dentry to inodeAl Viro3-17/+15
... and clean the callers up a bit Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29reiserfs: is_privroot_deh() needs only directory inode, actuallyAl Viro1-5/+4
... and that - only to get the superblock. Privroot is a directory and we don't allow hardlinks to those... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29[readdir] convert reiserfsAl Viro3-23/+19
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29[readdir] convert ntfsAl Viro1-57/+27
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29[readdir] convert isofsAl Viro1-22/+20
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29[readdir] convert jffs2Al Viro1-36/+16
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29[readdir] convert f2fsAl Viro2-35/+22
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29[readdir] convert 9pAl Viro1-44/+28
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29[readdir] convert affsAl Viro1-45/+24
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29[readdir] convert adfsAl Viro1-24/+18
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29[readdir] convert logfsAl Viro1-34/+15
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29[readdir] convert jfsAl Viro3-36/+31
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29[readdir] convert cephAl Viro1-51/+48
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29[readdir] convert nfsAl Viro1-26/+25
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29[readdir] convert ext4Al Viro3-190/+134
and trim the living hell out bogosities in inline dir case Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29[readdir] convert qnx6Al Viro1-17/+14
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29[readdir] convert qnx4Al Viro1-35/+31
... and use strnlen() instead of strlen() - it's done on untrusted data, after all. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29[readdir] convert omfsAl Viro1-56/+38
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29[readdir] convert nilfs2Al Viro1-30/+18
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29[readdir] convert sysfsAl Viro1-48/+18
get rid of the kludges in sysfs_readdir() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29[readdir] convert gfs2Al Viro4-51/+38
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29[readdir] convert exofsAl Viro1-22/+16
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29[readdir] convert bfsAl Viro1-21/+14
... and get rid of that ridiculous mutex in bfs_readdir() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29[readdir] convert procfsAl Viro9-489/+284
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29[readdir] convert openpromfsAl Viro1-51/+44
what the hell is op_mutex for, BTW? Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29[readdir] convert efsAl Viro1-42/+33
* sanity checks belong before risky operation, not after it * don't quit as soon as we'd found an entry Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29[readdir] convert configfsAl Viro1-70/+52
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29[readdir] convert romfsAl Viro1-12/+9
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29[readdir] convert squashfsAl Viro1-28/+12
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29[readdir] convert ubifsAl Viro1-41/+16
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29[readdir] convert udfAl Viro1-37/+26
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29[readdir] convert ext3Al Viro1-93/+64
new helper: dir_relax(inode). Call when you are in location that will _not_ be invalidated by directory modifications (block boundary, in case of ext*). Returns whether the directory has survived (dropping i_mutex allows rmdir to kill the sucker; if it returns false to us, ->iterate() is obviously done) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29[readdir] switch dcache_readdir() users to ->iterate()Al Viro2-52/+32
new helpers - dir_emit_dot(file, ctx, dentry), dir_emit_dotdot(file, ctx), dir_emit_dots(file, ctx). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29[readdir] simple local unixlike: switch to ->iterate()Al Viro4-75/+59
ext2, ufs, minix, sysv Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29[readdir] introduce ->iterate(), ctx->pos, dir_emit()Al Viro5-17/+37
New method - ->iterate(file, ctx). That's the replacement for ->readdir(); it takes callback from ctx->actor, uses ctx->pos instead of file->f_pos and calls dir_emit(ctx, ...) instead of filldir(data, ...). It does *not* update file->f_pos (or look at it, for that matter); iterate_dir() does the update. Note that dir_emit() takes the offset from ctx->pos (and eventually filldir_t will lose that argument). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29[readdir] introduce iterate_dir() and dir_contextAl Viro6-18/+40
iterate_dir(): new helper, replacing vfs_readdir(). struct dir_context: contains the readdir callback (and will get more stuff in it), embedded into whatever data that callback wants to deal with; eventually, we'll be passing it to ->readdir() replacement instead of (data,filldir) pair. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29compat.c: LOOP_CLR_FD is taken care of in loop.c itself...Al Viro1-3/+0
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29UBIFS: fix a horrid bugArtem Bityutskiy1-3/+27
Al Viro pointed me to the fact that '->readdir()' and '->llseek()' have no mutual exclusion, which means the 'ubifs_dir_llseek()' can be run while we are in the middle of 'ubifs_readdir()'. This means that 'file->private_data' can be freed while 'ubifs_readdir()' uses it, and this is a very bad bug: not only 'ubifs_readdir()' can return garbage, but this may corrupt memory and lead to all kinds of problems like crashes an security holes. This patch fixes the problem by using the 'file->f_version' field, which '->llseek()' always unconditionally sets to zero. We set it to 1 in 'ubifs_readdir()' and whenever we detect that it became 0, we know there was a seek and it is time to clear the state saved in 'file->private_data'. I tested this patch by writing a user-space program which runds readdir and seek in parallell. I could easily crash the kernel without these patches, but could not crash it with these patches. Cc: stable@vger.kernel.org Reported-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29UBIFS: prepare to fix a horrid bugArtem Bityutskiy1-12/+12
Al Viro pointed me to the fact that '->readdir()' and '->llseek()' have no mutual exclusion, which means the 'ubifs_dir_llseek()' can be run while we are in the middle of 'ubifs_readdir()'. First of all, this means that 'file->private_data' can be freed while 'ubifs_readdir()' uses it. But this particular patch does not fix the problem. This patch is only a preparation, and the fix will follow next. In this patch we make 'ubifs_readdir()' stop using 'file->f_pos' directly, because 'file->f_pos' can be changed by '->llseek()' at any point. This may lead 'ubifs_readdir()' to returning inconsistent data: directory entry names may correspond to incorrect file positions. So here we introduce a local variable 'pos', read 'file->f_pose' once at very the beginning, and then stick to 'pos'. The result of this is that when 'ubifs_dir_llseek()' changes 'file->f_pos' while we are in the middle of 'ubifs_readdir()', the latter "wins". Cc: stable@vger.kernel.org Reported-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-20splice: don't pass the address of ->f_pos to methodsAl Viro3-21/+40
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-15Merge branch 'for-linus' of ↵Linus Torvalds3-20/+12
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull VFS fixes from Al Viro: "Several fixes + obvious cleanup (you've missed a couple of open-coded can_lookup() back then)" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: snd_pcm_link(): fix a leak... use can_lookup() instead of direct checks of ->i_op->lookup move exit_task_namespaces() outside of exit_notify() fput: task_work_add() can fail if the caller has passed exit_task_work() ncpfs: fix rmdir returns Device or resource busy
2013-06-15Merge tag 'for-linus-v3.10-rc6' of git://oss.sgi.com/xfs/xfsLinus Torvalds5-11/+42
Pull xfs fixes from Ben Myers: - Remove noisy warnings about experimental support which spams the logs - Add padding to align directory and attr structures correctly - Set block number on child buffer on a root btree split - Disable verifiers during log recovery for non-CRC filesystems * tag 'for-linus-v3.10-rc6' of git://oss.sgi.com/xfs/xfs: xfs: don't shutdown log recovery on validation errors xfs: ensure btree root split sets blkno correctly xfs: fix implicit padding in directory and attr CRC formats xfs: don't emit v5 superblock warnings on write
2013-06-15use can_lookup() instead of direct checks of ->i_op->lookupAl Viro1-2/+2
a couple of places got missed back when Linus has introduced that one... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-15fput: task_work_add() can fail if the caller has passed exit_task_work()Oleg Nesterov1-9/+10
fput() assumes that it can't be called after exit_task_work() but this is not true, for example free_ipc_ns()->shm_destroy() can do this. In this case fput() silently leaks the file. Change it to fallback to delayed_fput_work if task_work_add() fails. The patch looks complicated but it is not, it changes the code from if (PF_KTHREAD) { schedule_work(...); return; } task_work_add(...) to if (!PF_KTHREAD) { if (!task_work_add(...)) return; /* fallback */ } schedule_work(...); As for shm_destroy() in particular, we could make another fix but I think this change makes sense anyway. There could be another similar user, it is not safe to assume that task_work_add() can't fail. Reported-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-15xfs: don't shutdown log recovery on validation errorsDave Chinner1-2/+17
Unfortunately, we cannot guarantee that items logged multiple times and replayed by log recovery do not take objects back in time. When they are taken back in time, the go into an intermediate state which is corrupt, and hence verification that occurs on this intermediate state causes log recovery to abort with a corruption shutdown. Instead of causing a shutdown and unmountable filesystem, don't verify post-recovery items before they are written to disk. This is less than optimal, but there is no way to detect this issue for non-CRC filesystems If log recovery successfully completes, this will be undone and the object will be consistent by subsequent transactions that are replayed, so in most cases we don't need to take drastic action. For CRC enabled filesystems, leave the verifiers in place - we need to call them to recalculate the CRCs on the objects anyway. This recovery problem can be solved for such filesystems - we have a LSN stamped in all metadata at writeback time that we can to determine whether the item should be replayed or not. This is a separate piece of work, so is not addressed by this patch. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com> (cherry picked from commit 9222a9cf86c0d64ffbedf567412b55da18763aa3)
2013-06-15xfs: ensure btree root split sets blkno correctlyDave Chinner1-0/+10
For CRC enabled filesystems, the BMBT is rooted in an inode, so it passes through a different code path on root splits than the freespace and inode btrees. This is much less traversed by xfstests than the other trees. When testing on a 1k block size filesystem, I've been seeing ASSERT failures in generic/234 like: XFS: Assertion failed: cur->bc_btnum != XFS_BTNUM_BMAP || cur->bc_private.b.allocated == 0, file: fs/xfs/xfs_btree.c, line: 317 which are generally preceded by a lblock check failure. I noticed this in the bmbt stats: $ pminfo -f xfs.btree.block_map xfs.btree.block_map.lookup value 39135 xfs.btree.block_map.compare value 268432 xfs.btree.block_map.insrec value 15786 xfs.btree.block_map.delrec value 13884 xfs.btree.block_map.newroot value 2 xfs.btree.block_map.killroot value 0 ..... Very little coverage of root splits and merges. Indeed, on a 4k filesystem, block_map.newroot and block_map.killroot are both zero. i.e. the code is not exercised at all, and it's the only generic btree infrastructure operation that is not exercised by a default run of xfstests. Turns out that on a 1k filesystem, generic/234 accounts for one of those two root splits, and that is somewhat of a smoking gun. In fact, it's the same problem we saw in the directory/attr code where headers are memcpy()d from one block to another without updating the self describing metadata. Simple fix - when copying the header out of the root block, make sure the block number is updated correctly. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com> (cherry picked from commit ade1335afef556df6538eb02e8c0dc91fbd9cc37)
2013-06-15xfs: fix implicit padding in directory and attr CRC formatsDave Chinner2-2/+4
Michael L. Semon has been testing CRC patches on a 32 bit system and been seeing assert failures in the directory code from xfs/080. Thanks to Michael's heroic efforts with printk debugging, we found that the problem was that the last free space being left in the directory structure was too small to fit a unused tag structure and it was being corrupted and attempting to log a region out of bounds. Hence the assert failure looked something like: ..... #5 calling xfs_dir2_data_log_unused() 36 32 #1 4092 4095 4096 #2 8182 8183 4096 XFS: Assertion failed: first <= last && last < BBTOB(bp->b_length), file: fs/xfs/xfs_trans_buf.c, line: 568 Where #1 showed the first region of the dup being logged (i.e. the last 4 bytes of a directory buffer) and #2 shows the corrupt values being calculated from the length of the dup entry which overflowed the size of the buffer. It turns out that the problem was not in the logging code, nor in the freespace handling code. It is an initial condition bug that only shows up on 32 bit systems. When a new buffer is initialised, where's the freespace that is set up: [ 172.316249] calling xfs_dir2_leaf_addname() from xfs_dir_createname() [ 172.316346] #9 calling xfs_dir2_data_log_unused() [ 172.316351] #1 calling xfs_trans_log_buf() 60 63 4096 [ 172.316353] #2 calling xfs_trans_log_buf() 4094 4095 4096 Note the offset of the first region being logged? It's 60 bytes into the buffer. Once I saw that, I pretty much knew that the bug was going to be caused by this. Essentially, all direct entries are rounded to 8 bytes in length, and all entries start with an 8 byte alignment. This means that we can decode inplace as variables are naturally aligned. With the directory data supposedly starting on a 8 byte boundary, and all entries padded to 8 bytes, the minimum freespace in a directory block is supposed to be 8 bytes, which is large enough to fit a unused data entry structure (6 bytes in size). The fact we only have 4 bytes of free space indicates a directory data block alignment problem. And what do you know - there's an implicit hole in the directory data block header for the CRC format, which means the header is 60 byte on 32 bit intel systems and 64 bytes on 64 bit systems. Needs padding. And while looking at the structures, I found the same problem in the attr leaf header. Fix them both. Note that this only affects 32 bit systems with CRCs enabled. Everything else is just fine. Note that CRC enabled filesystems created before this fix on such systems will not be readable with this fix applied. Reported-by: Michael L. Semon <mlsemon35@gmail.com> Debugged-by: Michael L. Semon <mlsemon35@gmail.com> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com> (cherry picked from commit 8a1fd2950e1fe267e11fc8c85dcaa6b023b51b60)
2013-06-15xfs: don't emit v5 superblock warnings on writeDave Chinner1-7/+11
We write the superblock every 30s or so which results in the verifier being called. Right now that results in this output every 30s: XFS (vda): Version 5 superblock detected. This kernel has EXPERIMENTAL support enabled! Use of these features in this kernel is at your own risk! And spamming the logs. We don't need to check for whether we support v5 superblocks or whether there are feature bits we don't support set as these are only relevant when we first mount the filesytem. i.e. on superblock read. Hence for the write verification we can just skip all the checks (and hence verbose output) altogether. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Ben Myers <bpm@sgi.com> (cherry picked from commit 34510185abeaa5be9b178a41c0a03d30aec3db7e)
2013-06-14Merge branch 'for-linus' of ↵Linus Torvalds3-9/+13
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs fixes from Chris Mason: "This is an assortment of crash fixes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: Btrfs: stop all workers before cleaning up roots Btrfs: fix use-after-free bug during umount Btrfs: init relocate extent_io_tree with a mapping btrfs: Drop inode if inode root is NULL Btrfs: don't delete fs_roots until after we cleanup the transaction