summaryrefslogtreecommitdiff
path: root/fs/ufs
AgeCommit message (Collapse)AuthorFilesLines
2016-12-24Replace <asm/uaccess.h> with <linux/uaccess.h> globallyLinus Torvalds2-2/+2
This was entirely automated, using the script by Al: PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>' sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \ $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h) to do the replacement at the end of the merge window. Requested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-23ufs: fix function declaration for ufs_truncate_blocksJeff Layton1-1/+1
sparse says: fs/ufs/inode.c:1195:6: warning: symbol 'ufs_truncate_blocks' was not declared. Should it be static? Note that the forward declaration in the file is already marked static. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-11-04fs: Add helper to clean bdev aliases under a bh and use itJan Kara2-4/+2
Add a helper function that clears buffer heads from a block device aliasing passed bh. Use this helper function from filesystems instead of the original unmap_underlying_metadata() to save some boiler plate code and also have a better name for the functionalily since it is not unmapping anything for a *long* time. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-11-01block,fs: untangle fs.h and blk_types.hChristoph Hellwig1-0/+1
Nothing in fs.h should require blk_types.h to be included. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-10-11Merge branch 'for-linus' of ↵Linus Torvalds4-11/+15
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull more vfs updates from Al Viro: ">rename2() work from Miklos + current_time() from Deepa" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fs: Replace current_fs_time() with current_time() fs: Replace CURRENT_TIME_SEC with current_time() for inode timestamps fs: Replace CURRENT_TIME with current_time() for inode timestamps fs: proc: Delete inode time initializations in proc_alloc_inode() vfs: Add current_time() api vfs: add note about i_op->rename changes to porting fs: rename "rename2" i_op to "rename" vfs: remove unused i_op->rename fs: make remaining filesystems use .rename2 libfs: support RENAME_NOREPLACE in simple_rename() fs: support RENAME_NOREPLACE for local filesystems ncpfs: fix unused variable warning
2016-10-11Merge remote-tracking branch 'ovl/rename2' into for-linusAl Viro1-1/+5
2016-09-28fs: Replace CURRENT_TIME_SEC with current_time() for inode timestampsDeepa Dinamani4-10/+10
CURRENT_TIME_SEC is not y2038 safe. current_time() will be transitioned to use 64 bit time along with vfs in a separate patch. There is no plan to transistion CURRENT_TIME_SEC to use y2038 safe time interfaces. current_time() will also be extended to use superblock range checking parameters when range checking is introduced. This works because alloc_super() fills in the the s_time_gran in super block to NSEC_PER_SEC. Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com> Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-09-27fs: rename "rename2" i_op to "rename"Miklos Szeredi1-1/+1
Generated patch: sed -i "s/\.rename2\t/\.rename\t\t/" `git grep -wl rename2` sed -i "s/\brename2\b/rename/g" `git grep -wl rename2` Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-09-27fs: support RENAME_NOREPLACE for local filesystemsMiklos Szeredi1-2/+6
This is trivial to do: - add flags argument to foo_rename() - check if flags doesn't have any other than RENAME_NOREPLACE - assign foo_rename() to .rename2 instead of .rename Filesystems converted: affs, bfs, exofs, ext2, hfs, hfsplus, jffs2, jfs, logfs, minix, msdos, nilfs2, omfs, reiserfs, sysvfs, ubifs, udf, ufs, vfat. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Acked-by: Boaz Harrosh <ooo@electrozaur.com> Acked-by: Richard Weinberger <richard@nod.at> Acked-by: Bob Copeland <me@bobcopeland.com> Acked-by: Jan Kara <jack@suse.cz> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Mikulas Patocka <mpatocka@redhat.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Dave Kleikamp <shaggy@kernel.org> Cc: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Cc: Christoph Hellwig <hch@infradead.org>
2016-09-22fs: Give dentry to inode_change_ok() instead of inodeJan Kara1-1/+1
inode_change_ok() will be resposible for clearing capabilities and IMA extended attributes and as such will need dentry. Give it as an argument to inode_change_ok() instead of an inode. Also rename inode_change_ok() to setattr_prepare() to better relect that it does also some modifications in addition to checks. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz>
2016-07-28Merge branch 'work.misc' of ↵Linus Torvalds1-16/+1
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs updates from Al Viro: "Assorted cleanups and fixes. Probably the most interesting part long-term is ->d_init() - that will have a bunch of followups in (at least) ceph and lustre, but we'll need to sort the barrier-related rules before it can get used for really non-trivial stuff. Another fun thing is the merge of ->d_iput() callers (dentry_iput() and dentry_unlink_inode()) and a bunch of ->d_compare() ones (all except the one in __d_lookup_lru())" * 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (26 commits) fs/dcache.c: avoid soft-lockup in dput() vfs: new d_init method vfs: Update lookup_dcache() comment bdev: get rid of ->bd_inodes Remove last traces of ->sync_page new helper: d_same_name() dentry_cmp(): use lockless_dereference() instead of smp_read_barrier_depends() vfs: clean up documentation vfs: document ->d_real() vfs: merge .d_select_inode() into .d_real() unify dentry_iput() and dentry_unlink_inode() binfmt_misc: ->s_root is not going anywhere drop redundant ->owner initializations ufs: get rid of redundant checks orangefs: constify inode_operations missed comment updates from ->direct_IO() prototype change file_inode(f)->i_mapping is f->f_mapping trim fsnotify hooks a bit 9p: new helper - v9fs_parent_fid() debugfs: ->d_parent is never NULL or negative ...
2016-06-07fs: have ll_rw_block users pass in op and flags separatelyMike Christie1-1/+1
This has ll_rw_block users pass in the operation and flags separately, so ll_rw_block can setup the bio op and bi_rw flags on the bio that is submitted. Signed-off-by: Mike Christie <mchristi@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-06-07fs: have submit_bh users pass in op and flags separatelyMike Christie1-1/+1
This has submit_bh users pass in the operation and flags separately, so submit_bh_wbc can setup the bio op and bi_rw flags on the bio that is submitted. Signed-off-by: Mike Christie <mchristi@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-30ufs: get rid of redundant checksAl Viro1-16/+1
ufs_check_page() makes sure there's no entries with zero ->reclen Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03simple local filesystems: switch to ->iterate_shared()Al Viro1-1/+1
no changes needed (XFS isn't simple, but it has the same parallelism in the interesting parts exercised from CXFS). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03make ext2_get_page() and friends work without external serializationAl Viro1-7/+7
Right now ext2_get_page() (and its analogues in a bunch of other filesystems) relies upon the directory being locked - the way it sets and tests Checked and Error bits would be racy without that. Switch to a slightly different scheme, _not_ setting Checked in case of failure. That way the logics becomes if Checked => OK else if Error => fail else if !validate => fail else => OK with validation setting Checked or Error on success and failure resp. and returning which one had happened. Equivalent to the current logics, but unlike the current logics not sensitive to the order of set_bit, test_bit getting reordered by CPU, etc. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-03Merge getxattr prototype change into work.lookupsAl Viro1-1/+1
The rest of work.xattr stuff isn't needed for this branch
2016-04-11don't bother with ->d_inode->i_sb - it's always equal to ->d_sbAl Viro1-1/+1
... and neither can ever be NULL Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-04-04mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macrosKirill A. Shutemov6-27/+27
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time ago with promise that one day it will be possible to implement page cache with bigger chunks than PAGE_SIZE. This promise never materialized. And unlikely will. We have many places where PAGE_CACHE_SIZE assumed to be equal to PAGE_SIZE. And it's constant source of confusion on whether PAGE_CACHE_* or PAGE_* constant should be used in a particular case, especially on the border between fs and mm. Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much breakage to be doable. Let's stop pretending that pages in page cache are special. They are not. The changes are pretty straight-forward: - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN}; - page_cache_get() -> get_page(); - page_cache_release() -> put_page(); This patch contains automated changes generated with coccinelle using script below. For some reason, coccinelle doesn't patch header files. I've called spatch for them manually. The only adjustment after coccinelle is revert of changes to PAGE_CAHCE_ALIGN definition: we are going to drop it later. There are few places in the code where coccinelle didn't reach. I'll fix them manually in a separate patch. Comments and documentation also will be addressed with the separate patch. virtual patch @@ expression E; @@ - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ expression E; @@ - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ @@ - PAGE_CACHE_SHIFT + PAGE_SHIFT @@ @@ - PAGE_CACHE_SIZE + PAGE_SIZE @@ @@ - PAGE_CACHE_MASK + PAGE_MASK @@ expression E; @@ - PAGE_CACHE_ALIGN(E) + PAGE_ALIGN(E) @@ expression E; @@ - page_cache_get(E) + get_page(E) @@ expression E; @@ - page_cache_release(E) + put_page(E) Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15kmemcg: account certain kmem allocations to memcgVladimir Davydov1-1/+1
Mark those kmem allocations that are known to be easily triggered from userspace as __GFP_ACCOUNT/SLAB_ACCOUNT, which makes them accounted to memcg. For the list, see below: - threadinfo - task_struct - task_delay_info - pid - cred - mm_struct - vm_area_struct and vm_region (nommu) - anon_vma and anon_vma_chain - signal_struct - sighand_struct - fs_struct - files_struct - fdtable and fdtable->full_fds_bits - dentry and external_name - inode for all filesystems. This is the most tedious part, because most filesystems overwrite the alloc_inode method. The list is far from complete, so feel free to add more objects. Nevertheless, it should be close to "account everything" approach and keep most workloads within bounds. Malevolent users will be able to breach the limit, but this was possible even with the former "account everything" approach (simply because it did not account everything in fact). [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Tejun Heo <tj@kernel.org> Cc: Greg Thelen <gthelen@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-09don't put symlink bodies in pagecache into highmemAl Viro2-0/+2
kmap() in page_follow_link_light() needed to go - allowing to hold an arbitrary number of kmaps for long is a great way to deadlocking the system. new helper (inode_nohighmem(inode)) needs to be used for pagecache symlinks inodes; done for all in-tree cases. page_follow_link_light() instrumented to yell about anything missed. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-07ufs: get rid of ->setattr() for symlinksAl Viro5-51/+5
It was to needed for a couple of months in 2010, until UFS quota support got dropped. Since then it's equivalent to simple_setattr() (i.e. the default) for everything except the regular files. And dropping it there allows to convert all UFS symlinks to {page,simple}_symlink_inode_operations, getting rid of fs/ufs/symlink.c completely. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-09-09fix ufs write vs readpage race when writing into a holeAl Viro1-2/+2
Followup to the UFS series - with the way we clear the new blocks (via buffer cache, possibly on more than a page worth of file) we really should not insert a reference to new block into inode block tree until after we'd cleared it. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-07-07ufs_inode_get{frag,block}(): get rid of 'phys' argumentAl Viro1-15/+8
Just pass NULL as locked_page in case of first block in the indirect chain. Old calling conventions aside, a reason for having 'phys' was that ufs_inode_getfrag() used to be able to do _two_ allocations - indirect block and extending/reallocating a tail. We needed locked_page for the latter (it's a data), but we also needed to figure out that indirect block is metadata. So we used to pass non-NULL locked_page in all cases *and* used NULL phys as indication of being asked to allocate an indirect. With tail unpacking taken into a separate function we don't need those convolutions anymore. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-07-07ufs_getfrag_block(): tidy up a bitAl Viro1-33/+15
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-07-07ufs_inode_getblock(): failure to read an indirect block is -EIOAl Viro1-2/+3
... and not "write to beginning of the disk", TYVM... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-07-07ufs_getfrag_block(): turn following indirects into a loopAl Viro1-24/+8
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-07-07ufs_inode_getfrag(): pass index instead of 'fragment'Al Viro1-33/+17
same story as with ufs_inode_getblock() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-07-07ufs_inode_getfrag(): split extending the partial blocks offAl Viro1-63/+65
ufs_extend_tail() is handling that now. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-07-07ufs_inode_getblock(): pass indirect block number and full indexAl Viro1-46/+16
... instead of messing with buffer_head. We can bloody well do sb_bread() in there. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-07-07ufs_inode_getblock(): pass index instead of 'fragment'Al Viro1-19/+13
The value passed to ufs_inode_getblock() as the 3rd argument had lower bits ignored; the upper bits were shifted down and used and they actually make sense - those are _lower_ bits of index in indirect block (i.e. they form the index within a fragment within an indirect block). Pass those as argument. Upper bits of index (i.e. the number of fragment within indirect block) will join them shortly. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-07-07ufs_inode_get{frag,block}(): leave sb_getblk() to callerAl Viro1-33/+55
just return the damn block number Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-07-07ufs_getfrag_block(): get rid of macro junglesAl Viro1-29/+22
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-07-07ufs_inode_get{frag,block}(): consolidate success exitsAl Viro1-28/+22
These calling conventions are rudiments of pre-2.3 times; they really need to be sanitized. This is the first step; next will be _always_ returning a block number, instead of this "return a pointer to buffer_head, except when we get to the actual data" crap. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-07-07ufs: use the branch depth in ufs_getfrag_block()Al Viro1-6/+4
we'd already calculated it... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-07-07ufs: move calculation of offsets into ufs_getfrag_block()Al Viro1-8/+9
... and massage ufs_frag_map() to take those instead of fragment number. As it is, we duplicate the damn thing on the write side, open-coded and bloody hard to follow. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-07-07ufs_inode_get{frag,block}(): get rid of retriesAl Viro1-35/+8
We are holding ->truncate_mutex, so nobody else can alter our block pointers. Rechecks/retries were needed back when we only held BKL there, and had to cope with write_begin/writepage and writepage/truncate races. Can't happen anymore... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-07-07__ufs_truncate_blocks(): avoid excessive dirtying of indirect blocksAl Viro1-3/+1
There's a case when an indirect block gets dirtied for no good reason - when there's a hole starting in the middle of area covered by it and spanning past its end, and truncate() is done precisely to the beginning of the hole. The block is obviously not modified at all - all removals happen beyond it. However, existing code ends up dirtying it just in case. It's trivial to fix and while it's not a real bug by any stretch of imagination, it makes the damn thing harder to follow. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-07-07free_full_branch(): don't bother modifying the block we are going to freeAl Viro1-12/+2
Note that it's already made unreachable from the inode, so we don't have to worry about ufs_frag_map() walking into something already freed. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-07-07move marking inode dirty to the end of __ufs_truncate_blocks()Al Viro1-6/+1
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-07-07free_full_branch(): saner calling conventionsAl Viro1-49/+51
Have caller fetch the block number *and* remove it from wherever it was. Pass the block number instead. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-07-07ufs_trunc_branch(): kill recursionAl Viro1-26/+26
turn recursion into a pair of loops Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-07-07ufs_trunc_branch(): massage towards killing recursionAl Viro1-5/+5
We always have 0 < depth2 <= depth in there, so if (--depth) { if (--depth2) A B } else { C // not using depth2 } D // not using depth2 is equivalent to if (--depth2) A with s/depth/depth - 1/ if (--depth) B else C D Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-07-07split ufs_truncate_branch() into full- and partial-branch variantsAl Viro1-16/+58
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-07-07ufs: unify the logics for collecting adjacent data blocks to freeAl Viro1-34/+22
open-coded in several places... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-07-07ufs_trunc_branch(): separate the calls with non-NULL offsetsAl Viro1-4/+7
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-07-07ufs_trunc_branch(): never call with offsets != NULL && depth2 == 0Al Viro1-3/+6
For calls in __ufs_truncate_blocks() it's just a matter of not incrementing offsets[0] and not making that call - immediately following loop will be executed one extra time and we'll be just fine. For recursive call in ufs_trunc_branch() itself, just assing NULL to offsets if we would be about to make such call. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-07-07__ufs_trunc_blocks(): turn the part after switch into a loopAl Viro1-25/+10
... and turn the switch into if (), since all cases with depth != 1 have just become identical. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-07-07__ufs_truncate_blocks(): unify freeing the full branchesAl Viro1-15/+14
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-07-07unify ufs_trunc_..indirect()Al Viro1-138/+60
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>