summaryrefslogtreecommitdiff
path: root/fs/btrfs/extent_io.h
AgeCommit message (Collapse)AuthorFilesLines
2016-07-27Merge branch 'for-4.8/core' of git://git.kernel.dk/linux-blockLinus Torvalds1-4/+4
Pull core block updates from Jens Axboe: - the big change is the cleanup from Mike Christie, cleaning up our uses of command types and modified flags. This is what will throw some merge conflicts - regression fix for the above for btrfs, from Vincent - following up to the above, better packing of struct request from Christoph - a 2038 fix for blktrace from Arnd - a few trivial/spelling fixes from Bart Van Assche - a front merge check fix from Damien, which could cause issues on SMR drives - Atari partition fix from Gabriel - convert cfq to highres timers, since jiffies isn't granular enough for some devices these days. From Jan and Jeff - CFQ priority boost fix idle classes, from me - cleanup series from Ming, improving our bio/bvec iteration - a direct issue fix for blk-mq from Omar - fix for plug merging not involving the IO scheduler, like we do for other types of merges. From Tahsin - expose DAX type internally and through sysfs. From Toshi and Yigal * 'for-4.8/core' of git://git.kernel.dk/linux-block: (76 commits) block: Fix front merge check block: do not merge requests without consulting with io scheduler block: Fix spelling in a source code comment block: expose QUEUE_FLAG_DAX in sysfs block: add QUEUE_FLAG_DAX for devices to advertise their DAX support Btrfs: fix comparison in __btrfs_map_block() block: atari: Return early for unsupported sector size Doc: block: Fix a typo in queue-sysfs.txt cfq-iosched: Charge at least 1 jiffie instead of 1 ns cfq-iosched: Fix regression in bonnie++ rewrite performance cfq-iosched: Convert slice_resid from u64 to s64 block: Convert fifo_time from ulong to u64 blktrace: avoid using timespec block/blk-cgroup.c: Declare local symbols static block/bio-integrity.c: Add #include "blk.h" block/partition-generic.c: Remove a set-but-not-used variable block: bio: kill BIO_MAX_SIZE cfq-iosched: temporarily boost queue priority for idle classes block: drbd: avoid to use BIO_MAX_SIZE block: bio: remove BIO_MAX_SECTORS ...
2016-06-07btrfs: use bio fields for op and flagsMike Christie1-4/+4
The bio REQ_OP and bi_rw rq_flag_bits are now always setup, so there is no need to pass around the rq_flag_bits bits too. btrfs users should should access the bio insead. Signed-off-by: Mike Christie <mchristi@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-06-02Btrfs: self-tests: Support non-4k page sizeFeifei Xu1-2/+2
self-tests code assumes 4k as the sectorsize and nodesize. This commit fix hardcoded 4K. Enables the self-tests code to be executed on non-4k page sized systems (e.g. ppc64). Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Feifei Xu <xufeifei@linux.vnet.ibm.com> Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-05-25Merge branch 'cleanups-4.7' into for-chris-4.7-20160525David Sterba1-17/+17
2016-05-06btrfs: kill unused writepage_io_hook callbackDavid Sterba1-1/+0
It seems to be long time unused, since 2008 and 6885f308b5570 ("Btrfs: Misc 2.6.25 updates"). Propagating the removal touches some code but has no functional effect. Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-29btrfs: sink gfp parameter to convert_extent_bitDavid Sterba1-1/+1
Single caller passes GFP_NOFS. We can get rid of the gfpflags_allow_blocking checks as NOFS can block but does not recurse to filesystem through reclaim. Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-29btrfs: sink gfp parameter to set_record_extent_bitsDavid Sterba1-2/+1
Single caller passes GFP_NOFS. Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-29btrfs: sink gfp parameter to set_extent_newDavid Sterba1-2/+3
Single caller passes GFP_NOFS. Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-29btrfs: sink gfp parameter to set_extent_defragDavid Sterba1-2/+2
Single caller passes GFP_NOFS. Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-29btrfs: sink gfp parameter to set_extent_delallocDavid Sterba1-2/+2
Callers pass GFP_NOFS and tests pass GFP_KERNEL, but using NOFS there does not hurt. No need to pass the flags around. Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-29btrfs: sink gfp parameter to clear_extent_dirtyDavid Sterba1-2/+2
Callers pass GFP_NOFS. No need to pass the flags around. Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-29btrfs: sink gfp parameter to clear_record_extent_bitsDavid Sterba1-2/+1
Callers pass GFP_NOFS. No need to pass the flags around. Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-29btrfs: sink gfp parameter to clear_extent_bitsDavid Sterba1-2/+3
Callers pass GFP_NOFS and GFP_KERNEL. No need to pass the flags around. Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-29btrfs: sink gfp parameter to set_extent_bitsDavid Sterba1-2/+2
All callers pass GFP_NOFS. Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-04mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macrosKirill A. Shutemov1-3/+3
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time ago with promise that one day it will be possible to implement page cache with bigger chunks than PAGE_SIZE. This promise never materialized. And unlikely will. We have many places where PAGE_CACHE_SIZE assumed to be equal to PAGE_SIZE. And it's constant source of confusion on whether PAGE_CACHE_* or PAGE_* constant should be used in a particular case, especially on the border between fs and mm. Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much breakage to be doable. Let's stop pretending that pages in page cache are special. They are not. The changes are pretty straight-forward: - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN}; - page_cache_get() -> get_page(); - page_cache_release() -> put_page(); This patch contains automated changes generated with coccinelle using script below. For some reason, coccinelle doesn't patch header files. I've called spatch for them manually. The only adjustment after coccinelle is revert of changes to PAGE_CAHCE_ALIGN definition: we are going to drop it later. There are few places in the code where coccinelle didn't reach. I'll fix them manually in a separate patch. Comments and documentation also will be addressed with the separate patch. virtual patch @@ expression E; @@ - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ expression E; @@ - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ @@ - PAGE_CACHE_SHIFT + PAGE_SHIFT @@ @@ - PAGE_CACHE_SIZE + PAGE_SIZE @@ @@ - PAGE_CACHE_MASK + PAGE_MASK @@ expression E; @@ - PAGE_CACHE_ALIGN(E) + PAGE_ALIGN(E) @@ expression E; @@ - page_cache_get(E) + get_page(E) @@ expression E; @@ - page_cache_release(E) + put_page(E) Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-26Merge branch 'cleanups-4.6' into for-chris-4.6David Sterba1-3/+2
2016-02-18btrfs: use proper type for failrec in extent_stateDavid Sterba1-3/+2
We use the private member of extent_state to store the failrec and play pointless pointer games. Signed-off-by: David Sterba <dsterba@suse.com>
2016-02-03Btrfs: remove no longer used function extent_read_full_page_nolock()Filipe Manana1-3/+0
Not needed after the previous patch named "Btrfs: fix page reading in extent_same ioctl leading to csum errors". Signed-off-by: Filipe Manana <fdmanana@suse.com>
2015-12-24Merge branch 'freespace-4.5' into for-linus-4.5Chris Mason1-1/+9
2015-12-24Merge branch 'dev/simplify-set-bit' of ↵Chris Mason1-23/+91
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus-4.5 Signed-off-by: Chris Mason <clm@fb.com>
2015-12-18Merge branch 'freespace-tree' into for-linus-4.5Chris Mason1-1/+9
Signed-off-by: Chris Mason <clm@fb.com>
2015-12-17Btrfs: add extent buffer bitmap sanity testsOmar Sandoval1-1/+3
Sanity test the extent buffer bitmap operations (test, set, and clear) against the equivalent standard kernel operations. Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Chris Mason <clm@fb.com>
2015-12-17Btrfs: add extent buffer bitmap operationsOmar Sandoval1-0/+6
These are going to be used for the free space tree bitmap items. Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Chris Mason <clm@fb.com>
2015-12-07btrfs: make extent_range_redirty_for_io return voidDavid Sterba1-1/+1
Does not return any errors, nor anything from the callgraph. There's a BUG_ON but it's a sanity check and not an error condition we could recover from. Signed-off-by: David Sterba <dsterba@suse.com>
2015-12-07btrfs: make extent_range_clear_dirty_for_io return voidDavid Sterba1-1/+1
Does not return any errors, nor anything from the callgraph. There's a BUG_ON but it's a sanity check and not an error condition we could recover from. Signed-off-by: David Sterba <dsterba@suse.com>
2015-12-07btrfs: make end_extent_writepage return voidDavid Sterba1-1/+1
Does not return any errors, nor anything from the callgraph. The branch in end_bio_extent_writepage has been skipped since 5fd02043553b ("Btrfs: finish ordered extents in their own thread"). Signed-off-by: David Sterba <dsterba@suse.com>
2015-12-07btrfs: make extent_clear_unlock_delalloc return voidDavid Sterba1-1/+1
Does not return any errors, nor anything from the callgraph. Signed-off-by: David Sterba <dsterba@suse.com>
2015-12-07btrfs: make clear_extent_buffer_uptodate return voidDavid Sterba1-1/+1
Does not return any errors, nor anything from the callgraph. Signed-off-by: David Sterba <dsterba@suse.com>
2015-12-07btrfs: make set_extent_buffer_uptodate return voidDavid Sterba1-1/+1
Does not return any errors, nor anything from the callgraph. Signed-off-by: David Sterba <dsterba@suse.com>
2015-12-03btrfs: make lock_extent static inlineDavid Sterba1-1/+6
One call less reduces stack usage, code slightly reduced as well. Signed-off-by: David Sterba <dsterba@suse.com>
2015-12-03btrfs: drop unused parameter from lock_extent_bitsDavid Sterba1-1/+1
We've always passed 0. Stack usage will slightly decrease. Signed-off-by: David Sterba <dsterba@suse.com>
2015-12-03btrfs: make clear_extent_bit helpers static inlineDavid Sterba1-9/+38
The funcions just wrap the clear_extent_bit API and generate function calls. This increases stack consumption and may negatively affect performance due to icache misses. We can simply make the helpers static inline and keep the type checking and API untouched. The code slightly decreases: text data bss dec hex filename 938667 43670 23144 1005481 f57a9 fs/btrfs/btrfs.ko.before 939651 43670 23144 1006465 f5b81 fs/btrfs/btrfs.ko.after Signed-off-by: David Sterba <dsterba@suse.com>
2015-12-03btrfs: make set_extent_bit helpers static inlineDavid Sterba1-12/+46
The funcions just wrap the set_extent_bit API and generate function calls. This increases stack consumption and may negatively affect performance due to icache misses. We can simply make the helpers static inline and keep the type checking and API untouched. The code slightly increases: text data bss dec hex filename 938427 43670 23144 1005241 f56b9 fs/btrfs/btrfs.ko.before 938667 43670 23144 1005481 f57a9 fs/btrfs/btrfs.ko Signed-off-by: David Sterba <dsterba@suse.com>
2015-10-22btrfs: qgroup: Introduce btrfs_qgroup_reserve_data functionQu Wenruo1-0/+1
Introduce a new function, btrfs_qgroup_reserve_data(), which will use io_tree to accurate qgroup reserve, to avoid reserved space leaking. Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
2015-10-22btrfs: extent_io: Introduce new function clear_record_extent_bits()Qu Wenruo1-0/+3
Introduce new function clear_record_extent_bits(), which will clear bits for given range and record the details about which ranges are cleared and how many bytes in total it changes. This provides the basis for later qgroup reserve codes. Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
2015-10-22btrfs: extent_io: Introduce new function set_record_extent_bitsQu Wenruo1-0/+3
Introduce new function set_record_extent_bits(), which will not only set given bits, but also record how many bytes are changed, and detailed range info. This is quite important for later qgroup reserve framework. The number of bytes will be used to do qgroup reserve, and detailed range info will be used to cleanup for EQUOT case. Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
2015-10-22btrfs: extent_io: Introduce needed structure for recoding set/clear bitsQu Wenruo1-0/+12
Add a new structure, extent_change_set, to record how many bytes are changed in one set/clear_extent_bits() operation, with detailed changed ranges info. This provides the needed facilities for later qgroup reserve framework. Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
2015-02-16btrfs: constify structs with op functions or static definitionsDavid Sterba1-1/+1
There are some op tables that can be easily made const, similarly the sysfs feature and raid tables. This is motivated by PaX CONSTIFY plugin. Signed-off-by: David Sterba <dsterba@suse.cz>
2015-01-22btrfs: switch extent_state state to unsignedDavid Sterba1-29/+29
Currently there's a 4B hole in the structure between refs and state and there are only 16 bits used so we can make it unsigned. This will get a better packing and may save some stack space for local variables. The size of extent_state gets reduced by 8B and there are usually a lot of slab objects. struct extent_state { u64 start; /* 0 8 */ u64 end; /* 8 8 */ struct rb_node rb_node; /* 16 24 */ wait_queue_head_t wq; /* 40 24 */ /* --- cacheline 1 boundary (64 bytes) --- */ atomic_t refs; /* 64 4 */ /* XXX 4 bytes hole, try to pack */ long unsigned int state; /* 72 8 */ u64 private; /* 80 8 */ /* size: 88, cachelines: 2, members: 7 */ /* sum members: 84, holes: 1, sum holes: 4 */ /* last cacheline: 24 bytes */ }; Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <clm@fb.com>
2014-12-12btrfs: sink parameter len to alloc_extent_bufferDavid Sterba1-2/+2
Because we're using globally known nodesize. Do the same for the sanity test function variant. Signed-off-by: David Sterba <dsterba@suse.cz>
2014-12-12btrfs: unify extent buffer allocation apiDavid Sterba1-1/+2
Make the extent buffer allocation interface consistent. Cloned eb will set a valid fs_info. For dummy eb, we can drop the length parameter and set it from fs_info. The built-in sanity checks may pass a NULL fs_info that's queried for nodesize, but we know it's 4096. Signed-off-by: David Sterba <dsterba@suse.cz>
2014-11-21Btrfs: set page and mapping error on compressed write failureFilipe Manana1-0/+1
If we fail in submit_compressed_extents() before calling btrfs_submit_compressed_write(), we start and end the writeback for the pages (clear their dirty flag, unlock them, etc) but we don't tag the pages, nor the inode's mapping, with an error. This makes it impossible for a caller of filemap_fdatawait_range() (fsync, or transaction commit for e.g.) know that there was an error. Note that the return value of submit_compressed_extents() is useless, as that function is executed by a workqueue task and not directly by the fill_delalloc callback. This means the writepage/s callbacks of the inode's address space operations don't get that return value. Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-10-08Btrfs: fix compiles when CONFIG_BTRFS_FS_RUN_SANITY_TESTS is offChris Mason1-1/+1
Commit fccb84c94 moved added some helpers to cleanup our sanity tests, but it looks like both Dave and I always compile with the tests enabled. This fixes things to work when they are turned off too. Signed-off-by: Chris Mason <clm@fb.com>
2014-10-04Merge branch 'cleanup/misc-for-3.18' of ↵Chris Mason1-10/+0
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus Signed-off-by: Chris Mason <clm@fb.com> Conflicts: fs/btrfs/extent_io.c
2014-10-04Btrfs: be aware of btree inode write errors to avoid fs corruptionFilipe Manana1-2/+5
While we have a transaction ongoing, the VM might decide at any time to call btree_inode->i_mapping->a_ops->writepages(), which will start writeback of dirty pages belonging to btree nodes/leafs. This call might return an error or the writeback might finish with an error before we attempt to commit the running transaction. If this happens, we might have no way of knowing that such error happened when we are committing the transaction - because the pages might no longer be marked dirty nor tagged for writeback (if a subsequent modification to the extent buffer didn't happen before the transaction commit) which makes filemap_fdata[write|wait]_range unable to find such pages (even if they're marked with SetPageError). So if this happens we must abort the transaction, otherwise we commit a super block with btree roots that point to btree nodes/leafs whose content on disk is invalid - either garbage or the content of some node/leaf from a past generation that got cowed or deleted and is no longer valid (for this later case we end up getting error messages like "parent transid verify failed on 10826481664 wanted 25748 found 29562" when reading btree nodes/leafs from disk). Note that setting and checking AS_EIO/AS_ENOSPC in the btree inode's i_mapping would not be enough because we need to distinguish between log tree extents (not fatal) vs non-log tree extents (fatal) and because the next call to filemap_fdatawait_range() will catch and clear such errors in the mapping - and that call might be from a log sync and not from a transaction commit, which means we would not know about the error at transaction commit time. Also, checking for the eb flag EXTENT_BUFFER_IOERR at transaction commit time isn't done and would not be completely reliable, as the eb might be removed from memory and read back when trying to get it, which clears that flag right before reading the eb's pages from disk, making us not know about the previous write error. Using the new 3 flags for the btree inode also makes us achieve the goal of AS_EIO/AS_ENOSPC when writepages() returns success, started writeback for all dirty pages and before filemap_fdatawait_range() is called, the writeback for all dirty pages had already finished with errors - because we were not using AS_EIO/AS_ENOSPC, filemap_fdatawait_range() would return success, as it could not know that writeback errors happened (the pages were no longer tagged for writeback). Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-10-02btrfs: kill extent_buffer_page helperDavid Sterba1-6/+0
It used to be more complex but now it's just a simple array access. Signed-off-by: David Sterba <dsterba@suse.cz>
2014-10-02btrfs: remove unused extent state bitsDavid Sterba1-4/+0
The last users are long gone. Signed-off-by: David Sterba <dsterba@suse.cz>
2014-09-18Btrfs: cleanup the read failure record after write or when the inode is freeingMiao Xie1-0/+1
After the data is written successfully, we should cleanup the read failure record in that range because - If we set data COW for the file, the range that the failure record pointed to is mapped to a new place, so it is invalid. - If we set no data COW for the file, and if there is no error during writting, the corrupted data is corrected, so the failure record can be removed. And if some errors happen on the mirrors, we also needn't worry about it because the failure record will be recreated if we read the same place again. Sometimes, we may fail to correct the data, so the failure records will be left in the tree, we need free them when we free the inode or the memory leak happens. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-09-18Btrfs: implement repair function when direct read failsMiao Xie1-1/+4
This patch implement data repair function when direct read fails. The detail of the implementation is: - When we find the data is not right, we try to read the data from the other mirror. - When the io on the mirror ends, we will insert the endio work into the dedicated btrfs workqueue, not common read endio workqueue, because the original endio work is still blocked in the btrfs endio workqueue, if we insert the endio work of the io on the mirror into that workqueue, deadlock would happen. - After we get right data, we write it back to the corrupted mirror. - And if the data on the new mirror is still corrupted, we will try next mirror until we read right data or all the mirrors are traversed. - After the above work, we set the uptodate flag according to the result. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-09-18Btrfs: modify clean_io_failure and make it suit direct ioMiao Xie1-3/+3
We could not use clean_io_failure in the direct IO path because it got the filesystem information from the page structure, but the page in the direct IO bio didn't have the filesystem information in its structure. So we need modify it and pass all the information it need by parameters. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>