summaryrefslogtreecommitdiff
path: root/fs/f2fs/node.c
AgeCommit message (Collapse)AuthorFilesLines
2014-04-02f2fs: use list_for_each_entry{_safe} for simplyfying codeChao Yu1-10/+6
This patch use list_for_each_entry{_safe} instead of list_for_each{_safe} for simplfying code. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-04-02f2fs: avoid free slab cache under spinlockChao Yu1-1/+16
Move kmem_cache_free out of spinlock protection region for better performance. Change log from v1: o remove spinlock protection for kmem_cache_free in destroy_node_manager suggested by Jaegeuk Kim. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-04-01f2fs: return -EIO when node id is not matchedJaegeuk Kim1-2/+1
During the cleaing of node segments, F2FS can get errored node blocks due to data race between node page lock and its valid bitmap operations. In that case, it needs to return an error to skip such the obsolete block copy. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-03-20f2fs: skip unnecessary node writes during fsyncJaegeuk Kim1-9/+28
If multiple redundant fsync calls are triggered, we don't need to write its node pages with fsync mark continuously. So, this patch adds FI_NEED_FSYNC to track whether the latest node block is written with the fsync mark or not. If the mark was set, a new fsync doesn't need to write a node block. Otherwise, we should do a new node block with the mark for roll-forward recovery. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-03-20f2fs: remove unnecessary thresholdJaegeuk Kim1-4/+1
The NM_WOUT_THRESHOLD is now obsolete since f2fs starts to control on a basis of the memory footprint. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-03-20f2fs: throttle the memory footprint with a sysfs entryJaegeuk Kim1-3/+20
This patch introduces ram_thresh, a sysfs entry, which controls the memory footprint used by the free nid list and the nat cache. Previously, the free nid list was controlled by MAX_FREE_NIDS, while the nat cache was managed by NM_WOUT_THRESHOLD. However, this approach cannot be applied dynamically according to the system. So, this patch adds ram_thresh that users can specify the threshold, which is in order of 1 / 1024. For example, if the total ram size is 4GB and the value is set to 10 by default, f2fs tries to control the number of free nids and nat caches not to consume over 10 * (4GB / 1024) = 10MB. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-03-20f2fs: avoid to drop nat entries due to the negative nr_shrinkJaegeuk Kim1-1/+1
The try_to_free_nats should not receive the negative nr_shrink. Otherwise, it can drop all the nat entries by the while loop. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-03-20f2fs: call f2fs_wait_on_page_writeback instead of native functionJaegeuk Kim1-5/+7
If a page is on writeback, f2fs can face with deadlock due to under writepages. This is caused by merging IOs inside f2fs, so if it comes to detect, let's throw merged IOs, which is implemented by f2fs_wait_on_page_writeback. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-03-18f2fs: introduce nr_pages_to_write for segment alignmentJaegeuk Kim1-5/+3
This patch introduces nr_pages_to_write to align page writes to the segment or other operational unit size, which can be tuned according to the system environment. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-03-18f2fs: increase pages_skipped when skipping writepagesJaegeuk Kim1-1/+5
This patch increases pages_skipped when skipping writepages. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-03-18f2fs: avoid small data writes by skipping writepagesJaegeuk Kim1-7/+1
This patch introduces nr_pages_to_skip(sbi, type) to determine writepages can be skipped. The dentry, node, and meta pages can be conrolled by F2FS without breaking the FS consistency. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-03-18f2fs: introduce f2fs_has_xattr_block for better readabilityChao Yu1-2/+2
This patch introduces a help function f2fs_has_xattr_block for better readability. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-03-12f2fs: introduce f2fs_has_inline_xattr for better readabilityChao Yu1-1/+1
This patch introduces a help function f2fs_has_inline_xattr for better readability. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-03-11f2fs: recover inline xattr data in roll-forward processChao Yu1-0/+33
Previously we do not recover inline xattr data of inode after power-cut, so inline xattr data may be lost. We should recover the data during the roll-forward process. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-03-10f2fs: optimize restore_node_summary slightlyGu Zheng1-16/+12
Previously, we ra_sum_pages to pre-read contiguous pages as more as possible, and if we fail to alloc more pages, an ENOMEM error will be reported upstream, even though we have alloced some pages yet. In fact, we can use the available pages to do the job partly, and continue the rest in the following circle. Only reporting ENOMEM upstream if we really can not alloc any available page. And another fix is ignoring dealing with the following pages if an EIO occurs when reading page from page_list. Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Reviewed-by: Chao Yu <chao2.yu@samsung.com> [Jaegeuk Kim: modify the flow for better neat code] Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-03-10f2fs: remove the unused ctor argument of f2fs_kmem_cache_create()Gu Zheng1-2/+2
Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-03-10f2fs: update start nid only once each circleGu Zheng1-5/+3
Integrated a couple of minor changes for better readability suggested by Chao Yu. Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-02-28f2fs: fix dirty page accounting when redirtyChao Yu1-0/+1
We should de-account dirty counters for page when redirty in ->writepage(). Wu Fengguang described in 'commit 971767caf632190f77a40b4011c19948232eed75': "writeback: fix dirtied pages accounting on redirty De-account the accumulative dirty counters on page redirty. Page redirties (very common in ext4) will introduce mismatch between counters (a) and (b) a) NR_DIRTIED, BDI_DIRTIED, tsk->nr_dirtied b) NR_WRITTEN, BDI_WRITTEN This will introduce systematic errors in balanced_rate and result in dirty page position errors (ie. the dirty pages are no longer balanced around the global/bdi setpoints)." Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-02-24f2fs: introduce a radix_tree for the free_nid listJaegeuk Kim1-19/+17
This patch introduces a radix tree for the list of free_nids, which enhances the performance on free nid management. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-02-24f2fs: introduce help macro on_build_free_nids()Gu Zheng1-3/+3
Introduce help macro on_build_free_nids() which just uses build_lock to judge whether the building free nid is going, so that we can remove the on_build_free_nids field from f2fs_sb_info. Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> [Jaegeuk Kim: remove an unnecessary white line removal] Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-02-24f2fs: fix to mark the checkpointed nat entry correctlyJaegeuk Kim1-6/+1
The nat cache entry maintains a status whether it is checkpointed or not. So, if a new cache entry is loaded from the last checkpoint, nat_entry->checkpointed should be true. If the cache entry is modified as being dirty, nat_entry->checkpoint should be false. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-02-17f2fs: fix the calculation of max_nidsJaegeuk Kim1-1/+3
Total nids that f2fs can use should not include 0, nid for node inode, and nid for meta inode. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-02-17f2fs: introduce ra_meta_pages to readahead CP/NAT/SIT pagesChao Yu1-37/+1
This patch help us to cleanup the readahead code by merging ra_{sit,nat}_pages function into ra_meta_pages. Additionally the new function is used to readahead cp block in recover_orphan_inodes. Change log from v1: o fix a deadloop bug pointed by Jaegeuk Kim. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-02-17f2fs: fix to recover xattr node blockJaegeuk Kim1-0/+40
If a new xattr node page was allocated and its inode is fsynced, we should recover the xattr node page during the roll-forward process after power-cut. But, previously, f2fs didn't handle that case, resulting in kernel panic as follows reported by Tom Li. BUG: unable to handle kernel paging request at ffffc9001c861a98 IP: [<ffffffffa0295236>] check_index_in_prev_nodes+0x86/0x2d0 [f2fs] Call Trace: [<ffffffff815ece9b>] ? printk+0x48/0x4a [<ffffffffa029626a>] recover_fsync_data+0xdca/0xf50 [f2fs] [<ffffffffa02873ae>] f2fs_fill_super+0x92e/0x970 [f2fs] [<ffffffff8112c9f8>] mount_bdev+0x1b8/0x200 [<ffffffffa0286a80>] ? f2fs_remount+0x130/0x130 [f2fs] [<ffffffffa0285e40>] f2fs_mount+0x10/0x20 [f2fs] [<ffffffff8112d4de>] mount_fs+0x3e/0x1b0 [<ffffffff810ef4eb>] ? __alloc_percpu+0xb/0x10 [<ffffffff8114761f>] vfs_kern_mount+0x6f/0x120 [<ffffffff811497b9>] do_mount+0x259/0xa90 [<ffffffff810ead1d>] ? memdup_user+0x3d/0x80 [<ffffffff810eadb3>] ? strndup_user+0x53/0x70 [<ffffffff8114a2c9>] SyS_mount+0x89/0xd0 [<ffffffff815feae2>] system_call_fastpath+0x16/0x1b This patch adds a recovery function of xattr node pages. Reported-by: Tom Li <biergaizi@members.fsf.org> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-01-23f2fs: drop obsolete node page when it is truncatedJaegeuk Kim1-0/+4
If a node page is trucated, we'd better drop the page in the node_inode's page cache for better memory footprint. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-01-22f2fs: introduce NODE_MAPPING for code consistencyJaegeuk Kim1-28/+22
This patch adds NODE_MAPPING which is similar as META_MAPPING introduced by Gu Zheng. Cc: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-01-22f2fs: add help function META_MAPPINGGu Zheng1-1/+1
Introduce help function META_MAPPING() to get the cache meta blocks' address space. Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-01-20f2fs: clean checkpatch warningsChris Fries1-1/+1
Fixed a variety of trivial checkpatch warnings. The only delta should be some minor formatting on log strings that were split / too long. Signed-off-by: Chris Fries <cfries@motorola.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-01-08f2fs: improve write performance under frequent fsync callsJaegeuk Kim1-1/+6
When considering a bunch of data writes with very frequent fsync calls, we are able to think the following performance regression. N: Node IO, D: Data IO, IO scheduler: cfq Issue pending IOs D1 D2 D3 D4 D1 D2 D3 D4 N1 D2 D3 D4 N1 N2 N1 D3 D4 N2 D1 --> N1 can be selected by cfq becase of the same priority of N and D. Then D3 and D4 would be delayed, resuling in performance degradation. So, when processing the fsync call, it'd better give higher priority to data IOs than node IOs by assigning WRITE and WRITE_SYNC respectively. This patch improves the random wirte performance with frequent fsync calls by up to 10%. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-01-06f2fs: fix truncate_partial_nodes bugshifei10.ge1-6/+7
The truncate_partial_nodes puts pages incorrectly in the following two cases. Note that the value for argc 'depth' can only be 2 or 3. Please see truncate_inode_blocks() and truncate_partial_nodes(). 1) An err is occurred in the first 'for' loop When err is occurred with depth = 2, pages[0] is invalid, so this page doesn't need to be put. There is no problem, however, when depth is 3, it doesn't put the pages correctly where pages[0] is valid and pages[1] is invalid. In this case, depth is set to 2 (ref to statemnt depth = i + 1), and then 'goto fail'. In label 'fail', for (i = depth - 3; i >= 0; i--) cannot meet the condition because i = -1, so pages[0] cann't be put. 2) An err happened in the second 'for' loop Now we've got pages[0] with depth = 2, or we've got pages[0] and pages[1] with depth = 3. When an err is detected, we need 'goto fail' to put such the pages. When depth is 2, in label 'fail', for (i = depth - 3; i >= 0; i--) cann't meet the condition because i = -1, so pages[0] cann't be put. When depth is 3, in label 'fail', for (i = depth - 3; i >= 0; i--) can only put pages[0], pages[1] also cann't be put. Note that 'depth' has been changed before first 'goto fail' (ref to statemnt depth = i + 1), so passing this modified 'depth' to the tracepoint, trace_f2fs_truncate_partial_nodes, is also incorrect. Signed-off-by: Shifei Ge <shifei10.ge@samsung.com> [Jaegeuk Kim: modify the description and fix one bug] Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-12-26f2fs: introduce F2FS_INODE macro to get f2fs_inodeJaegeuk Kim1-15/+15
This patch introduces F2FS_INODE that returns struct f2fs_inode * from the inode page. By using this macro, we can remove unnecessary casting codes like below. struct f2fs_inode *ri = &F2FS_NODE(inode_page)->i; -> struct f2fs_inode *ri = F2FS_INODE(inode_page); Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-12-23f2fs: update several commentsChao Yu1-4/+4
Update several comments: 1. use f2fs_{un}lock_op install of mutex_{un}lock_op. 2. update comment of get_data_block(). 3. update description of node offset. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-12-23f2fs: remove the rw_flag domain from f2fs_io_infoGu Zheng1-4/+2
When using the f2fs_io_info in the low level, we still need to merge the rw and rw_flag, so use the rw to hold all the io flags directly, and remove the rw_flag field. ps.It is based on the previous patch: f2fs: move all the bio initialization into __bio_alloc Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-12-23f2fs: refactor bio->rw handlingJaegeuk Kim1-6/+16
This patch introduces f2fs_io_info to mitigate the complex parameter list. struct f2fs_io_info { enum page_type type; /* contains DATA/NODE/META/META_FLUSH */ int rw; /* contains R/RS/W/WS */ int rw_flag; /* contains REQ_META/REQ_PRIO */ } 1. f2fs_write_data_pages - DATA - WRITE_SYNC is set when wbc->WB_SYNC_ALL. 2. sync_node_pages - NODE - WRITE_SYNC all the time 3. sync_meta_pages - META - WRITE_SYNC all the time - REQ_META | REQ_PRIO all the time ** f2fs_submit_merged_bio() handles META_FLUSH. 4. ra_nat_pages, ra_sit_pages, ra_sum_pages - META - READ_SYNC Cc: Fan Li <fanofcode.li@samsung.com> Cc: Changman Lee <cm224.lee@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-12-23f2fs: add unlikely() macro for compiler more aggressivelyJaegeuk Kim1-14/+14
This patch adds unlikely() macro into the most of codes. The basic rule is to add that when: - checking unusual errors, - checking page mappings, - and the other unlikely conditions. Change log from v1: - Don't add unlikely for the NULL test and error test: advised by Andi Kleen. Cc: Chao Yu <chao2.yu@samsung.com> Cc: Andi Kleen <andi@firstfloor.org> Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-12-23f2fs: add unlikely() macro for compiler optimizationChao Yu1-8/+8
As we know, some of our branch condition will rarely be true. So we could add 'unlikely' to let compiler optimize these code, by this way we could drop unneeded 'jump' assemble code to improve performance. change log: o add *unlikely* as many as possible across the whole source files at once suggested by Jaegeuk Kim. Suggested-by: Jaegeuk Kim <jaegeuk.kim@samsung.com> Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-12-23f2fs: use inner macro GFP_F2FS_ZERO for simplificationChao Yu1-1/+1
Use inner macro GFP_F2FS_ZERO to instead of GFP_NOFS | __GFP_ZERO for simplification of code. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-12-23f2fs: readahead contiguous pages for restore_node_summaryChao Yu1-26/+63
If cp has no CP_UMOUNT_FLAG, we will read all pages in whole node segment one by one, it makes low performance. So let's merge contiguous pages and readahead for better performance. Signed-off-by: Chao Yu <chao2.yu@samsung.com> [Jaegeuk Kim: adjust the new bio operations] Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-12-23f2fs: refactor bio-related operationsJaegeuk Kim1-7/+7
This patch integrates redundant bio operations on read and write IOs. 1. Move bio-related codes to the top of data.c. 2. Replace f2fs_submit_bio with f2fs_submit_merged_bio, which handles read bios additionally. 3. Introduce __submit_merged_bio to submit the merged bio. 4. Change f2fs_readpage to f2fs_submit_page_bio. 5. Introduce f2fs_submit_page_mbio to integrate previous submit_read_page and submit_write_page. Reviewed-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Reviewed-by: Chao Yu <chao2.yu@samsung.com > Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-12-23f2fs: use true and false for boolean variableChao Yu1-1/+1
The inode_page_locked should be a boolean variable. struct dnode_of_data { struct inode *inode; /* vfs inode pointer */ struct page *inode_page; /* its inode page, NULL is possible */ struct page *node_page; /* cached direct node page */ nid_t nid; /* node id of the direct node block */ unsigned int ofs_in_node; /* data offset in the node page */ ==> bool inode_page_locked; /* inode page is locked or not */ block_t data_blkaddr; /* block address of the node block */ }; Signed-off-by: Chao Yu <chao2.yu@samsung.com> [Jaegeuk Kim: add description] Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-12-23f2fs: send REQ_META or REQ_PRIO when reading meta areaChangman Lee1-2/+2
Let's send REQ_META or REQ_PRIO when reading meta area such as NAT/SIT etc. Signed-off-by: Changman Lee <cm224.lee@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-12-23f2fs: merge read IOs at ra_nat_pages()Jaegeuk Kim1-7/+4
Change log from v1: o add mark_page_accessed() not to reclaim the nat pages. This patch changes the policy of submitting read bios at ra_nat_pages. Previously, f2fs submits small read bios with block plugging. But, with this patch, f2fs itself merges read bios first and then submits a large bio, which can reduce the bio handling overheads. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-12-23f2fs: convert inc/dec_valid_node_count to inc/dec one countGu Zheng1-3/+3
Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-12-23f2fs: convert remove_inode_page to voidGu Zheng1-8/+4
Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-11-04f2fs: remove unnecessary TestClearPageError when wait pages writebackChao Yu1-3/+4
In wait_on_node_pages_writeback we will test and clear error flag for all pages in radix tree, but not necessary. So we only do this for pages belong to the specified inode. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-10-31f2fs: avoid to wait all the node blocks during fsyncJaegeuk Kim1-0/+40
Previously, f2fs_sync_file() waits for all the node blocks to be written. But, we don't need to do that, but wait only the inode-related node blocks. This patch adds wait_on_node_pages_writeback() in which waits inode-related node blocks that are on writeback. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-10-29f2fs: add an option to avoid unnecessary BUG_ONsJaegeuk Kim1-21/+21
If you want to remove unnecessary BUG_ONs, you can just turn off F2FS_CHECK_FS in your kernel config. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-10-25f2fs: add tracepoint for set_page_dirtyJaegeuk Kim1-0/+2
This patch adds a tracepoint for set_page_dirty. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-10-25f2fs: introduce f2fs_balance_fs_bg for some background jobsJaegeuk Kim1-7/+3
This patch merges some background jobs into this new function. Signed-off-by: Changman Lee <cm224.lee@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-10-25f2fs: reclaim prefree segments periodicallyJaegeuk Kim1-1/+2
Previously, f2fs postpones reclaiming prefree segments into free segments as much as possible. However, if user writes and deletes a bunch of data without any sync or fsync calls, some flash storages can suffer from garbage collections. So, this patch adds the reclaiming codes to f2fs_write_node_pages and background GC thread. If there are a lot of prefree segments, let's do checkpoint so that f2fs submits discard commands for the prefree regions to the flash storage. Signed-off-by: Changman Lee <cm224.lee@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>