summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorJaegeuk Kim <jaegeuk@kernel.org>2017-07-28 12:29:12 +0300
committerJaegeuk Kim <jaegeuk@kernel.org>2017-08-01 02:48:34 +0300
commitb6a245eb34cd935f48235e1160d8b7539b228e8e (patch)
tree0e3043f3a255a41a0cd838aca265a832aef66f18
parentdc6b20551044a05cd4d8ad2356a6bd888570f52a (diff)
downloadlinux-b6a245eb34cd935f48235e1160d8b7539b228e8e.tar.xz
f2fs: don't need to wait for node writes for atomic write
We have a node chain to serialize node block writes, so if any IOs for node block writes are reordered, we'll get broken node chain. IOWs, roll-forward recovery will see all or none node blocks given fsync mark. E.g., Node chain consists of: N1 -> N2 -> N3 -> NFSYNC -> N1' -> N2' -> N'FSYNC Reordered to: 1) N1 -> N2 -> N3 -> N2' -> NFSYNC -> N'FSYNC -> power-cut 2) N1 -> N2 -> N3 -> N1' -> NFSYNC -> power-cut 3) N1 -> N2 -> NFSYNC -> N1' -> N'FSYNC -> N3 -> power-cut 4) N1 -> NFSYNC -> N1' -> N2' -> N'FSYNC -> N3 -> power-cut Roll-forward recovery can proceed to: 1) N1 -> N2 -> N3 -> NFSYNC -> X 2) N1 -> N2 -> N3 -> NFSYNC -> N1' -> X 3) N1 -> N2 -> N3 -> FSYNC -> N1' -> X 4) N1 -> X Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
-rw-r--r--fs/f2fs/file.c16
1 files changed, 13 insertions, 3 deletions
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index a0413c951458..0246d19d96c3 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -274,9 +274,19 @@ sync_nodes:
goto sync_nodes;
}
- ret = wait_on_node_pages_writeback(sbi, ino);
- if (ret)
- goto out;
+ /*
+ * If it's atomic_write, it's just fine to keep write ordering. So
+ * here we don't need to wait for node write completion, since we use
+ * node chain which serializes node blocks. If one of node writes are
+ * reordered, we can see simply broken chain, resulting in stopping
+ * roll-forward recovery. It means we'll recover all or none node blocks
+ * given fsync mark.
+ */
+ if (!atomic) {
+ ret = wait_on_node_pages_writeback(sbi, ino);
+ if (ret)
+ goto out;
+ }
/* once recovery info is written, don't need to tack this */
remove_ino_entry(sbi, ino, APPEND_INO);