Age | Commit message (Collapse) | Author | Files | Lines |
|
writepages() needs to write dirty pages to OSD in strict order of
snapshot context. It must first write dirty pages associated with
the oldest snapshot context. In the write range case, dirty pages
in the specified range can be associated with newer snapc. They
are not writeable until we write all dirty pages associated with
the oldest snapc.
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
In range cyclic mode, writepages() should first write dirty pages
in range [writeback_index, (pgoff_t)-1], then write pages in range
[0, writeback_index -1]. Besides, if writepages() encounters a page
that beyond EOF, it should restart from the beginning.
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
Remove two variables and define variables of same type together.
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
ceph_writepages_start() supports writing non-continuous pages.
If it encounters a non-dirty or non-writeable page in pagevec,
it can continue to check the rest pages in pagevec.
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
Otherwise, the page left in state that page is associated with a
snapc, but (PageDirty(page) || PageWriteback(page)) is false.
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
capsnap's size is set by __ceph_finish_cap_snap(). If capsnap is under
writing, its size is zero. In this case, get_oldest_context() should
read i_size. Besides, ceph_writepages_start() should re-check capsnap's
size after dirty pages get locked.
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
Both set_page_dirty and truncate_complete_page should be called
for locked page, they can't race with each other.
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
If we create capsnap when snap realm's context does not change, the
new capsnap's snapc is equal to ci->i_head_snapc. Page writeback code
can't differentiates dirty pages associated with the new capsnap from
dirty pages associated with i_head_snapc.
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
It's possible that we create a cap snap while there is pending
vmtruncate (truncate hasn't been processed by worker thread).
We should truncate dirty pages beyond capsnap->size in that case.
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
If caps for importer mds exists, but cap id mismatch, client should
have received corresponding import message. Because cap ID does not
change as long as client holds the caps.
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
The script “checkpatch.pl” pointed information out like the following.
Comparison to NULL could be written ...
Thus fix the affected source code places.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Reviewed-by: Yan, Zheng <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
The script "checkpatch.pl" pointed information out like the following.
WARNING: void function return statements are not generally useful
Thus remove such a statement in the affected function.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Reviewed-by: Yan, Zheng <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
Omit an extra message for a memory allocation failure in this function.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Reviewed-by: Yan, Zheng <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
When a user requests SEEK_HOLE or SEEK_DATA with a negative offset
ceph_llseek should return -ENXIO. Currently -EINVAL is being returned for
SEEK_DATA and 0 for SEEK_HOLE.
Signed-off-by: Luis Henriques <lhenriques@suse.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
Improve accuracy of statfs reporting for Ceph filesystems comprising
exactly one data pool. In this case, the Ceph monitor can now report
the space usage for the single data pool instead of the global data
for the entire Ceph cluster. Include support for this message in
mon_client and leverage it in ceph/super.
Signed-off-by: Douglas Fuller <dfuller@redhat.com>
Reviewed-by: Yan, Zheng <zyan@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
Inode can be moved between snap realms. It's possible inode is moved
into a snap realm whose seq number is smaller than old snap realm's.
So there is no guarantee that seq number inode's snap context always
increases.
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
Before sending new flushsnap message, check if there are old
flushsnap messages that need to be re-sent. If there are, re-send
old messages first. This guarantees ordering of flushsnap messages.
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
Need to drop cap reference before retry. Besides, it's better to
redo file write checks for each retry because we re-lock inode.
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
Snapdir inode has no capability. __choose_mds() should choose mds
base on capabilities of snapdir's parent inode.
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
In LSSNAP case, req->r_dentry is already set to snapdir dentry.
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
Ensure that when writeback errors are marked that we report those to all
file descriptions that were open at the time of the error.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
These flags tell mds if there is pending capsnap explicitly.
Without this explicit notification, mds can only conclude if
client has pending capsnap. The method mds use is inefficient
and error-prone.
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
startsync is a no-op, has been for years. Remove it.
Link: http://tracker.ceph.com/issues/20604
Signed-off-by: Yanhu Cao <gmayyyha@gmail.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
OSD has a configurable limitation of max write size. OSD return
error if write request size is larger than the limitation. For now,
set max write size to CEPH_MSG_MAX_DATA_LEN. It should be small
enough.
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
libceph returns -EIO when read size > CEPH_MSG_MAX_DATA_LEN.
Link: http://tracker.ceph.com/issues/20528
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
Omit an extra message for a memory allocation failure in this function.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
|
This field hasn't been used since commit 57b691819ee2 ("NFS: Cache
access checks more aggressively").
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
|
When a byte range lock (or flock) is taken out on an NFS file, the
validity of the cached data is checked and the inode is marked
NFS_INODE_INVALID_DATA. However the cached data isn't flushed from
the page cache.
This is sufficient for future read() requests or mmap() requests as
they call nfs_revalidate_mapping() which performs the flush if
necessary.
However an existing mapping is not affected. Accessing data through
that mapping will continue to return old data even though the inode is
marked NFS_INODE_INVALID_DATA.
This can easily be confirmed using the 'nfs' tool in
git://github.com/okirch/twopence-nfs.git
and running
nfs coherence FILENAME
on one client, and
nfs coherence -r FILENAME
on another client.
It appears that prior to Linux 2.6.0 this worked correctly.
However commit:
http://git.kernel.org/cgit/linux/kernel/git/history/history.git/commit/?id=ca9268fe3ddd075714005adecd4afbd7f9ab87d0
removed the call to inode_invalidate_pages() from nfs_zap_caches(). I
haven't tested this code, but inspection suggests that prior to this
commit, file locking would invalidate all inode pages.
This patch adds a call to nfs_revalidate_mapping() after a
successful SETLK so that invalid data is flushed. With this patch the
above test passes. To minimize impact (and possibly avoid a GETATTR
call) this only happens if the mapping might be mapped into
userspace.
Cc: Olaf Kirch <okir@suse.com>
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
|
Commit fbe77c30e9ab ("NFS: move rw_mode to nfs_pageio_header")
reintroduced some pointless code that commit 518662e0fcb9 ("NFS: fix
usage of mempools.") had recently removed.
Remove it again.
Cc: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
|
This patch renames functions regarding to buffer management via META_MAPPING
used for encrypted blocks especially. We can actually use them in generic way.
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
This patch replaces (f2fs_encrypted_inode() && S_ISREG()) with
f2fs_encrypted_file(), which gives no functional change.
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
Since the beginning, svcsock has built a received RPC Call message
by populating the xdr_buf's head, then placing the remaining
message bytes in the xdr_buf's page list. The xdr_buf's tail is
never populated.
This means that an NFSv4 COMPOUND containing an NFS WRITE operation
plus trailing operations has a page list that contains the WRITE
data payload followed by the trailing operations. NFSv4 XDR decoders
will not look in the xdr_buf's tail, ever, because svcsock never put
anything there.
To support transports that can pass the write payload in the
xdr_buf's pagelist and trailing content in the xdr_buf's tail,
introduce logic in READ_BUF that switches to the xdr_buf's tail vec
when the decoder runs out of content in rq_arg.pages.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver updates from Greg KH:
"Here is the big char/misc driver update for 4.14-rc1.
Lots of different stuff in here, it's been an active development cycle
for some reason. Highlights are:
- updated binder driver, this brings binder up to date with what
shipped in the Android O release, plus some more changes that
happened since then that are in the Android development trees.
- coresight updates and fixes
- mux driver file renames to be a bit "nicer"
- intel_th driver updates
- normal set of hyper-v updates and changes
- small fpga subsystem and driver updates
- lots of const code changes all over the driver trees
- extcon driver updates
- fmc driver subsystem upadates
- w1 subsystem minor reworks and new features and drivers added
- spmi driver updates
Plus a smattering of other minor driver updates and fixes.
All of these have been in linux-next with no reported issues for a
while"
* tag 'char-misc-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (244 commits)
ANDROID: binder: don't queue async transactions to thread.
ANDROID: binder: don't enqueue death notifications to thread todo.
ANDROID: binder: Don't BUG_ON(!spin_is_locked()).
ANDROID: binder: Add BINDER_GET_NODE_DEBUG_INFO ioctl
ANDROID: binder: push new transactions to waiting threads.
ANDROID: binder: remove proc waitqueue
android: binder: Add page usage in binder stats
android: binder: fixup crash introduced by moving buffer hdr
drivers: w1: add hwmon temp support for w1_therm
drivers: w1: refactor w1_slave_show to make the temp reading functionality separate
drivers: w1: add hwmon support structures
eeprom: idt_89hpesx: Support both ACPI and OF probing
mcb: Fix an error handling path in 'chameleon_parse_cells()'
MCB: add support for SC31 to mcb-lpc
mux: make device_type const
char: virtio: constify attribute_group structures.
Documentation/ABI: document the nvmem sysfs files
lkdtm: fix spelling mistake: "incremeted" -> "incremented"
perf: cs-etm: Fix ETMv4 CONFIGR entry in perf.data file
nvmem: include linux/err.h from header
...
|
|
This reverts commit b7b7c4cf1c9ef0272a65f1480457cbfdadcda19d.
se->ckpt_valid_blocks will never be smaller than se->valid_blocks, so just
remove get_ssr_cost.
Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
super_operations are not supposed to change at runtime.
"struct super_block" working with super_operations provided
by <linux/fs.h> work with const super_operations. So mark
the non-const structs as const
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
In scenario of remount_ro vs flush, after flush_thread exits in
->remount_fs, flusher will only clean up golbal issue_list, but
without waking up flushers waiting on that list, result in hang
related user threads.
In order to fix this issue, this patch enables the flusher to
take charge of issue_flush thread: executes merged flush command,
and wake up all sleeping flushers.
Fixes: 5eba8c5d1fb3 ("f2fs: fix to access nullified flush_cmd_control pointer")
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
Previously, we will miss merging flush command during fsync due to below
race condition:
Thread A Thread B Thread C
- f2fs_issue_flush
- atomic_read(&issing_flush)
- f2fs_issue_flush
- atomic_read(&issing_flush)
- f2fs_issue_flush
- atomic_read(&issing_flush)
- atomic_inc(&issing_flush)
- atomic_inc(&issing_flush)
- atomic_inc(&issing_flush)
- submit_flush_wait
- submit_flush_wait
- submit_flush_wait
It needs to use atomic_inc_return instead to avoid such race.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
allocate_segment_by_default is the only caller of change_curseg passing
@reuse with 'false', but commit 763bfe1bc575 ("f2fs: remove reusing any
prefree segments") removes the calling, after that, @reuse in
change_curseg always be true, so, let's clean up the unneeded parameter.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
f2fs enables hash-indexed directory by default, so we need to tag
FS_INDEX_FL in inode::i_flags during directory creataion, in order
to show correct status of inode in lsattr:
Before:
------------------- /mnt/f2fs/dir/
After:
-----------I------- /mnt/f2fs/dir/
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
If another thread already made the page dirtied or writebacked, we must avoid
to verify checksum. If we got an error, we need to remove its uptodate as well.
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
This patch fixes "f2fs: support inode checksum".
The recovered inode page will be rewritten with valid checksum.
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core update from Greg KH:
"Here is the "big" driver core update for 4.14-rc1.
It's really not all that big, the largest thing here being some
firmware tests to help ensure that that crazy api is working properly.
There's also a new uevent for when a driver is bound or unbound from a
device, fixing a hole in the driver model that's been there since the
very beginning. Many thanks to Dmitry for being persistent and
pointing out how wrong I was about this all along :)
Patches for the new uevents are already in the systemd tree, if people
want to play around with them.
Otherwise just a number of other small api changes and updates here,
nothing major. All of these patches have been in linux-next for a
while with no reported issues"
* tag 'driver-core-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (28 commits)
driver core: bus: Fix a potential double free
Do not disable driver and bus shutdown hook when class shutdown hook is set.
base: topology: constify attribute_group structures.
base: Convert to using %pOF instead of full_name
kernfs: Clarify lockdep name for kn->count
fbdev: uvesafb: remove DRIVER_ATTR() usage
xen: xen-pciback: remove DRIVER_ATTR() usage
driver core: Document struct device:dma_ops
mod_devicetable: Remove excess description from structured comment
test_firmware: add batched firmware tests
firmware: enable a debug print for batched requests
firmware: define pr_fmt
firmware: send -EINTR on signal abort on fallback mechanism
test_firmware: add test case for SIGCHLD on sync fallback
initcall_debug: add deferred probe times
Input: axp20x-pek - switch to using devm_device_add_group()
Input: synaptics_rmi4 - use devm_device_add_group() for attributes in F01
Input: gpio_keys - use devm_device_add_group() for attributes
driver core: add devm_device_add_group() and friends
driver core: add device_{add|remove}_group() helpers
...
|
|
In the case of a kzalloc failure when allocating sbi we end up
with a null pointer dereference on sbi when assigning sbi->s_daxdev.
Fix this by moving the assignment of sbi->s_daxdev to after the
null pointer check of sbi.
Detected by CoverityScan CID#1455379 ("Dereference before null check")
Fixes: 5e405595e5bf ("ext4: perform dax_device lookup at mount")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|