kernel/linux.git - Linux kernel stable tree (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2015-04-22	libceph: announce support for straw2 buckets	Ilya Dryomov	1	-1/+15
	Sync up feature bits and enable CEPH_FEATURE_CRUSH_V4. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2015-04-22	ceph: rename snapshot support	Yan, Zheng	1	-0/+1
	Signed-off-by: Yan, Zheng <zyan@redhat.com>
2015-04-20	libceph: simplify our debugfs attr macro	Ilya Dryomov	1	-7/+1
	No need to do single_open()'s job ourselves. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2015-04-20	libceph: expose client options through debugfs	Ilya Dryomov	1	-0/+1
	Add a client_options attribute for showing libceph options. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2015-04-20	libceph, ceph: split ceph_show_options()	Ilya Dryomov	1	-0/+1
	Split ceph_show_options() into two pieces and move the piece responsible for printing client (libceph) options into net/ceph. This way people adding a libceph option wouldn't have to remember to update code in fs/ceph. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2015-04-20	libceph: osdmap.h: Add missing format newlines	Joe Perches	1	-3/+2
	To avoid possible interleaving, add missing '\n' to formats. Convert pr_warning to pr_warn while there. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2015-02-19	libceph: tcp_nodelay support	Chaitanya Huilgol	2	-2/+5
	TCP_NODELAY socket option set on connection sockets, disables Nagle’s algorithm and improves latency characteristics. tcp_nodelay(default)/notcp_nodelay option flags provided to enable/disable setting the socket option. Signed-off-by: Chaitanya Huilgol <chaitanya.huilgol@sandisk.com> [idryomov@redhat.com: NO_TCP_NODELAY -> TCP_NODELAY, minor adjustments] Signed-off-by: Ilya Dryomov <idryomov@redhat.com>
2015-02-19	ceph: handle SESSION_FORCE_RO message	Yan, Zheng	1	-0/+1
	mark session as readonly and wake up all cap waiters. Signed-off-by: Yan, Zheng <zyan@redhat.com>
2015-02-19	libceph: nuke pool op infrastructure	Ilya Dryomov	2	-44/+1
	On Mon, Dec 22, 2014 at 5:35 PM, Sage Weil <sage@newdream.net> wrote: > On Mon, 22 Dec 2014, Ilya Dryomov wrote: >> Actually, pool op stuff has been unused for over two years - looks like >> it was added for rbd create_snap and that got ripped out in 2012. It's >> unlikely we'd ever need to manage pools or snaps from the kernel client >> so I think it makes sense to nuke it. Sage? > > Yep! Signed-off-by: Ilya Dryomov <idryomov@redhat.com>
2015-01-08	libceph: fix sparse endianness warnings	Ilya Dryomov	1	-2/+2
	The only real issue is the one in auth_x.c and it came with 3.19-rc1 merge. Signed-off-by: Ilya Dryomov <idryomov@redhat.com>
2014-12-17	libceph: fixup includes in pagelist.h	Ilya Dryomov	1	-1/+3
	pagelist.h needs to include linux/types.h and asm/byteorder.h and not rely on other headers pulling yet another set of headers. Signed-off-by: Ilya Dryomov <idryomov@redhat.com>
2014-12-17	ceph: use getattr request to fetch inline data	Yan, Zheng	1	-0/+2
	Add a new parameter 'locked_page' to ceph_do_getattr(). If inline data in getattr reply will be copied to the page. Signed-off-by: Yan, Zheng <zyan@redhat.com>
2014-12-17	ceph: add inline data to pagecache	Yan, Zheng	1	-0/+1
	Request reply and cap message can contain inline data. add inline data to the page cache if there is Fc cap. Signed-off-by: Yan, Zheng <zyan@redhat.com>
2014-12-17	libceph: specify position of extent operation	Yan, Zheng	1	-1/+2
	allow specifying position of extent operation in multi-operations osd request. This is required for cephfs to convert inline data to normal data (compare xattr, then write object). Signed-off-by: Yan, Zheng <zyan@redhat.com> Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
2014-12-17	libceph: add SETXATTR/CMPXATTR osd operations support	Yan, Zheng	1	-0/+10
	Signed-off-by: Yan, Zheng <zyan@redhat.com> Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
2014-12-17	libceph: require cephx message signature by default	Yan, Zheng	1	-0/+1
	Signed-off-by: Yan, Zheng <zyan@redhat.com> Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
2014-12-17	libceph: update ceph_msg_header structure	John Spray	1	-1/+2
	2 bytes of what was reserved space is now used by userspace for the compat_version field. Signed-off-by: John Spray <john.spray@redhat.com> Reviewed-by: Sage Weil <sage@redhat.com>
2014-12-17	libceph: message signature support	Yan, Zheng	4	-1/+43
	Signed-off-by: Yan, Zheng <zyan@redhat.com>
2014-12-17	libceph: nuke ceph_kvfree()	Ilya Dryomov	2	-3/+1
	Use kvfree() from linux/mm.h instead, which is identical. Also fix the ceph_buffer comment: we will allocate with kmalloc() up to 32k - the value of PAGE_ALLOC_COSTLY_ORDER, but that really is just an implementation detail so don't mention it at all. Signed-off-by: Ilya Dryomov <idryomov@redhat.com>
2014-12-17	ceph: fix file lock interruption	Yan, Zheng	1	-2/+5
	When a lock operation is interrupted, current code sends a unlock request to MDS to undo the lock operation. This method does not work as expected because the unlock request can drop locks that have already been acquired. The fix is use the newly introduced CEPH_LOCK_FCNTL_INTR/CEPH_LOCK_FLOCK_INTR requests to interrupt blocked file lock request. These requests do not drop locks that have alread been acquired, they only interrupt blocked file lock request. Signed-off-by: Yan, Zheng <zyan@redhat.com>
2014-10-14	libceph: sync osd op definitions in rados.h	Ilya Dryomov	1	-96/+129
	Bring in missing osd ops and strings, use macros to eliminate multiple points of maintenance. Signed-off-by: Ilya Dryomov <idryomov@redhat.com> Reviewed-by: Sage Weil <sage@redhat.com>
2014-10-14	libceph: remove redundant declaration	Fabian Frederick	1	-1/+0
	ceph_release_page_vector was defined twice in libceph.h Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Ilya Dryomov <idryomov@redhat.com>
2014-10-14	libceph: reference counting pagelist	Yan, Zheng	1	-1/+4
	this allow pagelist to present data that may be sent multiple times. Signed-off-by: Yan, Zheng <zyan@redhat.com> Reviewed-by: Sage Weil <sage@redhat.com>
2014-07-08	libceph: nuke ceph_osdc_unregister_linger_request()	Ilya Dryomov	1	-2/+0
	Remove now unused ceph_osdc_unregister_linger_request(). Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Alex Elder <elder@linaro.org>
2014-07-08	libceph: introduce ceph_osdc_cancel_request()	Ilya Dryomov	1	-0/+1
	Introduce ceph_osdc_cancel_request() intended for canceling requests from the higher layers (rbd and cephfs). Because higher layers are in charge and are supposed to know what and when they are canceling, the request is not completed, only unref'ed and removed from the libceph data structures. __cancel_request() is no longer called before __unregister_request(), because __unregister_request() unconditionally revokes r_request and there is no point in trying to do it twice. Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Alex Elder <elder@linaro.org>
2014-07-08	libceph: move and add dout()s to ceph_osdc_request_{get,put}()	Ilya Dryomov	1	-9/+2
	Add dout()s to ceph_osdc_request_{get,put}(). Also move them to .c and turn kref release callback into a static function. Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Alex Elder <elder@linaro.org>
2014-07-08	libceph: move and add dout()s to ceph_msg_{get,put}()	Ilya Dryomov	1	-12/+2
	Add dout()s to ceph_msg_{get,put}(). Also move them to .c and turn kref release callback into a static function. Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Alex Elder <elder@linaro.org>
2014-07-08	libceph: rename ceph_osd_request::r_linger_osd to r_linger_osd_item	Ilya Dryomov	1	-1/+1
	So that: req->r_osd_item --> osd->o_requests list req->r_linger_osd_item --> osd->o_linger_requests list Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Alex Elder <elder@linaro.org>
2014-06-13	Merge branch 'for-linus' of ↵	Linus Torvalds	2	-3/+10
	git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull Ceph updates from Sage Weil: "This has a mix of bug fixes and cleanups. Alex's patch fixes a rare race in RBD. Ilya's patches fix an ENOENT check when a second rbd image is mapped and a couple memory leaks. Zheng fixes several issues with fragmented directories and multiple MDSs. Josh fixes a spin/sleep issue, and Josh and Guangliang's patches fix setting and unsetting RBD images read-only. Naturally there are several other cleanups mixed in for good measure" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (23 commits) rbd: only set disk to read-only once rbd: move calls that may sleep out of spin lock range rbd: add ioctl for rbd ceph: use truncate_pagecache() instead of truncate_inode_pages() ceph: include time stamp in every MDS request rbd: fix ida/idr memory leak rbd: use reference counts for image requests rbd: fix osd_request memory leak in __rbd_dev_header_watch_sync() rbd: make sure we have latest osdmap on 'rbd map' libceph: add ceph_monc_wait_osdmap() libceph: mon_get_version request infrastructure libceph: recognize poolop requests in debugfs ceph: refactor readpage_nounlock() to make the logic clearer mds: check cap ID when handling cap export message ceph: remember subtree root dirfrag's auth MDS ceph: introduce ceph_fill_fragtree() ceph: handle cap import atomically ceph: pre-allocate ceph_cap struct for ceph_add_cap() ceph: update inode fields according to issued caps rbd: replace IS_ERR and PTR_ERR with PTR_ERR_OR_ZERO ...
2014-06-06	libceph: add ceph_monc_wait_osdmap()	Ilya Dryomov	1	-0/+2
	Add ceph_monc_wait_osdmap(), which will block until the osdmap with the specified epoch is received or timeout occurs. Export both of these as they are going to be needed by rbd. Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
2014-06-06	libceph: mon_get_version request infrastructure	Ilya Dryomov	1	-3/+6
	Add support for mon_get_version requests to libceph. This reuses much of the ceph_mon_generic_request infrastructure, with one exception. Older OSDs don't set mon_get_version reply hdr->tid even if the original request had a non-zero tid, which makes it impossible to lookup ceph_mon_generic_request contexts by tid in get_generic_reply() for such replies. As a workaround, we allocate a reply message on the reply path. This can probably interfere with revoke, but I don't see a better way. Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
2014-06-06	ceph: update inode fields according to issued caps	Yan, Zheng	1	-0/+2
	Cap message and request reply from non-auth MDS may carry stale information (corresponding locks are in LOCK states) even they have the newest inode version. So client should update inode fields according to issued caps. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2014-05-07	ceph_sync_read: stop poking into iov_iter guts	Al Viro	1	-2/+0
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-04-05	libceph: enable PRIMARY_AFFINITY feature bit	Ilya Dryomov	1	-1/+2
	Announce our support for osdmaps with non-default primary affinity values. Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Alex Elder <elder@linaro.org>
2014-04-05	libceph: return primary from ceph_calc_pg_acting()	Ilya Dryomov	1	-1/+1
	In preparation for adding support for primary_temp, stop assuming primaryness: add a primary out parameter to ceph_calc_pg_acting() and change call sites accordingly. Primary is now specified separately from the order of osds in the set. Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Alex Elder <elder@linaro.org>
2014-04-05	libceph: switch ceph_calc_pg_acting() to new helpers	Ilya Dryomov	1	-1/+1
	Switch ceph_calc_pg_acting() to new helpers: pg_to_raw_osds(), raw_to_up_osds() and apply_temps(). Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Alex Elder <elder@linaro.org>
2014-04-05	libceph: ceph_can_shift_osds(pool) and pool type defines	Ilya Dryomov	2	-2/+15
	Bring in pg_pool_t::can_shift_osds() counterpart along with pool type defines. Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Alex Elder <elder@linaro.org>
2014-04-05	libceph: ceph_osd_{exists,is_up,is_down}(osd) definitions	Ilya Dryomov	1	-1/+13
	Sync up with ceph.git definitions. Bring in ceph_osd_is_down(). Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Alex Elder <elder@linaro.org>
2014-04-05	libceph: enable OSDMAP_ENC feature bit	Ilya Dryomov	1	-0/+1
	Announce our support for "new" (v7 - split and separately versioned client and osd sections) osdmap enconding. Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Alex Elder <elder@linaro.org>
2014-04-05	libceph: primary_affinity infrastructure	Ilya Dryomov	2	-0/+7
	Add primary_affinity infrastructure. primary_affinity values are stored in an max_osd-sized array, hanging off ceph_osdmap, similar to a osd_weight array. Introduce {get,set}_primary_affinity() helpers, primarily to return CEPH_OSD_DEFAULT_PRIMARY_AFFINITY when no affinity has been set and to abstract out osd_primary_affinity array allocation and initialization. Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Alex Elder <elder@linaro.org>
2014-04-05	libceph: primary_temp infrastructure	Ilya Dryomov	1	-0/+5
	Add primary_temp mappings infrastructure. struct ceph_pg_mapping is overloaded, primary_temp mappings are stored in an rb-tree, rooted at ceph_osdmap, in a manner similar to pg_temp mappings. Dump primary_temp mappings to /sys/kernel/debug/ceph/<client>/osdmap, one 'primary_temp <pgid> <osd>' per line, e.g: primary_temp 2.6 4 Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Alex Elder <elder@linaro.org>
2014-04-05	libceph: generalize ceph_pg_mapping	Ilya Dryomov	1	-2/+7
	In preparation for adding support for primary_temp mappings, generalize struct ceph_pg_mapping so it can hold mappings other than pg_temp. Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Alex Elder <elder@linaro.org>
2014-04-05	libceph: split osdmap allocation and decode steps	Ilya Dryomov	1	-1/+1
	Split osdmap allocation and initialization into a separate function, ceph_osdmap_decode(). Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Alex Elder <elder@linaro.org>
2014-04-05	crush: support chooseleaf_vary_r tunable (tunables3) by default	Ilya Dryomov	1	-1/+9
	Add TUNABLES3 feature (chooseleaf_vary_r tunable) to a set of features supported by default. Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2014-04-05	ceph: use fl->fl_file as owner identifier of flock and posix lock	Yan, Zheng	1	-2/+2
	flock and posix lock should use fl->fl_file instead of process ID as owner identifier. (posix lock uses fl->fl_owner. fl->fl_owner is usually equal to fl->fl_file, but it also can be a customized value). The process ID of who holds the lock is just for F_GETLK fcntl(2). The fix is rename the 'pid' fields of struct ceph_mds_request_args and struct ceph_filelock to 'owner', rename 'pid_namespace' fields to 'pid'. Assign fl->fl_file to the 'owner' field of lock messages. We also set the most significant bit of the 'owner' field. MDS can use that bit to distinguish between old and new clients. The MDS counterpart of this patch modifies the flock code to not take the 'pid_namespace' into consideration when checking conflict locks. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Reviewed-by: Sage Weil <sage@inktank.com>
2014-04-03	ceph: add get_name() NFS export callback	Yan, Zheng	1	-0/+1
	Use the newly introduced LOOKUPNAME MDS request to connect child inode to its parent directory. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Reviewed-by: Sage Weil <sage@inktank.com>
2014-04-03	libceph: bump CEPH_OSD_MAX_OP to 3	Ilya Dryomov	1	-1/+1
	Our longest osd request now contains 3 ops: copyup+hint+write. Also, CEPH_OSD_MAX_OP value in a BUG_ON in rbd_osd_req_callback() was hard-coded to 2. Fix it, and switch to rbd_assert while at it. Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> Reviewed-by: Alex Elder <elder@linaro.org>
2014-04-03	libceph: add support for CEPH_OSD_OP_SETALLOCHINT osd op	Ilya Dryomov	2	-0/+15
	This is primarily for rbd's benefit and is supposed to combat fragmentation: "... knowing that rbd images have a 4m size, librbd can pass a hint that will let the osd do the xfs allocation size ioctl on new files so that they are allocated in 1m or 4m chunks. We've seen cases where users with rbd workloads have very high levels of fragmentation in xfs and this would mitigate that and probably have a pretty nice performance benefit." SETALLOCHINT is considered advisory, so our backwards compatibility mechanism here is to set FAILOK flag for all SETALLOCHINT ops. Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> Reviewed-by: Alex Elder <elder@linaro.org>
2014-04-03	libceph: encode CEPH_OSD_OP_FLAG_* op flags	Ilya Dryomov	2	-1/+2
	Encode ceph_osd_op::flags field so that it gets sent over the wire. Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> Reviewed-by: Alex Elder <elder@linaro.org>
2014-04-03	libceph: a per-osdc crush scratch buffer	Ilya Dryomov	1	-0/+3
	With the addition of erasure coding support in the future, scratch variable-length array in crush_do_rule_ary() is going to grow to at least 200 bytes on average, on top of another 128 bytes consumed by rawosd/osd arrays in the call chain. Replace it with a buffer inside struct osdmap and a mutex. This shouldn't result in any contention, because all osd requests were already serialized by request_mutex at that point; the only unlocked caller was ceph_ioctl_get_dataloc(). Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>