summaryrefslogtreecommitdiff
path: root/include/linux/ceph
AgeCommit message (Collapse)AuthorFilesLines
2013-06-20libceph: wrap auth methods in a mutexSage Weil1-0/+2
commit e9966076cdd952e19f2dd4854cd719be0d7cbebc upstream. The auth code is called from a variety of contexts, include the mon_client (protected by the monc's mutex) and the messenger callbacks (currently protected by nothing). Avoid chaos by protecting all auth state with a mutex. Nothing is blocking, so this should be simple and lightweight. Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Alex Elder <elder@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-06-20libceph: wrap auth ops in wrapper functionsSage Weil1-0/+13
commit 27859f9773e4a0b2042435b13400ee2c891a61f4 upstream. Use wrapper functions that check whether the auth op exists so that callers do not need a bunch of conditional checks. Simplifies the external interface. Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Alex Elder <elder@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-06-20libceph: add update_authorizer auth methodSage Weil1-0/+3
commit 0bed9b5c523d577378b6f83eab5835fe30c27208 upstream. Currently the messenger calls out to a get_authorizer con op, which will create a new authorizer if it doesn't yet have one. In the meantime, when we rotate our service keys, the authorizer doesn't get updated. Eventually it will be rejected by the server on a new connection attempt and get invalidated, and we will then rebuild a new authorizer, but this is not ideal. Instead, if we do have an authorizer, call a new update_authorizer op that will verify that the current authorizer is using the latest secret. If it is not, we will build a new one that does. This avoids the transient failure. This fixes one of the sorry sequence of events for bug http://tracker.ceph.com/issues/4282 Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Alex Elder <elder@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-17libceph: remove 'osdtimeout' optionSage Weil1-2/+0
This would reset a connection with any OSD that had an outstanding request that was taking more than N seconds. The idea was that if the OSD was buggy, the client could compensate by resending the request. In reality, this only served to hide server bugs, and we haven't actually seen such a bug in quite a while. Moreover, the userspace client code never did this. More importantly, often the request is taking a long time because the OSD is trying to recover, or overloaded, and killing the connection and retrying would only make the situation worse by giving the OSD more work to do. Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Alex Elder <elder@inktank.com> (cherry picked from commit 83aff95eb9d60aff5497e9f44a2ae906b86d8e88) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-11-26libceph: drop declaration of ceph_con_get()Alex Elder1-2/+0
commit 261030215d970c62f799e6e508e3c68fc7ec2aa9 upstream. For some reason the declaration of ceph_con_get() and ceph_con_put() did not get deleted in this commit: d59315ca libceph: drop ceph_con_get/put helpers and nref member Clean that up. Signed-off-by: Alex Elder <elder@inktank.com> Cc: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-11-26libceph: check for invalid mappingSage Weil2-4/+4
(cherry picked from commit d63b77f4c552cc3a20506871046ab0fcbc332609) If we encounter an invalid (e.g., zeroed) mapping, return an error and avoid a divide by zero. Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Alex Elder <elder@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-11-26libceph: clean up con flagsSage Weil1-10/+0
(cherry picked from commit 4a8616920860920abaa51193146fe36b38ef09aa) Rename flags with CON_FLAG prefix, move the definitions into the c file, and (better) document their meaning. Signed-off-by: Sage Weil <sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-11-26libceph: replace connection state bits with statesSage Weil1-12/+0
(cherry picked from commit 8dacc7da69a491c515851e68de6036f21b5663ce) Use a simple set of 6 enumerated values for the socket states (CON_STATE_*) and use those instead of the state bits. All of the con->state checks are now under the protection of the con mutex, so this is safe. It also simplifies many of the state checks because we can check for anything other than the expected state instead of various bits for races we can think of. This appears to hold up well to stress testing both with and without socket failure injection on the server side. Signed-off-by: Sage Weil <sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-11-26libceph: prevent the race of incoming work during teardownGuanjun He1-0/+1
(cherry picked from commit a2a3258417eb6a1799cf893350771428875a8287) Add an atomic variable 'stopping' as flag in struct ceph_messenger, set this flag to 1 in function ceph_destroy_client(), and add the condition code in function ceph_data_ready() to test the flag value, if true(1), just return. Signed-off-by: Guanjun He <gjhe@suse.com> Reviewed-by: Sage Weil <sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-11-26libceph: initialize msgpool message typesSage Weil1-1/+2
(cherry picked from commit d50b409fb8698571d8209e5adfe122e287e31290) Initialize the type field for messages in a msgpool. The caller was doing this for osd ops, but not for the reply messages. Reported-by: Alex Elder <elder@inktank.com> Signed-off-by: Sage Weil <sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-11-26libceph: set peer name on con_open, not initSage Weil1-2/+2
(cherry picked from commit b7a9e5dd40f17a48a72f249b8bbc989b63bae5fd) The peer name may change on each open attempt, even when the connection is reused. Signed-off-by: Sage Weil <sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-11-26libceph: define and use an explicit CONNECTED stateAlex Elder1-0/+1
(cherry picked from commit e27947c767f5bed15048f4e4dad3e2eb69133697) There is no state explicitly defined when a ceph connection is fully operational. So define one. It's set when the connection sequence completes successfully, and is cleared when the connection gets closed. Be a little more careful when examining the old state when a socket disconnect event is reported. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-11-26libceph: drop ceph_con_get/put helpers and nref memberSage Weil1-1/+0
(cherry picked from commit d59315ca8c0de00df9b363f94a2641a30961ca1c) These are no longer used. Every ceph_connection instance is embedded in another structure, and refcounts manipulated via the get/put ops. Signed-off-by: Sage Weil <sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-11-26libceph: make ceph_con_revoke_message() a msg opAlex Elder1-2/+2
(cherry picked from commit 8921d114f5574c6da2cdd00749d185633ecf88f3) ceph_con_revoke_message() is passed both a message and a ceph connection. A ceph_msg allocated for incoming messages on a connection always has a pointer to that connection, so there's no need to provide the connection when revoking such a message. Note that the existing logic does not preclude the message supplied being a null/bogus message pointer. The only user of this interface is the OSD client, and the only value an osd client passes is a request's r_reply field. That is always non-null (except briefly in an error path in ceph_osdc_alloc_request(), and that drops the only reference so the request won't ever have a reply to revoke). So we can safely assume the passed-in message is non-null, but add a BUG_ON() to make it very obvious we are imposing this restriction. Rename the function ceph_msg_revoke_incoming() to reflect that it is really an operation on an incoming message. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-11-26libceph: make ceph_con_revoke() a msg operationAlex Elder1-1/+2
(cherry picked from commit 6740a845b2543cc46e1902ba21bac743fbadd0dc) ceph_con_revoke() is passed both a message and a ceph connection. Now that any message associated with a connection holds a pointer to that connection, there's no need to provide the connection when revoking a message. This has the added benefit of precluding the possibility of the providing the wrong connection pointer. If the message's connection pointer is null, it is not being tracked by any connection, so revoking it is a no-op. This is supported as a convenience for upper layers, so they can revoke a message that is not actually "in flight." Rename the function ceph_msg_revoke() to reflect that it is really an operation on a message, not a connection. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-11-26libceph: have messages point to their connectionAlex Elder1-0/+3
(cherry picked from commit 38941f8031bf042dba3ced6394ba3a3b16c244ea) When a ceph message is queued for sending it is placed on a list of pending messages (ceph_connection->out_queue). When they are actually sent over the wire, they are moved from that list to another (ceph_connection->out_sent). When acknowledgement for the message is received, it is removed from the sent messages list. During that entire time the message is "in the possession" of a single ceph connection. Keep track of that connection in the message. This will be used in the next patch (and is a helpful bit of information for debugging anyway). Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-11-26libceph: fully initialize connection in con_init()Alex Elder1-2/+4
(cherry picked from commit 1bfd89f4e6e1adc6a782d94aa5d4c53be1e404d7) Move the initialization of a ceph connection's private pointer, operations vector pointer, and peer name information into ceph_con_init(). Rearrange the arguments so the connection pointer is first. Hide the byte-swapping of the peer entity number inside ceph_con_init() Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-11-26libceph: embed ceph connection structure in mon_clientAlex Elder1-1/+1
(cherry picked from commit 67130934fb579fdf0f2f6d745960264378b57dc8) A monitor client has a pointer to a ceph connection structure in it. This is the only one of the three ceph client types that do it this way; the OSD and MDS clients embed the connection into their main structures. There is always exactly one ceph connection for a monitor client, so there is no need to allocate it separate from the monitor client structure. So switch the ceph_mon_client structure to embed its ceph_connection structure. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-11-26libceph: start tracking connection socket stateAlex Elder1-2/+6
(cherry picked from commit ce2c8903e76e690846a00a0284e4bd9ee954d680) Start explicitly keeping track of the state of a ceph connection's socket, separate from the state of the connection itself. Create placeholder functions to encapsulate the state transitions. -------- | NEW* | transient initial state -------- | con_sock_state_init() v ---------- | CLOSED | initialized, but no socket (and no ---------- TCP connection) ^ \ | \ con_sock_state_connecting() | ---------------------- | \ + con_sock_state_closed() \ |\ \ | \ \ | ----------- \ | | CLOSING | socket event; \ | ----------- await close \ | ^ | | | | | + con_sock_state_closing() | | / \ | | / --------------- | | / \ v | / -------------- | / -----------------| CONNECTING | socket created, TCP | | / -------------- connect initiated | | | con_sock_state_connected() | | v ------------- | CONNECTED | TCP connection established ------------- Make the socket state an atomic variable, reinforcing that it's a distinct transtion with no possible "intermediate/both" states. This is almost certainly overkill at this point, though the transitions into CONNECTED and CLOSING state do get called via socket callback (the rest of the transitions occur with the connection mutex held). We can back out the atomicity later. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil<sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-11-26libceph: start separating connection flags from stateAlex Elder1-6/+12
(cherry picked from commit 928443cd9644e7cfd46f687dbeffda2d1a357ff9) A ceph_connection holds a mixture of connection state (as in "state machine" state) and connection flags in a single "state" field. To make the distinction more clear, define a new "flags" field and use it rather than the "state" field to hold Boolean flag values. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil<sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-11-26libceph: embed ceph messenger structure in ceph_clientAlex Elder2-5/+6
(cherry picked from commit 15d9882c336db2db73ccf9871ae2398e452f694c) A ceph client has a pointer to a ceph messenger structure in it. There is always exactly one ceph messenger for a ceph client, so there is no need to allocate it separate from the ceph client structure. Switch the ceph_client structure to embed its ceph_messenger structure. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Yehuda Sadeh <yehuda@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-11-26libceph: kill bad_proto ceph connection opAlex Elder1-3/+0
(cherry picked from commit 6384bb8b8e88a9c6bf2ae0d9517c2c0199177c34) No code sets a bad_proto method in its ceph connection operations vector, so just get rid of it. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Yehuda Sadeh <yehuda@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-11-26libceph: eliminate connection state "DEAD"Alex Elder1-1/+0
(cherry picked from commit e5e372da9a469dfe3ece40277090a7056c566838) The ceph connection state "DEAD" is never set and is therefore not needed. Eliminate it. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Yehuda Sadeh <yehuda@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-11-26libceph: fix messenger retrySage Weil1-10/+2
(cherry picked from commit 5bdca4e0768d3e0f4efa43d9a2cc8210aeb91ab9) In ancient times, the messenger could both initiate and accept connections. An artifact if that was data structures to store/process an incoming ceph_msg_connect request and send an outgoing ceph_msg_connect_reply. Sadly, the negotiation code was referencing those structures and ignoring important information (like the peer's connect_seq) from the correct ones. Among other things, this fixes tight reconnect loops where the server sends RETRY_SESSION and we (the client) retries with the same connect_seq as last time. This bug pretty easily triggered by injecting socket failures on the MDS and running some fs workload like workunits/direct_io/test_sync_io. Signed-off-by: Sage Weil <sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-11-26ceph: use info returned by get_authorizerAlex Elder1-3/+1
(cherry picked from commit 8f43fb53894079bf0caab6e348ceaffe7adc651a) Rather than passing a bunch of arguments to be filled in with the content of the ceph_auth_handshake buffer now returned by the get_authorizer method, just use the returned information in the caller, and drop the unnecessary arguments. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-11-26ceph: have get_authorizer methods return pointersAlex Elder1-3/+5
(cherry picked from commit a3530df33eb91d787d08c7383a0a9982690e42d0) Have the get_authorizer auth_client method return a ceph_auth pointer rather than an integer, pointer-encoding any returned error value. This is to pave the way for making use of the returned value in an upcoming patch. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-11-26ceph: messenger: reduce args to create_authorizerAlex Elder1-3/+1
(cherry picked from commit 74f1869f76d043bad12ec03b4d5f04a8c3d1f157) Make use of the new ceph_auth_handshake structure in order to reduce the number of arguments passed to the create_authorizor method in ceph_auth_client_ops. Use a local variable of that type as a shorthand in the get_authorizer method definitions. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-11-26ceph: define ceph_auth_handshake typeAlex Elder2-6/+13
(cherry picked from commit 6c4a19158b96ea1fb8acbe0c1d5493d9dcd2f147) The definitions for the ceph_mds_session and ceph_osd both contain five fields related only to "authorizers." Encapsulate those fields into their own struct type, allowing for better isolation in some upcoming patches. Fix the #includes in "linux/ceph/osd_client.h" to lay out their more complete canonical path. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-03-28Merge branch 'for-linus' of ↵Linus Torvalds2-5/+2
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull Ceph updates for 3.4-rc1 from Sage Weil: "Alex has been busy. There are a range of rbd and libceph cleanups, especially surrounding device setup and teardown, and a few critical fixes in that code. There are more cleanups in the messenger code, virtual xattrs, a fix for CRC calculation/checks, and lots of other miscellaneous stuff. There's a patch from Amon Ott to make inos behave a bit better on 32-bit boxes, some decode check fixes from Xi Wang, and network throttling fix from Jim Schutt, and a couple RBD fixes from Josh Durgin. No new functionality, just a lot of cleanup and bug fixing." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (65 commits) rbd: move snap_rwsem to the device, rename to header_rwsem ceph: fix three bugs, two in ceph_vxattrcb_file_layout() libceph: isolate kmap() call in write_partial_msg_pages() libceph: rename "page_shift" variable to something sensible libceph: get rid of zero_page_address libceph: only call kernel_sendpage() via helper libceph: use kernel_sendpage() for sending zeroes libceph: fix inverted crc option logic libceph: some simple changes libceph: small refactor in write_partial_kvec() libceph: do crc calculations outside loop libceph: separate CRC calculation from byte swapping libceph: use "do" in CRC-related Boolean variables ceph: ensure Boolean options support both senses libceph: a few small changes libceph: make ceph_tcp_connect() return int libceph: encapsulate some messenger cleanup code libceph: make ceph_msgr_wq private libceph: encapsulate connection kvec operations libceph: move prepare_write_banner() ...
2012-03-22libceph: use "do" in CRC-related Boolean variablesAlex Elder1-1/+1
Change the name (and type) of a few CRC-related Boolean local variables so they contain the word "do", to distingish their purpose from variables used for holding an actual CRC value. Note that in the process of doing this I identified a fairly serious logic error in write_partial_msg_pages(): the value of "do_crc" assigned appears to be the opposite of what it should be. No attempt to fix this is made here; this change preserves the erroneous behavior. The problem I found is documented here: http://tracker.newdream.net/issues/2064 Signed-off-by: Alex Elder <elder@dreamhost.com> Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22libceph: make ceph_msgr_wq privateAlex Elder1-2/+0
The messenger workqueue has no need to be public. So give it static scope. Signed-off-by: Alex Elder <elder@dreamhost.com> Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22rbd: make ceph_parse_options() return a pointerAlex Elder1-1/+1
ceph_parse_options() takes the address of a pointer as an argument and uses it to return the address of an allocated structure if successful. With this interface is not evident at call sites that the pointer is always initialized. Change the interface to return the address instead (or a pointer-coded error code) to make the validity of the returned pointer obvious. Signed-off-by: Alex Elder <elder@dreamhost.com> Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22ceph: use a shared zero page rather than one per messengerAlex Elder1-1/+0
Each messenger allocates a page to be used when writing zeroes out in the event of error or other abnormal condition. Instead, use the kernel ZERO_PAGE() for that purpose. Signed-off-by: Alex Elder <elder@dreamhost.com> Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-05BUG: headers with BUG/BUG_ON etc. need linux/bug.hPaul Gortmaker3-1/+4
If a header file is making use of BUG, BUG_ON, BUILD_BUG_ON, or any other BUG variant in a static inline (i.e. not in a #define) then that header really should be including <linux/bug.h> and not just expecting it to be implicitly present. We can make this change risk-free, since if the files using these headers didn't have exposure to linux/bug.h already, they would have been causing compile failures/warnings. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-11-11libceph: Allocate larger oid buffer in request msgsStratos Psomadakis1-1/+7
ceph_osd_request struct allocates a 40-byte buffer for object names. RBD image names can be up to 96 chars long (100 with the .rbd suffix), which results in the object name for the image being truncated, and a subsequent map failure. Increase the oid buffer in request messages, in order to avoid the truncation. Signed-off-by: Stratos Psomadakis <psomas@grnet.gr> Signed-off-by: Sage Weil <sage@newdream.net>
2011-10-29Merge branch 'for-linus' of git://ceph.newdream.net/git/ceph-clientLinus Torvalds2-2/+5
* 'for-linus' of git://ceph.newdream.net/git/ceph-client: libceph: fix double-free of page vector ceph: fix 32-bit ino numbers libceph: force resend of osd requests if we skip an osdmap ceph: use kernel DNS resolver ceph: fix ceph_monc_init memory leak ceph: let the set_layout ioctl set single traits Revert "ceph: don't truncate dirty pages in invalidate work thread" ceph: replace leading spaces with tabs libceph: warn on msg allocation failures libceph: don't complain on msgpool alloc failures libceph: always preallocate mon connection libceph: create messenger with client ceph: document ioctls ceph: implement (optional) max read size ceph: rename rsize -> rasize ceph: make readpages fully async
2011-10-26libceph: don't complain on msgpool alloc failuresSage Weil1-1/+2
The pool allocation failures are masked by the pool; there is no need to spam the console about them. (That's the whole point of having the pool in the first place.) Mark msg allocations whose failure is safely handled as such. Signed-off-by: Sage Weil <sage@newdream.net>
2011-10-26libceph: create messenger with clientSage Weil1-1/+3
This simplifies the init/shutdown paths, and makes client->msgr available during the rest of the setup process. Signed-off-by: Sage Weil <sage@newdream.net>
2011-09-15Merge branch 'master' into for-nextJiri Kosina1-0/+1
Fast-forward merge with Linus to be able to merge patches based on more recent version of the tree.
2011-09-15Remove unneeded version.h includes from include/Jesper Juhl1-1/+0
It was pointed out by 'make versioncheck' that some includes of linux/version.h are not needed in include/. This patch removes them. When I last posted the patch, the ceph bit was ACK'ed by Sage Weil, so I've added that below. The pwc-ioctl change generated quite a bit of discussion about V4L version numbers in general, but as far as I can tell, no concensus was reached on what the long term solution should be, so in the mean time I think we could start by just removing the unneeded include, which is why I'm resending the patch with that hunk still included. Signed-off-by: Jesper Juhl <jj@chaosbits.net> Acked-by: Sage Weil <sage@newdream.net> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-07-27Merge branch 'for-linus' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (23 commits) ceph: document unlocked d_parent accesses ceph: explicitly reference rename old_dentry parent dir in request ceph: document locking for ceph_set_dentry_offset ceph: avoid d_parent in ceph_dentry_hash; fix ceph_encode_fh() hashing bug ceph: protect d_parent access in ceph_d_revalidate ceph: protect access to d_parent ceph: handle racing calls to ceph_init_dentry ceph: set dir complete frag after adding capability rbd: set blk_queue request sizes to object size ceph: set up readahead size when rsize is not passed rbd: cancel watch request when releasing the device ceph: ignore lease mask ceph: fix ceph_lookup_open intent usage ceph: only link open operations to directory unsafe list if O_CREAT|O_TRUNC ceph: fix bad parent_inode calc in ceph_lookup_open ceph: avoid carrying Fw cap during write into page cache libceph: don't time out osd requests that haven't been received ceph: report f_bfree based on kb_avail rather than diffing. ceph: only queue capsnap if caps are dirty ceph: fix snap writeback when racing with writes ...
2011-07-26libceph: don't time out osd requests that haven't been receivedSage Weil1-0/+1
Keep track of when an outgoing message is ACKed (i.e., the server fully received it and, presumably, queued it for processing). Time out OSD requests only if it's been too long since they've been received. This prevents timeouts and connection thrashing when the OSDs are simply busy and are throttling the requests they read off the network. Reviewed-by: Yehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by: Sage Weil <sage@newdream.net>
2011-07-21treewide: fix potentially dangerous trailing ';' in #defined values/expressionsPhil Carmody1-1/+1
All these are instances of #define NAME value; or #define NAME(params_opt) value; These of course fail to build when used in contexts like if(foo $OP NAME) while(bar $OP NAME) and may silently generate the wrong code in contexts such as foo = NAME + 1; /* foo = value; + 1; */ bar = NAME - 1; /* bar = value; - 1; */ baz = NAME & quux; /* baz = value; & quux; */ Reported on comp.lang.c, Message-ID: <ab0d55fe-25e5-482b-811e-c475aa6065c3@c29g2000yqd.googlegroups.com> Initial analysis of the dangers provided by Keith Thompson in that thread. There are many more instances of more complicated macros having unnecessary trailing semicolons, but this pile seems to be all of the cases of simple values suffering from the problem. (Thus things that are likely to be found in one of the contexts above, more complicated ones aren't.) Signed-off-by: Phil Carmody <ext-phil.2.carmody@nokia.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-05-24ceph: use LOOKUPINO to make unconnected nfs fh more reliableSage Weil1-0/+1
If we are unable to locate an inode by ino, ask the MDS using the new LOOKUPINO command. Signed-off-by: Sage Weil <sage@newdream.net>
2011-03-29ceph: Move secret key parsing earlier.Tommi Virtanen2-3/+3
This makes the base64 logic be contained in mount option parsing, and prepares us for replacing the homebew key management with the kernel key retention service. Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com> Signed-off-by: Sage Weil <sage@newdream.net>
2011-03-22libceph: add lingering request and watch/notify event frameworkYehuda Sadeh1-0/+52
Lingering requests are requests that are sent to the OSD normally but tracked also after we get a successful request. This keeps the OSD connection open and resends the original request if the object moves to another OSD. The OSD can then send notification messages back to us if another client initiates a notify. This framework will be used by RBD so that the client gets notification when a snapshot is created by another node or tool. Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by: Sage Weil <sage@newdream.net>
2011-03-21ceph: move readahead default to fs/ceph from libcephSage Weil1-1/+0
Signed-off-by: Sage Weil <sage@newdream.net>
2011-03-21ceph: update common header filesYehuda Sadeh2-13/+45
This updates the common header files used by the different ceph related modules. Specifically it adds definitions required by the rbd watch/notify feature. Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
2011-03-21libceph: fix osd request queuing on osdmap updatesSage Weil1-2/+3
If we send a request to osd A, and the request's pg remaps to osd B and then back to A in quick succession, we need to resend the request to A. The old code was only calling kick_requests after processing all incremental maps in a message, so it was very possible to not resend a request that needed to be resent. This would make the osd eventually time out (at least with the current default of osd timeouts enabled). The correct approach is to scan requests on every map incremental. This patch refactors the kick code in a few ways: - all requests are either on req_lru (in flight), req_unsent (ready to send), or req_notarget (currently map to no up osd) - mapping always done by map_request (previous map_osds) - if the mapping changes, we requeue. requests are resent only after all map incrementals are processed. - some osd reset code is moved out of kick_requests into a separate function - the "kick this osd" functionality is moved to kick_osd_requests, as it is unrelated to scanning for request->pg->osd mapping changes Signed-off-by: Sage Weil <sage@newdream.net>
2011-03-04libceph: fix msgr keepalive flagSage Weil1-1/+0
There was some broken keepalive code using a dead variable. Shift to using the proper bit flag. Signed-off-by: Sage Weil <sage@newdream.net>