diff options
author | David Howells <dhowells@redhat.com> | 2020-06-04 00:21:16 +0300 |
---|---|---|
committer | David Howells <dhowells@redhat.com> | 2020-06-05 15:36:35 +0300 |
commit | 5ac0d62226a07849b1a5233af8c800a19cecab83 (patch) | |
tree | 0a527105916a03109d2aac6880bf9dc470e40c4a /net/rxrpc/recvmsg.c | |
parent | 3067bf8c596d59164f48569a2d362de5b4c42f59 (diff) | |
download | linux-5ac0d62226a07849b1a5233af8c800a19cecab83.tar.xz |
rxrpc: Fix missing notification
Under some circumstances, rxrpc will fail a transmit a packet through the
underlying UDP socket (ie. UDP sendmsg returns an error). This may result
in a call getting stuck.
In the instance being seen, where AFS tries to send a probe to the Volume
Location server, tracepoints show the UDP Tx failure (in this case returing
error 99 EADDRNOTAVAIL) and then nothing more:
afs_make_vl_call: c=0000015d VL.GetCapabilities
rxrpc_call: c=0000015d NWc u=1 sp=rxrpc_kernel_begin_call+0x106/0x170 [rxrpc] a=00000000dd89ee8a
rxrpc_call: c=0000015d Gus u=2 sp=rxrpc_new_client_call+0x14f/0x580 [rxrpc] a=00000000e20e4b08
rxrpc_call: c=0000015d SEE u=2 sp=rxrpc_activate_one_channel+0x7b/0x1c0 [rxrpc] a=00000000e20e4b08
rxrpc_call: c=0000015d CON u=2 sp=rxrpc_kernel_begin_call+0x106/0x170 [rxrpc] a=00000000e20e4b08
rxrpc_tx_fail: c=0000015d r=1 ret=-99 CallDataNofrag
The problem is that if the initial packet fails and the retransmission
timer hasn't been started, the call is set to completed and an error is
returned from rxrpc_send_data_packet() to rxrpc_queue_packet(). Though
rxrpc_instant_resend() is called, this does nothing because the call is
marked completed.
So rxrpc_notify_socket() isn't called and the error is passed back up to
rxrpc_send_data(), rxrpc_kernel_send_data() and thence to afs_make_call()
and afs_vl_get_capabilities() where it is simply ignored because it is
assumed that the result of a probe will be collected asynchronously.
Fileserver probing is similarly affected via afs_fs_get_capabilities().
Fix this by always issuing a notification in __rxrpc_set_call_completion()
if it shifts a call to the completed state, even if an error is also
returned to the caller through the function return value.
Also put in a little bit of optimisation to avoid taking the call
state_lock and disabling softirqs if the call is already in the completed
state and remove some now redundant rxrpc_notify_socket() calls.
Fixes: f5c17aaeb2ae ("rxrpc: Calls should only have one terminal state")
Reported-by: Gerry Seidman <gerry@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Marc Dionne <marc.dionne@auristor.com>
Diffstat (limited to 'net/rxrpc/recvmsg.c')
-rw-r--r-- | net/rxrpc/recvmsg.c | 21 |
1 files changed, 13 insertions, 8 deletions
diff --git a/net/rxrpc/recvmsg.c b/net/rxrpc/recvmsg.c index 6c4ba4224ddc..2989742a4aa1 100644 --- a/net/rxrpc/recvmsg.c +++ b/net/rxrpc/recvmsg.c @@ -73,6 +73,7 @@ bool __rxrpc_set_call_completion(struct rxrpc_call *call, call->state = RXRPC_CALL_COMPLETE; trace_rxrpc_call_complete(call); wake_up(&call->waitq); + rxrpc_notify_socket(call); return true; } return false; @@ -83,11 +84,13 @@ bool rxrpc_set_call_completion(struct rxrpc_call *call, u32 abort_code, int error) { - bool ret; + bool ret = false; - write_lock_bh(&call->state_lock); - ret = __rxrpc_set_call_completion(call, compl, abort_code, error); - write_unlock_bh(&call->state_lock); + if (call->state < RXRPC_CALL_COMPLETE) { + write_lock_bh(&call->state_lock); + ret = __rxrpc_set_call_completion(call, compl, abort_code, error); + write_unlock_bh(&call->state_lock); + } return ret; } @@ -101,11 +104,13 @@ bool __rxrpc_call_completed(struct rxrpc_call *call) bool rxrpc_call_completed(struct rxrpc_call *call) { - bool ret; + bool ret = false; - write_lock_bh(&call->state_lock); - ret = __rxrpc_call_completed(call); - write_unlock_bh(&call->state_lock); + if (call->state < RXRPC_CALL_COMPLETE) { + write_lock_bh(&call->state_lock); + ret = __rxrpc_call_completed(call); + write_unlock_bh(&call->state_lock); + } return ret; } |