summaryrefslogtreecommitdiff
path: root/drivers/infiniband/ulp/rtrs/rtrs-clt.c
AgeCommit message (Collapse)AuthorFilesLines
2021-01-15RDMA/rtrs-clt: Rename __rtrs_clt_change_state to rtrs_clt_change_stateGuoqing Jiang1-5/+5
Let's rename it to rtrs_clt_change_state since the previous one is killed. Also update the comment to make it more clear. Link: https://lore.kernel.org/r/20201217141915.56989-13-jinpu.wang@cloud.ionos.com Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Reviewed-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-15RDMA/rtrs-clt: Kill rtrs_clt_change_stateGuoqing Jiang1-17/+10
It is just a wrapper of rtrs_clt_change_state_get_old, and we can reuse rtrs_clt_change_state_get_old with add the checking of 'old_state' is valid or not. Link: https://lore.kernel.org/r/20201217141915.56989-12-jinpu.wang@cloud.ionos.com Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-15RDMA/rtrs-clt: Remove unnecessary 'goto out'Guoqing Jiang1-1/+0
This is not needed since the label is just after the place. Link: https://lore.kernel.org/r/20201217141915.56989-11-jinpu.wang@cloud.ionos.com Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-15RDMA/rtrs-clt: Kill wait_for_inflight_permitsGuoqing Jiang1-11/+6
Let's wait the inflight permits before free it. Link: https://lore.kernel.org/r/20201217141915.56989-10-jinpu.wang@cloud.ionos.com Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Reviewed-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-15RDMA/rtrs-clt: Consolidate rtrs_clt_destroy_sysfs_root_{folder,files}Guoqing Jiang1-4/+2
Since the two functions are called together, let's consolidate them in a new function rtrs_clt_destroy_sysfs_root. Link: https://lore.kernel.org/r/20201217141915.56989-9-jinpu.wang@cloud.ionos.com Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-15RDMA/rtrs-clt: Set mininum limit when create QPJack Wang1-7/+12
Currently rtrs when create_qp use a coarse numbers (bigger in general), which leads to hardware create more resources which only waste memory with no benefits. - SERVICE con, For max_send_wr/max_recv_wr, it's 2 times SERVICE_CON_QUEUE_DEPTH + 2 - IO con For max_send_wr/max_recv_wr, it's sess->queue_depth * 3 + 1 Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality") Link: https://lore.kernel.org/r/20201217141915.56989-6-jinpu.wang@cloud.ionos.com Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Reviewed-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-15RDMA/rtrs: Extend ibtrs_cq_qp_createJack Wang1-2/+2
rtrs does not have same limit for both max_send_wr and max_recv_wr, To allow client and server set different values, export in a separate parameter for rtrs_cq_qp_create. Also fix the type accordingly, u32 should be used instead of u16. Fixes: c0894b3ea69d ("RDMA/rtrs: core: lib functions shared between client and server modules") Link: https://lore.kernel.org/r/20201217141915.56989-2-jinpu.wang@cloud.ionos.com Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Reviewed-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-12-17block/rnbd-clt: Does not request pdu to rtrs-cltGioh Kim1-6/+0
Previously the rnbd client requested the rtrs to allocate rnbd_iu just after the rtrs_iu. So the rnbd client passes the size of rnbd_iu for rtrs_clt_open() and rtrs creates an array of rnbd_iu and rtrs_iu. For IO handling, rnbd_iu exists after the request because we pass the size of rnbd_iu when setting the tag-set. Therefore we do not use the rnbd_iu allocated by rtrs for IO handling. We only use the rnbd_iu allocated by rtrs when doing session initialization. Almost all rnbd_iu allocated by rtrs are wasted. By this patch the rnbd client does not request rnbd_iu allocation to rtrs but allocate it for itself when doing session initialization. Also remove unused rtrs_permit_to_pdu from rtrs. Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-11-17Merge branch 'for-rc' into rdma.gitJason Gunthorpe1-2/+2
From https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git The rc RDMA branch is needed due to dependencies on the next patches. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-28RDMA/rtrs-clt: Remove 'addr' from rtrs_clt_add_path_to_arrGuoqing Jiang1-3/+2
Remove the argument since it is not used in the function. Link: https://lore.kernel.org/r/20201023074353.21946-13-jinpu.wang@cloud.ionos.com Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-28RDMA/rtrs-clt: Remove duplicated codeGioh Kim1-6/+5
process_info_rsp checks that sg_cnt is zero twice. Link: https://lore.kernel.org/r/20201023074353.21946-10-jinpu.wang@cloud.ionos.com Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-28RDMA/rtrs-clt: Remove duplicated switch-case handling for CM error eventsGioh Kim1-5/+7
The events returning the same error value are put together. Link: https://lore.kernel.org/r/20201023074353.21946-9-jinpu.wang@cloud.ionos.com Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-28RDMA/rtrs: Remove unnecessary argument dir of rtrs_iu_freeGioh Kim1-8/+6
The direction of DMA operation is already in the rtrs_iu Link: https://lore.kernel.org/r/20201023074353.21946-8-jinpu.wang@cloud.ionos.com Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-28RDMA/rtrs-clt: Missing error from rtrs_rdma_conn_establishedGioh Kim1-2/+2
When rtrs_rdma_conn_established returns error (non-zero value), the error value is stored in con->cm_err and it cannot trigger rtrs_rdma_error_recovery. Finally the error of rtrs_rdma_con_established will be forgot. Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality") Link: https://lore.kernel.org/r/20201023074353.21946-5-jinpu.wang@cloud.ionos.com Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-28RDMA/rtrs-clt: Avoid run destroy_con_cq_qp/create_con_cq_qp in parallelJack Wang1-2/+13
It could happen two kworkers race with each other: CPU0 CPU1 addr_resolver kworker reconnect kworker rtrs_clt_rdma_cm_handler rtrs_rdma_addr_resolved create_con_cq_qp: s.dev_ref++ "s.dev_ref is 1" wait in create_cm fails with TIMEOUT destroy_con_cq_qp: --s.dev_ref "s.dev_ref is 0" destroy_con_cq_qp: sess->s.dev = NULL rtrs_cq_qp_create -> create_qp(con, sess->dev->ib_pd...) sess->dev is NULL, panic. To fix the problem using mutex to serialize create_con_cq_qp and destroy_con_cq_qp. Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality") Link: https://lore.kernel.org/r/20201023074353.21946-4-jinpu.wang@cloud.ionos.com Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Reviewed-by: Gioh Kim <gi-oh.kim@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-28RDMA/rtrs-clt: Remove outdated comment in create_con_cq_qpJack Wang1-9/+0
As run destroy_con_cq_qp many times doesn't work, remove the comments. Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality") Link: https://lore.kernel.org/r/20201023074353.21946-3-jinpu.wang@cloud.ionos.com Suggested-by: Gioh Kim <gi-oh.kim@cloud.ionos.com> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-28RDMA/rtrs-clt: Remove destroy_con_cq_qp in case route resolving failedDanil Kipnis1-3/+1
We call destroy_con_cq_qp(con) in rtrs_rdma_addr_resolved() in case route couldn't be resolved and then again in create_cm() because nothing happens. Don't call destroy_con_cq_qp from rtrs_rdma_addr_resolved, create_cm() does the clean up already. Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality") Link: https://lore.kernel.org/r/20201023074353.21946-2-jinpu.wang@cloud.ionos.com Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-28RDMA: Add rdma_connect_locked()Jason Gunthorpe1-2/+2
There are two flows for handling RDMA_CM_EVENT_ROUTE_RESOLVED, either the handler triggers a completion and another thread does rdma_connect() or the handler directly calls rdma_connect(). In all cases rdma_connect() needs to hold the handler_mutex, but when handler's are invoked this is already held by the core code. This causes ULPs using the 2nd method to deadlock. Provide a rdma_connect_locked() and have all ULPs call it from their handlers. Link: https://lore.kernel.org/r/0-v2-53c22d5c1405+33-rdma_connect_locking_jgg@nvidia.com Reported-and-tested-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Fixes: 2a7cec538169 ("RDMA/cma: Fix locking for the RDMA_CM_CONNECT state") Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-29RDMA/rtrs: remove WQ_MEM_RECLAIM for rtrs_wqJack Wang1-1/+1
lockdep triggers a warning from time to time when running a regression test: rnbd_client L685: </dev/nullb0@bla> Device disconnected. rnbd_client L1756: Unloading module workqueue: WQ_MEM_RECLAIM rtrs_client_wq:rtrs_clt_reconnect_work [rtrs_client] is flushing !WQ_MEM_RECLAIM ib_addr:process_one_req [ib_core] WARNING: CPU: 2 PID: 18824 at kernel/workqueue.c:2517 check_flush_dependency+0xad/0x130 The root cause is workqueue core expect flushing should not be done for a !WQ_MEM_RECLAIM wq from a WQ_MEM_RECLAIM workqueue. In above case ib_addr workqueue without WQ_MEM_RECLAIM, but rtrs_wq WQ_MEM_RECLAIM. To avoid the warning, remove the WQ_MEM_RECLAIM flag. Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality") Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality") Link: https://lore.kernel.org/r/20200724111508.15734-4-haris.iqbal@cloud.ionos.com Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-29RDMA/rtrs-clt: add an additional random 8 seconds before reconnectingDanil Kipnis1-2/+12
In order to avoid all the clients to start reconnecting at the same time schedule the reconnect dwork with a random jitter of +[0,8] seconds. Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality") Link: https://lore.kernel.org/r/20200724111508.15734-2-haris.iqbal@cloud.ionos.com Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com> Signed-off-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-05-22RDMA/rtrs: Get rid of the do_next_path while_next_path macrosDanil Kipnis1-16/+13
The macros do_each_path/while_each_path lead to a smatch warning: drivers/infiniband/ulp/rtrs/rtrs-clt.c:1196 rtrs_clt_failover_req() warn: inconsistent indenting drivers/infiniband/ulp/rtrs/rtrs-clt.c:2890 rtrs_clt_request() warn: inconsistent indenting Also checkpatch complains: ERROR: Macros with multiple statements should be enclosed in a do - while loop The macros are used only in two places: for a normal IO path and for the failover path triggered after errors. Get rid of the macros and just use a for loop iterating over the list of paths in both places. It is easier to read and also less lines of code. Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality") Link: https://lore.kernel.org/r/20200522053924.528980-1-danil.kipnis@cloud.ionos.com Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-20rnbd/rtrs: Pass max segment size from blk user to the rdma libraryDanil Kipnis1-6/+11
When Block Device Layer is disabled, BLK_MAX_SEGMENT_SIZE is undefined. The rtrs is a transport library and should compile independently of the block layer. The desired max segment size should be passed down by the user. Introduce max_segment_size parameter for the rtrs_clt_open() call. Fixes: f7a7a5c228d4 ("block/rnbd: client: main functionality") Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality") Fixes: cb80329c9434 ("RDMA/rtrs: client: private header with client structs and functions") Fixes: b5c27cdb094e ("RDMA/rtrs: public interface header to establish RDMA connections") Link: https://lore.kernel.org/r/20200519111419.924170-1-danil.kipnis@cloud.ionos.com Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com> Reported-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-20RDMA/rtrs: client: Fix function return on successGustavo A. R. Silva1-3/+0
Remove the if-statement and return the value contained in _err_, unconditionally. Link: https://lore.kernel.org/r/20200519163612.GA6043@embeddedor Addresses-Coverity-ID: 1493753 ("Identical code for different branches") Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality") Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-20RDMA/rtrs: Fix some signedness bugs in error handlingDan Carpenter1-4/+3
The problem is that "req->sg_cnt" is an unsigned int so if "nr" is negative, it gets type promoted to a high positive value and the condition is false. This patch fixes it by handling negatives separately. Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality") Link: https://lore.kernel.org/r/20200519133223.GN2078@kadam Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-18RDMA/rtrs: client: main functionalityJack Wang1-0/+2994
This is main functionality of rtrs-client module, which manages set of RDMA connections for each rtrs session, does multipathing, load balancing and failover of RDMA requests. Link: https://lore.kernel.org/r/20200511135131.27580-7-danil.kipnis@cloud.ionos.com Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>