diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2018-04-07 03:35:43 +0300 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2018-04-07 03:35:43 +0300 |
commit | 19fd08b85bc7e0502b55cd726f466df82ee7e777 (patch) | |
tree | b042de4b9a8a9478c528ea950b14d34487375695 /drivers/infiniband/sw/rxe | |
parent | 28da7be5ebc096ada5e6bc526c623bdd8c47800a (diff) | |
parent | efc365e7290d040fbd43f60b0e97653489a739d4 (diff) | |
download | linux-19fd08b85bc7e0502b55cd726f466df82ee7e777.tar.xz |
Merge tag 'for-linus-unmerged' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma updates from Jason Gunthorpe:
"Doug and I are at a conference next week so if another PR is sent I
expect it to only be bug fixes. Parav noted yesterday that there are
some fringe case behavior changes in his work that he would like to
fix, and I see that Intel has a number of rc looking patches for HFI1
they posted yesterday.
Parav is again the biggest contributor by patch count with his ongoing
work to enable container support in the RDMA stack, followed by Leon
doing syzkaller inspired cleanups, though most of the actual fixing
went to RC.
There is one uncomfortable series here fixing the user ABI to actually
work as intended in 32 bit mode. There are lots of notes in the commit
messages, but the basic summary is we don't think there is an actual
32 bit kernel user of drivers/infiniband for several good reasons.
However we are seeing people want to use a 32 bit user space with 64
bit kernel, which didn't completely work today. So in fixing it we
required a 32 bit rxe user to upgrade their userspace. rxe users are
still already quite rare and we think a 32 bit one is non-existing.
- Fix RDMA uapi headers to actually compile in userspace and be more
complete
- Three shared with netdev pull requests from Mellanox:
* 7 patches, mostly to net with 1 IB related one at the back).
This series addresses an IRQ performance issue (patch 1),
cleanups related to the fix for the IRQ performance problem
(patches 2-6), and then extends the fragmented completion queue
support that already exists in the net side of the driver to the
ib side of the driver (patch 7).
* Mostly IB, with 5 patches to net that are needed to support the
remaining 10 patches to the IB subsystem. This series extends
the current 'representor' framework when the mlx5 driver is in
switchdev mode from being a netdev only construct to being a
netdev/IB dev construct. The IB dev is limited to raw Eth queue
pairs only, but by having an IB dev of this type attached to the
representor for a switchdev port, it enables DPDK to work on the
switchdev device.
* All net related, but needed as infrastructure for the rdma
driver
- Updates for the hns, i40iw, bnxt_re, cxgb3, cxgb4, hns drivers
- SRP performance updates
- IB uverbs write path cleanup patch series from Leon
- Add RDMA_CM support to ib_srpt. This is disabled by default. Users
need to set the port for ib_srpt to listen on in configfs in order
for it to be enabled
(/sys/kernel/config/target/srpt/discovery_auth/rdma_cm_port)
- TSO and Scatter FCS support in mlx4
- Refactor of modify_qp routine to resolve problems seen while
working on new code that is forthcoming
- More refactoring and updates of RDMA CM for containers support from
Parav
- mlx5 'fine grained packet pacing', 'ipsec offload' and 'device
memory' user API features
- Infrastructure updates for the new IOCTL interface, based on
increased usage
- ABI compatibility bug fixes to fully support 32 bit userspace on 64
bit kernel as was originally intended. See the commit messages for
extensive details
- Syzkaller bugs and code cleanups motivated by them"
* tag 'for-linus-unmerged' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (199 commits)
IB/rxe: Fix for oops in rxe_register_device on ppc64le arch
IB/mlx5: Device memory mr registration support
net/mlx5: Mkey creation command adjustments
IB/mlx5: Device memory support in mlx5_ib
net/mlx5: Query device memory capabilities
IB/uverbs: Add device memory registration ioctl support
IB/uverbs: Add alloc/free dm uverbs ioctl support
IB/uverbs: Add device memory capabilities reporting
IB/uverbs: Expose device memory capabilities to user
RDMA/qedr: Fix wmb usage in qedr
IB/rxe: Removed GID add/del dummy routines
RDMA/qedr: Zero stack memory before copying to user space
IB/mlx5: Add ability to hash by IPSEC_SPI when creating a TIR
IB/mlx5: Add information for querying IPsec capabilities
IB/mlx5: Add IPsec support for egress and ingress
{net,IB}/mlx5: Add ipsec helper
IB/mlx5: Add modify_flow_action_esp verb
IB/mlx5: Add implementation for create and destroy action_xfrm
IB/uverbs: Introduce ESP steering match filter
IB/uverbs: Add modify ESP flow_action
...
Diffstat (limited to 'drivers/infiniband/sw/rxe')
-rw-r--r-- | drivers/infiniband/sw/rxe/rxe.c | 4 | ||||
-rw-r--r-- | drivers/infiniband/sw/rxe/rxe.h | 6 | ||||
-rw-r--r-- | drivers/infiniband/sw/rxe/rxe_av.c | 5 | ||||
-rw-r--r-- | drivers/infiniband/sw/rxe/rxe_cq.c | 15 | ||||
-rw-r--r-- | drivers/infiniband/sw/rxe/rxe_loc.h | 20 | ||||
-rw-r--r-- | drivers/infiniband/sw/rxe/rxe_net.c | 56 | ||||
-rw-r--r-- | drivers/infiniband/sw/rxe/rxe_qp.c | 35 | ||||
-rw-r--r-- | drivers/infiniband/sw/rxe/rxe_queue.c | 24 | ||||
-rw-r--r-- | drivers/infiniband/sw/rxe/rxe_queue.h | 5 | ||||
-rw-r--r-- | drivers/infiniband/sw/rxe/rxe_recv.c | 16 | ||||
-rw-r--r-- | drivers/infiniband/sw/rxe/rxe_resp.c | 15 | ||||
-rw-r--r-- | drivers/infiniband/sw/rxe/rxe_srq.c | 44 | ||||
-rw-r--r-- | drivers/infiniband/sw/rxe/rxe_verbs.c | 101 | ||||
-rw-r--r-- | drivers/infiniband/sw/rxe/rxe_verbs.h | 2 |
14 files changed, 171 insertions, 177 deletions
diff --git a/drivers/infiniband/sw/rxe/rxe.c b/drivers/infiniband/sw/rxe/rxe.c index b7debb6f2eac..e493fdbd61c6 100644 --- a/drivers/infiniband/sw/rxe/rxe.c +++ b/drivers/infiniband/sw/rxe/rxe.c @@ -78,7 +78,7 @@ void rxe_release(struct kref *kref) } /* initialize rxe device parameters */ -static int rxe_init_device_param(struct rxe_dev *rxe) +static void rxe_init_device_param(struct rxe_dev *rxe) { rxe->max_inline_data = RXE_MAX_INLINE_DATA; @@ -122,8 +122,6 @@ static int rxe_init_device_param(struct rxe_dev *rxe) rxe->attr.local_ca_ack_delay = RXE_LOCAL_CA_ACK_DELAY; rxe->max_ucontext = RXE_MAX_UCONTEXT; - - return 0; } /* initialize port attributes */ diff --git a/drivers/infiniband/sw/rxe/rxe.h b/drivers/infiniband/sw/rxe/rxe.h index 7d232611303f..561ad307c6ec 100644 --- a/drivers/infiniband/sw/rxe/rxe.h +++ b/drivers/infiniband/sw/rxe/rxe.h @@ -59,7 +59,11 @@ #include "rxe_verbs.h" #include "rxe_loc.h" -#define RXE_UVERBS_ABI_VERSION (1) +/* + * Version 1 and Version 2 are identical on 64 bit machines, but on 32 bit + * machines Version 2 has a different struct layout. + */ +#define RXE_UVERBS_ABI_VERSION 2 #define IB_PHYS_STATE_LINK_UP (5) #define IB_PHYS_STATE_LINK_DOWN (3) diff --git a/drivers/infiniband/sw/rxe/rxe_av.c b/drivers/infiniband/sw/rxe/rxe_av.c index 7522d1af3ae2..7f1ae364088a 100644 --- a/drivers/infiniband/sw/rxe/rxe_av.c +++ b/drivers/infiniband/sw/rxe/rxe_av.c @@ -74,8 +74,9 @@ void rxe_av_fill_ip_info(struct rxe_av *av, struct ib_gid_attr *sgid_attr, union ib_gid *sgid) { - rdma_gid2ip(&av->sgid_addr._sockaddr, sgid); - rdma_gid2ip(&av->dgid_addr._sockaddr, &rdma_ah_read_grh(attr)->dgid); + rdma_gid2ip((struct sockaddr *)&av->sgid_addr, sgid); + rdma_gid2ip((struct sockaddr *)&av->dgid_addr, + &rdma_ah_read_grh(attr)->dgid); av->network_type = ib_gid_to_network_type(sgid_attr->gid_type, sgid); } diff --git a/drivers/infiniband/sw/rxe/rxe_cq.c b/drivers/infiniband/sw/rxe/rxe_cq.c index c4aabf78dc90..2ee4b08b00ea 100644 --- a/drivers/infiniband/sw/rxe/rxe_cq.c +++ b/drivers/infiniband/sw/rxe/rxe_cq.c @@ -36,7 +36,7 @@ #include "rxe_queue.h" int rxe_cq_chk_attr(struct rxe_dev *rxe, struct rxe_cq *cq, - int cqe, int comp_vector, struct ib_udata *udata) + int cqe, int comp_vector) { int count; @@ -83,7 +83,7 @@ static void rxe_send_complete(unsigned long data) int rxe_cq_from_init(struct rxe_dev *rxe, struct rxe_cq *cq, int cqe, int comp_vector, struct ib_ucontext *context, - struct ib_udata *udata) + struct rxe_create_cq_resp __user *uresp) { int err; @@ -94,15 +94,15 @@ int rxe_cq_from_init(struct rxe_dev *rxe, struct rxe_cq *cq, int cqe, return -ENOMEM; } - err = do_mmap_info(rxe, udata, false, context, cq->queue->buf, - cq->queue->buf_size, &cq->queue->ip); + err = do_mmap_info(rxe, uresp ? &uresp->mi : NULL, context, + cq->queue->buf, cq->queue->buf_size, &cq->queue->ip); if (err) { kvfree(cq->queue->buf); kfree(cq->queue); return err; } - if (udata) + if (uresp) cq->is_user = 1; cq->is_dying = false; @@ -114,14 +114,15 @@ int rxe_cq_from_init(struct rxe_dev *rxe, struct rxe_cq *cq, int cqe, return 0; } -int rxe_cq_resize_queue(struct rxe_cq *cq, int cqe, struct ib_udata *udata) +int rxe_cq_resize_queue(struct rxe_cq *cq, int cqe, + struct rxe_resize_cq_resp __user *uresp) { int err; err = rxe_queue_resize(cq->queue, (unsigned int *)&cqe, sizeof(struct rxe_cqe), cq->queue->ip ? cq->queue->ip->context : NULL, - udata, NULL, &cq->cq_lock); + uresp ? &uresp->mi : NULL, NULL, &cq->cq_lock); if (!err) cq->ibcq.cqe = cqe; diff --git a/drivers/infiniband/sw/rxe/rxe_loc.h b/drivers/infiniband/sw/rxe/rxe_loc.h index 4ef75d5b729b..b71023c1c58b 100644 --- a/drivers/infiniband/sw/rxe/rxe_loc.h +++ b/drivers/infiniband/sw/rxe/rxe_loc.h @@ -52,13 +52,14 @@ struct rxe_av *rxe_get_av(struct rxe_pkt_info *pkt); /* rxe_cq.c */ int rxe_cq_chk_attr(struct rxe_dev *rxe, struct rxe_cq *cq, - int cqe, int comp_vector, struct ib_udata *udata); + int cqe, int comp_vector); int rxe_cq_from_init(struct rxe_dev *rxe, struct rxe_cq *cq, int cqe, int comp_vector, struct ib_ucontext *context, - struct ib_udata *udata); + struct rxe_create_cq_resp __user *uresp); -int rxe_cq_resize_queue(struct rxe_cq *cq, int new_cqe, struct ib_udata *udata); +int rxe_cq_resize_queue(struct rxe_cq *cq, int new_cqe, + struct rxe_resize_cq_resp __user *uresp); int rxe_cq_post(struct rxe_cq *cq, struct rxe_cqe *cqe, int solicited); @@ -143,8 +144,7 @@ int advance_dma_data(struct rxe_dma_info *dma, unsigned int length); /* rxe_net.c */ int rxe_loopback(struct sk_buff *skb); -int rxe_send(struct rxe_dev *rxe, struct rxe_pkt_info *pkt, - struct sk_buff *skb); +int rxe_send(struct rxe_pkt_info *pkt, struct sk_buff *skb); struct sk_buff *rxe_init_packet(struct rxe_dev *rxe, struct rxe_av *av, int paylen, struct rxe_pkt_info *pkt); int rxe_prepare(struct rxe_dev *rxe, struct rxe_pkt_info *pkt, @@ -159,7 +159,8 @@ int rxe_mcast_delete(struct rxe_dev *rxe, union ib_gid *mgid); int rxe_qp_chk_init(struct rxe_dev *rxe, struct ib_qp_init_attr *init); int rxe_qp_from_init(struct rxe_dev *rxe, struct rxe_qp *qp, struct rxe_pd *pd, - struct ib_qp_init_attr *init, struct ib_udata *udata, + struct ib_qp_init_attr *init, + struct rxe_create_qp_resp __user *uresp, struct ib_pd *ibpd); int rxe_qp_to_init(struct rxe_qp *qp, struct ib_qp_init_attr *init); @@ -227,11 +228,12 @@ int rxe_srq_chk_attr(struct rxe_dev *rxe, struct rxe_srq *srq, int rxe_srq_from_init(struct rxe_dev *rxe, struct rxe_srq *srq, struct ib_srq_init_attr *init, - struct ib_ucontext *context, struct ib_udata *udata); + struct ib_ucontext *context, + struct rxe_create_srq_resp __user *uresp); int rxe_srq_from_attr(struct rxe_dev *rxe, struct rxe_srq *srq, struct ib_srq_attr *attr, enum ib_srq_attr_mask mask, - struct ib_udata *udata); + struct rxe_modify_srq_cmd *ucmd); void rxe_release(struct kref *kref); @@ -268,7 +270,7 @@ static inline int rxe_xmit_packet(struct rxe_dev *rxe, struct rxe_qp *qp, memcpy(SKB_TO_PKT(skb), pkt, sizeof(*pkt)); err = rxe_loopback(skb); } else { - err = rxe_send(rxe, pkt, skb); + err = rxe_send(pkt, skb); } if (err) { diff --git a/drivers/infiniband/sw/rxe/rxe_net.c b/drivers/infiniband/sw/rxe/rxe_net.c index 159246b03867..9da6e37fb70c 100644 --- a/drivers/infiniband/sw/rxe/rxe_net.c +++ b/drivers/infiniband/sw/rxe/rxe_net.c @@ -182,11 +182,39 @@ static struct dst_entry *rxe_find_route6(struct net_device *ndev, #endif +/* + * Derive the net_device from the av. + * For physical devices, this will just return rxe->ndev. + * But for VLAN devices, it will return the vlan dev. + * Caller should dev_put() the returned net_device. + */ +static struct net_device *rxe_netdev_from_av(struct rxe_dev *rxe, + int port_num, + struct rxe_av *av) +{ + union ib_gid gid; + struct ib_gid_attr attr; + struct net_device *ndev = rxe->ndev; + + if (ib_get_cached_gid(&rxe->ib_dev, port_num, av->grh.sgid_index, + &gid, &attr) == 0 && + attr.ndev && attr.ndev != ndev) + ndev = attr.ndev; + else + /* Only to ensure that caller may call dev_put() */ + dev_hold(ndev); + + return ndev; +} + static struct dst_entry *rxe_find_route(struct rxe_dev *rxe, struct rxe_qp *qp, struct rxe_av *av) { struct dst_entry *dst = NULL; + struct net_device *ndev; + + ndev = rxe_netdev_from_av(rxe, qp->attr.port_num, av); if (qp_type(qp) == IB_QPT_RC) dst = sk_dst_get(qp->sk->sk); @@ -201,14 +229,14 @@ static struct dst_entry *rxe_find_route(struct rxe_dev *rxe, saddr = &av->sgid_addr._sockaddr_in.sin_addr; daddr = &av->dgid_addr._sockaddr_in.sin_addr; - dst = rxe_find_route4(rxe->ndev, saddr, daddr); + dst = rxe_find_route4(ndev, saddr, daddr); } else if (av->network_type == RDMA_NETWORK_IPV6) { struct in6_addr *saddr6; struct in6_addr *daddr6; saddr6 = &av->sgid_addr._sockaddr_in6.sin6_addr; daddr6 = &av->dgid_addr._sockaddr_in6.sin6_addr; - dst = rxe_find_route6(rxe->ndev, saddr6, daddr6); + dst = rxe_find_route6(ndev, saddr6, daddr6); #if IS_ENABLED(CONFIG_IPV6) if (dst) qp->dst_cookie = @@ -217,6 +245,7 @@ static struct dst_entry *rxe_find_route(struct rxe_dev *rxe, } } + dev_put(ndev); return dst; } @@ -224,9 +253,14 @@ static int rxe_udp_encap_recv(struct sock *sk, struct sk_buff *skb) { struct udphdr *udph; struct net_device *ndev = skb->dev; + struct net_device *rdev = ndev; struct rxe_dev *rxe = net_to_rxe(ndev); struct rxe_pkt_info *pkt = SKB_TO_PKT(skb); + if (!rxe && is_vlan_dev(rdev)) { + rdev = vlan_dev_real_dev(ndev); + rxe = net_to_rxe(rdev); + } if (!rxe) goto drop; @@ -450,7 +484,7 @@ static void rxe_skb_tx_dtor(struct sk_buff *skb) rxe_drop_ref(qp); } -int rxe_send(struct rxe_dev *rxe, struct rxe_pkt_info *pkt, struct sk_buff *skb) +int rxe_send(struct rxe_pkt_info *pkt, struct sk_buff *skb) { struct rxe_av *av; int err; @@ -498,6 +532,10 @@ struct sk_buff *rxe_init_packet(struct rxe_dev *rxe, struct rxe_av *av, { unsigned int hdr_len; struct sk_buff *skb; + struct net_device *ndev; + const int port_num = 1; + + ndev = rxe_netdev_from_av(rxe, port_num, av); if (av->network_type == RDMA_NETWORK_IPV4) hdr_len = ETH_HLEN + sizeof(struct udphdr) + @@ -506,26 +544,30 @@ struct sk_buff *rxe_init_packet(struct rxe_dev *rxe, struct rxe_av *av, hdr_len = ETH_HLEN + sizeof(struct udphdr) + sizeof(struct ipv6hdr); - skb = alloc_skb(paylen + hdr_len + LL_RESERVED_SPACE(rxe->ndev), + skb = alloc_skb(paylen + hdr_len + LL_RESERVED_SPACE(ndev), GFP_ATOMIC); - if (unlikely(!skb)) + + if (unlikely(!skb)) { + dev_put(ndev); return NULL; + } skb_reserve(skb, hdr_len + LL_RESERVED_SPACE(rxe->ndev)); - skb->dev = rxe->ndev; + skb->dev = ndev; if (av->network_type == RDMA_NETWORK_IPV4) skb->protocol = htons(ETH_P_IP); else skb->protocol = htons(ETH_P_IPV6); pkt->rxe = rxe; - pkt->port_num = 1; + pkt->port_num = port_num; pkt->hdr = skb_put(skb, paylen); pkt->mask |= RXE_GRH_MASK; memset(pkt->hdr, 0, paylen); + dev_put(ndev); return skb; } diff --git a/drivers/infiniband/sw/rxe/rxe_qp.c b/drivers/infiniband/sw/rxe/rxe_qp.c index 2fcf1cab7678..b9f7aa1114b2 100644 --- a/drivers/infiniband/sw/rxe/rxe_qp.c +++ b/drivers/infiniband/sw/rxe/rxe_qp.c @@ -40,15 +40,6 @@ #include "rxe_queue.h" #include "rxe_task.h" -char *rxe_qp_state_name[] = { - [QP_STATE_RESET] = "RESET", - [QP_STATE_INIT] = "INIT", - [QP_STATE_READY] = "READY", - [QP_STATE_DRAIN] = "DRAIN", - [QP_STATE_DRAINED] = "DRAINED", - [QP_STATE_ERROR] = "ERROR", -}; - static int rxe_qp_chk_cap(struct rxe_dev *rxe, struct ib_qp_cap *cap, int has_srq) { @@ -225,7 +216,8 @@ static void rxe_qp_init_misc(struct rxe_dev *rxe, struct rxe_qp *qp, static int rxe_qp_init_req(struct rxe_dev *rxe, struct rxe_qp *qp, struct ib_qp_init_attr *init, - struct ib_ucontext *context, struct ib_udata *udata) + struct ib_ucontext *context, + struct rxe_create_qp_resp __user *uresp) { int err; int wqe_size; @@ -250,9 +242,9 @@ static int rxe_qp_init_req(struct rxe_dev *rxe, struct rxe_qp *qp, if (!qp->sq.queue) return -ENOMEM; - err = do_mmap_info(rxe, udata, true, - context, qp->sq.queue->buf, - qp->sq.queue->buf_size, &qp->sq.queue->ip); + err = do_mmap_info(rxe, uresp ? &uresp->sq_mi : NULL, context, + qp->sq.queue->buf, qp->sq.queue->buf_size, + &qp->sq.queue->ip); if (err) { kvfree(qp->sq.queue->buf); @@ -283,7 +275,8 @@ static int rxe_qp_init_req(struct rxe_dev *rxe, struct rxe_qp *qp, static int rxe_qp_init_resp(struct rxe_dev *rxe, struct rxe_qp *qp, struct ib_qp_init_attr *init, - struct ib_ucontext *context, struct ib_udata *udata) + struct ib_ucontext *context, + struct rxe_create_qp_resp __user *uresp) { int err; int wqe_size; @@ -303,9 +296,8 @@ static int rxe_qp_init_resp(struct rxe_dev *rxe, struct rxe_qp *qp, if (!qp->rq.queue) return -ENOMEM; - err = do_mmap_info(rxe, udata, false, context, - qp->rq.queue->buf, - qp->rq.queue->buf_size, + err = do_mmap_info(rxe, uresp ? &uresp->rq_mi : NULL, context, + qp->rq.queue->buf, qp->rq.queue->buf_size, &qp->rq.queue->ip); if (err) { kvfree(qp->rq.queue->buf); @@ -331,14 +323,15 @@ static int rxe_qp_init_resp(struct rxe_dev *rxe, struct rxe_qp *qp, /* called by the create qp verb */ int rxe_qp_from_init(struct rxe_dev *rxe, struct rxe_qp *qp, struct rxe_pd *pd, - struct ib_qp_init_attr *init, struct ib_udata *udata, + struct ib_qp_init_attr *init, + struct rxe_create_qp_resp __user *uresp, struct ib_pd *ibpd) { int err; struct rxe_cq *rcq = to_rcq(init->recv_cq); struct rxe_cq *scq = to_rcq(init->send_cq); struct rxe_srq *srq = init->srq ? to_rsrq(init->srq) : NULL; - struct ib_ucontext *context = udata ? ibpd->uobject->context : NULL; + struct ib_ucontext *context = ibpd->uobject ? ibpd->uobject->context : NULL; rxe_add_ref(pd); rxe_add_ref(rcq); @@ -353,11 +346,11 @@ int rxe_qp_from_init(struct rxe_dev *rxe, struct rxe_qp *qp, struct rxe_pd *pd, rxe_qp_init_misc(rxe, qp, init); - err = rxe_qp_init_req(rxe, qp, init, context, udata); + err = rxe_qp_init_req(rxe, qp, init, context, uresp); if (err) goto err1; - err = rxe_qp_init_resp(rxe, qp, init, context, udata); + err = rxe_qp_init_resp(rxe, qp, init, context, uresp); if (err) goto err2; diff --git a/drivers/infiniband/sw/rxe/rxe_queue.c b/drivers/infiniband/sw/rxe/rxe_queue.c index d14bf496d62d..f84ab4469261 100644 --- a/drivers/infiniband/sw/rxe/rxe_queue.c +++ b/drivers/infiniband/sw/rxe/rxe_queue.c @@ -37,35 +37,21 @@ #include "rxe_queue.h" int do_mmap_info(struct rxe_dev *rxe, - struct ib_udata *udata, - bool is_req, + struct mminfo __user *outbuf, struct ib_ucontext *context, struct rxe_queue_buf *buf, size_t buf_size, struct rxe_mmap_info **ip_p) { int err; - u32 len, offset; struct rxe_mmap_info *ip = NULL; - if (udata) { - if (is_req) { - len = udata->outlen - sizeof(struct mminfo); - offset = sizeof(struct mminfo); - } else { - len = udata->outlen; - offset = 0; - } - - if (len < sizeof(ip->info)) - goto err1; - + if (outbuf) { ip = rxe_create_mmap_info(rxe, buf_size, context, buf); if (!ip) goto err1; - err = copy_to_user(udata->outbuf + offset, &ip->info, - sizeof(ip->info)); + err = copy_to_user(outbuf, &ip->info, sizeof(ip->info)); if (err) goto err2; @@ -171,7 +157,7 @@ int rxe_queue_resize(struct rxe_queue *q, unsigned int *num_elem_p, unsigned int elem_size, struct ib_ucontext *context, - struct ib_udata *udata, + struct mminfo __user *outbuf, spinlock_t *producer_lock, spinlock_t *consumer_lock) { @@ -184,7 +170,7 @@ int rxe_queue_resize(struct rxe_queue *q, if (!new_q) return -ENOMEM; - err = do_mmap_info(new_q->rxe, udata, false, context, new_q->buf, + err = do_mmap_info(new_q->rxe, outbuf, context, new_q->buf, new_q->buf_size, &new_q->ip); if (err) { vfree(new_q->buf); diff --git a/drivers/infiniband/sw/rxe/rxe_queue.h b/drivers/infiniband/sw/rxe/rxe_queue.h index 8c8641c87817..79ba4b320054 100644 --- a/drivers/infiniband/sw/rxe/rxe_queue.h +++ b/drivers/infiniband/sw/rxe/rxe_queue.h @@ -77,8 +77,7 @@ struct rxe_queue { }; int do_mmap_info(struct rxe_dev *rxe, - struct ib_udata *udata, - bool is_req, + struct mminfo __user *outbuf, struct ib_ucontext *context, struct rxe_queue_buf *buf, size_t buf_size, @@ -94,7 +93,7 @@ int rxe_queue_resize(struct rxe_queue *q, unsigned int *num_elem_p, unsigned int elem_size, struct ib_ucontext *context, - struct ib_udata *udata, + struct mminfo __user *outbuf, /* Protect producers while resizing queue */ spinlock_t *producer_lock, /* Protect consumers while resizing queue */ diff --git a/drivers/infiniband/sw/rxe/rxe_recv.c b/drivers/infiniband/sw/rxe/rxe_recv.c index 4c3f899241d4..dd80c7d9074a 100644 --- a/drivers/infiniband/sw/rxe/rxe_recv.c +++ b/drivers/infiniband/sw/rxe/rxe_recv.c @@ -276,7 +276,6 @@ static void rxe_rcv_mcast_pkt(struct rxe_dev *rxe, struct sk_buff *skb) { struct rxe_pkt_info *pkt = SKB_TO_PKT(skb); struct rxe_mc_grp *mcg; - struct sk_buff *skb_copy; struct rxe_mc_elem *mce; struct rxe_qp *qp; union ib_gid dgid; @@ -309,18 +308,14 @@ static void rxe_rcv_mcast_pkt(struct rxe_dev *rxe, struct sk_buff *skb) continue; /* if *not* the last qp in the list - * make a copy of the skb to post to the next qp + * increase the users of the skb then post to the next qp */ - skb_copy = (mce->qp_list.next != &mcg->qp_list) ? - skb_clone(skb, GFP_ATOMIC) : NULL; + if (mce->qp_list.next != &mcg->qp_list) + refcount_inc(&skb->users); pkt->qp = qp; rxe_add_ref(qp); rxe_rcv_pkt(rxe, pkt, skb); - - skb = skb_copy; - if (!skb) - break; } spin_unlock_bh(&mcg->mcg_lock); @@ -328,8 +323,7 @@ static void rxe_rcv_mcast_pkt(struct rxe_dev *rxe, struct sk_buff *skb) rxe_drop_ref(mcg); /* drop ref from rxe_pool_get_key. */ err1: - if (skb) - kfree_skb(skb); + kfree_skb(skb); } static int rxe_match_dgid(struct rxe_dev *rxe, struct sk_buff *skb) @@ -347,7 +341,7 @@ static int rxe_match_dgid(struct rxe_dev *rxe, struct sk_buff *skb) return ib_find_cached_gid_by_port(&rxe->ib_dev, pdgid, IB_GID_TYPE_ROCE_UDP_ENCAP, - 1, rxe->ndev, NULL); + 1, skb->dev, NULL); } /* rxe_rcv is called from the interface driver */ diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c b/drivers/infiniband/sw/rxe/rxe_resp.c index d37bb9b97569..a65c9969f7fc 100644 --- a/drivers/infiniband/sw/rxe/rxe_resp.c +++ b/drivers/infiniband/sw/rxe/rxe_resp.c @@ -969,7 +969,6 @@ static int send_atomic_ack(struct rxe_qp *qp, struct rxe_pkt_info *pkt, int rc = 0; struct rxe_pkt_info ack_pkt; struct sk_buff *skb; - struct sk_buff *skb_copy; struct rxe_dev *rxe = to_rdev(qp->ibqp.device); struct resp_res *res; @@ -981,14 +980,7 @@ static int send_atomic_ack(struct rxe_qp *qp, struct rxe_pkt_info *pkt, goto out; } - skb_copy = skb_clone(skb, GFP_ATOMIC); - if (skb_copy) - rxe_add_ref(qp); /* for the new SKB */ - else { - pr_warn("Could not clone atomic response\n"); - rc = -ENOMEM; - goto out; - } + rxe_add_ref(qp); res = &qp->resp.resources[qp->resp.res_head]; free_rd_atomic_resource(qp, res); @@ -998,19 +990,18 @@ static int send_atomic_ack(struct rxe_qp *qp, struct rxe_pkt_info *pkt, memset((unsigned char *)SKB_TO_PKT(skb) + sizeof(ack_pkt), 0, sizeof(skb->cb) - sizeof(ack_pkt)); + refcount_inc(&skb->users); res->type = RXE_ATOMIC_MASK; res->atomic.skb = skb; res->first_psn = ack_pkt.psn; res->last_psn = ack_pkt.psn; res->cur_psn = ack_pkt.psn; - rc = rxe_xmit_packet(rxe, qp, &ack_pkt, skb_copy); + rc = rxe_xmit_packet(rxe, qp, &ack_pkt, skb); if (rc) { pr_err_ratelimited("Failed sending ack\n"); rxe_drop_ref(qp); - kfree_skb(skb_copy); } - out: return rc; } diff --git a/drivers/infiniband/sw/rxe/rxe_srq.c b/drivers/infiniband/sw/rxe/rxe_srq.c index efc832a2d7c6..0d6c04ba7fc3 100644 --- a/drivers/infiniband/sw/rxe/rxe_srq.c +++ b/drivers/infiniband/sw/rxe/rxe_srq.c @@ -99,7 +99,8 @@ err1: int rxe_srq_from_init(struct rxe_dev *rxe, struct rxe_srq *srq, struct ib_srq_init_attr *init, - struct ib_ucontext *context, struct ib_udata *udata) + struct ib_ucontext *context, + struct rxe_create_srq_resp __user *uresp) { int err; int srq_wqe_size; @@ -126,55 +127,41 @@ int rxe_srq_from_init(struct rxe_dev *rxe, struct rxe_srq *srq, srq->rq.queue = q; - err = do_mmap_info(rxe, udata, false, context, q->buf, + err = do_mmap_info(rxe, uresp ? &uresp->mi : NULL, context, q->buf, q->buf_size, &q->ip); if (err) return err; - if (udata && udata->outlen >= sizeof(struct mminfo) + sizeof(u32)) { - if (copy_to_user(udata->outbuf + sizeof(struct mminfo), - &srq->srq_num, sizeof(u32))) + if (uresp) { + if (copy_to_user(&uresp->srq_num, &srq->srq_num, + sizeof(uresp->srq_num))) return -EFAULT; } + return 0; } int rxe_srq_from_attr(struct rxe_dev *rxe, struct rxe_srq *srq, struct ib_srq_attr *attr, enum ib_srq_attr_mask mask, - struct ib_udata *udata) + struct rxe_modify_srq_cmd *ucmd) { int err; struct rxe_queue *q = srq->rq.queue; - struct mminfo mi = { .offset = 1, .size = 0}; + struct mminfo __user *mi = NULL; if (mask & IB_SRQ_MAX_WR) { - /* Check that we can write the mminfo struct to user space */ - if (udata && udata->inlen >= sizeof(__u64)) { - __u64 mi_addr; - - /* Get address of user space mminfo struct */ - err = ib_copy_from_udata(&mi_addr, udata, - sizeof(mi_addr)); - if (err) - goto err1; - - udata->outbuf = (void __user *)(unsigned long)mi_addr; - udata->outlen = sizeof(mi); - - if (!access_ok(VERIFY_WRITE, - (void __user *)udata->outbuf, - udata->outlen)) { - err = -EFAULT; - goto err1; - } - } + /* + * This is completely screwed up, the response is supposed to + * be in the outbuf not like this. + */ + mi = u64_to_user_ptr(ucmd->mmap_info_addr); err = rxe_queue_resize(q, &attr->max_wr, rcv_wqe_size(srq->rq.max_sge), srq->rq.queue->ip ? srq->rq.queue->ip->context : NULL, - udata, &srq->rq.producer_lock, + mi, &srq->rq.producer_lock, &srq->rq.consumer_lock); if (err) goto err2; @@ -188,6 +175,5 @@ int rxe_srq_from_attr(struct rxe_dev *rxe, struct rxe_srq *srq, err2: rxe_queue_cleanup(q); srq->rq.queue = NULL; -err1: return err; } diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c index f4bab2cd0ec2..2cb52fd48cf1 100644 --- a/drivers/infiniband/sw/rxe/rxe_verbs.c +++ b/drivers/infiniband/sw/rxe/rxe_verbs.c @@ -77,40 +77,6 @@ out: return rc; } -static int rxe_query_gid(struct ib_device *device, - u8 port_num, int index, union ib_gid *gid) -{ - int ret; - - if (index > RXE_PORT_GID_TBL_LEN) - return -EINVAL; - - ret = ib_get_cached_gid(device, port_num, index, gid, NULL); - if (ret == -EAGAIN) { - memcpy(gid, &zgid, sizeof(*gid)); - return 0; - } - - return ret; -} - -static int rxe_add_gid(struct ib_device *device, u8 port_num, unsigned int - index, const union ib_gid *gid, - const struct ib_gid_attr *attr, void **context) -{ - if (index >= RXE_PORT_GID_TBL_LEN) - return -EINVAL; - return 0; -} - -static int rxe_del_gid(struct ib_device *device, u8 port_num, unsigned int - index, void **context) -{ - if (index >= RXE_PORT_GID_TBL_LEN) - return -EINVAL; - return 0; -} - static struct net_device *rxe_get_netdev(struct ib_device *device, u8 port_num) { @@ -273,9 +239,7 @@ static int rxe_init_av(struct rxe_dev *rxe, struct rdma_ah_attr *attr, rxe_av_from_attr(rdma_ah_get_port_num(attr), av, attr); rxe_av_fill_ip_info(av, attr, &sgid_attr, &sgid); - - if (sgid_attr.ndev) - dev_put(sgid_attr.ndev); + dev_put(sgid_attr.ndev); return 0; } @@ -407,6 +371,13 @@ static struct ib_srq *rxe_create_srq(struct ib_pd *ibpd, struct rxe_pd *pd = to_rpd(ibpd); struct rxe_srq *srq; struct ib_ucontext *context = udata ? ibpd->uobject->context : NULL; + struct rxe_create_srq_resp __user *uresp = NULL; + + if (udata) { + if (udata->outlen < sizeof(*uresp)) + return ERR_PTR(-EINVAL); + uresp = udata->outbuf; + } err = rxe_srq_chk_attr(rxe, NULL, &init->attr, IB_SRQ_INIT_MASK); if (err) @@ -422,7 +393,7 @@ static struct ib_srq *rxe_create_srq(struct ib_pd *ibpd, rxe_add_ref(pd); srq->pd = pd; - err = rxe_srq_from_init(rxe, srq, init, context, udata); + err = rxe_srq_from_init(rxe, srq, init, context, uresp); if (err) goto err2; @@ -443,12 +414,22 @@ static int rxe_modify_srq(struct ib_srq *ibsrq, struct ib_srq_attr *attr, int err; struct rxe_srq *srq = to_rsrq(ibsrq); struct rxe_dev *rxe = to_rdev(ibsrq->device); + struct rxe_modify_srq_cmd ucmd = {}; + + if (udata) { + if (udata->inlen < sizeof(ucmd)) + return -EINVAL; + + err = ib_copy_from_udata(&ucmd, udata, sizeof(ucmd)); + if (err) + return err; + } err = rxe_srq_chk_attr(rxe, srq, attr, mask); if (err) goto err1; - err = rxe_srq_from_attr(rxe, srq, attr, mask, udata); + err = rxe_srq_from_attr(rxe, srq, attr, mask, &ucmd); if (err) goto err1; @@ -517,6 +498,13 @@ static struct ib_qp *rxe_create_qp(struct ib_pd *ibpd, struct rxe_dev *rxe = to_rdev(ibpd->device); struct rxe_pd *pd = to_rpd(ibpd); struct rxe_qp *qp; + struct rxe_create_qp_resp __user *uresp = NULL; + + if (udata) { + if (udata->outlen < sizeof(*uresp)) + return ERR_PTR(-EINVAL); + uresp = udata->outbuf; + } err = rxe_qp_chk_init(rxe, init); if (err) @@ -538,7 +526,7 @@ static struct ib_qp *rxe_create_qp(struct ib_pd *ibpd, rxe_add_index(qp); - err = rxe_qp_from_init(rxe, qp, pd, init, udata, ibpd); + err = rxe_qp_from_init(rxe, qp, pd, init, uresp, ibpd); if (err) goto err3; @@ -711,9 +699,8 @@ static int init_send_wqe(struct rxe_qp *qp, struct ib_send_wr *ibwr, memcpy(wqe->dma.sge, ibwr->sg_list, num_sge * sizeof(struct ib_sge)); - wqe->iova = (mask & WR_ATOMIC_MASK) ? - atomic_wr(ibwr)->remote_addr : - rdma_wr(ibwr)->remote_addr; + wqe->iova = mask & WR_ATOMIC_MASK ? atomic_wr(ibwr)->remote_addr : + mask & WR_READ_OR_WRITE_MASK ? rdma_wr(ibwr)->remote_addr : 0; wqe->mask = mask; wqe->dma.length = length; wqe->dma.resid = length; @@ -889,11 +876,18 @@ static struct ib_cq *rxe_create_cq(struct ib_device *dev, int err; struct rxe_dev *rxe = to_rdev(dev); struct rxe_cq *cq; + struct rxe_create_cq_resp __user *uresp = NULL; + + if (udata) { + if (udata->outlen < sizeof(*uresp)) + return ERR_PTR(-EINVAL); + uresp = udata->outbuf; + } if (attr->flags) return ERR_PTR(-EINVAL); - err = rxe_cq_chk_attr(rxe, NULL, attr->cqe, attr->comp_vector, udata); + err = rxe_cq_chk_attr(rxe, NULL, attr->cqe, attr->comp_vector); if (err) goto err1; @@ -904,7 +898,7 @@ static struct ib_cq *rxe_create_cq(struct ib_device *dev, } err = rxe_cq_from_init(rxe, cq, attr->cqe, attr->comp_vector, - context, udata); + context, uresp); if (err) goto err2; @@ -931,12 +925,19 @@ static int rxe_resize_cq(struct ib_cq *ibcq, int cqe, struct ib_udata *udata) int err; struct rxe_cq *cq = to_rcq(ibcq); struct rxe_dev *rxe = to_rdev(ibcq->device); + struct rxe_resize_cq_resp __user *uresp = NULL; + + if (udata) { + if (udata->outlen < sizeof(*uresp)) + return -EINVAL; + uresp = udata->outbuf; + } - err = rxe_cq_chk_attr(rxe, cq, cqe, 0, udata); + err = rxe_cq_chk_attr(rxe, cq, cqe, 0); if (err) goto err1; - err = rxe_cq_resize_queue(cq, cqe, udata); + err = rxe_cq_resize_queue(cq, cqe, uresp); if (err) goto err1; @@ -1207,7 +1208,7 @@ int rxe_register_device(struct rxe_dev *rxe) rxe->ndev->dev_addr); dev->dev.dma_ops = &dma_virt_ops; dma_coerce_mask_and_coherent(&dev->dev, - dma_get_required_mask(dev->dev.parent)); + dma_get_required_mask(&dev->dev)); dev->uverbs_abi_ver = RXE_UVERBS_ABI_VERSION; dev->uverbs_cmd_mask = BIT_ULL(IB_USER_VERBS_CMD_GET_CONTEXT) @@ -1248,10 +1249,7 @@ int rxe_register_device(struct rxe_dev *rxe) dev->query_port = rxe_query_port; dev->modify_port = rxe_modify_port; dev->get_link_layer = rxe_get_link_layer; - dev->query_gid = rxe_query_gid; dev->get_netdev = rxe_get_netdev; - dev->add_gid = rxe_add_gid; - dev->del_gid = rxe_del_gid; dev->query_pkey = rxe_query_pkey; dev->alloc_ucontext = rxe_alloc_ucontext; dev->dealloc_ucontext = rxe_dealloc_ucontext; @@ -1298,6 +1296,7 @@ int rxe_register_device(struct rxe_dev *rxe) } rxe->tfm = tfm; + dev->driver_id = RDMA_DRIVER_RXE; err = ib_register_device(dev, NULL); if (err) { pr_warn("%s failed with error %d\n", __func__, err); diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.h b/drivers/infiniband/sw/rxe/rxe_verbs.h index 1019f5e7dbdd..af1470d29391 100644 --- a/drivers/infiniband/sw/rxe/rxe_verbs.h +++ b/drivers/infiniband/sw/rxe/rxe_verbs.h @@ -139,8 +139,6 @@ enum rxe_qp_state { QP_STATE_ERROR }; -extern char *rxe_qp_state_name[]; - struct rxe_req_info { enum rxe_qp_state state; int wqe_index; |