summaryrefslogtreecommitdiff
path: root/drivers/infiniband/ulp
AgeCommit message (Collapse)AuthorFilesLines
2017-06-29IB/opa_vnic: Use spinlock instead of mutex for stats_lockVishwanathapura, Niranjana4-12/+10
Stats can be read from atomic context, hence make stats_lock as a spinlock. Fix the following trace with debug kernel. BUG: sleeping function called from invalid context at kernel/locking/mutex.c:238 in_atomic(): 1, irqs_disabled(): 0, pid: 6487, name: sadc Call Trace: dump_stack+0x63/0x90 ___might_sleep+0xda/0x130 __might_sleep+0x4a/0x90 mutex_lock+0x20/0x50 opa_vnic_get_stats64+0x56/0x140 [opa_vnic] dev_get_stats+0x74/0x130 dev_seq_printf_stats+0x37/0x120 dev_seq_show+0x14/0x30 seq_read+0x26d/0x3d0 Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-06-29IB/opa_vnic: Use GFP_ATOMIC while sending trapVishwanathapura, Niranjana1-1/+1
Pass GFP_ATOMIC flag to ib_create_send_mad() while sending trap as it can be triggered from the atomic context. Fix the following trace with debug kernel. BUG: sleeping function called from invalid context at mm/slab.h:432 in_atomic(): 1, irqs_disabled(): 0, pid: 1771, name: NetworkManager Call Trace: dump_stack+0x63/0x90 ___might_sleep+0xda/0x130 __might_sleep+0x4a/0x90 __kmalloc+0x19e/0x220 ? ib_create_send_mad+0xea/0x390 [ib_core] ib_create_send_mad+0xea/0x390 [ib_core] opa_vnic_vema_send_trap+0x17b/0x460 [opa_vnic] opa_vnic_vema_report_event+0x57/0x80 [opa_vnic] opa_vnic_mac_send_event+0xaa/0xf0 [opa_vnic] opa_vnic_set_rx_mode+0x17/0x30 [opa_vnic] __dev_set_rx_mode+0x52/0x90 dev_set_rx_mode+0x26/0x40 __dev_open+0xe8/0x140 Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-06-27IB/core,rdmavt,hfi1,opa-vnic: Send OPA cap_mask3 in trapVishwanathapura, Niranjana1-1/+26
Provide the ability for IB clients to modify the OPA specific capability mask and include this mask in the subsequent trap data. Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Michael N. Henry <michael.n.henry@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-06-27net: add netlink_ext_ack argument to rtnl_link_ops.changelinkMatthias Schiffer1-3/+4
Add support for extended error reporting. Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-27net: add netlink_ext_ack argument to rtnl_link_ops.newlinkMatthias Schiffer1-1/+2
Add support for extended error reporting. Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller3-7/+20
Two entries being added at the same time to the IFLA policy table, whilst parallel bug fixes to decnet routing dst handling overlapping with the dst gc removal in net-next. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-16networking: make skb_push & __skb_push return void pointersJohannes Berg3-4/+4
It seems like a historic accident that these return unsigned char *, and in many places that means casts are required, more often than not. Make these functions return void * and remove all the casts across the tree, adding a (u8 *) cast only where the unsigned char pointer was used directly, all done with the following spatch: @@ expression SKB, LEN; typedef u8; identifier fn = { skb_push, __skb_push, skb_push_rcsum }; @@ - *(fn(SKB, LEN)) + *(u8 *)fn(SKB, LEN) @@ expression E, SKB, LEN; identifier fn = { skb_push, __skb_push, skb_push_rcsum }; type T; @@ - E = ((T *)(fn(SKB, LEN))) + E = fn(SKB, LEN) @@ expression SKB, LEN; identifier fn = { skb_push, __skb_push, skb_push_rcsum }; @@ - fn(SKB, LEN)[0] + *(u8 *)fn(SKB, LEN) Note that the last part there converts from push(...)[0] to the more idiomatic *(u8 *)push(...). Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-14IB/ipoib: Fix memory leak in create child syscallFeras Daoud1-3/+3
The flow of creating a new child goes through ipoib_vlan_add which allocates a new interface and checks the rtnl_lock. If the lock is taken, restart_syscall will be called to restart the system call again. In this case we are not releasing the already allocated interface, causing a leak. Fixes: 9baa0b036410 ("IB/ipoib: Add rtnl_link_ops support") Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-06-14IB/ipoib: Fix access to un-initialized napi structAlex Vesker1-1/+0
There is no need to re-enable napi since we set the initialized flag before calling ipoib_ib_dev_stop which will disable napi, disabling napi twice is harmless in case it was already disabled. One more reason for this fix is that when using IPoIB new device driver napi is not added to priv, this can lead to kernel panic when rn_ops ndo_open fails. [ 289.755840] invalid opcode: 0000 [#1] SMP [ 289.757111] task: ffff880036964440 ti: ffff880178ee8000 task.ti: ffff880178ee8000 [ 289.757111] RIP: 0010:[<ffffffffa05368d6>] [<ffffffffa05368d6>] napi_enable.part.24+0x4/0x6 [ib_ipoib] [ 289.757111] RSP: 0018:ffff880178eeb6d8 EFLAGS: 00010246 [ 289.757111] RAX: 0000000000000000 RBX: ffff880177a80010 RCX: 000000007fffffff [ 289.757111] RDX: ffffffff81d5f118 RSI: 0000000000000000 RDI: ffff880177a80010 [ 289.757111] RBP: ffff880178eeb6d8 R08: 0000000000000082 R09: 0000000000000283 [ 289.757111] R10: 0000000000000000 R11: 0000000000000000 R12: ffff880175a00000 [ 289.757111] R13: ffff880177a80080 R14: 0000000000000000 R15: 0000000000000001 [ 289.757111] FS: 00007fe2ee346880(0000) GS:ffff88017fc00000(0000) knlGS:0000000000000000 [ 289.757111] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 289.757111] CR2: 00007fffca979020 CR3: 00000001792e4000 CR4: 00000000000006f0 [ 289.757111] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 289.757111] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 289.757111] Stack: [ 289.796027] ffff880178eeb6f0 ffffffffa05251f5 ffff880177a80000 ffff880178eeb718 [ 289.796027] ffffffffa0528505 ffff880175a00000 ffff880177a80000 0000000000000000 [ 289.796027] ffff880178eeb748 ffffffffa051f0ab ffff880175a00000 ffffffffa0537d60 [ 289.796027] Call Trace: [ 289.796027] [<ffffffffa05251f5>] napi_enable+0x25/0x30 [ib_ipoib] [ 289.796027] [<ffffffffa0528505>] ipoib_ib_dev_open+0x175/0x190 [ib_ipoib] [ 289.796027] [<ffffffffa051f0ab>] ipoib_open+0x4b/0x160 [ib_ipoib] [ 289.796027] [<ffffffff814fe33f>] _dev_open+0xbf/0x130 [ 289.796027] [<ffffffff814fe62d>] __dev_change_flags+0x9d/0x170 [ 289.796027] [<ffffffff814fe729>] dev_change_flags+0x29/0x60 [ 289.796027] [<ffffffff8150caf7>] do_setlink+0x397/0xa40 Fixes: cd565b4b51e5 ('IB/IPoIB: Support acceleration options callbacks') Signed-off-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-06-14IB/ipoib: Delete napi in device uninit defaultAlex Vesker1-0/+3
This patch mekas init_default and uninit_default symmetric with a call to delete napi. Additionally, the uninit_default gained delete napi call in case of init_default fails. Fixes: 515ed4f3aab4 ('IB/IPoIB: Separate control and data related initializations') Signed-off-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-06-14IB/ipoib: Limit call to free rdma_netdev for capable devicesAlex Vesker1-1/+4
Limit calls to free_rdma_netdev() for capable devices only. Fixes: cd565b4b51e5 ('IB/IPoIB: Support acceleration options callbacks') Signed-off-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-06-14IB/ipoib: Fix memory leaks for child interfaces privAlex Vesker2-2/+10
There is a need to free priv explicitly and not just to release the device, child priv is freed explicitly on remove flow and this patch also includes priv free on error flow in P_key creation and also in add_port. Fixes: cd565b4b51e5 ('IB/IPoIB: Support acceleration options callbacks') Signed-off-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-06-02RDMA/SA: Fix kernel panic in CMA request handler flowMajd Dibbiny1-1/+1
Commit 9fdca4da4d8c (IB/SA: Split struct sa_path_rec based on IB and ROCE specific fields) moved the service_id to be specific attribute for IB and OPA SA Path Record, and thus wasn't assigned for RoCE. This caused to the following kernel panic in the CMA request handler flow: [ 27.074594] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 [ 27.074731] IP: __radix_tree_lookup+0x1d/0xe0 ... [ 27.075356] Workqueue: ib_cm cm_work_handler [ib_cm] [ 27.075401] task: ffff88022e3b8000 task.stack: ffffc90001298000 [ 27.075449] RIP: 0010:__radix_tree_lookup+0x1d/0xe0 ... [ 27.075979] Call Trace: [ 27.076015] radix_tree_lookup+0xd/0x10 [ 27.076055] cma_ps_find+0x59/0x70 [rdma_cm] [ 27.076097] cma_id_from_event+0xd2/0x470 [rdma_cm] [ 27.076144] ? ib_init_ah_from_path+0x39a/0x590 [ib_core] [ 27.076193] cma_req_handler+0x25/0x480 [rdma_cm] [ 27.076237] cm_process_work+0x25/0x120 [ib_cm] [ 27.076280] ? cm_get_bth_pkey.isra.62+0x3c/0xa0 [ib_cm] [ 27.076350] cm_req_handler+0xb03/0xd40 [ib_cm] [ 27.076430] ? sched_clock_cpu+0x11/0xb0 [ 27.076478] cm_work_handler+0x194/0x1588 [ib_cm] [ 27.076525] process_one_work+0x160/0x410 [ 27.076565] worker_thread+0x137/0x4a0 [ 27.076614] kthread+0x112/0x150 [ 27.076684] ? max_active_store+0x60/0x60 [ 27.077642] ? kthread_park+0x90/0x90 [ 27.078530] ret_from_fork+0x2c/0x40 This patch moves it back to the common SA Path Record structure and removes the redundant setter and getter. Tested on Connect-IB and Connect-X4 in Infiniband and RoCE respectively. Fixes: 9fdca4da4d8c (IB/SA: Split struct sa_path_rec based on IB ands ROCE specific fields) Signed-off-by: Majd Dibbiny <majd@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-06-02RDMA/srp: Fix NULL deref at srp_destroy_qp()Israel Rukshin1-1/+1
If srp_init_qp() fails at srp_create_ch_ib() then ch->send_cq may be NULL. Calling directly to ib_destroy_qp() is sufficient because no work requests were posted on the created qp. Fixes: 9294000d6d89 ("IB/srp: Drain the send queue before destroying a QP") Cc: <stable@vger.kernel.org> Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Bart van Assche <bart.vanassche@sandisk.com>-- Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-06-02RDMA/IPoIB: Limit the ipoib_dev_uninit_default scopeLeon Romanovsky1-1/+1
ipoib_dev_uninit_default() call is used in ipoib_main.c file only and it generates the following warning from smatch tool: drivers/infiniband/ulp/ipoib/ipoib_main.c:1593:6: warning: symbol 'ipoib_dev_uninit_default' was not declared. Should it be static? so let's declare that function as static. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-06-02RDMA/IPoIB: Replace netdev_priv with ipoib_priv for ipoib_get_link_ksettingsHonggang Li1-1/+1
ipoib_dev_init accesses the wrong private data for the IPoIB device. Commit cd565b4b51e5 (IB/IPoIB: Support acceleration options callbacks) changed ipoib_priv from being identical to netdev_priv to being an area inside of, but not the same pointer as, the netdev_priv pointer. As such, the struct we want is the ipoib_priv area, not the netdev_priv area, so use the right accessor, otherwise we kernel panic. [ 27.271938] IPv6: ADDRCONF(NETDEV_CHANGE): mlx5_ib0.8006: link becomes ready [ 28.156790] BUG: unable to handle kernel NULL pointer dereference at 000000000000067c [ 28.166309] IP: ib_query_port+0x30/0x180 [ib_core] ... [ 28.306282] RIP: 0010:ib_query_port+0x30/0x180 [ib_core] ... [ 28.393337] Call Trace: [ 28.397594] ipoib_get_link_ksettings+0x66/0xe0 [ib_ipoib] [ 28.405274] __ethtool_get_link_ksettings+0xa0/0x1c0 [ 28.412353] speed_show+0x74/0xa0 [ 28.417503] dev_attr_show+0x20/0x50 [ 28.422922] ? mutex_lock+0x12/0x40 [ 28.428179] sysfs_kf_seq_show+0xbf/0x1a0 [ 28.434002] kernfs_seq_show+0x21/0x30 [ 28.439470] seq_read+0x116/0x3b0 [ 28.444445] ? do_filp_open+0xa5/0x100 [ 28.449774] kernfs_fop_read+0xff/0x180 [ 28.455220] __vfs_read+0x37/0x150 [ 28.460167] ? security_file_permission+0x9d/0xc0 [ 28.466560] vfs_read+0x8c/0x130 [ 28.471318] SyS_read+0x55/0xc0 [ 28.475950] do_syscall_64+0x67/0x150 [ 28.481163] entry_SYSCALL64_slow_path+0x25/0x25 ... [ 28.584493] ---[ end trace 3549968a4bf0aa5d ]--- Fixes: cd565b4b51e5 (IB/IPoIB: Support acceleration options callbacks) Fixes: 0d7e2d2166f6 (IB/ipoib: add get_link_ksettings in ethtool) Signed-off-by: Honggang Li <honli@redhat.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-05-12Merge branch 'for-next' of ↵Linus Torvalds1-6/+3
git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending Pull SCSI target updates from Nicholas Bellinger: "Things were a lot more calm than previously expected. It's primarily fixes in various areas, with most of the new functionality centering around TCMU backend driver work that Xiubo Li has been driving. Here's the summary on the feature side: - Make T10-PI verify configurable for emulated (FILEIO + RD) backends (Dmitry Monakhov) - Allow target-core/TCMU pass-through to use in-kernel SPC-PR logic (Bryant Ly + MNC) - Add TCMU support for growing ring buffer size (Xiubo Li + MNC) - Add TCMU support for global block data pool (Xiubo Li + MNC) and on the bug-fix side: - Fix COMPARE_AND_WRITE non GOOD status handling for READ phase failures (Gary Guo + nab) - Fix iscsi-target hang with explicitly changing per NodeACL CmdSN number depth with concurrent login driven session reinstatement. (Gary Guo + nab) - Fix ibmvscsis fabric driver ABORT task handling (Bryant Ly) - Fix target-core/FILEIO zero length handling (Bart Van Assche) Also, there was an OOPs introduced with the WRITE_VERIFY changes that I ended up reverting at the last minute, because as not unusual Bart and I could not agree on the fix in time for -rc1. Since it's specific to a conformance test, it's been reverted for now. There is a separate patch in the queue to address the underlying control CDB write overflow regression in >= v4.3 separate from the WRITE_VERIFY revert here, that will be pushed post -rc1" * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (30 commits) Revert "target: Fix VERIFY and WRITE VERIFY command parsing" IB/srpt: Avoid that aborting a command triggers a kernel warning IB/srpt: Fix abort handling target/fileio: Fix zero-length READ and WRITE handling ibmvscsis: Do not send aborted task response tcmu: fix module removal due to stuck thread target: Don't force session reset if queue_depth does not change iscsi-target: Set session_fall_back_to_erl0 when forcing reinstatement target: Fix compare_and_write_callback handling for non GOOD status tcmu: Recalculate the tcmu_cmd size to save cmd area memories tcmu: Add global data block pool support tcmu: Add dynamic growing data area feature support target: fixup error message in target_tg_pt_gp_tg_pt_gp_id_store() target: fixup error message in target_tg_pt_gp_alua_access_type_store() target/user: PGR Support target: Add WRITE_VERIFY_16 Documentation/target: add an example script to configure an iSCSI target target: Use kmalloc_array() in transport_kmap_data_sg() target: Use kmalloc_array() in compare_and_write_callback() target: Improve size determinations in two functions ...
2017-05-08IB/srpt: Avoid that aborting a command triggers a kernel warningBart Van Assche1-1/+2
Avoid that the following warning is triggered: WARNING: CPU: 10 PID: 166 at ../drivers/infiniband/ulp/srpt/ib_srpt.c:2674 srpt_release_cmd+0x139/0x140 [ib_srpt] CPU: 10 PID: 166 Comm: kworker/u24:8 Not tainted 4.9.4-1-default #1 Workqueue: tmr-fileio target_tmr_work [target_core_mod] Call Trace: [<ffffffffaa3c4f70>] dump_stack+0x63/0x83 [<ffffffffaa0844eb>] __warn+0xcb/0xf0 [<ffffffffaa0845dd>] warn_slowpath_null+0x1d/0x20 [<ffffffffc06ba429>] srpt_release_cmd+0x139/0x140 [ib_srpt] [<ffffffffc06e4377>] target_release_cmd_kref+0xb7/0x120 [target_core_mod] [<ffffffffc06e4d7f>] target_put_sess_cmd+0x2f/0x60 [target_core_mod] [<ffffffffc06e15e0>] core_tmr_lun_reset+0x340/0x790 [target_core_mod] [<ffffffffc06e4816>] target_tmr_work+0xe6/0x140 [target_core_mod] [<ffffffffaa09e4d3>] process_one_work+0x1f3/0x4d0 [<ffffffffaa09e7f8>] worker_thread+0x48/0x4e0 [<ffffffffaa09e7b0>] ? process_one_work+0x4d0/0x4d0 [<ffffffffaa0a46da>] kthread+0xca/0xe0 [<ffffffffaa0a4610>] ? kthread_park+0x60/0x60 [<ffffffffaa71b775>] ret_from_fork+0x25/0x30 Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Cc: Doug Ledford <dledford@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: David Disseldorp <ddiss@suse.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-05-08IB/srpt: Fix abort handlingBart Van Assche1-5/+1
Let the target core check the CMD_T_ABORTED flag instead of the SRP target driver. Hence remove the transport_check_aborted_status() call. Since state == SRPT_STATE_CMD_RSP_SENT is something that really should not happen, do not try to recover if srpt_queue_response() is called for an I/O context that is in that state. This patch is a bug fix because the srpt_abort_cmd() call is misplaced - if that function is called from srpt_queue_response() it should either be called before the command state is changed or after the response has been sent. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Cc: Doug Ledford <dledford@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Andy Grover <agrover@redhat.com> Cc: David Disseldorp <ddiss@suse.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-05-05IB/ipoib: add get_link_ksettings in ethtoolZhu Yanjun1-0/+59
In order to let the bonding driver report the correct speed of the underlaying interfaces, when they are IPoIB, the ethtool function get_link_ksettings() in the IPoIB driver is implemented. Cc: Joe Jin <joe.jin@oracle.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Suggested-by: Håkon Bugge <Haakon.Bugge@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-05-01IB/SA: Add support to query OPA path recordsDasaratharaman Chandramouli2-2/+9
When the bit 26 of capmask2 field in OPA classport info query is set, SA will query for OPA path records instead of querying for IB path records. Note that OPA path records can only be queried by kernel ULPs. Userspace clients continue to query IB path records. Reviewed-by: Don Hiatt <don.hiatt@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-05-01IB/SA: Add OPA path record typeDasaratharaman Chandramouli3-9/+9
Add opa_sa_path_rec to sa_path_rec data structure. The 'type' field in sa_path_rec identifies the type of the path record. Reviewed-by: Don Hiatt <don.hiatt@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-05-01IB/SA: Split struct sa_path_rec based on IB and ROCE specific fieldsDasaratharaman Chandramouli3-10/+14
sa_path_rec now contains a union of sa_path_rec_ib and sa_path_rec_roce based on the type of the path record. Note that fields applicable to path record type ROCE v1 and ROCE v2 fall under sa_path_rec_roce. Accessor functions are added to these fields so the caller doesn't have to know the type. Reviewed-by: Don Hiatt <don.hiatt@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-05-01IB/SA: Rename ib_sa_path_rec to sa_path_recDasaratharaman Chandramouli5-7/+7
Rename ib_sa_path_rec to a more generic sa_path_rec. This is part of extending ib_sa to also support OPA path records in addition to the IB defined path records. Reviewed-by: Don Hiatt <don.hiatt@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-05-01IB/core: Define 'ib' and 'roce' rdma_ah_attr typesDasaratharaman Chandramouli2-0/+2
rdma_ah_attr can now be either ib or roce allowing core components to use one type or the other and also to define attributes unique to a specific type. struct ib_ah is also initialized with the type when its first created. This ensures that calls such as modify_ah dont modify the type of the address handle attribute. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Don Hiatt <don.hiatt@intel.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-05-01IB/core: Use rdma_ah_attr accessor functionsDasaratharaman Chandramouli2-36/+33
Modify core and driver components to use accessor functions introduced to access individual fields of rdma_ah_attr Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Don Hiatt <don.hiatt@intel.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-05-01IB/core: Rename ib_destroy_ah to rdma_destroy_ahDasaratharaman Chandramouli3-6/+6
Rename ib_destroy_ah to rdma_destroy_ah so its in sync with the rename of the ib address handle attribute Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Don Hiatt <don.hiatt@intel.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-05-01IB/core: Rename ib_create_ah to rdma_create_ahDasaratharaman Chandramouli2-2/+2
Rename ib_create_ah to rdma_create_ah so its in sync with the rename of the ib address handle attribute Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Don Hiatt <don.hiatt@intel.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-05-01IB/core: Rename struct ib_ah_attr to rdma_ah_attrDasaratharaman Chandramouli5-5/+5
This patch simply renames struct ib_ah_attr to rdma_ah_attr as these fields specify attributes that are not necessarily specific to IB. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Don Hiatt <don.hiatt@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-05-01IB/IPoIB: Remove 'else' when the 'if' has a return.Dasaratharaman Chandramouli1-10/+9
This patch fixes a checkpatch issue related to not having to use an 'else' if the 'if' path returns from the function. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Don Hiatt <don.hiatt@intel.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-29IB/core: Move opa_class_port_info definition to header fileDasaratharaman Chandramouli1-25/+0
Both opa_vnic and the hfi driver use the same opa_classport_info definition. We will also have ib_sa capable of querying opa class port info and would need this definition. Move it to ib_mad.h for everyone to use. Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-28IB/SA: Modify SA to implicitly cache Class Port infoDasaratharaman Chandramouli3-78/+3
SA will query and cache class port info as part of its initialization. SA will also invalidate and refresh the cache based on specific events. Callers such as IPoIB and CM can query the SA to get the classportinfo information. Apart from making the caller code much simpler, this change puts the onus on the SA to query and maintain classportinfo much like how it maitains the address handle to the SM. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Don Hiatt <don.hiatt@intel.com> Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-25IB/iser: fix spelling mistake: "unexepected" -> "unexpected"Colin Ian King1-1/+1
trivial fix to spelling mistake in iser_err error message Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Acked-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-21IB/ipoib: Fix deadlock between ipoib_stop and mcast join flowFeras Daoud1-6/+5
Before calling ipoib_stop, rtnl_lock should be taken, then the flow clears the IPOIB_FLAG_ADMIN_UP and IPOIB_FLAG_OPER_UP flags, and waits for mcast completion if IPOIB_MCAST_FLAG_BUSY is set. On the other hand, the flow of multicast join task initializes a mcast completion, sets the IPOIB_MCAST_FLAG_BUSY and calls ipoib_mcast_join. If IPOIB_FLAG_OPER_UP flag is not set, this call returns EINVAL without setting the mcast completion and leads to a deadlock. ipoib_stop | | | clear_bit(IPOIB_FLAG_ADMIN_UP) | | | Context Switch | | ipoib_mcast_join_task | | | spin_lock_irq(lock) | | | init_completion(mcast) | | | set_bit(IPOIB_MCAST_FLAG_BUSY) | | | Context Switch | | clear_bit(IPOIB_FLAG_OPER_UP) | | | spin_lock_irqsave(lock) | | | Context Switch | | ipoib_mcast_join | return (-EINVAL) | | | spin_unlock_irq(lock) | | | Context Switch | | ipoib_mcast_dev_flush | wait_for_completion(mcast) | ipoib_stop will wait for mcast completion for ever, and will not release the rtnl_lock. As a result panic occurs with the following trace: [13441.639268] Call Trace: [13441.640150] [<ffffffff8168b579>] schedule+0x29/0x70 [13441.641038] [<ffffffff81688fc9>] schedule_timeout+0x239/0x2d0 [13441.641914] [<ffffffff810bc017>] ? complete+0x47/0x50 [13441.642765] [<ffffffff810a690d>] ? flush_workqueue_prep_pwqs+0x16d/0x200 [13441.643580] [<ffffffff8168b956>] wait_for_completion+0x116/0x170 [13441.644434] [<ffffffff810c4ec0>] ? wake_up_state+0x20/0x20 [13441.645293] [<ffffffffa05af170>] ipoib_mcast_dev_flush+0x150/0x190 [ib_ipoib] [13441.646159] [<ffffffffa05ac967>] ipoib_ib_dev_down+0x37/0x60 [ib_ipoib] [13441.647013] [<ffffffffa05a4805>] ipoib_stop+0x75/0x150 [ib_ipoib] Fixes: 08bc327629cb ("IB/ipoib: fix for rare multicast join race condition") Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-21IB/ipoib: Update broadcast object if PKey value was changed in index 0Feras Daoud1-0/+13
Update the broadcast address in the priv->broadcast object when the Pkey value changes in index 0, otherwise the multicast GID value will keep the previous value of the PKey, and will not be updated. This leads to interface state down because the interface will keep the old PKey value. For example, in SR-IOV environment, if the PF changes the value of PKey index 0 for one of the VFs, then the VF receives PKey change event that triggers heavy flush. This flush calls update_parent_pkey that update the broadcast object and its relevant members. If in this case the multicast GID will not be updated, the interface state will be down. Fixes: c2904141696e ("IPoIB: Fix pkey change flow for virtualization environments") Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-20IB/IPoIB: Support acceleration options callbacksErez Shitrit7-76/+188
IPoIB driver now uses the new set of callback functions. If the hardware provider supports the new ipoib_options implementation, the driver uses the callbacks in its data path flows, otherwise it uses the driver default implementation for all data flows in its code. The default implementation wasn't change and it is exactly as it was before introduction of acceleration support. Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-20IB/IPoIB: Use defined function for netdev_priv functionErez Shitrit10-129/+136
Make ipoib_priv point to netdev_priv where the code calls netdev_priv. Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-20IB/IPoIB: Rename qpn to be dqpn in ipoib_send and post_send functionsErez Shitrit2-7/+8
Change of function parameter name from qpn to be dqpn. Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-20IB/IPoIB: Separate control from HW operation on ipoib_open/stop ndoErez Shitrit3-103/+131
This patch is preparing the netdev part at the IPoIB driver to be able to use the ipoib_options. It deals with the two flows from the .ndo: ipoib_open and ipoib_stop. The code is rearranged as follows: * All operations which deal with the hardware resources, (for example change QP state, post-receive etc.) are performed in one place. * All operations that are control oriented (like restart multicast task, start the reap_ah etc.) are performed in separate place. The functions that deal with the hardware resources now located at __ipoib_ib_dev_open for the ipoib_open flow and __ipoib_ib_dev_stop for ipoib_stop. Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-20IB/IPoIB: Separate control and data related initializationsErez Shitrit4-88/+109
This patch prepares init and teardown flows so we can call them through ipoib_options function pointers. It arranges that area of code as the following: * All operations which deal with the resource allocation/deletion are performed in one place. * All operations that are control oriented, meaning that they are not connected to a specific hardware, are performed in a separate place. The operations for allocation of hardware resources are now in the function ipoib_dev_init_default, and the deletion of all the resources are in ipoib_dev_uninit_default The only exception is the creation of the PD object, which is used both for resource allocation (create QP etc.) and for control flows like creating AH. It also does: * Move creation of rx_ring and tx_ring to be in the resources allocation area. * Move the function ipoib_ib_dev_open that does the open device to the control area instead of the dev_init which creates resources. Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-20IB/opa-vnic: VNIC Ethernet Management Agent (VEMA) functionVishwanathapura, Niranjana5-5/+1106
OPA VEMA function interfaces with the Infiniband MAD stack to exchange the management information packets with the Ethernet Manager (EM). It interfaces with the OPA VNIC netdev function to SET/GET the management information. The information exchanged with the EM includes class port details, encapsulation configuration, various counters, unicast and multicast MAC list and the MAC table. It also supports sending traps to the EM. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Sadanand Warrier <sadanand.warrier@intel.com> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-20IB/opa-vnic: VNIC Ethernet Management Agent (VEMA) interfaceVishwanathapura, Niranjana5-2/+581
OPA VNIC EMA interface functions are the management interfaces to the OPA VNIC netdev. Add support to add and remove VNIC ports. Implement the required GET/SET management interface functions and processing of new management information. Add support to send trap notifications upon various events like interface status change, unicast/multicast mac list update and mac address change. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Sadanand Warrier <sadanand.warrier@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-20IB/opa-vnic: VNIC MAC table supportVishwanathapura, Niranjana3-0/+291
OPA VNIC MAC table contains the MAC address to DLID mappings provided by the Ethernet manager. During transmission, the MAC table provides the MAC address to DLID translation. Implement MAC table using simple hash list. Also provide support to update/query the MAC table by Ethernet manager. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Sadanand Warrier <sadanand.warrier@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-20IB/opa-vnic: VNIC statistics supportVishwanathapura, Niranjana3-0/+132
OPA VNIC driver statistics support maintains various counters including standard netdev counters and the Ethernet manager defined counters. Add the Ethtool hook to read the counters. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-20IB/opa-vnic: VNIC Ethernet Management (EM) structure definitionsVishwanathapura, Niranjana2-0/+456
Define VNIC EM MAD structures and the associated macros. These structures are used for information exchange between VNIC EM agent (EMA) on the host and the Ethernet manager. These include the virtual ethernet switch (vesw) port information, vesw port mac table, summay and error counters, vesw port interface mac lists and the EMA trap. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Sadanand Warrier <sadanand.warrier@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-20IB/opa-vnic: Virtual Network Interface Controller (VNIC) netdevVishwanathapura, Niranjana8-0/+794
OPA VNIC netdev function supports Ethernet functionality over Omni-Path fabric by encapsulating Ethernet packets inside Omni-Path packet header. It allocates a rdma netdev device and interfaces with the network stack to provide standard Ethernet network interfaces. It overrides HFI1 device's netdev operations where it is required. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Sadanand Warrier <sadanand.warrier@intel.com> Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com> Signed-off-by: Andrzej Kacprowski <andrzej.kacprowski@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-20Merge branch 'k.o/for-4.12' into k.o/for-4.12-rdma-netdeviceDoug Ledford3-8/+42
2017-04-12Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pendingLinus Torvalds2-19/+49
Pull SCSI target fixes from Nicholas Bellinger: "There has been work in a number of different areas over the last weeks, including: - Fix target-core-user (TCMU) back-end bi-directional handling (Xiubo Li + Mike Christie + Ilias Tsitsimpis) - Fix iscsi-target TMR reference leak during session shutdown (Rob Millner + Chu Yuan Lin) - Fix target_core_fabric_configfs.c race between LUN shutdown + mapped LUN creation (James Shen) - Fix target-core unknown fabric callback queue-full errors (Potnuri Bharat Teja) - Fix iscsi-target + iser-target queue-full handling in order to support iw_cxgb4 RNICs. (Potnuri Bharat Teja + Sagi Grimberg) - Fix ALUA transition state race between multiple initiator (Mike Christie) - Drop work-around for legacy GlobalSAN initiator, to allow QLogic 57840S + 579xx offload HBAs to work out-of-the-box in MSFT environments. (Martin Svec + Arun Easi) Note that a number are CC'ed for stable, and although the queue-full bug-fixes required for iser-target to work with iw_cxgb4 aren't CC'ed here, they'll be posted to Greg-KH separately" * git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: tcmu: Skip Data-Out blocks before gathering Data-In buffer for BIDI case iscsi-target: Drop work-around for legacy GlobalSAN initiator target: Fix ALUA transition state race between multiple initiators iser-target: avoid posting a recv buffer twice iser-target: Fix queue-full response handling iscsi-target: Propigate queue_data_in + queue_status errors target: Fix unknown fabric callback queue-full errors tcmu: Fix wrongly calculating of the base_command_size tcmu: Fix possible overwrite of t_data_sg's last iov[] target: Avoid mappedlun symlink creation during lun shutdown iscsi-target: Fix TMR reference leak during session shutdown usb: gadget: Correct usb EP argument for BOT status request tcmu: Allow cmd_time_out to be set to zero (disabled)
2017-04-05IB/IPoIB: ibX: failed to create mcg debug fileShamir Rabinovitch3-8/+42
When udev renames the netdev devices, ipoib debugfs entries does not get renamed. As a result, if subsequent probe of ipoib device reuse the name then creating a debugfs entry for the new device would fail. Also, moved ipoib_create_debug_files and ipoib_delete_debug_files as part of ipoib event handling in order to avoid any race condition between these. Fixes: 1732b0ef3b3a ([IPoIB] add path record information in debugfs) Cc: stable@vger.kernel.org # 2.6.15+ Signed-off-by: Vijay Kumar <vijay.ac.kumar@oracle.com> Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-03-31iser-target: avoid posting a recv buffer twiceSagi Grimberg2-1/+14
We pre-allocate our send-queues and might overflow them in case we have multi work-request operations which tend to occur for large RDMA transfers over devices with limited allowed sg elements. When we get to a queue-full condition we might retry again later, so track our receive buffers so we don't repost them for a retry case. Reported-by: Potnuri Bharat Teja <bharat@chelsio.com> Tested-by: Potnuri Bharat Teja <bharat@chelsio.com> Reviewed-by: Potnuri Bharat Teja <bharat@chelsio.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>