summaryrefslogtreecommitdiff
path: root/drivers/infiniband/hw
AgeCommit message (Collapse)AuthorFilesLines
2016-02-27iw_cxgb3: Fix incorrectly returning error on successHariprasad S1-2/+2
commit 67f1aee6f45059fd6b0f5b0ecb2c97ad0451f6b3 upstream. The cxgb3_*_send() functions return NET_XMIT_ values, which are positive integers values. So don't treat positive return values as an error. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2016-02-13IB/mlx4: Initialize hop_limit when creating address handleMatan Barak1-0/+1
commit 4e4081673445485aa6bc90383bdb83e7a96cc48a upstream. Hop limit value wasn't copied from attributes when ah was created. This may influence packets for unconnected services to get dropped in routers when endpoints are not in the same subnet. Fixes: fa417f7b520e ("IB/mlx4: Add support for IBoE") Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2016-02-13IB/qib: fix mcast detach when qp not attachedMike Marciniszyn1-20/+15
commit 09dc9cd6528f5b52bcbd3292a6312e762c85260f upstream. The code produces the following trace: [1750924.419007] general protection fault: 0000 [#3] SMP [1750924.420364] Modules linked in: nfnetlink autofs4 rpcsec_gss_krb5 nfsv4 dcdbas rfcomm bnep bluetooth nfsd auth_rpcgss nfs_acl dm_multipath nfs lockd scsi_dh sunrpc fscache radeon ttm drm_kms_helper drm serio_raw parport_pc ppdev i2c_algo_bit lpc_ich ipmi_si ib_mthca ib_qib dca lp parport ib_ipoib mac_hid ib_cm i3000_edac ib_sa ib_uverbs edac_core ib_umad ib_mad ib_core ib_addr tg3 ptp dm_mirror dm_region_hash dm_log psmouse pps_core [1750924.420364] CPU: 1 PID: 8401 Comm: python Tainted: G D 3.13.0-39-generic #66-Ubuntu [1750924.420364] Hardware name: Dell Computer Corporation PowerEdge 860/0XM089, BIOS A04 07/24/2007 [1750924.420364] task: ffff8800366a9800 ti: ffff88007af1c000 task.ti: ffff88007af1c000 [1750924.420364] RIP: 0010:[<ffffffffa0131d51>] [<ffffffffa0131d51>] qib_mcast_qp_free+0x11/0x50 [ib_qib] [1750924.420364] RSP: 0018:ffff88007af1dd70 EFLAGS: 00010246 [1750924.420364] RAX: 0000000000000001 RBX: ffff88007b822688 RCX: 000000000000000f [1750924.420364] RDX: ffff88007b822688 RSI: ffff8800366c15a0 RDI: 6764697200000000 [1750924.420364] RBP: ffff88007af1dd78 R08: 0000000000000001 R09: 0000000000000000 [1750924.420364] R10: 0000000000000011 R11: 0000000000000246 R12: ffff88007baa1d98 [1750924.420364] R13: ffff88003ecab000 R14: ffff88007b822660 R15: 0000000000000000 [1750924.420364] FS: 00007ffff7fd8740(0000) GS:ffff88007fc80000(0000) knlGS:0000000000000000 [1750924.420364] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [1750924.420364] CR2: 00007ffff597c750 CR3: 000000006860b000 CR4: 00000000000007e0 [1750924.420364] Stack: [1750924.420364] ffff88007b822688 ffff88007af1ddf0 ffffffffa0132429 000000007af1de20 [1750924.420364] ffff88007baa1dc8 ffff88007baa0000 ffff88007af1de70 ffffffffa00cb313 [1750924.420364] 00007fffffffde88 0000000000000000 0000000000000008 ffff88003ecab000 [1750924.420364] Call Trace: [1750924.420364] [<ffffffffa0132429>] qib_multicast_detach+0x1e9/0x350 [ib_qib] [1750924.568035] [<ffffffffa00cb313>] ? ib_uverbs_modify_qp+0x323/0x3d0 [ib_uverbs] [1750924.568035] [<ffffffffa0092d61>] ib_detach_mcast+0x31/0x50 [ib_core] [1750924.568035] [<ffffffffa00cc213>] ib_uverbs_detach_mcast+0x93/0x170 [ib_uverbs] [1750924.568035] [<ffffffffa00c61f6>] ib_uverbs_write+0xc6/0x2c0 [ib_uverbs] [1750924.568035] [<ffffffff81312e68>] ? apparmor_file_permission+0x18/0x20 [1750924.568035] [<ffffffff812d4cd3>] ? security_file_permission+0x23/0xa0 [1750924.568035] [<ffffffff811bd214>] vfs_write+0xb4/0x1f0 [1750924.568035] [<ffffffff811bdc49>] SyS_write+0x49/0xa0 [1750924.568035] [<ffffffff8172f7ed>] system_call_fastpath+0x1a/0x1f [1750924.568035] Code: 66 2e 0f 1f 84 00 00 00 00 00 31 c0 5d c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 0f 1f 44 00 00 55 48 89 e5 53 48 89 fb 48 8b 7f 10 <f0> ff 8f 40 01 00 00 74 0e 48 89 df e8 8e f8 06 e1 5b 5d c3 0f [1750924.568035] RIP [<ffffffffa0131d51>] qib_mcast_qp_free+0x11/0x50 [ib_qib] [1750924.568035] RSP <ffff88007af1dd70> [1750924.650439] ---[ end trace 73d5d4b3f8ad4851 ] The fix is to note the qib_mcast_qp that was found. If none is found, then return EINVAL indicating the error. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reported-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2015-10-13IB/mlx4: Use correct SL on AH query under RoCENoa Osherovich1-1/+5
commit 5e99b139f1b68acd65e36515ca347b03856dfb5a upstream. The mlx4 IB driver implementation for ib_query_ah used a wrong offset (28 instead of 29) when link type is Ethernet. Fixed to use the correct one. Fixes: fa417f7b520e ('IB/mlx4: Add support for IBoE') Signed-off-by: Shani Michaeli <shanim@mellanox.com> Signed-off-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2015-10-13IB/qib: Change lkey table allocation to support more MRsMike Marciniszyn4-4/+20
commit d6f1c17e162b2a11e708f28fa93f2f79c164b442 upstream. The lkey table is allocated with with a get_user_pages() with an order based on a number of index bits from a module parameter. The underlying kernel code cannot allocate that many contiguous pages. There is no reason the underlying memory needs to be physically contiguous. This patch: - switches the allocation/deallocation to vmalloc/vfree - caps the number of bits to 23 to insure at least 1 generation bit o this matches the module parameter description Reviewed-by: Vinit Agnihotri <vinit.abhay.agnihotri@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com> [bwh: Backported to 3.2: - Adjust context - Add definition of qib_dev_warn(), added upstream by commit ddb887658970 ("IB/qib: Convert opcode counters to per-context")] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2015-08-07IB/mlx4: Fix WQE LSO segment calculationErez Shitrit1-2/+1
commit ca9b590caa17bcbbea119594992666e96cde9c2f upstream. The current code decreases from the mss size (which is the gso_size from the kernel skb) the size of the packet headers. It shouldn't do that because the mss that comes from the stack (e.g IPoIB) includes only the tcp payload without the headers. The result is indication to the HW that each packet that the HW sends is smaller than what it could be, and too many packets will be sent for big messages. An easy way to demonstrate one more aspect of the problem is by configuring the ipoib mtu to be less than 2*hlen (2*56) and then run app sending big TCP messages. This will tell the HW to send packets with giant (negative value which under unsigned arithmetics becomes a huge positive one) length and the QP moves to SQE state. Fixes: b832be1e4007 ('IB/mlx4: Add IPoIB LSO support') Reported-by: Matthew Finlay <matt@mellanox.com> Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2015-05-10IB/mlx4: Saturate RoCE port PMA counters in case of overflowMajd Dibbiny1-4/+16
commit 61a3855bb726cbb062ef02a31a832dea455456e0 upstream. For RoCE ports, we set the u32 PMA values based on u64 HCA counters. In case of overflow, according to the IB spec, we have to saturate a counter to its max value, do that. Fixes: c37791349cc7 ('IB/mlx4: Support PMA counters for IBoE') Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> [bwh: Backported to 3.2: - Adjust context - Open-code U32_MAX] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2015-05-10IB/qib: Do not write EEPROMMitko Haralanov7-219/+1
commit 18c0b82a3e4501511b08d0e8676fb08ac08734a3 upstream. This changeset removes all the code that allows the driver to write to the EEPROM and update the recorded error counters and power on hours. These two stats are unused and writing them exposes a timing risk which could leave the EEPROM in a bad state preventing further normal operation of the HCA. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com> [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2014-07-11RDMA/cxgb4: Add missing padding at end of struct c4iw_create_cq_respYann Droneaud2-2/+3
commit b6f04d3d21458818073a2f5af5339f958864bf71 upstream. The i386 ABI disagrees with most other ABIs regarding alignment of data types larger than 4 bytes: on most ABIs a padding must be added at end of the structures, while it is not required on i386. So for most ABI struct c4iw_create_cq_resp gets implicitly padded to be aligned on a 8 bytes multiple, while for i386, such padding is not added. The tool pahole can be used to find such implicit padding: $ pahole --anon_include \ --nested_anon_include \ --recursive \ --class_name c4iw_create_cq_resp \ drivers/infiniband/hw/cxgb4/iw_cxgb4.o Then, structure layout can be compared between i386 and x86_64: # +++ obj-i386/drivers/infiniband/hw/cxgb4/iw_cxgb4.o.pahole.txt 2014-03-28 11:43:05.547432195 +0100 # --- obj-x86_64/drivers/infiniband/hw/cxgb4/iw_cxgb4.o.pahole.txt 2014-03-28 10:55:10.990133017 +0100 @@ -14,9 +13,8 @@ struct c4iw_create_cq_resp { __u32 size; /* 28 4 */ __u32 qid_mask; /* 32 4 */ - /* size: 36, cachelines: 1, members: 6 */ - /* last cacheline: 36 bytes */ + /* size: 40, cachelines: 1, members: 6 */ + /* padding: 4 */ + /* last cacheline: 40 bytes */ }; This ABI disagreement will make an x86_64 kernel try to write past the buffer provided by an i386 binary. When boundary check will be implemented, the x86_64 kernel will refuse to write past the i386 userspace provided buffer and the uverbs will fail. If the structure is on a page boundary and the next page is not mapped, ib_copy_to_udata() will fail and the uverb will fail. This patch adds an explicit padding at end of structure c4iw_create_cq_resp, and, like 92b0ca7cb149 ("IB/mlx5: Fix stack info leak in mlx5_ib_alloc_ucontext()"), makes function c4iw_create_cq() not writting this padding field to userspace. This way, x86_64 kernel will be able to write struct c4iw_create_cq_resp as expected by unpatched and patched i386 libcxgb4. Link: http://marc.info/?i=cover.1399309513.git.ydroneaud@opteya.com Fixes: cfdda9d764362 ("RDMA/cxgb4: Add driver for Chelsio T4 RNIC") Fixes: e24a72a3302a6 ("RDMA/cxgb4: Fix four byte info leak in c4iw_create_cq()") Cc: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Yann Droneaud <ydroneaud@opteya.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2014-07-11RDMA/cxgb4: Fix four byte info leak in c4iw_create_cq()Dan Carpenter1-0/+1
commit e24a72a3302a638d4c6e77f0b40c45cc61c3f089 upstream. There is a four byte hole at the end of the "uresp" struct after the ->qid_mask member. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2014-07-11IB/ipath: Translate legacy diagpkt into newer extended diagpktDennis Dalessandro1-0/+4
commit 7e6d3e5c70f13874fb06e6b67696ed90ce79bd48 upstream. This patch addresses an issue where the legacy diagpacket is sent in from the user, but the driver operates on only the extended diagpkt. This patch specifically initializes the extended diagpkt based on the legacy packet. Reported-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2014-07-11IB/qib: Fix port in pkey change eventMike Marciniszyn1-1/+1
commit 911eccd284d13d78c92ec4f1f1092c03457d732a upstream. The code used a literal 1 in dispatching an IB_EVENT_PKEY_CHANGE. As of the dual port qib QDR card, this is not necessarily correct. Change to use the port as specified in the call. Reported-by: Alex Estrin <alex.estrin@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2014-04-30IB/ehca: Returns an error on ib_copy_to_udata() failureYann Droneaud1-0/+1
commit 5bdb0f02add5994b0bc17494f4726925ca5d6ba1 upstream. In case of error when writing to userspace, function ehca_create_cq() does not set an error code before following its error path. This patch sets the error code to -EFAULT when ib_copy_to_udata() fails. This was caught when using spatch (aka. coccinelle) to rewrite call to ib_copy_{from,to}_udata(). Link: https://www.gitorious.org/opteya/coccib/source/75ebf2c1033c64c1d81df13e4ae44ee99c989eba:ib_copy_udata.cocci Link: http://marc.info/?i=cover.1394485254.git.ydroneaud@opteya.com Signed-off-by: Yann Droneaud <ydroneaud@opteya.com> Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2014-04-30IB/mthca: Return an error on ib_copy_to_udata() failureYann Droneaud1-0/+1
commit 08e74c4b00c30c232d535ff368554959403d0432 upstream. In case of error when writing to userspace, the function mthca_create_cq() does not set an error code before following its error path. This patch sets the error code to -EFAULT when ib_copy_to_udata() fails. This was caught when using spatch (aka. coccinelle) to rewrite call to ib_copy_{from,to}_udata(). Link: https://www.gitorious.org/opteya/coccib/source/75ebf2c1033c64c1d81df13e4ae44ee99c989eba:ib_copy_udata.cocci Link: http://marc.info/?i=cover.1394485254.git.ydroneaud@opteya.com Signed-off-by: Yann Droneaud <ydroneaud@opteya.com> Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2014-04-30IB/nes: Return an error on ib_copy_from_udata() failure instead of NULLYann Droneaud1-1/+1
commit 9d194d1025f463392feafa26ff8c2d8247f71be1 upstream. In case of error while accessing to userspace memory, function nes_create_qp() returns NULL instead of an error code wrapped through ERR_PTR(). But NULL is not expected by ib_uverbs_create_qp(), as it check for error with IS_ERR(). As page 0 is likely not mapped, it is going to trigger an Oops when the kernel will try to dereference NULL pointer to access to struct ib_qp's fields. In some rare cases, page 0 could be mapped by userspace, which could turn this bug to a vulnerability that could be exploited: the function pointers in struct ib_device will be under userspace total control. This was caught when using spatch (aka. coccinelle) to rewrite calls to ib_copy_{from,to}_udata(). Link: https://www.gitorious.org/opteya/ib-hw-nes-create-qp-null Link: https://www.gitorious.org/opteya/coccib/source/75ebf2c1033c64c1d81df13e4ae44ee99c989eba:ib_copy_udata.cocci Link: http://marc.info/?i=cover.1394485254.git.ydroneaud@opteya.com Signed-off-by: Yann Droneaud <ydroneaud@opteya.com> Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2014-04-30IB/ipath: Fix potential buffer overrun in sending diag packet routineDennis Dalessandro1-41/+25
commit a2cb0eb8a64adb29a99fd864013de957028f36ae upstream. Guard against a potential buffer overrun. The size to read from the user is passed in, and due to the padding that needs to be taken into account, as well as the place holder for the ICRC it is possible to overflow the 32bit value which would cause more data to be copied from user space than is allocated in the buffer. Reported-by: Nico Golde <nico@ngolde.de> Reported-by: Fabian Yamaguchi <fabs@goesec.de> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2014-04-02IB/qib: Add missing serdes init sequenceMike Marciniszyn1-0/+5
commit 2f75e12c4457a9b3d042c0a0d748fa198dc2ffaf upstream. Research has shown that commit a77fcf895046 ("IB/qib: Use a single txselect module parameter for serdes tuning") missed a key serdes init sequence. This patch add that sequence. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2014-04-02IB/qib: Fix QP check when looping back to/from QP1Ira Weiny1-1/+8
commit 6e0ea9e6cbcead7fa8c76e3e3b9de4a50c5131c5 upstream. The GSI QP type is compatible with and should be allowed to send data to/from any UD QP. This was found when testing ibacm on the same node as an SA. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2014-01-03IB/qib: Convert qib_user_sdma_pin_pages() to use get_user_pages_fast()Jan Kara1-5/+1
commit 603e7729920e42b3c2f4dbfab9eef4878cb6e8fa upstream. qib_user_sdma_queue_pkts() gets called with mmap_sem held for writing. Except for get_user_pages() deep down in qib_user_sdma_pin_pages() we don't seem to need mmap_sem at all. Even more interestingly the function qib_user_sdma_queue_pkts() (and also qib_user_sdma_coalesce() called somewhat later) call copy_from_user() which can hit a page fault and we deadlock on trying to get mmap_sem when handling that fault. So just make qib_user_sdma_pin_pages() use get_user_pages_fast() and leave mmap_sem locking for mm. This deadlock has actually been observed in the wild when the node is under memory pressure. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Roland Dreier <roland@purestorage.com> [bwh: Backported to 3.2: - Adjust context - Adjust indentation and nr_pages argument in qib_user_sdma_pin_pages()] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2014-01-03IB/ipath: Convert ipath_user_sdma_pin_pages() to use get_user_pages_fast()Jan Kara1-6/+1
commit 4adcf7fb6783e354aab38824d803fa8c4f8e8a27 upstream. ipath_user_sdma_queue_pkts() gets called with mmap_sem held for writing. Except for get_user_pages() deep down in ipath_user_sdma_pin_pages() we don't seem to need mmap_sem at all. Even more interestingly the function ipath_user_sdma_queue_pkts() (and also ipath_user_sdma_coalesce() called somewhat later) call copy_from_user() which can hit a page fault and we deadlock on trying to get mmap_sem when handling that fault. So just make ipath_user_sdma_pin_pages() use get_user_pages_fast() and leave mmap_sem locking for mm. This deadlock has actually been observed in the wild when the node is under memory pressure. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> [ Merged in fix for call to get_user_pages_fast from Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>. - Roland ] Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-01-16RDMA/nes: Fix for terminate timer crashTatyana Nikolova3-8/+6
commit 7bfcfa51c35cdd2d37e0d70fc11790642dd11fb3 upstream. The terminate timer needs to be initialized just once. Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-01-16RDMA/nes: Fix for crash when registering zero length MR for CQTatyana Nikolova1-0/+5
commit 7d9c199a55200c9b9fcad08e150470d02fb385be upstream. Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-05-31RDMA/cxgb4: Drop peer_abort when no endpoint foundSteve Wise1-0/+6
commit 14b9222808bb8bfefc71f72bc0dbdcf3b2f0140f upstream. Log a warning and drop the abort message. Otherwise we will do a bogus wake_up() and crash. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-05-31RDMA/cxgb4: Always wake up waiters in c4iw_peer_abort_intr()Steve Wise1-4/+1
commit 0f1dcfae6bc5563424346ad3a03282b8235a4c33 upstream. This fixes a race where an ingress abort fails to wake up the thread blocked in rdma_init() causing the app to hang. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-02-13IB/mlx4: pass SMP vendor-specific attribute MADs to firmwareJack Morgenstein1-5/+2
commit a6f7feae6d19e84253918d88b04153af09d3a243 upstream. In the current code, vendor-specific MADs (e.g with the FDR-10 attribute) are silently dropped by the driver, resulting in timeouts at the sending side and inability to query/configure the relevant feature. However, the ConnectX firmware is able to handle such MADs. For unsupported attributes, the firmware returns a GET_RESPONSE MAD containing an error status. For example, for a FDR-10 node with LID 11: # ibstat mlx4_0 1 CA: 'mlx4_0' Port 1: State: Active Physical state: LinkUp Rate: 40 (FDR10) Base lid: 11 LMC: 0 SM lid: 24 Capability mask: 0x02514868 Port GUID: 0x0002c903002e65d1 Link layer: InfiniBand Extended Port Query (EPI) vendor mad timeouts before the patch: # smpquery MEPI 11 -d ibwarn: [4196] smp_query_via: attr 0xff90 mod 0x0 route Lid 11 ibwarn: [4196] _do_madrpc: retry 1 (timeout 1000 ms) ibwarn: [4196] _do_madrpc: retry 2 (timeout 1000 ms) ibwarn: [4196] _do_madrpc: timeout after 3 retries, 3000 ms ibwarn: [4196] mad_rpc: _do_madrpc failed; dport (Lid 11) smpquery: iberror: [pid 4196] main: failed: operation EPI: ext port info query failed EPI query works OK with the patch: # smpquery MEPI 11 -d ibwarn: [6548] smp_query_via: attr 0xff90 mod 0x0 route Lid 11 ibwarn: [6548] mad_rpc: data offs 64 sz 64 mad data 0000 0000 0000 0001 0000 0001 0000 0001 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 # Ext Port info: Lid 11 port 0 StateChangeEnable:...............0x00 LinkSpeedSupported:..............0x01 LinkSpeedEnabled:................0x01 LinkSpeedActive:.................0x01 Signed-off-by: Jack Morgenstein <jackm@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Acked-by: Ira Weiny <weiny2@llnl.gov> Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-01-12IB/qib: Fix a possible data corruption when receiving packetsRam Vepa3-4/+10
commit eddfb675256f49d14e8c5763098afe3eb2c93701 upstream. Prevent a receive data corruption by ensuring that the write to update the rcvhdrheadn register to generate an interrupt is at the very end of the receive processing. Signed-off-by: Ramkrishna Vepa <ram.vepa@qlogic.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-12-19Merge branches 'cma', 'mlx4' and 'qib' into for-nextRoland Dreier2-4/+6
2011-12-19IB/qib: Correct sense on freectxts increment and decrementMike Marciniszyn1-2/+2
Commit 53ab1c64983 ("IB/qib: Correct nfreectxts for multiple HCAs") reversed the increments and decrements of dd->nfreectxts. Fix it. Reviewed-by: Ram Vepa <ram.vepa@qlogic.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-12-06IB/mlx4: Fix shutdown crash accessing a non-existent bitmapRoland Dreier1-2/+4
Commit cfcde11c3d7a ("IB/mlx4: Use flow counters on IBoE ports") added code that sets elements of counters[] to -1 if no counter is allocated, but then goes ahead and passes every entry to mlx4_counter_free() on shutdown. This is a bad idea, especially if MLX4_DEV_CAP_FLAG_COUNTERS isn't set so there isn't even an underlying bitmap to free from. Tested-by: Sean Hefty <sean.hefty@intel.com> Cc: <stable@kernel.org> Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-11-30Merge branches 'cxgb4', 'ipoib', 'misc' and 'qib' into for-nextRoland Dreier5-23/+23
2011-11-30IB: Fix RCU lockdep splatsEric Dumazet3-2/+14
Commit f2c31e32b37 ("net: fix NULL dereferences in check_peer_redir()") forgot to take care of infiniband uses of dst neighbours. Many thanks to Marc Aurele who provided a nice bug report and feedback. Reported-by: Marc Aurele La France <tsi@ualberta.ca> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: David Miller <davem@davemloft.net> Cc: <stable@kernel.org> Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-11-29IB/qib: Fix over-scheduling of QSFP workMike Marciniszyn2-20/+8
Don't over-schedule QSFP work on driver initialization. It could end up being run simultaneously on two different CPUs resulting in bad EEPROM reads. In combination with setting the physical IB link state prior to the IBC being brought out of reset, this can cause the link state machine to start training early with wrong settings. Signed-off-by: Mitko Haralanov <mitko@qlogic.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-11-28RDMA/cxgb4: Fix retry with MPAv1 logic for MPAv2Kumar Sanghvi1-1/+3
Fix logic so that we don't retry with MPAv1 once we have done that already. Otherwise, we end up retrying with MPAv1 even when its not needed on getting peer aborts - and this could lead to kernel panic. Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-11-28RDMA/cxgb4: Fix iw_cxgb4 count_rcqes() logicJonathan Lallinger1-1/+1
Fix another place in the code where logic dealing with the t4_cqe was using the wrong QID. This fixes the counting logic so that it tests against the SQ QID instead of the RQ QID when counting RCQES. Signed-off by: Jonathan Lallinger <jonathan@ogc.us> Signed-off by: Steve Wise <swise@ogc.us> Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-11-08IB/qib: Don't use schedule_work()Mike Marciniszyn1-1/+1
It was mistakenly introduced by dde05cbdf8b1 ("IB/qib: Hold links until tuning data is available"). Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-11-07Merge branch 'modsplit-Oct31_2011' of ↵Linus Torvalds21-0/+24
git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux * 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits) Revert "tracing: Include module.h in define_trace.h" irq: don't put module.h into irq.h for tracking irqgen modules. bluetooth: macroize two small inlines to avoid module.h ip_vs.h: fix implicit use of module_get/module_put from module.h nf_conntrack.h: fix up fallout from implicit moduleparam.h presence include: replace linux/module.h with "struct module" wherever possible include: convert various register fcns to macros to avoid include chaining crypto.h: remove unused crypto_tfm_alg_modname() inline uwb.h: fix implicit use of asm/page.h for PAGE_SIZE pm_runtime.h: explicitly requires notifier.h linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h miscdevice.h: fix up implicit use of lists and types stop_machine.h: fix implicit use of smp.h for smp_processor_id of: fix implicit use of errno.h in include/linux/of.h of_platform.h: delete needless include <linux/module.h> acpi: remove module.h include from platform/aclinux.h miscdevice.h: delete unnecessary inclusion of module.h device_cgroup.h: delete needless include <linux/module.h> net: sch_generic remove redundant use of <linux/module.h> net: inet_timewait_sock doesnt need <linux/module.h> ... Fix up trivial conflicts (other header files, and removal of the ab3550 mfd driver) in - drivers/media/dvb/frontends/dibx000_common.c - drivers/media/video/{mt9m111.c,ov6650.c} - drivers/mfd/ab3550-core.c - include/linux/dmaengine.h
2011-11-04Merge branches 'iser', 'mthca' and 'qib' into for-nextRoland Dreier2-8/+4
2011-11-04IB/qib: Fix panic in RC error flushing logicMike Marciniszyn1-7/+3
The following panic can occur when flushing a QP: RIP: 0010:[<ffffffffa0168e8b>] [<ffffffffa0168e8b>] qib_send_complete+0x3b/0x190 [ib_qib] RSP: 0018:ffff8803cdc6fc90 EFLAGS: 00010046 RAX: 0000000000000000 RBX: ffff8803d84ba000 RCX: 0000000000000000 RDX: 0000000000000005 RSI: ffffc90015a53430 RDI: ffff8803d84ba000 RBP: ffff8803cdc6fce0 R08: ffff8803cdc6fc90 R09: 0000000000000001 R10: 00000000ffffffff R11: 0000000000000000 R12: ffff8803d84ba0c0 R13: ffff8803d84ba5cc R14: 0000000000000800 R15: 0000000000000246 FS: 0000000000000000(0000) GS:ffff880036600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 0000000000000034 CR3: 00000003e44f9000 CR4: 00000000000406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process qib/0 (pid: 1350, threadinfo ffff8803cdc6e000, task ffff88042728a100) Stack: 53544c5553455201 0000000100000005 0000000000000000 ffff8803d84ba000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000001 ffff8803cdc6fd30 ffffffffa0165d7a Call Trace: [<ffffffffa0165d7a>] qib_make_rc_req+0x36a/0xe80 [ib_qib] [<ffffffffa0165a10>] ? qib_make_rc_req+0x0/0xe80 [ib_qib] [<ffffffffa01698b3>] qib_do_send+0xf3/0xb60 [ib_qib] [<ffffffff814db757>] ? thread_return+0x4e/0x777 [<ffffffffa01697c0>] ? qib_do_send+0x0/0xb60 [ib_qib] [<ffffffff81088bf0>] worker_thread+0x170/0x2a0 [<ffffffff8108e530>] ? autoremove_wake_function+0x0/0x40 [<ffffffff81088a80>] ? worker_thread+0x0/0x2a0 [<ffffffff8108e1c6>] kthread+0x96/0xa0 [<ffffffff8100c1ca>] child_rip+0xa/0x20 [<ffffffff8108e130>] ? kthread+0x0/0xa0 [<ffffffff8100c1c0>] ? child_rip+0x0/0x20 RIP [<ffffffffa0168e8b>] qib_send_complete+0x3b/0x190 [ib_qib] The RC error state flush logic in qib_make_rc_req() could return all of the acked wqes and potentially have emptied the queue. It would then unconditionally try return a flush completion via qib_send_complete() for an invalid wqe, or worse a valid one that is not queued. The panic results when the completion code tries to maintain an MR reference count for a NULL MR. This fix modifies logic to only send one completion per qib_make_rc_req() call and changing the completion status from IB_WC_SUCCESS to IB_WC_WR_FLUSH_ERR as the completions progress. The outer loop will call as many times as necessary to flush the queue. Reviewed-by: Ram Vepa <ram.vepa@qlogic.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-11-04IB/mthca: Fix buddy->num_free allocation sizeRoland Dreier1-1/+1
The num_free field of mthca_buddy has a type of array of unsigned int while it was allocated as an array of pointers. On 64-bit platforms this allocates twice more than required. Fix this by allocating the correct size for the type. This is the same bug just fixed in mlx4 by Eli Cohen <eli@mellanox.co.il>. Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-11-01Merge branch 'for-linus' of ↵Linus Torvalds53-784/+3236
git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (62 commits) mlx4_core: Deprecate log_num_vlan module param IB/mlx4: Don't set VLAN in IBoE WQEs' control segment IB/mlx4: Enable 4K mtu for IBoE RDMA/cxgb4: Mark QP in error before disabling the queue in firmware RDMA/cxgb4: Serialize calls to CQ's comp_handler RDMA/cxgb3: Serialize calls to CQ's comp_handler IB/qib: Fix issue with link states and QSFP cables IB/mlx4: Configure extended active speeds mlx4_core: Add extended port capabilities support IB/qib: Hold links until tuning data is available IB/qib: Clean up checkpatch issue IB/qib: Remove s_lock around header validation IB/qib: Precompute timeout jiffies to optimize latency IB/qib: Use RCU for qpn lookup IB/qib: Eliminate divide/mod in converting idx to egr buf pointer IB/qib: Decode path MTU optimization IB/qib: Optimize RC/UC code by IB operation IPoIB: Use the right function to do DMA unmap pages RDMA/cxgb4: Use correct QID in insert_recv_cqe() RDMA/cxgb4: Make sure flush CQ entries are collected on connection close ...
2011-11-01Merge branches 'amso1100', 'cma', 'cxgb3', 'cxgb4', 'fdr', 'ipath', 'ipoib', ↵Roland Dreier52-780/+3235
'misc', 'mlx4', 'misc', 'nes', 'qib' and 'xrc' into for-next
2011-11-01mm: distinguish between mlocked and pinned pagesChristoph Lameter2-5/+5
Some kernel components pin user space memory (infiniband and perf) (by increasing the page count) and account that memory as "mlocked". The difference between mlocking and pinning is: A. mlocked pages are marked with PG_mlocked and are exempt from swapping. Page migration may move them around though. They are kept on a special LRU list. B. Pinned pages cannot be moved because something needs to directly access physical memory. They may not be on any LRU list. I recently saw an mlockalled process where mm->locked_vm became bigger than the virtual size of the process (!) because some memory was accounted for twice: Once when the page was mlocked and once when the Infiniband layer increased the refcount because it needt to pin the RDMA memory. This patch introduces a separate counter for pinned pages and accounts them seperately. Signed-off-by: Christoph Lameter <cl@linux.com> Cc: Mike Marciniszyn <infinipath@qlogic.com> Cc: Roland Dreier <roland@kernel.org> Cc: Sean Hefty <sean.hefty@intel.com> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-11-01infiniband: add moduleparam.h to drivers/infiniband as requiredPaul Gortmaker3-0/+3
These files were getting the moduleparam infrastructure from the implicit presence of module.h being everywhere, but that is going away soon. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-11-01infiniband: add in export.h for files using EXPORT_SYMBOL/THIS_MODULEPaul Gortmaker5-0/+5
These were getting it implicitly via device.h --> module.h but we are going to stop that when we clean up the headers. Fix these in advance so the tree remains biscect-clean. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-11-01infiniband: Fix up module files that need to include module.hPaul Gortmaker12-0/+14
They had been getting it implicitly via device.h but we can't rely on that for the future, due to a pending cleanup so fix it now. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-11-01infiniband: Fix up users implicitly relying on getting stat.hPaul Gortmaker3-0/+3
They get it via module.h (via device.h) but we want to clean that up. When we do, we'll get things like: CC [M] drivers/infiniband/core/sysfs.o sysfs.c:361: error: 'S_IRUGO' undeclared here (not in a function) sysfs.c:654: error: 'S_IWUSR' undeclared here (not in a function) so add in the stat header it is using explicitly in advance. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31IB/mlx4: Don't set VLAN in IBoE WQEs' control segmentOr Gerlitz1-9/+2
There's no need to set the vlan-related fields in an IBoE send WQE control segment: - the vlan to be used by a UD QP is set in the datagram segment. - for GSI (CM) QP, all the headers down to 8021q and MAC are built by the software anyway. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-31IB/mlx4: Enable 4K mtu for IBoEOr Gerlitz1-1/+1
The IBoE port MTU is derived from the corresponding Ethernet netdevice MTU, which can support jumbo frames of 9K, and hence surely supports the max IB mtu of 4K. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-31RDMA/cxgb4: Mark QP in error before disabling the queue in firmwareTom Tucker1-0/+4
QPs need to be moved to error before telling the firwmare to shutdown the queue. Otherwise, the application can submit WRs that will never get fetched by the hardware and never flushed by the driver. Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com> Acked-by: Steve Wise <swsie@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-31RDMA/cxgb4: Serialize calls to CQ's comp_handlerKumar Sanghvi4-4/+23
Commit 01e7da6ba53c ("RDMA/cxgb4: Make sure flush CQ entries are collected on connection close") introduced a potential problem where a CQ's comp_handler can get called simultaneously from different places in the iw_cxgb4 driver. This does not comply with Documentation/infiniband/core_locking.txt, which states that at a given point of time, there should be only one callback per CQ should be active. This problem was reported by Parav Pandit <Parav.Pandit@Emulex.Com>. Based on discussion between Parav Pandit and Steve Wise, this patch fixes the above problem by serializing the calls to a CQ's comp_handler using a spin_lock. Reported-by: Parav Pandit <Parav.Pandit@Emulex.Com> Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>