summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2016-03-11staging/rdma/hfi1: add s_avail to qp_statsMike Marciniszyn1-1/+2
This diagnostic capability was missed in the dual lock series. Signed-off-by: Vennila Megavannan <vennila.megavannan@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11IB/qib: Destroy SMI AH before de-allocating the protection domainHarish Chegondi2-2/+3
If SMI AH is not destroyed before de-allocating the PD, it would result in non-zero PD use count when de-allocating the PD, triggering a WARN_ON() at drivers/infiniband/core/verbs.c:284 ib_dealloc_pd+0x69/0xb0 [ib_core]() when unloading the qib driver on systems with dual-port card. This problem has always been there in qib and was detected only after the commit 7dd78647a2c2 ("IB/core: Make ib_dealloc_pd return void") introduced a WARN_ON in ib_dealloc_pd() that triggers if a PD's use count is non-zero before de-allocating the PD. Below is the call trace from the dmesg log. [ 7264.966129] Call Trace: [ 7264.969652] [<ffffffff81338470>] dump_stack+0x44/0x64 [ 7264.976181] [<ffffffff81086bb6>] warn_slowpath_common+0x86/0xc0 [ 7264.983656] [<ffffffff81086cfa>] warn_slowpath_null+0x1a/0x20 [ 7264.990961] [<ffffffffa025c2d9>] ib_dealloc_pd+0x69/0xb0 [ib_core] [ 7264.998717] [<ffffffffa0044de8>] ib_mad_port_close+0xb8/0x120 [ib_mad] [ 7265.006866] [<ffffffffa0044ebf>] ib_mad_remove_device+0x6f/0xc0 [ib_mad] [ 7265.015224] [<ffffffffa025fc87>] ib_unregister_device+0xa7/0x140 [ib_core] [ 7265.023738] [<ffffffffa04b5b79>] rvt_unregister_device+0x29/0x80 [rdmavt] [ 7265.032181] [<ffffffffa088d2a2>] qib_unregister_ib_device+0x22/0x210 [ib_qib] [ 7265.040993] [<ffffffffa085f73f>] qib_remove_one+0x1f/0x250 [ib_qib] [ 7265.048823] [<ffffffff8137a319>] pci_device_remove+0x39/0xc0 [ 7265.055984] [<ffffffff81466a1a>] __device_release_driver+0x9a/0x140 [ 7265.063821] [<ffffffff81466bc8>] driver_detach+0xb8/0xc0 [ 7265.070579] [<ffffffff81465a15>] bus_remove_driver+0x55/0xd0 [ 7265.077717] [<ffffffff8146732c>] driver_unregister+0x2c/0x50 [ 7265.084849] [<ffffffff813789ba>] pci_unregister_driver+0x2a/0x80 [ 7265.092366] [<ffffffffa08921bd>] qib_ib_cleanup+0x37/0x65 [ib_qib] [ 7265.100068] [<ffffffff811096d0>] SyS_delete_module+0x190/0x220 [ 7265.107379] [<ffffffff816a7bae>] entry_SYSCALL_64_fastpath+0x12/0x71 Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Harish Chegondi <harish.chegondi@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11IB/rdmavt: Remove unnecessary exported functionsDennis Dalessandro7-155/+128
Remove exported functions which are no longer required as the functionality has moved into rdmavt. This also requires re-ordering some of the functions since their prototype no longer appears in a header file. Rather than add forward declarations it is just cleaner to re-order some of the functions. Reviewed-by: Jubin John <jubin.john@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11IB/rdmavt: Remove signal_supported and commentsDennis Dalessandro1-18/+0
Initially it was intended that rdmavt would support some signaling between the underlying driver and itself. However this turned out to be unnecessary for qib and hfi1. If we need to add something like this in later to support another driver we should do it then. As of now this essentially dead code so remove it. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jubin John <jubin.john@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11IB/rdmavt: Remove RVT_FLAGsDennis Dalessandro4-28/+0
While hfi1 and qib were still supporting bits and pieces of core verbs components there needed to be a way to convey if rdmavt should handle allocation and initialize of resources like the queue pair table. Now that all of this is moved into rdmavt there is no need for these flags. They are no longer used in the drivers. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jubin John <jubin.john@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11IB/qib,rdmavt: Move smi_ah to qibDennis Dalessandro3-6/+6
Rdmavt adopted an smi_ah from qib which is not needed by hfi1. Move this back to qib and get it out of the common library. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jubin John <jubin.john@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11IB/qib: Setup notify free/create mad agent callbacks for rdmavtDennis Dalessandro1-0/+4
Qib needs to be notified when mad agents are created and freed, there is some counter maintenance that needs to be performed. Add those callbacks at registration time with rdmavt. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jubin John <jubin.john@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11IB/rdmavt: Add per verb driver callback checkingDennis Dalessandro2-75/+489
For each verb validate that all requirements for driver callbacks are met. If a function is called without checking for a valid pointer, it is a required function. Also document what each callback function does. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11IB/rdmavt: Clean up comments and add more documentationDennis Dalessandro11-78/+280
Add, remove, and otherwise clean up existing comments that are leftover from the initial code postings of rdmavt. Many of the comments were added to provide an idea on the direction we were thinking of going. Now that the design is solidified make a pass over and clean everything up. Also add details where lacking. Ensure all non static functions have nano comments. Reviewed-by: Jubin John <jubin.john@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: Put QPs into error state after SL->SC table changesKaike Wan3-2/+64
If an SL->SC mapping table change occurs after an RC/UC QP is created, there is no mechanism to change the SC nor the VL for that QP. The fix is to place the QP into error state so that ULP can recreate the QP with the new SL->SC mapping. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11IB/rdmavt: Add trace and error print statements in post_one_wrHarish Chegondi2-1/+77
These trace and error print statements would help in debugging issues which are caused due to messed up QP ring buffer pointers. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Harish Chegondi <harish.chegondi@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11IB/qib, staging/rdma/hfi1: add s_hlock for use in post sendMike Marciniszyn18-221/+319
This patch adds an additional lock to reduce contention on the s_lock. This lock is used in post_send() so that the post_send is not serialized with the send engine and other send related processing. To do this the s_next_psn is now maintained on post_send() while post_send() related fields are moved to a new cache line. There is an s_avail maintained for the post_send() to mitigate trading cache lines with the send engine. The lock is released/acquired around releasing the just built packet to the egress mechanism. Reviewed-by: Jubin John <jubin.john@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Harish Chegondi <harish.chegondi@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11IB/qib: Rename several functions by adding a "qib_" prefixHarish Chegondi4-39/+39
This would avoid conflict with the functions in hfi1 that have similar names when both qib and hfi1 drivers are configured to be built into the kernel. This issue came up in the 0-day build report. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Harish Chegondi <harish.chegondi@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11IB/rdmavt, staging/rdma/hfi1: use qps to dynamically scale timeout valueVennila Megavannan4-3/+32
A busy_jiffies variable is maintained and updated when rc qps are created and deleted. busy_jiffies is a scaled value of the number of rc qps in the device. busy_jiffies is incremented every rc qp scaling interval. busy_jiffies is added to the rc timeout in add_retry_timer and mod_retry_timer. The rc qp scaling interval is selected based on extensive performance evaluation of targeted workloads. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Vennila Megavannan <vennila.megavannan@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: Turning off LED without checking if stepping is AxSebastian Sanchez2-4/+3
It prevents the LED from staying on when the QSFP module is not present. Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: actually use new RNR timer API in loopback pathMike Marciniszyn3-6/+5
The patch series which added a new API for the RNR timer did not include an updated call in the loopback path. RC/UC RNR loopback would be broken without this. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: Tune for unknown channel if configuration file is absentEaswar Hariharan4-52/+99
Currently, the driver fails to tune the SerDes and therefore prevents link up if the configuration file is missing or fails parsing or validation. This patch adds a fallback option so that the 8051 is asked to tune for an unknown channel and possibly get the link up if tuning succeeds. It also adds a user-friendly message to update the configuration file if it is out-of-date. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: Fetch platform configuration data from EFI variableEaswar Hariharan7-22/+121
The platform configuration data has been moved into the EFI variable store where it is populated by the HFI1 option ROM. This patch pulls the configuration data from the new location, retaining a fallback to request_firmware. Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11IB/qib,staging/rdma/hfi1: use setup_timer apiHari Prasath Gujulan Elango2-6/+2
Replace the timer API's to initialize a timer & then assign the callback function by the setup_timer() API. Signed-off-by: Hari Prasath Gujulan Elango <hgujulan@visteon.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11IB/rdmavt: remove unused qp fieldMike Marciniszyn1-1/+0
The field is a vestige from ipath. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11IB/qib: Insure last cursor is updated prior to completeMike Marciniszyn2-9/+23
This patch is a prerequisite for adding a separate lock for post send. The timing of updating s_last needs to be before returning any send completion to avoid a race between a poll cq seeing a completion and the post send checking for a full queue. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: Insure last cursor is updated prior to completeMike Marciniszyn2-9/+23
This patch is a prerequisite for adding a separate lock for post send. The timing of updating s_last needs to be before returning any send completion to avoid a race between a poll cq seeing a completion and the post send checking for a full queue. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: add s_retry to diagnosticsMike Marciniszyn1-1/+2
This is needed to debug ULP issues with getting retry attributes correctly specified. Reviewed-by: Jubin John <jubin.john@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: remove duplicate timeout printMike Marciniszyn1-2/+1
The qp->timeout field is duplicated in the seqfile print. Remove it. Reviewed-by: Jubin John <jubin.john@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: use new RNR timerMike Marciniszyn5-10/+19
Use the new RNR timer for hfi1. For qib, this timer doesn't exist, so exploit driver callbacks to use the new timer as appropriate. Reviewed-by: Jubin John <jubin.john@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: add unique rnr timerMike Marciniszyn3-2/+4
Add a new rnr timer to hfi1. This allows for future optimizations having the retry and rnr timers separate. Reviewed-by: Jubin John <jubin.john@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: use mod_timer when appropriateMike Marciniszyn1-20/+22
Use new timer API to optimize maintenance of timers during ACK processing. When we are still expecting ACKs, mod the timer to avoid a heavyweight delete/add. Otherwise, insure do_rc_ack() maintains the timer as it had. Reviewed-by: Jubin John <jubin.john@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: use new timer routinesMike Marciniszyn1-29/+10
Use the new timer routines. Reviewed-by: Jubin John <jubin.john@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: centralize timer routines into rcMike Marciniszyn1-0/+107
Centralize disparate timer maintenance. This allow for central control and changes to the RC timer handling including future optimizations. Reviewed-by: Jubin John <jubin.john@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: Removing unused struct hfi1_verbs_countersSebastian Sanchez1-16/+0
It removes the unused struct hfi1_verbs_counters from verbs.h Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: Adding support for hfi counters via sysfsSebastian Sanchez1-54/+252
It enables access to counters in /sys/class/infiniband/hfi1_0/ports/1/counters by providing infrastructure when PMA queries occur. Counters symbol_error and VL15_dropped are not supported in OPA, therefore, 0 will always be returned. In addition, two common routines (pma_get_opa_port_dctrs, pma_get_opa_port_ectrs) were created to query counters to avoid code duplication. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: Replacement of goto's for break/returnsSebastian Sanchez1-26/+30
It replaces goto's for break and return statements in process_perf_opa(). Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: Change for data type of port numberSebastian Sanchez1-7/+7
This commit changes the data type for port_num in pma_get_opa_porterrors() from unsigned long to u8. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: Fix bug that could block the process on context exitMitko Haralanov1-1/+4
A race was discovred in the user SDMA code, which could result in an process being stuck in the kernel call indefinitely in certain error conditions. If, during the processing of a user SDMA request, there was an error *and* all outstanding SDMA descriptor had been completed by the time the that error case was handled in the calling function, the state of the packet queue would not get correctly updated resulting in the process subsequently getting stuck, thinking that there are more descriptors to be completed. To handle this scenario, the driver now checks the submitted packet count vs. the completed. If all submitted packets have also been completed, the driver can safely free the request and signal user level. Otherwise, this will be handled by the completion callback. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: Remove unused variable nsbrDean Luick1-7/+0
Remove unused nsbr count from PCIe Gen3 code Reviewed-by: Stuart Summers <john.s.summers@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: Make EPROM check per deviceDean Luick2-11/+6
Add a variable eprom_available to each device, replacing the global of the same name. This is to allow multiple HFI devices with different EPROM availability to operate correctly on the the same system. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: Add credits for VL0 to VL7 in snoop modeSadanand Warrier3-2/+66
Add a new option to the snoop ioctl which allows credits to be allocated across all VLs. Previously only VL0 and VL15 had credits allocated. The new option used in the ioctl HFI1_SNOOP_IOCSET_OPTS allows credits to be allocated so that VL15 will have at least 8.5KB credits and the other VLs will have the rest of the credits divided equally across themselves. The total number of credits are stored in the upper 16 bits of the integer passed and the cumulative value should ensure that VL0 has at least 8.5KB and each VL a minimum of 2KB + 128 bytes Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Sadanand Warrier <sadanand.warrier@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: Improve performance of user SDMAMitko Haralanov2-40/+22
To facilitate locked page counting, the user SDMA routines would maintain a list of io vectors, which were freed in the completion callback and then unpin the associated pages during the next call into the kernel. Since the size of this list was unbounded, doing this was bad for performance because the driver ended up spending too much time freeing the io vectors. This commit changes how the io vector freeing is done by moving the actual page unpinning in the callback and maintaining a count of unpinned pages. This count can then be used during the next call into the kernel to update the mm->pinned_vm variable (since that requires process context and the ability to sleep.) Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1, IB/core: Fix LinkDownReason define for consistencyEaswar Hariharan3-4/+4
LinkDownReason LocalMediaNotInstalled lacked an underscore and was inconsistent with other defines in the same family. This patch fixes this. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: Remove modify_port and port_immutable functionsHarish Chegondi4-55/+28
Delete code from query_port which has been moved into rvt_query_port Create a call back function to shut down a port which may be called from rvt_modify_port Signed-off-by: Harish Chegondi <harish.chegondi@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: Support query gid in rdmavtDennis Dalessandro1-20/+12
Query gid is in rdmavt, but still relies on the driver to maintain the guid table. Add the necessary driver call back and remove the existing verb handler. Reviewed-by: Harish Chegondi <harish.chegondi@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: Clean up init_cntrs()Jubin John1-18/+0
Clean up init_cntrs() by removing unnecessary memsets and debug statements Suggested-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: Fix snoop packet length calculationDean Luick1-5/+4
The LRH has a 12 bit packet length field, not 11 bit. This caused a snoop packet length miscalculation leading to a crash when sending a large ping over IPoIB while running opapacketcapture. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: Correct TWSI resetDean Luick3-45/+36
Change the TWSI reset function so it will stop the reset once the lines are in an expected state. Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com> Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Pablo Cacho <pablo.cacho@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: Remove PCIe AER diagnostic messageDean Luick1-7/+1
There are several reasons why PCIE AER cannot be enabled. Do not report the failure to enable as an error. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: Implement LED beaconing for maintenanceEaswar Hariharan4-61/+64
This patch implements LED beaconing for maintenance. A MAD packet with the LEDInfo attribute set to 1 will enable LED beaconing with a duty cycle of 2s on and 1.5s off. A MAD packet with the LEDInfo attribute set to 0 will disable beaconing and return the LED to normal operation. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: Split last 8 bytes of copy to user bufferDean Luick6-20/+57
Copy the last 8 bytes of user mode RC WRITE_ONLY and WRITE_LAST opcodes separately from the rest of the data. It is a de-facto standard for some MPI implementations to use a poll on the last few bytes of a verbs message to indicate that the message has been received rather than follow the required function method. The driver uses the kernel memcpy routine, which becomes "rep movsb" on modern machines. This copy, while very fast, does not guarantee in-order copy completion and the result is an occasional perceived corrupted packet. Avoid the issue by splitting the last 8 bytes to copy from the verbs opcodes where it matters and performing an in-order byte copy. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: Fix fabric serdes reset by re-downloading firmwareDean Luick1-13/+44
A host fabric serdes reset is required to go back to polling. However, access to the fabric serdes may have been invalidated by the sibling HFI when it downloads its fabric serdes firmware. Work around this by re-downloading and re-validating the serdes firmware at reset time on Bx hardware. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: Report physical state changes per device instead of globallyDean Luick2-3/+4
Make physical state change reporting be per-device, not global to reduce excessive reports of "physical state changed" Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: Properly determine error status of SDMA slotsMitko Haralanov2-7/+14
To ensure correct operation between the driver and PSM with respect to managing the SDMA request ring, it is important that the status for a particular request slot is set at the correct time. Otherwise, PSM can get out of sync with the driver, which could lead to hangs or errors on new requests. Properly determining of when to set the error status of a SDMA slot depends on knowing exactly when the last txreq for that request has been completed. This in turn requires that the driver knows exactly how many requests have been generated and how many of those requests have been successfully submitted to the SDMA queue. The previous implementation of the mid-layer SDMA API did not provide a way for the caller of sdma_send_txlist() to know how many of the txreqs in the input list have actually been submitted without traversing the list and counting. Since sdma_send_txlist() already traverses the list in order to process it, requiring such traversal in the caller is completely unnecessary. Therefore, it is much easier to enhance sdma_send_txlist() to return the number of successfully submitted txreqs. This, in turn, allows the caller to accurately determine the progress of the SDMA request and, therefore, correctly set the error status at the right time. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>