summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2016-03-11IB/rdmavt: Remove signal_supported and commentsDennis Dalessandro1-18/+0
Initially it was intended that rdmavt would support some signaling between the underlying driver and itself. However this turned out to be unnecessary for qib and hfi1. If we need to add something like this in later to support another driver we should do it then. As of now this essentially dead code so remove it. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jubin John <jubin.john@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11IB/rdmavt: Remove RVT_FLAGsDennis Dalessandro4-28/+0
While hfi1 and qib were still supporting bits and pieces of core verbs components there needed to be a way to convey if rdmavt should handle allocation and initialize of resources like the queue pair table. Now that all of this is moved into rdmavt there is no need for these flags. They are no longer used in the drivers. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jubin John <jubin.john@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11IB/qib,rdmavt: Move smi_ah to qibDennis Dalessandro3-6/+6
Rdmavt adopted an smi_ah from qib which is not needed by hfi1. Move this back to qib and get it out of the common library. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jubin John <jubin.john@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11IB/qib: Setup notify free/create mad agent callbacks for rdmavtDennis Dalessandro1-0/+4
Qib needs to be notified when mad agents are created and freed, there is some counter maintenance that needs to be performed. Add those callbacks at registration time with rdmavt. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jubin John <jubin.john@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11IB/rdmavt: Add per verb driver callback checkingDennis Dalessandro2-75/+489
For each verb validate that all requirements for driver callbacks are met. If a function is called without checking for a valid pointer, it is a required function. Also document what each callback function does. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11IB/rdmavt: Clean up comments and add more documentationDennis Dalessandro11-78/+280
Add, remove, and otherwise clean up existing comments that are leftover from the initial code postings of rdmavt. Many of the comments were added to provide an idea on the direction we were thinking of going. Now that the design is solidified make a pass over and clean everything up. Also add details where lacking. Ensure all non static functions have nano comments. Reviewed-by: Jubin John <jubin.john@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: Put QPs into error state after SL->SC table changesKaike Wan3-2/+64
If an SL->SC mapping table change occurs after an RC/UC QP is created, there is no mechanism to change the SC nor the VL for that QP. The fix is to place the QP into error state so that ULP can recreate the QP with the new SL->SC mapping. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11IB/rdmavt: Add trace and error print statements in post_one_wrHarish Chegondi2-1/+77
These trace and error print statements would help in debugging issues which are caused due to messed up QP ring buffer pointers. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Harish Chegondi <harish.chegondi@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11IB/qib, staging/rdma/hfi1: add s_hlock for use in post sendMike Marciniszyn18-221/+319
This patch adds an additional lock to reduce contention on the s_lock. This lock is used in post_send() so that the post_send is not serialized with the send engine and other send related processing. To do this the s_next_psn is now maintained on post_send() while post_send() related fields are moved to a new cache line. There is an s_avail maintained for the post_send() to mitigate trading cache lines with the send engine. The lock is released/acquired around releasing the just built packet to the egress mechanism. Reviewed-by: Jubin John <jubin.john@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Harish Chegondi <harish.chegondi@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11IB/qib: Rename several functions by adding a "qib_" prefixHarish Chegondi4-39/+39
This would avoid conflict with the functions in hfi1 that have similar names when both qib and hfi1 drivers are configured to be built into the kernel. This issue came up in the 0-day build report. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Harish Chegondi <harish.chegondi@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11IB/rdmavt, staging/rdma/hfi1: use qps to dynamically scale timeout valueVennila Megavannan4-3/+32
A busy_jiffies variable is maintained and updated when rc qps are created and deleted. busy_jiffies is a scaled value of the number of rc qps in the device. busy_jiffies is incremented every rc qp scaling interval. busy_jiffies is added to the rc timeout in add_retry_timer and mod_retry_timer. The rc qp scaling interval is selected based on extensive performance evaluation of targeted workloads. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Vennila Megavannan <vennila.megavannan@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: Turning off LED without checking if stepping is AxSebastian Sanchez2-4/+3
It prevents the LED from staying on when the QSFP module is not present. Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: actually use new RNR timer API in loopback pathMike Marciniszyn3-6/+5
The patch series which added a new API for the RNR timer did not include an updated call in the loopback path. RC/UC RNR loopback would be broken without this. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: Tune for unknown channel if configuration file is absentEaswar Hariharan4-52/+99
Currently, the driver fails to tune the SerDes and therefore prevents link up if the configuration file is missing or fails parsing or validation. This patch adds a fallback option so that the 8051 is asked to tune for an unknown channel and possibly get the link up if tuning succeeds. It also adds a user-friendly message to update the configuration file if it is out-of-date. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: Fetch platform configuration data from EFI variableEaswar Hariharan7-22/+121
The platform configuration data has been moved into the EFI variable store where it is populated by the HFI1 option ROM. This patch pulls the configuration data from the new location, retaining a fallback to request_firmware. Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11IB/qib,staging/rdma/hfi1: use setup_timer apiHari Prasath Gujulan Elango2-6/+2
Replace the timer API's to initialize a timer & then assign the callback function by the setup_timer() API. Signed-off-by: Hari Prasath Gujulan Elango <hgujulan@visteon.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11IB/rdmavt: remove unused qp fieldMike Marciniszyn1-1/+0
The field is a vestige from ipath. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11IB/qib: Insure last cursor is updated prior to completeMike Marciniszyn2-9/+23
This patch is a prerequisite for adding a separate lock for post send. The timing of updating s_last needs to be before returning any send completion to avoid a race between a poll cq seeing a completion and the post send checking for a full queue. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: Insure last cursor is updated prior to completeMike Marciniszyn2-9/+23
This patch is a prerequisite for adding a separate lock for post send. The timing of updating s_last needs to be before returning any send completion to avoid a race between a poll cq seeing a completion and the post send checking for a full queue. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: add s_retry to diagnosticsMike Marciniszyn1-1/+2
This is needed to debug ULP issues with getting retry attributes correctly specified. Reviewed-by: Jubin John <jubin.john@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: remove duplicate timeout printMike Marciniszyn1-2/+1
The qp->timeout field is duplicated in the seqfile print. Remove it. Reviewed-by: Jubin John <jubin.john@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: use new RNR timerMike Marciniszyn5-10/+19
Use the new RNR timer for hfi1. For qib, this timer doesn't exist, so exploit driver callbacks to use the new timer as appropriate. Reviewed-by: Jubin John <jubin.john@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: add unique rnr timerMike Marciniszyn3-2/+4
Add a new rnr timer to hfi1. This allows for future optimizations having the retry and rnr timers separate. Reviewed-by: Jubin John <jubin.john@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: use mod_timer when appropriateMike Marciniszyn1-20/+22
Use new timer API to optimize maintenance of timers during ACK processing. When we are still expecting ACKs, mod the timer to avoid a heavyweight delete/add. Otherwise, insure do_rc_ack() maintains the timer as it had. Reviewed-by: Jubin John <jubin.john@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: use new timer routinesMike Marciniszyn1-29/+10
Use the new timer routines. Reviewed-by: Jubin John <jubin.john@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: centralize timer routines into rcMike Marciniszyn1-0/+107
Centralize disparate timer maintenance. This allow for central control and changes to the RC timer handling including future optimizations. Reviewed-by: Jubin John <jubin.john@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: Removing unused struct hfi1_verbs_countersSebastian Sanchez1-16/+0
It removes the unused struct hfi1_verbs_counters from verbs.h Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: Adding support for hfi counters via sysfsSebastian Sanchez1-54/+252
It enables access to counters in /sys/class/infiniband/hfi1_0/ports/1/counters by providing infrastructure when PMA queries occur. Counters symbol_error and VL15_dropped are not supported in OPA, therefore, 0 will always be returned. In addition, two common routines (pma_get_opa_port_dctrs, pma_get_opa_port_ectrs) were created to query counters to avoid code duplication. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: Replacement of goto's for break/returnsSebastian Sanchez1-26/+30
It replaces goto's for break and return statements in process_perf_opa(). Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: Change for data type of port numberSebastian Sanchez1-7/+7
This commit changes the data type for port_num in pma_get_opa_porterrors() from unsigned long to u8. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: Fix bug that could block the process on context exitMitko Haralanov1-1/+4
A race was discovred in the user SDMA code, which could result in an process being stuck in the kernel call indefinitely in certain error conditions. If, during the processing of a user SDMA request, there was an error *and* all outstanding SDMA descriptor had been completed by the time the that error case was handled in the calling function, the state of the packet queue would not get correctly updated resulting in the process subsequently getting stuck, thinking that there are more descriptors to be completed. To handle this scenario, the driver now checks the submitted packet count vs. the completed. If all submitted packets have also been completed, the driver can safely free the request and signal user level. Otherwise, this will be handled by the completion callback. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: Remove unused variable nsbrDean Luick1-7/+0
Remove unused nsbr count from PCIe Gen3 code Reviewed-by: Stuart Summers <john.s.summers@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: Make EPROM check per deviceDean Luick2-11/+6
Add a variable eprom_available to each device, replacing the global of the same name. This is to allow multiple HFI devices with different EPROM availability to operate correctly on the the same system. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: Add credits for VL0 to VL7 in snoop modeSadanand Warrier3-2/+66
Add a new option to the snoop ioctl which allows credits to be allocated across all VLs. Previously only VL0 and VL15 had credits allocated. The new option used in the ioctl HFI1_SNOOP_IOCSET_OPTS allows credits to be allocated so that VL15 will have at least 8.5KB credits and the other VLs will have the rest of the credits divided equally across themselves. The total number of credits are stored in the upper 16 bits of the integer passed and the cumulative value should ensure that VL0 has at least 8.5KB and each VL a minimum of 2KB + 128 bytes Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Sadanand Warrier <sadanand.warrier@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: Improve performance of user SDMAMitko Haralanov2-40/+22
To facilitate locked page counting, the user SDMA routines would maintain a list of io vectors, which were freed in the completion callback and then unpin the associated pages during the next call into the kernel. Since the size of this list was unbounded, doing this was bad for performance because the driver ended up spending too much time freeing the io vectors. This commit changes how the io vector freeing is done by moving the actual page unpinning in the callback and maintaining a count of unpinned pages. This count can then be used during the next call into the kernel to update the mm->pinned_vm variable (since that requires process context and the ability to sleep.) Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1, IB/core: Fix LinkDownReason define for consistencyEaswar Hariharan3-4/+4
LinkDownReason LocalMediaNotInstalled lacked an underscore and was inconsistent with other defines in the same family. This patch fixes this. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: Remove modify_port and port_immutable functionsHarish Chegondi4-55/+28
Delete code from query_port which has been moved into rvt_query_port Create a call back function to shut down a port which may be called from rvt_modify_port Signed-off-by: Harish Chegondi <harish.chegondi@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: Support query gid in rdmavtDennis Dalessandro1-20/+12
Query gid is in rdmavt, but still relies on the driver to maintain the guid table. Add the necessary driver call back and remove the existing verb handler. Reviewed-by: Harish Chegondi <harish.chegondi@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: Clean up init_cntrs()Jubin John1-18/+0
Clean up init_cntrs() by removing unnecessary memsets and debug statements Suggested-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: Fix snoop packet length calculationDean Luick1-5/+4
The LRH has a 12 bit packet length field, not 11 bit. This caused a snoop packet length miscalculation leading to a crash when sending a large ping over IPoIB while running opapacketcapture. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: Correct TWSI resetDean Luick3-45/+36
Change the TWSI reset function so it will stop the reset once the lines are in an expected state. Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com> Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Pablo Cacho <pablo.cacho@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: Remove PCIe AER diagnostic messageDean Luick1-7/+1
There are several reasons why PCIE AER cannot be enabled. Do not report the failure to enable as an error. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: Implement LED beaconing for maintenanceEaswar Hariharan4-61/+64
This patch implements LED beaconing for maintenance. A MAD packet with the LEDInfo attribute set to 1 will enable LED beaconing with a duty cycle of 2s on and 1.5s off. A MAD packet with the LEDInfo attribute set to 0 will disable beaconing and return the LED to normal operation. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: Split last 8 bytes of copy to user bufferDean Luick6-20/+57
Copy the last 8 bytes of user mode RC WRITE_ONLY and WRITE_LAST opcodes separately from the rest of the data. It is a de-facto standard for some MPI implementations to use a poll on the last few bytes of a verbs message to indicate that the message has been received rather than follow the required function method. The driver uses the kernel memcpy routine, which becomes "rep movsb" on modern machines. This copy, while very fast, does not guarantee in-order copy completion and the result is an occasional perceived corrupted packet. Avoid the issue by splitting the last 8 bytes to copy from the verbs opcodes where it matters and performing an in-order byte copy. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: Fix fabric serdes reset by re-downloading firmwareDean Luick1-13/+44
A host fabric serdes reset is required to go back to polling. However, access to the fabric serdes may have been invalidated by the sibling HFI when it downloads its fabric serdes firmware. Work around this by re-downloading and re-validating the serdes firmware at reset time on Bx hardware. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: Report physical state changes per device instead of globallyDean Luick2-3/+4
Make physical state change reporting be per-device, not global to reduce excessive reports of "physical state changed" Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: Properly determine error status of SDMA slotsMitko Haralanov2-7/+14
To ensure correct operation between the driver and PSM with respect to managing the SDMA request ring, it is important that the status for a particular request slot is set at the correct time. Otherwise, PSM can get out of sync with the driver, which could lead to hangs or errors on new requests. Properly determining of when to set the error status of a SDMA slot depends on knowing exactly when the last txreq for that request has been completed. This in turn requires that the driver knows exactly how many requests have been generated and how many of those requests have been successfully submitted to the SDMA queue. The previous implementation of the mid-layer SDMA API did not provide a way for the caller of sdma_send_txlist() to know how many of the txreqs in the input list have actually been submitted without traversing the list and counting. Since sdma_send_txlist() already traverses the list in order to process it, requiring such traversal in the caller is completely unnecessary. Therefore, it is much easier to enhance sdma_send_txlist() to return the number of successfully submitted txreqs. This, in turn, allows the caller to accurately determine the progress of the SDMA request and, therefore, correctly set the error status at the right time. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: correctly check for post-interrupt packetsDean Luick1-7/+25
At the end of the packet processing interrupt and thread handler, the RcvAvail interrupt is finally cleared down. There is a window between the last packet check (via DMA to memory) and interrupt clear-down. The code to recheck for a packet once the RcvAVail interrupt is enabled must ultimately use a CSR read of RcvHdrTail rather than depend on DMA'ed memory. This change adds a CSR read of RcvHdrTail if the memory check does not show a packet preset. The memory check is retained as a quick test before doing the more expensive, but always correct, CSR read. In the ASIC, the CSR read used to force the RcvAvail clear-down write to complete may bypass queued DMA writes to memory. The only correct way to decide if a packet has arrived without an interrupt to push DMA to memory ahead of itself is to read the tail directly after RcvAvail has been cleared down. It is not sufficient to just read the tail and skip pushing the clear-down. Both must be done. The tail read will not push clear-down write due to it being in a different area of the chip. At this point, it is OK to have packet data still being DMA'ed to memory. This is the end of packet processing for previous packets. If the driver detects a new packet has arrived before interrputs were re-enabled, it will force a new interrupt and the interrupt will push the packet DMAs to memory, where the driver will then react to the interrupt and do normal packet processing. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: Improve performance of SDMA transfersMitko Haralanov2-168/+128
Commit a0d406934a46 ("staging/rdma/hfi1: Add page lock limit check for SDMA requests") added a mechanism to delay the clean-up of user SDMA requests in order to facilitate proper locked page counting. This delayed processing was done using a kernel workqueue, which meant that a kernel thread would have to spin up and take CPU cycles to do the clean-up. This proved detrimental to performance because now there are two execution threads (the kernel workqueue and the user process) needing cycles on the same CPU. Performance-wise, it is much better to do as much of the clean-up as can be done in interrupt context (during the callback) and do the remaining work in-line during subsequent calls of the user process into the driver. The changes required to implement the above also significantly simplify the entire SDMA completion processing code and eliminate a memory corruption causing the following observed crash: [ 2881.703362] BUG: unable to handle kernel NULL pointer dereference at (null) [ 2881.703389] IP: [<ffffffffa02897e4>] user_sdma_send_pkts+0xcd4/0x18e0 [hfi1] [ 2881.703422] PGD 7d4d25067 PUD 77d96d067 PMD 0 [ 2881.703427] Oops: 0000 [#1] SMP [ 2881.703431] Modules linked in: [ 2881.703504] CPU: 28 PID: 6668 Comm: mpi_stress Tainted: G OENX 3.12.28-4-default #1 [ 2881.703508] Hardware name: Intel Corporation S2600KP/S2600KP, BIOS SE5C610.86B.11.01.0044.090 [ 2881.703512] task: ffff88077da8e0c0 ti: ffff880856772000 task.ti: ffff880856772000 [ 2881.703515] RIP: 0010:[<ffffffffa02897e4>] [<ffffffffa02897e4>] user_sdma_send_pkts+0xcd4/0x [ 2881.703529] RSP: 0018:ffff880856773c48 EFLAGS: 00010287 [ 2881.703531] RAX: 0000000000000000 RBX: 0000000000001000 RCX: 0000000000002000 [ 2881.703534] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000002000 [ 2881.703537] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000 [ 2881.703540] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 2881.703543] R13: 0000000000000000 R14: ffff88071e782e68 R15: ffff8810532955c0 [ 2881.703546] FS: 00007f9c4375e700(0000) GS:ffff88107eec0000(0000) knlGS:0000000000000000 [ 2881.703549] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2881.703551] CR2: 0000000000000000 CR3: 00000007d4cba000 CR4: 00000000003407e0 [ 2881.703554] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 2881.703556] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 2881.703558] Stack: [ 2881.703559] ffffffff00002000 ffff881000001800 ffffffff00000000 00000000000080d0 [ 2881.703570] 0000000000000000 0000200000000000 0000000000000000 ffff88071e782db8 [ 2881.703580] ffff8807d4d08d80 ffff881053295600 0000000000000008 ffff88071e782fc8 [ 2881.703589] Call Trace: [ 2881.703691] [<ffffffffa028b5da>] hfi1_user_sdma_process_request+0x84a/0xab0 [hfi1] [ 2881.703777] [<ffffffffa0255412>] hfi1_aio_write+0xd2/0x110 [hfi1] [ 2881.703828] [<ffffffff8119e3d8>] do_sync_readv_writev+0x48/0x80 [ 2881.703837] [<ffffffff8119f78b>] do_readv_writev+0xbb/0x230 [ 2881.703843] [<ffffffff8119fab8>] SyS_writev+0x48/0xc0 This commit also addresses issues related to notification of user processes of SDMA request slot availability. The slot should be cleaned up first before the user processes is notified of its availability. Reviewed-by: Arthur Kepner <arthur.kepner@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-11staging/rdma/hfi1: Use device file minor to identify EPROMDean Luick3-7/+12
When writing to the EPROM, the driver will always use the "first" device. This is incorrect for multiple cards. Use the device file minor to determine the device to use. Reject the generic device file. Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>