summaryrefslogtreecommitdiff
path: root/drivers/ntb/ntb_transport.c
AgeCommit message (Collapse)AuthorFilesLines
2015-09-07NTB: Use unique DMA channels for TX and RXDave Jiang1-19/+58
Allocate two DMA channels, one for TX operation and one for RX operation, instead of having one DMA channel for everything. This provides slightly better performance, and also will make error handling cleaner later on. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-09-07NTB: Remove dma_sync_wait from ntb_async_rxAllen Hubbe1-9/+3
The dma_sync_wait can hurt the performance of workloads mixed with both large and small frames. Large frames will be copied using the dma engine. Small frames will be copied by the cpu. The dma_sync_wait prevents the cpu and dma engine copying in parallel. In the period where the cpu is copying, the dma engine is stopped. The dma engine is not doing any useful work to copy large frames during that time, and the additional time to restart the dma engine for the next large frame. This will decrease the throughput for the portion of a workload with large frames. In the period where the dma engine is copying, the cpu is held up waiting for dma to complete. The small frames processing will be delayed until the dma is complete. The RX frames are completed in-order, and the processing of small frames takes very little time, so dma_sync_wait may have an insignificant impact on the respose time of frames. The more significant impact is to the system, because the delay in dma_sync_wait is implemented as busy non-blocking wait. This can prevent the delayed core from doing any useful work, even if it could be processing work for other drivers, unrelated to transport RX processing. After applying the earlier patch to fix out-of-order RX acknoledgement, the dma_sync_wait is no longer necessary. Remove it, so that cpu memcpy will proceed immediately for small frames, in parallel with ongoing dma for large frames. Do not hold up the cpu from doing work while dma is in progress. The prior fix will continue to ensure in-order completion of the RX frames to the upper layer, and in-order delivery of the RX acknoledgement. Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-09-07NTB: Clean up QP stats infoDave Jiang1-9/+16
Make QP stats info more readable for debugging purposes. Also add an entry to indicate whether DMA is being used. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-09-07NTB: Make the transport list in order of discoveryDave Jiang1-1/+1
The list should be added from the bottom and not the top in order to ensure the transport is provided in the same order to clients as ntb devices are discovered. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-09-07NTB: Add flow control to the ntb_netdevDave Jiang1-1/+17
Right now if we push the NTB really hard, we start dropping packets due to not able to process the packets fast enough. We need to st:qop the upper layer from flooding us when that happens. A timer is necessary in order to restart the queue once the resource has been processed on the receive side. Due to the way NTB is setup, the resources on the tx side are tied to the processing of the rx side and there's no async way to know when the rx side has released those resources. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-08-09NTB: Fix dereference before checkAllen Hubbe1-2/+1
Remove early dereference of a pointer that is checked later in the code. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-08-09NTB: Fix zero size or integer overflow in ntb_set_mwAllen Hubbe1-3/+6
A plain 32 bit integer will overflow for values over 4GiB. Change the plain integer size to the appropriate size type in ntb_set_mw. Change the type of the size parameter and two local variables used for size. Even if there is no overflow, a size of zero is invalid here. Reported-by: Juyoung Jung <jjung@micron.com> Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-08-09NTB: Schedule to receive on QP link upAllen Hubbe1-0/+2
Schedule to receive on QP link up, to make sure that the doorbell is properly cleared for interrupts. Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-08-09NTB: Fix oops in debugfs when transport is half-upDave Jiang1-1/+5
When the remote side is not up, we do not have all the context for the transport, and that causes NULL ptr access. Have the debugfs reads check to see if transport is up before we make access. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-08-09NTB: Fix transport stats for multiple devicesDave Jiang1-2/+10
Currently the debugfs does not have files for all NTB transport queue pairs. When there are multiple NTBs present in a system, the QP names of the last transport clobber the names of previously added transport QPs. Only the last added QPs can be observed via debugfs. Create a directory per NTB transport to associate the QPs with that transport. Name the directory the same as the PCI device. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-08-09NTB: Fix ntb_transport out-of-order RX updateAllen Hubbe1-69/+100
It was possible for a synchronous update of the RX index in the error case to get ahead of the asynchronous RX index update in the normal case. Change the RX processing to preserve an RX completion order. There were two error cases. First, if a buffer is not present to receive data, there would be no queue entry to preserve the RX completion order. Instead of dropping the RX frame, leave the RX frame in the ring. Schedule RX processing when RX entries are enqueued, in case there are RX frames waiting in the ring to be received. Second, if a buffer is too small to receive data, drop the frame in the ring, mark the RX entry as done, and indicate the error in the RX entry length. Check for a negative length in the receive callback in ntb_netdev, and count occurrences as rx_length_errors. Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-07-04NTB: Print driver name and version in module initDave Jiang1-0/+2
Printouts driver name and version to indicate what is being loaded. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-07-04NTB: Increase transport MTU to 64k from 16kDave Jiang1-1/+1
Benchmarking showed a significant performance increase with the MTU size to 64k instead of 16k. Change the driver default to 64k. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-07-04NTB: Default to CPU memcpy for performanceDave Jiang1-4/+13
Disable DMA usage by default, since the CPU provides much better performance with write combining. Provide a module parameter to enable DMA usage when offloading the memcpy is preferred. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-07-04NTB: Improve performance with write combiningDave Jiang1-1/+10
Changing the memory window BAR mappings to write combining significantly boosts the performance. We will also use memcpy that uses non-temporal store, which showed performance improvement when doing non-cached memcpys. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-07-04NTB: Use NUMA memory and DMA chan in transportAllen Hubbe1-14/+32
Allocate memory and request the DMA channel for the same NUMA node as the NTB device. Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-07-04NTB: Rate limit ntb_qp_link_workAllen Hubbe1-1/+1
When the ntb transport is connecting and waiting for the peer, the debug console receives lots of debug level messages about the remote qp link status being down. Rate limit those messages. Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-07-04NTB: Reset transport QP link stats on downAllen Hubbe1-8/+28
Reset the link stats when the link goes down. In particular, the TX and RX index and count must be reset, or else the TX side will be sending packets to the RX side where the RX side is not expecting them. Reset all the stats, to be consistent. Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-07-04NTB: Do not advance transport RX on link downAllen Hubbe1-2/+1
On link down, don't advance RX index to the next entry. The next entry should never be valid after receiving the link down flag. Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-07-04NTB: Differentiate transport link down messagesAllen Hubbe1-2/+2
The same message "qp %d: Link Down\n" was printed at two locations in ntb_transport. Change the messages so they are distinct. Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-07-04NTB: Read peer info from local SPAD in transportDave Jiang1-5/+5
The transport was writing and then reading the peer scratch pad, essentially reading what it just wrote instead of exchanging any information with the peer. The transport expects the peer values to be the same as the local values, so this issue was not obvious. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-07-04NTB: Split ntb_hw_intel and ntb_transport driversAllen Hubbe1-389/+547
Change ntb_hw_intel to use the new NTB hardware abstraction layer. Split ntb_transport into its own driver. Change it to use the new NTB hardware abstraction layer. Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-07-02NTB: Move files in preparation for NTB abstractionAllen Hubbe1-7/+7
This patch only moves files to their new locations, before applying the next two patches adding the NTB Abstraction layer. Splitting this patch from the next is intended make distinct which code is changed only due to moving the files, versus which are substantial code changes in adding the NTB Abstraction layer. Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2014-09-14ntb: Add alignment check to meet hardware requirementDave Jiang1-0/+13
The NTB translate register must have the value to be BAR size aligned. This alignment check make sure that the DMA memory allocated has the proper alignment. Another requirement for NTB to function properly with memory window BAR size greater or equal to 4M is to use the CMA feature in 3.16 kernel with the appropriate CONFIG_CMA_ALIGNMENT and CONFIG_CMA_SIZE_MBYTES set. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2014-09-14NTB: correct the spread of queues over mw'sJon Mason1-2/+2
The detection of an uneven number of queues on the given memory windows was not correct. The mw_num is zero based and the mod should be division to spread them evenly over the mw's. Signed-off-by: Jon Mason <jon.mason@intel.com>
2014-04-07NTB: Code Style Clean-upJon Mason1-10/+9
Some white space and 80 char overruns corrected. Signed-off-by: Jon Mason <jon.mason@intel.com>
2014-04-07NTB: client event cleanupJon Mason1-1/+0
Provide a better event interface between the client and transport Signed-off-by: Jon Mason <jon.mason@intel.com>
2013-11-26Merge tag 'ntb-3.13' of git://github.com/jonmason/ntbLinus Torvalds1-36/+41
Pull non-transparent bridge updates from Jon Mason: "NTB driver bug fixes to address a missed call to pci_enable_msix, NTB-RP Link Up issue, Xeon Doorbell errata workaround, ntb_transport link down race, and correct dmaengine_get/put usage. Also, clean-ups to remove duplicate defines and document a hardware errata. Finally, some changes to improve performance" * tag 'ntb-3.13' of git://github.com/jonmason/ntb: NTB: Disable interrupts and poll under high load NTB: Enable Snoop on Primary Side NTB: Document HW errata NTB: remove duplicate defines NTB: correct dmaengine_get/put usage NTB: Fix ntb_transport link down race ntb: Fix missed call to pci_enable_msix() NTB: Fix NTB-RP Link Up NTB: Xeon Doorbell errata workaround
2013-11-20NTB: Disable interrupts and poll under high loadJon Mason1-18/+7
Disable interrupts and poll under high load Signed-off-by: Jon Mason <jon.mason@intel.com>
2013-11-20NTB: correct dmaengine_get/put usageJon Mason1-3/+6
dmaengine_get() causes the initialization of the per-cpu channel tables. It needs to be called prior to dma_find_channel(). Initial version by Dan Williams <dan.j.williams@intel.com> Signed-off-by: Jon Mason <jon.mason@intel.com>
2013-11-20NTB: Fix ntb_transport link down raceJon Mason1-15/+28
A WARN_ON is being hit in ntb_qp_link_work due to the NTB transport link being down while the ntb qp link is still active. This is caused by the transport link being brought down prior to the qp link worker thread being terminated. To correct this, shutdown the qp's prior to bringing the transport link down. Also, only call the qp worker thread if it is in interrupt context, otherwise call the function directly. Signed-off-by: Jon Mason <jon.mason@intel.com>
2013-11-14dmaengine: remove DMA unmap flagsBartlomiej Zolnierkiewicz1-8/+3
Remove no longer needed DMA unmap flags: - DMA_COMPL_SKIP_SRC_UNMAP - DMA_COMPL_SKIP_DEST_UNMAP - DMA_COMPL_SRC_UNMAP_SINGLE - DMA_COMPL_DEST_UNMAP_SINGLE Cc: Vinod Koul <vinod.koul@intel.com> Cc: Tomasz Figa <t.figa@samsung.com> Cc: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com> Acked-by: Jon Mason <jon.mason@intel.com> Acked-by: Mark Brown <broonie@linaro.org> [djbw: clean up straggling skip unmap flags in ntb] Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2013-11-14NTB: convert to dmaengine_unmap_dataBartlomiej Zolnierkiewicz1-27/+58
Use the generic unmap object to unmap dma buffers. Cc: Vinod Koul <vinod.koul@intel.com> Cc: Tomasz Figa <t.figa@samsung.com> Cc: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com> Acked-by: Jon Mason <jon.mason@intel.com> [djbw: fix up unmap len, and GFP flags] Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2013-09-05NTB: Comment FixJon Mason1-2/+2
Add "data" ntb_register_db_callback parameter description comment and correct poor spelling. Signed-off-by: Jon Mason <jon.mason@intel.com>
2013-09-05NTB: Remove unused variableJon Mason1-3/+0
Remove unused pci_dev variable from ntb_transport_free() Signed-off-by: Jon Mason <jon.mason@intel.com>
2013-09-05NTB: Rename Variables for NTB-RPJon Mason1-1/+1
Many variable names in the NTB driver refer to the primary or secondary side. However, these variables will be used to access the reverse case when in NTB-RP mode. Make these names more generic in anticipation of NTB-RP support. Signed-off-by: Jon Mason <jon.mason@intel.com>
2013-09-05NTB: Use DMA Engine to Transmit and ReceiveJon Mason1-47/+277
Allocate and use a DMA engine channel to transmit and receive data over NTB. If none is allocated, fall back to using the CPU to transfer data. Signed-off-by: Jon Mason <jon.mason@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com>
2013-09-04NTB: Xeon Errata WorkaroundJon Mason1-29/+49
There is a Xeon hardware errata related to writes to SDOORBELL or B2BDOORBELL in conjunction with inbound access to NTB MMIO Space, which may hang the system. To workaround this issue, use one of the memory windows to access the interrupt and scratch pad registers on the remote system. This bypasses the issue, but removes one of the memory windows from use by the transport. This reduction of MWs necessitates adding some logic to determine the number of available MWs. Since some NTB usage methodologies may have unidirectional traffic, the ability to disable the workaround via modparm has been added. See BF113 in http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/xeon-c5500-c3500-spec-update.pdf See BT119 in http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/xeon-e5-family-spec-update.pdf Signed-off-by: Jon Mason <jon.mason@intel.com>
2013-09-03NTB: Correct debugfs to work with more than 1 NTB DeviceJon Mason1-12/+5
Debugfs was setup in NTB to only have a single debugfs directory. This resulted in the leaking of debugfs directories and files when multiple NTB devices were present, due to each device stomping on the variables containing the previous device's values (thus preventing them from being freed on cleanup). Correct this by creating a secondary directory of the PCI BDF for each device present, and nesting the previously existing information in those directories. Signed-off-by: Jon Mason <jon.mason@intel.com>
2013-05-15NTB: Multiple NTB client fixJon Mason1-2/+3
Fix issue with adding multiple ntb client devices to the ntb virtual bus. Previously, multiple devices would be added with the same name, resulting in crashes. To get around this issue, add a unique number to the device when it is added. Signed-off-by: Jon Mason <jon.mason@intel.com>
2013-05-15NTB: memcpy lockup workaroundJon Mason1-3/+8
The system will appear to lockup for long periods of time due to the NTB driver spending too much time in memcpy. Avoid this by reducing the number of packets that can be serviced on a given interrupt. Signed-off-by: Jon Mason <jon.mason@intel.com>
2013-05-15NTB: Correctly handle receive buffers of the minimal sizeJon Mason1-3/+5
The ring logic of the NTB receive buffer/transmit memory window requires there to be at least 2 payload sized allotments. For the minimal size case, split the buffer into two and set the transport_mtu to the appropriate size. Signed-off-by: Jon Mason <jon.mason@intel.com>
2013-05-15NTB: reset tx_index on link toggleJon Mason1-1/+1
If the NTB link toggles, the driver could stop receiving due to the tx_index not being set to 0 on the transmitting size on a link-up event. This is due to the driver expecting the incoming data to start at the beginning of the receive buffer and not at a random place. Signed-off-by: Jon Mason <jon.mason@intel.com>
2013-05-15NTB: Link toggle memory leakJon Mason1-12/+20
Each link-up will allocate a new NTB receive buffer when the NTB properties are negotiated with the remote system. These allocations did not check for existing buffers and thus did not free them. Now, the driver will check for an existing buffer and free it if not of the correct size, before trying to alloc a new one. Signed-off-by: Jon Mason <jon.mason@intel.com>
2013-05-15NTB: Handle 64bit BAR sizesJon Mason1-48/+73
64bit BAR sizes are permissible with an NTB device. To support them various modifications and clean-ups were required, most significantly using 2 32bit scratch pad registers for each BAR. Also, modify the driver to allow more than 2 Memory Windows. Signed-off-by: Jon Mason <jon.mason@intel.com>
2013-05-15NTB: fix pointer math issuesDan Carpenter1-2/+2
->remote_rx_info and ->rx_info are struct ntb_rx_info pointers. If we add sizeof(struct ntb_rx_info) then it goes too far. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jon Mason <jon.mason@intel.com>
2013-05-15NTB: variable dereferenced before checkJon Mason1-2/+14
Correct instances of variable dereferencing before checking its value on the functions exported to the client drivers. Also, add sanity checks for all exported functions. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jon Mason <jon.mason@intel.com>
2013-01-22NTB: Fix Sparse WarningsJon Mason1-16/+16
Address the sparse warnings and resulting fallout Signed-off-by: Jon Mason <jon.mason@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-21NTB: Out of free receive entries issueJon Mason1-3/+6
If the NTB client driver enqueues the maximum number of rx buffers, it will not be able to re-enqueue another in its callback handler due to a lack of free entries. This can be avoided by adding the current entry to the free queue prior to calling the client callback handler. With this change, ntb_netdev will no longer encounter a rx error on its first packet. Signed-off-by: Jon Mason <jon.mason@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-21NTB: Remove reads across NTBJon Mason1-74/+63
CPU reads across NTB are slow(er) and can hang the local system if an ECC error is encountered on the remote. To work around the need for a read, have the remote side write its current position in the rx buffer to the local system's buffer and use that to see if there is room when transmitting. Signed-off-by: Jon Mason <jon.mason@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>