diff options
Diffstat (limited to 'Documentation/mic/scif_overview.rst')
-rw-r--r-- | Documentation/mic/scif_overview.rst | 108 |
1 files changed, 108 insertions, 0 deletions
diff --git a/Documentation/mic/scif_overview.rst b/Documentation/mic/scif_overview.rst new file mode 100644 index 000000000000..4c8ad9e43706 --- /dev/null +++ b/Documentation/mic/scif_overview.rst @@ -0,0 +1,108 @@ +======================================== +Symmetric Communication Interface (SCIF) +======================================== + +The Symmetric Communication Interface (SCIF (pronounced as skiff)) is a low +level communications API across PCIe currently implemented for MIC. Currently +SCIF provides inter-node communication within a single host platform, where a +node is a MIC Coprocessor or Xeon based host. SCIF abstracts the details of +communicating over the PCIe bus while providing an API that is symmetric +across all the nodes in the PCIe network. An important design objective for SCIF +is to deliver the maximum possible performance given the communication +abilities of the hardware. SCIF has been used to implement an offload compiler +runtime and OFED support for MPI implementations for MIC coprocessors. + +SCIF API Components +=================== + +The SCIF API has the following parts: + +1. Connection establishment using a client server model +2. Byte stream messaging intended for short messages +3. Node enumeration to determine online nodes +4. Poll semantics for detection of incoming connections and messages +5. Memory registration to pin down pages +6. Remote memory mapping for low latency CPU accesses via mmap +7. Remote DMA (RDMA) for high bandwidth DMA transfers +8. Fence APIs for RDMA synchronization + +SCIF exposes the notion of a connection which can be used by peer processes on +nodes in a SCIF PCIe "network" to share memory "windows" and to communicate. A +process in a SCIF node initiates a SCIF connection to a peer process on a +different node via a SCIF "endpoint". SCIF endpoints support messaging APIs +which are similar to connection oriented socket APIs. Connected SCIF endpoints +can also register local memory which is followed by data transfer using either +DMA, CPU copies or remote memory mapping via mmap. SCIF supports both user and +kernel mode clients which are functionally equivalent. + +SCIF Performance for MIC +======================== + +DMA bandwidth comparison between the TCP (over ethernet over PCIe) stack versus +SCIF shows the performance advantages of SCIF for HPC applications and +runtimes:: + + Comparison of TCP and SCIF based BW + + Throughput (GB/sec) + 8 + PCIe Bandwidth ****** + + TCP ###### + 7 + ************************************** SCIF %%%%%% + | %%%%%%%%%%%%%%%%%%% + 6 + %%%% + | %% + | %%% + 5 + %% + | %% + 4 + %% + | %% + 3 + %% + | % + 2 + %% + | %% + | % + 1 + + + ###################################### + 0 +++---+++--+--+-+--+--+-++-+--+-++-+--+-++-+- + 1 10 100 1000 10000 100000 + Transfer Size (KBytes) + +SCIF allows memory sharing via mmap(..) between processes on different PCIe +nodes and thus provides bare-metal PCIe latency. The round trip SCIF mmap +latency from the host to an x100 MIC for an 8 byte message is 0.44 usecs. + +SCIF has a user space library which is a thin IOCTL wrapper providing a user +space API similar to the kernel API in scif.h. The SCIF user space library +is distributed @ https://software.intel.com/en-us/mic-developer + +Here is some pseudo code for an example of how two applications on two PCIe +nodes would typically use the SCIF API:: + + Process A (on node A) Process B (on node B) + + /* get online node information */ + scif_get_node_ids(..) scif_get_node_ids(..) + scif_open(..) scif_open(..) + scif_bind(..) scif_bind(..) + scif_listen(..) + scif_accept(..) scif_connect(..) + /* SCIF connection established */ + + /* Send and receive short messages */ + scif_send(..)/scif_recv(..) scif_send(..)/scif_recv(..) + + /* Register memory */ + scif_register(..) scif_register(..) + + /* RDMA */ + scif_readfrom(..)/scif_writeto(..) scif_readfrom(..)/scif_writeto(..) + + /* Fence DMAs */ + scif_fence_signal(..) scif_fence_signal(..) + + mmap(..) mmap(..) + + /* Access remote registered memory */ + + /* Close the endpoints */ + scif_close(..) scif_close(..) |