Four IBA Transfer Protocol Flavors

QPs (in separate CAs) that will be sending messages to each other must each be set up to use the same IBA protocol “flavor.” In the specification, the four IBA transfer types are referred to as service types, service classes, or transport types.

Connected Service Types

General
Earlier Example Assumed RC Service Type

The example scenario described in “Sending a Message to a Destination CA” on page 47 and pictured in Figure 3-4 on page 58 and Figure 3-5 on page 59 made the assumption that the QPs in the two CAs had been set up to use the Reliable Connected (RC) IBA transport service.

Two Connected Service Types

The specification defines two types of connected service types:

- Reliable Connected (RC). The responder QP's RQ Logic must respond to each request packet with an Ack or Nak packet. See “Reliable Connected Service Type” on page 64.

- Unreliable Connected (UC). UC works the same as RC except that the responder QP's RQ Logic does not send back a response to each request packet. See “UC Transport Service” on page 443.

RC/UC Necessitate Initial QP Connection Establishment

In both cases, the two QPs are initialized at setup time with information about each other and this information is stored in the QP Context of each of the two QPs.

RC/UC Are Private, Point-to-Point Comm Channels

Figure 4-1 on page 63 illustrates a pair of RC or UC QPs, one QP in each of the illustrated CAs (QP7 and QP4). When software associated with a RC or UC QP wishes to send a message from its local CA's memory to the memory of the CA containing the its companion RC or UC QP, it posts a WR to the SQ of its local RC or UC QP. It's important to note the following: the WR does not specify the destination CA port or QP to which the message is to be sent. This is because, once it has been set up, the QP inherently knows the addresses of the target port and QP in the other CA (from the information stored in its QP Context).

Figure 4-1. QP Basics


The two QPs (in two separate CAs) using the RC or UC service type to communicate with each other are therefore limited in the following manner: they can only transfer messages to each other (and not with any other QP).

Reliable Connected Service Type

This service type provides a high degree of reliability, but, due to the required response packets, it generates a substantial amount of traffic over the network. The basic characteristics of the RC service type are:

  • Connection needed. Before any messages may be transferred, a connection must be established between the RC QPs in the two CAs, and the QP Contexts of the two QPs are each programmed with the identity of the remote QP as well as the address of the port behind which the remote QP resides.

  • Private communications channel. The two RC QPs may then be used to send messages to each other (but not to any other QP in the same or any other target adapter).

  • Message size. Each message transfer WR can specify a message transfer anywhere from zero to 2GB in size. Messages larger than one PMTU are segmented into multi-packet transfers.

  • An Ack/Nak protocol permits the requester (i.e., the QP SQ Logic sending the message) to verify that all packets are delivered to the responder QP's RQ Logic. It also permits the responder QP's RQ Logic to detect missing, duplicate, or invalid packets.

  • Software completion notification. Software will not be alerted that a message transfer has completed until the QP's SQ Logic has received the Ack indicating that the message's final request packet (in the case of a Send or RDMA Write), or final RDMA Read response packet (in the case of an RDMA Read), or the Atomic response packet (in the case of an Atomic RMW).

  • High traffic. Because the responder QP's RQ Logic must Ack or Nak each request packet received, this service class generates considerably more network traffic than the “unreliable” service types.

  • Packet PSN checking. Each request packet contains a PSN that the target QP's RQ Logic uses to verify that all request packets are received in order (and that each is only processed once, even if it should be received multiple times; more on this later).

  • Two CRC fields in each packet are used to verify the integrity of the packet.

  • Completes on receipt of final response. The SQ Logic cannot create a CQE on the SCQ (Send Completion Queue) to signal the completion of a message transfer until the response for the final packet of the message transfer has been received.

  • Operations supported. Supports the following types of message transfer operations (they are explained in the next chapter, “Intro to Send/Receive Operations” on page 77):

    - RDMA Read support is required.

    - RDMA Write support is required.

    - It is optional whether or not the CI (see Channel Interface in the glossary) supports Atomic operations. If it does, then it must support both the Atomic Fetch and Add and Atomic Compare and Swap If Equal operations on the RC transport service type.

    - Send support is required.

    - Bind Memory Window support is required (memory windows are covered in “Memory Windows” on page 308).

Unreliable Connected Service Type

UC is a subset of the RC protocol. The responder's RQ Logic doesn't Ack or Nak each request packet it receives.

  • Disadvantage: The requester has no verification that each request packet has been received by the remote QP's RQ Logic.

  • Advantages:

    - This protocol generates considerably less network traffic than RC.

    - From the sender's perspective, the message transfer completes quickly (because it doesn't have to wait for the receipt of all of the responses for the transfer to complete).

The basic characteristics of the UC service type are:

  • Connection needed. Just like the RC protocol, before any messages may be transferred a connection must be established between the UC QPs in the two adapters, and the QP Contexts of the two QPs are each programmed with the identity of the remote QP as well as the address of the port behind which it resides.

  • Private communications channel. Just like the RC protocol, the two UC QPs may then be used to send messages to each other (but not to any other other QP in the same or any other target adapter).

  • Message size. Just like the RC protocol, each message can be anywhere from 0 to 2GB in size. Messages larger than one PMTU are segmented into multi-packet transfers.

  • No Ack/Nak protocol. Unlike the RC protocol, there is no Ack/Nak protocol. The requester (i.e., the QP SQ Logic sending the message) therefore cannot verify that each packet is delivered to the responder (the target QP's RQ Logic).

  • Software completion notification. There are no responses returned by the remote QP's RQ Logic when using the UC transport service type. That being the case, software on the sender's side is alerted that a message transfer has completed immediately upon the transmission of the last request packet of the Send or RDMA Write (RDMA Reads and Atomic requests are not supported).

  • Low traffic. Because the responder QP's RQ Logic doesn't Ack or Nak each request packet received, this service class generates considerably less network traffic than the “reliable” service types.

  • Packet PSN checking. Just like the RC protocol, each request packet contains a PSN that the target QP's RQ Logic uses to verify that all request packets are received in order (and that each is only processed once even if it should be received multiple times; more on this later). Unlike the RC protocol, however, if the responder QP's RQ Logic detects out-of-order request packets (i.e., one or more missing request packets), it is not permitted to send a Nak back to the requester QP's SQ Logic. If this should happen:

    - The responder QP's RQ Logic ignores the remaining request packets of the current message.

    - The responder QP's RQ Logic awaits the beginning of a new message (i.e., the receipt of a request packet with a “first” or “only” opcode).

    - The responder QP's RQ Logic may (or may not) inform its local client (e.g., software) of the problem.

  • Just like the RC protocol, two CRC fields in each packet are used to verify the integrity of the packet.

  • Operations supported. Supports the following types of message transfer operations (they are explained in the next chapter, “Intro to Send/Receive Operations” on page 77):

    - RDMA Write support is required.

    - Send support is required.

    - Bind Memory Window support is required.

Datagram Service Types

Datagram QPs Can Exchange Messages With Multiple QPs

As previously discussed, the two connected service types have the following basic characteristics:

  • The RC/UC service types both necessitate the initial establishment of a connection between the two QPs.

  • The RC/UC service types are both private, point-to-point communications channels between two QPs.

The characteristic that both of the datagram service types, RD and UD, have in common and that differentiates them from RC and UC is that an RD or UD QP can transfer messages with multiple QPs residing in other CAs.

The RD and UD service types permit a QP to send and receive messages with any of a number of multiple remote QPs that have also been set up to use the same type of datagram service (RD or UD). Unlike the connected service types, when posting each message transfer WR to a datagram QP's SQ, software must specify the delivery address for each “datagram” message in its respective WR.

RD Requires Initial Connection Establishment, UD Does Not

Refer to Figure 4-1 on page 63. The UD service type does not require the initial establishment of communications between UD QPs in different CAs before message transfers can be initiated between UD QPs.

The RD service type, on the other hand, requires the initial establishment of communications between two CAs (not QPs) before message transfers can be initiated between RD QPs in the two CAs.

Reliable Datagram (RD) Service Type
Messages Pass Through Pipeline Between Two CA Ports

Refer to Figure 4-1 on page 63. Rather than passing messages directly to each other, a RD QP in a CA (e.g., QP6) can use one or more pipelines (EEC1 or EEC2 in the example CA) to exchange messages with RD QPs in one or more remote CAs.

A Pipeline Is Referred to as a Reliable Datagram Channel

Each of these RD pipelines is referred to as a Reliable Datagram Channel, or RDC.

Each End of the RDC Is Referred to as an EEC

Each end of the pipeline (i.e., RDC) is referred to as an End-to-End Context, or EEC.

EEC Must Know Its Own ID and Its Partner's ID

It should be obvious that when setting up a RDC, software must supply each end of the RDC with its own identifier, as well as the ID (i.e., EECN) of the EEC at the other end. In other words, at RDC creation/setup time, each of the two EECs must be programmed with:

- The EECN (EEC Number) assigned to the local EEC (e.g., EEC2).

- The port number of the local port on its CA that the local EEC will use to send and receive messages with another EEC in a remote CA.

- The port address of the port on the other CA (behind which the remote EEC resides).

- The EECN (EEC Number) of the remote EEC.

RD WR Contains Local EEC ID and QPN of Remote RD QP

It should be obvious that, when posting a message transfer request (i.e., a WR) to the SQ of a RD QP, the following addressing elements must be supplied in the WR:

- The identifier of the local EEC through which the message packets will be passed. In other words, the EECN (e.g., EEC1) of an EEC in the local CA through which the message packets will be passed.

- The QPN (QP Number) that identifies the target RD QP (e.g., QP6) in the remote CA.

Once a RD QP's SQ Logic sends a packet to the local end of the RDC (i.e., the local EEC), the EEC takes responsibility for delivering it to the other end of the RDC (the remote EEC) and, ultimately, to the target RD QP in the other CA. The WR therefore does not contain the following elements:

- The port number to which the local EEC is connected.

- The port address of the port on the remote CA.

- The EECN of the remote end of the pipeline (i.e., the EEC in the remote CA).

WRs Posted to RD SQ Can Specify Different RDCs

In light of the discussion in the preceding section, it's obvious that a RD QP can therefore send and receive messages through any number of RDCs to other CAs.

EEC Contains Send and Receive Logic (but not a SQ and RQ)

An EEC does not contain a SQ and a RQ to which WRs can be posted. Rather, the EEC acts as the surrogate used to send and receive messages for its local, client RD QPs. The EEC contains Send Logic and Receive Logic to achieve this end.

Basic RD Operational Characteristics

The following are the basic characteristics of the RD service type:

- Connection needed. Before any messages may be transferred, a connection must be established between the EECs in the two CAs, and the EECs are each programmed with the identity of the remote EEC as well as the address of the port behind which the remote EEC resides.

- Message Size. Each message can be anywhere from zero to 2GB in size. Messages larger than one PMTU are segmented into multi-packet transfers.

- Ack/Nak protocol. An Ack/Nak protocol permits the requester (i.e., the Send Logic of the EEC sending the message) to verify that all packets are delivered to the remote EEC. It also permits the responder EEC's Receive Logic to detect missing or invalid packets.

- Software completion notification. Software on the sender's side will not be alerted that a message transfer has completed until the QP's SQ Logic has received the Ack corresponding to the message's final request packet (in the case of a Send or RDMA Write), or all of the expected RDMA Read response packets (in the case of an RDMA Read), or the Atomic response packet (in the case of an Atomic RMW).

- High traffic. Because the responder EEC must Ack or Nak each request packet received, this service class generates considerably more network traffic than the “unreliable” service types.

- Packet PSN checking. Each packet contains a PSN (Packet Sequence Number) that the target EEC's Receive Logic uses to verify that all packets are received in order (and that each is only received and processed once) and that there are no missing packets.

- Two CRC fields in each packet are used to verify the integrity of the packet.

- Local EEC forwards request packets to EEC in remote CA. Upon receipt of each request packet from a local RD QP's SQ Logic, the local EEC's Send Logic transmits the packets to the EEC in the remote CA.

- Remote EEC forwards request packets to remote QP. Upon receipt of each request packet, the EEC's Receive Logic in the remote CA forwards the request packet to the targeted remote RD QP's RQ Logic for processing.

- Operations supported. It supports the following types of message transfer operations (they are explained in the next chapter, “Intro to Send/Receive Operations” on page 77):

- RDMA Read support is required.

- RDMA Write support is required.

- It is optional whether or not the CI supports Atomic operations. If it does, then it must support both the Atomic Fetch and Add and Atomic Compare and Swap If Equal operations on the RD transport service type.

- Send support is required.

- Bind Memory Window support is required.

Basic RD Operational Description

The following is a basic description of RD operation:

  1. Prior to any message transfers, software causes an EEC to be created in the local HCA as well as in the remote adapter.

  2. Software then causes a connection to be established between the EECs in the two adapters, and the Send Logic of the two EECs are each programmed with the identity of the remote EEC (e.g., EEC1). In other words, each end of the RDC knows the identity of the other end.

  3. The EEC's Send Logic is initialized with a Start PSN that it will insert into the first request packet it transmits for one of its client RD QPs. The packet PSN is not supplied by the client RD QP's SQ Logic.

  4. The EEC's Receive Logic is initialized with the expected PSN (ePSN) that will be used to check the PSN in the next request packet received from the remote EEC's Send Logic to determine if it's the next expected request packet, a duplicate request packet (more on this later), or a stale request packet (more on this later).

  5. The two EECs may then be used to send messages to each other (but not to any other EEC in the same or any other target adapter).

  6. To send a message from a local RD QP's SQ to a remote RD QP's RQ Logic, software posts a WR to the local QP's SQ. The WR contains:

    - The identity of the local EEC to be used to transmit the message.

    - The identity of the remote target RD QP.

  7. The local EEC identified in the WR reads the WQE from the local RD QP's SQ and performs the message transfer request it identifies with the remote EEC. Each packet of a message contains:

    - The address of the port on the remote CA.

    - The EECN of the EEC in the remote CA.

    - The QPN of the target RD QP in the remote CA.

  8. Upon receipt of a request packet, the remote EEC's Receive Logic verifies that the packet contains the next expected PSN (ePSN).

  9. The actions then taken by the EEC's Receive Logic depend on the request packet type:

    - Send request packet. In this case, the EEC's Receive Logic returns an Ack packet with the same PSN as the request packet. It also writes the request packet's data payload to local memory using the pointer stored in a WQE on the target RD QP's RQ.

    - RDMA Write request packet. In this case, the EEC's Receive Logic returns an Ack packet with the same PSN as the request packet. It also writes the request packet's data payload to local memory using the pointer supplied in the first request packet of the RDMA Write operation.

    - RDMA Read request packet. The EEC's Receive Logic forwards the request to the RQ Logic of the destination RD QP. The QP's RQ Logic reads the requested read data from local memory and forwards the resultant RDMA Read response packets back to the local EEC's Receive Logic for transport to the request originator.

    - Atomic RMW request packet. The EEC's Receive Logic forwards the request to the RQ Logic of the destination RD QP. The QP's RQ Logic performs the RMW operation on the indicated local memory locations and forwards a response packet containing the data read from local memory back to the local EEC's Receive Logic for transport to the request originator.

  10. Back at the originating end of the RDC, the EEC's Send Logic verifies the PSN in each Ack packet, RDMA Read response packet, or Atomic Response packet it receives. In addition, it takes the following actions (based on the type of response packet):

    - Send Ack packet. In this case, no action is taken (assuming that the packet's PSN makes sense).

    - RDMA Write Ack packet. In this case, no action is taken (assuming that the packet's PSN makes sense).

    - RDMA Read response packet. The EEC's Send Logic forwards the packet's data payload containing the read data to the SQ Logic of the source RD QP. The QP's SQ Logic writes the requested read data into local memory using the Scatter Buffer List from the WQE in the top entry on its SQ.

    - Atomic RMW response packet. The EEC's Send Logic forwards the packet's data payload containing the read data to the SQ Logic of the source RD QP. The QP's SQ Logic writes the requested read data into local memory using the Scatter Buffer List from the WQE in the top entry on its SQ.

Unreliable Datagram (UD) Service Type

In Figure 4-1 on page 63, refer to QPs 4 and 5 in one CA and QP 2 in the other CA. The UD service has the following basic characteristics:

  • No connection setup. No initial connection setup with a remote QP is necessary prior to sending or receiving messages.

  • QP attached to specific port. The QP is associated with a specific local CA port through which it sends and receives messages with remote UD QPs.

  • Port association may limit destinations. The QP can only send and receive messages with remote UD QPs that can be reached through the local CA port with which the QP is associated.

  • Destination specified in WR. Each WR posted to the QP's SQ can target a different remote UD QP.

  • Message size. Each message must fit in the data payload field of a single packet (and therefore is a maximum of 4KB in size).

  • No Ack/Nak protocol. There is no Ack/Nak protocol, so there is no guarantee that packets are delivered.

  • Software completion notification. There are no responses returned by the remote QP's RQ Logic when using the UD transport service type. That being the case, software on the sender's side is alerted that a message transfer has completed immediately upon the transmission of the one and only request packet of the Send (RDMA Writes, RDMA Reads, and Atomic requests are not supported).

  • No packet PSN checking. Although there is a PSN in each packet, it's not meaningful because the entire message is encapsulated in a single packet.

  • CRCs protect each packet.

  • Support mandatory. Because management messages (referred to as Management Datagrams, or MADs) are sent using UD, all CAs, switches, and routers must support the UD protocol on the QPs used to send and receive management messages. The data payload field of a MAD is always exactly 256 bytes in length.

  • Operations supported. Only supports the Send message transfer operation.

IBA Service Type Support Requirements

Table 4-1 on this page defines whether or not an HCA or TCA must support each of the IBA service types. Note that both HCAs and TCAs must support the UD service on QPs 0 and 1 (because these two QPs are used to send and receive management datagrams). On an HCA, the Query HCA verb call may be used to discover the HCA's supported capabilities.

Table 4-1. CA Service Type Support Requirements
Service TypeHCA Support?TCA Support?
RCRequiredOptional
UCRequiredOptional
RDOptionalOptional
UDRequiredRequired on QP0 and QP1

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.224.29.201