C H A P T E R  19

image

Descriptor Type and Operations

packetC Descriptor Types

packetC provides data types that do not appear in standard C but do provide significant support for packet-processing applications. These data types are often extensions of familiar C types. The extended data type described in this chapter is descriptors. This chapter is divided into two different approaches. The first part of this chapter is focused on simply covering examples of descriptors and the packetC standard include file protocols.ph. The second part of this chapter covers an in-depth view into the background of the descriptors and how they operate under the hood as these are new to packetC.

Descriptors

descriptor type_name identifier  at  offset_expression
descriptor type_name  { field_decl_list } identifier  at  offset_expression

A descriptor is a structure that describes a network protocol and that is mapped to whatever location contains that structure within the current packet.

Each time a packetC program is triggered with a new packet, information about the packet and the location of certain packet protocols is computed and placed in the Packet Information Block (PIB). This information includes the location of standardized protocols within the current packet. These locations (offsets) may change their values from packet to packet. A descriptor declaration associates three things:

  • a structure definition (usually that of a standard network protocol)
  • a user-specified name for the descriptor
  • a packet offset (typically a layer offset computed automatically by the system for each packet)

A descriptor declaration specifies the name of a structure type (or defines it in-place), the name of the descriptor based on that structure (a variable), and the location of the descriptor in terms of a packet offset expression that will be computed at run time.

// the envisoned use
descriptor TcpStruct
{
    short     sourcePort;
    short     destPort;
    int       sequenceNum;
    int       ackNum;
    …
} tcp at pib.l4Offset;

mySourcePort = tcp.sourcePort;

    // idiosyncratic use
    struct StructTag {
             short    sourcePort;
             short    destPort;
             byte     mysteryItem1;
             …
             };
    typedef StructTag MyCustomProtocol;

    descriptor MyCustomProtocol  myProtocolVar at customOffset;
    myByte = myProtocolVar.mysteryItem1;

Because a descriptor is associated with a particular portion of a packet (the specific location can change from packet to packet), it cannot be part of a larger structure, since that could create impossible layouts. Thus, a descriptor cannot be a field within a structure, a field within another descriptor, a union member, or an element type of an array. However, a descriptor can be based on a structure type that includes nested definitions of structures, unions, and enumeration types.

Descriptors are effectively packet scope objects in that the associated values accessed by the descriptor apply only to the current context and packet. Descriptors are able to be defined throughout the packet module, including outside of main(), even though they are packet scope and may reference the pib. Even though a descriptor may be defined within global scope, making it accessible by all contexts, the values returned are unique to the context, packet, and local or block scope constraints that may appear in at clause elements.

An entire descriptor, as well as individual descriptor fields, can be assigned values from and to variables with compatible types.

struct StructType { short sourcePort; short destPort };
descriptor StructType myDescr at pib.l2Offset;
StructType myStruct = myDescr;  // struct gets descriptor contents

A descriptor declaration does not include an initialization clause, because the descriptor's contents are effectively initialized when a context is provisioned with a new packet. It is important to note that variables may exist within the descriptor's at clause which would result in potentially changing offsets at run time. Furthermore, the contents of a packet may change over the course of processing the packet causing particular values returned to differ throughout the execution of the program in the context.

Descriptor Example Application

The following is a simple example of packet Descriptors in use.

packet module telnetPackets;

#include "cloudshield.ph";
#include "protocols.ph";

int    totalPkts_;
int    telnetPkts_;
int    nonTelnetPkts_;

void main($PACKET pkt, $PIB pib, $SYS sys)
{
   ++totalPkts_;

   if ( tcp.destintationPort == 23 )
   {
        // Telnet Packets get dropped
        ++telnetPkts_;
             pib.action = DROP_PACKET;
   }
   else
   {
        // Forward any other packets
        ++nonTelnetPkts_;
        pib.action = FORWARD_PACKET;
   }
}

The example above references a predefined Transmission Control Protocol (TCP) descriptor within protocols.ph (shown in Chapter 25). The full TCP descriptor is included below for quick reference. In the example, tcp.destinationPort represents the 16-bit field containing the destination port number for the current packet found 2 bytes into the TCP header. A decimal value of 23 in a TCP destination port generally refers to packets communicating using the Telnet protocol. While the length of Ethernet and IP headers may differ between packets with different options and tags, the decoded start of layer 4 is used to represent the start of the TCP header in the descriptor defined below. This allows for a simple field by name reference that not only is easy to read, but also adjusts to varying offsets from one packet to another.

It should be noted, however, that the sample application should have had a few more statements to verify that it was an IP packet and the ipv4 header's protocol field specific TCP as the enveloped protocol.

//==============================================================================
//  Standard TCP Descriptor
//
//  A common layer 4 TCP header utilized in networks per RFC 793.  TCP Options
//  are varied and differ in size based upon the option header type as each may
//  differ in size, often from 1 to 4 bytes.  As there are trailers to the TCP
//  header, these can be developed as descriptors that sit at location
//  pib.l4Offset+20 or if nested change 20 as appropriate based upon a runtime
//  variable.
//
//==============================================================================

descriptor TcpStruct
{
  short     sourcePort;                  // Identifies the sending port
  short     destinationPort;             // Identifies the recieving port
  int       sequenceNumber;              // Sequence Number
  int       acknowledgementNumber;       // If the ACK flag is set then the value of
                                         // this field is the next sequence number that
                                         // the receiver is expecting.
  bits byte
  {
    length   :4;                         // # of 32-bit words in TCP Header, including Options
    reserved :4;
  } header;

  bits byte
  {
    cwr:1;   // Congestion window reduced per RFC 3168
    ece:1;   // ECN-Echo per RFC 3168
    urg:1;   // Urgent
    ack:1;   // Acknowledgement
    psh:1;   // Push
    rst:1;   // Reset
    syn:1;   // Synchronize
    fin:1;   // Finish
  } flags;

  short     windowSize;         // The size of the receive window
  short     checksum;           // Used for error-checking of the header and data
  short     urgentPointer;      // If the URG flag is set, then this is an offset from
                                // the sequence number indicating the last urgent byte

} tcp at pib.l4Offset;
typedef byte TcpStructBytes[sizeof(TcpStruct)];

Detailed View and Description of Descriptors

Since network packet processing applications increasingly execute at speeds of 10-40 Gigabits per second, they are often programmed for a specific network processor in assembly language or a C variant that exposes processor specifics. Applications typically search packet contents for the presence of packet protocol headers. Determining which protocols are present and where they are located can be computationally expensive. This encourages developers to exploit machine-specific features to increase speed. Hence, finding protocol headers poses performance burdens and encourages coding practices that hamper application portability. Our approach uses a parallel packet-processing model and a new language, packetC, to enable coding packet applications at a high level. The model requires the host system to represent the incoming packet as a byte array, to locate the protocol headers, and to capture that information in a user-accessible packet information block (PIB). The packetC language redefines C bitfields to provide layouts that will predictably match headers in the packet array. packetC also introduces a descriptor data type, a C-style structure that is superimposed on the packet array at a user-specified offset. By defining a standard protocol in terms of a descriptor and locating it at the appropriate PIB offset value, programmers can access header data in a machine-independent way. These capabilities are applicable to a variety of embedded systems, ranging from routers and switches to blades for larger-scale networking systems.

Pressure for faster network packet processing continues to increase as transmission media become faster (e.g., those specified by SONET/SDH [1, 2] and 10GbE [3]) offer speeds in the 10-40 Gigabits per second range) and the volume of data to be transmitted continues its own relentless increase.

Packets contain protocol headers, for communications standards, such as IPv4 (Figure 19-1). A header is a contiguous set of fields that provide routing, service and standards data. There are a variety of protocols, each with their own distinctive header content. Since multiple protocol headers may be present in a given packet and since their relative offset from the packet's start varies from packet to packet, a key aspect of packet processing is to determine which headers are present and where they are.

images

Figure 19-1. IPv4 Protocol (first 80 bits)

The search for headers occurs in a programming environment where applications are often partitioned into lightweight threads that swap themselves out for each memory access. This encourages exploiting low-level machine features to minimize the overhead of locating protocol headers. The resulting machine-specific code can require extensive redesign and recoding when the application is ported.

Complex Descriptor Structure and Union Usage

As an exploration of descriptors will show, the bounds of structures and unions from C are being pushed and pulled to the edges by packetC, but in doing so, the rules are tightened. One of the most important features of packetC is the ability to cast back and forth between complex data types such as a structure and a byte array. With special operators working on structures, such as offset, and special structure types such as descriptors, a new world of features is opened that through casting to a byte array opens array slicing functionality including memcopy and memset features all on a singular data element. Map unions onto the structure for multiple-structured views of a data element and significant flexibility on viewing data elements, such as headers, falls into place. Through exploration and a bit of torturing data through type conversion subtle but critical nuances unique to packetC supporting data transformation can be highlighted.

descriptor Ipv4Struct
{
  bits byte { version:4; headerLength:4; } bf;
  bits byte { precedence:3; delay:1; throughput:1; reliability:1; reserved:2; } tos;
  short totalLength;
  short identification;
  bits short { evil:1; dont:1; more:1; fragmentOffset:13; } fragment;
  byte  ttl;
  byte  protocol;
  short checksum;
  int   sourceAddress; 
  int   destinationAddress;
} ipv4 at pib.l3Offset;
typedef byte HeaderType[sizeof(ipv4)];
HeaderType header;        // An array of bytes equal in size to IPv4 header.
int addresses[2];         // Holds both sourceAddress and destinationAddress

// The following code shall be within packet scope

header = (HeaderType) ipv4;   // Copies entire IPv4 header into array header.
addresses[0]  = (int) header[offset(ipv4.sourceAddress):offset(ipv4.sourceAddress)+3];
addresses[0]  = (int) header[offset(ipv4.destinationAddress):offset(ipv4images
.destinationAddress)+3];

Numerous methods can be used to work with the data in question. While many of these aspects could have been addressed through the use of pointers in C, the methodology provided by packetC does so with strict type enforcement and named fields making it easier to audit code.

Background on Parallel Processing Paradigm and Relation to Descriptors

The packetC approach has three major elements: a model of parallel packet processing, a specialized language to express the model and an ensemble of heterogeneous processors to implement the language in an embedded hardware product. In this section, focus is applied on the specialized language features for protocol processing.

images

Figure 19-2. A threaded packet processing model vs. a Single Program Multiple Data (SPMD) model

The model's key characteristics for locating a given packet's header locations are as follows:

  • Task granularity is at the level of a complete program that processes a packet (Single Program Multiple Data paradigm shown in Figure 19-2).
  • The host system locates protocol headers in a packet before a copy of the program is executed on that packet.
  • Each program copy operates with copies of its current packet and system-provided information about the presence and location of the packet's layer offsets.

This model is expressed with specialized features for packet manipulation and protocol header processing:

  • A packet main construct corresponds to the model's parallel program copy.
  • Each program copy works on a packet stored as a byte array in big-endian byte form (matching network order).
  • A packet information block (PIB) structure predetermined packet offset values and protocol flags.
  • A revamped bitfield construct provides predictable matches with standard protocol fields with bit widths smaller than typical storage units. Note that packetC follows little-endian bit order.

For descriptors, it is important to understand that each executing context has only one packet assigned and a packet information block (PIB) provides layer offsets specific to that packet. While descriptors do not change, where they point to within the array of bytes representing the packet will be based upon the pib and other attributes specified in the location of the descriptor.

The host system (CPOS operating systems) manages program copies and ensures that a program copy has two kinds of pre-processed data each time it processes a packet, namely:

  • A copy of the packet in the form of an array of unsigned bytes (in big-endian byte order). (pkt)
  • A collection of values that indicate whether a standard protocol header is present in the packet and, if it is, its offset from the packet array's start. (pib)

With this model, developers design a program to process a single packet, instead of designing a set of discrete tasks. However, the particular language constructs used to implement the model greatly influences ease of programming and performance.

The Descriptor Construct

Given your understanding of bit fields and the packet information block, descriptors are easy. The packetC descriptor construct is a structure that corresponds to a portion of the packet array with the same size. Think of it as an alias for an array-slice within the packet.

descriptor typeTagName {
    short   source;
    short   dest;
} descripName at offsetExpression;

A descriptor declaration consists of its structure base type, the descriptor name, and its location—an integer value that defines its offset from the start of the packet array. The key ingredient is the offset location or at clause, which may contain three kinds of elements: compile-time constants, variables with values known only at run time, and PIB offset values.

images

Figure 19-3. Positioning descriptors on the basis of header offset values in the PIB..

By combining a descriptor's structure definitions with an offset location based on a PIB offset value, we can create a precise, high-level descriptor of a protocol header that gravitates to the correct location each time a new packet is prepared for a packetC program (see Figure 19-3).

descriptor Ipv4Struct {
     bits byte  {
         version : 4;
         headerLength: 4;
        }
     byte      typeOfService;
     short     totalLength;
     short     ipv4_identification;
     short     ipv4_fragmentOffset;
     byte      ipv4_ttl;
     byte      ipv4_protocol;
     short     ipv4_checksum;
     int       ipv4_sourceAddress;
     int       ipv4_destaddress;
     int       ipv4_payload;
} ipv4Header at pib.l3Offset;

Consider the IPv4 protocol shown above. First, the descriptor defines a structure that matches the fields of an IPv4 header. The location clause then states that it will always be found at the packet's layer 3 offset (when a valid layer 3 header is present).

In addition, descriptor at clauses can be constant or can be arbitrarily complex expressions. The latter is especially relevant when the start of one header depends on the presence of optional fields in a preceding header.

For example, if we did not provide Layer 4 offsets, it would be possible to calculate them in terms of an IPv4 Layer 3 header as follows:

descriptor layer4Descr {
      …
} layer4header at pib.L3_offset +
     ( ipv4header.headerLength * 4 );

The descriptor construct is also useful for describing stacks of protocols, i.e., groups of interrelated protocol headers that appear in a packet as a group, such as:

  • Layer 2: Ethernet
  • Layer 3: IP (e.g., IPv4 or IPv6)
  • Layer 4: TCP or UDP

The descriptor construct provides a clear way to define protocol headers, find them, and connect them. However, the greatest value of this language feature may be that, combined with PIB information, it makes possible a concise, readable, and maintainable coding style.

When descriptors are combined with PIB enumeration types and layer information, a very readable, maintainable kind of packet-processing application code can be created.

// Process Layer 4 scenarios
if ( pib.l4Offset != NULL ) {
     switch ( pib.L4_type ) {
     case L4TYPE_TCP: {…}
     case L4TYPE_UDP: {…}
     …
     }   // end switch
}

Although clear, streamlined application coding is a non-trivial achievement, these applications are deployed in embedded environments where performance has great importance.

images

Figure 19-4. Mapping packetC language features to code samples

In Figure 19-4 examples of packetC language features are highlighted with an emphasis on the relationship to packet processing and specific code examples. Portability, automation and clarity are not novel concepts although their consistent usage provides a means for more self intuitive code. Specifically, descriptors hide the complexities and remove error-prone pointer usage for accessing protocol fields within packets. As layer offsets can deviate from one packet to another and packet headers can have complex construction, descriptors hide and automate both of these computational operations while providing an added benefit of security.

Impacts on Performance

Two kinds of performance impacts characterize the packet protocol approach embodied in CloudShield's model and in the packetC language constructs reviewed above:

  • Pre-calculating selected Layer offsets and type data offers an opportunity for application speed-up; this will only be realized if the calculations are done faster than equivalent functionality the application would have provided.
  • Knowing the locations of entire headers within the packet affords opportunities for rapidly extracting the entire header or individual fields, depending on how the PIB and packet array are implemented—and on whether the host system's instruction set can be exploited to speed header reading and writing.

Pre-calculating headers' presence and characteristics can be done in three basic ways: by using much the same coding approach a high-level language user might employ, by using a machine-specific instruction level, or by using specialized hardware or firmware.

In CloudShield fielded systems, dedicated FPGAs perform the pre-calculations, provisioning each packet main with a packet array and with PIB contents. Second, the mechanics of reading individual header fields or writing them are heavily influenced by how a descriptor is implemented. For example, a system could implement descriptors as ordinary structures and implement the packet array as a buffer or an ordinary 1-dimensional array. Pieces of the array could then be read into or written from structure locations, accessed in terms of field offsets from the structure's starting address.

Alternatively, a descriptor can be treated purely as an alias for a slice of the array holding the packet. Thus, descriptor access can be treated either in terms of chained offsets or of array indexing. In either case, packetC implementers can exploit the host platform ISA and addressing modes to speed access operations.

In current implementations, both the PIB and the packet array are accorded special storage and a descriptor is treated as a complex alias of a packet array slice. CloudShield systems also manipulate ISA specifics to speed-up reads and writes wherever feasible.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.147.79.84