Using GAWK for network programming

The networking feature in GAWK was added from version 3.1 onwards after the addition of a two-way pipeline to a coprocess on the same system. Networking is more of a two-way connection to a process on another system, using TCP/IP connections. Before we move ahead in networking with GAWK, we must understand the fundamental construct of network communication. In a network communication model, one system acts as a client and another as a server.

The server is the system that provides the service, such as a web server or email server. It is the system to which the connection is made. The server keeps waiting in a listening state to receive requests for connections.

The client is the system which makes a request for service. It is the system which initiates the connection request. In the TCP/IP model, each connection consists of an IP address and port pair. Until the connection is in place, the ports used at each end are unique and cannot be used by other processes on the same system at the same time.

The AWK programming language was developed as a pattern-matching language for text manipulation; however, GAWK has advanced features, such as file-like handling of network connections. We can perform simple TCP/IP connection handling in GAWK with the help of special filenames. GAWK extends the two-way I/O mechanism used with the |& operator to simple networking using these special filenames that hide the complex details of socket programming to the programmer.

The special filename for network communication is made up of multiple fields, all of which are mandatory. The following is the syntax of creating a filename for network communication:

/net-type/protocol/local-port/remote-host/remote-port

Each field is separated from another with a forward slash. Specifying all of the fields is mandatory. If any of the field is not valid for any protocol or you want the system to pick a default value for that field, it is set as 0. The following list illustrates the meaning of different fields used in creating the file for network communication:

  • net-type: Its value is inet4 for IPv4, inet6 for IPv6, or inet to use the system default (which is generally IPv4).
  • protocolIt is either tcp or udp for a TCP or UDP IP connection. It is advised you use the TCP protocol for networking. UDP is used when low overhead is a priority.
  • local-port: Its value decides which port on the local machine is used for communication with the remote system. On the client side, its value is generally set to 0 to indicate any free port to be picked up by the system itself. On the server side, its value is other than 0 because the service is provided to a specific publicly known port number or service name, such as http, smtp, and so on.
  • remote-host: It is the remote hostname which is to be at the other end of the connection. For the server side, its value is set to 0 to indicate the server is open for all other hosts for connection. For the client side, its value is fixed to one remote host and hence, it is always different from 0. This name can either be represented through symbols, such as www.google.com, or numbers, 123.45.67.89.
  • remote-port: It is the port on which the remote machine will communicate across the network. For clients, its value is other than 0, to indicate to which port they are connecting to the remote machine. For servers, its value is the port on which they want connection from the client to be established. We can use a service name here such as ftp, http, or a port number such as 80, 21, and so on.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.227.161.225