5.8. The Web Request/Response Model

In this section we'll walk through the steps necessary to access a page over the Internet. We use the WebRequest, WebResponse, and Uri classes. They are defined within the System.Net namespace. Let's create a class called PageReader. First, we need a Uniform Resource Locator (URL) address. The user may optionally pass in a string representation to the PageReader constructor; otherwise, by default, we use the address of my company's home page:

public class PageReader
{
    private Uri uri;
    private bool validateUrl( string url ) { ... }

    public PageReader( string url )
    {
        if ( url != null && url != String.Empty &&
             validate_url( url ))
                uri = new Uri( url );
    }

    public PageReader()
          : this( "http://www.objectwrite.com" ) {}

    // ...
}

The Uri constructor parses the string, translating it into a lowercase canonical notation. (For example, C# is turned into C%23.) An invalid string results in an UriFormatException being thrown. The string must represent an absolute path. For example, passing it www.amazon.com is not acceptable. The prefix http:// (or file://) must be present in the string.

The Uri class provides read-only access to various aspects of the resource, such as Host, HostNameType, Port, Query, and so on. For example, given the following url string with an embedded query:

string url = @"http://www.amazon.com//exec/obidos/search-handle-
form/002-9257402-0511232?index=book&field-keywords=C#";

we can query various properties of the resource within the Uri as follows:

Uri uri = new Uri( url );
Console.WriteLine( "Uri: "           + uri.AbsoluteUri );
Console.WriteLine( "Uri host: "      + uri.Host );
Console.WriteLine( "Uri host type: " + uri.HostNameType );
Console.WriteLine( "Uri port: "      + uri.Port );
Console.WriteLine( "Uri path: "      + uri.AbsolutePath );
Console.WriteLine( "Uri query: "     + uri.Query );
Console.WriteLine( "Uri toSting: "   + uri.ToString() );

When compiled and executed, this code generates the following output:

Uri path: http://www.amazon.com/exec/obidos/search-handle-form/
002-9257402-0511232?index=book&field-keywords=C%23
Uri host: www.amazon.com
Uri host type: Dns
Uri port: 80
Uri path: /exec/obidos/search-handle-form/002-9257402-0511232
Uri query: ?index=book&field-keywords=C%23
Uri toSting: http://www.amazon.com/exec/obidos/search-handle-
form/002-9257402-0511232?index=book&field-keywords=C#

The representation within the Uri object is immutable. If we wish to modify the properties, we should use the UriBuilder utility class. Uri and UriBuilder are analogous to String and StringBuilder in terms of when we use each.

Once we have the Uri object, the next step is to create a WebRequest object. We do this through the static member function Create(), passing it the Uri object—for example,

WebRequest wreq = WebRequest.Create( uri );

The WebRequest object returned by Create() represents a derived class that supports a specific protocol, such as HTTP or FTP. The details of that protocol, however, are encapsulated within the class hierarchy that WebRequest represents.

We program a simpler and less error-prone protocol-neutral set of general operations that shield us from the low-level intricacies of making the Internet connection. The network expertise is encapsulated within the specialized derived classes that are by default shielded from us.

If the high-level interface is not flexible enough for our request requirements, we can explicitly downcast the WebRequest object to one of its derived classes. For example, the HttpWebRequest class manages the details of an HTTP Internet connection:

WebRequest wreq = WebRequest.Create( uri );

// the downcast to the specific derived instance
HttpWebRequest hwreq = ( HttpWebRequest )wreq;

GetResponse() sends the request from the client application to the server identified in the Uri object. It returns a WebResponse object that provides access to the data returned by the server—for example,

WebResponse wresp = wreq.GetResponse();

The data returned by the server is accessed through a System.IO.Stream class object that is returned from GetResponseStream(). At this point, we're back to our input-from-a-stream processing model. For example, in this fragment we tuck away each line within an ArrayList object:

Stream       wrespStream = wresp.GetResponseStream();
StreamReader wsrdr       = new StreamReader(wrespStream);
ArrayList    webData     = new ArrayList();
string       data        = null;

while (( data = wsrdr.ReadLine()) != null )
          webData.Add( data );

The following code segment examines each element of the array, looking for an explicit URL address beginning with http://:

Console.WriteLine( "read {0} lines of text from {1}",
                       webData.Count, uri.AbsoluteUri );

ArrayList webUrls = new ArrayList();
foreach ( string s in webData )
{
      int  pos, nextPos;
      int  spacePos, quotePos;
      char space = ' ', quote = '"';

      if (( pos = s.IndexOf( "http://" )) != -1 )
      {
          spacePos = s.IndexOf( space, pos );
          quotePos = s.IndexOf( quote, pos );
          nextPos = spacePos < quotePos
                       ? spacePos : quotePos;
            if ( nextPos > pos )
            {
                 string surl = s.Substring( pos, nextPos - pos );
                 if ( ! webUrls.Contains( surl ))
                        webUrls.Add( surl );
            }
      }
}

Console.WriteLine("There are {0} url references",webUrls.Count);
webUrls.Sort();

foreach ( string s in webUrls )
          Console.WriteLine( "	{0}", s );

When this code is executed against my home Web page, the following output is generated:

read 117 lines of text from http://www.objectwrite.com
There are 5 url references
http://cseng.awl.com/bookdetail.qry?ISBN=0-201-30993-9&ptype=0
http://www.amazon.com/exec/obidos/ASIN/0135705819/
qid%3D902875557/sr%3D1-6/002-5252584-4839230
http://www.awl.com/cseng/titles/0-201-82470-1/
http://www.awl.com/cseng/titles/0-201-83454-5/
http://www.objectwrite.com/

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.223.170.63