.NET Protocol Classes

Some classes within the System.NET namespace make connecting to a Web site easy. These classes include WebRequest, WebResponse, HttpWebRequest, HttpWebResponse, FileWebRequest, FileWebResponse, and WebClient.

These classes provide the following features:

  • Support for HTTP, HTTPS, FILE, and so on

  • Asynchronous development

  • Simple methods for uploading and downloading data

This list of classes might seem like too many classes to handle; however, with the exception of WebClient, the request classes derive from WebRequest, and the response classes derive from WebResponse.

Support for HTTP, HTTPS, and FILE

The request object is generated from the static WebRequest.Create method, and the WebResponse object is generated from the request object's (WebRequest) GetResponse method. The type of request that is generated (http or file) depends on the scheme that is passed as part of the URL. If you use http://www.microsoft.com as your URL, then an HttpWebRequest object is generated from WebRequest.Create. Using SSL/TSL with https:// is handled automatically by the HttpWebRequest object. In contrast, if you are trying to access a file such as file:///C: empsample.txt, then a FileWebRequest object is generated.

The WebClient client is similar to the WebRequest classes in that the protocol is handled automatically. If a URL such as https://www.microsoft.com/net/ is given to the WebClient, then the SSL/TSL communication is handled automatically. Similarly, if you use a “file” URL, it is also transparently handled. Try using the following URL:

file:///D:/Program%20Files/Microsoft.Net/FrameworkSDK/Samples/StartSamples.htm.

The instance of the WebClient recognizes that the protocol to be used is file rather than going through IIS or a valid HTTP server.

One of the easiest ways to interact with a Web page is using the WebClient class. Listing 12.37 shows an example of using the WebClient class. The complete source for this sample is included with the other source in the directory WebPage/WebBuffer.

Listing 12.37. Retrieving the Content of a Web Page Using WebClient
using System;
using System.Net;
using System.Text;
using System.IO;

public class WebBuffer
{
    public static void Main(string[] args)
    {
        try
        {
            string address = "http://www.microsoft.com";
            if(args.Length == 1)
                address = args[0];

            WebClient wc = new WebClient();

            Stream stream = wc.OpenRead(address);

            StreamReader reader = new StreamReader( wc.OpenRead(address), Encoding.ASCII);
            Console.WriteLine(reader.ReadToEnd());
        }
        catch(Exception ex)
        {
            Console.WriteLine(ex.ToString());
        }
    }
}

Even simpler, the three lines in Listing 12.37

Stream stream = wc.OpenRead(address);
StreamReader reader = new StreamReader(wc.OpenRead(address), Encoding.ASCII);
Console.WriteLine(reader.ReadToEnd());

can be replaced with a single line:

Console.WriteLine(Encoding.ASCII.GetString(wc.DownloadData(address)));

This achieves the same effect and is easy.

Alternatively, but just as easy, you can use the WebRequest/WebResponse classes. Listing 12.38 shows how to use the WebRequest/WebResponse classes to retrieve the contents of a Web page. The source for this listing is under the directory WebPage/HttpBuffer.

Listing 12.38. Retrieving the Content of a Web Page Using WebRequest/WebResponse Classes
using System;
using System.Net;
using System.Text;
using System.IO;
public class HttpBuffer
{
    public static void Main(string[] args)
    {
        try
        {
            string address = "http://www.microsoft.com";
            if(args.Length == 1)
                address = args[0];

            WebRequest request = WebRequest.Create(address);
            WebResponse response = request.GetResponse();
            StreamReader reader = new StreamReader( response.GetResponseStream(), Encoding
.ASCII);
            Console.WriteLine(reader.ReadToEnd());
            response.Close();
        }
        catch(Exception ex)
        {
            Console.WriteLine(ex.ToString());
        }
    }
}

You can choose from so many different options! Remember that because the code shown in Listings 12.37 and 12.38 uses either WebClient or WebRequest, you are not restricted to just http:// schemes.

Asynchronous Development

The request object does not do anything until either GetResponse (or its asynchronous version BeginGetResponse discussed later) or GetRequestStream is called. By calling GetResponse, you are telling the request object to connect to the URL given and download the contents that the URL specifies when the request is generated (typically, this is HTML). If a Stream is retrieved via GetRequestStream, commands can be written directly to the server, and GetResponse is called to retrieve the results of those commands.

Because of this separation between the response and the request, you are able to easily build an asynchronous Web page content extractor. Listings 12.3912.42 illustrate how to use the asynchronous methods to asynchronously obtain the contents of a Web page. The full source for this sample can be found in WebPageWebAsynch. Listing 12.39 shows a declaration of the state classes that will be used in this sample.

Listing 12.39. Retrieving the Content of a Web Page Using Asynchronous WebRequest/WebResponse Classes
using System;
using System.Net;
using System.Text;
using System.IO;
using System.Threading;

public class AsyncResponseData
{
    public AsyncResponseData (WebRequest webRequest)
    {
        this.webRequest = webRequest;
        this.responseDone = new ManualResetEvent(false);
    }
    public WebRequest Request
    {
        get
        {
            return webRequest;
        }
    }
    public ManualResetEvent ResponseEvent
    {
        get
        {
            return responseDone;
        }
    }
    private WebRequest webRequest;
    private ManualResetEvent responseDone;
}
public class AsyncReadData
{
    public AsyncReadData (Stream stream)
    {
        this.stream = stream;
        bytesRead = -1;
        readDone = new ManualResetEvent(false);
               page = new StringBuilder();
    }
    public Stream GetStream
    {
        get
        {
            return stream;
        }
    }
    public byte [] Buffer
    {
        get
        {
            return buffer;
        }
        set
        {
            buffer = value;
        }
    }
    public int ReadCount
    {
        get
        {
            return bytesRead;
        }
        set
        {
            bytesRead = value;
        }
    }
    public StringBuilder Page
    {
        get
        {
            return page;
        }
        set
        {
            page = value;
        }
    }
    public ManualResetEvent ReadEvent
    {
        get
        {
            return readDone;
        }
    }
    private Stream stream;
    private byte[] buffer = new byte[4096];
    private int bytesRead = 0;
    private StringBuilder page;
    private ManualResetEvent readDone;
}

This first part declares the state classes for the response and the read of the data. Compare this to the AsyncData class in the AsynchSrv sample illustrated in Listings 12.1712.24. These classes have considerably more code than in the AsynchSrv sample.

The only difference is that synchronization events have been encapsulated and accessors have been added for better programming practice. Continue this sample with Listing 12.40.

Listing 12.40. Retrieving the Content of a Web Page Using Asynchronous WebRequest/WebResponse Classes
public class WebAsynch
{
    public static void Main(string[] args)
    {
        try
        {
            string address = "http://localhost/QuickStart/HowTo/";
            if(args.Length == 1)
                address = args[0];

            WebRequest request = WebRequest.Create(address);

            AsyncResponseData ad = new AsyncResponseData (request);
            IAsyncResult responseResult = request.BeginGetResponse(new AsyncCallback
(ResponseCallback),
                                 ad);
            ad.ResponseEvent.WaitOne();
        }
        catch(Exception ex)
        {
            Console.WriteLine(ex.ToString());
        }
    }

Here is the main entry point for the sample. First, a WebRequest object is constructed. Next, an AsyncResponseData object is constructed with a single argument of the WebRequest object. This constructor initializes its members and constructs a synchronization object. With the construction of these two objects, the process of getting the Web page is started with BeginGetResponse. The call to BeginGetResponse immediately returns in all cases. The case that has not been accounted for is if the asynchronous event completed synchronously—in other words, if it were such a short operation that the OS decided it was not worth it to call the callback specified. It has been assumed here that the callback will always occur.

This listing waits for the operation to complete by waiting for the ResponseEvent to be signaled. This event is signaled in the ResponseCallback routine, which is illustrated in Listing 12.41.

Listing 12.41. Retrieving the Content of a Web Page Using Asynchronous WebRequest/WebResponse Classes
private static void ResponseCallback(IAsyncResult result)
{
    AsyncResponseData ar = (AsyncResponseData)result.AsyncState;
    WebRequest request = ar.Request;
    WebResponse response = request.EndGetResponse(result);
    Stream stream = response.GetResponseStream();
    AsyncReadData ad = new AsyncReadData(stream);
    IAsyncResult readResult = stream.BeginRead(ad.Buffer,
                                           0,
                                           ad.Buffer.Length,
                                           new AsyncCallback(ReadCallback),
                                           ad);
    ad.ReadEvent.WaitOne();
    ar.ResponseEvent.Set();
}

Much of this code should be familiar to you by now. The state object that was passed (WebRequest) is retrieved with AsyncState. The WebRequest object is retrieved and EndGetResponse is called to retrieve the WebResponse object. Although the types might change, this sequence is repeated for just about every AsyncCallback delegate that is called. Now you are in position to read the data.

Caution

The stream that is associated with the WebResponse is retrieved using GetResponseStream(). When I first built this example, I was trying to keep it as simple as possible. I tried to read all of the data from the stream by building a StreamReader class and calling the ReadToEnd() as was done with the code illustrated in the synchronous version in Listing 12.38. Doing this caused the application to lock up, and the read never completed. Upon further investigation and after some advice, I concluded that mixing asynchronous code with synchronous code wasn't a good idea. After I made the read asynchronous as well, the read completed and the application worked just fine.

Don't mix synchronous code with asynchronous code. In particular, don't call synchronous methods on objects that were constructed asynchronously (in a callback for instance).


With that lesson learned, I called BeginRead to start an asynchronous read from the stream. Next, I waited for the read to complete. Listing 12.42 continues from Listing 12.41 illustrating the read callback. After the read has completed, you can set the ResponseEvent so that the main thread can fall through and the application can terminate cleanly.

Listing 12.42. Retrieving the Content of a Web Page Using Asynchronous WebRequest/WebResponse Classes
    private static void ReadCallback(IAsyncResult result)
    {
        AsyncReadData ad = (AsyncReadData)result.AsyncState;
        Stream stream = ad.GetStream;
        int bytesRead = stream.EndRead(result);
        if(bytesRead == 0)
        {
            // The end of the read
            Console.WriteLine(ad.Page.ToString());
            ad.ReadEvent.Set();
        }
        else
        {
            ad.Page.Append(Encoding.ASCII.GetString(ad.Buffer, 0, bytesRead));
            IAsyncResult readResult = stream.BeginRead(ad.Buffer,
                                           0,
                                           ad.Buffer.Length,
                                           new AsyncCallback(ReadCallback),
                                           ad);
        }
    }
}

This code retrieves the AsyncReadData object using the AsyncState method of the IAsyncResult interface. From there, the number of bytes read is returned. If the number of bytes is not zero, then another read request is queued up after the current read is appended to the contents of the page. If the number of bytes read is zero, then no more data is available to be read. When no more data is to be read, the page contents are written out to the Console and the ReadEvent is set to indicate that reading has finished. This process is a little more involved than synchronously reading the contents of a Web page, but it is nice to know that the option is available.

Simple Methods for Uploading and Downloading Data

The final key feature of the .NET Protocol classes is they provide simple and effective methods for uploading and downloading data.

Download to a Buffer

You have already seen many examples of downloading data from a Web page into a buffer. You have looked at methods of reading data from a Web page a chunk at a time, and you have seen methods that put the entire page into a buffer at one time. (Look at the discussion following Listing 12.37 on alternatives for the DownloadData method of WebClient.) You don't need to belabor this issue any further.

Download to a File

After data is in memory, a programmer can write code to write that data to a file with relative ease. Be aware, however, that a method of the WebClient class can do that for you. If you want the data to go directly to a file, then replace code like this:

byte[] buffer = wc.DownloadData(address);
Console.WriteLine(Encoding.ASCII.GetString(buffer));

with

wc.DownloadFile(address, "default.aspx");

The data will be downloaded to a file called default.aspx. Alternatively, you could supply a full path to the file, and the file would be created at the path that is specified.

Download with a Stream

If you need a little more control over the read process, you can replace the DownloadFile method in the previous section with the following code shown in Listing 12.43.

Listing 12.43. Retrieving the Content of a Web Page Using a Stream
Stream stream = wc.OpenRead(address);

// Now read in s into a byte buffer.
byte[] bytes = new byte[1000];
int numBytesToRead = (int) bytes.Length;
int numBytesRead = 0;
while ((numBytesRead = stream.Read(bytes, 0, numBytesToRead)) >= 0)
{
    // Read may return anything from 0 to numBytesToRead.
    // You're at EOF
    if (numBytesRead == 0)
        break;
    Console.WriteLine(Encoding.ASCII.GetString(bytes, 0, numBytesRead));
}

stream.Close();
							

Uploading Data

The methods for uploading data to a server include UploadFile, UploadData, and OpenStream. These methods take similar arguments to the download functions. Before testing these functions, make sure that the server is prepared to handle the upload; otherwise, you will get an exception:

System.Net.WebException: The remote server returned an error:  (405) Method Not Allowed.
   at System.Net.HttpWebRequest.CheckFinalStatus()
   at System.Net.HttpWebRequest.EndGetResponse(IAsyncResult asyncResult)
   at System.Net.HttpWebRequest.GetResponse()
   at System.Net.WebClient.UploadData(String address, String method, Byte[] data)
   at System.Net.WebClient.UploadData(String address, Byte[] data)
   at WebUpload.Main(String[] args) in webupload.cs:line 18

Windows Applications

Before concluding, this section will present two more samples. These samples show that all of the APIs and classes presented earlier in this chapter can be integrated easily into a Windows application.

The first sample gets the source for a Web page and displays some rudimentary properties of that connection. Most of this sample is code to display the result, so it will be discussed here. The sample is available as part of the source to this book. The core functionality boils down to the following few lines in Listing 12.44. The full source to this sample can be found in the directory WebPageHTMLViewer. This listing comes from HTMLViewer.cs.

Listing 12.44. Retrieving the Content of a Web Page
string strAddress = url.Text.Trim();
strAddress = strAddress.ToLower();
.  .  .  .
// create the GetWebPageSource object
page = new HTMLPageGet(strAddress);
strSource = page.Source;
showSource();

The HTMLPageGet is a class that has been defined and will be shown next. This class is constructed with the URL that the user inputs into the text box. As part of the construction, the class retrieves the Web page associated with the URL given. The Web page is then passed back to the user via the page.Source property. Finally, showSource() is called to display the page.

Now look at the HTMLPageGet class, which does all of the work. Listing 12.45 shows the essential parts of the source. The full source can be found in WebPageHTMLViewer. This listing comes from the file HTMLPageGet.cs.

Listing 12.45. HTMLPageGet Class
        private WebRequest request;
        private WebResponse response;
        private StringBuilder strSource = new StringBuilder();
        public HTMLPageGet(string url)
        {
            try
            {
                request = WebRequest.Create(url);
                response = request.GetResponse();
                // get the stream of data
                StreamReader sr = new StreamReader(response.GetResponseStream(), Encoding
.ASCII);
                string strTemp;
                while ((strTemp = sr.ReadLine()) != null)
                {
                    strSource.Append(strTemp + "
");
                }
                sr.Close();
            }
            catch (WebException ex)
            {
                strSource.Append(ex.Message);
            }

This code snippet shows the essential portions of the class: the request and response objects and a portion of the constructor. This shows how easy it is to communicate with a Web server. The request object is constructed with WebRequest.Create. This function returns either an HttpWebRequest (if the scheme is http:// or https://) or FileWebRequest (if the scheme is file://). Both of these classes are derived from WebRequest; therefore, an HttpWebRequest object or a FileWebReqeust object can be created from WebRequest.Create. If you really want to know what object has been returned, then you can run a test like this in C#:

if(request is HttpWebRequest)
{
.  .  .  .  .
}

After the request object has been constructed, a call is made to GetResponse to return the contents of the file or Web page. The contents are read back using an instance of the StreamReader class and Appended a line at a time to a string.

The application starts out like Figure 12.8.

Figure 12.8. Initial Web application startup.


After the URL has been entered and the Source button has been activated, the application looks like Figure 12.9.

Figure 12.9. Web application after entering a URL.


This application demonstrates that a small amount of code can do a job that used to take substantially more custom code.

One common practice today is to pull information from a Web site by parsing the Web page and extracting the information desired. This is known as screen-scraping. It is far from an ideal solution because the producers of the Web page have no obligation to keep the format the same. Changing the font color or style or even changing the order of the rendered items could radically alter the format of the HTML that is presenting the information, causing the parse or the screen-scrape to fail. Using Web services or SOAP yields a much better way to present information to the user.

Now look at a simple application that pulls stock information from a Web site. The application pulls the interesting data from the page and caches it away so that it can be retrieved and displayed. Again, most of the code displays the UI, so that code will not be presented here. Figure 12.10 shows the initial appearance of the stock-quote application. The complete source for this application is in the WebPageStockInfo directory.

Figure 12.10. Initial screen of stock quote application.


Figure 12.11 shows the raw HTML that forms the source of the stock quote information.

Figure 12.11. Source of the Web page supplying the stock quote information.


Figure 12.12 shows the stock quote information that has been successfully extracted from the Web page source.

Figure 12.12. The quote information.


Figure 12.13 shows what the Web page looks like when viewed with Internet Explorer.

Figure 12.13. Viewing the Web page with Internet Explorer.


The complete source for this project is in the WebPageStockInfo directory.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.135.183.89