Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 4. HTTP: Communicating with Web Servers

Introduction

This chapter demonstrates how to pull data from the Web and use it within your own applications. As mentioned in Chapter 1, Web pages are hosted on computers that run Web server software such as Microsoft Internet Information Services (IIS) or Apache. Hypertext transfer protocol (HTTP) is used to communicate with these applications and retrieve Web sites.

There are many reasons why an application may interact with a Web site, such as the following:

To check for updates and to download patches and upgrades
To retrieve information on data that changes from hour to hour (e.g., shared values, currency conversion rates, weather)
To automatically query data from services operated by third parties (e.g., zip code lookup, phone directories, language translation services)
To build a search engine
To cache Web pages for faster access or to act as a proxy

The first half of this chapter describes how to send and receive data to web servers. This includes an example of how to manipulate the HTML data received from the web server. The chapter is concluded with an implementation of a custom web server, which could be used instead of IIS.

Data mining

Data mining is where an application downloads a Web page and extracts specific information from it automatically. It generally refers to the retrieval of large amounts of data from Web pages that were never designed for automated reading.

A sample application could be a TV guide program that would download scheduling information from TV Web sites and store it in a database for quick reference.

Note

You should always check with Web site administrators whether they permit data mining on their sites because it may infringe copyright or put excessive load on their servers. Unauthorized data mining can result in a Web administrator blocking your IP address or worse!

In order to extract useful data from this HTML, you will need to be well acquainted with the language and good at spotting the patterns of HTML that contain the data required; however, several good commercial products aid developers with data mining from HTML pages, and home-brewed solutions are not always the best idea.

HTTP

HTTP operates on TCP/IP port 80 and is described definitively in RFC 2616. The protocol is quite straightforward. The client opens TCP port 80 to a server, the client sends an HTTP request, the server sends back an HTTP response, and the server closes the TCP connection.

The HTTP request

The simplest HTTP request is as follows:

GET /
<enter><enter>

Tip

On some servers, it is necessary to specify the DNS name of the server in the GET request.

This request will instruct the server to return the default Web page; however, HTTP requests are generally more complex, such as the following:

GET / HTTP/1.1
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg,
application/vnd.ms-powerpoint, application/vnd.ms-excel,
application/msword, */*
Accept-Language: en-gb
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT
5.1; .NET CLR 1.0.3705)
Host: 127.0.0.1:90
Connection: Keep-Alive

This tells the server several things about the client, such as the type of browser and what sort of data the browser can render.

Table 4.1 shows a complete list of standard HTTP request headers are as follows:

Table 4.1. Standard HTTP request headers.

HTTP header	Meaning
`Accept`	Used to specify which media (MIME) types are acceptable for the response. The `type /` indicates all media types and `type/*` indicates all subtypes of that type. In the example above, `application/msword` indicates that the browser can display Word documents.
`Accept-Charset`	Used to specify which character sets are acceptable in the response. In the case where a client issues `Accept-Charset: iso–8859–5`, the server should be aware that the client cannot render Japanese (Unicode) characters.
`Accept-Encoding`	Used to specify if the client can handle compressed data. In the above example, the browser is capable of interpreting GZIP compressed data.
`Accept-Language`	Used to indicate the language preference of the user. This can be used to estimate the geographic location of a client; `en-gb` in the above example may indicate that the client is from the United Kingdom.
`Authorization`	Used to provide authentication between clients and servers. Refer to RFC 2617 or Chapter 9 for more details.
`Host`	Host indicates the intended server IP address as typed in at the client. This could differ from the actual destination IP address if the request were to go via a proxy. The host address `127.0.0.1:90` in the above example indicates that the client was on the same computer as the server, which was running on port 90.
`If-Modified-Since`	Indicates that the page is not to be returned if it has not been changed since a certain date. This permits a caching mechanism to work effectively. An example is `If-Modified-Since: Sat, 29 Oct 1994 19:43:31 GMT`.
`Proxy-Authorization`	This provides for authentication between clients and proxies. Refer to RFC 2617 or Chapter 9 for more details.
`Range`	This provides for a mechanism to retrieve a section of a Web page by specifying which ranges of bytes the server should return; this may not be implemented on all servers. An example is `bytes=500–600,601–999`.
`Referer`	This indicates the last page the client had visited before going to this specific URL. An example is `Referer: http://www.w3.org/index.html.` (The misspelling of “referrer” is not a typing mistake!)
`TE`	Transfer encoding (TE) indicates which extension transfer encoding it can accept in the response and if it can accept trailer fields in a chunked transfer encoding.
`User-Agent`	Indicates the type of device the client is running from. In the above example, the browser was Internet Explorer 6.
`Content-Type`	Used in `POST` requests. It indicates the MIME type of the posted data, which is usually `application/x-www-form-urlencoded`.
`Content-Length`	Used in `POST` requests. It indicates the length of the data immediately following the double line.

Note

Device-specific HTTP request headers are prefixed with “x-”.

GET and POST are the most common HTTP commands. There are others, such as HEAD, OPTIONS, PUT, DELETE, and TRACE, and interested readers can refer to RFC 2616 for information on these HTTP commands.

Web developers may be familiar with GET and POST from the HTML form tag, which takes the form:

<form name="myForm" action="someDynamicPage" method="POST">

The difference from a user’s point of view is that form parameters do not appear in the URL bar of the browser when submitting this form. These parameters are contained in the region immediately following the double-line feed. A POST request resembles the following:

POST / HTTP/1.1
Content-Type: application/x-www-form-urlencoded
Content-Length: 17

myField=some+text

The HTTP response

When the server receives an HTTP request, it retrieves the requested page and returns it along with an HTTP header. This is known as the HTTP response.

A sample HTTP response is as follows:

HTTP/1.1 200 OK
Server: Microsoft-IIS/5.1
Date: Sun, 05 Jan 2003 20:59:47 GMT
Connection: Keep-Alive
Content-Length: 25
Content-Type: text/html
Set-Cookie: ASPSESSIONIDQGGQQFCO=MEPLJPHDAGAEHENKAHIHGHGH;
path=/
Cache-control: private

This is a test html page!

Table 4.2. Standard HTTP request headers.

HTTP request header	Meaning
`ETag`	The entity tag is used in conjunction with the `If-` suffixed HTTP requests. Servers rarely return it.
`Location`	It is used in redirects, where the browser is requested to load a different page. Used in conjunction with HTTP 3xx responses.
`Proxy-Authenticate`	This provides for authentication between clients and proxies. Refer to RFC 2617 Section 14.33 or Chapter 9 for more details.
`Server`	Indicates the server version and vendor. In the above example, the server was IIS running on Windows XP.
`WWW-Authenticate`	This provides for authentication between clients and proxies. Refer to RFC 2617 Section 14.47 or Chapter 9 for more details.
`Content-Type`	Indicates the MIME type of the content returned. In the above example, the type is HTML
`Content-Length`	Indicates the amount of data following the double-line feed. The server will close the connection once it has sent all of the data; therefore, it is not always necessary to process this command.
`Set-Cookie`	A cookie is a small file that resides on the client. A cookie has a name and value. In the above example, the cookie name is `ASPSESSIONIDQGGQQFCO`.

The client would display the message “This is a test html page!” on screen in response to this command.

Every HTTP response has a response code. In the above example, the response code was 200. This number is followed by some human-readable text (i.e., OK).

The response codes fall into five main categories shown in Table 4.3.

Table 4.3. HTTP response codes.

HTTP response code range	Meaning
`100–199`	Informational: Request was received; continuing the process.
`200–299`	Success: The action was successfully received, understood, and accepted.
`300–399`	Redirection: Further action must be taken in order to complete the request.
`400–499`	Redirection: Further action must be taken in order to complete the request.
`500–599`	Server error: The server failed to fulfill an apparently valid request.

MIME types

Multipart Internet mail extensions (MIME) types are a means of describing the type of data, such that another computer will know how to handle the data and how to display it effectively to the user.

To illustrate the example, if you changed the extension of a JPEG image (.JPG) to .TXT, and clicked on it, you would see a jumble of strange characters, not the image. This is because Windows contains a mapping from file extension to file type, and .JPG and .TXT are mapped to different file types: image/jpeg for .JPG and text/plain for .TXT.

To find an MIME type for a particular file, such as .mp3, you can open the registry editor by clicking on Start > Run, then typing REGEDIT. Then click on HKEY_CLASSES_ROOT, scroll down to .mp3, and the MIME type is written next to Content Type.

Note

Not all file types have a MIME type (e.g., .hlp help files).

System.Web

One of the most common uses of HTTP within applications is the ability to download the HTML content of a page into a string. The following application demonstrates this concept.

It is certainly possible to implement HTTP at the socket level, but there is a wealth of objects ready for use in HTTP client applications, and it makes little sense to reinvent the wheel. The HTTP server in the next section is implemented using HTTPWebReqest.

Start a new project in Visual Studio .NET, and drag on two textboxes, tbResult and tbUrl. TbResults should be set with multiline=true. A button, btnCapture, should also be added. Click on the Capture button, and enter the following code:

private void btnCapture_Click(object sender, System.EventArgs
e)
{
 tbResult.Text = getHTTP(tbUrl.Text);
}

VB.NET

Private Sub btnCapture_Click(ByVal sender As Object, _
 ByVal e As System.EventArgs) Handles btnCapture.Click
 tbResult.Text = getHTTP(tbUrl.Text)
End Sub

Then implement the getHTTP function:

public string getHTTP(string szURL)
{
  HttpWebRequest  httpRequest;
  HttpWebResponse httpResponse;
  string          bodyText = "";
  Stream          responseStream;
  Byte[] RecvBytes = new Byte[Byte.MaxValue];
  Int32 bytes;
  httpRequest = (HttpWebRequest) WebRequest.Create(szURL);
  httpResponse = (HttpWebResponse) httpRequest.GetResponse();
  responseStream = httpResponse.GetResponseStream();
  while(true)
  {
   bytes = responseStream.Read(RecvBytes,
   0,RecvBytes.Length);
   if (bytes<=0) break;
   bodyText += System.Text.Encoding.UTF8.GetString(RecvBytes,
   0, bytes);
   }
   return bodyText;
 }

VB.NET

Public Function getHTTP(ByVal szURL As String) As String
  Dim httprequest As HttpWebRequest
  Dim httpresponse As HttpWebResponse
  Dim bodytext As String = ""
  Dim responsestream As Stream
  Dim bytes As Int32
  Dim RecvBytes(Byte.MaxValue) As Byte
  httprequest = CType(WebRequest.Create(szURL), _
  HttpWebRequest)
  httpresponse = CType(httprequest.GetResponse(), _
  HttpWebResponse)
  responsestream = httpresponse.GetResponseStream()
  Do While (True)
   bytes = responsestream.Read(RecvBytes, 0, _
   RecvBytes.Length)
   If bytes <= 0 Then Exit Do
   bodytext += System.Text.Encoding.UTF8.GetString _
   (RecvBytes, 0, bytes)
  Loop
  Return bodytext
End Function

Taking a closer look at this code, it should be relatively easy to identify how it operates. The first action taken as this code is executed is that a static method on the WebRequest class is called and passed the string szURL as a parameter. This creates a webRequest object that can be cast to an HttpWebRequest object, which will handle outgoing HTTP connections.

Once we have an HttpWebRequest object, we can then send the HTTP request to the server and start receiving data back from the server by calling the GetResponse method. The return value is then cast to an HttpWebResponse object, which is then held in the httPresponse variable.

A response from a Web server is asynchronous by nature, so it is natural to create a stream from this returning data and read it in as it becomes available. To do this, we can create a stream by calling the GetResponseStream method. Once the stream is obtained, we can read bytes from it in chunks of 256 bytes (byte.Max). Reading data in chunks improves performance. The chunk size can be arbitrarily chosen, but 256 is efficient.

The code sits in an infinite loop until all of the incoming data is received. In a production environment, therefore, this type of action should be contained within a separate thread. Once we have a string containing all of the HTML, we can simply dump it to screen. No other processing is required. You will also need some extra namespaces:

using System.Net;
using System.IO;

VB.NET

Imports System.Net
Imports System.IO

To test the application, run it from Visual Studio, type in a Web site address (not forgetting the http:// prefix), and press Capture. The HTML source will appear in the body (Figure 4.1).

Figure 4.1. HTTP client application.

This is a very simple HTTP client, with no error handling, and is single threaded; however, it should suffice for simpler applications.

Table 4.4 shows the significant methods of HttpWebResponse.

Table 4.4. Significant members of the HttpWebResponse class.

Method or property	Meaning
`ContentEncoding`	Gets the method used to encode the body of the. response. Returns `String`.
`ContentLength`	Gets the length of the content returned by the request. Returns `Long`.
`ContentType`	Gets the content type of the response. Returns `String`.
`Cookies`	Gets or sets the cookies associated with this request. May be used thus: `Cookies["name"].ToString()`.
`Headers`	Gets the headers associated with this response from the server. May be invoked thus: `Headers["Content-Type"].ToString()`.
`ResponseUri`	Gets the URI of the Internet resource that responded to the request. May be invoked thus: `RequestURI.ToString()`.
`Server`	Gets the name of the server that sent the response. Returns `String`.
`StatusCode`	Gets the status of the response. Returns the `HttpStatusCode` enumerated type. The `StatusDescription` returns a descriptive `String`.
`GetResponseHeader`	Gets the specified header contents that were returned with the response. Returns `String`.
`GetResponseStream`	Gets the stream used to read the body of the response. No asynchronous variant. Returns `stream`.

Posting data

Many dynamic Web sites contain forms for login details, search criteria, or similar data. These forms are usually submitted via the POST method. This poses a problem, however, for any application that needs to query a page that lies behind such a form because you cannot specify posted data in the URL line.

First, prepare a page that handles POST requests. In this case, type the following lines into a file called postTest.aspx in c:inetpubwwwroot (your HTTP root):

ASP.NET

<%@ Page language="c#" Debug="true"%>
<script language="C#" runat="server">
  public void Page_Load(Object sender, EventArgs E)
  {
   if (Request.Form["tbPost"]!=null)
   {
    Response.Write(Request.Form["tbPost"].ToString());
   }
  }
</script>

<form method="post">
 <input type="text" name="tbpost">
 <input type="submit">
</form>

ASP.NET is a vast subject that lies outside the scope of this book; however, for the sake of explaining the above example, a quick introduction is necessary. ASP.NET is an extension to IIS that enables .NET code to be executed on receipt of requests for Web pages. This also provides means for .NET code to dynamically generate responses to clients in the form of HTML, viewable on Web browsers.

Incoming requests and outgoing data are mapped to objects in .NET, which can easily be read and manipulated. The most fundamental of these objects are the Request and Response objects. The Request object encapsulates the data sent from the Web browser to the server; of its properties, two of the most important are the Form and QueryString collections. The Form collection reads data sent from the client via the POST method, whereas the QueryString collection reads data sent from the client via the GET method.

The Response object places data on the outgoing HTTP stream to be sent to the client. One of its most important methods is Write. This method is passed a string that will be rendered as HTML on the client.

One of the features that makes ASP.NET more powerful than its predecessor, classic ASP, is its ability to model HTML elements as objects, not merely as input and output streams. For example, an input box would be typically written in ASP.NET as <ASP:TEXTBOX id="tbText" runat="server"/>, and the properties of this textbox could then be modified from code by accessing the tbText object. In classic ASP, the only way to achieve such an effect would be to include code within the textbox declaration, such as <input type="text" <%=someCode%>>, which is less desirable because functional code is intermixed with HTML.

ASP.NET provides better performance than classic ASP because it is compiled on first access (in-line model) or precompiled (code-behind model). It also leverages the .NET framework, which is much richer than the scripting languages available to ASP.

The example above is appropriate for demonstrating the posting method. Every Web scripting language handles posted data in much the same way, so the technique is applicable to interfacing with any Web form.

Web scripting languages share a common feature: some sections of the page are rendered on the browser screen as HTML, and some are processed by the server and not displayed on the client. In the example, anything marked runat="server" or prefixed <% will be processed by the server.

When the user presses the submit button (<input type="submit">), the browser packages any user-entered data that was contained within the <form> tags and passes it back to the server as a POST request.

The server parses out the data in the POST request once it is received. The server-side script can retrieve this data by accessing the Request.Form collection. The Response.Write command prints this data back out to the browser.

To try the page out, open a browser and point it at http://localhost/post-Test.aspx; type something into the textbox, and press Submit. Then you will see the page refresh, and the text you typed appears above the form.

Reopen the previous example and add a new textbox named tbPost. Click on the Capture button and modify the code as follows:

private void btnCapture_Click(object sender, System.EventArgs
e)
{
    tbPost.Text = HttpUtility.UrlEncode(tbPost.Text);
    tbResult.Text =
getHTTP(tbUrl.Text,"tbPost="+tbPost.Text);
}

VB.NET

Private Sub btnCapture_Click(ByVal sender As Object, _
ByVal e As System.EventArgs) Handles btnCapture.Click
    tbPost.Text = HttpUtility.UrlEncode(tbPost.Text)
    tbResult.Text = getHTTP(tbUrl.Text,"tbPost="+tbPost.Text)
End Sub

The reason for the call to HttpUtility.UrlEncode is to convert the text entered by the user into a string that is safe for transport by HTTP. This means the removal of white space (spaces are converted to “+”) and the conversion of nonalphanumeric characters, which is a requirement of the HTTP protocol.

Once the data to post is encoded, it can be passed to the getHTTP function, which is described below. It is a modified version of the code previously listed.

public string getHTTP(string szURL,string szPost)
{
  HttpWebRequest  httprequest;
  HttpWebResponse httpresponse;
  StreamReader    bodyreader;
  string          bodytext = "";
  Stream          responsestream;
  Stream          requestStream;

  httprequest = (HttpWebRequest) WebRequest.Create(szURL);
  httprequest.Method = "POST";
  httprequest.ContentType =
  "application/x-www-form-urlencoded";
  httprequest.ContentLength = szPost.Length;
  requestStream = httprequest.GetRequestStream();
  requestStream.Write(Encoding.ASCII.GetBytes(szPost),0,
  szPost.Length);
  requestStream.Close();
  httpresponse = (HttpWebResponse) httprequest.GetResponse();
  responsestream = httpresponse.GetResponseStream();
  bodyreader = new StreamReader(responsestream);
  bodytext = bodyreader.ReadToEnd();
  return bodytext;
}

VB.NET

Public Function getHTTP(ByVal szURL As String, _
ByVal szPost As String) As String
  Dim httprequest As HttpWebRequest
  Dim httpresponse As HttpWebResponse
  Dim bodyreader As StreamReader
  Dim bodytext As String =  ""
  Dim responsestream As Stream
  Dim requestStream As Stream

  httprequest = CType(WebRequest.Create(szURL), _
  HttpWebRequest)
  httprequest.Method = "POST"
  httprequest.ContentType = _
  "application/x-www-form-urlencoded"
  httprequest.ContentLength = szPost.Length
  requestStream = httprequest.GetRequestStream()
  requestStream.Write(Encoding.ASCII.GetBytes(szPost), _
  0,szPost.Length)
  requestStream.Close()
  httpresponse = CType(httprequest.GetResponse(), _
  HttpWebResponse)
  responsestream = httpresponse.GetResponseStream()
  bodyreader = New StreamReader(responsestream)
  bodytext = bodyreader.ReadToEnd()
  Return bodytext
End Function

This differs from the code to simply retrieve a Web page in that once the HttpWebRequest has been created, several parameters are set such that the request also includes the posted data. The chunked reader loop is also replaced with the ReadToEnd() method of StreamReader. This method may be elegant, but it is not compatible with binary data.

The three settings that need to be changed are the request method, content type, and content length. The request method is usually GET but now must be set to POST. The content type should be set to the MIME type application/x-www-form-urlencoded, although this is not strictly necessary. The content length is simply the length of the data being posted, including the variable names, and after URL encoding.

The data to be posted must then be sent to the server using the Write method on the request stream. Once the request has been created, it is simply a matter of receiving the stream from the remote server and reading to the end of the stream.

Finally, we need namespaces for the HttpUtility and Encoding objects. You will need to make a reference to System.Web.dll by selecting Project→Add Reference, as shown in Figure 4.2.

Figure 4.2. Visual Studio .NET, Add Reference dialog.

using System.Web;
using System.Text;
using System.IO;
using System.Net;

VB.NET

Imports System.Web
Imports System.Text
Imports System.IO
Imports System.Net

To test the application, run it through Visual Studio .NET, enter http://localhost/postTest.aspx into the URL textbox, and add some other text into the POST textbox. When you press Capture, you will see that the posted text appears as part of the Web page (Figure 4.3).

Figure 4.3. HTTP client application with POST facility.

Table 4.5 shows the significant members of HttpWebRequest.

Table 4.5. Significant members of HttpWebRequest.

Method or Property	Meaning
`Accept`	Gets or sets the value of the `Accept` HTTP header. Returns `String`.
`AllowAutoRedirect`	Gets or sets a Boolean value that indicates whether the request should follow redirection (3xx) responses.
`ContentLength`	Gets or sets the `Content-length` HTTP header.
`ContentType`	Gets or sets the value of the `Content-type` HTTP header.
`CookieContainer`	Gets or sets the cookies associated with the request. May be invoked thus: `CookieContainer.getCookies["name"].ToS tring()`.
`Headers`	Gets a collection of strings that are contained in the HTTP header. May be invoked thus: `Headers["Content-Type"].ToString()`.
`Method`	Gets or sets the method for the request. Can be set to `GET`, `HEAD`, `POST`, `PUT`, `DELETE`, `TRACE`, or `OPTIONS`.
`Proxy`	Gets or sets proxy information for the request. Returns `WebProxy`.
`Referer`	Gets or sets the value of the `Referer` HTTP header. Returns `String`.
`RequestUri`	Gets the original URI of the request. Address is the URI after redirections. May be invoked thus: `RequestURI.ToString()`.
`Timeout`	Gets or sets the time-out value. May be invoked thus `Timeout=(int) new TimeSpan(0,0,30).TotalMilliseconds`.
`TransferEncoding`	Gets or sets the value of the `Transfer-encoding` HTTP header. Returns `String`.
`UserAgent`	Gets or sets the value of the `User-agent` HTTP header. Returns `String`.
`GetResponse`	Returns a `webResponse` from an Internet resource. Its asynchronous variant is `BeginGetResponse` and `EndGetResponse`.

A note on cookies

HTTP does not maintain state information. It is therefore difficult to differentiate between two users accessing a server or one user making two requests. From the server’s point of view, it is possible for both users to have the same IP address (e.g., if they are both going through the same proxy server). If the service being accessed contained personal information, the user to whom this data pertains is legally entitled to view this data, but other users should not be allowed access.

In this situation, the client side of the connection needs to differentiate itself from other clients. This can be done in several ways, but for Web sites, cookies are the best solution.

Cookies are small files stored in c:windowscookies (depending on your Windows installation). They are placed there in one of two ways: by the JavaScript document.cookie object, or by the set-cookie header in HTTP requests. These cookies remain on the client’s machine for a set time and can be retrieved in JavaScript or in HTTP responses.

Cookies are supported in .NET via the HttpWebResponse.Cookies and the HttpWebRequest.CookieContainer objects.

Cookies are domain specific; therefore, a cookie stored on www.library.com cannot be retrieved by www.bookshop.com. In circumstances where both sites are affiliated with each other, the two sites might need to share session state information. In this example, it would be advantageous for bookshop.com to know a user’s reading preferences, so that it could advertise the most relevant titles.

The trick to copying cookies across domains is to convert the cookies into text, pass the text between the servers, and pass the cookies back to the client from the foreign server. .NET offers a facility to serialize cookies, which is ideal for the purpose.

A WYSIWYG editor

WYSIWYG (what you see is what you get) is a term used to describe Web and graphics editors that enable you to naturally manipulate graphical output, without having to be concerned with the underlying code. This feature is a handy way to let users be more creative in the type of textual messages or documents they create, without requiring them to take a crash course in HTML.

Internet Explorer can run in a special design mode, which is acceptable as a WYSIWYG editor. The trick to accessing design mode in Internet Explorer is simply to set the property WebBrowser.Document.designMode to On. Users can type directly into the Internet Explorer window and use well-known shortcut keys to format text (e.g., Ctrl + B, Bold; Ctrl + I, Italic; Ctrl + U, Underline). By right-clicking on Internet Explorer in design mode, a user can include images, add hyperlinks, and switch to browser mode. When an image is included in the design view, it can be moved and scaled by clicking and dragging on the edge of the image.

More advanced features can be accessed via Internet Explorer’s execCommand function. Only FontName, FontSize, and ForeColor are used in the following sample program, but here is a list of the commands used by Internet Explorer.

Table 4.6. Parameters of Internet Explorer’s execCommand function.

Command	Meaning
`Bold`	Inserts a `<B>` tag in HTML
`Copy`	Copies text into the clipboard
`Paste`	Pastes text from the clipboard
`InsertUnorderedList`	Creates a bulleted list, `<UL>` in HTML
`Indent`	Tabulates text farther right on the page
`Outdent`	Retabulates text left on the page
`Italic`	Inserts an `<I>` tag in HTML
`Underline`	Inserts an `<U>` tag in HTML
`CreateLink`	Creates a hyperlink to another Web page
`UnLink`	Removes a hyperlink from text
`FontName`	Sets the font family of a piece of text
`FontSize`	Sets the font size of a piece of text
`CreateBookmark`	Creates a bookmark on a piece of text
`ForeColor`	Sets the color of the selected text
`SelectAll`	Is equivalent to pressing CTRL + A
`JustifyLeft`	Moves all text as far left as space allows
`JustifyRight`	Moves all text as far right as space allows
`JustifyCenter`	Moves all selected text as close to the center as possible
`SaveAs`	Saves the page to disk

Other functionality not included in this list can be implemented by dynamically modifying the underlying HTML.

To start coding this application, open a new project in Visual Studio .NET. Add a reference to Microsoft.mshtml by clicking Project→Add Reference. Scroll down the list until you find Microsoft.mshtml, highlight it, and press OK. If you have not already done so from Chapter 1’s example, add Internet Explorer to the toolbox. To do this, right-click on the toolbox and select Customize Toolbox. Scroll down the list under the COM components tab until you see Microsoft Web Browser. Check the box opposite it, and press OK.

Draw a Tab control on the form named tabControl. Click on the tabPages property in the properties window and add two tab pages, labeled Preview and HTML. Draw the Microsoft Web Browser control onto the preview tab page and name the control WebBrowser. Add three buttons to the Preview tab page, named btnViewHTML, btnFont, and btnColor. In the HTML tab page, add a textbox named tbHTML, and set its multiline property to true. Also add a button to the HTML tab page named btnPreview. Drag a Color Dialog control onto the form, and name it colorDialog. Drag a Font Dialog control onto the form and name it fontDialog.

Double-click on the form, and add the following code:

private void Form1_Load(object sender, System.EventArgs e)
{
  object any = null;
  object url = "about:blank";
  WebBrowser.Navigate2(ref url,ref any,ref any,ref any,ref
any);
  Application.DoEvents();
  ((HTMLDocument)WebBrowser.Document).designMode="On";
}

VB.NET

Private  Sub Form1_Load(ByVal sender As Object, _
ByVal e As System.EventArgs)
  Dim url As Object =  "about:blank"
  WebBrowser.Navigate2( url)
  Application.DoEvents()
  (CType(WebBrowser.Document, HTMLDocument)).designMode="On"
End Sub

In order to access the HTML contained within the Web browser page, it must first point to a valid URL that contains some HTML source. In this case, the URL about:blank is used. This page contains nothing more than <HTML></HTML>, but is sufficient for the needs of this application. The DoEvents method releases a little processor time to allow the Web browser to load this page. The Document property of the Web browser contains the object model for the page, but it must first be cast to an HTMLDocument object to be of use. The designMode property of Internet Explorer is then set to On to enable WYSIWYG editing.

Click on the view HTML button on the Preview tab page and enter the following code:

private void btnViewHTML_Click(object sender,
System.EventArgs e)
{
  tbHTML.Text=(
  (HTMLDocument)WebBrowser.Document).body.innerHTML;
}

VB.NET

Private  Sub btnViewHTML_Click(ByVal sender As Object, _
  ByVal e As System.EventArgs)
  tbHTML.Text= _
  (CType(WebBrowser.Document, HTMLDocument)).body.innerHTML
End Sub

This button extracts the HTML from the Web Browser control and places it into the HTML-viewer textbox. Again, the Document property must be cast to an HTMLDocument object in order to access the page object model. In this case, the body.innerHTML property contains the page source. If you required the page source less the HTML tags, then body.innerText would be of interest.

Click on the corresponding Preview button on the HTML tab page, and enter the following code:

private void btnPreview_Click(object sender, System.EventArgs
e)
{
 ((HTMLDocument)WebBrowser.Document).body.innerHTML=
 tbHTML.Text;
}

VB.NET

Private Sub btnPreview_Click(ByVal sender As Object, _
ByVal e As System.EventArgs)
 (CType(WebBrowser.Document, _
 HTMLDocument)).body.innerHTML=tbHTML.Text
End Sub

This code simply performs the reverse of the preceding code, replacing the HTML behind the Web browser with the HTML typed into the textbox.

Click on the Font button on the Preview tab page, and enter the following code:

private void btnFont_Click(object sender, System.EventArgs e)
{
  fontDialog.ShowDialog();
  HTMLDocument doc = (HTMLDocument)WebBrowser.Document;
  object selection= doc.selection.createRange();
  doc.execCommand("FontName",false,
  fontDialog.Font.FontFamily.Name);
  doc.execCommand("FontSize",false,fontDialog.Font.Size);
  ((IHTMLTxtRange)selection).select();
}

VB.NET

Private  Sub btnFont_Click(ByVal sender As Object, _
 ByVal e As System.EventArgs)
  fontDialog.ShowDialog()
  Dim doc As HTMLDocument = CType(WebBrowser.Document, _
  HTMLDocument)
  Dim selection As Object =  doc.selection.createRange()
  doc.execCommand("FontName",False,fontDialog.Font. _
  FontFamily.Name)
  doc.execCommand("FontSize",False,fontDialog.Font.Size)
  (CType(selection, IHTMLTxtRange)).select()
End Sub

Pressing the Font button will bring up the standard font dialog box (Figure 4.4), which allows the user to select any font held on the system and its size. Other properties that may be available on this screen, such as subscript, strikethrough, and so on, are not reflected in the WYSIWYG editor. This works by first capturing a reference to any selected text on the screen using the selection.createRange() method. The execCommand method is called twice, first to apply the font family to the selected text and then the font size. The selection is then cast to an IHTMLTxtRange interface, which exposes the select method and commits the changes to memory.

Figure 4.4. Font-chooser dialog box.

Now click on the Color button on the Preview tab page, and enter the following code:

private void btnColor_Click(object sender, System.EventArgs
e)
{
  colorDialog.ShowDialog();
  string colorCode = "#" +
      toHex(colorDialog.Color.R) +
      toHex(colorDialog.Color.G) +
      toHex(colorDialog.Color.B);
  HTMLDocument doc = (HTMLDocument)WebBrowser.Document;
  object selection = doc.selection.createRange();
  doc.execCommand("ForeColor",false,colorCode);
  ((IHTMLTxtRange)selection).select();
}

VB.NET

Private  Sub btnColor_Click(ByVal sender As Object, _
  ByVal e As System.EventArgs)
  colorDialog.ShowDialog()
  String colorCode = "#" + _
      toHex(colorDialog.Color.R) + _
      toHex(colorDialog.Color.G) + _
      toHex(colorDialog.Color.B)
  Dim doc As HTMLDocument = CType(WebBrowser.Document, _
  HTMLDocument)
  Dim selection As Object =  doc.selection.createRange()
  doc.execCommand("ForeColor",False,colorCode)
  (CType(selection, IHTMLTxtRange)).select()
End Sub

Pressing the Color button brings up the standard Color dialog box (Figure 4.5). When a color is chosen, the selected color is applied to any selected text. This code brings up the Color dialog box by calling the ShowDialog method. The color returned can be expressed in terms of its red (R), green (G), and blue (B) constituents. These values are in decimal format, in the range 0 (least intense) to 255 (most intense). HTML expresses colors in the form #RRGGBB, where RR, GG, and BB are hexadecimal equivalents of the R, G, and B values. To give a few examples, #FF0000 is bright red, #FFFFFF is white, and #000000 is black.

Figure 4.5. Color-picker dialog box.

Once again, a handle to the selected text is obtained in the same way as before. The execCommand method is called and passed ForeColor, along with the HTML color code. The selected text is cast to an IHTMLTxtRange interface and committed to memory with the Select method as before.

The above code calls the function toHex to convert the numeric values returned from the colorDialog control to hexadecimal values, which are required by Internet Explorer. Enter the following code:

public string toHex(int digit)
{
  string hexDigit = digit.ToString("X");
  if (hexDigit.length == 1){
  hexDigit = "0" + hexDigit;
  }
  return hexDigit;
}

VB.NET

Public Function toHex(ByVal number As Integer) As String
     Dim hexByte As String
     hexByte = Hex(number).ToString()
     If hexByte.Length = 1 Then
         hexByte = "0" & hexByte
     End If
     Return hexByte
 End Function

Finally, the relevant namespaces are required:

using mshtml;

VB.NET

Imports mshtml

To test this application, run it from Visual Studio .NET. Type into the Web Browser control under the Preview tab. Press the Font button to change the style and size of any text that is selected. Press the Color button to change the color of selected text. You can insert images by right-clicking and selecting Insert image (special thanks to Bella for posing for this photograph!). Press the view HTML button, then switch to the HTML tab page to view the autogenerated HTML (Figure 4.6).

Figure 4.6. HTML editor application.

Web servers

One may ask why you should develop a server in .NET when IIS is freely available. An in-house-developed server has some advantages, such as the following:

Web server can be installed as part of an application, without requiring the user to install IIS manually from the Windows installation CD.
IIS will not install on the Windows XP Home Edition, which constitutes a significant portion of Windows users.

Implementing a Web server

Start a new Visual Studio .NET project as usual. Draw two textboxes, tbPath and tbPort, onto the form, followed by a button, btnStart, and a list box named lbConnections, which has its view set to list.

At the heart of an HTTP server is a TCP server, and you may notice an overlap of code between this example and the TCP server in the previous chapter. The server has to be multithreaded, so the first step is to declare an Array List of sockets:

public class Form1 : System.Windows.Forms.Form
{
  private ArrayList alSockets;
  ...

VB.NET

Public Class Form1 Inherits System.Windows.Forms.Form

    Private alSockets As ArrayList
  ...

Every HTTP server has an HTTP root, which is a path to a folder on your hard disk from which the server will retrieve Web pages. IIS has a default HTTP root of C:inetpubwwwroot; in this case, we shall use the path in which the application is saved.

To obtain the application path, we can use Application.Executable-Path, which returns not only the path but also the filename, and thus we can trim off all characters after the last backslash.

private void Form1_Load(object sender, System.EventArgs e)
{
  tbPath.Text = Application.ExecutablePath;
  // trim off filename, to get the path
  tbPath.Text =
  tbPath.Text.Substring(0,tbPath.Text.LastIndexOf("\"));
}

VB.NET

Private  Sub Form1_Load(ByVal sender As Object, _
ByVal e As System.EventArgs)
  tbPath.Text = Application.ExecutablePath
  ' trim off filename, to get the path
  tbPath.Text = _
  tbPath.Text.Substring(0,tbPath.Text.LastIndexOf(""))
End Sub

Clicking the Start button will initialize the Array List of sockets and start the main server thread. Click btnStart:

private void btnStart_Click(object sender, System.EventArgs e)
{
  alSockets = new ArrayList();
  Thread thdListener =
  new Thread(new ThreadStart(listenerThread));
  thdListener.Start();
}

VB.NET

Private  Sub btnStart_Click(ByVal sender As Object, _
 ByVal e As System.EventArgs)
  alSockets = New ArrayList()
  Dim thdListener As Thread =  New Thread(New _
  ThreadStart( AddressOf listenerThread))
  thdListener.Start()
End Sub

The listenerThread function manages new incoming connections, allocating each new connection to a new thread, where the client’s requests will be handled.

HTTP operates over port 80, but if any other application is using port 80 at the same time (such as IIS), the code will crash. Therefore, the port for this server is configurable. The first step is to start the TcpListener on the port specified in tbPort.Text.

This thread runs in an infinite loop, constantly blocking on the AcceptSocket method. Once the socket is connected, some text is written to the screen, and a new thread calls the handlerSocket function.

The reason for the lock(this) command is that handlerSocket retrieves the socket by reading the last entry in ArrayList. In the case where two connections arrive simultaneously, two entries will be written to ArrayList, and one of the calls to handlerSocket will use the wrong socket. Lock ensures that the spawning of the new thread cannot happen at the same time as the acceptance of a new socket.

public void listenerThread()
{
  int port =0;
  port = Convert.ToInt16(tbPort.Text);
  TcpListener tcpListener = new TcpListener(port);
  tcpListener.Start();
  while(true)
  {
    Socket handlerSocket = tcpListener.AcceptSocket();
    if (handlerSocket.Connected)
    {
     lbConnections.Items.Add(
     handlerSocket.RemoteEndPoint.ToString() + " connected."
     );
     lock(this)
     {
      alSockets.Add(handlerSocket);
      ThreadStart thdstHandler = new
      ThreadStart(handlerThread);
      Thread thdHandler = new Thread(thdstHandler);
      thdHandler.Start();
     }
    }
  }
}

VB.NET

Public  Sub listenerThread()
  Dim port As Integer = 0
  port = Convert.ToInt16(tbPort.Text)
  Dim tcpListener As TcpListener =  New TcpListener(port)
  tcpListener.Start()
  do
    Dim handlerSocket As Socket =  tcpListener.AcceptSocket()
     If handlerSocket.Connected = true then
      lbConnections.Items.Add( _
      handlerSocket.RemoteEndPoint.ToString() + " _
      connected.")
      syncLock(me)
        alSockets.Add(handlerSocket)
        Dim thdstHandler As ThreadStart =  New  _
          ThreadStart(AddressOf handlerThread)
        Dim thdHandler As Thread =  New  _
          Thread(thdstHandler)
        thdHandler.Start()
      end syncLock
    end if
  loop
End sub

The handlerThread function is where HTTP is implemented, albeit minimally. Taking a closer look at the code should better explain what is happening here.

The first task this thread must perform, before it can communicate with the client to which it has been allocated, is to retrieve a socket from the top of the public ArrayList. Once this socket has been obtained, it can then create a stream to this client by passing the socket to the constructor of a NetworkStream.

To make processing of the stream easier, a StreamReader is used to read one line from the incoming NetworkStream. This line is assumed to be:

GET <some URL path> HTTP/1.1

HTTP posts will be handled identically to HTTP gets. Because this server has no support for server-side scripting, there is no use for anything else in the HTTP POST data, or anything else in the HTTP Request header for that matter.

Assuming that the HTTP request is properly formatted, we can extract the requested page URL from this line by splitting it into an array of strings (verbs[]), delimited by the space character.

The next task is to convert a URL path into a physical path on the local hard drive. This involves four steps:

Converting forward slashes to backslashes
Trimming off any query string (i.e., everything after the question mark)
Appending a default page, if none is specified; in this case, “index.htm”
Prefixing the URL path with the HTTP root

Once the physical path is resolved, it can be read from disk and sent out on the network stream. It is reported on screen, and then the socket is closed. This server does not return any HTTP headers, which means the client will have to determine how to display the data being sent to it.

public void handlerThread()
{
  Socket handlerSocket = (
  Socket)alSockets[alSockets.Count-1];
  String streamData = "";
  String filename = "";
  String[] verbs;
  StreamReader    quickRead;
  NetworkStream networkStream =
  new NetworkStream(handlerSocket);
  quickRead = new StreamReader(networkStream);
  streamData = quickRead.ReadLine();
  verbs = streamData.Split(" ".ToCharArray());
  // Assume verbs[0]=GET
  filename = verbs[1].Replace("/","\");
  if (filename.IndexOf("?")!=-1)
  {
    // Trim of anything after a question mark (Querystring)
    filename = filename.Substring(0,filename.IndexOf("?"));
  }

  if (filename.EndsWith("\"))
  {
    // Add a default page if not specified
    filename+="index.htm";
  }
  filename = tbPath.Text + filename;
  FileStream  fs = new FileStream(filename,
  FileMode.OpenOrCreate);
  fs.Seek(0, SeekOrigin.Begin);
  byte[] fileContents= new byte[fs.Length];
  fs.Read(fileContents, 0, (int)fs.Length);
  fs.Close();

  // optional: modify fileContents to include HTTP header.

  handlerSocket.Send(fileContents);
  lbConnections.Items.Add(filename);
  handlerSocket.Close();
}

VB.NET

Public  Sub handlerThread()
  Dim handlerSocket As Socket = _
  CType(alSockets(alSockets.Count-1), Socket)
  Dim streamData As String =  ""
  Dim filename As String =  ""
  Dim verbs() As String
  Dim quickRead As StreamReader
  Dim networkStream As NetworkStream = New _
  NetworkStream(handlerSocket)
  quickRead = New StreamReader(networkStream)
  streamData = quickRead.ReadLine()
  verbs = streamData.Split(" ".ToCharArray())
  ' Assume verbs[0]=GET
  filename = verbs(1).Replace("/","\")
  If filename.IndexOf("?")<>-1 Then
    ' Trim of anything after a question mark (Querystring)
    filename = filename.Substring(0,filename.IndexOf("?"))
  End If

  If filename.EndsWith("\") Then
    ' Add a default page if not specified
    filename+="index.htm"
  End If
  filename = tbPath.Text + filename
  Dim fs As FileStream =  New _
  FileStream(filename,FileMode.OpenOrCreate)
  fs.Seek(0, SeekOrigin.Begin)
  Dim fileContents() As Byte =  New Byte(fs.Length) {}
  fs.Read(fileContents, 0, CType(fs.Length, Integer))
  fs.Close()
  ' optional: modify fileContents to include HTTP header.
  handlerSocket.Send(fileContents)
  lbConnections.Items.Add(filename)
  handlerSocket.Close()
End Sub

Most modern browsers can determine how best to display the data being sent to them, without the need for Content-Type headers. For instance, Internet Explorer can tell the difference between JPEG image data and HTML by looking for the standard JPEG header in the received data; however, this system is not perfect.

A simple example is the difference between how XML is rendered on a browser window and how HTML is displayed. Without the Content-Type header, Internet Explorer will mistake all XML (excluding the <?xml?> tag) as HTML. You can see this by viewing a simple XML file containing the text <a><b/></a> through this server.

And, the usual namespaces are thrown in:

using System.Threading;
using System.Net;
using System.Net.Sockets;
using System.Text;
using System.IO;

VB.NET

Imports System.Threading
Imports System.Net
Imports System.Net.Sockets
Imports System.Text
Imports System.IO

To test the server, you will need a simple HTML page. Save the following text as index.htm in the same folder where the executable is built (the HTTP root).

HTML
<html>
 Hello world!
</html>

Run the server from Visual Studio .NET, change the port to 90 if you are running IIS, and press Start. Open a browser and type in http://localhost:90. Localhost should be replaced by the IP address of the server, if you are running the server on a second computer (Figure 4.7).

Figure 4.7. HTTP server application.

As mentioned previously, the server does not return HTTP headers. It is worthwhile to extend the example to include one of the more important headers, Content-Type, to save data from being misinterpreted at the client.

First, implement a new function called getMime(). This will retrieve a file’s MIME type from the computer’s registry from its file extension:

public string getMime(string filename)
{
  FileInfo thisFile = new FileInfo(filename);
  RegistryKey key = Registry.ClassesRoot;
  key = key.OpenSubKey(thisFile.Extension);
  return key.GetValue("Content Type").ToString();
}

VB.NET

Public Function getMime(ByVal filename As String) As String
  Dim thisFile As FileInfo =  New FileInfo(filename)
  Dim key As RegistryKey =  Registry.ClassesRoot
  key = key.OpenSubKey(thisFile.Extension)
  Return key.GetValue("Content Type").ToString()
End Function

If you have never used Windows registry before, this code may need a little explaining. The Windows registry is a repository for information that holds the vast amount of settings and preferences that keep Windows ticking over. You can view and edit the registry using Registry Editor (Figure 4.8); start this by clicking Start→Run and typing regedit or regedt32.

Figure 4.8. Registry Editor utility.

To view MIME types that correspond with file type extensions, click on HKEY_CLASSES_ROOT, scroll down to the file extension in question, and look at the Content Type key on the right-hand side of the screen.

This data is accessed programmatically by first extracting the file type extension using the Extension property of a FileInfo object. The first step in drilling down through the registry data is to open the root key. In this case, it is Registry.ClassesRoot.

The .html subkey is then opened using the openSubKey method. Finally, the Content Type value is retrieved using the getValue statement and returned as a string to the calling function.

Now the final call to the Send method must be replaced by a slightly more elaborate sending procedure, which issues correct HTTP headers:

handlerSocket.Send(fileContents);

VB.NET

handlerSocket.Send(fileContents)

These become:

string responseString = "HTTP/1.1 200 OK
Content-Type: " +
        getMime(filename) + "

";
System.Collections.ArrayList al = new ArrayList();
al.AddRange(Encoding.ASCII.GetBytes(responseString));
al.AddRange(fileContents);
handlerSocket.Send((byte[])al.ToArray((new
byte()).GetType()));

VB.NET

Dim responseString As String
responseString = "HTTP/1.1 200 OK" + vbCrLf + _
"Content-Type: " + getMime(filename) + vbCrLf + vbCrLf
Dim al As System.Collections.ArrayList = New ArrayList
al.AddRange(Encoding.ASCII.GetBytes(responseString))
al.AddRange(fileContents)
handlerSocket.Send(CType( _
al.ToArray((New Byte).GetType()), Byte()))

Finally, to support the registry access functionality, we need to include an extra namespace:

using Microsoft.Win32;

VB.NET

Imports Microsoft.Win32

To demonstrate the difference this makes to running the server, create two files, test.txt and test.xml, both containing the text <a><b/></a>. Save them both in the HTTP root of your server and type in http:localhost/test.xml and http:localhost/test.txt. You will notice that test.xml will be rendered as a collapsible tree, and the text file will be shown as a series of characters.

System.Net.HttpWebListener

In .NET 2 Whidbey, a more elegant solution for implementing Web servers exists, namely the HttpWebListener class. This class leverages the Http.sys driver (where available) to deliver unprecedented performance, and integrates many features, such as SSL encryption and authentication, which would be difficult to develop from the ground up.

The HttpWebListener class consists of the significant methods and properties shown in Table 4.7.

Table 4.7. Significant members of the HttpWebListener class.

Method or Property	Description
`Abort / Close`	Destroys the request queue.
`AddPrefix`	Adds a prefix to the Web listener.
`BeginGetRequest`	Awaits a client request asynchronously. Returns `IasyncResult`.
`EndGetRequest`	Handles client request. Returns `ListenerWebRequest`.
`GetPrefixes`	Retrieves all handled prefixes. Returns `String[]v`.
`GetRequest`	Awaits a client request synchronously. Returns `ListenerWebRequest`.
`RemoveAll`	Removes all prefixes.
`RemovePrefix`	Removes a specified prefix.
`Start`	Starts the Web server.
`Stop`	Stops the Web server.
`AuthenticationScheme`	Sets the means by which the server authenticates clients. Returns `AuthenticationScheme` (i.e., Basic, Digest, NTLM).
`IsListening`	Determines if the server is running. Returns `Boolean`.
`Realm string`	If Basic or Digest authentication schemes are selected, gets the realm directive. Returns `String`.

The ListenerWebRequest returned by GetRequest contains the significant methods and properties shown in Table 4.8.

Table 4.8.

Method or Property	Description
`Abort / Close`	Closes the client connection.
`GetRequestStream`	Retrieves a reference to the stream sent from the client. Returns `Stream`.
`GetResponse`	Retrieves a reference to the response to be sent to the client. Returns `ListenerWebResponse`.
`Accept`	Gets the `Accept` HTTP header sent in the client request. Returns `String`.
`ClientCertificate`	Gets the digital certificate sent with the client request. Returns `X509Certificate`.
`ClientCertificateError`	Determines if any errors were present in the client certificate. Returns `int32`.
`Connection`	Gets the `Connection` HTTP header sent in the client request. Returns `String`.
`ContentLength`	Gets the length of any data posted in the client request. Returns `int64`.
`ContentType`	Gets the `ContentType` HTTP header sent in the client request. Returns `String`.
`Expect`	Gets the `Expect` HTTP header sent in the client request. Returns `String`.
`HasEntityBody`	Determines if the client request had an `Entity` body. Returns `Boolean`.
`Headers`	Gets a reference to the set of HTTP headers sent from the client. Returns `WebHeaderCollection`.
`Host`	Gets the `Host` HTTP header sent in the client request. Returns `String`.
`Identity`	Determines the identity credentials in the client request. Returns `Identity`.
`IfModifiedSince`	Gets the `IfModifiedSince` header sent in the client request. Returns `DateTime`.
`KeepAlive Boolean`	Determines if the client sent `Connection: Keep-Alive` in its request. Returns `Boolean`.
`LocalEndPoint`	Determines the local logical endpoint of the communication. Returns `IPEndPoint`.
`Method`	Gets the HTTP send method (i.e., `GET`, `POST`) in the client request. Returns `String`.
`ProtocolVersion`	Determines the HTTP version used by the client. Returns `Version`.
`RawUri`	Gets the URI requested by the client. Returns `String`.
`Referer`	Gets the `Referer` HTTP header sent in the client request. Returns `String`.
`RemoteEndPoint`	Determines the remote logical endpoint of the communication. Returns `IPEndPoint`.
`RequestUri`	Gets the URI requested by the client. Returns `Uri`.
`UserAgent`	Gets the `UserAgent` HTTP header sent in the client request. Returns `String`.

The ListenerWebResponse returned by GetResponse contains the significant methods and properties listed in Table 4.9.

Table 4.9.

Method or Property	Description
`Abort / Close`	Disconnects the client.
`GetResponseStream`	Retrieves a reference to the stream to be returned to the client. Returns `Stream`.
`ContentLength`	Sets the length of data to be sent back to the client. Returns `int64`.
`ContentType`	Sets the `ContentType` HTTP header to be sent back the client. Returns `String`.
`Date`	Sets the `Date` HTTP header to be sent back to the client. Returns `DateTime`.
`EntityDelimitation`	Determines how the response content should be delimited (i.e., `ContentLength, Chunked, Raw`). Returns `EntityDelimitation`.
`Headers`	Retrieves a reference to the HTTP headers to be sent back to the client. Returns `WebHeaderCollection`.
`KeepAlive`	Determines if `Connection: Keep-Alive` should be set in the HTTP headers returned to the client. Returns `Boolean`.
`LastModified`	Sets the `LastModified` HTTP header to be sent back to the client. Returns `DateTime`.
`ProtocolVersion`	Sets the HTTP protocol version to be used in communicating with the client. Returns `Version`.
`RawHeaders`	Retrieves a reference to the HTTP headers to be sent back to the client. Returns `Byte[]`.
`Request`	Retrieves a reference to the request that initiated the response. Returns `ListenerWebRequest`.
`Server`	Sets the `Server` HTTP header to be sent back to the client. Returns `String`.
`StatusCode`	Sets the HTTP status code to be sent to the client. Returns `httpstatuscode` (e.g., `OK`, `Moved`, `NotFound`).
`StatusDescription`	Sets the HTTP status description to be sent to the client. Returns `String`.

Mobile Web browsers

Not all HTTP clients are PCs. Many people use their mobile phones to access the Internet. Some applications are infinitely more useful when available wirelessly. Even though mobile phones ferry data in a totally different way from wired networks, a wireless application protocol (WAP) phone will communicate via a WAP gateway, which converts mobile phone signals into TCP/IP and accesses servers in much the same way as browsers.

WAP runs over HTTP and wireless transfer protocol (WTP), with a few extra headers thrown into the HTTP request. The following is a sample HTTP request generated by a WAP phone:

GET / HTTP/1.1
Accept-Charset: ISO-8859-1
Accept-Language: en
Content-Type: application/x-www-form-urlencoded
x-up-subno: Fiach_hop
x-upfax-accepts: none
x-up-uplink: none
x-up-devcap-smartdialing: 1
x-up-devcap-screendepth: 1
x-up-devcap-iscolor: 0
x-up-devcap-immed-alert: 1
x-up-devcap-numsoftkeys: 3
x-up-devcap-screenchars: 15,4
Accept: application/x-hdmlc, application/x-up-alert,
application/x-up-cacheop, application/x-up-device,
application/x-up-digestentry, text/x-hdml;version=3.1, text/
x-hdml;version=3.0, text/x-hdml;version=2.0, text/x-wap.wml,
text/vnd.wap.wml, */*, image/bmp, text/html
User-Agent: UP.Browser/3.1-ALAV UP.Link/3.2
Host: 127.0.0.1:50

Note

x-up-subno is set to the computer username followed by the computer name.

WAP clients and PC browsers differ most in the response. WAP clients cannot read HTML and use a simpler language, wireless markup language (WML), which has a MIME type text/vnd.wap.wml.

A minimal page in WML is as follows:

WML
<!DOCTYPE wml PUBLIC "-//WAPFORUM//DTD WML 1.1//EN"
      "http://www.wapforum.org/DTD/wml_1.1.xml">
<wml>
 <card>
  <p align="left">
   <b>Title</b><br/>
     body
  </p>
  </card>
 </wml>

To view this page on a WAP phone, save the above text to index.wml. Ensure that the MIME type is registered on your computer by adding a registry key to HKEY_CLASSES_ROOT.wml named Content Type with the value text/vnd.wap.wml.

Run the server as described in the previous section, and copy index.wml into the HTTP root as displayed. Ensure that your computer is online and has an externally visible IP address. Connect your mobile phone to the Internet and type your IP address into it, followed by /index.wml (Figure 4.9).

Figure 4.9. Sample WML page.

Note

If you do not have a WAP phone, you can use a WAP emulator such as the UP.SDK from www.openwave.com.

Not all wireless HTTP clients read WML. A competing technology, iMode, which is the most widely used technology in Asia, offers a similar, yet incompatible, system. iMode reads compact HTML (cHTML), which is a stripped-down version of the language with features such as frames, tables, and even JPEG images explicitly unsupported; however, iMode has good support for Unicode and can adequately display many Web pages designed for PCs.

An iMode browser can be recognized by the word DoCoMo in the user agent HTTP request header.

Mobile Web SDK

When implementing WAP compatibility in a Web application, it is worth considering the .NET Mobile Web SDK. This enables you to develop applications for WAP in the same way as an ASP.NET Web application. Therefore, there is no need to learn WML.

Note

Utilities are available to convert HTML to WML on-the-fly, but the .NET Mobile Web SDK is freely available.

A sample page could be as follows:

ASP.NET

<%@ Page Inherits="System.Mobile.UI.MobilePage" language="c#"
%>
<%@ Register TagPrefix="mobile" Namespace="System.Mobile.UI"
%>
<mobile:Form runat="server">
<mobile:Label runat="server">
 Hello world!
</mobile:Label>
</mobile:Form>

To try this page out, save it as mobile.aspx in your IIS root (usually c:inetpubwwwroot). Ensure that your computer is online and has an externally visible IP address. Connect your mobile phone to the Internet, and type your IP address into it, followed by /mobile.aspx.

Conclusion

This chapter should have provided enough information to link your .NET application into data from the Web, to illustrate the point that HTTP is not only used for Web browsing and the WAP.

The next chapter deals with sending and receiving email from .NET applications.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 4. HTTP: Communicating with Web Servers

Create new playlist

Sign In

Sign Up

Chapter 4. HTTP: Communicating with Web Servers

Introduction

Data mining

Note

HTTP

The HTTP request

Tip

Note

The HTTP response

MIME types

Note

System.Web

Posting data

A note on cookies

A WYSIWYG editor

Web servers

Implementing a Web server

System.Net.HttpWebListener

Mobile Web browsers

Note

Note

Mobile Web SDK

Note

Conclusion

Table of Contents for
4. HTTP: Communicating with Web Servers