The Internet

The Internet is the world’s largest IP-based network. It is an amorphous group of computers in many different countries on all seven continents (Antarctica included) that talk to each other using the IP protocol. Each computer on the Internet has at least one unique IP address by which it can be identified. Most of them also have at least one name that maps to that IP address. The Internet is not owned by anyone, though pieces of it are. It is not governed by anyone, which is not to say that some governments don’t try. It is simply a very large collection of computers that have agreed to talk to each other in a standard way.

The Internet is not the only IP-based network, but it is the largest one. Other IP networks are called internets with a little i: for example, a corporate IP network that is not connected to the Internet. Intranet is a current buzzword that loosely describes corporate practices of putting lots of data on internal web servers. Since web browsers use IP, most intranets do too (though a few tunnel it through existing AppleTalk or IPX installations).

Almost certainly the internet that you’ll be using is the Internet. To make sure that hosts on different networks on the Internet can communicate with each other, a few rules need to be followed that don’t apply to purely internal internets. The most important rules deal with the assignment of addresses to different organizations, companies, and individuals. If everyone picked the Internet addresses she wanted at random, conflicts would arise almost immediately when different computers showed up on the Internet with the same address.

Internet Address Classes

To avoid this problem, Internet addresses are assigned to different organizations by the Internet Assigned Numbers Authority (IANA),[7] generally acting through intermediaries called ISPs. When a company or an organization wants to set up an IP-based network connected to the Internet, its ISP gives it a block of addresses. Currently, these blocks are available in two sizes called Class B and Class C. A Class C address block specifies the first 3 bytes of the address, for example, 199.1.32. This allows room for 254 individual addresses from 199.1.32.1 to 199.1.32.254.[8] A Class B address block specifies only the first 2 bytes of the addresses an organization may use, for instance, 167.1. Thus a Class B address has room for 65,024 different hosts (256 Class C-sized blocks times 254 hosts per Class C block).

Numeric addressing becomes important when you want to restrict access to your site. For instance, you may want to prevent a competing company from having access to your web site. In this case, you would find out your competitor’s address block and throw away all requests that come from that block of addresses. More commonly, you might want to make sure that only people within your organization can access your internal web server. In this case, you would deny access to all requests except those that come from within your own address block.

There’s no block with a size between a Class B and a Class C. This has become a problem because there are many organizations with more than 254 computers connected to the Internet but fewer than 65,024 of them. If each of these organizations gets a full Class B block, a lot of IP addresses are wasted. This is a problem since there’s a limited number of addresses, about 4.2 billion to be precise. That sounds like a lot, but it gets crowded quickly when you can easily waste 50,000 or 60,000 addresses at a shot.

There are also many networks, such as the author’s own personal basement area network, that have a few to a few dozen computers but not 255 of them. To more efficiently allocate the limited address space, Classless Inter-Domain Routing (CIDR) was invented. CIDR mostly (though not completely) replaces the whole A, B, C addressing scheme with one based on a specified numbers of prefix bits. These are generally written as /24 or /19. The number after the / indicates the number of fixed prefix bits. Thus a /24 fixes the first 24 bits in the address, leaving 8 bits available to distinguish individual nodes. This allows 256 nodes and is equivalent to an old-style Class C. A /19 fixes 19 bits, leaving 13 for individual nodes within the network. It’s equivalent to 32 separate Class C networks or an eighth of a Class B. A /28, generally the smallest you’re likely to encounter in practice, leaves only four bits for identifying local nodes. It can handle networks with up to 16 nodes. CIDR also carefully specifies which address blocks are associated with which ISPs. This helps keep the Internet routing tables smaller and more manageable than they would be under the old system.

Several address blocks and patterns are special. All Internet addresses beginning with 10., 172.16. through 172.31., and 192.168. are deliberately unassigned. They can be used on internal networks, but no host using addresses in these blocks is allowed onto the global Internet. These nonroutable addresses are useful for building private networks that can’t be seen from the rest of the Internet or for building a large network when you’ve been assigned only a Class C address block. Addresses beginning with 127 (most commonly 127.0.0.1) always mean the local loopback address. That is, these addresses always point to the local computer, no matter which computer you’re running on. The hostname for this address is generally localhost. The address 0.0.0.0 always refers to the originating host but may be used only as a source address, not a destination. Similarly, any address that begins with 0.0 is assumed to refer to a host on the same local network.

Firewalls

There are some naughty people on the Internet. To keep them out, it’s often helpful to set up one point of access to a local network and check all traffic into or out of that access point. The hardware and software that sits between the Internet and the local network, checking all the data that comes in or out to make sure it’s kosher, is called a firewall.

The most basic firewall is a packet filter that inspects each packet coming into or out of a network and uses a set of rules to determine whether that traffic is allowed. Filtering is usually based on network addresses and ports. For example, all traffic coming from the Class C network 193.28.25 may be rejected because you had bad experiences with hackers from that net in the past. Outgoing Telnet connections may be allowed, but incoming Telnet connections may not be. Incoming connections on port 80 (Web) may be allowed but only to the corporate web server. The exact configuration of a firewall—which packets of data are and are not allowed to pass through—depends on the security needs of an individual site. Java doesn’t have much to do with firewalls except insofar as they often get in your way.

Proxy Servers

Proxy servers are related to firewalls. If a firewall prevents hosts on a network from making direct connections to the outside world, a proxy server can act as a go-between. Thus a machine that is prevented from connecting to the external network by a firewall would make a request for a web page from the local proxy server instead of requesting the web page directly from the remote web server. The proxy server would then request the page from the web server and forward the response to the original requester. Proxies can also be used for FTP services and other connections. One of the security advantages of using a proxy server is that external hosts find out only about the proxy server. They do not learn the names and IP addresses of the internal machines, making it more difficult to hack into internal systems.

While firewalls generally operate at the level of the transport or internet layer, proxy servers operate at the application layer. A proxy server has detailed understanding of some application level protocols, like HTTP and FTP. Packets that pass through the proxy server can be examined to ensure that they contain data appropriate for their type. For instance, FTP packets that seem to contain Telnet data can be rejected. Figure 2.3 shows how proxy servers fit into the layer model.

Layered connections through a proxy server

Figure 2-3. Layered connections through a proxy server

As long as all access to the Internet is forwarded through the proxy server, access can be tightly controlled. For instance, a company might choose to block access to http://www.playboy.com but allow access to http://www.microsoft.com. Some companies allow incoming FTP but disallow outgoing FTP so that confidential data cannot be as easily smuggled out of the company. Some companies have begun using proxy servers to track their employees’ web usage so that they can see who’s using the Internet to get tech support and who’s using it to check out the Playmate of the Month. Such monitoring of employee behavior is controversial and not exactly an indicator of enlightened management techniques.

Proxy servers can also be used to implement local caching. When a file is requested from a web server, the proxy server will first check to see whether the file is in its cache. If the file is in the cache, then the proxy will serve the file from the cache rather than from the Internet. If the file is not in the cache, then the proxy server will retrieve the file, forward it to the requester, and store it in the cache for the next time it is requested. This scheme can significantly reduce load on an Internet connection and greatly improve response time. America Online (AOL) runs one of the largest farms of proxy servers in the world to speed the transfer of data to its users. If you look at a web server log file, you’ll probably find some hits from clients with names like http://www-d1.proxy.aol.com, but not as many as you’d expect given the more than 20 million AOL subscribers. That’s because AOL requests only pages they don’t already have in their cache. Many other large ISPs do similarly.

The biggest problem with proxy servers is their inability to cope with all but a few protocols. Generally established protocols like HTTP, FTP, and SMTP are allowed to pass through, while newer protocols like Napster are not. (Some network administrators would consider that a feature.) In the rapidly changing world of the Internet, this is a significant disadvantage. It’s a particular disadvantage for Java programmers because it limits the effectiveness of custom protocols. In Java, it’s easy and often useful to create a new protocol that is optimized for your application. However, no proxy server will ever understand these one-of-a-kind protocols.

Applets that run in web browsers will generally use the proxy server settings of the web browser itself. This is generally set in a dialog box (possibly hidden several levels deep in the preferences) like the one shown in Figure 2.4. Standalone Java applications can indicate the proxy server to use by setting the socksProxyHost and socksProxyPort properties (if you’re using a SOCKS proxy server), or http.proxySet, http.proxyHost, http.proxyPort, https.proxySet, https.proxyHost, https.proxyPort, ftpProxySet, ftpProxyHost, ftpProxyPort, gopherProxySet, gopherProxyHost, and gopherProxyPort system properties (if you’re using protocol-specific proxies). You can set system properties from the command-line using the -D flag like this:

               java -DsocksProxyHost=socks.cloud9.net -DsocksProxyPort=1080  MyClass

These can also be set by any other convenient means to set system properties, such as including them in the appletviewer.properties file like this:

ftpProxySet=true
ftpProxyHost=ftp.proxy.cloud9.net
ftpProxyPort=1000
gopherProxySet=true
gopherProxyHost=gopher.proxy.cloud9.net
gopherProxyPort=9800
http.proxySet=true
http.proxyHost=web.proxy.cloud9.net
http.proxyPort=8000
https.proxySet=true
https.proxyHost=web.proxy.cloud9.net
https.proxyPort=8001
Netscape Navigator proxy server settings

Figure 2-4. Netscape Navigator proxy server settings



[7] In the near future, this function will be assumed by the Internet Corporation for Assigned Names and Numbers (ICANN).

[8] Addresses with the last byte either .0 or .255 are reserved and should never actually be assigned to hosts.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.59.187