Discover How Live Score Applications Work

Practically any baseball site that presents live score data is probably accessing a hidden data source that you can use, too—provided you know where to look.

Many sports web sites include features for following along with games in real time. These features are usually implemented as either Java applet or Flash applications. In either case, they’re miniature programs that your web browser downloads on the fly and runs on your local machine. These applications then request data (images, plays, and statistics) from the Internet and display this on your computer. If you just want to collect the data on a baseball game for analysis, but you don’t want to follow along in real time, you need to figure out where the application gets its data.

Today, most live score applications simply get data through HTTP requests to web servers. This is the same way your web browser retrieves web pages from the Internet. This makes it very easy for the baseball hacker to get what he wants from those servers. In the future, these applications might get more sophisticated and start using web services mechanisms like SOAP, or they might start implementing digital rights management (DRM) features to make your life harder. For now, you can use a few simple tricks to figure out where live score applications get data.

Use Your Router’s Content Filtering Feature

Many home cable/DSL routers or wireless routers have a URL filtering function to keep children from viewing certain content. (If you’re a parent, I hope you’re not using this feature to block your children from reading about baseball!) My $50 wireless router does this (I didn’t even know it had this function when I bought it). Products from Linksys and D-Link also offer this functionality. Figure 3-8, for example, shows the log page for the content filtering function in my NETGEAR router.

As you can see, when I launched the MLB.com Gameday application, an entry appeared for gd2.mlb.com, indicating that this application was fetching data from this host.

Log page from a wireless router

Figure 3-8. Log page from a wireless router

Use a Proxy Server

A better option is to install a proxy server on your Windows PC and to use this to look at what documents your browser gets from the Internet. I recommend Proxy Sniffer, available from http://www.proxy-sniffer.com. (I downloaded my copy from CNET’s site, http://www.downloadl.com.) This software provides a log of every interaction between your web browser and other servers in a nice log format. Plus, a free edition is available that works perfectly for this application.

To use the proxy server, begin by downloading and installing the software. Next, start the Proxy Sniffer server by opening the program from your Start menu (select Start → Programs → Proxy Sniffer → Run Proxy Sniffer Server). By default, the server uses port 7999.

Next, configure the browser to use the proxy server. Open the Options… dialog box from the Tools menu in your browser (either Internet Explorer or Firefox). In Internet Explorer, click the Connections tab, click the LAN Settings button, and then enter localhost for the address and 7999 for the port. In Firefox, click the Connection Settings button under the Options tab; then select “Manual proxy configuration” and enter 7999 for the HTTP proxy port.

Open the management interface by selecting Start → Programs → Proxy Sniffer→ Proxy Sniffer Web Admin. Click the Start Record link in the upper-left corner. Now, start running the baseball score application. Once the application loads and is displaying data, return to the admin interface and click Stop Record. As shown in Figure 3-9, you will see a record of every document fetched from your web browser that you can use to decipher what the baseball score application is doing.

Proxy sniffer

Figure 3-9. Proxy sniffer

Packet Filters

If you’re really hardcore, you can try using a packet filter to snoop on your network traffic. Packet filters monitor your network interfaces, showing all traffic moving over the network. This technique guarantees that you will see how an application communicates with other servers on the Internet. (However, it doesn’t guarantee that you’ll be able to understand what’s going on.)

The easiest tool to use for this purpose is Ethereal. Ethereal is a GUI-driven packet sniffer. It looks at every packet sent over a network interface, decomposing the network traffic into a nice, organized format. Ethereal is an open source application, so you can download it free of charge. For more information about this application, see http://www.ethereal.com.

Warning

This hack is probably the most technical one in this book. I don’t go through all the steps involved in using Ethereal or tcpdump—this is a book about baseball, not computer networking! If you know something about computer networks and network applications, you’ll probably be comfortable reading the manpages and help files for these tools to figure out what they do.

When I did my investigations, I actually used a different tool, called tcpdump, which is another packet sniffer. This one is text driven (you run it from the command line) and is included with most Linux and BSD Unix distributions. It’s not as user friendly as Ethereal, though you can use Ethereal to read its output files. Here’s the call I used to monitor the traffic, dumping the results into the packets.dmp file:

	root# tcpdump -A -s 0 -w packets.dump ip

As the Flash application loads, it requests information from several different web sites. Looking for GET requests (where an application was fetching a document from an HTTP server), I found that the application was fetching items in a components directory on host 216.74.142.22. (Further investigation revealed that this was gd2.mlb.com.)

Here is a sample of what it was showing:

	10:17:32.992840 IP (tos 0x0, ttl  64, id 65008, offset 0, flags [DF], length: 40)
	192.168.0.2.52835 > 216.74.142.22.http: . [tcp sum ok] 1020:1020(0) ack 245 win 65
	535

	.   [S.P..$/[l..E..(..@[email protected][N..v..P…k…
	10:17:33.021167 IP (tos 0x0, ttl  64, id 65009, offset 0, flags [DF], length: 1064
	) 192.168.0.2.52837 > 216.74.142.22.http: P [bad tcp cksum 2b26 (->faef)!] 4284516
	538:4284517562(1024) ack 1752587775 win 65535
	
	.   [S.P..$/[l..E..(..@[email protected].`..hv].P…+&..GET /components/game/y
	ear_2004/month_07/day_01/gid_2004_07_01_bosmlb_nyamlb_1/players.txt HTTP/1.1
	Host: gd2.mlb.com
	User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.7.5) Gecko
	/20041107 Firefox/1.0
	Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;
	q=0.8,image/png,*/*;q=0.5
	Accept-Language: en-us,en;q=0.5
	Accept-Encoding: gzip,deflate
	Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
	Keep-Alive: 300
	Connection: keep-alive

Notice the part in bold in the preceding code—this is where the application is fetching data from this server. If you want to use another data source or if this one changes in the future, you can use this method and a little detective work to find where to get the data.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.149.250.168