Extending HTTP servlets for GET/POST methods

Now that we have our web scraper written, we need a way to take the VideoGame objects that are returned, and actually store them in our database. Furthermore, we need a way to communicate with our server once it's up and running and tell it to scrape the site and insert it into our JDO database. Our gateway for communicating with our server is through what's called an HTTP servlet—something that we briefly mentioned earlier in the book.

Setting up your backend in this way will be especially useful when we talk later about CRON jobs which, in order to automatically run some kind of function, require a servlet to communicate with (more on this soon). For now though, let's see how we can extend the HttpServlet class and implement its doGet() method, which will listen and handle all HTTP GET requests sent to it. But first, what exactly is an HTTP GET request? Well, an HTTP web request is simply a user making a request to some server that will be sent over the network (that is, the Internet). Depending on the type of request, the server will then handle and send an HTTP response back to the user. There are then two common types of HTTP requests:

  • GET request—web requests that are only meant to retrieve data. These web requests will typically ask the server to query for some kind of data to be returned.
  • POST request—web requests that submit data to be processed. Typically, this will ask the server to insert some kind of data that was submitted by the user.

In this case, since we aren't getting any data for a user or submitting any data from a user (in fact we're not really interacting with any users at all), it really doesn't make a difference which type of request we use, so we'll stick with the simpler GET request as follows:

import javax.servlet.ServletException;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;


// EXTEND THE HTTPSERVLET CLASS TO MAKE THIS METHOD AVAILABLE
// TO EXTERNAL WEB REQUESTS, NAMELY CLIENTS AND CRON JOBS
public class VideoGameScrapeServlet extends HttpServlet {

  private ArrayList<VideoGame> games;

  /**
   * METHOD THAT IS HIT WHEN HTTP GET REQUEST IS MADE
   * 
   * @param request
   * a servlet request object (any params passed can be retrieved
   * with this)
   * @param response
   * a servlet response that you can embed data back to user
   * @throws IOException
   * @throws ServletException
   */
  public void doGet(HttpServletRequest request, HttpServletResponse response) throws IOException, ServletException {
    games = new ArrayList<VideoGame>();
    String message = "Success";
    try {
      // GRAB GAMES FROM ALL PLATFORMS
      games.addAll(
      VideoGameScraper.getVideoGamesByConsole(VideoGameConsole.DS));
      games.addAll(
      VideoGameScraper.getVideoGamesByConsole(VideoGameConsole.PS2));
      games.addAll(
      VideoGameScraper.getVideoGamesByConsole(VideoGameConsole.PS3));
      games.addAll(
      VideoGameScraper.getVideoGamesByConsole(VideoGameConsole.PSP));
      games.addAll(
      VideoGameScraper.getVideoGamesByConsole(VideoGameConsole.WII));
      games.addAll(
      VideoGameScraper.getVideoGamesByConsole(VideoGameConsole.XBOX));
    } catch (Exception e) {
      e.printStackTrace();
      message = "Failed";
    }

    // HERE WE ADD ALL GAMES TO OUR VIDEOGAME JDO WRAPPER
    VideoGameJDOWrapper.batchInsertGames(games);

    // WRITE A RESPONSE BACK TO ORIGINAL HTTP REQUESTER
    response.setContentType("text/html");
    response.setHeader("Cache-Control", "no-cache");
    response.getWriter().write(message);
  }
}

So the method itself is quite simple. We already have our getVideoGamesByConsole() method from earlier, which goes and does all the scraping, returning a list of VideoGame objects as a result. We then simply run it for every console that we want, and at the end use our nifty JDO database wrapper class and call its batchInsertGames() method for quicker insertions. Once that's done, we take the HTTP response object that is passed in and quickly write some kind of message back to the user to let them know whether or not the scraping was successful. In this case, we don't make use of the HttpServletRequest object that gets passed in, but that object will come in very handy if the requester passes parameters into the URL. For instance, say you wanted to write your servlet in a way that only scrapes one specific game platform instead of all of them. In that case, you would need some way of passing a platform-type parameter to your servlet, and then extracting that passed-in parameter value within the servlet. Well, just like how earlier we saw that Yahoo! Finance allows you to pass in tickers with key value s, to pass in a platform type, we could simply do the following:

http://{your-GAE-base-url}.appspot.com/videoGameScrapeServlet?type=Xbox

Then, on the servlet side do:

  public void doGet(HttpServletRequest request, HttpServletResponse response) throws IOException, ServletException {
    String type = request.getParameter("type");
    games = new ArrayList<VideoGame>();
    String message = "Success";
    try {
      // GRAB GAMES FROM SPECIFIC PLATFORM
      games.addAll(VideoGameScraper.getVideoGamesByConsole(type));
    } catch (Exception e) {
      e.printStackTrace();
      message = "Failed";
    }

    // ADD GAMES TO JDO DATABASE
    VideoGameJDOWrapper.batchInsertGames(games);

    // WRITE A RESPONSE BACK TO ORIGINAL HTTP REQUESTER
    response.setContentType("text/html");
    response.setHeader("Cache-Control", "no-cache");
    response.getWriter().write(message);
  }

Pretty simple, right? You just have to make sure that the key used in the URL matches the parameter you request within the servlet class. Now, the last and final step for getting this all hooked together is defining the URL path in your GAE project—namely, making sure your GAE project knows that the URL pattern actually points to this class you just wrote. This can be found in your GAE project's /war/WEB-INF/ directory, specifically in the web.xml file. There you'll need to add the following. To make sure that the servlet name and class path matches the given URL pattern:

<?xml version="1.0" encoding="utf-8"?>

<web-app xmlns="http://java.sun.com/xml/ns/javaee" version="2.5">
  <servlet>
    <servlet-name>videoGameScrapeServlet</servlet-name>
    <servlet-class>app.httpservlets.VideoGameScrapeServlet</servlet-class>
  </servlet>
  <servlet-mapping>
    <servlet-name>videoGameScrapeServlet</servlet-name>
    <url-pattern>/videoGameScrapeServlet</url-pattern>
  </servlet-mapping>
</web-app>

At this point, we have our scraper, we have our JDO database, and we even have our first servlet all hooked up and ready to go. The last part is scheduling your scraper to run periodically; that way, your database has the latest and most up-to-date data, without you having to sit in front of your computer every day and manually call your scraper. In this next section, we'll see how we can use CRON jobs to accomplish just this.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.15.34.39