The synchronous scraper only uses Python standard libraries such as urllib. It downloads the home page of three popular sites and a fourth site whose loading time can be delayed to simulate a slow connection. It prints the respective page sizes and the total running time.
Here's the code for the synchronous scraper located at src/extras/sync.py:
"""Synchronously download a list of webpages and time it""" from urllib.request import Request, urlopen from time import time sites = [ "http://news.ycombinator.com/", "https://www.yahoo.com/", "http://www.aliexpress.com/", "http://deelay.me/5000/http://deelay.me/", ] def find_size(url): req = Request(url) with urlopen(req) as response: page = response.read() return len(page) def main(): for site in sites: size = find_size(site) print("Read {:8d} chars from {}".format(size, site)) if __name__ == '__main__': start_time = time() main() print("Ran in {:6.3f} secs".format(time() - start_time))
On a test laptop, this code took 17.1 seconds to run. It is the cumulative loading time of each site. Let's see how asynchronous code runs.