13.1 Data and the Internet

The internet has revolutionized the way businesses interact with customers, suppliers, and contractors, as well as the way we use social media and other web-based applications to live our everyday lives. The terms internet and World Wide Web are sometimes used interchangeably, but they have different meanings. The internet predates the World Wide Web by a few decades. The internet is a large network composed of smaller networks (internetworks). The internet developed from Arpanet, a communications network that was created in the 1960s using funds from the Defense Advanced Research Projects Agency (DARPA) for the purpose of linking together government and academic research institutions. The network used a common protocol, TCP/IP (Transmission Control Protocol/Internet Protocol), to facilitate communications between sites. Later, the National Science Foundation took over the responsibility for managing the network, which became known as the internet.

Although the internet allowed access to resources at distant sites, it required considerable sophistication on the part of the user, who had to find the resources of interest, log on to the remote system containing them, navigate the directories of that system to find the desired files, copy the files to a local directory, and then display them correctly.

In 1989, Tim Berners-Lee proposed a method of simplifying access to resources that led to the development of the World Wide Web (referred to here as the web). His proposal included the following:

  • A method to identify the name of a web resource, called a Uniform Resource Identifier (URI), with a Uniform Resource Locator (URL) being a type of URI

  • A standard protocol, Hypertext Transfer Protocol (HTTP), for transferring documents over the internet

  • A language, Hypertext Markup Language (HTML), that integrates instructions for displaying documents into the documents themselves

Berners-Lee’s proposal made it possible to automate the complicated process of finding, downloading, and displaying files on the internet.

Today, the web can be viewed as a massive collection of data resources. The data can be structured, unstructured, or semistructured. Structured data has a predefined data model such as relational or object-oriented and is the type of data stored in traditional databases. The model defines how the data is stored, accessed, and processed, making it readily accessible with predictable performance. Unstructured data has no predefined model and no apparent organization. Examples include pictures, video files, audio files, sensor data, and pdf documents. Much of the data used in ordinary business transactions and on the web is unstructured. It is difficult to analyze, although there are tools available for handling such data. Semistructured data falls between these two categories. It does not require a predefined schema, but it has some tags or markers that make it self-describing, separating and identifying its elements and making it easier to analyze than unstructured data. Common examples of semistructured data are JavaScript Object Notation (JSON) and Extensible Markup Language (XML) documents.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.221.172.50