Redefining the web

The web, which is all based on the HTTP protocol, consists of an enormous set of technologies. To give you an overview of these goes beyond the scope of this book. But it might be good for the reader to know that many of the original ideas, definitions, and abstractions have been loosened and remade.

Originally, resources on the web were considered to be web pages, that is, pages for human consumption. Resources were identified using Uniform Resource Locators, or URLs. It was quickly realized that this was not a good abstraction. The URL construct was redefined into a Uniform Resource Identifier, or URI. Not all identifiers point to actual resources that can be found on the web. An example could be namespaces in XML, for instance. Furthermore, resources do not need to be pages. They can be data items, as is the case within the realms of the Semantic Web and Linked Data, where each data subject can be identified using an URI.

We can also access dynamic content and Application Programming Interfaces (or APIs) using URLs. The first popular web service architecture, Simple Object Access Protocol, or SOAP, focused on semantics rather than content. It was one of the cornerstones of the emerging new paradigm that came to be known as Service-Oriented Architecture, or SOA.

Today, a new service-oriented architecture, based on Representational State Transfer, or REST, has become very popular. One of the reasons is that RESTful interfaces are more loosely coupled than SOAP interfaces. This means they are easier to extend and adapt to. Another reason is that representation is separated from resource. This basically means you can extend the service to support multiple content types or representations of data, without modifying the interface itself. This permits the use of JSON, which simplifies their use from JavaScript clients. RESTful APIs focus more on content and resources than semantics, and URLs play an important part in this. RESTful interfaces also enforce server statelessness and require the client request to contain all required information to process the request. The reason this is done, is to be able to scale by allowing multiple servers to process incoming requests. To facilitate interacting with resources, each response is assumed to be self-describing and contain a set of links specifying operations that can the client can perform on the corresponding resource.

Earlier models focusing on relational, or linked content models, were RSS and ATOM. Based on syndication principles, and closely related to semantic web technologies, they use URLs to link to related data content, a mix between machine-readable and human-readable content.

Authentication is another field where the web is seeing rapid restructuring. Authentication is the means to make sure a claim is true, particularly claims of identity. Originally, servers were considered to be aware of which users had the right to access which content, and so client authentication could be done locally on each server. This model worked when content providers published their own content and wanted to control who had access to it in their local environment. But the model failed in modern architecture that required interoperability between different online entities. This is especially the case for the Internet of Things, where servers are small and typically unaware of external entities, and transactions span a multitude of servers. Clients need to be able to identify themselves in the same manner regardless of server. New, distributed authentication methods such as JSON Web Tokens, or JWT, and OAuth2 are being defined for this purpose. A distributed authorization framework is also being worked on, under the name User-Managed Access or UMA.

Optimization and security are other fields where great change is occurring. The traditional Request/Response mechanism is limiting in many regards. This has been addressed in efforts such as Web Sockets or HTTP/2. Web Sockets involve a HTTP connection being negotiated to become more like a normal full-duplex socket. HTTP/2, which is a standard from IETF, removes the limitation of being able to only pose one question at a time, greatly enhancing HTTP performance.

The topology problem of HTTP is still a major obstacle. Clients can connect to servers, but servers can normally not connect back to clients, since most of these reside behind firewalls. It is even more difficult for actors behind separate firewalls to interconnect. Web RTC aims to provide interconnectivity between actors behind separate firewalls using web technologies. Other alternatives, such as using HTTP over XMPP, instead of TCP, also solves this problem.

Table of Contents for Redefining the web

Create new playlist

Sign In

Sign Up

Table of Contents for
Redefining the web