2015 Oct 01

A brief discussion of the history of the web, APIs and how it relates to the work we're doing with wicket.io.

This article uses a lot of initialisms and acronyms, but with the intent to clarify their use. To start, an API (Application Programming Interface) is a set of tools or instructions that determine how software components should interact.

These days, connecting distributed systems and applications over the internet is an essential tool in the modern web application space. Connecting two different technologies, in distinct locations, possibly across the globe is a challenging endeavour. Without the amazing capabilities of the internet and its associated technologies - and the foresight of its inventors and stewards - it would be all but impossible.

Before diving into how to connect distributed systems over the internet, we'll have to understand how the internet was founded, built, and subsequently maintained.

So, in the beginning...

The Internet (inter-network)

Well, maybe not the very beginning. The internet was the child of the U.S. Defence Department and various College and University campuses. It specifically referred to a large interconnected network used primarily for defence and research, spearheaded by DARPA (Defense Advanced Research Projects Agency). This original internet was called the ARPANet.

ARPANet was powered by a number of different technologies and communication protocols, but this variance made adoption difficult for newcomers. It was realized that the methods that computers used to communicate needed to be standardized and thus, the IETF (Internet Engineering Task Force) was born.

The IETF established a formal process for these communication protocols to be developed and ratified. The method used is called RFCs (Request for Comments) which are memoranda written and peer reviewed by engineers and computer scientists. The IETF formally adopts RFCs into standards which form the basis of the protocols and technologies used by the internet today.

TCP/IP (Transmission Control Protocol/Internet Protocol)

TCP/IP is a collection of protocols that have evolved from their original inception in the 1970's until today. They form the foundation of the internet, or any network comprised of more than one computer. Included among their ranks are the following:

  • SMTP -- Simple Mail Transfer Protocol. This is the protocol that determines how email is sent.
  • NTP -- Network Time Protocol. NTP is how time synchronization happens across networked computers. 
  • DHCP -- Dynamic Host Configuration Protocol. A specialized protocol that provides unique idenfiers (usually Internet Protocol, or IP addresses) per connected network device.
  • DNS -- Domain Name System. The distributed and decentrailized network that provides the translation between domain names (such as industrialagency.ca) and the associated IP address
  • FTP -- File Transfer Protocol. A very well named protocol specifically targetted at transferring any size file between computers.

These protocols are called Application Layer protocols. TCP/IP itself is made up of a number of layers with a different set of protocols in use at each one, with the application layer being the top-most layer. The reason these are called Application Layer protocols is that they are predominantly used by your operating system or the software you install, while the lower layer protocols are implemented in your hardware.

HTTP (Hypertext Transport Protocol)

In 1991, the first version of HTTP arrived on the scene and with it, the World Wide Web (WWW) was born. At its core, HTTP is a protocol designed to transfer text, but the nature of how this text is transferred and its structure is what makes HTTP so powerful. Like a lot of protocols, particularly ones that have wide adoption, it is versioned. Currently, the web is powered by version 1.0/1.1 of the HTTP protocol. Version 2.0 (or HTTP/2) was introduced in May 2015 and is not widely adopted (yet!).

HTTP is a request-response protocol. This means that a client (e.g. a web browser) will initiate a request and a server (e.g. web server) will respond. The structure of the request and subsequent response is dictated by the HTTP protocol and the version it is using, which is documented in a number of RFCs.

HTTP Request

An HTTP request and response have similar structures. Imagine them as document template, they both contain a header and a body. The header in a request contains important information such as the request method, the URI (Uniform Resource Identifier) such as the web page or image being requested, and the version of the HTTP protocol being used.

The request method is very important. Particularly in an API being used over the internet. The request method will often be referred to as the verb used in the request. This is because the request method often is a verb.

The following are the verbs used in HTTP requests today:

  • GET - Requests using GET should only retrieve data and should have no other effect. This is the most common verb in use on the web, whenever you click on a link, it issues a GET request to the web server to retrieve that web page. The web page is contained in the body of the response in a structured markup language called HTML (HyperText Markup Language).
  • POST - A post request is commonly used during a web form submission. It tells the web server that the document enclosed in the body (the form fields) should be treated as a new object relative to the URI (web address) being requested. For example, if you submit a contact form on a web page, it issues a web request to a contact resource (/contact) with a POST verb which tells the server that this is a new contact form submission.
  • PUT/PATCH - Similar to a POST request, the PUT request is typically used for a web form submission. The key difference is that it is typically used to update an existing object with the document enclosed in the body, rather than create a new one. For example, updating your email address on a service you've subscribed to would typically be an handled via a PUT request.
  • DELETE - Self-explanatory. This indicates that the object referenced by the URI should be deleted.

There are a few more verbs such as HEAD, TRACE, OPTIONS and CONNECT that are extremely important to the standard HTTP request/response, but outside of the scope of this discussion.

HTTP Response

As mentioned, the HTTP response is structured in a similar manner to the request.

It also contains a header and a body. The difference being that the response contains a Status that indicates the result of the request. There are codes used within the Status that you have commonly come across such as 200 which indicates OK, 404 indicates that the requested resource cannot be found, and 500-series errors means something has gone wrong on the server with the request.

The request header and body is consumed by the client (e.g. a web browser) and, depending on the type of request made, either a page is rendered or a message is displayed based on actions performed on the server.

REST (REpresentational State Transfer)

At this point we're going to leap ahead in time to the year 2000 when a new architecture was proposed to handle the design of network-based software. This architecture was titled REST (REpresentational State Transfer) which used the HTTP verbs as specified in the HTTP 1.0/1.1 specification to provide some sensible defaults to the structure of requests, and a framework for server side request processing to handle these requests.

For example, if you were going to design a web-based system to manage a user object you would have the following types of requests defined on a web server that a client could use:

Note: All of these requests are relative to a web address (URL). Imagine all of the / URIs as prefixed with http://example.com.

  • GET /users -- Retrieves a list of users
  • GET /users/1 -- Retrieves the user identified by 1, where 1 could be any unique identifier for a user
  • POST /users -- Create a new user
  • PUT or PATCH /users/1 -- Update the user identified by 1
  • DELETE /users/1 -- Delete the user identified by 1

Prior to REST, a number of standards had been proposed and codified to handle the request-response cycle. These standards really highlight the versatility of HTTP as the body and header used in the request/response cycle were heavily customized, but remained within the specification to handle network-based software requests.

For example, SOA (Service Oriented Architecture) which is still heavily used today uses a protocol called SOAP (Simple Object Access Protocol) that relies on the body of an HTTP request to follow a structured markup language similar to HTML called XML (eXtensible Markup Language) that is consumed and parsed by the web server to determine what actions to take upon the request.

While extremely powerful, SOA implementations were largely left to the implementors and varied widely in terms of capabilities that were difficult to digest by developers. For example, a language was developed called WSDL (Web-Services Definition Language) which was in itself a structured XML document which would detail the capabilities of a SOAP endpoint (another XML document).

If you're deciding to hop off this post at this point, your reaction is analagous to a lot of developers who would regularly work with SOA, SOAP and WSDL...

In any event, REST at its most basic form took the guesswork out of understanding what a web-based request structure would look like, what it would do, and how it would do it, thus speeding up development and providing some standardization across disparate services.

Today

Today, web-based software architectures flourish. The work done throughout the last few decades has built an extremely powerful, robust foundation of protocols and standards for for these services to talk to each other.

They communicate using a standardized structure and language determined by features that exist in protocols such as HTTP and TCP/IP which are well documented, peer reviewed and available to everyone that wants to interact with the internet.

At its core, wicket.io will be built upon REST as the vast majority of the services it will be interacting with will be leveraging the same architecture. For example, take a look at the API documentation for Mailchimp, Eventbrite and Stripe. You'll notice that there are remarkable simularities between what we've discussed here. This is because they are all built upon REST.

Our core goal with wicket.io is to manage the person and/or member lifecycle and interact with best-in-class applications for services such as Email Marketing, Event Management, and more. Speaking the same language is imperative to ensuring interoperability with these services and, when required, speed up new service integration.

Most importantly though, is that by leveraging the decades of work that has brought us the modern day internet, wicket.io is standing on the shoulders of giants while simultaneously walking among them.