How Does The Web Work? Part 3
3 weeks ago, I began researching how the Web works, and was sidetracked by the Internet and the multitude of acronyms and terms used when talking about both.
As I mentioned in Part 1 of this series, the most basic response to ‘How does the web work?’ is that the web is a series of data exchanges that include requests from clients (aka: browsers) and responses from servers. Clients can be any of your devices that are connected to the internet. Servers, which are also connected to the internet, are the hardware that store files of applications, programs, websites, etc. Although we do have local servers on our own personal computers, when we talk about servers, we usually mean the ones that are offsite and are holding all the information for the online sites we visit everyday, all day.
So now that we have all these server farms across the globe holding everyones applications, what actually happens when we want to visit one of these sites? Most people are satisfied knowing that they can just hop on a browser and as long as they’re connected to the Internet, they can type in an address, like https://www.google.com/ and be redirected immediately to Googles site. One of the first things about this process of the web that I found fascinating was that you aren’t even going to that human readable address. You’re actually being first directed to one of their IP addresses, you might actually be going to 2001:4860:4860::8888. Amazingly enough, the device that you’re working on, like every piece of hardware ever designed that is network equipped, also has its own IP address. To find out your own IP address, you can visit Whatismyipaddress, this site will give you the IP that your ISP has allotted to you, but keep in, this might change and can do so if you were to, for example, restart your router.
We open our browser, type in a URL that we’d like to visit and a set of processes are fired off. The url is first broken down into 3 parts, the HTTP protocol, the server name (domain name), and files (any paths that are listed after the main domain name). The HTTP protocol determines how the content of the request will need to be transmitted. Through DNS, the server name will be changed here from the human readable words to the IP address. The IP address actually determines route that will be needed to follow through the correct network navigation to get to the server holding the files of the site we have requested. Once the server has been reached the request has reached the correct application hosted, any paths listed need to be addressed. These paths determine if specific files or assets have been requested directly. Sidenote: There’s also a 4th part, the .com or .org in the URL. This is called the top-level domain and determines what type of domain the site is, for example: .com is short for commercial².
The server then sends back the data in chunks of information, aka packets. Because data is too massive to send all in one chunk and so that multiple users can visit, but really — download³ information from, the same site at the same time, there can be thousands of packets sent back in a response. These network packets all ‘provide data for delivering the payload (e.g., source and destination network addresses, error detection codes, or sequencing information)’⁴. You can think of these packets like envelopes and they all have headers. Headers give all of the information of the data of the packet, for example Content-Type. These packets are delivered to the client and the client, your device, then needs to determine how to convert this data into the web page that is human readable.
A fascinating tool that each of us has available, actually allows us to inspect this activity. On Chrome, for example, navigate to Google, open up your DevTools, on a mac ‘comman-option-j’. Visit your ‘Network’ tab. You can see so many interesting things here, for example, the name of the site visited and the type of content that was sent and received. In Part 2 of this series, I listed a site that was great to visit to see all of the content types that can be sent through HTTP.
There are so many other steps that happen here, whether it be from understanding how TCP is working, or truly understand HTTP requests, please know this is just a quick over-view of how Web requests and responses are happening. But you might be in an interview one day and your interviewer may just ask you how the Web works. I believe these steps listed above will give you the structure they’re asking for and then all you need to worry about is how you’re going to solve that algorithm they’re about to give you. Good luck! You got this!