VPN is a wonderful thing that you all have probably heard about. I assume it was something like this: “Using a VPN you can visit websites blocked by state services and engage in any network activity without fear of revealing your actual IP address.”
If you thought that Virtual Private Network is a magic tool designed specifically for staying anonymous, then it is not. It is not even a separate technology, like SSH. In general, VPN is a virtual network, build with the help of software, on top of a regular network (the Internet).
Data packets in a properly organized VPN network are disguised as regular data packets (and even contain a header) so that the equipment of Internet Service Providers that these packets go through recognize them and transport them correctly. However, these packets contain encrypted data, which the VPN server will decrypt.
In addition, VPN is a means to protect personal data from interception, which is important when working with public networks.
In order to understand how VPNs work, let us turn to the theory.
The network protocol is a set of rules and actions (sequence of actions) allowing to establish connection and exchange data between two or more connected devices.
Encapsulation – “inclusion” of the packet into the frame, or one packet into another packet. On Ethernet networks, packets are almost always framed due to the nature of the equipment through which these packets should go. Encapsulation of packets into other packets is not so rare, for example, in some VPN protocols like IPSec it is needed for additional security.
The checksum is a value calculated by passing all the packet data through a specific checksum function. It is used to confirm the integrity of the transmitted data. If data gets changed along the way, re-calculating the checksum will not give the original result calculated by the sender.
Our entire global network and the system of protocols used by devices to interact with each other is built according to the Open Systems Interconnection model (OSI model). Protocols are divided into layers, lined up in a strict hierarchy.
The OSI model is divided into 7 layers:
- Physical layer
- Data link layer
- Network layer
- Transport layer
- Session layer
- Presentation layer
- Application layer
The physical layer (L1) operates the electrical signals transmitted over the network cable – bits.
Data link layer
The data link layer (L2) operates the frames. Frames are sets of bytes of information containing data about the physical address of the device (MAC address), the data itself and the checksum to verify the integrity of the transmitted data. Sometimes frames contain data on the priority of transmitted data. The data link layer interacts with the physics of the device (ports) and is responsible for providing access. It forms data transfer channels. Switches work at this level. Switches are devices that receive data over a network cable and distribute it across ports.
The network layer (L3) is the routing layer. It is the most interesting thing in the whole model, as here the transition begins from simple and clear electric signals to complex and tangled network magic. The network layer protocols determine the paths of the data, select the best delivery way, and check for problems along the way (all this is called routing).
The unit of data at the network layer is called a packet, and although in its structure it looks like a frame — a packet is definitely cooler. Packets can be divided into two parts – the header and the payload. The header includes everything that is needed for proper routing – the IP addresses of the sender and the receiver, the packet length, the checksum, the routing protocol, and so on.
In order to be transferred over a wired Ethernet network, packets are almost always packed into frames, so that they can go through all the equipment (and this is usually L2 equipment) along their long way through the network. When the frame gets into any switch during transportation, it is sent further according to data indicated in the header. When a frame enters the router, it searches for a packet in its data and sends it further based on the packet header.
The transport level (L4) is entirely responsible for the delivery of the files – this is the driver of the FedEx truck. Its main task is to guarantee the delivery of data to the correct addresses in the correct sequence. The transport layer is based on the TCP/IP protocol stack (although there are others, like UDP).
Thanks to the transport layer technology, you can be sure that not a single bit of the transmitted data will be lost along the way. If at the 2nd level there are physical ports, and at the 3rd – IP, addresses, at the 4th level – recipient is a virtual port. For example, the IP address is the house number and the virtual port is the apartment number, the physical port, and the MAC address is the person receiving the letter. Each network process (and the application using it) has its own port on your computer (from 1 to 65,535), port 22 is reserved for SSH, 80/8080 for HTTP, and so on.
The session layer (L5), as the name implies, is responsible for creating and maintaining communication sessions (connections or dialogues). Its protocols, to put it simply, check whether the application is showing signs of life, whether data is being transmitted, whether it is possible to suspend a session, and also try to re-establish communication if the application is working and requires some data, which is absent.
The presentation layer (L6) is the level of transformation and data compression necessary for the transmitted data — bits/frames/packets — to be correctly displayed as text, image, or music, and vice versa. The presentation layer transforms data into the format that applications understand.
The application layer (L7) – the last in the model, is responsible for the interaction of the end user and the network. More precisely, the application layer protocols do not process the packets themselves but sort the data obtained from the presentation layer protocols into a convenient and understandable format for the end user. Applications are responsible for the data output, which by themselves are in no way connected with the network (browser, mail program, etc.).
How all layers work
Here is an example of how all layers work. You clicked on the image link in the browser. Application layer protocols like HTTP helped you to do this by forming and transmitting the link to the browser for it to display you the link. Your click took you to the presentation layer, where a certain set of bytes was formed, containing a request to the server where the picture is located.
Then, at the network layer, a packet is formed, where a header is added to the information created by the browser: who sent it, where, checksum, and so on.
Then we go down again. Here, the packet is assigned a header, where the sender’s and recipient’s MAC address are registered, the checksum, and so on — here the packet is placed in a frame and flies out of the network card. If any problems arise, the transport layer protocols will come into play. (The session layer in this example is not involved since the communication session is not established). When the packet reaches the addressee, the staircase turns and goes the other way around. All this happen in a fraction of seconds.