TCP/UDP: a little reminder

Greg
6 min readJan 3, 2021

--

In the last blog post, we studied the Internet Protocol and how computers can transmit packets between them.

Now let’s understand two most commonly used upper-layer protocols: UDP and TCP.

Plan

  • UDP
  • TCP
  • Socket abstraction

UDP (User Datagram Protocol)

UDP is very close to the lower protocol (IP) as it doesn’t add much features / complexity.
It is made to send IP packets without caring if the destination host receives the packets or not; a packet sent with UDP could never be received at all.
Thus one benefit of UDP is the transmission speed; useful for certain cases such as video or audio streaming.
Indeed with video or audio streaming a lot of data needs to be transmitted and it is acceptable if a small amount is lost (such as a video frame in a video).

UDP anatomy

With UDP, the data field of an IP packet is splitted in half: UDP header + data.

Figure 1 — UDP anatomy

Two ports must be specified: one for the source host and one for the destination host.

TCP (Transmission Control Protocol)

TCP is the most used (websites, emails, file transfers, …) protocol as it provides a connection-based and reliable delivery service.

With TCP, whatever is sent from host 1 to host 2 must arrive and host 2 has to be able to reorder the segments into the exact same order of host 1.
TCP establishes a “logical connection” between two peers.
Once created, stream bytes can be exchanged through the connection.

Figure 2 — Encapsulation of TCP segments into IP packets
Figure 3 — TCP Header

Sequence number is a field used by the source host to specify the order of the different segments sent to the destination host.

Acknowledgment number is a field used by the destination host to inform the source host if a segment has been received (= acknowledged).

Flags is a field used to specify a type of TCP segment (open, close, acknowledge, …).

TCP handshake

When a TCP connection is created, a handshake has to be made between the 2 hosts:

Figure 4 — TCP 3-way handshaking
  1. The client who wants to establish a connection will first create and send a special initialization TCP segment by putting a random number ‘m’ in the sequence number field of a ‘SYN’ flagged segment.
  2. The server will then return a special ‘SYN/ACK’ segment to the client. This segment will contain a random ’n’ sequence number and m+1 as acknowledgment number.
  3. Finally, the client will acknowledge the server response by sending an ‘ACK’ segment with an acknowledgment number of n+1.

The purpose of this handshake is to make sure both hosts are up and ready and want to speak with each other.
1 = ‘I would like to speak with you’
2 = ‘Thanks for contacting me, I am ok and ready to speak with you’
3 = ‘Great, I’ll begin exchanging data’

  • If host 1 doesn’t receive a response to its ‘SYN’ segment request, he will then send another one for a new attempt.
  • If host 1 sends a SYN to a host for a port at which no application is listening, the receiving host will respond with a ‘RST’ flagged segment. ‘RST’ flag means hard error and host 1 should therefore stop any attempt for this port.

Once a handshake has been made, the two hosts can exchange data (segments).
Thanks to sequence and acknowledgment numbers of the segments, the two hosts can ensure the order of the packets and ask to resend a packet if not received.

TCP is quite fascinating and complex as it has to tackle several issues to make IP protocol reliable and connection oriented and also maximizing the transfer speed at the same time (congestion issues, retransmission, number of segments sent for a given time window, …).

If you want to understand TCP algorithms more in details you should checkout the TCP algorithm page:
https://fr.wikipedia.org/wiki/Algorithme_TCP

TCP Connection shutdown

Figure 5 — TCP connection shutdown

Once one of the 2 hosts is done, he can gracefully close the connection by sending a special FIN flagged segment which will be acknowledged by the server (figure 5).
Gracefully closing the connection will allow both the server and the client to terminate resources (memory, buffer, …) the OS may have allocated to exchange data.

As the TCP layer is managed by the OS kernel, when a process exits (for different reasons even with SIGKILL) the OS will automatically close the TCP connections it had open.

However some connections aren’t closed this way.
Indeed one of the two hosts could suddenly become unavailable (brutal computer shutdown, firewall added, …).
That’s why TCP is called a “logical” connection ;)

“When TCP sends data, it expects an ACK in reply. If none comes within some amount of time, it re-transmits the data and waits again. The time it waits between transmissions generally increases exponentially.
After some number of retransmissions or some amount of total time with no ACK, TCP will consider the connection “broken”. How many times or how long depends on the OS and its configuration but it typically times-out on the order of many minutes.” -> Brian White on stackoverflow

TCP Keep-alive

Keep-alive is a TCP feature which can be activated. Its purpose is to detect if an inactive established connection is broken (or not) by sending a special TCP segment each X inactive seconds . More details on keep-alive on linux here.

Socket abstraction

The operating system implements and handles the layer 4 (TCP/UDP) stack in the kernel space thanks to multiple data structures but the most important one for programmers is the “socket”.
The user has to make system calls to create “sockets” with specific parameters such as destination IP address, destination port number, type (UDP/TCP/…).
The type of socket created is specified by the user at creation time.

Figure 6 — Socket creation

Then the user only has to write data to that socket (thanks to a file descriptor returned by the socket creation call). The OS kernel handles everything for the user :)

Socket types

  • Datagram socket for UDP
  • Stream socket for TCP
  • Raw socket (without any protocol-specific transport layer formatting).
    One cool use case of that type of socket is for the implementation of new transport-layer protocols in the user space. Network equipments can use this type of socket for routing protocols such as OSPF, ICMP, …

For TCP, a stream socket can be in different states.

Figure 7 — Possible states of a stream socket

The end !

I hope you enjoyed this blog post, if you see any error don’t hesitate to let me know.

You can also follow me on twitter and check out the references below if you want to dive deeper into the subject.

References

Books
VPNs Illustrated: Tunnels, VPNs, and IPsec (English Edition) — Jon C. Snader
TCP
http://www-sop.inria.fr/members/Vincenzo.Mancuso/ReteInternet/06_tcp_part1.pdf
https://jameshfisher.com/2018/02/24/what-are-tcp-sequence-numbers/
https://en.wikipedia.org/wiki/Transmission_Control_Protocol
UDP
https://en.wikipedia.org/wiki/User_Datagram_Protocol
Socket
https://en.wikipedia.org/wiki/Network_socket
http://www.qnx.com/developers/docs/qnx_4.25_docs/tcpip50/prog_guide/sock_advanced_tut.html

Images

Figure 1
https://www.imperva.com/learn/wp-content/uploads/sites/13/2019/01/UDP-packet-1024x375.jpg
Figure 2
https://www.researchgate.net/figure/Packet-encapsulation-TCP-IP-architecture-encapsulates-the-data-from-the-upper-layer-by_fig4_49288737
Figure 3
https://cdn.networklessons.com/wp-content/uploads/2015/07/xtcp-header.png.pagespeed.ic.dqm79W8xl8.png
Figure 4
https://afteracademy.com/blog/what-is-a-tcp-3-way-handshake-process
Figure 5
VPNs Illustrated: Tunnels, VPNs, and IPsec (English Edition) — Jon C. Snader
Figure 6
https://en.wikipedia.org/wiki/Network_socket
Figure 7
https://www.ibm.com/support/knowledgecenter/en/SSLTBW_2.1.0/com.ibm.zos.v2r1.halu101/constatus.htm

--

--

Greg
Greg

No responses yet