- Stream of bytes that will arrive in the right order
A large TCP packet is split into segments to allow it to be transmitted over the wire.
The largest packet size, according to the Ethernet protocol is 1500 bytes; this is known as the MTU.
IP layer requires at least 20 bytes for its header.
TCP layer requires at least 20 bytes for its header.
As OSI layers encapsulate each other, the maximum size of a TCP packet is 1460 bytes.
- Maximum Transmission Unit (MTU) - Up to 1500 bytes with Ethernet
- Maximum Segment Size (MSS) - Up to 1460 bytes (MTU - 20 - 20)
- 2 bytes - Source port
- 2 bytes - Destination port
- 4 bytes - Sequence number
- 4 bytes - Acknowledgement
- 2 bytes - Flags
- 2 bytes - Receive Window
- 2 bytes - Checksum
- 2 bytes - Urgent Pointer
- Optional n bytes - Options
The sequence number of a TCP segment is the position of the first byte in the segment.
SeqNo = ISN + K
ACK Sequence Number
The acknowledgement number is the next byte the client accepting from the other host
ACKNo = SeqNo + Length
If a packet is lost and later packets arrive, the receiving will keep sending the ACK for the byte number of the lost packet. Refer to TCP Packet Buffer
Seq=42, ACK=79, data='C' -> This packet is byte #42, I am next expecting the reply with
Multiple packet payloads within the same packet.
i.e. Response + ACK
TCP Packet Buffer
RTT Time and Timeout
- If timeout too short - premature timeout, unnecessary retransmission
- If timeout too long - slow reaction to segment loss and lower throughput
SampleRTT - Time from segment transmission until last ACK receipt (ignore retransmission).
- EstimatedRTT = (1-α) * EstimatedRTT_prev + α*SampleRTT
- DevRTT = (1-ß)*DevRTT + ß*|SampleRTT-EstimatedRTT|
- Timeout Interval = EstimatedRTT + 4*DevRTT
Where there is a safety margin to allow for deviations in the EstimatedRTT.
α = 0.125
ß = 0.25
TCP Fast Retransmit
Duplicate ACKs to trigger early retransmission.
Trigger: Triple Duplicate ACK - When the same ACK is received 4 times (3 times extra).
This mechanism bypasses the timeout (Which is often relatively long)
Sockets usually have a
RcvBuffer value of 4096 bytes.
This controls the size of the buffer.
The Receive Window (
rwnd) value in the TCP header field of receiver-to-sender segments is the free space of the RcvBuffer.
By sending the size of the free space in the buffer, the sender will be ensured that the next data they transmit will not overflow the receiver’s buffer
- Handshake - Agreement on connection and connection parameters.
Three Way Handshake
- Client sends
TCP SYN(contains initial sequence number
- Server sends
TCP SYNACK(contains their own initial sequence number
- Client sends
There is a timer (usually 3 seconds) that waits for a SYNACK.
It will retransmit if needed.
A SYN could also be transmitted quickly by creating a new connection
When the FIN bit in a packet is sent, the host can no longer transmit data (*).
It can however, still receive data
*: ACK packets are not counted as data, and can still be sent
- A sends TCP segment with
- B sends
ACKand possibly its own
- A sends
- A sends a reset packet
RSTto B to tell B to stop communicating
- B does not acknowledge the
- A will keep transmitting
RSTpackets for each reply it keeps receiving from
TCP SYN Attack (SYN Flooding)
The SYN Flooding attack causes a server to accept spoofed connections, wasting server resources.
Whilst the server will eventually close and garbage collect the socket, large numbers of spoofed SYN packets will overwhelm the server, increasing server load.
- Increase connection queue size
- Decrease timeout time for the 3 way handshake
- TCP SYN Cookies
TCP SYN Cookies
The initial sequence number (hash of source and destination IP and ports) is used as a secret key for a hash.
When receiving a
SYN, the server will reply with a
SYNACK containing this initial sequence number (key).
The client will then have to reply with an
ACK, checking that the ACK is equal to the initial sequence number + 1.
Only if this check is true will the server create the connection.
This stops attackers from sending many SYN packets - instead they must wait for the server’s
SYNACK, and send a second packet (ACK) with the right details
Congestion increases delivery latency, loss rate, and leads to retransmissions.
- Knee Point - Point where throughput increases slowly, decay increases fast
Cliff Point - Point where throughput drops to zero (Congestion collapse), decay approaches infinity
End to End Congestion Control
- Congestion inferred from observed loss and delay
Network-asssited Congestion Control
- Feedback from networking infrastructure
- DNA, DECbit, TCP/IP ECN, ATM
- Authoritative transmission rate
TCP Send Rate is roughly (cwnd / RTT) (bytes/sec)
Congestion Window (
cwnd) - How many bytes can be sent without overflowing routers
- Computed by the sender using algorithms
Receive Window (
rwnd) - How many bytes can be sent without overflowing the receiver’s buffers
- Determined by the receiver (notified to the sender)
- When ACK (of new data) is received, increase rate
- When loss is detected, decrease rate
TCP incorporates two algorithms, TCP Slow Start then AIMD.
TCP Slow Start (Bandwidth Discovery)
Increase transmission rate exponentially (doubled) until the first loss event.
- Initial rate slow, but ramps up exponentially
Additive Increase Multiplicative Decrease (
- Additive Increase - Increases
cwndevery RTT (ACK received) until loss detected
- Multiplicative Decrease - Half the
Steady State Threshold (
The steady state threshold is the
cwnd value which will trigger the protocol to change from TCP Slow Start to AIMD.
This is often set to some high value, and is decreased (by means of halving) on loss
TCP Reno (New)
- cwnd = 1 on timeout
- cwnd = cwnd/2 on triple duplicate ACK (TCP fast retransmit)
TCP Tahoe (Old)
- cwnd = 1 on timeout and triple duplicate ACK