Surviving network errors with Selective Repeat ARQ

Joyatee Datta
7 min readNov 1, 2021

How do our end devices or network devices control the data flow?

Suppose I’m sending files or some messages to my friend, ideally it should be sent to him without any delay or any packet loss. But the world is not as ideal as we imagine it to be, there are frequent delays and packet losses in our day-to-day communication over networks. When sending data or information at a high speed, it allows the receiver of the information to receive the information and process it. Receivers with high traffic loads or little processing power are more likely to encounter this problem than senders with lower traffic loads.

In the OSI model, both the Data Link layer and Transport layer have the responsibility to control the data flow. In intra-network communication, the data link layer is in charge of flow control and error control. The transport layer, on the other hand, is responsible only for inter-network communications flow control and error control. Here we’ll discuss the flow control in the Data Link Layer.

When I’m sending a file to my friend, Data Link Layer will look after the flow control between my device and my home router. It will then forward the data to any other routers to its destination. All along the way, the data link layer will control the flow of the data by hop to hop.

Node to Node Networking (fig. 1)

Flow control is a set of procedures that tells the sender how much data it can transmit before it must wait for an acknowledgment from the receiver. In networking, we consider both sender and receiver but in flow control, we mainly focus on the receiver. Why? Because as I said earlier if there is a fast sender and slow receiver this will always cause a problem because the receiver cannot handle the speed of data reception that was sent by the sender. In case the sender is a slow sender and the receiver is the fast receiver so that’s not at all an issue. If the receiver’s memory is filling up with high speed than the processing speed then ultimately all the data will be lost, once the memory is full all the data will be lost. So Receiver must inform the sender before the limits are reached and request that the transmitter send fewer frames or stop temporarily.

For noisy channels or real-world network communication, the protocols we follow for flow control are -

i. Stop-and-Wait ARQ

ii. Go-Back-N ARQ

iii. Selective Repeat ARQ

Stop-and-wait ARQ is the least popular ARQ, being the most inefficient compared to other ARQs because this protocol has some issues. One such issue is the high probability of a deadlock occurring as a result of a data packet or acknowledgment loss, which can lead to indefinite waiting because the sender would keep sending the data packet even if the acknowledgment is lost.

The Go-Back-N ARQ protocol works well compared to Stop-and-wait ARQ, but still, it has the capability to work smoothly for fewer errors. But if there is a lot of error in the frame, lots of bandwidth loss in sending the frames again. Go-Back-N ARQ simplifies the process at the receiver site. The receiver keeps track of only one variable, and there is no need to buffer out-of-order frames; they are simply discarded. However, this protocol is very inefficient for a noisy link.

Selective Repeat ARQ

In a noisy link, a frame has a larger chance of being damaged, requiring numerous frames to be resent. This resending consumes bandwidth and causes the transmission to slow down. When only one frame is destroyed on a noisy network, another mechanism prevents N frames from being resent. Only the missing frame is delivered again. Selective Repeat ARQ is the name of this mechanism. It is more efficient than any other ARQs for noisy links. The Selective Repeat Protocol uses two windows: a send window and a receive window.

The Selective Repeat Protocol permits as many out-of-order frames as the receive window allows to be maintained until a group of in-order frames is ready to be transferred to the network layer. All of the frames in the send frame can arrive out of order and be kept until they can be delivered since the send and receive windows are the same size. However, it is important to note that the receiver never sends packets to the network layer out of order.

The size of the send window is much smaller 2*(m-1) and the receive window is the same size as the send window. For example,

if m = 4, the sequence numbers go from 0 to 15,

But, Size of the window = 2*(m-1) = 2*(4–1) = 8,

So the size of the window is just 8 and it’s the same for both windows.

This mechanism follows some steps to save bandwidth and make the transmission smooth.

  1. Identifying Duplicates

Windows size should be equal to 2*(m-1). If it is greater than 2*(m-1) then Selective Repeat will face issues like Go-Back-N. In Go-Back-N if a sent frame is found suspected or damaged then all the frames are retransmitted till the last packet.

Difference between Selective Repeat ARQ(left) and Go-Back-N ARQ(right) (fig. 2)

We can now demonstrate why the send window must be lesser than 2*(m-1). We’ll use m =2 as an example. The sequence number will be 0,1,2,3(as 2*m = 2*2) and the window size is 2*(2–1) or 2. In the diagram above, a window size of 2 is compared to a window size of 3. The frame timer is set to expire and all two frames are resent if the window size is 2 and all two acknowledgments are lost. Sender resent frame 0. Since the receiver now expects frame 2 rather than frame 0, the duplicate frame is properly discarded.

On the other hand, if the size of the window is 3 and all acknowledgments are lost, the sender will send a duplicate of frame 0. However, because the receiver’s window expects to receive frame 0, it accepts frame 0 as the first frame in the next cycle, rather than as a duplicate. This is an error. This is where Selective Repeat plays a vital role and discards the duplicate frame correctly.

2. Negative Acknowledgement(NAK)

Negative Acknowledgment in Action(fig. 3)

In this diagram, Window size = 4. Each frame sent or resent has a timer, which means that the timer needs to be numbered, as shown in the figure, 0,1,2,3. The timer for frame 0 starts at the first request and stops when the ACK arrived. Next Frame 1 is lost, so the receiver didn’t send any ACK.

Then Frame 2 arrives, is stored, and marked, but it is unable to be delivered due to the absence of Frame 1. Frame 3 is the same way; it arrives, is stored, and is marked, but none of the frames can be delivered. Frame 1 is still expected by the receiver. The timer for frame 1 starts at the second request restarts when a NAK(negative acknowledgment) arrives, and finally stops when the last ACK arrives.

What does NAK mean? The receiver got frames 2, 3 but not frame 1, it’s waiting for 1. So receiver sent “NAK 1” which means it’s expecting frame 1. So sender resent frame 1. If the receiver had sent “ACK 4” without sending “NAK 1” then the sender would have assumed that frames 0,1,2,3 are accepted and it’s waiting for frame 4. After that, the sender would send frame 4 and skip frame 1.

This protocol follows 2 conditions for the delivery of frames to the network layer,

  • A set of consecutive frames must have arrived.
  • The set starts from the beginning of the window.

Only at the last arrival, when finally a copy of frame 1 arrives, can frame 1, 2, and 3 be delivered to the network layer. A NAK is sent after the 2nd arrival but not after the 3rd, although both situations look the same. The reason behind this is that the protocol doesn’t want to crowd the network with unnecessary NAKs and unnecessary resent frames. A NAK is sent once for each window position and defines the first slot in the window.

The efficiency of the Selective Repeat protocol is = N/(1+2*a)

Where,

a = propagation delay/transmission delay

N = Window size = 2*(m — 1)

m = number of bits representing the window size

As example, if m = 3 then 2*m = 2*3 = 8. So the sequence number will be 0, 1, 2, 3, 4, 5, 6, 7.

It means, if the number of bits is m, then the sequence number will be 0 to

2*(m) — 1

It is unrealistically believed that all messages are sent without error. Transmission mistakes, on the other hand, are inescapable in the actual world. Especially given the presence in any communication channel of noise, which is the sum total of random signals that interfere with the transmission signal. We need a comprehensive process to account for the inevitable transmission faults of the real world so that we can prevent them as much as feasible. Selective Repeat ARQ solves the issue of packet loss in one of the most optimal ways possible, by choosing an optimal window, hence minimizing the network lag as much as possible.

--

--

Joyatee Datta

Computer Science Engineering | IEM Kolkata | Networking Researcher & Security Nerd