Internet Packet Loss and Delay Modeling

We consider the scenario of IP communication between S and R over the Internet (see Figure 1 below). We assume that there is no exchange of information between S and the nearby router(s) on the current status of the network. That is S can make inference about the communication channel with R only through the round-trip-time (RTT) of acknowledged packets from R. The question that we raise is the following. Given a pool of IP packets already ACK-ed, how can the information they provide be exploited to make the following predictions about the current transmission: (i) Will the packet or its acknowledgement be lost? (ii) If not, how much will the response from R be delayed? Previous work on Internet channel estimation and prediction can be classified into two categories: (i) In TCP we have an exponentially weighted moving average (EWMA) filter for RTT prediction of IP packets. This model is causal as it uses RTT values of previously ACK-ed packets. It can be supplemented with another EWMA filter to estimate the variance of the RTT prediction. However, the model does not include loss prediction, i.e., it exclusively focuses on packet delay. (ii) There have been Markov chain models proposed for describing the loss process in the Internet. However, their main drawback is that they operate independently of the associated RTT process, i.e., RTT and packet loss prediction can only be considered separately.



Figure 1. Internet IP communication between S and R.

In our approach, we treat the network as a finite state machine. Each hidden state is characterized with a certain distribution of the packet loss and the RTT associated with a transmission. We train the model using the forward-backward (Baum-Welch) algorithm on real measurement data. Such a trained model can be used afterward for loss and RTT prediction in actual systems sending packets over the Internet. It could also be used as a channel simulator for testing the performance of different transmission schemes via simulations. Compared to the TCP algorithm we observe a 15% reduction of the mean-squared-error (MSE) of the RTT prediction. Furthermore, our model provides a few orders of magnitude larger likelihood for the packet loss sequence observed from the measurement data. When employed as a channel simulator the proposed framework generates a packet loss and RTT sequence which statistics matches that of the real data. The drawback of the model is that no long range dependence prospectivelly existing between the data is captured by it. This is due to the fact that HMMs are statistically stationary. In order to provide improved performance, we subsequently designed an HMM model that comprised more than one layer of hidden states. The multi-layer HMM exhibited a higher prediction accuracy than the original model thereby capturing better the characteristics of actual packet loss and delay data observed in the Internet. Finally, as another extension of this work we studied employing context-free grammars (CFGs) as another tool for Internet packet loss and delay modeling. To this end, we designed appropriate probabilistic CFGs that described the bursty nature of Internet packet loss with a high degree of accuracy.