Mongolia: An Auditory Testing Environment to Study the Importance of a VoIP Packet

Christian Hoene, Enhtuya Dulamsuren-Lalla
TU-Berlin, TKN

If highly compressed multimedia streams are transported over packet networks, losses of individual packets can impair the perceptual quality of the received stream in different degrees, depending on the content and context of the lost packet. This web page demonstrates how differently packet losses impact the perceptual speech quality in Voice-over-IP systems. Some packet losses are hardly audible. Others are far more important.

Mongolia is a web based application, which allows you to force the loss of packets having the same speech properties or importances. Simply try and listen!

Press here to start the program on the TKN-server or a virtual Lycos-Server (currently down).


Mongolia Screenshot
The tool called “Mongolia” works as follows:
First a reference sample is selected from ITU’s database P.suppl 23. Each sample has a length of 8s. Background noise is not present. If requested, samples and their degraded version can be played loudly.

Next, a coding algorithm compresses the reference sample and the PESQ calculates the degraded sample’s MOS value. The tool supports the three coding modes: ITU G.711 µ-law including packet loss concealment, ITU G.729, and ETSI’s Adaptive Multi-Rate (AMR) speech codec.

Next, the overall frame loss rate controls, how many frames are dropped. Also, the packet length controls the burstiness of frame losses. The later effect refers to packetised transmission of speech, whether a VoIP packet can contain multiple voice frames. A random seed value controls the positions of the losses.

The user can select whether important or less important frame are dropped. The Importance of a frame is the quality degradation that the frame’s loss would cause. High values refer to more important frames.

Then, frames are selected according to their speech property: a) frames containing during silence or b) active voice or active frame containing c) unvoiced and d) voiced sounds.

Last, the packet loss statistics and the PESQ value are displayed.
  • C. Hoene and E. Dulamsuren-Lalla, "Predicting Performance of PESQ in Case of Single Frame Losses", In Proc. MESAQIN 2004, Prague, CZ, June 2004 MESAQIN 2004 - Measurement of Speech and Audio Quality in Networks, (Slides). (PDF (PDF, 446,0 KB))
  • C. Hoene, B. Rathke, and A. Wolisz, "On the Importance of a VoIP Packet", In Proc. of ISCA Tutorial and Research Workshop on th Auditory Quality of Systems, Herne, Germany, April 2003. (PDF (PDF, 323,5 KB))
Please send comments and bug reports to hoene@tkn.tu-berlin.de

