Human conversation is characterized by brief pauses and so-called turn-taking behavior between the speakers. In the context of VoIP, this means that there are frequent periods where the microphone captures only background noise – or even silence whenever the microphone is muted. The bits transmitted from such silence periods introduce overhead in terms of data usage, energy consumption, and network infrastructure costs. In this paper, we contribute by shedding light on these costs for VoIP applications. We systematically measure the performance of six popular mobile VoIP applications with controlled human conversation and acoustic setup. Our analysis demonstrates that significant savings can indeed be achievable - with the best performing silence suppression technique being effective on 75% of silent pauses in the conversation in a quiet place. This results in 2-5 times data savings, and 50-90% lower energy consumption compared to the next better alternative. Even then, the effectiveness of silence suppression can be sensitive to the amount of background noise, underlying speech codec, and the device being used. The codec characteristics and performance do not depend on the network type. However, silence suppression makes VoIP traffic network friendly as much as VoLTE traffic. Our results provide new insights into VoIP performance and offer a motivation for further enhancements, such as performance-aware codec selection, that can significantly benefit a wide variety of voice assisted applications, as such intelligent home assistants and other speech codec enabled IoT devices.
|Otsikko||MMSys '20: Proceedings of the 11th ACM Multimedia Systems Conference|
|DOI - pysyväislinkit|
|Tila||Julkaistu - 2020|
|OKM-julkaisutyyppi||A4 Artikkeli konferenssijulkaisuussa|
|Tapahtuma||MMSys '20: 11th ACM Multimedia Systems Conference - Istanbul, Turkki|
Kesto: 8 kesäkuuta 2020 → 11 kesäkuuta 2020
- 113 Tietojenkäsittely- ja informaatiotieteet