Voice activation detection (VAD)

27 Aug 2012 11:09

Voice activity detection (VAD), also known as speech activity detection or speech detection, is a technique used in speech processing in which the presence or absence of human speech is detected.

It can facilitate speech processing, and can also be used to deactivate some processes during non-speech section of an audio session: it can avoid unnecessary coding/transmission of silence packets in VoIP applications, saving on computation and on network bandwidth.

In Voice over IP (VoIP), voice activation detection (VAD) is a software application that allows a data network carrying voice traffic over the Internet to detect the absence of audio and conserve bandwidth by preventing the transmission of "silent packets" over the network. Most conversations include about 50% silence; VAD (also called "silence suppression") can be enabled to monitor signals for voice activity so that when silence is detected for a specified amount of time, the application informs the Packet Voice Protocol and prevents the encoder output from being transported across the network.

Voice activation detection can also be used to forward idle noise characteristics (sometimes called ambient or comfort noise) to a remote IP telephone or gateway. The universal standard for digitized voice, 64 Kbps, is a constant bit rate whether the speaker is actively speaking, is pausing between thoughts, or is totally silent. Without idle noise giving the illusion of a constant transmission stream during silence suppression, the listener would be likely to think the line had gone dead.

Voice activity detection (VAD) feature in Matrix devices

It is given all the ATA series of Matrix say ATA211, ATA2S, ATA1S & ATA211G & also in all gateway series say Setu VGFX, Setu VFXTH etc

When enabled VAD (Silence suppression) will not send RTP traffic during periods of silence, saving bandwidth usage. At the beginning of a silence period, a single packet will be sent to the distant end to inform it that a period of silence is being entered, and that the distant end should begin to regenerate comfort noise to its TDM stream. Silence Suppression can be enabled or disabled on an established connection.