Abstract Cockpit voice recorder (CVR) in aircraft black box records many cockpit voices, such as speaker voices, noises and background sounds with special meanings。 Cockpit voices’ complexity exacerbates analysis difficulty through traditional differentiating and hearing methods, so that fresh cockpit voices are not captured easily from non-stationary sounds。 In this paper, by analyzing firstly thoroughly characteristics of cockpit voices, we develop an improved voice activity detection scheme based on iterative spectral subtraction and double thresholds Finally, to demonstrate the effectiveness of the proposed scheme, we make simulations with a section of speech (SNR=8) from standard voice bank and a section of true 79854
cockpit voices and compare the probabilities Pcs and Pcn of the three algorithms, where Pcs denotes probability of
Voice activity detection (VAD) is just detecting the beginning and ending of a section of speech signal, and achieves the goal of distinguishing speaker voice from background sounds。 VAD in AAI requires proper detection and low computing cost with the purpose of gaining more time for AAI。 The traditional double thresholds VAD based on short time energy and zero crossing rates acquires wonderful performance results in case of high signal to noise ratio(SNR), but has total failure when speaker voices are submerged in strong background sounds or low SNR, for example, when the accident happens。
In this paper, we develop an improved scheme based on double thresholds VAD with spectral subtraction。 The paper is organized as follows: Section 2 analyzes characteristics of cockpit voices。 Section 3 and 4 describe respectively the scheme of traditional VAD based on double thresholds and basic spectral subtraction。 In section
5 and 6 present the improved scheme, and simulation
correctly detecting speech frames
Pcn probability of results are also given in this section。 Finally, some
correctly detecting noise frames。 Simulations results are presented to demonstrate the effectiveness of the improved algorithm。
1。Introduction
Cockpit voice recorder (CVR) in aircraft black box records many cockpit voices, such as speaker voices, noises and background sounds with special meanings。 They are complex non-stationary signals with characteristics of mutation, instantaneousness and singularity。 In some special cases, speaker voices combined in background sounds play an important role in air accident investigation (AAI)。 However, when the aircraft is flying, especially when an accident is happening, the record condition of CVR is becoming worse。 Therefore, fast extraction of speaker voices from cockpit voices is an important work in AAI。 The conventional measure of AAI is based on “differentiating and hearing” method and simple audio processing, so more exact speaker voices can’t be obtained easily。
conclusions are drawn in section 7。
2。Voice characteristics of CVR
The frequency scope of cockpit voice recorded by CVR is very wide, about 150Hz to 6800 Hz, which brings some difficulty to sound separation。 With the aim of facilitating sound separation, the cockpit voice is classified to three kinds: aviation noises, speaker voices and background sounds [1][2]。
Aviation noises include additive noises and non-additive noises。 Additive noises contain periodic noises, impulse, broad band noises, and speech interference。 Non-additive noises are mainly sound residue and circuit noises。 Non-additive noises can be transformed to additive ones by means of a particular transformation。 However, more specifically, aviation noises include engine sound, exterior air current noise while flying, skating noise while takeoff and landing, circuitry noise in electrical equipments and circuits, motor noise droved by power when manipulating aircraft and so on。