Das Zoologische Forschungsmuseum Alexander Koenig

ist ein Forschungsmuseum der Leibniz Gemeinschaft

Audio parameterization with robust frame selection

AutorInnen: 
Ventura, T.M., de Oliveira, A.G., Ganchev, T.D., de Figueiredo, J.M., Jahn, O., Marques, M.I., Schuchmann, K.-L.
Erscheinungsjahr: 
2015
Vollständiger Titel: 
Audio parameterization with robust frame selection for improved bird identification
Org. Einordnung: 
Publiziert in: 
Expert Systems with Applications
Publikationstyp: 
Zeitschriftenaufsatz
DOI Name: 
http://dx.doi.org/10.1016/j.eswa.2015.07.002
Keywords: 
computational bioacoustics, bird identification, Hidden Markov Model (HMM), Mel Frequency Cepstral Coefficients (MFCCs), robust frame selection
Bibliographische Angaben: 
Ventura, T.M., de Oliveira, A.G., Ganchev, T.D., de Figueiredo, J.M., Jahn, O., Marques, M.I., Schuchmann, K.-L. (2015): Audio parameterization with robust frame selection for improved bird identification. Expert Systems with Applications 42, 8463–8471. doi:10.1016/j.eswa.2015.07.002
Abstract: 

A major challenge in the automated acoustic recognition of bird species is the audio segmentation, which aims to select portions of audio that contain meaningful sound events and eliminates segments that contain predominantly background noise or sound events of other origin. Here we report on the development of an audio parameterization method with integrated robust frame selection that makes use of morphological filtering applied on the spectrogram seen as an image. The morphological filtering allows to exclude from further processing certain audio events, which otherwise could cause misclassification errors. The Mel Frequency Cepstral Coefficients (MFCCs) computed for the selected audio frames offer a good representation of the spectral information for dominant vocalizations because the morphological filtering eliminates short bursts of noise and suppresses weak competing signals. Experimental validation of the proposed method on the identification of 40 bird species from Brazil demonstrated superior accuracy and faster operation than three traditional and recent approaches. This is expressed as reduction of the relative error rate by 3.4% and the overall operational time by 7.5% when compared to the second best result. The improved frame selection robustness, precision, and operational speed facilitate applications like multispecies identification of real-field recordings.