Audio parameterization with robust frame selection

AutorInnen:

Ventura, T.M., de Oliveira, A.G., Ganchev, T.D., de Figueiredo, J.M., Jahn, O., Marques, M.I., Schuchmann, K.-L.

Erscheinungsjahr:

2015

Vollständiger Titel:

Audio parameterization with robust frame selection for improved bird identification

ZFMK-Autorinnen / ZFMK-Autoren:

Dr. Olaf Jahn, Prof. Dr. Karl-Ludwig Schuchmann

Org. Einordnung:

Ornithologie

Publiziert in:

Expert Systems with Applications

Publikationstyp:

Zeitschriftenaufsatz

DOI Name:

http://dx.doi.org/10.1016/j.eswa.2015.07.002

Keywords:

computational bioacoustics, bird identification, Hidden Markov Model (HMM), Mel Frequency Cepstral Coefficients (MFCCs), robust frame selection

Bibliographische Angaben:

Ventura, T.M., de Oliveira, A.G., Ganchev, T.D., de Figueiredo, J.M., Jahn, O., Marques, M.I., Schuchmann, K.-L. (2015): Audio parameterization with robust frame selection for improved bird identification. Expert Systems with Applications 42, 8463–8471. doi:10.1016/j.eswa.2015.07.002

Abstract:

A major challenge in the automated acoustic recognition of bird species is the audio segmentation, which aims to select portions of audio that contain meaningful sound events and eliminates segments that contain predominantly background noise or sound events of other origin. Here we report on the development of an audio parameterization method with integrated robust frame selection that makes use of morphological filtering applied on the spectrogram seen as an image. The morphological filtering allows to exclude from further processing certain audio events, which otherwise could cause misclassification errors. The Mel Frequency Cepstral Coefficients (MFCCs) computed for the selected audio frames offer a good representation of the spectral information for dominant vocalizations because the morphological filtering eliminates short bursts of noise and suppresses weak competing signals. Experimental validation of the proposed method on the identification of 40 bird species from Brazil demonstrated superior accuracy and faster operation than three traditional and recent approaches. This is expressed as reduction of the relative error rate by 3.4% and the overall operational time by 7.5% when compared to the second best result. The improved frame selection robustness, precision, and operational speed facilitate applications like multispecies identification of real-field recordings.

Audio parameterization with robust frame selection

Das LIB

Forschung

Museum

Alexander-Koenig-Gesellschaft

Suchformular

Audio parameterization with robust frame selection