by Adriana Stan
Abstract:
This article is based on the study of new methods to improve recognition capabilities of automatic speech recognition in the presence of noise systems. Instead of trying to modify complex recognition models, the study is aimed at enhancing the input data's reliability. This is achieved through processing of the acoustic representations of speech. One of these representations, called SpectroTemporal Excitation Pattern (STEP) is used in recognition systems with missing or unreliable data. One of the ideas behind this study was to increase the glimpsing areas in the STEP representations. And, because the glimpsing algorithm requires previous knowledge of the noise, another idea was to estimate noise characteristics, and base the glimpsing areas determination on these estimations. Preliminary tests were conducted with an HMM recognition system, but this will be the object of a future study.
Reference:
Adriana Stan, "Linear Interpolation of Spectrotemporal Excitation Pattern Representations for Automatic Speech Recognition in the Presence of Noise", In Proceedings of the 5th Conference on Speech Technology and Human-Computer Dialogue, Constanta, Romania, 2009.
Bibtex Entry:
@inproceedings{SPED09,
author = {Adriana Stan},
title = {{Linear Interpolation of Spectrotemporal Excitation Pattern Representations
for Automatic Speech Recognition in the Presence of Noise}},
abstract = {This article is based on the study of new methods to
improve recognition capabilities of automatic speech
recognition in the presence of noise systems. Instead
of trying to modify complex recognition models, the
study is aimed at enhancing the input data's reliability.
This is achieved through processing of the acoustic
representations of speech. One of these representations,
called SpectroTemporal Excitation Pattern (STEP) is used in
recognition systems with missing or unreliable data. One of
the ideas behind this study was to increase the glimpsing areas
in the STEP representations. And, because the glimpsing algorithm
requires previous knowledge of the noise, another idea was to estimate
noise characteristics, and base the glimpsing areas determination on
these estimations. Preliminary tests were conducted with an HMM
recognition system, but this will be the object of a future study.},
booktitle = {Proceedings of the 5th Conference on Speech Technology and Human-Computer Dialogue},
year = 2009,
address = {Constanta, Romania}
}