Abstract

Many problems in voice recognition and audio processing involve feature extraction from raw waveforms. The goal of feature extraction is to reduce the dimensionality of the audio signal while preserving the informative signatures that, for example, distinguish different phonemes in speech or identify particular instruments in music. If the acoustic variability of a data set is described by a small number of continuous features, then we can imagine the data as lying on a low dimensional manifold in the high dimensional space of all possible waveforms. Locally linear embedding (LLE) is an unsupervised learning algorithm for feature extraction in this setting. In this paper, we present results from the exploratory analysis and visualization of speech and music by LLE. 1.

Keywords

Feature extractionComputer scienceVisualizationSpeech recognitionEmbeddingArtificial intelligenceDimensionality reductionPattern recognition (psychology)Feature (linguistics)Signal processingSet (abstract data type)WaveformSpeech processingAudio signal processingAudio signalCurse of dimensionalityData visualizationSpeech codingDigital signal processing

Affiliated Institutions

Related Publications

Publication Info

Year
2004
Type
article
Volume
3
Pages
iii-984
Citations
46
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

46
OpenAlex

Cite This

Vaibhav Jain, L.K. Saul (2004). Exploratory analysis and visualization of speech and music by locally linear embedding. , 3 , iii-984. https://doi.org/10.1109/icassp.2004.1326712

Identifiers

DOI
10.1109/icassp.2004.1326712