ABSTRACT:
machine learning techniques. In general, these are not applied directly on the audio samples, but on a limited number of time- and/or
frequency-domain features, usually computed from a short-time Fourier transform of the signal.
This talk will present a few experiments where we try alternate, more adaptive, signal processing techniques to dig out the information from
the audio. One such technique uses so-called "sparse representations", i.e. where the signal is seen as a linear combination of a small
number of elementary waveforms, taken from a fixed, very large set - a well-known optimization problem ! Beyond an obvious application in
audio coding, we will demonstrate that this also rearranges the information in a hierarchical way, that can be used to build "scalable
features". Example applications are audio similarity, and audio classification experiments.