Doctorant sous la direction de M. Kowalski

Titre de la thèse: Mathematics Models and Dictionaries Learning for Music Signals: Application to Harmonic/Percussive Source Separation
Résumé de la thèse: Many treatments applied to audio signals such as automatic transcription or rhythm recognition are more effective when applied to files containing as few instruments as possible. Indeed, the spectral structures present in the Time-Frequency (TF) representations of the data are blurred when several instruments play at the same time. This thesis addresses the problem of Harmonic/Percussive source separation. We propose methods for decomposing the spectrograms of musical signals, based on the differences in the TF domain between the structures of harmonic instruments (generally horizontal structures) and that of percussive instruments (vertical structures). The proposed decompositions rely on Factorization in Non-Negative Matrices (NMF). NMF is a technique of rank reduction for non-negative data. It is widely used to decompose musical spectrograms and has been used with great success in areas such as source separation, automatic transcription and rhythm recognition. In this thesis we first used a structured NMF with orthogonal components to model the harmonic instruments and, in parallel, we decomposed the percussive part in several different ways. The first algorithm leaves the percussive part unconstrained. It is a very general method and it obtains an efficient decomposition on simple signals without optimization of hyperparameters. However, on complex signals, the results are not satisfactory. We then forced the percussive part with drums-specific dictionaries, trying many methods and combinations for the construction of the apriori data. Finally, we used a convoluted NMF decomposition with drum sound samples. This alternative method makes it possible to represent each hit of an element of the battery by a sound drum fragment coming from a database. This manuscript then reflects a work centered on sources separation and proposes new decomposition methods to separate harmonic instruments from percussive instruments from musical signal. These methods have been tested on a large database and their performances have been evaluated in terms of the quality of the estimated signals compared to state of the art methods. We also applied our convoluted NMF method for transcription and audio synthesis.