Ph.D. student under the direction of A. TENENHAUS

Thesis title: Statistical and computational framework for structured data analysis : Application to imaging-genetic data integration.
Thesis abstract: IMAGEN is a European Research Project which aim is to identify and learn more about biological and environmental factors that might have an influence on mental health in teenagers. This knowledge will allow developing better prevention strategies and therapies in the future. The IMAGEN database includes, for about two thousand 14 years old adolescents: (i) demographics data, (ii) neuropsychological assessments, psychometry, medical questionnaires, (iii) multimodal neuroimaging (including MR functional, structural and diffusion weighted neuroimaging) and (iv) omics (SNP and methylation) data. All these datasets are already centralized at NeuroSpin. The IMAGEN dataset gathers all the challenges that have to be faced in modern multivariate data analysis. The first bottleneck is the high complexity of the data that stems from (i) various sources: genetics, neuroimaging, etc. (ii) the number of neuroimaging modalities and (iii) the multi-centric nature of the data. The second bottleneck is the high number of measurements (~1M) in both genetic and neuroimaging data which involves the computation of billion(s) of associations. A successful investigation of such a dataset requires developing a computational and statistical framework that fits both the peculiar structure of the data as well as its heterogeneous nature. The work of the PhD student at the interface between statistical data analysis and machine learning consists in ontributing to the development of a statistical and computational framework for multiblock data analysis with application to IMAGEN.