I am a Senior Researcher at Inria (National research center in digital science) and the head of the Inria-Inserm Idesp (Inserm – Um)  (National research center in health) team PreMeDICaL (precision medicine by data integration and causal learning) and responsible of the Causal and Missing Data group at Inria.
News: see also PreMeDICaL’s news.
– My PhD student Margaux Zaffran received the l’Oreal Unesco Young talent prize!! This is my 2nd PhD student receiving this prize!
Team retreat: before launching the clinical trial to assess the decision support tool Traumatrix, we tested the models a last time with clinicians. Video to present Traumatrix. Statitical challenges slides. Presentation of the project slides.
– Talk on causal measures and transportability, NIH Biostatistics National Cancer Institute (slides)
Video at Online Causal Inference  Seminar on leveraging incomplete RCT and observational data
– Missing values: EPFL 2023, AutoML2022 ICML slides, video. A  missing values tour:  2022 slides les diableret PhD school, 2019 useR slides, 2019 video (start at 30′)
– Researchers/Interns/Phd/Postdoc/Engineers positions. Contact me.

Project TrauMatrix decision tool for intensive Care. Consortium Capgemini Invent, Traumabase, EHESS, CNRS, Ecole Polytechnique, Inria
– ICUBAM Development of an app for Bed Allocation Monitoring  during COVID-19 Fork. slides.
–  R-mis-static website [paper], with missing values ressources (lecture, workflows, tutorials, etc.), Contribute!
Causal Inference Taskview to organize R packages, Contribute!
Rforwards dedicated to widen the participation of the minorities in communities.

julie.josse[at]inria.fr   – Office, 231,  INRIA Montpellier.

My main research fields are: missing values (EM algorithms, imputation, supervised learning), causal inference (treatment effect estimation, combining RCT and observational data; survival analysis, policy learning),  visualization with dimensionality reduction (PCA, correspondence analysis, questionnaire analyses, multi­-blocks data), low rank matrix estimation. Main application in  bio-sciences and health. Short CVDetailed CV


Selected publications:

Current researchers collaborators: Judith AbecassisGosia Bogdan, Claire Boyer,  Antoine ChambazYanniv Romano, Erwan Scornet, Bertrand ThirionShu Yang,
Current collaboration with companies EDF, Elixir Health, Quinten Health, Sanofi,  etc.
Current collaboration in health:   APHP, CHU Nancy/Montpellier, Gustave Roussy, Traumabase, etc.
Associate Editor: Foundations and Trends® in Machine Learning. Past:  Journal of Computational & Graphical Statistics.  Journal of Statistical Software. (7 years). AC for Neurips, ICLR.

– A summary of my research contributions up to 2023  can be found here.
– An overview of my research up to 2016 can be found in my Habilitation.  (slides)

YearAuthors Title link
2024Zaffran, M., Josse, J., Romano, Y., Dieuleveut, A. Predictive Uncertainty Quantification with Missing Covariatespdf
2024Boughdiri, A., Josse, J. & Scornet, E. Quantifying Treatment Effects: Estimating Risk Ratios in Causal Inference. pdf
2024Näf, J., & Josse, J. What is a Good Imputation Under MAR Missingness? pdf
2024Sussman, H., Chambaz, A. & Josse, J, Aegerter, P. Wargon, M., Bacry E.Probabilistic Prediction of Arrivals and Hospitalizations in Emergency Departments in Ile-de-France.
2024Zhao, P., Gatulle, N, James, A., Josse, J. & Chambaz, A. Learning, Evaluating and Analysising An Individualized Decision Support
Rule with Application to Early Intervention in Intensive Care Unit
2023Bénard, C., Naf, J. & Josse, J.MMD-based Variable Importance for Distributional Random Forest.

2023 Sussman, H., Chambaz, A. & Josse, J. Adaptive Conformal, an R package for adaptive conformal inference.
2023Zhao, P., Chambaz, A., Josse, J., Yang, S. Positivity-free Policy Learning with Observational Data.
2023Bénard, C, Josse, J.Variable importance for causal forests: breaking down
the heterogeneity of treatment effects
2023Colnet, B, Josse, J., Varoquaux, G., Scornet, E. Risk ratio, odds ratio, risk difference... Which causal measure is easier to generalize?
2023Zaffran, Josse, J. M., Dieuleveut A., Romano, Y. Conformal prediction with missing values.
2023Zhao, P., Josse, J. & Yang, S. (2023). Efficient and robust transfer learning of optimal individualized
treatment regimes with right-censored survival data.
2022-24Colnet, B, Josse, J., Varoquaux, G., Scornet, E. Reweighting the RCT for generalization: finite sample analysis and variable selection.
2022Blet et al.Association between in-ICU red blood cells transfusion and one-year mortality in ICU survivors.
Critical Care.
2022Colnet, B, Josse, J., Varoquaux, G., Scornet, E. Generalizing a causal effect: sensitivity analysis and missing covariates.
Journal of Causal Inference.
2022Gauss et al. Is Early Norepinephrine Associated with 24-hour Mortality of Blunt Trauma Patients in Haemorrhagic Shock? An International Cohort Study.
Jama Network.
2022Garaix et al.Decision-making tools for healthcare structures in times of pandemic.
Anaesthesia Critical Care & Pain Medicine.
2022Zaffran et al. Adaptive conformal prediction for time series.
2022Perez-Lebel et al. Benchmarking missing-values approaches for predictive models on health databases.
2021Le Morvan, J. Josse, E. Scornet. & G. VaroquauxWhat’s a good imputation to predict with missing
Neurips 2021. (Spotlight).
2021Sportisse, A. et al. Model-based Clustering with Missing Not At Random Data.
2021Mayer, I., Josse, J & TraumbaseTransporting treatment effects with incomplete attributes.
Biometrical Journal
2020-2023Colnet, B et al.Causal inference methods for combining randomized trials and observational studies: a review.
Statistical Science.
2020Le Morvan, J. Josse, M., Moreaux, T, E. Scornet. & G. VaroquauxNeumiss networks: differential programming for supervised learning with missing values. Neurips2020. (Oral) pdf
2020Sbidian et al. Hydroxychloroquine with or without azithromycin and in-hospital mortality or discharge in patients hospitalized for COVID-19 infection: a cohort study of 4,642 in-patients in France.
2020Consortium ICUBAMICU Bed Availability Monitoring and analysis in the Grand Est région of France during the COVID-19 epidemic.
Statistiques et Société.
2020A. Sportisse, C. Boyer,
and Josse, J.
Estimation and imputation in Probabilistic Principal Component Analysis with Missing Not At Random data.
2020A. Sportisse, C. Boyer, A. Dieuleveut, J. Josse.Debiasing Stochastic Gradient Descent to handle missing values. Neurips2020. pdf
2020J.D. Moyer et al. Trauma reloaded: Trauma registry in the era of data science. Anaesthesia Critical Care & Pain Medicine. pdf
2020Muzellec, B., Josse, J. Boyer, C. & Cuturi, M.
Missing Data Imputation using Optimal Transport.
2019Josse, J., Mayer, I, & Vert, J.P.MissDeepCausal: causal inference from incomplete data using deep latent variable models.
2020Le Morvan, M., N. Prost, J. Josse, E. Scornet. & G. VaroquauxLinear predictor on linearly-generated data with missing values: non consistency and solutions.
2020Descloux, P. , Boyer, C. Josse, J. Sportisse, A. Sardy, S. Robust Lasso-Zero for sparse corruption and model selection with missing covariates.
Scandinavian Journal of Statistics.
2022Mayer, I, Sportisse, A., Josse, J., Vialaneix, N., Tierney, N. R-miss-tastic: a unified platform for missing values methods and workflows. R journal.
2019-20Mayer, I, Josse, J., Wager, S., Sverdr, E., Moyer, J.D. and Gauss, T. Doubly robust treatment effect estimation with incomplete confounders.
Annals Of Applied Statistics.
2019-21M. Bogdan, W. Jiang, J. Josse, B. Miasojedow and V. Rockova.Adaptive Bayesian SLOPE – High dimensional Model Selection with Missing Values.
Journal of Computational and Graphical Statistics.
2019-24Josse, J., Prost, N., Scornet, E. & Varoquaux, G. On the consistency of supervised learning with missing values.
Statistical paper.
2019G. Robin, O. Klopp, J. Josse, E. Moulines, and R. Tibshirani Main effects and interactions in mixed and incomplete data frames.
Journal of the American Statistical Association.
2019Hamada, S et al.Effect of Fibrinogen administration on early mortality in traumatic haemorrhagic shock: a propensity score analysis.
Journal of Trauma.
2019Sportisse, A., Boyer, C. and Josse, J.Low-rank estimation with missing non at random data.
Statistics and Computing.
2018Josse, J., Husson, F. Robin, G. and Balasubramanian. N.Imputation of mixed data with multilevel SVD.
Journal of Computational and Graphical Statistics.
2018Robin, G, Sardy, S., Moulines, E. and Josse, J. Low-rank model with covariates for count data
with missing values.
Journal of Multivariate Analysis.

2018Jiang, W., Lavielle, M. Josse, J. and T. Gauss.Logistic Regression with Missing Covariates -- Parameter Estimation, Model Selection and Prediction within a Joint-Modeling Framework.
Package, code
2018 G. Robin, Hoi To Wai, J. Josse, O. Klopp and E. MoulinesLow-rank interactions and sparse additive effects model for large data frames.
NeurIPS 2018.
2018Josse, J. and Reiter, J.Introduction to the Special Section on Missing Data.
Statistical Sciences.
2018Seijo-Pardo, B., Alonso-Betanzos, A., P. Bennett, K. Bol\'on-Canedo, Josse, J., Saeed, M., Guyon, I. Feature selection in the presence of missing data.
Neurocomputing, ESANN.
2017-2018Mozharovskyi, P., Husson, F. and Josse, J.Nonparametric imputation by data depth.
Journal of the American Statistical Association.
2017Holmes, S and Josse, J.50 years of data-sciences, discussion.
Journal of Computational and Graphical Statistics.
2017Bollmann, S., Cook, Di. Dumas, J., Fox, J., Josse, J., Keyes, O. Strobl, C., Turner, H. and Debelak, R.A First Survey on the Diversity of the R Community.
R journal.
2017G. Celeux, J. Jewson, J. Josse, J.M. Marin and C. P. Robert.Some discussions on the Read Paper "Beyond subjective and objective in statistics" by A. Gelman and C. Hennig.

2017Foulley, JL, Celeux, G and Josse, J.Empirical Bayes approaches to PageRank type
algorithms for rating scientific journals.
Technical report.
2016Sobczyk, P, Bogdan, M. and Josse, J.PCA using penalized semi-integrated likelihood.
Journal of Computational and Graphical Statistics.
2016Fithian, W. and Josse, J.Multiple Correspondence Analysis & the Multilogit Bilinear Model.
Journal of Multivariate Analysis.
2016Husson, F., Josse, J. and Saporta, G.Jan de Leeuw and the French school of data analysis.
Journal of Statistical Software.
2016-2017Josse, J., Sardy, S. and Wager, S.denoiseR: a package for low rank matrix estimation.
2016Groenen, P. and Josse, J.
Multinomial Multiple Correspondence Analysis.
2016Fujii, H., Josse, J., Tanioka, M., Miyachi, Y. Husson, F., and Ono, M.Regulatory T cells in melanoma revisited by a computational clustering of FOXP3+ T cell subpopulations.
Journal of Immunology.
2015Audigier, V., Husson, F. and Josse, J.MIMCA: Multiple imputation for categorical variables with multiple correspondence analysis.
Statistics and Computing.
2015-2016Josse, J and Wager, S. Bootstrap-Based Regularization for Low-Rank Matrix Estimation.
Journal of Machine Learning research.
2015Josse, J. and Sardy, S.Adaptive Shrinkage of singular values.
Statistics and Computing.
2015Josse, J and Husson, F.
missMDA a package to handle missing values in and with multivariate data analysis methods.
Journal of Statistical Software.
2015 Audigier, V., Husson, F. and Josse, J.Multiple Imputation with Bayesian PCA. 
Journal of Statistical Computation and Simulation.
2015-2016Josse, J. and Holmes, S.Measuring multivariate association.
Statistics Survey.
2014Audigier, V., Husson, F. and Josse, J.A principal components method to impute mixed data. 
Advances in Data analysis and Classification. 
2014Josse, J., Wager, S. and Husson, F.

Confidence areas for fixed-effects PCA. 
Journal of Computational and Graphical Statistics.
2014Dray, S and Josse, J.Principal component analysis with missing values: a comparative survey of methods.
Plant Ecology. 
2014Josse, J.,  van Eeuwijk, F., Piepho, H-P and Denis, J.B.Another look at Bayesian analysis of AMMI models for genotype-environment data. 
Journal of Agricultural, Biological, and Environmental Statistics.
2013Verbanck, M., Josse, J. and Husson, F.Regularized PCA to denoise and visualise data. 
Statistics and Computing.  
2013Josse, J., Timmerman, M.E. and Kiers, H.A.L.
Missing values in multi-level simultaneous component analysis.
Chemometrics and Intelligent Laboratory Systems.
2013Husson, F. and Josse, J.Handling missing values in Multiple Factor Analysis.
Food Quality and Preferences.
2013Josse, J and Husson, F.Handling missing values in exploratory multivariate data analysis methods.
Journal de la SFdS. Paper written for the best Ph.D doctoral thesis prize delivered by the French Statistical Society.
2012Josse, J., Chavent, M., Liquet, B. and Husson, F.
Regularized Iterative Multiple Correspondence Analysis.
Journal of Classification.
2011Josse, J and Husson, F.Selecting the number of components in PCA using cross-validation approximations.
Computational Statistics and Data Analysis.
2011Josse, J., Husson, F. and Pagès, J.Multiple imputation in PCA.
Advances in data analysis and classification.
2010Josse, J., Husson, F. and Pagès, J.Principal component methods - hierarchical clustering - partitional clustering: why would we need to choose for visualizing data?
Technical report.
2009Josse, J., Husson, F. and Pagès, J.Analyse en Composantes Principales.
Journal de la SFdS.
2008Josse, J., Husson, F. and Pagès, J.Testing the significance of the RV coefficient.
Computational Statistics and Data Analysis.
2008Lê S., Josse, J. and Husson, F.FactoMineR: an R package for multivariate analysis.
Journal of Statistical Software.


Software – R

I am involved in the R software community and I  have been elected as a member of the R Foundation for Statistical Computing.

Development of packages:
FactoMineR: Exploratory Data Analyses  (PCA, Multiple Correspondence Analysis, Multi-tables/view data, etc.)
missMDA:  imputation (matrix completion) of continuous and categorical data, PCA with missing values, etc.
For questions on the use of packages we have a google group
denoiseR: low rank matrix estimation with regularized SVD and bootstrap

My students have also developed R packages associated to our  works: misaem: logistic regression with missing values, mimi: Generalized low-rank models for mixed and incomplete data frames, lori: contingency table with missing values and covariates, AdaptiveConformal for adaptive conformal prediction for times series, etc.

Development of R-miss-tastic, Causal Inference Taskview.  For causal inference with missing values, you can use the R package grf  and see the pipeline to compare different estimators (IPW, DR) strategies (imputations, etc.).

Development of ICUBAM (ICU Bed Allocation Monitor) open source project to visualize the availability of resuscitation beds. This started as a personal initiative from a rescusitator in the Grand-Est region who identify the need to to visualize available Covid + beds in real time (with a respirator). ICUBAM is an operational tool for rescuscitators in times of crisis to model patient flows, anticipate bed needs and welcome patients from submerged areas. ICUBAM was deployed in 130 ICU wards in 40 départements, and inventories more than 2,000 ICU. Slides application, Slides models, paper, github.

I  served as an associate editor of Journal of Statistical Software (2011-2017) and I was a founding member of  Rforwards to leading the R community forwards in widening the participation of women and other under-represented group. I am in the R foundation conference committee and have worked for implementation of Code of Conduct.
We also created the « French R board group » to support the organization of Les Rencontres R.

Video presentation of Rforwards. Blog posts and multivariate studies of the R community. Help R by supporting with donation or through the R consortium.


As a French professor, I taugh around 160 hours/year (lectures, computer labs mainly with the R software) and supervised master students’ projects and their internship in industry.  I was the head of a Master’s degree in Data-Science for Business at Ecole Polytechnique in collaboration with HEC business school. In addition, I give tutorials in different institutes and at conferences. Learn more. From, Sept 2020, I  taught Causal Inference in the IPP (Institut Polytechnique de Paris.) Master of Data Science at Polytechnique. For recent tutorials on missing values, see the Rmistatic plateform.


Her first employment was in the statistics department of an Agronomy University (Agrocampus Ouest) where she was trained to « the French data analysis school » and had the opportunity to work closely with researchers from various departments and increases her interest in transversal studies. In the meantime, she prepared her PhD which was defended in 2010 and rewarded by the French Statistical Society as the best PhD in applied statistics. She has specialized in missing data, visualization and the nonparametric analyses of complex data structures. Her work was rewarded by a Marie Curie European Union grant in 2013 to increase her research potential and to spend a year and a half at Stanford University. She spent a year as a researcher in INRIA before joining Ecole Polytechnique in 2016 as a Professor of Statistics. At Polytechnique, she was responsible of a Master in data-sciences for business in collaboration with HEC business school. She was a part-time visiting researcher at Google Brain Paris, for a year in 2019. In September 2020, she joined Inria as a senior researcher and created in 2022 an inria-Inserm Premedical team in data-science for health composed of clinicians and researchers in statistics, machine learning. She has published over 60 articles and written 3 books in applied statistics.  Her experience on dealing with incomplete data is recognized by the community: she organized workshops, the MissData conference, created the Rmistatic website and she is often invited to give lectures to share her experience. Her vocation is to push methodological innovation to bring useful application of her research to the user in particular in bio-sciences and health. Her current research focuses on causal inferences techniques for personalized medicine. She leads a project with the Traumabase group dedicated to the management of polytraumatized patients to help emergency doctors taking decisions. Julie Josse is dedicated to reproducible research with the R statistical software: she has developed packages including FactoMineR, denoiseR, missMDA to transfer her work, she is a member of the R foundation and of Rforwards to increase the participation of minorities in the community.

Perso: I grew up in Africa and French Polynesia. Then I arrived in Brittany a magnificent French region and I had the chance to discover Paris and now the south of France.  I am passionate about statistics but also about travelling (when I was younger on horseback) around the world. I am also fascinated by nature and science (fan of https://www.sciencefriday.com/, wildlife photographer of the year). I have a particular interest in humanitarian issues and my long-term goal is to use more of my skills for these purposes.

Interview in Academie des technologies (French, English) – Interview in MontpellierInterview in medium.


– SFdS French Statistical Society – Interested in data sciences? Join-us!!
– Some historical references on what was the French school of data analysis, data sciences.

Others projects:
Distributed computation with hospital data (with Balasubramanian Narasimhan)

Conferences organization head:
Leveraging Observational Data with Machine Learning 2021. 
Artemiss workshop at ICML 2020.
– The first MissData on missing values and matrix completion, June 2015.
– Correspondence analysis related methods CARME 2011. Videos. Let the data speak…. data analysis.
– The R conference useR! 2009.