Distributed matrix completion for medical databases
This is a joint work with Geneviève Robin (PhD student at Polytechnique), François Husson (Professor at Agrocampus Ouest) and Balasubramanian Narasimhan (Senior Researcher at Stanford University). Personalized medical care relies on comparing new patients profiles to existing medical records, in order to predict patients treatment response or risk of disease based on their individual characteristics, and adapt medical decisions accordingly. The chances of finding profiles similar to new patients, and therefore of providing them better treatment, increase with the number of individuals in the database. For this reason, gathering the information contained in the databases of several hospitals promises better care for every patient. However, there are technical and social barriers to the aggregation of medical data. The size of combined databases often makes computations and storage intractable, while institutions are usually reluctant to share their data due to privacy concerns and proprietary attitudes. Both obstacles can be overcome by turning to distributed computations, which consists in leaving the data on sites and distributing the calculations, so that hospitals only share some intermediate results instead of the raw data. This could solve the privacy problem and reduce the cost of calculations by splitting one large problem into several smaller ones. The general project is described in Narasimhan et. al. (2017). As it is often the case, the medical databases are incomplete. One aim of the project is to impute the data of one hospital using the data of the other hospitals. This could also be an incentive to encourage the hospitals to participate in the project and to share their summaries of their data.
Modelisation of polytraumatized patients
In collaboration with Jean-Pierre Nadal, the Traumabase group, APHP (Public Assistance – Hospitals of Paris). The management of a polytraumatized patient (trauma putting a vital function in play) occurs in several stages: 1) on-site by the ambulance: the emergency doctors make a first assessment on the gravity of the patient’s state and start first gestures of emergency. 2) the patient is transferred to a Trauma-center and put in a recovery room where new measurements are made as well as first actions if needed. 3) the patient is either directed to an operating room or to a radiology room.
One aim of the project is to model the decisions and events taken by the emergency doctors to help them making choices in a very stressful environment and avoid discrepancies between the diagnosis made by the emergency doctors and the one made by the doctors when the patient arrives at the Trauma-center. From a statistical point of view, the challenges involve performing predictive models such as logistic regressions with many missing values (with different coding: NA for Not Applicable, Imp for impossible, NR for Not Recorded, NM for Not Made..) as well as both continuous and categorical data. Multiple imputation could be a solution to get valid inferences despite missing values.