ChemOddity : machine learning to predict odorant detection threshold
Thu-P1-031
Presented by: Maxence Lalis
Electronic noses mimic the sense of smell using a combination of molecular sensors. These sensors have selective affinities for compounds. In order to approximate the human sense of smell, the affinity of these sensors must be close to the detection threshold of the molecules. However, only a subset of the detection threshold is available in the scientific literature.
Here, we propose a machine learning model that aims to determine the odour detection threshold (ODT) for any molecules. A database of more than 3500 compounds has been gathered to train and evaluate a predictive model. We assessed several combinations of models’ architecture (Random Forest, k-Nearest Neighbors, Support Vector Machine, Graph Neural Network, Ensemble models) and molecular descriptors (3D molecular descriptors, fingerprints, molecular graphs). We selected the Ensemble regressor, after optimizing parameters for each combination, whose performances are RMSE= 1.14, MAE=0.8 log (ODT) and an R2 of 0.61. Our model predicts detection limits ranging from parts-per-million concentration (ppm) to parts-per-trillion (ppt). This allows us to custom design a series of chemicals to calibrate the e-nose according to specific needs. The predictive model and the entirety of the data will be available.
Here, we propose a machine learning model that aims to determine the odour detection threshold (ODT) for any molecules. A database of more than 3500 compounds has been gathered to train and evaluate a predictive model. We assessed several combinations of models’ architecture (Random Forest, k-Nearest Neighbors, Support Vector Machine, Graph Neural Network, Ensemble models) and molecular descriptors (3D molecular descriptors, fingerprints, molecular graphs). We selected the Ensemble regressor, after optimizing parameters for each combination, whose performances are RMSE= 1.14, MAE=0.8 log (ODT) and an R2 of 0.61. Our model predicts detection limits ranging from parts-per-million concentration (ppm) to parts-per-trillion (ppt). This allows us to custom design a series of chemicals to calibrate the e-nose according to specific needs. The predictive model and the entirety of the data will be available.