Dr Feng Gao
Columbia University, USA
Abstract: Assessing fate, transport, and toxicity of contaminants is vital to evaluating the potential exposures and risks to human. Emerging machine learning and deep learning models have been used to predict chemical ecotoxicity and important fate properties such as bioconcentration to complement time-consuming and labor-intensive experiments. The performance of these machine learning models heavily relies on the numerical representation of chemicals. On the other hand, representation learning is a class of machine learning and deep learning approach that learns representation of the data to be used for various downstream tasks such as regression, classification, or clustering. In this talk, I’ll discuss three common ways of representing molecules: molecular fingerprints, molecular physicochemical properties, and molecular graphs. I’ll share our recent work on using both supervised and unsupervised machine learning/deep learning methods to learn chemical representations and predict the fate and toxicity of organic contaminants. Specially, I’ll discuss our work on linking molecular substructures with bioconcentration through molecular fingerprints and learning chemical representations from hundreds of chemical physicochemical properties for toxicity prediction. Finally, I’ll talk about a novel unsupervised graph learning method we developed named geometric scattering transform (GST). GST can learn representations from graph-structured data, and we comprehensively tested its performance on seven biochemistry datasets. Our results demonstrate that learning chemical representation can provide unique perspectives and is important in building predictive models towards accurately assessing the fate and toxicity of organic contaminants.
Host:Assist. Prof. Yanbin Zhao
EEH Early Career Board Member
Shanghai Jiao Tong University
Time:9:00am Oct 13, 2022 (Beijing time)
Zoom ID: 816 9975 7155
Bilibili: 25002335