SlideShare ist ein Scribd-Unternehmen logo
1 von 45
Downloaden Sie, um offline zu lesen
Machine Learning for Chemistry:
Representing and Intervening
Ichigaku Takigawa
takigawa@icredd.hokudai.ac.jp
Apr 26, 2021 @ Hokkaido University
Joint Symposium of Engineering & Information Science & WPI-ICReDD
I am a graduate of School of Engineering and IST!
1995-2005 (10 years) Hokkaido Univ
School of Engineering
Grad School of Engineering
Grad School of Info Sci & Tech
2012-2019 (7 years) Hokkaido Univ
B.Eng (1999)
M.Eng (2001), PhD (2004)
Postdoc (2004-2005)
Grad School of Info Sci & Tech Tenure Track (2012-2014)
Assoc Prof (2014-2019)
KUDO Mineichi TANAKA Yuzuru
SHIMBO Masaru
MINATO Shinichi
TANAKA Yuzuru
IMAI Hideyuki
2005-2011 (7 years) Kyoto Univ
2019-present (2 years) The “Cross-Appointment System”
But when I stepped outside
Physically I’m at Kyoto
Things go interdisciplinary…
• Bioinformatics Center
Institute for Chemical Research
• Grad School of Pharmaceutical Sci
• Medical-risk Avoidance
based on iPS Cells Team
• Institute for Chemical Reaction
Design and Discovery
Assist Prof (2005-2011)
2005-2011 (7 years) Kyoto Univ
2019-present (2 years) The “Cross-Appointment System”
This talk
• Why it is needed?
• What are exciting for computer scientists?
Machine Learning (ML) for Chemistry
It’s a hot topic in Chemistry
But also in Machine Learning!
NeurIPS 2020 ICML 2020
ICLR 2020
• Self-Supervised Graph Transformer on Large-Scale
Molecular Data
• RetroXpert: Decompose Retrosynthesis Prediction
Like A Chemist
• Reinforced Molecular Optimization with
Neighborhood-Controlled Grammars
• Autofocused Oracles for Model-based Design
• Barking Up the Right Tree: an Approach to Search
over Molecule Synthesis DAGs
• On the Equivalence of Molecular Graph Convolution
and Molecular Wave Function with Poor Basis Set
• CogMol: Target-Specific and Selective Drug Design
for COVID-19 Using Deep Generative Models
• A Graph to Graphs Framework for Retrosynthesis
Prediction
• Hierarchical Generation of Molecular Graphs using
Structural Motifs
• Learning to Navigate in Synthetically Accessible
Chemical Space Using Reinforcement Learning
• Reinforcement Learning for Molecular Design Guided by
Quantum Mechanics
• Multi-Objective Molecule Generation using Interpretable
Substructures
• Improving Molecular Design by Stochastic Iterative
Target Augmentation
• A Generative Model for Molecular Distance Geometry
• Directional Message Passing for Molecular Graphs
• GraphAF: a Flow-based Autoregressive Model for Molecular Graph Generation
• Augmenting Genetic Algorithms with Deep Neural Networks for Exploring the
Chemical Space
• A Fair Comparison of Graph Neural Networks for Graph Classification
Mixed feelings of curiosity, optimism, skepticism?
Inseparably linked to automation
“These illustrate how rapid advancements in hardware automation and machine
learning continue to transform the nature of experimentation and modeling.”
Automation is the use of technology to perform tasks with reduced
human involvement or human labor.
Towards machine autonomy in discovery
Organic synthesis in a modular robotic system. Science 363 (2019) A mobile robotic chemist. Nature 583 (2020)
Automating drug discovery. Nature Reviews Drug Discovery 17 (2018)
Automation has been impactfully changing our daily life, society,
as well as scientific experiments and computations.
This talk
• Why it is needed?
• What are exciting for computer scientists?
I’ll briefly cover these from two aspects:
2. (Experimental) Intervention
Machine Learning (ML) for Chemistry
• What are good ML-readable representations for chemistry?
• What information should be recorded and given to ML?
1. Representation
• What are essential to make real chemical discoveries?
• Any principled ways for data acquisition and experimental design?
Two pillars for scientific discovery?
In essence, ML for chemistry is metascience (the science on how
to do science) unexpectedly hitting age-old unsolved questions in
the philosophy of natural science.
Machine Learning (ML)
https://www.forbes.com/sites/forbestechcouncil/2020/02/19/
in-praise-of-boring-ai-a-k-a-machine-learning/
…
“Let’s face it:
So far, the artificial
intelligence plastered all
over PowerPoint slides
hasn’t lived up to its hype.”
The AI frenzy: hope & hype
Machine Learning (ML)
From AAAI-20 Oxford-Style Debate
https://www.forbes.com/sites/forbestechcouncil/2020/02/19/
in-praise-of-boring-ai-a-k-a-machine-learning/
…
“Let’s face it:
So far, the artificial
intelligence plastered all
over PowerPoint slides
hasn’t lived up to its hype.”
The AI frenzy: hope & hype
Machine Learning (ML)
All about statistical and algorithmic techniques for
surface-model fitting to data points by adjusting model parameters.
Random Forest Neural Networks
SVR Kernel Ridge
“Predictive Modeling”
Fitted surface used for
making predictions on
unseen data points
Variable 1
Variable 2
<latexit sha1_base64="Ill3Als4zZd947f5Xm9sW99d0QA=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCJTomJcEd245CGPBAlp64gNpW3aQkTiD5i4lYUrTVwYP8APcOMPuOATjEtM3LjwUpoYJeJtpnPmzD13zsyVTU21Hca6PmFsfGJyyj8dmJmdmw+GFhbzttGwFJ5TDM2wirJkc03Vec5RHY0XTYtLdVnjBbm2198vNLllq4Z+4LRMXq5LVV09VhXJISp7WhEroQiLMTfCw0D0QARepIzQIw5xBAMKGqiDQ4dDWIMEm74SRDCYxJXRJs4ipLr7HOcIkLZBWZwyJGJr9K/SquSxOq37NW1XrdApGg2LlGFE2Qu7Zz32zB7YK/v8s1bbrdH30qJZHmi5WQleLGc//lXVaXZw8q0a6dnBMbZdryp5N12mfwtloG+edXrZnUy0vcZu2Rv5v2Fd9kQ30Jvvyl2aZ65H+JHJC70YNUj83Y5hkI/HxK1YPL0RSe56rfJjBatYp34kkMQ+UshR/SoucYWO4BdiwqaQGKQKPk+zhB8hJL8AVA6Qmg==</latexit>
x1
<latexit sha1_base64="QFtMwnKe2I12XGZu0bNJbdnDaaE=">AAAChnichVG7TgJBFD2sL8QHqI2JDZFgrMhAVIwV0caShzwSJGR3HXHDvrK7EJH4Aya2UlhpYmH8AD/Axh+w4BOMJSY2Fl6WTYwS8W5m58yZe+6cmSuZqmI7jHV9wtj4xOSUfzowMzs3HwwtLBZso2HJPC8bqmGVJNHmqqLzvKM4Ki+ZFhc1SeVFqb7X3y82uWUrhn7gtExe0cSarhwrsugQlTutJqqhCIsxN8LDIO6BCLxIG6FHHOIIBmQ0oIFDh0NYhQibvjLiYDCJq6BNnEVIcfc5zhEgbYOyOGWIxNbpX6NV2WN1Wvdr2q5aplNUGhYpw4iyF3bPeuyZPbBX9vlnrbZbo++lRbM00HKzGrxYzn38q9JodnDyrRrp2cExtl2vCnk3XaZ/C3mgb551ermdbLS9xm7ZG/m/YV32RDfQm+/yXYZnr0f4kcgLvRg1KP67HcOgkIjFt2KJzEYkteu1yo8VrGKd+pFECvtII0/1a7jEFTqCX4gJm0JykCr4PM0SfoSQ+gJWLpCb</latexit>
x2
<latexit sha1_base64="Ill3Als4zZd947f5Xm9sW99d0QA=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCJTomJcEd245CGPBAlp64gNpW3aQkTiD5i4lYUrTVwYP8APcOMPuOATjEtM3LjwUpoYJeJtpnPmzD13zsyVTU21Hca6PmFsfGJyyj8dmJmdmw+GFhbzttGwFJ5TDM2wirJkc03Vec5RHY0XTYtLdVnjBbm2198vNLllq4Z+4LRMXq5LVV09VhXJISp7WhEroQiLMTfCw0D0QARepIzQIw5xBAMKGqiDQ4dDWIMEm74SRDCYxJXRJs4ipLr7HOcIkLZBWZwyJGJr9K/SquSxOq37NW1XrdApGg2LlGFE2Qu7Zz32zB7YK/v8s1bbrdH30qJZHmi5WQleLGc//lXVaXZw8q0a6dnBMbZdryp5N12mfwtloG+edXrZnUy0vcZu2Rv5v2Fd9kQ30Jvvyl2aZ65H+JHJC70YNUj83Y5hkI/HxK1YPL0RSe56rfJjBatYp34kkMQ+UshR/SoucYWO4BdiwqaQGKQKPk+zhB8hJL8AVA6Qmg==</latexit>
x1
<latexit sha1_base64="QFtMwnKe2I12XGZu0bNJbdnDaaE=">AAAChnichVG7TgJBFD2sL8QHqI2JDZFgrMhAVIwV0caShzwSJGR3HXHDvrK7EJH4Aya2UlhpYmH8AD/Axh+w4BOMJSY2Fl6WTYwS8W5m58yZe+6cmSuZqmI7jHV9wtj4xOSUfzowMzs3HwwtLBZso2HJPC8bqmGVJNHmqqLzvKM4Ki+ZFhc1SeVFqb7X3y82uWUrhn7gtExe0cSarhwrsugQlTutJqqhCIsxN8LDIO6BCLxIG6FHHOIIBmQ0oIFDh0NYhQibvjLiYDCJq6BNnEVIcfc5zhEgbYOyOGWIxNbpX6NV2WN1Wvdr2q5aplNUGhYpw4iyF3bPeuyZPbBX9vlnrbZbo++lRbM00HKzGrxYzn38q9JodnDyrRrp2cExtl2vCnk3XaZ/C3mgb551ermdbLS9xm7ZG/m/YV32RDfQm+/yXYZnr0f4kcgLvRg1KP67HcOgkIjFt2KJzEYkteu1yo8VrGKd+pFECvtII0/1a7jEFTqCX4gJm0JykCr4PM0SfoSQ+gJWLpCb</latexit>
x2
Modern aspects of ML
1. High dimensionality: Data can have many input variables.
a 100x100 pixel grayscale image = 10000 input variables (a 10000-dimensional array)
Modern aspects of ML
1. High dimensionality: Data can have many input variables.
a 100x100 pixel grayscale image = 10000 input variables (a 10000-dimensional array)
2. Multiformity and multimodality: Data take many forms + modes
Numerical values, discrete structures, networks, variable-length sequences, etc.
Images, volumes, videos, audios, texts, point clouds, geometries, sensor signals, etc.
Modern aspects of ML
1. High dimensionality: Data can have many input variables.
a 100x100 pixel grayscale image = 10000 input variables (a 10000-dimensional array)
3. Overrepresentation: ML models can have many parameters.
ResNet50: 26 million params
ResNet101: 45 million params
EfficientNet-B7: 66 million params
VGG19: 144 million params
12-layer, 12-heads BERT: 110 million params
24-layer, 16-heads BERT: 336 million params
GPT-2 XL: 1558 million params
GPT-3: 175 billion params
2. Multiformity and multimodality: Data take many forms + modes
Numerical values, discrete structures, networks, variable-length sequences, etc.
Images, volumes, videos, audios, texts, point clouds, geometries, sensor signals, etc.
Modern aspects of ML
1. High dimensionality: Data can have many input variables.
a 100x100 pixel grayscale image = 10000 input variables (a 10000-dimensional array)
3. Overrepresentation: ML models can have many parameters.
ResNet50: 26 million params
ResNet101: 45 million params
EfficientNet-B7: 66 million params
VGG19: 144 million params
12-layer, 12-heads BERT: 110 million params
24-layer, 16-heads BERT: 336 million params
GPT-2 XL: 1558 million params
GPT-3: 175 billion params
Can you imagine what would happen if we try to
fit a surface model having 175 billion parameters
to 100 million data points in 10 thousand dimension??
2. Multiformity and multimodality: Data take many forms + modes
Numerical values, discrete structures, networks, variable-length sequences, etc.
Images, volumes, videos, audios, texts, point clouds, geometries, sensor signals, etc.
Modern aspects of ML
4. Representation learning: Models can have “feature learning”
blocks, and they can be “pre-trained” by different large datasets.
Prediction
Input
variables
Surface
model
Classifier or
Regressor
<latexit sha1_base64="Ill3Als4zZd947f5Xm9sW99d0QA=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCJTomJcEd245CGPBAlp64gNpW3aQkTiD5i4lYUrTVwYP8APcOMPuOATjEtM3LjwUpoYJeJtpnPmzD13zsyVTU21Hca6PmFsfGJyyj8dmJmdmw+GFhbzttGwFJ5TDM2wirJkc03Vec5RHY0XTYtLdVnjBbm2198vNLllq4Z+4LRMXq5LVV09VhXJISp7WhEroQiLMTfCw0D0QARepIzQIw5xBAMKGqiDQ4dDWIMEm74SRDCYxJXRJs4ipLr7HOcIkLZBWZwyJGJr9K/SquSxOq37NW1XrdApGg2LlGFE2Qu7Zz32zB7YK/v8s1bbrdH30qJZHmi5WQleLGc//lXVaXZw8q0a6dnBMbZdryp5N12mfwtloG+edXrZnUy0vcZu2Rv5v2Fd9kQ30Jvvyl2aZ65H+JHJC70YNUj83Y5hkI/HxK1YPL0RSe56rfJjBatYp34kkMQ+UshR/SoucYWO4BdiwqaQGKQKPk+zhB8hJL8AVA6Qmg==</latexit>
x1
<latexit sha1_base64="QFtMwnKe2I12XGZu0bNJbdnDaaE=">AAAChnichVG7TgJBFD2sL8QHqI2JDZFgrMhAVIwV0caShzwSJGR3HXHDvrK7EJH4Aya2UlhpYmH8AD/Axh+w4BOMJSY2Fl6WTYwS8W5m58yZe+6cmSuZqmI7jHV9wtj4xOSUfzowMzs3HwwtLBZso2HJPC8bqmGVJNHmqqLzvKM4Ki+ZFhc1SeVFqb7X3y82uWUrhn7gtExe0cSarhwrsugQlTutJqqhCIsxN8LDIO6BCLxIG6FHHOIIBmQ0oIFDh0NYhQibvjLiYDCJq6BNnEVIcfc5zhEgbYOyOGWIxNbpX6NV2WN1Wvdr2q5aplNUGhYpw4iyF3bPeuyZPbBX9vlnrbZbo++lRbM00HKzGrxYzn38q9JodnDyrRrp2cExtl2vCnk3XaZ/C3mgb551ermdbLS9xm7ZG/m/YV32RDfQm+/yXYZnr0f4kcgLvRg1KP67HcOgkIjFt2KJzEYkteu1yo8VrGKd+pFECvtII0/1a7jEFTqCX4gJm0JykCr4PM0SfoSQ+gJWLpCb</latexit>
x2
<latexit sha1_base64="lFhRrRrVTrFR31ebbMgRp5myJpc=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCIDPjCuiG5c8pBHgoS0dcCG0jZtISLxB0zcysKVJi6MH+AHuPEHXPAJxiUmblx4KU2MEvE20zlz5p47Z+ZKhqpYNmNdjzA2PjE55Z32zczOzfsDC4s5S2+YMs/KuqqbBUm0uKpoPGsrtsoLhsnFuqTyvFTb7+/nm9y0FF07tFsGL9XFqqZUFFm0icqcljfKgRCLMCeCwyDqghDcSOqBRxzhGDpkNFAHhwabsAoRFn1FRMFgEFdCmziTkOLsc5zDR9oGZXHKEImt0b9Kq6LLarTu17QctUynqDRMUgYRZi/snvXYM3tgr+zzz1ptp0bfS4tmaaDlRtl/sZz5+FdVp9nGybdqpGcbFew4XhXybjhM/xbyQN886/Qyu+lwe43dsjfyf8O67IluoDXf5bsUT1+P8CORF3oxalD0dzuGQS4WiW5HYqnNUGLPbZUXK1jFOvUjjgQOkESW6ldxiSt0BK8QEbaE+CBV8LiaJfwIIfEFWE6QnA==</latexit>
x3
<latexit sha1_base64="0IPXcU0UIDvzZlYURjV2A/THv9U=">AAACiXichVG7SgNBFD2ur/hM1EawEYNiFWZFNKQKprGMj0TBBNndTHR0X+xOFmLwB6zsRK0ULMQP8ANs/AELP0EsFWwsvNksiAbjXWbnzJl77pyZq7um8CVjz11Kd09vX39sYHBoeGQ0nhgbL/pOzTN4wXBMx9vWNZ+bwuYFKaTJt12Pa5Zu8i39MNfc3wq45wvH3pR1l5ctbc8WVWFokqhiKag40t9NJFmKhTHdDtQIJBFF3knco4QKHBiowQKHDUnYhAafvh2oYHCJK6NBnEdIhPscxxgkbY2yOGVoxB7Sf49WOxFr07pZ0w/VBp1i0vBIOY1Z9sRu2Rt7ZHfshX3+WasR1mh6qdOst7Tc3Y2fTG58/KuyaJbY/1Z19CxRRTr0Ksi7GzLNWxgtfXB09raRWZ9tzLFr9kr+r9gze6Ab2MG7cbPG1y87+NHJC70YNUj93Y52UFxIqUuphbXFZHYlalUMU5jBPPVjGVmsIo8C1T/AKc5xoQwpqpJWMq1UpSvSTOBHKLkvAi+SPA==</latexit>
.
.
.
Modern aspects of ML
Prediction
Input
variables
Surface
model
<latexit sha1_base64="Ill3Als4zZd947f5Xm9sW99d0QA=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCJTomJcEd245CGPBAlp64gNpW3aQkTiD5i4lYUrTVwYP8APcOMPuOATjEtM3LjwUpoYJeJtpnPmzD13zsyVTU21Hca6PmFsfGJyyj8dmJmdmw+GFhbzttGwFJ5TDM2wirJkc03Vec5RHY0XTYtLdVnjBbm2198vNLllq4Z+4LRMXq5LVV09VhXJISp7WhEroQiLMTfCw0D0QARepIzQIw5xBAMKGqiDQ4dDWIMEm74SRDCYxJXRJs4ipLr7HOcIkLZBWZwyJGJr9K/SquSxOq37NW1XrdApGg2LlGFE2Qu7Zz32zB7YK/v8s1bbrdH30qJZHmi5WQleLGc//lXVaXZw8q0a6dnBMbZdryp5N12mfwtloG+edXrZnUy0vcZu2Rv5v2Fd9kQ30Jvvyl2aZ65H+JHJC70YNUj83Y5hkI/HxK1YPL0RSe56rfJjBatYp34kkMQ+UshR/SoucYWO4BdiwqaQGKQKPk+zhB8hJL8AVA6Qmg==</latexit>
x1
<latexit sha1_base64="QFtMwnKe2I12XGZu0bNJbdnDaaE=">AAAChnichVG7TgJBFD2sL8QHqI2JDZFgrMhAVIwV0caShzwSJGR3HXHDvrK7EJH4Aya2UlhpYmH8AD/Axh+w4BOMJSY2Fl6WTYwS8W5m58yZe+6cmSuZqmI7jHV9wtj4xOSUfzowMzs3HwwtLBZso2HJPC8bqmGVJNHmqqLzvKM4Ki+ZFhc1SeVFqb7X3y82uWUrhn7gtExe0cSarhwrsugQlTutJqqhCIsxN8LDIO6BCLxIG6FHHOIIBmQ0oIFDh0NYhQibvjLiYDCJq6BNnEVIcfc5zhEgbYOyOGWIxNbpX6NV2WN1Wvdr2q5aplNUGhYpw4iyF3bPeuyZPbBX9vlnrbZbo++lRbM00HKzGrxYzn38q9JodnDyrRrp2cExtl2vCnk3XaZ/C3mgb551ermdbLS9xm7ZG/m/YV32RDfQm+/yXYZnr0f4kcgLvRg1KP67HcOgkIjFt2KJzEYkteu1yo8VrGKd+pFECvtII0/1a7jEFTqCX4gJm0JykCr4PM0SfoSQ+gJWLpCb</latexit>
x2
<latexit sha1_base64="lFhRrRrVTrFR31ebbMgRp5myJpc=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCIDPjCuiG5c8pBHgoS0dcCG0jZtISLxB0zcysKVJi6MH+AHuPEHXPAJxiUmblx4KU2MEvE20zlz5p47Z+ZKhqpYNmNdjzA2PjE55Z32zczOzfsDC4s5S2+YMs/KuqqbBUm0uKpoPGsrtsoLhsnFuqTyvFTb7+/nm9y0FF07tFsGL9XFqqZUFFm0icqcljfKgRCLMCeCwyDqghDcSOqBRxzhGDpkNFAHhwabsAoRFn1FRMFgEFdCmziTkOLsc5zDR9oGZXHKEImt0b9Kq6LLarTu17QctUynqDRMUgYRZi/snvXYM3tgr+zzz1ptp0bfS4tmaaDlRtl/sZz5+FdVp9nGybdqpGcbFew4XhXybjhM/xbyQN886/Qyu+lwe43dsjfyf8O67IluoDXf5bsUT1+P8CORF3oxalD0dzuGQS4WiW5HYqnNUGLPbZUXK1jFOvUjjgQOkESW6ldxiSt0BK8QEbaE+CBV8LiaJfwIIfEFWE6QnA==</latexit>
x3
<latexit sha1_base64="0IPXcU0UIDvzZlYURjV2A/THv9U=">AAACiXichVG7SgNBFD2ur/hM1EawEYNiFWZFNKQKprGMj0TBBNndTHR0X+xOFmLwB6zsRK0ULMQP8ANs/AELP0EsFWwsvNksiAbjXWbnzJl77pyZq7um8CVjz11Kd09vX39sYHBoeGQ0nhgbL/pOzTN4wXBMx9vWNZ+bwuYFKaTJt12Pa5Zu8i39MNfc3wq45wvH3pR1l5ctbc8WVWFokqhiKag40t9NJFmKhTHdDtQIJBFF3knco4QKHBiowQKHDUnYhAafvh2oYHCJK6NBnEdIhPscxxgkbY2yOGVoxB7Sf49WOxFr07pZ0w/VBp1i0vBIOY1Z9sRu2Rt7ZHfshX3+WasR1mh6qdOst7Tc3Y2fTG58/KuyaJbY/1Z19CxRRTr0Ksi7GzLNWxgtfXB09raRWZ9tzLFr9kr+r9gze6Ab2MG7cbPG1y87+NHJC70YNUj93Y52UFxIqUuphbXFZHYlalUMU5jBPPVjGVmsIo8C1T/AKc5xoQwpqpJWMq1UpSvSTOBHKLkvAi+SPA==</latexit>
.
.
.
Latent
variables
Variable
transformation
Feature learning
Classifier or
Regressor
4. Representation learning: Models can have “feature learning”
blocks, and they can be “pre-trained” by different large datasets.
Modern aspects of ML
Prediction
Input
variables
Surface
model
<latexit sha1_base64="Ill3Als4zZd947f5Xm9sW99d0QA=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCJTomJcEd245CGPBAlp64gNpW3aQkTiD5i4lYUrTVwYP8APcOMPuOATjEtM3LjwUpoYJeJtpnPmzD13zsyVTU21Hca6PmFsfGJyyj8dmJmdmw+GFhbzttGwFJ5TDM2wirJkc03Vec5RHY0XTYtLdVnjBbm2198vNLllq4Z+4LRMXq5LVV09VhXJISp7WhEroQiLMTfCw0D0QARepIzQIw5xBAMKGqiDQ4dDWIMEm74SRDCYxJXRJs4ipLr7HOcIkLZBWZwyJGJr9K/SquSxOq37NW1XrdApGg2LlGFE2Qu7Zz32zB7YK/v8s1bbrdH30qJZHmi5WQleLGc//lXVaXZw8q0a6dnBMbZdryp5N12mfwtloG+edXrZnUy0vcZu2Rv5v2Fd9kQ30Jvvyl2aZ65H+JHJC70YNUj83Y5hkI/HxK1YPL0RSe56rfJjBatYp34kkMQ+UshR/SoucYWO4BdiwqaQGKQKPk+zhB8hJL8AVA6Qmg==</latexit>
x1
<latexit sha1_base64="QFtMwnKe2I12XGZu0bNJbdnDaaE=">AAAChnichVG7TgJBFD2sL8QHqI2JDZFgrMhAVIwV0caShzwSJGR3HXHDvrK7EJH4Aya2UlhpYmH8AD/Axh+w4BOMJSY2Fl6WTYwS8W5m58yZe+6cmSuZqmI7jHV9wtj4xOSUfzowMzs3HwwtLBZso2HJPC8bqmGVJNHmqqLzvKM4Ki+ZFhc1SeVFqb7X3y82uWUrhn7gtExe0cSarhwrsugQlTutJqqhCIsxN8LDIO6BCLxIG6FHHOIIBmQ0oIFDh0NYhQibvjLiYDCJq6BNnEVIcfc5zhEgbYOyOGWIxNbpX6NV2WN1Wvdr2q5aplNUGhYpw4iyF3bPeuyZPbBX9vlnrbZbo++lRbM00HKzGrxYzn38q9JodnDyrRrp2cExtl2vCnk3XaZ/C3mgb551ermdbLS9xm7ZG/m/YV32RDfQm+/yXYZnr0f4kcgLvRg1KP67HcOgkIjFt2KJzEYkteu1yo8VrGKd+pFECvtII0/1a7jEFTqCX4gJm0JykCr4PM0SfoSQ+gJWLpCb</latexit>
x2
<latexit sha1_base64="lFhRrRrVTrFR31ebbMgRp5myJpc=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCIDPjCuiG5c8pBHgoS0dcCG0jZtISLxB0zcysKVJi6MH+AHuPEHXPAJxiUmblx4KU2MEvE20zlz5p47Z+ZKhqpYNmNdjzA2PjE55Z32zczOzfsDC4s5S2+YMs/KuqqbBUm0uKpoPGsrtsoLhsnFuqTyvFTb7+/nm9y0FF07tFsGL9XFqqZUFFm0icqcljfKgRCLMCeCwyDqghDcSOqBRxzhGDpkNFAHhwabsAoRFn1FRMFgEFdCmziTkOLsc5zDR9oGZXHKEImt0b9Kq6LLarTu17QctUynqDRMUgYRZi/snvXYM3tgr+zzz1ptp0bfS4tmaaDlRtl/sZz5+FdVp9nGybdqpGcbFew4XhXybjhM/xbyQN886/Qyu+lwe43dsjfyf8O67IluoDXf5bsUT1+P8CORF3oxalD0dzuGQS4WiW5HYqnNUGLPbZUXK1jFOvUjjgQOkESW6ldxiSt0BK8QEbaE+CBV8LiaJfwIIfEFWE6QnA==</latexit>
x3
<latexit sha1_base64="0IPXcU0UIDvzZlYURjV2A/THv9U=">AAACiXichVG7SgNBFD2ur/hM1EawEYNiFWZFNKQKprGMj0TBBNndTHR0X+xOFmLwB6zsRK0ULMQP8ANs/AELP0EsFWwsvNksiAbjXWbnzJl77pyZq7um8CVjz11Kd09vX39sYHBoeGQ0nhgbL/pOzTN4wXBMx9vWNZ+bwuYFKaTJt12Pa5Zu8i39MNfc3wq45wvH3pR1l5ctbc8WVWFokqhiKag40t9NJFmKhTHdDtQIJBFF3knco4QKHBiowQKHDUnYhAafvh2oYHCJK6NBnEdIhPscxxgkbY2yOGVoxB7Sf49WOxFr07pZ0w/VBp1i0vBIOY1Z9sRu2Rt7ZHfshX3+WasR1mh6qdOst7Tc3Y2fTG58/KuyaJbY/1Z19CxRRTr0Ksi7GzLNWxgtfXB09raRWZ9tzLFr9kr+r9gze6Ab2MG7cbPG1y87+NHJC70YNUj93Y52UFxIqUuphbXFZHYlalUMU5jBPPVjGVmsIo8C1T/AKc5xoQwpqpJWMq1UpSvSTOBHKLkvAi+SPA==</latexit>
.
.
.
Latent
variables
Variable
transformation
Feature learning
Classifier or
Regressor
4. Representation learning: Models can have “feature learning”
blocks, and they can be “pre-trained” by different large datasets.
Modern aspects of ML
Prediction
Input
variables
Surface
model
<latexit sha1_base64="Ill3Als4zZd947f5Xm9sW99d0QA=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCJTomJcEd245CGPBAlp64gNpW3aQkTiD5i4lYUrTVwYP8APcOMPuOATjEtM3LjwUpoYJeJtpnPmzD13zsyVTU21Hca6PmFsfGJyyj8dmJmdmw+GFhbzttGwFJ5TDM2wirJkc03Vec5RHY0XTYtLdVnjBbm2198vNLllq4Z+4LRMXq5LVV09VhXJISp7WhEroQiLMTfCw0D0QARepIzQIw5xBAMKGqiDQ4dDWIMEm74SRDCYxJXRJs4ipLr7HOcIkLZBWZwyJGJr9K/SquSxOq37NW1XrdApGg2LlGFE2Qu7Zz32zB7YK/v8s1bbrdH30qJZHmi5WQleLGc//lXVaXZw8q0a6dnBMbZdryp5N12mfwtloG+edXrZnUy0vcZu2Rv5v2Fd9kQ30Jvvyl2aZ65H+JHJC70YNUj83Y5hkI/HxK1YPL0RSe56rfJjBatYp34kkMQ+UshR/SoucYWO4BdiwqaQGKQKPk+zhB8hJL8AVA6Qmg==</latexit>
x1
<latexit sha1_base64="QFtMwnKe2I12XGZu0bNJbdnDaaE=">AAAChnichVG7TgJBFD2sL8QHqI2JDZFgrMhAVIwV0caShzwSJGR3HXHDvrK7EJH4Aya2UlhpYmH8AD/Axh+w4BOMJSY2Fl6WTYwS8W5m58yZe+6cmSuZqmI7jHV9wtj4xOSUfzowMzs3HwwtLBZso2HJPC8bqmGVJNHmqqLzvKM4Ki+ZFhc1SeVFqb7X3y82uWUrhn7gtExe0cSarhwrsugQlTutJqqhCIsxN8LDIO6BCLxIG6FHHOIIBmQ0oIFDh0NYhQibvjLiYDCJq6BNnEVIcfc5zhEgbYOyOGWIxNbpX6NV2WN1Wvdr2q5aplNUGhYpw4iyF3bPeuyZPbBX9vlnrbZbo++lRbM00HKzGrxYzn38q9JodnDyrRrp2cExtl2vCnk3XaZ/C3mgb551ermdbLS9xm7ZG/m/YV32RDfQm+/yXYZnr0f4kcgLvRg1KP67HcOgkIjFt2KJzEYkteu1yo8VrGKd+pFECvtII0/1a7jEFTqCX4gJm0JykCr4PM0SfoSQ+gJWLpCb</latexit>
x2
<latexit sha1_base64="lFhRrRrVTrFR31ebbMgRp5myJpc=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCIDPjCuiG5c8pBHgoS0dcCG0jZtISLxB0zcysKVJi6MH+AHuPEHXPAJxiUmblx4KU2MEvE20zlz5p47Z+ZKhqpYNmNdjzA2PjE55Z32zczOzfsDC4s5S2+YMs/KuqqbBUm0uKpoPGsrtsoLhsnFuqTyvFTb7+/nm9y0FF07tFsGL9XFqqZUFFm0icqcljfKgRCLMCeCwyDqghDcSOqBRxzhGDpkNFAHhwabsAoRFn1FRMFgEFdCmziTkOLsc5zDR9oGZXHKEImt0b9Kq6LLarTu17QctUynqDRMUgYRZi/snvXYM3tgr+zzz1ptp0bfS4tmaaDlRtl/sZz5+FdVp9nGybdqpGcbFew4XhXybjhM/xbyQN886/Qyu+lwe43dsjfyf8O67IluoDXf5bsUT1+P8CORF3oxalD0dzuGQS4WiW5HYqnNUGLPbZUXK1jFOvUjjgQOkESW6ldxiSt0BK8QEbaE+CBV8LiaJfwIIfEFWE6QnA==</latexit>
x3
<latexit sha1_base64="0IPXcU0UIDvzZlYURjV2A/THv9U=">AAACiXichVG7SgNBFD2ur/hM1EawEYNiFWZFNKQKprGMj0TBBNndTHR0X+xOFmLwB6zsRK0ULMQP8ANs/AELP0EsFWwsvNksiAbjXWbnzJl77pyZq7um8CVjz11Kd09vX39sYHBoeGQ0nhgbL/pOzTN4wXBMx9vWNZ+bwuYFKaTJt12Pa5Zu8i39MNfc3wq45wvH3pR1l5ctbc8WVWFokqhiKag40t9NJFmKhTHdDtQIJBFF3knco4QKHBiowQKHDUnYhAafvh2oYHCJK6NBnEdIhPscxxgkbY2yOGVoxB7Sf49WOxFr07pZ0w/VBp1i0vBIOY1Z9sRu2Rt7ZHfshX3+WasR1mh6qdOst7Tc3Y2fTG58/KuyaJbY/1Z19CxRRTr0Ksi7GzLNWxgtfXB09raRWZ9tzLFr9kr+r9gze6Ab2MG7cbPG1y87+NHJC70YNUj93Y52UFxIqUuphbXFZHYlalUMU5jBPPVjGVmsIo8C1T/AKc5xoQwpqpJWMq1UpSvSTOBHKLkvAi+SPA==</latexit>
.
.
.
Latent
variables
Variable
transformation
Feature learning
Classifier or
Regressor
4. Representation learning: Models can have “feature learning”
blocks, and they can be “pre-trained” by different large datasets.
Modern aspects of ML
Prediction
Input
variables
Surface
model
<latexit sha1_base64="Ill3Als4zZd947f5Xm9sW99d0QA=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCJTomJcEd245CGPBAlp64gNpW3aQkTiD5i4lYUrTVwYP8APcOMPuOATjEtM3LjwUpoYJeJtpnPmzD13zsyVTU21Hca6PmFsfGJyyj8dmJmdmw+GFhbzttGwFJ5TDM2wirJkc03Vec5RHY0XTYtLdVnjBbm2198vNLllq4Z+4LRMXq5LVV09VhXJISp7WhEroQiLMTfCw0D0QARepIzQIw5xBAMKGqiDQ4dDWIMEm74SRDCYxJXRJs4ipLr7HOcIkLZBWZwyJGJr9K/SquSxOq37NW1XrdApGg2LlGFE2Qu7Zz32zB7YK/v8s1bbrdH30qJZHmi5WQleLGc//lXVaXZw8q0a6dnBMbZdryp5N12mfwtloG+edXrZnUy0vcZu2Rv5v2Fd9kQ30Jvvyl2aZ65H+JHJC70YNUj83Y5hkI/HxK1YPL0RSe56rfJjBatYp34kkMQ+UshR/SoucYWO4BdiwqaQGKQKPk+zhB8hJL8AVA6Qmg==</latexit>
x1
<latexit sha1_base64="QFtMwnKe2I12XGZu0bNJbdnDaaE=">AAAChnichVG7TgJBFD2sL8QHqI2JDZFgrMhAVIwV0caShzwSJGR3HXHDvrK7EJH4Aya2UlhpYmH8AD/Axh+w4BOMJSY2Fl6WTYwS8W5m58yZe+6cmSuZqmI7jHV9wtj4xOSUfzowMzs3HwwtLBZso2HJPC8bqmGVJNHmqqLzvKM4Ki+ZFhc1SeVFqb7X3y82uWUrhn7gtExe0cSarhwrsugQlTutJqqhCIsxN8LDIO6BCLxIG6FHHOIIBmQ0oIFDh0NYhQibvjLiYDCJq6BNnEVIcfc5zhEgbYOyOGWIxNbpX6NV2WN1Wvdr2q5aplNUGhYpw4iyF3bPeuyZPbBX9vlnrbZbo++lRbM00HKzGrxYzn38q9JodnDyrRrp2cExtl2vCnk3XaZ/C3mgb551ermdbLS9xm7ZG/m/YV32RDfQm+/yXYZnr0f4kcgLvRg1KP67HcOgkIjFt2KJzEYkteu1yo8VrGKd+pFECvtII0/1a7jEFTqCX4gJm0JykCr4PM0SfoSQ+gJWLpCb</latexit>
x2
<latexit sha1_base64="lFhRrRrVTrFR31ebbMgRp5myJpc=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCIDPjCuiG5c8pBHgoS0dcCG0jZtISLxB0zcysKVJi6MH+AHuPEHXPAJxiUmblx4KU2MEvE20zlz5p47Z+ZKhqpYNmNdjzA2PjE55Z32zczOzfsDC4s5S2+YMs/KuqqbBUm0uKpoPGsrtsoLhsnFuqTyvFTb7+/nm9y0FF07tFsGL9XFqqZUFFm0icqcljfKgRCLMCeCwyDqghDcSOqBRxzhGDpkNFAHhwabsAoRFn1FRMFgEFdCmziTkOLsc5zDR9oGZXHKEImt0b9Kq6LLarTu17QctUynqDRMUgYRZi/snvXYM3tgr+zzz1ptp0bfS4tmaaDlRtl/sZz5+FdVp9nGybdqpGcbFew4XhXybjhM/xbyQN886/Qyu+lwe43dsjfyf8O67IluoDXf5bsUT1+P8CORF3oxalD0dzuGQS4WiW5HYqnNUGLPbZUXK1jFOvUjjgQOkESW6ldxiSt0BK8QEbaE+CBV8LiaJfwIIfEFWE6QnA==</latexit>
x3
<latexit sha1_base64="0IPXcU0UIDvzZlYURjV2A/THv9U=">AAACiXichVG7SgNBFD2ur/hM1EawEYNiFWZFNKQKprGMj0TBBNndTHR0X+xOFmLwB6zsRK0ULMQP8ANs/AELP0EsFWwsvNksiAbjXWbnzJl77pyZq7um8CVjz11Kd09vX39sYHBoeGQ0nhgbL/pOzTN4wXBMx9vWNZ+bwuYFKaTJt12Pa5Zu8i39MNfc3wq45wvH3pR1l5ctbc8WVWFokqhiKag40t9NJFmKhTHdDtQIJBFF3knco4QKHBiowQKHDUnYhAafvh2oYHCJK6NBnEdIhPscxxgkbY2yOGVoxB7Sf49WOxFr07pZ0w/VBp1i0vBIOY1Z9sRu2Rt7ZHfshX3+WasR1mh6qdOst7Tc3Y2fTG58/KuyaJbY/1Z19CxRRTr0Ksi7GzLNWxgtfXB09raRWZ9tzLFr9kr+r9gze6Ab2MG7cbPG1y87+NHJC70YNUj93Y52UFxIqUuphbXFZHYlalUMU5jBPPVjGVmsIo8C1T/AKc5xoQwpqpJWMq1UpSvSTOBHKLkvAi+SPA==</latexit>
.
.
.
Latent
variables
Variable
transformation
Feature learning
Classifier or
Regressor
4. Representation learning: Models can have “feature learning”
blocks, and they can be “pre-trained” by different large datasets.
Modern aspects of ML
Prediction
Input
variables
Surface
model
<latexit sha1_base64="Ill3Als4zZd947f5Xm9sW99d0QA=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCJTomJcEd245CGPBAlp64gNpW3aQkTiD5i4lYUrTVwYP8APcOMPuOATjEtM3LjwUpoYJeJtpnPmzD13zsyVTU21Hca6PmFsfGJyyj8dmJmdmw+GFhbzttGwFJ5TDM2wirJkc03Vec5RHY0XTYtLdVnjBbm2198vNLllq4Z+4LRMXq5LVV09VhXJISp7WhEroQiLMTfCw0D0QARepIzQIw5xBAMKGqiDQ4dDWIMEm74SRDCYxJXRJs4ipLr7HOcIkLZBWZwyJGJr9K/SquSxOq37NW1XrdApGg2LlGFE2Qu7Zz32zB7YK/v8s1bbrdH30qJZHmi5WQleLGc//lXVaXZw8q0a6dnBMbZdryp5N12mfwtloG+edXrZnUy0vcZu2Rv5v2Fd9kQ30Jvvyl2aZ65H+JHJC70YNUj83Y5hkI/HxK1YPL0RSe56rfJjBatYp34kkMQ+UshR/SoucYWO4BdiwqaQGKQKPk+zhB8hJL8AVA6Qmg==</latexit>
x1
<latexit sha1_base64="QFtMwnKe2I12XGZu0bNJbdnDaaE=">AAAChnichVG7TgJBFD2sL8QHqI2JDZFgrMhAVIwV0caShzwSJGR3HXHDvrK7EJH4Aya2UlhpYmH8AD/Axh+w4BOMJSY2Fl6WTYwS8W5m58yZe+6cmSuZqmI7jHV9wtj4xOSUfzowMzs3HwwtLBZso2HJPC8bqmGVJNHmqqLzvKM4Ki+ZFhc1SeVFqb7X3y82uWUrhn7gtExe0cSarhwrsugQlTutJqqhCIsxN8LDIO6BCLxIG6FHHOIIBmQ0oIFDh0NYhQibvjLiYDCJq6BNnEVIcfc5zhEgbYOyOGWIxNbpX6NV2WN1Wvdr2q5aplNUGhYpw4iyF3bPeuyZPbBX9vlnrbZbo++lRbM00HKzGrxYzn38q9JodnDyrRrp2cExtl2vCnk3XaZ/C3mgb551ermdbLS9xm7ZG/m/YV32RDfQm+/yXYZnr0f4kcgLvRg1KP67HcOgkIjFt2KJzEYkteu1yo8VrGKd+pFECvtII0/1a7jEFTqCX4gJm0JykCr4PM0SfoSQ+gJWLpCb</latexit>
x2
<latexit sha1_base64="lFhRrRrVTrFR31ebbMgRp5myJpc=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCIDPjCuiG5c8pBHgoS0dcCG0jZtISLxB0zcysKVJi6MH+AHuPEHXPAJxiUmblx4KU2MEvE20zlz5p47Z+ZKhqpYNmNdjzA2PjE55Z32zczOzfsDC4s5S2+YMs/KuqqbBUm0uKpoPGsrtsoLhsnFuqTyvFTb7+/nm9y0FF07tFsGL9XFqqZUFFm0icqcljfKgRCLMCeCwyDqghDcSOqBRxzhGDpkNFAHhwabsAoRFn1FRMFgEFdCmziTkOLsc5zDR9oGZXHKEImt0b9Kq6LLarTu17QctUynqDRMUgYRZi/snvXYM3tgr+zzz1ptp0bfS4tmaaDlRtl/sZz5+FdVp9nGybdqpGcbFew4XhXybjhM/xbyQN886/Qyu+lwe43dsjfyf8O67IluoDXf5bsUT1+P8CORF3oxalD0dzuGQS4WiW5HYqnNUGLPbZUXK1jFOvUjjgQOkESW6ldxiSt0BK8QEbaE+CBV8LiaJfwIIfEFWE6QnA==</latexit>
x3
<latexit sha1_base64="0IPXcU0UIDvzZlYURjV2A/THv9U=">AAACiXichVG7SgNBFD2ur/hM1EawEYNiFWZFNKQKprGMj0TBBNndTHR0X+xOFmLwB6zsRK0ULMQP8ANs/AELP0EsFWwsvNksiAbjXWbnzJl77pyZq7um8CVjz11Kd09vX39sYHBoeGQ0nhgbL/pOzTN4wXBMx9vWNZ+bwuYFKaTJt12Pa5Zu8i39MNfc3wq45wvH3pR1l5ctbc8WVWFokqhiKag40t9NJFmKhTHdDtQIJBFF3knco4QKHBiowQKHDUnYhAafvh2oYHCJK6NBnEdIhPscxxgkbY2yOGVoxB7Sf49WOxFr07pZ0w/VBp1i0vBIOY1Z9sRu2Rt7ZHfshX3+WasR1mh6qdOst7Tc3Y2fTG58/KuyaJbY/1Z19CxRRTr0Ksi7GzLNWxgtfXB09raRWZ9tzLFr9kr+r9gze6Ab2MG7cbPG1y87+NHJC70YNUj93Y52UFxIqUuphbXFZHYlalUMU5jBPPVjGVmsIo8C1T/AKc5xoQwpqpJWMq1UpSvSTOBHKLkvAi+SPA==</latexit>
.
.
.
Latent
variables
Variable
transformation
Feature learning
Classifier or
Regressor
4. Representation learning: Models can have “feature learning”
blocks, and they can be “pre-trained” by different large datasets.
Modern aspects of ML
Prediction
Input
variables
Surface
model
<latexit sha1_base64="Ill3Als4zZd947f5Xm9sW99d0QA=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCJTomJcEd245CGPBAlp64gNpW3aQkTiD5i4lYUrTVwYP8APcOMPuOATjEtM3LjwUpoYJeJtpnPmzD13zsyVTU21Hca6PmFsfGJyyj8dmJmdmw+GFhbzttGwFJ5TDM2wirJkc03Vec5RHY0XTYtLdVnjBbm2198vNLllq4Z+4LRMXq5LVV09VhXJISp7WhEroQiLMTfCw0D0QARepIzQIw5xBAMKGqiDQ4dDWIMEm74SRDCYxJXRJs4ipLr7HOcIkLZBWZwyJGJr9K/SquSxOq37NW1XrdApGg2LlGFE2Qu7Zz32zB7YK/v8s1bbrdH30qJZHmi5WQleLGc//lXVaXZw8q0a6dnBMbZdryp5N12mfwtloG+edXrZnUy0vcZu2Rv5v2Fd9kQ30Jvvyl2aZ65H+JHJC70YNUj83Y5hkI/HxK1YPL0RSe56rfJjBatYp34kkMQ+UshR/SoucYWO4BdiwqaQGKQKPk+zhB8hJL8AVA6Qmg==</latexit>
x1
<latexit sha1_base64="QFtMwnKe2I12XGZu0bNJbdnDaaE=">AAAChnichVG7TgJBFD2sL8QHqI2JDZFgrMhAVIwV0caShzwSJGR3HXHDvrK7EJH4Aya2UlhpYmH8AD/Axh+w4BOMJSY2Fl6WTYwS8W5m58yZe+6cmSuZqmI7jHV9wtj4xOSUfzowMzs3HwwtLBZso2HJPC8bqmGVJNHmqqLzvKM4Ki+ZFhc1SeVFqb7X3y82uWUrhn7gtExe0cSarhwrsugQlTutJqqhCIsxN8LDIO6BCLxIG6FHHOIIBmQ0oIFDh0NYhQibvjLiYDCJq6BNnEVIcfc5zhEgbYOyOGWIxNbpX6NV2WN1Wvdr2q5aplNUGhYpw4iyF3bPeuyZPbBX9vlnrbZbo++lRbM00HKzGrxYzn38q9JodnDyrRrp2cExtl2vCnk3XaZ/C3mgb551ermdbLS9xm7ZG/m/YV32RDfQm+/yXYZnr0f4kcgLvRg1KP67HcOgkIjFt2KJzEYkteu1yo8VrGKd+pFECvtII0/1a7jEFTqCX4gJm0JykCr4PM0SfoSQ+gJWLpCb</latexit>
x2
<latexit sha1_base64="lFhRrRrVTrFR31ebbMgRp5myJpc=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCIDPjCuiG5c8pBHgoS0dcCG0jZtISLxB0zcysKVJi6MH+AHuPEHXPAJxiUmblx4KU2MEvE20zlz5p47Z+ZKhqpYNmNdjzA2PjE55Z32zczOzfsDC4s5S2+YMs/KuqqbBUm0uKpoPGsrtsoLhsnFuqTyvFTb7+/nm9y0FF07tFsGL9XFqqZUFFm0icqcljfKgRCLMCeCwyDqghDcSOqBRxzhGDpkNFAHhwabsAoRFn1FRMFgEFdCmziTkOLsc5zDR9oGZXHKEImt0b9Kq6LLarTu17QctUynqDRMUgYRZi/snvXYM3tgr+zzz1ptp0bfS4tmaaDlRtl/sZz5+FdVp9nGybdqpGcbFew4XhXybjhM/xbyQN886/Qyu+lwe43dsjfyf8O67IluoDXf5bsUT1+P8CORF3oxalD0dzuGQS4WiW5HYqnNUGLPbZUXK1jFOvUjjgQOkESW6ldxiSt0BK8QEbaE+CBV8LiaJfwIIfEFWE6QnA==</latexit>
x3
<latexit sha1_base64="0IPXcU0UIDvzZlYURjV2A/THv9U=">AAACiXichVG7SgNBFD2ur/hM1EawEYNiFWZFNKQKprGMj0TBBNndTHR0X+xOFmLwB6zsRK0ULMQP8ANs/AELP0EsFWwsvNksiAbjXWbnzJl77pyZq7um8CVjz11Kd09vX39sYHBoeGQ0nhgbL/pOzTN4wXBMx9vWNZ+bwuYFKaTJt12Pa5Zu8i39MNfc3wq45wvH3pR1l5ctbc8WVWFokqhiKag40t9NJFmKhTHdDtQIJBFF3knco4QKHBiowQKHDUnYhAafvh2oYHCJK6NBnEdIhPscxxgkbY2yOGVoxB7Sf49WOxFr07pZ0w/VBp1i0vBIOY1Z9sRu2Rt7ZHfshX3+WasR1mh6qdOst7Tc3Y2fTG58/KuyaJbY/1Z19CxRRTr0Ksi7GzLNWxgtfXB09raRWZ9tzLFr9kr+r9gze6Ab2MG7cbPG1y87+NHJC70YNUj93Y52UFxIqUuphbXFZHYlalUMU5jBPPVjGVmsIo8C1T/AKc5xoQwpqpJWMq1UpSvSTOBHKLkvAi+SPA==</latexit>
.
.
.
Latent
variables
Variable
transformation
Feature learning
Classifier or
Regressor
4. Representation learning: Models can have “feature learning”
blocks, and they can be “pre-trained” by different large datasets.
Modern aspects of ML
Prediction
Input
variables
Surface
model
<latexit sha1_base64="Ill3Als4zZd947f5Xm9sW99d0QA=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCJTomJcEd245CGPBAlp64gNpW3aQkTiD5i4lYUrTVwYP8APcOMPuOATjEtM3LjwUpoYJeJtpnPmzD13zsyVTU21Hca6PmFsfGJyyj8dmJmdmw+GFhbzttGwFJ5TDM2wirJkc03Vec5RHY0XTYtLdVnjBbm2198vNLllq4Z+4LRMXq5LVV09VhXJISp7WhEroQiLMTfCw0D0QARepIzQIw5xBAMKGqiDQ4dDWIMEm74SRDCYxJXRJs4ipLr7HOcIkLZBWZwyJGJr9K/SquSxOq37NW1XrdApGg2LlGFE2Qu7Zz32zB7YK/v8s1bbrdH30qJZHmi5WQleLGc//lXVaXZw8q0a6dnBMbZdryp5N12mfwtloG+edXrZnUy0vcZu2Rv5v2Fd9kQ30Jvvyl2aZ65H+JHJC70YNUj83Y5hkI/HxK1YPL0RSe56rfJjBatYp34kkMQ+UshR/SoucYWO4BdiwqaQGKQKPk+zhB8hJL8AVA6Qmg==</latexit>
x1
<latexit sha1_base64="QFtMwnKe2I12XGZu0bNJbdnDaaE=">AAAChnichVG7TgJBFD2sL8QHqI2JDZFgrMhAVIwV0caShzwSJGR3HXHDvrK7EJH4Aya2UlhpYmH8AD/Axh+w4BOMJSY2Fl6WTYwS8W5m58yZe+6cmSuZqmI7jHV9wtj4xOSUfzowMzs3HwwtLBZso2HJPC8bqmGVJNHmqqLzvKM4Ki+ZFhc1SeVFqb7X3y82uWUrhn7gtExe0cSarhwrsugQlTutJqqhCIsxN8LDIO6BCLxIG6FHHOIIBmQ0oIFDh0NYhQibvjLiYDCJq6BNnEVIcfc5zhEgbYOyOGWIxNbpX6NV2WN1Wvdr2q5aplNUGhYpw4iyF3bPeuyZPbBX9vlnrbZbo++lRbM00HKzGrxYzn38q9JodnDyrRrp2cExtl2vCnk3XaZ/C3mgb551ermdbLS9xm7ZG/m/YV32RDfQm+/yXYZnr0f4kcgLvRg1KP67HcOgkIjFt2KJzEYkteu1yo8VrGKd+pFECvtII0/1a7jEFTqCX4gJm0JykCr4PM0SfoSQ+gJWLpCb</latexit>
x2
<latexit sha1_base64="lFhRrRrVTrFR31ebbMgRp5myJpc=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCIDPjCuiG5c8pBHgoS0dcCG0jZtISLxB0zcysKVJi6MH+AHuPEHXPAJxiUmblx4KU2MEvE20zlz5p47Z+ZKhqpYNmNdjzA2PjE55Z32zczOzfsDC4s5S2+YMs/KuqqbBUm0uKpoPGsrtsoLhsnFuqTyvFTb7+/nm9y0FF07tFsGL9XFqqZUFFm0icqcljfKgRCLMCeCwyDqghDcSOqBRxzhGDpkNFAHhwabsAoRFn1FRMFgEFdCmziTkOLsc5zDR9oGZXHKEImt0b9Kq6LLarTu17QctUynqDRMUgYRZi/snvXYM3tgr+zzz1ptp0bfS4tmaaDlRtl/sZz5+FdVp9nGybdqpGcbFew4XhXybjhM/xbyQN886/Qyu+lwe43dsjfyf8O67IluoDXf5bsUT1+P8CORF3oxalD0dzuGQS4WiW5HYqnNUGLPbZUXK1jFOvUjjgQOkESW6ldxiSt0BK8QEbaE+CBV8LiaJfwIIfEFWE6QnA==</latexit>
x3
<latexit sha1_base64="0IPXcU0UIDvzZlYURjV2A/THv9U=">AAACiXichVG7SgNBFD2ur/hM1EawEYNiFWZFNKQKprGMj0TBBNndTHR0X+xOFmLwB6zsRK0ULMQP8ANs/AELP0EsFWwsvNksiAbjXWbnzJl77pyZq7um8CVjz11Kd09vX39sYHBoeGQ0nhgbL/pOzTN4wXBMx9vWNZ+bwuYFKaTJt12Pa5Zu8i39MNfc3wq45wvH3pR1l5ctbc8WVWFokqhiKag40t9NJFmKhTHdDtQIJBFF3knco4QKHBiowQKHDUnYhAafvh2oYHCJK6NBnEdIhPscxxgkbY2yOGVoxB7Sf49WOxFr07pZ0w/VBp1i0vBIOY1Z9sRu2Rt7ZHfshX3+WasR1mh6qdOst7Tc3Y2fTG58/KuyaJbY/1Z19CxRRTr0Ksi7GzLNWxgtfXB09raRWZ9tzLFr9kr+r9gze6Ab2MG7cbPG1y87+NHJC70YNUj93Y52UFxIqUuphbXFZHYlalUMU5jBPPVjGVmsIo8C1T/AKc5xoQwpqpJWMq1UpSvSTOBHKLkvAi+SPA==</latexit>
.
.
.
Latent
variables
Variable
transformation
Feature learning
Classifier or
Regressor
Linear
4. Representation learning: Models can have “feature learning”
blocks, and they can be “pre-trained” by different large datasets.
Prior Info
Observational data
Reported facts
Textbook knowledge
Needs and excitement around ML for Chemistry
Discovery
Representation
Model (Belief)
Intervention
Hypothesis
New Info
Prior Info
• Identify relevant variables
• Set design choices
• Set experiments
• Interpret results
Model (Belief)
Hypothesis
Can we somehow externalize “experience and intuition” of
experienced chemists to rationalize and accelerate discoveries?
Prior Info
Observational data
Reported facts
Textbook knowledge
Needs and excitement around ML for Chemistry
Discovery
Representation
Model (Belief)
Intervention
Hypothesis
New Info
Prior Info
• Identify relevant variables
• Set design choices
• Set experiments
• Interpret results
Model (Belief)
Hypothesis
Can we somehow externalize “experience and intuition” of
experienced chemists to rationalize and accelerate discoveries?
Representation
Reactions
Materials
Molecules
ML
computer
programs
• Observational data
• Reported facts
• Textbook knowledge
?
Identifying relevant factors and establishing any necessary and sufficient
computer-readable representations are inevitable preconditions, but this is
far from trivial and quite paradoxical since we haven’t understood the target.
Any rationalized “real” discovery only comes from understanding and discovery
of the causal relations between relevant factors.
Representation
<latexit sha1_base64="dwtAUUE0cfsFu6+2FLg7b109CNE=">AAACi3ichVG7SgNBFL1ZX/ERjdoINsGgWIW7a0iiWIgiWKoxMaASdtdJMmRf7E4CMfgDljYW2ihYiB/gB9j4AxZ+glhGsLHw7mZFLIx3mZ07Z+65c2aO5hjcE4gvEamvf2BwKDo8MjoWG5+IT04VPbvh6qyg24btljTVYwa3WEFwYbCS4zLV1Ay2r9U3/P39JnM9blt7ouWwI1OtWrzCdVUQVDoUNSbUMi/Hk5hazmWUdCaBKcSsrMh+omTTS+mETIgfSQhj244/wCEcgw06NMAEBhYIyg1QwaPvAGRAcAg7gjZhLmU82GdwCiPEbVAVowqV0Dr9q7Q6CFGL1n5PL2DrdIpBwyVmAubxGe+wg094j6/4+WevdtDD19KiWetymVOeOJvJf/zLMmkWUPth9dQsoAK5QCsn7U6A+LfQu/zmyUUnv7I7317AG3wj/df4go90A6v5rt/usN3LHno00kIvRgZ9u5D4OykqKTmTUnbSybX10KoozMIcLJIfWViDLdiGQuDDOVzClRSTlqQVabVbKkVCzjT8CmnzC0ydk0A=</latexit>
✓i
<latexit sha1_base64="tkPRNIYeS8tNgbH62CO/ULi3LDw=">AAACi3ichVHLSsNAFL2Nr/quuhHcBIviqtykoa3iQhTBZbXWFtpSkjjaaF4k04IWf8ClGxe6UXAhfoAf4MYfcOEniMsKblx4k0bEhXrDZO6cuefOmTmaaxo+R3yOCT29ff0D8cGh4ZHRsfHExOSO7zQ9nRV1x3S8sqb6zDRsVuQGN1nZ9ZhqaSYraYdrwX6pxTzfcOxtfuSymqXu28aeoaucoHKVNxhX6wf1RBJTi7mMrGRETCFmJVkKEjmrpBVRIiSIJESRdxL3UIVdcECHJljAwAZOuQkq+PRVQAIEl7AatAnzKDPCfQYnMETcJlUxqlAJPaT/Pq0qEWrTOujph2ydTjFpeMQUYQ6f8BY7+Ih3+IIfv/Zqhz0CLUc0a10uc+vjp9OF939ZFs0cGt+sPzVz2INcqNUg7W6IBLfQu/zW8XmnsLQ1157Ha3wl/Vf4jA90A7v1pt9ssq2LP/RopIVejAz6ckH8PdmRU1ImJW8qyZXVyKo4zMAsLJAfWViBDchDMfThDC7gUhgV0sKSsNwtFWIRZwp+hLD+CU69k0E=</latexit>
✓j
O
N
N
N
H
NH
N
N
N
CH3
CH3
Levels of Theory/Model Abstraction First Principle and Simulation (Quantum Chemistry)
Spatio-Temporal Flexibility, Variations, Dynamics, and Interactions
Representation
Latent
variables
Representation
learning
Reactions
Materials
Molecules
Graphs (of different size)
Node
features
Edge
features
CC1CCNO1
Graph Neural
Networks (GNNs)
NCc1ccoc1.S=(Cl)Cl>>[RX_5]S=C=NCc1ccoc1
…
Classifier or
Regressor
Diverse
Downstream
Tasks
Modular Hierarchy
Amide
Proline
Oxazoline
Compositionality
Phenyl
Carboxyl Methyl Ethyl Tert-butyl
Isoprophyl
Trifluoromethyl
Benzyl
Substituents
Graph

Coarsening
Combinatorial aspects
Representation
NB: Transformers can be considered as a special case of GNNs,
and many Transformer-type GNNs are also developed.
Transformer Core
(Multihead)
Self-attention
Feed-forward NN
Add + LayerNorm
Add + LayerNorm
<latexit sha1_base64="I4mbdBylFC3Uuk1C7RrdvvfeVHQ=">AAACqXichVFNS9xQFD2m9dvqqJtCN8GpogjDy1CqKIXBbrp01NFBI+ElvnEeky+SN0N16B+YP9CFKwUX4qa70m676R9w4U8Qlxa66cKbTEBUqjck97zz7rk57107dGWsGLvs0V687O3rHxgcGh55NTqWG5/YjINm5IiKE7hBVLV5LFzpi4qSyhXVMBLcs12xZTc+JvtbLRHFMvA31EEodj2+78uadLgiysq9DfQPuhk3PUvqJnfDOrfk7Oc5vZakZVPVheJzVi7PCiwN/TEwMpBHFqtB7jtM7CGAgyY8CPhQhF1wxPTswABDSNwu2sRFhGS6L/AFQ6RtUpWgCk5sg777tNrJWJ/WSc84VTv0F5feiJQ6ptkFO2M37Dc7Z1fs3397tdMeiZcDynZXK0JrrPN6/e+zKo+yQv1O9aRnhRoWU6+SvIcpk5zC6epbh19v1pfWptsz7IRdk/9jdsl+0Qn81h/ntCzWjp7wY5MXujEakPFwHI/BZrFgvC8Uy+/ypZVsVAN4gynM0jwWUMInrKJC/Tv4hh/4qc1rZa2qbXdLtZ5MM4l7oTm3XZydSQ==</latexit>
o =
X
i
↵i(x)fi(x; ✓)
Effective pretraining is a crucial open problem because in practice,
we can only access to limited data for each specific problem.
Pretraining with self-supervised pretext tasks have transformed NLP
Prior Info
Observational data
Reported facts
Textbook knowledge
Needs and excitement around ML for Chemistry
Discovery
Representation
Model (Belief)
Intervention
Hypothesis
New Info
Prior Info
• Identify relevant variables
• Set design choices
• Set experiments
• Interpret results
Model (Belief)
Hypothesis
Can we somehow externalize “experience and intuition” of
experienced chemists to rationalize and accelerate discoveries?
New Info
Prior Info
Observational data
Reported facts
Textbook knowledge
Needs and excitement around ML for Chemistry
Discovery
Representation
Model (Belief)
Intervention
Hypothesis
New Info
Prior Info
• Identify relevant variables
• Set design choices
• Set experiments
• Interpret results
Model (Belief)
Hypothesis
Can we somehow externalize “experience and intuition” of
experienced chemists to rationalize and accelerate discoveries?
New Info
(Experimental) Intervention
New Info
Hypothesis
?
Automation
Reactions
Materials
Molecules
Any rationalized “real” discovery only comes from understanding and discovery
of the causal relations between relevant factors.
Information about causal relations can be acquired by passive observation and
active intervention. Correlation does not imply causation.
ML
computer
programs
• Observational data
• Reported facts
• Textbook knowledge
(Experimental) Intervention
We need to carefully rethink how an experiment should be
performed to be informative about causal structure of targets.
(Experimental) Intervention
We need to carefully rethink how an experiment should be
performed to be informative about causal structure of targets.
• Correlation vs Causation
ML models trained over passive observational data can be trapped by
spurious correlations between variables, being totally ignorant of the
underlying causality.
(Experimental) Intervention
We need to carefully rethink how an experiment should be
performed to be informative about causal structure of targets.
• Correlation vs Causation
ML models trained over passive observational data can be trapped by
spurious correlations between variables, being totally ignorant of the
underlying causality.
• Garbage In, Garbage Out (GIGO)
ML models are just representative of the given data. If it has any bias, ML
predictions can be miserably misleading.
(Experimental) Intervention
We need to carefully rethink how an experiment should be
performed to be informative about causal structure of targets.
• Correlation vs Causation
ML models trained over passive observational data can be trapped by
spurious correlations between variables, being totally ignorant of the
underlying causality.
• Garbage In, Garbage Out (GIGO)
ML models are just representative of the given data. If it has any bias, ML
predictions can be miserably misleading.
• Unavoidable Human-Caused Biases
Always remember that “most chemical experiments are planned by human
scientists and therefore are subject to a variety of human cognitive biases,
heuristics and social influences.”
* Jia, X., Lynch, A., Huang, Y. et al. Anthropogenic biases in chemical reaction data hinder
exploratory inorganic synthesis. Nature 573, 251–255 (2019).
https://www.chemistryworld.com/news/dispute-over-reaction-prediction-puts-machine-learnings-
pitfalls-in-spotlight/3009912.article
• Main paper https://doi.org/10.1126/science.aar5169
• Erratum https://doi.org/10.1126/science.aat7648
• Negative comment paper https://doi.org/10.1126/science.aat8603
• Author's response https://doi.org/10.1126/science.aat8763
(Experimental) Intervention
Keys: fusing modern ML with first-principles, simulations, domain
knowledge, and collaboratively working with experimental experts.
Current ML is too data-hungry and vulnerable to any data bias, but
acquisition of clean representative data is often quite impractical.
(Experimental) Intervention
• Deep learning techniques thus far have proven to be data hungry, shallow, brittle, and
limited in their ability to generalize (Marcus, 2018)
• Current machine learning techniques are data-hungry and brittle—they can only make
sense of patterns they've seen before. (Chollet, 2020)
• A growing body of evidence shows that state-of-the-art models learn to exploit spurious
statistical patterns in datasets... instead of learning meaning in the flexible and
generalizable way that humans do. (Nie et al., 2019)
• Current machine learning methods seem weak when they are required to generalize
beyond the training distribution, which is what is often needed in practice. (Bengio et al.,
2019)
(Experimental) Intervention
AlphaGo
(Nature, 2016)
AlphaGo Zero
(Nature, 2017)
AlphaZero
(Science, 2018)
MuZero
(Nature, 2020)
This has reignited the old war between induction and deduction,
and we’re re-encountering the long-standing problems in AI.
• Knowledge acquisition / Principled data acquisition
Experimental design, Model-based optimization, Evolutionary computation
• Reconciliation between inductive and deductive ML
Hybrid models of causal/logical/algorithmic ML and deep learning
• Balancing exploitation and exploration
Model-based reinforcement learning or search in a combinatorial space
ML for Chemistry to me (a ML researcher)
An exciting “real” test bench for the long-standing unsolved but
attractive fundamental problems in “AI for automating discovery”,
involving many fascinating technical topics of modern ML.
Prior Info
Observational data
Reported facts
Textbook knowledge
Discovery
Representation
Model (Belief)
Intervention
Hypothesis
New Info
Prior Info
• Identify relevant variables
• Set design choices
• Set experiments
• Interpret results
Model (Belief)
Hypothesis
Summary
• Why it is needed?
• What are exciting for computer scientists?
Two aspects:
2. (Experimental) Intervention
Machine Learning (ML) for Chemistry
• What are good ML-readable representations for chemistry?
• What information should be recorded and given to ML?
1. Representation
• What are essential to make real chemical discoveries?
• Any principled ways for data acquisition and experimental design?

Weitere ähnliche Inhalte

Was ist angesagt?

機械学習は化学研究の"経験と勘"を合理化できるか?
機械学習は化学研究の"経験と勘"を合理化できるか?機械学習は化学研究の"経験と勘"を合理化できるか?
機械学習は化学研究の"経験と勘"を合理化できるか?Ichigaku Takigawa
 
Machine Learning and Model-Based Optimization for Heterogeneous Catalyst Desi...
Machine Learning and Model-Based Optimization for Heterogeneous Catalyst Desi...Machine Learning and Model-Based Optimization for Heterogeneous Catalyst Desi...
Machine Learning and Model-Based Optimization for Heterogeneous Catalyst Desi...Ichigaku Takigawa
 
How to use data to design and optimize reaction? A quick introduction to work...
How to use data to design and optimize reaction? A quick introduction to work...How to use data to design and optimize reaction? A quick introduction to work...
How to use data to design and optimize reaction? A quick introduction to work...Ichigaku Takigawa
 
A machine-learning view on heterogeneous catalyst design and discovery
A machine-learning view on heterogeneous catalyst design and discoveryA machine-learning view on heterogeneous catalyst design and discovery
A machine-learning view on heterogeneous catalyst design and discoveryIchigaku Takigawa
 
Evaluating Machine Learning Algorithms for Materials Science using the Matben...
Evaluating Machine Learning Algorithms for Materials Science using the Matben...Evaluating Machine Learning Algorithms for Materials Science using the Matben...
Evaluating Machine Learning Algorithms for Materials Science using the Matben...Anubhav Jain
 
Scikit-learn : Machine Learning in Python
Scikit-learn : Machine Learning in PythonScikit-learn : Machine Learning in Python
Scikit-learn : Machine Learning in PythonAjay Ohri
 
Machine Learning for Molecular Graph Representations and Geometries
Machine Learning for Molecular Graph Representations and GeometriesMachine Learning for Molecular Graph Representations and Geometries
Machine Learning for Molecular Graph Representations and GeometriesIchigaku Takigawa
 
Applications of Natural Language Processing to Materials Design
Applications of Natural Language Processing to Materials DesignApplications of Natural Language Processing to Materials Design
Applications of Natural Language Processing to Materials DesignAnubhav Jain
 
When The New Science Is In The Outliers
When The New Science Is In The OutliersWhen The New Science Is In The Outliers
When The New Science Is In The Outliersaimsnist
 
Computational of Bioinformatics
Computational of BioinformaticsComputational of Bioinformatics
Computational of Bioinformaticsijtsrd
 
Lecture on AI and Machine Learning
Lecture on AI and Machine LearningLecture on AI and Machine Learning
Lecture on AI and Machine LearningXiaonan Wang
 
Xin Yao: "What can evolutionary computation do for you?"
Xin Yao: "What can evolutionary computation do for you?"Xin Yao: "What can evolutionary computation do for you?"
Xin Yao: "What can evolutionary computation do for you?"ieee_cis_cyprus
 
Cao report 2007-2012
Cao report 2007-2012Cao report 2007-2012
Cao report 2007-2012Elif Ceylan
 
Introductory Lecture to Applied Mathematics Stream
Introductory Lecture to Applied Mathematics StreamIntroductory Lecture to Applied Mathematics Stream
Introductory Lecture to Applied Mathematics StreamSSA KPI
 
2D/3D Materials screening and genetic algorithm with ML model
2D/3D Materials screening and genetic algorithm with ML model2D/3D Materials screening and genetic algorithm with ML model
2D/3D Materials screening and genetic algorithm with ML modelaimsnist
 
2020.04.07 automated molecular design and the bradshaw platform webinar
2020.04.07 automated molecular design and the bradshaw platform webinar2020.04.07 automated molecular design and the bradshaw platform webinar
2020.04.07 automated molecular design and the bradshaw platform webinarPistoia Alliance
 
012517 ResumeJH Amex DS-ML
012517 ResumeJH Amex DS-ML012517 ResumeJH Amex DS-ML
012517 ResumeJH Amex DS-MLJeremy Hadidjojo
 
Applied Machine Learning for Chemistry II (HSI2020)
Applied Machine Learning for Chemistry II (HSI2020)Applied Machine Learning for Chemistry II (HSI2020)
Applied Machine Learning for Chemistry II (HSI2020)Ichigaku Takigawa
 

Was ist angesagt? (20)

機械学習は化学研究の"経験と勘"を合理化できるか?
機械学習は化学研究の"経験と勘"を合理化できるか?機械学習は化学研究の"経験と勘"を合理化できるか?
機械学習は化学研究の"経験と勘"を合理化できるか?
 
Machine Learning and Model-Based Optimization for Heterogeneous Catalyst Desi...
Machine Learning and Model-Based Optimization for Heterogeneous Catalyst Desi...Machine Learning and Model-Based Optimization for Heterogeneous Catalyst Desi...
Machine Learning and Model-Based Optimization for Heterogeneous Catalyst Desi...
 
How to use data to design and optimize reaction? A quick introduction to work...
How to use data to design and optimize reaction? A quick introduction to work...How to use data to design and optimize reaction? A quick introduction to work...
How to use data to design and optimize reaction? A quick introduction to work...
 
A machine-learning view on heterogeneous catalyst design and discovery
A machine-learning view on heterogeneous catalyst design and discoveryA machine-learning view on heterogeneous catalyst design and discovery
A machine-learning view on heterogeneous catalyst design and discovery
 
Evaluating Machine Learning Algorithms for Materials Science using the Matben...
Evaluating Machine Learning Algorithms for Materials Science using the Matben...Evaluating Machine Learning Algorithms for Materials Science using the Matben...
Evaluating Machine Learning Algorithms for Materials Science using the Matben...
 
Scikit-learn : Machine Learning in Python
Scikit-learn : Machine Learning in PythonScikit-learn : Machine Learning in Python
Scikit-learn : Machine Learning in Python
 
Machine Learning for Molecular Graph Representations and Geometries
Machine Learning for Molecular Graph Representations and GeometriesMachine Learning for Molecular Graph Representations and Geometries
Machine Learning for Molecular Graph Representations and Geometries
 
Artificial Intelligence in Weed Recognition Tasks
Artificial Intelligence in Weed Recognition TasksArtificial Intelligence in Weed Recognition Tasks
Artificial Intelligence in Weed Recognition Tasks
 
Applications of Natural Language Processing to Materials Design
Applications of Natural Language Processing to Materials DesignApplications of Natural Language Processing to Materials Design
Applications of Natural Language Processing to Materials Design
 
When The New Science Is In The Outliers
When The New Science Is In The OutliersWhen The New Science Is In The Outliers
When The New Science Is In The Outliers
 
Computational of Bioinformatics
Computational of BioinformaticsComputational of Bioinformatics
Computational of Bioinformatics
 
Lecture on AI and Machine Learning
Lecture on AI and Machine LearningLecture on AI and Machine Learning
Lecture on AI and Machine Learning
 
Xin Yao: "What can evolutionary computation do for you?"
Xin Yao: "What can evolutionary computation do for you?"Xin Yao: "What can evolutionary computation do for you?"
Xin Yao: "What can evolutionary computation do for you?"
 
Cao report 2007-2012
Cao report 2007-2012Cao report 2007-2012
Cao report 2007-2012
 
International Journal of Engineering Inventions (IJEI),
International Journal of Engineering Inventions (IJEI), International Journal of Engineering Inventions (IJEI),
International Journal of Engineering Inventions (IJEI),
 
Introductory Lecture to Applied Mathematics Stream
Introductory Lecture to Applied Mathematics StreamIntroductory Lecture to Applied Mathematics Stream
Introductory Lecture to Applied Mathematics Stream
 
2D/3D Materials screening and genetic algorithm with ML model
2D/3D Materials screening and genetic algorithm with ML model2D/3D Materials screening and genetic algorithm with ML model
2D/3D Materials screening and genetic algorithm with ML model
 
2020.04.07 automated molecular design and the bradshaw platform webinar
2020.04.07 automated molecular design and the bradshaw platform webinar2020.04.07 automated molecular design and the bradshaw platform webinar
2020.04.07 automated molecular design and the bradshaw platform webinar
 
012517 ResumeJH Amex DS-ML
012517 ResumeJH Amex DS-ML012517 ResumeJH Amex DS-ML
012517 ResumeJH Amex DS-ML
 
Applied Machine Learning for Chemistry II (HSI2020)
Applied Machine Learning for Chemistry II (HSI2020)Applied Machine Learning for Chemistry II (HSI2020)
Applied Machine Learning for Chemistry II (HSI2020)
 

Ähnlich wie Machine Learning for Chemistry: Representing and Intervening

Large scale gpu cluster for ai
Large scale gpu cluster for aiLarge scale gpu cluster for ai
Large scale gpu cluster for aiKyunam Cho
 
Spohrer PHD_ICT_KES 20230316 v10.pptx
Spohrer PHD_ICT_KES 20230316 v10.pptxSpohrer PHD_ICT_KES 20230316 v10.pptx
Spohrer PHD_ICT_KES 20230316 v10.pptxISSIP
 
Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedData Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedPhilip Bourne
 
State of AI Report 2022 - ONLINE.pdf
State of AI Report 2022 - ONLINE.pdfState of AI Report 2022 - ONLINE.pdf
State of AI Report 2022 - ONLINE.pdfvizologi
 
How Can AI and IoT Power the Chemical Industry?
How Can AI and IoT Power the Chemical Industry?How Can AI and IoT Power the Chemical Industry?
How Can AI and IoT Power the Chemical Industry?Xiaonan Wang
 
20210325 jim spohrer future ai v11
20210325 jim spohrer future ai v1120210325 jim spohrer future ai v11
20210325 jim spohrer future ai v11ISSIP
 
Status und Ausblick - Wie wird sich KI technisch weiterentwickeln? Münchner K...
Status und Ausblick - Wie wird sich KI technisch weiterentwickeln? Münchner K...Status und Ausblick - Wie wird sich KI technisch weiterentwickeln? Münchner K...
Status und Ausblick - Wie wird sich KI technisch weiterentwickeln? Münchner K...Willi Schroll
 
Service Provision 20221023 v3.pptx
Service Provision 20221023 v3.pptxService Provision 20221023 v3.pptx
Service Provision 20221023 v3.pptxISSIP
 
State of AI Report 2022 - ONLINE.pptx
State of AI Report 2022 - ONLINE.pptxState of AI Report 2022 - ONLINE.pptx
State of AI Report 2022 - ONLINE.pptxEithuThutun
 
Thoughts on Knowledge Graphs & Deeper Provenance
Thoughts on Knowledge Graphs  & Deeper ProvenanceThoughts on Knowledge Graphs  & Deeper Provenance
Thoughts on Knowledge Graphs & Deeper ProvenancePaul Groth
 
Internet of Things (IoT) and Artificial Intelligence (AI) role in Medical and...
Internet of Things (IoT) and Artificial Intelligence (AI) role in Medical and...Internet of Things (IoT) and Artificial Intelligence (AI) role in Medical and...
Internet of Things (IoT) and Artificial Intelligence (AI) role in Medical and...Hamidreza Bolhasani
 
AI In Actuarial Science
AI In Actuarial ScienceAI In Actuarial Science
AI In Actuarial ScienceAudrey Britton
 

Ähnlich wie Machine Learning for Chemistry: Representing and Intervening (20)

Large scale gpu cluster for ai
Large scale gpu cluster for aiLarge scale gpu cluster for ai
Large scale gpu cluster for ai
 
Spohrer PHD_ICT_KES 20230316 v10.pptx
Spohrer PHD_ICT_KES 20230316 v10.pptxSpohrer PHD_ICT_KES 20230316 v10.pptx
Spohrer PHD_ICT_KES 20230316 v10.pptx
 
Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedData Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has Changed
 
State of AI Report 2022 - ONLINE.pdf
State of AI Report 2022 - ONLINE.pdfState of AI Report 2022 - ONLINE.pdf
State of AI Report 2022 - ONLINE.pdf
 
How Can AI and IoT Power the Chemical Industry?
How Can AI and IoT Power the Chemical Industry?How Can AI and IoT Power the Chemical Industry?
How Can AI and IoT Power the Chemical Industry?
 
20210325 jim spohrer future ai v11
20210325 jim spohrer future ai v1120210325 jim spohrer future ai v11
20210325 jim spohrer future ai v11
 
CSE.ppt
CSE.pptCSE.ppt
CSE.ppt
 
PowerAyupppt.ppt
PowerAyupppt.pptPowerAyupppt.ppt
PowerAyupppt.ppt
 
engineering.ppt
engineering.pptengineering.ppt
engineering.ppt
 
computer.ppt
computer.pptcomputer.ppt
computer.ppt
 
CSE.ppt
CSE.pptCSE.ppt
CSE.ppt
 
CSE.ppt
CSE.pptCSE.ppt
CSE.ppt
 
Collins seattle-2014-final
Collins seattle-2014-finalCollins seattle-2014-final
Collins seattle-2014-final
 
AI for Science
AI for ScienceAI for Science
AI for Science
 
Status und Ausblick - Wie wird sich KI technisch weiterentwickeln? Münchner K...
Status und Ausblick - Wie wird sich KI technisch weiterentwickeln? Münchner K...Status und Ausblick - Wie wird sich KI technisch weiterentwickeln? Münchner K...
Status und Ausblick - Wie wird sich KI technisch weiterentwickeln? Münchner K...
 
Service Provision 20221023 v3.pptx
Service Provision 20221023 v3.pptxService Provision 20221023 v3.pptx
Service Provision 20221023 v3.pptx
 
State of AI Report 2022 - ONLINE.pptx
State of AI Report 2022 - ONLINE.pptxState of AI Report 2022 - ONLINE.pptx
State of AI Report 2022 - ONLINE.pptx
 
Thoughts on Knowledge Graphs & Deeper Provenance
Thoughts on Knowledge Graphs  & Deeper ProvenanceThoughts on Knowledge Graphs  & Deeper Provenance
Thoughts on Knowledge Graphs & Deeper Provenance
 
Internet of Things (IoT) and Artificial Intelligence (AI) role in Medical and...
Internet of Things (IoT) and Artificial Intelligence (AI) role in Medical and...Internet of Things (IoT) and Artificial Intelligence (AI) role in Medical and...
Internet of Things (IoT) and Artificial Intelligence (AI) role in Medical and...
 
AI In Actuarial Science
AI In Actuarial ScienceAI In Actuarial Science
AI In Actuarial Science
 

Mehr von Ichigaku Takigawa

データ社会を生きる技術
〜機械学習の夢と現実〜
データ社会を生きる技術
〜機械学習の夢と現実〜データ社会を生きる技術
〜機械学習の夢と現実〜
データ社会を生きる技術
〜機械学習の夢と現実〜Ichigaku Takigawa
 
機械学習を科学研究で使うとは?
機械学習を科学研究で使うとは?機械学習を科学研究で使うとは?
機械学習を科学研究で使うとは?Ichigaku Takigawa
 
A Modern Introduction to Decision Tree Ensembles
A Modern Introduction to Decision Tree EnsemblesA Modern Introduction to Decision Tree Ensembles
A Modern Introduction to Decision Tree EnsemblesIchigaku Takigawa
 
Exploring Practices in Machine Learning and Machine Discovery for Heterogeneo...
Exploring Practices in Machine Learning and Machine Discovery for Heterogeneo...Exploring Practices in Machine Learning and Machine Discovery for Heterogeneo...
Exploring Practices in Machine Learning and Machine Discovery for Heterogeneo...Ichigaku Takigawa
 
機械学習と機械発見:自然科学融合が誘起するデータ科学の新展開
機械学習と機械発見:自然科学融合が誘起するデータ科学の新展開機械学習と機械発見:自然科学融合が誘起するデータ科学の新展開
機械学習と機械発見:自然科学融合が誘起するデータ科学の新展開Ichigaku Takigawa
 
機械学習と機械発見:自然科学研究におけるデータ利活用の再考
機械学習と機械発見:自然科学研究におけるデータ利活用の再考機械学習と機械発見:自然科学研究におけるデータ利活用の再考
機械学習と機械発見:自然科学研究におけるデータ利活用の再考Ichigaku Takigawa
 
小1にルービックキューブを教えてみた 〜群論スポーツの教育とパターン認知〜
小1にルービックキューブを教えてみた 〜群論スポーツの教育とパターン認知〜小1にルービックキューブを教えてみた 〜群論スポーツの教育とパターン認知〜
小1にルービックキューブを教えてみた 〜群論スポーツの教育とパターン認知〜Ichigaku Takigawa
 
"データ化"する化学と情報技術・人工知能・データサイエンス
"データ化"する化学と情報技術・人工知能・データサイエンス"データ化"する化学と情報技術・人工知能・データサイエンス
"データ化"する化学と情報技術・人工知能・データサイエンスIchigaku Takigawa
 
自然科学における機械学習と機械発見
自然科学における機械学習と機械発見自然科学における機械学習と機械発見
自然科学における機械学習と機械発見Ichigaku Takigawa
 
幾何と機械学習: A Short Intro
幾何と機械学習: A Short Intro幾何と機械学習: A Short Intro
幾何と機械学習: A Short IntroIchigaku Takigawa
 
決定森回帰の信頼区間推定, Benign Overfitting, 多変量木とReLUネットの入力空間分割
決定森回帰の信頼区間推定, Benign Overfitting, 多変量木とReLUネットの入力空間分割決定森回帰の信頼区間推定, Benign Overfitting, 多変量木とReLUネットの入力空間分割
決定森回帰の信頼区間推定, Benign Overfitting, 多変量木とReLUネットの入力空間分割Ichigaku Takigawa
 
機械学習を自然現象の理解・発見に使いたい人に知っておいてほしいこと
機械学習を自然現象の理解・発見に使いたい人に知っておいてほしいこと機械学習を自然現象の理解・発見に使いたい人に知っておいてほしいこと
機械学習を自然現象の理解・発見に使いたい人に知っておいてほしいことIchigaku Takigawa
 
自己紹介:機械学習・機械発見とデータ中心的自然科学
自己紹介:機械学習・機械発見とデータ中心的自然科学自己紹介:機械学習・機械発見とデータ中心的自然科学
自己紹介:機械学習・機械発見とデータ中心的自然科学Ichigaku Takigawa
 
機械学習・機械発見から見るデータ中心型化学の野望と憂鬱
機械学習・機械発見から見るデータ中心型化学の野望と憂鬱機械学習・機械発見から見るデータ中心型化学の野望と憂鬱
機械学習・機械発見から見るデータ中心型化学の野望と憂鬱Ichigaku Takigawa
 
(2021.11) 機械学習と機械発見:データ中心型の化学・材料科学の教訓とこれから
(2021.11) 機械学習と機械発見:データ中心型の化学・材料科学の教訓とこれから(2021.11) 機械学習と機械発見:データ中心型の化学・材料科学の教訓とこれから
(2021.11) 機械学習と機械発見:データ中心型の化学・材料科学の教訓とこれからIchigaku Takigawa
 
機械学習~データを予測に変える技術~で化学に挑む! (サイエンスアゴラ2021)
機械学習~データを予測に変える技術~で化学に挑む! (サイエンスアゴラ2021)機械学習~データを予測に変える技術~で化学に挑む! (サイエンスアゴラ2021)
機械学習~データを予測に変える技術~で化学に挑む! (サイエンスアゴラ2021)Ichigaku Takigawa
 
(2021.10) 機械学習と機械発見 データ中心型の化学・材料科学の教訓とこれから
(2021.10) 機械学習と機械発見 データ中心型の化学・材料科学の教訓とこれから (2021.10) 機械学習と機械発見 データ中心型の化学・材料科学の教訓とこれから
(2021.10) 機械学習と機械発見 データ中心型の化学・材料科学の教訓とこれから Ichigaku Takigawa
 
帰納バイアスと分子の組合せ的表現・幾何的表現 (本発表)
帰納バイアスと分子の組合せ的表現・幾何的表現 (本発表)帰納バイアスと分子の組合せ的表現・幾何的表現 (本発表)
帰納バイアスと分子の組合せ的表現・幾何的表現 (本発表)Ichigaku Takigawa
 
帰納バイアスと分子の組合せ的表現・幾何的表現 (3minフラッシュトーク)
帰納バイアスと分子の組合せ的表現・幾何的表現 (3minフラッシュトーク)帰納バイアスと分子の組合せ的表現・幾何的表現 (3minフラッシュトーク)
帰納バイアスと分子の組合せ的表現・幾何的表現 (3minフラッシュトーク)Ichigaku Takigawa
 

Mehr von Ichigaku Takigawa (20)

機械学習と自動微分
機械学習と自動微分機械学習と自動微分
機械学習と自動微分
 
データ社会を生きる技術
〜機械学習の夢と現実〜
データ社会を生きる技術
〜機械学習の夢と現実〜データ社会を生きる技術
〜機械学習の夢と現実〜
データ社会を生きる技術
〜機械学習の夢と現実〜
 
機械学習を科学研究で使うとは?
機械学習を科学研究で使うとは?機械学習を科学研究で使うとは?
機械学習を科学研究で使うとは?
 
A Modern Introduction to Decision Tree Ensembles
A Modern Introduction to Decision Tree EnsemblesA Modern Introduction to Decision Tree Ensembles
A Modern Introduction to Decision Tree Ensembles
 
Exploring Practices in Machine Learning and Machine Discovery for Heterogeneo...
Exploring Practices in Machine Learning and Machine Discovery for Heterogeneo...Exploring Practices in Machine Learning and Machine Discovery for Heterogeneo...
Exploring Practices in Machine Learning and Machine Discovery for Heterogeneo...
 
機械学習と機械発見:自然科学融合が誘起するデータ科学の新展開
機械学習と機械発見:自然科学融合が誘起するデータ科学の新展開機械学習と機械発見:自然科学融合が誘起するデータ科学の新展開
機械学習と機械発見:自然科学融合が誘起するデータ科学の新展開
 
機械学習と機械発見:自然科学研究におけるデータ利活用の再考
機械学習と機械発見:自然科学研究におけるデータ利活用の再考機械学習と機械発見:自然科学研究におけるデータ利活用の再考
機械学習と機械発見:自然科学研究におけるデータ利活用の再考
 
小1にルービックキューブを教えてみた 〜群論スポーツの教育とパターン認知〜
小1にルービックキューブを教えてみた 〜群論スポーツの教育とパターン認知〜小1にルービックキューブを教えてみた 〜群論スポーツの教育とパターン認知〜
小1にルービックキューブを教えてみた 〜群論スポーツの教育とパターン認知〜
 
"データ化"する化学と情報技術・人工知能・データサイエンス
"データ化"する化学と情報技術・人工知能・データサイエンス"データ化"する化学と情報技術・人工知能・データサイエンス
"データ化"する化学と情報技術・人工知能・データサイエンス
 
自然科学における機械学習と機械発見
自然科学における機械学習と機械発見自然科学における機械学習と機械発見
自然科学における機械学習と機械発見
 
幾何と機械学習: A Short Intro
幾何と機械学習: A Short Intro幾何と機械学習: A Short Intro
幾何と機械学習: A Short Intro
 
決定森回帰の信頼区間推定, Benign Overfitting, 多変量木とReLUネットの入力空間分割
決定森回帰の信頼区間推定, Benign Overfitting, 多変量木とReLUネットの入力空間分割決定森回帰の信頼区間推定, Benign Overfitting, 多変量木とReLUネットの入力空間分割
決定森回帰の信頼区間推定, Benign Overfitting, 多変量木とReLUネットの入力空間分割
 
機械学習を自然現象の理解・発見に使いたい人に知っておいてほしいこと
機械学習を自然現象の理解・発見に使いたい人に知っておいてほしいこと機械学習を自然現象の理解・発見に使いたい人に知っておいてほしいこと
機械学習を自然現象の理解・発見に使いたい人に知っておいてほしいこと
 
自己紹介:機械学習・機械発見とデータ中心的自然科学
自己紹介:機械学習・機械発見とデータ中心的自然科学自己紹介:機械学習・機械発見とデータ中心的自然科学
自己紹介:機械学習・機械発見とデータ中心的自然科学
 
機械学習・機械発見から見るデータ中心型化学の野望と憂鬱
機械学習・機械発見から見るデータ中心型化学の野望と憂鬱機械学習・機械発見から見るデータ中心型化学の野望と憂鬱
機械学習・機械発見から見るデータ中心型化学の野望と憂鬱
 
(2021.11) 機械学習と機械発見:データ中心型の化学・材料科学の教訓とこれから
(2021.11) 機械学習と機械発見:データ中心型の化学・材料科学の教訓とこれから(2021.11) 機械学習と機械発見:データ中心型の化学・材料科学の教訓とこれから
(2021.11) 機械学習と機械発見:データ中心型の化学・材料科学の教訓とこれから
 
機械学習~データを予測に変える技術~で化学に挑む! (サイエンスアゴラ2021)
機械学習~データを予測に変える技術~で化学に挑む! (サイエンスアゴラ2021)機械学習~データを予測に変える技術~で化学に挑む! (サイエンスアゴラ2021)
機械学習~データを予測に変える技術~で化学に挑む! (サイエンスアゴラ2021)
 
(2021.10) 機械学習と機械発見 データ中心型の化学・材料科学の教訓とこれから
(2021.10) 機械学習と機械発見 データ中心型の化学・材料科学の教訓とこれから (2021.10) 機械学習と機械発見 データ中心型の化学・材料科学の教訓とこれから
(2021.10) 機械学習と機械発見 データ中心型の化学・材料科学の教訓とこれから
 
帰納バイアスと分子の組合せ的表現・幾何的表現 (本発表)
帰納バイアスと分子の組合せ的表現・幾何的表現 (本発表)帰納バイアスと分子の組合せ的表現・幾何的表現 (本発表)
帰納バイアスと分子の組合せ的表現・幾何的表現 (本発表)
 
帰納バイアスと分子の組合せ的表現・幾何的表現 (3minフラッシュトーク)
帰納バイアスと分子の組合せ的表現・幾何的表現 (3minフラッシュトーク)帰納バイアスと分子の組合せ的表現・幾何的表現 (3minフラッシュトーク)
帰納バイアスと分子の組合せ的表現・幾何的表現 (3minフラッシュトーク)
 

Kürzlich hochgeladen

Base editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingBase editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingNetHelix
 
Topic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxTopic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxJorenAcuavera1
 
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》rnrncn29
 
Pests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPirithiRaju
 
User Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationUser Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationColumbia Weather Systems
 
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 GenuineCall Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuinethapagita
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxpriyankatabhane
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologycaarthichand2003
 
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxSTOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxMurugaveni B
 
Four Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.pptFour Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.pptJoemSTuliba
 
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxGenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxBerniceCayabyab1
 
Radiation physics in Dental Radiology...
Radiation physics in Dental Radiology...Radiation physics in Dental Radiology...
Radiation physics in Dental Radiology...navyadasi1992
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxNandakishor Bhaurao Deshmukh
 
Citronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayCitronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayupadhyaymani499
 
The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxEran Akiva Sinbar
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationColumbia Weather Systems
 
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)Columbia Weather Systems
 
Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensorsonawaneprad
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024AyushiRastogi48
 
Bioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptxBioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptx023NiWayanAnggiSriWa
 

Kürzlich hochgeladen (20)

Base editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingBase editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
 
Topic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxTopic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptx
 
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
 
Pests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdf
 
User Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationUser Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather Station
 
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 GenuineCall Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptx
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technology
 
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxSTOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
 
Four Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.pptFour Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.ppt
 
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxGenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
 
Radiation physics in Dental Radiology...
Radiation physics in Dental Radiology...Radiation physics in Dental Radiology...
Radiation physics in Dental Radiology...
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
 
Citronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayCitronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyay
 
The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptx
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather Station
 
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
 
Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensor
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024
 
Bioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptxBioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptx
 

Machine Learning for Chemistry: Representing and Intervening

  • 1. Machine Learning for Chemistry: Representing and Intervening Ichigaku Takigawa takigawa@icredd.hokudai.ac.jp Apr 26, 2021 @ Hokkaido University Joint Symposium of Engineering & Information Science & WPI-ICReDD
  • 2. I am a graduate of School of Engineering and IST! 1995-2005 (10 years) Hokkaido Univ School of Engineering Grad School of Engineering Grad School of Info Sci & Tech 2012-2019 (7 years) Hokkaido Univ B.Eng (1999) M.Eng (2001), PhD (2004) Postdoc (2004-2005) Grad School of Info Sci & Tech Tenure Track (2012-2014) Assoc Prof (2014-2019) KUDO Mineichi TANAKA Yuzuru SHIMBO Masaru MINATO Shinichi TANAKA Yuzuru IMAI Hideyuki
  • 3. 2005-2011 (7 years) Kyoto Univ 2019-present (2 years) The “Cross-Appointment System” But when I stepped outside Physically I’m at Kyoto
  • 4. Things go interdisciplinary… • Bioinformatics Center Institute for Chemical Research • Grad School of Pharmaceutical Sci • Medical-risk Avoidance based on iPS Cells Team • Institute for Chemical Reaction Design and Discovery Assist Prof (2005-2011) 2005-2011 (7 years) Kyoto Univ 2019-present (2 years) The “Cross-Appointment System”
  • 5. This talk • Why it is needed? • What are exciting for computer scientists? Machine Learning (ML) for Chemistry
  • 6. It’s a hot topic in Chemistry
  • 7. But also in Machine Learning! NeurIPS 2020 ICML 2020 ICLR 2020 • Self-Supervised Graph Transformer on Large-Scale Molecular Data • RetroXpert: Decompose Retrosynthesis Prediction Like A Chemist • Reinforced Molecular Optimization with Neighborhood-Controlled Grammars • Autofocused Oracles for Model-based Design • Barking Up the Right Tree: an Approach to Search over Molecule Synthesis DAGs • On the Equivalence of Molecular Graph Convolution and Molecular Wave Function with Poor Basis Set • CogMol: Target-Specific and Selective Drug Design for COVID-19 Using Deep Generative Models • A Graph to Graphs Framework for Retrosynthesis Prediction • Hierarchical Generation of Molecular Graphs using Structural Motifs • Learning to Navigate in Synthetically Accessible Chemical Space Using Reinforcement Learning • Reinforcement Learning for Molecular Design Guided by Quantum Mechanics • Multi-Objective Molecule Generation using Interpretable Substructures • Improving Molecular Design by Stochastic Iterative Target Augmentation • A Generative Model for Molecular Distance Geometry • Directional Message Passing for Molecular Graphs • GraphAF: a Flow-based Autoregressive Model for Molecular Graph Generation • Augmenting Genetic Algorithms with Deep Neural Networks for Exploring the Chemical Space • A Fair Comparison of Graph Neural Networks for Graph Classification
  • 8. Mixed feelings of curiosity, optimism, skepticism?
  • 9. Inseparably linked to automation “These illustrate how rapid advancements in hardware automation and machine learning continue to transform the nature of experimentation and modeling.” Automation is the use of technology to perform tasks with reduced human involvement or human labor.
  • 10. Towards machine autonomy in discovery Organic synthesis in a modular robotic system. Science 363 (2019) A mobile robotic chemist. Nature 583 (2020) Automating drug discovery. Nature Reviews Drug Discovery 17 (2018) Automation has been impactfully changing our daily life, society, as well as scientific experiments and computations.
  • 11. This talk • Why it is needed? • What are exciting for computer scientists? I’ll briefly cover these from two aspects: 2. (Experimental) Intervention Machine Learning (ML) for Chemistry • What are good ML-readable representations for chemistry? • What information should be recorded and given to ML? 1. Representation • What are essential to make real chemical discoveries? • Any principled ways for data acquisition and experimental design?
  • 12. Two pillars for scientific discovery? In essence, ML for chemistry is metascience (the science on how to do science) unexpectedly hitting age-old unsolved questions in the philosophy of natural science.
  • 13. Machine Learning (ML) https://www.forbes.com/sites/forbestechcouncil/2020/02/19/ in-praise-of-boring-ai-a-k-a-machine-learning/ … “Let’s face it: So far, the artificial intelligence plastered all over PowerPoint slides hasn’t lived up to its hype.” The AI frenzy: hope & hype
  • 14. Machine Learning (ML) From AAAI-20 Oxford-Style Debate https://www.forbes.com/sites/forbestechcouncil/2020/02/19/ in-praise-of-boring-ai-a-k-a-machine-learning/ … “Let’s face it: So far, the artificial intelligence plastered all over PowerPoint slides hasn’t lived up to its hype.” The AI frenzy: hope & hype
  • 15. Machine Learning (ML) All about statistical and algorithmic techniques for surface-model fitting to data points by adjusting model parameters. Random Forest Neural Networks SVR Kernel Ridge “Predictive Modeling” Fitted surface used for making predictions on unseen data points Variable 1 Variable 2 <latexit sha1_base64="Ill3Als4zZd947f5Xm9sW99d0QA=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCJTomJcEd245CGPBAlp64gNpW3aQkTiD5i4lYUrTVwYP8APcOMPuOATjEtM3LjwUpoYJeJtpnPmzD13zsyVTU21Hca6PmFsfGJyyj8dmJmdmw+GFhbzttGwFJ5TDM2wirJkc03Vec5RHY0XTYtLdVnjBbm2198vNLllq4Z+4LRMXq5LVV09VhXJISp7WhEroQiLMTfCw0D0QARepIzQIw5xBAMKGqiDQ4dDWIMEm74SRDCYxJXRJs4ipLr7HOcIkLZBWZwyJGJr9K/SquSxOq37NW1XrdApGg2LlGFE2Qu7Zz32zB7YK/v8s1bbrdH30qJZHmi5WQleLGc//lXVaXZw8q0a6dnBMbZdryp5N12mfwtloG+edXrZnUy0vcZu2Rv5v2Fd9kQ30Jvvyl2aZ65H+JHJC70YNUj83Y5hkI/HxK1YPL0RSe56rfJjBatYp34kkMQ+UshR/SoucYWO4BdiwqaQGKQKPk+zhB8hJL8AVA6Qmg==</latexit> x1 <latexit sha1_base64="QFtMwnKe2I12XGZu0bNJbdnDaaE=">AAAChnichVG7TgJBFD2sL8QHqI2JDZFgrMhAVIwV0caShzwSJGR3HXHDvrK7EJH4Aya2UlhpYmH8AD/Axh+w4BOMJSY2Fl6WTYwS8W5m58yZe+6cmSuZqmI7jHV9wtj4xOSUfzowMzs3HwwtLBZso2HJPC8bqmGVJNHmqqLzvKM4Ki+ZFhc1SeVFqb7X3y82uWUrhn7gtExe0cSarhwrsugQlTutJqqhCIsxN8LDIO6BCLxIG6FHHOIIBmQ0oIFDh0NYhQibvjLiYDCJq6BNnEVIcfc5zhEgbYOyOGWIxNbpX6NV2WN1Wvdr2q5aplNUGhYpw4iyF3bPeuyZPbBX9vlnrbZbo++lRbM00HKzGrxYzn38q9JodnDyrRrp2cExtl2vCnk3XaZ/C3mgb551ermdbLS9xm7ZG/m/YV32RDfQm+/yXYZnr0f4kcgLvRg1KP67HcOgkIjFt2KJzEYkteu1yo8VrGKd+pFECvtII0/1a7jEFTqCX4gJm0JykCr4PM0SfoSQ+gJWLpCb</latexit> x2 <latexit sha1_base64="Ill3Als4zZd947f5Xm9sW99d0QA=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCJTomJcEd245CGPBAlp64gNpW3aQkTiD5i4lYUrTVwYP8APcOMPuOATjEtM3LjwUpoYJeJtpnPmzD13zsyVTU21Hca6PmFsfGJyyj8dmJmdmw+GFhbzttGwFJ5TDM2wirJkc03Vec5RHY0XTYtLdVnjBbm2198vNLllq4Z+4LRMXq5LVV09VhXJISp7WhEroQiLMTfCw0D0QARepIzQIw5xBAMKGqiDQ4dDWIMEm74SRDCYxJXRJs4ipLr7HOcIkLZBWZwyJGJr9K/SquSxOq37NW1XrdApGg2LlGFE2Qu7Zz32zB7YK/v8s1bbrdH30qJZHmi5WQleLGc//lXVaXZw8q0a6dnBMbZdryp5N12mfwtloG+edXrZnUy0vcZu2Rv5v2Fd9kQ30Jvvyl2aZ65H+JHJC70YNUj83Y5hkI/HxK1YPL0RSe56rfJjBatYp34kkMQ+UshR/SoucYWO4BdiwqaQGKQKPk+zhB8hJL8AVA6Qmg==</latexit> x1 <latexit sha1_base64="QFtMwnKe2I12XGZu0bNJbdnDaaE=">AAAChnichVG7TgJBFD2sL8QHqI2JDZFgrMhAVIwV0caShzwSJGR3HXHDvrK7EJH4Aya2UlhpYmH8AD/Axh+w4BOMJSY2Fl6WTYwS8W5m58yZe+6cmSuZqmI7jHV9wtj4xOSUfzowMzs3HwwtLBZso2HJPC8bqmGVJNHmqqLzvKM4Ki+ZFhc1SeVFqb7X3y82uWUrhn7gtExe0cSarhwrsugQlTutJqqhCIsxN8LDIO6BCLxIG6FHHOIIBmQ0oIFDh0NYhQibvjLiYDCJq6BNnEVIcfc5zhEgbYOyOGWIxNbpX6NV2WN1Wvdr2q5aplNUGhYpw4iyF3bPeuyZPbBX9vlnrbZbo++lRbM00HKzGrxYzn38q9JodnDyrRrp2cExtl2vCnk3XaZ/C3mgb551ermdbLS9xm7ZG/m/YV32RDfQm+/yXYZnr0f4kcgLvRg1KP67HcOgkIjFt2KJzEYkteu1yo8VrGKd+pFECvtII0/1a7jEFTqCX4gJm0JykCr4PM0SfoSQ+gJWLpCb</latexit> x2
  • 16. Modern aspects of ML 1. High dimensionality: Data can have many input variables. a 100x100 pixel grayscale image = 10000 input variables (a 10000-dimensional array)
  • 17. Modern aspects of ML 1. High dimensionality: Data can have many input variables. a 100x100 pixel grayscale image = 10000 input variables (a 10000-dimensional array) 2. Multiformity and multimodality: Data take many forms + modes Numerical values, discrete structures, networks, variable-length sequences, etc. Images, volumes, videos, audios, texts, point clouds, geometries, sensor signals, etc.
  • 18. Modern aspects of ML 1. High dimensionality: Data can have many input variables. a 100x100 pixel grayscale image = 10000 input variables (a 10000-dimensional array) 3. Overrepresentation: ML models can have many parameters. ResNet50: 26 million params ResNet101: 45 million params EfficientNet-B7: 66 million params VGG19: 144 million params 12-layer, 12-heads BERT: 110 million params 24-layer, 16-heads BERT: 336 million params GPT-2 XL: 1558 million params GPT-3: 175 billion params 2. Multiformity and multimodality: Data take many forms + modes Numerical values, discrete structures, networks, variable-length sequences, etc. Images, volumes, videos, audios, texts, point clouds, geometries, sensor signals, etc.
  • 19. Modern aspects of ML 1. High dimensionality: Data can have many input variables. a 100x100 pixel grayscale image = 10000 input variables (a 10000-dimensional array) 3. Overrepresentation: ML models can have many parameters. ResNet50: 26 million params ResNet101: 45 million params EfficientNet-B7: 66 million params VGG19: 144 million params 12-layer, 12-heads BERT: 110 million params 24-layer, 16-heads BERT: 336 million params GPT-2 XL: 1558 million params GPT-3: 175 billion params Can you imagine what would happen if we try to fit a surface model having 175 billion parameters to 100 million data points in 10 thousand dimension?? 2. Multiformity and multimodality: Data take many forms + modes Numerical values, discrete structures, networks, variable-length sequences, etc. Images, volumes, videos, audios, texts, point clouds, geometries, sensor signals, etc.
  • 20. Modern aspects of ML 4. Representation learning: Models can have “feature learning” blocks, and they can be “pre-trained” by different large datasets. Prediction Input variables Surface model Classifier or Regressor <latexit sha1_base64="Ill3Als4zZd947f5Xm9sW99d0QA=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCJTomJcEd245CGPBAlp64gNpW3aQkTiD5i4lYUrTVwYP8APcOMPuOATjEtM3LjwUpoYJeJtpnPmzD13zsyVTU21Hca6PmFsfGJyyj8dmJmdmw+GFhbzttGwFJ5TDM2wirJkc03Vec5RHY0XTYtLdVnjBbm2198vNLllq4Z+4LRMXq5LVV09VhXJISp7WhEroQiLMTfCw0D0QARepIzQIw5xBAMKGqiDQ4dDWIMEm74SRDCYxJXRJs4ipLr7HOcIkLZBWZwyJGJr9K/SquSxOq37NW1XrdApGg2LlGFE2Qu7Zz32zB7YK/v8s1bbrdH30qJZHmi5WQleLGc//lXVaXZw8q0a6dnBMbZdryp5N12mfwtloG+edXrZnUy0vcZu2Rv5v2Fd9kQ30Jvvyl2aZ65H+JHJC70YNUj83Y5hkI/HxK1YPL0RSe56rfJjBatYp34kkMQ+UshR/SoucYWO4BdiwqaQGKQKPk+zhB8hJL8AVA6Qmg==</latexit> x1 <latexit sha1_base64="QFtMwnKe2I12XGZu0bNJbdnDaaE=">AAAChnichVG7TgJBFD2sL8QHqI2JDZFgrMhAVIwV0caShzwSJGR3HXHDvrK7EJH4Aya2UlhpYmH8AD/Axh+w4BOMJSY2Fl6WTYwS8W5m58yZe+6cmSuZqmI7jHV9wtj4xOSUfzowMzs3HwwtLBZso2HJPC8bqmGVJNHmqqLzvKM4Ki+ZFhc1SeVFqb7X3y82uWUrhn7gtExe0cSarhwrsugQlTutJqqhCIsxN8LDIO6BCLxIG6FHHOIIBmQ0oIFDh0NYhQibvjLiYDCJq6BNnEVIcfc5zhEgbYOyOGWIxNbpX6NV2WN1Wvdr2q5aplNUGhYpw4iyF3bPeuyZPbBX9vlnrbZbo++lRbM00HKzGrxYzn38q9JodnDyrRrp2cExtl2vCnk3XaZ/C3mgb551ermdbLS9xm7ZG/m/YV32RDfQm+/yXYZnr0f4kcgLvRg1KP67HcOgkIjFt2KJzEYkteu1yo8VrGKd+pFECvtII0/1a7jEFTqCX4gJm0JykCr4PM0SfoSQ+gJWLpCb</latexit> x2 <latexit sha1_base64="lFhRrRrVTrFR31ebbMgRp5myJpc=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCIDPjCuiG5c8pBHgoS0dcCG0jZtISLxB0zcysKVJi6MH+AHuPEHXPAJxiUmblx4KU2MEvE20zlz5p47Z+ZKhqpYNmNdjzA2PjE55Z32zczOzfsDC4s5S2+YMs/KuqqbBUm0uKpoPGsrtsoLhsnFuqTyvFTb7+/nm9y0FF07tFsGL9XFqqZUFFm0icqcljfKgRCLMCeCwyDqghDcSOqBRxzhGDpkNFAHhwabsAoRFn1FRMFgEFdCmziTkOLsc5zDR9oGZXHKEImt0b9Kq6LLarTu17QctUynqDRMUgYRZi/snvXYM3tgr+zzz1ptp0bfS4tmaaDlRtl/sZz5+FdVp9nGybdqpGcbFew4XhXybjhM/xbyQN886/Qyu+lwe43dsjfyf8O67IluoDXf5bsUT1+P8CORF3oxalD0dzuGQS4WiW5HYqnNUGLPbZUXK1jFOvUjjgQOkESW6ldxiSt0BK8QEbaE+CBV8LiaJfwIIfEFWE6QnA==</latexit> x3 <latexit sha1_base64="0IPXcU0UIDvzZlYURjV2A/THv9U=">AAACiXichVG7SgNBFD2ur/hM1EawEYNiFWZFNKQKprGMj0TBBNndTHR0X+xOFmLwB6zsRK0ULMQP8ANs/AELP0EsFWwsvNksiAbjXWbnzJl77pyZq7um8CVjz11Kd09vX39sYHBoeGQ0nhgbL/pOzTN4wXBMx9vWNZ+bwuYFKaTJt12Pa5Zu8i39MNfc3wq45wvH3pR1l5ctbc8WVWFokqhiKag40t9NJFmKhTHdDtQIJBFF3knco4QKHBiowQKHDUnYhAafvh2oYHCJK6NBnEdIhPscxxgkbY2yOGVoxB7Sf49WOxFr07pZ0w/VBp1i0vBIOY1Z9sRu2Rt7ZHfshX3+WasR1mh6qdOst7Tc3Y2fTG58/KuyaJbY/1Z19CxRRTr0Ksi7GzLNWxgtfXB09raRWZ9tzLFr9kr+r9gze6Ab2MG7cbPG1y87+NHJC70YNUj93Y52UFxIqUuphbXFZHYlalUMU5jBPPVjGVmsIo8C1T/AKc5xoQwpqpJWMq1UpSvSTOBHKLkvAi+SPA==</latexit> . . .
  • 21. Modern aspects of ML Prediction Input variables Surface model <latexit sha1_base64="Ill3Als4zZd947f5Xm9sW99d0QA=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCJTomJcEd245CGPBAlp64gNpW3aQkTiD5i4lYUrTVwYP8APcOMPuOATjEtM3LjwUpoYJeJtpnPmzD13zsyVTU21Hca6PmFsfGJyyj8dmJmdmw+GFhbzttGwFJ5TDM2wirJkc03Vec5RHY0XTYtLdVnjBbm2198vNLllq4Z+4LRMXq5LVV09VhXJISp7WhEroQiLMTfCw0D0QARepIzQIw5xBAMKGqiDQ4dDWIMEm74SRDCYxJXRJs4ipLr7HOcIkLZBWZwyJGJr9K/SquSxOq37NW1XrdApGg2LlGFE2Qu7Zz32zB7YK/v8s1bbrdH30qJZHmi5WQleLGc//lXVaXZw8q0a6dnBMbZdryp5N12mfwtloG+edXrZnUy0vcZu2Rv5v2Fd9kQ30Jvvyl2aZ65H+JHJC70YNUj83Y5hkI/HxK1YPL0RSe56rfJjBatYp34kkMQ+UshR/SoucYWO4BdiwqaQGKQKPk+zhB8hJL8AVA6Qmg==</latexit> x1 <latexit sha1_base64="QFtMwnKe2I12XGZu0bNJbdnDaaE=">AAAChnichVG7TgJBFD2sL8QHqI2JDZFgrMhAVIwV0caShzwSJGR3HXHDvrK7EJH4Aya2UlhpYmH8AD/Axh+w4BOMJSY2Fl6WTYwS8W5m58yZe+6cmSuZqmI7jHV9wtj4xOSUfzowMzs3HwwtLBZso2HJPC8bqmGVJNHmqqLzvKM4Ki+ZFhc1SeVFqb7X3y82uWUrhn7gtExe0cSarhwrsugQlTutJqqhCIsxN8LDIO6BCLxIG6FHHOIIBmQ0oIFDh0NYhQibvjLiYDCJq6BNnEVIcfc5zhEgbYOyOGWIxNbpX6NV2WN1Wvdr2q5aplNUGhYpw4iyF3bPeuyZPbBX9vlnrbZbo++lRbM00HKzGrxYzn38q9JodnDyrRrp2cExtl2vCnk3XaZ/C3mgb551ermdbLS9xm7ZG/m/YV32RDfQm+/yXYZnr0f4kcgLvRg1KP67HcOgkIjFt2KJzEYkteu1yo8VrGKd+pFECvtII0/1a7jEFTqCX4gJm0JykCr4PM0SfoSQ+gJWLpCb</latexit> x2 <latexit sha1_base64="lFhRrRrVTrFR31ebbMgRp5myJpc=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCIDPjCuiG5c8pBHgoS0dcCG0jZtISLxB0zcysKVJi6MH+AHuPEHXPAJxiUmblx4KU2MEvE20zlz5p47Z+ZKhqpYNmNdjzA2PjE55Z32zczOzfsDC4s5S2+YMs/KuqqbBUm0uKpoPGsrtsoLhsnFuqTyvFTb7+/nm9y0FF07tFsGL9XFqqZUFFm0icqcljfKgRCLMCeCwyDqghDcSOqBRxzhGDpkNFAHhwabsAoRFn1FRMFgEFdCmziTkOLsc5zDR9oGZXHKEImt0b9Kq6LLarTu17QctUynqDRMUgYRZi/snvXYM3tgr+zzz1ptp0bfS4tmaaDlRtl/sZz5+FdVp9nGybdqpGcbFew4XhXybjhM/xbyQN886/Qyu+lwe43dsjfyf8O67IluoDXf5bsUT1+P8CORF3oxalD0dzuGQS4WiW5HYqnNUGLPbZUXK1jFOvUjjgQOkESW6ldxiSt0BK8QEbaE+CBV8LiaJfwIIfEFWE6QnA==</latexit> x3 <latexit sha1_base64="0IPXcU0UIDvzZlYURjV2A/THv9U=">AAACiXichVG7SgNBFD2ur/hM1EawEYNiFWZFNKQKprGMj0TBBNndTHR0X+xOFmLwB6zsRK0ULMQP8ANs/AELP0EsFWwsvNksiAbjXWbnzJl77pyZq7um8CVjz11Kd09vX39sYHBoeGQ0nhgbL/pOzTN4wXBMx9vWNZ+bwuYFKaTJt12Pa5Zu8i39MNfc3wq45wvH3pR1l5ctbc8WVWFokqhiKag40t9NJFmKhTHdDtQIJBFF3knco4QKHBiowQKHDUnYhAafvh2oYHCJK6NBnEdIhPscxxgkbY2yOGVoxB7Sf49WOxFr07pZ0w/VBp1i0vBIOY1Z9sRu2Rt7ZHfshX3+WasR1mh6qdOst7Tc3Y2fTG58/KuyaJbY/1Z19CxRRTr0Ksi7GzLNWxgtfXB09raRWZ9tzLFr9kr+r9gze6Ab2MG7cbPG1y87+NHJC70YNUj93Y52UFxIqUuphbXFZHYlalUMU5jBPPVjGVmsIo8C1T/AKc5xoQwpqpJWMq1UpSvSTOBHKLkvAi+SPA==</latexit> . . . Latent variables Variable transformation Feature learning Classifier or Regressor 4. Representation learning: Models can have “feature learning” blocks, and they can be “pre-trained” by different large datasets.
  • 22. Modern aspects of ML Prediction Input variables Surface model <latexit sha1_base64="Ill3Als4zZd947f5Xm9sW99d0QA=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCJTomJcEd245CGPBAlp64gNpW3aQkTiD5i4lYUrTVwYP8APcOMPuOATjEtM3LjwUpoYJeJtpnPmzD13zsyVTU21Hca6PmFsfGJyyj8dmJmdmw+GFhbzttGwFJ5TDM2wirJkc03Vec5RHY0XTYtLdVnjBbm2198vNLllq4Z+4LRMXq5LVV09VhXJISp7WhEroQiLMTfCw0D0QARepIzQIw5xBAMKGqiDQ4dDWIMEm74SRDCYxJXRJs4ipLr7HOcIkLZBWZwyJGJr9K/SquSxOq37NW1XrdApGg2LlGFE2Qu7Zz32zB7YK/v8s1bbrdH30qJZHmi5WQleLGc//lXVaXZw8q0a6dnBMbZdryp5N12mfwtloG+edXrZnUy0vcZu2Rv5v2Fd9kQ30Jvvyl2aZ65H+JHJC70YNUj83Y5hkI/HxK1YPL0RSe56rfJjBatYp34kkMQ+UshR/SoucYWO4BdiwqaQGKQKPk+zhB8hJL8AVA6Qmg==</latexit> x1 <latexit sha1_base64="QFtMwnKe2I12XGZu0bNJbdnDaaE=">AAAChnichVG7TgJBFD2sL8QHqI2JDZFgrMhAVIwV0caShzwSJGR3HXHDvrK7EJH4Aya2UlhpYmH8AD/Axh+w4BOMJSY2Fl6WTYwS8W5m58yZe+6cmSuZqmI7jHV9wtj4xOSUfzowMzs3HwwtLBZso2HJPC8bqmGVJNHmqqLzvKM4Ki+ZFhc1SeVFqb7X3y82uWUrhn7gtExe0cSarhwrsugQlTutJqqhCIsxN8LDIO6BCLxIG6FHHOIIBmQ0oIFDh0NYhQibvjLiYDCJq6BNnEVIcfc5zhEgbYOyOGWIxNbpX6NV2WN1Wvdr2q5aplNUGhYpw4iyF3bPeuyZPbBX9vlnrbZbo++lRbM00HKzGrxYzn38q9JodnDyrRrp2cExtl2vCnk3XaZ/C3mgb551ermdbLS9xm7ZG/m/YV32RDfQm+/yXYZnr0f4kcgLvRg1KP67HcOgkIjFt2KJzEYkteu1yo8VrGKd+pFECvtII0/1a7jEFTqCX4gJm0JykCr4PM0SfoSQ+gJWLpCb</latexit> x2 <latexit sha1_base64="lFhRrRrVTrFR31ebbMgRp5myJpc=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCIDPjCuiG5c8pBHgoS0dcCG0jZtISLxB0zcysKVJi6MH+AHuPEHXPAJxiUmblx4KU2MEvE20zlz5p47Z+ZKhqpYNmNdjzA2PjE55Z32zczOzfsDC4s5S2+YMs/KuqqbBUm0uKpoPGsrtsoLhsnFuqTyvFTb7+/nm9y0FF07tFsGL9XFqqZUFFm0icqcljfKgRCLMCeCwyDqghDcSOqBRxzhGDpkNFAHhwabsAoRFn1FRMFgEFdCmziTkOLsc5zDR9oGZXHKEImt0b9Kq6LLarTu17QctUynqDRMUgYRZi/snvXYM3tgr+zzz1ptp0bfS4tmaaDlRtl/sZz5+FdVp9nGybdqpGcbFew4XhXybjhM/xbyQN886/Qyu+lwe43dsjfyf8O67IluoDXf5bsUT1+P8CORF3oxalD0dzuGQS4WiW5HYqnNUGLPbZUXK1jFOvUjjgQOkESW6ldxiSt0BK8QEbaE+CBV8LiaJfwIIfEFWE6QnA==</latexit> x3 <latexit sha1_base64="0IPXcU0UIDvzZlYURjV2A/THv9U=">AAACiXichVG7SgNBFD2ur/hM1EawEYNiFWZFNKQKprGMj0TBBNndTHR0X+xOFmLwB6zsRK0ULMQP8ANs/AELP0EsFWwsvNksiAbjXWbnzJl77pyZq7um8CVjz11Kd09vX39sYHBoeGQ0nhgbL/pOzTN4wXBMx9vWNZ+bwuYFKaTJt12Pa5Zu8i39MNfc3wq45wvH3pR1l5ctbc8WVWFokqhiKag40t9NJFmKhTHdDtQIJBFF3knco4QKHBiowQKHDUnYhAafvh2oYHCJK6NBnEdIhPscxxgkbY2yOGVoxB7Sf49WOxFr07pZ0w/VBp1i0vBIOY1Z9sRu2Rt7ZHfshX3+WasR1mh6qdOst7Tc3Y2fTG58/KuyaJbY/1Z19CxRRTr0Ksi7GzLNWxgtfXB09raRWZ9tzLFr9kr+r9gze6Ab2MG7cbPG1y87+NHJC70YNUj93Y52UFxIqUuphbXFZHYlalUMU5jBPPVjGVmsIo8C1T/AKc5xoQwpqpJWMq1UpSvSTOBHKLkvAi+SPA==</latexit> . . . Latent variables Variable transformation Feature learning Classifier or Regressor 4. Representation learning: Models can have “feature learning” blocks, and they can be “pre-trained” by different large datasets.
  • 23. Modern aspects of ML Prediction Input variables Surface model <latexit sha1_base64="Ill3Als4zZd947f5Xm9sW99d0QA=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCJTomJcEd245CGPBAlp64gNpW3aQkTiD5i4lYUrTVwYP8APcOMPuOATjEtM3LjwUpoYJeJtpnPmzD13zsyVTU21Hca6PmFsfGJyyj8dmJmdmw+GFhbzttGwFJ5TDM2wirJkc03Vec5RHY0XTYtLdVnjBbm2198vNLllq4Z+4LRMXq5LVV09VhXJISp7WhEroQiLMTfCw0D0QARepIzQIw5xBAMKGqiDQ4dDWIMEm74SRDCYxJXRJs4ipLr7HOcIkLZBWZwyJGJr9K/SquSxOq37NW1XrdApGg2LlGFE2Qu7Zz32zB7YK/v8s1bbrdH30qJZHmi5WQleLGc//lXVaXZw8q0a6dnBMbZdryp5N12mfwtloG+edXrZnUy0vcZu2Rv5v2Fd9kQ30Jvvyl2aZ65H+JHJC70YNUj83Y5hkI/HxK1YPL0RSe56rfJjBatYp34kkMQ+UshR/SoucYWO4BdiwqaQGKQKPk+zhB8hJL8AVA6Qmg==</latexit> x1 <latexit sha1_base64="QFtMwnKe2I12XGZu0bNJbdnDaaE=">AAAChnichVG7TgJBFD2sL8QHqI2JDZFgrMhAVIwV0caShzwSJGR3HXHDvrK7EJH4Aya2UlhpYmH8AD/Axh+w4BOMJSY2Fl6WTYwS8W5m58yZe+6cmSuZqmI7jHV9wtj4xOSUfzowMzs3HwwtLBZso2HJPC8bqmGVJNHmqqLzvKM4Ki+ZFhc1SeVFqb7X3y82uWUrhn7gtExe0cSarhwrsugQlTutJqqhCIsxN8LDIO6BCLxIG6FHHOIIBmQ0oIFDh0NYhQibvjLiYDCJq6BNnEVIcfc5zhEgbYOyOGWIxNbpX6NV2WN1Wvdr2q5aplNUGhYpw4iyF3bPeuyZPbBX9vlnrbZbo++lRbM00HKzGrxYzn38q9JodnDyrRrp2cExtl2vCnk3XaZ/C3mgb551ermdbLS9xm7ZG/m/YV32RDfQm+/yXYZnr0f4kcgLvRg1KP67HcOgkIjFt2KJzEYkteu1yo8VrGKd+pFECvtII0/1a7jEFTqCX4gJm0JykCr4PM0SfoSQ+gJWLpCb</latexit> x2 <latexit sha1_base64="lFhRrRrVTrFR31ebbMgRp5myJpc=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCIDPjCuiG5c8pBHgoS0dcCG0jZtISLxB0zcysKVJi6MH+AHuPEHXPAJxiUmblx4KU2MEvE20zlz5p47Z+ZKhqpYNmNdjzA2PjE55Z32zczOzfsDC4s5S2+YMs/KuqqbBUm0uKpoPGsrtsoLhsnFuqTyvFTb7+/nm9y0FF07tFsGL9XFqqZUFFm0icqcljfKgRCLMCeCwyDqghDcSOqBRxzhGDpkNFAHhwabsAoRFn1FRMFgEFdCmziTkOLsc5zDR9oGZXHKEImt0b9Kq6LLarTu17QctUynqDRMUgYRZi/snvXYM3tgr+zzz1ptp0bfS4tmaaDlRtl/sZz5+FdVp9nGybdqpGcbFew4XhXybjhM/xbyQN886/Qyu+lwe43dsjfyf8O67IluoDXf5bsUT1+P8CORF3oxalD0dzuGQS4WiW5HYqnNUGLPbZUXK1jFOvUjjgQOkESW6ldxiSt0BK8QEbaE+CBV8LiaJfwIIfEFWE6QnA==</latexit> x3 <latexit sha1_base64="0IPXcU0UIDvzZlYURjV2A/THv9U=">AAACiXichVG7SgNBFD2ur/hM1EawEYNiFWZFNKQKprGMj0TBBNndTHR0X+xOFmLwB6zsRK0ULMQP8ANs/AELP0EsFWwsvNksiAbjXWbnzJl77pyZq7um8CVjz11Kd09vX39sYHBoeGQ0nhgbL/pOzTN4wXBMx9vWNZ+bwuYFKaTJt12Pa5Zu8i39MNfc3wq45wvH3pR1l5ctbc8WVWFokqhiKag40t9NJFmKhTHdDtQIJBFF3knco4QKHBiowQKHDUnYhAafvh2oYHCJK6NBnEdIhPscxxgkbY2yOGVoxB7Sf49WOxFr07pZ0w/VBp1i0vBIOY1Z9sRu2Rt7ZHfshX3+WasR1mh6qdOst7Tc3Y2fTG58/KuyaJbY/1Z19CxRRTr0Ksi7GzLNWxgtfXB09raRWZ9tzLFr9kr+r9gze6Ab2MG7cbPG1y87+NHJC70YNUj93Y52UFxIqUuphbXFZHYlalUMU5jBPPVjGVmsIo8C1T/AKc5xoQwpqpJWMq1UpSvSTOBHKLkvAi+SPA==</latexit> . . . Latent variables Variable transformation Feature learning Classifier or Regressor 4. Representation learning: Models can have “feature learning” blocks, and they can be “pre-trained” by different large datasets.
  • 24. Modern aspects of ML Prediction Input variables Surface model <latexit sha1_base64="Ill3Als4zZd947f5Xm9sW99d0QA=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCJTomJcEd245CGPBAlp64gNpW3aQkTiD5i4lYUrTVwYP8APcOMPuOATjEtM3LjwUpoYJeJtpnPmzD13zsyVTU21Hca6PmFsfGJyyj8dmJmdmw+GFhbzttGwFJ5TDM2wirJkc03Vec5RHY0XTYtLdVnjBbm2198vNLllq4Z+4LRMXq5LVV09VhXJISp7WhEroQiLMTfCw0D0QARepIzQIw5xBAMKGqiDQ4dDWIMEm74SRDCYxJXRJs4ipLr7HOcIkLZBWZwyJGJr9K/SquSxOq37NW1XrdApGg2LlGFE2Qu7Zz32zB7YK/v8s1bbrdH30qJZHmi5WQleLGc//lXVaXZw8q0a6dnBMbZdryp5N12mfwtloG+edXrZnUy0vcZu2Rv5v2Fd9kQ30Jvvyl2aZ65H+JHJC70YNUj83Y5hkI/HxK1YPL0RSe56rfJjBatYp34kkMQ+UshR/SoucYWO4BdiwqaQGKQKPk+zhB8hJL8AVA6Qmg==</latexit> x1 <latexit sha1_base64="QFtMwnKe2I12XGZu0bNJbdnDaaE=">AAAChnichVG7TgJBFD2sL8QHqI2JDZFgrMhAVIwV0caShzwSJGR3HXHDvrK7EJH4Aya2UlhpYmH8AD/Axh+w4BOMJSY2Fl6WTYwS8W5m58yZe+6cmSuZqmI7jHV9wtj4xOSUfzowMzs3HwwtLBZso2HJPC8bqmGVJNHmqqLzvKM4Ki+ZFhc1SeVFqb7X3y82uWUrhn7gtExe0cSarhwrsugQlTutJqqhCIsxN8LDIO6BCLxIG6FHHOIIBmQ0oIFDh0NYhQibvjLiYDCJq6BNnEVIcfc5zhEgbYOyOGWIxNbpX6NV2WN1Wvdr2q5aplNUGhYpw4iyF3bPeuyZPbBX9vlnrbZbo++lRbM00HKzGrxYzn38q9JodnDyrRrp2cExtl2vCnk3XaZ/C3mgb551ermdbLS9xm7ZG/m/YV32RDfQm+/yXYZnr0f4kcgLvRg1KP67HcOgkIjFt2KJzEYkteu1yo8VrGKd+pFECvtII0/1a7jEFTqCX4gJm0JykCr4PM0SfoSQ+gJWLpCb</latexit> x2 <latexit sha1_base64="lFhRrRrVTrFR31ebbMgRp5myJpc=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCIDPjCuiG5c8pBHgoS0dcCG0jZtISLxB0zcysKVJi6MH+AHuPEHXPAJxiUmblx4KU2MEvE20zlz5p47Z+ZKhqpYNmNdjzA2PjE55Z32zczOzfsDC4s5S2+YMs/KuqqbBUm0uKpoPGsrtsoLhsnFuqTyvFTb7+/nm9y0FF07tFsGL9XFqqZUFFm0icqcljfKgRCLMCeCwyDqghDcSOqBRxzhGDpkNFAHhwabsAoRFn1FRMFgEFdCmziTkOLsc5zDR9oGZXHKEImt0b9Kq6LLarTu17QctUynqDRMUgYRZi/snvXYM3tgr+zzz1ptp0bfS4tmaaDlRtl/sZz5+FdVp9nGybdqpGcbFew4XhXybjhM/xbyQN886/Qyu+lwe43dsjfyf8O67IluoDXf5bsUT1+P8CORF3oxalD0dzuGQS4WiW5HYqnNUGLPbZUXK1jFOvUjjgQOkESW6ldxiSt0BK8QEbaE+CBV8LiaJfwIIfEFWE6QnA==</latexit> x3 <latexit sha1_base64="0IPXcU0UIDvzZlYURjV2A/THv9U=">AAACiXichVG7SgNBFD2ur/hM1EawEYNiFWZFNKQKprGMj0TBBNndTHR0X+xOFmLwB6zsRK0ULMQP8ANs/AELP0EsFWwsvNksiAbjXWbnzJl77pyZq7um8CVjz11Kd09vX39sYHBoeGQ0nhgbL/pOzTN4wXBMx9vWNZ+bwuYFKaTJt12Pa5Zu8i39MNfc3wq45wvH3pR1l5ctbc8WVWFokqhiKag40t9NJFmKhTHdDtQIJBFF3knco4QKHBiowQKHDUnYhAafvh2oYHCJK6NBnEdIhPscxxgkbY2yOGVoxB7Sf49WOxFr07pZ0w/VBp1i0vBIOY1Z9sRu2Rt7ZHfshX3+WasR1mh6qdOst7Tc3Y2fTG58/KuyaJbY/1Z19CxRRTr0Ksi7GzLNWxgtfXB09raRWZ9tzLFr9kr+r9gze6Ab2MG7cbPG1y87+NHJC70YNUj93Y52UFxIqUuphbXFZHYlalUMU5jBPPVjGVmsIo8C1T/AKc5xoQwpqpJWMq1UpSvSTOBHKLkvAi+SPA==</latexit> . . . Latent variables Variable transformation Feature learning Classifier or Regressor 4. Representation learning: Models can have “feature learning” blocks, and they can be “pre-trained” by different large datasets.
  • 25. Modern aspects of ML Prediction Input variables Surface model <latexit sha1_base64="Ill3Als4zZd947f5Xm9sW99d0QA=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCJTomJcEd245CGPBAlp64gNpW3aQkTiD5i4lYUrTVwYP8APcOMPuOATjEtM3LjwUpoYJeJtpnPmzD13zsyVTU21Hca6PmFsfGJyyj8dmJmdmw+GFhbzttGwFJ5TDM2wirJkc03Vec5RHY0XTYtLdVnjBbm2198vNLllq4Z+4LRMXq5LVV09VhXJISp7WhEroQiLMTfCw0D0QARepIzQIw5xBAMKGqiDQ4dDWIMEm74SRDCYxJXRJs4ipLr7HOcIkLZBWZwyJGJr9K/SquSxOq37NW1XrdApGg2LlGFE2Qu7Zz32zB7YK/v8s1bbrdH30qJZHmi5WQleLGc//lXVaXZw8q0a6dnBMbZdryp5N12mfwtloG+edXrZnUy0vcZu2Rv5v2Fd9kQ30Jvvyl2aZ65H+JHJC70YNUj83Y5hkI/HxK1YPL0RSe56rfJjBatYp34kkMQ+UshR/SoucYWO4BdiwqaQGKQKPk+zhB8hJL8AVA6Qmg==</latexit> x1 <latexit sha1_base64="QFtMwnKe2I12XGZu0bNJbdnDaaE=">AAAChnichVG7TgJBFD2sL8QHqI2JDZFgrMhAVIwV0caShzwSJGR3HXHDvrK7EJH4Aya2UlhpYmH8AD/Axh+w4BOMJSY2Fl6WTYwS8W5m58yZe+6cmSuZqmI7jHV9wtj4xOSUfzowMzs3HwwtLBZso2HJPC8bqmGVJNHmqqLzvKM4Ki+ZFhc1SeVFqb7X3y82uWUrhn7gtExe0cSarhwrsugQlTutJqqhCIsxN8LDIO6BCLxIG6FHHOIIBmQ0oIFDh0NYhQibvjLiYDCJq6BNnEVIcfc5zhEgbYOyOGWIxNbpX6NV2WN1Wvdr2q5aplNUGhYpw4iyF3bPeuyZPbBX9vlnrbZbo++lRbM00HKzGrxYzn38q9JodnDyrRrp2cExtl2vCnk3XaZ/C3mgb551ermdbLS9xm7ZG/m/YV32RDfQm+/yXYZnr0f4kcgLvRg1KP67HcOgkIjFt2KJzEYkteu1yo8VrGKd+pFECvtII0/1a7jEFTqCX4gJm0JykCr4PM0SfoSQ+gJWLpCb</latexit> x2 <latexit sha1_base64="lFhRrRrVTrFR31ebbMgRp5myJpc=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCIDPjCuiG5c8pBHgoS0dcCG0jZtISLxB0zcysKVJi6MH+AHuPEHXPAJxiUmblx4KU2MEvE20zlz5p47Z+ZKhqpYNmNdjzA2PjE55Z32zczOzfsDC4s5S2+YMs/KuqqbBUm0uKpoPGsrtsoLhsnFuqTyvFTb7+/nm9y0FF07tFsGL9XFqqZUFFm0icqcljfKgRCLMCeCwyDqghDcSOqBRxzhGDpkNFAHhwabsAoRFn1FRMFgEFdCmziTkOLsc5zDR9oGZXHKEImt0b9Kq6LLarTu17QctUynqDRMUgYRZi/snvXYM3tgr+zzz1ptp0bfS4tmaaDlRtl/sZz5+FdVp9nGybdqpGcbFew4XhXybjhM/xbyQN886/Qyu+lwe43dsjfyf8O67IluoDXf5bsUT1+P8CORF3oxalD0dzuGQS4WiW5HYqnNUGLPbZUXK1jFOvUjjgQOkESW6ldxiSt0BK8QEbaE+CBV8LiaJfwIIfEFWE6QnA==</latexit> x3 <latexit sha1_base64="0IPXcU0UIDvzZlYURjV2A/THv9U=">AAACiXichVG7SgNBFD2ur/hM1EawEYNiFWZFNKQKprGMj0TBBNndTHR0X+xOFmLwB6zsRK0ULMQP8ANs/AELP0EsFWwsvNksiAbjXWbnzJl77pyZq7um8CVjz11Kd09vX39sYHBoeGQ0nhgbL/pOzTN4wXBMx9vWNZ+bwuYFKaTJt12Pa5Zu8i39MNfc3wq45wvH3pR1l5ctbc8WVWFokqhiKag40t9NJFmKhTHdDtQIJBFF3knco4QKHBiowQKHDUnYhAafvh2oYHCJK6NBnEdIhPscxxgkbY2yOGVoxB7Sf49WOxFr07pZ0w/VBp1i0vBIOY1Z9sRu2Rt7ZHfshX3+WasR1mh6qdOst7Tc3Y2fTG58/KuyaJbY/1Z19CxRRTr0Ksi7GzLNWxgtfXB09raRWZ9tzLFr9kr+r9gze6Ab2MG7cbPG1y87+NHJC70YNUj93Y52UFxIqUuphbXFZHYlalUMU5jBPPVjGVmsIo8C1T/AKc5xoQwpqpJWMq1UpSvSTOBHKLkvAi+SPA==</latexit> . . . Latent variables Variable transformation Feature learning Classifier or Regressor 4. Representation learning: Models can have “feature learning” blocks, and they can be “pre-trained” by different large datasets.
  • 26. Modern aspects of ML Prediction Input variables Surface model <latexit sha1_base64="Ill3Als4zZd947f5Xm9sW99d0QA=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCJTomJcEd245CGPBAlp64gNpW3aQkTiD5i4lYUrTVwYP8APcOMPuOATjEtM3LjwUpoYJeJtpnPmzD13zsyVTU21Hca6PmFsfGJyyj8dmJmdmw+GFhbzttGwFJ5TDM2wirJkc03Vec5RHY0XTYtLdVnjBbm2198vNLllq4Z+4LRMXq5LVV09VhXJISp7WhEroQiLMTfCw0D0QARepIzQIw5xBAMKGqiDQ4dDWIMEm74SRDCYxJXRJs4ipLr7HOcIkLZBWZwyJGJr9K/SquSxOq37NW1XrdApGg2LlGFE2Qu7Zz32zB7YK/v8s1bbrdH30qJZHmi5WQleLGc//lXVaXZw8q0a6dnBMbZdryp5N12mfwtloG+edXrZnUy0vcZu2Rv5v2Fd9kQ30Jvvyl2aZ65H+JHJC70YNUj83Y5hkI/HxK1YPL0RSe56rfJjBatYp34kkMQ+UshR/SoucYWO4BdiwqaQGKQKPk+zhB8hJL8AVA6Qmg==</latexit> x1 <latexit sha1_base64="QFtMwnKe2I12XGZu0bNJbdnDaaE=">AAAChnichVG7TgJBFD2sL8QHqI2JDZFgrMhAVIwV0caShzwSJGR3HXHDvrK7EJH4Aya2UlhpYmH8AD/Axh+w4BOMJSY2Fl6WTYwS8W5m58yZe+6cmSuZqmI7jHV9wtj4xOSUfzowMzs3HwwtLBZso2HJPC8bqmGVJNHmqqLzvKM4Ki+ZFhc1SeVFqb7X3y82uWUrhn7gtExe0cSarhwrsugQlTutJqqhCIsxN8LDIO6BCLxIG6FHHOIIBmQ0oIFDh0NYhQibvjLiYDCJq6BNnEVIcfc5zhEgbYOyOGWIxNbpX6NV2WN1Wvdr2q5aplNUGhYpw4iyF3bPeuyZPbBX9vlnrbZbo++lRbM00HKzGrxYzn38q9JodnDyrRrp2cExtl2vCnk3XaZ/C3mgb551ermdbLS9xm7ZG/m/YV32RDfQm+/yXYZnr0f4kcgLvRg1KP67HcOgkIjFt2KJzEYkteu1yo8VrGKd+pFECvtII0/1a7jEFTqCX4gJm0JykCr4PM0SfoSQ+gJWLpCb</latexit> x2 <latexit sha1_base64="lFhRrRrVTrFR31ebbMgRp5myJpc=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCIDPjCuiG5c8pBHgoS0dcCG0jZtISLxB0zcysKVJi6MH+AHuPEHXPAJxiUmblx4KU2MEvE20zlz5p47Z+ZKhqpYNmNdjzA2PjE55Z32zczOzfsDC4s5S2+YMs/KuqqbBUm0uKpoPGsrtsoLhsnFuqTyvFTb7+/nm9y0FF07tFsGL9XFqqZUFFm0icqcljfKgRCLMCeCwyDqghDcSOqBRxzhGDpkNFAHhwabsAoRFn1FRMFgEFdCmziTkOLsc5zDR9oGZXHKEImt0b9Kq6LLarTu17QctUynqDRMUgYRZi/snvXYM3tgr+zzz1ptp0bfS4tmaaDlRtl/sZz5+FdVp9nGybdqpGcbFew4XhXybjhM/xbyQN886/Qyu+lwe43dsjfyf8O67IluoDXf5bsUT1+P8CORF3oxalD0dzuGQS4WiW5HYqnNUGLPbZUXK1jFOvUjjgQOkESW6ldxiSt0BK8QEbaE+CBV8LiaJfwIIfEFWE6QnA==</latexit> x3 <latexit sha1_base64="0IPXcU0UIDvzZlYURjV2A/THv9U=">AAACiXichVG7SgNBFD2ur/hM1EawEYNiFWZFNKQKprGMj0TBBNndTHR0X+xOFmLwB6zsRK0ULMQP8ANs/AELP0EsFWwsvNksiAbjXWbnzJl77pyZq7um8CVjz11Kd09vX39sYHBoeGQ0nhgbL/pOzTN4wXBMx9vWNZ+bwuYFKaTJt12Pa5Zu8i39MNfc3wq45wvH3pR1l5ctbc8WVWFokqhiKag40t9NJFmKhTHdDtQIJBFF3knco4QKHBiowQKHDUnYhAafvh2oYHCJK6NBnEdIhPscxxgkbY2yOGVoxB7Sf49WOxFr07pZ0w/VBp1i0vBIOY1Z9sRu2Rt7ZHfshX3+WasR1mh6qdOst7Tc3Y2fTG58/KuyaJbY/1Z19CxRRTr0Ksi7GzLNWxgtfXB09raRWZ9tzLFr9kr+r9gze6Ab2MG7cbPG1y87+NHJC70YNUj93Y52UFxIqUuphbXFZHYlalUMU5jBPPVjGVmsIo8C1T/AKc5xoQwpqpJWMq1UpSvSTOBHKLkvAi+SPA==</latexit> . . . Latent variables Variable transformation Feature learning Classifier or Regressor 4. Representation learning: Models can have “feature learning” blocks, and they can be “pre-trained” by different large datasets.
  • 27. Modern aspects of ML Prediction Input variables Surface model <latexit sha1_base64="Ill3Als4zZd947f5Xm9sW99d0QA=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCJTomJcEd245CGPBAlp64gNpW3aQkTiD5i4lYUrTVwYP8APcOMPuOATjEtM3LjwUpoYJeJtpnPmzD13zsyVTU21Hca6PmFsfGJyyj8dmJmdmw+GFhbzttGwFJ5TDM2wirJkc03Vec5RHY0XTYtLdVnjBbm2198vNLllq4Z+4LRMXq5LVV09VhXJISp7WhEroQiLMTfCw0D0QARepIzQIw5xBAMKGqiDQ4dDWIMEm74SRDCYxJXRJs4ipLr7HOcIkLZBWZwyJGJr9K/SquSxOq37NW1XrdApGg2LlGFE2Qu7Zz32zB7YK/v8s1bbrdH30qJZHmi5WQleLGc//lXVaXZw8q0a6dnBMbZdryp5N12mfwtloG+edXrZnUy0vcZu2Rv5v2Fd9kQ30Jvvyl2aZ65H+JHJC70YNUj83Y5hkI/HxK1YPL0RSe56rfJjBatYp34kkMQ+UshR/SoucYWO4BdiwqaQGKQKPk+zhB8hJL8AVA6Qmg==</latexit> x1 <latexit sha1_base64="QFtMwnKe2I12XGZu0bNJbdnDaaE=">AAAChnichVG7TgJBFD2sL8QHqI2JDZFgrMhAVIwV0caShzwSJGR3HXHDvrK7EJH4Aya2UlhpYmH8AD/Axh+w4BOMJSY2Fl6WTYwS8W5m58yZe+6cmSuZqmI7jHV9wtj4xOSUfzowMzs3HwwtLBZso2HJPC8bqmGVJNHmqqLzvKM4Ki+ZFhc1SeVFqb7X3y82uWUrhn7gtExe0cSarhwrsugQlTutJqqhCIsxN8LDIO6BCLxIG6FHHOIIBmQ0oIFDh0NYhQibvjLiYDCJq6BNnEVIcfc5zhEgbYOyOGWIxNbpX6NV2WN1Wvdr2q5aplNUGhYpw4iyF3bPeuyZPbBX9vlnrbZbo++lRbM00HKzGrxYzn38q9JodnDyrRrp2cExtl2vCnk3XaZ/C3mgb551ermdbLS9xm7ZG/m/YV32RDfQm+/yXYZnr0f4kcgLvRg1KP67HcOgkIjFt2KJzEYkteu1yo8VrGKd+pFECvtII0/1a7jEFTqCX4gJm0JykCr4PM0SfoSQ+gJWLpCb</latexit> x2 <latexit sha1_base64="lFhRrRrVTrFR31ebbMgRp5myJpc=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCIDPjCuiG5c8pBHgoS0dcCG0jZtISLxB0zcysKVJi6MH+AHuPEHXPAJxiUmblx4KU2MEvE20zlz5p47Z+ZKhqpYNmNdjzA2PjE55Z32zczOzfsDC4s5S2+YMs/KuqqbBUm0uKpoPGsrtsoLhsnFuqTyvFTb7+/nm9y0FF07tFsGL9XFqqZUFFm0icqcljfKgRCLMCeCwyDqghDcSOqBRxzhGDpkNFAHhwabsAoRFn1FRMFgEFdCmziTkOLsc5zDR9oGZXHKEImt0b9Kq6LLarTu17QctUynqDRMUgYRZi/snvXYM3tgr+zzz1ptp0bfS4tmaaDlRtl/sZz5+FdVp9nGybdqpGcbFew4XhXybjhM/xbyQN886/Qyu+lwe43dsjfyf8O67IluoDXf5bsUT1+P8CORF3oxalD0dzuGQS4WiW5HYqnNUGLPbZUXK1jFOvUjjgQOkESW6ldxiSt0BK8QEbaE+CBV8LiaJfwIIfEFWE6QnA==</latexit> x3 <latexit sha1_base64="0IPXcU0UIDvzZlYURjV2A/THv9U=">AAACiXichVG7SgNBFD2ur/hM1EawEYNiFWZFNKQKprGMj0TBBNndTHR0X+xOFmLwB6zsRK0ULMQP8ANs/AELP0EsFWwsvNksiAbjXWbnzJl77pyZq7um8CVjz11Kd09vX39sYHBoeGQ0nhgbL/pOzTN4wXBMx9vWNZ+bwuYFKaTJt12Pa5Zu8i39MNfc3wq45wvH3pR1l5ctbc8WVWFokqhiKag40t9NJFmKhTHdDtQIJBFF3knco4QKHBiowQKHDUnYhAafvh2oYHCJK6NBnEdIhPscxxgkbY2yOGVoxB7Sf49WOxFr07pZ0w/VBp1i0vBIOY1Z9sRu2Rt7ZHfshX3+WasR1mh6qdOst7Tc3Y2fTG58/KuyaJbY/1Z19CxRRTr0Ksi7GzLNWxgtfXB09raRWZ9tzLFr9kr+r9gze6Ab2MG7cbPG1y87+NHJC70YNUj93Y52UFxIqUuphbXFZHYlalUMU5jBPPVjGVmsIo8C1T/AKc5xoQwpqpJWMq1UpSvSTOBHKLkvAi+SPA==</latexit> . . . Latent variables Variable transformation Feature learning Classifier or Regressor Linear 4. Representation learning: Models can have “feature learning” blocks, and they can be “pre-trained” by different large datasets.
  • 28. Prior Info Observational data Reported facts Textbook knowledge Needs and excitement around ML for Chemistry Discovery Representation Model (Belief) Intervention Hypothesis New Info Prior Info • Identify relevant variables • Set design choices • Set experiments • Interpret results Model (Belief) Hypothesis Can we somehow externalize “experience and intuition” of experienced chemists to rationalize and accelerate discoveries?
  • 29. Prior Info Observational data Reported facts Textbook knowledge Needs and excitement around ML for Chemistry Discovery Representation Model (Belief) Intervention Hypothesis New Info Prior Info • Identify relevant variables • Set design choices • Set experiments • Interpret results Model (Belief) Hypothesis Can we somehow externalize “experience and intuition” of experienced chemists to rationalize and accelerate discoveries?
  • 30. Representation Reactions Materials Molecules ML computer programs • Observational data • Reported facts • Textbook knowledge ? Identifying relevant factors and establishing any necessary and sufficient computer-readable representations are inevitable preconditions, but this is far from trivial and quite paradoxical since we haven’t understood the target. Any rationalized “real” discovery only comes from understanding and discovery of the causal relations between relevant factors.
  • 31. Representation <latexit sha1_base64="dwtAUUE0cfsFu6+2FLg7b109CNE=">AAACi3ichVG7SgNBFL1ZX/ERjdoINsGgWIW7a0iiWIgiWKoxMaASdtdJMmRf7E4CMfgDljYW2ihYiB/gB9j4AxZ+glhGsLHw7mZFLIx3mZ07Z+65c2aO5hjcE4gvEamvf2BwKDo8MjoWG5+IT04VPbvh6qyg24btljTVYwa3WEFwYbCS4zLV1Ay2r9U3/P39JnM9blt7ouWwI1OtWrzCdVUQVDoUNSbUMi/Hk5hazmWUdCaBKcSsrMh+omTTS+mETIgfSQhj244/wCEcgw06NMAEBhYIyg1QwaPvAGRAcAg7gjZhLmU82GdwCiPEbVAVowqV0Dr9q7Q6CFGL1n5PL2DrdIpBwyVmAubxGe+wg094j6/4+WevdtDD19KiWetymVOeOJvJf/zLMmkWUPth9dQsoAK5QCsn7U6A+LfQu/zmyUUnv7I7317AG3wj/df4go90A6v5rt/usN3LHno00kIvRgZ9u5D4OykqKTmTUnbSybX10KoozMIcLJIfWViDLdiGQuDDOVzClRSTlqQVabVbKkVCzjT8CmnzC0ydk0A=</latexit> ✓i <latexit sha1_base64="tkPRNIYeS8tNgbH62CO/ULi3LDw=">AAACi3ichVHLSsNAFL2Nr/quuhHcBIviqtykoa3iQhTBZbXWFtpSkjjaaF4k04IWf8ClGxe6UXAhfoAf4MYfcOEniMsKblx4k0bEhXrDZO6cuefOmTmaaxo+R3yOCT29ff0D8cGh4ZHRsfHExOSO7zQ9nRV1x3S8sqb6zDRsVuQGN1nZ9ZhqaSYraYdrwX6pxTzfcOxtfuSymqXu28aeoaucoHKVNxhX6wf1RBJTi7mMrGRETCFmJVkKEjmrpBVRIiSIJESRdxL3UIVdcECHJljAwAZOuQkq+PRVQAIEl7AatAnzKDPCfQYnMETcJlUxqlAJPaT/Pq0qEWrTOujph2ydTjFpeMQUYQ6f8BY7+Ih3+IIfv/Zqhz0CLUc0a10uc+vjp9OF939ZFs0cGt+sPzVz2INcqNUg7W6IBLfQu/zW8XmnsLQ1157Ha3wl/Vf4jA90A7v1pt9ssq2LP/RopIVejAz6ckH8PdmRU1ImJW8qyZXVyKo4zMAsLJAfWViBDchDMfThDC7gUhgV0sKSsNwtFWIRZwp+hLD+CU69k0E=</latexit> ✓j O N N N H NH N N N CH3 CH3 Levels of Theory/Model Abstraction First Principle and Simulation (Quantum Chemistry) Spatio-Temporal Flexibility, Variations, Dynamics, and Interactions
  • 32. Representation Latent variables Representation learning Reactions Materials Molecules Graphs (of different size) Node features Edge features CC1CCNO1 Graph Neural Networks (GNNs) NCc1ccoc1.S=(Cl)Cl>>[RX_5]S=C=NCc1ccoc1 … Classifier or Regressor Diverse Downstream Tasks Modular Hierarchy Amide Proline Oxazoline Compositionality Phenyl Carboxyl Methyl Ethyl Tert-butyl Isoprophyl Trifluoromethyl Benzyl Substituents Graph
 Coarsening Combinatorial aspects
  • 33. Representation NB: Transformers can be considered as a special case of GNNs, and many Transformer-type GNNs are also developed. Transformer Core (Multihead) Self-attention Feed-forward NN Add + LayerNorm Add + LayerNorm <latexit sha1_base64="I4mbdBylFC3Uuk1C7RrdvvfeVHQ=">AAACqXichVFNS9xQFD2m9dvqqJtCN8GpogjDy1CqKIXBbrp01NFBI+ElvnEeky+SN0N16B+YP9CFKwUX4qa70m676R9w4U8Qlxa66cKbTEBUqjck97zz7rk57107dGWsGLvs0V687O3rHxgcGh55NTqWG5/YjINm5IiKE7hBVLV5LFzpi4qSyhXVMBLcs12xZTc+JvtbLRHFMvA31EEodj2+78uadLgiysq9DfQPuhk3PUvqJnfDOrfk7Oc5vZakZVPVheJzVi7PCiwN/TEwMpBHFqtB7jtM7CGAgyY8CPhQhF1wxPTswABDSNwu2sRFhGS6L/AFQ6RtUpWgCk5sg777tNrJWJ/WSc84VTv0F5feiJQ6ptkFO2M37Dc7Z1fs3397tdMeiZcDynZXK0JrrPN6/e+zKo+yQv1O9aRnhRoWU6+SvIcpk5zC6epbh19v1pfWptsz7IRdk/9jdsl+0Qn81h/ntCzWjp7wY5MXujEakPFwHI/BZrFgvC8Uy+/ypZVsVAN4gynM0jwWUMInrKJC/Tv4hh/4qc1rZa2qbXdLtZ5MM4l7oTm3XZydSQ==</latexit> o = X i ↵i(x)fi(x; ✓) Effective pretraining is a crucial open problem because in practice, we can only access to limited data for each specific problem. Pretraining with self-supervised pretext tasks have transformed NLP
  • 34. Prior Info Observational data Reported facts Textbook knowledge Needs and excitement around ML for Chemistry Discovery Representation Model (Belief) Intervention Hypothesis New Info Prior Info • Identify relevant variables • Set design choices • Set experiments • Interpret results Model (Belief) Hypothesis Can we somehow externalize “experience and intuition” of experienced chemists to rationalize and accelerate discoveries? New Info
  • 35. Prior Info Observational data Reported facts Textbook knowledge Needs and excitement around ML for Chemistry Discovery Representation Model (Belief) Intervention Hypothesis New Info Prior Info • Identify relevant variables • Set design choices • Set experiments • Interpret results Model (Belief) Hypothesis Can we somehow externalize “experience and intuition” of experienced chemists to rationalize and accelerate discoveries? New Info
  • 36. (Experimental) Intervention New Info Hypothesis ? Automation Reactions Materials Molecules Any rationalized “real” discovery only comes from understanding and discovery of the causal relations between relevant factors. Information about causal relations can be acquired by passive observation and active intervention. Correlation does not imply causation. ML computer programs • Observational data • Reported facts • Textbook knowledge
  • 37. (Experimental) Intervention We need to carefully rethink how an experiment should be performed to be informative about causal structure of targets.
  • 38. (Experimental) Intervention We need to carefully rethink how an experiment should be performed to be informative about causal structure of targets. • Correlation vs Causation ML models trained over passive observational data can be trapped by spurious correlations between variables, being totally ignorant of the underlying causality.
  • 39. (Experimental) Intervention We need to carefully rethink how an experiment should be performed to be informative about causal structure of targets. • Correlation vs Causation ML models trained over passive observational data can be trapped by spurious correlations between variables, being totally ignorant of the underlying causality. • Garbage In, Garbage Out (GIGO) ML models are just representative of the given data. If it has any bias, ML predictions can be miserably misleading.
  • 40. (Experimental) Intervention We need to carefully rethink how an experiment should be performed to be informative about causal structure of targets. • Correlation vs Causation ML models trained over passive observational data can be trapped by spurious correlations between variables, being totally ignorant of the underlying causality. • Garbage In, Garbage Out (GIGO) ML models are just representative of the given data. If it has any bias, ML predictions can be miserably misleading. • Unavoidable Human-Caused Biases Always remember that “most chemical experiments are planned by human scientists and therefore are subject to a variety of human cognitive biases, heuristics and social influences.” * Jia, X., Lynch, A., Huang, Y. et al. Anthropogenic biases in chemical reaction data hinder exploratory inorganic synthesis. Nature 573, 251–255 (2019).
  • 41. https://www.chemistryworld.com/news/dispute-over-reaction-prediction-puts-machine-learnings- pitfalls-in-spotlight/3009912.article • Main paper https://doi.org/10.1126/science.aar5169 • Erratum https://doi.org/10.1126/science.aat7648 • Negative comment paper https://doi.org/10.1126/science.aat8603 • Author's response https://doi.org/10.1126/science.aat8763 (Experimental) Intervention
  • 42. Keys: fusing modern ML with first-principles, simulations, domain knowledge, and collaboratively working with experimental experts. Current ML is too data-hungry and vulnerable to any data bias, but acquisition of clean representative data is often quite impractical. (Experimental) Intervention • Deep learning techniques thus far have proven to be data hungry, shallow, brittle, and limited in their ability to generalize (Marcus, 2018) • Current machine learning techniques are data-hungry and brittle—they can only make sense of patterns they've seen before. (Chollet, 2020) • A growing body of evidence shows that state-of-the-art models learn to exploit spurious statistical patterns in datasets... instead of learning meaning in the flexible and generalizable way that humans do. (Nie et al., 2019) • Current machine learning methods seem weak when they are required to generalize beyond the training distribution, which is what is often needed in practice. (Bengio et al., 2019)
  • 43. (Experimental) Intervention AlphaGo (Nature, 2016) AlphaGo Zero (Nature, 2017) AlphaZero (Science, 2018) MuZero (Nature, 2020) This has reignited the old war between induction and deduction, and we’re re-encountering the long-standing problems in AI. • Knowledge acquisition / Principled data acquisition Experimental design, Model-based optimization, Evolutionary computation • Reconciliation between inductive and deductive ML Hybrid models of causal/logical/algorithmic ML and deep learning • Balancing exploitation and exploration Model-based reinforcement learning or search in a combinatorial space
  • 44. ML for Chemistry to me (a ML researcher) An exciting “real” test bench for the long-standing unsolved but attractive fundamental problems in “AI for automating discovery”, involving many fascinating technical topics of modern ML. Prior Info Observational data Reported facts Textbook knowledge Discovery Representation Model (Belief) Intervention Hypothesis New Info Prior Info • Identify relevant variables • Set design choices • Set experiments • Interpret results Model (Belief) Hypothesis
  • 45. Summary • Why it is needed? • What are exciting for computer scientists? Two aspects: 2. (Experimental) Intervention Machine Learning (ML) for Chemistry • What are good ML-readable representations for chemistry? • What information should be recorded and given to ML? 1. Representation • What are essential to make real chemical discoveries? • Any principled ways for data acquisition and experimental design?