32. / 166
分類 (classification) as フィッティング
26
Random Forest
Gaussian Process Classifier
Logistic Regression
0-1の確率値を出⼒
(predict_proba)
P(class=red)
P(class=blue)
= 1 - P(class=red)
33. / 166
Boring AI (a.k.a. Machine Learning)
27
https://www.forbes.com/sites/forbestechcouncil/2020/02/19/
in-praise-of-boring-ai-a-k-a-machine-learning/
ʜ
“Let’s face it:
So far, the artificial
intelligence plastered all
over PowerPoint slides
hasn’t lived up to its hype.”
The AI frenzy: hope & hype
34. / 166
Boring AI (a.k.a. Machine Learning)
27
From AAAI-20 Oxford-Style Debate
https://www.forbes.com/sites/forbestechcouncil/2020/02/19/
in-praise-of-boring-ai-a-k-a-machine-learning/
ʜ
“Let’s face it:
So far, the artificial
intelligence plastered all
over PowerPoint slides
hasn’t lived up to its hype.”
The AI frenzy: hope & hype
35. / 166
関数フィッティングとしての機械学習
28
Prediction
Input
variables
Classifier or
Regressor
<latexit sha1_base64="Ill3Als4zZd947f5Xm9sW99d0QA=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCJTomJcEd245CGPBAlp64gNpW3aQkTiD5i4lYUrTVwYP8APcOMPuOATjEtM3LjwUpoYJeJtpnPmzD13zsyVTU21Hca6PmFsfGJyyj8dmJmdmw+GFhbzttGwFJ5TDM2wirJkc03Vec5RHY0XTYtLdVnjBbm2198vNLllq4Z+4LRMXq5LVV09VhXJISp7WhEroQiLMTfCw0D0QARepIzQIw5xBAMKGqiDQ4dDWIMEm74SRDCYxJXRJs4ipLr7HOcIkLZBWZwyJGJr9K/SquSxOq37NW1XrdApGg2LlGFE2Qu7Zz32zB7YK/v8s1bbrdH30qJZHmi5WQleLGc//lXVaXZw8q0a6dnBMbZdryp5N12mfwtloG+edXrZnUy0vcZu2Rv5v2Fd9kQ30Jvvyl2aZ65H+JHJC70YNUj83Y5hkI/HxK1YPL0RSe56rfJjBatYp34kkMQ+UshR/SoucYWO4BdiwqaQGKQKPk+zhB8hJL8AVA6Qmg==</latexit>
x1
<latexit sha1_base64="QFtMwnKe2I12XGZu0bNJbdnDaaE=">AAAChnichVG7TgJBFD2sL8QHqI2JDZFgrMhAVIwV0caShzwSJGR3HXHDvrK7EJH4Aya2UlhpYmH8AD/Axh+w4BOMJSY2Fl6WTYwS8W5m58yZe+6cmSuZqmI7jHV9wtj4xOSUfzowMzs3HwwtLBZso2HJPC8bqmGVJNHmqqLzvKM4Ki+ZFhc1SeVFqb7X3y82uWUrhn7gtExe0cSarhwrsugQlTutJqqhCIsxN8LDIO6BCLxIG6FHHOIIBmQ0oIFDh0NYhQibvjLiYDCJq6BNnEVIcfc5zhEgbYOyOGWIxNbpX6NV2WN1Wvdr2q5aplNUGhYpw4iyF3bPeuyZPbBX9vlnrbZbo++lRbM00HKzGrxYzn38q9JodnDyrRrp2cExtl2vCnk3XaZ/C3mgb551ermdbLS9xm7ZG/m/YV32RDfQm+/yXYZnr0f4kcgLvRg1KP67HcOgkIjFt2KJzEYkteu1yo8VrGKd+pFECvtII0/1a7jEFTqCX4gJm0JykCr4PM0SfoSQ+gJWLpCb</latexit>
x2
<latexit sha1_base64="lFhRrRrVTrFR31ebbMgRp5myJpc=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCIDPjCuiG5c8pBHgoS0dcCG0jZtISLxB0zcysKVJi6MH+AHuPEHXPAJxiUmblx4KU2MEvE20zlz5p47Z+ZKhqpYNmNdjzA2PjE55Z32zczOzfsDC4s5S2+YMs/KuqqbBUm0uKpoPGsrtsoLhsnFuqTyvFTb7+/nm9y0FF07tFsGL9XFqqZUFFm0icqcljfKgRCLMCeCwyDqghDcSOqBRxzhGDpkNFAHhwabsAoRFn1FRMFgEFdCmziTkOLsc5zDR9oGZXHKEImt0b9Kq6LLarTu17QctUynqDRMUgYRZi/snvXYM3tgr+zzz1ptp0bfS4tmaaDlRtl/sZz5+FdVp9nGybdqpGcbFew4XhXybjhM/xbyQN886/Qyu+lwe43dsjfyf8O67IluoDXf5bsUT1+P8CORF3oxalD0dzuGQS4WiW5HYqnNUGLPbZUXK1jFOvUjjgQOkESW6ldxiSt0BK8QEbaE+CBV8LiaJfwIIfEFWE6QnA==</latexit>
x3
<latexit sha1_base64="0IPXcU0UIDvzZlYURjV2A/THv9U=">AAACiXichVG7SgNBFD2ur/hM1EawEYNiFWZFNKQKprGMj0TBBNndTHR0X+xOFmLwB6zsRK0ULMQP8ANs/AELP0EsFWwsvNksiAbjXWbnzJl77pyZq7um8CVjz11Kd09vX39sYHBoeGQ0nhgbL/pOzTN4wXBMx9vWNZ+bwuYFKaTJt12Pa5Zu8i39MNfc3wq45wvH3pR1l5ctbc8WVWFokqhiKag40t9NJFmKhTHdDtQIJBFF3knco4QKHBiowQKHDUnYhAafvh2oYHCJK6NBnEdIhPscxxgkbY2yOGVoxB7Sf49WOxFr07pZ0w/VBp1i0vBIOY1Z9sRu2Rt7ZHfshX3+WasR1mh6qdOst7Tc3Y2fTG58/KuyaJbY/1Z19CxRRTr0Ksi7GzLNWxgtfXB09raRWZ9tzLFr9kr+r9gze6Ab2MG7cbPG1y87+NHJC70YNUj93Y52UFxIqUuphbXFZHYlalUMU5jBPPVjGVmsIo8C1T/AKc5xoQwpqpJWMq1UpSvSTOBHKLkvAi+SPA==</latexit>
.
.
.
Function
model
標準的な機械学習モデル
42. / 166
表現学習 (良い潜在特徴量のデータからの抽出)
29
Prediction
Input
variables
Function
model
<latexit sha1_base64="QFtMwnKe2I12XGZu0bNJbdnDaaE=">AAAChnichVG7TgJBFD2sL8QHqI2JDZFgrMhAVIwV0caShzwSJGR3HXHDvrK7EJH4Aya2UlhpYmH8AD/Axh+w4BOMJSY2Fl6WTYwS8W5m58yZe+6cmSuZqmI7jHV9wtj4xOSUfzowMzs3HwwtLBZso2HJPC8bqmGVJNHmqqLzvKM4Ki+ZFhc1SeVFqb7X3y82uWUrhn7gtExe0cSarhwrsugQlTutJqqhCIsxN8LDIO6BCLxIG6FHHOIIBmQ0oIFDh0NYhQibvjLiYDCJq6BNnEVIcfc5zhEgbYOyOGWIxNbpX6NV2WN1Wvdr2q5aplNUGhYpw4iyF3bPeuyZPbBX9vlnrbZbo++lRbM00HKzGrxYzn38q9JodnDyrRrp2cExtl2vCnk3XaZ/C3mgb551ermdbLS9xm7ZG/m/YV32RDfQm+/yXYZnr0f4kcgLvRg1KP67HcOgkIjFt2KJzEYkteu1yo8VrGKd+pFECvtII0/1a7jEFTqCX4gJm0JykCr4PM0SfoSQ+gJWLpCb</latexit>
x2
<latexit sha1_base64="lFhRrRrVTrFR31ebbMgRp5myJpc=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCIDPjCuiG5c8pBHgoS0dcCG0jZtISLxB0zcysKVJi6MH+AHuPEHXPAJxiUmblx4KU2MEvE20zlz5p47Z+ZKhqpYNmNdjzA2PjE55Z32zczOzfsDC4s5S2+YMs/KuqqbBUm0uKpoPGsrtsoLhsnFuqTyvFTb7+/nm9y0FF07tFsGL9XFqqZUFFm0icqcljfKgRCLMCeCwyDqghDcSOqBRxzhGDpkNFAHhwabsAoRFn1FRMFgEFdCmziTkOLsc5zDR9oGZXHKEImt0b9Kq6LLarTu17QctUynqDRMUgYRZi/snvXYM3tgr+zzz1ptp0bfS4tmaaDlRtl/sZz5+FdVp9nGybdqpGcbFew4XhXybjhM/xbyQN886/Qyu+lwe43dsjfyf8O67IluoDXf5bsUT1+P8CORF3oxalD0dzuGQS4WiW5HYqnNUGLPbZUXK1jFOvUjjgQOkESW6ldxiSt0BK8QEbaE+CBV8LiaJfwIIfEFWE6QnA==</latexit>
x3
<latexit sha1_base64="0IPXcU0UIDvzZlYURjV2A/THv9U=">AAACiXichVG7SgNBFD2ur/hM1EawEYNiFWZFNKQKprGMj0TBBNndTHR0X+xOFmLwB6zsRK0ULMQP8ANs/AELP0EsFWwsvNksiAbjXWbnzJl77pyZq7um8CVjz11Kd09vX39sYHBoeGQ0nhgbL/pOzTN4wXBMx9vWNZ+bwuYFKaTJt12Pa5Zu8i39MNfc3wq45wvH3pR1l5ctbc8WVWFokqhiKag40t9NJFmKhTHdDtQIJBFF3knco4QKHBiowQKHDUnYhAafvh2oYHCJK6NBnEdIhPscxxgkbY2yOGVoxB7Sf49WOxFr07pZ0w/VBp1i0vBIOY1Z9sRu2Rt7ZHfshX3+WasR1mh6qdOst7Tc3Y2fTG58/KuyaJbY/1Z19CxRRTr0Ksi7GzLNWxgtfXB09raRWZ9tzLFr9kr+r9gze6Ab2MG7cbPG1y87+NHJC70YNUj93Y52UFxIqUuphbXFZHYlalUMU5jBPPVjGVmsIo8C1T/AKc5xoQwpqpJWMq1UpSvSTOBHKLkvAi+SPA==</latexit>
.
.
.
Latent
variables
Learnable variable
transformation
Representation learning
Classifier or
Regressor
Linear
<latexit sha1_base64="Ill3Als4zZd947f5Xm9sW99d0QA=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCJTomJcEd245CGPBAlp64gNpW3aQkTiD5i4lYUrTVwYP8APcOMPuOATjEtM3LjwUpoYJeJtpnPmzD13zsyVTU21Hca6PmFsfGJyyj8dmJmdmw+GFhbzttGwFJ5TDM2wirJkc03Vec5RHY0XTYtLdVnjBbm2198vNLllq4Z+4LRMXq5LVV09VhXJISp7WhEroQiLMTfCw0D0QARepIzQIw5xBAMKGqiDQ4dDWIMEm74SRDCYxJXRJs4ipLr7HOcIkLZBWZwyJGJr9K/SquSxOq37NW1XrdApGg2LlGFE2Qu7Zz32zB7YK/v8s1bbrdH30qJZHmi5WQleLGc//lXVaXZw8q0a6dnBMbZdryp5N12mfwtloG+edXrZnUy0vcZu2Rv5v2Fd9kQ30Jvvyl2aZ65H+JHJC70YNUj83Y5hkI/HxK1YPL0RSe56rfJjBatYp34kkMQ+UshR/SoucYWO4BdiwqaQGKQKPk+zhB8hJL8AVA6Qmg==</latexit>
x1
Simple model is enough
when we have good features.
最近の深層学習では回帰・分類の前に良い潜在変数表現への変換を⾏う!
(2つのブロックの合成を関数フィッティングとして⼀気通貫で最適化する)
54. / 166
Use Case 1: Virtual Screening (QSAR/QSPR)
38
• Mutagenic potency
• Carcinogenic potency
• Endocrine disruption
• Growth inhibition
• Aqueous solubility
N
NH
O
O
H
H
H
H H
H
H
H
H
H
H
H
H
H
H
+1 +1
H
H
H
H
H
H
H
H
H
H
-1
O
O
O
O
O
O
Cl
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
-1
Br
Br O P
O
O Br
Br
O
Br
Br
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
+1
N
S
N
N
H
H
H
H
H
H
H
+1
H
H
H
H
H
H
H
H
O
N
O
O
H
+1
H
H
O
O
H
H
N
O
O
Cl
Cl
Cl
-1
H
H
H
H
H
H H
N
O
O
+1 -1 H
H
H
H
H
H
H H
H
N
O
O
-1
H
H
H
H
H
H
H
N
H
N
O
O
N
O
O
H
H
H
H
H
H
H
H
N
+1
-2.885
CH3
O
O
H
N Cl
Cl
Cl
Cl
Cl
-6.545 -4.138
H3C
O O
O
O
O
O
H3C
CH3
CH2
-5.246
O
HN
O
O
NH
-0.527
CH3
HO
OH
CH3
N
O
O
-0.816
CH3
N
N
H
N
H
H3C
N
0.739
H3C
H3C
NH
O
N
O
N
O
1.399
CH3
O N
NH2
O
CH3
Br
-1.753 CH3
N
H3C
H
N
S
N
O
CH3
N
OH
1.394
CH3
CH3
N
N
N
CH3
H3C
H2N NH2
-3.171
H
OH
O
HO
CH3
H
H
O
CH3
+1
H
O
O
H3C H
H
H
O
H3C
S
CH3
O
+1
H
H
O
CH3
CH3
O
O
HO
H3C
H
HO
F
H
O
H3C
+1
NH2
O
N
HO
H
O
O
-1 H
H
O
O
O
H3C
O
O
O
CH3
O
CH3
H
O
CH3
H
O
O
CH3
H
H
N
H
N O
H3C
-1
O
O
O -1
55. / 166
Use Case 1: Virtual Screening (QSAR/QSPR)
39
https://pubchem.ncbi.nlm.nih.gov/bioassay/1
56. / 166
Use Case 1: Virtual Screening (QSAR/QSPR)
40
input output
ML
activity: “Active”
LogGI50: -7.8811
CID 11978790
GI50: concentration required
for 50% inhibition of growth
59. / 166
Molecular Graphs: 分⼦のグラフ表現
41
Input representation (molecular graph)
1
2
1
3
explicit Hs
4
5
6
7
8
9
10
11
12
13
14
15
16
17
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
Any permutation of
this numbering should
not change the results.
❗
atom features bond features
topology
Molecular Graph
read out
graph-level
output
• sum, mean or max
• attentive pooling
CID 204
• atomic_num (one-hot, 101)
• total_degree (one-hot, 7)
• formal_charge (one-hot, 6)
• chiral_tag (one-hot, 5)
• num_Hs (one-hot, 6)
• hybridization (one-hot, 6)
• is_aromatic (binary, 1)
• atomic_mass (real, 1)
17
edge(bond) features
• no_bond (binary, 1)
• is_single (binary, 1)
• is_double (binary, 1)
• is_triple (binary, 1)
• is_aromatic (binary, 1)
• is_connjugated (binary, 1)
• is_in_ring (binary, 1)
• stereo (one-hot, 7)
17
14
133
node(atom) features
133 features 14 features
e.g. Features for ChemProp (Yang et al, 2019)
atoms → nodes
bonds → edges
1. permutation
equivariance
2. permutation
invariance
60. / 166
Graph Neural Networks (GNNs)
42
N O
C
C
C
C
H
H
H
H
H
N O
C
C
C
C
H
H
H
H
H
GNN Layer
GNN updates
features
61. / 166
Graph Neural Networks (GNNs)
42
N O
C
C
C
C
H
H
H
H
H
N O
C
C
C
C
H
H
H
H
H
GNN Layer
GNN updates
features
62. / 166
Graph Neural Networks (GNNs)
42
N O
C
C
C
C
H
H
H
H
H
N O
C
C
C
C
H
H
H
H
H
GNN Layer
GNN updates
features
<latexit sha1_base64="+I1ZH8a510AHL/VRK05INyOdDHc=">AAACi3ichVHLSsNAFL2Nr1qtrboR3BRLxVWZaFEpLooiuOzDPqAtJYnTdmheJGmhhv6ASzcu6kbBhfgBfoAbf8BFP0FcVnDjwps0IFqsN0zmzJl77pyZK+oyMy1CBj5uanpmds4/H1hYDC6FwssrBVNrGxLNS5qsGSVRMKnMVJq3mCXTkm5QQRFlWhRbR85+sUMNk2nqqdXVaVURGiqrM0mwkCpVRMVu9mqsFo6SOHEjMg54D0TBi7QWfoQKnIEGErRBAQoqWIhlEMDErww8ENCRq4KNnIGIufsUehBAbRuzKGYIyLbw38BV2WNVXDs1TVct4SkyDgOVEYiRF3JPhuSZPJBX8vlnLdut4Xjp4iyOtFSvhS7Wch//qhScLWh+qyZ6tqAO+65Xht51l3FuIY30nfOrYS6Zjdmb5Ja8of8bMiBPeAO18y7dZWi2P8GPiF7wxbBB/O92jIPCdpzfjScyiWjq0GuVH9ZhA7awH3uQghNIQ97twyX04ZoLcjtckjsYpXI+T7MKP4I7/gL6JJMZ</latexit>
hi
<latexit sha1_base64="iVE2zwSrY7uMFp5lULN8y4s7ZXg=">AAACjnichVHLSsNAFL2Nr1ofjboR3BRLxVWZSKkiiEU3XfZhH9CWksSpjs2LJC3U0B9wLy4ERcGF+AF+gBt/wEU/QVxWcOPCmzQgWqw3TObMmXvunJkrGQqzbEJ6AW5sfGJyKjgdmpmdmw/zC4tFS2+ZMi3IuqKbZUm0qMI0WrCZrdCyYVJRlRRakpr77n6pTU2L6dqB3TFoTRWPNNZgsmgjValKqkO7dYeddOt8lMSJF5FhIPggCn5kdP4RqnAIOsjQAhUoaGAjVkAEC78KCEDAQK4GDnImIubtU+hCCLUtzKKYISLbxP8Rrio+q+HarWl5ahlPUXCYqIxAjLyQe9Inz+SBvJLPP2s5Xg3XSwdnaaClRj18tpz/+Fel4mzD8bdqpGcbGrDleWXo3fAY9xbyQN8+vejnt3MxZ43ckjf0f0N65AlvoLXf5bsszV2O8COhF3wxbJDwux3DoLgRF5LxRDYRTe35rQrCCqzCOvZjE1KQhgwUvBc9hyu45nguye1wu4NULuBrluBHcOkvYEmUlg==</latexit>
eij
atom features
bond features
63. / 166
Graph Neural Networks (GNNs)
42
N O
C
C
C
C
H
H
H
H
H
N O
C
C
C
C
H
H
H
H
H
GNN Layer
GNN updates
features
<latexit sha1_base64="rQGx6XMOvsCjXRp7b9gbmKjgm1M=">AAAC/HichVHNTttAEB4b2tI0lNBeKnGxiKiChKJNhVrUU1QuPaHwE0DCkbU2m3jD+kfrTVBquQ/AC3DgVNQeEOIKD9BLX4ADBx4AcUylXnroxHEVtah0LHu//Wa+8bc7dih4pAi50vSx8QcPH008zj3JTz6dKkw/24yCjnRY3QlEILdtGjHBfVZXXAm2HUpGPVuwLXtveZDf6jIZ8cDfUL2QNTza8nmTO1QhZRX2TduL3cTihilYU1Epg33DDKNsX/qdXsiZNm8FoehEVtw2TO4bpkeV61ARr2A+QZHLjVG9kaH2ELHEink7mc+ZkrdcNW8ViqRM0jDugkoGipBFLShcgAm7EIADHfCAgQ8KsQAKET47UAECIXINiJGTiHiaZ5BADrUdrGJYQZHdw28LdzsZ6+N+0DNK1Q7+ReArUWnAHLkkJ6RPvpFTckN+/rNXnPYYeOnhag+1LLSmDl6s//ivysNVgTtS3etZQROWUq8cvYcpMziFM9R3Pxz219+uzcUvyTG5Rf+fyBX5iifwu9+dL6ts7egePzZ6wRvDAVX+HsddsPmqXHldXlxdLFbfZaOagBmYhRLO4w1U4T3UoI79r7UxLa9N6h/1z/qpfjYs1bVM8xz+CP38F+WDvZo=</latexit>
hi
0
@hi,
M
j2Ni
(hi, hj, eij)
1
A
Update by “Message Passing”
<latexit sha1_base64="+I1ZH8a510AHL/VRK05INyOdDHc=">AAACi3ichVHLSsNAFL2Nr1qtrboR3BRLxVWZaFEpLooiuOzDPqAtJYnTdmheJGmhhv6ASzcu6kbBhfgBfoAbf8BFP0FcVnDjwps0IFqsN0zmzJl77pyZK+oyMy1CBj5uanpmds4/H1hYDC6FwssrBVNrGxLNS5qsGSVRMKnMVJq3mCXTkm5QQRFlWhRbR85+sUMNk2nqqdXVaVURGiqrM0mwkCpVRMVu9mqsFo6SOHEjMg54D0TBi7QWfoQKnIEGErRBAQoqWIhlEMDErww8ENCRq4KNnIGIufsUehBAbRuzKGYIyLbw38BV2WNVXDs1TVct4SkyDgOVEYiRF3JPhuSZPJBX8vlnLdut4Xjp4iyOtFSvhS7Wch//qhScLWh+qyZ6tqAO+65Xht51l3FuIY30nfOrYS6Zjdmb5Ja8of8bMiBPeAO18y7dZWi2P8GPiF7wxbBB/O92jIPCdpzfjScyiWjq0GuVH9ZhA7awH3uQghNIQ97twyX04ZoLcjtckjsYpXI+T7MKP4I7/gL6JJMZ</latexit>
hi
<latexit sha1_base64="iVE2zwSrY7uMFp5lULN8y4s7ZXg=">AAACjnichVHLSsNAFL2Nr1ofjboR3BRLxVWZSKkiiEU3XfZhH9CWksSpjs2LJC3U0B9wLy4ERcGF+AF+gBt/wEU/QVxWcOPCmzQgWqw3TObMmXvunJkrGQqzbEJ6AW5sfGJyKjgdmpmdmw/zC4tFS2+ZMi3IuqKbZUm0qMI0WrCZrdCyYVJRlRRakpr77n6pTU2L6dqB3TFoTRWPNNZgsmgjValKqkO7dYeddOt8lMSJF5FhIPggCn5kdP4RqnAIOsjQAhUoaGAjVkAEC78KCEDAQK4GDnImIubtU+hCCLUtzKKYISLbxP8Rrio+q+HarWl5ahlPUXCYqIxAjLyQe9Inz+SBvJLPP2s5Xg3XSwdnaaClRj18tpz/+Fel4mzD8bdqpGcbGrDleWXo3fAY9xbyQN8+vejnt3MxZ43ckjf0f0N65AlvoLXf5bsszV2O8COhF3wxbJDwux3DoLgRF5LxRDYRTe35rQrCCqzCOvZjE1KQhgwUvBc9hyu45nguye1wu4NULuBrluBHcOkvYEmUlg==</latexit>
eij
atom features
bond features
64. / 166
Graph Neural Networks (GNNs)
42
N O
C
C
C
C
H
H
H
H
H
N O
C
C
C
C
H
H
H
H
H
GNN Layer
GNN updates
features
❶
❶ Message
Permutation
equivariant
operations
• nn.Linear
Bond features can be
used (typically in ❶)
<latexit sha1_base64="rQGx6XMOvsCjXRp7b9gbmKjgm1M=">AAAC/HichVHNTttAEB4b2tI0lNBeKnGxiKiChKJNhVrUU1QuPaHwE0DCkbU2m3jD+kfrTVBquQ/AC3DgVNQeEOIKD9BLX4ADBx4AcUylXnroxHEVtah0LHu//Wa+8bc7dih4pAi50vSx8QcPH008zj3JTz6dKkw/24yCjnRY3QlEILdtGjHBfVZXXAm2HUpGPVuwLXtveZDf6jIZ8cDfUL2QNTza8nmTO1QhZRX2TduL3cTihilYU1Epg33DDKNsX/qdXsiZNm8FoehEVtw2TO4bpkeV61ARr2A+QZHLjVG9kaH2ELHEink7mc+ZkrdcNW8ViqRM0jDugkoGipBFLShcgAm7EIADHfCAgQ8KsQAKET47UAECIXINiJGTiHiaZ5BADrUdrGJYQZHdw28LdzsZ6+N+0DNK1Q7+ReArUWnAHLkkJ6RPvpFTckN+/rNXnPYYeOnhag+1LLSmDl6s//ivysNVgTtS3etZQROWUq8cvYcpMziFM9R3Pxz219+uzcUvyTG5Rf+fyBX5iifwu9+dL6ts7egePzZ6wRvDAVX+HsddsPmqXHldXlxdLFbfZaOagBmYhRLO4w1U4T3UoI79r7UxLa9N6h/1z/qpfjYs1bVM8xz+CP38F+WDvZo=</latexit>
hi
0
@hi,
M
j2Ni
(hi, hj, eij)
1
A
Update by “Message Passing”
<latexit sha1_base64="+I1ZH8a510AHL/VRK05INyOdDHc=">AAACi3ichVHLSsNAFL2Nr1qtrboR3BRLxVWZaFEpLooiuOzDPqAtJYnTdmheJGmhhv6ASzcu6kbBhfgBfoAbf8BFP0FcVnDjwps0IFqsN0zmzJl77pyZK+oyMy1CBj5uanpmds4/H1hYDC6FwssrBVNrGxLNS5qsGSVRMKnMVJq3mCXTkm5QQRFlWhRbR85+sUMNk2nqqdXVaVURGiqrM0mwkCpVRMVu9mqsFo6SOHEjMg54D0TBi7QWfoQKnIEGErRBAQoqWIhlEMDErww8ENCRq4KNnIGIufsUehBAbRuzKGYIyLbw38BV2WNVXDs1TVct4SkyDgOVEYiRF3JPhuSZPJBX8vlnLdut4Xjp4iyOtFSvhS7Wch//qhScLWh+qyZ6tqAO+65Xht51l3FuIY30nfOrYS6Zjdmb5Ja8of8bMiBPeAO18y7dZWi2P8GPiF7wxbBB/O92jIPCdpzfjScyiWjq0GuVH9ZhA7awH3uQghNIQ97twyX04ZoLcjtckjsYpXI+T7MKP4I7/gL6JJMZ</latexit>
hi
<latexit sha1_base64="iVE2zwSrY7uMFp5lULN8y4s7ZXg=">AAACjnichVHLSsNAFL2Nr1ofjboR3BRLxVWZSKkiiEU3XfZhH9CWksSpjs2LJC3U0B9wLy4ERcGF+AF+gBt/wEU/QVxWcOPCmzQgWqw3TObMmXvunJkrGQqzbEJ6AW5sfGJyKjgdmpmdmw/zC4tFS2+ZMi3IuqKbZUm0qMI0WrCZrdCyYVJRlRRakpr77n6pTU2L6dqB3TFoTRWPNNZgsmgjValKqkO7dYeddOt8lMSJF5FhIPggCn5kdP4RqnAIOsjQAhUoaGAjVkAEC78KCEDAQK4GDnImIubtU+hCCLUtzKKYISLbxP8Rrio+q+HarWl5ahlPUXCYqIxAjLyQe9Inz+SBvJLPP2s5Xg3XSwdnaaClRj18tpz/+Fel4mzD8bdqpGcbGrDleWXo3fAY9xbyQN8+vejnt3MxZ43ckjf0f0N65AlvoLXf5bsszV2O8COhF3wxbJDwux3DoLgRF5LxRDYRTe35rQrCCqzCOvZjE1KQhgwUvBc9hyu45nguye1wu4NULuBrluBHcOkvYEmUlg==</latexit>
eij
atom features
bond features
65. / 166
Graph Neural Networks (GNNs)
42
N O
C
C
C
C
H
H
H
H
H
N O
C
C
C
C
H
H
H
H
H
GNN Layer
GNN updates
features
• sum, mean or max
• attentive pooling
❷
❷ Aggregate
Permutation
invariant
operations
❶
❶ Message
Permutation
equivariant
operations
• nn.Linear
Bond features can be
used (typically in ❶)
<latexit sha1_base64="rQGx6XMOvsCjXRp7b9gbmKjgm1M=">AAAC/HichVHNTttAEB4b2tI0lNBeKnGxiKiChKJNhVrUU1QuPaHwE0DCkbU2m3jD+kfrTVBquQ/AC3DgVNQeEOIKD9BLX4ADBx4AcUylXnroxHEVtah0LHu//Wa+8bc7dih4pAi50vSx8QcPH008zj3JTz6dKkw/24yCjnRY3QlEILdtGjHBfVZXXAm2HUpGPVuwLXtveZDf6jIZ8cDfUL2QNTza8nmTO1QhZRX2TduL3cTihilYU1Epg33DDKNsX/qdXsiZNm8FoehEVtw2TO4bpkeV61ARr2A+QZHLjVG9kaH2ELHEink7mc+ZkrdcNW8ViqRM0jDugkoGipBFLShcgAm7EIADHfCAgQ8KsQAKET47UAECIXINiJGTiHiaZ5BADrUdrGJYQZHdw28LdzsZ6+N+0DNK1Q7+ReArUWnAHLkkJ6RPvpFTckN+/rNXnPYYeOnhag+1LLSmDl6s//ivysNVgTtS3etZQROWUq8cvYcpMziFM9R3Pxz219+uzcUvyTG5Rf+fyBX5iifwu9+dL6ts7egePzZ6wRvDAVX+HsddsPmqXHldXlxdLFbfZaOagBmYhRLO4w1U4T3UoI79r7UxLa9N6h/1z/qpfjYs1bVM8xz+CP38F+WDvZo=</latexit>
hi
0
@hi,
M
j2Ni
(hi, hj, eij)
1
A
Update by “Message Passing”
<latexit sha1_base64="+I1ZH8a510AHL/VRK05INyOdDHc=">AAACi3ichVHLSsNAFL2Nr1qtrboR3BRLxVWZaFEpLooiuOzDPqAtJYnTdmheJGmhhv6ASzcu6kbBhfgBfoAbf8BFP0FcVnDjwps0IFqsN0zmzJl77pyZK+oyMy1CBj5uanpmds4/H1hYDC6FwssrBVNrGxLNS5qsGSVRMKnMVJq3mCXTkm5QQRFlWhRbR85+sUMNk2nqqdXVaVURGiqrM0mwkCpVRMVu9mqsFo6SOHEjMg54D0TBi7QWfoQKnIEGErRBAQoqWIhlEMDErww8ENCRq4KNnIGIufsUehBAbRuzKGYIyLbw38BV2WNVXDs1TVct4SkyDgOVEYiRF3JPhuSZPJBX8vlnLdut4Xjp4iyOtFSvhS7Wch//qhScLWh+qyZ6tqAO+65Xht51l3FuIY30nfOrYS6Zjdmb5Ja8of8bMiBPeAO18y7dZWi2P8GPiF7wxbBB/O92jIPCdpzfjScyiWjq0GuVH9ZhA7awH3uQghNIQ97twyX04ZoLcjtckjsYpXI+T7MKP4I7/gL6JJMZ</latexit>
hi
<latexit sha1_base64="iVE2zwSrY7uMFp5lULN8y4s7ZXg=">AAACjnichVHLSsNAFL2Nr1ofjboR3BRLxVWZSKkiiEU3XfZhH9CWksSpjs2LJC3U0B9wLy4ERcGF+AF+gBt/wEU/QVxWcOPCmzQgWqw3TObMmXvunJkrGQqzbEJ6AW5sfGJyKjgdmpmdmw/zC4tFS2+ZMi3IuqKbZUm0qMI0WrCZrdCyYVJRlRRakpr77n6pTU2L6dqB3TFoTRWPNNZgsmgjValKqkO7dYeddOt8lMSJF5FhIPggCn5kdP4RqnAIOsjQAhUoaGAjVkAEC78KCEDAQK4GDnImIubtU+hCCLUtzKKYISLbxP8Rrio+q+HarWl5ahlPUXCYqIxAjLyQe9Inz+SBvJLPP2s5Xg3XSwdnaaClRj18tpz/+Fel4mzD8bdqpGcbGrDleWXo3fAY9xbyQN8+vejnt3MxZ43ckjf0f0N65AlvoLXf5bsszV2O8COhF3wxbJDwux3DoLgRF5LxRDYRTe35rQrCCqzCOvZjE1KQhgwUvBc9hyu45nguye1wu4NULuBrluBHcOkvYEmUlg==</latexit>
eij
atom features
bond features
66. / 166
Graph Neural Networks (GNNs)
42
N O
C
C
C
C
H
H
H
H
H
N O
C
C
C
C
H
H
H
H
H
GNN Layer
GNN updates
features
• sum, mean or max
• attentive pooling
❷
❷ Aggregate
Permutation
invariant
operations
❸
❸ Update
Any
• nn.Linear
❶
❶ Message
Permutation
equivariant
operations
• nn.Linear
Bond features can be
used (typically in ❶)
<latexit sha1_base64="rQGx6XMOvsCjXRp7b9gbmKjgm1M=">AAAC/HichVHNTttAEB4b2tI0lNBeKnGxiKiChKJNhVrUU1QuPaHwE0DCkbU2m3jD+kfrTVBquQ/AC3DgVNQeEOIKD9BLX4ADBx4AcUylXnroxHEVtah0LHu//Wa+8bc7dih4pAi50vSx8QcPH008zj3JTz6dKkw/24yCjnRY3QlEILdtGjHBfVZXXAm2HUpGPVuwLXtveZDf6jIZ8cDfUL2QNTza8nmTO1QhZRX2TduL3cTihilYU1Epg33DDKNsX/qdXsiZNm8FoehEVtw2TO4bpkeV61ARr2A+QZHLjVG9kaH2ELHEink7mc+ZkrdcNW8ViqRM0jDugkoGipBFLShcgAm7EIADHfCAgQ8KsQAKET47UAECIXINiJGTiHiaZ5BADrUdrGJYQZHdw28LdzsZ6+N+0DNK1Q7+ReArUWnAHLkkJ6RPvpFTckN+/rNXnPYYeOnhag+1LLSmDl6s//ivysNVgTtS3etZQROWUq8cvYcpMziFM9R3Pxz219+uzcUvyTG5Rf+fyBX5iifwu9+dL6ts7egePzZ6wRvDAVX+HsddsPmqXHldXlxdLFbfZaOagBmYhRLO4w1U4T3UoI79r7UxLa9N6h/1z/qpfjYs1bVM8xz+CP38F+WDvZo=</latexit>
hi
0
@hi,
M
j2Ni
(hi, hj, eij)
1
A
Update by “Message Passing”
<latexit sha1_base64="+I1ZH8a510AHL/VRK05INyOdDHc=">AAACi3ichVHLSsNAFL2Nr1qtrboR3BRLxVWZaFEpLooiuOzDPqAtJYnTdmheJGmhhv6ASzcu6kbBhfgBfoAbf8BFP0FcVnDjwps0IFqsN0zmzJl77pyZK+oyMy1CBj5uanpmds4/H1hYDC6FwssrBVNrGxLNS5qsGSVRMKnMVJq3mCXTkm5QQRFlWhRbR85+sUMNk2nqqdXVaVURGiqrM0mwkCpVRMVu9mqsFo6SOHEjMg54D0TBi7QWfoQKnIEGErRBAQoqWIhlEMDErww8ENCRq4KNnIGIufsUehBAbRuzKGYIyLbw38BV2WNVXDs1TVct4SkyDgOVEYiRF3JPhuSZPJBX8vlnLdut4Xjp4iyOtFSvhS7Wch//qhScLWh+qyZ6tqAO+65Xht51l3FuIY30nfOrYS6Zjdmb5Ja8of8bMiBPeAO18y7dZWi2P8GPiF7wxbBB/O92jIPCdpzfjScyiWjq0GuVH9ZhA7awH3uQghNIQ97twyX04ZoLcjtckjsYpXI+T7MKP4I7/gL6JJMZ</latexit>
hi
<latexit sha1_base64="iVE2zwSrY7uMFp5lULN8y4s7ZXg=">AAACjnichVHLSsNAFL2Nr1ofjboR3BRLxVWZSKkiiEU3XfZhH9CWksSpjs2LJC3U0B9wLy4ERcGF+AF+gBt/wEU/QVxWcOPCmzQgWqw3TObMmXvunJkrGQqzbEJ6AW5sfGJyKjgdmpmdmw/zC4tFS2+ZMi3IuqKbZUm0qMI0WrCZrdCyYVJRlRRakpr77n6pTU2L6dqB3TFoTRWPNNZgsmgjValKqkO7dYeddOt8lMSJF5FhIPggCn5kdP4RqnAIOsjQAhUoaGAjVkAEC78KCEDAQK4GDnImIubtU+hCCLUtzKKYISLbxP8Rrio+q+HarWl5ahlPUXCYqIxAjLyQe9Inz+SBvJLPP2s5Xg3XSwdnaaClRj18tpz/+Fel4mzD8bdqpGcbGrDleWXo3fAY9xbyQN8+vejnt3MxZ43ckjf0f0N65AlvoLXf5bsszV2O8COhF3wxbJDwux3DoLgRF5LxRDYRTe35rQrCCqzCOvZjE1KQhgwUvBc9hyu45nguye1wu4NULuBrluBHcOkvYEmUlg==</latexit>
eij
atom features
bond features
67. / 166
Use Case 1: Virtual Screening (QSAR/QSPR)
43
ChemProp
(Directed MPNN)
ExtraTrees
w/ ECFP6(1024)
Performance for unseen (test) data:
Standard ML GNN
Stokes et al, Cell (2020) https://doi.org/10.1016/j.cell.2020.01.021
Marchant, Nature (2020) https://doi.org/10.1038/d41586-020-00018-3
ChemProp (Yang et al, 2019)
from MIT MLPDS (Machine Learning
for Pharmaceutical Discovery
and Synthesis) Consortium
Disclaimer: This is just for a toy demo. This should be taken
as classification for ACTIVITY_OUTCOME (Active or Inactive)
95.079% (Active/Inactive) 95.604% (Active/Inactive)
• Regression for LogGI50 • Regression for LogGI50
• Classification accuracy • Classification accuracy
RMSE 0.6076
RMSE 0.7970
Activie/Inactive (Classification), LogGI50 (Regression)
72. / 166
Use Case 2: Quantum chemistry
48
https://qcarchive.molssi.org/apps/ml_datasets/
73. / 166
Use Case 2: Quantum chemistry
49
input output
gdb_21014
∼ 1000 sec
Density Functional Theory (DFT)
B3LYP/6-31G(2df, p)
<latexit sha1_base64="JI//afsBt1AdIhSgVUbVSGVXtww=">AAACmnichVHLSsNAFD2Nr1pfVREEXQSL4qpMpagIgiiC4qZVq4KVksSxHZomIZkWavAH/AEXrhRcqB/gB7jxB1z0E8SlghsX3qYBUVFvmMyZM/fcOTNXd0zhScYaEaWtvaOzK9od6+nt6x+IDw7teHbVNXjOsE3b3dM1j5vC4jkppMn3HJdrFd3ku3p5pbm/W+OuJ2xrW9YdflDRipY4EoYmiSrER/IlTfprJ2o+4wl1UV0NQCGeYEkWhPoTpEKQQBgZO36HPA5hw0AVFXBYkIRNaPDo20cKDA5xB/CJcwmJYJ/jBDHSVimLU4ZGbJn+RVrth6xF62ZNL1AbdIpJwyWlikn2yK7ZC3tgt+yJvf9ayw9qNL3UadZbWu4UBk5Ht97+VVVolih9qv70LHGE+cCrIO9OwDRvYbT0teOzl62FzUl/il2yZ/J/wRrsnm5g1V6NqyzfPP/Dj05e6MWoQanv7fgJdmaSqdlkOptOLC2HrYpiDBOYpn7MYQlryCBH9X1c4Aa3yriyrKwrG61UJRJqhvEllO0PT6aXZA==</latexit>
Ĥ = E
例) ⼀電⼦版のSchrödinger⽅程式
(Kohn‒Sham⽅程式)の求解
QMࢉܭ
74. / 166
Use Case 2: Quantum chemistry
49
input output
gdb_21014
∼ 1000 sec
Density Functional Theory (DFT)
B3LYP/6-31G(2df, p)
<latexit sha1_base64="JI//afsBt1AdIhSgVUbVSGVXtww=">AAACmnichVHLSsNAFD2Nr1pfVREEXQSL4qpMpagIgiiC4qZVq4KVksSxHZomIZkWavAH/AEXrhRcqB/gB7jxB1z0E8SlghsX3qYBUVFvmMyZM/fcOTNXd0zhScYaEaWtvaOzK9od6+nt6x+IDw7teHbVNXjOsE3b3dM1j5vC4jkppMn3HJdrFd3ku3p5pbm/W+OuJ2xrW9YdflDRipY4EoYmiSrER/IlTfprJ2o+4wl1UV0NQCGeYEkWhPoTpEKQQBgZO36HPA5hw0AVFXBYkIRNaPDo20cKDA5xB/CJcwmJYJ/jBDHSVimLU4ZGbJn+RVrth6xF62ZNL1AbdIpJwyWlikn2yK7ZC3tgt+yJvf9ayw9qNL3UadZbWu4UBk5Ht97+VVVolih9qv70LHGE+cCrIO9OwDRvYbT0teOzl62FzUl/il2yZ/J/wRrsnm5g1V6NqyzfPP/Dj05e6MWoQanv7fgJdmaSqdlkOptOLC2HrYpiDBOYpn7MYQlryCBH9X1c4Aa3yriyrKwrG61UJRJqhvEllO0PT6aXZA==</latexit>
Ĥ = E
例) ⼀電⼦版のSchrödinger⽅程式
(Kohn‒Sham⽅程式)の求解
ML
∼ 0.01 sec
≈
100,000 times faster!
QMࢉܭ
75. / 166
Use Case 2: Quantum chemistry
50
ICML 2017 https://arxiv.org/abs/1704.01212 JCTC 2017 https://doi.org/10.1021/acs.jctc.7b00577
• Googleが⾊々なGNNのバリエーションを「MPNN(Message Passing NN)」として
統⼀的に⾒直した際にターゲットにされたのがこの量⼦化学計算近似タスク
76. / 166
Use Case 2: Quantum chemistry
51
オリジナルのFCFP原⼦不変量
オリジナルのECFP原⼦不変量
• the number of immediate neighbors who are
“heavy” (non-hydrogen) atoms
• the valence minus the number of hydrogens
• the atomic number
• the atomic mass
• the atomic charge
• the number of attached hydrogens
• whether the atom is contained in at least one ring
Daylight
原⼦不変量
• hydrogen-bond acceptor or not?
• hydrogen-bond donor or not?
• negatively ionizable or not?
• positively ionizable or not?
• aromatic or not?
• halogen or not?
Rogers and Hahn, JCIM (2005) https://doi.org/10.1021/ci100050t
Faber et al, JCTC (2017) https://doi.org/10.1021/acs.jctc.7b00577
MPNNによる量⼦化学計算近似で⽤いられた頂点・辺特徴
連続量ラベル
80. / 166
Use Case 2: Quantum chemistry
53
pred vs true for SchNet (Schütt et al, 2017) pred vs true for DimeNet (Klicpera et al, 2020)
Dipole Moment Energy U
HOMO
LUMO
Heat Capacity
Enthalpy H
Dipole Moment Energy U
HOMO
LUMO
Heat Capacity
Enthalpy H
81. / 166
Use Case 2: Quantum chemistry
53
pred vs true for SchNet (Schütt et al, 2017) pred vs true for DimeNet (Klicpera et al, 2020)
Dipole Moment Energy U
HOMO
LUMO
Heat Capacity
Enthalpy H
Dipole Moment Energy U
HOMO
LUMO
Heat Capacity
Enthalpy H
100,000倍速いなら
⼗分許容できる予測誤差!
これはテストデータ(訓練時に
⾒せてないデータ)の結果
82. / 166
Use Case 2: Quantum chemistry
54
SchNet (Schütt et al, 2017) DimeNet (Klicpera et al, 2020)
Free Energy Free Energy
y_true
y_true
y_pred y_pred
ExtraTrees w/ ECFP6 LightGBM w/ ECFP6 3-Layer MLP w/ ECFP6
(without 3D geometry) (without 3D geometry) (without 3D geometry)
Free Energy Free Energy Free Energy