SlideShare ist ein Scribd-Unternehmen logo
1 von 86
Downloaden Sie, um offline zu lesen
Deep Learning
Neural Networks Reborn
Fabio A. González
Univ. Nacional de Colombia
Fabio A. González Universidad Nacional de Colombia
Some history
https://www.youtube.com/watch?v=cNxadbrN_aI
Fabio A. González Universidad Nacional de Colombia
Rosenblatt’s Perceptron
(1957)
• Input: 20x20 photocells array
• Weights implemented with
potentiometers
• Weight updating performed by
electric motors
Fabio A. González Universidad Nacional de Colombia
Neural networks time line
1943 20161957 1969 1986 1995 2007 2012
Fabio A. González Universidad Nacional de Colombia
Neural networks time line
1943 20161957 1969 1986 1995 2007 2012
Fabio A. González Universidad Nacional de Colombia
Neural networks time line
1943 20161957 1969 1986 1995 2007 2012
Fabio A. González Universidad Nacional de Colombia
Neural networks time line
1943 20161957 1969 1986 1995 2007 2012
Fabio A. González Universidad Nacional de Colombia
Neural networks time line
1943 20161957 1969 1986 1995 2007 2012
Fabio A. González Universidad Nacional de Colombia
Backpropagation
Source: http://home.agh.edu.pl/~vlsi/AI/backp_t_en/backprop.html
Fabio A. González Universidad Nacional de Colombia
Neural networks time line
1943 20161957 1969 1986 1995 2007 2012
Fabio A. González Universidad Nacional de Colombia
Neural networks time line
1943 20161957 1969 1986 1995 2007 2012
Colombian local NN history
Fabio A. González Universidad Nacional de Colombia
Colombian local neural
networks history
Fabio A. González Universidad Nacional de Colombia
Colombian local neural
networks history
UNNeuro (1993, UN)
Fabio A. González Universidad Nacional de Colombia
Neural networks time line
1943 20161957 1969 1986 1995 2007 2012
Fabio A. González Universidad Nacional de Colombia
Neural networks time line
1943 20161957 1969 1986 1995 2007 2012
Fabio A. González Universidad Nacional de Colombia
Neural networks time line
1943 20161957 1969 1986 1995 2007 2012
Fabio A. González Universidad Nacional de Colombia
Deep Learning
Fabio A. González Universidad Nacional de Colombia
Deep learning boom
Fabio A. González Universidad Nacional de Colombia
Deep learning boom
Fabio A. González Universidad Nacional de Colombia
Deep learning boom
Fabio A. González Universidad Nacional de Colombia
Deep learning boom
Fabio A. González Universidad Nacional de Colombia
Deep learning boom
Fabio A. González Universidad Nacional de Colombia
Deep learning boom
Fabio A. González Universidad Nacional de Colombia
Deep learning boom
Fabio A. González Universidad Nacional de Colombia
Deep learning time line
1943 20161957 1969 1986 1995 2007 2012
Fabio A. González Universidad Nacional de Colombia
Deep learning time line
1943 20161957 1969 1986 1995 2007 2012
LETTER Communicated by Yann Le Cun
A Fast Learning Algorithm for Deep Belief Nets
Geoffrey E. Hinton
hinton@cs.toronto.edu
Simon Osindero
osindero@cs.toronto.edu
Department of Computer Science, University of Toronto, Toronto, Canada M5S 3G4
Yee-Whye Teh
tehyw@comp.nus.edu.sg
Department of Computer Science, National University of Singapore,
Singapore 117543
We show how to use “complementary priors” to eliminate the explaining-
away effects that make inference difficult in densely connected belief nets
that have many hidden layers. Using complementary priors, we derive a
fast, greedy algorithm that can learn deep, directed belief networks one
layer at a time, provided the top two layers form an undirected associa-
tive memory. The fast, greedy algorithm is used to initialize a slower
learning procedure that fine-tunes the weights using a contrastive ver-
sion of the wake-sleep algorithm. After fine-tuning, a network with three
hidden layers forms a very good generative model of the joint distribu-
tion of handwritten digit images and their labels. This generative model
gives better digit classification than the best discriminative learning al-
gorithms. The low-dimensional manifolds on which the digits lie are
modeled by long ravines in the free-energy landscape of the top-level
associative memory, and it is easy to explore these ravines by using the
directed connections to display what the associative memory has in mind.
1 Introduction
Learning is difficult in densely connected, directed belief nets that have
many hidden layers because it is difficult to infer the conditional distribu-
tion of the hidden activities when given a data vector. Variational methods
use simple approximations to the true conditional distribution, but the ap-
proximations may be poor, especially at the deepest hidden layer, where
the prior assumes independence. Also, variational learning still requires all
of the parameters to be learned together and this makes the learning time
scale poorly as the number of parameters increases.
We describe a model in which the top two hidden layers form an undi-
rected associative memory (see Figure 1) and the remaining hidden layers
Neural Computation 18, 1527–1554 (2006) C⃝ 2006 Massachusetts Institute of Technology
LETTER Communicated by Yann Le Cun
A Fast Learning Algorithm for Deep Belief Nets
Geoffrey E. Hinton
hinton@cs.toronto.edu
Simon Osindero
osindero@cs.toronto.edu
Department of Computer Science, University of Toronto, Toronto, Canada M5S 3G4
Yee-Whye Teh
tehyw@comp.nus.edu.sg
Department of Computer Science, National University of Singapore,
Singapore 117543
We show how to use “complementary priors” to eliminate the explaining-
away effects that make inference difficult in densely connected belief nets
that have many hidden layers. Using complementary priors, we derive a
fast, greedy algorithm that can learn deep, directed belief networks one
layer at a time, provided the top two layers form an undirected associa-
tive memory. The fast, greedy algorithm is used to initialize a slower
learning procedure that fine-tunes the weights using a contrastive ver-
sion of the wake-sleep algorithm. After fine-tuning, a network with three
hidden layers forms a very good generative model of the joint distribu-
tion of handwritten digit images and their labels. This generative model
gives better digit classification than the best discriminative learning al-
gorithms. The low-dimensional manifolds on which the digits lie are
modeled by long ravines in the free-energy landscape of the top-level
associative memory, and it is easy to explore these ravines by using the
directed connections to display what the associative memory has in mind.
1 Introduction
Learning is difficult in densely connected, directed belief nets that have
many hidden layers because it is difficult to infer the conditional distribu-
tion of the hidden activities when given a data vector. Variational methods
use simple approximations to the true conditional distribution, but the ap-
proximations may be poor, especially at the deepest hidden layer, where
the prior assumes independence. Also, variational learning still requires all
of the parameters to be learned together and this makes the learning time
scale poorly as the number of parameters increases.
We describe a model in which the top two hidden layers form an undi-
rected associative memory (see Figure 1) and the remaining hidden layers
Neural Computation 18, 1527–1554 (2006) C⃝ 2006 Massachusetts Institute of Technology
Fabio A. González Universidad Nacional de Colombia
Deep learning time line
1943 20161957 1969 1986 1995 2007 2012
LETTER Communicated by Yann Le Cun
A Fast Learning Algorithm for Deep Belief Nets
Geoffrey E. Hinton
hinton@cs.toronto.edu
Simon Osindero
osindero@cs.toronto.edu
Department of Computer Science, University of Toronto, Toronto, Canada M5S 3G4
Yee-Whye Teh
tehyw@comp.nus.edu.sg
Department of Computer Science, National University of Singapore,
Singapore 117543
We show how to use “complementary priors” to eliminate the explaining-
away effects that make inference difficult in densely connected belief nets
that have many hidden layers. Using complementary priors, we derive a
fast, greedy algorithm that can learn deep, directed belief networks one
layer at a time, provided the top two layers form an undirected associa-
tive memory. The fast, greedy algorithm is used to initialize a slower
learning procedure that fine-tunes the weights using a contrastive ver-
sion of the wake-sleep algorithm. After fine-tuning, a network with three
hidden layers forms a very good generative model of the joint distribu-
tion of handwritten digit images and their labels. This generative model
gives better digit classification than the best discriminative learning al-
gorithms. The low-dimensional manifolds on which the digits lie are
modeled by long ravines in the free-energy landscape of the top-level
associative memory, and it is easy to explore these ravines by using the
directed connections to display what the associative memory has in mind.
1 Introduction
Learning is difficult in densely connected, directed belief nets that have
many hidden layers because it is difficult to infer the conditional distribu-
tion of the hidden activities when given a data vector. Variational methods
use simple approximations to the true conditional distribution, but the ap-
proximations may be poor, especially at the deepest hidden layer, where
the prior assumes independence. Also, variational learning still requires all
of the parameters to be learned together and this makes the learning time
scale poorly as the number of parameters increases.
We describe a model in which the top two hidden layers form an undi-
rected associative memory (see Figure 1) and the remaining hidden layers
Neural Computation 18, 1527–1554 (2006) C⃝ 2006 Massachusetts Institute of Technology
LETTER Communicated by Yann Le Cun
A Fast Learning Algorithm for Deep Belief Nets
Geoffrey E. Hinton
hinton@cs.toronto.edu
Simon Osindero
osindero@cs.toronto.edu
Department of Computer Science, University of Toronto, Toronto, Canada M5S 3G4
Yee-Whye Teh
tehyw@comp.nus.edu.sg
Department of Computer Science, National University of Singapore,
Singapore 117543
We show how to use “complementary priors” to eliminate the explaining-
away effects that make inference difficult in densely connected belief nets
that have many hidden layers. Using complementary priors, we derive a
fast, greedy algorithm that can learn deep, directed belief networks one
layer at a time, provided the top two layers form an undirected associa-
tive memory. The fast, greedy algorithm is used to initialize a slower
learning procedure that fine-tunes the weights using a contrastive ver-
sion of the wake-sleep algorithm. After fine-tuning, a network with three
hidden layers forms a very good generative model of the joint distribu-
tion of handwritten digit images and their labels. This generative model
gives better digit classification than the best discriminative learning al-
gorithms. The low-dimensional manifolds on which the digits lie are
modeled by long ravines in the free-energy landscape of the top-level
associative memory, and it is easy to explore these ravines by using the
directed connections to display what the associative memory has in mind.
1 Introduction
Learning is difficult in densely connected, directed belief nets that have
many hidden layers because it is difficult to infer the conditional distribu-
tion of the hidden activities when given a data vector. Variational methods
use simple approximations to the true conditional distribution, but the ap-
proximations may be poor, especially at the deepest hidden layer, where
the prior assumes independence. Also, variational learning still requires all
of the parameters to be learned together and this makes the learning time
scale poorly as the number of parameters increases.
We describe a model in which the top two hidden layers form an undi-
rected associative memory (see Figure 1) and the remaining hidden layers
Neural Computation 18, 1527–1554 (2006) C⃝ 2006 Massachusetts Institute of Technology
Fabio A. González Universidad Nacional de Colombia
Deep learning time line
1943 20161957 1969 1986 1995 2007 2012
LETTER Communicated by Yann Le Cun
A Fast Learning Algorithm for Deep Belief Nets
Geoffrey E. Hinton
hinton@cs.toronto.edu
Simon Osindero
osindero@cs.toronto.edu
Department of Computer Science, University of Toronto, Toronto, Canada M5S 3G4
Yee-Whye Teh
tehyw@comp.nus.edu.sg
Department of Computer Science, National University of Singapore,
Singapore 117543
We show how to use “complementary priors” to eliminate the explaining-
away effects that make inference difficult in densely connected belief nets
that have many hidden layers. Using complementary priors, we derive a
fast, greedy algorithm that can learn deep, directed belief networks one
layer at a time, provided the top two layers form an undirected associa-
tive memory. The fast, greedy algorithm is used to initialize a slower
learning procedure that fine-tunes the weights using a contrastive ver-
sion of the wake-sleep algorithm. After fine-tuning, a network with three
hidden layers forms a very good generative model of the joint distribu-
tion of handwritten digit images and their labels. This generative model
gives better digit classification than the best discriminative learning al-
gorithms. The low-dimensional manifolds on which the digits lie are
modeled by long ravines in the free-energy landscape of the top-level
associative memory, and it is easy to explore these ravines by using the
directed connections to display what the associative memory has in mind.
1 Introduction
Learning is difficult in densely connected, directed belief nets that have
many hidden layers because it is difficult to infer the conditional distribu-
tion of the hidden activities when given a data vector. Variational methods
use simple approximations to the true conditional distribution, but the ap-
proximations may be poor, especially at the deepest hidden layer, where
the prior assumes independence. Also, variational learning still requires all
of the parameters to be learned together and this makes the learning time
scale poorly as the number of parameters increases.
We describe a model in which the top two hidden layers form an undi-
rected associative memory (see Figure 1) and the remaining hidden layers
Neural Computation 18, 1527–1554 (2006) C⃝ 2006 Massachusetts Institute of Technology
LETTER Communicated by Yann Le Cun
A Fast Learning Algorithm for Deep Belief Nets
Geoffrey E. Hinton
hinton@cs.toronto.edu
Simon Osindero
osindero@cs.toronto.edu
Department of Computer Science, University of Toronto, Toronto, Canada M5S 3G4
Yee-Whye Teh
tehyw@comp.nus.edu.sg
Department of Computer Science, National University of Singapore,
Singapore 117543
We show how to use “complementary priors” to eliminate the explaining-
away effects that make inference difficult in densely connected belief nets
that have many hidden layers. Using complementary priors, we derive a
fast, greedy algorithm that can learn deep, directed belief networks one
layer at a time, provided the top two layers form an undirected associa-
tive memory. The fast, greedy algorithm is used to initialize a slower
learning procedure that fine-tunes the weights using a contrastive ver-
sion of the wake-sleep algorithm. After fine-tuning, a network with three
hidden layers forms a very good generative model of the joint distribu-
tion of handwritten digit images and their labels. This generative model
gives better digit classification than the best discriminative learning al-
gorithms. The low-dimensional manifolds on which the digits lie are
modeled by long ravines in the free-energy landscape of the top-level
associative memory, and it is easy to explore these ravines by using the
directed connections to display what the associative memory has in mind.
1 Introduction
Learning is difficult in densely connected, directed belief nets that have
many hidden layers because it is difficult to infer the conditional distribu-
tion of the hidden activities when given a data vector. Variational methods
use simple approximations to the true conditional distribution, but the ap-
proximations may be poor, especially at the deepest hidden layer, where
the prior assumes independence. Also, variational learning still requires all
of the parameters to be learned together and this makes the learning time
scale poorly as the number of parameters increases.
We describe a model in which the top two hidden layers form an undi-
rected associative memory (see Figure 1) and the remaining hidden layers
Neural Computation 18, 1527–1554 (2006) C⃝ 2006 Massachusetts Institute of Technology
Fabio A. González Universidad Nacional de Colombia
Deep learning time line
1943 20161957 1969 1986 1995 2007 2012
LETTER Communicated by Yann Le Cun
A Fast Learning Algorithm for Deep Belief Nets
Geoffrey E. Hinton
hinton@cs.toronto.edu
Simon Osindero
osindero@cs.toronto.edu
Department of Computer Science, University of Toronto, Toronto, Canada M5S 3G4
Yee-Whye Teh
tehyw@comp.nus.edu.sg
Department of Computer Science, National University of Singapore,
Singapore 117543
We show how to use “complementary priors” to eliminate the explaining-
away effects that make inference difficult in densely connected belief nets
that have many hidden layers. Using complementary priors, we derive a
fast, greedy algorithm that can learn deep, directed belief networks one
layer at a time, provided the top two layers form an undirected associa-
tive memory. The fast, greedy algorithm is used to initialize a slower
learning procedure that fine-tunes the weights using a contrastive ver-
sion of the wake-sleep algorithm. After fine-tuning, a network with three
hidden layers forms a very good generative model of the joint distribu-
tion of handwritten digit images and their labels. This generative model
gives better digit classification than the best discriminative learning al-
gorithms. The low-dimensional manifolds on which the digits lie are
modeled by long ravines in the free-energy landscape of the top-level
associative memory, and it is easy to explore these ravines by using the
directed connections to display what the associative memory has in mind.
1 Introduction
Learning is difficult in densely connected, directed belief nets that have
many hidden layers because it is difficult to infer the conditional distribu-
tion of the hidden activities when given a data vector. Variational methods
use simple approximations to the true conditional distribution, but the ap-
proximations may be poor, especially at the deepest hidden layer, where
the prior assumes independence. Also, variational learning still requires all
of the parameters to be learned together and this makes the learning time
scale poorly as the number of parameters increases.
We describe a model in which the top two hidden layers form an undi-
rected associative memory (see Figure 1) and the remaining hidden layers
Neural Computation 18, 1527–1554 (2006) C⃝ 2006 Massachusetts Institute of Technology
LETTER Communicated by Yann Le Cun
A Fast Learning Algorithm for Deep Belief Nets
Geoffrey E. Hinton
hinton@cs.toronto.edu
Simon Osindero
osindero@cs.toronto.edu
Department of Computer Science, University of Toronto, Toronto, Canada M5S 3G4
Yee-Whye Teh
tehyw@comp.nus.edu.sg
Department of Computer Science, National University of Singapore,
Singapore 117543
We show how to use “complementary priors” to eliminate the explaining-
away effects that make inference difficult in densely connected belief nets
that have many hidden layers. Using complementary priors, we derive a
fast, greedy algorithm that can learn deep, directed belief networks one
layer at a time, provided the top two layers form an undirected associa-
tive memory. The fast, greedy algorithm is used to initialize a slower
learning procedure that fine-tunes the weights using a contrastive ver-
sion of the wake-sleep algorithm. After fine-tuning, a network with three
hidden layers forms a very good generative model of the joint distribu-
tion of handwritten digit images and their labels. This generative model
gives better digit classification than the best discriminative learning al-
gorithms. The low-dimensional manifolds on which the digits lie are
modeled by long ravines in the free-energy landscape of the top-level
associative memory, and it is easy to explore these ravines by using the
directed connections to display what the associative memory has in mind.
1 Introduction
Learning is difficult in densely connected, directed belief nets that have
many hidden layers because it is difficult to infer the conditional distribu-
tion of the hidden activities when given a data vector. Variational methods
use simple approximations to the true conditional distribution, but the ap-
proximations may be poor, especially at the deepest hidden layer, where
the prior assumes independence. Also, variational learning still requires all
of the parameters to be learned together and this makes the learning time
scale poorly as the number of parameters increases.
We describe a model in which the top two hidden layers form an undi-
rected associative memory (see Figure 1) and the remaining hidden layers
Neural Computation 18, 1527–1554 (2006) C⃝ 2006 Massachusetts Institute of Technology
Fabio A. González Universidad Nacional de Colombia
Deep learning time line
1943 20161957 1969 1986 1995 2007 2012
LETTER Communicated by Yann Le Cun
A Fast Learning Algorithm for Deep Belief Nets
Geoffrey E. Hinton
hinton@cs.toronto.edu
Simon Osindero
osindero@cs.toronto.edu
Department of Computer Science, University of Toronto, Toronto, Canada M5S 3G4
Yee-Whye Teh
tehyw@comp.nus.edu.sg
Department of Computer Science, National University of Singapore,
Singapore 117543
We show how to use “complementary priors” to eliminate the explaining-
away effects that make inference difficult in densely connected belief nets
that have many hidden layers. Using complementary priors, we derive a
fast, greedy algorithm that can learn deep, directed belief networks one
layer at a time, provided the top two layers form an undirected associa-
tive memory. The fast, greedy algorithm is used to initialize a slower
learning procedure that fine-tunes the weights using a contrastive ver-
sion of the wake-sleep algorithm. After fine-tuning, a network with three
hidden layers forms a very good generative model of the joint distribu-
tion of handwritten digit images and their labels. This generative model
gives better digit classification than the best discriminative learning al-
gorithms. The low-dimensional manifolds on which the digits lie are
modeled by long ravines in the free-energy landscape of the top-level
associative memory, and it is easy to explore these ravines by using the
directed connections to display what the associative memory has in mind.
1 Introduction
Learning is difficult in densely connected, directed belief nets that have
many hidden layers because it is difficult to infer the conditional distribu-
tion of the hidden activities when given a data vector. Variational methods
use simple approximations to the true conditional distribution, but the ap-
proximations may be poor, especially at the deepest hidden layer, where
the prior assumes independence. Also, variational learning still requires all
of the parameters to be learned together and this makes the learning time
scale poorly as the number of parameters increases.
We describe a model in which the top two hidden layers form an undi-
rected associative memory (see Figure 1) and the remaining hidden layers
Neural Computation 18, 1527–1554 (2006) C⃝ 2006 Massachusetts Institute of Technology
LETTER Communicated by Yann Le Cun
A Fast Learning Algorithm for Deep Belief Nets
Geoffrey E. Hinton
hinton@cs.toronto.edu
Simon Osindero
osindero@cs.toronto.edu
Department of Computer Science, University of Toronto, Toronto, Canada M5S 3G4
Yee-Whye Teh
tehyw@comp.nus.edu.sg
Department of Computer Science, National University of Singapore,
Singapore 117543
We show how to use “complementary priors” to eliminate the explaining-
away effects that make inference difficult in densely connected belief nets
that have many hidden layers. Using complementary priors, we derive a
fast, greedy algorithm that can learn deep, directed belief networks one
layer at a time, provided the top two layers form an undirected associa-
tive memory. The fast, greedy algorithm is used to initialize a slower
learning procedure that fine-tunes the weights using a contrastive ver-
sion of the wake-sleep algorithm. After fine-tuning, a network with three
hidden layers forms a very good generative model of the joint distribu-
tion of handwritten digit images and their labels. This generative model
gives better digit classification than the best discriminative learning al-
gorithms. The low-dimensional manifolds on which the digits lie are
modeled by long ravines in the free-energy landscape of the top-level
associative memory, and it is easy to explore these ravines by using the
directed connections to display what the associative memory has in mind.
1 Introduction
Learning is difficult in densely connected, directed belief nets that have
many hidden layers because it is difficult to infer the conditional distribu-
tion of the hidden activities when given a data vector. Variational methods
use simple approximations to the true conditional distribution, but the ap-
proximations may be poor, especially at the deepest hidden layer, where
the prior assumes independence. Also, variational learning still requires all
of the parameters to be learned together and this makes the learning time
scale poorly as the number of parameters increases.
We describe a model in which the top two hidden layers form an undi-
rected associative memory (see Figure 1) and the remaining hidden layers
Neural Computation 18, 1527–1554 (2006) C⃝ 2006 Massachusetts Institute of Technology
Fabio A. González Universidad Nacional de Colombia
Deep learning time line
1943 20161957 1969 1986 1995 2007 2012
LETTER Communicated by Yann Le Cun
A Fast Learning Algorithm for Deep Belief Nets
Geoffrey E. Hinton
hinton@cs.toronto.edu
Simon Osindero
osindero@cs.toronto.edu
Department of Computer Science, University of Toronto, Toronto, Canada M5S 3G4
Yee-Whye Teh
tehyw@comp.nus.edu.sg
Department of Computer Science, National University of Singapore,
Singapore 117543
We show how to use “complementary priors” to eliminate the explaining-
away effects that make inference difficult in densely connected belief nets
that have many hidden layers. Using complementary priors, we derive a
fast, greedy algorithm that can learn deep, directed belief networks one
layer at a time, provided the top two layers form an undirected associa-
tive memory. The fast, greedy algorithm is used to initialize a slower
learning procedure that fine-tunes the weights using a contrastive ver-
sion of the wake-sleep algorithm. After fine-tuning, a network with three
hidden layers forms a very good generative model of the joint distribu-
tion of handwritten digit images and their labels. This generative model
gives better digit classification than the best discriminative learning al-
gorithms. The low-dimensional manifolds on which the digits lie are
modeled by long ravines in the free-energy landscape of the top-level
associative memory, and it is easy to explore these ravines by using the
directed connections to display what the associative memory has in mind.
1 Introduction
Learning is difficult in densely connected, directed belief nets that have
many hidden layers because it is difficult to infer the conditional distribu-
tion of the hidden activities when given a data vector. Variational methods
use simple approximations to the true conditional distribution, but the ap-
proximations may be poor, especially at the deepest hidden layer, where
the prior assumes independence. Also, variational learning still requires all
of the parameters to be learned together and this makes the learning time
scale poorly as the number of parameters increases.
We describe a model in which the top two hidden layers form an undi-
rected associative memory (see Figure 1) and the remaining hidden layers
Neural Computation 18, 1527–1554 (2006) C⃝ 2006 Massachusetts Institute of Technology
LETTER Communicated by Yann Le Cun
A Fast Learning Algorithm for Deep Belief Nets
Geoffrey E. Hinton
hinton@cs.toronto.edu
Simon Osindero
osindero@cs.toronto.edu
Department of Computer Science, University of Toronto, Toronto, Canada M5S 3G4
Yee-Whye Teh
tehyw@comp.nus.edu.sg
Department of Computer Science, National University of Singapore,
Singapore 117543
We show how to use “complementary priors” to eliminate the explaining-
away effects that make inference difficult in densely connected belief nets
that have many hidden layers. Using complementary priors, we derive a
fast, greedy algorithm that can learn deep, directed belief networks one
layer at a time, provided the top two layers form an undirected associa-
tive memory. The fast, greedy algorithm is used to initialize a slower
learning procedure that fine-tunes the weights using a contrastive ver-
sion of the wake-sleep algorithm. After fine-tuning, a network with three
hidden layers forms a very good generative model of the joint distribu-
tion of handwritten digit images and their labels. This generative model
gives better digit classification than the best discriminative learning al-
gorithms. The low-dimensional manifolds on which the digits lie are
modeled by long ravines in the free-energy landscape of the top-level
associative memory, and it is easy to explore these ravines by using the
directed connections to display what the associative memory has in mind.
1 Introduction
Learning is difficult in densely connected, directed belief nets that have
many hidden layers because it is difficult to infer the conditional distribu-
tion of the hidden activities when given a data vector. Variational methods
use simple approximations to the true conditional distribution, but the ap-
proximations may be poor, especially at the deepest hidden layer, where
the prior assumes independence. Also, variational learning still requires all
of the parameters to be learned together and this makes the learning time
scale poorly as the number of parameters increases.
We describe a model in which the top two hidden layers form an undi-
rected associative memory (see Figure 1) and the remaining hidden layers
Neural Computation 18, 1527–1554 (2006) C⃝ 2006 Massachusetts Institute of Technology
LETTER Communicated by Yann Le Cun
A Fast Learning Algorithm for Deep Belief Nets
Geoffrey E. Hinton
hinton@cs.toronto.edu
Simon Osindero
osindero@cs.toronto.edu
Department of Computer Science, University of Toronto, Toronto, Canada M5S 3G4
Yee-Whye Teh
tehyw@comp.nus.edu.sg
Department of Computer Science, National University of Singapore,
Singapore 117543
We show how to use “complementary priors” to eliminate the explaining-
away effects that make inference difficult in densely connected belief nets
that have many hidden layers. Using complementary priors, we derive a
Learning is difficult in densely connected, directed belief nets that have
many hidden layers because it is difficult to infer the conditional distribu-
tion of the hidden activities when given a data vector. Variational methods
use simple approximations to the true conditional distribution, but the ap-
proximations may be poor, especially at the deepest hidden layer, where
the prior assumes independence. Also, variational learning still requires all
of the parameters to be learned together and this makes the learning time
scale poorly as the number of parameters increases.
We describe a model in which the top two hidden layers form an undi-
rected associative memory (see Figure 1) and the remaining hidden layers
Neural Computation 18, 1527–1554 (2006) C⃝ 2006 Massachusetts Institute of Technology
Fabio A. González Universidad Nacional de Colombia
Deep learning time line
1943 20161957 1969 1986 1995 2007 2012
LETTER Communicated by Yann Le Cun
A Fast Learning Algorithm for Deep Belief Nets
Geoffrey E. Hinton
hinton@cs.toronto.edu
Simon Osindero
osindero@cs.toronto.edu
Department of Computer Science, University of Toronto, Toronto, Canada M5S 3G4
Yee-Whye Teh
tehyw@comp.nus.edu.sg
Department of Computer Science, National University of Singapore,
Singapore 117543
We show how to use “complementary priors” to eliminate the explaining-
away effects that make inference difficult in densely connected belief nets
that have many hidden layers. Using complementary priors, we derive a
fast, greedy algorithm that can learn deep, directed belief networks one
layer at a time, provided the top two layers form an undirected associa-
tive memory. The fast, greedy algorithm is used to initialize a slower
learning procedure that fine-tunes the weights using a contrastive ver-
sion of the wake-sleep algorithm. After fine-tuning, a network with three
hidden layers forms a very good generative model of the joint distribu-
tion of handwritten digit images and their labels. This generative model
gives better digit classification than the best discriminative learning al-
gorithms. The low-dimensional manifolds on which the digits lie are
modeled by long ravines in the free-energy landscape of the top-level
associative memory, and it is easy to explore these ravines by using the
directed connections to display what the associative memory has in mind.
1 Introduction
Learning is difficult in densely connected, directed belief nets that have
many hidden layers because it is difficult to infer the conditional distribu-
tion of the hidden activities when given a data vector. Variational methods
use simple approximations to the true conditional distribution, but the ap-
proximations may be poor, especially at the deepest hidden layer, where
the prior assumes independence. Also, variational learning still requires all
of the parameters to be learned together and this makes the learning time
scale poorly as the number of parameters increases.
We describe a model in which the top two hidden layers form an undi-
rected associative memory (see Figure 1) and the remaining hidden layers
Neural Computation 18, 1527–1554 (2006) C⃝ 2006 Massachusetts Institute of Technology
LETTER Communicated by Yann Le Cun
A Fast Learning Algorithm for Deep Belief Nets
Geoffrey E. Hinton
hinton@cs.toronto.edu
Simon Osindero
osindero@cs.toronto.edu
Department of Computer Science, University of Toronto, Toronto, Canada M5S 3G4
Yee-Whye Teh
tehyw@comp.nus.edu.sg
Department of Computer Science, National University of Singapore,
Singapore 117543
We show how to use “complementary priors” to eliminate the explaining-
away effects that make inference difficult in densely connected belief nets
that have many hidden layers. Using complementary priors, we derive a
fast, greedy algorithm that can learn deep, directed belief networks one
layer at a time, provided the top two layers form an undirected associa-
tive memory. The fast, greedy algorithm is used to initialize a slower
learning procedure that fine-tunes the weights using a contrastive ver-
sion of the wake-sleep algorithm. After fine-tuning, a network with three
hidden layers forms a very good generative model of the joint distribu-
tion of handwritten digit images and their labels. This generative model
gives better digit classification than the best discriminative learning al-
gorithms. The low-dimensional manifolds on which the digits lie are
modeled by long ravines in the free-energy landscape of the top-level
associative memory, and it is easy to explore these ravines by using the
directed connections to display what the associative memory has in mind.
1 Introduction
Learning is difficult in densely connected, directed belief nets that have
many hidden layers because it is difficult to infer the conditional distribu-
tion of the hidden activities when given a data vector. Variational methods
use simple approximations to the true conditional distribution, but the ap-
proximations may be poor, especially at the deepest hidden layer, where
the prior assumes independence. Also, variational learning still requires all
of the parameters to be learned together and this makes the learning time
scale poorly as the number of parameters increases.
We describe a model in which the top two hidden layers form an undi-
rected associative memory (see Figure 1) and the remaining hidden layers
Neural Computation 18, 1527–1554 (2006) C⃝ 2006 Massachusetts Institute of Technology
Fabio A. González Universidad Nacional de Colombia
Deep learning time line
1943 20161957 1969 1986 1995 2007 2012
LETTER Communicated by Yann Le Cun
A Fast Learning Algorithm for Deep Belief Nets
Geoffrey E. Hinton
hinton@cs.toronto.edu
Simon Osindero
osindero@cs.toronto.edu
Department of Computer Science, University of Toronto, Toronto, Canada M5S 3G4
Yee-Whye Teh
tehyw@comp.nus.edu.sg
Department of Computer Science, National University of Singapore,
Singapore 117543
We show how to use “complementary priors” to eliminate the explaining-
away effects that make inference difficult in densely connected belief nets
that have many hidden layers. Using complementary priors, we derive a
fast, greedy algorithm that can learn deep, directed belief networks one
layer at a time, provided the top two layers form an undirected associa-
tive memory. The fast, greedy algorithm is used to initialize a slower
learning procedure that fine-tunes the weights using a contrastive ver-
sion of the wake-sleep algorithm. After fine-tuning, a network with three
hidden layers forms a very good generative model of the joint distribu-
tion of handwritten digit images and their labels. This generative model
gives better digit classification than the best discriminative learning al-
gorithms. The low-dimensional manifolds on which the digits lie are
modeled by long ravines in the free-energy landscape of the top-level
associative memory, and it is easy to explore these ravines by using the
directed connections to display what the associative memory has in mind.
1 Introduction
Learning is difficult in densely connected, directed belief nets that have
many hidden layers because it is difficult to infer the conditional distribu-
tion of the hidden activities when given a data vector. Variational methods
use simple approximations to the true conditional distribution, but the ap-
proximations may be poor, especially at the deepest hidden layer, where
the prior assumes independence. Also, variational learning still requires all
of the parameters to be learned together and this makes the learning time
scale poorly as the number of parameters increases.
We describe a model in which the top two hidden layers form an undi-
rected associative memory (see Figure 1) and the remaining hidden layers
Neural Computation 18, 1527–1554 (2006) C⃝ 2006 Massachusetts Institute of Technology
LETTER Communicated by Yann Le Cun
A Fast Learning Algorithm for Deep Belief Nets
Geoffrey E. Hinton
hinton@cs.toronto.edu
Simon Osindero
osindero@cs.toronto.edu
Department of Computer Science, University of Toronto, Toronto, Canada M5S 3G4
Yee-Whye Teh
tehyw@comp.nus.edu.sg
Department of Computer Science, National University of Singapore,
Singapore 117543
We show how to use “complementary priors” to eliminate the explaining-
away effects that make inference difficult in densely connected belief nets
that have many hidden layers. Using complementary priors, we derive a
fast, greedy algorithm that can learn deep, directed belief networks one
layer at a time, provided the top two layers form an undirected associa-
tive memory. The fast, greedy algorithm is used to initialize a slower
learning procedure that fine-tunes the weights using a contrastive ver-
sion of the wake-sleep algorithm. After fine-tuning, a network with three
hidden layers forms a very good generative model of the joint distribu-
tion of handwritten digit images and their labels. This generative model
gives better digit classification than the best discriminative learning al-
gorithms. The low-dimensional manifolds on which the digits lie are
modeled by long ravines in the free-energy landscape of the top-level
associative memory, and it is easy to explore these ravines by using the
directed connections to display what the associative memory has in mind.
1 Introduction
Learning is difficult in densely connected, directed belief nets that have
many hidden layers because it is difficult to infer the conditional distribu-
tion of the hidden activities when given a data vector. Variational methods
use simple approximations to the true conditional distribution, but the ap-
proximations may be poor, especially at the deepest hidden layer, where
the prior assumes independence. Also, variational learning still requires all
of the parameters to be learned together and this makes the learning time
scale poorly as the number of parameters increases.
We describe a model in which the top two hidden layers form an undi-
rected associative memory (see Figure 1) and the remaining hidden layers
Neural Computation 18, 1527–1554 (2006) C⃝ 2006 Massachusetts Institute of Technology
Fabio A. González Universidad Nacional de Colombia
Deep learning model won
ILSVRC 2012 challenge
Fabio A. González Universidad Nacional de Colombia
Deep learning model won
ILSVRC 2012 challenge
1.2 million images
1,000 concepts
Fabio A. González Universidad Nacional de Colombia
Deep learning model won
ILSVRC 2012 challenge
1.2 million images
1,000 concepts
Fabio A. González Universidad Nacional de Colombia
Deep learning recipe
Data
HPC
Algorithms
Tricks
Feature
learning
Size
Fabio A. González Universidad Nacional de Colombia
Feature learning
Fabio A. González Universidad Nacional de Colombia
Feature learning
Fabio A. González Universidad Nacional de Colombia
Feature learning
Fabio A. González Universidad Nacional de Colombia
Deep learning recipe
Data
HPC
Algorithms
Tricks
Feature
learning
Size
Fabio A. González Universidad Nacional de Colombia
Deep —> Bigger
Revolution of Depth
3.57
6.7 7.3
11.7
16.4
25.8
28.2
ILSVRC'15
ResNet
ILSVRC'14
GoogleNet
ILSVRC'14
VGG
ILSVRC'13 ILSVRC'12
AlexNet
ILSVRC'11 ILSVRC'10
ImageNet Classification top-5 error (%)
shallow8 layers
19 layers22 layers
152 layers
Kaiming He, Xiangyu Zhang, Shaoqing Ren, & Jian Sun. “Deep Residual Learning for Image Recognition”. arXiv 2015.
8 layers
Fabio A. González Universidad Nacional de Colombia
Deep learning recipe
Data
HPC
Algorithms
Tricks
Feature
learning
Size
Fabio A. González Universidad Nacional de Colombia
Data…
• Images annotated with
WordNet concepts
• Concepts: 21,841
• Images: 14,197,122
• Bounding box annotations:
1,034,908
• Crowdsourcing
Fabio A. González Universidad Nacional de Colombia
Deep learning recipe
Data
HPC
Algorithms
Tricks
Feature
learning
Size
Fabio A. González Universidad Nacional de Colombia
HPC
Fabio A. González Universidad Nacional de Colombia
HPC
Fabio A. González Universidad Nacional de Colombia
HPC
Fabio A. González Universidad Nacional de Colombia
Deep learning recipe
Data
HPC
Algorithms
Tricks
Feature
learning
Size
Fabio A. González Universidad Nacional de Colombia
Algorithms
• Backpropagation
• Backpropagation through time
• Online learning (stochastic
gradient descent)
• Softmax (hierarchical)
Fabio A. González Universidad Nacional de Colombia
Deep learning recipe
Data
HPC
Algorithms
Tricks
Feature
learning
Size
Fabio A. González Universidad Nacional de Colombia
Tricks
• DL is mainly an engineering
problem
• DL networks are hard to train
• Several tricks product of years
of experience
Fabio A. González Universidad Nacional de Colombia
Tricks
• DL is mainly an engineering
problem
• DL networks are hard to train
• Several tricks product of years
of experience
• Layer-wise training
• RELU units
• Dropout
• Adaptive learning rates
• Initialization
• Preprocessing
• Gradient norm clipping
Fabio A. González Universidad Nacional de Colombia
Applications
• Computer vision:
• Image: annotation, detection, segmentation, captioning
• Video: object tracking, action recognition, segmentation
• Speech recognition and synthesis
• Text: language modeling, word/text representation, text
classification, translation
• Biomedical image analysis
Fabio A. González Universidad Nacional de Colombia
Deep Learning at
Fabio A. González Universidad Nacional de Colombia
Feature learning for cancer
diagnosisArtificial Intelligence in Medicine 64 (2015) 131–145
Contents lists available at ScienceDirect
Artificial Intelligence in Medicine
journal homepage: www.elsevier.com/locate/aiim
An unsupervised feature learning framework for basal cell carcinoma
image analysis
John Arevaloa
, Angel Cruz-Roaa
, Viviana Ariasb
, Eduardo Romeroc
, Fabio A. Gonzáleza,∗
a
Machine Learning, Perception and Discovery Lab, Systems and Computer Engineering Department, Universidad Nacional de Colombia, Faculty of
Engineering, Cra 30 No 45 03-Ciudad Universitaria, Building 453 Office 114, Bogotá DC, Colombia
b
Pathology Department, Universidad Nacional de Colombia, Faculty of Medicine, Cra 30 No 45 03-Ciudad Universitaria, Bogotá DC, Colombia
c
Computer Imaging & Medical Applications Laboratory, Universidad Nacional de Colombia, Faculty of Medicine, Cra 30 No 45 03-Ciudad Universitaria,
Bogotá DC, Colombia
a r t i c l e i n f o
Article history:
Received 1 October 2014
Received in revised form 9 April 2015
Accepted 15 April 2015
Keywords:
Digital pathology
Representation learning
Unsupervised feature learning
Basal cell carcinoma
a b s t r a c t
Objective: The paper addresses the problem of automatic detection of basal cell carcinoma (BCC) in
histopathology images. In particular, it proposes a framework to both, learn the image representation in
an unsupervised way and visualize discriminative features supported by the learned model.
Materials and methods: This paper presents an integrated unsupervised feature learning (UFL) framework
for histopathology image analysis that comprises three main stages: (1) local (patch) representation
learning using different strategies (sparse autoencoders, reconstruct independent component analy-
sis and topographic independent component analysis (TICA), (2) global (image) representation learning
using a bag-of-features representation or a convolutional neural network, and (3) a visual interpretation
layer to highlight the most discriminant regions detected by the model. The integrated unsupervised
feature learning framework was exhaustively evaluated in a histopathology image dataset for BCC diag-
nosis.
Results: The experimental evaluation produced a classification performance of 98.1%, in terms of the
area under receiver-operating-characteristic curve, for the proposed framework outperforming by 7%
the state-of-the-art discrete cosine transform patch-based representation.
Conclusions: The proposed UFL-representation-based approach outperforms state-of-the-art methods for
BCC detection. Thanks to its visual interpretation layer, the method is able to highlight discriminative
Fabio A. González Universidad Nacional de Colombia
Feature learning for cancer
diagnosis
Artificial Intelligence in Medicine
journal homepage: www.elsevier.com/locate/aiim
An unsupervised feature learning framework for basal cell carcinoma
image analysis
John Arevaloa
, Angel Cruz-Roaa
, Viviana Ariasb
, Eduardo Romeroc
, Fabio A. Gonzáleza,∗
a
Machine Learning, Perception and Discovery Lab, Systems and Computer Engineering Department, Universidad Nacional de Colombia, Faculty of
Engineering, Cra 30 No 45 03-Ciudad Universitaria, Building 453 Office 114, Bogotá DC, Colombia
b
Pathology Department, Universidad Nacional de Colombia, Faculty of Medicine, Cra 30 No 45 03-Ciudad Universitaria, Bogotá DC, Colombia
c
Computer Imaging & Medical Applications Laboratory, Universidad Nacional de Colombia, Faculty of Medicine, Cra 30 No 45 03-Ciudad Universitaria,
Bogotá DC, Colombia
a r t i c l e i n f o
Article history:
a b s t r a c t
Objective: The paper addresses the problem of automatic detection of basal cell carcinoma (BCC)
406 A.A. Cruz-Roa et al.
Fig. 2. Convolutional auto-encoder neural network architecture for histopathology image repre-
sentation learning, automatic cancer detection and visually interpretable prediction results anal-
ogous to a digital stain identifying image regions that are most relevant for diagnostic decisions.
Fabio A. González Universidad Nacional de Colombia
Feature learning for cancer
diagnosis
Artificial Intelligence in Medicine
journal homepage: www.elsevier.com/locate/aiim
An unsupervised feature learning framework for basal cell carcinoma
image analysis
John Arevaloa
, Angel Cruz-Roaa
, Viviana Ariasb
, Eduardo Romeroc
, Fabio A. Gonzáleza,∗
a
Machine Learning, Perception and Discovery Lab, Systems and Computer Engineering Department, Universidad Nacional de Colombia, Faculty of
Engineering, Cra 30 No 45 03-Ciudad Universitaria, Building 453 Office 114, Bogotá DC, Colombia
b
Pathology Department, Universidad Nacional de Colombia, Faculty of Medicine, Cra 30 No 45 03-Ciudad Universitaria, Bogotá DC, Colombia
c
Computer Imaging & Medical Applications Laboratory, Universidad Nacional de Colombia, Faculty of Medicine, Cra 30 No 45 03-Ciudad Universitaria,
Bogotá DC, Colombia
a r t i c l e i n f o
Article history:
a b s t r a c t
Objective: The paper addresses the problem of automatic detection of basal cell carcinoma (BCC)
. Local learned features with different unsupervised feature learning methods. Left: sparse autoencoders learned some features that are visually re
ies of the histopathology dataset (i.e. nuclei shapes) which are highlighted by red squares. Center: reconstruct independent component analysis. R
onent analysis highlighting some translational (blue), color (red), scale (yellow) and rotational (green) invariances. (For interpretation of the referen
e legend, the reader is referred to the web version of this article.)
. Discriminant map of learned features. Enclosed with red path are related with positive (basal cell carcinoma) class and enclosed by blue path are rela
thy tissue) class. Left: softmax weights mapped back to topographic organization. Center: features learned with topographic component analysis. Rig
minative features for positive (top) and negative (bottom) classes. (For interpretation of the references to color in this figure legend, the reader is re
on of this article.)
size is not very significant in terms of time for feature extrac-
process. Excluding SAE second layer, any feature extraction
el proposed here takes less than 2 s to extract features from
4.9. Overall BCC classification results
Table 4 summarizes the systematic evaluation per
Fabio A. González Universidad Nacional de Colombia
Feature learning for cancer
diagnosis
Artificial Intelligence in Medicine
journal homepage: www.elsevier.com/locate/aiim
An unsupervised feature learning framework for basal cell carcinoma
image analysis
John Arevaloa
, Angel Cruz-Roaa
, Viviana Ariasb
, Eduardo Romeroc
, Fabio A. Gonzáleza,∗
a
Machine Learning, Perception and Discovery Lab, Systems and Computer Engineering Department, Universidad Nacional de Colombia, Faculty of
Engineering, Cra 30 No 45 03-Ciudad Universitaria, Building 453 Office 114, Bogotá DC, Colombia
b
Pathology Department, Universidad Nacional de Colombia, Faculty of Medicine, Cra 30 No 45 03-Ciudad Universitaria, Bogotá DC, Colombia
c
Computer Imaging & Medical Applications Laboratory, Universidad Nacional de Colombia, Faculty of Medicine, Cra 30 No 45 03-Ciudad Universitaria,
Bogotá DC, Colombia
a r t i c l e i n f o
Article history:
a b s t r a c t
Objective: The paper addresses the problem of automatic detection of basal cell carcinoma (BCC)
142 J. Arevalo et al. / Artificial Intelligence in Medicine 64 (2015) 131–145
Table 2
Outputs produced by the system for different cancer and non-cancer input images. The table rows show from top to bottom: the real image class
predicted by the model, the probability associated to the prediction, and the digital stained image (red stain indicates cancer regions, blue stain in
Fabio A. González Universidad Nacional de Colombia
Feature learning for cancer
diagnosis
Artificial Intelligence in Medicine
journal homepage: www.elsevier.com/locate/aiim
An unsupervised feature learning framework for basal cell carcinoma
image analysis
John Arevaloa
, Angel Cruz-Roaa
, Viviana Ariasb
, Eduardo Romeroc
, Fabio A. Gonzáleza,∗
a
Machine Learning, Perception and Discovery Lab, Systems and Computer Engineering Department, Universidad Nacional de Colombia, Faculty of
Engineering, Cra 30 No 45 03-Ciudad Universitaria, Building 453 Office 114, Bogotá DC, Colombia
b
Pathology Department, Universidad Nacional de Colombia, Faculty of Medicine, Cra 30 No 45 03-Ciudad Universitaria, Bogotá DC, Colombia
c
Computer Imaging & Medical Applications Laboratory, Universidad Nacional de Colombia, Faculty of Medicine, Cra 30 No 45 03-Ciudad Universitaria,
Bogotá DC, Colombia
a r t i c l e i n f o
Article history:
a b s t r a c t
Objective: The paper addresses the problem of automatic detection of basal cell carcinoma (BCC)
Fabio A. González Universidad Nacional de Colombia
Deep learning for cancer
diagnosis
Cascaded Ensemble of Convolutional Neural Networks and
Handcrafted Features for Mitosis Detection
Haibo Wang *⇤1
, Angel Cruz-Roa*2
, Ajay Basavanhally1
, Hannah Gilmore1
, Natalie Shih3
, Mike
Feldman3
, John Tomaszewski4
, Fabio Gonzalez2
, and Anant Madabhushi1
1
Case Western Reserve University, USA
2
Universidad Nacional de Colombia, Colombia
3
University of Pennsylvania, USA
4
University at Buffalo School of Medicine and Biomedical Sciences, USA
ABSTRACT
Breast cancer (BCa) grading plays an important role in predicting disease aggressiveness and patient outcome. A key
component of BCa grade is mitotic count, which involves quantifying the number of cells in the process of dividing (i.e.
undergoing mitosis) at a specific point in time. Currently mitosis counting is done manually by a pathologist looking at
multiple high power fields on a glass slide under a microscope, an extremely laborious and time consuming process. The
development of computerized systems for automated detection of mitotic nuclei, while highly desirable, is confounded
by the highly variable shape and appearance of mitoses. Existing methods use either handcrafted features that capture
certain morphological, statistical or textural attributes of mitoses or features learned with convolutional neural networks
(CNN). While handcrafted features are inspired by the domain and the particular application, the data-driven CNN models
Fabio A. González Universidad Nacional de Colombia
Deep learning for cancer
diagnosis
Cascaded Ensemble of Convolutional Neural Networks and
Handcrafted Features for Mitosis Detection
Haibo Wang *⇤1
, Angel Cruz-Roa*2
, Ajay Basavanhally1
, Hannah Gilmore1
, Natalie Shih3
, Mike
Feldman3
, John Tomaszewski4
, Fabio Gonzalez2
, and Anant Madabhushi1
1
Case Western Reserve University, USA
2
Universidad Nacional de Colombia, Colombia
3
University of Pennsylvania, USA
4
University at Buffalo School of Medicine and Biomedical Sciences, USA
optimization. Including the time needed to extract handcrafted features (6.5 hours in pure MATLAB implementation), the
training stage for HC+CNN was completed in less than 18 hours.
Dataset Method TP FP FN Precision Recall F-measure
Scanner
Aperio
HC+CNN 65 12 35 0.84 0.65 0.7345
HC 64 22 36 0.74 0.64 0.6864
CNN 53 32 47 0.63 0.53 0.5730
IDSIA13
70 9 30 0.89 0.70 0.7821
IPAL4
74 32 26 0.70 0.74 0.7184
SUTECH 72 31 28 0.70 0.72 0.7094
NEC12
59 20 41 0.75 0.59 0.6592
Table 2: Evaluation results for mitosis detection using HC+CNN and comparative methods on the ICPR12 dataset.
Figure 3: Mitoses identified by HC+CNN as TP (green circles), FN (yellow circles), and FP (red circles) on the ICPR12
dataset. The TP examples have distinctive intensity, shape and texture while the FN examples are less distinctive in intensity
and shape. The FP examples are visually more alike to mitotic figures than the FNs.
3.4 Results on AMIDA13 Dataset
On the AMIDA13 dataset, the F-measure of our approach (CCIPD/MINDLAB) is 0.319, which ranks 6 among 14 submis-
Fabio A. González Universidad Nacional de Colombia
Deep learning for cancer
diagnosis
Cascaded Ensemble of Convolutional Neural Networks and
Handcrafted Features for Mitosis Detection
Haibo Wang *⇤1
, Angel Cruz-Roa*2
, Ajay Basavanhally1
, Hannah Gilmore1
, Natalie Shih3
, Mike
Feldman3
, John Tomaszewski4
, Fabio Gonzalez2
, and Anant Madabhushi1
1
Case Western Reserve University, USA
2
Universidad Nacional de Colombia, Colombia
3
University of Pennsylvania, USA
4
University at Buffalo School of Medicine and Biomedical Sciences, USA
Figure 3: Mitoses identified by HC+CNN as TP (green circles), FN (yellow circles), and FP (red circles) on the ICPR12
HPF
Handcrafted features
Feature learning
Classifier 1
Classifier 3
Probabilistic
fusion
Classifier 2
HPF Segmentation
Figure 1: Workflow of our methodology. Blue-ratio thresholding14
is first applied to segment mitosis candidates. On each
segmented blob, handcrafted features are extracted and classified via a Random Forests classifier. Meanwhile, on each
segmented 80 ⇥ 80 patch, convolutional neural networks (CNN)8
are trained with a fully connected regression model as
part of the classification layer. For those candidates that are difficult to classify (ambiguous result from the CNN), we train
a second-stage Random Forests classifier on the basis of combining CNN-derived and handcrafted features. Final decision
is obtained via a consensus of the predictions of the three classifiers.
Fabio A. González Universidad Nacional de Colombia
Deep learning for cancer
diagnosis
Cascaded Ensemble of Convolutional Neural Networks and
Handcrafted Features for Mitosis Detection
Haibo Wang *⇤1
, Angel Cruz-Roa*2
, Ajay Basavanhally1
, Hannah Gilmore1
, Natalie Shih3
, Mike
Feldman3
, John Tomaszewski4
, Fabio Gonzalez2
, and Anant Madabhushi1
1
Case Western Reserve University, USA
2
Universidad Nacional de Colombia, Colombia
3
University of Pennsylvania, USA
4
University at Buffalo School of Medicine and Biomedical Sciences, USA
Figure 3: Mitoses identified by HC+CNN as TP (green circles), FN (yellow circles), and FP (red circles) on the ICPR12
HPF
Handcrafted features
Feature learning
Classifier 1
Classifier 3
Probabilistic
fusion
Classifier 2
HPF Segmentation
Fabio A. González Universidad Nacional de Colombia
Efficient DL over whole slide
pathology images
~2000x2000 pixels
Fabio A. González Universidad Nacional de Colombia
Efficient DL over whole slide
pathology images
~2000x2000 pixels
Fabio A. González Universidad Nacional de Colombia
Efficient DL over whole slide
pathology images
~2000x2000 pixels
Fabio A. González Universidad Nacional de Colombia
Efficient DL over whole slide
pathology images
~2000x2000 pixels
Fabio A. González Universidad Nacional de Colombia
Efficient DL over whole slide
pathology images
~2000x2000 pixels
Fabio A. González Universidad Nacional de Colombia
Exudate detection in eye
fundus images
Fabio A. González Universidad Nacional de Colombia
Exudate detection in eye
fundus images
Fabio A. González Universidad Nacional de Colombia
Exudate detection in eye
fundus images
Fabio A. González Universidad Nacional de Colombia
Exudate detection in eye
fundus images
Fabio A. González Universidad Nacional de Colombia
RNN for book genre
classification
Dataset construction
Annotated
Dataset
Get top tags
from users
No. Class Tags
1 science_fiction sci-fi, science-fiction
2 comedy comedies, comedy, humor
... ... ...
9 religion christian, religion, christianity,...
Fabio A. González Universidad Nacional de Colombia
RNN for book genre
classification
Dataset construction
Annotated
Dataset
Get top tags
from users
No. Class Tags
1 science_fiction sci-fi, science-fiction
2 comedy comedies, comedy, humor
... ... ...
9 religion christian, religion, christianity,...
RNN Architecture
Fabio A. González Universidad Nacional de Colombia
Deep CNN-RNN for object
tracking in video
Fabio A. González Universidad Nacional de Colombia
Deep CNN-RNN for object
tracking in video
Fabio A. González Universidad Nacional de Colombia
Deep CNN-RNN for object
tracking in video
Number of sequences in different
public datasets:
• VOT2013: 16
• VOT2014: 25
• VOT2015: 60
• OOT: 50
• ALOV++: 315
Fabio A. González Universidad Nacional de Colombia
Deep CNN-RNN for object
tracking in video
Number of sequences in different
public datasets:
• VOT2013: 16
• VOT2014: 25
• VOT2015: 60
• OOT: 50
• ALOV++: 315
Fabio A. González Universidad Nacional de Colombia
Deep CNN-RNN for object
tracking in video
Number of sequences in different
public datasets:
• VOT2013: 16
• VOT2014: 25
• VOT2015: 60
• OOT: 50
• ALOV++: 315
Fabio A. González Universidad Nacional de Colombia
Deep CNN-RNN for object
tracking in video
Number of sequences in different
public datasets:
• VOT2013: 16
• VOT2014: 25
• VOT2015: 60
• OOT: 50
• ALOV++: 315
Fabio A. González Universidad Nacional de Colombia
Deep CNN-RNN for object
tracking in video
Fabio A. González Universidad Nacional de Colombia
Deep CNN-RNN for object
tracking in video
Fabio A. González Universidad Nacional de Colombia
The Team
Alexis Carrillo
Andrés Esteban Paez
Angel Cruz
Andrés Castillo
Andrés Jaque
Andrés Rosso
Camilo Pino
Claudia Becerra
Fabián Paez
Felipe Baquero
Fredy Díaz
Gustavo Bula
Germán Sosa
Hugo Castellanos
Ingrid Suárez
John Arévalo
JorgeVanegas
Jorge Camargo
Jorge Mario Carrasco
Joseph Alejandro Gallego
José David Bermeo
Juan Carlos Caicedo
Juan Sebastián Otálora
Katherine Rozo
LadyViviana Beltrán
Lina Rosales
Luis Alejandro Riveros
Miguel Chitiva
Óscar Paruma
Óscar Perdomo
Raúl Ramos
Roger Guzmán
Santiago Pérez
Sergio Jiménez
Susana Sánchez
Sebastián Sierra
Gracias!
fagonzalezo@unal.edu.co
http://mindlaboratory.org

Weitere ähnliche Inhalte

Ähnlich wie Deep learning: el renacimiento de las redes neuronales

NIPS2007: deep belief nets
NIPS2007: deep belief netsNIPS2007: deep belief nets
NIPS2007: deep belief netszukun
 
[PR12] Inception and Xception - Jaejun Yoo
[PR12] Inception and Xception - Jaejun Yoo[PR12] Inception and Xception - Jaejun Yoo
[PR12] Inception and Xception - Jaejun YooJaeJun Yoo
 
(DL輪読)Matching Networks for One Shot Learning
(DL輪読)Matching Networks for One Shot Learning(DL輪読)Matching Networks for One Shot Learning
(DL輪読)Matching Networks for One Shot LearningMasahiro Suzuki
 
Lifelong Learning for Dynamically Expandable Networks
Lifelong Learning for Dynamically Expandable NetworksLifelong Learning for Dynamically Expandable Networks
Lifelong Learning for Dynamically Expandable NetworksNAVER Engineering
 
K means Clustering
K means ClusteringK means Clustering
K means ClusteringEdureka!
 
Employing Neocognitron Neural Network Base Ensemble Classifiers To Enhance Ef...
Employing Neocognitron Neural Network Base Ensemble Classifiers To Enhance Ef...Employing Neocognitron Neural Network Base Ensemble Classifiers To Enhance Ef...
Employing Neocognitron Neural Network Base Ensemble Classifiers To Enhance Ef...cscpconf
 
Deep Belief nets
Deep Belief netsDeep Belief nets
Deep Belief netsbutest
 
Convolutional Neural Networks Research
Convolutional Neural Networks ResearchConvolutional Neural Networks Research
Convolutional Neural Networks ResearchTanmay Ghai
 
A Time Series ANN Approach for Weather Forecasting
A Time Series ANN Approach for Weather ForecastingA Time Series ANN Approach for Weather Forecasting
A Time Series ANN Approach for Weather Forecastingijctcm
 
Inference in HMM and Bayesian Models
Inference in HMM and Bayesian ModelsInference in HMM and Bayesian Models
Inference in HMM and Bayesian ModelsMinakshi Atre
 
An Introduction to Long Short-term Memory (LSTMs)
An Introduction to Long Short-term Memory (LSTMs)An Introduction to Long Short-term Memory (LSTMs)
An Introduction to Long Short-term Memory (LSTMs)EmmanuelJosterSsenjo
 
Tsinghua invited talk_zhou_xing_v2r0
Tsinghua invited talk_zhou_xing_v2r0Tsinghua invited talk_zhou_xing_v2r0
Tsinghua invited talk_zhou_xing_v2r0Joe Xing
 
Intro to Deep learning - Autoencoders
Intro to Deep learning - Autoencoders Intro to Deep learning - Autoencoders
Intro to Deep learning - Autoencoders Akash Goel
 
An introduction to deep learning
An introduction to deep learningAn introduction to deep learning
An introduction to deep learningVan Thanh
 

Ähnlich wie Deep learning: el renacimiento de las redes neuronales (20)

NIPS2007: deep belief nets
NIPS2007: deep belief netsNIPS2007: deep belief nets
NIPS2007: deep belief nets
 
rnn_review.10.pdf
rnn_review.10.pdfrnn_review.10.pdf
rnn_review.10.pdf
 
[PR12] Inception and Xception - Jaejun Yoo
[PR12] Inception and Xception - Jaejun Yoo[PR12] Inception and Xception - Jaejun Yoo
[PR12] Inception and Xception - Jaejun Yoo
 
(DL輪読)Matching Networks for One Shot Learning
(DL輪読)Matching Networks for One Shot Learning(DL輪読)Matching Networks for One Shot Learning
(DL輪読)Matching Networks for One Shot Learning
 
Lifelong Learning for Dynamically Expandable Networks
Lifelong Learning for Dynamically Expandable NetworksLifelong Learning for Dynamically Expandable Networks
Lifelong Learning for Dynamically Expandable Networks
 
K means Clustering
K means ClusteringK means Clustering
K means Clustering
 
PhD Defense
PhD DefensePhD Defense
PhD Defense
 
Node similarity
Node similarityNode similarity
Node similarity
 
Employing Neocognitron Neural Network Base Ensemble Classifiers To Enhance Ef...
Employing Neocognitron Neural Network Base Ensemble Classifiers To Enhance Ef...Employing Neocognitron Neural Network Base Ensemble Classifiers To Enhance Ef...
Employing Neocognitron Neural Network Base Ensemble Classifiers To Enhance Ef...
 
Neural networks
Neural networksNeural networks
Neural networks
 
Deep Belief nets
Deep Belief netsDeep Belief nets
Deep Belief nets
 
Convolutional Neural Networks Research
Convolutional Neural Networks ResearchConvolutional Neural Networks Research
Convolutional Neural Networks Research
 
deepLearning report
deepLearning reportdeepLearning report
deepLearning report
 
A Time Series ANN Approach for Weather Forecasting
A Time Series ANN Approach for Weather ForecastingA Time Series ANN Approach for Weather Forecasting
A Time Series ANN Approach for Weather Forecasting
 
The Tower of Knowledge A Generic System Architecture
The Tower of Knowledge A Generic System ArchitectureThe Tower of Knowledge A Generic System Architecture
The Tower of Knowledge A Generic System Architecture
 
Inference in HMM and Bayesian Models
Inference in HMM and Bayesian ModelsInference in HMM and Bayesian Models
Inference in HMM and Bayesian Models
 
An Introduction to Long Short-term Memory (LSTMs)
An Introduction to Long Short-term Memory (LSTMs)An Introduction to Long Short-term Memory (LSTMs)
An Introduction to Long Short-term Memory (LSTMs)
 
Tsinghua invited talk_zhou_xing_v2r0
Tsinghua invited talk_zhou_xing_v2r0Tsinghua invited talk_zhou_xing_v2r0
Tsinghua invited talk_zhou_xing_v2r0
 
Intro to Deep learning - Autoencoders
Intro to Deep learning - Autoencoders Intro to Deep learning - Autoencoders
Intro to Deep learning - Autoencoders
 
An introduction to deep learning
An introduction to deep learningAn introduction to deep learning
An introduction to deep learning
 

Mehr von Big Data Colombia

An introduction to deep reinforcement learning
An introduction to deep reinforcement learningAn introduction to deep reinforcement learning
An introduction to deep reinforcement learningBig Data Colombia
 
Machine learning applied in health
Machine learning applied in healthMachine learning applied in health
Machine learning applied in healthBig Data Colombia
 
Whose Balance Sheet is this? Neural Networks for Banks’ Pattern Recognition
Whose Balance Sheet is this? Neural Networks for Banks’ Pattern RecognitionWhose Balance Sheet is this? Neural Networks for Banks’ Pattern Recognition
Whose Balance Sheet is this? Neural Networks for Banks’ Pattern RecognitionBig Data Colombia
 
Analysis of your own Facebook friends’ data structure through graphs
Analysis of your own Facebook friends’ data structure through graphsAnalysis of your own Facebook friends’ data structure through graphs
Analysis of your own Facebook friends’ data structure through graphsBig Data Colombia
 
Lo datos cuentan su historia
Lo datos cuentan su historiaLo datos cuentan su historia
Lo datos cuentan su historiaBig Data Colombia
 
Entornos Naturalmente Inteligentes
Entornos Naturalmente InteligentesEntornos Naturalmente Inteligentes
Entornos Naturalmente InteligentesBig Data Colombia
 
Modelamiento predictivo y medicina
Modelamiento predictivo y medicinaModelamiento predictivo y medicina
Modelamiento predictivo y medicinaBig Data Colombia
 
Ayudando a los Viajeros usando 500 millones de Reseñas Hoteleras al Mes
Ayudando a los Viajeros usando 500 millones de Reseñas Hoteleras al MesAyudando a los Viajeros usando 500 millones de Reseñas Hoteleras al Mes
Ayudando a los Viajeros usando 500 millones de Reseñas Hoteleras al MesBig Data Colombia
 
Cloud computing: Trends and Challenges
Cloud computing: Trends and ChallengesCloud computing: Trends and Challenges
Cloud computing: Trends and ChallengesBig Data Colombia
 
Kaggle: Coupon Purchase Prediction
Kaggle: Coupon Purchase PredictionKaggle: Coupon Purchase Prediction
Kaggle: Coupon Purchase PredictionBig Data Colombia
 
Introducción al Datawarehousing
Introducción al DatawarehousingIntroducción al Datawarehousing
Introducción al DatawarehousingBig Data Colombia
 
Análisis Explotatorio de Datos: Dejad que la data hable.
Análisis Explotatorio de Datos: Dejad que la data hable.Análisis Explotatorio de Datos: Dejad que la data hable.
Análisis Explotatorio de Datos: Dejad que la data hable.Big Data Colombia
 
Salud, dinero, amor y big data
Salud, dinero, amor y big dataSalud, dinero, amor y big data
Salud, dinero, amor y big dataBig Data Colombia
 
Business Analytics: ¡La culpa es del BIG data!
Business Analytics: ¡La culpa es del BIG data!Business Analytics: ¡La culpa es del BIG data!
Business Analytics: ¡La culpa es del BIG data!Big Data Colombia
 

Mehr von Big Data Colombia (19)

An introduction to deep reinforcement learning
An introduction to deep reinforcement learningAn introduction to deep reinforcement learning
An introduction to deep reinforcement learning
 
Machine learning applied in health
Machine learning applied in healthMachine learning applied in health
Machine learning applied in health
 
Whose Balance Sheet is this? Neural Networks for Banks’ Pattern Recognition
Whose Balance Sheet is this? Neural Networks for Banks’ Pattern RecognitionWhose Balance Sheet is this? Neural Networks for Banks’ Pattern Recognition
Whose Balance Sheet is this? Neural Networks for Banks’ Pattern Recognition
 
Analysis of your own Facebook friends’ data structure through graphs
Analysis of your own Facebook friends’ data structure through graphsAnalysis of your own Facebook friends’ data structure through graphs
Analysis of your own Facebook friends’ data structure through graphs
 
Lo datos cuentan su historia
Lo datos cuentan su historiaLo datos cuentan su historia
Lo datos cuentan su historia
 
Entornos Naturalmente Inteligentes
Entornos Naturalmente InteligentesEntornos Naturalmente Inteligentes
Entornos Naturalmente Inteligentes
 
Modelamiento predictivo y medicina
Modelamiento predictivo y medicinaModelamiento predictivo y medicina
Modelamiento predictivo y medicina
 
Ayudando a los Viajeros usando 500 millones de Reseñas Hoteleras al Mes
Ayudando a los Viajeros usando 500 millones de Reseñas Hoteleras al MesAyudando a los Viajeros usando 500 millones de Reseñas Hoteleras al Mes
Ayudando a los Viajeros usando 500 millones de Reseñas Hoteleras al Mes
 
IPython & Jupyter
IPython & JupyterIPython & Jupyter
IPython & Jupyter
 
Cloud computing: Trends and Challenges
Cloud computing: Trends and ChallengesCloud computing: Trends and Challenges
Cloud computing: Trends and Challenges
 
Kaggle: Coupon Purchase Prediction
Kaggle: Coupon Purchase PredictionKaggle: Coupon Purchase Prediction
Kaggle: Coupon Purchase Prediction
 
Machine learning y Kaggle
Machine learning y KaggleMachine learning y Kaggle
Machine learning y Kaggle
 
Fraud Analytics
Fraud AnalyticsFraud Analytics
Fraud Analytics
 
Data crunching con Spark
Data crunching con SparkData crunching con Spark
Data crunching con Spark
 
Introducción al Datawarehousing
Introducción al DatawarehousingIntroducción al Datawarehousing
Introducción al Datawarehousing
 
Análisis Explotatorio de Datos: Dejad que la data hable.
Análisis Explotatorio de Datos: Dejad que la data hable.Análisis Explotatorio de Datos: Dejad que la data hable.
Análisis Explotatorio de Datos: Dejad que la data hable.
 
Big Data para mortales
Big Data para mortalesBig Data para mortales
Big Data para mortales
 
Salud, dinero, amor y big data
Salud, dinero, amor y big dataSalud, dinero, amor y big data
Salud, dinero, amor y big data
 
Business Analytics: ¡La culpa es del BIG data!
Business Analytics: ¡La culpa es del BIG data!Business Analytics: ¡La culpa es del BIG data!
Business Analytics: ¡La culpa es del BIG data!
 

Kürzlich hochgeladen

Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBoston Institute of Analytics
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhThiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhYasamin16
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesTimothy Spann
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectBoston Institute of Analytics
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksdeepakthakur548787
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfblazblazml
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
INTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingINTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingsocarem879
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024Susanna-Assunta Sansone
 

Kürzlich hochgeladen (20)

Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhThiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis Project
 
Data Analysis Project: Stroke Prediction
Data Analysis Project: Stroke PredictionData Analysis Project: Stroke Prediction
Data Analysis Project: Stroke Prediction
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing works
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
INTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingINTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processing
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
 

Deep learning: el renacimiento de las redes neuronales

  • 1. Deep Learning Neural Networks Reborn Fabio A. González Univ. Nacional de Colombia
  • 2. Fabio A. González Universidad Nacional de Colombia Some history
  • 4. Fabio A. González Universidad Nacional de Colombia Rosenblatt’s Perceptron (1957) • Input: 20x20 photocells array • Weights implemented with potentiometers • Weight updating performed by electric motors
  • 5. Fabio A. González Universidad Nacional de Colombia Neural networks time line 1943 20161957 1969 1986 1995 2007 2012
  • 6. Fabio A. González Universidad Nacional de Colombia Neural networks time line 1943 20161957 1969 1986 1995 2007 2012
  • 7. Fabio A. González Universidad Nacional de Colombia Neural networks time line 1943 20161957 1969 1986 1995 2007 2012
  • 8. Fabio A. González Universidad Nacional de Colombia Neural networks time line 1943 20161957 1969 1986 1995 2007 2012
  • 9. Fabio A. González Universidad Nacional de Colombia Neural networks time line 1943 20161957 1969 1986 1995 2007 2012
  • 10. Fabio A. González Universidad Nacional de Colombia Backpropagation Source: http://home.agh.edu.pl/~vlsi/AI/backp_t_en/backprop.html
  • 11. Fabio A. González Universidad Nacional de Colombia Neural networks time line 1943 20161957 1969 1986 1995 2007 2012
  • 12. Fabio A. González Universidad Nacional de Colombia Neural networks time line 1943 20161957 1969 1986 1995 2007 2012 Colombian local NN history
  • 13. Fabio A. González Universidad Nacional de Colombia Colombian local neural networks history
  • 14. Fabio A. González Universidad Nacional de Colombia Colombian local neural networks history UNNeuro (1993, UN)
  • 15. Fabio A. González Universidad Nacional de Colombia Neural networks time line 1943 20161957 1969 1986 1995 2007 2012
  • 16. Fabio A. González Universidad Nacional de Colombia Neural networks time line 1943 20161957 1969 1986 1995 2007 2012
  • 17. Fabio A. González Universidad Nacional de Colombia Neural networks time line 1943 20161957 1969 1986 1995 2007 2012
  • 18. Fabio A. González Universidad Nacional de Colombia Deep Learning
  • 19. Fabio A. González Universidad Nacional de Colombia Deep learning boom
  • 20. Fabio A. González Universidad Nacional de Colombia Deep learning boom
  • 21. Fabio A. González Universidad Nacional de Colombia Deep learning boom
  • 22. Fabio A. González Universidad Nacional de Colombia Deep learning boom
  • 23. Fabio A. González Universidad Nacional de Colombia Deep learning boom
  • 24. Fabio A. González Universidad Nacional de Colombia Deep learning boom
  • 25. Fabio A. González Universidad Nacional de Colombia Deep learning boom
  • 26. Fabio A. González Universidad Nacional de Colombia Deep learning time line 1943 20161957 1969 1986 1995 2007 2012
  • 27. Fabio A. González Universidad Nacional de Colombia Deep learning time line 1943 20161957 1969 1986 1995 2007 2012 LETTER Communicated by Yann Le Cun A Fast Learning Algorithm for Deep Belief Nets Geoffrey E. Hinton hinton@cs.toronto.edu Simon Osindero osindero@cs.toronto.edu Department of Computer Science, University of Toronto, Toronto, Canada M5S 3G4 Yee-Whye Teh tehyw@comp.nus.edu.sg Department of Computer Science, National University of Singapore, Singapore 117543 We show how to use “complementary priors” to eliminate the explaining- away effects that make inference difficult in densely connected belief nets that have many hidden layers. Using complementary priors, we derive a fast, greedy algorithm that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associa- tive memory. The fast, greedy algorithm is used to initialize a slower learning procedure that fine-tunes the weights using a contrastive ver- sion of the wake-sleep algorithm. After fine-tuning, a network with three hidden layers forms a very good generative model of the joint distribu- tion of handwritten digit images and their labels. This generative model gives better digit classification than the best discriminative learning al- gorithms. The low-dimensional manifolds on which the digits lie are modeled by long ravines in the free-energy landscape of the top-level associative memory, and it is easy to explore these ravines by using the directed connections to display what the associative memory has in mind. 1 Introduction Learning is difficult in densely connected, directed belief nets that have many hidden layers because it is difficult to infer the conditional distribu- tion of the hidden activities when given a data vector. Variational methods use simple approximations to the true conditional distribution, but the ap- proximations may be poor, especially at the deepest hidden layer, where the prior assumes independence. Also, variational learning still requires all of the parameters to be learned together and this makes the learning time scale poorly as the number of parameters increases. We describe a model in which the top two hidden layers form an undi- rected associative memory (see Figure 1) and the remaining hidden layers Neural Computation 18, 1527–1554 (2006) C⃝ 2006 Massachusetts Institute of Technology LETTER Communicated by Yann Le Cun A Fast Learning Algorithm for Deep Belief Nets Geoffrey E. Hinton hinton@cs.toronto.edu Simon Osindero osindero@cs.toronto.edu Department of Computer Science, University of Toronto, Toronto, Canada M5S 3G4 Yee-Whye Teh tehyw@comp.nus.edu.sg Department of Computer Science, National University of Singapore, Singapore 117543 We show how to use “complementary priors” to eliminate the explaining- away effects that make inference difficult in densely connected belief nets that have many hidden layers. Using complementary priors, we derive a fast, greedy algorithm that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associa- tive memory. The fast, greedy algorithm is used to initialize a slower learning procedure that fine-tunes the weights using a contrastive ver- sion of the wake-sleep algorithm. After fine-tuning, a network with three hidden layers forms a very good generative model of the joint distribu- tion of handwritten digit images and their labels. This generative model gives better digit classification than the best discriminative learning al- gorithms. The low-dimensional manifolds on which the digits lie are modeled by long ravines in the free-energy landscape of the top-level associative memory, and it is easy to explore these ravines by using the directed connections to display what the associative memory has in mind. 1 Introduction Learning is difficult in densely connected, directed belief nets that have many hidden layers because it is difficult to infer the conditional distribu- tion of the hidden activities when given a data vector. Variational methods use simple approximations to the true conditional distribution, but the ap- proximations may be poor, especially at the deepest hidden layer, where the prior assumes independence. Also, variational learning still requires all of the parameters to be learned together and this makes the learning time scale poorly as the number of parameters increases. We describe a model in which the top two hidden layers form an undi- rected associative memory (see Figure 1) and the remaining hidden layers Neural Computation 18, 1527–1554 (2006) C⃝ 2006 Massachusetts Institute of Technology
  • 28. Fabio A. González Universidad Nacional de Colombia Deep learning time line 1943 20161957 1969 1986 1995 2007 2012 LETTER Communicated by Yann Le Cun A Fast Learning Algorithm for Deep Belief Nets Geoffrey E. Hinton hinton@cs.toronto.edu Simon Osindero osindero@cs.toronto.edu Department of Computer Science, University of Toronto, Toronto, Canada M5S 3G4 Yee-Whye Teh tehyw@comp.nus.edu.sg Department of Computer Science, National University of Singapore, Singapore 117543 We show how to use “complementary priors” to eliminate the explaining- away effects that make inference difficult in densely connected belief nets that have many hidden layers. Using complementary priors, we derive a fast, greedy algorithm that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associa- tive memory. The fast, greedy algorithm is used to initialize a slower learning procedure that fine-tunes the weights using a contrastive ver- sion of the wake-sleep algorithm. After fine-tuning, a network with three hidden layers forms a very good generative model of the joint distribu- tion of handwritten digit images and their labels. This generative model gives better digit classification than the best discriminative learning al- gorithms. The low-dimensional manifolds on which the digits lie are modeled by long ravines in the free-energy landscape of the top-level associative memory, and it is easy to explore these ravines by using the directed connections to display what the associative memory has in mind. 1 Introduction Learning is difficult in densely connected, directed belief nets that have many hidden layers because it is difficult to infer the conditional distribu- tion of the hidden activities when given a data vector. Variational methods use simple approximations to the true conditional distribution, but the ap- proximations may be poor, especially at the deepest hidden layer, where the prior assumes independence. Also, variational learning still requires all of the parameters to be learned together and this makes the learning time scale poorly as the number of parameters increases. We describe a model in which the top two hidden layers form an undi- rected associative memory (see Figure 1) and the remaining hidden layers Neural Computation 18, 1527–1554 (2006) C⃝ 2006 Massachusetts Institute of Technology LETTER Communicated by Yann Le Cun A Fast Learning Algorithm for Deep Belief Nets Geoffrey E. Hinton hinton@cs.toronto.edu Simon Osindero osindero@cs.toronto.edu Department of Computer Science, University of Toronto, Toronto, Canada M5S 3G4 Yee-Whye Teh tehyw@comp.nus.edu.sg Department of Computer Science, National University of Singapore, Singapore 117543 We show how to use “complementary priors” to eliminate the explaining- away effects that make inference difficult in densely connected belief nets that have many hidden layers. Using complementary priors, we derive a fast, greedy algorithm that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associa- tive memory. The fast, greedy algorithm is used to initialize a slower learning procedure that fine-tunes the weights using a contrastive ver- sion of the wake-sleep algorithm. After fine-tuning, a network with three hidden layers forms a very good generative model of the joint distribu- tion of handwritten digit images and their labels. This generative model gives better digit classification than the best discriminative learning al- gorithms. The low-dimensional manifolds on which the digits lie are modeled by long ravines in the free-energy landscape of the top-level associative memory, and it is easy to explore these ravines by using the directed connections to display what the associative memory has in mind. 1 Introduction Learning is difficult in densely connected, directed belief nets that have many hidden layers because it is difficult to infer the conditional distribu- tion of the hidden activities when given a data vector. Variational methods use simple approximations to the true conditional distribution, but the ap- proximations may be poor, especially at the deepest hidden layer, where the prior assumes independence. Also, variational learning still requires all of the parameters to be learned together and this makes the learning time scale poorly as the number of parameters increases. We describe a model in which the top two hidden layers form an undi- rected associative memory (see Figure 1) and the remaining hidden layers Neural Computation 18, 1527–1554 (2006) C⃝ 2006 Massachusetts Institute of Technology
  • 29. Fabio A. González Universidad Nacional de Colombia Deep learning time line 1943 20161957 1969 1986 1995 2007 2012 LETTER Communicated by Yann Le Cun A Fast Learning Algorithm for Deep Belief Nets Geoffrey E. Hinton hinton@cs.toronto.edu Simon Osindero osindero@cs.toronto.edu Department of Computer Science, University of Toronto, Toronto, Canada M5S 3G4 Yee-Whye Teh tehyw@comp.nus.edu.sg Department of Computer Science, National University of Singapore, Singapore 117543 We show how to use “complementary priors” to eliminate the explaining- away effects that make inference difficult in densely connected belief nets that have many hidden layers. Using complementary priors, we derive a fast, greedy algorithm that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associa- tive memory. The fast, greedy algorithm is used to initialize a slower learning procedure that fine-tunes the weights using a contrastive ver- sion of the wake-sleep algorithm. After fine-tuning, a network with three hidden layers forms a very good generative model of the joint distribu- tion of handwritten digit images and their labels. This generative model gives better digit classification than the best discriminative learning al- gorithms. The low-dimensional manifolds on which the digits lie are modeled by long ravines in the free-energy landscape of the top-level associative memory, and it is easy to explore these ravines by using the directed connections to display what the associative memory has in mind. 1 Introduction Learning is difficult in densely connected, directed belief nets that have many hidden layers because it is difficult to infer the conditional distribu- tion of the hidden activities when given a data vector. Variational methods use simple approximations to the true conditional distribution, but the ap- proximations may be poor, especially at the deepest hidden layer, where the prior assumes independence. Also, variational learning still requires all of the parameters to be learned together and this makes the learning time scale poorly as the number of parameters increases. We describe a model in which the top two hidden layers form an undi- rected associative memory (see Figure 1) and the remaining hidden layers Neural Computation 18, 1527–1554 (2006) C⃝ 2006 Massachusetts Institute of Technology LETTER Communicated by Yann Le Cun A Fast Learning Algorithm for Deep Belief Nets Geoffrey E. Hinton hinton@cs.toronto.edu Simon Osindero osindero@cs.toronto.edu Department of Computer Science, University of Toronto, Toronto, Canada M5S 3G4 Yee-Whye Teh tehyw@comp.nus.edu.sg Department of Computer Science, National University of Singapore, Singapore 117543 We show how to use “complementary priors” to eliminate the explaining- away effects that make inference difficult in densely connected belief nets that have many hidden layers. Using complementary priors, we derive a fast, greedy algorithm that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associa- tive memory. The fast, greedy algorithm is used to initialize a slower learning procedure that fine-tunes the weights using a contrastive ver- sion of the wake-sleep algorithm. After fine-tuning, a network with three hidden layers forms a very good generative model of the joint distribu- tion of handwritten digit images and their labels. This generative model gives better digit classification than the best discriminative learning al- gorithms. The low-dimensional manifolds on which the digits lie are modeled by long ravines in the free-energy landscape of the top-level associative memory, and it is easy to explore these ravines by using the directed connections to display what the associative memory has in mind. 1 Introduction Learning is difficult in densely connected, directed belief nets that have many hidden layers because it is difficult to infer the conditional distribu- tion of the hidden activities when given a data vector. Variational methods use simple approximations to the true conditional distribution, but the ap- proximations may be poor, especially at the deepest hidden layer, where the prior assumes independence. Also, variational learning still requires all of the parameters to be learned together and this makes the learning time scale poorly as the number of parameters increases. We describe a model in which the top two hidden layers form an undi- rected associative memory (see Figure 1) and the remaining hidden layers Neural Computation 18, 1527–1554 (2006) C⃝ 2006 Massachusetts Institute of Technology
  • 30. Fabio A. González Universidad Nacional de Colombia Deep learning time line 1943 20161957 1969 1986 1995 2007 2012 LETTER Communicated by Yann Le Cun A Fast Learning Algorithm for Deep Belief Nets Geoffrey E. Hinton hinton@cs.toronto.edu Simon Osindero osindero@cs.toronto.edu Department of Computer Science, University of Toronto, Toronto, Canada M5S 3G4 Yee-Whye Teh tehyw@comp.nus.edu.sg Department of Computer Science, National University of Singapore, Singapore 117543 We show how to use “complementary priors” to eliminate the explaining- away effects that make inference difficult in densely connected belief nets that have many hidden layers. Using complementary priors, we derive a fast, greedy algorithm that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associa- tive memory. The fast, greedy algorithm is used to initialize a slower learning procedure that fine-tunes the weights using a contrastive ver- sion of the wake-sleep algorithm. After fine-tuning, a network with three hidden layers forms a very good generative model of the joint distribu- tion of handwritten digit images and their labels. This generative model gives better digit classification than the best discriminative learning al- gorithms. The low-dimensional manifolds on which the digits lie are modeled by long ravines in the free-energy landscape of the top-level associative memory, and it is easy to explore these ravines by using the directed connections to display what the associative memory has in mind. 1 Introduction Learning is difficult in densely connected, directed belief nets that have many hidden layers because it is difficult to infer the conditional distribu- tion of the hidden activities when given a data vector. Variational methods use simple approximations to the true conditional distribution, but the ap- proximations may be poor, especially at the deepest hidden layer, where the prior assumes independence. Also, variational learning still requires all of the parameters to be learned together and this makes the learning time scale poorly as the number of parameters increases. We describe a model in which the top two hidden layers form an undi- rected associative memory (see Figure 1) and the remaining hidden layers Neural Computation 18, 1527–1554 (2006) C⃝ 2006 Massachusetts Institute of Technology LETTER Communicated by Yann Le Cun A Fast Learning Algorithm for Deep Belief Nets Geoffrey E. Hinton hinton@cs.toronto.edu Simon Osindero osindero@cs.toronto.edu Department of Computer Science, University of Toronto, Toronto, Canada M5S 3G4 Yee-Whye Teh tehyw@comp.nus.edu.sg Department of Computer Science, National University of Singapore, Singapore 117543 We show how to use “complementary priors” to eliminate the explaining- away effects that make inference difficult in densely connected belief nets that have many hidden layers. Using complementary priors, we derive a fast, greedy algorithm that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associa- tive memory. The fast, greedy algorithm is used to initialize a slower learning procedure that fine-tunes the weights using a contrastive ver- sion of the wake-sleep algorithm. After fine-tuning, a network with three hidden layers forms a very good generative model of the joint distribu- tion of handwritten digit images and their labels. This generative model gives better digit classification than the best discriminative learning al- gorithms. The low-dimensional manifolds on which the digits lie are modeled by long ravines in the free-energy landscape of the top-level associative memory, and it is easy to explore these ravines by using the directed connections to display what the associative memory has in mind. 1 Introduction Learning is difficult in densely connected, directed belief nets that have many hidden layers because it is difficult to infer the conditional distribu- tion of the hidden activities when given a data vector. Variational methods use simple approximations to the true conditional distribution, but the ap- proximations may be poor, especially at the deepest hidden layer, where the prior assumes independence. Also, variational learning still requires all of the parameters to be learned together and this makes the learning time scale poorly as the number of parameters increases. We describe a model in which the top two hidden layers form an undi- rected associative memory (see Figure 1) and the remaining hidden layers Neural Computation 18, 1527–1554 (2006) C⃝ 2006 Massachusetts Institute of Technology
  • 31. Fabio A. González Universidad Nacional de Colombia Deep learning time line 1943 20161957 1969 1986 1995 2007 2012 LETTER Communicated by Yann Le Cun A Fast Learning Algorithm for Deep Belief Nets Geoffrey E. Hinton hinton@cs.toronto.edu Simon Osindero osindero@cs.toronto.edu Department of Computer Science, University of Toronto, Toronto, Canada M5S 3G4 Yee-Whye Teh tehyw@comp.nus.edu.sg Department of Computer Science, National University of Singapore, Singapore 117543 We show how to use “complementary priors” to eliminate the explaining- away effects that make inference difficult in densely connected belief nets that have many hidden layers. Using complementary priors, we derive a fast, greedy algorithm that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associa- tive memory. The fast, greedy algorithm is used to initialize a slower learning procedure that fine-tunes the weights using a contrastive ver- sion of the wake-sleep algorithm. After fine-tuning, a network with three hidden layers forms a very good generative model of the joint distribu- tion of handwritten digit images and their labels. This generative model gives better digit classification than the best discriminative learning al- gorithms. The low-dimensional manifolds on which the digits lie are modeled by long ravines in the free-energy landscape of the top-level associative memory, and it is easy to explore these ravines by using the directed connections to display what the associative memory has in mind. 1 Introduction Learning is difficult in densely connected, directed belief nets that have many hidden layers because it is difficult to infer the conditional distribu- tion of the hidden activities when given a data vector. Variational methods use simple approximations to the true conditional distribution, but the ap- proximations may be poor, especially at the deepest hidden layer, where the prior assumes independence. Also, variational learning still requires all of the parameters to be learned together and this makes the learning time scale poorly as the number of parameters increases. We describe a model in which the top two hidden layers form an undi- rected associative memory (see Figure 1) and the remaining hidden layers Neural Computation 18, 1527–1554 (2006) C⃝ 2006 Massachusetts Institute of Technology LETTER Communicated by Yann Le Cun A Fast Learning Algorithm for Deep Belief Nets Geoffrey E. Hinton hinton@cs.toronto.edu Simon Osindero osindero@cs.toronto.edu Department of Computer Science, University of Toronto, Toronto, Canada M5S 3G4 Yee-Whye Teh tehyw@comp.nus.edu.sg Department of Computer Science, National University of Singapore, Singapore 117543 We show how to use “complementary priors” to eliminate the explaining- away effects that make inference difficult in densely connected belief nets that have many hidden layers. Using complementary priors, we derive a fast, greedy algorithm that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associa- tive memory. The fast, greedy algorithm is used to initialize a slower learning procedure that fine-tunes the weights using a contrastive ver- sion of the wake-sleep algorithm. After fine-tuning, a network with three hidden layers forms a very good generative model of the joint distribu- tion of handwritten digit images and their labels. This generative model gives better digit classification than the best discriminative learning al- gorithms. The low-dimensional manifolds on which the digits lie are modeled by long ravines in the free-energy landscape of the top-level associative memory, and it is easy to explore these ravines by using the directed connections to display what the associative memory has in mind. 1 Introduction Learning is difficult in densely connected, directed belief nets that have many hidden layers because it is difficult to infer the conditional distribu- tion of the hidden activities when given a data vector. Variational methods use simple approximations to the true conditional distribution, but the ap- proximations may be poor, especially at the deepest hidden layer, where the prior assumes independence. Also, variational learning still requires all of the parameters to be learned together and this makes the learning time scale poorly as the number of parameters increases. We describe a model in which the top two hidden layers form an undi- rected associative memory (see Figure 1) and the remaining hidden layers Neural Computation 18, 1527–1554 (2006) C⃝ 2006 Massachusetts Institute of Technology
  • 32. Fabio A. González Universidad Nacional de Colombia Deep learning time line 1943 20161957 1969 1986 1995 2007 2012 LETTER Communicated by Yann Le Cun A Fast Learning Algorithm for Deep Belief Nets Geoffrey E. Hinton hinton@cs.toronto.edu Simon Osindero osindero@cs.toronto.edu Department of Computer Science, University of Toronto, Toronto, Canada M5S 3G4 Yee-Whye Teh tehyw@comp.nus.edu.sg Department of Computer Science, National University of Singapore, Singapore 117543 We show how to use “complementary priors” to eliminate the explaining- away effects that make inference difficult in densely connected belief nets that have many hidden layers. Using complementary priors, we derive a fast, greedy algorithm that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associa- tive memory. The fast, greedy algorithm is used to initialize a slower learning procedure that fine-tunes the weights using a contrastive ver- sion of the wake-sleep algorithm. After fine-tuning, a network with three hidden layers forms a very good generative model of the joint distribu- tion of handwritten digit images and their labels. This generative model gives better digit classification than the best discriminative learning al- gorithms. The low-dimensional manifolds on which the digits lie are modeled by long ravines in the free-energy landscape of the top-level associative memory, and it is easy to explore these ravines by using the directed connections to display what the associative memory has in mind. 1 Introduction Learning is difficult in densely connected, directed belief nets that have many hidden layers because it is difficult to infer the conditional distribu- tion of the hidden activities when given a data vector. Variational methods use simple approximations to the true conditional distribution, but the ap- proximations may be poor, especially at the deepest hidden layer, where the prior assumes independence. Also, variational learning still requires all of the parameters to be learned together and this makes the learning time scale poorly as the number of parameters increases. We describe a model in which the top two hidden layers form an undi- rected associative memory (see Figure 1) and the remaining hidden layers Neural Computation 18, 1527–1554 (2006) C⃝ 2006 Massachusetts Institute of Technology LETTER Communicated by Yann Le Cun A Fast Learning Algorithm for Deep Belief Nets Geoffrey E. Hinton hinton@cs.toronto.edu Simon Osindero osindero@cs.toronto.edu Department of Computer Science, University of Toronto, Toronto, Canada M5S 3G4 Yee-Whye Teh tehyw@comp.nus.edu.sg Department of Computer Science, National University of Singapore, Singapore 117543 We show how to use “complementary priors” to eliminate the explaining- away effects that make inference difficult in densely connected belief nets that have many hidden layers. Using complementary priors, we derive a fast, greedy algorithm that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associa- tive memory. The fast, greedy algorithm is used to initialize a slower learning procedure that fine-tunes the weights using a contrastive ver- sion of the wake-sleep algorithm. After fine-tuning, a network with three hidden layers forms a very good generative model of the joint distribu- tion of handwritten digit images and their labels. This generative model gives better digit classification than the best discriminative learning al- gorithms. The low-dimensional manifolds on which the digits lie are modeled by long ravines in the free-energy landscape of the top-level associative memory, and it is easy to explore these ravines by using the directed connections to display what the associative memory has in mind. 1 Introduction Learning is difficult in densely connected, directed belief nets that have many hidden layers because it is difficult to infer the conditional distribu- tion of the hidden activities when given a data vector. Variational methods use simple approximations to the true conditional distribution, but the ap- proximations may be poor, especially at the deepest hidden layer, where the prior assumes independence. Also, variational learning still requires all of the parameters to be learned together and this makes the learning time scale poorly as the number of parameters increases. We describe a model in which the top two hidden layers form an undi- rected associative memory (see Figure 1) and the remaining hidden layers Neural Computation 18, 1527–1554 (2006) C⃝ 2006 Massachusetts Institute of Technology LETTER Communicated by Yann Le Cun A Fast Learning Algorithm for Deep Belief Nets Geoffrey E. Hinton hinton@cs.toronto.edu Simon Osindero osindero@cs.toronto.edu Department of Computer Science, University of Toronto, Toronto, Canada M5S 3G4 Yee-Whye Teh tehyw@comp.nus.edu.sg Department of Computer Science, National University of Singapore, Singapore 117543 We show how to use “complementary priors” to eliminate the explaining- away effects that make inference difficult in densely connected belief nets that have many hidden layers. Using complementary priors, we derive a Learning is difficult in densely connected, directed belief nets that have many hidden layers because it is difficult to infer the conditional distribu- tion of the hidden activities when given a data vector. Variational methods use simple approximations to the true conditional distribution, but the ap- proximations may be poor, especially at the deepest hidden layer, where the prior assumes independence. Also, variational learning still requires all of the parameters to be learned together and this makes the learning time scale poorly as the number of parameters increases. We describe a model in which the top two hidden layers form an undi- rected associative memory (see Figure 1) and the remaining hidden layers Neural Computation 18, 1527–1554 (2006) C⃝ 2006 Massachusetts Institute of Technology
  • 33. Fabio A. González Universidad Nacional de Colombia Deep learning time line 1943 20161957 1969 1986 1995 2007 2012 LETTER Communicated by Yann Le Cun A Fast Learning Algorithm for Deep Belief Nets Geoffrey E. Hinton hinton@cs.toronto.edu Simon Osindero osindero@cs.toronto.edu Department of Computer Science, University of Toronto, Toronto, Canada M5S 3G4 Yee-Whye Teh tehyw@comp.nus.edu.sg Department of Computer Science, National University of Singapore, Singapore 117543 We show how to use “complementary priors” to eliminate the explaining- away effects that make inference difficult in densely connected belief nets that have many hidden layers. Using complementary priors, we derive a fast, greedy algorithm that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associa- tive memory. The fast, greedy algorithm is used to initialize a slower learning procedure that fine-tunes the weights using a contrastive ver- sion of the wake-sleep algorithm. After fine-tuning, a network with three hidden layers forms a very good generative model of the joint distribu- tion of handwritten digit images and their labels. This generative model gives better digit classification than the best discriminative learning al- gorithms. The low-dimensional manifolds on which the digits lie are modeled by long ravines in the free-energy landscape of the top-level associative memory, and it is easy to explore these ravines by using the directed connections to display what the associative memory has in mind. 1 Introduction Learning is difficult in densely connected, directed belief nets that have many hidden layers because it is difficult to infer the conditional distribu- tion of the hidden activities when given a data vector. Variational methods use simple approximations to the true conditional distribution, but the ap- proximations may be poor, especially at the deepest hidden layer, where the prior assumes independence. Also, variational learning still requires all of the parameters to be learned together and this makes the learning time scale poorly as the number of parameters increases. We describe a model in which the top two hidden layers form an undi- rected associative memory (see Figure 1) and the remaining hidden layers Neural Computation 18, 1527–1554 (2006) C⃝ 2006 Massachusetts Institute of Technology LETTER Communicated by Yann Le Cun A Fast Learning Algorithm for Deep Belief Nets Geoffrey E. Hinton hinton@cs.toronto.edu Simon Osindero osindero@cs.toronto.edu Department of Computer Science, University of Toronto, Toronto, Canada M5S 3G4 Yee-Whye Teh tehyw@comp.nus.edu.sg Department of Computer Science, National University of Singapore, Singapore 117543 We show how to use “complementary priors” to eliminate the explaining- away effects that make inference difficult in densely connected belief nets that have many hidden layers. Using complementary priors, we derive a fast, greedy algorithm that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associa- tive memory. The fast, greedy algorithm is used to initialize a slower learning procedure that fine-tunes the weights using a contrastive ver- sion of the wake-sleep algorithm. After fine-tuning, a network with three hidden layers forms a very good generative model of the joint distribu- tion of handwritten digit images and their labels. This generative model gives better digit classification than the best discriminative learning al- gorithms. The low-dimensional manifolds on which the digits lie are modeled by long ravines in the free-energy landscape of the top-level associative memory, and it is easy to explore these ravines by using the directed connections to display what the associative memory has in mind. 1 Introduction Learning is difficult in densely connected, directed belief nets that have many hidden layers because it is difficult to infer the conditional distribu- tion of the hidden activities when given a data vector. Variational methods use simple approximations to the true conditional distribution, but the ap- proximations may be poor, especially at the deepest hidden layer, where the prior assumes independence. Also, variational learning still requires all of the parameters to be learned together and this makes the learning time scale poorly as the number of parameters increases. We describe a model in which the top two hidden layers form an undi- rected associative memory (see Figure 1) and the remaining hidden layers Neural Computation 18, 1527–1554 (2006) C⃝ 2006 Massachusetts Institute of Technology
  • 34. Fabio A. González Universidad Nacional de Colombia Deep learning time line 1943 20161957 1969 1986 1995 2007 2012 LETTER Communicated by Yann Le Cun A Fast Learning Algorithm for Deep Belief Nets Geoffrey E. Hinton hinton@cs.toronto.edu Simon Osindero osindero@cs.toronto.edu Department of Computer Science, University of Toronto, Toronto, Canada M5S 3G4 Yee-Whye Teh tehyw@comp.nus.edu.sg Department of Computer Science, National University of Singapore, Singapore 117543 We show how to use “complementary priors” to eliminate the explaining- away effects that make inference difficult in densely connected belief nets that have many hidden layers. Using complementary priors, we derive a fast, greedy algorithm that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associa- tive memory. The fast, greedy algorithm is used to initialize a slower learning procedure that fine-tunes the weights using a contrastive ver- sion of the wake-sleep algorithm. After fine-tuning, a network with three hidden layers forms a very good generative model of the joint distribu- tion of handwritten digit images and their labels. This generative model gives better digit classification than the best discriminative learning al- gorithms. The low-dimensional manifolds on which the digits lie are modeled by long ravines in the free-energy landscape of the top-level associative memory, and it is easy to explore these ravines by using the directed connections to display what the associative memory has in mind. 1 Introduction Learning is difficult in densely connected, directed belief nets that have many hidden layers because it is difficult to infer the conditional distribu- tion of the hidden activities when given a data vector. Variational methods use simple approximations to the true conditional distribution, but the ap- proximations may be poor, especially at the deepest hidden layer, where the prior assumes independence. Also, variational learning still requires all of the parameters to be learned together and this makes the learning time scale poorly as the number of parameters increases. We describe a model in which the top two hidden layers form an undi- rected associative memory (see Figure 1) and the remaining hidden layers Neural Computation 18, 1527–1554 (2006) C⃝ 2006 Massachusetts Institute of Technology LETTER Communicated by Yann Le Cun A Fast Learning Algorithm for Deep Belief Nets Geoffrey E. Hinton hinton@cs.toronto.edu Simon Osindero osindero@cs.toronto.edu Department of Computer Science, University of Toronto, Toronto, Canada M5S 3G4 Yee-Whye Teh tehyw@comp.nus.edu.sg Department of Computer Science, National University of Singapore, Singapore 117543 We show how to use “complementary priors” to eliminate the explaining- away effects that make inference difficult in densely connected belief nets that have many hidden layers. Using complementary priors, we derive a fast, greedy algorithm that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associa- tive memory. The fast, greedy algorithm is used to initialize a slower learning procedure that fine-tunes the weights using a contrastive ver- sion of the wake-sleep algorithm. After fine-tuning, a network with three hidden layers forms a very good generative model of the joint distribu- tion of handwritten digit images and their labels. This generative model gives better digit classification than the best discriminative learning al- gorithms. The low-dimensional manifolds on which the digits lie are modeled by long ravines in the free-energy landscape of the top-level associative memory, and it is easy to explore these ravines by using the directed connections to display what the associative memory has in mind. 1 Introduction Learning is difficult in densely connected, directed belief nets that have many hidden layers because it is difficult to infer the conditional distribu- tion of the hidden activities when given a data vector. Variational methods use simple approximations to the true conditional distribution, but the ap- proximations may be poor, especially at the deepest hidden layer, where the prior assumes independence. Also, variational learning still requires all of the parameters to be learned together and this makes the learning time scale poorly as the number of parameters increases. We describe a model in which the top two hidden layers form an undi- rected associative memory (see Figure 1) and the remaining hidden layers Neural Computation 18, 1527–1554 (2006) C⃝ 2006 Massachusetts Institute of Technology
  • 35. Fabio A. González Universidad Nacional de Colombia Deep learning model won ILSVRC 2012 challenge
  • 36. Fabio A. González Universidad Nacional de Colombia Deep learning model won ILSVRC 2012 challenge 1.2 million images 1,000 concepts
  • 37. Fabio A. González Universidad Nacional de Colombia Deep learning model won ILSVRC 2012 challenge 1.2 million images 1,000 concepts
  • 38. Fabio A. González Universidad Nacional de Colombia Deep learning recipe Data HPC Algorithms Tricks Feature learning Size
  • 39. Fabio A. González Universidad Nacional de Colombia Feature learning
  • 40. Fabio A. González Universidad Nacional de Colombia Feature learning
  • 41. Fabio A. González Universidad Nacional de Colombia Feature learning
  • 42. Fabio A. González Universidad Nacional de Colombia Deep learning recipe Data HPC Algorithms Tricks Feature learning Size
  • 43. Fabio A. González Universidad Nacional de Colombia Deep —> Bigger Revolution of Depth 3.57 6.7 7.3 11.7 16.4 25.8 28.2 ILSVRC'15 ResNet ILSVRC'14 GoogleNet ILSVRC'14 VGG ILSVRC'13 ILSVRC'12 AlexNet ILSVRC'11 ILSVRC'10 ImageNet Classification top-5 error (%) shallow8 layers 19 layers22 layers 152 layers Kaiming He, Xiangyu Zhang, Shaoqing Ren, & Jian Sun. “Deep Residual Learning for Image Recognition”. arXiv 2015. 8 layers
  • 44. Fabio A. González Universidad Nacional de Colombia Deep learning recipe Data HPC Algorithms Tricks Feature learning Size
  • 45. Fabio A. González Universidad Nacional de Colombia Data… • Images annotated with WordNet concepts • Concepts: 21,841 • Images: 14,197,122 • Bounding box annotations: 1,034,908 • Crowdsourcing
  • 46. Fabio A. González Universidad Nacional de Colombia Deep learning recipe Data HPC Algorithms Tricks Feature learning Size
  • 47. Fabio A. González Universidad Nacional de Colombia HPC
  • 48. Fabio A. González Universidad Nacional de Colombia HPC
  • 49. Fabio A. González Universidad Nacional de Colombia HPC
  • 50. Fabio A. González Universidad Nacional de Colombia Deep learning recipe Data HPC Algorithms Tricks Feature learning Size
  • 51. Fabio A. González Universidad Nacional de Colombia Algorithms • Backpropagation • Backpropagation through time • Online learning (stochastic gradient descent) • Softmax (hierarchical)
  • 52. Fabio A. González Universidad Nacional de Colombia Deep learning recipe Data HPC Algorithms Tricks Feature learning Size
  • 53. Fabio A. González Universidad Nacional de Colombia Tricks • DL is mainly an engineering problem • DL networks are hard to train • Several tricks product of years of experience
  • 54. Fabio A. González Universidad Nacional de Colombia Tricks • DL is mainly an engineering problem • DL networks are hard to train • Several tricks product of years of experience • Layer-wise training • RELU units • Dropout • Adaptive learning rates • Initialization • Preprocessing • Gradient norm clipping
  • 55. Fabio A. González Universidad Nacional de Colombia Applications • Computer vision: • Image: annotation, detection, segmentation, captioning • Video: object tracking, action recognition, segmentation • Speech recognition and synthesis • Text: language modeling, word/text representation, text classification, translation • Biomedical image analysis
  • 56. Fabio A. González Universidad Nacional de Colombia Deep Learning at
  • 57. Fabio A. González Universidad Nacional de Colombia Feature learning for cancer diagnosisArtificial Intelligence in Medicine 64 (2015) 131–145 Contents lists available at ScienceDirect Artificial Intelligence in Medicine journal homepage: www.elsevier.com/locate/aiim An unsupervised feature learning framework for basal cell carcinoma image analysis John Arevaloa , Angel Cruz-Roaa , Viviana Ariasb , Eduardo Romeroc , Fabio A. Gonzáleza,∗ a Machine Learning, Perception and Discovery Lab, Systems and Computer Engineering Department, Universidad Nacional de Colombia, Faculty of Engineering, Cra 30 No 45 03-Ciudad Universitaria, Building 453 Office 114, Bogotá DC, Colombia b Pathology Department, Universidad Nacional de Colombia, Faculty of Medicine, Cra 30 No 45 03-Ciudad Universitaria, Bogotá DC, Colombia c Computer Imaging & Medical Applications Laboratory, Universidad Nacional de Colombia, Faculty of Medicine, Cra 30 No 45 03-Ciudad Universitaria, Bogotá DC, Colombia a r t i c l e i n f o Article history: Received 1 October 2014 Received in revised form 9 April 2015 Accepted 15 April 2015 Keywords: Digital pathology Representation learning Unsupervised feature learning Basal cell carcinoma a b s t r a c t Objective: The paper addresses the problem of automatic detection of basal cell carcinoma (BCC) in histopathology images. In particular, it proposes a framework to both, learn the image representation in an unsupervised way and visualize discriminative features supported by the learned model. Materials and methods: This paper presents an integrated unsupervised feature learning (UFL) framework for histopathology image analysis that comprises three main stages: (1) local (patch) representation learning using different strategies (sparse autoencoders, reconstruct independent component analy- sis and topographic independent component analysis (TICA), (2) global (image) representation learning using a bag-of-features representation or a convolutional neural network, and (3) a visual interpretation layer to highlight the most discriminant regions detected by the model. The integrated unsupervised feature learning framework was exhaustively evaluated in a histopathology image dataset for BCC diag- nosis. Results: The experimental evaluation produced a classification performance of 98.1%, in terms of the area under receiver-operating-characteristic curve, for the proposed framework outperforming by 7% the state-of-the-art discrete cosine transform patch-based representation. Conclusions: The proposed UFL-representation-based approach outperforms state-of-the-art methods for BCC detection. Thanks to its visual interpretation layer, the method is able to highlight discriminative
  • 58. Fabio A. González Universidad Nacional de Colombia Feature learning for cancer diagnosis Artificial Intelligence in Medicine journal homepage: www.elsevier.com/locate/aiim An unsupervised feature learning framework for basal cell carcinoma image analysis John Arevaloa , Angel Cruz-Roaa , Viviana Ariasb , Eduardo Romeroc , Fabio A. Gonzáleza,∗ a Machine Learning, Perception and Discovery Lab, Systems and Computer Engineering Department, Universidad Nacional de Colombia, Faculty of Engineering, Cra 30 No 45 03-Ciudad Universitaria, Building 453 Office 114, Bogotá DC, Colombia b Pathology Department, Universidad Nacional de Colombia, Faculty of Medicine, Cra 30 No 45 03-Ciudad Universitaria, Bogotá DC, Colombia c Computer Imaging & Medical Applications Laboratory, Universidad Nacional de Colombia, Faculty of Medicine, Cra 30 No 45 03-Ciudad Universitaria, Bogotá DC, Colombia a r t i c l e i n f o Article history: a b s t r a c t Objective: The paper addresses the problem of automatic detection of basal cell carcinoma (BCC) 406 A.A. Cruz-Roa et al. Fig. 2. Convolutional auto-encoder neural network architecture for histopathology image repre- sentation learning, automatic cancer detection and visually interpretable prediction results anal- ogous to a digital stain identifying image regions that are most relevant for diagnostic decisions.
  • 59. Fabio A. González Universidad Nacional de Colombia Feature learning for cancer diagnosis Artificial Intelligence in Medicine journal homepage: www.elsevier.com/locate/aiim An unsupervised feature learning framework for basal cell carcinoma image analysis John Arevaloa , Angel Cruz-Roaa , Viviana Ariasb , Eduardo Romeroc , Fabio A. Gonzáleza,∗ a Machine Learning, Perception and Discovery Lab, Systems and Computer Engineering Department, Universidad Nacional de Colombia, Faculty of Engineering, Cra 30 No 45 03-Ciudad Universitaria, Building 453 Office 114, Bogotá DC, Colombia b Pathology Department, Universidad Nacional de Colombia, Faculty of Medicine, Cra 30 No 45 03-Ciudad Universitaria, Bogotá DC, Colombia c Computer Imaging & Medical Applications Laboratory, Universidad Nacional de Colombia, Faculty of Medicine, Cra 30 No 45 03-Ciudad Universitaria, Bogotá DC, Colombia a r t i c l e i n f o Article history: a b s t r a c t Objective: The paper addresses the problem of automatic detection of basal cell carcinoma (BCC) . Local learned features with different unsupervised feature learning methods. Left: sparse autoencoders learned some features that are visually re ies of the histopathology dataset (i.e. nuclei shapes) which are highlighted by red squares. Center: reconstruct independent component analysis. R onent analysis highlighting some translational (blue), color (red), scale (yellow) and rotational (green) invariances. (For interpretation of the referen e legend, the reader is referred to the web version of this article.) . Discriminant map of learned features. Enclosed with red path are related with positive (basal cell carcinoma) class and enclosed by blue path are rela thy tissue) class. Left: softmax weights mapped back to topographic organization. Center: features learned with topographic component analysis. Rig minative features for positive (top) and negative (bottom) classes. (For interpretation of the references to color in this figure legend, the reader is re on of this article.) size is not very significant in terms of time for feature extrac- process. Excluding SAE second layer, any feature extraction el proposed here takes less than 2 s to extract features from 4.9. Overall BCC classification results Table 4 summarizes the systematic evaluation per
  • 60. Fabio A. González Universidad Nacional de Colombia Feature learning for cancer diagnosis Artificial Intelligence in Medicine journal homepage: www.elsevier.com/locate/aiim An unsupervised feature learning framework for basal cell carcinoma image analysis John Arevaloa , Angel Cruz-Roaa , Viviana Ariasb , Eduardo Romeroc , Fabio A. Gonzáleza,∗ a Machine Learning, Perception and Discovery Lab, Systems and Computer Engineering Department, Universidad Nacional de Colombia, Faculty of Engineering, Cra 30 No 45 03-Ciudad Universitaria, Building 453 Office 114, Bogotá DC, Colombia b Pathology Department, Universidad Nacional de Colombia, Faculty of Medicine, Cra 30 No 45 03-Ciudad Universitaria, Bogotá DC, Colombia c Computer Imaging & Medical Applications Laboratory, Universidad Nacional de Colombia, Faculty of Medicine, Cra 30 No 45 03-Ciudad Universitaria, Bogotá DC, Colombia a r t i c l e i n f o Article history: a b s t r a c t Objective: The paper addresses the problem of automatic detection of basal cell carcinoma (BCC) 142 J. Arevalo et al. / Artificial Intelligence in Medicine 64 (2015) 131–145 Table 2 Outputs produced by the system for different cancer and non-cancer input images. The table rows show from top to bottom: the real image class predicted by the model, the probability associated to the prediction, and the digital stained image (red stain indicates cancer regions, blue stain in
  • 61. Fabio A. González Universidad Nacional de Colombia Feature learning for cancer diagnosis Artificial Intelligence in Medicine journal homepage: www.elsevier.com/locate/aiim An unsupervised feature learning framework for basal cell carcinoma image analysis John Arevaloa , Angel Cruz-Roaa , Viviana Ariasb , Eduardo Romeroc , Fabio A. Gonzáleza,∗ a Machine Learning, Perception and Discovery Lab, Systems and Computer Engineering Department, Universidad Nacional de Colombia, Faculty of Engineering, Cra 30 No 45 03-Ciudad Universitaria, Building 453 Office 114, Bogotá DC, Colombia b Pathology Department, Universidad Nacional de Colombia, Faculty of Medicine, Cra 30 No 45 03-Ciudad Universitaria, Bogotá DC, Colombia c Computer Imaging & Medical Applications Laboratory, Universidad Nacional de Colombia, Faculty of Medicine, Cra 30 No 45 03-Ciudad Universitaria, Bogotá DC, Colombia a r t i c l e i n f o Article history: a b s t r a c t Objective: The paper addresses the problem of automatic detection of basal cell carcinoma (BCC)
  • 62. Fabio A. González Universidad Nacional de Colombia Deep learning for cancer diagnosis Cascaded Ensemble of Convolutional Neural Networks and Handcrafted Features for Mitosis Detection Haibo Wang *⇤1 , Angel Cruz-Roa*2 , Ajay Basavanhally1 , Hannah Gilmore1 , Natalie Shih3 , Mike Feldman3 , John Tomaszewski4 , Fabio Gonzalez2 , and Anant Madabhushi1 1 Case Western Reserve University, USA 2 Universidad Nacional de Colombia, Colombia 3 University of Pennsylvania, USA 4 University at Buffalo School of Medicine and Biomedical Sciences, USA ABSTRACT Breast cancer (BCa) grading plays an important role in predicting disease aggressiveness and patient outcome. A key component of BCa grade is mitotic count, which involves quantifying the number of cells in the process of dividing (i.e. undergoing mitosis) at a specific point in time. Currently mitosis counting is done manually by a pathologist looking at multiple high power fields on a glass slide under a microscope, an extremely laborious and time consuming process. The development of computerized systems for automated detection of mitotic nuclei, while highly desirable, is confounded by the highly variable shape and appearance of mitoses. Existing methods use either handcrafted features that capture certain morphological, statistical or textural attributes of mitoses or features learned with convolutional neural networks (CNN). While handcrafted features are inspired by the domain and the particular application, the data-driven CNN models
  • 63. Fabio A. González Universidad Nacional de Colombia Deep learning for cancer diagnosis Cascaded Ensemble of Convolutional Neural Networks and Handcrafted Features for Mitosis Detection Haibo Wang *⇤1 , Angel Cruz-Roa*2 , Ajay Basavanhally1 , Hannah Gilmore1 , Natalie Shih3 , Mike Feldman3 , John Tomaszewski4 , Fabio Gonzalez2 , and Anant Madabhushi1 1 Case Western Reserve University, USA 2 Universidad Nacional de Colombia, Colombia 3 University of Pennsylvania, USA 4 University at Buffalo School of Medicine and Biomedical Sciences, USA optimization. Including the time needed to extract handcrafted features (6.5 hours in pure MATLAB implementation), the training stage for HC+CNN was completed in less than 18 hours. Dataset Method TP FP FN Precision Recall F-measure Scanner Aperio HC+CNN 65 12 35 0.84 0.65 0.7345 HC 64 22 36 0.74 0.64 0.6864 CNN 53 32 47 0.63 0.53 0.5730 IDSIA13 70 9 30 0.89 0.70 0.7821 IPAL4 74 32 26 0.70 0.74 0.7184 SUTECH 72 31 28 0.70 0.72 0.7094 NEC12 59 20 41 0.75 0.59 0.6592 Table 2: Evaluation results for mitosis detection using HC+CNN and comparative methods on the ICPR12 dataset. Figure 3: Mitoses identified by HC+CNN as TP (green circles), FN (yellow circles), and FP (red circles) on the ICPR12 dataset. The TP examples have distinctive intensity, shape and texture while the FN examples are less distinctive in intensity and shape. The FP examples are visually more alike to mitotic figures than the FNs. 3.4 Results on AMIDA13 Dataset On the AMIDA13 dataset, the F-measure of our approach (CCIPD/MINDLAB) is 0.319, which ranks 6 among 14 submis-
  • 64. Fabio A. González Universidad Nacional de Colombia Deep learning for cancer diagnosis Cascaded Ensemble of Convolutional Neural Networks and Handcrafted Features for Mitosis Detection Haibo Wang *⇤1 , Angel Cruz-Roa*2 , Ajay Basavanhally1 , Hannah Gilmore1 , Natalie Shih3 , Mike Feldman3 , John Tomaszewski4 , Fabio Gonzalez2 , and Anant Madabhushi1 1 Case Western Reserve University, USA 2 Universidad Nacional de Colombia, Colombia 3 University of Pennsylvania, USA 4 University at Buffalo School of Medicine and Biomedical Sciences, USA Figure 3: Mitoses identified by HC+CNN as TP (green circles), FN (yellow circles), and FP (red circles) on the ICPR12 HPF Handcrafted features Feature learning Classifier 1 Classifier 3 Probabilistic fusion Classifier 2 HPF Segmentation Figure 1: Workflow of our methodology. Blue-ratio thresholding14 is first applied to segment mitosis candidates. On each segmented blob, handcrafted features are extracted and classified via a Random Forests classifier. Meanwhile, on each segmented 80 ⇥ 80 patch, convolutional neural networks (CNN)8 are trained with a fully connected regression model as part of the classification layer. For those candidates that are difficult to classify (ambiguous result from the CNN), we train a second-stage Random Forests classifier on the basis of combining CNN-derived and handcrafted features. Final decision is obtained via a consensus of the predictions of the three classifiers.
  • 65. Fabio A. González Universidad Nacional de Colombia Deep learning for cancer diagnosis Cascaded Ensemble of Convolutional Neural Networks and Handcrafted Features for Mitosis Detection Haibo Wang *⇤1 , Angel Cruz-Roa*2 , Ajay Basavanhally1 , Hannah Gilmore1 , Natalie Shih3 , Mike Feldman3 , John Tomaszewski4 , Fabio Gonzalez2 , and Anant Madabhushi1 1 Case Western Reserve University, USA 2 Universidad Nacional de Colombia, Colombia 3 University of Pennsylvania, USA 4 University at Buffalo School of Medicine and Biomedical Sciences, USA Figure 3: Mitoses identified by HC+CNN as TP (green circles), FN (yellow circles), and FP (red circles) on the ICPR12 HPF Handcrafted features Feature learning Classifier 1 Classifier 3 Probabilistic fusion Classifier 2 HPF Segmentation
  • 66. Fabio A. González Universidad Nacional de Colombia Efficient DL over whole slide pathology images ~2000x2000 pixels
  • 67. Fabio A. González Universidad Nacional de Colombia Efficient DL over whole slide pathology images ~2000x2000 pixels
  • 68. Fabio A. González Universidad Nacional de Colombia Efficient DL over whole slide pathology images ~2000x2000 pixels
  • 69. Fabio A. González Universidad Nacional de Colombia Efficient DL over whole slide pathology images ~2000x2000 pixels
  • 70. Fabio A. González Universidad Nacional de Colombia Efficient DL over whole slide pathology images ~2000x2000 pixels
  • 71. Fabio A. González Universidad Nacional de Colombia Exudate detection in eye fundus images
  • 72. Fabio A. González Universidad Nacional de Colombia Exudate detection in eye fundus images
  • 73. Fabio A. González Universidad Nacional de Colombia Exudate detection in eye fundus images
  • 74. Fabio A. González Universidad Nacional de Colombia Exudate detection in eye fundus images
  • 75. Fabio A. González Universidad Nacional de Colombia RNN for book genre classification Dataset construction Annotated Dataset Get top tags from users No. Class Tags 1 science_fiction sci-fi, science-fiction 2 comedy comedies, comedy, humor ... ... ... 9 religion christian, religion, christianity,...
  • 76. Fabio A. González Universidad Nacional de Colombia RNN for book genre classification Dataset construction Annotated Dataset Get top tags from users No. Class Tags 1 science_fiction sci-fi, science-fiction 2 comedy comedies, comedy, humor ... ... ... 9 religion christian, religion, christianity,... RNN Architecture
  • 77. Fabio A. González Universidad Nacional de Colombia Deep CNN-RNN for object tracking in video
  • 78. Fabio A. González Universidad Nacional de Colombia Deep CNN-RNN for object tracking in video
  • 79. Fabio A. González Universidad Nacional de Colombia Deep CNN-RNN for object tracking in video Number of sequences in different public datasets: • VOT2013: 16 • VOT2014: 25 • VOT2015: 60 • OOT: 50 • ALOV++: 315
  • 80. Fabio A. González Universidad Nacional de Colombia Deep CNN-RNN for object tracking in video Number of sequences in different public datasets: • VOT2013: 16 • VOT2014: 25 • VOT2015: 60 • OOT: 50 • ALOV++: 315
  • 81. Fabio A. González Universidad Nacional de Colombia Deep CNN-RNN for object tracking in video Number of sequences in different public datasets: • VOT2013: 16 • VOT2014: 25 • VOT2015: 60 • OOT: 50 • ALOV++: 315
  • 82. Fabio A. González Universidad Nacional de Colombia Deep CNN-RNN for object tracking in video Number of sequences in different public datasets: • VOT2013: 16 • VOT2014: 25 • VOT2015: 60 • OOT: 50 • ALOV++: 315
  • 83. Fabio A. González Universidad Nacional de Colombia Deep CNN-RNN for object tracking in video
  • 84. Fabio A. González Universidad Nacional de Colombia Deep CNN-RNN for object tracking in video
  • 85. Fabio A. González Universidad Nacional de Colombia The Team Alexis Carrillo Andrés Esteban Paez Angel Cruz Andrés Castillo Andrés Jaque Andrés Rosso Camilo Pino Claudia Becerra Fabián Paez Felipe Baquero Fredy Díaz Gustavo Bula Germán Sosa Hugo Castellanos Ingrid Suárez John Arévalo JorgeVanegas Jorge Camargo Jorge Mario Carrasco Joseph Alejandro Gallego José David Bermeo Juan Carlos Caicedo Juan Sebastián Otálora Katherine Rozo LadyViviana Beltrán Lina Rosales Luis Alejandro Riveros Miguel Chitiva Óscar Paruma Óscar Perdomo Raúl Ramos Roger Guzmán Santiago Pérez Sergio Jiménez Susana Sánchez Sebastián Sierra