4. 4
ARTIFICIAL NEURAL NETWORK
Each neuron is a function
A Neuron for Machine
z
1w
2w
Nw
…
1x
2x
Nx
+
b
( )zs
( )zs
zbias
a
( ) z
e
z -
+
=
1
1
s
Sigmoid
function
Activation
function
ANN is one type of machine learning
that's loosely based on how neurons
work in the brain, though “the actual
similarity is very minor”.
5. 5
DEEP LEARNING (DEEP NEURAL NETWORK)
Each layer is a simple function in the production
Cascading the neurons to form a neural network
2006年「A fast learning
algorithm for deep belief
nets」論文,如果類神經網路
神經元權重不是以隨機方式指
派,可以大幅縮短神經網路的
計算時間,類似非監督式學習
來做為神經網路初始權重的指
派
6. 6
常見DNN網路架構介紹
CNN – some convolutional layers
適用於影像辨識 ; local connections, shared weights, pooling and the use of
many layers.
RNN – recurrent neural network, LSTM(long short-term memory networks)
適用於語音辨識, NLP(語意理解)
·Backpropagation
·Augment the
network with an
explicit memory.
Deep Learning – Review by LeCun, Bengio, and Hinton, Nature 521, 436–444 (28 May 2015)
7. 7
訓練深度學習MODEL之TOOLKIT
項目 用途 授權方式 備註
Kaldi 語音辨識 Apache 2.0 Open Source Project
Caffe 影像辨識 BSD license Berkeley
Torch 影像辨識, 語音辨識, 自然語言 BSD license Facebook
Theano 語音辨識, 自然語言 BSD license Universite de Montreal
Tensorflow 語音辨識, 影像辨識, 自然語言 Apache 2.0 Google
CNTK 影像辨識, 語音辨識, 自然語言 微軟自己的授權合約 Microsoft
項目 用途 備註
CUDA 平行運算架構 NVIDIA
cuDNN 為深層神經網路設計的 GPU 加速原式函式庫
(Caffe, TensorFlow, Theano,Torch and CNTK都有
使用)
NVIDIA
DIGITS 整合現有的開發工具(Caffe,Torch),實現DNN設計、
訓練和可視化等任務變得簡單化。
NVIDIA
8. 8
深度學習框架比較
8/39+20
Caffe CNTK TensorFlow Theano Torch
模型涵蓋性 ★★★ ★★ ★★★★☆ ★★★★☆ ★★★★★
佈署性 ★★★★★ ★★★★☆ ★★★★☆ ★★★ ★★★
架構修改性 ★★★ - ★★★★★ ★★★ ★★★★★
介面易用性 ★★★
C++/CMD
Python/Matlab
★★☆
CMD/C++
Python/.NET
★★★★☆
Python
C++
★★★★
Python
★★★★
Lua/LuaJIT
C
原開發語言 C++/Python C++ C++/Python Python C/Lua
支援分散式 No Yes Yes No No
授權 BSD 2-Clause Free Apache 2.0 BSD BSD
支持廠商/原創
者
Berkeley Microsoft Google Université de
Montréal
Facebook
Twitter
Google(before)
https://en.wikipedia.org/wiki/Comparison_of_deep_learning_software
https://github.com/zer0n/deepframeworks
n TensorFlow在模型涵蓋性及佈署皆有不錯的表現,受Google支持,社群/學習資源快速起步
n Torch/Theano成熟度高,已有豐富的網路分享資源供學習;Caffe有相當多影像領域的使用者
13. 13
GMM-HMM VS. DNN-HMM
GMM-HMMHand-crafted feature extraction is needed.
Each box is a simple function in the
production line,only GMM is learned
from data.
Construct very large Context-Dependent
output units in DNN
Make decoding of such huge networks
highly efficient using HMM technology
20. 20
場景辨識資料/模型
• 訓練資料
nPlaces365 Dataset
365類
Standard(1.8m)
Challenge(8m)
• 類神經模型(辨識器)
– AlexNet(8層)
– VGGNet(16層)
Input
Conv
Pool
FC
Softmax
Conv
Conv
Conv
Conv
Conv
Conv
Pool
Pool
Conv
Conv
Conv
Pool
Conv
Conv
Conv
Pool
FC
FC
Input
Conv
FC
Softmax
Conv
Conv
Pool
Pool
FC
Pool
Conv
Conv
FC
AlexNet
VGGNet
36. 36
結論
深度學習技術突破的關鍵因素:
n資料:Amount of available (big) data from people, sensors and devices.
n人才:Improvement in algorithms and combination of algorithms.
n設備:Specialized hardware (GPU/FPGA/ASIC HPC)
車牌及語音辨識各模組逐步替換為DNN方式
影音場景偵測持續往物件、事件偵測及Visual Question Answering等目
標努力