SlideShare ist ein Scribd-Unternehmen logo
1 von 58
Downloaden Sie, um offline zu lesen
Convolutional Neural Network


1
APPLICATION OF CONVOLUTIONAL NEURAL NETWORK IN IMAGE
CLASSIFICATION


Name


Course


Professor's Name


Institution


Location of Institution


Date
Convolutional Neural Network


2
ABSTRACT


Thanks to its broad applications in fields as diverse as smart surveillance and tracking, health and
medicine, sports and entertainment, robots, drones, and self-driving cars, computer vision has
become increasingly popular and successful in recent years. The basic building blocks of each of
these applications are image-processing tasks like image classification, localization, and
detection. Latest advances in Convolutional Neural Networks (CNNs) have resulted in excellent
results in these cutting-edge visual recognition tasks and systems. Consequently, CNNs are now
at the heart of computer vision's deep learning algorithms. This article would be useful to
anyone who wants to learn about the principles behind CNNs as well as get hands-on experience
with CNNs in image processing. It gives a thorough overview of CNNs, beginning with the
fundamental principles of neural networks: preparation, regularization, and optimization in
image processing. Besides, it also proves the effectiveness of CNNs as compared to other image
classification algorithms such as support vector machine.
Convolutional Neural Network


3
CONTENTS


ABSTRACT
	
2


..................................................................................................................................................
CHAPTER 1 INTRODUCTION
	
5


................................................................................................................
1.1 Background of the Study
	
5


.....................................................................................................................
CHAPTER 2 ARTIFICIAL NEURAL NETWORK
	
6


................................................................................
2.1 Artificial Neural Network
	
7


....................................................................................................................
2.2 Artificial Neuron
	
8


..................................................................................................................................
2.3 Weight, Biases and activation functions
	
8


..............................................................................................
2.3.1 Weight and Structure of a Neuron
	
8


...............................................................................
2.3.2 Bias
	
9


.................................................................................................................................
2.3.3 Activation function and the ReLu
	
10


..............................................................................
2.4 Back Propagation
	
11


...............................................................................................................................
2.5 Loss Function
	
12


.....................................................................................................................................
2.6 Gradient Descent
	
12


...............................................................................................................................
2.7 Learning Rate
	
12


.....................................................................................................................................
CHAPTER 3 CONVOLUSIONAL NEURAL NETWORK
	
13


..................................................................
3.1 Convolutional Neural Network Architecture
	
13


.....................................................................................
3.2 Convolutional Layers
	
14


.........................................................................................................................
3.3 Pooling Layers
	
15


...................................................................................................................................
3.4 Fully Connected Layers
	
16


.....................................................................................................................
3.5 Models for Composing CNN in Image Classification
	
16


.......................................................................
3.5.1 Classification and Localization
	
17


..................................................................................
3.5.2 Semantic Segmentation
	
18


...............................................................................................
3.5.3 Object Detection
	
19


..........................................................................................................
3.5.4 Instance Segmentation
	
23


................................................................................................
CHAPTER 4 CONCEPTUAL FRAMEWORK AND LITERATURE REVIEW
	
24


...............................
4.1 Literature Overview
	
24


...........................................................................................................................
4.2 Case Study 1. Convolutional Neural Networks for Image Processing
	
24


..............................................
4.2 Case Study 2. Deep Convolutional Neural Networks for Hyperspectral Image Classification.
	
26


.......
Convolutional Neural Network


4
4.3 Case Study 3. Convolutional neural networks: an overview and application in radiology
	
28


...............
4.4 Case Study 4. Evaluating the performance of convolutional neural networks with direct acyclic
graph architectures in automatic segmentation of breast lesion in US images
	
29


........................................
4.5 Conclusion
	
29


.........................................................................................................................................
CHAPTER 5 DESIGN
	
30


..............................................................................................................................
5.1 Methodology
	
30


......................................................................................................................................
5.2 Stages Of Development
	
30


.....................................................................................................................
5.2.1 Feasibility Study - Stage 0.
	
30


.........................................................................................
5.2.3 Requirements Specification - Stage 3.
	
31


.......................................................................
5.2.4 Logical System Specification – Stage 4&5.
	
31


...............................................................
5.2.5 Physical Design – Stage 6.
	
31


...........................................................................................
5.3 Reasons for Choosing SSADM
	
32


.........................................................................................................
5.4 Comparison of SSADM With Other Methodologies
	
32


.........................................................................
a) Waterfall Model
	
32


........................................................................................................
b.) Iterative Model
	
34


..............................................................................................................
5.5 Research Methods
	
34


..............................................................................................................................
5.5.1 Techniques for data collection.
	
34


...................................................................................
CHAPTER 6 IMPLEMENTATION
	
36


........................................................................................................
6.1 Hardware and Software Used
	
36


............................................................................................................
6.2 Definitions
	
36


.........................................................................................................................................
6.2.1 Train, Validation, and Test
	
36


..........................................................................................
6.2.2 Overfitting and Underfitting
	
37


......................................................................................
6.2.3 Batch Size
	
37


.....................................................................................................................
6.2.4 Epoch
	
37


............................................................................................................................
6.2.5 Dropout
	
37


........................................................................................................................
6.2.6 Batch Normalization
	
37


...................................................................................................
6.3 Modelling And Results
	
38


......................................................................................................................
6.3.1 First Model
	
38


...................................................................................................................
6.3.2 Second Model
	
40


...............................................................................................................
6.3.3 Third Model
	
42


.................................................................................................................
6.4 CNN and SVM comparison
	
44


...............................................................................................................
CHAPTER 7 CONCLUSION
	
47


...................................................................................................................
CHAPTER 8 RECOMMENDATION AND FURTHER WORK
	
48


.........................................................
Convolutional Neural Network


5
CHAPTER 1 INTRODUCTION


1.1 Background of the Study


Artificial intelligence (AI) has become increasingly common in recent years. One of the
responsibilities of computer vision, which is the ability to see things, is something that AI can
8.1 Tune Parameters
	
48


.................................................................................................................................
8.2 Image Data Augmentation
	
49


.................................................................................................................
8.3 Deeper Network Topology
	
49


.................................................................................................................
8.4 Handle Overfitting and Overfitting Problem
	
50


.....................................................................................
APPENDICES
	
54
............................................................................................................................................
Convolutional Neural Network


6
help with. Computers are used to process and analyze images to simulate human vision. Image
recognition is one of the most important tasks in computer vision. Image classification, for
example, is when there are pictures of several items that need to be classified into "groups," such
as "car," "plane," "ship," or "house."


Convolutional neural networks are a popular method for image classification. It involves
employing deep learning, which is implemented using neural networks. Deep learning is a subset
of machine learning, which is a subset of AI.


First, the University of Edinburgh's CINIC-10 dataset was employed to show how to apply
convolutional neural networks in image classification and how to achieve more accurate results
by employing different factors. CIFAR-10 and ImageNet are two well-known image
classification datasets, and CINIC-10 is a mixture of the two.


Second, a dataset composing the Salina, University of Pavia scenes, and Indian pines data was
used to show the effectiveness of convolutional neural networks in image classification as
compared to support vector machine. Support vector machine was used since it has been known
to be a very effective algorithm for image classification for many years (Gulli 2021).


CHAPTER 2 ARTIFICIAL NEURAL NETWORK
Convolutional Neural Network


7
Since artificial neurons are greatly inspired by human neurons, it is important to understand how
human neurons work.


Figure 1: A diagram of the neuron showing the structure between the axon and dendrite.


When a neuron fires, normally in response to a stimulus, signals are sent down its axon to the
dendrites of another neuron through a synapse. The new neuron through then fire, causing
another neuron to fire, repeating the process in the system.


2.1 Artificial Neural Network


An artificial neural network (ANN) is a set of layers of neurons (referred to as units or nodes in
this context). Each unit in one layer is connected to each unit in the next layer.
Convolutional Neural Network


8
Figure 2: The artificial neural network architecture


The network takes all the information it needs, in this case the images to identify, through an
input layer. Secret layers exist between the input and output layers. Each hidden layer detects a
different set of features in an image, ranging from simple to complex. The first hidden layer, for
example, detects edges and lines, the second layer detects curves, and the third layer detects
objects. The first secret layer, for example, detects edges and lines, the second detects curves,
and the third layer detects specific image features, such as a face or a wheel. The first secret
layer, for example, detects edges and lines, the second detects curves, and the third layer detects
specific image features, such as a face or a wheel.


The network makes predictions in the output layer. Human-provided labels are compared to the
projected image categories. If they are wrong, the network corrects its learning using a technique
called backpropagation (discussed later in this chapter) so that it can make better guesses in the
next iteration. After enough training, a network may make classifications on its own, without the
need for human intervention.


2.2 Artificial Neuron


In an artificial neural network, an artificial neuron is a link point (unit or node) that can process
input signals and generate output signals.


2.3 Weight, Biases and activation functions


2.3.1 Weight and Structure of a Neuron


In a neural network, the connections between the units are weighted, which means that the
weight shows how much the input from a previous unit influences the output of the next unit. To
Convolutional Neural Network


9
compute an artificial neuron mathematically, add all the products of all the inputs (x1 to xn) and
their corresponding weights (w1 to wn), then add a bias (b), then feed the resulting value into an
activation function (f) to form the output.




Figure 3: A diagram to show the work of a neuron: input x, weights w, bias b, activation function
f.


2.3.2 Bias


A bias (b) is an additional input to a neuron that is technically the number 1 compounded by a
weight. The bias allows the activation function curve to be moved left or right on the coordinate
graph, allowing the neuron to produce the desired output value.


Figure 4: A bias value allows the activation function to shift to the left or right.
Convolutional Neural Network


10
To illustrate Figure 4, when the input (x) is 2, a bias value of 5 allows the Sigmoid activation
function to output 0.


2.3.3 Activation function and the ReLu


An activation mechanism, by definition, determines whether or not a neuron should be activated
(“fired”). It causes a neuron's output to become nonlinear. Without activation functions, a neural
network is nothing more than a linear regression model. The ReLu: A(x) = max (0, x) is the most
common activation function for CNNs (13) and the one used in this thesis. (No. 14) When x is
positive, it outputs x; otherwise, it outputs 0.
Convolutional Neural Network


11
Figure 5: The ReLu function.


Since the mathematical operation is simpler and the activation is sparser, ReLu is less
computationally costly than some other popular activation functions like tanh and Sigmoid.
Since the function returns 0 when x is less than zero, there's a good chance that a given unit
won't turn on at all. Sparsity also means less noise and overfitting, as well as more succinct
models with higher predictive capacity. Neurons in a sparse network are more likely to process
useful data. A neuron that can recognize human faces, for example, should not be triggered if the
picture is actually about a house.


Another advantage that the ReLu has over the others is that it is faster. Converges more quickly
Linearity (when x 0) denotes that the line's slope does not change. As x rises, it does not reach a
plateau. As a result, ReLu does not have the vanishing capacity. Other activation functions, such
as Sigmoid or tanh, suffer from a gradient problem.


The Softmax function is another common activation function in CNNs. It's frequently used in the
output layer, where multiclass classification is performed. However, this function's mathematical
calculation is outside the reach of this thesis.


2.4 Back Propagation


Backpropagation is an algorithm that aids neural networks in learning new information.
parameters, primarily because of prediction errors. This chapter will focus on using gradient
descent, illustrate backpropagation.
Convolutional Neural Network


12
2.5 Loss Function


A loss function is an error measure, a method of calculating the degree of inaccuracy in a system.
Forecasting the goal of deep learning models is to minimize this loss function value, and this
process is known as optimization.


2.6 Gradient Descent


Gradient descent is an optimization algorithm that changes the internal state of the system. To minimize
the loss function value, adjust the weights of the neural network. The gradient descent algorithm tries to
reduce the loss function value by adjusting weights after each iteration until further tweaks are no longer
possible. produce little to no change in the value of the loss function, also known as convergence.


2.7 Learning Rate


In gradient descent or other optimization algorithms, a learning rate is the step size of each
iteration. Convergence will take a long time if the learning rate is too low, but there may be no
convergence at all if the learning rate is too high.
Convolutional Neural Network


13
CHAPTER 3 CONVOLUSIONAL NEURAL NETWORK


3.1 Convolutional Neural Network Architecture


A Convolutional neural network is a deep neural network used in image processing that takes
images as input and understands the characteristics from the data. Any colored image is divided
into three layers: red, green, and blue, each of which is nothing more than a pixel value matrix.
On previous output, mathematical operations such as convolutions and pooling are used to create
new layers. Convolutions are used to remove functionality and pooling is used to reduce the
network's complexity. For classification, the output matrix is flattened to one layer and attached
to a completely connected layer.
Convolutional Neural Network


14
Figure 10: CNN Architecture


The connectivity pattern between neurons in convolutional networks was influenced by
biological processes in that it resembles the organization of the animal visual cortex. Individual
cortical neurons respond to stimuli only in the receptive field, which is a small portion of the
visual field. Different neurons' receptive fields partly overlap, allowing them to occupy the entire
visual field. Our vision is based on multiple cortical layers, each of which recognizes
increasingly organized data. Single pixels are first seen, followed by basic geometric forms and
more complex elements such as shapes, faces, human beings, animals, and so on.


3.2 Convolutional Layers


The mathematical combination of two functions to form a third function is referred to as
"convolution." When this occurs, two sets of data are combined. A convolutional layer (also
known as a filter or kernel) is added to the input data in CNNs to generate a function map.
Convolutional Neural Network


15
Figure 9: Convolutional layer with filter slides over the input and performs its output on the new
layer.


Between a 3x3 sized filter matrix and a 3x3 sized region of the input image's matrix, a dot
product multiplication is performed. The output value (“destination 16pixel”) on the function
map is the number of the elements of the resulting matrix. The filter then slides over the input
matrix and completes the function map by repeating the dot product multiplication for each
remaining combination of 3x3 sized areas.


3.3 Pooling Layers


Pooling layers reduce the dimensionality of feature charts, specifically the height, and width
while maintaining the depth. This is advantageous because it reduces the amount of computing
power needed to process the data when extracting the most important features in function maps.


Pooling layers are divided into two categories: maximum pooling and average pooling.
Convolutional Neural Network


16
Figure 10: Types of Pooling.


The maximum value of the elements in the portion of the image projected by the filter is returned
by max pooling, while the average value is returned by average pooling. Max pooling is more
effective at extracting dominant features and is therefore, more efficient.


3.4 Fully Connected Layers


The classification takes place in completely linked layers. The input matrix is converted to a
column vector and fed into a series of fully connected layers, similar to the fully connected ANN
architecture mentioned previously. Each completely connected layer (called a Dense layer) goes
through an activation function (such as tanh or ReLu), but the output Dense layer goes through
Softmax. Cross-Entropy (categorical cross-entropy in Keras) is the loss function used in Softmax
multiclass classification. The Softmax function returns an N-dimensional vector, where N is the
number of classes from which the CNN must choose. The probability that the image belongs to
each of the classes is represented by each number in this N-dimensional vector. For example, if
the output vector is [0.1,1.75,0,0,0,0 ,0,0.0,5], there is a 10% chance that this image belongs to
class 2, a 25% chance that it belongs to class 3, a 75% chance that it belongs to class 4, and a 5%
chance that it belongs to class 10.


3.5 Models for Composing CNN in Image Classification


To solve several complex tasks, the simple CNN architecture can be composed and expanded in
a variety of ways.
Convolutional Neural Network


17
3.5.1 Classification and Localization


You must report not only the type of object contained in the image but also the coordinates of the
bounding box where the object appears in the image in the classification and localization task.
This task assumes that an image contains only one instance of an entity.


In a standard classification network, this can be accomplished by adding a "regression head" in
addition to the "classification head." Remember that the final production of convolution and
pooling operations, called the feature map, is fed into a fully connected network that generates a
vector of class probabilities in a classification network. The classification head is a completely
connected network that is tuned using a categorical loss function (Lc) such as categorical cross-
entropy (Gulli 2021).


A regression head is a completely connected network that takes the function map and generates a
vector (x, y, w, h) that represents the top-left x and y coordinates, as well as the bounding box's
width and height. A continuous loss function (Lr), such as mean squared error, is used to tune it.
A linear combination of the two losses is used to tune the entire network, i.e.


L=αLC+(1-α)Lr


This is a hyper parameter that can have a value of 0 or 1. It can be set to 0.5 unless the value is
determined by some domain information about the problem. A typical classification and
Convolutional Neural Network


18
localization network architecture is depicted in the diagram below. The only deviation from a
standard CNN classification network, as you can see, is the additional regression head on the top
right:


Figure 4: Architecture for Classification and Localization


3.5.2 Semantic Segmentation


The goal here is to assign a single class to each pixel on the image. A first step may be to create a
classifier network for each pixel, with the input being a small neighborhood surrounding each
pixel. In reality, this method is inefficient, so running the image through convolutions to increase
the feature depth while keeping the image width and height constant may be a better alternative.
After that, each pixel has a feature map that can be sent through a completely connected network
to predict the pixel's class. In reality, however, this is often very costly, and it is seldom used.


A third method is to use a CNN encoder-decoder network, in which the encoder reduces the
image's width and height while increasing its depth (number of features), while the decoder uses
transposed convolution operations to increase the image's size while decreasing its depth.


The method of moving in the opposite direction of a typical convolution is known as transpose
convolution (or up sampling). The picture is the input to this network, and the segmentation map
is the output (Gulli 2021).


The U-Net (a good implementation is available at https://github.com/jakeret/tf unet), which was
originally designed for biomedical image segmentation and has additional skip-connections
Convolutional Neural Network


19
between corresponding layers of the encoder and decoder, is a common implementation of this
encoder-decoder architecture. The U-Net architecture is depicted in the diagram below:




Figure 10: Semantic Segmentation


3.5.3 Object Detection


The classification and localization tasks are identical to the object detection task. The main
difference is that there are now several objects in the image, and we must determine the class and
Convolutional Neural Network


20
bounding box coordinates for each one. Furthermore, neither the number nor the size of the items
is specified ahead of time. As you would expect, this difficult problem has prompted a significant
amount of study. A first solution to the problem might be to make several random croppings of
the input image and apply the classification and localization networks we discussed earlier to
each crop. However, such an approach wastes a lot of computing power and is unlikely to be
competitive. Using a method like Selective, which uses conventional computer vision techniques
to identify areas in the image that may contain objects, is a more realistic approach.


Figure 10: Object Detection


These areas are known as "Region Proposals," and the network that was used to find them was
known as the "Region Proposal Network," or R-CNN. The regions were resized and fed into a
network in the original R-CNN to produce image vectors: The bounding boxes suggested by the
external tool were corrected using a linear regression network over the image vectors, and the
vectors were then categorized using an SVM-based classifier. A R-CNN network can be
conceptually interpreted as follows:
Convolutional Neural Network


21
Figure 10: R-CNN Network Architecture


The Quick R-CNN was the next version of the R-CNN network. Instead of feeding each region
proposal through the CNN, the Quick R-CNN feeds the entire picture through the CNN, and the
region proposals are projected onto the resulting feature map. Each region of interest is fed
through a Region of Interest (ROI) pooling layer before being fed into a fully connected network,
which generates an ROI feature vector.


ROI pooling is a common operation in convolutional neural network object detection tasks. The
ROI pooling layer employs maximum pooling to transform features within any valid region of
interest into a small feature map with a defined spatial extent of H W. (where H and W are two
hyperparameters).


The function vector is then fed into two completely connected networks, one of which predicts
the ROI class and the other of which corrects the proposal's bounding box coordinates. As an
example, consider the following:
Convolutional Neural Network


22
Figure 10: Quick R-CNN Network Architecture


The fast R-CNN is 25 times faster than the R-CNN. The next upgrade, known as the Faster R-
CNN (an implementation can be found at), replaces the external region proposal mechanism with
a trainable portion within the network called the Region Proposal Network (RPN). As shown
below, the performance of this network is combined with the feature map and passed through a
pipeline similar to that of the Fast R-CNN network. The Faster R-CNN network is approximately
10 times faster than the Fast R-CNN network, making it roughly 250 times faster than an R-CNN
network (Gulli 2021).




Figure 10: Faster R-CNN Network Architecture


Single Shot Detectors (SSD), such as You Only Look Once, is a slightly different type of object
detection network (YOLO). Each image is divided into a predetermined number of sections
using a grid in these cases. A 7x7 grid is used in the case of YOLO, resulting in 49 subimages.
Each subimage receives a predetermined collection of crops with different aspect ratios. The
output for each image is a vector of size (7 * 7 * (5B + C) given B bounding boxes and C object
Convolutional Neural Network


23
groups. Each grid has prediction probabilities for the various objects detected inside it, as well as
trust and coordinates (x, y, w, h) for each bounding box.


This transition is carried out by the YOLO network, which is a CNN affiliate. The results from
this vector are combined to find the final predictions and bounding boxes. In YOLO, the
bounding boxes and associated class probabilities are predicted by a single convolutional
network. YOLO is the quickest solution for object detection, but the algorithm can miss smaller
artifacts.


3.5.4 Instance Segmentation


With a few key differences, instance segmentation is similar to semantic segmentation — the
process of associating each pixel of an image with a class mark. It must first differentiate
between different instances of the same class in a picture. Second, labeling every bitmap image
in the image is not necessary. In some ways, instance segmentation is similar to object detection,
but we are looking for a binary mask that covers each object instead of bounding boxes. The
second concept contributes to the Mask R-CNN network's intuition. The Mask R-CNN is a
Faster R-CNN with an additional CNN in front of its regression head that takes the ROI
bounding box coordinates as input and converts them to a binary mask.


The second concept contributes to the Mask R-CNN network's intuition. The Mask R-CNN is a
Faster R-CNN with an additional CNN in front of its regression head that converts the bounding
box coordinates recorded for each ROI to a binary mask as input.
Convolutional Neural Network


24
Figure 11: Mask R-CNN Network Architecture


CHAPTER 4 CONCEPTUAL FRAMEWORK AND LITERATURE REVIEW


4.1 Literature Overview


While hand-crafted feature extraction techniques, such as texture analysis, have been used in
radionics studies for many years, they have been followed by traditional machine learning
classifiers, such as random forests and support vector machines. When it comes to image
recognition, there are a few distinctions to be made between certain approaches and CNN. First,
CNN does not necessitate feature extraction by hand. Second, CNN architectures do not often
require human experts to segment tumors or organs. Third, since there are millions of learnable
parameters to estimate, CNN is much more data-hungry and computationally costly,
necessitating the use of graphical processing units (GPUs) for model training (Browne and
Ghidary 2003).


4.2 Case Study 1. Convolutional Neural Networks for Image Processing


The term convolutional network (CNN) is used to describe an architecture for applying neural
networks to two-dimensional arrays (usually images), based on spatially localized neural input.
The ‘sharing' of weights across processing units in the CNN architecture decreases the number of
free parameters, improving the network's generalization efficiency. Weights are repeated
throughout the spatial collection, resulting in inherent insensitivity to input translations – a useful
function for image classification. CNNs have a range of distinct advantages over completely
connected and unconstrained neural network architectures in the sense of image processing.
Convolutional Neural Network


25
When providing input directly to the network, the number of free parameters in the network can
easily become unmanageable unless a specialized architecture is used. Traditional neural network
applications may be able to solve this problem by relying on comprehensive pre-processing of
images to make them in a usable format. However, this results in a hybrid two-stage architecture
in which the pre-processing stage does most of the "interesting" function, which is, of course,
hard-wired and non-adaptive (Browne and Ghidary 2003)


.


There is no built-in invariance in unstructured neural networks when it comes to translations or
local distortions of the inputs. Indeed, one shortcoming of fully connected architectures is that
the input topology is completely ignored. Images are strongly correlated and have a solid 2D
local structure. In general, we argue that when input data is organized temporally or spatially, a
general CNN architecture is better than a generic neural network (Browne and Ghidary 2003).


CNNs perform mappings in any dimension between spatially/temporally distributed arrays. They
tend to be appropriate for use with time series, photographs, or video. CNNs have the following
characteristics:


• Translation invariance (neural weights remain constant regardless of translation
direction).


• Connectivity within the group (neural connections only exist between spatially local
regions).


• A gradual reduction in spatial resolution is a choice (as the number of features is
gradually increased).
Convolutional Neural Network


26
4.2 Case Study 2. Deep Convolutional Neural Networks for Hyperspectral Image
Classification.


Huang et al. (2015) Found out that generally, in comparison to other image classification
algorithms, CNNs need very little pre-processing. This means that the network learns to optimize
the filters (or kernels) through automatic learning, as opposed to hand-engineered filters in
conventional algorithms. This lack of reliance on prior expertise or human involvement in
feature extraction is a significant benefit.


Hyperspectral imagery is determined and created by remote sensors, which involve hundreds of
observation channels with high spectral resolution. This process has inspired the development of
many algorithms such as K-nearest neighbors, minimum distance, and logistic regression.
However, these algorithms over the years have proved inefficient as compared to CNN when
employed in remote sensing data. CNN provides multilayer perceptron and a radial basis
function neural networks that these other algorithms lack. It is true that algorithms like SVM are
indeed efficient as compared to the conventional CNN in terms of classification accuracy and
computing cost, but when a deep structure and architectures of CNN are employed then CNN
proves to be a powerful model for classification than all the other algorithms and very
competitive as compared to SVM. Not only has CNN overpowered other algorithms but also
over the years, deep CNN results in a promising performance in many fields as it has played a
vital role for processing visual-related problems (Huang et al 2015).
Convolutional Neural Network


27
CNN has even more recently proved efficient than some of the superior methods such as human
performance and many vision-oriented tasks, including image classification, object detection,
and scene mapping, number digit classification and face recognition. When applying CNNs to
HIS classification the structure of CNN is gradually proven the most effective and preferable
way to understand visual representations. The figure below represents hyperspectral data with
hundreds of spectral channels. Each curve for specific lass has its visual shape, though it’s hard
distinguishing some of these differences with human eyes, CNN can achieve better results as
compared to humans and as a result, CNN has proven to be the best techniques when employed
in HIS classification (Huang et al 2015).


Figure 12: HIS classification
Convolutional Neural Network


28
4.3 Case Study 3. Convolutional neural networks: an overview and application in radiology


Classification using deep learning in medical image analysis typically uses target lesions
represented in medical images, and these lesions are divided into two or more groups. Deep
learning, for example, is commonly used to classify lung nodules on computed tomography (CT)
images as benign or malignant, as seen below. For efficient classification using CNN, a large
amount of training data with corresponding labels is needed. CT photographs of lung nodules
and their indications (i.e., benign or cancerous) are used as training data for lung nodule
classification. Below is a display two examples of lung nodule classification training results, one
for a benign lung nodule and the other for primary lung cancer.




Figure 13: CNN in radiology
Convolutional Neural Network


29
4.4 Case Study 4. Evaluating the performance of convolutional neural networks with direct
acyclic graph architectures in automatic segmentation of breast lesion in US images


In Ultra Sound (US) breast photos, highlighting lesion contours is a vital step in breast cancer
diagnosis. Infiltrating the underlying tissue, malignant lesions produce irregular contours with
speculation and angulated edges, while benign lesions produce smooth contours with an elliptical
form. He states that In breast imaging, the majority of the existing publications in the literature
focus on using Convolutional Neural Networks (CNNs) for segmentation and classification of
lesions in mammographic images. However In this study the main objective is to assess the
ability of CNNs in detecting contour irregularities in breast lesions in US images.


4.5 Conclusion


It is very clear that while with the convolutional neural, the model accuracy in image
classification increases proportionally. Secondly using CNN in different fields such as radiology,
hyperspectral image classification and many other fields, has proven to be more beneficial and
advanced as compared to other algorithms employed in image classification. As such it is worthy
concluding that CNN is the best method to apply when involved in image classification
Convolutional Neural Network


30
CHAPTER 5 DESIGN


5.1 Methodology


To evaluate and build, the structured systems analysis and design process (SSADM) was used.
According to (Kendall 1988), the SSADM approach includes users during the most important
and intensive period of the development process: the first stages of development. Aside from
that, in terms of development stages of operation, it is close to the waterfall model. It divides
growth into stages and modulates it. The data model is the first model it creates. The following
techniques were used:


Logical data modeling - logical data modeling is the method of defining, modeling, and
recording data. The information is then divided into entities and relationships.


Data Flow Modelling - involves following the flow of data in a computer system. Processes, data
servers, external actors, and data movement are all thoroughly examined.


Entity Behavior Modeling - involves defining and recording the events that influence each
individual, as well as the order in which they occur.


5.2 Stages Of Development


5.2.1 Feasibility Study - Stage 0.
Convolutional Neural Network


31
Its aim is to determine whether the project's course and specifications are financially, technically,
and operationally feasible.


5.2.2 Requirement Analysis - Stage 1&2.


This stage entails looking at the current situation and finding issues and areas that need to be
improved. The second state entails creating a range of options that meet the specified criteria and
selecting the most appropriate alternative.


5.2.3 Requirements Specification - Stage 3.


The stage aims to identify the desired system data, functions, and events.


5.2.4 Logical System Specification – Stage 4&5.


This stage aims to evaluate the technical system's operations as well as the conceptual design.


5.2.5 Physical Design – Stage 6.


The physical world in which the device will operate is taken into account.
Convolutional Neural Network


32
Figure 14: SSADM Methodology.


5.3 Reasons for Choosing SSADM


Within a systems development cycle, SSADM incorporates three approaches, each of which
complements the others:


• Logical Data Modelling


• Data Flow Modelling


• Entity Event Modelling.


Its key advantages over other methodologies are as follows:


➢ Quality improvement


➢ Detailed documentation of the development stages


➢ Reusability for similar projects that follow.


Because of this thorough examination of the information system, this approach decreases the
likelihood of information misunderstandings during the project life cycle, which is why it was
chosen for this project.


5.4 Comparison of SSADM With Other Methodologies


	


Other software development methodologies that have been investigated but not taken into
account for this project include:


a) Waterfall Model
Convolutional Neural Network


33
This is a sequential design process in which progress is viewed as a waterfall that flows steadily
downward through the phases of:


• Conception


• Initiation


• Analysis


• Design


• Construction


• Testing


• Implementation and maintenance


All of these steps flow through one another, with progress appearing to flow slowly like a
waterfall.


Advantages of Waterfall


It is simple to handle since each stage is defined by rigid deliverables and a review process.


There is no overlapping since phases are processed and completed one at a time.


Disadvantages of waterfall


It is difficult to predict how long each phase of construction will take and how much it will cost.


For dynamic and object-oriented projects, this is not a suitable model.


Not ideal for projects with variable specifications.


SSADM vs Waterfall


While the two always seem to be identical, a subtle difference makes SSADM superior to
Waterfall. This is because, unlike the traditional Waterfall, SSADM allows for the review of
previous stages/phases even after they have been completed, while the traditional Waterfall is
static and cannot be checked until a step has been completed.
Convolutional Neural Network


34
b.) Iterative Model


This is a version of the software development life cycle that focuses on a simple initial
implementation that gradually increases in complexity and feature set until the final set is
complete. Following the initial planning process, a limited number of steps are replicated, with
each cycle's completion refining and iterating the software incrementally. These phases include:


• Planning and requirements


• Analysis and design


• Implementation (coding)


• Testing


• Evaluation


Advantages of iterative model


Simple adaptability to the system's ever-changing requirements. computer applications


To suit the needs of the project or organization, each stage can be broken down into smaller
chunks.


Disadvantages of iterative model


User interaction is under more strain.


Users notice the changes in each iteration, so feature/requirement creep is a possibility.


5.5 Research Methods


This section explains the data collection techniques that will be used in the qualitative study of
the system. Farmers and some management are the system's most important stakeholders.


5.5.1 Techniques for data collection.
Convolutional Neural Network


35
Many data collection methods are available. Interviews, observation, and documents and records
review will be used to collect data. Existing data was majorly used during the development and
testing.


5.5.1.1 Existing Data


This refers to the addition of new investigation questions to the ones that were originally used
when the data was collected. It entails incorporating measurement into a study or research
project. Data sourced from an archive is an example.


Advantages of Existing Data


The level of precision is extremely high.


Data that is easily available.


Disadvantages of Existing Data


Evaluation issues and comprehension difficulties.
Convolutional Neural Network


36
CHAPTER 6 IMPLEMENTATION


6.1 Hardware and Software Used




The report employed free GPUs from Google Colab (Collaboratory). The deep learning framework
applied is TensorFlow with Keras API.


6.2 Definitions


6.2.1 Train, Validation, and Test


The model is trained using the training dataset. The model learns its weights and prejudices in the case of
neural networks.


After each set of predictions, the model evaluates itself using the validation dataset. It aids the model's
hyperparameter tuning.


After the model has been fully trained, the test dataset is used to validate it.
Convolutional Neural Network


37
6.2.2 Overfitting and Underfitting


When a model captures the noise in the data, it is said to overfit. It intuitively suits the data too well, or in
other words, it is overly reliant on the training data.


Underfitting, on the other hand, happens when the model fails to capture the underlying pattern of the
data or does not intuitively match the data well enough.


Overfitting and underfitting both result in poor predictions in new datasets.


6.2.3 Batch Size


In most cases, the whole dataset cannot be fed into the neural network at the same time. As a result, it
must be divided into parts or batches. The batch size specifies how many training samples are used in a
single batch.


6.2.4 Epoch


When the entire dataset (i.e. every training sample) is fed forward and backward through the neural
network only once, it is referred to as an epoch.


6.2.5 Dropout


Dropout is a method for reducing overfitting. The word "dropout" refers to units and their links being
dropped out at random during training.


6.2.6 Batch Normalization


Overfitting can also be reduced by using batch normalization. It adjusts and scales the activations to
normalize the input layer. Batch normalization's mathematics is outside the reach of this thesis.
Convolutional Neural Network


38
6.3 Modelling And Results


6.3.1 First Model


This model is based on TensorFlow’s Convolutional Neural Network (CNN) tutorial, with some tweaks.
To avoid overfitting, there are three convolutional layers, each followed by a max-pooling layer and two
dropout layers with a dropout rate of 0.3. Following that, there are two thick layers, each with 256 and 10
units (10 is the number of classes for classification). A dropout layer with a dropout rate of 0.2 exists
between the two thick layers. The batch size is 32 and the number of epochs is 32. The optimizer is Adam
with a learning rate of 0.0001.


Below is the code and the model summary


All programs are implemented using Python language and Theano library.
Convolutional Neural Network


39


Figure 15: Two layer model CNN code


Here are the results concerning the accuracies and loses




Figure 16: Test loss: 1.25/ Test accuracy: 0.58


The training accuracy continues to improve, but the validation accuracy quickly reaches a
plateau. As a result, despite several Dropout layers, the model is extremely overfitting.
Convolutional Neural Network


40
6.3.2 Second Model


This model is based on TensorFlow's Convolutional Neural Network (CNN) tutorial (33), with some
tweaks. Three convolutional layers follow each other, followed by a max pooling layer and two dropout
layers with a dropout rate of 0.3 to avoid overfitting. Following that, there are two thick layers, each with
256 and 10 units (10 is the number of classes for classification). A dropout layer with a dropout rate of 0.2
exists between the two thick layers. The number of epochs is 32, and the batch size is 32. Adam is the
optimizer, and his learning rate is 0.0001.


Below is the code and the model summary
Convolutional Neural Network


41
Figure 17: Three layer CNN model codel


Here are the results concerning the accuracies and loses
Convolutional Neural Network


42


Figure 18: Test loss: 1.16 / Test accuracy: 0.58


The model is not as overfitting, but the accuracy is not high enough (just over 60%).


6.3.3 Third Model


The third model has the same structure as the second, but after each convolutional layer, batch
normalization is applied. To save time studying, the batch size has been increased to 128 and the
number of epochs has been reduced to 27. Adam is the optimizer again, but this time the learning
rate has been improved to 0.001 to reduce learning time.


Below is the code and the model summary
Convolutional Neural Network


43


Figure 19: Batch Normalization CNN code




Figure 20: Test accuracy: 0.71 / Test loss: 0.83


The model is now running exceptionally well. It is still not overfitting; the accuracies of
preparation, validation, and testing are all reasonably high: 75%, 71%, and 71%, respectively.
Convolutional Neural Network


44
6.4 CNN and SVM comparison


The Data Set: Majorly three hyperspectral data, composing the Salina, University of Pavia
scenes, and Indian pines are employed to test the effectiveness of CNN in imaged classification
as compared to the SVM algorithm. I am comparing CNN to SVM since SVM has been known
to be the most effective algorithm when employed in image classification. For the data, 200-
labeled pixels are randomly selected per class for the training data sets while all the rest pixels
are employed as the testing datasets.
Convolutional Neural Network


45
Table 1: Number of training and test samples used in the Indian Pines data set.


The second data was provided by University of Pavia


Table 2: Number of training and test samples used in University of Pavia data set




Table 3: Number of training and test samples used in the Indian Pines data set.


Results and Comparison
Convolutional Neural Network


46
The figures below provide the comparison between SVM and CNN when employed in HSI
classification.


Table 4: comparison of SVM and CNN in HIS classification


Result of comparison with different neural networks on the Indian Pines data set.


Table 5: comparison of SVM and CNN in HIS classification
Convolutional Neural Network


47
Figure 21: comparison of SVM and CNN in HIS classification


CHAPTER 7 CONCLUSION


First, the model with four convolutional layers (the second and third models) outperforms the
model with three convolutional layers (the first model) by a large margin, with slightly less
overfitting. The models with four convolutional layers (the second and third models) outperform
the model with three convolutional layers (the first model) by a large margin, with slightly less
overfitting.


It is very clear that while increasing the convolutional layers, the model accuracy in image
classification increases proportionally.


Secondly, the CNN algorithm is more efficient as compared to other image classification
algorithms such as SVM and KNN. As such, it is worth concluding that CNN when employed
effectively is the best algorithm in image classification.
Convolutional Neural Network


48
CHAPTER 8 RECOMMENDATION AND FURTHER WORK


The research that has been performed for this report has highlighted several topics that suggest
further research and improvement.


8.1 Tune Parameters


Das (2021), found out that to improve CNN model performance, we can tune parameters like
epochs, learning rate, etc… The Number of epochs affects the performance. There is an increase
in efficiency over a wide number of epochs. However, some experimentation is needed when
deciding on epochs and learning rates. We can see that there is no reduction in training failure
and no increase in training precision after a certain number of epochs. Accordingly, we can
determine the number of epochs. In the CNN model, we can also use a dropout layer. During
model compilation, the appropriate optimizer must be chosen based on the application. Various
optimizers, such as SGD can be used. Various optimizers must be used to fine-tune the model.
All of these factors have an impact on CNN's results.
Convolutional Neural Network


49
8.2 Image Data Augmentation


“Deep learning is only useful when there is a lot of data.” It's not incorrect. CNN requires the
ability to automatically learn features from data, which is typically only possible when a large
amount of training data is available.


If we have less training data available.. what to do? Solution is here.. use Image Augmentation


Zoom, shear, rotation, preprocessing feature, and other image augmentation parameters are
commonly used to increase the data sample count. When these parameters are used during the
training of a Deep Learning model, images with these attributes are created. Existing data
samples increased by nearly 3x to 4x time when image samples were produced using image
augmentation.


Another benefit of data augmentation is that, since CNN is not rotation invariant, we can use it to
add images to the dataset while taking rotation into account. It would undoubtedly improve the
system's accuracy.


8.3 Deeper Network Topology


With any possible input value, a large neural network can be trained. As a result, these networks
excel at memorization but struggle with generalization. However, there are a few drawbacks to
using a very large, shallow network. However, a wide neural network can accept every possible
input value, in the practical application, we won’t have every possible value for training.
Convolutional Neural Network


50
Deeper networks capture the inherent "hierarchy" that can be seen all over the world. Consider a
covnet: it captures low-level features in the first layer, slightly better but still low-level features
in the second layer, and object parts and basic structures in higher layers. Multiple layers have
the advantage of being able to learn features at different levels of abstraction.


That explains why a deep network may be preferable to a wide but shallow network.


However, why not a very deep, very wide network?


The answer is to achieve successful results, we want our network to be as limited as possible. It
will take longer to train the broader network. Deep networks need a lot of computing power to
practice. As a result, make them wide and deep enough to work, but no wider or deeper (Gulli
2021).


8.4 Handle Overfitting and Overfitting Problem


Let us start with a basic definition, such as Model, to discuss overfitting and underfitting. What
exactly is a model? It is a machine that converts input into output. For example, we can create an
image classification model that takes a test input image and predicts a class label for it. It's
fascinating!


We split the dataset into training and testing sets to build a model. On the training set, we train
our model with a classifier, such as CNN. Then we can use the trained model to predict test data
production.


Overfitting: A model that overfits the training data is referred to as overfitting. What exactly
does it mean? Let's keep it easy... Your model has a high level of accuracy on qualified data but a
Convolutional Neural Network


51
low level of accuracy on test data due to overfitting. This means that an overfitting model has
strong memorization but poor generalization abilities. From our training data to unknown data,
our model does not generalize well.


Underfitting: Underfitting refers to a model that performs poorly on both train and test results. It
is very hazardous. Isn't that so? The model does not match the training data well.


In technical terms, an overfitting model has a low bias and a high variance. A model that
underfits has a low variance and a high bias. There will always be a tradeoff between bias and
variance in every model, and we strive to strike the best balance when we design models.


Now what is bias and variance?


Bias is a mistake concerning the training collection. The variance of a model refers to how much
it varies in response to the training data. The sense of variance is that a model's accuracy on test
data is low.


How to prevent overfitting and underfitting?


Your model has a 50% accuracy on train data and an 80% accuracy on test data. Is this an
example of underfitting?


Its the worst problem.. why does it occur


Underfitting happens when a model is too simplistic — based on too few features or too
regularized — making it inflexible when learning from a dataset.


Solution
Convolutional Neural Network


52
If there is underfitting, I would recommend concentrating on the model's depth. It's possible that
you'll need to add layers to get more comprehensive features. To avoid Underfitting, you must
tune parameters, as we mentioned earlier.


Overfitting:


Overfitting is exemplified by the model's accuracy of 99 percent on train data and 60 percent on
test data.


In machine learning, overfitting is a common issue.


There are a few options for avoiding overfitting.


1. Experiment with more info.


2. Taking a break early:


3. Validation by cross-validation


Train with more data


Increase the amount of data you train with to improve mode accuracy. Overfitting can be avoided
by using a large amount of training data. To increase the size of the training set in CNN, we can
use data augmentation.


Early stopping
Convolutional Neural Network


53
The system is being trained through a series of iterations. Every iteration of the model improves
it. But wait... the model begins to overfit the training data after a certain number of iterations. As
a result, the model's capacity to generalize could be harmed. Do the same with the early stop.
Stopping the training phase before the learner reaches the stage is referred to as early stopping.


Cross Validation


So what is cross validation


Let’s start with k-fold cross validation. (where k is any integer number)


Divide the original training data set into k subsets of equal size. Each subset is referred to as a
fold. Let's call the folds f1, f2,..., fk.


For i = 1 to i = k


• Hold the fold fi in the Validation package, and the rest of the k-1 folds in the Cross
validation training set.


• Using the cross validation training set, train your machine learning algorithm and
measure the accuracy of your model by validating the predicted results against the
validation set.


• Averaging the accuracies derived in all k cases of cross validation can be used to estimate
the accuracy of your machine learning model.
Convolutional Neural Network


54
APPENDICES
Convolutional Neural Network


55
Figure 22: CNN Algorithm
Convolutional Neural Network


56
Figure 23: CNN vs SVM Algorithm


List of References
Convolutional Neural Network


57
Gulli, A., 2021. Using the CNN Architecture in Image Processing. [online] Medium. Available at:
<https://medium.com/@ODSC/using-the-cnn-architecture-in-image-processing-65b9eb032bdc>
[Accessed 22 April 2021].


Hu, W., Huang, Y., Wei, L., Zhang, F. and Li, H., 2015. Deep Convolutional Neural Networks for
Hyperspectral Image Classification. Journal of Sensors, [online] 2015, pp.1-12. Available at:
<https://doi.org/10.1155/2015/258619> [Accessed 22 April 2021].


Schwartzman, A., Kagan, M., Mackey, L., Nachman, B. and De Oliveira, L., 2016. Image
Processing, Computer Vision, and Deep Learning: new approaches to the analysis and physics
interpretation of LHC events. Journal of Physics: Conference Series, [online] 762, p.012035.
Available at: <https://doi.org/10.1088/1742-6596/762/1/012035> [Accessed 22 April 2021].


Costa, M., Campos, J., de Aquino e Aquino, G., de Albuquerque Pereira, W. and Costa Filho, C.,
2019. Evaluating the performance of convolutional neural networks with direct acyclic graph
architectures in automatic segmentation of breast lesion in US images. BMC Medical Imaging,
[online] 19(1). Available at: <https://doi.org/10.1186/s12880-019-0389-2> [Accessed 22 April
2021].


Browne, M. and Ghidary, S., 2003. Convolutional Neural Networks for Image Processing: An
Application in Robot Vision. Lecture Notes in Computer Science, [online] pp.641-652. Available
at: <https://doi.org/10.1007/978-3-540-24581-0_55> [Accessed 22 April 2021].
Convolutional Neural Network


58
Das, A., 2021. Convolution Neural Network for Image Processing — Using Keras. [online]
Medium. Available at: <https://towardsdatascience.com/convolution-neural-network-for-image-
processing-using-keras-dc3429056306> [Accessed 22 April 2021].

Weitere ähnliche Inhalte

Was ist angesagt?

LEAF DISEASE DETECTION USING IMAGE PROCESSING AND SUPPORT VECTOR MACHINE (SVM)
LEAF DISEASE DETECTION USING IMAGE PROCESSING AND SUPPORT VECTOR MACHINE (SVM)LEAF DISEASE DETECTION USING IMAGE PROCESSING AND SUPPORT VECTOR MACHINE (SVM)
LEAF DISEASE DETECTION USING IMAGE PROCESSING AND SUPPORT VECTOR MACHINE (SVM)Journal For Research
 
Machine Learning for Disease Prediction
Machine Learning for Disease PredictionMachine Learning for Disease Prediction
Machine Learning for Disease PredictionMustafa Oğuz
 
Facial Emotion Recognition: A Deep Learning approach
Facial Emotion Recognition: A Deep Learning approachFacial Emotion Recognition: A Deep Learning approach
Facial Emotion Recognition: A Deep Learning approachAshwinRachha
 
IRJET- Leaf Disease Detecting using CNN Technique
IRJET- Leaf Disease Detecting using CNN TechniqueIRJET- Leaf Disease Detecting using CNN Technique
IRJET- Leaf Disease Detecting using CNN TechniqueIRJET Journal
 
DIABETES PREDICTION SYSTEM .pptx
DIABETES PREDICTION SYSTEM .pptxDIABETES PREDICTION SYSTEM .pptx
DIABETES PREDICTION SYSTEM .pptxHome
 
Skin Cancer Detection and Classification
Skin Cancer Detection and ClassificationSkin Cancer Detection and Classification
Skin Cancer Detection and ClassificationDr. Amarjeet Singh
 
Blue Technologies : Blue Brain & Blue Eyes
Blue Technologies : Blue Brain & Blue EyesBlue Technologies : Blue Brain & Blue Eyes
Blue Technologies : Blue Brain & Blue EyesMittal Patel
 
Alzheimer Disease Prediction using Machine Learning Algorithms
Alzheimer Disease Prediction using Machine Learning AlgorithmsAlzheimer Disease Prediction using Machine Learning Algorithms
Alzheimer Disease Prediction using Machine Learning AlgorithmsIRJET Journal
 
Handwritten Digit Recognition(Convolutional Neural Network) PPT
Handwritten Digit Recognition(Convolutional Neural Network) PPTHandwritten Digit Recognition(Convolutional Neural Network) PPT
Handwritten Digit Recognition(Convolutional Neural Network) PPTRishabhTyagi48
 
Convolutional Neural Network
Convolutional Neural NetworkConvolutional Neural Network
Convolutional Neural NetworkVignesh Suresh
 
Object Detection and Recognition
Object Detection and Recognition Object Detection and Recognition
Object Detection and Recognition Intel Nervana
 
Semantic segmentation with Convolutional Neural Network Approaches
Semantic segmentation with Convolutional Neural Network ApproachesSemantic segmentation with Convolutional Neural Network Approaches
Semantic segmentation with Convolutional Neural Network ApproachesFellowship at Vodafone FutureLab
 
Object detection presentation
Object detection presentationObject detection presentation
Object detection presentationAshwinBicholiya
 
IRJET - Disease Detection in Plant using Machine Learning
IRJET -  	  Disease Detection in Plant using Machine LearningIRJET -  	  Disease Detection in Plant using Machine Learning
IRJET - Disease Detection in Plant using Machine LearningIRJET Journal
 
Animal identification using machine learning techniques
Animal identification using machine learning techniquesAnimal identification using machine learning techniques
Animal identification using machine learning techniquesAboul Ella Hassanien
 
Pneumonia Classification using Transfer Learning
Pneumonia Classification using Transfer LearningPneumonia Classification using Transfer Learning
Pneumonia Classification using Transfer LearningTushar Dalvi
 

Was ist angesagt? (20)

LEAF DISEASE DETECTION USING IMAGE PROCESSING AND SUPPORT VECTOR MACHINE (SVM)
LEAF DISEASE DETECTION USING IMAGE PROCESSING AND SUPPORT VECTOR MACHINE (SVM)LEAF DISEASE DETECTION USING IMAGE PROCESSING AND SUPPORT VECTOR MACHINE (SVM)
LEAF DISEASE DETECTION USING IMAGE PROCESSING AND SUPPORT VECTOR MACHINE (SVM)
 
Object detection
Object detectionObject detection
Object detection
 
Machine Learning for Disease Prediction
Machine Learning for Disease PredictionMachine Learning for Disease Prediction
Machine Learning for Disease Prediction
 
Facial Emotion Recognition: A Deep Learning approach
Facial Emotion Recognition: A Deep Learning approachFacial Emotion Recognition: A Deep Learning approach
Facial Emotion Recognition: A Deep Learning approach
 
BIONIC EYE
BIONIC EYEBIONIC EYE
BIONIC EYE
 
IRJET- Leaf Disease Detecting using CNN Technique
IRJET- Leaf Disease Detecting using CNN TechniqueIRJET- Leaf Disease Detecting using CNN Technique
IRJET- Leaf Disease Detecting using CNN Technique
 
DIABETES PREDICTION SYSTEM .pptx
DIABETES PREDICTION SYSTEM .pptxDIABETES PREDICTION SYSTEM .pptx
DIABETES PREDICTION SYSTEM .pptx
 
Computer vision
Computer visionComputer vision
Computer vision
 
Skin Cancer Detection and Classification
Skin Cancer Detection and ClassificationSkin Cancer Detection and Classification
Skin Cancer Detection and Classification
 
CIFAR-10
CIFAR-10CIFAR-10
CIFAR-10
 
Blue Technologies : Blue Brain & Blue Eyes
Blue Technologies : Blue Brain & Blue EyesBlue Technologies : Blue Brain & Blue Eyes
Blue Technologies : Blue Brain & Blue Eyes
 
Alzheimer Disease Prediction using Machine Learning Algorithms
Alzheimer Disease Prediction using Machine Learning AlgorithmsAlzheimer Disease Prediction using Machine Learning Algorithms
Alzheimer Disease Prediction using Machine Learning Algorithms
 
Handwritten Digit Recognition(Convolutional Neural Network) PPT
Handwritten Digit Recognition(Convolutional Neural Network) PPTHandwritten Digit Recognition(Convolutional Neural Network) PPT
Handwritten Digit Recognition(Convolutional Neural Network) PPT
 
Convolutional Neural Network
Convolutional Neural NetworkConvolutional Neural Network
Convolutional Neural Network
 
Object Detection and Recognition
Object Detection and Recognition Object Detection and Recognition
Object Detection and Recognition
 
Semantic segmentation with Convolutional Neural Network Approaches
Semantic segmentation with Convolutional Neural Network ApproachesSemantic segmentation with Convolutional Neural Network Approaches
Semantic segmentation with Convolutional Neural Network Approaches
 
Object detection presentation
Object detection presentationObject detection presentation
Object detection presentation
 
IRJET - Disease Detection in Plant using Machine Learning
IRJET -  	  Disease Detection in Plant using Machine LearningIRJET -  	  Disease Detection in Plant using Machine Learning
IRJET - Disease Detection in Plant using Machine Learning
 
Animal identification using machine learning techniques
Animal identification using machine learning techniquesAnimal identification using machine learning techniques
Animal identification using machine learning techniques
 
Pneumonia Classification using Transfer Learning
Pneumonia Classification using Transfer LearningPneumonia Classification using Transfer Learning
Pneumonia Classification using Transfer Learning
 

Ähnlich wie CNN Image Classification Using Deep Learning

SeniorThesisFinal_Biswas
SeniorThesisFinal_BiswasSeniorThesisFinal_Biswas
SeniorThesisFinal_BiswasAditya Biswas
 
A Seminar Report On NEURAL NETWORK
A Seminar Report On NEURAL NETWORKA Seminar Report On NEURAL NETWORK
A Seminar Report On NEURAL NETWORKSara Parker
 
Dissertation or Thesis on Efficient Clustering Scheme in Cognitive Radio Wire...
Dissertation or Thesis on Efficient Clustering Scheme in Cognitive Radio Wire...Dissertation or Thesis on Efficient Clustering Scheme in Cognitive Radio Wire...
Dissertation or Thesis on Efficient Clustering Scheme in Cognitive Radio Wire...aziznitham
 
Enterprise Data Center Networking (with citations)
Enterprise Data Center Networking (with citations)Enterprise Data Center Networking (with citations)
Enterprise Data Center Networking (with citations)Jonathan Williams
 
Neural Networks on Steroids
Neural Networks on SteroidsNeural Networks on Steroids
Neural Networks on SteroidsAdam Blevins
 
KurtPortelliMastersDissertation
KurtPortelliMastersDissertationKurtPortelliMastersDissertation
KurtPortelliMastersDissertationKurt Portelli
 
Im-ception - An exploration into facial PAD through the use of fine tuning de...
Im-ception - An exploration into facial PAD through the use of fine tuning de...Im-ception - An exploration into facial PAD through the use of fine tuning de...
Im-ception - An exploration into facial PAD through the use of fine tuning de...Cooper Wakefield
 
Single person pose recognition and tracking
Single person pose recognition and trackingSingle person pose recognition and tracking
Single person pose recognition and trackingJavier_Barbadillo
 
Deep Learning for Health Informatics
Deep Learning for Health InformaticsDeep Learning for Health Informatics
Deep Learning for Health InformaticsJason J Pulikkottil
 
Project report on Eye tracking interpretation system
Project report on Eye tracking interpretation systemProject report on Eye tracking interpretation system
Project report on Eye tracking interpretation systemkurkute1994
 
Head_Movement_Visualization
Head_Movement_VisualizationHead_Movement_Visualization
Head_Movement_VisualizationHongfu Huang
 
Geometric Processing of Data in Neural Networks
Geometric Processing of Data in Neural NetworksGeometric Processing of Data in Neural Networks
Geometric Processing of Data in Neural NetworksLorenzo Cassani
 
Au anthea-ws-201011-ma sc-thesis
Au anthea-ws-201011-ma sc-thesisAu anthea-ws-201011-ma sc-thesis
Au anthea-ws-201011-ma sc-thesisevegod
 

Ähnlich wie CNN Image Classification Using Deep Learning (20)

SeniorThesisFinal_Biswas
SeniorThesisFinal_BiswasSeniorThesisFinal_Biswas
SeniorThesisFinal_Biswas
 
A Seminar Report On NEURAL NETWORK
A Seminar Report On NEURAL NETWORKA Seminar Report On NEURAL NETWORK
A Seminar Report On NEURAL NETWORK
 
Dissertation or Thesis on Efficient Clustering Scheme in Cognitive Radio Wire...
Dissertation or Thesis on Efficient Clustering Scheme in Cognitive Radio Wire...Dissertation or Thesis on Efficient Clustering Scheme in Cognitive Radio Wire...
Dissertation or Thesis on Efficient Clustering Scheme in Cognitive Radio Wire...
 
Visual odometry _report
Visual odometry _reportVisual odometry _report
Visual odometry _report
 
Enterprise Data Center Networking (with citations)
Enterprise Data Center Networking (with citations)Enterprise Data Center Networking (with citations)
Enterprise Data Center Networking (with citations)
 
main
mainmain
main
 
Neural Networks on Steroids
Neural Networks on SteroidsNeural Networks on Steroids
Neural Networks on Steroids
 
mscthesis
mscthesismscthesis
mscthesis
 
KurtPortelliMastersDissertation
KurtPortelliMastersDissertationKurtPortelliMastersDissertation
KurtPortelliMastersDissertation
 
Im-ception - An exploration into facial PAD through the use of fine tuning de...
Im-ception - An exploration into facial PAD through the use of fine tuning de...Im-ception - An exploration into facial PAD through the use of fine tuning de...
Im-ception - An exploration into facial PAD through the use of fine tuning de...
 
Single person pose recognition and tracking
Single person pose recognition and trackingSingle person pose recognition and tracking
Single person pose recognition and tracking
 
Deep Learning for Health Informatics
Deep Learning for Health InformaticsDeep Learning for Health Informatics
Deep Learning for Health Informatics
 
Project report on Eye tracking interpretation system
Project report on Eye tracking interpretation systemProject report on Eye tracking interpretation system
Project report on Eye tracking interpretation system
 
project(copy1)
project(copy1)project(copy1)
project(copy1)
 
exjobb Telia
exjobb Teliaexjobb Telia
exjobb Telia
 
2000330 en
2000330 en2000330 en
2000330 en
 
Head_Movement_Visualization
Head_Movement_VisualizationHead_Movement_Visualization
Head_Movement_Visualization
 
Geometric Processing of Data in Neural Networks
Geometric Processing of Data in Neural NetworksGeometric Processing of Data in Neural Networks
Geometric Processing of Data in Neural Networks
 
978 1-4615-6311-2 fm
978 1-4615-6311-2 fm978 1-4615-6311-2 fm
978 1-4615-6311-2 fm
 
Au anthea-ws-201011-ma sc-thesis
Au anthea-ws-201011-ma sc-thesisAu anthea-ws-201011-ma sc-thesis
Au anthea-ws-201011-ma sc-thesis
 

Mehr von Writers Per Hour

Mehr von Writers Per Hour (20)

Gas Process
Gas ProcessGas Process
Gas Process
 
Religious Studies and Literacy
Religious Studies and LiteracyReligious Studies and Literacy
Religious Studies and Literacy
 
Differential Application of Utilitarian and Deontological Ethical Theories in...
Differential Application of Utilitarian and Deontological Ethical Theories in...Differential Application of Utilitarian and Deontological Ethical Theories in...
Differential Application of Utilitarian and Deontological Ethical Theories in...
 
Religion
ReligionReligion
Religion
 
News Article
News ArticleNews Article
News Article
 
American Government
American GovernmentAmerican Government
American Government
 
Religion
ReligionReligion
Religion
 
Discussion Post
Discussion PostDiscussion Post
Discussion Post
 
Multicultural History of Lowell Assignment
Multicultural History of Lowell AssignmentMulticultural History of Lowell Assignment
Multicultural History of Lowell Assignment
 
how one company penaterated into the korean seasoning market back in 90s and ...
how one company penaterated into the korean seasoning market back in 90s and ...how one company penaterated into the korean seasoning market back in 90s and ...
how one company penaterated into the korean seasoning market back in 90s and ...
 
Curriculum Design
Curriculum DesignCurriculum Design
Curriculum Design
 
Personal Theory Paper
Personal Theory PaperPersonal Theory Paper
Personal Theory Paper
 
T-shirt company
T-shirt companyT-shirt company
T-shirt company
 
Legacy of the Progressive
Legacy of the ProgressiveLegacy of the Progressive
Legacy of the Progressive
 
Legacy of the Progressive Era
Legacy of the Progressive EraLegacy of the Progressive Era
Legacy of the Progressive Era
 
Theory of Knowledge
Theory of KnowledgeTheory of Knowledge
Theory of Knowledge
 
Learning Module
Learning ModuleLearning Module
Learning Module
 
History Essay
History EssayHistory Essay
History Essay
 
Psych Essay
Psych EssayPsych Essay
Psych Essay
 
Psych Essay
Psych EssayPsych Essay
Psych Essay
 

Kürzlich hochgeladen

Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxDr.Ibrahim Hassaan
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceSamikshaHamane
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...Postal Advocate Inc.
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfphamnguyenenglishnb
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomnelietumpap1
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptxmary850239
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99
 

Kürzlich hochgeladen (20)

Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptx
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptxYOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in Pharmacovigilance
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choom
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parents
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
Raw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptxRaw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptx
 
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptxFINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
 

CNN Image Classification Using Deep Learning

  • 1. Convolutional Neural Network 1 APPLICATION OF CONVOLUTIONAL NEURAL NETWORK IN IMAGE CLASSIFICATION Name Course Professor's Name Institution Location of Institution Date
  • 2. Convolutional Neural Network 2 ABSTRACT Thanks to its broad applications in fields as diverse as smart surveillance and tracking, health and medicine, sports and entertainment, robots, drones, and self-driving cars, computer vision has become increasingly popular and successful in recent years. The basic building blocks of each of these applications are image-processing tasks like image classification, localization, and detection. Latest advances in Convolutional Neural Networks (CNNs) have resulted in excellent results in these cutting-edge visual recognition tasks and systems. Consequently, CNNs are now at the heart of computer vision's deep learning algorithms. This article would be useful to anyone who wants to learn about the principles behind CNNs as well as get hands-on experience with CNNs in image processing. It gives a thorough overview of CNNs, beginning with the fundamental principles of neural networks: preparation, regularization, and optimization in image processing. Besides, it also proves the effectiveness of CNNs as compared to other image classification algorithms such as support vector machine.
  • 3. Convolutional Neural Network 3 CONTENTS ABSTRACT 2 .................................................................................................................................................. CHAPTER 1 INTRODUCTION 5 ................................................................................................................ 1.1 Background of the Study 5 ..................................................................................................................... CHAPTER 2 ARTIFICIAL NEURAL NETWORK 6 ................................................................................ 2.1 Artificial Neural Network 7 .................................................................................................................... 2.2 Artificial Neuron 8 .................................................................................................................................. 2.3 Weight, Biases and activation functions 8 .............................................................................................. 2.3.1 Weight and Structure of a Neuron 8 ............................................................................... 2.3.2 Bias 9 ................................................................................................................................. 2.3.3 Activation function and the ReLu 10 .............................................................................. 2.4 Back Propagation 11 ............................................................................................................................... 2.5 Loss Function 12 ..................................................................................................................................... 2.6 Gradient Descent 12 ............................................................................................................................... 2.7 Learning Rate 12 ..................................................................................................................................... CHAPTER 3 CONVOLUSIONAL NEURAL NETWORK 13 .................................................................. 3.1 Convolutional Neural Network Architecture 13 ..................................................................................... 3.2 Convolutional Layers 14 ......................................................................................................................... 3.3 Pooling Layers 15 ................................................................................................................................... 3.4 Fully Connected Layers 16 ..................................................................................................................... 3.5 Models for Composing CNN in Image Classification 16 ....................................................................... 3.5.1 Classification and Localization 17 .................................................................................. 3.5.2 Semantic Segmentation 18 ............................................................................................... 3.5.3 Object Detection 19 .......................................................................................................... 3.5.4 Instance Segmentation 23 ................................................................................................ CHAPTER 4 CONCEPTUAL FRAMEWORK AND LITERATURE REVIEW 24 ............................... 4.1 Literature Overview 24 ........................................................................................................................... 4.2 Case Study 1. Convolutional Neural Networks for Image Processing 24 .............................................. 4.2 Case Study 2. Deep Convolutional Neural Networks for Hyperspectral Image Classification. 26 .......
  • 4. Convolutional Neural Network 4 4.3 Case Study 3. Convolutional neural networks: an overview and application in radiology 28 ............... 4.4 Case Study 4. Evaluating the performance of convolutional neural networks with direct acyclic graph architectures in automatic segmentation of breast lesion in US images 29 ........................................ 4.5 Conclusion 29 ......................................................................................................................................... CHAPTER 5 DESIGN 30 .............................................................................................................................. 5.1 Methodology 30 ...................................................................................................................................... 5.2 Stages Of Development 30 ..................................................................................................................... 5.2.1 Feasibility Study - Stage 0. 30 ......................................................................................... 5.2.3 Requirements Specification - Stage 3. 31 ....................................................................... 5.2.4 Logical System Specification – Stage 4&5. 31 ............................................................... 5.2.5 Physical Design – Stage 6. 31 ........................................................................................... 5.3 Reasons for Choosing SSADM 32 ......................................................................................................... 5.4 Comparison of SSADM With Other Methodologies 32 ......................................................................... a) Waterfall Model 32 ........................................................................................................ b.) Iterative Model 34 .............................................................................................................. 5.5 Research Methods 34 .............................................................................................................................. 5.5.1 Techniques for data collection. 34 ................................................................................... CHAPTER 6 IMPLEMENTATION 36 ........................................................................................................ 6.1 Hardware and Software Used 36 ............................................................................................................ 6.2 Definitions 36 ......................................................................................................................................... 6.2.1 Train, Validation, and Test 36 .......................................................................................... 6.2.2 Overfitting and Underfitting 37 ...................................................................................... 6.2.3 Batch Size 37 ..................................................................................................................... 6.2.4 Epoch 37 ............................................................................................................................ 6.2.5 Dropout 37 ........................................................................................................................ 6.2.6 Batch Normalization 37 ................................................................................................... 6.3 Modelling And Results 38 ...................................................................................................................... 6.3.1 First Model 38 ................................................................................................................... 6.3.2 Second Model 40 ............................................................................................................... 6.3.3 Third Model 42 ................................................................................................................. 6.4 CNN and SVM comparison 44 ............................................................................................................... CHAPTER 7 CONCLUSION 47 ................................................................................................................... CHAPTER 8 RECOMMENDATION AND FURTHER WORK 48 .........................................................
  • 5. Convolutional Neural Network 5 CHAPTER 1 INTRODUCTION 1.1 Background of the Study Artificial intelligence (AI) has become increasingly common in recent years. One of the responsibilities of computer vision, which is the ability to see things, is something that AI can 8.1 Tune Parameters 48 ................................................................................................................................. 8.2 Image Data Augmentation 49 ................................................................................................................. 8.3 Deeper Network Topology 49 ................................................................................................................. 8.4 Handle Overfitting and Overfitting Problem 50 ..................................................................................... APPENDICES 54 ............................................................................................................................................
  • 6. Convolutional Neural Network 6 help with. Computers are used to process and analyze images to simulate human vision. Image recognition is one of the most important tasks in computer vision. Image classification, for example, is when there are pictures of several items that need to be classified into "groups," such as "car," "plane," "ship," or "house." Convolutional neural networks are a popular method for image classification. It involves employing deep learning, which is implemented using neural networks. Deep learning is a subset of machine learning, which is a subset of AI. First, the University of Edinburgh's CINIC-10 dataset was employed to show how to apply convolutional neural networks in image classification and how to achieve more accurate results by employing different factors. CIFAR-10 and ImageNet are two well-known image classification datasets, and CINIC-10 is a mixture of the two. Second, a dataset composing the Salina, University of Pavia scenes, and Indian pines data was used to show the effectiveness of convolutional neural networks in image classification as compared to support vector machine. Support vector machine was used since it has been known to be a very effective algorithm for image classification for many years (Gulli 2021). CHAPTER 2 ARTIFICIAL NEURAL NETWORK
  • 7. Convolutional Neural Network 7 Since artificial neurons are greatly inspired by human neurons, it is important to understand how human neurons work. Figure 1: A diagram of the neuron showing the structure between the axon and dendrite. When a neuron fires, normally in response to a stimulus, signals are sent down its axon to the dendrites of another neuron through a synapse. The new neuron through then fire, causing another neuron to fire, repeating the process in the system. 2.1 Artificial Neural Network An artificial neural network (ANN) is a set of layers of neurons (referred to as units or nodes in this context). Each unit in one layer is connected to each unit in the next layer.
  • 8. Convolutional Neural Network 8 Figure 2: The artificial neural network architecture The network takes all the information it needs, in this case the images to identify, through an input layer. Secret layers exist between the input and output layers. Each hidden layer detects a different set of features in an image, ranging from simple to complex. The first hidden layer, for example, detects edges and lines, the second layer detects curves, and the third layer detects objects. The first secret layer, for example, detects edges and lines, the second detects curves, and the third layer detects specific image features, such as a face or a wheel. The first secret layer, for example, detects edges and lines, the second detects curves, and the third layer detects specific image features, such as a face or a wheel. The network makes predictions in the output layer. Human-provided labels are compared to the projected image categories. If they are wrong, the network corrects its learning using a technique called backpropagation (discussed later in this chapter) so that it can make better guesses in the next iteration. After enough training, a network may make classifications on its own, without the need for human intervention. 2.2 Artificial Neuron In an artificial neural network, an artificial neuron is a link point (unit or node) that can process input signals and generate output signals. 2.3 Weight, Biases and activation functions 2.3.1 Weight and Structure of a Neuron In a neural network, the connections between the units are weighted, which means that the weight shows how much the input from a previous unit influences the output of the next unit. To
  • 9. Convolutional Neural Network 9 compute an artificial neuron mathematically, add all the products of all the inputs (x1 to xn) and their corresponding weights (w1 to wn), then add a bias (b), then feed the resulting value into an activation function (f) to form the output. Figure 3: A diagram to show the work of a neuron: input x, weights w, bias b, activation function f. 2.3.2 Bias A bias (b) is an additional input to a neuron that is technically the number 1 compounded by a weight. The bias allows the activation function curve to be moved left or right on the coordinate graph, allowing the neuron to produce the desired output value. Figure 4: A bias value allows the activation function to shift to the left or right.
  • 10. Convolutional Neural Network 10 To illustrate Figure 4, when the input (x) is 2, a bias value of 5 allows the Sigmoid activation function to output 0. 2.3.3 Activation function and the ReLu An activation mechanism, by definition, determines whether or not a neuron should be activated (“fired”). It causes a neuron's output to become nonlinear. Without activation functions, a neural network is nothing more than a linear regression model. The ReLu: A(x) = max (0, x) is the most common activation function for CNNs (13) and the one used in this thesis. (No. 14) When x is positive, it outputs x; otherwise, it outputs 0.
  • 11. Convolutional Neural Network 11 Figure 5: The ReLu function. Since the mathematical operation is simpler and the activation is sparser, ReLu is less computationally costly than some other popular activation functions like tanh and Sigmoid. Since the function returns 0 when x is less than zero, there's a good chance that a given unit won't turn on at all. Sparsity also means less noise and overfitting, as well as more succinct models with higher predictive capacity. Neurons in a sparse network are more likely to process useful data. A neuron that can recognize human faces, for example, should not be triggered if the picture is actually about a house. Another advantage that the ReLu has over the others is that it is faster. Converges more quickly Linearity (when x 0) denotes that the line's slope does not change. As x rises, it does not reach a plateau. As a result, ReLu does not have the vanishing capacity. Other activation functions, such as Sigmoid or tanh, suffer from a gradient problem. The Softmax function is another common activation function in CNNs. It's frequently used in the output layer, where multiclass classification is performed. However, this function's mathematical calculation is outside the reach of this thesis. 2.4 Back Propagation Backpropagation is an algorithm that aids neural networks in learning new information. parameters, primarily because of prediction errors. This chapter will focus on using gradient descent, illustrate backpropagation.
  • 12. Convolutional Neural Network 12 2.5 Loss Function A loss function is an error measure, a method of calculating the degree of inaccuracy in a system. Forecasting the goal of deep learning models is to minimize this loss function value, and this process is known as optimization. 2.6 Gradient Descent Gradient descent is an optimization algorithm that changes the internal state of the system. To minimize the loss function value, adjust the weights of the neural network. The gradient descent algorithm tries to reduce the loss function value by adjusting weights after each iteration until further tweaks are no longer possible. produce little to no change in the value of the loss function, also known as convergence. 2.7 Learning Rate In gradient descent or other optimization algorithms, a learning rate is the step size of each iteration. Convergence will take a long time if the learning rate is too low, but there may be no convergence at all if the learning rate is too high.
  • 13. Convolutional Neural Network 13 CHAPTER 3 CONVOLUSIONAL NEURAL NETWORK 3.1 Convolutional Neural Network Architecture A Convolutional neural network is a deep neural network used in image processing that takes images as input and understands the characteristics from the data. Any colored image is divided into three layers: red, green, and blue, each of which is nothing more than a pixel value matrix. On previous output, mathematical operations such as convolutions and pooling are used to create new layers. Convolutions are used to remove functionality and pooling is used to reduce the network's complexity. For classification, the output matrix is flattened to one layer and attached to a completely connected layer.
  • 14. Convolutional Neural Network 14 Figure 10: CNN Architecture The connectivity pattern between neurons in convolutional networks was influenced by biological processes in that it resembles the organization of the animal visual cortex. Individual cortical neurons respond to stimuli only in the receptive field, which is a small portion of the visual field. Different neurons' receptive fields partly overlap, allowing them to occupy the entire visual field. Our vision is based on multiple cortical layers, each of which recognizes increasingly organized data. Single pixels are first seen, followed by basic geometric forms and more complex elements such as shapes, faces, human beings, animals, and so on. 3.2 Convolutional Layers The mathematical combination of two functions to form a third function is referred to as "convolution." When this occurs, two sets of data are combined. A convolutional layer (also known as a filter or kernel) is added to the input data in CNNs to generate a function map.
  • 15. Convolutional Neural Network 15 Figure 9: Convolutional layer with filter slides over the input and performs its output on the new layer. Between a 3x3 sized filter matrix and a 3x3 sized region of the input image's matrix, a dot product multiplication is performed. The output value (“destination 16pixel”) on the function map is the number of the elements of the resulting matrix. The filter then slides over the input matrix and completes the function map by repeating the dot product multiplication for each remaining combination of 3x3 sized areas. 3.3 Pooling Layers Pooling layers reduce the dimensionality of feature charts, specifically the height, and width while maintaining the depth. This is advantageous because it reduces the amount of computing power needed to process the data when extracting the most important features in function maps. Pooling layers are divided into two categories: maximum pooling and average pooling.
  • 16. Convolutional Neural Network 16 Figure 10: Types of Pooling. The maximum value of the elements in the portion of the image projected by the filter is returned by max pooling, while the average value is returned by average pooling. Max pooling is more effective at extracting dominant features and is therefore, more efficient. 3.4 Fully Connected Layers The classification takes place in completely linked layers. The input matrix is converted to a column vector and fed into a series of fully connected layers, similar to the fully connected ANN architecture mentioned previously. Each completely connected layer (called a Dense layer) goes through an activation function (such as tanh or ReLu), but the output Dense layer goes through Softmax. Cross-Entropy (categorical cross-entropy in Keras) is the loss function used in Softmax multiclass classification. The Softmax function returns an N-dimensional vector, where N is the number of classes from which the CNN must choose. The probability that the image belongs to each of the classes is represented by each number in this N-dimensional vector. For example, if the output vector is [0.1,1.75,0,0,0,0 ,0,0.0,5], there is a 10% chance that this image belongs to class 2, a 25% chance that it belongs to class 3, a 75% chance that it belongs to class 4, and a 5% chance that it belongs to class 10. 3.5 Models for Composing CNN in Image Classification To solve several complex tasks, the simple CNN architecture can be composed and expanded in a variety of ways.
  • 17. Convolutional Neural Network 17 3.5.1 Classification and Localization You must report not only the type of object contained in the image but also the coordinates of the bounding box where the object appears in the image in the classification and localization task. This task assumes that an image contains only one instance of an entity. In a standard classification network, this can be accomplished by adding a "regression head" in addition to the "classification head." Remember that the final production of convolution and pooling operations, called the feature map, is fed into a fully connected network that generates a vector of class probabilities in a classification network. The classification head is a completely connected network that is tuned using a categorical loss function (Lc) such as categorical cross- entropy (Gulli 2021). A regression head is a completely connected network that takes the function map and generates a vector (x, y, w, h) that represents the top-left x and y coordinates, as well as the bounding box's width and height. A continuous loss function (Lr), such as mean squared error, is used to tune it. A linear combination of the two losses is used to tune the entire network, i.e. L=αLC+(1-α)Lr This is a hyper parameter that can have a value of 0 or 1. It can be set to 0.5 unless the value is determined by some domain information about the problem. A typical classification and
  • 18. Convolutional Neural Network 18 localization network architecture is depicted in the diagram below. The only deviation from a standard CNN classification network, as you can see, is the additional regression head on the top right: Figure 4: Architecture for Classification and Localization 3.5.2 Semantic Segmentation The goal here is to assign a single class to each pixel on the image. A first step may be to create a classifier network for each pixel, with the input being a small neighborhood surrounding each pixel. In reality, this method is inefficient, so running the image through convolutions to increase the feature depth while keeping the image width and height constant may be a better alternative. After that, each pixel has a feature map that can be sent through a completely connected network to predict the pixel's class. In reality, however, this is often very costly, and it is seldom used. A third method is to use a CNN encoder-decoder network, in which the encoder reduces the image's width and height while increasing its depth (number of features), while the decoder uses transposed convolution operations to increase the image's size while decreasing its depth. The method of moving in the opposite direction of a typical convolution is known as transpose convolution (or up sampling). The picture is the input to this network, and the segmentation map is the output (Gulli 2021). The U-Net (a good implementation is available at https://github.com/jakeret/tf unet), which was originally designed for biomedical image segmentation and has additional skip-connections
  • 19. Convolutional Neural Network 19 between corresponding layers of the encoder and decoder, is a common implementation of this encoder-decoder architecture. The U-Net architecture is depicted in the diagram below: Figure 10: Semantic Segmentation 3.5.3 Object Detection The classification and localization tasks are identical to the object detection task. The main difference is that there are now several objects in the image, and we must determine the class and
  • 20. Convolutional Neural Network 20 bounding box coordinates for each one. Furthermore, neither the number nor the size of the items is specified ahead of time. As you would expect, this difficult problem has prompted a significant amount of study. A first solution to the problem might be to make several random croppings of the input image and apply the classification and localization networks we discussed earlier to each crop. However, such an approach wastes a lot of computing power and is unlikely to be competitive. Using a method like Selective, which uses conventional computer vision techniques to identify areas in the image that may contain objects, is a more realistic approach. Figure 10: Object Detection These areas are known as "Region Proposals," and the network that was used to find them was known as the "Region Proposal Network," or R-CNN. The regions were resized and fed into a network in the original R-CNN to produce image vectors: The bounding boxes suggested by the external tool were corrected using a linear regression network over the image vectors, and the vectors were then categorized using an SVM-based classifier. A R-CNN network can be conceptually interpreted as follows:
  • 21. Convolutional Neural Network 21 Figure 10: R-CNN Network Architecture The Quick R-CNN was the next version of the R-CNN network. Instead of feeding each region proposal through the CNN, the Quick R-CNN feeds the entire picture through the CNN, and the region proposals are projected onto the resulting feature map. Each region of interest is fed through a Region of Interest (ROI) pooling layer before being fed into a fully connected network, which generates an ROI feature vector. ROI pooling is a common operation in convolutional neural network object detection tasks. The ROI pooling layer employs maximum pooling to transform features within any valid region of interest into a small feature map with a defined spatial extent of H W. (where H and W are two hyperparameters). The function vector is then fed into two completely connected networks, one of which predicts the ROI class and the other of which corrects the proposal's bounding box coordinates. As an example, consider the following:
  • 22. Convolutional Neural Network 22 Figure 10: Quick R-CNN Network Architecture The fast R-CNN is 25 times faster than the R-CNN. The next upgrade, known as the Faster R- CNN (an implementation can be found at), replaces the external region proposal mechanism with a trainable portion within the network called the Region Proposal Network (RPN). As shown below, the performance of this network is combined with the feature map and passed through a pipeline similar to that of the Fast R-CNN network. The Faster R-CNN network is approximately 10 times faster than the Fast R-CNN network, making it roughly 250 times faster than an R-CNN network (Gulli 2021). Figure 10: Faster R-CNN Network Architecture Single Shot Detectors (SSD), such as You Only Look Once, is a slightly different type of object detection network (YOLO). Each image is divided into a predetermined number of sections using a grid in these cases. A 7x7 grid is used in the case of YOLO, resulting in 49 subimages. Each subimage receives a predetermined collection of crops with different aspect ratios. The output for each image is a vector of size (7 * 7 * (5B + C) given B bounding boxes and C object
  • 23. Convolutional Neural Network 23 groups. Each grid has prediction probabilities for the various objects detected inside it, as well as trust and coordinates (x, y, w, h) for each bounding box. This transition is carried out by the YOLO network, which is a CNN affiliate. The results from this vector are combined to find the final predictions and bounding boxes. In YOLO, the bounding boxes and associated class probabilities are predicted by a single convolutional network. YOLO is the quickest solution for object detection, but the algorithm can miss smaller artifacts. 3.5.4 Instance Segmentation With a few key differences, instance segmentation is similar to semantic segmentation — the process of associating each pixel of an image with a class mark. It must first differentiate between different instances of the same class in a picture. Second, labeling every bitmap image in the image is not necessary. In some ways, instance segmentation is similar to object detection, but we are looking for a binary mask that covers each object instead of bounding boxes. The second concept contributes to the Mask R-CNN network's intuition. The Mask R-CNN is a Faster R-CNN with an additional CNN in front of its regression head that takes the ROI bounding box coordinates as input and converts them to a binary mask. The second concept contributes to the Mask R-CNN network's intuition. The Mask R-CNN is a Faster R-CNN with an additional CNN in front of its regression head that converts the bounding box coordinates recorded for each ROI to a binary mask as input.
  • 24. Convolutional Neural Network 24 Figure 11: Mask R-CNN Network Architecture CHAPTER 4 CONCEPTUAL FRAMEWORK AND LITERATURE REVIEW 4.1 Literature Overview While hand-crafted feature extraction techniques, such as texture analysis, have been used in radionics studies for many years, they have been followed by traditional machine learning classifiers, such as random forests and support vector machines. When it comes to image recognition, there are a few distinctions to be made between certain approaches and CNN. First, CNN does not necessitate feature extraction by hand. Second, CNN architectures do not often require human experts to segment tumors or organs. Third, since there are millions of learnable parameters to estimate, CNN is much more data-hungry and computationally costly, necessitating the use of graphical processing units (GPUs) for model training (Browne and Ghidary 2003). 4.2 Case Study 1. Convolutional Neural Networks for Image Processing The term convolutional network (CNN) is used to describe an architecture for applying neural networks to two-dimensional arrays (usually images), based on spatially localized neural input. The ‘sharing' of weights across processing units in the CNN architecture decreases the number of free parameters, improving the network's generalization efficiency. Weights are repeated throughout the spatial collection, resulting in inherent insensitivity to input translations – a useful function for image classification. CNNs have a range of distinct advantages over completely connected and unconstrained neural network architectures in the sense of image processing.
  • 25. Convolutional Neural Network 25 When providing input directly to the network, the number of free parameters in the network can easily become unmanageable unless a specialized architecture is used. Traditional neural network applications may be able to solve this problem by relying on comprehensive pre-processing of images to make them in a usable format. However, this results in a hybrid two-stage architecture in which the pre-processing stage does most of the "interesting" function, which is, of course, hard-wired and non-adaptive (Browne and Ghidary 2003) . There is no built-in invariance in unstructured neural networks when it comes to translations or local distortions of the inputs. Indeed, one shortcoming of fully connected architectures is that the input topology is completely ignored. Images are strongly correlated and have a solid 2D local structure. In general, we argue that when input data is organized temporally or spatially, a general CNN architecture is better than a generic neural network (Browne and Ghidary 2003). CNNs perform mappings in any dimension between spatially/temporally distributed arrays. They tend to be appropriate for use with time series, photographs, or video. CNNs have the following characteristics: • Translation invariance (neural weights remain constant regardless of translation direction). • Connectivity within the group (neural connections only exist between spatially local regions). • A gradual reduction in spatial resolution is a choice (as the number of features is gradually increased).
  • 26. Convolutional Neural Network 26 4.2 Case Study 2. Deep Convolutional Neural Networks for Hyperspectral Image Classification. Huang et al. (2015) Found out that generally, in comparison to other image classification algorithms, CNNs need very little pre-processing. This means that the network learns to optimize the filters (or kernels) through automatic learning, as opposed to hand-engineered filters in conventional algorithms. This lack of reliance on prior expertise or human involvement in feature extraction is a significant benefit. Hyperspectral imagery is determined and created by remote sensors, which involve hundreds of observation channels with high spectral resolution. This process has inspired the development of many algorithms such as K-nearest neighbors, minimum distance, and logistic regression. However, these algorithms over the years have proved inefficient as compared to CNN when employed in remote sensing data. CNN provides multilayer perceptron and a radial basis function neural networks that these other algorithms lack. It is true that algorithms like SVM are indeed efficient as compared to the conventional CNN in terms of classification accuracy and computing cost, but when a deep structure and architectures of CNN are employed then CNN proves to be a powerful model for classification than all the other algorithms and very competitive as compared to SVM. Not only has CNN overpowered other algorithms but also over the years, deep CNN results in a promising performance in many fields as it has played a vital role for processing visual-related problems (Huang et al 2015).
  • 27. Convolutional Neural Network 27 CNN has even more recently proved efficient than some of the superior methods such as human performance and many vision-oriented tasks, including image classification, object detection, and scene mapping, number digit classification and face recognition. When applying CNNs to HIS classification the structure of CNN is gradually proven the most effective and preferable way to understand visual representations. The figure below represents hyperspectral data with hundreds of spectral channels. Each curve for specific lass has its visual shape, though it’s hard distinguishing some of these differences with human eyes, CNN can achieve better results as compared to humans and as a result, CNN has proven to be the best techniques when employed in HIS classification (Huang et al 2015). Figure 12: HIS classification
  • 28. Convolutional Neural Network 28 4.3 Case Study 3. Convolutional neural networks: an overview and application in radiology Classification using deep learning in medical image analysis typically uses target lesions represented in medical images, and these lesions are divided into two or more groups. Deep learning, for example, is commonly used to classify lung nodules on computed tomography (CT) images as benign or malignant, as seen below. For efficient classification using CNN, a large amount of training data with corresponding labels is needed. CT photographs of lung nodules and their indications (i.e., benign or cancerous) are used as training data for lung nodule classification. Below is a display two examples of lung nodule classification training results, one for a benign lung nodule and the other for primary lung cancer. Figure 13: CNN in radiology
  • 29. Convolutional Neural Network 29 4.4 Case Study 4. Evaluating the performance of convolutional neural networks with direct acyclic graph architectures in automatic segmentation of breast lesion in US images In Ultra Sound (US) breast photos, highlighting lesion contours is a vital step in breast cancer diagnosis. Infiltrating the underlying tissue, malignant lesions produce irregular contours with speculation and angulated edges, while benign lesions produce smooth contours with an elliptical form. He states that In breast imaging, the majority of the existing publications in the literature focus on using Convolutional Neural Networks (CNNs) for segmentation and classification of lesions in mammographic images. However In this study the main objective is to assess the ability of CNNs in detecting contour irregularities in breast lesions in US images. 4.5 Conclusion It is very clear that while with the convolutional neural, the model accuracy in image classification increases proportionally. Secondly using CNN in different fields such as radiology, hyperspectral image classification and many other fields, has proven to be more beneficial and advanced as compared to other algorithms employed in image classification. As such it is worthy concluding that CNN is the best method to apply when involved in image classification
  • 30. Convolutional Neural Network 30 CHAPTER 5 DESIGN 5.1 Methodology To evaluate and build, the structured systems analysis and design process (SSADM) was used. According to (Kendall 1988), the SSADM approach includes users during the most important and intensive period of the development process: the first stages of development. Aside from that, in terms of development stages of operation, it is close to the waterfall model. It divides growth into stages and modulates it. The data model is the first model it creates. The following techniques were used: Logical data modeling - logical data modeling is the method of defining, modeling, and recording data. The information is then divided into entities and relationships. Data Flow Modelling - involves following the flow of data in a computer system. Processes, data servers, external actors, and data movement are all thoroughly examined. Entity Behavior Modeling - involves defining and recording the events that influence each individual, as well as the order in which they occur. 5.2 Stages Of Development 5.2.1 Feasibility Study - Stage 0.
  • 31. Convolutional Neural Network 31 Its aim is to determine whether the project's course and specifications are financially, technically, and operationally feasible. 5.2.2 Requirement Analysis - Stage 1&2. This stage entails looking at the current situation and finding issues and areas that need to be improved. The second state entails creating a range of options that meet the specified criteria and selecting the most appropriate alternative. 5.2.3 Requirements Specification - Stage 3. The stage aims to identify the desired system data, functions, and events. 5.2.4 Logical System Specification – Stage 4&5. This stage aims to evaluate the technical system's operations as well as the conceptual design. 5.2.5 Physical Design – Stage 6. The physical world in which the device will operate is taken into account.
  • 32. Convolutional Neural Network 32 Figure 14: SSADM Methodology. 5.3 Reasons for Choosing SSADM Within a systems development cycle, SSADM incorporates three approaches, each of which complements the others: • Logical Data Modelling • Data Flow Modelling • Entity Event Modelling. Its key advantages over other methodologies are as follows: ➢ Quality improvement ➢ Detailed documentation of the development stages ➢ Reusability for similar projects that follow. Because of this thorough examination of the information system, this approach decreases the likelihood of information misunderstandings during the project life cycle, which is why it was chosen for this project. 5.4 Comparison of SSADM With Other Methodologies Other software development methodologies that have been investigated but not taken into account for this project include: a) Waterfall Model
  • 33. Convolutional Neural Network 33 This is a sequential design process in which progress is viewed as a waterfall that flows steadily downward through the phases of: • Conception • Initiation • Analysis • Design • Construction • Testing • Implementation and maintenance All of these steps flow through one another, with progress appearing to flow slowly like a waterfall. Advantages of Waterfall It is simple to handle since each stage is defined by rigid deliverables and a review process. There is no overlapping since phases are processed and completed one at a time. Disadvantages of waterfall It is difficult to predict how long each phase of construction will take and how much it will cost. For dynamic and object-oriented projects, this is not a suitable model. Not ideal for projects with variable specifications. SSADM vs Waterfall While the two always seem to be identical, a subtle difference makes SSADM superior to Waterfall. This is because, unlike the traditional Waterfall, SSADM allows for the review of previous stages/phases even after they have been completed, while the traditional Waterfall is static and cannot be checked until a step has been completed.
  • 34. Convolutional Neural Network 34 b.) Iterative Model This is a version of the software development life cycle that focuses on a simple initial implementation that gradually increases in complexity and feature set until the final set is complete. Following the initial planning process, a limited number of steps are replicated, with each cycle's completion refining and iterating the software incrementally. These phases include: • Planning and requirements • Analysis and design • Implementation (coding) • Testing • Evaluation Advantages of iterative model Simple adaptability to the system's ever-changing requirements. computer applications To suit the needs of the project or organization, each stage can be broken down into smaller chunks. Disadvantages of iterative model User interaction is under more strain. Users notice the changes in each iteration, so feature/requirement creep is a possibility. 5.5 Research Methods This section explains the data collection techniques that will be used in the qualitative study of the system. Farmers and some management are the system's most important stakeholders. 5.5.1 Techniques for data collection.
  • 35. Convolutional Neural Network 35 Many data collection methods are available. Interviews, observation, and documents and records review will be used to collect data. Existing data was majorly used during the development and testing. 5.5.1.1 Existing Data This refers to the addition of new investigation questions to the ones that were originally used when the data was collected. It entails incorporating measurement into a study or research project. Data sourced from an archive is an example. Advantages of Existing Data The level of precision is extremely high. Data that is easily available. Disadvantages of Existing Data Evaluation issues and comprehension difficulties.
  • 36. Convolutional Neural Network 36 CHAPTER 6 IMPLEMENTATION 6.1 Hardware and Software Used The report employed free GPUs from Google Colab (Collaboratory). The deep learning framework applied is TensorFlow with Keras API. 6.2 Definitions 6.2.1 Train, Validation, and Test The model is trained using the training dataset. The model learns its weights and prejudices in the case of neural networks. After each set of predictions, the model evaluates itself using the validation dataset. It aids the model's hyperparameter tuning. After the model has been fully trained, the test dataset is used to validate it.
  • 37. Convolutional Neural Network 37 6.2.2 Overfitting and Underfitting When a model captures the noise in the data, it is said to overfit. It intuitively suits the data too well, or in other words, it is overly reliant on the training data. Underfitting, on the other hand, happens when the model fails to capture the underlying pattern of the data or does not intuitively match the data well enough. Overfitting and underfitting both result in poor predictions in new datasets. 6.2.3 Batch Size In most cases, the whole dataset cannot be fed into the neural network at the same time. As a result, it must be divided into parts or batches. The batch size specifies how many training samples are used in a single batch. 6.2.4 Epoch When the entire dataset (i.e. every training sample) is fed forward and backward through the neural network only once, it is referred to as an epoch. 6.2.5 Dropout Dropout is a method for reducing overfitting. The word "dropout" refers to units and their links being dropped out at random during training. 6.2.6 Batch Normalization Overfitting can also be reduced by using batch normalization. It adjusts and scales the activations to normalize the input layer. Batch normalization's mathematics is outside the reach of this thesis.
  • 38. Convolutional Neural Network 38 6.3 Modelling And Results 6.3.1 First Model This model is based on TensorFlow’s Convolutional Neural Network (CNN) tutorial, with some tweaks. To avoid overfitting, there are three convolutional layers, each followed by a max-pooling layer and two dropout layers with a dropout rate of 0.3. Following that, there are two thick layers, each with 256 and 10 units (10 is the number of classes for classification). A dropout layer with a dropout rate of 0.2 exists between the two thick layers. The batch size is 32 and the number of epochs is 32. The optimizer is Adam with a learning rate of 0.0001. Below is the code and the model summary All programs are implemented using Python language and Theano library.
  • 39. Convolutional Neural Network 39 Figure 15: Two layer model CNN code Here are the results concerning the accuracies and loses Figure 16: Test loss: 1.25/ Test accuracy: 0.58 The training accuracy continues to improve, but the validation accuracy quickly reaches a plateau. As a result, despite several Dropout layers, the model is extremely overfitting.
  • 40. Convolutional Neural Network 40 6.3.2 Second Model This model is based on TensorFlow's Convolutional Neural Network (CNN) tutorial (33), with some tweaks. Three convolutional layers follow each other, followed by a max pooling layer and two dropout layers with a dropout rate of 0.3 to avoid overfitting. Following that, there are two thick layers, each with 256 and 10 units (10 is the number of classes for classification). A dropout layer with a dropout rate of 0.2 exists between the two thick layers. The number of epochs is 32, and the batch size is 32. Adam is the optimizer, and his learning rate is 0.0001. Below is the code and the model summary
  • 41. Convolutional Neural Network 41 Figure 17: Three layer CNN model codel Here are the results concerning the accuracies and loses
  • 42. Convolutional Neural Network 42 Figure 18: Test loss: 1.16 / Test accuracy: 0.58 The model is not as overfitting, but the accuracy is not high enough (just over 60%). 6.3.3 Third Model The third model has the same structure as the second, but after each convolutional layer, batch normalization is applied. To save time studying, the batch size has been increased to 128 and the number of epochs has been reduced to 27. Adam is the optimizer again, but this time the learning rate has been improved to 0.001 to reduce learning time. Below is the code and the model summary
  • 43. Convolutional Neural Network 43 Figure 19: Batch Normalization CNN code Figure 20: Test accuracy: 0.71 / Test loss: 0.83 The model is now running exceptionally well. It is still not overfitting; the accuracies of preparation, validation, and testing are all reasonably high: 75%, 71%, and 71%, respectively.
  • 44. Convolutional Neural Network 44 6.4 CNN and SVM comparison The Data Set: Majorly three hyperspectral data, composing the Salina, University of Pavia scenes, and Indian pines are employed to test the effectiveness of CNN in imaged classification as compared to the SVM algorithm. I am comparing CNN to SVM since SVM has been known to be the most effective algorithm when employed in image classification. For the data, 200- labeled pixels are randomly selected per class for the training data sets while all the rest pixels are employed as the testing datasets.
  • 45. Convolutional Neural Network 45 Table 1: Number of training and test samples used in the Indian Pines data set. The second data was provided by University of Pavia Table 2: Number of training and test samples used in University of Pavia data set Table 3: Number of training and test samples used in the Indian Pines data set. Results and Comparison
  • 46. Convolutional Neural Network 46 The figures below provide the comparison between SVM and CNN when employed in HSI classification. Table 4: comparison of SVM and CNN in HIS classification Result of comparison with different neural networks on the Indian Pines data set. Table 5: comparison of SVM and CNN in HIS classification
  • 47. Convolutional Neural Network 47 Figure 21: comparison of SVM and CNN in HIS classification CHAPTER 7 CONCLUSION First, the model with four convolutional layers (the second and third models) outperforms the model with three convolutional layers (the first model) by a large margin, with slightly less overfitting. The models with four convolutional layers (the second and third models) outperform the model with three convolutional layers (the first model) by a large margin, with slightly less overfitting. It is very clear that while increasing the convolutional layers, the model accuracy in image classification increases proportionally. Secondly, the CNN algorithm is more efficient as compared to other image classification algorithms such as SVM and KNN. As such, it is worth concluding that CNN when employed effectively is the best algorithm in image classification.
  • 48. Convolutional Neural Network 48 CHAPTER 8 RECOMMENDATION AND FURTHER WORK The research that has been performed for this report has highlighted several topics that suggest further research and improvement. 8.1 Tune Parameters Das (2021), found out that to improve CNN model performance, we can tune parameters like epochs, learning rate, etc… The Number of epochs affects the performance. There is an increase in efficiency over a wide number of epochs. However, some experimentation is needed when deciding on epochs and learning rates. We can see that there is no reduction in training failure and no increase in training precision after a certain number of epochs. Accordingly, we can determine the number of epochs. In the CNN model, we can also use a dropout layer. During model compilation, the appropriate optimizer must be chosen based on the application. Various optimizers, such as SGD can be used. Various optimizers must be used to fine-tune the model. All of these factors have an impact on CNN's results.
  • 49. Convolutional Neural Network 49 8.2 Image Data Augmentation “Deep learning is only useful when there is a lot of data.” It's not incorrect. CNN requires the ability to automatically learn features from data, which is typically only possible when a large amount of training data is available. If we have less training data available.. what to do? Solution is here.. use Image Augmentation Zoom, shear, rotation, preprocessing feature, and other image augmentation parameters are commonly used to increase the data sample count. When these parameters are used during the training of a Deep Learning model, images with these attributes are created. Existing data samples increased by nearly 3x to 4x time when image samples were produced using image augmentation. Another benefit of data augmentation is that, since CNN is not rotation invariant, we can use it to add images to the dataset while taking rotation into account. It would undoubtedly improve the system's accuracy. 8.3 Deeper Network Topology With any possible input value, a large neural network can be trained. As a result, these networks excel at memorization but struggle with generalization. However, there are a few drawbacks to using a very large, shallow network. However, a wide neural network can accept every possible input value, in the practical application, we won’t have every possible value for training.
  • 50. Convolutional Neural Network 50 Deeper networks capture the inherent "hierarchy" that can be seen all over the world. Consider a covnet: it captures low-level features in the first layer, slightly better but still low-level features in the second layer, and object parts and basic structures in higher layers. Multiple layers have the advantage of being able to learn features at different levels of abstraction. That explains why a deep network may be preferable to a wide but shallow network. However, why not a very deep, very wide network? The answer is to achieve successful results, we want our network to be as limited as possible. It will take longer to train the broader network. Deep networks need a lot of computing power to practice. As a result, make them wide and deep enough to work, but no wider or deeper (Gulli 2021). 8.4 Handle Overfitting and Overfitting Problem Let us start with a basic definition, such as Model, to discuss overfitting and underfitting. What exactly is a model? It is a machine that converts input into output. For example, we can create an image classification model that takes a test input image and predicts a class label for it. It's fascinating! We split the dataset into training and testing sets to build a model. On the training set, we train our model with a classifier, such as CNN. Then we can use the trained model to predict test data production. Overfitting: A model that overfits the training data is referred to as overfitting. What exactly does it mean? Let's keep it easy... Your model has a high level of accuracy on qualified data but a
  • 51. Convolutional Neural Network 51 low level of accuracy on test data due to overfitting. This means that an overfitting model has strong memorization but poor generalization abilities. From our training data to unknown data, our model does not generalize well. Underfitting: Underfitting refers to a model that performs poorly on both train and test results. It is very hazardous. Isn't that so? The model does not match the training data well. In technical terms, an overfitting model has a low bias and a high variance. A model that underfits has a low variance and a high bias. There will always be a tradeoff between bias and variance in every model, and we strive to strike the best balance when we design models. Now what is bias and variance? Bias is a mistake concerning the training collection. The variance of a model refers to how much it varies in response to the training data. The sense of variance is that a model's accuracy on test data is low. How to prevent overfitting and underfitting? Your model has a 50% accuracy on train data and an 80% accuracy on test data. Is this an example of underfitting? Its the worst problem.. why does it occur Underfitting happens when a model is too simplistic — based on too few features or too regularized — making it inflexible when learning from a dataset. Solution
  • 52. Convolutional Neural Network 52 If there is underfitting, I would recommend concentrating on the model's depth. It's possible that you'll need to add layers to get more comprehensive features. To avoid Underfitting, you must tune parameters, as we mentioned earlier. Overfitting: Overfitting is exemplified by the model's accuracy of 99 percent on train data and 60 percent on test data. In machine learning, overfitting is a common issue. There are a few options for avoiding overfitting. 1. Experiment with more info. 2. Taking a break early: 3. Validation by cross-validation Train with more data Increase the amount of data you train with to improve mode accuracy. Overfitting can be avoided by using a large amount of training data. To increase the size of the training set in CNN, we can use data augmentation. Early stopping
  • 53. Convolutional Neural Network 53 The system is being trained through a series of iterations. Every iteration of the model improves it. But wait... the model begins to overfit the training data after a certain number of iterations. As a result, the model's capacity to generalize could be harmed. Do the same with the early stop. Stopping the training phase before the learner reaches the stage is referred to as early stopping. Cross Validation So what is cross validation Let’s start with k-fold cross validation. (where k is any integer number) Divide the original training data set into k subsets of equal size. Each subset is referred to as a fold. Let's call the folds f1, f2,..., fk. For i = 1 to i = k • Hold the fold fi in the Validation package, and the rest of the k-1 folds in the Cross validation training set. • Using the cross validation training set, train your machine learning algorithm and measure the accuracy of your model by validating the predicted results against the validation set. • Averaging the accuracies derived in all k cases of cross validation can be used to estimate the accuracy of your machine learning model.
  • 56. Convolutional Neural Network 56 Figure 23: CNN vs SVM Algorithm List of References
  • 57. Convolutional Neural Network 57 Gulli, A., 2021. Using the CNN Architecture in Image Processing. [online] Medium. Available at: <https://medium.com/@ODSC/using-the-cnn-architecture-in-image-processing-65b9eb032bdc> [Accessed 22 April 2021]. Hu, W., Huang, Y., Wei, L., Zhang, F. and Li, H., 2015. Deep Convolutional Neural Networks for Hyperspectral Image Classification. Journal of Sensors, [online] 2015, pp.1-12. Available at: <https://doi.org/10.1155/2015/258619> [Accessed 22 April 2021]. Schwartzman, A., Kagan, M., Mackey, L., Nachman, B. and De Oliveira, L., 2016. Image Processing, Computer Vision, and Deep Learning: new approaches to the analysis and physics interpretation of LHC events. Journal of Physics: Conference Series, [online] 762, p.012035. Available at: <https://doi.org/10.1088/1742-6596/762/1/012035> [Accessed 22 April 2021]. Costa, M., Campos, J., de Aquino e Aquino, G., de Albuquerque Pereira, W. and Costa Filho, C., 2019. Evaluating the performance of convolutional neural networks with direct acyclic graph architectures in automatic segmentation of breast lesion in US images. BMC Medical Imaging, [online] 19(1). Available at: <https://doi.org/10.1186/s12880-019-0389-2> [Accessed 22 April 2021]. Browne, M. and Ghidary, S., 2003. Convolutional Neural Networks for Image Processing: An Application in Robot Vision. Lecture Notes in Computer Science, [online] pp.641-652. Available at: <https://doi.org/10.1007/978-3-540-24581-0_55> [Accessed 22 April 2021].
  • 58. Convolutional Neural Network 58 Das, A., 2021. Convolution Neural Network for Image Processing — Using Keras. [online] Medium. Available at: <https://towardsdatascience.com/convolution-neural-network-for-image- processing-using-keras-dc3429056306> [Accessed 22 April 2021].