SlideShare ist ein Scribd-Unternehmen logo
1 von 21
Project Title :- Bank Loan Approval Analysis
Present by :- Shiva G Waghe
Project Contents
1. Introduction
2. Library Import
3. Loading Data
4. Data Exploration (EDA)
5. Data Cleaning
6. Data Visualization
7. Data Preprocessing
8. Train and Test Split
9. Model Building and Evaluation
10.Model Comparison
11.Power BI Dashboard
12.Observation
Introduction of Bank Loan Approval Analysis
Finance companies deals with some kinds of home loans. They
may have their presence across urban, semi urban and rural areas.
Customer first applies for home loan and after that company validates
the customer eligibility for loan.
Mostly Company wants to automate the loan eligibility process
(real time) based on customer detail provided while filling online
application form. These details are Gender, Marital Status, Education,
Number of Dependents, Income, Loan Amount, Credit History and
others. To automate this process, I have provided a data set to identify
the customers segments that are eligible for loan amount so that they
can specifically target these customers.
Library Import
 Import the libraries required for data processing and visualization.
 Reading a CSV file from the provided directory and assigning it to the pandas Data Frame
'data'.
 data.Shape –This attribute of a Data Frame returns a tuple
describing its Dimensionality.
 The data.Isnull.sum function returns a count of null values in
each column of the DataFrame.
EDA
 Head Function displays the first five rows of a Data Frame, providing a quick overview of its
structure and content.
 Tail function shows the last few rows of a Data Frame.
 The data.duplicate.sum method displays the total of duplicate values in the data set. There
are no duplicate values in this dataset..
 The unique() function returns a Series object that shows the unique values for each
column
Data Cleaning
 For better understanding, we convert Y=Yes and N=No in the
Loan Status Column using Replace function.
Filling Null value using fillna function.
Data Visualization
Data Preprocessing
 Loan ID column is not important in our dataset. So, we will drop that column.
 We know that machines cannot interpret categorical values, so we convert data into
numerical form.
 splitting into independent & Dependent Feature.
 This code randomly splits the dataset x (features) and y (labels) into two separate sets: the
training set (x_train and y_train) and the testing set (x_test and y_test). The split is done with
a test size of “0.3”, meaning that “30%” of the data will be allocated for testing, while the
remaining “70%” will be used for training. The random_state parameter is set to “0” to
ensure of the split.
Splitting data into Training and Testing
Models used :
1. Logistic Regression : Logistic regression on this dataset requires numerous steps, as
it is often used for binary classification problems. For this dataset, logistic
regression could be used to predict a binary outcome.
2. Support Vector Classifier : SVC (Support Vector Classification), a variation of the
SVM (Support Vector Machine) model, will be utilized in this dataset to perform a
number of classification tasks. SVC is especially useful for binary and multiclass
classification tasks. For this dataset, we may use SVC to predict a categorical result,
such as whether a customer's loan was authorized or not.
3. K-Nearest Neighbors (KNN) : KNN is a simple, instance-based learning method
used in classification and regression. It categorizes a data point according on how its
neighbors are classed. In classification, the data point is assigned to the class with
the most k-nearest neighbors.
Model Building and Evaluation
Model Comparison
Selection of Model:
 After evaluating three different models, including Logistic Regression, SVC, and KNN, it is
clear that Logistic Regression outperforms than the others, with got accuracy score of 79%
Train and 82% Test.
Observation
 Majority of the customers is getting loan approved (Yes) 68.7%
 Those that are educated are better able to get their loans approved.
 A majority of our customers who get loans approved are located
in semi-urban areas.
 Those who are married taking loans more than unmarried
people.
 The majority of the graduates come from semiurban areas.
 we can see that those people whose salary above 5446 have a strong
chances of getting a loan authorized.
THANK YOU !

Weitere ähnliche Inhalte

Ähnlich wie Bank Loan Approval Analysis: A Comprehensive Data Analysis Project

Task A. [20 marks] Data Choice. Name the chosen data set(s) .docx
Task A. [20 marks] Data Choice. Name the chosen data set(s) .docxTask A. [20 marks] Data Choice. Name the chosen data set(s) .docx
Task A. [20 marks] Data Choice. Name the chosen data set(s) .docx
josies1
 
Open06
Open06Open06
Open06
butest
 
A Hybrid Theory Of Power Theft Detection
A Hybrid Theory Of Power Theft DetectionA Hybrid Theory Of Power Theft Detection
A Hybrid Theory Of Power Theft Detection
Camella Taylor
 
Neural Network Model
Neural Network ModelNeural Network Model
Neural Network Model
Eric Esajian
 

Ähnlich wie Bank Loan Approval Analysis: A Comprehensive Data Analysis Project (20)

Task A. [20 marks] Data Choice. Name the chosen data set(s) .docx
Task A. [20 marks] Data Choice. Name the chosen data set(s) .docxTask A. [20 marks] Data Choice. Name the chosen data set(s) .docx
Task A. [20 marks] Data Choice. Name the chosen data set(s) .docx
 
Open06
Open06Open06
Open06
 
Clustering
ClusteringClustering
Clustering
 
Default Prediction & Analysis on Lending Club Loan Data
Default Prediction & Analysis on Lending Club Loan DataDefault Prediction & Analysis on Lending Club Loan Data
Default Prediction & Analysis on Lending Club Loan Data
 
LOAN APPROVAL PRDICTION SYSTEM USING MACHINE LEARNING.
LOAN APPROVAL PRDICTION SYSTEM USING MACHINE LEARNING.LOAN APPROVAL PRDICTION SYSTEM USING MACHINE LEARNING.
LOAN APPROVAL PRDICTION SYSTEM USING MACHINE LEARNING.
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
 
Credit scorecard
Credit scorecardCredit scorecard
Credit scorecard
 
Supervised learning
Supervised learningSupervised learning
Supervised learning
 
Loan Analysis Predicting Defaulters
Loan Analysis Predicting DefaultersLoan Analysis Predicting Defaulters
Loan Analysis Predicting Defaulters
 
Project crm submission sonali
Project crm submission sonaliProject crm submission sonali
Project crm submission sonali
 
A Hybrid Theory Of Power Theft Detection
A Hybrid Theory Of Power Theft DetectionA Hybrid Theory Of Power Theft Detection
A Hybrid Theory Of Power Theft Detection
 
Black_Friday_Sales_Trushita
Black_Friday_Sales_TrushitaBlack_Friday_Sales_Trushita
Black_Friday_Sales_Trushita
 
Loan Approval Prediction Using Machine Learning
Loan Approval Prediction Using Machine LearningLoan Approval Prediction Using Machine Learning
Loan Approval Prediction Using Machine Learning
 
Credit iconip
Credit iconipCredit iconip
Credit iconip
 
Data Mining to Classify Telco Churners
Data Mining to Classify Telco ChurnersData Mining to Classify Telco Churners
Data Mining to Classify Telco Churners
 
Neural Network Model
Neural Network ModelNeural Network Model
Neural Network Model
 
EDA_Assignment_Sourabh S Hubballi.pdf
EDA_Assignment_Sourabh S Hubballi.pdfEDA_Assignment_Sourabh S Hubballi.pdf
EDA_Assignment_Sourabh S Hubballi.pdf
 
Customer_Churn_prediction.pptx
Customer_Churn_prediction.pptxCustomer_Churn_prediction.pptx
Customer_Churn_prediction.pptx
 
Customer_Churn_prediction.pptx
Customer_Churn_prediction.pptxCustomer_Churn_prediction.pptx
Customer_Churn_prediction.pptx
 
Scalable constrained spectral clustering
Scalable constrained spectral clusteringScalable constrained spectral clustering
Scalable constrained spectral clustering
 

Mehr von Boston Institute of Analytics

Mehr von Boston Institute of Analytics (20)

Solar production with K means clustering
Solar production with K means clusteringSolar production with K means clustering
Solar production with K means clustering
 
Demystifying Salaries: A Data Science Approach to Predicting Salary Ranges
Demystifying Salaries: A Data Science Approach to Predicting Salary RangesDemystifying Salaries: A Data Science Approach to Predicting Salary Ranges
Demystifying Salaries: A Data Science Approach to Predicting Salary Ranges
 
Machine Learning for Accident Severity Prediction
Machine Learning for Accident Severity PredictionMachine Learning for Accident Severity Prediction
Machine Learning for Accident Severity Prediction
 
Predicting Power Consumption for a Greener Tomorrow: Machine Learning Project...
Predicting Power Consumption for a Greener Tomorrow: Machine Learning Project...Predicting Power Consumption for a Greener Tomorrow: Machine Learning Project...
Predicting Power Consumption for a Greener Tomorrow: Machine Learning Project...
 
Credit Card Fraud Detection: Safeguarding Transactions in the Digital Age
Credit Card Fraud Detection: Safeguarding Transactions in the Digital AgeCredit Card Fraud Detection: Safeguarding Transactions in the Digital Age
Credit Card Fraud Detection: Safeguarding Transactions in the Digital Age
 
Sensing the Future: Anomaly Detection and Event Prediction in Sensor Networks
Sensing the Future: Anomaly Detection and Event Prediction in Sensor NetworksSensing the Future: Anomaly Detection and Event Prediction in Sensor Networks
Sensing the Future: Anomaly Detection and Event Prediction in Sensor Networks
 
Predictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting TechniquesPredictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting Techniques
 
Unveiling the Market: Predicting House Prices with Data Science
Unveiling the Market: Predicting House Prices with Data ScienceUnveiling the Market: Predicting House Prices with Data Science
Unveiling the Market: Predicting House Prices with Data Science
 
Beyond Thumbs Up/Down: Using AI to Analyze Movie Reviews
Beyond Thumbs Up/Down: Using AI to Analyze Movie ReviewsBeyond Thumbs Up/Down: Using AI to Analyze Movie Reviews
Beyond Thumbs Up/Down: Using AI to Analyze Movie Reviews
 
Unveiling the Patterns: A Cluster Analysis of NYC Shootings
Unveiling the Patterns: A Cluster Analysis of NYC ShootingsUnveiling the Patterns: A Cluster Analysis of NYC Shootings
Unveiling the Patterns: A Cluster Analysis of NYC Shootings
 
Enhancing Cybersecurity: An In-depth Analysis of Travelblog.org
Enhancing Cybersecurity: An In-depth Analysis of Travelblog.orgEnhancing Cybersecurity: An In-depth Analysis of Travelblog.org
Enhancing Cybersecurity: An In-depth Analysis of Travelblog.org
 
Exploring Web Security Threats: A Practical Study on SQL Injection and CSRF
Exploring Web Security Threats: A Practical Study on SQL Injection and CSRFExploring Web Security Threats: A Practical Study on SQL Injection and CSRF
Exploring Web Security Threats: A Practical Study on SQL Injection and CSRF
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning Approach
 
Detecting Credit Card Fraud: An AI-driven Approach
Detecting Credit Card Fraud: An AI-driven ApproachDetecting Credit Card Fraud: An AI-driven Approach
Detecting Credit Card Fraud: An AI-driven Approach
 
Predicting House Prices: A Machine Learning Approach
Predicting House Prices: A Machine Learning ApproachPredicting House Prices: A Machine Learning Approach
Predicting House Prices: A Machine Learning Approach
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Decoding Loan Approval with Predictive Modeling in Action Discovering Weaknes...
Decoding Loan Approval with Predictive Modeling in Action Discovering Weaknes...Decoding Loan Approval with Predictive Modeling in Action Discovering Weaknes...
Decoding Loan Approval with Predictive Modeling in Action Discovering Weaknes...
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
NLP Based project presentation: Analyzing Automobile Prices
NLP Based project presentation: Analyzing Automobile PricesNLP Based project presentation: Analyzing Automobile Prices
NLP Based project presentation: Analyzing Automobile Prices
 

Kürzlich hochgeladen

如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一
如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一
如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一
fztigerwe
 
一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理
pyhepag
 
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
ju0dztxtn
 
edited gordis ebook sixth edition david d.pdf
edited gordis ebook sixth edition david d.pdfedited gordis ebook sixth edition david d.pdf
edited gordis ebook sixth edition david d.pdf
great91
 
一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理
pyhepag
 
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
dq9vz1isj
 
一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理
cyebo
 
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
pyhepag
 
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
pyhepag
 
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Valters Lauzums
 
Abortion Clinic in Randfontein +27791653574 Randfontein WhatsApp Abortion Cli...
Abortion Clinic in Randfontein +27791653574 Randfontein WhatsApp Abortion Cli...Abortion Clinic in Randfontein +27791653574 Randfontein WhatsApp Abortion Cli...
Abortion Clinic in Randfontein +27791653574 Randfontein WhatsApp Abortion Cli...
mikehavy0
 
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
acoha1
 

Kürzlich hochgeladen (20)

Aggregations - The Elasticsearch "GROUP BY"
Aggregations - The Elasticsearch "GROUP BY"Aggregations - The Elasticsearch "GROUP BY"
Aggregations - The Elasticsearch "GROUP BY"
 
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
 
如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一
如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一
如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一
 
一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理
 
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
 
edited gordis ebook sixth edition david d.pdf
edited gordis ebook sixth edition david d.pdfedited gordis ebook sixth edition david d.pdf
edited gordis ebook sixth edition david d.pdf
 
Formulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdfFormulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdf
 
What is Insertion Sort. Its basic information
What is Insertion Sort. Its basic informationWhat is Insertion Sort. Its basic information
What is Insertion Sort. Its basic information
 
Digital Marketing Demystified: Expert Tips from Samantha Rae Coolbeth
Digital Marketing Demystified: Expert Tips from Samantha Rae CoolbethDigital Marketing Demystified: Expert Tips from Samantha Rae Coolbeth
Digital Marketing Demystified: Expert Tips from Samantha Rae Coolbeth
 
MATERI MANAJEMEN OF PENYAKIT TETANUS.ppt
MATERI  MANAJEMEN OF PENYAKIT TETANUS.pptMATERI  MANAJEMEN OF PENYAKIT TETANUS.ppt
MATERI MANAJEMEN OF PENYAKIT TETANUS.ppt
 
一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理
 
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
 
Genuine love spell caster )! ,+27834335081) Ex lover back permanently in At...
Genuine love spell caster )! ,+27834335081)   Ex lover back permanently in At...Genuine love spell caster )! ,+27834335081)   Ex lover back permanently in At...
Genuine love spell caster )! ,+27834335081) Ex lover back permanently in At...
 
一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理
 
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
 
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
 
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
 
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
 
Abortion Clinic in Randfontein +27791653574 Randfontein WhatsApp Abortion Cli...
Abortion Clinic in Randfontein +27791653574 Randfontein WhatsApp Abortion Cli...Abortion Clinic in Randfontein +27791653574 Randfontein WhatsApp Abortion Cli...
Abortion Clinic in Randfontein +27791653574 Randfontein WhatsApp Abortion Cli...
 
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
 

Bank Loan Approval Analysis: A Comprehensive Data Analysis Project

  • 1.
  • 2. Project Title :- Bank Loan Approval Analysis Present by :- Shiva G Waghe
  • 3. Project Contents 1. Introduction 2. Library Import 3. Loading Data 4. Data Exploration (EDA) 5. Data Cleaning 6. Data Visualization 7. Data Preprocessing 8. Train and Test Split 9. Model Building and Evaluation 10.Model Comparison 11.Power BI Dashboard 12.Observation
  • 4. Introduction of Bank Loan Approval Analysis Finance companies deals with some kinds of home loans. They may have their presence across urban, semi urban and rural areas. Customer first applies for home loan and after that company validates the customer eligibility for loan. Mostly Company wants to automate the loan eligibility process (real time) based on customer detail provided while filling online application form. These details are Gender, Marital Status, Education, Number of Dependents, Income, Loan Amount, Credit History and others. To automate this process, I have provided a data set to identify the customers segments that are eligible for loan amount so that they can specifically target these customers.
  • 5. Library Import  Import the libraries required for data processing and visualization.  Reading a CSV file from the provided directory and assigning it to the pandas Data Frame 'data'.
  • 6.  data.Shape –This attribute of a Data Frame returns a tuple describing its Dimensionality.  The data.Isnull.sum function returns a count of null values in each column of the DataFrame. EDA
  • 7.  Head Function displays the first five rows of a Data Frame, providing a quick overview of its structure and content.  Tail function shows the last few rows of a Data Frame.
  • 8.  The data.duplicate.sum method displays the total of duplicate values in the data set. There are no duplicate values in this dataset..  The unique() function returns a Series object that shows the unique values for each column
  • 9. Data Cleaning  For better understanding, we convert Y=Yes and N=No in the Loan Status Column using Replace function. Filling Null value using fillna function.
  • 11.
  • 12.
  • 13.
  • 14. Data Preprocessing  Loan ID column is not important in our dataset. So, we will drop that column.  We know that machines cannot interpret categorical values, so we convert data into numerical form.
  • 15.  splitting into independent & Dependent Feature.
  • 16.  This code randomly splits the dataset x (features) and y (labels) into two separate sets: the training set (x_train and y_train) and the testing set (x_test and y_test). The split is done with a test size of “0.3”, meaning that “30%” of the data will be allocated for testing, while the remaining “70%” will be used for training. The random_state parameter is set to “0” to ensure of the split. Splitting data into Training and Testing
  • 17. Models used : 1. Logistic Regression : Logistic regression on this dataset requires numerous steps, as it is often used for binary classification problems. For this dataset, logistic regression could be used to predict a binary outcome. 2. Support Vector Classifier : SVC (Support Vector Classification), a variation of the SVM (Support Vector Machine) model, will be utilized in this dataset to perform a number of classification tasks. SVC is especially useful for binary and multiclass classification tasks. For this dataset, we may use SVC to predict a categorical result, such as whether a customer's loan was authorized or not. 3. K-Nearest Neighbors (KNN) : KNN is a simple, instance-based learning method used in classification and regression. It categorizes a data point according on how its neighbors are classed. In classification, the data point is assigned to the class with the most k-nearest neighbors. Model Building and Evaluation
  • 18. Model Comparison Selection of Model:  After evaluating three different models, including Logistic Regression, SVC, and KNN, it is clear that Logistic Regression outperforms than the others, with got accuracy score of 79% Train and 82% Test.
  • 19.
  • 20. Observation  Majority of the customers is getting loan approved (Yes) 68.7%  Those that are educated are better able to get their loans approved.  A majority of our customers who get loans approved are located in semi-urban areas.  Those who are married taking loans more than unmarried people.  The majority of the graduates come from semiurban areas.  we can see that those people whose salary above 5446 have a strong chances of getting a loan authorized.