Tf itpbapm

Intro to
Python:
Build a
Predictive
Model

Introductions
➔ What's your name?
➔ What brought you here today?
➔ What is your programming experience?

We train developers and
data scientists through
1x1 mentorship and
project-based learning.
Guaranteed.
About Thinkful

Learn
by
Doing
➔ Why is Data Science a thing?
➔ What is Python?
➔ How do we use it with a real
world project?
➔ How do I learn more?

“[LinkedIn] was like arriving at a conference
reception and realizing you don’t know anyone. So
you just stand in the corner sipping your drink —
and you probably leave early.”
— LinkedIn Manager, June 2006
Example:
LinkedIn
2006

➔ Joined LinkedIn in 2006, only 8M
users (450M in 2016)
➔ Started experiments to predict
people’s networks
➔ Engineers were dismissive: “you
can already import your address
book”
Enter:
Data
Scientist

➔ Frame the question
➔ Collect the raw data
➔ Process the data
➔ Explore the data
➔ Communicate results
The
Process:
LinkedIn
Example

➔ What questions do we want to answer?
◆ Who?
◆ What?
◆ When?
◆ Where?
◆ Why?
◆ How?
Case:
Frame
the
Question

➔ What connections (type and number) lead to higher
user engagement?
➔ Which connections do people want to make but are
currently limited from making?
➔ How might we predict these types of connections with
limited data from the user?
Case:
Frame
the
Question

➔ What data do we need to
answer these questions?
Case:
Collect
the
Data

➔ Connection data (who is who connected to?)
➔ Demographic data (what is the profile of
the connection)
➔ Engagement data (how do they use the site)
Case:
Collect
the
Data

➔ How is the data
“dirty” and how can
we clean it?
Case:
Process
the
Data

➔ User input
➔ Redundancies
➔ Feature changes
➔ Data model changes
Case:
Process
the
Data

➔ What are the meaningful
patterns in the data?
Case:
Explore
the
Data

➔ Triangle closing
➔ Time Overlaps
➔ Geographic Overlaps
Case:
Explore
the Data

➔ How do we communicate this?
➔ To whom?
Case:
Communicate
Findings

➔ Marketing - sell X more ad space, results in X more
impressions per day
➔ Product - build X more features
➔ Development - grow our team by X
➔ Sales - attract X more premium accounts
➔ C-Level - more revenue, 8M - 450M in 10 years
Case:
Communicate
Findings

Python for Programming
➔ Great for Data Science
➔ Robotics
➔ Web Development
(Python/Django)
➔ Automation
Let’s
Learn
Python

➔ Our model is going to be a Decision Tree
➔ Decision Trees predict the most likely outcome
based on input
➔ Like a computer building a version of 20
questions
The
Model

➔ We’ll be using a
Google-hosted Python notebook
to build this model called
Colaboratory
➔ Go to:
Colab.research.google.com
➔ Click New Python 3 Notebook
The
Notebook

from sklearn import tree
➔ Import Tree functionality from
the SKLearn Python Package
➔ bit.ly/sklearn-python
Code
Block 1

X = [[181,80], [177,70], [160,60], [154,54], [166,65],
[190,90], [175,64], [177,70], [159,55], [171,75], [181,85]]
Y = ['male','female','female','female','male','male','male','female',
'male','female','male']
➔ Load in our seed data
➔ X is an array of inputs, each input is itself
an array that contains Height (in cm) and
Weight (in kg)
➔ Y is an array of strings that map to the
inputs in X so we can train the model
Code
Block 2

clf = tree.DecisionTreeClassifier()
clf = clf.fit(X,Y)
#print tree.export_graphviz(clf,None)
➔ We create an empty DecisionTreeClassifier and
assign it to the variable clf
➔ We fit the decision tree with our X and Y
seed data
➔ SKLearn is automatically creating our
Decision Tree questions for us (Example: Is
height > 177? Yes - Male)
➔ Uncomment the last line and paste the return
string into: webgraphviz.com
Code
Block 3

prediction = clf.predict([[183,76]])
print prediction
➔ Now we give our inputs, in the same format
➔ Height (cm), Weight (kg)
➔ Print our prediction
Code
Block 4

Our model has a few weaknesses:
➔ Limited inputs
➔ Assumptions
Shortcomings

➔ Start with Python and Statistics
➔ Personal Program Manager
➔ Unlimited Q&A Sessions
➔ Student Slack Community
➔ bit.ly/freetrial-ds
Thinkful
Two-Week
Free
Trial

The
Student
Experience
Marnie Boyer, Thinkful Graduate
Capstone
Wolfgang Hall, Thinkful Graduate
Capstone

➔ bit.ly/tf-event-feedback
Survey

Tf itpbapm

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Ähnlich wie Tf itpbapm

Ähnlich wie Tf itpbapm (20)

Mehr von Shannon Gallagher

Mehr von Shannon Gallagher (19)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Tf itpbapm