SlideShare ist ein Scribd-Unternehmen logo
1 von 66
Downloaden Sie, um offline zu lesen
data science @ The New York Times
chris.wiggins@columbia.edu
chris.wiggins@nytimes.com
chris.wiggins@hackNY.org
@chrishwiggins
data natives, berlin, 2020-11-18
data science @ The New York Times
modern history:
2009
in tf
in gcp
data science: mindset & toolset
- drew conway, 2010
data science: mindset & toolset
develop + deploy
machine learning solutions
to
newsroom + business problems
data science @ The New York Times:
data science @ The New York Timesdata science @ The New York Times
1851
news: 20th century
church state
news: 20th century
church state
news: 20th century
church state
every publisher is now a startup
2014
ad -v- subs crossover in 2011;
post-election data not shown
2017
news: 21st century
church state
data
learnings
- descriptive modeling
- predictive modeling
- prescriptive modeling
(actually ML, shhhh…)
- (unsupervised learning)
- (supervised learning)
- (reinforcement learning)
(actually ML, shhhh…)
2012; h/t michael littman
learnings
- descriptive modeling
- predictive modeling
- prescriptive modeling
- descriptive modeling
- predictive modeling
- prescriptive modeling
UPDATE COPYReaderscope
In the course of our global expansion, we
realized we needed to have much more
sophisticated, real-time insight into what’s
happening across our site. 

Who is reading what? And where?
LOCATIONS FAQAUDIENCE SEGMENTSTOPICS
FAQ or Intro Information
Lorem ipsum dolor sit amet, consectetur
adipiscing elit, sed do eiusmod tempor incididunt
ut quislabore et dolore magna aliqua. enim ad
minim veniam, quis quis aliqua ullamconostrud
exercitation ullamco laboris nisi ut aliquip ex ea
commodo consequat.
Duis aute irure dolor in reprehenderit in
voluptate velit esse cillum dolore eu fugiat
nulla pariatur. Excepteur sint occaecat non
proident, sunt in culpa qui officia deserunt
mollit anim id est laborum. aliquip ex ea
commodo consequat.
Duis aute irure dolor in reprehenderit in
voluptate velit esse cillum dolore eu fugiat
nulla pariatur. Excepteur sint occaecat
cupidatat non proident, sunt in culpa qui
officia deserunt mollit anim id est laborum.
aliquip ex ea commodo consequat.aliquip ex
Privacy Policy Terms of Service © The New York Times CompanyNYTimes.com Send Us Feedback
Searchex. High-tech Lifestyle, Parents, Media - Comedy Films
Audience Segment
Search by
nytreaderscope
Illustration by Clara Nguyen
LOCATIONS FAQAUDIENCE SEGMENTSTOPICS
FAQ or Intro Information
Lorem ipsum dolor sit amet, consectetur
adipiscing elit, sed do eiusmod tempor incididunt
ut quislabore et dolore magna aliqua. enim ad
minim veniam, quis quis aliqua ullamconostrud
exercitation ullamco laboris nisi ut aliquip ex ea
commodo consequat.
Duis aute irure dolor in reprehenderit in
voluptate velit esse cillum dolore eu fugiat
nulla pariatur. Excepteur sint occaecat non
proident, sunt in culpa qui officia deserunt
mollit anim id est laborum. aliquip ex ea
commodo consequat.
Duis aute irure dolor in reprehenderit in
voluptate velit esse cillum dolore eu fugiat
nulla pariatur. Excepteur sint occaecat
cupidatat non proident, sunt in culpa qui
officia deserunt mollit anim id est laborum.
aliquip ex ea commodo consequat.aliquip ex
Privacy Policy Terms of Service © The New York Times CompanyNYTimes.com Send Us Feedback
Searchex. High-tech Lifestyle, Parents, Media - Comedy Films
Audience Segment
Search by
nytreaderscope
Illustration by Clara Nguyen
f
f
f
Tool: Readerscope
AUDIENCE INSIGHTS ENGINE
LOCATIONS FAQAUDIENCE SEGMENTSTOPICS
FAQ or Intro Information
Lorem ipsum dolor sit amet, consectetur
adipiscing elit, sed do eiusmod tempor incididunt
ut quislabore et dolore magna aliqua. enim ad
minim veniam, quis quis aliqua ullamconostrud
exercitation ullamco laboris nisi ut aliquip ex ea
commodo consequat.
Duis aute irure dolor in reprehenderit in
voluptate velit esse cillum dolore eu fugiat
nulla pariatur. Excepteur sint occaecat non
proident, sunt in culpa qui officia deserunt
mollit anim id est laborum. aliquip ex ea
commodo consequat.
Duis aute irure dolor in reprehenderit in
voluptate velit esse cillum dolore eu fugiat
nulla pariatur. Excepteur sint occaecat
cupidatat non proident, sunt in culpa qui
officia deserunt mollit anim id est laborum.
aliquip ex ea commodo consequat.aliquip ex
Privacy Policy Terms of Service © The New York Times CompanyNYTimes.com Send Us Feedback
Searchex. High-tech Lifestyle, Parents, Media - Comedy Films
Audience Segment
Search by
nytreaderscope
Illustration by Clara Nguyen
C-Suite
C-Suite, Executives and BDMs - Entertainment
C-Suite, Executives and BDMs - Media
C-Suite|
C-Suite
C-Suite, Executives and BDMs - Entertainment
C-Suite, Executives and BDMs - Media
LOCATIONS FAQAUDIENCE SEGMENTSTOPICS
FAQ or Intro Information
Lorem ipsum dolor sit amet, consectetur
adipiscing elit, sed do eiusmod tempor incididunt
ut quislabore et dolore magna aliqua. enim ad
minim veniam, quis quis aliqua ullamconostrud
exercitation ullamco laboris nisi ut aliquip ex ea
commodo consequat.
Duis aute irure dolor in reprehenderit in
voluptate velit esse cillum dolore eu fugiat
nulla pariatur. Excepteur sint occaecat non
proident, sunt in culpa qui officia deserunt
mollit anim id est laborum. aliquip ex ea
commodo consequat.
Duis aute irure dolor in reprehenderit in
voluptate velit esse cillum dolore eu fugiat
nulla pariatur. Excepteur sint occaecat
cupidatat non proident, sunt in culpa qui
officia deserunt mollit anim id est laborum.
aliquip ex ea commodo consequat.aliquip ex
Privacy Policy Terms of Service © The New York Times CompanyNYTimes.com Send Us Feedback
Searchex. High-tech Lifestyle, Parents, Media - Comedy Films
Audience Segment
Search by
nytreaderscope
Illustration by Clara Nguyen
f
Tool: Readerscope
AUDIENCE INSIGHTS ENGINE
learnings
- descriptive modeling
- predictive modeling
- prescriptive modeling
- descriptive modeling
- predictive modeling
- prescriptive modeling
predictive modeling, e.g.,
“the funnel”
interpretable predictive modeling
supercoolstuff
optimization & learning, e.g.,
“How The New York Times Works “popular mechanics, 2015
optimization & prediction, e.g.,
(some models)
(somemoneys)
“newsvendor problem,” literally (+prediction+experiment)
deep prediction w/ cloud AI
ming zhao: convolutional neural nets, in google
cloud platform, to help “tone” images for print
things: cloud AI
thousands of images, daily
deep prediction w/ cloud AI
Project
Feels
We conducted
user research
to gather
millions of
observations
about how
different
articles made
people feel.
DATA COLLECTION
When reading this article, did you feel…
Anger Sadness Happiness Despair
Hurt No Emotion Jealousy Frustration
Anxiety Hope Hate Interest
Guilt Contentment Contempt Love
Compassion Shame Amusement Stress
Irritation Fear Boredom Surprise
Confusion Disgust Irony Pride
Disappointment
Adventur
ous
98
Interest
42
Happiness
96
Self

Confident
39
Love
97
*units based on 100th percentile
Hate
Inspired
100
Amused
100
Sadness
27
Sources: Google DFP, NYT Ad Performance Data, Sizmek
April May June July August September October November December
Perspective Targeting Impression Volume By Month
Throughout the year, NYT
began running more and more
perspective targeting
campaigns every month.
And performance kept
breaking boundaries and
setting new benchmarks for
success.
A Record First Year
learnings
- descriptive modeling
- predictive modeling
- prescriptive modeling
- descriptive modeling
- predictive modeling
- prescriptive modeling
learnings
- descriptive modeling
- predictive modeling
- prescriptive modeling
- descriptive modeling
- predictive modeling
- prescriptive modeling
…two examples
predicting engagement
w/“audience development” team
leverage methods which are predictive yet performant
w/“audience development” team
driving question: which content
should we promote, where and when?
NB: data informed, not data-driven
learnings
- descriptive modeling
- predictive modeling
- prescriptive modeling
- descriptive modeling
- predictive modeling
- prescriptive modeling
… recommendation as prescription
2018: algos for *highly editorially curated* content pools
- smarter living
- midterms
- editors picks
2019-now: all of the above,
plus:
- For You Tab
- stay tuned…
slow: Randomized controlled trial
slow: Randomized controlled trial
time ——->
banditRCT
old (1933) idea: do the best you can
Lihong Li (YHOO->MSFT->GOOG), 2011
thompson sampling & “bandits”
common requirements in
data science:
common requirements in
data science:
1. people
2. ideas
3. things
cf. John Boyd, USAF
monica rogati, Aug 1 2017 hackernoon.com
things: de>da>ds/ml/ai
data science: ideas
Reporting
Learning
Test
Optimizing
Exploredescriptive:
predictive:
prescriptive:
Reporting
Learning
Test
Optimizing
Exploredescriptive:
predictive:
prescriptive:
- DARPA XAI
watch this space: NYT+AI
physics
math/fin p chem app math
cog sciEE
people.. so far (we’re hiring!!!!)
astrophys math/fin
pure mathapp math
cog sciEE
biophysseismology neuro
physics
mech eng CS mol bio ?CS
data science @ The New York Times
chris.wiggins@columbia.edu
chris.wiggins@nytimes.com
chris.wiggins@hackNY.org
@chrishwiggins
more info: data-ppf.github.io

Weitere ähnliche Inhalte

Ähnlich wie data science at the new york times

Data Science at The New York Times: what industry can learn from us; what we ...
Data Science at The New York Times: what industry can learn from us; what we ...Data Science at The New York Times: what industry can learn from us; what we ...
Data Science at The New York Times: what industry can learn from us; what we ...chris wiggins
 
Digital Leadership Series : Shawn O'Neal
Digital Leadership Series : Shawn O'Neal Digital Leadership Series : Shawn O'Neal
Digital Leadership Series : Shawn O'Neal Capgemini
 
ORGANISING YOUR ADVANCED ANALYTICS PROJECTS FOR SUCCESS - Big Data Expo 2019
ORGANISING YOUR ADVANCED ANALYTICS PROJECTS FOR SUCCESS - Big Data Expo 2019ORGANISING YOUR ADVANCED ANALYTICS PROJECTS FOR SUCCESS - Big Data Expo 2019
ORGANISING YOUR ADVANCED ANALYTICS PROJECTS FOR SUCCESS - Big Data Expo 2019webwinkelvakdag
 
2008 ANA Masters of Marketing Speech
2008 ANA Masters of Marketing Speech2008 ANA Masters of Marketing Speech
2008 ANA Masters of Marketing Speechpaulnprice
 
AMES 2016 - The Human Side of Analytics
AMES 2016 - The Human Side of AnalyticsAMES 2016 - The Human Side of Analytics
AMES 2016 - The Human Side of AnalyticsStephen Tracy
 
The Agency of the Future
The Agency of the FutureThe Agency of the Future
The Agency of the FutureLeslie Bradshaw
 
Advertising is Dead
Advertising is DeadAdvertising is Dead
Advertising is DeadStefan Kolle
 
State of Drupal keynote, DrupalCon Portland
State of Drupal keynote, DrupalCon PortlandState of Drupal keynote, DrupalCon Portland
State of Drupal keynote, DrupalCon PortlandDries Buytaert
 
Winning consumers on their brand journey using data and technology_Zawacki_St...
Winning consumers on their brand journey using data and technology_Zawacki_St...Winning consumers on their brand journey using data and technology_Zawacki_St...
Winning consumers on their brand journey using data and technology_Zawacki_St...National Retail Federation
 
Capturing the real customer experience
Capturing the real customer experienceCapturing the real customer experience
Capturing the real customer experiencenativeye
 
The Soft Side of Software Development / Devoxx 2019
The Soft Side of Software Development / Devoxx 2019The Soft Side of Software Development / Devoxx 2019
The Soft Side of Software Development / Devoxx 2019🎤 Hanno Embregts 🎸
 
Make Your UX Ideas Stick
Make Your UX Ideas StickMake Your UX Ideas Stick
Make Your UX Ideas StickJohn H Douglass
 
(Content) Marketing ROI for Fun and Profit (mostly Profit)Digital summit denv...
(Content) Marketing ROI for Fun and Profit (mostly Profit)Digital summit denv...(Content) Marketing ROI for Fun and Profit (mostly Profit)Digital summit denv...
(Content) Marketing ROI for Fun and Profit (mostly Profit)Digital summit denv...LaneTerralever
 
Joanna Lord - How to Operationalize Growth & Drive Revenue
Joanna Lord - How to Operationalize Growth & Drive RevenueJoanna Lord - How to Operationalize Growth & Drive Revenue
Joanna Lord - How to Operationalize Growth & Drive RevenueTuring Fest
 
2019 09 19 AdWeek - Machine marketing: How AI is powering the next generation...
2019 09 19 AdWeek - Machine marketing: How AI is powering the next generation...2019 09 19 AdWeek - Machine marketing: How AI is powering the next generation...
2019 09 19 AdWeek - Machine marketing: How AI is powering the next generation...Colin Pye
 
Formal Presentation Template.pptx
Formal Presentation Template.pptxFormal Presentation Template.pptx
Formal Presentation Template.pptxBOGORSURVEY
 

Ähnlich wie data science at the new york times (20)

Data Science at The New York Times: what industry can learn from us; what we ...
Data Science at The New York Times: what industry can learn from us; what we ...Data Science at The New York Times: what industry can learn from us; what we ...
Data Science at The New York Times: what industry can learn from us; what we ...
 
Digital Leadership Series : Shawn O'Neal
Digital Leadership Series : Shawn O'Neal Digital Leadership Series : Shawn O'Neal
Digital Leadership Series : Shawn O'Neal
 
ORGANISING YOUR ADVANCED ANALYTICS PROJECTS FOR SUCCESS - Big Data Expo 2019
ORGANISING YOUR ADVANCED ANALYTICS PROJECTS FOR SUCCESS - Big Data Expo 2019ORGANISING YOUR ADVANCED ANALYTICS PROJECTS FOR SUCCESS - Big Data Expo 2019
ORGANISING YOUR ADVANCED ANALYTICS PROJECTS FOR SUCCESS - Big Data Expo 2019
 
12.10.13
12.10.1312.10.13
12.10.13
 
2008 ANA Masters of Marketing Speech
2008 ANA Masters of Marketing Speech2008 ANA Masters of Marketing Speech
2008 ANA Masters of Marketing Speech
 
AMES 2016 - The Human Side of Analytics
AMES 2016 - The Human Side of AnalyticsAMES 2016 - The Human Side of Analytics
AMES 2016 - The Human Side of Analytics
 
Digital Transformation
Digital TransformationDigital Transformation
Digital Transformation
 
The Agency of the Future
The Agency of the FutureThe Agency of the Future
The Agency of the Future
 
Advertising is Dead
Advertising is DeadAdvertising is Dead
Advertising is Dead
 
Big Data for HR
Big Data for HRBig Data for HR
Big Data for HR
 
State of Drupal keynote, DrupalCon Portland
State of Drupal keynote, DrupalCon PortlandState of Drupal keynote, DrupalCon Portland
State of Drupal keynote, DrupalCon Portland
 
Winning consumers on their brand journey using data and technology_Zawacki_St...
Winning consumers on their brand journey using data and technology_Zawacki_St...Winning consumers on their brand journey using data and technology_Zawacki_St...
Winning consumers on their brand journey using data and technology_Zawacki_St...
 
Capturing the real customer experience
Capturing the real customer experienceCapturing the real customer experience
Capturing the real customer experience
 
The Soft Side of Software Development / Devoxx 2019
The Soft Side of Software Development / Devoxx 2019The Soft Side of Software Development / Devoxx 2019
The Soft Side of Software Development / Devoxx 2019
 
Make Your UX Ideas Stick
Make Your UX Ideas StickMake Your UX Ideas Stick
Make Your UX Ideas Stick
 
(Content) Marketing ROI for Fun and Profit (mostly Profit)Digital summit denv...
(Content) Marketing ROI for Fun and Profit (mostly Profit)Digital summit denv...(Content) Marketing ROI for Fun and Profit (mostly Profit)Digital summit denv...
(Content) Marketing ROI for Fun and Profit (mostly Profit)Digital summit denv...
 
Joanna Lord - How to Operationalize Growth & Drive Revenue
Joanna Lord - How to Operationalize Growth & Drive RevenueJoanna Lord - How to Operationalize Growth & Drive Revenue
Joanna Lord - How to Operationalize Growth & Drive Revenue
 
2019 09 19 AdWeek - Machine marketing: How AI is powering the next generation...
2019 09 19 AdWeek - Machine marketing: How AI is powering the next generation...2019 09 19 AdWeek - Machine marketing: How AI is powering the next generation...
2019 09 19 AdWeek - Machine marketing: How AI is powering the next generation...
 
Formal Presentation Template.pptx
Formal Presentation Template.pptxFormal Presentation Template.pptx
Formal Presentation Template.pptx
 
Aim First, Shoot Second: Mastering the Complexity of B2B Digital Marketing
Aim First, Shoot Second: Mastering the Complexity of B2B Digital MarketingAim First, Shoot Second: Mastering the Complexity of B2B Digital Marketing
Aim First, Shoot Second: Mastering the Complexity of B2B Digital Marketing
 

Mehr von chris wiggins

"data hum: a core approach to the ethics of data"
"data hum: a core approach to the ethics of data""data hum: a core approach to the ethics of data"
"data hum: a core approach to the ethics of data"chris wiggins
 
"data: past, present, and future" day 1 lecture 2020-01-20
"data: past, present, and future" day 1 lecture 2020-01-20"data: past, present, and future" day 1 lecture 2020-01-20
"data: past, present, and future" day 1 lecture 2020-01-20chris wiggins
 
history and ethics of data
history and ethics of datahistory and ethics of data
history and ethics of datachris wiggins
 
"data: past, present, and future" lecture 1 (intro) 1/22/19
"data: past, present, and future" lecture 1 (intro) 1/22/19"data: past, present, and future" lecture 1 (intro) 1/22/19
"data: past, present, and future" lecture 1 (intro) 1/22/19chris wiggins
 
"data: past, present, and future" lab 2 (EDA) notes by Prof. Matt Jones
"data: past, present, and future" lab 2 (EDA) notes by Prof. Matt Jones"data: past, present, and future" lab 2 (EDA) notes by Prof. Matt Jones
"data: past, present, and future" lab 2 (EDA) notes by Prof. Matt Joneschris wiggins
 
Data: Past, Present, and Future (Cornell Digital Life Seminar on Data Literac...
Data: Past, Present, and Future (Cornell Digital Life Seminar on Data Literac...Data: Past, Present, and Future (Cornell Digital Life Seminar on Data Literac...
Data: Past, Present, and Future (Cornell Digital Life Seminar on Data Literac...chris wiggins
 
Data: Past, Present, and Future (Lecture 1, Spring 2018)
Data: Past, Present, and Future (Lecture 1, Spring 2018)Data: Past, Present, and Future (Lecture 1, Spring 2018)
Data: Past, Present, and Future (Lecture 1, Spring 2018)chris wiggins
 
data science: past present & future [American Statistical Association (ASA) C...
data science: past present & future [American Statistical Association (ASA) C...data science: past present & future [American Statistical Association (ASA) C...
data science: past present & future [American Statistical Association (ASA) C...chris wiggins
 
Machine Learning Summer School 2016
Machine Learning Summer School 2016Machine Learning Summer School 2016
Machine Learning Summer School 2016chris wiggins
 
lean + design thinking in building data products
lean + design thinking in building data productslean + design thinking in building data products
lean + design thinking in building data productschris wiggins
 
data science @NYT ; inaugural Data Science Initiative Lecture
data science @NYT ; inaugural Data Science Initiative Lecturedata science @NYT ; inaugural Data Science Initiative Lecture
data science @NYT ; inaugural Data Science Initiative Lecturechris wiggins
 
data history / data science @ NYT
data history / data science @ NYTdata history / data science @ NYT
data history / data science @ NYTchris wiggins
 
data science history / data science @ NYT
data science history / data science @ NYTdata science history / data science @ NYT
data science history / data science @ NYTchris wiggins
 
data science: past, present, and future
data science: past, present, and futuredata science: past, present, and future
data science: past, present, and futurechris wiggins
 
Chris Wiggins: "engagement & reality"
Chris Wiggins: "engagement & reality"Chris Wiggins: "engagement & reality"
Chris Wiggins: "engagement & reality"chris wiggins
 
intro data science at NYT 2015-01-22
intro data science at NYT 2015-01-22intro data science at NYT 2015-01-22
intro data science at NYT 2015-01-22chris wiggins
 
data science in academia and the real world
data science in academia and the real worlddata science in academia and the real world
data science in academia and the real worldchris wiggins
 
Lean workbench 2013-07-24
Lean workbench 2013-07-24Lean workbench 2013-07-24
Lean workbench 2013-07-24chris wiggins
 
variational bayes in biophysics
variational bayes in biophysicsvariational bayes in biophysics
variational bayes in biophysicschris wiggins
 

Mehr von chris wiggins (20)

"data hum: a core approach to the ethics of data"
"data hum: a core approach to the ethics of data""data hum: a core approach to the ethics of data"
"data hum: a core approach to the ethics of data"
 
"data: past, present, and future" day 1 lecture 2020-01-20
"data: past, present, and future" day 1 lecture 2020-01-20"data: past, present, and future" day 1 lecture 2020-01-20
"data: past, present, and future" day 1 lecture 2020-01-20
 
history and ethics of data
history and ethics of datahistory and ethics of data
history and ethics of data
 
"data: past, present, and future" lecture 1 (intro) 1/22/19
"data: past, present, and future" lecture 1 (intro) 1/22/19"data: past, present, and future" lecture 1 (intro) 1/22/19
"data: past, present, and future" lecture 1 (intro) 1/22/19
 
"data: past, present, and future" lab 2 (EDA) notes by Prof. Matt Jones
"data: past, present, and future" lab 2 (EDA) notes by Prof. Matt Jones"data: past, present, and future" lab 2 (EDA) notes by Prof. Matt Jones
"data: past, present, and future" lab 2 (EDA) notes by Prof. Matt Jones
 
Data: Past, Present, and Future (Cornell Digital Life Seminar on Data Literac...
Data: Past, Present, and Future (Cornell Digital Life Seminar on Data Literac...Data: Past, Present, and Future (Cornell Digital Life Seminar on Data Literac...
Data: Past, Present, and Future (Cornell Digital Life Seminar on Data Literac...
 
Data: Past, Present, and Future (Lecture 1, Spring 2018)
Data: Past, Present, and Future (Lecture 1, Spring 2018)Data: Past, Present, and Future (Lecture 1, Spring 2018)
Data: Past, Present, and Future (Lecture 1, Spring 2018)
 
data science: past present & future [American Statistical Association (ASA) C...
data science: past present & future [American Statistical Association (ASA) C...data science: past present & future [American Statistical Association (ASA) C...
data science: past present & future [American Statistical Association (ASA) C...
 
Machine Learning Summer School 2016
Machine Learning Summer School 2016Machine Learning Summer School 2016
Machine Learning Summer School 2016
 
lean + design thinking in building data products
lean + design thinking in building data productslean + design thinking in building data products
lean + design thinking in building data products
 
data science @NYT ; inaugural Data Science Initiative Lecture
data science @NYT ; inaugural Data Science Initiative Lecturedata science @NYT ; inaugural Data Science Initiative Lecture
data science @NYT ; inaugural Data Science Initiative Lecture
 
data history / data science @ NYT
data history / data science @ NYTdata history / data science @ NYT
data history / data science @ NYT
 
data science history / data science @ NYT
data science history / data science @ NYTdata science history / data science @ NYT
data science history / data science @ NYT
 
data science: past, present, and future
data science: past, present, and futuredata science: past, present, and future
data science: past, present, and future
 
Chris Wiggins: "engagement & reality"
Chris Wiggins: "engagement & reality"Chris Wiggins: "engagement & reality"
Chris Wiggins: "engagement & reality"
 
intro data science at NYT 2015-01-22
intro data science at NYT 2015-01-22intro data science at NYT 2015-01-22
intro data science at NYT 2015-01-22
 
data science in academia and the real world
data science in academia and the real worlddata science in academia and the real world
data science in academia and the real world
 
Lean workbench 2013-07-24
Lean workbench 2013-07-24Lean workbench 2013-07-24
Lean workbench 2013-07-24
 
Wiggins 2013 05-29
Wiggins 2013 05-29Wiggins 2013 05-29
Wiggins 2013 05-29
 
variational bayes in biophysics
variational bayes in biophysicsvariational bayes in biophysics
variational bayes in biophysics
 

Kürzlich hochgeladen

Artificial Intelligence in Power System overview
Artificial Intelligence in Power System overviewArtificial Intelligence in Power System overview
Artificial Intelligence in Power System overviewsandhya757531
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleAlluxio, Inc.
 
High Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMS
High Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMSHigh Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMS
High Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMSsandhya757531
 
Katarzyna Lipka-Sidor - BIM School Course
Katarzyna Lipka-Sidor - BIM School CourseKatarzyna Lipka-Sidor - BIM School Course
Katarzyna Lipka-Sidor - BIM School Coursebim.edu.pl
 
Main Memory Management in Operating System
Main Memory Management in Operating SystemMain Memory Management in Operating System
Main Memory Management in Operating SystemRashmi Bhat
 
Class 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm SystemClass 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm Systemirfanmechengr
 
Gravity concentration_MI20612MI_________
Gravity concentration_MI20612MI_________Gravity concentration_MI20612MI_________
Gravity concentration_MI20612MI_________Romil Mishra
 
Mine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptxMine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptxRomil Mishra
 
Immutable Image-Based Operating Systems - EW2024.pdf
Immutable Image-Based Operating Systems - EW2024.pdfImmutable Image-Based Operating Systems - EW2024.pdf
Immutable Image-Based Operating Systems - EW2024.pdfDrew Moseley
 
Engineering Drawing section of solid
Engineering Drawing     section of solidEngineering Drawing     section of solid
Engineering Drawing section of solidnamansinghjarodiya
 
Virtual memory management in Operating System
Virtual memory management in Operating SystemVirtual memory management in Operating System
Virtual memory management in Operating SystemRashmi Bhat
 
US Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionUS Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionMebane Rash
 
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...Erbil Polytechnic University
 
chpater16.pptxMMMMMMMMMMMMMMMMMMMMMMMMMMM
chpater16.pptxMMMMMMMMMMMMMMMMMMMMMMMMMMMchpater16.pptxMMMMMMMMMMMMMMMMMMMMMMMMMMM
chpater16.pptxMMMMMMMMMMMMMMMMMMMMMMMMMMMNanaAgyeman13
 
Risk Management in Engineering Construction Project
Risk Management in Engineering Construction ProjectRisk Management in Engineering Construction Project
Risk Management in Engineering Construction ProjectErbil Polytechnic University
 
Energy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptxEnergy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptxsiddharthjain2303
 
Comprehensive energy systems.pdf Comprehensive energy systems.pdf
Comprehensive energy systems.pdf Comprehensive energy systems.pdfComprehensive energy systems.pdf Comprehensive energy systems.pdf
Comprehensive energy systems.pdf Comprehensive energy systems.pdfalene1
 
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.pptROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.pptJohnWilliam111370
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfAsst.prof M.Gokilavani
 
Computer Graphics Introduction, Open GL, Line and Circle drawing algorithm
Computer Graphics Introduction, Open GL, Line and Circle drawing algorithmComputer Graphics Introduction, Open GL, Line and Circle drawing algorithm
Computer Graphics Introduction, Open GL, Line and Circle drawing algorithmDeepika Walanjkar
 

Kürzlich hochgeladen (20)

Artificial Intelligence in Power System overview
Artificial Intelligence in Power System overviewArtificial Intelligence in Power System overview
Artificial Intelligence in Power System overview
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at Scale
 
High Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMS
High Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMSHigh Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMS
High Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMS
 
Katarzyna Lipka-Sidor - BIM School Course
Katarzyna Lipka-Sidor - BIM School CourseKatarzyna Lipka-Sidor - BIM School Course
Katarzyna Lipka-Sidor - BIM School Course
 
Main Memory Management in Operating System
Main Memory Management in Operating SystemMain Memory Management in Operating System
Main Memory Management in Operating System
 
Class 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm SystemClass 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm System
 
Gravity concentration_MI20612MI_________
Gravity concentration_MI20612MI_________Gravity concentration_MI20612MI_________
Gravity concentration_MI20612MI_________
 
Mine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptxMine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptx
 
Immutable Image-Based Operating Systems - EW2024.pdf
Immutable Image-Based Operating Systems - EW2024.pdfImmutable Image-Based Operating Systems - EW2024.pdf
Immutable Image-Based Operating Systems - EW2024.pdf
 
Engineering Drawing section of solid
Engineering Drawing     section of solidEngineering Drawing     section of solid
Engineering Drawing section of solid
 
Virtual memory management in Operating System
Virtual memory management in Operating SystemVirtual memory management in Operating System
Virtual memory management in Operating System
 
US Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionUS Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of Action
 
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
 
chpater16.pptxMMMMMMMMMMMMMMMMMMMMMMMMMMM
chpater16.pptxMMMMMMMMMMMMMMMMMMMMMMMMMMMchpater16.pptxMMMMMMMMMMMMMMMMMMMMMMMMMMM
chpater16.pptxMMMMMMMMMMMMMMMMMMMMMMMMMMM
 
Risk Management in Engineering Construction Project
Risk Management in Engineering Construction ProjectRisk Management in Engineering Construction Project
Risk Management in Engineering Construction Project
 
Energy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptxEnergy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptx
 
Comprehensive energy systems.pdf Comprehensive energy systems.pdf
Comprehensive energy systems.pdf Comprehensive energy systems.pdfComprehensive energy systems.pdf Comprehensive energy systems.pdf
Comprehensive energy systems.pdf Comprehensive energy systems.pdf
 
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.pptROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
 
Computer Graphics Introduction, Open GL, Line and Circle drawing algorithm
Computer Graphics Introduction, Open GL, Line and Circle drawing algorithmComputer Graphics Introduction, Open GL, Line and Circle drawing algorithm
Computer Graphics Introduction, Open GL, Line and Circle drawing algorithm
 

data science at the new york times

  • 1. data science @ The New York Times chris.wiggins@columbia.edu chris.wiggins@nytimes.com chris.wiggins@hackNY.org @chrishwiggins data natives, berlin, 2020-11-18
  • 2. data science @ The New York Times
  • 4.
  • 5.
  • 6.
  • 8. data science: mindset & toolset - drew conway, 2010
  • 9. data science: mindset & toolset develop + deploy machine learning solutions to newsroom + business problems
  • 10. data science @ The New York Times:
  • 11. data science @ The New York Timesdata science @ The New York Times
  • 12. 1851
  • 16.
  • 17. every publisher is now a startup 2014
  • 18. ad -v- subs crossover in 2011; post-election data not shown 2017
  • 20. learnings - descriptive modeling - predictive modeling - prescriptive modeling
  • 21. (actually ML, shhhh…) - (unsupervised learning) - (supervised learning) - (reinforcement learning)
  • 22. (actually ML, shhhh…) 2012; h/t michael littman
  • 23. learnings - descriptive modeling - predictive modeling - prescriptive modeling - descriptive modeling - predictive modeling - prescriptive modeling
  • 24. UPDATE COPYReaderscope In the course of our global expansion, we realized we needed to have much more sophisticated, real-time insight into what’s happening across our site. 
 Who is reading what? And where?
  • 25. LOCATIONS FAQAUDIENCE SEGMENTSTOPICS FAQ or Intro Information Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut quislabore et dolore magna aliqua. enim ad minim veniam, quis quis aliqua ullamconostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum. aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum. aliquip ex ea commodo consequat.aliquip ex Privacy Policy Terms of Service © The New York Times CompanyNYTimes.com Send Us Feedback Searchex. High-tech Lifestyle, Parents, Media - Comedy Films Audience Segment Search by nytreaderscope Illustration by Clara Nguyen LOCATIONS FAQAUDIENCE SEGMENTSTOPICS FAQ or Intro Information Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut quislabore et dolore magna aliqua. enim ad minim veniam, quis quis aliqua ullamconostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum. aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum. aliquip ex ea commodo consequat.aliquip ex Privacy Policy Terms of Service © The New York Times CompanyNYTimes.com Send Us Feedback Searchex. High-tech Lifestyle, Parents, Media - Comedy Films Audience Segment Search by nytreaderscope Illustration by Clara Nguyen f f f Tool: Readerscope AUDIENCE INSIGHTS ENGINE
  • 26. LOCATIONS FAQAUDIENCE SEGMENTSTOPICS FAQ or Intro Information Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut quislabore et dolore magna aliqua. enim ad minim veniam, quis quis aliqua ullamconostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum. aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum. aliquip ex ea commodo consequat.aliquip ex Privacy Policy Terms of Service © The New York Times CompanyNYTimes.com Send Us Feedback Searchex. High-tech Lifestyle, Parents, Media - Comedy Films Audience Segment Search by nytreaderscope Illustration by Clara Nguyen C-Suite C-Suite, Executives and BDMs - Entertainment C-Suite, Executives and BDMs - Media C-Suite| C-Suite C-Suite, Executives and BDMs - Entertainment C-Suite, Executives and BDMs - Media LOCATIONS FAQAUDIENCE SEGMENTSTOPICS FAQ or Intro Information Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut quislabore et dolore magna aliqua. enim ad minim veniam, quis quis aliqua ullamconostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum. aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum. aliquip ex ea commodo consequat.aliquip ex Privacy Policy Terms of Service © The New York Times CompanyNYTimes.com Send Us Feedback Searchex. High-tech Lifestyle, Parents, Media - Comedy Films Audience Segment Search by nytreaderscope Illustration by Clara Nguyen f Tool: Readerscope AUDIENCE INSIGHTS ENGINE
  • 27. learnings - descriptive modeling - predictive modeling - prescriptive modeling - descriptive modeling - predictive modeling - prescriptive modeling
  • 30. optimization & learning, e.g., “How The New York Times Works “popular mechanics, 2015
  • 31. optimization & prediction, e.g., (some models) (somemoneys) “newsvendor problem,” literally (+prediction+experiment)
  • 32. deep prediction w/ cloud AI ming zhao: convolutional neural nets, in google cloud platform, to help “tone” images for print
  • 33. things: cloud AI thousands of images, daily deep prediction w/ cloud AI
  • 35. We conducted user research to gather millions of observations about how different articles made people feel. DATA COLLECTION When reading this article, did you feel… Anger Sadness Happiness Despair Hurt No Emotion Jealousy Frustration Anxiety Hope Hate Interest Guilt Contentment Contempt Love Compassion Shame Amusement Stress Irritation Fear Boredom Surprise Confusion Disgust Irony Pride Disappointment
  • 36. Adventur ous 98 Interest 42 Happiness 96 Self
 Confident 39 Love 97 *units based on 100th percentile Hate Inspired 100 Amused 100 Sadness 27
  • 37.
  • 38. Sources: Google DFP, NYT Ad Performance Data, Sizmek April May June July August September October November December Perspective Targeting Impression Volume By Month Throughout the year, NYT began running more and more perspective targeting campaigns every month. And performance kept breaking boundaries and setting new benchmarks for success. A Record First Year
  • 39. learnings - descriptive modeling - predictive modeling - prescriptive modeling - descriptive modeling - predictive modeling - prescriptive modeling
  • 40. learnings - descriptive modeling - predictive modeling - prescriptive modeling - descriptive modeling - predictive modeling - prescriptive modeling …two examples
  • 43. leverage methods which are predictive yet performant w/“audience development” team
  • 44. driving question: which content should we promote, where and when?
  • 45. NB: data informed, not data-driven
  • 46. learnings - descriptive modeling - predictive modeling - prescriptive modeling - descriptive modeling - predictive modeling - prescriptive modeling … recommendation as prescription
  • 47. 2018: algos for *highly editorially curated* content pools - smarter living - midterms - editors picks
  • 48. 2019-now: all of the above, plus: - For You Tab - stay tuned…
  • 49.
  • 50.
  • 51.
  • 53. slow: Randomized controlled trial time ——-> banditRCT
  • 54. old (1933) idea: do the best you can
  • 55. Lihong Li (YHOO->MSFT->GOOG), 2011 thompson sampling & “bandits”
  • 57. common requirements in data science: 1. people 2. ideas 3. things cf. John Boyd, USAF
  • 58. monica rogati, Aug 1 2017 hackernoon.com things: de>da>ds/ml/ai
  • 62.
  • 64. watch this space: NYT+AI physics math/fin p chem app math cog sciEE
  • 65. people.. so far (we’re hiring!!!!) astrophys math/fin pure mathapp math cog sciEE biophysseismology neuro physics mech eng CS mol bio ?CS
  • 66. data science @ The New York Times chris.wiggins@columbia.edu chris.wiggins@nytimes.com chris.wiggins@hackNY.org @chrishwiggins more info: data-ppf.github.io