Basic structure of hair and hair growth cycle.pptx
Atul Butte's presentation for the FDA 5th Annual Scientific Computing Days
1. Translating a Trillion Points of Data into
Diagnostics, Therapies and New Insights
in Health and Disease
atul.butte@ucsf.edu
@atulbutte
Atul Butte, MD, PhD
Director, Institute for Computational
Health Sciences
University of California, San Francisco
12. 227 million substances x
1.3 million assays
More than a billion measurements
within a grid of 300 trillion cells
71 million meet Lipinski 5
1.2 million active substances
16. Preeclampsia: large cause of maternal and fetal death
• Incidence
• 5-8% of all pregnancies in the U.S. and worldwide
• 4.1 million births in the U.S. in 2009
• Up to 300K cases of preeclampsia annually in the U.S.
• Mortality
• Responsible for 18% of all maternal deaths in the U.S.
• Maternal death in 56 out of every 100,000 live births in US
• Neonatal death in 71 out of every 100,000 live births in US
• Cost
• $20 billion in direct costs in the U.S annually
• Average hospital stay of 3.5 days
Linda Liu
Bruce Ling
Matt Cooper
20. New blood markers for preeclampsia
Linda Liu
Bruce Ling
Matt Cooper
@MarchofDimes
bit.ly/preeclamp
21. Need a
diagnostic for
preeclampsia
Public big data
available
March of Dimes
Center for
Prematurity
Research
Data analyzed,
diagnostic
designed
SPARK grant
($50k)
Life Science
Angels, other
seed investors
($2 million)
@CarmentaBio
progenity.com
bit.ly/carm_prog
24. Credit: Russ Altman and team
Human genome sequence can be used to predict drug adverse events
25. • Study published in 2008 in
Inflammatory Bowel Disease
• Crohn’s Disease and Ulcerative
Colitis
• Investigated 9 loci in 700 Finnish
IBD patients
• We record 100+ items
– GWAS, non-GWAS papers
– Disease, Phenotype
– Population, Gender
– Alleles and Genotypes
– p-value (and confidence)
– Odds ratio (and confidence)
– Technology, Study design
– Genetic model
• Mapped to UMLS concepts
Rong Chen
Optra Systems
26. • Study published in 2008 in
Inflammatory Bowel Disease
• Crohn’s Disease and Ulcerative
Colitis
• Investigated 9 loci in 700 Finnish
IBD patients
• We record 100+ items
– GWAS, non-GWAS papers
– Disease, Phenotype
– Population, Gender
– Alleles and Genotypes
– p-value (and confidence)
– Odds ratio (and confidence)
– Technology, Study design
– Genetic model
• Mapped to UMLS concepts
27. • Study published in 2009
in Rheumatology
• Ankylosing spondylitis
• Investigated 8 SNPs in
IL23R in 2000 UK case-
control patients
• Tables can be rotated
• NLP is hard
28. • Study published in 2009
in Rheumatology
• Ankylosing spondylitis
• Investigated 8 SNPs in
IL23R in 2000 UK case-
control patients
• Tables can be rotated
• NLP is hard
29. • Study published in 2009
in Rheumatology
• Ankylosing spondylitis
• Investigated 8 SNPs in
IL23R in 2000 UK case-
control patients
• Tables can be rotated
• NLP is hard
32. Alleles for rs1004819 are C and T
~11% of records reported genotypes in the negative strand
33. Credit: Rong Chen, Optra Systems, and Personalis, Inc.
Important genome differences “locked up” in publications
34. Credit: Rong Chen, Optra Systems, and Personalis, Inc.
Collect the “big data” of findings across publications to
analyze the “big data” of the genome
37. Number of
papers curated
Number of
records
Distinct SNPs Diseases and
phenotypes
~19,000 ~1.6 million ~473,000 ~7,400
Rong Chen
Optra Systems
Personalis
VARIMED: Variants Informing Medicine
Chen R, Davydov EV, Sirota M, Butte AJ.
PLoS One.
2010 October: 5(10): e13574.
44. Need to use
genomes to
predict
disease
Publications
available for
curation
Stanford
donor
funding
Science
curated,
methods
designed
Company
launched,
Stanford
license
MDV,
Lightspeed,
Abingworth
($20 million)
Same 3 plus
Wellington
Shields ($22
million)
Series C ($33
million)
51. Cancer Discovery 2013, 3:1.
Psychiatric Drug Imipramine Shows Significant Activity
Against Small Cell Lung Cancer
Vehicle control Imipramine
p53/Rb/p130
triple knockout
model of SCLC
Mice dosed after
tumor formation
Joel Dudley
Nadine Jahchan
Julien Sage
Alejandro Sweet-Cordero
Joel Neal
@NuMedii
52. Bin Chen
Wei Wei
Li Ma
Bin Yang
Mei-Sze Chua
Samuel So
Gastroenterology, 2017
53. Need more drugs
for more diseases
Public big data
available
NIH funding
Data analyzed,
method designed
Company launched,
ARRA, StartX,
Stanford license,
first deal
Claremont Creek,
Lightspeed ($3.5
million)
@NuMedii
54. The next big open data: clinical trials
Download 100+ studies today
Drug repositioning, new patient subsets,
digital comparative effectiveness, more!
immport.org
Sanchita Bhattacharya
Jeff Wiser
55. Reanalyzing RAVE
• Rituximab in ANCA-Associated Vasculitis (RAVE) trial of new approach to
the induction of remission
– randomized
– double-blind
– double-dummy
– active-controlled
– non-inferiority
58. Reproduce CD19+ B-cell depletion
using publicly released clinical trials data
Nasrallah M, …, Butte AJ. Arthritis Research & Therapy (2015) 17:262.
59. RAVE re-analysis
• 63 of the 99 patients in the rituximab group (64%) reached the
primary end point, as compared with 52 of 98 in the control group
(53%).
• The treatment difference of 11% points between the groups met the
criterion for non-inferiority (P<0.001).
In retrospect, do any measured factors
predict response?
Mazen Nasrallah
Nasrallah M, …, Butte AJ. Arthritis Research & Therapy (2015) 17:262.
60. Nasrallah M, …, Butte AJ. Arthritis Research & Therapy (2015) 17:262.
Granularity index higher in rituximab-treated
subjects with remission
SSC
1 2
Granulocyte Subpopulations and Treatment Outcomes
Panel A: representative bi-dimensional dot-plot of granulocyte sub-
populations identified by ImmPortFLOCK on the basis of FSC and SSC.
A1: Hypogranular granulocytes with an SSC of low or positive (2 or 3).
A2: Hyper granular granulocytes with an SSC of high (4).
Panel B: granularity index at day 0 among patients receiving rituximab or
cyclophosphamide, stratified by treatment outcome (failure: red,
success: blue). Data distribution is shown as a boxplot, with mean ±
SEM represented by dots and small error bars. CYC: cyclophosphamide,
RTX: rituximab. A Welch two-sided t-test was used to calculate
significance.
61. Nasrallah M, …, Butte AJ. Arthritis Research & Therapy (2015) 17:262.
ANCA-
associated
Vasculitis
Profiled
Therapy
~ 54% of patients
Non-profiled
Therapy
~46% of patients
Treat with
Rituximab
~ 30% of patients
Remission Rate ~ 83%
Treat with
Cyclophosphamide
~24% of patients
Remission Rate ~ 66%
Do not treat with
Cyclophosphamide
Failure rate ~ 67%
Do not treat with
Rituximab
Failure rate ~ 70%
GI ≤ -9.25%
OR
GI ≥ 47.6%
GI ≤ -9.25% GI ≥ 47.6%
Treat with either Rituximab or
Cyclophosphamide
according to best clinical judgement
Average Remission Rate ~ 60%
Non-profiled
Therapy
100% of patients
NO
Proposed
Method
Current
Method
Measure the Granularity Index
(GI)
YES
Mazen Nasrallah
63. • Founded 2015
• 38 affiliated faculty members from UCSF’s four
top-ranked schools
– 5 in National Academy of Medicine
– 1 in National Academy of Science
– 2 in the American Society for Clinical Investigation
– 3 NIH Director’s Awards
– 2 Sloan Foundation fellows
– 1 HHMI faculty scholar
– 1 MacArthur Foundation fellow
– 1 Chan/Zuckerberg faculty fellow
64. Build the strongest team in the world in
biomedical computation and health data analytics
• Academic affinity home for faculty and staff
• Research and development (and spin out technologies)
• Develop new educational plans
• Bring the best new computational and informatics faculty members
to UCSF
• Organize infrastructure and operations
• Build and use our new data assets for precision medicine
66. Combining healthcare data from across the
six University of California medical schools and systems
Clinical Data Warehouse
A Big UC Healthcare Data Analytics Platform
69. What could we do with clinical data?
• Clinical researcher at UCLA could run a genome wide association study across UC Health
• Mobile health researcher at UCSD can enable patients to contribute data for research
• Community activist and researcher UC Merced can study environmental factors
contributing to health and disease
• Transplant patient at UC Irvine can download all their data across UC Health
• Data scientist at UC Santa Barbara can model development of Alzheimer's disease and build
a multi-modal predictor
• App designer at UC Riverside can show patients their choices with chronic disease
• CMO at UCSF can build predictive models for readmission, test, share across UC Health
• AI researcher at UC Berkeley can build deep-learning models for image-based diagnostics
• Health services researcher at UC Davis can build predictive models for drug efficacy, and
maybe enable pay-for-performance
• Cancer genomics researcher at UCSC can study all our clinical cancer genomes
71. Take home points:
• Plenty of high-quality data already available:
some public, some private
• Don’t wait for perfection; data always
getting better
• Use and intersect data to ask new questions,
to innovative new diagnostics and drugs
• Academia and industry are compatible: the science
can and will continue in industry
72. UC Clinical Data Warehouse Team
Executive Team
• Atul Butte
• Joe Bengfort
• Michael Pfeffer
• Tom Andriola
• Chris Longhurst
Steering Committee
• Irfan Chaudhry
• Mohammed Mahbouba
• Lisa Dahm
• David Dobbs
• Kent Andersen
• Ralph James
• Jennifer Holland
• Eugene Lee
ETL Team
• Albert Dugan
• Tony Choe
• Michael Sweeney
• Timothy Satterwhite
• Ayan Patel
• Niranjan Wagle
• Ralph James
• Joseph Dalton
Data Harmonization
• Dana Ludwig
• Daniella Meeker
Data Quality
• Momeena Ali
• Jodie Nygaard
Epic
• Kevin Ames
• Ben Jenkins
• Steve Gesualdo
Business Analyst
• Ankeeta Shukla
Hardware
• Sandeep Chandra
• Jeff Love
• Scott Bailey
• Kwong Law
• Pallav Saxena
Support
• Jack Stobo
• Michael Blum
• Sam Hawgood
73. Collaborators
• Jeff Wiser, Patrick Dunn, Mike Atassi /
Northrop Grumman
• Ashley Xia and Quan Chen / NIAID
• Takashi Kadowaki, Momoko Horikoshi, Kazuo
Hara, Hiroshi Ohtsu / U Tokyo
• Kyoko Toda, Satoru Yamada, Junichiro Irie /
Kitasato Univ and Hospital
• Shiro Maeda / RIKEN
• Jeff Olgin / Cardiology
• Alejandro Sweet-Cordero, Julien Sage /
Pediatric Oncology
• Mark Davis, C. Garrison Fathman /
Immunology
• Russ Altman, Steve Quake / Bioengineering
• Euan Ashley, Joseph Wu, Tom Quertermous /
Cardiology
• Mike Snyder, Carlos Bustamante, Anne Brunet
/ Genetics
• Jay Pasricha / Gastroenterology
• Rob Tibshirani, Brad Efron / Statistics
• Hannah Valantine, Kiran Khush/ Cardiology
• Ken Weinberg / Pediatric Stem Cell
Therapeutics
• Mark Musen, Nigam Shah / National Center for
Biomedical Ontology
• Minnie Sarwal / Nephrology
• David Miklos / Oncology
74. Support
Admin and Tech Staff
• Mary Lyall
• Mounira Kenaani
• Kevin Kaier
• Boris Oskotsky
• Mae Moredo
• Ada Chen
• University of California, San Francisco
• NIH: NIAID, NLM, NIGMS, NCI, NHLBI, OD; NIDDK, NHGRI, NIA, NCATS
• March of Dimes
• Juvenile Diabetes Research Foundation
• Hewlett Packard
• Howard Hughes Medical Institute
• California Institute for Regenerative Medicine
• Luke Evnin and Deann Wright (Scleroderma Research Foundation)
• Clayville Research Fund
• PhRMA Foundation
• Stanford Cancer Center, Bio-X, SPARK
• Tarangini Deshpande
• Kimayani Butte
• Sam Hawgood
• Keith Yamamoto
• Isaac Kohane