Good Biostatistical Report Practices
Being Honest in Data Analysis
Nicholas P. Jewell
Departments of Statistics &
School of Public Health (Biostatistics)
University of California, Berkeley
March 26, 2011
Call Girls Whitefield Just Call 7001305949 Top Class Call Girl Service Available
Â
Nicholas Jewell MedicReS World Congress 2011
1. Good Biostatistical Report Practices
Being Honest in Data Analysis
Nicholas P. Jewell
Departments of Statistics &
School of Public Health (Biostatistics)
University of California, Berkeley
March 26, 2011
1
4. References
â˘âŻ Bailar, J.C. and Mosteller, F. âGuidelines for
statistical reporting in articles for medical
journals: Amplifications and explanations,â
Annals of Internal Medicine, 108, 1988, 266-273.
â˘âŻ Lang, T.A. and Secic M. âHow to Report
Statistics in Medicine: Annotated Guidelines for
Authors, Editors, and Reviewers,â 2nd Edition,
Amrican College of Physicians, 2006.
â˘âŻ Altman, D. âStatistical reviewing for medical
journals, Statistics in Medicine, 17, 1998,
2661-2674.
Š Nicholas P. Jewell, 2011
4
5. Overview
â˘âŻ Opportunistic Data Torturing
â⯠Multiple comparisons
â⯠Multiple analyses
â⯠Subgroup analyses
â˘âŻ Procustean Data Torturing
â⯠Subgroup analyses (or lack thereof)
â⯠Meta-analyses
â˘âŻ Sensitivity Analyses/Proxies/Assumptions/Honesty
â˘âŻ Blinding the Statistician (who funds the statistician?)
â˘âŻ Pre-specified Data Driven Analyses
â˘âŻ Dissemination and Publication
5
Š Nicholas P. Jewell, 2011
6. What is the Recipe for a Good
Biostatistical Review?
The Four Hs
â˘âŻ What basic questions should be asked?
â⯠GBRS Questions provide an excellent checklist
(guidelines etc)
â⯠How was the Data Collected (e.g. design/sample size
issues/loss to follow up)
â⯠How were the variables measured?
â⯠How was the data analyzed? (e.g. methods/
assumptions, software, pre-specified methods)
â⯠How were the results reported? (e.g. sensitivity,
interpretation, uncertainty, conflicts of interest)
6
Š Nicholas P. Jewell, 2011
7. Deeper Questions
The Five Ws
â˘âŻ What is the population of interest?
â˘âŻ What is the data-generating mechanism (e.g.
assumptions, missingness or data filtering)
â˘âŻ What are the parameters of interest that describe the
data-generating mechanism? (e.g. causal inference)
â˘âŻ Which of these parameters are identifiable and how can
they be estimated effectively?
â˘âŻ Where can I find the data and the software?
7
Š Nicholas P. Jewell, 2011
8. The MIRA Trial
â˘âŻ Gates Foundation study to determine the
effectiveness of a latex diaphragm in the reduction
of heterosexual acquisition of HIV among women
â˘âŻ Two arm, randomized, controlled trial
â˘âŻ Primary intervention: diaphragm and gel provision
to diaphragm arm (nothing to control arm).
â˘âŻ Secondary Intervention: Intensive condom provision
and counseling given to both arms, plus treatment
of STIs
â˘âŻ Trial is not blinded
â˘âŻ 5000 women seen for 18 months in three sites in
Zimbabwe and South Africa
8
Š Nicholas P. Jewell, 2009
9. MIRA Trial: Basic Intention to
Treat Results
â˘âŻ Basic Intent-to-Treat Analysis:
ââŻ158 new HIV infections in Diaphragm Arm
ââŻ151 new HIV infections in Control Arm
â˘âŻ ITT estimate of Relative Risk is 1.05 with a
95% CI of (0.84, 1.30)
â˘âŻ End of story . . . . .?
9
Š Nicholas P. Jewell, 2009
10. MIRA Trial: Basic Intention to
Treat Results
â˘âŻ However condom use differed between the
two arms:
ââŻ53.5% in Diaphragm Arm (by visit)
ââŻ85.1% in Control Arm (by visit)
â˘âŻ Could this mean that the diaphragm was more
effective than it appeared from the basic
analysis?
â˘âŻ To make sense of thisâweâd like to
understand the role of condom use in
mediating the effect of treatment assignment
on HIV infection. 10
Š Nicholas P. Jewell, 2009
11. Most Important Public Health Questions
1. What is the effectiveness of providing study product in
environment of country-level standard condom
counseling?
(in environment of no condom counseling?)
2. How does providing study product alone compare to
consistent condom use alone in reducing HIV
transmission?
3. How does providing the study product alone compare
to unprotected sex, in terms of risk of HIV infection?
None of these questions are answered by basic ITT analysis
11
Š Nicholas P. Jewell, 2009
12. Reasons effectiveness may be different in environment
of intensive condom counseling vs. environment of
country-level standard condom counseling
A. In trials without blinding, effect of intensive
condom counseling may be different for
treatment arm and control arm.
B. Effects of intensive condom counseling may be
different for those who adhere to assigned
treatment and those who donât.
C. Intensive condom counseling may itself affect
adherence to assigned treatment.
12
Š Nicholas P. Jewell, 2009
13. Reason A: In unblinded trials, effect of
intensive condom counseling may be
different for treatment arm and control arm.
Treatment
Arm
Control
Arm
Non-adherers (25%) Adherers (75%)
50 infections 40
50
Treatment
Arm
Control
Arm
Non-adherers (25%) Adherers (75%)
50 40
50 10
RR:170/200=0.85
RR:106/80=1.33
Study Product (or placebo) Users Condom Users
No Condom Counseling
Intensive Condom Counseling
Assume: Condoms 80% protective, Study Product 20% efficacy,
and Intensive Condom Counseling less effect in treatment arm
8
50 infections
4040
5050
8
10 10
15. Estimating Direct Effects: Adjusting for
a Mediator (condom use)
HIV
Infection
Treatment Group
(Diaphragm use)
Condom
Use
DIRECT EFFECT
â˘âŻ We want to estimate the direct effect of diaphragm provision, at a set
level of condom use. (Petersen et al. 2006, Robins and Greenland 1992,
Pearl 2000, Rosenblum et al. 2009)
â˘âŻ Still ITT interpretation
â˘âŻ Requires Stronger Assumptions than basic ITT 15
Š Nicholas P. Jewell, 2009
16. Estimating Direct Effects: Stratify on
condom use?
(danger: condom use is not randomized)
HIV
Infection
Treatment Group
(Diaphragm use)
Condom
Use
Confounders
after stratification on condom use
HIV
Infection
Treatment Group
(Diaphragm use)
Confounders
randomization hasnât ruled out confounding of direct effect!
Now have to adjust for confounders (but we are still ITT)
17. Why Stratification/Regression Gives
Biased Estimates When There Are
Confounders as Causal Intermediates
Treatment
Group (R)
HIV Status
(H)
Condom Use
(C)
Diaphragm Use
(D) (confounder)
Using Regression, if we control for D, we donât get the direct effect that we want.
Š Nicholas P. Jewell, 2009Š Nicholas P. Jewell, 2009
18. Results of Direct Effects Analysis
â˘âŻ Relative Risk of HIV infection between
Diaphragm arm and Control arm by end of Trial,
with Condom Use Fixed at âNeverâ: 0.59 (95%
CI: 0.26, 4.56)
â˘âŻ Relative Risk of HIV infection between
Diaphragm arm and Control arm by end of Trial,
with Condom Use Fixed at âAlwaysâ: 0.96 (95%
CI: 0.59, 1.45)
Conclusion: No definitive evidence from direct
effects analysis that diaphragms prevent (or
donât prevent) HIV.
Š Nicholas P. Jewell, 2009
18
Š Nicholas P. Jewell, 2009
19. Pain Trials with Self-Reporting
â˘âŻ Pain is the most disturbing symptom of peripheral
neuropathy among diabetic patients
â˘âŻ As many as 45% of patients with diabetes develop
peripheral neuropathies
â˘âŻ Gabapentin was suggested as a treatment option
â˘âŻ To evaluate the effect of Gabapentin, a randomized,
double-blind, placebo-controlled trial was conducted
â˘âŻ 165 patients with a 1- to 5-year history of pain attributed
to diabetic neuropathy enrolled at 20 different sites
Š Nicholas P. Jewell, 2011
19
20. Backonja et al. Trial in JAMA
â˘âŻ The main outcome was daily pain severity as measured on an
11-point Likert scale (0 no pain- 10 worst possible pain)
â˘âŻ Eighty-four patients received gabapentin, 81 received placebo
â˘âŻ By intention-to-treat analysis, gabapentin-treated patients'
mean daily pain score (baseline 6.4, end point 3.9) was
signiďŹcantly lower (P<.001) than the placebo-treated patients'
score (baseline 6.5, end point 5.1)
â˘âŻ Concluded that gabapentin appears to be efficacious for the
treatment of pain associated with diabetic peripheral
neuropathy
Š Nicholas P. Jewell, 2011
20
21. Handling Treatment-Related
Side Effects
â˘âŻ Treat side effects singly by
removing those individuals
from the data analysis and
seeing if that changed the
results
Š Nicholas P. Jewell, 2011
21
22. Perception Effect
â˘âŻ Patients have a perception about the treatment they
receive
â˘âŻ In general we may think of the patients assigning a
degree of certainty (probability) to receiving the active
treatment, measured by a variable P
â˘âŻ In most cases we do not observe the patientâs perception
on a continuous scale
Š Nicholas P. Jewell, 2011
âŹ
P =
1
0
â1
â§ďŁą
â¨ďŁ˛
âŞďŁ´
âŠďŁł
âŞďŁ´
convinced on treatment
not sure if on treatment or placebo
convinced on placebo
22
23. Application to Unmasking of Blinded
Treatment Assignment Due to Side Effects
Outcome
(pain score)
Treatment Group
Side effects Confounders
â˘âŻ side effects are treatment-related and therefore potentially unmasking
â˘âŻ outcome is measured subjectively
â˘âŻ actual use of the drug (adherence) is a potential confounder
23
Š Nicholas P. Jewell, 2011
24. Data
â˘âŻ Outcome:
mean pain score for the last 7
diary entries
â˘âŻ Baseline covariates:
age, sex, race, height, weight,
baseline pain, baseline sleep
â˘âŻ Treatment:
gabapentin, placebo
â˘âŻ Perception:
changes from 0 to 1 when a
treatment-related side effect
occurs
Š Nicholas P. Jewell, 2011
24
25. Parameters of Interest
â˘âŻ What is the treatment effect if all the patients
remained unknowledgeable about their
treatment? (Perception fixed at 0)
â˘âŻ What is the treatment effect if all the
patients thought they were receiving the
active treatment? (Perception fixed at 1)
â˘âŻ (The difference between these two parameters can be
thought of as a perception bias)
Š Nicholas P. Jewell, 2011
preferred ITT
parameter
25
31. Reviewer Comments
â˘âŻ Figure 4- The steroid finding is worthy of comment, even if the interaction is not
statistically significant (the power for interactions is low). It may well be that the
new agent has little benefit for those not treated with steroids,â
â˘âŻ Figure 4 has been deleted as suggested by the editor. While there was not a
significant reduction in relative risk of confirmed PUBS in the subgroup not
taking steroids at baseline, VIGOR was only designed to detect a significant
reduction overall, and not within any specific subgroup. There were, however,
reductions in relative risk in this subgroup albeit not significant. Whenever many
subgroups are examined, it becomes likely that there will be no apparent effect
in some of them purely by chance. This is why the test for interaction, although
underpowered as you rightly pointed out, is the appropriate way to assess
subgroup effects. When there is no treatment-by-subgroup interaction, the more
reliable estimate of treatment effect within a subgroup is the effect in the entire
cohort [1].â
31
Š Nicholas P. Jewell, 2011
32. Meta-Analyses
â˘âŻ Random or Fixed Effects
â⯠heterogeneity
â⯠Measure of association (RR or ER)
â˘âŻ Zero-event trials/use of all evidence
â⯠Drug safety studies (Vioxx, Celebrex, Avandia, etc)
â⯠Continuity corrections
â⯠Study duration
32
Š Nicholas P. Jewell, 2009
33. Meta-Analyses and No Event
Studies
â˘âŻ One study: 30 events under Tx, 3 under
placebo (5000 individuals in both arms)
ââŻRisk difference of 0.0054 with exact 95%
confidence interval of (0.0032, 0.0076), p < 0.0001
â˘âŻ Now add three (short duration) studies,all with
5000 in both arms, no events in either arms in
all cases
ââŻexact 95% confidence interval of (-0.0044, 0.0001),
p = 0.46
33
Š Nicholas P. Jewell, 2011
35. References
â˘âŻ Ioannidis, J.P.A. âWhy most published research findings are false,â PLoS Med, 2(8),
2005, e124.
â˘âŻ Siegfried, T. âOdds are, itâs wrong,â Science News, March 27, 2010, 26-29.
â˘âŻ Bailar, J.C. III., âScience. Statistics, and deception," Annals of Internal Medicine, 104,
1986, 259-260.
â˘âŻ Mills. J.S., âData torturing,â New England Journal of Medicine, 329, 1993, 1196-1199.
â˘âŻ Young, J., Hubbard, A.E., Eskenazi, B. and Jewell, N. P., âA machine-learning
algorithm for estimating and ranking the impact of environmental risk factors in
exploratory epidemiological studiesâ, to appear, 2011.
â˘âŻ A.E. Hubbard, Ahern, J., Fleischer, N.L., Lippman, S.A., Jewell, N.P., Bruckner, T.,
Satariano, W.A., âTo GEE or not to GEE: Comparing estimating function and
likelihood-based methods for estimating the associations between neighboring risk
factors and health,â Epidemiology, 21, 2010, 467-474.
â˘âŻ Van der Laan, M., Hubbard, A., Jewell, N.P., âSemi-parametric models versus faith-
based inference," to appear, Epidemiology, 21, 2010, 479-481.
35
Š Nicholas P. Jewell, 2011