Sample Size Estimation for Diagnostic Accuracy Studies

Sample Size Estimation For Diagnostic Accuracy Studies

Haldun Akoglu, MD, Prof.

Email: drhaldun@gmail.com | Twitter: @istanbulemdoc

This spreadsheet is online supplement for the review article published on Turkish Journal of Emergency Medicine. Please cite this journal article if you are using this spreadsheet in your articles.

Akoglu, H. "Sample Size Estimation For Diagnostic Accuracy Studies". Turk J Emerg Med. 2022, 22(4):XX-XX. doi: PMID:

When to use

This formula is used to estimate the sample size when the new diagnostic test is compared with the reference standard in a cohort where the true disease status and prevalance is known.

Comments

Sens or Spec: pre-determined value ascertained by previous published data or clinician experience/judgment

Marginal error: maximum error of the estimate with a confidence level of 95%.

Prevalance: disease prevalance in the study population.

Estimated sample sizes will be different for the same sensitivity and specificity if the disease prevalance is not 50%, or when the number subjects with and without the disease are not equal.

Equation
\[n_{se} = {{{Z_{\alpha \over 2}^2} \times Se (1 - Se)} \over d^2}\] \[n_{sp} = {{{Z_{\alpha \over 2}^2} \times Sp (1 - Sp)} \over d^2}\]
Adjusting for disease prevalance
\[n_{se} = n \over Prevalance \] \[n_{sp} = n \over (1 - Prevalance) \]
diseased
not-diseased
total subjects
total subjects
total subjects
When to use

This formula is used to estimate the sample size when the diagnostic test is compared with the reference standard in a cohort where the true disease status and prevalance is unknown.

After the using the Equations, calculated values should be adjusted according to disease prevelance.

Comments

Sens or Spec: pre-determined value ascertained by previous published data or clinician experience/judgment

Marginal error: maximum error of the estimate with a confidence level of 95%.

Prevalance: disease prevalance in the study population.

Estimated sample sizes will be different for the same sensitivity and specificity if the disease prevalance is not 50%, or when the number subjects with and without the disease are not equal.

Equation
\[n = { \left[ {Z_{\alpha \over 2} } \sqrt{{P_0(1-{P_0})}} + {Z_{\beta} \sqrt{{P_1(1-{P_1})}}} \right]^2 \over ({P_1} - {P_0})^2 }\]
Adjusting for disease prevalance
\[n_{se} = n \over Prevalance \] \[n_{sp} = n \over (1 - Prevalance) \]
Yates’ Continuity Correction
\[= { n \over 4 }{(1 + \sqrt{1 + 4/(n|P_1 - P_2|)} )} ^2 \]
lowest prob. of disagreement
highest prob. of disagreement
for each group
for the study
for each study
for each study
for each study
When to use
Unpaired (between-subjects) design

"Participants are randomly assigned to either the index or comparator test.

Paired (within-subjects) design

Two comparator tests are applied to all subjects along with the reference standard

Comments

One-sided N is preferred since we want to test if one of the paths are different than the other

Ψ (min) where the disagreement is minimum (P2-P1)

Ψ (max) where the agreement is by chance, or disagreement is maximum

Cont.Correction are the values where the Yates' continuity correction was applied

Equation
Unpaired
\[ n = { \left[ {Z_{\alpha} } \sqrt{2 \times {\overline{P}(1-\overline{P})}} + {Z_{\beta} \sqrt{ {P_1(1-{P_1})} + {P_2(1-{P_2})}}} \right]^2 \over ({P_1} - {P_2})^2 }\]
Paired
\[ n = {{\left[Z_{\alpha}\sqrt{\psi} + Z_{\beta}\sqrt{\psi - (P_2-P_1)^2} \right]^2} \over {(P_1 - P_2)^2}} \] \[ \psi_{min} = P_2 - P_1 \] \[ \psi_{max} = P_1 \times (1 - P_2) + P_2 \times (1 - P_1) \]
Yates’ Continuity Correction
\[= { n \over 4 }{(1 + \sqrt{1 + 4/(n|P_1 - P_2|)} )} ^2 \]