 # Tutorial on Introduction to biostatistics

Sensitivity and Specificity

Diagnostic tests used in clinical practices have certain operating characteristics. It is important for clinicians to be aware of these test characteristics as they interpret the results of these tests, and also as they determine optimal testing strategies to get to an accurate diagnosis or assign an appropriate prognosis. Sensitivity specificity, positive predictive value and negative predictive values are key parameters used in the further evaluation of the properties of diagnostic tests. Diagnostic tests are compared to a “gold standard” that is the best single test or combination of tests that is relevant to the particular diagnosis.

Sensitivity is the chance that the diagnostic test will indicate the presence of disease when the disease is actually present.

Specificity is the chance that the diagnostic disease will indicate the absence of disease when the disease is actually absent.

Positive predictive value is the chance that a positive test result actually means that the disease is present.

Negative predictive value is the chance that a negative test result actually means that the disease is absent

Note that sensitivity depends only on the distribution of positive and negative test results within the diseased population and the specificity depends only on the distribution of the results within the non-diseased population.  They do not depend on the ratio of diseased to non-diseased and therefore are considered to be independent of disease prevalence whereas positive and negative predictive value is a function of disease prevalence and pre-test probability.

 Disease + _ Test Present True Positive (TP) False Positive (FP) Absent False Negative  (FN) True Negative (TN)

Sensitivity = TP/(TP +FN)

Specificity = TN/(TN + FP)

PPV = TP/(TP + FP)

NPV = TN/(TN + FN)

Efficiency = (TP + TN)/(TP + FP + FN + TN)

The mnemonics of “Spin” and “Snout” (adapted from those originally suggested by Sackett and colleagues) are extremely useful to remember the properties of specificity and sensitivity. A highly specific (Sp) test, if positive (p) rules “in” the disease – giving us Spin. A highly sensitive (Sn) test, if negative (n) rules “out” the disease – and there you have Snout.

# Bayes Theorem

Bayes’ theorem states the predictive value of a test will depend on the prevalence of the disease.  For diseases with high prevalence, the positive predictive value will increase and vice versa.  The negative predictive value will have an opposite effect. If a researcher uses a diagnostic test in a high prevalence setting, a positive test will be more likely to be truly positive than in a low prevalence setting.

# ROC curves

ROC curves illustrate the trade-off in sensitivity for specificity. The greater the area under the ROC curve, the better the overall trade-off between sensitivity and specificity. This is a more sophisticated way to determine the optimal points for weighing sensitivity versus specificity since we know that if one is increased, the other invariably tends to decrease.

Relative Risk (RR):

Probability of the disease if the risk factor is present divided by the probability of the disease if the risk factor is absent.  Example:  a study to evaluate the relationship between a food habit and diabetic might compare a group of People with the specific food habit to a group not on the food habit and follow them for the development of diabetic.  If 10% of the people on the food habit developed diabetic and 0.5% of the people not on the food habit developed it, the relative risk would be 20.

Relative risk of 1: no effect

Relative risk >1: positive effect

Relative risk <1: negative effect

Relative risk should be presented with confidence intervals (CI), which to reflect a statistically significant finding, should not contain data points that include an RR of 1. Conversely, it can be seen that if the RR CI does include 1, then the RR is not statistically significant.

In the food habit /diabetic example If p value was 0.05 and the 95% confidence interval for the relative risk of 20 was 0.7-25, then statistical significance would not be achieved since the range of values includes 1.

Odds Ratio (OR):  similar to relative risk, but used for case-control studies.  The odds of having the risk factor if the disease is present divided by the odds of having the risk factor if the disease is absent gives us the OR.

Likelihood Ratio (LR)

Likelihood ratios are very useful in that they are an indication of the degree to which a test result will change the pre-test probability of disease.

It can be calculated in two ways one is for a positive result and another is for a negative result.

For a given test, to get a positive likelihood ratio, the probability of a positive test result if the disease is present divided by the probability of a positive test result if the disease is absent.

+LR = sensitivity/(1-specificity)

Probability of a negative test result if the disease is present divided by the probability of a negative test result if the disease is absent to get a negative likelihood ratio.

-LR = (1-sensitivity)/specificity

LR=1: no effect on pre-test probability

LR>1:  positive effect

LR<1:  negative effect

LR=1-2 or 0.5-1:  minimal effect

LR=2-5 or 0.2-0.5:  small effect

LR=5-10 or 0.1-0.2:  moderate effect

LR>10 or <0.1:  large effect