Tutorial on Introduction to biostatistics
Inferential data analysis
As the researcher draws scientific conclusions from his study using only a sample instead of the whole population, he can justify his conclusion with help of statistical inference tools. The principal concepts involved in statistical inference are theory of estimation and hypothesis testing.
Theory of estimation
Point Estimation:
A single value is used to provide the best estimate of the parameter of interest.
Interval Estimation:
Interval estimates shows the estimate of the parameter and also give an idea of the confidence that the researcher has in that estimate. This leads us to consideration of confidence intervals.
Confidence interval (CI)
A confidence interval estimate of a parameter consists of an interval, along with a probability that the interval contains the unknown parameter. The level of confidence in a confidence interval is a probability that represents the percentage of intervals that will contain the parameter if a large number of repeated samples are obtained. The level of confidence is denoted (1  a)*100%.
The narrower the width of the confidence interval, the lower is the error of the point estimate it contains. The sample size, sample variance and the level of confidence all affect the width of the confidence interval.
 If the sample size increases it will decrease the width of the confidence interval.
 If the level of confidence increases the width will increase.
 If the variation in sample increase it will increases the width of confidence interval
Confidence intervals can be computed for estimating single mean and proportions and also for comparing the difference between two means or proportions. Confidence interval is widely used to represent the main clinical outcomes instead of p values as it has many advantages over it (such as giving information about effect size, variability and possible range). The most commonly used confidence interval is the 95% CI. Increasingly, medical journals and publications require authors to calculate and report the 95% CI wherever appropriate since it gives a measure of the range of effect sizes possible – information that is of great relevance to clinicians. The term 95% CI means that it is the interval within which we can be 95% sure the true population value lies. Note that the remaining 5% of the time, the value may fall outside this interval. The estimate, which is the effect size observed in the particular study is the point at which the true value is most likely to fall, though it can theoretically occur at any point within the confidence interval (or even outside it, as just alluded to).
Example:
A study is conducted to estimate the average glucose levels in patients admitted with diabetic ketoacidosis. Sample of 100 patients were selected and the mean was found to be 500 mg/dL with a 95% confidence interval of 320780. This means that there is a 95% chance that the true mean of all patients will lie between 320 and 780.
Hypothesis testing vs. Estimation
Similarity: Both use sample data to infer something about a population
Difference: Designed to answer different questions
Does a new drug lower cholesterol levels?
Measure cholesterol of 25 patients before drug & after  change in cholesterol is 15 mg/dL (225 before; 210 after)
Hypothesis test: Did the drug alter cholesterol levels?
Yes/no decision. Reject or fail to reject H_{0}
Estimation: By how much did the drug alter cholesterol levels?
Hypothesis testing
Setting up the Hypotheses:
The basic concept used in hypothesis testing is that it is far easier to show that something is false than to prove that it is true.
a) Two mutually exclusive & competing hypotheses:
Let us consider a situation where we want to test if a new drug is having superior efficacy to one of the standard drugs prevailing in the market for the treatment of tuberculosis. We will have to construct a null hypothesis and alternative hypothesis for this experiment as below:
1. The “null” hypothesis (H_{0})
The null hypothesis indicates a neutral position (or the status quo in an interventional trial) in the given study or experiment. Typically the investigator hopes to prove this hypothesis wrong so that the alternate hypotheses (which encompasses the concept of interest to the investigator) can be accepted.Example:
In the situation given above, though we actually want to prove the new drug to be effective, we should proceed with a neutral attitude while doing the experiment so our null hypothesis will be stated as follows:Ho: There is no difference between the effect of new drug and standard drug in treating tuberculosis
2. The “alternative” hypothesis (H_{1})
This is the hypothesis we believe or hope is true.
Example: In the above situation if we want to prove the new drug is superior then our alternative hypothesis will be:
H_{1}: New drug’s effect is superior to that of the standard drug.
Based on the alternative hypothesis the test will become onetailed test or twotailed test. Twotailed tests are when the researcher wants to test in both the direction for the population parameter specified in the null hypothesis (i.e. either greater or lesser). If he wants to test the parameter of the null hypothesis in only one direction greater or lesser it becomes a onetailed test.
In the above example the researcher test framed the alternative hypothesis in only one direction (new drug is superior to the standard drug) so the test becomes a one tailed test.
b) Selecting a “significance level”: a
Significance level is the probability of rejecting the null hypothesis when it is actually true (Type I error). It is usually set at 5% i.e. a = .05 (5%)
c) Calculate the test statistics and p value
Test statistics
Calculating the test statistics will depend on our null hypothesis. It may be testing a single mean or proportion or it may be comparing two means or proportions.
pvalue
A pvalue gives the likelihood of the study effect, given that the null hypothesis is true. For example, a pvalue of .03 means that, assuming that the treatment has no effect, and given the sample size, an effect as large as the observed effect would be seen in only 3% of studies.
In other words it gives the chance of observing a difference (effect) from the sample when the null hypothesis is true. For example, if get a p value of 0.02 then only a 2% chance is there for observing a difference (effect) from the sample if we assume the null hypothesis is true.
The pvalue obtained in the study is evaluated against the significance level alpha. If alpha is set at .05, then a pvalue of .05 or less is required to reject the null hypothesis and establish statistical significance.
d) Decision rule:
We can reject H_{0} if the pvalue <α.
Most statistical packages calculate the pvalue for a 2tailed test. If we are conducting a 1tailed test we must divide the pvalue by 2 before deciding if it is acceptable. (In SPSS output, the pvalue is labeled “Sig (2tailed).”)
Table 1: Step by step guide to applying hypothesis testing in research
1. Formulate a research question
2. Formulate a research/alternative hypothesis
3. Formulate the null hypothesis
4. Collect data
5. Reference a sampling distribution of the particular statistic assuming that H_{0 }
is true (in the cases so far, a sampling distribution of the mean)
6. Decide on a significance level (a), typically .05
7. Compute the appropriate test statistic
8. Calculate p value
9. Reject H_{0 }if the p value is less than the set level of significance otherwise accept H_{0}
Hypothesis Testing for different SituationsTesting for Single mean – Large Samples: Ztest
Ztest for single mean is useful when we want to test a sample mean against the population mean when the sample is size is large (i.e. more than 30).
Example:
A researcher wants to test the statement that the mean level of dopamine is greater than 36 in individuals with schizophrenia. He collects a sample of 54 patients with schizophrenia.
The researcher can test the hypothesis using Ztest for testing single mean.
Testing for Two means – Large Samples: Ztest for comparing two means.
Ztest for comparing two means is useful when we want to compare two sample means when the sample is size is large (i.e. more than 30).
Example:
Past studies shows that Indian men have higher cholesterol levels than Indian women. A sample of 100 males and females were taken and their cholesterol level measured – males were found to have a mean cholesterol level of 188 mg/dL and females a mean level of 164 mg/dL. Is there sufficient evidence to conclude that the males are indeed having a higher cholesterol level?
Here we can test the hypothesis using Ztest for comparing two sample means.Testing for Single mean – ttest.
The ttest for single mean is useful when we want to test a sample mean against the population mean when the sample is size is small (i.e. less than 30).
Example:
A researcher wants to test the statement that the mean age of diabetic patients in his district is greater than 60 years. He draws a sample of 25 persons.
Here we can test the hypothesis using ttest for single mean.
Independent Sample ttest for two means.The ttest for comparing two means is appropriate when we want to compare two independent sample means when the sample is size is small (i.e. less than 30).
Example:
A study was conducted to compare males and females in terms of average years of education with a sample of 9 females and 13 males. It was inferred that males had an average of 17 years of formal education while females had 14. Can it be concluded that males are having a higher degree of education than females within this population?
Here we can test the hypothesis using ttest for comparing two sample means.
Paired ttest for two means.
Paired ttest is useful when we want to compare the two sample means when the two sample measurements are taken from the same subject under the study like pre and post measurements.
Example:
A study was conducted to compare the effect of a drug in treating hypertension by administering it to 20 patients. BP was recorded immediately before and one hour after the drug is given. The question of interest  is the drug effective is reducing blood pressure?
A paired ttest can be used for hypothesis testing and comparing two paired sample means.
Testing for Single proportion: Binomial test for proportion
If we want to test a sample proportion against the population proportion we can use the
binomial test for single proportion.
Example:
A random sample of patients is recruited for a clinical study. The researcher wants to establish that the proportion of female patients is not equal to 0.5.
The binomial test for proportion is the appropriate statistic method here.
Testing for Two proportion: Ztest for two proportions
If we want to compare two sample proportions we can use the Ztest for two proportions when the sample size is large (i.e. more than 30)
Example:
Two types of hypodermic needles, the old type and a new type, are used for giving injections. It is hoped that the new design will lead to less painful injections. The patients are allocated at random to two groups, one to receive the injections using a needle of the old type, the other to receive injections with needles of the new type.
Does the information support the belief that the proportion of patients having severe pain with injections using needles of the old type is greater than the proportion of patients with severe pain in the group getting injections using the new type?
Here we can test the hypothesis using Ztest for comparing two sample proportions.
Chisquare test (χ^{2})
It is a statistical procedure used to analyze categorical data.
We will explore two different types of χ^{2} tests:
1. One categorical variable: Goodnessoffit test
2. Two categorical variables: Contingency table analysis
One categorical variable: Goodnessoffit test
A test for comparing observed frequencies with theoretically predicted frequencies.
Two categorical variables: Contingency table analysis
Defined: a statistical procedure to determine if the distribution of one categorical variable is contingent on a second categorical variable
 Allows us to see if two categorical variables are independent from one another or are related
 Conceptually, it allows us to determine if two categorical variables are correlated
Note:
If the expected frequencies in the cells are “too small,” the χ^{2 }test may not be valid
A conservative rule is that you should have expected frequencies of at least 5 in all cells
ExampleWe want to test the association between cancer and smoking habit in 250 patients. The chisquare would be an appropriate test.
Analysis of Variance (ANOVA)
When we want to compare more than two means we will have to use an analysis of variance test.
Example:
A researcher has assembled three groups of psychology students. He teaches the same topic to each group using three different educational methodologies. The researcher wishes to determine if the three modalities are giving equivalent results. He tests all the students and records the marks obtained.
An ANOVA analysis can be used to test the hypothesis.
Repeated Measures ANOVA
Repeated measures ANOVA is useful when we want to compare more than two sample means when the sample measurements are taken from the same subject enrolled in the study.Example:
A trial was conducted to compare the effect of a drug in treating hypertension by administering it to 20 patients. BP was recorded immediately before and one, two and four hours after the drug is administered
Is the drug is effective is reducing blood pressure?
Repeated measures ANOVA would be the right way to get an answer.Parametric Tests
The statistical hypothesis test such as ztest,ttest and ANOVA assumes the distributions of the variables being assessed comes from a parametrized probability distribution. The parameters usually used are the mean and standard deviation. For example, ttest assumes the variable comes from the normal population and analysis of variance assumes that the underlying distributions are normally distributed and that the variances are similar.
Parametric techniques are poweful to detect differences or similarities than the non parametric tests
Nonparametric/Distributionfree tests
Nonparametric tests: statistical tests that do not involve population parameters and do not make assumptions about the shape of the population(s) from which sample(s) originate.
It is used in the following circumstances
1. Useful when statistical assumptions have been violated
2. Ideal for nominal (categorical) and ordinal (ranked) data
3. Useful when sample sizes are small (as this is often when assumptions are violated)
What are the disadvantages of Nonparametric/Distributionfree tests?
1. Tend to be less powerful than their parametric counterparts
2. H_{0} & H_{1} not as precisely defined
There is a nonparametric/distributionfree counterpart to many parametric tests.
· The MannWhitney U Test: The nonparametric counterpart of the independent samples ttest
· The Wilcoxon Signed Rank Test: The nonparametric counterpart of the related samples ttest
· The KruskalWallis Test: The nonparametric counterpart of oneway ANOVA
· KolmogorovSmirnov Test : It is a nonparametric test and is used to test whether the distribution of the two data sets are same or not
· Run Test: Run is a series of similar values followed by a different value. Run test is used to test the runs randomly occurred in a data set or not
Table 2: Statistical tests at a glance
Type of variable in the study 
Parameters to be tested 
Number of variables 
Sample size 
Test 
Ratio variables 
Mean 
One 
>30 
Ztest 
Mean 
Two 
>30 
Ztest 

Mean 
One 
<30 
ttest 

Mean 
Two 
<30 
Independent sample ttest 

Mean (same subject) 
Two 
<30 
Paired sample ttest 

Proportion 
One 
Binomial 

Proportion 
Two 
>30 
ztest 

Mean 
More than two 
>30 
ANOVA 

Mean(same subject 
More than two 
>30 
Repeated measures ANOVA 

Nominal/ Categorical variables 
Association 
Two or more 
 
Chisquare 
Ratio variables 
Mean 
Two 
When normality assumption violated 
MannWhitney test 
Ratio variables 
Mean (same subject) 
Two 
When normality assumption violated 
Wilcoxon signed rank test 
Ratio variables 
Mean 
Moe than Two 
When normality assumption violated 
Kruskal Wallis test 