Conviction Intervals for Sample Size Less Than xxx

In the preceding discussion we have been using southward, the population standard deviation, to compute the standard error. However, we don't really know the population standard deviation, since we are working from samples. To get around this, we have been using the sample standard departure (s) equally an estimate. This is not a trouble if the sample size is 30 or greater because of the central limit theorem. Notwithstanding, if the sample is minor (<30) , we take to accommodate and use a t-value instead of a Z score in social club to account for the smaller sample size and using the sample SD.

Therefore, if n<30, employ the appropriate t score instead of a z score, and note that the t-value will depend on the degrees of freedom (df) as a reflection of sample size. When using the t-distribution to compute a confidence interval, df = northward-one.

Calculation of a 95% confidence interval when northward<30 will then utilize the appropriate t-value in place of Z in the formula:

The T-distribution

One manner to think about the t-distribution is that information technology is actually a large family unit of distributions that are similar in shape to the normal standard distribution, but adapted to account for smaller sample sizes. A t-distribution for a small sample size would look similar a squashed down version of the standard normal distribution, but as the sample size increment the t-distribution will get closer and closer to approximating the standard normal distribution.

The tabular array below shows a portion of the table for the t-distribution. Notice that sample size is represented by the "degrees of liberty" in the outset column. For determining the confidence interval df=northward-i. Notice likewise that this table is prepare a lot differently than the table of Z scores. Here, but five levels of probability are shown in the column titles, whereas in the tabular array of Z scores, the probabilities were in the interior of the table. Consequently, the levels of probability are much more limited here, because t-values depend on the degrees of freedom, which are listed in the rows.

Conviction Level

80%

90%

95%

98%

99%

Two-sided test p-values

.20

.ten

.05

.02

.01

Ane-sided exam p-values

.ten

.05

.025

.01

.005

Degrees of Freedom (df)

1

three.078

6.314

12.71

31.82

63.66

2

ane.886

2.920

iv.303

vi.965

nine.925

3

1.638

ii.353

3.182

4.541

5.841

4

1.533

2.132

ii.776

3.747

4.604

5

one.476

two.015

2.571

three.365

iv.032

half dozen

1.440

1.943

2.447

3.143

3.707

7

1.415

ane.895

2.365

2.998

iii.499

viii

1.397

i.860

2.306

two.896

3.355

9

ane.383

1.833

2.262

2.821

3.250

10

1.372

1.812

two.228

2.764

three.169

11

1.362

1.796

2.201

2.718

3.106

12

1.356

1.782

ii.179

ii.681

3.055

13

1.350

1.771

ii.160

2.650

3.012

14

i.345

ane.761

2.145

2.624

2.977

15

i.341

1.753

2.131

2.602

2.947

16

1.337

1.746

2.120

2.583

two.921

17

1.333

1.740

2.110

2.567

2.898

eighteen

1.330

1.734

2.101

two.552

2.878

nineteen

1.328

ane.729

2.093

two.539

2.861

xx

1.325

ane.725

two.086

2.528

ii.845

Notice that the value of t is larger for smaller sample sizes (i.east., lower df). When we use "t" instead of "Z" in the equation for the confidence interval, it will outcome in a larger margin of fault and a wider confidence interval reflecting the smaller sample size.

With an infinitely large sample size the t-distribution and the standard normal distribution will be the aforementioned, and for samples greater than 30 they volition be like, but the t-distribution will exist somewhat more conservative. Consequently, one can e'er use a t-distribution instead of the standard normal distribution. However, when yous want to compute a 95% confidence interval for an estimate from a large sample, it is easier to just use Z=i.96.

Because the t-distribution is, if anything, more conservative, R relies heavily on the t-distribution.

Test Yourself

Problem #1

Using the table above, what is the critical t score for a 95% confidence interval if the sample size (n) is 11?

Answer

Trouble #2

A sample of n=10 patients free of diabetes accept their trunk mass index (BMI) measured. The hateful is 27.26 with a standard difference of two.10. Generate a 90% confidence interval for the mean BMI among patients free of diabetes.

Link to Answer in a Discussion file

Conviction Intervals for a Mean Using R

Instead of using the table, you tin apply R to generate t-values. For case, to generate t values for computing a 95% conviction interval, apply the office qt(ane-tail surface area,df).

For case, if the sample size is 15, then df=xiv, nosotros tin calculate the t-score for the lower and upper tails of the 95% conviction interval in R:

> qt(0.025,14)
[one] -2.144787
>
qt(0.975,xiv )
[ane] 2.144787

So, to compute the 95% confidence interval nosotros could plug t=two.144787 into the equation:

Confidence Intervals from Raw Data Using R

Information technology is besides easy to compute the point approximate and 95% conviction interval from a raw data set using the " t.test " office in R. For example, in the data fix from the Weymouth Health Survey I could compute the mean and 95% confidence interval for BMI as follows. Commencement, I would load the data set and requite it a short nickname. And then I would attach the data set, and then utilize the post-obit command:

> t.test(bmi)

The output would look like this:

1 Sample t-test

information:  bmi
t = 228.5395, df = 3231, p-value < 2.2e-sixteen
alternative hypothesis: true mean is not equal to 0
95 pct conviction interval:
26.66357 27.12504

sample estimates:
mean of x
26.8943

R defaults to calculating a 95% conviction interval, but you can specify the confidence interval as follows:

> t.test(bmi,conf.level=.90)

This would compute a xc% confidence interval.

Test Yourself

Lozoff and colleagues compared developmental outcomes in children who had been anemic in infancy to those in children who had not been anemic. Some of the information are shown in the table below.

Mean + SD

Anemia in Infancy

(due north=30)

Non-anemic in Infancy

(n=133)

Gross Motor Score

52.four+14.3

58.7+12.v

Verbal IQ

101.4+13.2`

102.nine+12.4

Source: Lozoff et al.: Long-term Developmental Consequence of Infants with Iron Deficiency, NEJM, 1991

Compute the 95% confidence interval for verbal IQ using the t-distribution

Link to the Answer in a Word file