**1 Estimation of sample size and testing power of difference test for quantitative and qualitative data with the single-group design**
** **The single-group design refers to the research on the purpose to observe the value of a quantitative or qualitative index of a random sample under specific conditions. The single-group design only involves a specific level of an experimental factor, and the research subjects are not divided into groups based on any experimental factor or block factor. If the observational index is quantitative, the population mean or the standard value should be provided for hypothesis testing; if the observational index is qualitative, the population rate or the standard rate is needed^{[1]}.
**1.1 Testing of the mean** 1.1.1 Formulas for sample size estimation For one-sided test:
For two-sided test:
In formulas (1) and (2)^{[2]}, *α* and *β* are the allowed probabilities of making type Ⅰ error and type Ⅱ error, respectively; *S* represents the standard deviation; *θ* is the difference of the population means of the sample group and the standard group, which can be replaced by its estimated value if unknown; *t*_{1-α,df}, *t*_{1-α/2,df} and *t*_{1-β,df} are the 1-*α* quantile, 1-*α*/2 quantile and 1-*β* quantile corresponding to the probability-density curve of the central t distribution with the degree of freedom *df*=*n*-1. The iterative algorithm is applied. First, let *df*=∞ and denote the *n* as *n*_{1}; then let *df*=*n*_{1}-1, and denote the *n* as *n*_{2} ... The process continues until the nearest two “*n*” are stable.
**1.1.2 Formulas for testing power estimation **First, compute the value of *t*_{1-β,df} by formulas (3) and (4)^{[2]}, then compute the value of 1-*β* accordingly, namely, the testing power. For one-sided test:
For two-sided test:
The symbols in formulas (3) and (4) have the same meanings as those in formulas (1) and (2).
**1.1.3 Examples** 1.1.3.1 Example 1 The mean of the general closing time of the anterior fontanel of children in the northeast China is known as 14.1 months. A researcher planed to select a certain number of children with calcium deficiency in the northeast to survey whether the closing time of the anterior fontanel of the children in the northeast is longer. The pilot test showed that the closing time of the anterior fontanel of 36 children selected from the northeast was 14.3 months. and the standard deviation was 5.1 months. Suppose that the significance level *α* was set to be 0.05 and the testing power was required to be 75%. How many children were needed so that the researcher could draw the conclusion that the closing time of the anterior fontanel of the children in the northeast was longer? Analysis: Example 1 deals with the sample size estimation of the one-sided test for quantitative data with the single-group design. The sample mean, the standard mean, the sample standard deviation, the significance level and the testing power are all given. Formula (1) can be used for sample size estimation. The needed SAS program is as follows:
%let alpha=0.05;%let beta=0.25;%let miu_0=14.1;%let miu_1=14.3; %let s=5.1;%let side=1; data a1; theta=&miu_1-&miu_0; if &side=1 then p1=1-αelse p1=1-&alpha/2; p2=1-β df=10**9; t1=tinv(p1,df);t2=tinv(p2,df); w=&s/theta; n1=((t1+t2)*w)**2;df=n1-1; t1=tinv(p1,df); t2=tinv(p2,df);n2=((t1+t2)*w)**2; do while(ABS(n1-n2)>1); df=n2-1;n1=n2;t1=tinv(p1,df);t2=tinv(p2,df); n2=((t1+t2)*w)**2; end;
*n*=CEIL(MAX(n1,n2));file print; if &side=1 then PUT #3 @10 ' mean comparison of the single-group design (the one-sided test), the needed sample size' n=; else PUT #3 @10 ' mean comparison of the single-group design (the two-sided test), the needed sample size ' n=; run;quit; |
Program explanation: In the beginning part, the six “%let” statements are used to specify the allowed probabilities of making type Ⅰ error and type Ⅱ error, the standard mean, the sample mean, the sample standard deviation and the one-sided test, which can be modified in specific situations. The output: The needed sample size *n*=3 500. Sometimes, the output of the program will be slightly different from the result of the manual computation, because the *t* value computed by the computer is more accurate than that from the *t* table. Below is the needed advanced SAS program:
proc power; onesamplemeans test=t sides=1 mean=0.2 stddev =5.1 ntotal =.power =0.75; run;quit; |
Program explanation: The option “test=t” requires to conduct the t test to the mean; “sides=1” means to conduct the one-sided test; “mean=0.2” specifies the difference between the population mean corresponding to the sample and the standard mean; “stddev =5.1” specifies the sample standard value; “power = 0.75” specifies the testing power. The values in the program can be modified in specific situations. The output is that 3 500 children were needed.
**1.1.3.2 Example 2 **In example 1, suppose that the researcher used the data of the 36 children for the one-sided test instead of estimating the sample size, and drew the conclusion that there was no statistical significance between the population mean corresponding to the sample and the standard mean. Estimate the testing power. The needed program based on formula (3) is as follows:
%let miu_0=14.1; %let miu_1=14.3; %let n=36; %let s=5.1; %let alpha=0.05;%let side=1; data a2; theta=&miu_1-&miu_0; df=&n-1; if &side=1 then p1=1-αelse p1=1-&alpha/2; t_1_beta_df=theta*sqrt(&n)/&s-tinv(p1,df); power=put(probt(t_1_beta_df,df),7.3); if power<0.75 then do; sig='<0.75,';mean="the testing power is insufficient"; end; if power>0.75 then do; sig='>=0.75,'; mean="the testing power is sufficient"; end ; file print; PUT #3 @15 ' power= ' power sig mean; run; |
Program explanation: The six “%let” statements in the program specify the standard mean, the sample mean, the sample size, the sample standard deviation, the allowed probability of making type Ⅰ error and the one-sided test. The values in the program can be changed in specific situations. The output: Power=0.077<0.75, the testing power is insufficient. The needed advanced SAS program is as follows:
proc power; onesamplemeans test=t side=1 mean =0.2stddev =5.1 ntotal =36 power = .; run;quit; | Program explanation: The option “test=t” requires to conduct the t test to the mean; “sides=1” specifies the one-sided test; “mean=0.2” specifies the difference between the population mean corresponding to the sample and the standard mean; “stddev =5.1” refers to the sample standard deviation; “total = 36” specifies the sample size. The output: power=0.079<0.75. The testing power is insufficient. The result of the advanced SAS program is a little higher than that of the SAS program based on formulas (3) and (4) due to the fact that the advanced program adopted the approximate algorithm. The difference between the two algorithms is that the advanced program used the quantile of the standard normal distribution, while the other used the quantile of the t distribution. When the sample size is small, the result of the program based on formulas (3) and (4) is more accurate, while the result of the advanced program is slightly higher than the real value.
**1.2 Testing of the rate** 1.2.1 Formulas for sample size estimation For one-sided test:
For two-sided test:
In formulas (5) and (6)^{[2]}, *p*_{0} and *p* stand for the standard rate and the sample rate, respectively; *θ* stands for the difference of the population rates of the sample group and the standard group, which can be replaced by its estimated value if unknown; *df*=*n*-1. The iterative algorithm is also adopted for sample size estimation based on formulas (5) and (6). First, let *df*=∞ and denote the *n* as *n*_{1}, then let *df*=*n*_{1}-1 and denote the *n* as *n*_{2}. The process continues until the nearest two “*n*” are stable.
**1.2.2 Formulas for testing power estimation **First, compute the value of *t*_{1-β,df} by formulas (7) and (8)^{[2]}, then compute the value of 1-*β* accordingly, namely, the testing power. For one-sided test:
The symbols in formulas (7) and (8) have the same meanings as those in formulas (5) and (6).
**1.2.3 Examples** 1.2.3.1 Example 3 According to the experience, the curative rate of traditional drug A in treatment of a certain disease is 60%. A researcher expected to test the curative rate of a new drug B. The pilot test showed that the curative rate of drug B was 75%. Suppose that the testing level *α* was 0.05 and the testing power was set to be 0.75. How many patients were needed in order to draw the conclusion that drug B was better than the traditional drug A? Analysis: Example 3 deals with the sample size estimation of the one-sided test for qualitative data with the single-group design. The known information includes the sample rate, the standard rate, the significance level and the testing power. Formula (5) can be adopted for sample size estimation. The SAS program based on formula (5) is as follows:
%let p=0.75;%let p0=0.60;%let theta=0.15;%let alpha=0.05;%let beta=0.25;%let side=1; data a3; df=10**9; if &side=1 then do n1=(((tinv(1-&alpha,df))*sqrt(&p0*(1-&p0))+(tinv(1-&beta,df)) *sqrt(&p*(1-&p)))/&theta)**2; df=n1-1; n2=(((tinv(1-&alpha,df))*sqrt(&p0*(1-&p0))+(tinv(1-&beta,df)) *sqrt(&p*(1-&p)))/&theta)**2; do while(ABS(n1-n2)>1); df=n2-1;n1=n2; n2=(((tinv(1-&alpha,df))*sqrt(&p0*(1-&p0))+(tinv(1-&beta,df)) *sqrt(&p*(1-&p)))/&theta)**2; end;end; else do n1=(((tinv(1-&alpha/2,df))*sqrt(&p0*(1-&p0))+(tinv(1-&beta,df)) *sqrt(&p*(1-&p)))/&theta)**2; df=n1-1; n2=(((tinv(1-&alpha/2,df))*sqrt(&p0*(1-&p0))+(tinv(1-&beta,df)) *sqrt(&p*(1-&p)))/&theta)**2; do while(ABS(n1-n2)>1); df=n2-1;n1=n2; n2=(((tinv(1-&alpha/2,df))*sqrt(&p0*(1-&p0))+(tinv(1-&beta,df)) *sqrt(&p*(1-&p)))/&theta)**2; end;end; n=CEIL(MAX(n1,n2)); file print; PUT #3 @15n 'patients were needed'; run;quit; |
Program explanation: The six “%let” statements in the program are used to set the sample rate, the standard rate, the rate difference, the probability of making type Ⅰ error, the probability of making type Ⅱ error and the one-sided test. In similar situations, readers only need to change the values of the parameters in the six “%let” statements to get the result. The output: Totally 56 patients were needed. There is no available advanced SAS program yet.
**1.2.3.2 Example 4 **In example 3, suppose that the researcher did not estimate the sample size. Instead, he performed the one-sided test using the data of the 50 patients and drew the conclusion that drug B was not better than the traditional drug A. Estimate the testing power. Below is the corresponding SAS program based on formula (7).
%let p=0.75;%let p0=0.60;%let theta=0.15;%let alpha=0.05;%let n=50; %let side=1; data a4; df=&n-1; if &side=1 then p1=1-αelse p1=1-&alpha/2; t_1_beta_df=(abs(&theta)*sqrt(&n)-tinv(p1,df)*sqrt(&p0*(1-&p0)))/sqrt(&p*(1-&p)); power=put(probt(t_1_beta_df,df),7.3); if power<0.75 then do; sig='<0.75,';mean="the testing power is insufficient."; end; if power>0.75 then do; sig='>=0.75,'; mean="the testing power is sufficient."; end ; file print; PUT #3 @15 'power= ' power sig mean; run; |
Program explanation: The six “%let” statements are used to specify the sample rate, the standard rate, the rate difference, the probability of making type Ⅰ error, the probability of making type Ⅱ error and the one-sided test. In similar situations, readers only need to change the values of the parameters in the six “%let” statements to get the result. The output: power=0.709 < 0.75, the test power is insufficient. The needed advanced SAS program is as follows:
proc power; onesamplefreq test=exact nullproportion = 0.60 proportion = 0.75 sides = U ntotal =50 power = .; run; proc power; onesamplefreq test=adjz nullproportion = 0.60 proportion = 0.75 sides = U ntotal = 50 power = .; run; proc power; onesamplefreq test=z nullproportion = 0.60 proportion = 0.75 sides = U ntotal = 50 power = .; run; |
Program explanation: The option “test=exact | adjz | z”requires to proform the Fisher exact test or the adjusted Z test or the Z test; “sides=1” requires to conduct the one-sided test; “nullproportion =”and “proportion =” are used to set the standard rate and the sample rate; “sides = U | L | 1 | 2” requires to conduct the upper one-sided test, the lower one-sided test or the two-sided test, respectively; “ntotal=50” specifies the sample size; “power = .” means to estimate the testing power. The values of the parameters in the program can be altered in similar situations. The output of the first procedure: power=0.637<0.75. The testing power is insufficient. The output of the second procedure: power= 0.637<0.75. The testing power is insufficient. The output of the third procedure: power=0.748<0.75. The testing power is insufficient. The above three results are not exactly the same, and are slightly different from the result of the program based on formulas (7) and (8). The first two procedures have the same result, and are relatively conservative. The result of the third procedure may be slightly higher than the real value. The difference is caused by different hypothesis premise and different algorithms. When the sample size is small, the difference of the results of the above algorithms will be relatively larger. Therefore, the conservative result should be used. When the sample size is large, the results of the different algorithms will have slight difference. In that case, either algorithm is fine.
| |