您的位置:首页 > Web前端

Chapter 07-Basic statistics(Part4 t-tests&&nonparametric tests of group difference)

2013-08-23 22:38 357 查看
一. t-tests

这一部分我们使用分布在MASS包中的UScrime数据集。它是关于美国47个州在1960年时,关于惩罚制度对犯罪率的影响。

Prob:监禁(坐牢)的概率;

U1:14到24岁的城市那你的失业率;

U2:35到39岁的城市男子的失业率;

So:an indicator variable for Southern states

1. 独立的t-test(independent t-test)

t.test(y~x,data)

t.tset(y1,y2)

例01:

> library(MASS)
> t.test(Prob~So,data=UScrime)

Welch Two Sample t-test

data:  Prob by So
t = -3.8954, df = 24.925, p-value = 0.0006506
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-0.03852569 -0.01187439
sample estimates:
mean in group 0 mean in group 1
0.03851265      0.06371269

注意:可以摒弃南方的州和非南方的州有相同的犯罪率,因为p<0.01。

2.依赖的t-test

t.test(y1,y2,paired=TRUE)

·y1和y2是两个有依赖关系的组的数值向量。

例02:

> library(MASS)
> sapply(UScrime[c("U1","U2")],function(x)(c(mean=mean(x),sd=sd(x))))
U1       U2
mean 95.46809 33.97872
sd   18.02878  8.44545
> with(UScrime,t.test(U1,U2,paired=TRUE))

Paired t-test

data:  U1 and U2
t = 32.4066, df = 46, p-value < 2.2e-16
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
57.67003 65.30870
sample estimates:
mean of the differences
61.48936


二. nonparametric tests of group difference

1. 比较两组

如果两组是独立的,应该使用Wilcoxon rank sum去评估自变量是否是来自相同概率分布的样本。

wilcox.test(y~x,data)

wilcox.test(y1,y2)

例03:

> with(UScrime,by(Prob,So,median))
So: 0
[1] 0.038201
--------------------------------------------------------
So: 1
[1] 0.055552
> wilcox.test(Prob~So,data=UScrime)

Wilcoxon rank sum test

data:  Prob by So
W = 81, p-value = 8.488e-05
alternative hypothesis: true location shift is not equal to 0

例04:

> sapply(UScrime[c("U1","U2")],median)
U1 U2
92 34
> with(UScrime,wilcox.test(U1,U2,paired=TRUE))

Wilcoxon signed rank test with continuity correction

data:  U1 and U2
V = 1128, p-value = 2.464e-09
alternative hypothesis: true location shift is not equal to 0


2.比较多于两组

Kruskal-Wallis test:

kruskal.test(y~A,data)

·A:a grouping variable with two or more levels, if just two levels, equivalent to Mann-Whitney;

·y:a numeric outcome variable;

Friedman test:

friedman.test(y~A|B,data)

·B: a blocking variable that identifies matched observations.

npmc包中的npmc()函数:期待输入两列的数据,分别叫var(the dependent variable)和class(the grouping variable).
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: 
相关文章推荐