您的位置:首页 > 产品设计 > UI/UE

Frequency Distribution

2015-11-03 16:12 585 查看
Data set: faithful, 272*2, Waiting time between eruptions and the duration of the eruption for the Old Faithful geyser.

> head(faithful)
eruptions waiting
1     3.600      79
2     1.800      54
3     3.333      74
4     2.283      62
5     4.533      85
6     2.883      55
Checking the range of duration.
> duration <- faithful$eruptions
> range(duration)
[1] 1.6 5.1
Setting breaks of the distribution

> breaks <- seq(1.5, 5.5, by = 0.5)

Cutting duration into different class. 

Out() function divides the range of x into intervals and codes the values in x according to which interval they fall.

Arguments: X - a numeric vector

      breaks - either a numeric vector of two or more unique cut points or a single number giving the number of intervals into which x is to be cut.

      right - logical, indicating if the intervals should be closed on the right (and open on the left) or vice versa. The default is TRUE, meaning open on the left, close on the right.

> duration.cut <- cut(duration, breaks, right = F)
> duration.cut
[1] [3.5,4) [1.5,2) [3,3.5) [2,2.5) [4.5,5) [2.5,3) [4.5,5) [3.5,4) [1.5,2) [4,4.5)
[11] [1.5,2) [3.5,4) [4,4.5) [1.5,2) [4.5,5) [2,2.5) [1.5,2) [4.5,5) [1.5,2) [4,4.5)
......
Compute the frequency of each class
> duration.freq <- table(duration.cut)
> duration.freq
duration.cut
[1.5,2) [2,2.5) [2.5,3) [3,3.5) [3.5,4) [4,4.5) [4.5,5) [5,5.5)
51      41       5       7      30      73      61       4
> cbind(duration.freq)
duration.freq
[1.5,2)            51
[2,2.5)            41
[2.5,3)             5
[3,3.5)             7
[3.5,4)            30
[4,4.5)            73
[4.5,5)            61
[5,5.5)             4

Computing the cumulative frequency of each class.

> duration.comfreq <- cumsum(duration.freq)
> duration.comfreq
[1.5,2) [2,2.5) [2.5,3) [3,3.5) [3.5,4) [4,4.5) [4.5,5) [5,5.5)
51      92      97     104     134     207     268     272

Drawing a draft to show cumulative frequency.

> comfreq0 <- c(0, duration.comfreq)
> comfreq0
[1.5,2) [2,2.5) [2.5,3) [3,3.5) [3.5,4) [4,4.5) [4.5,5) [5,5.5)
0      51      92      97     104     134     207     268     272
> plot(breaks, comfreq0)
> lines(breaks, comfreq0)


ecdf() compute an empirical cumulative distribution function

> duration
[1] 4 2 3 2 5 3 5 4 2 4 2 4 4 2 5 2 2 5 2 4 2 2 3 3 5 4 2 4 4 4 4 4 3 4 4 2 2 5 2 5 4
[42] 2 5 2 5 3 4 2 5 2 5 5 2 5 2 5 4 2 5 4 2 4 2 5 2 4 4 5 2 5 4 2 4 4 2 5 2 5 4 4 4 4
[83] 4 3 4 5 4 5 2 4 2 4 2 5 2 4 5 4 2 5 2 4 2 4 4 2 5 2 5 4 5 2 5 4 2 5 2 5 2 4 3 4 4
[124] 2 5 4 2 4 2 5 2 4 3 4 2 4 2 5 2 4 4 2 5 5 4 2 5 2 5 2 5 4 2 5 4 4 4 4 2 4 2 4 2 4
[165] 4 5 2 5 2 5 2 2 5 3 4 4 4 2 4 4 2 5 4 4 2 4 4 2 4 2 5 2 5 4 4 4 4 4 2 5 2 4 4 2 5
[206] 2 4 4 2 4 2 5 2 4 3 4 2 5 2 4 2 4 2 4 4 4 4 4 4 5 4 2 4 2 4 2 2 4 4 2 4 2 5 3 5 4
[247] 2 4 2 4 2 4 4 4 4 4 4 4 2 4 5 5 2 4 2 2 5 4 2 4 2 4
> Fn <- ecdf(duration)
> Fn
Empirical CDF
Call: ecdf(duration)
x[1:126] = 1.6, 1.667, 1.7, ..., 5.067, 5.1
> plot(Fn)
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  r语言 统计