您的位置:首页 > 理论基础 > 数据结构算法

R语言数据结构5—factor

2014-01-27 02:29 176 查看
有两种类型的变量:类别(名义型)变量和有序类别(有序型),他们在R中称为因子(factor),函数factor()以一个整数向量的形式存储类别值,整数的取值范围是[1...
k ](其中k 是名义型变量中唯一值的个数),同时一个由字符串(原始值)组成的内部向量将映射到这些整数上。
举例来说,假设有向量:
diabetes <- c(“type1”,”type2”,”type1”,”type1”)
语句diabetes <- factor(diabetes)将此向量存储为(1, 2, 1, 1),并在内部将其关联为1=Type1和2=Type2(具体赋值根据字母顺序而定)。针对向量diabetes进行的任何分析都会将其作为名义型变量对待,并自动选择适合这一测量尺度的统计方法。

#创建factor
gender.vector <- c("Male", "Female", "Female", "Male",
"Male")

factor.gender.vector <- factor(gender.vector)
factor.gender.vector

> factor.gender.vector
[1] Male   Female Female Male   Male
 

Levels: Female Male

hair.color.vector <- c("Blonde", "Blonde", "Brunette",
"Ginger", "Grey", "Brunette")
temperature.vector <- c("High", "Low", "High", "Low",
"Medium")

factor.hair.color.vector <- factor(hair.color.vector)
factor.temperature.vector <- factor(temperature.vector,
order = TRUE, levels = c("Low",
   "Medium", "High"))

factor.temperature.vector
factor.hair.color.vector

> factor.temperature.vector
[1] High   Low    High   Low    Medium

Levels: Low < Medium < High
> factor.hair.color.vector
[1] Blonde   Blonde   Brunette Ginger
  Grey     Brunette

Levels: Blonde Brunette Ginger Grey

survey.vector <- c("M","F","F","M","M")
factor.survey.vector <- factor( survey.vector )
factor.survey.vector

levels(factor.survey.vector) <- c("Female","Male")

factor.survey.vector

> factor.survey.vector #Print to console
[1] M F F M M

Levels: F M
> factor.survey.vector
[1] Male   Female Female Male   Male
 

Levels: Female Male

> survey.vector <- c("M", "F", "F",
"M", "M")

> factor.survey.vector <- factor(survey.vector)

> levels(factor.survey.vector) <- c("Female",
"Male")

> factor.survey.vector
[1] Male   Female Female Male   Male
 

Levels: Female Male
> # Type your code here for survey.vector
> summary(survey.vector)
 Length     Class      Mode

       5 character character
> # Type your code here for factor.survey.vector
> summary(factor.survey.vector)
Female   Male

    2      3

speed.vector <- c("Fast","Slow","Slow","Fast","Ultra-fast")
factor.speed.vector <-
factor(speed.vector,order = TRUE,levels=c('Slow','Fast','Ultra-fast'))
factor.speed.vector
summary(factor.speed.vector)

> factor.speed.vector
[1] Fast       Slow       Slow       Fast
      Ultra-fast

Levels: Slow < Fast < Ultra-fast
> summary(factor.speed.vector)
    Slow       Fast Ultra-fast

      2          2          1

speed.vector <- c("Fast","Slow","Slow","Fast","Ultra-fast")
speed.factor.vector   <- factor(speed.vector, ordered=TRUE,levels=c("Slow","Fast","Ultra-fast")
)

speed.factor.vector
compare.them <- speed.factor.vector[2] > speed.factor.vector[5]

# Is data analyst 2 faster than data analyst 5?
compare.them

> speed.factor.vector
[1] Fast       Slow       Slow       Fast
      Ultra-fast

Levels: Slow < Fast < Ultra-fast
> compare.them
[1] FALSE
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: