R语言数据结构5—factor
2014-01-27 02:29
176 查看
有两种类型的变量:类别(名义型)变量和有序类别(有序型),他们在R中称为因子(factor),函数factor()以一个整数向量的形式存储类别值,整数的取值范围是[1...
k ](其中k 是名义型变量中唯一值的个数),同时一个由字符串(原始值)组成的内部向量将映射到这些整数上。
举例来说,假设有向量:
diabetes <- c(“type1”,”type2”,”type1”,”type1”)
语句diabetes <- factor(diabetes)将此向量存储为(1, 2, 1, 1),并在内部将其关联为1=Type1和2=Type2(具体赋值根据字母顺序而定)。针对向量diabetes进行的任何分析都会将其作为名义型变量对待,并自动选择适合这一测量尺度的统计方法。
#创建factor
gender.vector <- c("Male", "Female", "Female", "Male",
"Male")
factor.gender.vector <- factor(gender.vector)
factor.gender.vector
> factor.gender.vector
[1] Male Female Female Male Male
Levels: Female Male
hair.color.vector <- c("Blonde", "Blonde", "Brunette",
"Ginger", "Grey", "Brunette")
temperature.vector <- c("High", "Low", "High", "Low",
"Medium")
factor.hair.color.vector <- factor(hair.color.vector)
factor.temperature.vector <- factor(temperature.vector,
order = TRUE, levels = c("Low",
"Medium", "High"))
factor.temperature.vector
factor.hair.color.vector
> factor.temperature.vector
[1] High Low High Low Medium
Levels: Low < Medium < High
> factor.hair.color.vector
[1] Blonde Blonde Brunette Ginger
Grey Brunette
Levels: Blonde Brunette Ginger Grey
survey.vector <- c("M","F","F","M","M")
factor.survey.vector <- factor( survey.vector )
factor.survey.vector
levels(factor.survey.vector) <- c("Female","Male")
factor.survey.vector
> factor.survey.vector #Print to console
[1] M F F M M
Levels: F M
> factor.survey.vector
[1] Male Female Female Male Male
Levels: Female Male
> survey.vector <- c("M", "F", "F",
"M", "M")
> factor.survey.vector <- factor(survey.vector)
> levels(factor.survey.vector) <- c("Female",
"Male")
> factor.survey.vector
[1] Male Female Female Male Male
Levels: Female Male
> # Type your code here for survey.vector
> summary(survey.vector)
Length Class Mode
5 character character
> # Type your code here for factor.survey.vector
> summary(factor.survey.vector)
Female Male
2 3
speed.vector <- c("Fast","Slow","Slow","Fast","Ultra-fast")
factor.speed.vector <-
factor(speed.vector,order = TRUE,levels=c('Slow','Fast','Ultra-fast'))
factor.speed.vector
summary(factor.speed.vector)
> factor.speed.vector
[1] Fast Slow Slow Fast
Ultra-fast
Levels: Slow < Fast < Ultra-fast
> summary(factor.speed.vector)
Slow Fast Ultra-fast
2 2 1
speed.vector <- c("Fast","Slow","Slow","Fast","Ultra-fast")
speed.factor.vector <- factor(speed.vector, ordered=TRUE,levels=c("Slow","Fast","Ultra-fast")
)
speed.factor.vector
compare.them <- speed.factor.vector[2] > speed.factor.vector[5]
# Is data analyst 2 faster than data analyst 5?
compare.them
> speed.factor.vector
[1] Fast Slow Slow Fast
Ultra-fast
Levels: Slow < Fast < Ultra-fast
> compare.them
[1] FALSE
k ](其中k 是名义型变量中唯一值的个数),同时一个由字符串(原始值)组成的内部向量将映射到这些整数上。
举例来说,假设有向量:
diabetes <- c(“type1”,”type2”,”type1”,”type1”)
语句diabetes <- factor(diabetes)将此向量存储为(1, 2, 1, 1),并在内部将其关联为1=Type1和2=Type2(具体赋值根据字母顺序而定)。针对向量diabetes进行的任何分析都会将其作为名义型变量对待,并自动选择适合这一测量尺度的统计方法。
#创建factor
gender.vector <- c("Male", "Female", "Female", "Male",
"Male")
factor.gender.vector <- factor(gender.vector)
factor.gender.vector
> factor.gender.vector
[1] Male Female Female Male Male
Levels: Female Male
hair.color.vector <- c("Blonde", "Blonde", "Brunette",
"Ginger", "Grey", "Brunette")
temperature.vector <- c("High", "Low", "High", "Low",
"Medium")
factor.hair.color.vector <- factor(hair.color.vector)
factor.temperature.vector <- factor(temperature.vector,
order = TRUE, levels = c("Low",
"Medium", "High"))
factor.temperature.vector
factor.hair.color.vector
> factor.temperature.vector
[1] High Low High Low Medium
Levels: Low < Medium < High
> factor.hair.color.vector
[1] Blonde Blonde Brunette Ginger
Grey Brunette
Levels: Blonde Brunette Ginger Grey
survey.vector <- c("M","F","F","M","M")
factor.survey.vector <- factor( survey.vector )
factor.survey.vector
levels(factor.survey.vector) <- c("Female","Male")
factor.survey.vector
> factor.survey.vector #Print to console
[1] M F F M M
Levels: F M
> factor.survey.vector
[1] Male Female Female Male Male
Levels: Female Male
> survey.vector <- c("M", "F", "F",
"M", "M")
> factor.survey.vector <- factor(survey.vector)
> levels(factor.survey.vector) <- c("Female",
"Male")
> factor.survey.vector
[1] Male Female Female Male Male
Levels: Female Male
> # Type your code here for survey.vector
> summary(survey.vector)
Length Class Mode
5 character character
> # Type your code here for factor.survey.vector
> summary(factor.survey.vector)
Female Male
2 3
speed.vector <- c("Fast","Slow","Slow","Fast","Ultra-fast")
factor.speed.vector <-
factor(speed.vector,order = TRUE,levels=c('Slow','Fast','Ultra-fast'))
factor.speed.vector
summary(factor.speed.vector)
> factor.speed.vector
[1] Fast Slow Slow Fast
Ultra-fast
Levels: Slow < Fast < Ultra-fast
> summary(factor.speed.vector)
Slow Fast Ultra-fast
2 2 1
speed.vector <- c("Fast","Slow","Slow","Fast","Ultra-fast")
speed.factor.vector <- factor(speed.vector, ordered=TRUE,levels=c("Slow","Fast","Ultra-fast")
)
speed.factor.vector
compare.them <- speed.factor.vector[2] > speed.factor.vector[5]
# Is data analyst 2 faster than data analyst 5?
compare.them
> speed.factor.vector
[1] Fast Slow Slow Fast
Ultra-fast
Levels: Slow < Fast < Ultra-fast
> compare.them
[1] FALSE
相关文章推荐
- c语言数据结构的练习
- R语言入门--R中的factor
- 数据结构c语言基础
- 数据结构 - 线性表之顺序表 (c 语言)
- 数据结构(Java语言)——Stack简单实现
- R语言数据结构2—matrix
- C语言数据结构——栈、行编辑程序
- 【脚本语言系列】关于Python数据结构,你需要知道的事
- C语言数据结构----算法基本知识和静态表
- 【脚本语言系列】关于JavaScript数据结构,你需要知道的事
- C语言数据结构之普通树篇
- 【数据结构】之顺序表(Java语言描述)
- C语言基础——数据结构
- 嵌入式Linux C语言基础——ARM Linux内核常见数据结构
- 数据结构第二周项目1--C/C++语言中函数传递的三种方式之方法二传地址
- c语言数据结构实现-链式队列
- 熟练使用语言在Linux平台实现具体的算法和数据结构
- 数据结构(java语言描述)-- 表的简单数组实现
- 程序员代码面试指南:IT名企算法与数据结构题目最优解-字符串问题:C/C++语言实现
- C语言数据结构实现字符串分割的实例