您的位置:首页 > 其它

ISLR chapter 2, R 基础

2015-07-25 05:36 176 查看
2.3.1BasicCommands

createanarray

x=c(1,6,2)

createamatrix

>x=matrix(data=c(1,2,3,4),nrow=2,ncol=2)

>x=matrix(c(1,2,3,4),2,2)

>x[1,2]
[1]3


>matrix(c(1,2,3,4),2,2,byrow=TRUE),thenpopulatebyrows

[,1][,2]
[1,]12
[2,]34

ls()functionallowsustolookatalistofalloftheobjects

>ls()

[1]"x""y"
>rm(x,y)
>ls()
character(0)


rmallitems

rm(list=ls())


gethelpforafunction

?matrix

correlationcoefficient

>x=rnorm(50)
>y=x+rnorm(50,mean=50,sd=.1)
>cor(x,y)
[1]0.995


randomquantities

useset.seed()throughoutthelabswheneverweperformcalculations
involvingrandomquantities.Ingeneralthisshouldallowtheusertoreproduce
ourresults.

>set.seed(3)
>y=rnorm(100)
>mean(y)
[1]0.0110
>var(y)
[1]0.7329
>sqrt(var(y))
[1]0.8561
>sd(y)
[1]0.8561


2.3.2Graphics

plot

>x=rnorm(100)
>y=rnorm(100)
>plot(x,y)
>plot(x,y,xlab="thisisthex-axis",ylab="thisisthey-axis",
main="PlotofXvsY")


makeasequence

x=seq(-1,2,length=50)

countourplot

>y=x
>f=outer(x,y,function(x,y)cos(y)/(1+x^2))
>contour(x,y,f)
>contour(x,y,f,nlevels=45,add=T)
>fa=(f-t(f))/2
>contour(x,y,fa,nlevels=15)

>image(x,y,fa)
>persp(x,y,fa)
>persp(x,y,fa,theta=30)
>persp(x,y,fa,theta=30,phi=20)
>persp(x,y,fa,theta=30,phi=70)
>persp(x,y,fa,theta=30,phi=40)

2.3.3IndexingData

Matrix从1开始计位

>A=matrix(1:16,4,4)
>A
[,1][,2][,3][,4]
[1,]15913
[2,]261014
[3,]371115
[4,]481216

>A[2,3]
[1]10


非常诡异的排列

>A[c(1,3),c(2,4)]
[,1][,2]
[1,]513
[2,]715

就是A[1,2]A[1,4]
A[3,2]A[3,4]


>A[1:3,2:4]
[,1][,2][,3]
[1,]5913
[2,]61014
[3,]71115

>A[1:2,]
[,1][,2][,3][,4]
[1,]15913
[2,]261014

>A[,1:2]
[,1][,2]
[1,]15
[2,]26

[3,]37
[4,]48


>A[1,]
[1]15913

>A[-c(1,3),]
[,1][,2][,3][,4]
[1,]261014
[2,]481216
>A[-c(1,3),-c(1,3,4)]
[1]68



>dim(A)tellsdimension

2.3.4LoadingData

1)changeworkingdirectory

Rstudio右下角找到directory位置后,用moreset好workingdirectory

2)

Rhasassumedthatthevariablenamesarepartofthedataandsohasincludedtheminthefirstrow.

Usingtheoptionheader=T(orheader=TRUE)intheread.table()functiontellsRthatthefirstlineofthefilecontainsthevariablenames,andusingtheoptionna.stringstellsRthatanytimeitseesaparticularcharacterorsetofcharacters(suchasaquestionmark),itshouldbetreatedasamissingelementofthedatamatrix.

>Auto=read.table("Auto.data",header=T,na.strings="?")
>fix(Auto)

>dim(Auto)
[1]3979


3)missingvalue

usethena.omit()functiontosimplyremovetheserows.

>Auto=na.omit(Auto)
>dim(Auto)
[1]3929


4)usenames()tocheckthevariablenames.

>names(Auto)就是title那一行

[1]"mpg""cylinders""displacement""horsepower"
[5]"weight""acceleration""year""origin"
[9]"name"

2.3.5AdditionalGraphicalandNumericalSummaries

categorial,thenboxplots

>plot(Auto$cylinders,Auto$mpg)
>attach(Auto)
>plot(cylinders,mpg)


thereareonlyasmallnumberofpossiblevaluesforcylinders,onemayprefertotreatitasaqualitativevariable.
Theas.factor()functionconvertsquantitativevariablesintoqualitativeas.factor()variables.

>cylinders=as.factor(cylinders)


plot(cylinders,mpg,col="red",varwidth=T,xlab="cylinders",
ylab="MPG")




hist()functioncanbeusedtoplotahistogram.

hist(mpg,col=2,breaks=15)




Thepairs()functioncreatesascatterplotmatrixi.e.ascatterplotforeverypairofvariablesforanygivendataset.Wecanalsoproducescatterplotsforjustasubsetofthevariables.

pairs(∼mpg+displacement+horsepower+weight+
acceleration,Auto)




identifyeachpoint

>plot(horsepower,mpg)
>identify(horsepower,mpg,name)


ThenclickingonagivenpointintheplotwillcauseRtoprintthevalueofthevariableofinterest.

Summary

summary(Auto)
summary(mpg)
givesoutaverage,min,median...


Save

OncewehavefinishedusingR,wetypeq()inordertoshutitdown,orq()quit.WhenexitingR,wehavetheoptiontosavethecurrentworkspacesoworkspacethatallobjects(suchasdatasets)thatwehavecreatedinthisRsession
willbeavailablenexttime.BeforeexitingR,wemaywanttosavearecordofallofthecommandsthatwetypedinthemostrecentsession;thiscanbeaccomplishedusingthesavehistory()function.NexttimeweenterR,savehistory()
wecanloadthathistoryusingtheloadhistory()function.
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: