您的位置:首页 > 其它

Time Series Clustering and Classification

2014-12-23 23:58 302 查看
http://www.rdatamining.com/examples/time-series-clustering-classification


TimeSeriesClusteringandClassification

ThispageshowsRcodeexamplesontimeseriesclusteringandclassificationwithR.

TimeSeriesClustering

Timeseriesclusteringistopartitiontimeseriesdataintogroupsbasedonsimilarityordistance,sothattimeseriesinthesameclusteraresimilar.FortimeseriesclusteringwithR,thefirststepistoworkoutanappropriatedistance/similarity
metric,andthen,atthesecondstep,useexistingclusteringtechniques,suchask-means,hierarchicalclustering,density-basedclusteringorsubspaceclustering,tofindclusteringstructures.

DynamicTimeWarping(DTW)findsoptimalalignmentbetweentwotimeseries,andDTWdistanceisusedasadistancemetricintheexamplebelow.

AdatasetofSyntheticControlChartTimeSeriesisusedhere,whichcontains600examplesofcontrolcharts.Eachcontrolchartisatimeserieswith60values.Therearesixclasses:1)1-100Normal,2)101-200Cyclic,3)201-300Increasingtrend,4)301-400
Decreasingtrend,5)401-500Upwardshift,and6)501-600Downwardshift.ThedatasetisdownloadableatUCI
KDDArchive.

>sc<-read.table(“E:/Rtmp/synthetic_control.data”,header=F,sep=”")

#randomlysampledncasesfromeachclass,tomakeiteasyforplotting

>n<-10

>s<-sample(1:100,n)

>idx<-c(s,100+s,200+s,300+s,400+s,500+s)

>sample2<-sc[idx,]

>observedLabels<-c(rep(1,n),rep(2,n),rep(3,n),rep(4,n),rep(5,n),rep(6,n))

#computeDTWdistances

>library(dtw)

>distMatrix<-dist(sample2,method=”DTW”)

#hierarchicalclustering

>hc<-hclust(distMatrix,method=”average”)

>plot(hc,labels=observedLabels,main=”")






TimeSeriesClassification

Timeseriesclassificationistobuildaclassificationmodelbasedonlabelledtimeseriesandthenusethemodeltopredictthelabelofunlabelledtimeseries.ThewayfortimeseriesclassificationwithRistoextractandbuildfeaturesfromtimeseries
datafirst,andthenapplyexistingclassificationtechniques,suchasSVM,k-NN,neuralnetworks,regressionanddecisiontrees,tothefeatureset.

DiscreteWaveletTransform(DWT)providesamulti-resolutionrepresentationusingwaveletsandisusedintheexamplebelow.AnotherpopularfeatureextractiontechniqueisDiscreteFourierTransform(DFT).

#extractingDWTcoefficients(withHaarfilter)

>library(wavelets)

>wtData<-NULL

>for(iin1:nrow(sc)){

+a<-t(sc[i,])

+wt<-dwt(a,filter=”haar”,boundary=”periodic”)

+wtData<-rbind(wtData,unlist(c(wt@W,wt@V[[wt@level]])))

+}

>wtData<-as.data.frame(wtData)


#setclasslabelsintocategoricalvalues

>classId<-c(rep(“1″,100),rep(“2″,100),rep(“3″,100),

+rep(“4″,100),rep(“5″,100),rep(“6″,100))

>wtSc<-data.frame(cbind(classId,wtData))


#buildadecisiontreewithctree()inpackageparty

>library(party)

>ct<-ctree(classId~.,data=wtSc,

+controls=ctree_control(minsplit=30,minbucket=10,maxdepth=5))

>pClassId<-predict(ct)


#checkpredictedclassesagainstoriginalclasslabels

>table(classId,pClassId)





#accuracy

>(sum(classId==pClassId))/nrow(wtSc)

[1]0.8716667


>plot(ct,ip_args=list(pval=FALSE),ep_args=list(digits=0))






MoreexamplesontimeseriesanalysisandminingwithRandotherdataminingtechniquescanbefoundinmybook"RandDataMining:ExamplesandCaseStudies",
whichisdownloadableasa.PDFfileatthelink.
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: