您的位置：首页 > 编程语言 > MATLAB

PCA and kmeans MATLAB实现

2015-07-05 21:16 1216 查看

MATLAB基础知识

lImread：读取图片信息；

laxis：轴缩放：axis

([xminxmaxyminymaxzminzmaxcmincmax])

设置x、y和z轴范围以及颜色缩放范围（请参阅

caxis

）。

v=

axis返回包含x、y和z轴缩放因子的行矢量。

具有4或6个分量，具体分别取决于当前坐标轴是二维还是三维。返回值是当前坐标轴的

XLim

、

Ylim

和

ZLim

属性。基于x、y和z数据的最小值和最大值，axis

auto

自动设置MATLAB®默认行为以计算当前坐标轴范围。可以将该自动行为限制为特定轴。例如，axis

'auto

x'

仅自动计算x轴范围；axis

'autoyz'

自动计算y和z轴范围。

lbsxfun

对两个数组应用基于元素的二进制操作（启用单一扩展）

用bsxfun减去的矩阵

对应列元素的列均值。

A=[1210;1420;1615];

C=bsxfun(@minus,A,mean(A))

C=

0-2-5

@plus	加
@minus	减
@times	数组乘法
@rdivide	数组右除
@ldivide	数组左除
@power	数组幂
@max	二进制最大值
@min	二进制最小值
@rem	除后的余数
@mod	除后的模数
@atan2	四象限反切线；以弧度表示结果
@atan2d	四象限反切线；以度表示结果
@hypot	平方和的平方根
@eq	等于
@ne	不等于
@lt	小于
@le	小于或等于
@gt	大于
@ge	大于或等于
@and	按元素逻辑AND
@or	按元素逻辑OR
@xor	逻辑异OR

l奇异值分解：

svd命令计算矩阵奇异值分解。

s=

svd

(X)

返回奇异值的矢量。

[U,S,V]=

svd

(X)

生成维度与

相同的对角矩阵

（包含以降序排列的非负对角线元素）以及单位矩阵

和

，这样

X=U*S*V'

。

[U,S,V]=

svd

(X,0)

生成大小合适的分解。如果

是m×n（其中m>n），则svd仅计算

的前

列，并且

是n×n。

[U,S,V]=

svd

(X,'econ')

也生成大小合适的分解。如果

是m×n（其中m>=n），则它等于svd

(X,0)

。对于m<n，仅计算

的前m列并且

是m×m。

A.示例

对于矩阵

X=

语句

[U,S,V]=svd(X)

生成

U=

-0.1525-0.8226-0.3945-0.3800

-0.3499-0.42140.24280.8007

-0.5474-0.02010.6979-0.4614

-0.74480.3812-0.54620.0407

S=

14.26910

00.6268

V=

-0.64140.7672

-0.7672-0.6414

II.K-MEANS

聚类是非监督模型不带标签

KMeans算法的基本思想是初始随机给定K个簇中心，按照最邻近原则把待分类样本点分到各个簇。然后按平均法重新计算各个簇的质心，从而确定新的簇心。一直迭代，直到簇心的移动距离小于某个给定的值。

K-Means聚类算法主要分为三个步骤：
(1)第一步是为待聚类的点寻找聚类中心
(2)第二步是计算每个点到聚类中心的距离，将每个点聚类到离该点最近的聚类中去
(3)第三步是计算每个聚类中所有点的坐标平均值，并将这个平均值作为新的聚类中心
反复执行(2)、(3)，直到聚类中心不再进行大范围移动或者聚类次数达到要求为止

数据kmeans具体过程：

1：设置中心点：初始化，

2：找到每个点的所属最近簇

Functionidx=findClosestCentroids(X,initial_centroids);

3:训练数据找到每个簇的数据，将数据进行归属簇i。并且重新求每个簇的中心点

4：将数据迭代计算中心点基本达到中心点不变为；迭代训练找到中心对

测试集和验证集进行验证。

III.PCA算法降维：

人脸识别，

和数据压缩（将100维的数据降到10维压缩率为90%）。

l假设将数据的维数从RN降到R3，具体的PCA分析步骤如下：

数据压缩

l均值归一化；第一步计算矩阵X的样本的协方差矩阵S:

[X_norm,mu,sigma]=featureNormalize(X);

求平均值；和协方差

mu=mean(X);平均值

X_norm=bsxfun(@minus,X,mu);

sigma=std(X_norm);协方差

X_norm=bsxfun(@rdivide,X_norm,sigma)

l奇异值分解：

sigma=X'*X/m;

[U,S,V]=svd(sigma);

l找到平面上的一维平面

Z=X*U(:,1:K);

将所有数据垂直折射到一维平面上

holdon;

plot(X_rec(:,1),X_rec(:,2),'ro');

fori=1:size(X_norm,1)

drawLine(X_norm(i,:),X_rec(i,:),'--k','LineWidth',1);

end

holdoff

将bird中的数据进行提取降维

1：读取数据数据归一化：

A=double(imread('bird_small.png'));

A=A/255;

img_size=size(A);

X=reshape(A,img_size(1)*img_size(2),3);

2：kmeans训练特征将3d数据投射到2d曲面上

sel=floor(rand(1000,1)*size(X,1))+1;

palette=hsv(K);

colors=palette(idx(sel),:);

3:画出23d的图像；

人脸识别：

l基本原理：

1主成分分析（PCA）的原理就是将一个高维向量x,通过一个特殊的特征向量矩阵U，投影到一个低维的向量空间中，表征为一个低维向量y，并且仅仅损失了一些次要信息。也就是说，通过低维表征的向量和特征向量矩阵，可以基本重构出所对应的原始高维向量。

在人脸识别中，特征向量矩阵U称为特征脸（eigenface）空间，因此其中的特征向量ui进行量化后可以看出人脸轮廓，在下面的实验中可以看出。

以人脸识别为例，说明下PCA的应用。

设有N个人脸训练样本，每个样本由其像素灰度值组成一个向量xi，则样本图像的像素点数即为xi的维数，M=width*height，由向量构成的训练样本集为。

该样本集的平均向量为：

平均向量又叫平均脸。

样本集的协方差矩阵为：

求出协方差矩阵的特征向量ui和对应的特征值，这些特征向量组成的矩阵U就是人脸空间的正交基底，用它们的线性组合可以重构出样本中任意的人脸图像，（如果有朋友不太理解这句话的意思，请看下面的总结2。）并且图像信息集中在特征值大的特征向量中，即使丢弃特征值小的向量也不会影响图像质量。

将协方差矩阵的特征值按大到小排序：。由大于的对应的特征向量构成主成分，主成分构成的变换矩阵为：

这样每一幅人脸图像都可以投影到构成的特征脸子空间中，U的维数为M×d。有了这样一个降维的子空间，任何一幅人脸图像都可以向其作投影，即并获得一组坐标系数，即低维向量y，维数d×1,为称为KL分解系数。这组系数表明了图像在子空间的位置，从而可以作为人脸识别的依据。

[align=center][/align]
有朋友可能不太理解，第一部分讲K-L变换的时候，求的是相关矩阵的特征向量和特征值，这里怎么求的是协方差矩阵?

其实协方差矩阵也是：

，可以看出其实用代替x就成了相关矩阵R，相当于原始样本向量都减去个平均向量，实质上还是一样的，协方差矩阵也是实对称矩阵。

[align=center][/align]
总结下：

1、在人脸识别过程中，对输入的一个测试样本x，求出它与平均脸的偏差，则在特征脸空间U的投影，可以表示为系数向量y：

U的维数为M×d，的维数为M×1，y的维数d×1。若M为200*200=40000维，取200个主成分，即200个特征向量，则最后投影的系数向量y维数降维200维。

2、根据1中的式子，可以得出：

这里的x就是根据投影系数向量y重构出的人脸图像，丢失了部分图像信息，但不会影响图像质量。

Matlab基本函数：

部分函数说明如下：

MatMat::reshape(intcn,introws=0)const

　　该函数是改变Mat的尺寸，即保持尺寸大小=行数*列数*通道数不变。其中第一个参数为变换后Mat的通道数，如果为0，代表变换前后通道数不变。第二个参数为变换后Mat的行数，如果为0也是代表变换前后通道数不变。但是该函数本身不复制数据。

　　voidMat::convertTo(OutputArraym,intrtype,doublealpha=1,doublebeta=0)const

　　该函数其实是对原Mat的每一个值做一个线性变换。参数1为目的矩阵，参数2为目d矩阵的类型，参数3和4变换的系数，看完下面的公式就明白了：

　　

　　PCA::PCA(InputArraydata,InputArraymean,intflags,intmaxComponents=0)

　　该构造函数的第一个参数为要进行PCA变换的输入Mat；参数2为该Mat的均值向量；参数3为输入矩阵数据的存储方式，如果其值为CV_PCA_DATA_AS_ROW则说明输入Mat的每一行代表一个样本，同理当其值为CV_PCA_DATA_AS_COL时，代表输入矩阵的每一列为一个样本；最后一个参数为该PCA计算时保留的最大主成分的个数。如果是缺省值，则表示所有的成分都保留。

　　MatPCA::project(InputArrayvec)const

　　该函数的作用是将输入数据vec(该数据是用来提取PCA特征的原始数据)投影到PCA主成分空间中去，返回每一个样本主成分特征组成的矩阵。因为经过PCA处理后，原始数据的维数降低了，因此原始数据集中的每一个样本的维数都变了，由改变后的样本集就组成了本函数的返回值。下面由一个图说明：

　　MatPCA::backProject(InputArrayvec)const

　　一般调用backProject（）函数前需调用project()函数，因为backProject()函数的参数vec就是经过PCA投影降维过后的矩阵dst。因此backProject()函数的作用就是用vec来重构原始数据集（关于该函数的本质就是上面总结2的公式）。由一个图说明如下：

　　另外PCA类中还有几个成员变量，mean,eigenvectors,eigenvalues等分别对应着原始数据的均值，协方差矩阵的特征值和特征向量。

具体步骤：

1)加载图片：

load('ex7faces.mat')

displayData(X(1:100,:))

2)降维操作

[X_norm,mu,sigma]=featureNormalize(X);

%RunPCA

[U,S]=pca(X_norm);

3)将降维之后的特征一样的进行聚类然后进行识别为同一个人

重复操作进行分类：

X_rec=Z*U(:,1:K)';

1:
%%MachineLearningOnlineClass
%Exercise7|PrincipleComponentAnalysisandK-MeansClustering
%
%Instructions
%------------
%
%Thisfilecontainscodethathelpsyougetstartedonthe
%exercise.Youwillneedtocompletethefollowingfunctions:
%
%pca.m
%projectData.m
%recoverData.m
%computeCentroids.m
%findClosestCentroids.m
%kMeansInitCentroids.m
%
%Forthisexercise,youwillnotneedtochangeanycodeinthisfile,
%oranyotherfilesotherthanthosementionedabove.
%

%%Initialization
clear;closeall;clc

%%=================Part1:FindClosestCentroids====================
%TohelpyouimplementK-Means,wehavedividedthelearningalgorithm
%intotwofunctions--findClosestCentroidsandcomputeCentroids.Inthis
%part,youshoudlcompletethecodeinthefindClosestCentroidsfunction.
%
fprintf('Findingclosestcentroids.\n\n');

%Loadanexampledatasetthatwewillbeusing
load('ex7data2.mat');

%Selectaninitialsetofcentroids
K=3;%3Centroids%定义三簇
initial_centroids=[33;62;85];%定义每个簇的中心点

%Findtheclosestcentroidsfortheexamplesusingthe
%initial_centroids
idx=findClosestCentroids(X,initial_centroids);

fprintf('Closestcentroidsforthefirxst3examples:\n')
fprintf('%d',idx(1:3));
fprintf('\n(theclosestcentroidsshouldbe1,3,2respectively)\n');

fprintf('Programpaused.Pressentertocontinue.\n');
pause;

%%=====================Part2:ComputeMeans=========================
%Afterimplementingtheclosestcentroidsfunction,youshouldnow
%completethecomputeCentroidsfunction.
%
fprintf('\nComputingcentroidsmeans.\n\n');

%Computemeansbasedontheclosestcentroidsfoundinthepreviouspart.
centroids=computeCentroids(X,idx,K);

fprintf('Centroidscomputedafterinitialfindingofclosestcentroids:\n')
fprintf('%f%f\n',centroids');
fprintf('\n(thecentroidsshouldbe\n');
fprintf('[2.4283013.157924]\n');
fprintf('[5.8135032.633656]\n');
fprintf('[7.1193873.616684]\n\n');

fprintf('Programpaused.Pressentertocontinue.\n');
pause;

%%===================Part3:K-MeansClustering======================
%AfteryouhavecompletedthetwofunctionscomputeCentroidsand
%findClosestCentroids,youhaveallthenecessarypiecestorunthe
%kMeansalgorithm.Inthispart,youwillruntheK-Meansalgorithmon
%theexampledatasetwehaveprovided.
%
fprintf('\nRunningK-Meansclusteringonexampledataset.\n\n');

%Loadanexampledataset
load('ex7data2.mat');

%SettingsforrunningK-Means
K=3;
max_iters=10;

%Forconsistency,herewesetcentroidstospecificvalues
%butinpracticeyouwanttogeneratethemautomatically,suchasby
%settingsthemtoberandomexamples(ascanbeseenin
%kMeansInitCentroids).
initial_centroids=[33;62;85];

%RunK-Meansalgorithm.The'true'attheendtellsourfunctiontoplot
%theprogressofK-Means找到每个训练集最近的簇节点训练特征值然后改变节点位置
[centroids,idx]=runkMeans(X,initial_centroids,max_iters,true);
fprintf('\nK-MeansDone.\n\n');

fprintf('Programpaused.Pressentertocontinue.\n');
pause;

%%=============Part4:K-MeansClusteringonPixels===============
%Inthisexercise,youwilluseK-Meanstocompressanimage.Todothis,
%youwillfirstrunK-Meansonthecolorsofthepixelsintheimageand
%thenyouwillmapeachpixelontoit'sclosestcentroid.
%
%YoushouldnowcompletethecodeinkMeansInitCentroids.m
%

fprintf('\nRunningK-Meansclusteringonpixelsfromanimage.\n\n');

%Loadanimageofabird
A=double(imread('bird_small.png'));

%Ifimreaddoesnotworkforyou,youcantryinstead
%load('bird_small.mat');

A=A/255;%Divideby255sothatallvaluesareintherange0-1
%将数据归一化

%Sizeoftheimage
img_size=size(A);

%ReshapetheimageintoanNx3matrixwhereN=numberofpixels.
%EachrowwillcontaintheRed,GreenandBluepixelvalues
%ThisgivesusourdatasetmatrixXthatwewilluseK-Meanson.
X=reshape(A,img_size(1)*img_size(2),3);

%RunyourK-Meansalgorithmonthisdata
%YoushouldtrydifferentvaluesofKandmax_itershere
K=16;
max_iters=10;

%WhenusingK-Means,itisimportanttheinitializethecentroids
%randomly.
%YoushouldcompletethecodeinkMeansInitCentroids.mbeforeproceeding
initial_centroids=kMeansInitCentroids(X,K);

%RunK-Means
[centroids,idx]=runkMeans(X,initial_centroids,max_iters);

fprintf('Programpaused.Pressentertocontinue.\n');
pause;

%%=================Part5:ImageCompression======================
%Inthispartoftheexercise,youwillusetheclustersofK-Meansto
%compressanimage.Todothis,wefirstfindtheclosestclustersfor
%eachexample.Afterthat,we

fprintf('\nApplyingK-Meanstocompressanimage.\n\n');

%Findclosestclustermembers
idx=findClosestCentroids(X,centroids);

%Essentially,nowwehaverepresentedtheimageXasintermsofthe
%indicesinidx.

%Wecannowrecovertheimagefromtheindices(idx)bymappingeachpixel
%(specifiedbyit'sindexinidx)tothecentroidvalue
X_recovered=centroids(idx,:);

%Reshapetherecoveredimageintoproperdimensions
X_recovered=reshape(X_recovered,img_size(1),img_size(2),3);

%Displaytheoriginalimage
subplot(1,2,1);
imagesc(A);
title('Original');

%Displaycompressedimagesidebyside
subplot(1,2,2);
imagesc(X_recovered)
title(sprintf('Compressed,with%dcolors.',K));

fprintf('Programpaused.Pressentertocontinue.\n');
pause;

2:
%%MachineLearningOnlineClass
%Exercise7|PrincipleComponentAnalysisandK-MeansClustering
%
%Instructions
%------------
%
%Thisfilecontainscodethathelpsyougetstartedonthe
%exercise.Youwillneedtocompletethefollowingfunctions:
%
%pca.m
%projectData.m
%recoverData.m
%computeCentroids.m
%findClosestCentroids.m
%kMeansInitCentroids.m
%
%Forthisexercise,youwillnotneedtochangeanycodeinthisfile,
%oranyotherfilesotherthanthosementionedabove.
%

%%Initialization
clear;closeall;clc

%%==================Part1:LoadExampleDataset===================
%Westartthisexercisebyusingasmalldatasetthatiseasilyto
%visualize
%
fprintf('VisualizingexampledatasetforPCA.\n\n');

%Thefollowingcommandloadsthedataset.Youshouldnowhavethe
%variableXinyourenvironment
load('ex7data1.mat');

%Visualizetheexampledataset
plot(X(:,1),X(:,2),'bo');
axis([0.56.528]);
axissquare;

fprintf('Programpaused.Pressentertocontinue.\n');
pause;

%%===============Part2:PrincipalComponentAnalysis===============
%YoushouldnowimplementPCA,adimensionreductiontechnique.You
%shouldcompletethecodeinpca.m
%
fprintf('\nRunningPCAonexampledataset.\n\n');

%BeforerunningPCA,itisimportanttofirstnormalizeX
[X_norm,mu,sigma]=featureNormalize(X);

%RunPCA
[U,S]=pca(X_norm);

%Computemu,themeanoftheeachfeature

%Drawtheeigenvectorscenteredatmeanofdata.Theselinesshowthe
%directionsofmaximumvariationsinthedataset.
holdon;
drawLine(mu,mu+1.5*S(1,1)*U(:,1)','-k','LineWidth',2);
drawLine(mu,mu+1.5*S(2,2)*U(:,2)','-k','LineWidth',2);
holdoff;

fprintf('Topeigenvector:\n');
fprintf('U(:,1)=%f%f\n',U(1,1),U(2,1));
fprintf('\n(youshouldexpecttosee-0.707107-0.707107)\n');

fprintf('Programpaused.Pressentertocontinue.\n');
pause;

%%===================Part3:DimensionReduction===================
%Youshouldnowimplementtheprojectionsteptomapthedataontothe
%firstkeigenvectors.Thecodewillthenplotthedatainthisreduced
%dimensionalspace.Thiswillshowyouwhatthedatalookslikewhen
%usingonlythecorrespondingeigenvectorstoreconstructit.
%
%YoushouldcompletethecodeinprojectData.m
%
fprintf('\nDimensionreductiononexampledataset.\n\n');

%Plotthenormalizeddataset(returnedfrompca)
plot(X_norm(:,1),X_norm(:,2),'bo');
axis([-43-43]);axissquare

%ProjectthedataontoK=1dimension
K=1;
Z=projectData(X_norm,U,K);
fprintf('Projectionofthefirstexample:%f\n',Z(1));
fprintf('\n(thisvalueshouldbeabout1.481274)\n\n');

X_rec=recoverData(Z,U,K);
fprintf('Approximationofthefirstexample:%f%f\n',X_rec(1,1),X_rec(1,2));
fprintf('\n(thisvalueshouldbeabout-1.047419-1.047419)\n\n');

%Drawlinesconnectingtheprojectedpointstotheoriginalpoints
holdon;
plot(X_rec(:,1),X_rec(:,2),'ro');
fori=1:size(X_norm,1)
drawLine(X_norm(i,:),X_rec(i,:),'--k','LineWidth',1);
end
holdoff

fprintf('Programpaused.Pressentertocontinue.\n');
pause;

%%===============Part4:LoadingandVisualizingFaceData=============
%Westarttheexercisebyfirstloadingandvisualizingthedataset.
%Thefollowingcodewillloadthedatasetintoyourenvironment
%
fprintf('\nLoadingfacedataset.\n\n');

%LoadFacedataset
load('ex7faces.mat')

%Displaythefirst100facesinthedataset
displayData(X(1:100,:));

fprintf('Programpaused.Pressentertocontinue.\n');
pause;

%%===========Part5:PCAonFaceData:Eigenfaces===================
%RunPCAandvisualizetheeigenvectorswhichareinthiscaseeigenfaces
%Wedisplaythefirst36eigenfaces.
%
fprintf(['\nRunningPCAonfacedataset.\n'...
'(thismghttakeaminuteortwo...)\n\n']);

%BeforerunningPCA,itisimportanttofirstnormalizeXbysubtracting
%themeanvaluefromeachfeature
[X_norm,mu,sigma]=featureNormalize(X);

%RunPCA
[U,S]=pca(X_norm);

%Visualizethetop36eigenvectorsfound
displayData(U(:,1:36)');

fprintf('Programpaused.Pressentertocontinue.\n');
pause;

%%=============Part6:DimensionReductionforFaces=================
%Projectimagestotheeigenspaceusingthetopkeigenvectors
%Ifyouareapplyingamachinelearningalgorithm
fprintf('\nDimensionreductionforfacedataset.\n\n');

K=100;
Z=projectData(X_norm,U,K);

fprintf('TheprojecteddataZhasasizeof:')
fprintf('%d',size(Z));

fprintf('\n\nProgrampaused.Pressentertocontinue.\n');
pause;

%%====Part7:VisualizationofFacesafterPCADimensionReduction====
%ProjectimagestotheeigenspaceusingthetopKeigenvectorsand
%visualizeonlyusingthoseKdimensions
%Comparetotheoriginalinput,whichisalsodisplayed

fprintf('\nVisualizingtheprojected(reduceddimension)faces.\n\n');

K=100;
X_rec=recoverData(Z,U,K);

%Displaynormalizeddata
subplot(1,2,1);
displayData(X_norm(1:100,:));
title('Originalfaces');
axissquare;

%Displayreconstructeddatafromonlykeigenfaces
subplot(1,2,2);
displayData(X_rec(1:100,:));
title('Recoveredfaces');
axissquare;

fprintf('Programpaused.Pressentertocontinue.\n');
pause;

%%===Part8(a):Optional(ungraded)Exercise:PCAforVisualization===
%OneusefulapplicationofPCAistouseittovisualizehigh-dimensional
%data.InthelastK-MeansexerciseyouranK-Meanson3-dimensional
%pixelcolorsofanimage.Wefirstvisualizethisoutputin3D,andthen
%applyPCAtoobtainavisualizationin2D.

closeall;closeall;clc

%Re-loadtheimagefromthepreviousexerciseandrunK-Meansonit
%Forthistowork,youneedtocompletetheK-Meansassignmentfirst
A=double(imread('bird_small.png'));

%Ifimreaddoesnotworkforyou,youcantryinstead
%load('bird_small.mat');

A=A/255;
img_size=size(A);
X=reshape(A,img_size(1)*img_size(2),3);
K=16;
max_iters=10;
initial_centroids=kMeansInitCentroids(X,K);
[centroids,idx]=runkMeans(X,initial_centroids,max_iters);

%Sample1000randomindexes(sinceworkingwithallthedatais
%tooexpensive.Ifyouhaveafastcomputer,youmayincreasethis.
sel=floor(rand(1000,1)*size(X,1))+1;

%SetupColorPalette
palette=hsv(K);
colors=palette(idx(sel),:);

%Visualizethedataandcentroidmembershipsin3D
figure;
scatter3(X(sel,1),X(sel,2),X(sel,3),10,colors);
title('Pixeldatasetplottedin3D.Colorshowscentroidmemberships');
fprintf('Programpaused.Pressentertocontinue.\n');
pause;

%%===Part8(b):Optional(ungraded)Exercise:PCAforVisualization===
%UsePCAtoprojectthiscloudto2Dforvisualization

%SubtractthemeantousePCA
[X_norm,mu,sigma]=featureNormalize(X);

%PCAandprojectthedatato2D
[U,S]=pca(X_norm);
Z=projectData(X_norm,U,2);

%Plotin2D
figure;
plotDataPoints(Z(sel,:),idx(sel),K);
title('Pixeldatasetplottedin2D,usingPCAfordimensionalityreduction');
fprintf('Programpaused.Pressentertocontinue.\n');
pause;

3:
function[centroids,idx]=runkMeans(X,initial_centroids,...
max_iters,plot_progress)
%RUNKMEANSrunstheK-MeansalgorithmondatamatrixX,whereeachrowofX
%isasingleexample
%[centroids,idx]=RUNKMEANS(X,initial_centroids,max_iters,...
%plot_progress)runstheK-MeansalgorithmondatamatrixX,whereeach
%rowofXisasingleexample.Itusesinitial_centroidsusedasthe
%initialcentroids.max_itersspecifiesthetotalnumberofinteractions
%ofK-Meanstoexecute.plot_progressisatrue/falseflagthat
%indicatesifthefunctionshouldalsoplotitsprogressasthe
%learninghappens.Thisissettofalsebydefault.runkMeansreturns
%centroids,aKxnmatrixofthecomputedcentroidsandidx,amx1
%vectorofcentroidassignments(i.e.eachentryinrange[1..K])
%

%Setdefaultvalueforplotprogress
if~exist('plot_progress','var')||isempty(plot_progress)
plot_progress=false;
end

%Plotthedataifweareplottingprogress
ifplot_progress
figure;
holdon;
end

%Initializevalues
[mn]=size(X);
K=size(initial_centroids,1);
centroids=initial_centroids;
previous_centroids=centroids;
idx=zeros(m,1);

%RunK-Means
fori=1:max_iters

%Outputprogress
fprintf('K-Meansiteration%d/%d...\n',i,max_iters);
ifexist('OCTAVE_VERSION')
fflush(stdout);
end

%ForeachexampleinX,assignittotheclosestcentroid
idx=findClosestCentroids(X,centroids);%接着调用找最近节点的函数指导训练结束

%Optionally,plotprogresshere
ifplot_progress
plotProgresskMeans(X,centroids,previous_centroids,idx,K,i);
previous_centroids=centroids;
fprintf('Pressentertocontinue.\n');
pause;
end

%Giventhememberships,computenewcentroids
centroids=computeCentroids(X,idx,K);
end

%Holdoffifweareplottingprogress
ifplot_progress
holdoff;
end

end

4:
functioncentroids=computeCentroids(X,idx,K)
%COMPUTECENTROIDSretursthenewcentroidsbycomputingthemeansofthe
%datapointsassignedtoeachcentroid.
%centroids=COMPUTECENTROIDS(X,idx,K)returnsthenewcentroidsby
%computingthemeansofthedatapointsassignedtoeachcentroid.Itis
%givenadatasetXwhereeachrowisasingledatapoint,avector
%idxofcentroidassignments(i.e.eachentryinrange[1..K])foreach
%example,andK,thenumberofcentroids.Youshouldreturnamatrix
%centroids,whereeachrowofcentroidsisthemeanofthedatapoints
%assignedtoit.
%

%Usefulvariables
[mn]=size(X);

%Youneedtoreturnthefollowingvariablescorrectly.
centroids=zeros(K,n);

%======================YOURCODEHERE======================
%Instructions:Goovereverycentroidandcomputemeanofallpointsthat
%belongtoit.Concretely,therowvectorcentroids(i,:)
%shouldcontainthemeanofthedatapointsassignedto
%centroidi.
%
%Note:Youcanuseafor-loopoverthecentroidstocomputethis.
%

fori=1:K,
k=find(idx==i);%注意这里不要写成一个等号，第一次就写错了
num=size(k,1);%分类将属于每个簇的数据放到一起
centroids(i,:)=sum(X(k,:),1)/num;%将每个簇中的数据进行求中心点
end;

%=============================================================

end
5;
function[h,display_array]=displayData(X,example_width)
%DISPLAYDATADisplay2Ddatainanicegrid
%[h,display_array]=DISPLAYDATA(X,example_width)displays2Ddata
%storedinXinanicegrid.Itreturnsthefigurehandlehandthe
%displayedarrayifrequested.

%Setexample_widthautomaticallyifnotpassedin
if~exist('example_width','var')||isempty(example_width)
example_width=round(sqrt(size(X,2)));
end

%GrayImage
colormap(gray);

%Computerows,cols
[mn]=size(X);
example_height=(n/example_width);

%Computenumberofitemstodisplay
display_rows=floor(sqrt(m));
display_cols=ceil(m/display_rows);

%Betweenimagespadding
pad=1;

%Setupblankdisplay
display_array=-ones(pad+display_rows*(example_height+pad),...
pad+display_cols*(example_width+pad));

%Copyeachexampleintoapatchonthedisplayarray
curr_ex=1;
forj=1:display_rows
fori=1:display_cols
ifcurr_ex>m,
break;
end
%Copythepatch

%Getthemaxvalueofthepatch
max_val=max(abs(X(curr_ex,:)));
display_array(pad+(j-1)*(example_height+pad)+(1:example_height),...
pad+(i-1)*(example_width+pad)+(1:example_width))=...
reshape(X(curr_ex,:),example_height,example_width)/max_val;
curr_ex=curr_ex+1;
end
ifcurr_ex>m,
break;
end
end

%DisplayImage
h=imagesc(display_array,[-11]);

%Donotshowaxis
axisimageoff

drawnow;

end

6:
function[X_norm,mu,sigma]=featureNormalize(X)
%FEATURENORMALIZENormalizesthefeaturesinX
%FEATURENORMALIZE(X)returnsanormalizedversionofXwhere
%themeanvalueofeachfeatureis0andthestandarddeviation
%is1.Thisisoftenagoodpreprocessingsteptodowhen
%workingwithlearningalgorithms.

mu=mean(X);
X_norm=bsxfun(@minus,X,mu);

sigma=std(X_norm);
X_norm=bsxfun(@rdivide,X_norm,sigma);

%============================================================

end

8:
functioncentroids=kMeansInitCentroids(X,K)
%KMEANSINITCENTROIDSThisfunctioninitializesKcentroidsthataretobe
%usedinK-MeansonthedatasetX
%centroids=KMEANSINITCENTROIDS(X,K)returnsKinitialcentroidstobe
%usedwiththeK-MeansonthedatasetX
%

%Youshouldreturnthisvaluescorrectly
centroids=zeros(K,size(X,2));

%======================YOURCODEHERE======================
%Instructions:Youshouldsetcentroidstorandomlychosenexamplesfrom
%thedatasetX
%

%=============================================================

end

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航