【Python-ML】聚类的性能评价指标
2018-01-29 17:05
696 查看
参考:http://scikit-learn.org/stable/modules/clustering.html#clustering-performance-evaluation
1、兰德指数
from sklearn import metrics
labels_true = [0, 0, 0, 1, 1, 1]
labels_pred = [0, 0, 1, 1, 2, 2]
print (metrics.adjusted_rand_score(labels_true, labels_pred))
2、互信息
from sklearn import metrics
labels_true = [0, 0, 0, 1, 1, 1]
labels_pred = [0, 0, 1, 1, 2, 2]
print (metrics.adjusted_mutual_info_score(labels_true, labels_pred) )
3、Homogeneity, completeness and V-measure
同质性homogeneity:每个群集只包含单个类的成员。
完整性completeness:给定类的所有成员都分配给同一个群集。
两者的调和平均V-measure。
from sklearn import metrics
labels_true = [0, 0, 0, 1, 1, 1]
labels_pred = [0, 0, 1, 1, 2, 2]
print (metrics.homogeneity_score(labels_true, labels_pred))
print (metrics.completeness_score(labels_true, labels_pred))
print (metrics.v_measure_score(labels_true, labels_pred))
4、Fowlkes-Mallows scores
from sklearn import metrics
labels_true = [0, 0, 0, 1, 1, 1]
labels_pred = [0, 0, 1, 1, 2, 2]
print (metrics.fowlkes_mallows_score(labels_true, labels_pred))
5、Silhouette Coefficient 轮廓系数
参考:
http://blog.csdn.net/fjssharpsword/article/details/79161570
6、Calinski-Harabaz Index
类别内部数据的协方差越小越好,类别之间的协方差越大越好,这样的Calinski-Harabasz分数会高。
1、兰德指数
from sklearn import metrics
labels_true = [0, 0, 0, 1, 1, 1]
labels_pred = [0, 0, 1, 1, 2, 2]
print (metrics.adjusted_rand_score(labels_true, labels_pred))
2、互信息
from sklearn import metrics
labels_true = [0, 0, 0, 1, 1, 1]
labels_pred = [0, 0, 1, 1, 2, 2]
print (metrics.adjusted_mutual_info_score(labels_true, labels_pred) )
3、Homogeneity, completeness and V-measure
同质性homogeneity:每个群集只包含单个类的成员。
完整性completeness:给定类的所有成员都分配给同一个群集。
两者的调和平均V-measure。
from sklearn import metrics
labels_true = [0, 0, 0, 1, 1, 1]
labels_pred = [0, 0, 1, 1, 2, 2]
print (metrics.homogeneity_score(labels_true, labels_pred))
print (metrics.completeness_score(labels_true, labels_pred))
print (metrics.v_measure_score(labels_true, labels_pred))
4、Fowlkes-Mallows scores
from sklearn import metrics
labels_true = [0, 0, 0, 1, 1, 1]
labels_pred = [0, 0, 1, 1, 2, 2]
print (metrics.fowlkes_mallows_score(labels_true, labels_pred))
5、Silhouette Coefficient 轮廓系数
参考:
http://blog.csdn.net/fjssharpsword/article/details/79161570
6、Calinski-Harabaz Index
类别内部数据的协方差越小越好,类别之间的协方差越大越好,这样的Calinski-Harabasz分数会高。
相关文章推荐
- 【机器学习】聚类结果评价指标及python3代码实现
- 聚类评价指标S_Dbw的python实现
- 【Python-ML】SKlearn库性能指标ROC-AUC
- 【Python-ML】SKlearn库性能指标-混淆矩阵和F1
- 文本分类性能评价指标
- 【Python-ML】SKlearn库多元线性回归性能评估
- 【Python-ML】SKlearn库原型聚类KMeans
- CPU 性能指标评价标准
- 聚类的一些评价指标
- 聚类评价指标 Rand Index,RI,Recall,Precision,F1
- 聚类︱python实现 六大 分群质量评估指标(兰德系数、互信息、轮廓系数)
- 磁盘性能评价指标—IOPS和吞吐量
- 聚类效果好坏的评价指标
- 主流的聚类评价指标概览及聚类精度Accuracy的Java实现
- 系统性能测试的关键评价指标
- MANET性能评价指标
- 信息检索常用的性能评价指标
- 聚类评价指标
- 阿里云收集服务器性能指标的python脚本
- 磁盘性能评价指标—IOPS和吞吐量