基于各种分类算法的语音分类(年龄段识别)(续)
2016-08-19 10:12
183 查看
基于各种分类算法的语音分类(年龄段识别)
语料提取,基于分类算法进行分类
语料提取分类
TIMIT/DOC/SPKRINFO.TXT中为speaker信息,作为分类条件定义方法
def initspeakerinfo(speakerinfo),生成speaker:age字典:
def initspeakerinfo(speakerinfo): dict = {} f = open(speakerinfo,'r') for line in f: linelist = line.strip().split(' ') recorddate = linelist[4].strip().split('/') birthdata = linelist[5].strip().split('/') if recorddate[2]=="??" or birthdata[2]=="??": age = 0 else: age = int(recorddate[2])*365+int(recorddate[0])*30+int(recorddate[1])-int(birthdata[2])*365+int(birthdata[0])*30+int(birthdata[1]) age = age/365.0 dict[linelist[1]+linelist[0]] = age return dict
如三分类或两分类:
def getclass(filename,dict): m = filename if dict[m]==0: return "0" if dict[m]<=25: return "-1" elif dict[m]<=45: return "0" else: return "+1"
特征表示
在之前提取出了MFCC/i-vector,其中MFCC为38n矩阵形式,38是MFCC维度而n为一段语音的帧数,i-vector则是1200矩阵形式,如果要进行分类,需要对MFCC进行处理,最简单的方法就是取38*n的均值再进行归一化定义方法
def initavgmfcc(avgmfccname,mfccpath)读取mfccpath路径下的mfcc文件写入到一个文件中,并完成均值和归一化
def initavgmfcc(avgmfccname,mfccpath): f = open(avgmfccname,'w') for filename in os.listdir(mfccpath): fo = open(mfccpath+"\\"+filename,'r') dimen = 13 avgmfcc = [0]*dimen length = 1 for line in fo: linelist = line.strip().split(' ') for i in range(dimen): avgmfcc[i] = avgmfcc[i] + float(linelist[i]) length = length + 1 for i in range(dimen): avgmfcc[i] = avgmfcc[i]/length listmin = min(avgmfcc) listmax = max(avgmfcc) for i in range(dimen): avgmfcc[i] = str((avgmfcc[i]-listmin)/(listmax-listmin)) f.write(filename+" "+" ".join(avgmfcc)+"\n") print filename+" avg over" fo.close() f.close()
定义方法
def initiv(ivname,ivpath)读取ivpath路径下的i-vector文件写入到一个文件中
def initiv(ivname,ivpath): f = open(ivname,'w') avgf = open(ivname+"avg","w") for filename in os.listdir(ivpath): fo = open(ivpath+"\\"+filename,'r') dimen = 200 for line in fo: linelist = line.strip().split(' ') if(len(linelist)==dimen): f.write(filename+" "+" ".join(linelist)+"\n") avgiv = [0]*dimen linelist = map(eval, linelist) listmin = min(linelist) listmax = max(linelist) for i in range(dimen): avgiv[i] = (str)((linelist[i]-listmin)/(listmax-listmin)) avgf.write(filename+" "+" ".join(avgiv)+"\n") fo.close() f.close() avgf.close()
PS:https://www.zhihu.com/question/20455227 归一化说明
LIBSVM进行分类
安装
参考http://blog.csdn.net/lqhbupt/article/details/8599295 进行LIBSVM的安装PS:64位麻烦一点,但是同样可以用nmake解决
LIBSVM格式
http://blog.csdn.net/kobesdu/article/details/8944851 介绍了LIBSVM格式和生成方法简单来说格式为
+1 1:0.533355514244 2:0.225956771932 3:0.551555751325 4:0.448831840291 5:0.732958158188 6:0.516967914119 ... -1 1:0.723092649707 2:0.352547706883 3:0.524416372722 4:0.683881004712 5:0.464490812227 6:0.70279542324 ... ...
其实Python几行就可以解决
最后定义方法
def initFormat(formatname,avgmfccname,dict,dimen)生成了LIBSVM格式的
FormatData-iv-train
FormatData-iv-test
FormatData-mfcc-train
FormatData-mfcc-test
参数寻优
在libsvm-3.21/tools/grid.py中可以进行参数寻优E:\libsvm-3.21\tools>grid.py Usage: grid.py [grid_options] [svm_options] dataset grid_options : -log2c {begin,end,step | "null"} : set the range of c (default -5,15,2) begin,end,step -- c_range = 2^{begin,...,begin+k*step,...,end} "null" -- do not grid with c -log2g {begin,end,step | "null"} : set the range of g (default 3,-15,-2) begin,end,step -- g_range = 2^{begin,...,begin+k*step,...,end} "null" -- do not grid with g -v n : n-fold cross validation (default 5) -svmtrain pathname : set svm executable path and name -gnuplot {pathname | "null"} : pathname -- set gnuplot executable path and name "null" -- do not plot -out {pathname | "null"} : (default dataset.out) pathname -- set output file path and name "null" -- do not output file -png pathname : set graphic output file path and name (default dataset.png) -resume [pathname] : resume the grid task using an existing output file (default pathname is dataset.out) This is experimental. Try this option only if some parameters have been checked for the SAME data.
option如上
用以求参数C和gamma
http://m.blog.csdn.net/article/details?id=46386201
参数寻优的原理是交叉验证
-v n分为n份
依次取其中n-1份为训练集,1份为测试集,参数C和gamma在
-log2c {begin,end,step | "null"} : set the range of c (default -5,15,2) begin,end,step -- c_range = 2^{begin,...,begin+k*step,...,end} "null" -- do not grid with c -log2g {begin,end,step | "null"} : set the range of g (default 3,-15,-2) begin,end,step -- g_range = 2^{begin,...,begin+k*step,...,end} "null" -- do not grid with g
区间内
然后更换训练集和测试集做简单的枚举,设C区间内有numC个取值,gamma区间内有numG个取值,则总共进行
numC*numG*n次测试,会输出每一次的结果:准确率accuracy,取最高accuracy时的C和gamma作为参数寻优的结果
LIBSVM训练和预测
train_y, train_x = svm_read_problem('../FormatData-train') test_y, test_x = svm_read_problem('../FormatData-test') model = svm_train(train_y,train_x,'-c 112.0 -g 0.000125') p_label, p_acc, p_val = svm_predict(test_y,test_x, model)
scikit-learn进行分类
scikit-learn是python的一个第三方库分类方法众多,调用简单,需要预先了解分类方法/Python/numpy
LDA/PLDA/PCA处理
scikit-learn还提供LDA处理,所以之前的LIBSVM可以升级为from svmutil import * from sklearn.lda import LDA #read the data(mfcc/ivectr/LDA-ivector) train_y, train_x = svm_read_problem('../FormatData-mfcc-train') test_y, test_x = svm_read_problem('../FormatData-mfcc-test') clf = LDA(solver='eigen',n_components=100) train_x2 = clf.fit(train_x,train_y).transform(train_x) test_x2 = clf.fit(train_x,train_y).transform(test_x) model = svm_train(train_y2,train_x2,'-c 8192.0 -g 0.05')
scikit-learn分类
可以尝试GMM/KNN/GBDT等算法《scikit-learn.user_guide_0.16.1.pdf》
http://www.cnblogs.com/nsnow/p/5026673.html 中的example修改引用:
#!usr/bin/env python #-*- coding: utf-8 -*- import sys import os import time from sklearn import metrics import numpy as np import cPickle as pickle from sklearn.datasets import load_svmlight_file import numpy from sklearn.lda import LDA from sklearn.decomposition import PCA reload(sys) sys.setdefaultencoding('utf8') # Multinomial Naive Bayes Classifier def naive_bayes_classifier(train_x, train_y): from sklearn.naive_bayes import MultinomialNB model = MultinomialNB(alpha=0.01) model.fit(train_x, train_y) return model # KNN Classifier def knn_classifier(train_x, train_y): from sklearn.neighbors import KNeighborsClassifier model = KNeighborsClassifier() model.fit(train_x, train_y) return model # Logistic Regression Classifier def logistic_regression_classifier(train_x, train_y): from sklearn.linear_model import LogisticRegression model = LogisticRegression(penalty='l2') model.fit(train_x, train_y) return model # Random Forest Classifier def random_forest_classifier(train_x, train_y): from sklearn.ensemble import RandomForestClassifier model = RandomForestClassifier(n_estimators=8) model.fit(train_x, train_y) return model # Decision Tree Classifier def decision_tree_classifier(train_x, train_y): from sklearn import tree model = tree.DecisionTreeClassifier() model.fit(train_x, train_y) return model # GBDT(Gradient Boosting Decision Tree) Classifier def gradient_boosting_classifier(train_x, train_y): from sklearn.ensemble import GradientBoostingClassifier model = GradientBoostingClassifier(n_estimators=200) model.fit(train_x, train_y) return model # SVM Classifier def svm_classifier(train_x, train_y): from sklearn.svm import SVC model = SVC(kernel='rbf', probability=True) model.fit(train_x, train_y) return model # SVM Classifier using cross validation def svm_cross_validation(train_x, train_y): from sklearn.grid_search import GridSearchCV from sklearn.svm import SVC model = SVC(kernel='rbf', probability=True) param_grid = {'C': [1e-3, 1e-2, 1e-1, 1, 10, 100, 1000], 'gamma': [0.001, 0.0001]} grid_search = GridSearchCV(model, param_grid, n_jobs = 1, verbose=1) grid_search.fit(train_x, train_y) best_parameters = grid_search.best_estimator_.get_params() for para, val in best_parameters.items(): print para, val model = SVC(kernel='rbf', C=best_parameters['C'], gamma=best_parameters['gamma'], probability=True) model.fit(train_x, train_y) return model def read_data(data_file): f = open(data_file+"-train") x = [] y = [] for line in f: linelist = line.strip().split(' ') linelist = map(eval, linelist) x.append(linelist[1:]) y.append(linelist[0]) x1 = np.array(x) y1 = np.array(y) ff = open(data_file+"-test") xx = [] yy = [] for line in ff: linelist = line.strip().split(' ') linelist = map(eval, linelist) xx.append(linelist[1:]) yy.append(linelist[0]) x2 = np.array(xx) y2 = np.array(yy) train_x = x1 train_y = y1 test_x = x2 test_y = y2 #return x1[:trainlen],y1[:trainlen],x1[trainlen:],y1[trainlen:] return train_x, train_y, test_x, test_y if __name__ == '__main__': data_file = "./data/FormatData-mfcc" thresh = 0.5 model_save_file = None model_save = {} test_classifiers = ['KNN', 'LR', 'RF', 'DT', 'SVM', 'GBDT'] classifiers = {#'NB':naive_bayes_classifier, 'KNN':knn_classifier, 'LR':logistic_regression_classifier, 'RF':random_forest_classifier, 'DT':decision_tree_classifier, 'SVM':svm_classifier, 'SVMCV':svm_cross_validation, 'GBDT':gradient_boosting_classifier } print 'reading training and testing data...' train_x, train_y, test_x, test_y = read_data(data_file) num_train, num_feat = train_x.shape num_test, num_feat = test_x.shape is_binary_class = (len(np.unique(train_y)) == 2) print is_binary_class print '******************** Data Info *********************' print '#training data: %d, #testing_data: %d, dimension: %d' % (num_train, num_test, num_feat) for classifier in test_classifiers: print '******************* %s ********************' % classifier start_time = time.time() model = classifiers[classifier](train_x, train_y) print 'training took %fs!' % (time.time() - start_time) predict = model.predict(test_x) if model_save_file != None: model_save[classifier] = model if is_binary_class: precision = metrics.precision_score(test_y, predict) recall = metrics.recall_score(test_y, predict) print 'precision: %.2f%%, recall: %.2f%%' % (100 * precision, 100 * recall) accuracy = metrics.accuracy_score(test_y, predict) print 'accuracy: %.2f%%' % (100 * accuracy) if model_save_file != None: pickle.dump(model_save, open(model_save_file, 'wb')) grid_search = GridSearchCV(classifiers,)
相关文章推荐
- 基于各种分类算法的说话人识别(年龄段识别)
- Don't Look Back Robust Place Categorization for Place Recognition 基于分类的地点识别算法
- 基于BOW的图像分类识别算法实现步骤
- 基于左右值的无限级分类算法-oracle实现--03
- 基于opencv2.0的haar算法以人脸识别为例的训练分类器xml的方法
- 基于朴素贝叶斯分类器的文本分类算法(上)
- 利用基因算法训练连续隐马尔柯夫模型的语音识别
- 基于朴素贝叶斯分类器的文本分类算法(下)
- 基于语音类 应用的识别和 跟踪系统
- 简单的图像识别方法:基于灰度的模板识别算法
- 基于朴素贝叶斯分类器的文本分类算法(上)
- 基于朴素贝叶斯分类器的文本分类算法(下)
- 基于数据库技术的分类算法
- 基于朴素贝叶斯分类器的文本分类算法(上)
- 基于朴素贝叶斯分类器的文本分类算法及其他
- MMSEG介绍及基于分类的中文分词算法遐想
- 基于朴素贝叶斯分类器的文本分类算法(下)
- 求做基于质心的半监督文本分类算法的同伴
- 基于朴素贝叶斯分类器的文本分类算法(上)
- 【转】 基于朴素贝叶斯分类器的文本分类算法(上)