您的位置:首页 > 编程语言 > Python开发

天池盐城上牌数之基于logistic的回归预测python代码

2018-01-28 16:53 633 查看
基于logistics做了回归预测,数据集用的是天池比赛上面的盐城汽车上牌数,最后的效果不美丽,差不多是随着日期增加上牌数一直增加的。

就贴一部分结果把;



代码如下:(新手小白)

import numpy as np
def sigmoid(inX):
return 1.0/(1+np.exp(-inX))

#改进的随机梯度上升:
def stocGradAscent1(dataMatrix,classLabels,numIter=150):
m,n=np.shape(dataMatrix)
weights=np.ones(n)
for j in range(numIter):
dataIndex = list(range(m))
for i in range(m):
alpha=4/(1.0+j+i)+0.0001
randIndex = int(np.random.uniform(0, len(dataIndex)))
h = sigmoid(sum(dataMatrix[randIndex] * weights))
error = classLabels[randIndex] - h
weights = weights + alpha * error * dataMatrix[randIndex]

del (dataIndex[randIndex])
return weights
#做预测
def prediction(inX,weights):
prob=np.matmul(np.mat(inX),np.mat(weights).T)
return prob
#训练加测试:
def colicTest():
frTrain=open('train_20171215.txt','r',encoding='utf-8')
frtest=open('testA.txt')
trainningSet=[]
trainningLabels=[]
for line in frTrain.readlines():
currLine=line.strip().split('\t')
lineArr=[]
for i in range(3):
lineArr.append(float(currLine[i]))
trainningSet.append(lineArr)
trainningLabels.append(float(currLine[-1]))
trainWeights=stocGradAscent1(np.array(trainningSet),trainningLabels,500)
errorCount=0
numTestVec=0.0
for line in frtest.readlines():
numTestVec+=1
currLine=line.strip().split('\t')
lineArr=[]
for i in range(3):
lineArr.append(float(currLine[i]))
# print(np.shape(np.mat(lineArr)))
# print(np.shape(np.mat(trainWeights)))
predict=prediction(np.array(lineArr),trainWeights)
print(predict)

colicTest()

内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: