您的位置:首页 > 其它

列处理——寻找并处理非法值

2016-01-05 20:17 344 查看
#input is a list named legislators, the first two elements looks like this:
#[['Bassett', 'Richard', '1745-04-02', 'M', 'sen', 'DE', 'Anti-Administration'], ['Bland', 'Theodorick', '1742-03-21', '', 'rep', 'VA', '']]
#Bassett Richard is the name
#1745-04-02 is bithdate
#M is gender


对性别列做处理。

单独取出该字段

genders_list = []
for rows in legislators:
genders_list.append(rows[3])


得到该字段的所有取值

# Converting to a set, so we get the unique values
unique_genders = set(genders_list)
# We can't index sets, so we need to convert back into a list first.
unique_genders_list = list(unique_genders)
print(unique_genders_list)
# 输出为['', 'M', 'F']


看出,除了正常的M,F,还会有空值。

pandas获取方法:

#recent_grads['Major'].value_counts()是series类型
majors = recent_grads['Major'].value_counts().index


处理非法值

for row in legislators:
if row[3]=='':
row[3]='M'
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: