您的位置：首页 > 编程语言 > Python开发

python第七周学习内容及测验作业

2017-08-02 12:53 399 查看

字典和集合

1.字典(键值对)

键必须是不可变的且不重复，值可以是任意类型

for key in my_dict :

枚举字典中的键，注：键是无序的 

my_dict.items() – 全部的键-值对

my_dict.keys() – 全部的键

my_dict.values() – 全部的值

my_dict.clear() – 清空字典

字典的简单应用：

读取一个字符串，计算每个字母出现的个数
①
s=raw_input()
count=[0]*26

for i in s :
if i.isalpha():#考虑到空格存在需要判断一下
count[ord(i)-97]+=1
else:
continue

print count

应将字母转换成小写字母。
>>>
single is simple,double is trouble
[0, 2, 0, 1, 4, 0, 1, 0, 4, 0, 0, 4, 1, 1, 2, 1, 0, 1, 4, 1, 2, 0, 0, 0, 0, 0]

②字典
s=raw_input()
s=s.lower()
dic={}
for i in s:
if i.isalpha():
if i in dic :
dic[i]+=1
else:
dic[i]=1

print dic

>>>
AabBCCefdg
{'a': 2, 'c': 2, 'b': 2, 'e': 1, 'd': 1, 'g': 1, 'f': 1}

单词计数

f=open('emma.txt')

word_freq={}

for line in f:
words=line.strip().split()
for word in words:
if word in word_freq:
word_freq[word]+=1
else:
word_freq[word]=1

freq_word=[]
for word ,freq in word_freq.items():
freq_word.append((freq,word))

freq_word.sort(reverse=True)

for freq,word in freq_word[:10]:
print word

f.close()

>>>
to
the
and
of
a
I
was
in
not
her

Python split()

通过指定分隔符对字符串进行切片，如果参数num 有指定值，则仅分隔 num 个子字符串

split()方法语法：

str.split(str=”“, num=string.count(str)).

参数

str – 分隔符，默认为所有的空字符，包括空格、换行(\n)、制表符(\t）等。

num – 分割次数

返回分割后的字符串列表。

Python strip()

用于移除字符串头尾指定的字符（默认为空格）。

strip()方法语法：str.strip([chars]);

参数

chars – 移除字符串头尾指定的字符。

返回移除字符串头尾指定的字符生成的新字符串。

2.集合(无序不重复（键）集）

题目内容：

实现逆向最大匹配分词算法，即从右向左扫描，找到最长的词并切分。如句子“研究生命的起源”，逆向最大匹配分词算法的输出结果为“研究生命的起源”。

输入格式:

第一行是以utf-8格式输入的词表，每个词之间以空格分隔。

接下来是若干行以utf-8格式输入的中文句子。

输出格式：

以utf-8格式输出的逆向最大匹配的分词结果，每个词之间使用空格分隔。每个输入对应一行输出。

输入样例：

你我他爱北京天安门研究研究生命生命的起源

研究生命的起源

我爱北京天安门

输出样例：

研究生命的起源

我爱北京天安门

def load_dict():

line = unicode(raw_input(), 'utf-8')

word_dict = set()

max_length = 1

words = line.split()

for word in words:

if len(word) > max_length:

max_length = len(word)

word_dict.add(word)

return max_length, word_dict

def bmm_word_seg(sentence, word_dict, max_length):

words = []

sentence = unicode(sentence, 'utf-8')

right = len(sentence)

while right > 0:

for left in range(max(right - max_length, 0), right,1):

word = sentence[left:right]

if word in word_dict or right == left + 1:

words.append(word)

break

right = left

return words

max_length, word_dict = load_dict()

list=[]

while 1:

seginput=raw_input()

if seginput=='':

4000
break

words = bmm_word_seg(seginput, word_dict, max_length)

words.reverse()

list.append(words)

for words in list:

for word in words:

print word.encode('utf-8'),

print

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航