您的位置：首页 > 编程语言 > Python开发

Python问题列表

2017-09-15 22:11 162 查看

目录

Q 如何判断一个对象是否是可迭代的

Q 如何判断一个对象是否是Iterator对象

Q 如何把list等非Iterator对象变成Iterator对象

Q字符串数据预处理

Topic 路径相关操作
Q 将目录和文件名连接成一个绝对路径

Q把路径分割成两部分一部分目录另外一部分最后一个级别的目录和文件名

Q 如何抽取文件的扩展名

Topic 文件
Q 如何删除文件

Q如何写文件

Q 如何读一个小文件内存可以放下

Q如何读一个大文件内存放不下

Q 使用cPickle将python对象以二进制形式存储到本地

Q 使用Keras把文本的中单词都抽取出来并做成词典

Q对列表中元素进行数学运算

Q如何区分类属性实例属性并取出类或实例属性的值设置类或实例属性的值

Topic时间的统计
Q输出两个时间的时间差

Topic有哪些python的参考书籍

参考资料

Q: 如何判断一个对象是否是可迭代的？

原理

一个可迭代对象的类型是Iterable.

使用isinstance(objectName, Iterable)可判断。如果isinstance(objectName)返回True，表明对象是可迭代的

否则是不可迭代的。

Q: 如何判断一个对象是否是Iterator对象？

原理

使用isinstance(objectName, Iterable)可判断。如果isinstance(objectName, Iterator)返回True，表明对象是Iterator对象; 否则不是。

Q: 如何把list等非Iterator对象变成Iterator对象

使用iter(IteratorObjectName); 好处是Iterator对象可以使用next()返回下一个值。

例子

>>> from collections import Iterator, Iterable
>>> ll = [1,2,3]
>>> isinstance(ll,  Iterable) ##判断ll是否是可迭代的
>>> gene = (x*x for x in [1,2,3])
>>> isinstance(gene,  Iterable) ##判断gene是否是Iterable对象
>>> llIter = iter(ll)　##把ll非Iterator对象变成Iterable对象
>>> isinstance(llIter,  Iterator) ##返回值为true，表明llIter（原为Iterable）已经变成了Iterator对象.

Q:字符串数据预处理

功能:　把可打印字符中字母，数字抽取出；其他可打印字符转换成空格。

版本一：使用基本python语法

def clearStr(char):
newChar = ""
import string
if char in string.printable: ##判断char是否是可打印字符
if char.isalnum() or char.isspace():
newChar = char
else:
newChar = " "
return newChar

def change():
ll = "Linksys EtherFast 8-Port 10/100 Switch (New/Workgroup)"
newll = map(clearStr, ll)
print "".join(newll)

版本二：使用Keras

推荐，理由能用开源的工具，就用。

使用Keras的text_to_word_sequence 函数.

import keras
from keras.preprocessing import text
def cleanStr():
s = "abc-df 1323/dfd"
newS = keras.preprocessing
newText = keras.preprocessing.text.text_to_word_sequence(s,
filters='!"#$%&()*+,-./:;<=>?@[\\]^_`{|}~\t\n',　lower=True,　split=" ")
print newText, type(newText)

Topic: 路径相关操作

原则

* 进行文件操作，不使用字符串拼接的方式。理由：你的代码可以跨系统执行，不需要有额外的修改。

Q: 将目录和文件名连接成一个绝对路径?

方法，调用os.path的join函数

import os
dirV1 = '/home/xuyl/PycharmProjects/StudyPython' ##目录名caseI
dirV2 = '/home/xuyl/PycharmProjects/StudyPython/' ##目录名caseII
file = 'abtbuy.inconsistentPair.txt' ##文件名
pathV1 = os.path.join(dirV1, file) ##将文件名和目录名连接在一起
pathV2 = os.path.join(dirV2, file)

Q:把路径分割成两部分，一部分目录，另外一部分最后一个级别的目录和文件名？

方法，调用os.path的split函数

import os
absPath = 'home/xuyl/123.txt'
rootDir, fn = os.path.split(absPath) ##caseI: 把路径分成两部分，第一部分是目录；第二部分是文件名
path = 'home/xuyl'
other, lastDirName = os.path.split(absPath) ##caseII:把路径分成两部分，第一部分是部分目录（去掉最后一个目录的）；第二部分是最后一级的目录名

Q: 如何抽取文件的扩展名？

方法　调用os.path的splitext函数。

例子

import os
absPath = 'home/xuyl/123.txt'
other, extendName = os.path.splitext(absPath) ##返回值的第二部分是.txt

Topic: 文件

Q: 如何删除文件？

函数　os.remove()

import os
os.remove(/home/test.txt)

Q:如何写文件？

函数　调用file.write()

例子

with open("/home/log.txt", 'a') as f:　##追加文件
f.write("Hello world!\n")

Q: 如何读一个小文件（内存可以放下）？

思路　一次读入内存

方式一：read()+close() 不推荐太复杂了-必须有f.close()函数。

import os
dir = '/home/xuyl/PycharmProjects/'
file = 'abtbuyP.txt'
path = os.path.join(dir, file)
try:
f = open(path, 'r')
f.read() ##一次读入内存
finally:
if f:
f.close()

方式二：read()+with子句

import os
dir = '/home/xuyl/PycharmProjects/'
file = 'abtbuyP.txt'
path = os.path.join(dir, file)
with open(path, 'r') as f: ### with clause can close file after operate.
print(f.read())

方式三：read()+codecs.open()

例子

import codecs
with codecs.open('/home/xuyl/test.txt', 'r', encoding='utf-8') as f:
f.read()

Q:如何读一个大文件（内存放不下）？

思路　一行一行读

调用的函数 f.readlines()

例子　如read()函数类似，有三种调用方法

import os
dir = '/home/xuyl/PycharmProjects/'
file = 'abtbuyP.txt'
path = os.path.join(dir, file)
try:
f = open(path, 'r')
for line in f.readlines():  ### read file line by line.
print(line.strip())
finally:
if f:
f.close()

Q: 使用cPickle将python对象以二进制形式存储到本地

使用pickle和cPickle的函数是相同的，但据说cPickle比较快（理由－cPickle底层使用的是c代码）.参考的是文献[3]的pickle代码。

import cPickle
info_dict = {'1':1, '2': 1, '3': 3}
f = open('info.pkl', 'wb')
cPickle.dump(info_dict, f)　##将python对象info_dict存储到本地
f.close()

f = open('info.pkl', 'r+')
info2_dict = cPickle.load(f)##将文件info.pkl文件中python对象，抽取到内存
print info2_dict

PS:不清楚是否必须使用后缀名.pkl.

Q: 使用Keras把文本的中单词都抽取出来，并做成词典。

import keras
from keras.preprocessing import text
def extractDictionary():
texts=[]
texts.append("Kenwood KDC-C669 Car CD Changer - KDCC669")
texts.append("Cordless Solo S2 Radar/Laser Detector - 010ES20A")

tokenizer = keras.preprocessing.text.Tokenizer()
tokenizer.fit_on_texts(texts) ##feed texts to Tokenizer

word_index = tokenizer.word_index ## word_index　包含文本texts中,所有单词的词典, 且词典的键是word, 值是索引下标。
print "word_index:", word_index

texts_seq = tokenizer.texts_to_sequences(texts) ## 将文本转换成序列. 该序列是文本中单词在word_index的键值。
print "texts_seq", texts_seq ##PS: 不明白为什么会是15个单词,明明词典里面只有14个

texts_matrix = tokenizer.texts_to_matrix(texts, mode="tfidf")##将文本转换成一个向量，该向量是文本中
texts_matrix = tokenizer.texts_to_matrix(texts, mode="binary")
print texts_matrix, "shape of texts_matrix", texts_matrix.shape

Q:对列表中元素进行数学运算

"""
任务,计算list中元素的和,均值和最大值等。
核心思想, 把list转换成numpy的ndarray多维数组,然后用矩阵运算来计算。
好处：第一，必然比你用高级语言写的代码快(底层用c实现)；第二，节省时间-直接调接口.不需要你花费时间检查逻辑。
"""
def mathOperation():
ll = [1,2,3]
import numpy as np
lla = np.asarray(ll)
print "sum", lla.sum()
print "max", lla.max()
print "average", lla.mean()

Q:如何区分类属性，实例属性，并取出类(或实例)属性的值，设置类(或实例)属性的值？

注意事项：

* 把属性名，设置为__xxx;

* 使用@property修饰器，把属性变成可读的属性；

* 使用@xx.setter修饰器，把属性变成可写的属性（属性值可修改）。

* 类属性与实例属性的区别是，实例属性位于init(self)下面，并且需要有前缀”self”.

example

class CatII(object):
__type = "Br" ## 类属性(class attribute)
def __init__(self):
self.__name = "" ## 实例属性(instance attribute) 方式一
pass

### set instance attribute and methods.
@property
def name(self):## 获得实例属性
return self.__name ## 实例属性(instance attribute) 方式一

@name.setter
def name(self, value):##将实例属性设置为value
self.__name = value

### class attribute and methods
@property
def type(cls): ##获得类属性的
'class method get the class attribute value'
return cls.__type
@type.setter
def type(cls, value): ##设置类的属性值
cls.__type = value

PS: 代码可以从git 上InstanceVSClass下载。

Topic:时间的统计

Q:输出两个时间的时间差？

"""
output time difference in seconds.
"""
def getTimeDiff():
start = datetime.datetime.now()
time.sleep(5)
end = datetime.datetime.now()
diff = end - start
print("diff:", diff)
print("diff.seconds", (end - start).seconds)  ##输出时间差，以秒为单位

Topic:有哪些python的参考书籍

Python Essential Reference 4th Edition download

参考资料

[1] 廖雪峰的官网　https://www.liaoxuefeng.com/wiki/0014316089557264a6b348958f449949df42a6d3a2e542c000

[2] Keras中文文档　https://keras-cn.readthedocs.io/en/latest/preprocessing/text/

[3] pickle 存储到本地　https://blog.linuxeye.cn/369.html

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签： python

相关文章推荐

新的分享

章节导航