您的位置:首页 > 编程语言 > Python开发

[第一弹]os.walk的相关用法

2014-10-11 12:43 169 查看
os.walk(string directoryPath)的参数是一个目录,字符串类型,返回root(根目录),directory(子目录,列表),file(子文件名,列表类型)

代码1-1.

import os

for root,dirs,files in os.walk('e://HIMYM//HIMYM-S5'):
print root,dirs,files,'\n'
输出结果:

e://HIMYM//HIMYM-S5 [] ['How I Met Your Mother S05E01 Definitions 720p WEB-DL DD5.1.mkv', 'How I Met Your Mother S05E02 Double Date 720p WEB-DL DD5.1.mkv', 'How I Met Your Mother
S05E03 Robin 101 720p WEB-DL DD5.1.mkv', 'How I Met Your Mother S05E04 The Sexless Innkeeper 720p WEB-DL DD5.1.mkv', 'How I Met Your Mother S05E05 Duel Citizenship 720p WEB-DL DD5.1.mkv', 'How I Met Your Mother S05E06 Bagpipes 720p WEB-DL DD5.1.mkv', 'How
I Met Your Mother S05E07 The Rough Patch 720p WEB-DL DD5.1.mkv', 'How I Met Your Mother S05E08 The Playbook 720p WEB-DL DD5.1.mkv', 'How I Met Your Mother S05E10 The Window 720p WEB-DL DD5.1.mkv', 'How I Met Your Mother S05E11 The Last Cigarette Ever 720p
WEB-DL DD5.1.mkv', 'How I Met Your Mother S05E12 Girls VS. Suits 720p WEB-DL DD5.1.mkv', 'How I Met Your Mother S05E13 Jenkins 720p WEB-DL DD5.1.mkv', 'How I Met Your Mother S05E14 The Perfect Week 720p WEB-DL DD5.1 H264-PeeWee.mkv', 'How I Met Your Mother
S05E15 Rabbit or Duck 720p WEB-DL DD5.1.mkv', 'How I Met Your Mother S05E16 Hooked 720p WEB-DL DD5.1.mkv', 'How I Met Your Mother S05E17 Of Course 720p WEB-DL DD5.1.mkv', 'How I Met Your Mother S05E18 Say Cheese 720p WEB-DL DD5.1.mkv', 'How I Met Your Mother
S05E19 Zoo or False 720p WEB-DL DD5.1.mkv', 'How I Met Your Mother S05E20 Home Wreckers 720p WEB-DL DD5.1.mkv', 'How I Met Your Mother S05E21 Twin Beds 720p WEB-DL DD5.1.mkv', 'How I Met Your Mother S05E22 Robots vs. Wrestlers 720p WEB-DL DD5.1.mkv', 'How
I Met Your Mother S05E23 The Wedding Bride 720p WEB-DL DD5.1.mkv', 'How I Met Your Mother S05E24 Doppelgangers 720p WEB-DL DD5.1-PeeWee.mkv'] 

代码1-1中os.walk的参数'e://HIMYM//HIMYM-S5'是一个只包含文件的目录,没有子目录,所以dirs=[].

使用os.walk经常遇到中文编码问题,当目录名或文件名中包含中文时,输出乱码,如下:

代码1-2.

import os

for root,dirs,files in os.walk('E:\\WORK_FILE\\Python\\Python2'):
print root,dirs,files,'\n'


输出结果:

>>> 

E:\WORK_FILE\Python\Python2 [] ['bkjw.py', 'calculator.py', 'cdclog.txt', 'cdctools.py', 'cdctools.pyc', 'class_login.py', 'class_test01.py', 'eight_queen.py', 'hehe.py', 'pycdc-v0.5.py',
'pyre_ebb9ce1c-e5e8-4219-a8ae-7ee620d5f9f1.png', 'renren.html', 'renren.py', 're_match.py', 're_test.py', 'szhxy\xd0\xde\xb8\xc4\xb0\xe6.py', 'szhxy\xd4\xad\xb0\xe6.py', 'table.html', 'test
(2).py', 'test.py', 'test0.py', 'test1.py', 'YaYa', 'YaYa.html', 'YaYa.txt', 'yy1.py', 'yy2.py', '\xd5\xbb.py', '\xc0\xe0\xb5\xc4\xbc\xcc\xb3\xd0.py', '\xb1\xe0\xc2\xeb\xce\xca\xcc\xe2.py', '\xbc\xc7\xca\xc2\xb1\xbe.py'] 

解决方法:像上面代码中直接输出dirs,files,会导致乱码,如果将dirs,files遍历每项然后输出,就不会产生乱码,

代码1-3:

import os

for root,dirs,files in os.walk('E:\\WORK_FILE\\Python\\Python2'):
print 'root:' , root , '\n'
print 'directory:\n'
for directory in dirs:
print directory , '\n'
print 'file:\n'
for f in files:
print f , '\n'
部分输出结果:

>>> 

root: E:\WORK_FILE\Python\Python2 

directory:

file:

....

szhxy修改版.py 

szhxy原版.py 

...

栈.py 

类的继承.py 

编码问题.py 

记事本.py 

实用代码1-3:



# _*_coding:utf-8 _*_
import os
import chardet
import re
#
#@param file_list 全为字符串的列表
#功能:将列表中的每一个字符串重新格式化,返回一个格式化好的字符串
#
def list2str(file_list):
if file_list==[]:
return 'null'
tmp_file=''
i=0
for name in file_list:
if i%5==0 and i!=0:
tmp_file+='\n'
tmp_file+=(name+'|#|')
i+=1
return tmp_file
#
#param directory 需要遍历的目录
#      save-file 将遍历之后的结果保存在save_file
#
def fileWalker(directory,save_file):
fp=open(save_file,'w')
for root,dirs,files in os.walk(directory):
dirs=list2str(dirs)
files=list2str(files)
tmp='rootdir:'+root+'\n'+'dirs----'+dirs+'\n'+'files----'+files+'\n'
fp.write(tmp)
fp.write('+'*20+'\n'+'+'*20+'\n')
fp.close()
#
#param directory 指定搜索目录
#      keyword   指定查询关键字
#返回directory目录下的所有符合条件的目录,文件,子目录,子文件
#
def Grep(directory,keyword):
tmp_dir=''
for root,dirs,files in os.walk(directory):
'''dirs=list2str(dirs)
files=list2str(files)
re_find=re.compile(keyword)
re_find.findAll(dirs)'''
if chardet.detect(keyword)['encoding']!='ascii':
for dir_name in dirs:
if chardet.detect(dir_name)['encoding']=='GB2312':
if keyword.decode('utf8') in dir_name.decode('GB2312'):
tmp_dir+=('d:'+root+'\\'+dir_name+'\n')
for file_name in files:
code=chardet.detect(file_name)['encoding']
try:
if keyword.decode('utf8') in file_name.decode(code):
tmp_dir+=('f:'+root+'\\'+file_name+'\n')
except:
pass

else:
for dir_name in dirs:
if keyword in dir_name:
tmp_dir+=('d:'+root+'\\'+dir_name+'\n')
for file_name in files:
if keyword in file_name:
tmp_dir+=('f:'+root+'\\'+file_name+'\n')
return tmp_dir
if __name__=="__main__":
'''directory="E:\\BaiduYunDownload"
fileWalker(directory,"E:\\WORK_FILE\\Python\\Python2\\cdclog.txt")'''
dirs=Grep('E:\\BaiduYunDownload','韩寒')
print dirs



                                            
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  python os