您的位置:首页 > 编程语言 > Python开发

python标准库之文本

2013-04-27 18:58 309 查看
对于python来说,最显而易见的文本处理工具就所string类,不过除此之外,标准库中还提供了大量的其他工具,可以帮大家轻松的完成高级文本处理。

最近在看《python标准库》一书,所以下面就介绍集中常见的方法:

string——文本常量

  目前string模块中还有两个函数未移除:capwords()和maketrans(). capwords()的作用是将一个字符串中所有的单词的首字母大写。例如:

import string

s = "hello! this is test...."

print s
print string.capwords(s)


  其运行结果为:
       hello! this is test...

       Hello! This Is Test...

  maketrans()函数将创建转换表,可以用来结合translate()方法将一组字符修改为另一组字符,这种做法比反复调用replace()更为高效。例如:

import string

pass_code = string.maketrans("abcdefghi","123456789")
s = "the quick brown fox jumped over the lazy dog."

print s
print s.translate(pass_code)


  其运行结果为:

       the quick brown fox jumped over the lazy dog.

       t85 qu93k 2rown 6ox jump54 ov5r t85 l1zy 4o7.
  在这个例子中,一些字母被替换为相应的“火星文”数字(事先准备好的对应字母与数字:"abcdefghi","123456789")
  

备注:  translate()是字符的一一映射. 每个字符只要出现都会被替换为对应的字符.
     replace()是字符串替换, 字符串完整出现后被整体替换.replace的两个字符串参数长度可以不同.

textwrap——格式化文本段落

  以下例子中用到的范本:

    sample_text = """the textwrap module can be used to format text for otput in situations where pretty-printing s dsired, it offers                  programmatic functionality similar to the paragraph wrapping or filling features found in many text editors"""

  示例代码为:

import textwrap

sample_text = """the textwrap module   can be used to format text for  otput     in situations where pretty-printing s dsired,   it offers programmatic functionality similar to the paragraph wrapping or filling features found in many text editors
"""
print "Test:\n"
print textwrap.fill(sample_text,width=50)


  输出内容为:

  Test:

   the textwrap module can be used to format text
  for otput in situations where pretty-printing s
  dsired, it offers programmatic functionality
  similar to the paragraph wrapping or filling
  features found in many text editors

  fill()函数取文本作为输入,生成格式化的文本作为输出。

  虽然文本最后的结果为左对齐,不过只有第一行保留了缩进,其余各行前面的空格则嵌入到段落中。

  下面修改一下代码即可(dedent()函数执行去除缩进):

print textwrap.dedent(sample_text)


  输出内容为:

  Test:

  the textwrap module can be used to format text
  for otput in situations where pretty-printing s
  dsired, it offers programmatic functionality
  similar to the paragraph wrapping or filling
  features found in many text editors

除以上方法外还可以结合dedent和fill使用,可以去除缩进的文本传入fill(),并提供一组不同的width值。按照指定宽带结合dedent去除缩进现实。另外还可以进行悬挂缩进处理。示例代码分别为:

import textwrap

sample_text = """the textwrap module   can be used to format text for  otput     in situations where pretty-printing s dsired,   it offers programmatic functionality similar to the paragraph wrapping or filling features found in many text editors
"""
dedent_text = textwrap.dedent(sample_text).strip()
for width in [45,70]:
print "%d columns:\n" % width
print textwrap.fill(dedented_text,width=width)
print


import textwrap

sample_text = """the textwrap module   can be used to format text for  otput     in situations where pretty-printing s dsired,   it offers programmatic functionality similar to the paragraph wrapping or filling features found in many text editors
"""
dedent_text = textwrap.dedent(sample_text).strip()
print textwrap.fill(dedent_text,
initial_indent=" ",
subsequent_indent= " " *4,
width=50,
)


备注:strip()去除字符串开始及结束的空格符号等。
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: