您的位置：首页 > 编程语言 > Python开发

Python 快速入门

2013-05-03 23:32 495 查看

Python

参考

diveintopython.org

Python 的语法特点

代码缩进不再是美观的需要，而称为语法的一部分！

函数的参数传递：支持关键字参数传递使参数顺序不再重要！

内嵌代码中的帮助文档: DocStrings

三引号的字符串

while 循环和 for 循环可以带 else 语句块

交换赋值：a,b = b,a

Class 中 method（方法）的第一个参数非常特殊：需要声明（self），调用时却不提供（Python 自动添加）。

类的构造函数名称为 __init__(self, ...)

类的 Class 变量和 Object 变量

一切皆是对象：甚至字符串，变量，函数，都是对象

获得帮助

如何获得帮助？

1. 进入 python 命令行

2. 包含要查询的模组。如： import sys

3. 显示该模组包含的属性。命令： dir(sys)

4. 获取该模组的帮助。如： help(sys)

源文件的字符集设置

为支持中文，需要在源码的第一行或第二行（一般是第二行）添加特殊格式的注释，声明源文件的字符集。默认为 7-bit ASCII

格式为： # -*- coding: <encoding-name> -*-

参见: http://www.python.org/dev/peps/pep-0263/

如：设置 gbk 编码：

#!/usr/bin/python

# -*- coding: gbk -*-

如：设置 utf-8 编码

#!/usr/bin/python

# -*- coding: utf-8 -*-

注： emacs 能够也能识别该语法。而 VIM 通过 # vim:fileencoding=<encoding-name> 来识别

常量和变量

变量

变量名规则和 C 的相类似

合法的变量名，如： __my_name, name_23, a1b2_c3 等

保留关键字（不能与之重名）

and　　　　　 def　　　　 exec　　　　 if　　　　　not　　　　 return

assert　　　 del　　　　 finally　　　 import　　　or　　　　　try

break　　　　 elif　　　　 for　　　　　in　　　　　pass　　　　while

class　　　　 else　　　　from　　　　 is　　　　　print　　　 yield

continue　　 except　　　global　　　 lambda　　　raise

没有类型声明，直接使用

类型综述 / 查看类型

int

>>> type(17)

<type 'int'>

float

>>> type(3.2)

<type 'float'>

long

>>> type(1L)

<type 'long'>

>>> type(long(1))

<type 'long'>

bool

True 和 False，注意大小写

>>> type(True)

<type 'bool'>

>>> type(1>2)

<type 'bool'>

string

>>> type("Hello, World!")

<type 'str'>

>>> type("WorldHello"[0])

<type 'str'>

即 Python 没有 Char 类型

list

>>> type(['a','b','c'])

<type 'list'>

>>> type([])

<type 'list'>

tuple

>>> type(('a','b','c'))

<type 'tuple'>

>>> type(())

<type 'tuple'>

dict

>>> type({'color1':'red','color12':'blue'})

<type 'dict'>

>>> type({})

<type 'dict'>

字符串

三引号

三引号：''' 或者 """ 是 python 的发明。三引号可以包含跨行文字，其中的引号不必转义。（即内容可以包含的换行符和引号）

如

'''This is a multi-line string. This is the first line.

This is the second line.

"What's your name?," I asked.

He said "Bond, James Bond." '''

单引号和双引号都可以用于创建字符串。

注意，单引号和双引号没有任何不同，不像 PHP, PERL

\ 作为转义字符，\ 用在行尾作为续行符

r 或者 R 作为前缀，引入 Raw String

例如: r"Newlines are indicated by \n."

在处理常规表达式，尽量使用 Raw String，免得增加反斜线。例如 r'\1' 相当于 '\\1'。

u 或者 U 作为前缀，引入 Unicode

例如: u"This is a Unicode string."

u， r 可以一起使用，u在r前

例如 ur"\u0062\n" 包含三个字符

\u0062

字符串连接：两个字符串并排，则表示两个字符串连接在一起

'What\'s ' "your name?" 自动转换为 "What's your name?" .

作用一：减少 \ 作为续行符的使用。

作用二：可以为每段文字添加注释。如：

re.compile("[A-Za-z_]" # letter or underscore

"[A-Za-z0-9_]*" # letter, digit or underscore

)

用括号包含多行字串

>>> test= ("case 1: something;" # test case 1

... "case 2: something;" #test case 2

... "case 3: something." #test case 3

... )

>>> test

'case 1: something;case 2: something;case 3: something.'

类似于 sprintf 的字符串格式化

header1 = "Dear %s," % name

header2 = "Dear %(title)s %(name)s," % vars()

字符串操作

String slices

: 字符串的第 n+1 个字符

print "WorldHello"[0]

str="WorldHello"

print str[len(str)-1]

[n:m] : 返回从 n 开始到 m 结束的字符串，包括 n，不包括 m

>>> s = "0123456789"

>>> print s[0:5]

01234

>>> print s[3:5]

34

>>> print s[7:21]

789

>>> print s[:5]

01234

>>> print s[7:]

789

>>> print s[21:]

len : 字符串长度

len("WorldHello")

字符串比较

==, >, < 可以用于字符串比较

string 模组

警告： python 中字符串不可更改，属于常量

# 错误！字符串不可更改

greeting = "Hello, world!"

greeting[0] = 'J' # ERROR!

print greeting

# 可改写为：

greeting = "Hello, world!"

newGreeting = 'J' + greeting[1:]

print newGreeting

数字

整形和长整形

浮点数

类型转换

int("32")

int(-2.3)

float(32)

float("3.14159")

str(3.14149)

ord('A') ：返回字母'A' 的 ASCII 值

复杂类型，如 list, tuple, dict 参见后面章节

局部变量与全局变量

函数中可以直接引用全局变量的值，无须定义。但如果修改，影响只限于函数内部。

函数中没有用 global 声明的变量是局部变量，不影响全局变量的取值

global 声明全局变量

#!/usr/bin/python

def func1():

print "func1: local x is", x

def func2():

x = 2

print 'func2: local x is', x

def func3():

global x

print "func3: before change, x is", x

x = 2

print 'func3: changed x to', x

x = 1

print 'Global x is', x

func1()

print 'Global x is', x

func2()

print 'Global x is', x

func3()

print 'Global x is', x

locals() 和 globals() 是两个特殊函数，返回局部变量和全局变量

locals() 返回局部变量的 copy，不能修改

globals() 返回全局变量的 namespace, 可以通过其修改全局变量本身

vars() 等同于 locales()，可以用 vars()['key'] = 'value' 动态添加局部变量

复杂类型

string/unicode（字符串）

list （列表）

方括号建立的列表

[10, 20, 30, 40]

["spam", "bungee", "swallow"]

["hello", 2.0, 5, [10, 20]]

range 函数建立的列表

>>> range(1,5)

[1, 2, 3, 4]

从1 到 5，包括1，但不包括5。（隐含步长为1）

>>> range(10)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

从 0 到 10，包括 0，但不包括 10。（隐含步长为1）

>>> range(1, 10, 2)

[1, 3, 5, 7, 9]

步长为2

访问列表中的元素

类似数组下标

print numbers[0]

numbers[1] = 5

print 语句显示列表

vocabulary = ["ameliorate", "castigate", "defenestrate"]

numbers = [17, 123]

empty = []

print vocabulary, numbers, empty

['ameliorate', 'castigate', 'defenestrate'] [17, 123] []

列表操作

列表长度

len() 函数

+ （相加）

>>> a = [1, 2, 3]

>>> b = [4, 5, 6]

>>> c = a + b

>>> print c

[1, 2, 3, 4, 5, 6]

* （重复）

>>> [0] * 4

[0, 0, 0, 0]

>>> [1, 2, 3] * 3

[1, 2, 3, 1, 2, 3, 1, 2, 3]

List slices

参见 String slices

列表是变量，可以更改

不像字符串 str， List 是可以更改的

>>> fruit = ["banana", "apple", "quince"]

>>> fruit[0] = "pear"

>>> fruit[-1] = "orange"

>>> print fruit

['pear', 'apple', 'orange']

>>> list = ['a', 'b', 'c', 'd', 'e', 'f']

>>> list[1:3] = ['x', 'y']

>>> print list

['a', 'x', 'y', 'd', 'e', 'f']

列表中增加元素

>>> list = ['a', 'd', 'f']

>>> list[1:1] = ['b', 'c']

>>> print list

['a', 'b', 'c', 'd', 'f']

>>> list[4:4] = ['e']

>>> print list

['a', 'b', 'c', 'd', 'e', 'f']

删除列表中元素

通过清空而删除

>>> list = ['a', 'b', 'c', 'd', 'e', 'f']

>>> list[1:3] = []

>>> print list

['a', 'd', 'e', 'f']

使用 del 关键字

>>> a = ['one', 'two', 'three']

>>> del a[1]

>>> a

['one', 'three']

>>> list = ['a', 'b', 'c', 'd', 'e', 'f']

>>> del list[1:5]

>>> print list

['a', 'f']

查看列表的id

>>> a = [1, 2, 3]

>>> b = [1, 2, 3]

>>> print id(a), id(b)

418650444 418675820

>>> b = a

>>> print id(a), id(b)

418650444 418650444

>>> b = a[:]

>>> print id(a), id(b)

418650444 418675692

引用和Copy/Clone

b = a，则两个变量指向同一个对象，两个变量的值一起变动

b = a[:]，则建立克隆，b 和 a 指向不同对象，互不相干

list 作为函数的参数，是引用调用，即函数对 list 所做的修改会影响 list 对象本身

列表嵌套和矩阵

嵌套

>>> list = ["hello", 2.0, 5, [10, 20]]

>>> list[3][1]

20

矩阵

>>> matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]

>>> matrix[1]

[4, 5, 6]

>>> matrix[1][1]

5

字符串和列表

string.split 方法

>>> import string

>>> song = "The rain in Spain..."

>>> string.split(song)

['The', 'rain', 'in', 'Spain...']

>>> string.split(song, 'ai')

['The r', 'n in Sp', 'n...']

string.join 方法

>>> list = ['The', 'rain', 'in', 'Spain...']

>>> string.join(list)

'The rain in Spain...'

>>> string.join(list, '_')

'The_rain_in_Spain...'

>>> list = ['The', 'rain', 'in', 'Spain...']

>>> '|'.join(list)

'The|rain|in|Spain...'

Tuples

圆括号建立 Tuple

在最外面用圆括号括起来

>>> type((1,2,3))

<type 'tuple'>

必需是逗号分隔的多个值

>>> type((1))

<type 'int'>

>>> type((1,))

<type 'tuple'>

>>> type(('WorldHello'))

<type 'str'>

>>> type(('WorldHello',))

<type 'tuple'>

Tuple vs list

Tuple 和 list 的区别就是: Tuple 是不可更改的，而 list 是可以更改的

一个元素也可以构成 list，但 tuple 必需为多个元素

>>> type([1])

<type 'list'>

>>> type((1))

<type 'int'>

Dictionaries （哈希表）

花括号建立哈希表

Perl 管这种类型叫做哈希表或者关联数组。即下标可以是字符串的数组

>>> eng2sp = {}

>>> eng2sp['one'] = 'uno'

>>> eng2sp['two'] = 'dos'

>>> print eng2sp

{'one': 'uno', 'two': 'dos'}

访问哈希表中元素：下标为字符串

>>> print eng2sp

{'one': 'uno', 'three': 'tres', 'two': 'dos'}

>>> print eng2sp['two']

'dos'

哈希表操作

keys() 方法，返回 keys 组成的列表

>>> eng2sp.keys()

['one', 'three', 'two']

values() 方法，返回由 values 组成的列表

>>> eng2sp.values()

['uno', 'tres', 'dos']

items() 方法，返回由 key-value tuple 组成的列表

>>> eng2sp.items()

[('one','uno'), ('three', 'tres'), ('two', 'dos')]

from MoinMoin.util.chartypes import _chartypes

for key, val in _chartypes.items():

if not vars().has_key(key):

vars()[key] = val

haskey() 方法，返回布尔值

>>> eng2sp.has_key('one')

True

>>> eng2sp.has_key('deux')

False

get() 方法

返回哈希表某个 key 对应的 value

如 eng2sp.get('one')

等同于 eng2sp['one']

get() 可以带缺省值，即如果没有定义该 key，返回缺省值

如 eng2sp.get('none', 0)，如果没有定义 none, 返回 0，而不是空

引用和 copy/clone

哈希表的克隆：copy() 方法

>>> opposites = {'up': 'down', 'right': 'wrong', 'true': 'false'}

>>> copy = opposites.copy()

Iterators

type 函数返回变量类型

isinstance(varname, type({}))

语句

每一行语句，不需要分号作为语句结尾！

如果多个语句写在一行，则需要分号分隔；

用 “\” 显示连接行

如：

i=10

print \

i

默认连接行

方括号，圆括号，花括号中的内容可以多行排列，不用 \ 续行，默认续行

例如：

month_names = ['Januari', 'Februari', 'Maart', # These are the

'April', 'Mei', 'Juni', # Dutch names

'Juli', 'Augustus', 'September', # for the months

'Oktober', 'November', 'December'] # of the year

缩进

一条语句前的空白（空格、TAB）是有意义的！

相同缩进的语句成为一个逻辑代码块

错误的缩进，将导致运行出错！

缩进的单位是空格。Tab 转换为1-8个空格，转换原则是空格总数是 8 的倍数。

空语句 pass

def someFunction():

pass

操作符和表达式

** 代表幂

3 ** 4 gives 81 (i.e. 3 * 3 * 3 * 3)

// 代表 floor

4 // 3.0 gives 1.0

% 代表取余

-25.5 % 2.25 gives 1.5 .

<< 左移位

>> 右移位

<, >, <=, >=, ==, != 和 C 类似

比较可以级联。如：

if 0 < x < 10:

print "x is a positive single digit."

~, &, |, ^ 和 c 语言相同

5 & 3 gives 1.

5 | 3 gives 7.

5 ^ 3 gives 6

~5 gives -6

取反。 ~x 相当于 -(x+1)

and, or, not 代表逻辑与或非

if 0 < x and x < 10:

print "x is a positive single digit."

is 和 is not，用于比较两个 object 是否为同一个对象

实际上两个对象的 ID 相同，才代表同一个对象。

is: id(obj1) == id(obj2)

is not: id(obj1) != id(obj2)

in, not in 用于测试成员变量

'a' in ['a', 'b', 'c'] # True

交换赋值 a,b = b,a

为交换变量 a, b 的值，其它语言可能需要一个中间变量

temp=a

a=b

b=temp

python 有一个交换赋值的写法： a,b = b,a

控制语句

if 语句

if ... elif ... else ，示例：（注意冒号和缩进）

#!/usr/bin/python

# Filename : if.py

number = 23

guess = int(raw_input('Enter an integer : '))

if guess == number:

print 'Congratulations, you guessed it.' # new block starts here

print "(but you don't win any prizes!)" # new block ends here

elif guess < number:

print 'No, it is a little higher than that.' # another block

# You can do whatever you want in a block ...

else:

print 'No, it is a little lower than that.'

# you must have guess > number to reach here

print 'Done'

# This last statement is always executed, after the if statement

# is executed.

注意: 没有 switch... case 语句！

while 循环语句

while ... [else ...] ，示例：（else 可选）

#!/usr/bin/python

# Filename : while.py

number = 23

stop = False

while not stop:

guess = int(raw_input('Enter an integer : '))

if guess == number:

print 'Congratulations, you guessed it.'

stop = True # This causes the while loop to stop

elif guess < number:

print 'No, it is a little higher than that.'

else: # you must have guess > number to reach here

print 'No, it is a little lower than that.'

else:

print 'The while loop is over.'

print 'I can do whatever I want here.'

print 'Done.'

break 和 continue 语句

break 语句跳出循环，且不执行 else 语句

for 循环语句

for... else... ，示例：（else 可选）

#!/usr/bin/python

# Filename : for.py

for i in range(1, 5):

print i

else:

print 'The for loop is over.'

range(1,5) 相当于 range(1,5,1) 第三个参数为步长

range 止于第二个参数，但不包括第二个参数

break 和 continue 语句

break 语句跳出循环，且不执行 else 语句

后置 for 语句

[ name for name in wikiaction.__dict__ ]

actions = [name[3:] for name in wikiaction.__dict__ if name.startswith('do_')]

示例

字符串中的字符

prefixes = "JKLMNOPQ"

suffix = "ack"

for letter in prefixes:

print letter + suffix

函数

函数声明

def 关键字

函数名

括号和参数

冒号

如：

#!/usr/bin/python

# Filename : func_param.py

def printMax(a, b):

if a > b:

print a, 'is maximum'

else:

print b, 'is maximum'

printMax(3, 4) # Directly give literal values

参数的缺省值

如同 C++ 那样

#!/usr/bin/python

# Filename : func_default.py

def say(s, times = 1):

print s * times

say('Hello')

say('World', 5)

关键字参数

在 C++ 等语言中遇到如下困扰：有一长串参数，虽然都有缺省值，但只为了修改后面的某个参数，还需要把前面的参数也赋值。这种方式，在 python 中称为顺序参数赋值。

Python 的一个特色是关键字参数赋值

例如：

#!/usr/bin/python

# Filename : func_key.py

def func(a, b=5, c=10):

print 'a is', a, 'and b is', b, 'and c is', c

func(3, 7)

func(25, c=24)

func(c=50, a=100)

可变参数

参数前加 * 或者 **，则读取的是 list 或者 dictionary

示例1

#!/usr/bin/python

def sum(*args):

'''Return the sum the number of args.'''

total = 0

for i in range(0, len(args)):

total += args[i]

return total

print sum(10, 20, 30, 40, 50)

函数返回值

return 语句提供函数返回值

没有 return，则返回 None

DocStrings

DocStrings 提供函数的帮助

函数内部的第一行开始的字符串为 DocStrings

DocStrings 一般为多行

DocString 为三引号扩起来的多行字符串

第一行为概述

第二行为空行

第三行开始是详细描述

DocStrings 的存在证明了函数也是对象

函数的 __doc__ 属性为该 DocStrings

例如 print printMax.__doc__ 为打印 printMax 函数的 DocStrings

help( ) 查看帮助就是调用函数的 DocStrings

Lambda Forms

Lambda Forms 用于创建并返回新函数，即是一个函数生成器

示例

内置函数和对象

帮助： import __builtin__; help (__builtin__)

函数

数学／逻辑／算法

abs(number) : 绝对值

cmp(x,y) ：比较x y 的值。返回 1,0,-1

divmod(x, y) -> (div, mod) ：显示除数和余数

pow(x, y[, z]) -> number

round(number[, ndigits]) -> floating point number ：四舍五入，保留 n 位小数

sum(sequence, start=0) -> value ：取 sequence 的和

hex(number) -> string ：返回十六进制

oct(number) -> string ：八进制

len(object) -> integer

max(sequence) -> value

min(sequence) -> value

range([start,] stop[, step]) -> list of integers

>>> range(10)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

filter(function or None, sequence) -> list, tuple, or string

function 作用于 sequence 的每一个元素，返回 true 的元素。返回类型同 sequence 类型。

如果 function 为 None，返回本身为 true 的元素

map(function, sequence[, sequence, ...]) -> list

将函数作用于 sequence 每个元素，生成 list

>>> map(lambda x : x*2, [1,2,3,4,5])

[2, 4, 6, 8, 10]

reduce(function, sequence[, initial]) -> value

从左至右，将函数作用在 sequence 上，最终由 sequence 产生一个唯一值。

>>> reduce(lambda x, y: x+y, [1, 2, 3, 4, 5])

15

相当于 ((((1+2)+3)+4)+5)

sorted(iterable, cmp=None, key=None, reverse=False) ：排序

zip(seq1 [, seq2 [...]]) -> [(seq1[0], seq2[0] ...), (...)]

>>> zip('1234','789')

[('1', '7'), ('2', '8'), ('3', '9')]

coerce(x, y) -> (x1, y1)

Return a tuple consisting of the two numeric arguments converted to a common type, using the same rules as used by arithmetic operations. If coercion is not possible, raise TypeError.

字符串

chr(i) ： 0<=i<256, 返回 ascii 码为 i 的字符

unichr(i) -> Unicode character ：返回 unicode 。 0 <= i <= 0x10ffff

ord(c) ：返回字符 c 的 ascii 码

对象相关

delattr(object,name) ：在对象 object 中删除属性 name

delattr(x, 'y') 相当于 del x.y

getattr(object, name[, default]) -> value

getattr(x, 'y') 相当于 x.y

缺省值，是当对象不包含时的取值

hasattr(object, name) -> bool

id(object) -> integer ：返回对象 ID，相当于内存中地址

hash(object) -> integer ：两个对象具有相同的值，就有相当的 hash。但反之未必。

setattr(object, name, value) ：相当于赋值 x.y = v

isinstance(object, class-or-type-or-tuple) -> bool

issubclass(C, B) -> bool

globals() -> dictionary

locals() -> dictionary

vars([object]) -> dictionary

没有参数相当于 locals()

以对象为参数，相当于 object.__dict__

dir([object]) ：显示对象属性列表

repr(object) -> string ：对象 object 的正式名称

reload(module) -> module ：重新加载 module

iter

iter(collection) -> iterator

Get an iterator from an object. In the first form, the argument must

supply its own iterator, or be a sequence.

iter(callable, sentinel) -> iterator

In the second form, the callable is called until it returns the sentinel.

输入输出

input([prompt]) -> value ：输入。相当于 eval(raw_input(prompt))。

raw_input([prompt]) -> string ：输入内容不做处理，作为字符串

其他

__import__(name, globals, locals, fromlist) -> module ：动态加载模块

import module 中的 module 不能是变量。如果要使用变量动态加载模块，使用下面的方法。

def importName(modulename, name):

""" Import name dynamically from module

Used to do dynamic import of modules and names that you know their

names only in runtime.

Any error raised here must be handled by the caller.

@param modulename: full qualified mudule name, e.g. x.y.z

@param name: name to import from modulename

@rtype: any object

@return: name from module

"""

module = __import__(modulename, globals(), {}, [name])

return getattr(module, name)

callable(object) ：是否可调用，如函数。对象也可以调用。

compile(source, filename, mode[, flags[, dont_inherit]]) -> code object

eval(source[, globals[, locals]]) -> value

执行代码，source 可以是字符串表达的代码，或者 compile 返回的 code object

execfile(filename[, globals[, locals]])

intern(string) -> string

对象

basestring

str

unicode

buffer

classmethod

complex

dict

enumerate

file

float

frozenset

int

bool

list

long

property

reversed

set

slice

staticmethod

super

tuple

type

xrange

输入和输出

输入：raw_input vs input

最好用 raw_input

name = raw_input ("What...is your name? ")

input 只能用于输入数字

age = input ("How old are you? ")

如果输入的不是数字，直接报错退出！

文件

打开文件

读

>>> f = open("test.dat","r")

写

>>> f = open("test.dat","w")

>>> print f

<open file 'test.dat', mode 'w' at fe820>

write("content")：写文件

>>> f.write("Now is the time")

>>> f.write("to close the file")

read(count)：读文件

读取全部数据

>>> text = f.read()

>>> print text

Now is the timeto close the file

读取定长数据

>>> f = open("test.dat","r")

>>> print f.read(5)

Now i

判断是否到文件尾：读取内容为空

readline()：读取一行内容，包括行尾换行符

readlines()：读取每行内容到一个列表

关闭文件

>>> f.close()

示例

def copyFile(oldFile, newFile):

f1 = open(oldFile, "r")

f2 = open(newFile, "w")

while True:

text = f1.read(50)

if text == "":

break

f2.write(text)

f1.close()

f2.close()

return

% 格式化输出

% 用在数字中，是取余数。

% 前面如果是字符串，则类似 C 的 printf 格式化输出。

示例

>>> cars = 52

>>> "In July we sold %d cars." % cars

'In July we sold 52 cars.'

>>> "In %d days we made %f million %s." % (34,6.1,'dollars')

'In 34 days we made 6.100000 million dollars.'

pickle 和 cPickle

相当于 C++ 中的序列化

示例

>>> import pickle

>>> f = open("test.pck","w")

>>> pickle.dump(12.3, f)

>>> pickle.dump([1,2,3], f)

>>> f.close()

>>> f = open("test.pck","r")

>>> x = pickle.load(f)

>>> x

12.3

>>> type(x)

<type 'float'>

>>> y = pickle.load(f)

>>> y

[1, 2, 3]

>>> type(y)

<type 'list'>

使用 cPickle

cPickle 是用 C 语言实现的，速度更快

比较两者时间

bash$ x=1; time while [ $x -lt 20 ]; do x=`expr $x + 1`; ./pickle.py ; done

real 0m5.743s

user 0m2.368s

sys 0m2.932s

bash$ x=1; time while [ $x -lt 20 ]; do x=`expr $x + 1`; ./cpickle.py ; done

real 0m3.826s

user 0m2.194s

sys 0m1.958s

cPickle 示例

#!/usr/bin/python

# Filename: pickling.py

import cPickle

shoplistfile = 'shoplist.data' # The name of the file we will use

shoplist = ['apple', 'mango', 'carrot']

# Write to the storage

f = file(shoplistfile, 'w')

cPickle.dump(shoplist, f) # dump the data to the file

f.close()

del shoplist # Remove shoplist

# Read back from storage

f = file(shoplistfile)

storedlist = cPickle.load(f)

print storedlist

管道(pipe)

os.popen('ls /etc').read()

os.popen('ls /etc').readlines()

关于 Python

Python 链接

http://www.python.org

wxPython

Boa

Eclipse

Python 版本

2.4.3

关于本文

参考资料

《A Byte of Python》, by Swaroop C H

《How to Think Like a Computer Scientist ——Learning with Python》

面向对象：类的编程

甚至字符串，变量，函数，都是对象

概念

class 和 object

class 是用 class 关键字创建的一个新类型

object 是该 class 的一个实例

fields 和 methods

class 中定义的变量称为 field

class 中定义的函数称为 method

fields 的两种类型

instance variables

属于该类的每一个对象实例

class variables

属于class 本身的

method（方法）与函数的区别

method 的第一个参数比较特殊

在 method 声明时必须提供，但是调用时又不能提供该参数

这个参数指向对象本身，一般命名为 self

python 在调用时会自动提供该参数

例如：调用 MyClass 的一个实例 MyObject：

MyObject.method(arg1, arg2) ，Python 自动调用 MyClass.method(MyObject, arg1,arg2).

class 变量和 object 变量

在 Class ChassName 中定义的变量 var1 和 var2

如果 ClassName.var1 方式调用，则该变量为 Class 变量，在该 Class 的各个实例中共享

如果 var2 以 self.var2 方式调用，则该变量为 Object 变量，与其他 Object 隔离

示例

类 Person, 每新增一人，类变量 population 加一

代码

#!/usr/bin/python

# Filename: objvar.py

class Person:

'''Represents a person.'''

population = 0

def __init__(self, name):

'''Initializes the person.'''

self.name = name

print '(Initializing %s)' % self.name

# When this person is created,

# he/she adds to the population

Person.population += 1

def sayHi(self):

'''Greets the other person.

Really, that's all it does.'''

print 'Hi, my name is %s.' % self.name

def howMany(self):

'''Prints the current population.'''

# There will always be atleast one person

if Person.population == 1:

print 'I am the only person here.'

else:

print 'We have %s persons here.' % \

Person.population

swaroop = Person('Swaroop')

swaroop.sayHi()

swaroop.howMany()

kalam = Person('Abdul Kalam')

kalam.sayHi()

kalam.howMany()

swaroop.sayHi()

swaroop.howMany()

构造和析构函数

__init__ 构造函数

在对象建立时，该函数自动执行。

__del__ 构造函数

在用 del 删除对象时，该函数自动执行。

其他类的方法

大多和操作符重载相关

__lt__(self, other)

重载 <

__getitem__(...)

x.__getitem__(y) <==> x[y]

重载 [key]

__len__(self)

重载 len() 函数

__str__(self)

当 print object 时，打印的内容

__iter__(self)

支持 iterator, 返回一个包含 next() 方法的对象。或者如果类定义了 next(), __iter__ 可以直接返回 self

__getattribute__(...)

x.__getattribute__('name') <==> x.name

类的继承

语法，在子类声明中用括号将基类扩在其中

示例

# Filename: inheritance.py

class SchoolMember:

'''Represents any school member.'''

def __init__(self, name, age):

self.name = name

self.age = age

print '(Initialized SchoolMember: %s)' % self.name

def tell(self):

print 'Name:"%s" Age:"%s" ' % (self.name, self.age),

class Teacher(SchoolMember):

'''Represents a teacher.'''

def __init__(self, name, age, salary):

SchoolMember.__init__(self, name, age)

self.salary = salary

print '(Initialized Teacher: %s)' % self.name

def tell(self):

SchoolMember.tell(self)

print 'Salary:"%d"' % self.salary

class Student(SchoolMember):

'''Represents a student.'''

def __init__(self, name, age, marks):

SchoolMember.__init__(self, name, age)

self.marks = marks

print '(Initialized Student: %s)' % self.name

def tell(self):

SchoolMember.tell(self)

print 'Marks:"%d"' % self.marks

t = Teacher('Mrs. Abraham', 40, 30000)

s = Student('Swaroop', 21, 75)

print # prints a blank line

members = [t, s]

for member in members:

member.tell()

# Works for instances of Student as well as Teacher

异常处理

Try..Except

在 python 解析器中输入 s = raw_input('Enter something --> ')，

按 Ctrl-D ， Ctrl-C 看看如何显示？

用 Try..Except 捕获异常输入。示例

#!/usr/bin/python

# Filename: try_except.py

import sys

try:

s = raw_input('Enter something --> ')

except EOFError:

print '\nWhy did you do an EOF on me?'

sys.exit() # Exit the program

except:

print '\nSome error/exception occurred.'

# Here, we are not exiting the program

print 'Done'

Try..Finally

finally: 代表无论如何都要执行的语句块

Raising Exceptions

建立自己的异常事件，需要创建一个 Exception 的子类

创建自己的异常类 ShortInputException 示例

#!/usr/bin/python

# Filename: raising.py

class ShortInputException(Exception):

'''A user-defined exception class.'''

def __init__(self, length, atleast):

self.length = length

self.atleast = atleast

产生异常和捕获异常

try:

s = raw_input('Enter something --> ')

if len(s) < 3:

raise ShortInputException(len(s), 3)

# Other work can go as usual here.

except EOFError:

print '\nWhy did you do an EOF on me?'

except ShortInputException, x:

print '\nThe input was of length %d, it should be at least %d'\% (x.length, x.atleast)

else:

print 'No exception was raised.'

模组和包

示例

a.py 示例

# -*- python -*-

version=0.1.a

b.py 以模组调用 a.py

a.py 与 b.py 在同一目录下

直接 import

a.py 中定义的变量和函数的引用属于模块 a 的命名空间

import a

print "version:%s, author:%s" % (a.version, a.author)

使用 from module import 语法

a.py 中定义的变量和函数就像在 b.py 中定义的一样

from a import *

print "version:%s, author:%s" % (version, author)

from a import author

# 只 import 模块a中一个变量

print "author:", author

将 a.py 拷贝到目录 dir_a 中

修改 sys.path, 将 dir_a 包含其中

import sys

sys.path.insert(0, "dir_a")

import a

print "author:", a.author

import sys

sys.path.insert(0, "dir_a")

from a import *

print "version:%s, author:%s" % (version, author)

将 dir_a 作为 package

参见： python.org > Doc > Essays > Packages

在 dir_a 目录下创建文件 __init__.py (空文件即可)

from dir_a import a

# 只 import 模块a中一个变量

print "author:", a.author

# b.py

from dir_a.a import *

print "version:%s, author:%s" % (version, author)

说明

模组文件为 *.py 文件

模组文件位于 PYTHONPATH 指定的目录中，可以用 print sys.path 查看

import sys

print sys.path

模组引用一次后，会编译为 *.pyc 二进制文件，以提高效率

import 语句，引用模组

语法1： "import" module ["as" name] ( "," module ["as" name] )*

语法2： "from" module "import" identifier ["as" name] ( "," identifier ["as" name] )*

__name__ 变量

每个模组都有一个名字，模组内语句可以通过 __name__ 属性得到模组名字。

当模组被直接调用， __name__ 设置为 __main__

例如模组中的如下语句

#!/usr/bin/python

# Filename: using_name.py

if __name__ == '__main__':

print 'This program is being run by itself'

else:

print 'I am being imported from another module'

__dict__

Modules, classes, and class instances all have __dict__ attributes that holds the namespace contents for that object.

dir() 函数

可以列出一个模组中定义的变量

关于包（package）

package 可以更有效的组织 modules。

__init__.py 文件，决定了一个目录不是不同目录，而是作为 python package

__init__.py 可以为空

__init__.py 可以包含 __all__变量

package 就是一个目录，包含 *.py 模组文件，同时包含一个 __init__.py 文件

一个问题：由于 Mac, windows 等对于文件名大小写不区分，当用 from package import * 的时候，难以确定文件名到模组名的对应

__all__ 变量是一个解决方案

已如对于上例，在 __init__.py 中定义

__all__ = ["a"]

即当 from dir_a import * 的时候，import 的模组是 __all__ 中定义的模组

sys, os: Python 核心库

Python 函数库

sys

查看系统信息 sys.platform, sys.version_info, sys.maxint

>>> import sys

>>> sys.version

'2.4.1 (#1, May 27 2005, 18:02:40) \n[GCC 3.3.3 (cygwin special)]'

>>> sys.version_info

(2, 4, 1, 'final', 0)

>>> sys.platform, sys.maxint

('linux2', 9223372036854775807)

Python 模组的查询路径： sys.path

显示 python 查询路径： sys.path

设置 Python 模组包含路径： sys.path.append( '/home/user')

Exception 例外信息： sys.exc_type

>>> try:

... raise IndexError

... except:

... print sys.exc_info()

try:

raise TypeError, "Bad Thing"

except:

print sys.exc_info()

print sys.exc_type, sys.exc_value

命令行参数： sys.argv

命令行参数数目： len(sys.argv) ，包含程序本身名称

sys.argv[0] 为程序名称， sys.argv[1] 为第一个参数，依此类推

示例1

def main(arg1, arg2):

"""main entry point"""

... ...

if __name__ == '__main__':

if len(sys.argv) < 3:

sys.stderr.write("Usage: %s ARG1 ARG2\n" % (sys.argv[0]))

else:

main(sys.argv[1], sys.argv[2])

示例2

#!/usr/bin/python

# Filename : using_sys.py

import sys

print 'The command line arguments used are:'

for i in sys.argv:

print i

print '\n\nThe PYTHONPATH is', sys.path, '\n'

系统退出： sys.exit

标准输入输出等： sys.stdin, sys.stdout, sys.stderr

分隔符等：os.sep, os.pathsep, os.linesep

获取进程ID： os.getpid()

得到当前路径： os.getcwd()

切换路径： os.chdir(r'c:\temp')

将路径分解为目录和文件名：os.path.split(), os.path.dirname()

>>> os.path.split('/home/swaroop/poem.txt')

('/home/swaroop', 'poem.txt')

os.path.dirname('/etc/init.d/apachectl')

os.path.basename('/etc/init.d/apachectl')

判断是文件还是目录： os.path.isdir(r'c:\temp'), os.path.isfile(r'c:\temp') ，返回值 1,0

判断文件/目录是否存在 os.path.exists('/etc/passwd')

执行系统命令： os.system('ls -l /etc')

执行系统命令并打开管道： os.popen(command [, mode='r' [, bufsize]])

os.popen('ls /etc').read()

os.popen('ls /etc').readlines()

string （字符串处理）

帮助： help('string')

示例

import string

fruit = "banana"

index = string.find(fruit, "a")

print index

math （数学函数）

例如

import math

x = math.cos(angle + math.pi/2)

x = math.exp(math.log(10.0))

帮助

常规表达式。参考： http://docs.python.org/lib/module-re.html

>>> help('sre')

正则表达式语法

^, $ 指代字符串开始，结束。对于 re.MULTILINE 模式，^,$ 除了指代字符串开始和结尾，还指代一行的开始和结束

[ ] 字符列表，其中的 ^ 含义为“非”

*, +, ?, {m,n} ：量词（默认贪婪模式，尽量多的匹配）

例如：表达式 "<.*>" 用于匹配字符串 '<H1>title</H1>'，会匹配整个字串，而非 '<H1>'

>>> re.match('<.*>', '<H1>titile</H1>').group()

'<H1>titile</H1>'

*?, +?, ?? ：避免贪婪模式的量词

例如：表达式 "<.*?>" 用于匹配字符串 '<H1>title</H1>'，只匹配 '<H1>'

>>> re.match('<.*?>', '<H1>titile</H1>').group()

'<H1>'

{m,n}? ：同样尽量少的匹配（非贪婪模式）

>>> re.match('<.{,20}>', '<H1>titile</H1>').group()

'<H1>titile</H1>'

>>> re.match('<.{,20}?>', '<H1>titile</H1>').group()

'<H1>'

[(] [)]

( 和 )，用于组合pattern，如果要匹配括号，可以使用 $, $ 或者 [(] , [)]

( ) ：组合表达式，可以在后面匹配

(?iLmsux)

(? 之后跟 iLmsux 任意字符，相当于设置了 re.I, re.L, re.M, re.S, re.U, re.X

参见 re 选项

>>> re.search('(?i)(T[A-Z]*)','<h1>title</h1>').groups()

('title',)

(?P<name>pattern) ：用名称指代匹配

>>> re.match('(?P<p>.*?)(?::\s*)(?P<msg>.*)', 'prompt: enter your name').group('p')

'prompt'

>>> re.match('(?P<p>.*?)(?::\s*)(?P<msg>.*)', 'prompt: enter your name').group('msg')

'enter your name'

>>> re.match('(?P<p>.*?)(?::\s*)(?P<msg>.*)', 'prompt: enter your name').group(0)

'prompt: enter your name'

>>> re.match('(?P<p>.*?)(?::\s*)(?P<msg>.*)', 'prompt: enter your name').group(1)

'prompt'

>>> re.match('(?P<p>.*?)(?::\s*)(?P<msg>.*)', 'prompt: enter your name').group(2)

'enter your name'

用 r'\1' 指代匹配

>>> re.sub ( 'id:\s*(?P<id>\d+)', 'N:\\1', 'userlist\nid:001,user001:jiangxin\nid:002,user003:tom\nid:003,user003:jerry\n')

'userlist\nN:001,user001:jiangxin\nN:002,user003:tom\nN:003,user003:jerry\n'

>>> re.sub ( 'id:\s*(?P<id>\d+)', r'N:\1', 'userlist\nid:001,user001:jiangxin\nid:002,user003:tom\nid:003,user003:jerry\n')

'userlist\nN:001,user001:jiangxin\nN:002,user003:tom\nN:003,user003:jerry\n'

用 r'\g<name>' 指代匹配

>>> re.sub ( 'id:\s*(?P<id>\d+)', r'N:\g<id>', 'userlist\nid:001,user001:jiangxin\nid:002,user003:tom\nid:003,user003:jerry\n')

'userlist\nN:001,user001:jiangxin\nN:002,user003:tom\nN:003,user003:jerry\n'

(?P=name) ：指代前面发现的命名匹配

>>> re.findall ( 'id:\s*(?P<id>\d+)', 'userlist\nid:001,user001:jiangxin\nid:002,user003:tom\nid:003,user003:jerry\n')

['001', '002', '003']

>>> re.findall ( 'id:\s*(?P<id>\d+),\s*user(?P=id):', 'userlist\nid:001,user001:jiangxin\nid:002,user003:tom\nid:003,user003:jerry\n')

['001', '003']

(?#...) ：为注释

(?:pattern)

组合表达式，但并不计入分组

对比下面的两个例子：

>>> re.match('(.*?:\s*)(.*)', 'prompt: enter your name').group(1)

'prompt: '

>>> re.match('(?:.*?:\s*)(.*)', 'prompt: enter your name').group(1)

'enter your name'

(?=pattern) 正向前断言

Matches if pattern matches next, but doesn't consume any of the string.

例如：

只改动出现在 foobar 中的 foo, 不改变如 fool, foolish 中出现的 foo

$line="foobar\nfool";

## foo后面出现bar，且 bar 的内容不再替换之列。

$line =~ s/foo(?=bar)/something/gm;

print "$line\n";

显示

somethingbar

fool

(?!pattern) 负向前断言

则和 (?=pattern) 相反。 Matches if ... doesn't match next. This is a negative lookahead assertion.

例如: 改动除了 foobar 外单词中的 foo, 如 fool, foolish 中出现的 foo。

$line="foobar\nfool";

## foo后面不是bar，且 (?!..) 中的内容不再替换之列。

$line =~ s/foo(?!bar)/something/gm;

print "$line\n";

显示

foobar

somethingl

(?<=pattern) 正向后断言

正向后断言。Matches if the current position in the string is preceded by a match for ... that ends at the current position.

如下例:

$line="foobar\nbarfoo\nbar foo\na fool";

## 替换 bar 后面的 foo，(bar) 不再替换之列。

$line =~ s/(?<=bar)foo/something/gm;

print "$line\n";

显示

foobar

barsomething

bar foo

a fool

(?<!pattern) 负向后断言

负向后断言。 Matches if the current position in the string is not preceded by a match for .... This is called a negative lookbehind assertion.

如下例:

$line="foobar\nbarfoo\nbar foo\na fool";

## 替换 foo，但之前不能是 bar。

$line =~ s/(?<!bar)foo/something/gm;

print "$line\n";

显示

somethingbar

barfoo

bar something

a somethingl

正则表达式特殊字符

\A Matches only at the start of the string.

\b Matches the empty string, but only at the beginning or end of a word

\B Matches the empty string, but only when it is not at the beginning or end of a word.

\d When the UNICODE flag is not specified, matches any decimal digit; this is equivalent to the set [0-9]. With UNICODE, it will match whatever is classified as a digit in the Unicode character properties database.

\D When the UNICODE flag is not specified, matches any non-digit character; this is equivalent to the set [^0-9]. With UNICODE, it will match anything other than character marked as digits in the Unicode character
properties database.

\s When the LOCALE and UNICODE flags are not specified, matches any whitespace character; this is equivalent to the set [ \t\n\r\f\v]. With LOCALE, it will match this set plus whatever characters are defined as
space for the current locale. If UNICODE is set, this will match the characters [ \t\n\r\f\v] plus whatever is classified as space in the Unicode character properties database.

\S When the LOCALE and UNICODE flags are not specified, matches any non-whitespace character; this is equivalent to the set [^ \t\n\r\f\v] With LOCALE, it will match any character not in this set, and not defined
as space in the current locale. If UNICODE is set, this will match anything other than [ \t\n\r\f\v] and characters marked as space in the Unicode character properties database.

\w When the LOCALE and UNICODE flags are not specified, matches any alphanumeric character and the underscore; this is equivalent to the set [a-zA-Z0-9_]. With LOCALE, it will match the set [0-9_] plus whatever
characters are defined as alphanumeric for the current locale. If UNICODE is set, this will match the characters [0-9_] plus whatever is classified as alphanumeric in the Unicode character properties database.

\W When the LOCALE and UNICODE flags are not specified, matches any non-alphanumeric character; this is equivalent to the set [^a-zA-Z0-9_]. With LOCALE, it will match any character not in the set [0-9_], and
not defined as alphanumeric for the current locale. If UNICODE is set, this will match anything other than [0-9_] and characters marked as alphanumeric in the Unicode character properties database.

\Z Matches only at the end of the string.

re 选项

re.I, re.IGNORE ：忽略大小写

re.L, re.LOCALE ： \w, \W, \b, \B, \s and \S 参考当前 locale

re.M, re.MULTILINE ：将字符串视为多行，^ 和 $ 匹配字符串中的换行符。缺省只匹配字符串开始和结束。

re.S, re.DOTALL ： . 匹配任意字符包括换行符。缺省匹配除了换行符外的字符

re.U, re.UNICODE ： \w, \W, \b, \B, \d, \D, \s and \S 参考 Unicode 属性

>>> re.compile(ur'----(-)*\r?\n.*\b(网页类)\b',re.U).search("--------\r\nCategoryX 网页类 CategoryY".decode('utf-8')).groups()

(u'-', u'\u7f51\u9875\u7c7b')

>>> re.compile(ur'----(-)*\r?\n.*\b(网页类)\b',re.U).search(u"--------\r\nCategoryX 网页类 CategoryY").groups()

(u'-', u'\u7f51\u9875\u7c7b')

re.X, re.VERBOSE ：可以添加 # 注释，以增强表达式可读性。

空格被忽略。＃为注释

注意 match 和 search 的差异

re.match( pattern, string[, flags]) 仅在字符串开头匹配。相当于在 pattern 前加上了一个'^'！

>>> p.match("")

>>> print p.match("")

None

p = re.compile( ... )

m = p.match( 'string goes here' )

if m:

print 'Match found: ', m.group()

else:

print 'No match'

re.search( pattern, string[, flags]) 在整个字符串中查询

re.compile( pattern[, flags])

使用 re.compile，对于需要重复使用的表达式，更有效率

prog = re.compile(pat)

result = prog.match(str)

相当于

result = re.match(pat, str)

re.split( pattern, string[, maxsplit = 0]) 分割字符串

>>> re.split('\W+', 'Words, words, words.')

['Words', 'words', 'words', '']

>>> re.split('(\W+)', 'Words, words, words.')

['Words', ', ', 'words', ', ', 'words', '.', '']

>>> re.split('\W+', 'Words, words, words.', 1)

['Words', 'words, words.']

re.findall( pattern, string[, flags])

查询所有匹配，返回 list

>>> p = re.compile('\d+')

>>> p.findall('12 drummers drumming, 11 pipers piping, 10 lords a-leaping')

['12', '11', '10']

re.finditer( pattern, string[, flags])

查询所有匹配，返回 iterator

>>> p = re.compile('\d+')

>>> iterator = p.finditer('12 drummers drumming, 11 ... 10 ...')

>>> iterator

<callable-iterator object at 0x401833ac>

>>> for match in iterator:

... print match.span()

...

(0, 2)

(22, 24)

(29, 31)

re.sub(pattern, repl, string[, count])

>>> re.sub ( 'id:\s*(?P<id>\d+)', 'N:\\1', 'userlist\nid:001,user001:jiangxin\nid:002,user003:tom\nid:003,user003:jerry\n')

'userlist\nN:001,user001:jiangxin\nN:002,user003:tom\nN:003,user003:jerry\n'

>>> re.sub ( 'id:\s*(?P<id>\d+)', r'N:\1', 'userlist\nid:001,user001:jiangxin\nid:002,user003:tom\nid:003,user003:jerry\n')

'userlist\nN:001,user001:jiangxin\nN:002,user003:tom\nN:003,user003:jerry\n'

>>> re.sub ( 'id:\s*(?P<id>\d+)', r'N:\g<id>', 'userlist\nid:001,user001:jiangxin\nid:002,user003:tom\nid:003,user003:jerry\n')

'userlist\nN:001,user001:jiangxin\nN:002,user003:tom\nN:003,user003:jerry\n'

re.subn( pattern, repl, string[, count]) 和 re.sub 类似，返回值不同

返回值为： a tuple (new_string, number_of_subs_made).

re.escape(string) ：对字符串预处理，以免其中特殊字符对正则表达式造成影响

compile 对象

re.compile 返回的 compile 对象的方法都有 re 类似方法对应，只是参数不同

re 相关对象有 flags 参数，而 compile 对象因为在建立之初已经提供了 flags，

在 compile 相应方法中，用 pos, endpos 即开始位置和结束位置参数取代 flags 参数

match( string[, pos[, endpos]])

search( string[, pos[, endpos]])

split( string[, maxsplit = 0])

findall( string[, pos[, endpos]])

finditer( string[, pos[, endpos]])

sub( repl, string[, count = 0])

subn( repl, string[, count = 0])

match 对象

expand( template)

利用匹配结果展开模板 template

支持 "\1", "\2", "\g<1>", "\g<name>"

group( [group1, ...])

示例

m = re.match(r"(?P<int>\d+)\.(\d*)", '3.14')

结果

m.group(1) is '3', as is m.group('int'), and m.group(2) is '14'.

>>> p = re.compile('(a(b)c)d')

>>> m = p.match('abcd')

>>> m.group(0)

'abcd'

>>> m.group(1)

'abc'

>>> m.group(2)

'b'

>>> m.groups()

('abc', 'b')

groups( [default])

返回一个 tuple，包含从 1 开始的所有匹配

groupdict( [default])

返回一个 dictionary，包含所有的命名匹配

start( [group]) 和 end( [group])

分别代表第 group 组匹配在字符串中的开始和结束位置

span( [group])

返回由 start, end 组成的二值 tuple

getopt（命令行处理）

getopt.getopt( args, options[, long_options])

args 是除了应用程序名称外的参数，相当于： sys.argv[1:]

options 是短格式的参数支持。如果带有赋值的参数后面加上冒号":"。参见 Unix getopt()

long_options 是长格式的参数支持。如果是带有赋值的参数，参数后面加上等号“="。

返回值：返回两个元素

一：返回包含 (option, value) 键值对的列表

二：返回剩余参数

异常：GetoptError ，又作 error

示例：

>>> import getopt

>>> args = '-a -b -cfoo -d bar a1 a2'.split()

>>> args

['-a', '-b', '-cfoo', '-d', 'bar', 'a1', 'a2']

>>> optlist, args = getopt.getopt(args, 'abc:d:')

>>> optlist

[('-a', ''), ('-b', ''), ('-c', 'foo'), ('-d', 'bar')]

>>> args

['a1', 'a2']

"""Module docstring.

This serves as a long usage message.

"""

import sys

import getopt

def main():

# parse command line options

try:

opts, args = getopt.getopt(sys.argv[1:], "hp:", ["help", "port="])

except getopt.error, msg:

print msg

print "for help use --help"

sys.exit(2)

# process options

for o, a in opts:

if o in ("-h", "--help"):

print __doc__

sys.exit(0)

elif o in ("-p", "--port"):

print "port is %d" % a

# process arguments

for arg in args:

process(arg) # process() is defined elsewhere

if __name__ == "__main__":

main()

数据库

参见： http://mysql-python.sourceforge.net/MySQLdb.html

LDAP

time（时间函数）

time.time() ：返回 Unix Epoch 时间（秒），符点数

time.clock() ：进程启动后的秒数（符点数）

gmtime() ：返回 UTC 时间，格式为 tuple

localtime() ：返回本地时间，格式为 tuple

asctime() ：将 tuple 时间转换为字符串

ctime() ：将秒转换为字符串

mktime() ：将本地时间 tuple 转换为 Epoch 秒数

strftime() ：将 tuple time 依照格式转换

strptime() ：将字符串按格式转换为 tuple time

tzset() ：设置时区

logging

logging 级别

Level Numeric value

CRITICAL 50

ERROR 40

WARNING 30

INFO 20

DEBUG 10

NOTSET 0

getLogger()

缺省为 root logger, 通过 getLogger 设置新的 logger 和名称

logging.basicConfig()

logging.getLogger("").setLevel(logging.DEBUG)

ERR = logging.getLogger("ERR")

ERR = logging.getLogger("ERR")

ERR.setLevel(logging.ERROR)

#These should log

logging.log(logging.CRITICAL, nextmessage())

logging.debug(nextmessage())

ERR.log(logging.CRITICAL, nextmessage())

ERR.error(nextmessage())

#These should not log

ERR.debug(nextmessage())

basicConfig 用于设置日志级别和格式等

logging.basicConfig(level=logging.DEBUG,

format="%(levelname)s : %(asctime)-15s > %(message)s")

Python 实战

帮助框架

__doc__

'''PROGRAM INTRODUCTION

Usage: %(PROGRAM)s [options]

Options:

-h|--help

Print this message and exit.

'''

函数 usage

def usage(code, msg=''):

if code:

fd = sys.stderr

else:

fd = sys.stdout

print >> fd, _(__doc__)

if msg:

print >> fd, msg

sys.exit(code)

说明： code 是返回值，msg 是附加的错误消息

命令行处理

命令行框架

#!/usr/bin/python

# -*- coding: utf-8 -*-

import sys

import getopt

def main(argv=None):

if argv is None:

argv = sys.argv

try:

opts, args = getopt.getopt(

argv[1:], "hn:",

["help", "name="])

except getopt.error, msg:

return usage(1, msg)

for opt, arg in opts:

if opt in ('-h', '--help'):

return usage(0)

#elif opt in ('--more_options'):

if __name__ == "__main__":

sys.exit(main())

说明

利用 __name__ 属性，封装代码

sys.argv 参见

之所以为 main 添加缺省参数，是为了可以在交互模式调用 main 来传参

def main(argv=None):

if argv is None:

argv = sys.argv

# etc., replacing sys.argv with argv in the getopt() call.

为防止 main 中调用 sys.exit()，导致交互模式退出，在 main 中使用 return 语句，而非 sys.exit

if __name__ == "__main__":

sys.exit(main())

文件读写

unicode

Python 里面的编码和解码也就是 unicode 和 str 这两种形式的相互转化。

编码是 unicode -> str，相反的，解码就 > 是 str -> unicode

认识 unicode

# 因为当前 locale 是 utf-8 编码，因此字符串默认编码为 utf-8

>>> '中文'

'\xe4\xb8\xad\xe6\x96\x87'

>>> isinstance('中文', unicode)

False

>>> isinstance('中文', str)

True

# decode 是将 str 转换为 unicode

>>> '中文'.decode('utf-8')

u'\u4e2d\u6587'

>>> isinstance('中文'.decode('utf-8'), unicode)

True

>>> isinstance('中文'.decode('utf-8'), str)

False

# 前缀 u 定义 unicode 字串

>>> u'中文'

u'\u4e2d\u6587'

>>> isinstance(u'中文', unicode)

True

>>> isinstance(u'中文', str)

False

# encode 将 unicode 转换为 str

>>> u'中文'.encode('utf-8')

'\xe4\xb8\xad\xe6\x96\x87'

>>> isinstance(u'中文'.encode('utf-8'), unicode)

False

>>> isinstance(u'中文'.encode('utf-8'), str)

True

>>> len(u'中文')

2

>>> len(u'中文'.encode('utf-8'))

6

>>> len(u'中文'.encode('utf-8').decode('utf-8'))

2

Unicode 典型错误1

>>> "str1: %s, str2: %s" % ('中文', u'中文')

Traceback (most recent call last):

File "<stdin>", line 1, in ?

UnicodeDecodeError: 'ascii' codec can't decode byte 0xe4 in position 6: ordinal not in range(128)

解决方案

>>> "str1: %s, str2: %s" % ('中文', '中文')

'str1: \xe4\xb8\xad\xe6\x96\x87, str2: \xe4\xb8\xad\xe6\x96\x87'

>>> "str1: %s, str2: %s" % (u'中文', u'中文')

u'str1: \u4e2d\u6587, str2: \u4e2d\u6587'

Unicode 典型错误2

mystr = '中文'

mystr.encode('gb18030')

报错：

Traceback (most recent call last):

File "<stdin>", line 1, in ?

UnicodeDecodeError: 'ascii' codec can't decode byte 0xe4 in position 0: ordinal not in range(128)

错误解析：

mystr.encode('gb18030') 这句代码将 mystr 重新编码为 gb18030 的格式，即进行 unicode -> str 的转换。因为 mystr 本身就是 str 类型的，因此 Python 会自动的先将 mystr 解码为 unicode ，然后再编码成 gb18030。

因为解码是python自动进行的，我们没有指明解码方式，python 就会使用 sys.defaultencoding 指明的方式来解码。很多情况下 sys.defaultencoding 是 ANSCII，如果 mystr 不是这个类型就会出错。

拿上面的情况来说，缺省 sys.defaultencoding 是 anscii，而 mystr 的编码方式和文件的编码方式一致，是 utf8 的，所以出错了。

通过 sys.setdefaultencoding 设置字符串缺省编码

#! /usr/bin/env python

# -*- coding: utf-8 -*-

import sys

reload(sys) # Python2.5 初始化后会删除 sys.setdefaultencoding 这个方法，我们需要重新载入

sys.setdefaultencoding('utf-8')

mystr = '中文'

# 缺省先用定义的缺省字符集将 str 解码为 unicode，

# 之后编码为 gb18030

mystr.encode('gb18030')

显式将 str 转换为 unicode, 再编码

#! /usr/bin/env python

# -*- coding: gb2312 -*-

s = '中文'

s.decode('gb2312').encode('big5')

#! /usr/bin/env python

# -*- coding: utf-8 -*-

s = '中文'

# 即使文件编码为 utf-8，sys 的缺省字符编码仍为 ascii，需要显式设置解码的字符集为 utf-8

print s.decode('utf-8')

print s.decode('utf-8').encode('gb18030')

unicode 函数

是 python 内置函数。将字符串由'charset' 字符集转换为 unicode

unicode (message, charset)

unicode('中文字符串', 'gbk')

encode 负责 uicode --> str

unicode('中文字符串', 'gbk').encode('gb18030')

调试

手动调试函数

运行命令行 python

用 import 加载程序，模块名为程序名

以程序名.函数名(参数) 方式调试函数

语法检查