您的位置：首页 > 编程语言 > Python开发

Python 正则表达式（Google Python Course）

2016-10-10 09:33 501 查看

用了这么久正则表达式，一直是复制粘贴或求助论坛～期间也看着教程学过，可是一直领会不了啊～领会不了！看了Google For Education的Python Course终于知道点皮毛了

Python函数

主要是两个函数：

import re
match = re.search(pattern, string, flags=0)
match = re.findall(pattern, string, flags=0)

partten

匹配模式一般以r开头，表示raw string

flags

默认是False，表示不区分大小写

对于输出，search函数的输出可以用

match.group

或

match.group(1)

表示第一个分组；findall函数的输出是一个多元组。

匹配模式符号


符号	功能
a, X, 9	原始字符串仅表示字符串本身
.	(点号)表示除\n外的任意单个字符
\w	表示单个字符和下划线，即[a-zA-Z0-9_]，可以理解成可以出现在“word”中的符号，不过不是指真正的一个单词
\W	大写表示和小写相反的含义，即任意“非单词”字符
\b	单词和非单词的边界，但是不表示任何字符，仅表示位置
\s	(space)表示一个空白符：空格、\n、\r、\t、\f
\S	大写表示非空白符
\d	(decimal)表示一个数字
^, $	起止符
\	转义符
/	？

重复


符号	功能
+	左边匹配模式出现了≥1次
*	左边匹配模式出现了≥0次
?	左边匹配模式出现了=0,1次

重复是贪心的，它先找到第一个匹配的位置，然后找尽可能远的位置，如：

str = "<b>foo</b> and <i>so on</i>"
match = re.search(r'<.*>', str)
if match:
print(match.group())

这并不会输出

<b>

而是输出所有字符串，因为

号会匹配到最远的位置，也就是中间的“b>foo….<.*?>

中括号

中括号表示字符集合，如：


符号	功能
[abc]	表示a或b或c
[\w.-]	表示单词或-号
[^ab]	表示除a、b外的所有字符串

分组记录

圆括号表示匹配的分组，以便后续输出

str = 'purple alice-b@google.com monkey dishwasher'
match = re.search('([\w.-]+)@([\w.-]+)', str)
if match:
print match.group()   ## 'alice-b@google.com' (the whole match)
print match.group(1)  ## 'alice-b' (the username, group 1)
print match.group(2)  ## 'google.com' (the host, group 2)

str = 'purple alice@google.com, blah monkey bob@abc.com blah dishwasher'

## Here re.findall() returns a list of all the found email strings
emails = re.findall(r'[\w\.-]+@[\w\.-]+', str) ## ['alice@google.com', 'bob@abc.com']
for email in emails:
# do something with each found email string
print email

str = 'purple alice@google.com, blah monkey bob@abc.com blah dishwasher'
tuples = re.findall(r'([\w\.-]+)@([\w\.-]+)', str)
print tuples  ## [('alice', 'google.com'), ('bob', 'abc.com')]
for tuple in tuples:
print tuple[0]  ## username
print tuple[1]  ## host

替换

用

\1

，

\2

来表示替换位置

str = 'purple alice@google.com, blah monkey bob@abc.com blah dishwasher'
## re.sub(pat, replacement, str) -- returns new string with all replacements,
## \1 is group(1), \2 group(2) in the replacement
print re.sub(r'([\w\.-]+)@([\w\.-]+)', r'\1@yo-yo-dyne.com', str)
## purple alice@yo-yo-dyne.com, blah monkey bob@yo-yo-dyne.com blah dishwasher

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签： python 正则表达式

相关文章推荐

新的分享

章节导航