perl正则表达式递归
2011-02-22 12:32
134 查看
今天在chinaunix上看到有个贴,问
设最外层括号为第 1 层,请问怎么样能够除去 1 对第 2 层的括号,保留其他括号?
例如:
(((1,2),3),4) => ((1,2),3,4)
((1,2),(3,4)) => ((1,2),3,4)
or
(1,2,(3,4))
(1,(2,(3,4))) => (1,2,(3,4))
解决方案一:
解决方案二:
$str =~ /
(/() # 分组1: $1匹配左括号
(?= # 整体是1个环视,这样,第1次匹配成功会从第1个左括号开始,第2个次匹配成功会从第2个左括号开始,以此类推
( # 分组2: $2匹配括号里的内容加上$3
(?: # 分组不捕获
[^()] # 要么不包括括号
|
(?1)(?2) # 要么是分组1加上分组2的递归
)+
(/)) # 分组3:$3匹配右括号
)
)
/xg;
————————————————————分割线————————————————————
http://perldoc.perl.org/perlre.html上有介绍perl 5.10以上的正则表达式新特性
Similar to
except it does not involve compiling any code,
instead it treats the contents of a capture buffer as an independent
pattern that must match at the current position. Capture buffers
contained by the pattern will have the value as determined by the
outermost recursion.
PARNO is a sequence of digits (not starting with 0) whose value reflects
the paren-number of the capture buffer to recurse to.
recurses to
the beginning of the whole pattern.
is an alternate syntax for
. If PARNO is preceded by a plus or minus sign then it is assumed
to be relative, with negative numbers indicating preceding capture buffers
and positive ones following. Thus
refers to the most recently
declared buffer, and
indicates the next buffer to be declared.
Note that the counting for relative recursion differs from that of
relative backreferences, in that with recursion unclosed buffers are
included.
The following pattern matches a function foo() which may contain
balanced parentheses as the argument.
If the pattern was used as follows
the output produced should be the following:
If there is no corresponding capture buffer defined, then it is a
fatal error. Recursing deeper than 50 times without consuming any input
string will also result in a fatal error. The maximum depth is compiled
into perl, so changing it requires a custom build.
The following shows how using negative indexing can make it
easier to embed recursive patterns inside of a
construct
for later use:
Note
that this pattern does not behave the same way as the equivalent
PCRE or Python construct of the same form. In Perl you can backtrack into
a recursed group, in PCRE and Python the recursed into group is treated
as atomic. Also, modifiers are resolved at compile time, so constructs
like (?i:(?1)) or (?:(?i)(?1)) do not affect how the sub-pattern will
be processed.
设最外层括号为第 1 层,请问怎么样能够除去 1 对第 2 层的括号,保留其他括号?
例如:
(((1,2),3),4) => ((1,2),3,4)
((1,2),(3,4)) => ((1,2),3,4)
or
(1,2,(3,4))
(1,(2,(3,4))) => (1,2,(3,4))
解决方案一:
#!/bin/env perl use strict; use warnings; use 5.010; while (my $str = <DATA>) { chomp $str; print "$str => "; my @stack; foreach (0 .. 1) { $str =~ /(/()(?=((?:[^()]|(?1)(?2))+(/))))/g; push(@stack, [$-[1], $-[3]]); } substr($str, $stack[1][1], 1) = ""; substr($str, $stack[1][0], 1) = ""; print "$str/n"; } __DATA__ (((1,2),3),4) ((1,2),(3,4)) (1,(2,(3,4)))
解决方案二:
my $balance = qr/(/((?:[^()]++|(?-1))*+/))*/; my $innerRe = qr/(?:[^()]*?$balance)*/; while( <DATA> ){ chomp; print; if( s/^(/([^()]*?)/(($innerRe)/)/$1$2/ ){ print " => $_"; } print "/n"; } __DATA__ (((1,2),3),4) ((1,2),(3,4)) (1,(2,(3,4))) (((((1,2),3),4),(5,6)))
$str =~ /
(/() # 分组1: $1匹配左括号
(?= # 整体是1个环视,这样,第1次匹配成功会从第1个左括号开始,第2个次匹配成功会从第2个左括号开始,以此类推
( # 分组2: $2匹配括号里的内容加上$3
(?: # 分组不捕获
[^()] # 要么不包括括号
|
(?1)(?2) # 要么是分组1加上分组2的递归
)+
(/)) # 分组3:$3匹配右括号
)
)
/xg;
————————————————————分割线————————————————————
http://perldoc.perl.org/perlre.html上有介绍perl 5.10以上的正则表达式新特性
(?PARNO)
(?-PARNO)
(?+PARNO)
(?R)
(?0)
Similar to
( ?? { code } )
except it does not involve compiling any code,
instead it treats the contents of a capture buffer as an independent
pattern that must match at the current position. Capture buffers
contained by the pattern will have the value as determined by the
outermost recursion.
PARNO is a sequence of digits (not starting with 0) whose value reflects
the paren-number of the capture buffer to recurse to.
(?R)
recurses to
the beginning of the whole pattern.
(?0)
is an alternate syntax for
(?R)
. If PARNO is preceded by a plus or minus sign then it is assumed
to be relative, with negative numbers indicating preceding capture buffers
and positive ones following. Thus
(?-1)
refers to the most recently
declared buffer, and
(?+1)
indicates the next buffer to be declared.
Note that the counting for relative recursion differs from that of
relative backreferences, in that with recursion unclosed buffers are
included.
The following pattern matches a function foo() which may contain
balanced parentheses as the argument.
$re = qr{ ( # paren group 1 (full function) foo ( # paren group 2 (parens) /( ( # paren group 3 (contents of parens) (?: (?> [^()]+ ) # Non-parens without backtracking | (?2) # Recurse to start of paren group 2 )* ) /) ) ) }x ;
If the pattern was used as follows
'foo(bar(baz)+baz(bop))' =~/$re/ and print "/$1 = $1/n" , "/$2 = $2/n" , "/$3 = $3/n" ;
the output produced should be the following:
$1 = foo(bar(baz)+baz(bop)) $2 = (bar(baz)+baz(bop)) $3 = bar(baz)+baz(bop)
If there is no corresponding capture buffer defined, then it is a
fatal error. Recursing deeper than 50 times without consuming any input
string will also result in a fatal error. The maximum depth is compiled
into perl, so changing it requires a custom build.
The following shows how using negative indexing can make it
easier to embed recursive patterns inside of a
qr//
construct
for later use:
my $parens = qr/(/((?:[^()]++|(?-1))*+/))/ ; if ( /foo $parens /s+ + /s+ bar $parens/x ) { # do something here... }
Note
that this pattern does not behave the same way as the equivalent
PCRE or Python construct of the same form. In Perl you can backtrack into
a recursed group, in PCRE and Python the recursed into group is treated
as atomic. Also, modifiers are resolved at compile time, so constructs
like (?i:(?1)) or (?:(?i)(?1)) do not affect how the sub-pattern will
be processed.
相关文章推荐
- Delphi 正则表达式之TPerlRegEx 类的属性与方法(7): Split 函数
- posix,perl正则表达式区别
- Perl 中的正则表达式
- 在PHP里面运用与Perl兼容地正则表达式【转载】
- perl正则表达式实现大写字母转小写字母
- 正则表达式之——Perl正则表达式
- 面试题31:字符串正则表达式的递归匹配
- Perl 正则表达式语法
- Perl 中的正则表达式学习
- PHP中的递归正则表达式用法分享
- PHP中的递归正则表达式用法分享
- Delphi 正则表达式之TPerlRegEx 类的属性与方法(7): Split 函数
- posix和perl标准的正则表达式区别
- perl 取出正则表达式的匹配位置
- Delphi 正则表达式之TPerlRegEx 类的属性与方法(6): EscapeRegExChars 函数
- 正则表达式修饰符号(perl)
- Perl 正则表达式 动态匹配
- Perl语言学习笔记 9 正则表达式处理文本
- Perl正则表达式
- [perl] 正则表达式实现多模式匹配