pattern比對 (grep)
字符串 命令 /pattern/修飾詞 |
命令 =~ 表示比對符合pattern !~ 表示比對不符合pattern |
修飾詞 i 不計(jì)大小寫 x 在模式中忽略空格 g 繼續(xù)比對,繼續(xù)尋找,相當(dāng)于find next |
#!/usr/bin/perl $file="/home/macg/perltest/gogo"; &gotest($file); sub gotest{ my(@tmp)=@_; open (MYFILE, $tmp[0]) || die ("Could not open file"); my($line,$newline); while ($line=<MYFILE>) { if($newline=($line=~/want/)) { 行中查找含有want print "found\n"; print "\$line is:$line"; print "\$newline is:$newline \n"; } else { print "not found\n"; print "\$line is:$line"; print "\$newline is:$newline \n"; } } close(MYFILE); } |
[macg@localhost perltest]$ ./tip.pl not found $line is:I glad to be Los angle $newline is: found $line is:I want to be Los angle $newline is:1 |
缺省的,模式定界符為反斜線/,但其可用字母m自行指定,如:
m!/u/jqpublic/perl/prog1! 等價(jià)于 /\/u\/jqpublic\/perl\/prog1/
而且換成其他字符后,/就不屬于特殊字符了,不必加\/了
pattern
\d或\d+ 任意數(shù)字 [0-9]
\D或\D+ 除數(shù)字外的任意字符
/[\da-z]/ 等同于/[0-9a-z]/
^ /^def/ 只匹配以def打頭的字符串
$
/\\/ 轉(zhuǎn)義字符
/\*+/
pattern中標(biāo)點(diǎn)都要加\
[] 意味著匹配一組字符中的一個(gè)
* + ? . 通配符
*+不能作為首字符,所以任意字符必須用顯示表示法[0-9a-zA-Z]
$line=~ |
syntax error at ./address.pl line 6, near "out @int_hwaddress" Quantifier follows nothing in regex; marked by <-- HERE in m at ./address.pl line 41 |
$line=~/[0-9a-zA-Z]+:[0-9a-fA-F]+:[0-9a-fA-F]+:[0-9a-fA-F]+:[0-9a-fA-F]+:[0-9a-fA-F]+/
.
pattern中空格就是" "
if(($input=~/^ping$/i)||($input=~/^ping $/i)) |
macg>ping command:[ping xxx.xxx.xxx.xxx] macg>ping 帶一個(gè)空格 command:[ping xxx.xxx.xxx.xxx] IEI-nTracker>ping 帶兩個(gè)空格 Use of uninitialized value in concatenation (.) or string at /nettracker/ntshell/ntshell |
if(($input=~/^ping$/i)||($input=~/^ping +$/i)) |
IEI-nTracker>ping 多個(gè)空格 command:[ping xxx.xxx.xxx.xxx] |
if($_[0]=~/^ping +[0-9a-zA-Z\.]+$/i) |
相當(dāng)于"ping xxxxx"或"ping xxxx" |
=~/want/是等于,還是含有???
當(dāng)然是包含
等于其實(shí)就是exactly匹配,/^wang$/
格式匹配(不是包含性符合),通常用于一些特殊格式輸入時(shí)用(比如IP地址)
#!/usr/bin/perl $file="/home/macg/perltest/gogo"; &gotest($file); sub gotest{ my(@tmp)=@_; open (MYFILE, $tmp[0]) || die ("Could not open file"); my($line,$newline); while ($line=<MYFILE>) { if ($line =~ /\d+\.\d+\.\d+\.\d+/) { print "$line"; print "the ip add is good\n"; } else { print "$line"; print "the ip add is a error\n"; } } close(MYFILE); } |
[macg@localhost perltest]$ cat gogo 202.106.0.20 10.0.0.as [macg@localhost perltest]$ ./tip.pl 202.106.0.20 the ip add is good 10.0.0.as the ip add is a error |
[macg@localhost perltest]$ cat gogo
202.106.0.20
10.0.0.as
10.0.0.1 as
改成 if ($line =~ /^\d+\.\d+\.\d+\.\d+$/) { |
[macg@localhost perltest]$ ./tip.pl 202.106.0.20 the ip add is good 10.0.0.as the ip add is a error 10.0.0.1 as the ip add is a error |
/want/g與/want/的區(qū)別:指針后移,相當(dāng)于find next
$line="inet addr:192.168.10.17 Bcast:192.168.10.255 Mask:255.255.255.0"; $line=~/(\d+)\.(\d+)\.(\d+)\.(\d+)/; print "$&\n"; $&查詢結(jié)果 $line=~/(\d+)\.(\d+)\.(\d+)\.(\d+)/; print "$&\n"; $line=~/(\d+)\.(\d+)\.(\d+)\.(\d+)/; print "$&\n"; |
[root@nm testpl]# ./tip.pl 192.168.10.17 192.168.10.17 192.168.10.17 幾次的查找都相同,因?yàn)槊看味际?span style="line-height: 18px; word-wrap: normal; word-break: normal; font-weight: bold; ">回到“首部”找“一次” 找到一個(gè),就返回值1,并停止比對 |
$line="inet addr:192.168.10.17 Bcast:192.168.10.255 Mask:255.255.255.0"; $line=~/(\d+)\.(\d+)\.(\d+)\.(\d+)/g; print "$&\n"; $line=~/(\d+)\.(\d+)\.(\d+)\.(\d+)/g; print "$&\n"; $line=~/(\d+)\.(\d+)\.(\d+)\.(\d+)/g; print "$&\n"; |
[root@nm testpl]# ./tip.pl 192.168.10.17 192.168.10.255 255.255.255.0 加g后,好象文件指針一樣,查一次,指針就移一格 |
“或”下的比對,匹配的先后順序很重要, 尤其是包含型的,要把精確的放前面
if (($ret=($line=~/eth[0-5]/))||($ret=($line=~/eth[0-5]:[0-5]/))) { print "$&:"; $found=1; } elsif ($found) { print $line; $found=0; } } |
[mac@nm testpl]$ ./address.pl eth0: inet addr:10.4.3.117 Bcast:10.4.255.255 Mask:255.255.0.0 eth0: inet addr:192.168.10.117 Bcast:192.168.10.255 Mask:255.255.255.0 eth0: inet addr:192.168.1.142 Bcast:192.168.1.255 Mask:255.255.255.0 eth1: BROADCAST MULTICAST MTU:1500 Metric:1 |
if (($ret=($line=~/eth[0-5]:[0-5]/))||($ret=($line=~/eth[0-5]/))) |
[mac@nm testpl]$ ./address.pl eth0: inet addr:10.4.3.117 Bcast:10.4.255.255 Mask:255.255.0.0 eth0:1: inet addr:192.168.10.117 Bcast:192.168.10.255 Mask:255.255.255.0 eth0:2: inet addr:192.168.1.142 Bcast:192.168.1.255 Mask:255.255.255.0 eth1: BROADCAST MULTICAST MTU:1500 Metric:1 |
$line =~ /want/ 完成,只返回1和null
即比對不修改=~左邊字符串
與比對截然不同,替換是修改=~左邊字符串的
my($line,$newline); while ($line=<MYFILE>) { if($newline=($line=~/want/)) { print "found\n"; print "\$line is:$line"; print "\$newline is:$newline \n"; } else { print "not found\n"; print "\$line is:$line"; print "\$newline is:$newline \n"; } } close(MYFILE); } |
[macg@localhost perltest]$ ./tip.pl not found $line is:I glad to be Los angle $newline is: found $line is:I want to be Los angle $newline is:1 |
如何能找到比對結(jié)果? 用$&
while ($line=<MYFILE>) { if ($line =~ /want/) { print "$line"; print "\$\& is $&\n"; print "good\n"; } else { print "$line"; print " error\n"; } } |
[macg@localhost perltest]$ ./tip.pl I want go to LA. and I also want to be NY. $line沒發(fā)生變化 $& is want $&是比對結(jié)果 good But I glad to be D.C. error |
注意if ($line=~/want/)和賦值毫無關(guān)系,所以不存在$line值改變的問題,$line只是操作符號(hào)=~左邊的一個(gè)元素而已,所以也不存在返回值給$line的問題
$& $` $' 的含義
while ($line=<MYFILE>) { if ($line =~ /want/g) { print "good\n"; print "$line"; print $& . "\n"; print $` . "\n"; print $' . "\n"; print "good\n"; } else { print " error\n"; print "$line"; } } |
[macg@localhost perltest]$ ./tip.pl good I want go to LA. and I also want to be NY. want $& $&是最后一個(gè)match,也可算是結(jié)果 I $` match之前的所有字符 go to LA. and I also want to be NY. $' match之后的所有字符 |
!~ 比對不符合pattern (其實(shí)沒什么用,因?yàn)橛胕f ( =~ ) else即可)
perl可以將pattern再細(xì)分 ,再用$1,$2,$3,$4表示這些子match
步驟:
1.對想單獨(dú)提出來的子pattern加( )
2.再用$1,$2來表示
if ($line =~ /^(\d+)\.(\d+)\.(\d+)\.(\d+)$/) { print "good \n"; print $line; print $& . "\n"; print $1,"\n"; print $2,"\n"; print $3,"\n"; print $4,"\n"; |
[macg@localhost perltest]$ ./tip.pl good 202.106.0.20 202.106.0.20 202 106 0 20 |
修飾詞i 不計(jì)大小寫
if ($line =~ /want/i) { print "good \n"; print $line; print $& . "\n"; |
[macg@localhost perltest]$ ./tip.pl good I WANT TO go to WANT |
修飾詞x 在模式中忽略空格
/\d{2} ([\W]) \d{2} \1 \d{2}/x 等價(jià)于 /\d{2}([\W])\d{2}\1\d{2}/
替換格式
命令與修飾詞基本上與比對相同格式: string command s/pattern/欲置換的字串/修飾詞 |
命令與比對相同 =~ 先比對符合(=~)再替換 !~ 比對不符合(!~)再替換 |
基本替換(后面替換前面)
$line =~s/want/hate/i; print "good \n"; print "\$line is :$line"; print "\$\& is : $&", "\n"; |
[macg@localhost perltest]$ ./tip.pl good $line is :I hate TO go to 與比對截然不同,替換是修改=~左邊字符串的 $& is : WANT 替換里的$&和就是比對的$& |
修飾詞i,不計(jì)大小寫
$line =~s/want/hate/i; 將 $line中的 want 或 WANT,Want 換成 hate
刪除(替換為空)
單純的刪除一般沒用,實(shí)際應(yīng)用中,基本上都用全域刪除(g)
$line =~s/want//i; print "\$line is :$line"; |
[macg@localhost perltest]$ ./tip.pl $line is :I TO go to |
g全域替換,替換所有的,缺省替換是查找到第一符合的就替換,然后停止
$line =~s/want/hate/ig; 修飾詞可以連寫 print "good \n"; print "\$line is :$line" |
[macg@localhost perltest]$ cat gogo I WANT TO go to NY. And I also want to be DC. I glad to go to [macg@localhost perltest]$ ./tip.pl $line is :I hate TO go to NY. And I also hate to be DC. |
替換g與比對的g的不同
- 比對g是find next,所以需要與while等合用
- 替換不需要用循環(huán)語句,一句就能實(shí)現(xiàn)所有替換,即:替換不需要find and find next,替換可以find all in one time.
e選項(xiàng)把替換部分的字符串看作表達(dá)式,在替換之前先計(jì)算其值
$string = "0abc1";
$string =~ s/[a-zA-Z]+/$& x 2/e; 將中間的字符(非數(shù)字)成倍
now $string = "0abcabc1"
$&是查找結(jié)果
轉(zhuǎn)換格式
string command tr/字元集/欲轉(zhuǎn)換的字元集/修飾詞 string command y/字元集/欲轉(zhuǎn)換的字元集/修飾詞 |
命令:=~ !~ 修飾詞: d 刪除 s 將重覆刪除 c 非轉(zhuǎn)換:將不在指定字元集中的字元(包括換行字元),換成欲轉(zhuǎn)換的字元集 |
最基本的轉(zhuǎn)換:字符串小寫換大寫
$line =~tr/a-z/A-Z/; print "\$line is :$line"; |
[macg@localhost perltest]$ cat gogo I WANT TO go to NY. And I also want to be DC. [macg@localhost perltest]$ ./tip.pl good $line is :I WANT TO GO TO NY. AND I ALSO WANT TO BE DC. |
刪除: =~tr/要?jiǎng)h除的字符//d
全域替換刪除和轉(zhuǎn)換刪除等價(jià)
全域替換刪除 $line =~tr/\t//g;
轉(zhuǎn)換刪除 $line =~tr/\t//d;
$line =~tr/\t//; 刪除所有TAB 轉(zhuǎn)化所有TAB為空// print "\$line is :$line"; |
[macg@localhost perltest]$ ./tip.pl $line is :I WANT TO go to NY. And I also want to be DC. |
$line =~tr/\t//d; print "\$line is :$line"; |
[macg@localhost perltest]$ ./tip.pl good $line is :I WANT TO go to NY. And I also want to be DC. |
刪除重復(fù)字符: =~ tr/a-zA-Z//s; 這功能沒什么實(shí)際用途
$line=~ tr/a-zA-Z//s;
print "\$line is :$line";
[macg@localhost perltest]$ cat gogo
I WANTWANT TO go to NNYY. And I also wWant to be DC.
[macg@localhost perltest]$ ./tip.pl
good
$line is :I WANTWANT TO go to NY. And I also wWant to be DC.
tr轉(zhuǎn)換不支持!~ 只支持=~ 因?yàn)樾揎椩~c就相當(dāng)于!~了
$text="1 abc 23 PID";
$text =~ tr/[0-9]c; [0-90]c即非數(shù)字
一個(gè)CGI控件值的解碼的示范程序:
$value="%A4T%A4K%21"; $value=~s/%([a-fA-F0-9][a-fA-F0-9])/pack("C",hex($1))/eg; |
s替換%字符串, 并把符合的字符串傳給$1, 把$1通過e運(yùn)算pack("C",hex($1))進(jìn)行解碼處理 pack("C",hex($1))把$1找到的十六進(jìn)制數(shù)值轉(zhuǎn)成十進(jìn)制的碼 C代表unsigned char value的意 |