隨筆-314 評論-209 文章-0 trackbacks-0

pattern比對（grep）

字符串命令 /pattern/修飾詞

命令
=~ 表示比對符合pattern
!~ 表示比對不符合pattern

修飾詞
i 不計大小寫
x 在模式中忽略空格
g 繼續比對，繼續尋找，相當于find next

例子：掃描文件gogo,找含有want的行

#!/usr/bin/perl

$file="/home/macg/perltest/gogo";
&gotest($file);

sub gotest{
my(@tmp)=@_;

open (MYFILE, $tmp[0]) || die ("Could not open file");
my($line,$newline);
while ($line=<MYFILE>) {
if($newline=($line=~/want/)) {          行中查找含有want
  print "found\n";
  print "\$line is:$line";
  print "\$newline is:$newline \n";
} else {
  print "not found\n";
  print "\$line is:$line";
  print "\$newline is:$newline \n";
    }
  }
close(MYFILE);
}

[macg@localhost perltest]$ ./tip.pl
not found
$line is:I glad to be Los angle
$newline is:
found
$line is:I want to be Los angle
$newline is:1

    缺省的，模式定界符為反斜線/，但其可用字母m自行指定，如：
m!/u/jqpublic/perl/prog1!    等價于    /\/u\/jqpublic\/perl\/prog1/
而且換成其他字符后，/就不屬于特殊字符了，不必加\/了

  pattern
\d或\d+      任意數字 [0-9]
\D或\D+     除數字外的任意字符
/[\da-z]/   等同于/[0-9a-z]/
^    /^def/  只匹配以def打頭的字符串
$
/\\/         轉義字符
/\*+/
             pattern中標點都要加\
[]           意味著匹配一組字符中的一個
* + ? .      通配符

   *+不能作為首字符，所以任意字符必須用顯示表示法[0-9a-zA-Z]

$line=~

syntax error at ./address.pl line 6, near "out @int_hwaddress"
Quantifier follows nothing in regex; marked by <-- HERE in m at ./address.pl line 41

改為
$line=~/[0-9a-zA-Z]+:[0-9a-fA-F]+:[0-9a-fA-F]+:[0-9a-fA-F]+:[0-9a-fA-F]+:[0-9a-fA-F]+/
.

pattern中空格就是" "

if(($input=~/^ping$/i)||($input=~/^ping $/i))

macg>ping
command:[ping xxx.xxx.xxx.xxx]
macg>ping 帶一個空格
command:[ping xxx.xxx.xxx.xxx]
IEI-nTracker>ping 帶兩個空格
Use of uninitialized value in concatenation (.) or string at /nettracker/ntshell/ntshell

if(($input=~/^ping$/i)||($input=~/^ping +$/i))

IEI-nTracker>ping 多個空格
command:[ping xxx.xxx.xxx.xxx]

if($_[0]=~/^ping +[0-9a-zA-Z\.]+$/i)

相當于"ping xxxxx"或"ping xxxx"

=~/want/是等于，還是含有？？？
當然是包含
等于其實就是exactly匹配，/^wang$/

格式匹配（不是包含性符合），通常用于一些特殊格式輸入時用（比如IP地址）

#!/usr/bin/perl

$file="/home/macg/perltest/gogo";
&gotest($file);

sub gotest{
my(@tmp)=@_;

open (MYFILE, $tmp[0]) || die ("Could not open file");
my($line,$newline);
while ($line=<MYFILE>) {
if ($line =~ /\d+\.\d+\.\d+\.\d+/) {
  print "$line";
  print "the ip add is good\n";
} else {
  print "$line";
  print "the ip add is a error\n";
    }
  }
close(MYFILE);
}

[macg@localhost perltest]$ cat gogo
202.106.0.20
10.0.0.as

[macg@localhost perltest]$ ./tip.pl
202.106.0.20
the ip add is good
10.0.0.as
the ip add is a error

上面這個例子也不對，會出下面的錯：包含以外的錯誤,所以應該加^ $
[macg@localhost perltest]$ cat gogo
202.106.0.20
10.0.0.as
10.0.0.1 as

改成 if ($line =~ /^\d+\.\d+\.\d+\.\d+$/) {

[macg@localhost perltest]$ ./tip.pl
202.106.0.20
the ip add is good
10.0.0.as
the ip add is a error
10.0.0.1 as
the ip add is a error

/want/g與/want/的區別：指針后移,相當于find next

$line="inet addr:192.168.10.17 Bcast:192.168.10.255 Mask:255.255.255.0";
$line=~/(\d+)\.(\d+)\.(\d+)\.(\d+)/;
print "$&\n"; $&查詢結果
$line=~/(\d+)\.(\d+)\.(\d+)\.(\d+)/;
print "$&\n";
$line=~/(\d+)\.(\d+)\.(\d+)\.(\d+)/;
print "$&\n";

[root@nm testpl]# ./tip.pl
192.168.10.17
192.168.10.17
192.168.10.17
幾次的查找都相同，因為每次都是回到“首部”找“一次”
找到一個，就返回值1，并停止比對

加g

$line="inet addr:192.168.10.17 Bcast:192.168.10.255 Mask:255.255.255.0";
$line=~/(\d+)\.(\d+)\.(\d+)\.(\d+)/g;
print "$&\n";
$line=~/(\d+)\.(\d+)\.(\d+)\.(\d+)/g;
print "$&\n";
$line=~/(\d+)\.(\d+)\.(\d+)\.(\d+)/g;
print "$&\n";

[root@nm testpl]# ./tip.pl
192.168.10.17
192.168.10.255
255.255.255.0
加g后，好象文件指針一樣，查一次，指針就移一格

“或”下的比對，匹配的先后順序很重要, 尤其是包含型的，要把精確的放前面

if (($ret=($line=~/eth[0-5]/))||($ret=($line=~/eth[0-5]:[0-5]/)))
{
print "$&:";
$found=1;
} elsif ($found)
  {
  print $line;
  $found=0;
  }
}

[mac@nm testpl]$ ./address.pl
eth0:          inet addr:10.4.3.117  Bcast:10.4.255.255  Mask:255.255.0.0
eth0:          inet addr:192.168.10.117  Bcast:192.168.10.255  Mask:255.255.255.0
eth0:          inet addr:192.168.1.142  Bcast:192.168.1.255  Mask:255.255.255.0
eth1:          BROADCAST MULTICAST  MTU:1500  Metric:1

改成精確的放前面

if (($ret=($line=~/eth[0-5]:[0-5]/))||($ret=($line=~/eth[0-5]/)))

[mac@nm testpl]$ ./address.pl
eth0:          inet addr:10.4.3.117  Bcast:10.4.255.255  Mask:255.255.0.0
eth0:1:          inet addr:192.168.10.117  Bcast:192.168.10.255  Mask:255.255.255.0
eth0:2:          inet addr:192.168.1.142  Bcast:192.168.1.255  Mask:255.255.255.0
eth1:          BROADCAST MULTICAST  MTU:1500  Metric:1

先查到精確，就會跳過模糊的（部分的），否則會用模糊的（部分的）代替精確的

$line =~ /want/ 完成，只返回1和null
即比對不修改=~左邊字符串
與比對截然不同，替換是修改=~左邊字符串的

my($line,$newline);
while ($line=<MYFILE>) {
if($newline=($line=~/want/)) {
  print "found\n";
  print "\$line is:$line";
  print "\$newline is:$newline \n";
} else {
  print "not found\n";
  print "\$line is:$line";
  print "\$newline is:$newline \n";
    }
  }
close(MYFILE);
}

[macg@localhost perltest]$ ./tip.pl
not found
$line is:I glad to be Los angle
$newline is:
found
$line is:I want to be Los angle
$newline is:1

如何能找到比對結果？用$&

while ($line=<MYFILE>) {
if ($line =~ /want/) {
  print "$line";
  print "\$\& is $&\n";
  print "good\n";
} else {
  print "$line";
  print " error\n";
    }
  }

[macg@localhost perltest]$ ./tip.pl
I want go to LA. and I also want to be NY.       $line沒發生變化
$& is want                         $&是比對結果
good
But I glad to be D.C.
error

注意if ($line=~/want/)和賦值毫無關系，所以不存在$line值改變的問題，$line只是操作符號=~左邊的一個元素而已，所以也不存在返回值給$line的問題

$& $` $' 的含義

while ($line=<MYFILE>) {
if ($line =~ /want/g) {
  print "good\n";
  print "$line";
  print  $& . "\n";
  print  $` . "\n";
  print  $' . "\n";
  print "good\n";
} else {
  print " error\n";
  print "$line";
    }
  }

[macg@localhost perltest]$ ./tip.pl
good
I want go to LA. and I also want to be NY.
want                  $&      $&是最后一個match，也可算是結果
I                      $`       match之前的所有字符
go to LA. and I also want to be NY.    $'      match之后的所有字符

!~ 比對不符合pattern （其實沒什么用，因為用if ( =~ ) else即可）

perl可以將pattern再細分 ,再用$1,$2,$3,$4表示這些子match
步驟：
1.對想單獨提出來的子pattern加( )
2.再用$1,$2來表示

if ($line =~ /^(\d+)\.(\d+)\.(\d+)\.(\d+)$/) {
  print "good \n";
  print $line;
  print  $& . "\n";
  print $1,"\n";
  print $2,"\n";
  print $3,"\n";
  print $4,"\n";

[macg@localhost perltest]$ ./tip.pl
good
202.106.0.20
202.106.0.20
202
106
0
20

修飾詞i 不計大小寫

if ($line =~ /want/i) {
  print "good \n";
  print $line;
  print  $& . "\n";

[macg@localhost perltest]$ ./tip.pl
good
I WANT TO go to
WANT

修飾詞x 在模式中忽略空格
/\d{2} ([\W]) \d{2} \1 \d{2}/x 等價于 /\d{2}([\W])\d{2}\1\d{2}/

替換格式

命令與修飾詞基本上與比對相同

格式： string command s/pattern/欲置換的字串/修飾詞

命令與比對相同
=~ 先比對符合(=~)再替換
!~ 比對不符合(!~)再替換

基本替換(后面替換前面)

$line =~s/want/hate/i;
  print "good \n";
  print "\$line is :$line";
  print "\$\& is : $&", "\n";

[macg@localhost perltest]$ ./tip.pl
good
$line is :I hate TO go to 與比對截然不同，替換是修改=~左邊字符串的
$& is : WANT 替換里的$&和就是比對的$&

    修飾詞i,不計大小寫
$line =~s/want/hate/i;    將 $line中的 want 或 WANT,Want 換成 hate

    刪除（替換為空）
單純的刪除一般沒用，實際應用中，基本上都用全域刪除(g)

$line =~s/want//i;
print "\$line is :$line";

[macg@localhost perltest]$ ./tip.pl
$line is :I TO go to

g全域替換，替換所有的，缺省替換是查找到第一符合的就替換，然后停止

$line =~s/want/hate/ig;       修飾詞可以連寫
  print "good \n";
  print "\$line is :$line"

[macg@localhost perltest]$ cat gogo
I WANT TO go to NY. And I also want to be DC.
I glad to go to

[macg@localhost perltest]$ ./tip.pl
$line is :I hate TO go to NY. And I also hate to be DC.

替換g與比對的g的不同

比對g是find next,所以需要與while等合用
替換不需要用循環語句，一句就能實現所有替換，即：替換不需要find and find next,替換可以find all in one time.

    e選項把替換部分的字符串看作表達式，在替換之前先計算其值
$string = "0abc1";
$string =~ s/[a-zA-Z]+/$& x 2/e;  將中間的字符（非數字）成倍
now $string = "0abcabc1"
$&是查找結果


    轉換格式

string command tr/字元集/欲轉換的字元集/修飾詞
string command y/字元集/欲轉換的字元集/修飾詞

命令：=~ !~
修飾詞：
d 刪除
s 將重覆刪除
c 非轉換：將不在指定字元集中的字元(包括換行字元)，換成欲轉換的字元集

最基本的轉換：字符串小寫換大寫

$line =~tr/a-z/A-Z/;
print "\$line is :$line";

[macg@localhost perltest]$ cat gogo
I WANT TO go to NY. And I also want to be DC.

[macg@localhost perltest]$ ./tip.pl
good
$line is :I WANT TO GO TO NY. AND I ALSO WANT TO BE DC.

轉換和替換一樣，也是修改string

    刪除:     =~tr/要刪除的字符//d

    全域替換刪除和轉換刪除等價
全域替換刪除    $line =~tr/\t//g;
轉換刪除        $line =~tr/\t//d;

$line =~tr/\t//; 刪除所有TAB 轉化所有TAB為空//
print "\$line is :$line";

[macg@localhost perltest]$ ./tip.pl
$line is :I WANT TO go to NY. And I also want to be DC.

發覺TAB沒刪掉，其實不是沒刪掉，只是只刪了第一個TAB而已

$line =~tr/\t//d;
print "\$line is :$line";

[macg@localhost perltest]$ ./tip.pl
good
$line is :I WANT TO go to NY. And I also want to be DC.

    刪除重復字符：   =~ tr/a-zA-Z//s;      這功能沒什么實際用途
$line=~ tr/a-zA-Z//s;
  print "\$line is :$line";
[macg@localhost perltest]$ cat gogo
I WANTWANT TO go to NNYY. And I also wWant to be DC.
[macg@localhost perltest]$ ./tip.pl
good
$line is :I WANTWANT TO go to NY. And I also wWant to be DC.

    tr轉換不支持!~  只支持=~      因為修飾詞c就相當于!~了
$text="1 abc 23 PID";
$text =~ tr/[0-9]c;      [0-90]c即非數字

一個CGI控件值的解碼的示范程序:

$value="%A4T%A4K%21";
$value=~s/%([a-fA-F0-9][a-fA-F0-9])/pack("C",hex($1))/eg;

s替換%字符串，
并把符合的字符串傳給$1，
把$1通過e運算pack("C",hex($1))進行解碼處理
pack("C",hex($1))把$1找到的十六進制數值轉成十進制的碼
C代表unsigned char value的意

posted on 2012-03-10 15:44 xzc 閱讀(8221) 評論(0) 編輯收藏所屬分類: linux/unix

新用戶注冊刷新評論列表


只有注冊用戶登錄后才能發表評論。




網站導航: 博客園 IT新聞 Chat2DB C++博客博問管理
相關文章: hive集成sentry的sql用法。 Linux系統查看當前主機CPU、內存、機器型號及主板信息 Python日期的加減等操作 linux shell 多線程執行程序 ontab 在固定時間或固定間隔執行某文件或命令 shell時間處理、加減、以及時間差 curl模擬http發送get或post接口測試 linux netstat命令使用收集，查看80端口連接數 redis主從以及認證配置 Kerberos簡介

<

2012年3月

>

日

一

二

三

四

五

六

26

27

28

29

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

1

2

3

4

5

6

7

常用鏈接

留言簿(12)

隨筆分類

隨筆檔案

收藏夾

xzc(12)

常用鏈接

留言簿(12)

隨筆分類

隨筆檔案

收藏夾

搜索

最新評論

閱讀排行榜

評論排行榜