隨筆-314  評論-209  文章-0  trackbacks-0

           pattern比對    (grep)

          字符串 命令 /pattern/修飾詞
          命令
          =~          表示比對符合pattern
          !~         表示比對不符合pattern
          修飾詞
          i 不計大小寫
          x 在模式中忽略空格
          g 繼續比對,繼續尋找,相當于find next
             例子  :掃描文件gogo,找含有want的行
          #!/usr/bin/perl

          $file="/home/macg/perltest/gogo";
          &gotest($file);

          sub gotest{
          my(@tmp)=@_;

          open (MYFILE, $tmp[0]) || die ("Could not open file");
          my($line,$newline);
          while ($line=<MYFILE>) {
          if($newline=($line=~/want/)) {          行中查找含有want 
            print "found\n";
            print "\$line is:$line";
            print "\$newline is:$newline \n";
           } else {
            print "not found\n";
            print "\$line is:$line";
            print "\$newline is:$newline \n";
              }
            }
          close(MYFILE);
          }    
           
          [macg@localhost perltest]$ ./tip.pl
          not found
          $line is:I glad to be Los angle
          $newline is:
          found
          $line is:I want to be Los angle
          $newline is:1


              缺省的,模式定界符為反斜線/,但其可用字母m自行指定,如:
          m!/u/jqpublic/perl/prog1!    等價于    /\/u\/jqpublic\/perl\/prog1/
          而且換成其他字符后,/就不屬于特殊字符了,不必加\/了


            pattern
          \d或\d+      任意數字 [0-9]
          \D或\D+     除數字外的任意字符  
          /[\da-z]/   等同于/[0-9a-z]/
          ^    /^def/  只匹配以def打頭的字符串
          $    
          /\\/         轉義字符
          /\*+/
                       pattern中標點都要加\
          []           意味著匹配一組字符中的一個
          * + ? .      通配符


             *+不能作為首字符,所以任意字符必須用顯示表示法[0-9a-zA-Z]
          $line=~
          syntax error at ./address.pl line 6, near "out @int_hwaddress"
          Quantifier follows nothing in regex; marked by <-- HERE in m at ./address.pl line 41
          改為
          $line=~/[0-9a-zA-Z]+:[0-9a-fA-F]+:[0-9a-fA-F]+:[0-9a-fA-F]+:[0-9a-fA-F]+:[0-9a-fA-F]+/
          .

              pattern中空格就是" "
          if(($input=~/^ping$/i)||($input=~/^ping $/i))
          macg>ping
          command:[ping xxx.xxx.xxx.xxx]
          macg>ping                      帶一個空格
          command:[ping xxx.xxx.xxx.xxx]
          IEI-nTracker>ping               帶兩個空格
          Use of uninitialized value in concatenation (.) or string at /nettracker/ntshell/ntshell
           
          if(($input=~/^ping$/i)||($input=~/^ping +$/i))
          IEI-nTracker>ping     多個空格
          command:[ping xxx.xxx.xxx.xxx]

          if($_[0]=~/^ping +[0-9a-zA-Z\.]+$/i)
          相當于"ping xxxxx"或"ping        xxxx"


              =~/want/是等于,還是含有???
          當然是包含
          等于其實就是exactly匹配,/^wang$/


              格式匹配(不是包含性符合),通常用于一些特殊格式輸入時用(比如IP地址)
          #!/usr/bin/perl

          $file="/home/macg/perltest/gogo";
          &gotest($file);

          sub gotest{
          my(@tmp)=@_;

          open (MYFILE, $tmp[0]) || die ("Could not open file");
          my($line,$newline);
          while ($line=<MYFILE>) {
          if ($line =~ /\d+\.\d+\.\d+\.\d+/) {
            print "$line";
            print "the ip add is good\n";
           } else {
            print "$line";
            print "the ip add is a error\n";
              }
            }
          close(MYFILE);
          }   
           
          [macg@localhost perltest]$ cat gogo
          202.106.0.20
          10.0.0.as

          [macg@localhost perltest]$ ./tip.pl
          202.106.0.20
          the ip add is good
          10.0.0.as
          the ip add is a error 
          上面這個例子也不對,會出下面的錯:包含以外的錯誤,所以應該加^ $
          [macg@localhost perltest]$ cat gogo
          202.106.0.20
          10.0.0.as
          10.0.0.1 as   
          改成 if ($line =~ /^\d+\.\d+\.\d+\.\d+$/) {
          [macg@localhost perltest]$ ./tip.pl
          202.106.0.20
          the ip add is good
          10.0.0.as
          the ip add is a error
          10.0.0.1 as
          the ip add is a error   


                /want/g與/want/的區別:指針后移,相當于find next
          $line="inet addr:192.168.10.17  Bcast:192.168.10.255  Mask:255.255.255.0";
          $line=~/(\d+)\.(\d+)\.(\d+)\.(\d+)/;
          print "$&\n";       $&查詢結果
          $line=~/(\d+)\.(\d+)\.(\d+)\.(\d+)/;
          print "$&\n";
          $line=~/(\d+)\.(\d+)\.(\d+)\.(\d+)/;
          print "$&\n";
          [root@nm testpl]# ./tip.pl
          192.168.10.17
          192.168.10.17
          192.168.10.17 
          幾次的查找都相同,因為每次都是回到“首部”找“一次”
          找到一個,就返回值1,并停止比對
          加g
          $line="inet addr:192.168.10.17  Bcast:192.168.10.255  Mask:255.255.255.0";
          $line=~/(\d+)\.(\d+)\.(\d+)\.(\d+)/g;
          print "$&\n";
          $line=~/(\d+)\.(\d+)\.(\d+)\.(\d+)/g;
          print "$&\n";
          $line=~/(\d+)\.(\d+)\.(\d+)\.(\d+)/g;
          print "$&\n";
          [root@nm testpl]# ./tip.pl
          192.168.10.17
          192.168.10.255
          255.255.255.0  
          加g后,好象文件指針一樣,查一次,指針就移一格



             “或”下的比對,匹配的先后順序很重要, 尤其是包含型的,要把精確的放前面
          if (($ret=($line=~/eth[0-5]/))||($ret=($line=~/eth[0-5]:[0-5]/)))
           {
          print "$&:";
          $found=1;
           } elsif ($found)
            {
            print $line;
            $found=0;
            }
          }
          [mac@nm testpl]$ ./address.pl
          eth0:          inet addr:10.4.3.117  Bcast:10.4.255.255  Mask:255.255.0.0
          eth0:          inet addr:192.168.10.117  Bcast:192.168.10.255  Mask:255.255.255.0
          eth0:          inet addr:192.168.1.142  Bcast:192.168.1.255  Mask:255.255.255.0
          eth1:          BROADCAST MULTICAST  MTU:1500  Metric:1
          改成精確的放前面
          if (($ret=($line=~/eth[0-5]:[0-5]/))||($ret=($line=~/eth[0-5]/)))
          [mac@nm testpl]$ ./address.pl
          eth0:          inet addr:10.4.3.117  Bcast:10.4.255.255  Mask:255.255.0.0
          eth0:1:          inet addr:192.168.10.117  Bcast:192.168.10.255  Mask:255.255.255.0
          eth0:2:          inet addr:192.168.1.142  Bcast:192.168.1.255  Mask:255.255.255.0
          eth1:          BROADCAST MULTICAST  MTU:1500  Metric:1 
          先查到精確,就會跳過模糊的(部分的),否則會用模糊的(部分的)代替精確的


              $line =~ /want/ 完成,只返回1和null
          比對不修改=~左邊字符串
          與比對截然不同,替換是修改=~左邊字符串的
          my($line,$newline);
          while ($line=<MYFILE>) {
          if($newline=($line=~/want/)) {
            print "found\n";
            print "\$line is:$line";
            print "\$newline is:$newline \n";
           } else {
            print "not found\n";
            print "\$line is:$line";
            print "\$newline is:$newline \n";
              }
            }
          close(MYFILE);
          }   
          [macg@localhost perltest]$ ./tip.pl
          not found
          $line is:I glad to be Los angle
          $newline is:
          found
          $line is:I want to be Los angle
          $newline is:1 


              如何能找到比對結果?    用$&
          while ($line=<MYFILE>) {
          if ($line =~ /want/) {
            print "$line";
            print "\$\& is $&\n";
            print "good\n";
           } else {
            print "$line";
            print " error\n";
              }
            }   
          [macg@localhost perltest]$ ./tip.pl
          I want go to LA. and I also want to be NY.       $line沒發生變化
          $& is want                         $&是比對結果
          good
          But I glad to be D.C.
           error   

             注意if ($line=~/want/)和賦值毫無關系,所以不存在$line值改變的問題,$line只是操作符號=~左邊的一個元素而已,所以也不存在返回值給$line的問題

              $&     $`      $' 的含義
          while ($line=<MYFILE>) {
          if ($line =~ /want/g) {
            print "good\n";
            print "$line";
            print  $& . "\n";
            print  $` . "\n";
            print  $' . "\n";
            print "good\n";
           } else {
            print " error\n";
            print "$line";
              }
            }  
          [macg@localhost perltest]$ ./tip.pl
          good
          I want go to LA. and I also want to be NY.
          want                  $&      $&是最后一個match,也可算是結果
          I                      $`       match之前的所有字符
           go to LA. and I also want to be NY.    $'      match之后的所有字符

           

             !~ 比對不符合pattern      (其實沒什么用,因為用if ( =~ ) else即可)


              perl可以將pattern再細分 ,再用$1,$2,$3,$4表示這些子match
          步驟:
          1.對想單獨提出來的子pattern加( )
          2.再用$1,$2來表示
          if ($line =~ /^(\d+)\.(\d+)\.(\d+)\.(\d+)$/) {
            print "good \n";
            print $line;
            print  $& . "\n";
            print $1,"\n";
            print $2,"\n";
            print $3,"\n";
            print $4,"\n";
          [macg@localhost perltest]$ ./tip.pl
          good
          202.106.0.20
          202.106.0.20
          202
          106
          0
          20  
           

             修飾詞i          不計大小寫
          if ($line =~ /want/i) {
            print "good \n";
            print $line;
            print  $& . "\n"; 
          [macg@localhost perltest]$ ./tip.pl
          good
          I WANT TO go to
          WANT  

           
              修飾詞x 在模式中忽略空格
          /\d{2} ([\W]) \d{2} \1 \d{2}/x    等價于    /\d{2}([\W])\d{2}\1\d{2}/

           替換格式

          命令與修飾詞基本上與比對相同
          格式: string command  s/pattern/欲置換的字串/修飾詞
          命令與比對相同
          =~          先比對符合(=~)再替換
          !~          比對不符合(!~)再替換


             基本替換(后面替換前面)
          $line =~s/want/hate/i;
            print "good \n";
            print "\$line is :$line";
            print "\$\& is : $&", "\n";  
          [macg@localhost perltest]$ ./tip.pl
          good
          $line is :I hate TO go to    與比對截然不同,替換是修改=~左邊字符串的
          $& is : WANT      替換里的$&和就是比對的$&
           

              修飾詞i,不計大小寫
          $line =~s/want/hate/i;    將 $line中的 want 或 WANT,Want 換成 hate


              刪除(替換為空)
          單純的刪除一般沒用,實際應用中,基本上都用全域刪除(g)

          $line =~s/want//i;
            print "\$line is :$line";
          [macg@localhost perltest]$ ./tip.pl
          $line is :I  TO go to 
            
           
              g全域替換,替換所有的,缺省替換是查找到第一符合的就替換,然后停止
          $line =~s/want/hate/ig;       修飾詞可以連寫
            print "good \n";
            print "\$line is :$line"
          [macg@localhost perltest]$ cat gogo
          I WANT TO go to NY. And I also want to be DC.
          I glad to go to  

          [macg@localhost perltest]$ ./tip.pl
          $line is :I hate TO go to NY. And I also hate to be DC.



          替換g與比對的g的不同
          • 比對g是find next,所以需要與while等合用
          • 替換不需要用循環語句,一句就能實現所有替換,即:替換不需要find and find next,替換可以find all in one time.


              e選項把替換部分的字符串看作表達式,在替換之前先計算其值
          $string = "0abc1";
          $string =~ s/[a-zA-Z]+/$& x 2/e;  將中間的字符(非數字)成倍
          now $string = "0abcabc1"
          $&是查找結果
              

              轉換格式
          string command tr/字元集/欲轉換的字元集/修飾詞
          string command y/字元集/欲轉換的字元集/修飾詞
          命令:=~  !~
          修飾詞:
          d 刪除
          s 將重覆刪除
          c 非轉換:將不在指定字元集中的字元(包括換行字元),換成欲轉換的字元集


             最基本的轉換:字符串小寫換大寫
          $line =~tr/a-z/A-Z/;
            print "\$line is :$line"; 
          [macg@localhost perltest]$ cat gogo
          I WANT TO go to NY. And I also want to be DC.
           
          [macg@localhost perltest]$ ./tip.pl
          good
          $line is :I WANT TO GO TO NY. AND I ALSO WANT TO BE DC.   
          轉換和替換一樣,也是修改string
           

              刪除:     =~tr/要刪除的字符//d

              全域替換刪除和轉換刪除等價
          全域替換刪除    $line =~tr/\t//g;
          轉換刪除        $line =~tr/\t//d;
            $line =~tr/\t//;            刪除所有TAB  轉化所有TAB為空//
            print "\$line is :$line";
          [macg@localhost perltest]$ ./tip.pl
          $line is :I WANT TO              go to NY. And   I also want to          be DC.            
          發覺TAB沒刪掉,其實不是沒刪掉,只是只刪了第一個TAB而已
            $line =~tr/\t//d;
            print "\$line is :$line";
          [macg@localhost perltest]$ ./tip.pl
          good
          $line is :I WANT TO go to NY. And I also want to be DC. 


              刪除重復字符:   =~ tr/a-zA-Z//s;      這功能沒什么實際用途
          $line=~ tr/a-zA-Z//s;
            print "\$line is :$line";  
          [macg@localhost perltest]$ cat gogo
          WANTWANT TO go to NNYY. And I also wWant to be DC.  
          [macg@localhost perltest]$ ./tip.pl
          good
          $line is :I WANTWANT TO go to NY. And I also wWant to be DC.  


              tr轉換不支持!~  只支持=~      因為修飾詞c就相當于!~了
          $text="1 abc 23 PID";
          $text =~ tr/[0-9]c;      [0-90]c即非數字
           
           
          一個CGI控件值的解碼的示范程序:
          $value="%A4T%A4K%21";
          $value=~s/%([a-fA-F0-9][a-fA-F0-9])/pack("C",hex($1))/eg;
          s替換%字符串,
          并把符合的字符串傳給$1
          把$1通過e運算pack("C",hex($1))進行解碼處理
          pack("C",hex($1))把$1找到的十六進制數值轉成十進制的碼
          C代表unsigned char value的意





          posted on 2012-03-10 15:44 xzc 閱讀(8221) 評論(0)  編輯  收藏 所屬分類: linux/unix
          主站蜘蛛池模板: 于田县| 台北市| 方城县| 潞城市| 福海县| 凭祥市| 连州市| 南皮县| 澄江县| 惠安县| 凯里市| 翁牛特旗| 忻城县| 镇巴县| 板桥市| 房山区| 海晏县| 大石桥市| 廊坊市| 诸暨市| 全南县| 克什克腾旗| 贵溪市| 仙游县| 平谷区| 八宿县| 胶州市| 新巴尔虎右旗| 濉溪县| 西和县| 奉节县| 嘉兴市| 松江区| 赤水市| 新民市| 渑池县| 榕江县| 乌鲁木齐县| 金乡县| 绥德县| 沙雅县|