Chan Chen Coding...

          Perl Print Duplicate Line

          # Find out duplicate line, if yes, print it out.
          AcceptEnv LANG LC_CTYPE LC_NUMERIC LC_TIME LC_COLLATE LC_MONETARY
          LC_MESSAGES
          AcceptEnv LC_PAPER LC_NAME LC_ADDRESS LC_TELEPHONE LC_MEASUREMENT
          AcceptEnv LC_IDENTIFICATION LC_ALL
           
          # Example of overriding settings on a per-user basis
          #Match User anoncvs
          # X11Forwarding no
          # AllowTcpForwarding no
          # ForceCommand cvs server

          AcceptEnv LC_IDENTIFICATION LC_ALL

          #! /usr/bin/perl
          use strict;

          open(FH, 'dupLine.sample');
          my %seen;
          while (<FH>) {
            $seen{$_}++;
          }
           
          while (my ($line$count) = each %seen) {
            print "$count: $line" if $count > 1;
          }


          Using the standard Perl shorthands:

          my %seen;
          while ( <> ) {
             
          print if $seen{$_}++;
          }

          As a "one-liner":

          perl -ne 'print if $seen{$_}++'

          More data? This prints <file name>:<line number>:<line>:

          perl -ne 'print ( $ARGV eq "-" ? "" : "$ARGV:" ), "$.:$_" if $seen{$_}++'

          Explanation on %seen:

          • %seen declares a hash. For each unique line in the input $seen{$_} is a scalar slot in the hash named by the the text of the line.
          • Using the postfix increment operator (x++) we take the value for our expression, remembering toincrement it after the expression. So, if we haven't "seen" the line $seen{$_} is undefined--but when forced into an numeric "context" like this, it's taken as 0--and false.
          • Then it's incremented to 1.

          So the first time we see a line, we take the undefined value which fails the if. It increments the count at the slot to 1. Thus, it is 1 for any future occurrences at which point it passes the if condition.

          Now as I said above, %seen declares a hash, but with strict turned off, any variable expression can be created on the spot. So the first time perl sees $seen{$_} it knows that I'm looking for %seen, it doesn't have it, so it creates it.

          An added neat thing about this is that at the end, if you care to use it, you have a count of how many times each line was repeated.



          -----------------------------------------------------
          Silence, the way to avoid many problems;
          Smile, the way to solve many problems;

          posted on 2012-04-26 11:01 Chan Chen 閱讀(298) 評論(0)  編輯  收藏 所屬分類: Linux

          主站蜘蛛池模板: 大田县| 安龙县| 怀安县| 贡嘎县| 迁安市| 确山县| 靖安县| 囊谦县| 南漳县| 库尔勒市| 洪湖市| 天津市| 南京市| 茶陵县| 衡山县| 汉源县| 兴山县| 盐边县| 郧西县| 辽宁省| 丰城市| 平泉县| 神木县| 荆州市| 清徐县| 特克斯县| 清镇市| 固镇县| 栾城县| 汕尾市| 湖州市| 乌兰县| 凤阳县| 牟定县| 泾源县| 中阳县| 宿迁市| 社旗县| 秀山| 德惠市| 建瓯市|