weidagang2046的專欄

          物格而后知致
          隨筆 - 8, 文章 - 409, 評論 - 101, 引用 - 0
          數據加載中……

          文本處理命令之sed

          以 Redhat6.0 為測試環境
          事實上在solaris下的sed命令要比linux強,但因為沒有測試
          環境,我這里只給在linux下經過測試的用法。

          目錄:

          ★ 命令行參數簡介
          ★ 首先假設我們有這樣一個文本文件 sedtest.txt
          ★ 輸出指定范圍的行 p
          ★ 在每一行前面增加一個制表符(^I)
          ★ 在每一行后面增加--end
          ★ 顯示指定模式匹配行的行號 [/pattern/]=
          ★ 在匹配行后面增加文本 [/pattern/]a\ 或者 [address]a\
          ★ 刪除匹配行 [/pattern/]d 或者 [address1][,address2]d
          ★ 替換匹配行 [/pattern/]c\ 或者 [address1][,address2]c\
          ★ 在匹配行前面插入文本 [/pattern/]i\ 或者 [address]i\
          ★ 替換匹配串(注意不再是匹配行) [addr1][,addr2]s/old/new/g
          ★ 限定范圍后的模式匹配
          ★ 指定替換每一行中匹配的第幾次出現
          ★ &代表最后匹配
          ★ 利用sed修改PATH環境變量
          ★ 測試并提高sed命令運行效率
          ★ 指定輸出文件 [address1][,address2]w outputfile
          ★ 指定輸入文件 [address]r inputfile
          ★ 替換相應字符 [address1][,address2]y/old/new/
          ★ !號的使用
          ★ \c正則表達式c 的使用
          ★ sed命令中正則表達式的復雜性
          ★ 轉換man手冊成普通文本格式(新)
          ★ sed的man手冊(用的就是上面的方法)

          ★ 命令行參數簡介

          sed
          -e script 指定sed編輯命令
          -f scriptfile 指定的文件中是sed編輯命令
          -n 寂靜模式,抑制來自sed命令執行過程中的冗余輸出信息,比如只
          顯示那些被改變的行。

          不明白?不要緊,把這些骯臟丟到一邊,跟我往下走,不過下面的介紹里
          不包括正則表達式的解釋,如果你不明白,可能有點麻煩。

          ★ 首先假設我們有這樣一個文本文件 sedtest.txt

          cat > sedtest.txt
          Sed is a stream editor
          ----------------------
          A stream editor is used to perform basic text transformations on an input stream
          --------------------------------------------------------------------------------
          While in some ways similar to an editor which permits scripted edits (such as ed
          )
          ,
          --------------------------------------------------------------------------------
          -
          -
          sed works by making only one pass over the input(s), and is consequently more
          -----------------------------------------------------------------------------
          efficient. But it is sed's ability to filter text in a pipeline which particular
          l
          y
          --------------------------------------------------------------------------------
          -

          ★ 輸出指定范圍的行 p other types of editors.

          sed -e "1,4p" -n sedtest.txt
          sed -e "/from/p" -n sedtest.txt
          sed -e "1,/from/p" -n sedtest.txt

          ★ 在每一行前面增加一個制表符(^I)

          sed "s/^/^I/g" sedtest.txt

          注意^I的輸入方法是ctrl-v ctrl-i

          單個^表示行首

          ★ 在每一行后面增加--end

          sed "s/$/--end/g" sedtest.txt

          單個$表示行尾

          ★ 顯示指定模式匹配行的行號 [/pattern/]=

          sed -e '/is/=' sedtest.txt

          1
          Sed is a stream editor
          ----------------------
          3
          A stream editor is used to perform basic text transformations on an input stream
          --------------------------------------------------------------------------------
          While in some ways similar to an editor which permits scripted edits (such as ed
          )
          ,
          --------------------------------------------------------------------------------
          -
          -
          7
          sed works by making only one pass over the input(s), and is consequently more
          -----------------------------------------------------------------------------
          9
          efficient. But it is sed's ability to filter text in a pipeline which particular
          l
          y
          --------------------------------------------------------------------------------
          -
          -
          意思是分析sedtest.txt,顯示那些包含is串的匹配行的行號,注意11行中出現了is字符串
          這個輸出是面向stdout的,如果不做重定向處理,則不影響原來的sedtest.txt

          ★ 在匹配行后面增加文本 [/pattern/]a\ 或者 [address]a\
          ^D

          sed -f sedadd.script sedtest.txt

          Sed is a stream editor

          A stream editor is used to perform basic text transformations on an input stream

          While in some ways similar to an editor which permits scripted edits (such as ed
          )
          ,
          --------------------------------------------------------------------------------
          -
          -
          sed works by making only one pass over the input(s), and is consequently more
          -----------------------------------------------------------------------------
          efficient. But it is sed's ability to filter text in a pipeline which particular
          l
          y
          --------------------------------------------------------------------------------
          -
          -
          [scz@ /home/scz/src]> sed -e "a\\
          +++++++++
          ---------------------------------------------

          找到包含from字符串的行,在該行的下一行增加+++++++++。
          這個輸出是面向stdout的,如果不做重定向處理,則不影響原來的sedtest.txt

          很多人想在命令行上直接完成這個操作而不是多一個sedadd.script,不幸的是,這需要用?nbsp;
          ?nbsp;
          續行符\,

          [scz@ /home/scz/src]> sed -e "/from/a\\
          > +++++++++" sedtest.txt

          [scz@ /home/scz/src]> sed -e "a\\
          > +++++++++" sedtest.txt

          上面這條命令將在所有行后增加一個新行+++++++++

          [scz@ /home/scz/src]> sed -e "1 a\\
          > +++++++++" sedtest.txt

          把下面這兩行copy/paste到一個shell命令行上,效果一樣

          +++++++++" sedtest.txt

          [address]a\ 只接受一個地址指定

          對于a命令,不支持單引號,只能用雙引號,而對于d命令等其他命令,同時


          ★ 刪除匹配行 [/pattern/]d 或者 [address1][,address2]d

          sed -e '/---------------------------------------------/d' sedtest.txt

          Sed is a stream editor

          A stream editor is used to perform basic text transformations on an input stream
          While in some ways similar to an editor which permits scripted edits (such as ed
          )
          ,
          sed works by making only one pass over the input(s), and is consequently more
          efficient. But it is sed's ability to filter text in a pipeline which particular
          l

          y

          sed -e '6,10d' sedtest.txt
          刪除6-10行的內容,包括6和10

          sed -e "2d" sedtest.txt
          刪除第2行的內容

          sed "1,/^$/d" sedtest.txt
          刪除從第一行到第一個空行之間的所有內容
          注意這個命令很容易帶來意外的結果,當sedtest.txt中從第一行開始并沒有空行,則sed刪
          ?nbsp;
          ?nbsp;

          sed "1,/from/d" sedtest.txt
          刪除從第一行到第一個包含from字符串的行之間的所有內容,包括第一個包含
          from字符串的行。

          ★ 替換匹配行 [/pattern/]c\ 或者 [address1][,address2]c\

          sed -e "/is/c\\
          **********" sedtest.txt

          尋找所有包含is字符串的匹配行,替換成**********

          **********
          ----------------------
          **********
          --------------------------------------------------------------------------------
          While in some ways similar to an editor which permits scripted edits (such as ed
          )
          ,
          --------------------------------------------------------------------------------
          -
          -
          **********
          -----------------------------------------------------------------------------
          **********
          --------------------------------------------------------------------------------
          -

          sed -e "1,11c\\
          **********" sedtest.txt----------------------
          在1-12行內搜索所有from字符串,分別替換成****字符串

          ★ 限定范圍后的模式匹配

          sed "/But/s/is/are/g" sedtest.txt
          對那些包含But字符串的行,把is替換成are

          sed "/is/s/t/T/" sedtest.txt
          對那些包含is字符串的行,把每行第一個出現的t替換成T

          sed "/While/,/from/p" sedtest.txt -n
          輸出在這兩個模式匹配行之間的所有內容

          ★ 指定替換每一行中匹配的第幾次出現

          sed "s/is/are/5" sedtest.txt
          把每行的is字符串的第5次出現替換成are

          ★ &代表最后匹配

          sed "s/^$/(&)/" sedtest.txt
          給所有空行增加一對()

          sed "s/is/(&)/g" sedtest.txt
          給所有is字符串外增加()

          sed "s/.*/(&)/" sedtest.txt
          給所有行增加一對()

          sed "/is/s/.*/(&)/" sedtest.txt
          給所有包含is字符串的行增加一對()

          ★ 利用sed修改PATH環境變量

          先查看PATH環境變量
          [scz@ /home/scz/src]> echo $PATH
          /usr/bin:/usr/bin:/bin:/usr/local/bin:/sbin:/usr/sbin:/usr/X11R6/bin:.

          去掉尾部的{ :/usr/X11R6/bin:. }
          [scz@ /home/scz/src]> echo $PATH | sed "s/^\(.*\):\/usr[/]X11R6\/bin:[.]$/\1/"
          /usr/bin:/usr/bin:/bin:/usr/local/bin:/sbin:/usr/sbin

          去掉中間的{ :/bin: }
          [scz@ /home/scz/src]> echo $PATH | sed "s/^\(.*\):\/bin:\(.*\)$/\1\2/"
          /usr/bin:/usr/bin/usr/local/bin:/sbin:/usr/sbin:/usr/X11R6/bin:.

          [/]表示/失去特殊意義
          \/同樣表示/失去意義
          \1表示子匹配的第一次出現
          \2表示子匹配的第二次出現
          \(.*\)表示子匹配

          去掉尾部的:,然后增加新的路徑
          PATH=`echo $PATH | sed 's/\(.*\):$/\1/'`:$HOME/src
          注意反引號`和單引號'的區別。

          ★ 測試并提高sed命令運行效率

          time sed -n "1,12p" webkeeper.db > /dev/null
          time sed 12q webkeeper.db > /dev/null
          可以看出后者比前者效率高。

          [address]q 當碰上指定行時退出sed執行

          ★ 指定輸出文件 [address1][,address2]w outputfile

          sed "1,10w sed.out" sedtest.txt -n
          將sedtest.txt中1-10行的內容寫到sed.out文件中。

          ★ 指定輸入文件 [address]r inputfile

          sed "1r sedappend.txt" sedtest.txt
          將sedappend.txt中的內容附加到sedtest.txt文件的第一行之后

          ★ 替換相應字符 [address1][,address2]y/old/new/

          sed "y/abcdef/ABCDEF/" sedtest.txt
          將sedtest.txt中所有的abcdef小寫字母替換成ABCDEF大寫字母。

          ★ !號的使用

          sed -e '3,7!d' sedtest.txt
          刪除3-7行之外的所有行

          sed -e '1,/from/!d' sedtest.txt
          找到包含from字符串的行,刪除其后的所有行

          ★ \c正則表達式c 的使用

          sed -e "\:from:d" sedtest.txt
          等價于 sed -e "/from/d" sedtest.txt

          ★ sed命令中正則表達式的復雜性

          cat > sedtest.txt
          ^\/[}]{.*}[\(]$\)
          ^D

          如何才能把該行替換成
          \(]$\)\/[}]{.*}^[

          ★ 轉換man手冊成普通文本格式(新)

          man sed | col -b > sed.txt
          sed -e "s/^H//g" -e "/^$/d" -e "s/^^I/ /g" -e "s/^I/ /g" sed.txt > sedman
          .
          txt
          刪除所有退格鍵、空行,把行首的制表符替換成8個空格,其余制表符替換成一個空格。

          ★ sed的man手冊(用的就是上面的方法)

          NAME
          sed - a Stream EDitor
          SYNOPSIS
          sed [-n] [-V] [--quiet] [--silent] [--version] [--help]
          [-e script] [--expression=script]
          [-f script-file] [--file=script-file]
          [script-if-no-other-script]
          [file...]
          DESCRIPTION
          Sed is a stream editor. A stream editor is used to per-
          form basic text transformations on an input stream (a file
          or input from a pipeline). While in some ways similar to
          an editor which permits scripted edits (such as ed), sed
          works by making only one pass over the input(s), and is
          consequently more efficient. But it is sed's ability to
          filter text in a pipeline which particularly distinguishes
          it from other types of editors.
          OPTIONS
          Sed may be invoked with the following command-line
          options:
          -V
          --version
          Print out the version of sed that is being run and
          a copyright notice, then exit.
          -h
          --help Print a usage message briefly summarizing these
          command-line options and the bug-reporting address,
          then exit.
          -n
          --quiet
          --silent
          By default, sed will print out the pattern space at
          the end of each cycle through the script. These
          options disable this automatic printing, and sed
          will only produce output when explicitly told to
          via the p command.
          -e script
          --expression=script
          Add the commands in script to the set of commands
          to be run while processing the input.
          -f script-file
          --file=script-file
          Add the commands contained in the file script-file
          to the set of commands to be run while processing
          the input.
          If no -e,-f,--expression, or --file options are given on
          the command-line, then the first non-option argument on
          the command line is taken to be the script to be executed.
          If any command-line parameters remain after processing the
          above, these parameters are interpreted as the names of
          input files to be processed. A file name of - refers to
          the standard input stream. The standard input will pro-
          cessed if no file names are specified.
          Command Synopsis
          This is just a brief synopsis of sed commands to serve as
          a reminder to those who already know sed; other documenta-
          tion (such as the texinfo document) must be consulted for
          fuller descriptions.
          Zero-address ``commands''
          : label
          Label for b and t commands.
          #comment
          The comment extends until the next newline (or the
          end of a -e script fragment).
          } The closing bracket of a { } block.
          Zero- or One- address commands
          = Print the current line number.
          a \
          text Append text, which has each embedded newline pre-
          ceeded by a backslash.
          i \
          text Insert text, which has each embedded newline pre-
          ceeded by a backslash.
          q Immediately quit the sed script without processing
          any more input, except that if auto-print is not
          diabled the current pattern space will be printed.
          r filename
          Append text read from filename.
          Commands which accept address ranges
          { Begin a block of commands (end with a }).
          b label
          Branch to label; if label is omitted, branch to end
          of script.
          t label
          If a s/// has done a successful substitution since
          the last input line was read and since the last t
          command, then branch to label; if label is omitted,
          branch to end of script.
          c \
          text Replace the selected lines with text, which has
          each embedded newline preceeded by a backslash.
          d Delete pattern space. Start next cycle.
          D Delete up to the first embedded newline in the pat-
          tern space. Start next cycle, but skip reading
          from the input if there is still data in the pat-
          tern space.
          h H Copy/append pattern space to hold space.
          g G Copy/append hold space to pattern space.
          x Exchange the contents of the hold and pattern
          spaces.
          l List out the current line in a ``visually unambigu-
          ous'' form.
          n N Read/append the next line of input into the pattern
          space.
          p Print the current pattern space.
          P Print up to the first embedded newline of the cur-
          rent pattern space.
          s/regexp/replacement/
          Attempt to match regexp against the pattern space.
          If successful, replace that portion matched with
          replacement. The replacement may contain the spe-
          cial character & to refer to that portion of the
          pattern space which matched, and the special
          escapes \1 through \9 to refer to the corresponding
          matching sub-expressions in the regexp.
          w filename Write the current pattern space to file-
          name.
          y/source/dest/
          Transliterate the characters in the pattern space
          which appear in source to the corresponding charac-
          ter in dest.
          Addresses
          Sed commands can be given with no addresses, in which case
          the command will be executed for all input lines; with one
          address, in which case the command will only be executed
          for input lines which match that address; or with two
          addresses, in which case the command will be executed for
          all input lines which match the inclusive range of lines
          starting from the first address and continuing to the sec-
          ond address. Three things to note about address ranges:
          the syntax is addr1,addr2 (i.e., the addresses are sepa-
          rated by a comma); the line which addr1 matched will
          always be accepted, even if addr2 selects an earlier line;
          and if addr2 is a regexp, it will not be tested against
          the line that addr1 matched.
          After the address (or address-range), and before the com-
          mand, a ! may be inserted, which specifies that the com-
          mand shall only be executed if the address (or address-
          range) does not match.
          The following address types are supported:
          number Match only the specified line number.
          first~step
          Match every step'th line starting with line first.
          For example, ``sed -n 1~2p'' will print all the
          odd-numbered lines in the input stream, and the
          address 2~5 will match every fifth line, starting
          with the second. (This is a GNU extension.)
          $ Match the last line.
          /regexp/
          Match lines matching the regular expression regexp.
          \cregexpc
          Match lines matching the regular expression regexp.
          The c may be any character.
          Regular expressions
          POSIX.2 BREs should be supported, but they aren't com-
          pletely yet. The \n sequence in a regular expression
          matches the newline character. There are also some GNU
          extensions. [XXX FIXME: more needs to be said. At the
          very least, a reference to another document which
          describes what is supported should be given.]
          Miscellaneous notes
          This version of sed supports a \<newline> sequence in all
          regular expressions, the replacement part of a substitute
          (s) command, and in the source and dest parts of a
          transliterate (y) command. The \ is stripped, and the
          newline is kept.
          SEE ALSO
          awk(1), ed(1), expr(1), emacs(1), perl(1), tr(1), vi(1),
          regex(5) [well, one ought to be written... XXX], sed.info,
          any of various books on sed, the sed FAQ
          (http://www.wollery.demon.co.uk/sedtut10.txt,
          http://www.ptug.org/sed/sedfaq.htm).
          BUGS
          E-mail bug reports to bug-gnu-utils@gnu.org. Be sure to
          include the word ``sed'' somewhere in the ``Subject:''
          field.

          轉自:http://www.1717du.com/web/Article/os/unix/200505/6033.html

          posted on 2005-08-03 11:20 weidagang2046 閱讀(685) 評論(0)  編輯  收藏 所屬分類: Linux

          主站蜘蛛池模板: 油尖旺区| 怀仁县| 夏邑县| 白山市| 洞口县| 工布江达县| 昌宁县| 伊宁市| 扶风县| 拜城县| 沐川县| 林芝县| 桑日县| 隆回县| 湖北省| 木兰县| 宁夏| 西华县| 桦南县| 漠河县| 宣威市| 拜城县| 古丈县| 阳泉市| 吉安县| 奉化市| 若尔盖县| 蓬溪县| 丹巴县| 尼玛县| 宁强县| 汉川市| 郎溪县| 鹤岗市| 商河县| 修文县| 巴楚县| 扎兰屯市| 那曲县| 疏附县| 新巴尔虎左旗|