捕風之巢

          統(tǒng)計

          留言簿(3)

          java友情鏈接

          閱讀排行榜

          評論排行榜

          DOM4J使用總結

          ?DOM4J是dom4j.org出品的一個開源XML解析包,它的網(wǎng)站中這樣定義:

          Dom4j is an easy to use, open source library for working with XML, XPath and XSLT on the Java platform using the Java Collections Framework and with full support for DOM, SAX and JAXP.

          下載地址:http://sourceforge.net/project/showfiles.php?group_id=16035??

          Dom4j是一個易用的、開源的庫,用于XML,XPath和XSLT。它應用于Java平臺,采用了Java集合框架并完全支持DOM,SAX和JAXP。

          DOM4J使用起來非常簡單。只要你了解基本的XML-DOM模型,就能使用。然而他自己帶的指南只有短短一頁(html),不過說的到挺全。國內的中文資料很少。因而俺寫這個短小的教程方便大家使用,這篇文章僅談及基本的用法,如需深入的使用,請……自己摸索或查找別的資料。

          之前看過IBM developer社區(qū)的文章(參見附錄),提到一些XML解析包的性能比較,其中DOM4J的性能非常出色,在多項測試中名列前茅。(事實上DOM4J的官方文檔中也引用了這個比較)所以這次的項目中我采用了DOM4J作為XML解析工具。

          在國內比較流行的是使用JDOM作為解析器,兩者各擅其長,但DOM4J最大的特色是使用大量的接口,這也是它被認為比JDOM靈活的主要原因。大師不是說過么,“面向接口編程”。目前使用DOM4J的已經(jīng)越來越多。如果你善于使用JDOM,不妨繼續(xù)用下去,只看看本篇文章作為了解與比較,如果你正要采用一種解析器,不如就用DOM4J吧。

          它的主要接口都在org.dom4j這個包里定義:

          Attribute

          Attribute定義了XML的屬性

          Branch

          Branch為能夠包含子節(jié)點的節(jié)點如XML元素(Element)和文檔(Docuemnts)定義了一個公共的行為,

          CDATA

          CDATA 定義了XML CDATA 區(qū)域

          CharacterData

          CharacterData是一個標識借口,標識基于字符的節(jié)點。如CDATA,Comment, Text.

          Comment

          Comment 定義了XML注釋的行為

          Document

          定義了XML文檔

          DocumentType

          DocumentType 定義XML DOCTYPE聲明

          Element

          Element定義XML 元素

          ElementHandler

          ElementHandler定義了 Element 對象的處理器

          ElementPath

          被 ElementHandler 使用,用于取得當前正在處理的路徑層次信息

          Entity

          Entity定義 XML entity

          Node

          Node為所有的dom4j中XML節(jié)點定義了多態(tài)行為

          NodeFilter

          NodeFilter 定義了在dom4j節(jié)點中產(chǎn)生的一個濾鏡或謂詞的行為(predicate)

          ProcessingInstruction

          ProcessingInstruction 定義 XML 處理指令.

          Text

          Text 定義XML 文本節(jié)點.

          Visitor

          Visitor 用于實現(xiàn)Visitor模式.

          XPath

          XPath 在分析一個字符串后會提供一個XPath 表達式

          看名字大致就知道它們的涵義如何了。

          要想弄懂這套接口,關鍵的是要明白接口的繼承關系:

          ??? * interface java.lang.Cloneable
          ????????? o interface org.dom4j.Node

          ??????????????? + interface org.dom4j.Attribute
          ??????????????? + interface org.dom4j.Branch

          ????????????????????? # interface org.dom4j.Document
          ????????????????????? # interface org.dom4j.Element
          ??????????????? + interface org.dom4j.CharacterData
          ????????????????????? # interface org.dom4j.CDATA
          ????????????????????? # interface org.dom4j.Comment
          ????????????????????? # interface org.dom4j.Text
          ??????????????? + interface org.dom4j.DocumentType
          ??????????????? + interface org.dom4j.Entity
          ??????????????? + interface org.dom4j.ProcessingInstruction

          一目了然,很多事情都清楚了。大部分都是由Node繼承來的。知道這些關系,將來寫程序就不會出現(xiàn)ClassCastException了。

          下面給出一些例子(部分摘自DOM4J自帶的文檔),簡單說一下如何使用。

          1.????????????? 讀取并解析XML文檔:

          讀寫XML文檔主要依賴于org.dom4j.io包,其中提供DOMReader和SAXReader兩類不同方式,而調用方式是一樣的。這就是依靠接口的好處。?

          ???

          ? // ?從文件讀取XML,輸入文件名,返回XML文檔

          ????
          public ?Document?read(String?fileName)? throws ?MalformedURLException,?DocumentException? {

          ???????SAXReader?reader?
          = ? new ?SAXReader();

          ???????Document?document?
          = ?reader.read( new ?File(fileName));

          ???????
          return ?document;

          ????}

          其中,reader的read方法是重載的,可以從InputStream, File, Url等多種不同的源來讀取。得到的Document對象就帶表了整個XML。

          根據(jù)本人自己的經(jīng)驗,讀取的字符編碼是按照XML文件頭定義的編碼來轉換。如果遇到亂碼問題,注意要把各處的編碼名稱保持一致即可。

          2.??? 取得Root節(jié)點

          讀取后的第二步,就是得到Root節(jié)點。熟悉XML的人都知道,一切XML分析都是從Root元素開始的。

           

          ?? public ?Element?getRootElement(Document?doc) {

          ???????
          return ?doc.getRootElement();

          ????}

          3.??? 遍歷XML樹

          DOM4J提供至少3種遍歷節(jié)點的方法:

          1) 枚舉(Iterator)

          ???

          // ?枚舉所有子節(jié)點

          ????
          for ?(?Iterator?i? = ?root.elementIterator();?i.hasNext();?)? {

          ???????Element?element?
          = ?(Element)?i.next();

          ???????
          // ?do?something

          ????}


          ????
          // ?枚舉名稱為foo的節(jié)點

          ????
          for ?(?Iterator?i? = ?root.elementIterator(foo);?i.hasNext();)? {

          ???????Element?foo?
          = ?(Element)?i.next();

          ???????
          // ?do?something

          ????}


          ????
          // ?枚舉屬性

          ????
          for ?(?Iterator?i? = ?root.attributeIterator();?i.hasNext();?)? {

          ???????Attribute?attribute?
          = ?(Attribute)?i.next();

          ???????
          // ?do?something

          ????}

          2)遞歸

          遞歸也可以采用Iterator作為枚舉手段,但文檔中提供了另外的做法?

          ???

          ? public ? void ?treeWalk()? {

          ???????treeWalk(getRootElement());

          ????}


          ????
          public ? void ?treeWalk(Element?element)? {

          ???????
          for ?( int ?i? = ? 0 ,?size? = ?element.nodeCount();?i? < ?size;?i ++ )????? {

          ???????????Node?node?
          = ?element.node(i);

          ???????????
          if ?(node? instanceof ?Element)? {

          ??????????????treeWalk((Element)?node);

          ???????????}
          ? else ? {? // ?do?something.

          ???????????}


          ???????}


          }

          ?

          3) Visitor模式

          最令人興奮的是DOM4J對Visitor的支持,這樣可以大大縮減代碼量,并且清楚易懂。了解設計模式的人都知道,Visitor是GOF設計模式之一。其主要原理就是兩種類互相保有對方的引用,并且一種作為Visitor去訪問許多Visitable。我們來看DOM4J中的Visitor模式(快速文檔中沒有提供)

          只需要自定一個類實現(xiàn)Visitor接口即可。?

          ? public ? class ?MyVisitor? extends ?VisitorSupport? {

          ???????????
          public ? void ?visit(Element?element) {

          ???????????????System.out.println(element.getName());

          ???????????}


          ???????????
          public ? void ?visit(Attribute?attr) {

          ???????????????System.out.println(attr.getName());

          ???????????}


          ????????}

          ?? 調用:? root.accept(new MyVisitor())

          ??? Visitor接口提供多種Visit()的重載,根據(jù)XML不同的對象,將采用不同的方式來訪問。上面是給出的Element和Attribute的簡單實現(xiàn),一般比較常用的就是這兩個。VisitorSupport是DOM4J提供的默認適配器,Visitor接口的Default Adapter模式,這個模式給出了各種visit(*)的空實現(xiàn),以便簡化代碼。

          ??? 注意,這個Visitor是自動遍歷所有子節(jié)點的。如果是root.accept(MyVisitor),將遍歷子節(jié)點。我第一次用的時候,認為是需要自己遍歷,便在遞歸中調用Visitor,結果可想而知。

          4. XPath支持

          ??? DOM4J對XPath有良好的支持,如訪問一個節(jié)點,可直接用XPath選擇。

          ??

          ? public ? void ?bar(Document?document)? {

          ????????List?list?
          = ?document.selectNodes(? // foo/bar?);

          ????????Node?node?
          = ?document.selectSingleNode( // foo/bar/author);

          ????????String?name?
          = ?node.valueOf(?@name?);

          ?????}

          ??? 例如,如果你想查找XHTML文檔中所有的超鏈接,下面的代碼可以實現(xiàn):

          ??

          ?? public ? void ?findLinks(Document?document)? throws ?DocumentException? {

          ????????List?list?
          = ?document.selectNodes(? // a/@href?);

          ????????
          for ?(Iterator?iter? = ?list.iterator();?iter.hasNext();?)? {

          ????????????Attribute?attribute?
          = ?(Attribute)?iter.next();

          ????????????String?url?
          = ?attribute.getValue();

          ????????}


          ?????}

          5. 字符串與XML的轉換

          有時候經(jīng)常要用到字符串轉換為XML或反之,

          ???

          ? // ?XML轉字符串?

           ?Document?document?
          = ?;

          ????String?text?
          = ?document.asXML();

          // ?字符串轉XML

          ????String?text?
          = ? < person > ? < name > James </ name > ? </ person > ;

          ????Document?document?
          = ?DocumentHelper.parseText(text);

          6 用XSLT轉換XML

          ? public ?Document?styleDocument(

          ???????Document?document,

          ???????String?stylesheet

          ????)?
          throws ?Exception? {

          ????
          // ?load?the?transformer?using?JAXP

          ????TransformerFactory?factory?
          = ?TransformerFactory.newInstance();

          ????Transformer?transformer?
          = ?factory.newTransformer(

          ???????
          new ?StreamSource(?stylesheet?)

          ????);

          ????
          // ?now?lets?style?the?given?document

          ????DocumentSource?source?
          = ? new ?DocumentSource(?document?);

          ????DocumentResult?result?
          = ? new ?DocumentResult();

          ????transformer.transform(?source,?result?);

          ????
          // ?return?the?transformed?document

          ????Document?transformedDoc?
          = ?result.getDocument();

          ????
          return ?transformedDoc;

          }

          ?7. 創(chuàng)建XML

          ? 一般創(chuàng)建XML是寫文件前的工作,這就像StringBuffer一樣容易。

          public ?Document?createDocument()? {

          ???????Document?document?
          = ?DocumentHelper.createDocument();

          ???????Element?root?
          = ?document.addElement(root);

          ???????Element?author1?
          =

          ???????????root

          ??????????????.addElement(author)

          ??????????????.addAttribute(name,?James)

          ??????????????.addAttribute(location,?UK)

          ??????????????.addText(James?Strachan);

          ???????Element?author2?
          =

          ???????????root

          ??????????????.addElement(author)

          ??????????????.addAttribute(name,?Bob)

          ??????????????.addAttribute(location,?US)

          ??????????????.addText(Bob?McWhirter);

          ???????
          return ?document;

          ????}

          ?8. 文件輸出

          ??? 一個簡單的輸出方法是將一個Document或任何的Node通過write方法輸出

          ???

          FileWriter?out? = ? new ?FileWriter(?foo.xml?);

          ????document.write(out);

          如果你想改變輸出的格式,比如美化輸出或縮減格式,可以用XMLWriter類

          public ? void ?write(Document?document)? throws ?IOException? {

          ???????
          // ?指定文件

          ???????XMLWriter?writer?
          = ? new ?XMLWriter(

          ???????????
          new ?FileWriter(?output.xml?)

          ???????);

          ???????writer.write(?document?);

          ???????writer.close();

          ???????
          // ?美化格式

          ???????OutputFormat?format?
          = ?OutputFormat.createPrettyPrint();

          ???????writer?
          = ? new ?XMLWriter(?System.out,?format?);

          ???????writer.write(?document?);

          ???????
          // ?縮減格式

          ???????format?
          = ?OutputFormat.createCompactFormat();

          ???????writer?
          = ? new ?XMLWriter(?System.out,?format?);

          ???????writer.write(?document?);

          ????}


          ?如何,DOM4J夠簡單吧,當然,還有一些復雜的應用沒有提到,如ElementHandler等。如果你動心了,那就一起來用DOM4J.

          DOM4J官方網(wǎng)站:

          http://www.dom4j.org

          DOM4J下載(SourceForge),最新版本為1.4

          http://sourceforge.net/projects/dom4j

          參考資料:

          DOM4J文檔

          Java 中的 XML:文檔模型,第一部分:性能

          http://www-900.ibm.com/developerWorks/cn/xml/x-injava/index.shtml

          Java 中的 XML:Java 文檔模型的用法

          http://www-900.ibm.com/developerWorks/cn/xml/x-injava2/index.shtml

          Java XML API 漫談 by robbin

          http://www.hibernate.org.cn:8000/137.html

          附件:

          package ?org.test;

          import ?java.io.File;
          import ?java.io.FileWriter;
          import ?java.util.Iterator;
          import ?java.util.List;

          import ?org.dom4j.Attribute;
          import ?org.dom4j.Document;
          import ?org.dom4j.DocumentException;
          import ?org.dom4j.DocumentHelper;
          import ?org.dom4j.Element;
          import ?org.dom4j.io.OutputFormat;
          import ?org.dom4j.io.SAXReader;
          import ?org.dom4j.io.XMLWriter;

          public ? class ?TestDom4J? {

          ????
          private ?Document?document? = ? null ;

          ????
          private ?String?filePath? = ? null ;

          ????
          public ?TestDom4J()? {
          ????}


          ????
          public ?TestDom4J(String?filePath)? {
          ????????
          this .filePath? = ?filePath;
          ????}


          ????
          private ?Document?parse()? throws ?DocumentException? {
          ????????SAXReader?reader?
          = ? new ?SAXReader();
          ????????Document?document?
          = ?reader.read(filePath);
          ????????
          return ?document;
          ????}


          ????
          public ? void ?backup()? throws ?Exception? {
          ????????document?
          = ?parse();
          ????????Element?rootElement?
          = ?document.getRootElement();
          ????????rootElement.addElement(
          " friend_list " ).addText( " nihao " );
          ????????saveToFile();
          ????}


          ????
          private ? void ?saveToFile()? {
          ????????
          try ? {
          ????????????
          // ?美化輸出
          ????????????OutputFormat?format? = ?OutputFormat.createPrettyPrint();
          ????????????
          // ?設置字符編碼
          ????????????format.setEncoding( " GB2312 " );
          ????????????XMLWriter?writer?
          = ? new ?XMLWriter( new ?FileWriter(filePath),?format);
          ????????????writer.write(document);
          ????????????writer.close();
          ????????}
          ? catch ?(Exception?e)? {
          ????????}

          ????}


          ????
          public ? static ? void ?main(String[]?args)? throws ?Exception? {
          ????????TestDom4J?test?
          = ? new ?TestDom4J( " first.xml " );
          ????????test.backup();
          ????????test.createXMLFile(
          " first.xml " );
          ????????test.ModiXMLFile(
          " first.xml " ,? " firstNew.xml " );
          ????}


          ????
          /**
          ?????*?
          ?????*?建立一個XML文檔,文檔名由輸入屬性決定
          ?????*?
          ?????*?
          @param ?filename
          ?????*????????????需建立的文件名
          ?????*?
          ?????*?
          @return ?返回操作結果,?0表失敗,?1表成功
          ?????*?
          ?????
          */

          ????
          public ? int ?createXMLFile(String?filename)? {
          ????????
          /** ?返回操作結果,?0表失敗,?1表成功? */
          ????????
          int ?returnValue? = ? 0 ;
          ????????
          /** ?建立document對象? */
          ????????Document?document?
          = ?DocumentHelper.createDocument();
          ????????
          /** ?建立XML文檔的根books? */
          ????????Element?booksElement?
          = ?document.addElement( " books " );
          ????????
          /** ?加入一行注釋? */
          ????????booksElement.addComment(
          " 這是dom4j的一個測試,?2004.9.11 " );
          ????????
          /** ?加入第一個book節(jié)點? */
          ????????Element?bookElement?
          = ?booksElement.addElement( " book " );
          ????????
          /** ?加入show屬性內容? */
          ????????bookElement.addAttribute(
          " show " ,? " yes " );
          ????????
          /** ?加入title節(jié)點? */
          ????????Element?titleElement?
          = ?bookElement.addElement( " title " );
          ????????
          /** ?為title設置內容? */
          ????????titleElement.setText(
          " Dom4j?Tutorials " );
          ????????
          /** ?類似的完成后兩個book? */
          ????????bookElement?
          = ?booksElement.addElement( " book " );
          ????????bookElement.addAttribute(
          " show " ,? " yes " );
          ????????titleElement?
          = ?bookElement.addElement( " title " );
          ????????titleElement.setText(
          " Lucene?Studing " );
          ????????bookElement?
          = ?booksElement.addElement( " book " );
          ????????bookElement.addAttribute(
          " show " ,? " no " );
          ????????titleElement?
          = ?bookElement.addElement( " title " );
          ????????titleElement.setText(
          " Lucene?in?Action " );
          ????????
          /** ?加入owner節(jié)點? */
          ????????Element?ownerElement?
          = ?booksElement.addElement( " owner " );
          ????????ownerElement.setText(
          " 張華平 " );
          ????????
          try ? {
          ????????????
          // ?美化輸出
          ????????????OutputFormat?format? = ?OutputFormat.createPrettyPrint();
          ????????????
          // ?設置字符編碼
          ????????????format.setEncoding( " GB2312 " );
          ????????????
          /** ?將document中的內容寫入文件中? */
          ????????????XMLWriter?writer?
          = ? new ?XMLWriter(
          ????????????????????
          new ?FileWriter( new ?File(filename)),?format);
          ????????????writer.write(document);
          ????????????writer.close();
          ????????????
          /** ?執(zhí)行成功,需返回1? */
          ????????????returnValue?
          = ? 1 ;
          ????????}
          ? catch ?(Exception?ex)? {
          ????????????ex.printStackTrace();
          ????????????returnValue?
          = ? 0 ;
          ????????}

          ????????
          return ?returnValue;
          ????}


          ????
          /**
          ?????*?
          ?????*?修改XML文件中內容,并另存為一個新文件
          ?????*?
          ?????*?重點掌握dom4j中如何添加節(jié)點,修改節(jié)點,刪除節(jié)點
          ?????*?
          ?????*?
          @param ?filename
          ?????*????????????修改對象文件
          ?????*?
          ?????*?
          @param ?newfilename
          ?????*????????????修改后另存為該文件
          ?????*?
          ?????*?
          @return ?返回操作結果,?0表失敗,?1表成功
          ?????*?
          ?????
          */

          ????
          public ? int ?ModiXMLFile(String?filename,?String?newfilename)? {
          ????????
          int ?returnValue? = ? 0 ;
          ????????
          try ? {
          ????????????SAXReader?saxReader?
          = ? new ?SAXReader();
          ????????????Document?document?
          = ?saxReader.read( new ?File(filename));
          ????????????
          /** ?修改內容之一:?如果book節(jié)點中show屬性的內容為yes,則修改成no? */
          ????????????
          /** ?先用xpath查找對象? */
          ????????????List?list?
          = ?document.selectNodes( " /books/book/@show " );
          ????????????Iterator?iter?
          = ?list.iterator();
          ????????????
          while ?(iter.hasNext())? {
          ????????????????Attribute?attribute?
          = ?(Attribute)?iter.next();
          ????????????????
          if ?(attribute.getValue().equals( " yes " ))? {
          ????????????????????attribute.setValue(
          " no " );
          ????????????????}

          ????????????}

          ????????????
          /**
          ?????????????*?
          ?????????????*?修改內容之二:?把owner項內容改為Tshinghua
          ?????????????*?
          ?????????????*?并在owner節(jié)點中加入date節(jié)點,date節(jié)點的內容為2004-09-11,還為date節(jié)點添加一個屬性type
          ?????????????*?
          ?????????????
          */

          ????????????list?
          = ?document.selectNodes( " /books/owner " );
          ????????????iter?
          = ?list.iterator();
          ????????????
          if ?(iter.hasNext())? {
          ????????????????Element?ownerElement?
          = ?(Element)?iter.next();
          ????????????????ownerElement.setText(
          " Tshinghua " );
          ????????????????Element?dateElement?
          = ?ownerElement.addElement( " date " );
          ????????????????dateElement.setText(
          " 2004-09-11 " );
          ????????????????dateElement.addAttribute(
          " type " ,? " Gregorian?calendar " );
          ????????????}

          ????????????
          /** ?修改內容之三:?若title內容為Dom4j?Tutorials,則刪除該節(jié)點? */
          ????????????list?
          = ?document.selectNodes( " /books/book " );
          ????????????iter?
          = ?list.iterator();
          ????????????
          while ?(iter.hasNext())? {
          ????????????????Element?bookElement?
          = ?(Element)?iter.next();
          ????????????????Iterator?iterator?
          = ?bookElement.elementIterator( " title " );
          ????????????????
          while ?(iterator.hasNext())? {
          ????????????????????Element?titleElement?
          = ?(Element)?iterator.next();
          ????????????????????
          if ?(titleElement.getText().equals( " Dom4j?Tutorials " ))? {
          ????????????????????????bookElement.remove(titleElement);
          ????????????????????}

          ????????????????}

          ????????????}

          ????????????
          try ? {
          ????????????????
          // ?美化輸出
          ????????????????OutputFormat?format? = ?OutputFormat.createPrettyPrint();
          ????????????????
          // ?設置字符編碼
          ????????????????format.setEncoding( " GB2312 " );
          ????????????????
          /** ?將document中的內容寫入文件中? */
          ????????????????XMLWriter?writer?
          = ? new ?XMLWriter( new ?FileWriter( new ?File(
          ????????????????????????newfilename)),?format);
          ????????????????writer.write(document);
          ????????????????writer.close();
          ????????????????
          /** ?執(zhí)行成功,需返回1? */
          ????????????????returnValue?
          = ? 1 ;
          ????????????}
          ? catch ?(Exception?ex)? {
          ????????????????ex.printStackTrace();
          ????????????}

          ????????}
          ? catch ?(Exception?ex)? {
          ????????????ex.printStackTrace();
          ????????}

          ????????
          return ?returnValue;
          ????}

          }


          posted on 2006-10-12 10:23 捕風 閱讀(381) 評論(0)  編輯  收藏 所屬分類: xml應用

          主站蜘蛛池模板: 水城县| 德惠市| 临澧县| 青州市| 察隅县| 工布江达县| 澎湖县| 兰州市| 临澧县| 同江市| 赤峰市| 天峨县| 杭州市| 中卫市| 明光市| 景泰县| 休宁县| 屏山县| 罗平县| 宜春市| 达州市| 绥德县| 偏关县| 三明市| 涪陵区| 惠水县| 葵青区| 玉门市| 太仆寺旗| 广平县| 宣武区| 平谷区| 澳门| 读书| 日照市| 图们市| 通榆县| 旬阳县| 武穴市| 抚顺县| 筠连县|