在线观看视频色潮,精品一区二区三区视频在线播放,亚洲在线久久

飛云小俠之風兒吹過

山谷里鳥語花香,溪水潺潺

分析XML中的CDATA類型在RSS中的使用

除經特別注明外,本站文章版權歸JScud Develop團隊或其原作者所有.
轉載請注明作者和來源. scud(飛云小俠) 歡迎訪問 JScud Develop

根據XML中CDATA類型的規范可以知道:"&"和"<"不需要也不能被轉換. ">" 如果出現在"]]>" 的內容而不是表示結束時,必須被轉義為>

但是這樣就存在一個問題,如果我需要輸入"]]>",正確的處理是保存為"]]>",但是如果我想輸入"]]>",那么應該如何保存哪? 我想了很久,除非加空格或者采用特殊的辦法,否則是沒有辦法解決的.

1.如果我們不考慮輸入"]]>"的問題,來考慮一下"]]>"的處理,看看各種XML解析器是如何處理的?

xml解析器的測試包含2個部分:設置cdata類型的數據和讀出cdata類型的數據.

首先我們寫一個測試的例子,計劃使用JDom 1.0和Dom4j來測試一下:

package com.jscud.test;

public class XmlTestBase
{
     public static String xmlpart =
      "<?xml version=\"1.0\" encoding=\"UTF-8\"?>"+
         "<xml>" +
         "<test>"+
         "<hello><![CDATA[ hello ]]> ]]></hello>" +
         "</test>" +
         "</xml>";

     public static void print(String str)
     {
         System.out.println(str);
     }
}

JDom測試的例子如下:

package com.jscud.test;

import java.io.*;
import org.jdom.*;
import org.jdom.input.SAXBuilder;
import org.jdom.output.*;

//@author scud http://www.jscud.com

public class JDomXmlFileTest extends XmlTestBase
{

     public static void main(String[] args) throws Exception
     {
         readDocument();
         print("===========================");
         createDocument();
     }

     public static void readDocument() throws Exception
     {
         Reader reader = new StringReader(xmlpart);
         SAXBuilder builder = new SAXBuilder();

         Document doc = builder.build(reader);

         Element aRoot = doc.getRootElement();

         Element anode = aRoot.getChild("test").getChild("hello");

         print(anode.getText());
     }

     public static void createDocument() throws Exception
     {
         Document doc = new Document();

         doc.setRootElement(new Element("root"));

         CDATA node = new CDATA("hello alt=]]>");

         //throw Exception
         //node.setText("hello]]>");

         Element ele = new Element("hello");

         ele.setContent(node);

         Element root = doc.getRootElement();

         root.getChildren().add(ele);

         XMLOutputter outputter = new XMLOutputter();
         Format aFormat = Format.getCompactFormat();
         aFormat.setEncoding("GB2312");

         String sResult = outputter.outputString(doc.getRootElement().getChildren());

        print(sResult);

     }
}

編譯并運行上面的代碼結果,我們可以看到JDom無法設置Cdata的值為"]]>",會報異常.從xml字符串讀出cdata的結果也沒有把字串"]]>"翻譯為"]]>".

接著再來測試Dom4J:

package com.xml.test;

import java.io.StringReader;

import org.dom4j.*;
import org.dom4j.io.SAXReader;
import org.dom4j.tree.DefaultCDATA;

/**
* 測試XML的CData數據類型.
*
* @author scud http://www.jscud.com
*
*/

public class Dom4jXmlTest extends XmlTestBase
{

     public static void main(String[] args) throws Exception
     {
         readDocument();
         print("===========================");
         createDocument();
     }

     public static void createDocument()
     {
         Document document = DocumentHelper.createDocument();
         Element root = document.addElement( "root" );

         DefaultCDATA cdata = new DefaultCDATA("sample]]>");
         DefaultCDATA cdata2 = new DefaultCDATA("sample]]>");

         Element anode = root.addElement("cdata");
         anode.add(cdata);

         print(anode.getText());
         print(anode.asXML());

         Element anode2 = root.addElement("cdata2");
         anode2.add(cdata2);

         print(anode2.getText());
         print(anode2.asXML());
     }

     public static void readDocument() throws Exception
     {
         StringReader strreader = new StringReader(xmlpart);

         SAXReader reader = new SAXReader();
         Document document = reader.read(strreader);

         Node node = document.selectSingleNode( "http://test/hello" );

         print(node.getText());

         print(node.getStringValue());
     }

}

我們可以看到Dom4j也是沒有做任何處理,輸入的時候不作任何轉換,原樣輸出,這樣必然導致xml錯誤.讀出的時候也沒有做轉換.

根據上面的測試我們可以得出結論:很多xml解析器沒有正確解析cdata的數據,(jdom和dom4j用的人比較多),不要太相信這些解析器.

2.我們再來看看閱讀RSS的RSS閱讀器吧,例如FeedDemon和POTU,我們準備了一個CData類型的description字段,來進行測試.

內容:

<?xml version="1.0" encoding="GB2312" ?>
<rss version="2.0">
<channel>
<title>Some Where</title>
<link>http://www.jscud.com/</link>
<description />
<item>
<title>Test</title>
<link>http://www.jscud.com</link>
<author>scud</author>
<pubDate>Mon, 22 Aug 2005 10:22:22 GMT</pubDate>
<description><![CDATA[
<hr>
]]>
]]></description>
</item>
</channel>
</rss>

結果:
1.POTU沒有做任何處理
2.FeedDemon做了處理,不過同時也把其他的> <等等都翻譯了,這就更不對了..

本來我是打算在RSS里使用CDATA類型的description字段的,經過幾番試驗和測試,最后決定還是使用普通的description字段了,不在使用CDATA了.

CDATA? 雞肋乎? 呵呵

posted on 2005-08-22 18:49 Scud(飛云小俠) 閱讀(2604) 評論(0) 編輯收藏所屬分類: Java

新用戶注冊刷新評論列表


只有注冊用戶登錄后才能發表評論。




網站導航: 博客園 IT新聞 Chat2DB C++博客博問管理
相關文章: MAVEN:如何為開發和生產環境建立不同的配置文件 --我的簡潔方案對搜索引擎同義詞支持的實驗, 分析模擬不重復的排列組合示例最近在編寫DBHelper的文檔讀"Under the Hood of J2EE Clustering" J2EE集群幾個提高代碼質量,檢查代碼規范的工具分析XML中的CDATA類型在RSS中的使用使用FreeMarker/Jsp(webwork)生成靜態/動態RSS文件 Rss 中日期格式的研究使用Lucene進行全文檢索(三)---進行搜索