摘要:
JAXP (全稱Java API for XML Parsing)的可插拔性(pluggability)在開(kāi)發(fā)社區(qū)里引起很大的轟動(dòng)。這點(diǎn)也是JAXP的精華所在。開(kāi)發(fā)人員可以編寫(xiě)自己的xml處理器,只要它符合JAXP的APIs,這樣底層不同的xml處理器可以任意切換而不用改應(yīng)用程序的代碼。版權(quán)聲明:可以任意轉(zhuǎn)載,轉(zhuǎn)載時(shí)請(qǐng)務(wù)必以超鏈接形式標(biāo)明文章原始出處和作者信息及本聲明
作者:
Rahul Srivastava;SJTUer
原文地址:http://www.xml.com/pub/a/2005/07/06/jaxp.html
中文地址:http://www.matrix.org.cn/resource/article/43/43893_JAXP.html
關(guān)鍵詞: JAXP XML
簡(jiǎn)介
在1998年W3CXML1.0推薦標(biāo)準(zhǔn)發(fā)布之后,XML就開(kāi)始變得很流行。Sun公司就是在那時(shí)候規(guī)范Java Community Process (JCP),同時(shí)JAXP(JSR-05)的第一版在2000早些時(shí)候發(fā)布了。這個(gè)版本得到了很多工業(yè)集團(tuán)的支持,譬如(以年月次序排列)BEA Systems, Fujitsu Limited, Hewlett-Packard, IBM, Netscape Communications, Oracle, and Sun Microsystems, Inc.
JAXP (全稱Java API for XML Parsing)的可插拔性(pluggability)在開(kāi)發(fā)社區(qū)里引起很大的轟動(dòng)。這點(diǎn)也是JAXP的精華所在。開(kāi)發(fā)人員可以編寫(xiě)自己的xml處理器,只要它符合JAXP的APIs,這樣底層不同的xml處理器可以任意切換而不用改應(yīng)用程序的代碼。
那JAXP到底是什么呢?首先 這個(gè)P有點(diǎn)迷惑,它代表Parsing還是Processing呢?
因?yàn)镴AXP1.0的時(shí)候只支持解析(parsing),所以JAXP全稱應(yīng)該是Java API for XML Parsing.
但在JAXP1.1的時(shí)候,XSL-T被推薦用作XML的轉(zhuǎn)換(transformation)處理。很遺憾,當(dāng)時(shí)W3C XLT-T的標(biāo)準(zhǔn)規(guī)范(specification)里沒(méi)有提供任何用來(lái)轉(zhuǎn)換(transformation)處理的APIs。因此JAXP1.1的專家組推薦了一組APIs叫Transformation API for XML (TrAX)。
從此JAXP就叫Java API for XML Processing. JAXP通過(guò)逐步進(jìn)化,支持的東西也越來(lái)越多
不僅僅是解析xml文件(譬如在解析文檔的時(shí)候根據(jù)schema校驗(yàn)有效性,根據(jù)預(yù)解析的schema來(lái)校驗(yàn)文檔有效性,計(jì)算XPath 表達(dá)式等等)。
由于底層用來(lái)處理xml文檔的可插拔的processor是任意編寫(xiě)的,只要它符合JAXP的規(guī)范,因此JAXP 是一個(gè)輕量級(jí)的處理xml文件的處理APIs。(譯者注:JAXP只是一個(gè)api規(guī)范而已,真正底層實(shí)現(xiàn)是任意的。后面會(huì)有具體介紹。)
使用JAXP來(lái)解析XML文檔
JAXP支持基于對(duì)象和基于事件的兩種解析方式。基于對(duì)象的解析,到目前為止只支持W3C DOM解析,JAXP的專家組可能在JAXP的將來(lái)版本中會(huì)支持J-DOM規(guī)范。基于事件的解析,只有SAX 解析模式被支持,另一個(gè)基于事件的解析模式叫Pull Parsing,本來(lái)它應(yīng)該是JAXP的一部分。但是對(duì)于Pull Parsing存在有一份不同的JSR (#173)文檔,也就是大家所知道的Streaming API for XML (StAX) parsing,現(xiàn)在我們對(duì)于那個(gè)也沒(méi)什么更多的可以做了。

Figure 1: Various mechanism of parsing XML
使用SAX來(lái)解析XML文檔
SAX APIs 是在1998年的早些時(shí)候由David Megginson提出的,目標(biāo)是成為基于事件驅(qū)動(dòng)的xml文檔解析模式的標(biāo)準(zhǔn)API(這里你可以的到一些 SAX 的歷史信息)。即使這樣,SAX仍不是W3C 的REC。但毫無(wú)疑問(wèn)實(shí)際中它是行業(yè)內(nèi)解析XML文檔的標(biāo)準(zhǔn)。
SAX 是一種基于事件的解析模式,是push-parsing原理,解析文檔的時(shí)候,當(dāng)遇到<opening> 標(biāo)簽, </closing>標(biāo)簽 或字符等,SAX 都會(huì)產(chǎn)生相應(yīng)的事件(event)。一個(gè)SAX解析器解析XML文檔的時(shí)候,把文檔看作為一個(gè)流,依次產(chǎn)生相應(yīng)的事件報(bào)告給已注冊(cè)的content handler, org.xml.sax.ContentHandler,如果有錯(cuò)誤,錯(cuò)誤會(huì)報(bào)告給error handler, org.xml.sax.ErrorHandler.
如果你不注冊(cè)一個(gè)error handler,那你就根本不會(huì)知道在解析XML文檔的時(shí)候有沒(méi)有錯(cuò)誤產(chǎn)生和錯(cuò)誤是什么。因此,在SAX解析XML文檔的時(shí)候注冊(cè)一個(gè)error handler是極其重要的。
如果程序需要知道有什么事件產(chǎn)生了(并且想處理此事件),那你必須實(shí)現(xiàn)org.xml.sax.ContentHandler 接口并注冊(cè)給 SAX解析器。一個(gè)典型的事件被觸發(fā)的順序是
startDocument, startElement, characters, endElement, endDocument。
startDocument 僅僅被觸發(fā)一次而且是在觸發(fā)其它event之前。同樣,endDocument僅僅被觸發(fā)一次而且是在整個(gè)文檔被成功解析之后。你可以從SAX javadocs中獲取更詳細(xì)的信息。

Figure 2: SAX Parsing XML
使用JAXP,通過(guò)SAX parse XML document的代碼片斷:
?1?????SAXParserFactory?spfactory?=?SAXParserFactory.newInstance();
?2?????spfactory.setNamespaceAware(true);
?3?????SAXParser?saxparser?=?spfactory.newSAXParser();
?4
?5?????//write?your?handler?for?processing?events?and?handling?error
?6?????DefaultHandler?handler?=?new?MyHandler();
?7
?8?????//parse?the?XML?and?report?events?and?errors?(if?any)?to?the?handler
?9?????saxparser.parse(new?File("data.xml"),?handler);
?2?????spfactory.setNamespaceAware(true);
?3?????SAXParser?saxparser?=?spfactory.newSAXParser();
?4
?5?????//write?your?handler?for?processing?events?and?handling?error
?6?????DefaultHandler?handler?=?new?MyHandler();
?7

?8?????//parse?the?XML?and?report?events?and?errors?(if?any)?to?the?handler
?9?????saxparser.parse(new?File("data.xml"),?handler);
文檔對(duì)象模型解析
DOM 解析是基于對(duì)象的原理,當(dāng)用DOM解析XML文檔時(shí)它會(huì)在內(nèi)存中生成一個(gè)樹(shù)形的結(jié)構(gòu)來(lái)表示一個(gè)XML文檔。樹(shù)上的每個(gè)節(jié)點(diǎn)代表著XML文檔中的一個(gè)節(jié)點(diǎn)。如果一個(gè)DOM解析器符合W3C標(biāo)準(zhǔn),那它產(chǎn)生的DOM就是W3C的DOM,使用org.w3c.dom APIs就能遍歷和修改這個(gè)DOM。
大部分DOM解析器允許你抽取XML文檔里的一部分來(lái)生成DOM樹(shù),而不是把整個(gè)XML文檔在內(nèi)存中建立對(duì)應(yīng)DOM樹(shù)。

Figure 3: DOM Parsing XML
使用JAXP, 通過(guò)DOM parse XML document的代碼片斷:
?1?????DocumentBuilderFactory?dbfactory?=?DocumentBuilderFactory.newInstance();
?2?????dbfactory.setNamespaceAware(true);
?3?????DocumentBuilder?domparser?=?dbfactory.newDocumentBuilder();
?4
?5?????//parse?the?XML?and?create?the?DOM
?6?????Document?doc?=?domparser.parse(new?File("data.xml"));
?7
?8?????//to?create?a?new?DOM?from?scratch?-
?9?????//Document?doc?=?domparser.newDocument();
10
11?????//once?you?have?the?Document?handle,?then?you?can?use
12?????//the?org.w3c.dom.*?APIs?to?traverse?or?modify?the?DOM

?2?????dbfactory.setNamespaceAware(true);
?3?????DocumentBuilder?domparser?=?dbfactory.newDocumentBuilder();
?4

?5?????//parse?the?XML?and?create?the?DOM
?6?????Document?doc?=?domparser.parse(new?File("data.xml"));
?7

?8?????//to?create?a?new?DOM?from?scratch?-
?9?????//Document?doc?=?domparser.newDocument();
10

11?????//once?you?have?the?Document?handle,?then?you?can?use
12?????//the?org.w3c.dom.*?APIs?to?traverse?or?modify?the?DOM


在校驗(yàn)?zāi)J较逻M(jìn)行解析
根據(jù)DTD校驗(yàn)
DTD 是XML 文檔的語(yǔ)法。經(jīng)常人們會(huì)覺(jué)得DTD有點(diǎn)另類,因?yàn)樗蚗ML的syntax不一樣,但DTD是W3C XML1.0里的完整的一部分。如果一份XML文檔聲明了DOCTYPE,并且想在解析的時(shí)候根據(jù)DTD校驗(yàn)文檔,那你必須在適當(dāng)?shù)膄actory里啟用根據(jù)DTD校驗(yàn)文檔(validation)這個(gè)特性。例如:
1??? ?DocumentBuilderFactory?dbfactory?=?DocumentBuilderFactory.newInstance();
2?????dbfactory.setValidating(true);
3
4?????OR
5
6?????SAXParserFactory?spfactory?=?SAXParserFactory.newInstance();
7?????spfactory.setValidating(true);
8
2?????dbfactory.setValidating(true);
3

4?????OR
5

6?????SAXParserFactory?spfactory?=?SAXParserFactory.newInstance();
7?????spfactory.setValidating(true);
8

注意,如果XML文檔聲明了一個(gè)DTD ,即使你不啟用校驗(yàn)(validation)這個(gè)特性,解析器總是試著去讀入這個(gè)DTD。??這樣做的目的是為了保證XML文檔中entity reference被正確的擴(kuò)展了,否則會(huì)導(dǎo)致格式不正確的XML文檔,只有在XML文檔序言部分的聲明中standalone屬性被置為true時(shí),外部的DTD才會(huì)被完全忽略掉。例如:????
1
<?xml?version="1.1"?encoding="UTF-8"?standalone="yes"?>

根據(jù)W3C Schema來(lái)校驗(yàn)XML文檔(WXS)
XMLSchema 是XML文檔的另外一種文法描述。XMLSchema非常流行市因?yàn)樗蚗ML文檔使用同樣的語(yǔ)法并且提供了豐富的定義校驗(yàn)限制的特性。如果一個(gè)XML文檔用"schemaLocation" 和"noNamespaceSchemaLocation"指向了一個(gè)schema,結(jié)下來(lái)你想啟用根據(jù)XMLSchema校驗(yàn)文檔這個(gè)特性,你還要做如下的步驟:
1.和上面說(shuō)的一樣,調(diào)用SAXParserFactory o或DocumentBuilderFactory的setValidating函數(shù)來(lái)啟用validation這個(gè)特性。
2.把屬性 "http://java.sun.com/xml/jaxp/properties/schemaLanguage" 值設(shè)為 "http://www.w3.org/2001/XMLSchema"
注意,這種情況下,即使XML文檔有DOCTYPE聲明,處理器仍不會(huì)用DTD來(lái)校驗(yàn)這個(gè)文檔。但是和前面提到的一樣,為了任何一個(gè)entity reference是被正確擴(kuò)展的,這個(gè)DTD還是會(huì)被裝載的,
既然"schemaLocation" 和"noNamespaceSchemaLocation"僅僅是提示,所以可以使用屬性"http://java.sun.com/xml/jaxp/properties/schemaSource"從外部提供schemas來(lái)覆蓋這些提示。
對(duì)于這個(gè)屬性,一些可以接受值是:
·是一個(gè)代表schema的URL地址的字符串。
·java.io.InputStream with the contents of the schema
·org.xml.sax.InputSource
·java.io.File
·一個(gè) java.lang.Object 的數(shù)組,數(shù)組內(nèi)容是上面所提到三類中的一個(gè)。
例如:
?1?????SAXParserFactory?spfactory?=?SAXParserFactory.newInstance();
?2?????spfactory.setNamespaceAware(true);
?3
?4?????//turn?the?validation?on
?5?????spfactory.setValidating(true);
?6
?7?????//set?the?validation?to?be?against?WXS
?8?????saxparser.setProperty("http://java.sun.com/xml/jaxp/properties/
?9?????schemaLanguage",?"http://www.w3.org/2001/XMLSchema");
10
11?????//set?the?schema?against?which?the?validation?is?to?be?done
12?????saxparser.setProperty("http://java.sun.com/xml/jaxp/properties/
13?????schemaSource",?new?File("myschema.xsd"));
?2?????spfactory.setNamespaceAware(true);
?3

?4?????//turn?the?validation?on
?5?????spfactory.setValidating(true);
?6

?7?????//set?the?validation?to?be?against?WXS
?8?????saxparser.setProperty("http://java.sun.com/xml/jaxp/properties/
?9?????schemaLanguage",?"http://www.w3.org/2001/XMLSchema");
10

11?????//set?the?schema?against?which?the?validation?is?to?be?done
12?????saxparser.setProperty("http://java.sun.com/xml/jaxp/properties/
13?????schemaSource",?new?File("myschema.xsd"));

使用JAXP的TrAX APIs來(lái)進(jìn)行XML文檔轉(zhuǎn)換處理工作
W3C XSL-T 定義了一些轉(zhuǎn)換規(guī)則來(lái)把源樹(shù)轉(zhuǎn)化生成結(jié)果樹(shù)。在XSL-T中,轉(zhuǎn)換信息所存在的文件叫樣式表(stylesheet)。要用JAXP來(lái)轉(zhuǎn)換一個(gè)XML文檔,你需要定義一個(gè)使用樣式表來(lái)轉(zhuǎn)換XML文檔的轉(zhuǎn)換器。創(chuàng)建好這樣的轉(zhuǎn)換器后,它把要轉(zhuǎn)換的XML文檔作為JAXP的source,返回轉(zhuǎn)換好的結(jié)果作為JAXP的result。目前JAXP提供三種類型的source和result:
StreamSource, SAXSource, DOMSource and StreamResult, SAXResult, DOMResult, 他們是能夠聯(lián)合使用的。

Figure4: XML Transformation
從DOM中生成SAX Events:
?1?????//parse?the?XML?file?to?a?W3C?DOM
?2?????DocumentBuilderFactory?dbfactory?=?DocumentBuilderFactory.newInstance();
?3?????dbfactory.setNamespaceAware(true);
?4
?5?????DocumentBuilder?domparser?=?dbfactory.newDocumentBuilder();
?6?????Document?doc?=?domparser.parse(new?File("data.xml"));
?7
?8??????//prepare?the?DOM?source
?9?????Source?xmlsource?=?new?DOMSource(doc);
10
11?????//create?a?content?handler?to?handle?the?SAX?events
12??????ContentHandler?handler?=?new?MyHandler();
13
14?????//prepare?a?SAX?result?using?the?content?handler
15?????Result?result?=?new?SAXResult(handler);
16
17?????//create?a?transformer?factory
18?????TransformerFactory?xfactory?=?TransformerFactory.newInstance();
19
20?????//create?a?transformer
21?????Transformer?xformer?=?xfactory.newTransformer();
22
23?????//transform?to?raise?the?SAX?events?from?DOM
24?????xformer.transform(xmlsource,?result);
?2?????DocumentBuilderFactory?dbfactory?=?DocumentBuilderFactory.newInstance();
?3?????dbfactory.setNamespaceAware(true);
?4

?5?????DocumentBuilder?domparser?=?dbfactory.newDocumentBuilder();
?6?????Document?doc?=?domparser.parse(new?File("data.xml"));
?7

?8??????//prepare?the?DOM?source
?9?????Source?xmlsource?=?new?DOMSource(doc);
10

11?????//create?a?content?handler?to?handle?the?SAX?events
12??????ContentHandler?handler?=?new?MyHandler();
13

14?????//prepare?a?SAX?result?using?the?content?handler
15?????Result?result?=?new?SAXResult(handler);
16

17?????//create?a?transformer?factory
18?????TransformerFactory?xfactory?=?TransformerFactory.newInstance();
19

20?????//create?a?transformer
21?????Transformer?xformer?=?xfactory.newTransformer();
22

23?????//transform?to?raise?the?SAX?events?from?DOM
24?????xformer.transform(xmlsource,?result);
上面的例子中,我們創(chuàng)建Transformer的時(shí)候沒(méi)有用到XSL。這意味著這個(gè)轉(zhuǎn)換器對(duì)XML不會(huì)有任何的轉(zhuǎn)換,source和result是一樣的。當(dāng)你實(shí)際相要用XSL來(lái)轉(zhuǎn)換,你應(yīng)該創(chuàng)建一個(gè)使用了XSL的轉(zhuǎn)換器,就像下面一樣:
1?????//create?the?xsl?source
2??????Source?xslsource?=?new?StreamSource(new?File("mystyle.xsl"));
3
4?????//create?the?transformer?using?the?xsl?source
5?????Transformer?xformer?=?xfactory.newTransformer(xslsource);
2??????Source?xslsource?=?new?StreamSource(new?File("mystyle.xsl"));
3

4?????//create?the?transformer?using?the?xsl?source
5?????Transformer?xformer?=?xfactory.newTransformer(xslsource);
JAXP1.3有哪些新特性?
除了支持SAX解析,DOM解析,根據(jù)DTD/ XMLSchema的校驗(yàn),使用XSL-T轉(zhuǎn)換,
和以前的版本相比JAXP1.3新支持的特性有:
1. XML 1.1 和XML 1.1名字空間
2. XML Inclusions - XInclude 1.0
3. 根據(jù)預(yù)解析的schema來(lái)校驗(yàn)文檔。
4. XPath表達(dá)式的計(jì)算.
5.?以前XMLSchema 1.0, XPath 2.0 和XQuery 1.0中的某些數(shù)據(jù)類型不能被映射到j(luò)ava里的數(shù)據(jù)類型 ,現(xiàn)在可以了。
使用JAXP1.3
XML1.1主要支持的特性如下:
1.向前兼容不斷增長(zhǎng)的Unicode字符集。
2.在行結(jié)束(line-end)字符集中新添加了NEL (#x85)和Unicode行分隔符(#x2028)。
XML1.1中的變更不是向下兼容的,XML1.0中的一些well-formedness規(guī)則在
XML1.1中可能就不適用。所以XML1.1的規(guī)范是全新的而不是從XML1.0規(guī)范上升級(jí)。
為了能夠使用XML1.1和XML1.1的名字空間,你必須在XML序言聲明中把version屬性的值設(shè)為“1.1”。例如:
1
<?xml?version="1.1"?encoding="UTF-8"?standalone="yes"?>

XInclude允許一個(gè)XML文檔包含另一個(gè)XML文檔,例如:
1??????<myMainXMLDoc?xmlns:xi="http://www.w3.org/2001/XInclude">
2??????????<xi:include?href="fragment.xml"/>
3????????????
4?????</myMainXMLDoc>
2??????????<xi:include?href="fragment.xml"/>
3????????????

4?????</myMainXMLDoc>

相要使用XML inclusions特性,你必須在適當(dāng)?shù)膄actory里設(shè)置XInclude屬性為true,就像下面代碼所示:?
1??????DocumentBuilderFactory?dbfactory?=?DocumentBuilderFactory.newInstance();
2?????dbfactory.setXIncludeAware(true);
2?????dbfactory.setXIncludeAware(true);

根據(jù)預(yù)解析的schema校驗(yàn)JAXP的輸入源
javax.xml.validation包提供了解析schema和根據(jù)預(yù)解析的schema校驗(yàn)XML文檔的功能。DOMSource和SAXSource是可以根據(jù)預(yù)解析的schema來(lái)被校驗(yàn)。如果需要可以緩存預(yù)解析的schema來(lái)達(dá)到優(yōu)化的目的。必須注意到的是,根據(jù)預(yù)解析的schema校驗(yàn)JAXP的輸入源中,StreamSource并不是被支持的源,還有schema可以是W3C XML Schema 或者是一個(gè)OASIS RELAX-NG。例如:
?1?????//parse?an?XML?in?non-validating?mode?and?create?a?DOMSource
?2?????DocumentBuilderFactory?dbfactory?=?DocumentBuilderFactory.newInstance();
?3?????dbfactory.setNamespaceAware(true);
?4?????dbfactory.setXIncludeAware(true);
?5
?6?????DocumentBuilder?parser?=?dbfactory.newDocumentBuilder();
?7?????Document?doc?=?parser.parse(new?File("data.xml"));
?8
?9?????DOMSource?xmlsource?=?new?DOMSource(doc);
10
11?????//create?a?SchemaFactory?for?loading?W3C?XML?Schemas
12?????SchemaFactory?wxsfactory?=?
13????????SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
14
15?????//set?the?errorhandler?for?handling?errors?in?schema?itself
16?????wxsfactory.setErrorHandler(schemaErrorHandler);
17
18?????//load?a?W3C?XML?Schema
19?????Schema?schema?=?wxsfactory.newSchema(new?File("myschema.xsd"));
20
21?????//?create?a?validator?from?the?loaded?schema
22?????Validator?validator?=?schema.newValidator();
23
24?????//set?the?errorhandler?for?handling?validation?errors
25?????validator.setErrorHandler(validationErrorHandler);
26
27?????//validate?the?XML?instance
28?????validator.validate(xmlsource);
?2?????DocumentBuilderFactory?dbfactory?=?DocumentBuilderFactory.newInstance();
?3?????dbfactory.setNamespaceAware(true);
?4?????dbfactory.setXIncludeAware(true);
?5

?6?????DocumentBuilder?parser?=?dbfactory.newDocumentBuilder();
?7?????Document?doc?=?parser.parse(new?File("data.xml"));
?8

?9?????DOMSource?xmlsource?=?new?DOMSource(doc);
10

11?????//create?a?SchemaFactory?for?loading?W3C?XML?Schemas
12?????SchemaFactory?wxsfactory?=?
13????????SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
14

15?????//set?the?errorhandler?for?handling?errors?in?schema?itself
16?????wxsfactory.setErrorHandler(schemaErrorHandler);
17

18?????//load?a?W3C?XML?Schema
19?????Schema?schema?=?wxsfactory.newSchema(new?File("myschema.xsd"));
20

21?????//?create?a?validator?from?the?loaded?schema
22?????Validator?validator?=?schema.newValidator();
23

24?????//set?the?errorhandler?for?handling?validation?errors
25?????validator.setErrorHandler(validationErrorHandler);
26

27?????//validate?the?XML?instance
28?????validator.validate(xmlsource);
計(jì)算XPath表達(dá)式
javax.xml.xpath 包提供了根據(jù)XML文檔計(jì)算XPath表達(dá)式的功能。如果一個(gè)表達(dá)式要被重用,出于性能考慮,這個(gè)XPath表達(dá)式會(huì)被編譯。
順便說(shuō)一下,JAXP中的XPath 的API被設(shè)計(jì)為無(wú)狀態(tài)的,這就意味著每次你要計(jì)算一個(gè)XPath表達(dá)式,你都要傳入一個(gè)XML的文檔。通常,很多XPath表達(dá)式是根據(jù)單個(gè)XML文檔來(lái)計(jì)算的。這種情況下,如果JAXP中的XPath APIs是有狀態(tài)的,XML文檔只需傳入一次,那樣就更好了。對(duì)于底層實(shí)現(xiàn)來(lái)說(shuō)就多了一個(gè)優(yōu)化選擇,可以把XML 文檔源存儲(chǔ)起來(lái),這樣就可以快速計(jì)算XPath表達(dá)式了。
一個(gè)根據(jù)XML文檔計(jì)算XPath表達(dá)式得例子:
1
????????<?xml?version="1.0"?>
2
????????<employees>
3
????????????<employee>
4
????????????????<name>e1</name>
5
????????????</employee>
6
????????????<employee>
7
????????????????<name>e2</name>
8
????????????</employee>
9
????????</employees>

2

3

4

5

6

7

8

9

?1?????//parse?an?XML?to?get?a?DOM?to?query
?2?????DocumentBuilderFactory?dbfactory?=?DocumentBuilderFactory.newInstance();
?3?????dbfactory.setNamespaceAware(true);
?4?????dbfactory.setXIncludeAware(true);
?5
?6?????DocumentBuilder?parser?=?dbfactory.newDocumentBuilder();
?7?????Document?doc?=?parser.parse(new?File("data.xml"));
?8
?9?????//get?an?XPath?processor
10?????XPathFactory?xpfactory?=?XPathFactory.newInstance();
11?????XPath?xpathprocessor?=?xpfactory.newXPath();
12
13?????//set?the?namespace?context?for?resolving?prefixes?of?the?Qnames
14?????//to?NS?URI,?if?the?xpath?expresion?uses?Qnames.?XPath?expression
15?????//would?use?Qnames?if?the?XML?document?uses?namespaces.
16?????//xpathprocessor.setNamespaceContext(NamespaceContext?nsContext);
17
18?????//create?XPath?expressions
19?????String?xpath1?=?"/employees/employee";
20?????XPathExpression?employeesXPath?=?xpathprocessor.compile(xpath1);
21
22?????String?xpath2?=?"/employees/employee[1]";
23?????XPathExpression?employeeXPath??=?xpathprocessor.compile(xpath2);
24
25?????String?xpath3?=?"/employees/employee[1]/name";
26?????XPathExpression?empnameXPath??=?xpathprocessor.compile(xpath3);
27
28?????//execute?the?XPath?expressions
29?????System.out.println("XPath1="+xpath1);
30?????NodeList?employees?=?(NodeList)employeesXPath.evaluate(doc,?
31??????????XPathConstants.NODESET);
32
?????for?(int?i=0;?i<employees.getLength();?i++)?
{
33???????????????System.out.println(employees.item(i).getTextContent());
34?????}
35
36?????System.out.println("XPath2="+xpath2);
37?????Node?employee?=?(Node)employeeXPath.evaluate(doc,?XPathConstants.NODE);
38?????System.out.println(employee.getTextContent());
39
40?????System.out.println("XPath3="+xpath3);
41?????String?empname?=?empnameXPath.evaluate(doc);
42?????System.out.println(empname);
?2?????DocumentBuilderFactory?dbfactory?=?DocumentBuilderFactory.newInstance();
?3?????dbfactory.setNamespaceAware(true);
?4?????dbfactory.setXIncludeAware(true);
?5

?6?????DocumentBuilder?parser?=?dbfactory.newDocumentBuilder();
?7?????Document?doc?=?parser.parse(new?File("data.xml"));
?8

?9?????//get?an?XPath?processor
10?????XPathFactory?xpfactory?=?XPathFactory.newInstance();
11?????XPath?xpathprocessor?=?xpfactory.newXPath();
12

13?????//set?the?namespace?context?for?resolving?prefixes?of?the?Qnames
14?????//to?NS?URI,?if?the?xpath?expresion?uses?Qnames.?XPath?expression
15?????//would?use?Qnames?if?the?XML?document?uses?namespaces.
16?????//xpathprocessor.setNamespaceContext(NamespaceContext?nsContext);
17

18?????//create?XPath?expressions
19?????String?xpath1?=?"/employees/employee";
20?????XPathExpression?employeesXPath?=?xpathprocessor.compile(xpath1);
21

22?????String?xpath2?=?"/employees/employee[1]";
23?????XPathExpression?employeeXPath??=?xpathprocessor.compile(xpath2);
24

25?????String?xpath3?=?"/employees/employee[1]/name";
26?????XPathExpression?empnameXPath??=?xpathprocessor.compile(xpath3);
27

28?????//execute?the?XPath?expressions
29?????System.out.println("XPath1="+xpath1);
30?????NodeList?employees?=?(NodeList)employeesXPath.evaluate(doc,?
31??????????XPathConstants.NODESET);
32


33???????????????System.out.println(employees.item(i).getTextContent());
34?????}
35

36?????System.out.println("XPath2="+xpath2);
37?????Node?employee?=?(Node)employeeXPath.evaluate(doc,?XPathConstants.NODE);
38?????System.out.println(employee.getTextContent());
39

40?????System.out.println("XPath3="+xpath3);
41?????String?empname?=?empnameXPath.evaluate(doc);
42?????System.out.println(empname);
XML和java數(shù)據(jù)類型間的映射
Datatypes 在XMLSchema1.0的時(shí)候就很流行了,被很多其他XML規(guī)范所應(yīng)用,像 XPath,XQuery,WSDL等。大部分的數(shù)據(jù)類型可以映射到j(luò)ava的基本數(shù)據(jù)類型或包裝過(guò)的數(shù)據(jù)類型。其他的類型如:dataTime,duration可以被映射到新的java數(shù)據(jù)類型: javax.xml.datatype.XMLGregorianCalendar, javax.xml.datatype.Duration, and javax.xml.namespace.QName. 這樣X(jué)MLSchema1.0??XPath 2.0 和XQuery 1.0中所有數(shù)據(jù)類型,JAVA中都有對(duì)應(yīng)的類型存在。
從可用性角度看,如果DatatypeFactory有方法能夠生成一個(gè)對(duì)應(yīng)WXS中數(shù)據(jù)類型的java對(duì)象,并且這個(gè)java對(duì)象擁有方法能根據(jù)facets限制數(shù)據(jù)類型和根據(jù)值來(lái)校驗(yàn)數(shù)據(jù)類型,這樣就非常好了。
一個(gè)使用Oracle's XDK的例子:
?1?????import?oracle.xml.parser.schema.*;
?2?????.?.?.
?3
?4?????//create?a?simpleType?object
?5?????XSDSimpleType?st?=?XSDSimpleType.getPrimitiveType(XSDSimpleType.iSTRING);
?6
?7?????//set?a?constraining?facet?on?the?simpleType
?8?????st.setFacet(XSDSimpleType.LENGTH,?"5");
?9
10?????//validate?value
11?????st.validateValue("hello");
?2?????.?.?.
?3

?4?????//create?a?simpleType?object
?5?????XSDSimpleType?st?=?XSDSimpleType.getPrimitiveType(XSDSimpleType.iSTRING);
?6

?7?????//set?a?constraining?facet?on?the?simpleType
?8?????st.setFacet(XSDSimpleType.LENGTH,?"5");
?9

10?????//validate?value
11?????st.validateValue("hello");
底層實(shí)現(xiàn)間的切換
一個(gè)JAXP的實(shí)現(xiàn)通常包括一個(gè)默認(rèn)的解析器,轉(zhuǎn)換器,xpath引擎和schema校驗(yàn)器,
但是就像文章開(kāi)始的時(shí)候所提到的那樣,JAXP是一個(gè)可插拔的API,我們可以插入我們自己的處理器來(lái)替換JAXP默認(rèn)的處理器。要實(shí)現(xiàn)這樣的切換,我們可以通過(guò)設(shè)置屬性javax.xml.xxx.yyyFactory的值來(lái)指定一個(gè)合格的factory的實(shí)現(xiàn)類。當(dāng)yyyFactory.newInstance()被調(diào)用的時(shí)候,JAXP使用如下的順序查找需要裝載的具體的實(shí)現(xiàn)類:
1.使用javax.xml.xxx.yyyFactory屬性指定的值.
2.使用 JRE 目錄下lib/jaxp.properties 屬性文件。Jaxp.properties文件只被讀入一次,它的值會(huì)被緩存已被將來(lái)所用。如果第一次嘗試去讀這個(gè)文件而此文件不存在的話,以后就不會(huì)嘗試著去檢查此文件是否存在。jaxp.properties里面的值第一次讀過(guò)之后,就不可能被修改了。
3.如果可以地話,可以使用Services API(在JAR的規(guī)范里有詳細(xì)的信息)來(lái)決定哪個(gè)實(shí)現(xiàn)類被載入。Services API會(huì)在runtime時(shí)存在的jars的META-INF/services/javax.xml.xxx.yyyFactory里尋找那個(gè)classname。
4.使用平臺(tái)默認(rèn)的javax.xml.xxx.yyyFactory 實(shí)例。
javax.xml.xxx.yyyFactory 可是下面其中的一個(gè):
javax.xml.parsers.SAXParserFactory
javax.xml.parsers.DocumentBuilderFactory
javax.xml.transform.TransformerFactory
javax.xml.xpath.XPathFactory
javax.xml.validation.SchemaFactory:schemaLanguage (schemaLanguage 是調(diào)用SchemaFactory的newInstance函數(shù)時(shí)所提供的參數(shù))
例如:想在JAXP中使用SAX解析器,你可以用上面提到的4個(gè)方法中的任何一個(gè)把
javax.xml.parsers.SAXParserFactory設(shè)為org.apache.xerces.jaxp.SAXParserFactoryImpl。
其中的一個(gè)方法如下:?
1
?????java?-Djavax.xml.parsers.SAXParserFactory=??????????org.apache.xerces.jaxp.SAXParserFactoryImpl?MyApplicationProgram
