無(wú)聊想寫(xiě)個(gè)讀取RSS的東東,思路比較清晰:
⒈用ajax根據(jù)url抓取xml數(shù)據(jù);
⒉解析數(shù)據(jù),提取有用的信息;
⒊將提取出來(lái)的數(shù)據(jù)以合理的方式顯示。
本來(lái)以為比較簡(jiǎn)單的,誰(shuí)知道在第一步就卡住了。首先,ajax存在跨域問(wèn)題,直接抓取沒(méi)有權(quán)限。那只好后臺(tái)用URL去取。
可是抓回來(lái)的xml文件,只要有中文就是亂碼。用new(string.getByte(),"UTF-8")解碼,會(huì)出現(xiàn)中文部分亂碼。抓取代碼如下:
簡(jiǎn)單總結(jié)問(wèn)題原因:讀取xml文件流的時(shí)候要使用相符合的編碼方式。一般xml都是用UTF-8編碼,而系統(tǒng)的默認(rèn)讀取應(yīng)該是GBK
(簡(jiǎn)體中文)。上一點(diǎn)只是猜想:因?yàn)槲矣胣ew(string.getByte(),"UTF-8"),用的是默認(rèn)GBK讀取,可以解出部分中文。
所以因該在讀取xml時(shí)候指定編碼,下面代碼注釋1。
解出的代碼,如果直接傳回ajax處理,會(huì)得步到xml對(duì)象。細(xì)看下最后輸出得xml文件,發(fā)現(xiàn)最后多了一個(gè)"?"。這個(gè)問(wèn)號(hào)因該是
用來(lái)表明xml結(jié)束的。但是dom對(duì)xml格式要求和嚴(yán)格,所以直接給回肯定是的不到responseXML.documentElement對(duì)象的。去掉后
可以正常解析。?1public?class?GetRss?extends?HttpServlet
?2{
?3????private?static?final?Logger?logger?=?Logger.getLogger(GetRss.class);
?4???
?5????protected?void?doGet(HttpServletRequest?request,?HttpServletResponse?response)?throws?ServletException,?IOException
?6????
{
?7????????String?url?=?request.getParameter("RssUrl");
?8????????logger.info(url);
?9????????try
10????????
{
11????????????URL?RssUrl?=?new?URL(url);
12????????????InputStream?stream?=?RssUrl.openStream();
13????????????BufferedReader?in?=?new?BufferedReader(new?InputStreamReader(stream),"UTF-8");//
1
14????????????boolean?loop?=?true;?
15????????????StringBuffer?sb?=?new?StringBuffer();?
16
17???????????
18????????????while?(loop)?
{?
19????????????if?(?in.ready()?)?
{?
20????????????int?i=0;?
21????????????while?(i!=?-1)?
{?
22????????????????i?=?in.read();
23????????????????sb.append((char)i);
24????????????}?
25????????????loop?=?false;?
26????????????}?
27????????????}
28????????????int?len?=?sb.length();
29????????????String?out?=?sb.substring(0,?len-1);//
2
30????????????response.setContentType("text/xml;charset=utf-8");
31????????????response.setHeader("Cache-Control",?"no-cache");
32????????????
33????????????PrintWriter?pw=new?PrintWriter(new?OutputStreamWriter(response.getOutputStream(),"UTF-8"));
34????????????pw.write(out);
35????????????pw.close();
36????????}
37????????catch?(MalformedURLException?e)
38????????
{
39????????????
40????????????logger.error("GetRss.execute?Error.?"?+?e.getMessage());
41????????}
42????????catch?(IOException?e)
43????????
{
44????????????logger.error("GetRss.execute?Error.?"?+?e.getMessage());
45????????}?????
46????}
47
48}
只有注冊(cè)用戶登錄后才能發(fā)表評(píng)論。 | ||
![]() |
||
網(wǎng)站導(dǎo)航:
博客園
IT新聞
Chat2DB
C++博客
博問(wèn)
管理
|
||