Java亂碼問題解決方案
Java亂碼問題一直是困擾初學(xué)者的一個難題,下面就根據(jù)筆者的經(jīng)驗(yàn)來給大家一個解決方案。我寫了一個Demo的web應(yīng)用,解決了亂碼問題,點(diǎn)擊下載。1 問題來源
Java的亂碼問題,根源在于操作系統(tǒng)、數(shù)據(jù)庫(MySQL)、Web服務(wù)器(Tomcat)、頁面(JSP)中的編碼不一致造成的。例如,mysql的編碼是latin1,而頁面上字符的編碼是GBK,則就會出現(xiàn)亂碼問題。
2 解決方案
了解了亂碼產(chǎn)生的原因,下面就來看一下如何解決亂碼。事實(shí)上,只要保證各個環(huán)節(jié)的編碼一致,就不會產(chǎn)生亂碼,所以只要將所有的環(huán)節(jié),設(shè)置的編碼為UTF-8,就不會出現(xiàn)亂碼了(為了支持國際化,建議統(tǒng)一設(shè)置成UTF-8)。
3 mysql數(shù)據(jù)庫編碼的設(shè)置(以MySQL 5.0.41為例)
- 查看數(shù)據(jù)庫支持的編碼:show character set;
這樣可以查看mysql數(shù)據(jù)庫支持的所有編碼,其中可以看到有支持utf8編碼。
+----------+-----------------------------+---------------------+--------+
| Charset | Description | Default collation | Maxlen |
+----------+-----------------------------+---------------------+--------+
| big5 | Big5 Traditional Chinese | big5_chinese_ci | 2 |
| dec8 | DEC West European | dec8_swedish_ci | 1 |
| cp850 | DOS West European | cp850_general_ci | 1 |
| hp8 | HP West European | hp8_english_ci | 1 |
| koi8r | KOI8-R Relcom Russian | koi8r_general_ci | 1 |
| latin1 | cp1252 West European | latin1_swedish_ci | 1 |
| latin2 | ISO 8859-2 Central European | latin2_general_ci | 1 |
| swe7 | 7bit Swedish | swe7_swedish_ci | 1 |
| ascii | US ASCII | ascii_general_ci | 1 |
| ujis | EUC-JP Japanese | ujis_japanese_ci | 3 |
| sjis | Shift-JIS Japanese | sjis_japanese_ci | 2 |
| hebrew | ISO 8859-8 Hebrew | hebrew_general_ci | 1 |
| tis620 | TIS620 Thai | tis620_thai_ci | 1 |
| euckr | EUC-KR Korean | euckr_korean_ci | 2 |
| koi8u | KOI8-U Ukrainian | koi8u_general_ci | 1 |
| gb2312 | GB2312 Simplified Chinese | gb2312_chinese_ci | 2 |
| greek | ISO 8859-7 Greek | greek_general_ci | 1 |
| cp1250 | Windows Central European | cp1250_general_ci | 1 |
| gbk | GBK Simplified Chinese | gbk_chinese_ci | 2 |
| latin5 | ISO 8859-9 Turkish | latin5_turkish_ci | 1 |
| armscii8 | ARMSCII-8 Armenian | armscii8_general_ci | 1 |
| utf8 | UTF-8 Unicode | utf8_general_ci | 3 |
| ucs2 | UCS-2 Unicode | ucs2_general_ci | 2 |
| cp866 | DOS Russian | cp866_general_ci | 1 |
| keybcs2 | DOS Kamenicky Czech-Slovak | keybcs2_general_ci | 1 |
| macce | Mac Central European | macce_general_ci | 1 |
| macroman | Mac West European | macroman_general_ci | 1 |
| cp852 | DOS Central European | cp852_general_ci | 1 |
| latin7 | ISO 8859-13 Baltic | latin7_general_ci | 1 |
| cp1251 | Windows Cyrillic | cp1251_general_ci | 1 |
| cp1256 | Windows Arabic | cp1256_general_ci | 1 |
| cp1257 | Windows Baltic | cp1257_general_ci | 1 |
| binary | Binary pseudo charset | binary | 1 |
| geostd8 | GEOSTD8 Georgian | geostd8_general_ci | 1 |
| cp932 | SJIS for Windows Japanese | cp932_japanese_ci | 2 |
| eucjpms | UJIS for Windows Japanese | eucjpms_japanese_ci | 3 |
+----------+-----------------------------+---------------------+--------+
36 rows in set (0.00 sec
- 查看數(shù)據(jù)庫默認(rèn)的編碼: show variables like '%character%'
+--------------------------+---------------------------------------+
| Variable_name | Value |
+--------------------------+---------------------------------------+
| character_set_client | latin1 |
| character_set_connection | latin1 |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_results | latin1 |
| character_set_server | latin1 |
| character_set_system | utf8 |
| character_sets_dir | E:\mysql-5.0.41-win32\share\charsets\ |
+--------------------------+---------------------------------------+
8 rows in set (0.00 sec)
可以看到,mysql數(shù)據(jù)庫中,此時有關(guān)字符串的設(shè)置的參數(shù),其中“character_set_server”為創(chuàng)建數(shù)據(jù)庫是默認(rèn)的編碼,現(xiàn)在需要將其修改為utf8。
- 修改數(shù)據(jù)庫默認(rèn)的編碼:set character_set_server='utf8';
Query OK, 0 rows affected (0.00 sec)
執(zhí)行改命令后,可以看到數(shù)據(jù)庫此時的默認(rèn)編碼改為utf8了
+--------------------------+---------------------------------------+
| Variable_name | Value |
+--------------------------+---------------------------------------+
| character_set_client | latin1 |
| character_set_connection | latin1 |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_results | latin1 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| character_sets_dir | E:\mysql-5.0.41-win32\share\charsets\ |
+--------------------------+---------------------------------------+
8 rows in set (0.00 sec)
此時,創(chuàng)建數(shù)據(jù)庫和表如果不指定字符集,則也會使用uft8的編碼了。
- 查看schema和table編碼:
show create database 數(shù)據(jù)庫名;
show create table 表名;
例如存在這樣一個數(shù)據(jù)庫mydbdefault,mydbdefault中有一個表test
Database changed
mysql> show create database mydbdefault;
+-------------+-----------------------------------------------------------------
-------+
| Database | Create Database
|
+-------------+-----------------------------------------------------------------
-------+
| mydbdefault | CREATE DATABASE `mydbdefault` /*!40100 DEFAULT CHARACTER SET latin1 */ |
+-------------+-----------------------------------------------------------------
-------+
1 row in set (0.00 sec)
數(shù)據(jù)庫mydbdefault的編碼為latin1
+-------+-----------------------------------------------------------------------
-------------------+
| Table | Create Table
|
+-------+-----------------------------------------------------------------------
-------------------+
| test | CREATE TABLE `test` (
`id` int(20) default NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1 |
+-------+-----------------------------------------------------------------------
-------------------+
1 row in set (0.00 sec)
表的編碼為latin1
- 修改schema和table編碼:
alter database 數(shù)據(jù)庫名 character set utf8;
alter table 表名 character set utf8;
既然查出數(shù)據(jù)庫和表的編碼都不是uft8,所以此時要將數(shù)據(jù)庫和表的字符集改成utf8。
Query OK, 1 row affected (0.00 sec)
mysql> show create database mydbdefault;
+-------------+-----------------------------------------------------------------
-----+
| Database | Create Database
|
+-------------+-----------------------------------------------------------------
-----+
| mydbdefault | CREATE DATABASE `mydbdefault` /*!40100 DEFAULT CHARACTER SET utf
8 */ |
+-------------+-----------------------------------------------------------------
-----+
1 row in set (0.00 sec)
mysql> alter table test character set utf8;
Query OK, 0 rows affected (0.03 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> show create table test;
+-------+-----------------------------------------------------------------------
-----------------+
| Table | Create Table
|
+-------+-----------------------------------------------------------------------
-----------------+
| test | CREATE TABLE `test` (
`id` int(20) default NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8 |
+-------+-----------------------------------------------------------------------
-----------------+
1 row in set (0.00 sec)
- 在不知道默認(rèn)的編碼方式的情況下,創(chuàng)建數(shù)據(jù)庫和表時,最好指定字符編碼為utf8:
create database 數(shù)據(jù)庫名 character set utf8;
create table 表名 (….) character set utf8;
有關(guān)mysql的字符集的命令,可以mysql的參考手冊http://dev.mysql.com/doc/refman/5.1/zh/charset.html
在Tomcat的安裝目錄下,找到TOMCAT_HOME\conf\server.xml,然后找到以下代碼,在其后加上URIEncoding="UTF-8"
connectionTimeout="20000"
redirectPort="8443" URIEncoding="UTF-8"/>
5 web應(yīng)用中編碼處理
在web應(yīng)用中,為了確保提交的字符串為uft-8的,可以編寫一個過濾器filter,過濾器的在web.xml中的配置如下:
對應(yīng)的SetCharacterEncodingFilter類代碼如下:
<filter-name>Set Character Encoding</filter-name>
<filter-class>
com.fengmanfei.util.SetCharacterEncodingFilter
</filter-class>
<init-param>
<param-name>encoding</param-name>
<param-value>UTF-8</param-value>
</init-param>
</filter>
<filter-mapping>
<filter-name>Set Character Encoding</filter-name>
<url-pattern>/*</url-pattern>
</filter-mapping>
過濾器的代碼如下所示:
import java.io.IOException;
import javax.servlet.Filter;
import javax.servlet.FilterChain;
import javax.servlet.FilterConfig;
import javax.servlet.ServletException;
import javax.servlet.ServletRequest;
import javax.servlet.ServletResponse;
public class SetCharacterEncodingFilter implements Filter {
private String encoding = "UTF-8";
public void doFilter(ServletRequest request, ServletResponse response,
FilterChain chain) throws IOException, ServletException {
request.setCharacterEncoding(encoding);
chain.doFilter(request, response);
}
public void init(FilterConfig filterConfig) throws ServletException {
String s = filterConfig.getInitParameter("encoding");
if (s != null) {
encoding = s;
}
}
public void destroy() {
}
}
其中,關(guān)鍵的是通過過濾器,將request請求設(shè)置UTF-8編碼
6 數(shù)據(jù)庫連接URL
jdbc:mysql://localhost:3306/mydb?characterEncoding=UTF-8
7 頁面中編碼處理
在jsp頁面上,同時也要設(shè)置頁面編碼方式
<%@ page language="java" import="java.util.*" pageEncoding="UTF-8"%>
8 Eclipse中編碼的設(shè)置
最后,在使用Eclipse時,要將工作區(qū)的編碼也設(shè)置為UTF-8,選擇“Window”|“Preference”|“General”|“Workspace”,將Text file encodeing 改為UTF-8,如圖所示:
posted on 2008-08-25 11:35 Janet 閱讀(1938) 評論(3) 編輯 收藏 所屬分類: JAVA Tips