Ginew.Z 的博客

一切，為了讓生活更簡單、更自然

BlogJava :: 首頁 :: 聯(lián)系 :: 聚合

:: 管理

21 Posts :: 0 Stories :: 14 Comments :: 0 Trackbacks

常用鏈接

留言簿(6)

我參與的團隊

隨筆分類

隨筆檔案

搜索

閱讀排行榜

評論排行榜

2006年4月11日 #

href=# vs href=void(0)

href="#" vs. href="javascript:void(0)"

開發(fā)的時候有時需要用link(<a>)來觸發(fā)一些javascript事件，所以常常可以看到如下的代碼：

<a href="javascript:void(0)" onclick="doSomething();returnfalse;">Link</a>

這是一個曾經(jīng)被多次討論過的問題，長期以來，我也一直是這樣寫的。讀了 a href=”javascript:void(0);” — avoid the void 之后，我認(rèn)同了作者的意見。下面的寫法確實更合理：

<a href="#" onclick="doSomething();returnfalse;">Link</a>

或者

<script type="javascript">
function doSomething() {
  //doSomething
  returnfalse;
}
</script>
<a href="#" onclick="return doSomething();">Link</a>

以往大家不使用"#"的問題是，這將導(dǎo)致點擊鏈接時頁面跳回頁面頂部，但通過 return false 語句將使得瀏覽器忽略鏈接的默認(rèn)動作，從而避免了此問題。

youngpup 更有意思，他在How to Create Pop-Up Windows 中言辭激烈的倡導(dǎo)大家永遠(yuǎn)永遠(yuǎn)永遠(yuǎn)不要使用 javascript: 偽協(xié)議：

Never, ever, ever use the javascript: pseudo-protocol for anything, ever ever ever ever again. Please. Pretty please.

他的解決方案是：

<a 
  href="http://google.com/" 
  onclick="window.open(this.href, 'popupwindow', 
  'width=400,height=300,scrollbars,resizable'); 
  returnfalse;">

這樣的好處就是可以保存到書簽或者收藏夾，可以左鍵單擊，也可以右鍵使用！

posted @ 2006-11-17 12:15 無風(fēng)之雨閱讀(1106) | 評論 (2) | 編輯收藏

resin使用security-manager的問題

???? 我們打算為用戶架設(shè)單獨的虛擬主機服務(wù)器，可以讓企業(yè)自主上傳jsp、htm、php等程序。其中resin用來做jsp的容器。
???? 由于是用戶自主通過FTP上傳程序，我們必須控制這些程序可以執(zhí)行的權(quán)限，不能讓用戶隨便瀏覽硬盤上的文件，但又要能讓resin可以正常運行。比如：/data/user_a目錄中的程序，只能在/data/user_a目錄及其子目錄中讀寫，如果想要訪問其他目錄，就沒有權(quán)限。
???? 通過研究resin的文檔以及JAVA的機制，我認(rèn)為要實現(xiàn)以上構(gòu)想，可以通過使用java權(quán)限管理器來構(gòu)建一個resin的沙箱來對java的具體操作進行授權(quán)。
參考文檔：http://www.caucho.com/resin-3.0/security/securitymanager.xtp ，http://www.jscud.com/srun/news/viewhtml/3_2005_10/148.htm

???? 當(dāng)我認(rèn)為勝利在望的時候，發(fā)現(xiàn)resin好像不支持grant codeBase "file:xxxx 。

grant codeBase "file:/data/ftpdata/user01.test.com/-" {
?permission java.io.FilePermission "/data/ftpdata/user01.test.com/-", "read,write,delete";
};
???? 上面的語句，語法上沒有問題，但就是不起作用。那個codebase目錄下的文件，對本目錄沒有任何權(quán)限。

??????? resin的官方論壇里面，有人在2001年，針對resin1.2.5就提出了和我一摸一樣的疑問（http://www.caucho.com/support/resin-interest/0105/0106.html），作者發(fā)現(xiàn)問題是由于resin的classloader是非安全的，因此改了resin原文件后解決了問題（http://www.caucho.com/support/resin-interest/0105/0112.html），但是我看resin3的源代碼，里面已經(jīng)基于java.security.SecureClassLoader，因此應(yīng)該不是這個原因了。
???? 以下是我的resin.policy文件：

grant codeBase "file:${java.home}/lib/-" {
?permission java.security.AllPermission;
};

grant codeBase "file:${java.home}/jre/lib/-" {
?permission java.security.AllPermission;
};

grant codeBase "file:${resin.home}/lib/-" {
?permission java.security.AllPermission;
};

grant {
?permission java.util.PropertyPermission "*", "read";
?permission java.io.SerializablePermission "enableSubstitution";
?permission java.lang.reflect.ReflectPermission "suppressAccessChecks";?
?permission java.lang.RuntimePermission "accessClassInPackage.*";
?permission java.lang.RuntimePermission "getClassLoader";
?permission java.lang.RuntimePermission "accessDeclaredMembers";
?permission java.lang.RuntimePermission "modifyThreadGroup";
?permission java.lang.RuntimePermission "setContextClassLoader";
?permission java.lang.RuntimePermission "setIO";
?permission java.lang.RuntimePermission "stopThread";
?permission java.lang.RuntimePermission "createClassLoader";
?permission java.lang.RuntimePermission "getProtectionDomain";
?permission java.lang.RuntimePermission "defineClassInPackage";
?permission java.security.SecurityPermission "putProviderProperty.SunJCE";
?permission java.security.SecurityPermission "insertProvider.SunJCE";
?permission java.util.logging.LoggingPermission "control";
?permission java.lang.RuntimePermission "getAttribute";
?permission java.util.PropertyPermission "jaxp.debug", "read";
?permission ognl.OgnlInvokePermission "invoke.*";
?permission java.net.SocketPermission "localhost:3306","connect";
?permission java.io.FilePermission "${resin.home}/-", "read";
?permission java.io.FilePermission "${java.home}/-", "read";
?permission java.io.FilePermission "/tmp/-","read,write,delete";
?permission java.io.FilePermission "/tmp","read,write,delete";
?permission java.io.FilePermission ".","read";
?permission java.io.FilePermission "/home/apps/java/jdk/lib/tools.jar","read";
?permission java.io.FilePermission "/bin/sh", "read,execute";
};

//以下語句沒有任何作用，/data/ftpdata/user01.test.com/下的jsp對這個目錄沒有讀的權(quán)限
grant codeBase "file:/data/ftpdata/user01.test.com/-" {
?permission java.io.FilePermission "/data/ftpdata/user01.test.com/-", "read,write,delete";
};

posted @ 2006-06-09 11:00 無風(fēng)之雨閱讀(693) | 評論 (2) | 編輯收藏

GPL協(xié)議簡述

要了解GPL，一般地，您沒有必要耐心閱讀原版的GPL協(xié)議，因為 GPL 無非交待了幾個原則：

確保軟件自始至終都以開放源代碼形式發(fā)布，保護開發(fā)成果不被竊取用作商業(yè)發(fā)售。任何一套軟件，只要其中使用了受 GPL 協(xié)議保護的第三方軟件的源程序，并向非開發(fā)人員發(fā)布時，軟件本身也就自動成為受 GPL 保護并且約束的實體。也就是說，此時它必須開放源代碼。
GPL 大致就是一個左側(cè)版權(quán)（Copyleft，或譯為“反版權(quán)”、“版權(quán)屬左”、“版權(quán)所無”、“版責(zé)”等）的體現(xiàn)。你可以去掉所有原作的版權(quán) 信息，只要你保持開源，并且隨源代碼、二進制版附上 GPL 的許可證就行，讓后人可以很明確地得知此軟件的授權(quán)信息。GPL 精髓就是，只要使軟件在完整開源的情況下，盡可能使使用者得到自由發(fā)揮的空間，使軟件得到更快更好的發(fā)展。
無論軟件以何種形式發(fā)布，都必須同時附上源代碼。例如在 Web 上提供下載，就必須在二進制版本（如果有的話）下載的同一個頁面，清楚地提供源代碼下載的鏈接。如果以光盤形式發(fā)布，就必須同時附上源文件的光盤。
開發(fā)或維護遵循 GPL 協(xié)議開發(fā)的軟件的公司或個人，可以對使用者收取一定的服務(wù)費用。但還是一句老話——必須無償提供軟件的完整源代碼，不得將源代碼與服務(wù)做捆綁或任何變相捆綁銷售。

posted @ 2006-05-16 16:50 無風(fēng)之雨閱讀(687) | 評論 (0) | 編輯收藏

最想要的車

posted @ 2006-05-14 11:28 無風(fēng)之雨閱讀(283) | 評論 (0) | 編輯收藏

關(guān)于IE無法打開INTERNET站點的錯

??????? 今天新頁面上線，很多同事報告說頁面打開到一半，經(jīng)常跳出無法打開Internet站點的錯誤，然后頁面會跳轉(zhuǎn)到DNS錯誤的頁面。

?????

????????
??????? 這個問題我以前遇到過，一直沒有詳細(xì)的去深究原因，只是以為是服務(wù)器關(guān)閉連接太快的原因。今天發(fā)現(xiàn)這個問題出的很頻繁，服務(wù)器方面沒有改什么，只是上傳了新的頁面程序而已，應(yīng)該不會和服務(wù)器有關(guān)。在對頁面進行分析，并搜索了一下網(wǎng)上，發(fā)現(xiàn)原來是js在document還沒完全load完的時候就試圖改變其值導(dǎo)致。

??????? 因此對js做如下改變：

原js：

???? window.settimeout("go()",500);
???? function go(){
??? .......
???? }

改成：

var go_i=window.setInterval("go()",500);
function go(){
???if(document.readyState=="complete"){
????? window.clearInterval(go2_i);
??? }
????else return;
??? ........
}
目的就是讓他一定要在document完成后才執(zhí)行那個操作

posted @ 2006-04-19 18:14 無風(fēng)之雨閱讀(6924) | 評論 (6) | 編輯收藏

document.domain與域名的問題

以前如果要使iframe里面的腳本能訪問parent的內(nèi)容，但iframe和parent的二級域名相同，那一般都會在兩者都寫上document.domain="xxx.com" 以放寬訪問權(quán)限。

今天發(fā)現(xiàn)，如果iframe和parent在同一個三級域名下，比如都是aa.bb.com，那設(shè)了document.domain反而會造成訪問拒絕。

查了下MSDN，有如下解釋：

Remarks

The property initially returns the host name of the server from which the page is served. The property can be assigned the domain suffix to allow sharing of pages across frames. For example, a page in one frame from home.microsoft.com and a page from www.microsoft.com initially would not be able to communicate with each other. However, by setting the domain property of both pages to the suffix "microsoft.com", you ensure that both pages are considered secure and access is available between the pages.

When you set the domain property, use the domain name determined by the server rather than by the client browser.

All the pages on different hosts must have the domain property explicitly set to the same value to communicate successfully with each other. For example, the value of the domain property of a page on the host microsoft.com would be "microsoft.com" by default. It might seem logical that if you set the domain property of a page on another host named msdn.microsoft.com to "microsoft.com," that the two pages could communicate with each other. However, this is not the case unless you have also explicitly set the domain property of the page on microsoft.com to "microsoft.com".

Furthermore, this property cannot be used to allow cross-frame communication among frames with different domain suffixes. For example, a page in one frame from www.microsoft.com and a page in another frame from www.msn.com would not be able to communicate with each other even if the domain property of both pages was set to the suffix "microsoft.com".

Security Alert??Using this property incorrectly can compromise the security of your Web site. Set the domain property only if you must allow cross-domain scripting. Use a value determined on the server. Setting this property to a value determined on the client (like through the location object) could expose your site to attack from another site through Domain Name System (DNS) manipulation. For more information, see Security Considerations: Dynamic HTML.

For more information on domain security, see About Cross-Frame Scripting and Security.

posted @ 2006-04-13 11:54 無風(fēng)之雨閱讀(9359) | 評論 (3) | 編輯收藏

域名和COOKIE的問題

今天發(fā)現(xiàn)，在IE里面，當(dāng)一個域名包含_的時候，IE不會給這個網(wǎng)站發(fā)送COOKIE，真變態(tài)。同事調(diào)試了半天，才發(fā)現(xiàn)還有這個問題

posted @ 2006-04-11 17:54 無風(fēng)之雨閱讀(278) | 評論 (0) | 編輯收藏

我認(rèn)為最好的MYSQL備份方式

??? 要備份MYSQL，很多人用mysqldump，其實這種方式，導(dǎo)出的文件是最大的，導(dǎo)入的時間是最久的。命令是方便的，但真正發(fā)生錯誤的時候，恢復(fù)效率很低。
??? 我主張，另外找一臺比較空閑的機器，來做數(shù)據(jù)庫的備份。這臺機器作以下用途：

?? 它是主數(shù)據(jù)庫帶的slave數(shù)據(jù)庫群里面的一臺，每天凌晨定時啟動同步數(shù)據(jù)，等追上bin-log并全部執(zhí)行后，停止同步，并用select * into outfile將數(shù)據(jù)全部導(dǎo)出成文件，并且在每周的某一天，清除掉主數(shù)據(jù)庫上已經(jīng)同步好的bin-log，以確保硬盤空間不被log占滿。

?? 為此，我寫了3個腳本，分別執(zhí)行1、啟動mysql，追log，然后停止slave；2、導(dǎo)出全部數(shù)據(jù)庫全部文件到文件；3、刪除主數(shù)據(jù)庫的log

---------------------------------------------------------------------------------------
#!/bin/bash
#readMasterMysql.sh
CHECK_MYSQL=0
/home/mysql/bin/mysqld_safe &
until [ "$CHECK_MYSQL" = "1" ]
do
? sleep 10
? CHECK_MYSQL=`/home/mysql/bin/mysql -uroot -e"show slave status"|awk '{if($14==$21)print "1"}'|tail -n1`
done
/home/mysql/bin/mysql -uroot -e"slave stop"
/home/script/backupMysql.sh
/home/mysql/bin/mysqladmin shutdown
WEEK=`date "+%w"`
if [ $WEEK = "5" ]
then
??? /home/script/purgeLog.sh
fi

------------------------------------------------------------------------------
#!/bin/bash
#purgeLog.sh
LOG_FILE=/home/mysql/data/master.info
DB_SERVER=`sed -n '4p' $LOG_FILE`
DB_USER=`sed -n '5p' $LOG_FILE`
DB_PASS=`sed -n '6p' $LOG_FILE`
DB_LOGFILE=`sed -n '2p' $LOG_FILE`
/home/mysql/bin/mysql -h$DB_SERVER -u$DB_USER -p"$DB_PASS" -e"purge master logs to '$DB_LOGFILE'"

------------------------------------------------------------------------------
#!/bin/bash
#backupMysql.sh
database=$1
table=$2
MYSQL_CLIENT="/home/mysql/bin/mysql -uroot --default-character-set=gbk"
MYSQL_DUMP="/home/mysql/bin/mysqldump -d -uroot --default-character-set=gbk"
OUTPUT_PATH=/date/backup
for databases in `$MYSQL_CLIENT -e "show databases"|grep -v Database`
do
if [ "$#" = "0" -o "$database" = "$databases" ] ; then
??????? mkdir -p -m777 $OUTPUT_PATH/$databases/
??????? $MYSQL_DUMP $databases > $OUTPUT_PATH/$databases/$databases.sql
??????? for tables in `$MYSQL_CLIENT -e "show tables" $databases|grep -v Tables_in_`
??????? do
??????? if [ "$#" = "0" -o "$#" = "1" -o "$table" = "$tables" ] ; then
??????????????? mv -f $OUTPUT_PATH/$databases/$tables $OUTPUT_PATH/$databases/$tables.old
??????????????? $MYSQL_CLIENT -e "select * into outfile '$OUTPUT_PATH/$databases/$tables' from $tables" $databases
??????? fi
??????? done
fi
done

posted @ 2006-04-11 12:47 無風(fēng)之雨閱讀(667) | 評論 (0) | 編輯收藏

如何讓referer和user-agent同時為空時記錄日志

一般情況下，Referer和User-Agent同時為空的時候，可以認(rèn)為是其他網(wǎng)站在批量采集本站數(shù)據(jù)，我打算deny掉這種請求。不過apache文檔里面沒有提到有兩個環(huán)境變量的與操作。最后chinaunix上有大俠回答了我的問題：

SetEnv?? log_flag=1
SetEnvIf Referer !"^$"? log_flag=0
SetEnvIf user-agent !"^$" log_flag=0
...

看字面上，就是如果用兩個非的或來代替與
這樣，只要兩個條件有一個不滿足，就log_flag就會變掉，只要它變掉了，就說明不符合我屏蔽的規(guī)則。
高手就是高手，為什么非要苛求一定要有“與”呢，兩個“非”的“或”，不一樣達到要求？

學(xué)習(xí)了。

posted @ 2006-04-11 12:25 無風(fēng)之雨閱讀(843) | 評論 (0) | 編輯收藏

oracle到mysql的遷移，函數(shù)部分

to_date(?, 'YYYY-MM-DD HH24:MI:SS')"
STR_TO_DATE('2003-15-10 00:00:00','%Y-%m-%d %H:%i:%s');???? //格式不對，會返回NULL

to_char(create_time,'yyyy-MM-dd')
DATE_FORMAT(create_time,'%Y-%m-%d')

sysdate
now()或者CURRENT_TIMESTAMP?//'1997-12-15 23:50:26',建表的時候，timestamp類型可以指定default CURRENT_TIMESTAMP

sysdate - 7?? //7天前
now()-INTERVAL 7 DAY??

select * from (select .... where rownum<end) where rownum>start
limit [start,] length

substr(productInfor,1,20)
SUBSTRING('Quadratically',5,6)???? //SUBSTRING(str,pos,len)

instr(str,substr,pos,index)
instr(str,substr) 或者 locate(substr,str,pos)
// 沒有相對應(yīng)的語法，但一般情況，這個是和substr結(jié)合起來用的。
//如果是str="2005-10-01"取中間的10這樣的需要，oracle是substr(str,instr(str,'-',1,1)+1,instr(str,'-',1,2)-instr(str,'-',1,1)-1)
那在mysql里面，可以試試這樣SUBSTRING_INDEX(SUBSTRING_INDEX(str,'-',2),'-',-1),意思就是取第二個-之前的str后（2005-10），再取倒數(shù)第一個-之后的內(nèi)容

oracle的nvl（ss,dd）函數(shù)在mysql中怎么實現(xiàn)？
答：ifnull(ss,dd)

posted @ 2006-04-11 12:16 無風(fēng)之雨閱讀(855) | 評論 (0) | 編輯收藏

Mysql到oracle程序遷移的注意事項

Mysql到oracle程序遷移的注意事項(摘抄)
有很多應(yīng)用項目, 剛起步的時候用MYSQL數(shù)據(jù)庫基本上能實現(xiàn)各種功能需求，隨著應(yīng)用用戶的增多，
數(shù)據(jù)量的增加，MYSQL漸漸地出現(xiàn)不堪重負(fù)的情況：連接很慢甚至宕機，于是就有把數(shù)據(jù)從MYSQL遷到
ORACLE的需求，應(yīng)用程序也要相應(yīng)做一些修改。本人總結(jié)出以下幾點注意事項，希望對大家有所幫助。

1．自動增長的數(shù)據(jù)類型處理
MYSQL有自動增長的數(shù)據(jù)類型，插入記錄時不用操作此字段，會自動獲得數(shù)據(jù)值。
ORACLE沒有自動增長的數(shù)據(jù)類型，需要建立一個自動增長的序列號，插入記錄時要把序列號的下一個
值賦于此字段。

CREATE SEQUENCE 序列號的名稱 (最好是表名+序列號標(biāo)記) INCREMENT BY 1 START WITH 1
MAXVALUE 99999 CYCLE NOCACHE;
其中最大的值按字段的長度來定, 如果定義的自動增長的序列號 NUMBER(6) , 最大值為999999
INSERT 語句插入這個字段值為: 序列號的名稱.NEXTVAL

2. 單引號的處理
MYSQL里可以用雙引號包起字符串，ORACLE里只可以用單引號包起字符串。在插入和修改字符串
前必須做單引號的替換：把所有出現(xiàn)的一個單引號替換成兩個單引號。

3. 翻頁的SQL語句的處理
MYSQL處理翻頁的SQL語句比較簡單，用LIMIT 開始位置, 記錄個數(shù)；PHP里還可以用SEEK定位到結(jié)果
集的位置。
ORACLE處理翻頁的SQL語句就比較繁瑣了。每個結(jié)果集只有一個ROWNUM字段標(biāo)明它的位置, 并且只能
用ROWNUM<100, 不能用ROWNUM>80。
以下是經(jīng)過分析后較好的兩種ORACLE翻頁SQL語句( ID是唯一關(guān)鍵字的字段名 )：
語句一：
SELECT ID, [FIELD_NAME,...] FROM TABLE_NAME WHERE ID IN ( SELECT ID FROM (SELECT
ROWNUM AS NUMROW, ID FROM TABLE_NAME WHERE 條件1 ORDER BY 條件2) WHERE NUMROW > 80 AND
NUMROW < 100 ) ORDER BY 條件3;

語句二：
SELECT * FROM (( SELECT ROWNUM AS NUMROW, c.* from (select [FIELD_NAME,...] FROM
TABLE_NAME WHERE 條件1 ORDER BY 條件2) c) WHERE NUMROW > 80 AND NUMROW < 100 ) ORDER BY 條件3;

4．長字符串的處理
長字符串的處理ORACLE也有它特殊的地方。INSERT和UPDATE時最大可操作的字符串長度小于等于
4000個單字節(jié), 如果要插入更長的字符串, 請考慮字段用CLOB類型，方法借用ORACLE里自帶的DBMS_LOB程序
包。插入修改記錄前一定要做進行非空和長度判斷，不能為空的字段值和超出長度字段值都應(yīng)該提出警告,
返回上次操作。

5. 日期字段的處理
MYSQL日期字段分DATE和TIME兩種，ORACLE日期字段只有DATE，包含年月日時分秒信息，用當(dāng)前數(shù)據(jù)庫
的系統(tǒng)時間為SYSDATE, 精確到秒，或者用字符串轉(zhuǎn)換成日期型函數(shù)TO_DATE(‘2001-08-01’,’YYYY-MM-DD’)
年-月-日 24小時:分鐘:秒的格式Y(jié)YYY-MM-DD HH24:MI:SS TO_DATE()還有很多種日期格式, 可以參看
ORACLE DOC.
日期型字段轉(zhuǎn)換成字符串函數(shù)TO_CHAR(‘2001-08-01’,’YYYY-MM-DD HH24:MI:SS’)

日期字段的數(shù)學(xué)運算公式有很大的不同。
MYSQL找到離當(dāng)前時間7天用
DATE_FIELD_NAME > SUBDATE（（NOW（），INTERVAL 7 DAY）
ORACLE找到離當(dāng)前時間7天用
DATE_FIELD_NAME >SYSDATE - 7;

6. 空字符的處理
MYSQL的非空字段也有空的內(nèi)容，ORACLE里定義了非空字段就不容許有空的內(nèi)容。
按MYSQL的NOT NULL來定義ORACLE表結(jié)構(gòu), 導(dǎo)數(shù)據(jù)的時候會產(chǎn)生錯誤。因此導(dǎo)數(shù)據(jù)時要對空字符進行判
斷，如果為NULL或空字符，需要把它改成一個空格的字符串。

7. 字符串的模糊比較
MYSQL里用字段名 like '%字符串%'
ORACLE里也可以用字段名 like '%字符串%' 但這種方法不能使用索引, 速度不快
用字符串比較函數(shù) instr(字段名,'字符串')>0 會得到更精確的查找結(jié)果

8. 程序和函數(shù)里，操作數(shù)據(jù)庫的工作完成后請注意結(jié)果集和指針的釋放。

有興趣可以看MYSQL管理員指南

posted @ 2006-04-11 12:15 無風(fēng)之雨閱讀(270) | 評論 (0) | 編輯收藏

MYSQL特殊函數(shù)

1、用mysql內(nèi)置函數(shù)轉(zhuǎn)換ip地址和數(shù)字
利用兩個內(nèi)置函數(shù)
inet_aton:將ip地址轉(zhuǎn)換成數(shù)字型
inet_ntoa:將數(shù)字型轉(zhuǎn)換成ip地址

2、用Mysql內(nèi)置函數(shù)來轉(zhuǎn)化unix時間(秒值)和字符串時間
from_unixtime():1144728462 -> "2006-04-11 12:07:42"
unix_timestamp():"2006-04-11 12:07:42" -> 1144728462

posted @ 2006-04-11 12:13 無風(fēng)之雨閱讀(263) | 評論 (0) | 編輯收藏

MYSQL5.1的表分區(qū)功能

Improving Database Performance with Partitioning

A few years ago, I wrote an article entitled "The Foundation of Excellent Performance" (still available at http://www.tdan.com/i016fe03.htm) where I argued against the notion that SQL code was the number one contributor to performance in a database-driven system. Instead, I stated in the article that I firmly believed how good physical database design was far and away the leading component of superior database performance. In addition, I showed that Oracle's own research illustrated how poor design was the main culprit behind database downtime (planned or unplanned). In the years since then, I've not changed my stance and still think that any DBA who wants a high-performance database has got to invest in intelligent and savvy physical design to produce the kind of response times that make end users smile instead of scream.

One of the reasons I'm very excited about the release of MySQL 5.1 is that it contains a potent new weapon for designing supercharged databases that any MySQL DBA should quickly learn how to use and exploit. By smartly using the new 5.1 partitioning feature, a DBA can oftentimes dramatically improve the performance of most any VLDB or data warehouse they happen to be in charge of.
What is Partitioning?

Partitioning is a physical database design technique that many data modelers and DBAs are quite familiar with. Although partitioning can be used to accomplish a number of various objectives, the main goal is to reduce the amount of data read for particular SQL operations so that overall response time is reduced.

There are two major forms of partitioning:

1. Horizontal Partitioning - this form of partitioning segments table rows so that distinct groups of physical row-based datasets are formed that can be addressed individually (one partition) or collectively (one-to-all partitions). All columns defined to a table are found in each set of partitions so no actual table attributes are missing. An example of horizontal partitioning might be a table that contains ten years worth of historical invoice data being partitioned into ten distinct partitions, where each partition contains a single year's worth of data.
2. Vertical Partitioning - this partitioning scheme is traditionally used to reduce the width of a target table by splitting a table vertically so that only certain columns are included in a particular dataset, with each partition including all rows. An example of vertical partitioning might be a table that contains a number of very wide text or BLOB columns that aren't addressed often being broken into two tables that has the most referenced columns in one table and the seldom-referenced text or BLOB data in another.

Before database vendors began building partitioning (mainly horizontal) into their engines, DBAs and data modelers had to physically design separate table structures to hold the desired partitions, which either held redundant data (separate tables with data that were based off a live parent table) or were linked together to form one logical parent object (usually via a view). This practice has since been made obsolete for the most part for horizontal partitioning, although it is sometimes still done for vertical partitioning.
Partitioning in MySQL 5.1

One of the great new features in MySQL 5.1 is support for horizontal partitioning. The really good news about MySQL and the new 5.1 partitioning feature is all the major forms of partitioning are supported:

1. Range - this partitioning mode allows a DBA to specify various ranges for which data is assigned. For example, a DBA may create a partitioned table that is segmented by three partitions that contain data for the 1980's, 1990's, and everything beyond and including the year 2000.
2. Hash - this partitioning mode allows a DBA to separate data based on a computed hash key that is defined on one or more table columns, with the end goal being an equal distribution of values among partitions. For example, a DBA may create a partitioned table that has ten partitions that are based on the table's primary key.
3. Key - a special form of Hash where MySQL guarantees even distribution of data through a system-generated hash key.
4. List - this partitioning mode allows a DBA to segment data based on a pre-defined list of values that the DBA specifies. For example, a DBA may create a partitioned table that contains three partitions based on the years 2004, 2005, and 2006.
5. Composite - this final partitioning mode allows a DBA to perform sub-partitioning where a table is initially partitioned by, for example range partitioning, but then each partition is segmented even further by another method (for example, hash).

There are a number of benefits that come with partitioning, but the two main advantages are:

Increased performance - during scan operations, the MySQL optimizer knows what partitions contain the data that will satisfy a particular query and will access only those necessary partitions during query execution. For example, a million row table may be broken up into ten different partitions in range style so that each partition contains 100,000 rows. If a query is issued that only needs data from one of the partitions, and a table scan operation is necessary, only 100,000 rows will be accessed instead of a million. Obviously, it is much quicker for MySQL to sample 100,000 rows than one million so the query will complete much sooner. The same benefit is derived should index access be possible as local partitioned indexes are created for partitioned tables. Finally, it is possible to stripe a partitioned table across different physical drives by specifying different file system/directory paths for specific partitions. This allows physical I/O contention to be reduced when multiple partitions are accessed at the same time.
Simplified data management - partitioning allows a DBA to have more control over how data is managed inside of the database. By intelligently creating partitions, a DBA can simplify how certain data operations are performed. For example, a DBA can drop specific partitions in a partitioned table while the remaining partitions remain intact (as opposed to crafting a fragmentation-producing mass delete operation for the whole table). Further, partitions are automatically maintained by MySQL so the DBA doesn't have to manually separate and maintain a horizontal partitioning scheme for a table. For example, a DBA can create a history table that holds data for customers that are partitioned across various year ranges, and have those partitioned automatically enforced by the database server with no DBA intervention being necessary.

From a design-for-performance standpoint, we're mainly interested in point one above. By smartly using partitioning and matching the design to properly coded queries, a dramatic performance impact can be realized. Let's take a quick test drive of partitioning in MySQL 5.1 to see this in action. Note that all tests below were done on a Dell Optiplex box with a Pentium 4 3.00GHz processor, 1GB of RAM, running Fedora Core 4 and MySQL 5.1.6 alpha.
Partitioning in Action

To see the positive benefit partitioning can have on a database, let's create identical MyISAM tables that contain date sensitive information, but let's partition one and leave the other a standard heap table. For our partitioned table, we'll partition based on range and use a function that segments the data based on year:

mysql> CREATE TABLE part_tab
-> ( c1 int default NULL,
-> c2 varchar(30) default NULL,
-> c3 date default NULL
->
-> ) engine=myisam
-> PARTITION BY RANGE (year(c3)) (PARTITION p0 VALUES LESS THAN (1995),
-> PARTITION p1 VALUES LESS THAN (1996) , PARTITION p2 VALUES LESS THAN (1997) ,
-> PARTITION p3 VALUES LESS THAN (1998) , PARTITION p4 VALUES LESS THAN (1999) ,
-> PARTITION p5 VALUES LESS THAN (2000) , PARTITION p6 VALUES LESS THAN (2001) ,
-> PARTITION p7 VALUES LESS THAN (2002) , PARTITION p8 VALUES LESS THAN (2003) ,
-> PARTITION p9 VALUES LESS THAN (2004) , PARTITION p10 VALUES LESS THAN (2010),
-> PARTITION p11 VALUES LESS THAN MAXVALUE );
Query OK, 0 rows affected (0.00 sec)

Notice that we designed partitions for a particular year and finished with one catch-all partition to get anything that doesn't fall into any of the specific date partitions. Now let's create a mirror MyISAM table that's not partitioned:

mysql> create table no_part_tab
-> (c1 int(11) default NULL,
-> c2 varchar(30) default NULL,
-> c3 date default NULL) engine=myisam;
Query OK, 0 rows affected (0.02 sec)

Now let's create a procedure (thanks to Peter Gulutzan for the code?) that will fill our partitioned table with 8 million rows that distributes data fairly evenly across the various partitions. Once filled, we'll then insert the same data into our non-partitioned MyISAM clone table:

mysql> delimiter //
mysql> CREATE PROCEDURE load_part_tab()
-> begin
-> declare v int default 0;
-> while v < 8000000
-> do
-> insert into part_tab
-> values (v,'testing partitions',adddate('1995-01-01',(rand(v)*36520) mod 3652));
-> set v = v + 1;
-> end while;
-> end
-> //
Query OK, 0 rows affected (0.00 sec)
mysql> delimiter ;
mysql> call load_part_tab();
Query OK, 1 row affected (8 min 17.75 sec)
mysql> insert into no_part_tab select * from part_tab;
Query OK, 8000000 rows affected (51.59 sec)
Records: 8000000 Duplicates: 0 Warnings: 0

With our tables now ready, let's issue a simple date range query on both tables - the non-partitioned table first and then the partitioned table - followed by EXPLAIN's, and see what MySQL does:

mysql> select count(*) from no_part_tab where
-> c3 > date '1995-01-01' and c3 < date '1995-12-31';
+----------+
| count(*) |
+----------+
| 795181 |
+----------+
1 row in set (38.30 sec)

mysql> select count(*) from part_tab where
-> c3 > date '1995-01-01' and c3 < date '1995-12-31';
+----------+
| count(*) |
+----------+
| 795181 |
+----------+
1 row in set (3.88 sec)

mysql> explain select count(*) from no_part_tab where
-> c3 > date '1995-01-01' and c3 < date '1995-12-31'\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: no_part_tab
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 8000000
Extra: Using where
1 row in set (0.00 sec)

mysql> explain partitions select count(*) from part_tab where
-> c3 > date '1995-01-01' and c3 < date '1995-12-31'\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: part_tab
partitions: p1
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 798458
Extra: Using where
1 row in set (0.00 sec)

The power of proper partition and query design can easily be seen as the partitioned table access delivers a whopping 90% response time reduction over the non-partitioned table. The EXPLAIN plans showcase why this is (notice the new EXPLAIN syntax for partitioned objects) as only the first partition in the partitioned table is accessed with all others being skipped.

As a MySQL DBA, it's easy to get excited about the potential benefits that partitioning can provide, but you always want to make sure that the tool you use for database design matches the requirements and scenario of your particular application. Partitioning is best suited for VLDB's that contain a lot of query activity that targets specific portions/ranges of one or more database tables. Of course, other situations lend themselves to partitioning as well (e.g. data archiving, etc.)
A Quick Side Note on Vertical Partitioning

Although MySQL 5.1 automates horizontal partitioning, don't lose sight of vertical partitioning schemes when designing your databases. Although you have to do vertical partitioning manually, you can benefit from the practice in certain circumstances. For example, let's say you didn't normally need to reference or use the VARCHAR column defined in our previously shown partitioned table. Would the elimination of this column help query speed? Let's find out:

mysql> desc part_tab;
+-------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+-------+
| c1 | int(11) | YES | | NULL | |
| c2 | varchar(30) | YES | | NULL | |
| c3 | date | YES | | NULL | |
+-------+-------------+------+-----+---------+-------+
3 rows in set (0.03 sec)

mysql> alter table part_tab drop column c2;
Query OK, 8000000 rows affected (42.20 sec)
Records: 8000000 Duplicates: 0 Warnings: 0

mysql> desc part_tab;
+-------+---------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+---------+------+-----+---------+-------+
| c1 | int(11) | YES | | NULL | |
| c3 | date | YES | | NULL | |
+-------+---------+------+-----+---------+-------+
2 rows in set (0.00 sec)

mysql> select count(*) from part_tab where
-> c3 > date '1995-01-01' and c3 < date '1995-12-31';
+----------+
| count(*) |
+----------+
| 795181 |
+----------+
1 row in set (0.34 sec)

By removing the VARCHAR column from the design, you actually get another 90+% reduction in query response time. Beyond partitioning, this speaks to the effect wide tables can have on queries and why you should always ensure that all columns defined to a table are actually needed.
Wrap Up

A short article like this can't possibly cover all the benefits and mechanics of MySQL 5.1's partitioning, but a few notes of interest include:

* All storage engines support partitioning (MyISAM, Archive, InnoDB, etc.)
* Indexing support for partitioned tables include local indexes, which mirror each partition in a one-to-one scheme. In other words, if a partitioned table has ten partitions, then a local index for that table would also contain ten partitions.
* Metadata regarding partitioned tables can be found in the INFORMATION_SCHEMA database, with a new PARTITIONS table being available.
* All SHOW commands support the return of partitioned table and index metadata.
* Maintenance functions and a number of other operations can be performed on partitions (rather than acting on a full table), including:
o ADD PARTITION
o DROP PARTITION
o COALESCE PARTITION
o REORGANIZE PARTITION
o ANALYZE PARTITION
o CHECK PARTITION
o OPTIMIZE PARTITION
o REBUILD PARTITION
o REPAIR PARTITION

From a performance standpoint, the main take-away is that MySQL 5.1 partitioning is a powerful new tool that can be used in many physical database designs to dramatically improve performance and ease DBA management burdens. For more information on MySQL partitioning, you can visit out the online reference manual at http://dev.mysql.com/doc/refman/5.1/en/partitioning.html and visit the MySQL forums as there is a forum section devoted to partitioning, which can be referenced at http://forums.mysql.com/list.php?106.

Download a copy of MySQL 5.1 (which is now is beta) today and give partitioning a try. I think you will be pleased with all the new possibilities partitioning provides when it comes to creating a top-notch physical database design, which is the number one contributor to overall database performance.

posted @ 2006-04-11 11:47 無風(fēng)之雨閱讀(782) | 評論 (0) | 編輯收藏