精品欧美aⅴ在线网站,精品国产电影一区二区,一个色免费成人影院

2009年4月1日

Hibernate Shard簡介介紹

HibernateShard
    多數據庫水平分區解決方案。

1. 簡介
     Hibernate 的一個擴展，用于處理多數據庫水平分區架構。
     由google工程師 2007年捐獻給 Hibernate社區。
     http://www.hibernate.org/414.html
     目前版本：   3.0.0 beta2，未發GA版。
     條件：Hibernate Core 3.2, JDK 5.0

2. 水平分區原理
     一個庫表如 Order 存在于多個數據庫實例上。按特定的分區邏輯，將該庫表的數據存儲在這些實例中，一條記錄的主鍵 PK，在所有實例中不得重復。

    水平分區在大型網站，大型企業應用中經常采用。像www.sina.com.cn ,www.163.com www.bt285.cn www.guihua.org
    目的出于海量數據分散存儲，分散操作，分散查詢以便提高數據處理量和整體數據處理性能。

    使用：
      google工程師的設計還是非常好的，完全兼容 Hibernate本身的主要接口。

Java代碼

org.hibernate.Session
org.hibernate.SessionFactory
org.hibernate.Criteria
org.hibernate.Query

      org.hibernate.Session
org.hibernate.SessionFactory
org.hibernate.Criteria
org.hibernate.Query

     因此程序員開發變化不大，甚至不需要關心后臺使用了分區數據庫。程序遷移問題不大。而且配置上比較簡明。

3. 三種策略：
   1) ShardAccessStrategy, 查詢操作時，到那個分區執行。
      默認提供兩個實現：
      順序策略：SequentialShardAccessStrategy，每個query按順序在所有分區上執行。
      平行策略：ParallelShardAccessStrategy，每個query以多線程方式并發平行的在所有分區上執行。此策略下，需要使用線程池機制滿足特定的性能需要，java.util.concurrent.ThreadPoolExecutor。

   2) ShardSelectionStrategy, 新增對象時，存儲到哪個分區。
         框架默認提供了一個輪詢選擇策略 RoundRobinShardSelectionStrategy, 但一般不這樣使用。
        通常采用“attribute-based sharding”機制，基于屬性分區。一般是用戶根據表自己實現一個基于屬性分區的策略類ShardSelectionStrategy ，例如，以下WeatherReport基于continent屬性選擇分區：

Java代碼

public class WeatherReportShardSelectionStrategy implements ShardSelectionStrategy {
public ShardId selectShardIdForNewObject(Object obj) {
if(obj instanceof WeatherReport) {
return ((WeatherReport)obj).getContinent().getShardId();
}
throw new IllegalArgumentException();
}

      public class WeatherReportShardSelectionStrategy implements ShardSelectionStrategy {
public ShardId selectShardIdForNewObject(Object obj) {
if(obj instanceof WeatherReport) {
return ((WeatherReport)obj).getContinent().getShardId();
}
throw new IllegalArgumentException();
}
}

3) ShardResolutionStrategy, 該策略用于查找單個對象時，判斷它在哪個或哪幾個分區上。
默認使用 AllShardsShardResolutionStrategy ，可以自定義例如：

Java代碼

public class WeatherReportShardResolutionStrategy extends AllShardsShardResolutionStrategy {
public WeatherReportShardResolutionStrategy(List<ShardId> shardIds) {
super(shardIds);
}
public List<ShardId> selectShardIdsFromShardResolutionStrategyData(
ShardResolutionStrategyData srsd) {
if(srsd.getEntityName().equals(WeatherReport.class.getName())) {
return Continent.getContinentByReportId(srsd.getId()).getShardId();
}
return super.selectShardIdsFromShardResolutionStrategyData(srsd);
}
}

public class WeatherReportShardResolutionStrategy extends AllShardsShardResolutionStrategy {
public WeatherReportShardResolutionStrategy(List<ShardId> shardIds) {
super(shardIds);
}
public List<ShardId> selectShardIdsFromShardResolutionStrategyData(
ShardResolutionStrategyData srsd) {
if(srsd.getEntityName().equals(WeatherReport.class.getName())) {
return Continent.getContinentByReportId(srsd.getId()).getShardId();
}
return super.selectShardIdsFromShardResolutionStrategyData(srsd);
}
}

4. 水平分區下的查詢

   對于簡單查詢 HibernateShard 可以滿足。

   水平分區下多庫查詢是一個挑戰。主要存在于以下三種操作：
   1) distinct
         因為需要遍歷所有shard分區，并進行合并判斷重復記錄。
   2) order by
         類似 1)
   3) aggregation
         count，sim，avg等聚合操作先分散到分區執行，再進行匯總。
         是不是有點類似于 MapReduce ？呵呵。

   目前 HibernateShard 不支持 1), 2), 對 3) 部分支持

    HibernateShard 目前通過 Criteria 接口的實現對聚合提供了較好的支持，因為 Criteria 以API接口指定了 Projection 操作，邏輯相對簡單。

    而HQL，原生 SQL 還不支持此類操作。


5. 再分區和虛擬分區
      當數據庫規模增大，需要調整分區邏輯和數據存儲時，需要再分區。
      兩種方式： 1）數據庫數據遷移其他分區； 2）改變記錄和分區映射關系。這兩種方式都比較麻煩。尤其“改變記錄和分區映射關系”，需要調整 ShardResolutionStrategy。

     HibernateShard 提供了一種虛擬分區層。當需要調整分區策略時，只需要調整虛擬分區和物理分區映射關系即可。以下是使用虛擬分區時的配置創建過程：

Java代碼

Map<Integer, Integer> virtualShardMap = new HashMap<Integer, Integer>();
virtualShardMap.put(0, 0);
virtualShardMap.put(1, 0);
virtualShardMap.put(2, 1);
virtualShardMap.put(3, 1);
ShardedConfiguration shardedConfig =
new ShardedConfiguration(
prototypeConfiguration,
configurations,
strategyFactory,
virtualShardMap);
return shardedConfig.buildShardedSessionFactory();

Map<Integer, Integer> virtualShardMap = new HashMap<Integer, Integer>();
virtualShardMap.put(0, 0);
virtualShardMap.put(1, 0);
virtualShardMap.put(2, 1);
virtualShardMap.put(3, 1);
ShardedConfiguration shardedConfig =
new ShardedConfiguration(
prototypeConfiguration,
configurations,
strategyFactory,
virtualShardMap);
return shardedConfig.buildShardedSessionFactory();

6. 局限：
    1）HibernateShard 不支持垂直分區，垂直+水平混合分區。

    2）水平分區下查詢功能受到一定限制，有些功能不支持。實踐中，需要在應用層面對水平分區算法進行更多的考慮。
    3）不支持跨分區的關系操作。例如：刪除A分區上的 s 表，B分區上的關聯子表 t的記錄無法進行參照完整性約束檢查。（其實這個相對跨分區查詢的挑戰應該說小的多，也許google工程師下個版本會支持，呵呵）

    4) 解析策略接口似乎和對象ID全局唯一性有些自相矛盾，
AllShardsShardResolutionStrategy 的接口返回的是給定對象ID所在的 shard ID集合，按理應該是明確的一個 shard ID.

參考資料：HibernateShard 參考指南。

posted @ 2009-04-01 18:49 王| 編輯收藏

wang123

Hibernate Shard簡介介紹

導航

統計

常用鏈接

留言簿(3)

隨筆檔案

搜索

最新評論

閱讀排行榜

評論排行榜