Change Dir

          先知cd——熱愛生活是一切藝術的開始

          統計

          留言簿(18)

          積分與排名

          “牛”們的博客

          各個公司技術

          我的鏈接

          淘寶技術

          閱讀排行榜

          評論排行榜

          如何高效的實現一個計數器map

          這本是多年前一個stackoverflow上的一個討論,回答中涉及到了多種計數方法。對于一個key-value結構的map,我們在編程時會經常涉及到key是對象,而value是一個integer或long來負責計數,從而統計多個key的頻率。

          面對這樣一個基本需求,可能有很多種實現。比如最基本的使用jdk的map直接實現——value是一個integer或者long。其基本代碼型如下:

             1: final Map<String, Integer> freq = new HashMap<String, Integer>();
             2: int count = freq.containsKey(word) ? freq.get(word) : 0;
             3: freq.put(word, count + 1);

          邏輯簡單,判斷是否存在,是則get取值,否則為0,再put進去一個加1后的值。總共要contain判斷,get,put做三次方法調用。

          當然進一步我們可以把contain判斷去掉,代碼如下:

             1: final Map<String, Integer> freq = new HashMap<String, Integer>();
             2: Integer count = freq.get(word);
             3: if (count == null) {
             4:     freq.put(word, 1);
             5: } else {
             6:     freq.put(word, count + 1);
             7: }

          一般情況,我們做到這個地步,多數人對其邏輯已經滿足,簡單性能也能接受,試著想一下,難道不是這樣嗎?get加put,解決了。

          當然這樣的實現還不夠高效,于是我們開始去嘗試實現或尋找更高效的方法,看看開源的集合類庫是否有需要的:

          有個Trove,可以讓我們參考一下:

             1: final TObjectIntHashMap<String> freq = new TObjectIntHashMap<String>();
             2: freq.adjustOrPutValue(word, 1, 1);

          這樣做,非常優雅啊,性能如何呢?不知道,需要看源碼了解細節。那再看看大名鼎鼎的guava如何呢?

             1: AtomicLongMap<String> map = AtomicLongMap.create();
             2: map.getAndIncrement(word);

          實現依然優雅,但是,但是看這名字,再看源碼,好吧,線程安全的,支持并發,這不好搞了,我們場景需要嗎?不需要的話直覺告訴我們這肯定是“慢”的。再找:

             1: Multiset<String> bag = HashMultiset.create();
             2: bag.add(word);

          這個看上去合適了,bag的實現明顯好很多,而且從語義理解上,這樣的接口更容易讓人理解。

          那么這些方法,性能如何呢?做了個簡單的比較,將26個英文字母做key,均勻循環若干次比較各個方法的效率(單純時間效率),而且時間不統計構建開銷。外加了一個線程安全版的concurrentMap實現,而其實google的guava里的AtomicLongMap也是包裝了juc的concurrentMap而已。里面有最終極的MutableInt方法,找找看吧,性能最好的就是它了。

             1: /**
             2:  * 
             3:  */
             4:
             5:  
             6: import gnu.trove.map.hash.TObjectIntHashMap;
             7:  
             8: import java.util.HashMap;
             9: import java.util.Map;
            10: import java.util.concurrent.ConcurrentHashMap;
            11: import java.util.concurrent.ConcurrentMap;
            12: import java.util.concurrent.atomic.AtomicLong;
            13:  
            14: import com.google.common.collect.HashMultiset;
            15: import com.google.common.collect.Multiset;
            16: import com.google.common.util.concurrent.AtomicLongMap;
            17:  
            18: /**
            19:  * @author Administrator
            20:  * 
            21:  */
            22: public class IntMapTest {
            23:  
            24:     /**
            25:      * @param args
            26:      */
            27:     public static void main(String[] args) {
            28:         // TODO Auto-generated method stub
            29:         int cycles[] = { 100, 1000, 10000, 100000 };
            30:         Tester baseLine = new BaseLine();
            31:         Tester testForNull = new UseNullTest();
            32:         Tester useAtomicLong = new UseAtomicLong();
            33:         Tester useTrove = new UseTrove();
            34:         Tester useMutableInt = new UseMutableInt();
            35:         Tester useGuava = new UseGuava();
            36:         Tester useGuava2 = new UseGuava2();
            37:  
            38:         for (int i = 0; i < cycles.length; i++) {
            39:             System.out.println("-----With " + cycles[i] + " cycles-----");
            40:             baseLine.test(cycles[i]);
            41:             testForNull.test(cycles[i]);
            42:             useAtomicLong.test(cycles[i]);
            43:             useTrove.test(cycles[i]);
            44:             useMutableInt.test(cycles[i]);
            45:             useGuava.test(cycles[i]);
            46:             useGuava2.test(cycles[i]);
            47:             System.out.println("------------------------");
            48:         }
            49:  
            50:     }
            51:  
            52: }
            53:  
            54: abstract class Tester {
            55:     long ms;
            56:     static String[] strs = "abcdefghijklmnopqrstuvwxyz".split("");
            57:  
            58:     void pre() {
            59:         System.out.println("====" + this.getName() + "Test Case ");
            60:         ms = System.currentTimeMillis();
            61:         System.out.println("start at " + ms);
            62:     }
            63:  
            64:     void post() {
            65:         ms = System.currentTimeMillis() - ms;
            66:         System.out.println("Time used: " + ms + " ms");
            67:     }
            68:  
            69:     abstract void doAction(int cycles);
            70:  
            71:     public void test(int cycles) {
            72:         pre();
            73:         doAction(cycles);
            74:         post();
            75:     }
            76:  
            77:     abstract String getName();
            78: }
            79:  
            80: class BaseLine extends Tester {
            81:     final Map<String, Integer> freq = new HashMap<String, Integer>();
            82:  
            83:     @Override
            84:     void doAction(int cycles) {
            85:         for (int i = 0; i < cycles; i++) {
            86:             for (String word : strs) {
            87:                 int count = freq.containsKey(word) ? freq.get(word) : 0;
            88:                 freq.put(word, count + 1);
            89:             }
            90:         }
            91:     }
            92:  
            93:     @Override
            94:     String getName() {
            95:         return "BaseLine";
            96:     }
            97:  
            98: }
            99:  
           100: class UseNullTest extends Tester {
           101:     final Map<String, Integer> freq = new HashMap<String, Integer>();
           102:  
           103:     @Override
           104:     void doAction(int cycles) {
           105:         for (int i = 0; i < cycles; i++) {
           106:             for (String word : strs) {
           107:                 Integer count = freq.get(word);
           108:                 if (count == null) {
           109:                     freq.put(word, 1);
           110:                 } else {
           111:                     freq.put(word, count + 1);
           112:                 }
           113:             }
           114:         }
           115:     }
           116:  
           117:     @Override
           118:     String getName() {
           119:         return "TestForNull";
           120:     }
           121:  
           122: }
           123:  
           124: class UseAtomicLong extends Tester {
           125:     final ConcurrentMap<String, AtomicLong> map = new ConcurrentHashMap<String, AtomicLong>();
           126:  
           127:     @Override
           128:     void doAction(int cycles) {
           129:         for (int i = 0; i < cycles; i++) {
           130:             for (String word : strs) {
           131:                 map.putIfAbsent(word, new AtomicLong(0));
           132:                 map.get(word).incrementAndGet();
           133:             }
           134:         }
           135:     }
           136:  
           137:     @Override
           138:     String getName() {
           139:         return "AtomicLong";
           140:     }
           141:  
           142: }
           143:  
           144: class UseTrove extends Tester {
           145:     final TObjectIntHashMap<String> freq = new TObjectIntHashMap<String>();
           146:  
           147:     @Override
           148:     void doAction(int cycles) {
           149:         for (int i = 0; i < cycles; i++) {
           150:             for (String word : strs) {
           151:                 freq.adjustOrPutValue(word, 1, 1);
           152:             }
           153:         }
           154:     }
           155:  
           156:     @Override
           157:     String getName() {
           158:         return "Trove";
           159:     }
           160:  
           161: }
           162:  
           163: class MutableInt {
           164:     int value = 1; // note that we start at 1 since we're counting
           165:  
           166:     public void increment() {
           167:         ++value;
           168:     }
           169:  
           170:     public int get() {
           171:         return value;
           172:     }
           173: }
           174:  
           175: class UseMutableInt extends Tester {
           176:     Map<String, MutableInt> freq = new HashMap<String, MutableInt>();
           177:  
           178:     @Override
           179:     void doAction(int cycles) {
           180:         for (int i = 0; i < cycles; i++) {
           181:             for (String word : strs) {
           182:                 MutableInt count = freq.get(word);
           183:                 if (count == null) {
           184:                     freq.put(word, new MutableInt());
           185:                 } else {
           186:                     count.increment();
           187:                 }
           188:             }
           189:         }
           190:     }
           191:  
           192:     @Override
           193:     String getName() {
           194:         return "MutableInt";
           195:     }
           196:  
           197: }
           198:  
           199: class UseGuava extends Tester {
           200:     AtomicLongMap<String> map = AtomicLongMap.create();
           201:  
           202:     @Override
           203:     void doAction(int cycles) {
           204:         for (int i = 0; i < cycles; i++) {
           205:             for (String word : strs) {
           206:                 map.getAndIncrement(word);
           207:             }
           208:         }
           209:     }
           210:  
           211:     @Override
           212:     String getName() {
           213:         return "Guava AtomicLongMap";
           214:     }
           215:  
           216: }
           217:  
           218: class UseGuava2 extends Tester {
           219:     Multiset<String> bag = HashMultiset.create();
           220:  
           221:     @Override
           222:     void doAction(int cycles) {
           223:         for (int i = 0; i < cycles; i++) {
           224:             for (String word : strs) {
           225:                 bag.add(word);
           226:             }
           227:         }
           228:     }
           229:  
           230:     @Override
           231:     String getName() {
           232:         return "Guava HashMultiSet";
           233:     }
           234:  
           235: }

          輸出結果如下:

             1: -----With 100 cycles-----
             2: ====BaseLineTest Case 
             3: start at 1358655702729
             4: Time used: 7 ms
             5: ====TestForNullTest Case 
             6: start at 1358655702736
             7: Time used: 3 ms
             8: ====AtomicLongTest Case 
             9: start at 1358655702739
            10: Time used: 14 ms
            11: ====TroveTest Case 
            12: start at 1358655702753
            13: Time used: 2 ms
            14: ====MutableIntTest Case 
            15: start at 1358655702755
            16: Time used: 2 ms
            17: ====Guava AtomicLongMapTest Case 
            18: start at 1358655702757
            19: Time used: 4 ms
            20: ====Guava HashMultiSetTest Case 
            21: start at 1358655702761
            22: Time used: 7 ms
            23: ------------------------
            24: -----With 1000 cycles-----
            25: ====BaseLineTest Case 
            26: start at 1358655702768
            27: Time used: 17 ms
            28: ====TestForNullTest Case 
            29: start at 1358655702785
            30: Time used: 7 ms
            31: ====AtomicLongTest Case 
            32: start at 1358655702792
            33: Time used: 44 ms
            34: ====TroveTest Case 
            35: start at 1358655702836
            36: Time used: 17 ms
            37: ====MutableIntTest Case 
            38: start at 1358655702853
            39: Time used: 5 ms
            40: ====Guava AtomicLongMapTest Case 
            41: start at 1358655702858
            42: Time used: 9 ms
            43: ====Guava HashMultiSetTest Case 
            44: start at 1358655702868
            45: Time used: 50 ms
            46: ------------------------
            47: -----With 10000 cycles-----
            48: ====BaseLineTest Case 
            49: start at 1358655702918
            50: Time used: 16 ms
            51: ====TestForNullTest Case 
            52: start at 1358655702934
            53: Time used: 14 ms
            54: ====AtomicLongTest Case 
            55: start at 1358655702948
            56: Time used: 29 ms
            57: ====TroveTest Case 
            58: start at 1358655702977
            59: Time used: 10 ms
            60: ====MutableIntTest Case 
            61: start at 1358655702988
            62: Time used: 5 ms
            63: ====Guava AtomicLongMapTest Case 
            64: start at 1358655702993
            65: Time used: 15 ms
            66: ====Guava HashMultiSetTest Case 
            67: start at 1358655703009
            68: Time used: 77 ms
            69: ------------------------
            70: -----With 100000 cycles-----
            71: ====BaseLineTest Case 
            72: start at 1358655703086
            73: Time used: 124 ms
            74: ====TestForNullTest Case 
            75: start at 1358655703210
            76: Time used: 118 ms
            77: ====AtomicLongTest Case 
            78: start at 1358655703329
            79: Time used: 240 ms
            80: ====TroveTest Case 
            81: start at 1358655703569
            82: Time used: 102 ms
            83: ====MutableIntTest Case 
            84: start at 1358655703671
            85: Time used: 45 ms
            86: ====Guava AtomicLongMapTest Case 
            87: start at 1358655703716
            88: Time used: 126 ms
            89: ====Guava HashMultiSetTest Case 
            90: start at 1358655703842
            91: Time used: 98 ms
            92: ------------------------

          一般結論:單線程使用MutableInt,多線程使用guava的AtomicLongMap,其實可以看看guava對addAndGet的實現,循環,很有趣。

          最后總結一下,我們在對這個問題做優化的時候,明顯的思路就是減少方法調用,而MutableInt效率最高,明顯的是它將方法調用減少到最小——1次get,指針的威力頓時顯現。當然實際業務代碼實現的時候還要考慮到多個因素,比如代碼可讀性,與業務結合等等,我們現實中不一定要追求如此的效率,但是也要避免毫無思考的寫下baseline里的代碼,因為明顯是可優化的,why not?

          注:文中單個實現代碼來自stackoverflow的各個回答,測試代碼本人編寫。

          ref:

          本文提到的so的原帖:http://stackoverflow.com/questions/81346/most-efficient-way-to-increment-a-map-value-in-java

          trove:http://trove.starlight-systems.com/

          guava:http://code.google.com/p/guava-libraries/

          posted on 2013-01-20 12:40 changedi 閱讀(4662) 評論(0)  編輯  收藏 所屬分類: Java技術

          主站蜘蛛池模板: 巧家县| 嘉定区| 栖霞市| 安龙县| 隆回县| 兴安县| 商南县| 紫阳县| 尼玛县| 榕江县| 嘉祥县| 图片| 泽库县| 成都市| 东莞市| 汤原县| 称多县| 连江县| 凤山市| 中山市| 永寿县| 饶河县| 嘉黎县| 肃北| 滁州市| 赫章县| 瓦房店市| 岳阳市| 美姑县| 镇坪县| 通州区| 高平市| 克山县| 大名县| 滦平县| 措勤县| 沙洋县| 呈贡县| 福清市| 宣武区| 竹北市|