Change Dir

          先知cd——熱愛(ài)生活是一切藝術(shù)的開(kāi)始

          統(tǒng)計(jì)

          留言簿(18)

          積分與排名

          “牛”們的博客

          各個(gè)公司技術(shù)

          我的鏈接

          淘寶技術(shù)

          閱讀排行榜

          評(píng)論排行榜

          如何高效的實(shí)現(xiàn)一個(gè)計(jì)數(shù)器map

          這本是多年前一個(gè)stackoverflow上的一個(gè)討論,回答中涉及到了多種計(jì)數(shù)方法。對(duì)于一個(gè)key-value結(jié)構(gòu)的map,我們?cè)诰幊虝r(shí)會(huì)經(jīng)常涉及到key是對(duì)象,而value是一個(gè)integer或long來(lái)負(fù)責(zé)計(jì)數(shù),從而統(tǒng)計(jì)多個(gè)key的頻率。

          面對(duì)這樣一個(gè)基本需求,可能有很多種實(shí)現(xiàn)。比如最基本的使用jdk的map直接實(shí)現(xiàn)——value是一個(gè)integer或者long。其基本代碼型如下:

             1: final Map<String, Integer> freq = new HashMap<String, Integer>();
             2: int count = freq.containsKey(word) ? freq.get(word) : 0;
             3: freq.put(word, count + 1);

          邏輯簡(jiǎn)單,判斷是否存在,是則get取值,否則為0,再put進(jìn)去一個(gè)加1后的值。總共要contain判斷,get,put做三次方法調(diào)用。

          當(dāng)然進(jìn)一步我們可以把contain判斷去掉,代碼如下:

             1: final Map<String, Integer> freq = new HashMap<String, Integer>();
             2: Integer count = freq.get(word);
             3: if (count == null) {
             4:     freq.put(word, 1);
             5: } else {
             6:     freq.put(word, count + 1);
             7: }

          一般情況,我們做到這個(gè)地步,多數(shù)人對(duì)其邏輯已經(jīng)滿足,簡(jiǎn)單性能也能接受,試著想一下,難道不是這樣嗎?get加put,解決了。

          當(dāng)然這樣的實(shí)現(xiàn)還不夠高效,于是我們開(kāi)始去嘗試實(shí)現(xiàn)或?qū)ふ腋咝У姆椒ǎ纯撮_(kāi)源的集合類庫(kù)是否有需要的:

          有個(gè)Trove,可以讓我們參考一下:

             1: final TObjectIntHashMap<String> freq = new TObjectIntHashMap<String>();
             2: freq.adjustOrPutValue(word, 1, 1);

          這樣做,非常優(yōu)雅啊,性能如何呢?不知道,需要看源碼了解細(xì)節(jié)。那再看看大名鼎鼎的guava如何呢?

             1: AtomicLongMap<String> map = AtomicLongMap.create();
             2: map.getAndIncrement(word);

          實(shí)現(xiàn)依然優(yōu)雅,但是,但是看這名字,再看源碼,好吧,線程安全的,支持并發(fā),這不好搞了,我們場(chǎng)景需要嗎?不需要的話直覺(jué)告訴我們這肯定是“慢”的。再找:

             1: Multiset<String> bag = HashMultiset.create();
             2: bag.add(word);

          這個(gè)看上去合適了,bag的實(shí)現(xiàn)明顯好很多,而且從語(yǔ)義理解上,這樣的接口更容易讓人理解。

          那么這些方法,性能如何呢?做了個(gè)簡(jiǎn)單的比較,將26個(gè)英文字母做key,均勻循環(huán)若干次比較各個(gè)方法的效率(單純時(shí)間效率),而且時(shí)間不統(tǒng)計(jì)構(gòu)建開(kāi)銷。外加了一個(gè)線程安全版的concurrentMap實(shí)現(xiàn),而其實(shí)google的guava里的AtomicLongMap也是包裝了juc的concurrentMap而已。里面有最終極的MutableInt方法,找找看吧,性能最好的就是它了。

             1: /**
             2:  * 
             3:  */
             4:
             5:  
             6: import gnu.trove.map.hash.TObjectIntHashMap;
             7:  
             8: import java.util.HashMap;
             9: import java.util.Map;
            10: import java.util.concurrent.ConcurrentHashMap;
            11: import java.util.concurrent.ConcurrentMap;
            12: import java.util.concurrent.atomic.AtomicLong;
            13:  
            14: import com.google.common.collect.HashMultiset;
            15: import com.google.common.collect.Multiset;
            16: import com.google.common.util.concurrent.AtomicLongMap;
            17:  
            18: /**
            19:  * @author Administrator
            20:  * 
            21:  */
            22: public class IntMapTest {
            23:  
            24:     /**
            25:      * @param args
            26:      */
            27:     public static void main(String[] args) {
            28:         // TODO Auto-generated method stub
            29:         int cycles[] = { 100, 1000, 10000, 100000 };
            30:         Tester baseLine = new BaseLine();
            31:         Tester testForNull = new UseNullTest();
            32:         Tester useAtomicLong = new UseAtomicLong();
            33:         Tester useTrove = new UseTrove();
            34:         Tester useMutableInt = new UseMutableInt();
            35:         Tester useGuava = new UseGuava();
            36:         Tester useGuava2 = new UseGuava2();
            37:  
            38:         for (int i = 0; i < cycles.length; i++) {
            39:             System.out.println("-----With " + cycles[i] + " cycles-----");
            40:             baseLine.test(cycles[i]);
            41:             testForNull.test(cycles[i]);
            42:             useAtomicLong.test(cycles[i]);
            43:             useTrove.test(cycles[i]);
            44:             useMutableInt.test(cycles[i]);
            45:             useGuava.test(cycles[i]);
            46:             useGuava2.test(cycles[i]);
            47:             System.out.println("------------------------");
            48:         }
            49:  
            50:     }
            51:  
            52: }
            53:  
            54: abstract class Tester {
            55:     long ms;
            56:     static String[] strs = "abcdefghijklmnopqrstuvwxyz".split("");
            57:  
            58:     void pre() {
            59:         System.out.println("====" + this.getName() + "Test Case ");
            60:         ms = System.currentTimeMillis();
            61:         System.out.println("start at " + ms);
            62:     }
            63:  
            64:     void post() {
            65:         ms = System.currentTimeMillis() - ms;
            66:         System.out.println("Time used: " + ms + " ms");
            67:     }
            68:  
            69:     abstract void doAction(int cycles);
            70:  
            71:     public void test(int cycles) {
            72:         pre();
            73:         doAction(cycles);
            74:         post();
            75:     }
            76:  
            77:     abstract String getName();
            78: }
            79:  
            80: class BaseLine extends Tester {
            81:     final Map<String, Integer> freq = new HashMap<String, Integer>();
            82:  
            83:     @Override
            84:     void doAction(int cycles) {
            85:         for (int i = 0; i < cycles; i++) {
            86:             for (String word : strs) {
            87:                 int count = freq.containsKey(word) ? freq.get(word) : 0;
            88:                 freq.put(word, count + 1);
            89:             }
            90:         }
            91:     }
            92:  
            93:     @Override
            94:     String getName() {
            95:         return "BaseLine";
            96:     }
            97:  
            98: }
            99:  
           100: class UseNullTest extends Tester {
           101:     final Map<String, Integer> freq = new HashMap<String, Integer>();
           102:  
           103:     @Override
           104:     void doAction(int cycles) {
           105:         for (int i = 0; i < cycles; i++) {
           106:             for (String word : strs) {
           107:                 Integer count = freq.get(word);
           108:                 if (count == null) {
           109:                     freq.put(word, 1);
           110:                 } else {
           111:                     freq.put(word, count + 1);
           112:                 }
           113:             }
           114:         }
           115:     }
           116:  
           117:     @Override
           118:     String getName() {
           119:         return "TestForNull";
           120:     }
           121:  
           122: }
           123:  
           124: class UseAtomicLong extends Tester {
           125:     final ConcurrentMap<String, AtomicLong> map = new ConcurrentHashMap<String, AtomicLong>();
           126:  
           127:     @Override
           128:     void doAction(int cycles) {
           129:         for (int i = 0; i < cycles; i++) {
           130:             for (String word : strs) {
           131:                 map.putIfAbsent(word, new AtomicLong(0));
           132:                 map.get(word).incrementAndGet();
           133:             }
           134:         }
           135:     }
           136:  
           137:     @Override
           138:     String getName() {
           139:         return "AtomicLong";
           140:     }
           141:  
           142: }
           143:  
           144: class UseTrove extends Tester {
           145:     final TObjectIntHashMap<String> freq = new TObjectIntHashMap<String>();
           146:  
           147:     @Override
           148:     void doAction(int cycles) {
           149:         for (int i = 0; i < cycles; i++) {
           150:             for (String word : strs) {
           151:                 freq.adjustOrPutValue(word, 1, 1);
           152:             }
           153:         }
           154:     }
           155:  
           156:     @Override
           157:     String getName() {
           158:         return "Trove";
           159:     }
           160:  
           161: }
           162:  
           163: class MutableInt {
           164:     int value = 1; // note that we start at 1 since we're counting
           165:  
           166:     public void increment() {
           167:         ++value;
           168:     }
           169:  
           170:     public int get() {
           171:         return value;
           172:     }
           173: }
           174:  
           175: class UseMutableInt extends Tester {
           176:     Map<String, MutableInt> freq = new HashMap<String, MutableInt>();
           177:  
           178:     @Override
           179:     void doAction(int cycles) {
           180:         for (int i = 0; i < cycles; i++) {
           181:             for (String word : strs) {
           182:                 MutableInt count = freq.get(word);
           183:                 if (count == null) {
           184:                     freq.put(word, new MutableInt());
           185:                 } else {
           186:                     count.increment();
           187:                 }
           188:             }
           189:         }
           190:     }
           191:  
           192:     @Override
           193:     String getName() {
           194:         return "MutableInt";
           195:     }
           196:  
           197: }
           198:  
           199: class UseGuava extends Tester {
           200:     AtomicLongMap<String> map = AtomicLongMap.create();
           201:  
           202:     @Override
           203:     void doAction(int cycles) {
           204:         for (int i = 0; i < cycles; i++) {
           205:             for (String word : strs) {
           206:                 map.getAndIncrement(word);
           207:             }
           208:         }
           209:     }
           210:  
           211:     @Override
           212:     String getName() {
           213:         return "Guava AtomicLongMap";
           214:     }
           215:  
           216: }
           217:  
           218: class UseGuava2 extends Tester {
           219:     Multiset<String> bag = HashMultiset.create();
           220:  
           221:     @Override
           222:     void doAction(int cycles) {
           223:         for (int i = 0; i < cycles; i++) {
           224:             for (String word : strs) {
           225:                 bag.add(word);
           226:             }
           227:         }
           228:     }
           229:  
           230:     @Override
           231:     String getName() {
           232:         return "Guava HashMultiSet";
           233:     }
           234:  
           235: }

          輸出結(jié)果如下:

             1: -----With 100 cycles-----
             2: ====BaseLineTest Case 
             3: start at 1358655702729
             4: Time used: 7 ms
             5: ====TestForNullTest Case 
             6: start at 1358655702736
             7: Time used: 3 ms
             8: ====AtomicLongTest Case 
             9: start at 1358655702739
            10: Time used: 14 ms
            11: ====TroveTest Case 
            12: start at 1358655702753
            13: Time used: 2 ms
            14: ====MutableIntTest Case 
            15: start at 1358655702755
            16: Time used: 2 ms
            17: ====Guava AtomicLongMapTest Case 
            18: start at 1358655702757
            19: Time used: 4 ms
            20: ====Guava HashMultiSetTest Case 
            21: start at 1358655702761
            22: Time used: 7 ms
            23: ------------------------
            24: -----With 1000 cycles-----
            25: ====BaseLineTest Case 
            26: start at 1358655702768
            27: Time used: 17 ms
            28: ====TestForNullTest Case 
            29: start at 1358655702785
            30: Time used: 7 ms
            31: ====AtomicLongTest Case 
            32: start at 1358655702792
            33: Time used: 44 ms
            34: ====TroveTest Case 
            35: start at 1358655702836
            36: Time used: 17 ms
            37: ====MutableIntTest Case 
            38: start at 1358655702853
            39: Time used: 5 ms
            40: ====Guava AtomicLongMapTest Case 
            41: start at 1358655702858
            42: Time used: 9 ms
            43: ====Guava HashMultiSetTest Case 
            44: start at 1358655702868
            45: Time used: 50 ms
            46: ------------------------
            47: -----With 10000 cycles-----
            48: ====BaseLineTest Case 
            49: start at 1358655702918
            50: Time used: 16 ms
            51: ====TestForNullTest Case 
            52: start at 1358655702934
            53: Time used: 14 ms
            54: ====AtomicLongTest Case 
            55: start at 1358655702948
            56: Time used: 29 ms
            57: ====TroveTest Case 
            58: start at 1358655702977
            59: Time used: 10 ms
            60: ====MutableIntTest Case 
            61: start at 1358655702988
            62: Time used: 5 ms
            63: ====Guava AtomicLongMapTest Case 
            64: start at 1358655702993
            65: Time used: 15 ms
            66: ====Guava HashMultiSetTest Case 
            67: start at 1358655703009
            68: Time used: 77 ms
            69: ------------------------
            70: -----With 100000 cycles-----
            71: ====BaseLineTest Case 
            72: start at 1358655703086
            73: Time used: 124 ms
            74: ====TestForNullTest Case 
            75: start at 1358655703210
            76: Time used: 118 ms
            77: ====AtomicLongTest Case 
            78: start at 1358655703329
            79: Time used: 240 ms
            80: ====TroveTest Case 
            81: start at 1358655703569
            82: Time used: 102 ms
            83: ====MutableIntTest Case 
            84: start at 1358655703671
            85: Time used: 45 ms
            86: ====Guava AtomicLongMapTest Case 
            87: start at 1358655703716
            88: Time used: 126 ms
            89: ====Guava HashMultiSetTest Case 
            90: start at 1358655703842
            91: Time used: 98 ms
            92: ------------------------

          一般結(jié)論:?jiǎn)尉€程使用MutableInt,多線程使用guava的AtomicLongMap,其實(shí)可以看看guava對(duì)addAndGet的實(shí)現(xiàn),循環(huán),很有趣。

          最后總結(jié)一下,我們?cè)趯?duì)這個(gè)問(wèn)題做優(yōu)化的時(shí)候,明顯的思路就是減少方法調(diào)用,而MutableInt效率最高,明顯的是它將方法調(diào)用減少到最小——1次get,指針的威力頓時(shí)顯現(xiàn)。當(dāng)然實(shí)際業(yè)務(wù)代碼實(shí)現(xiàn)的時(shí)候還要考慮到多個(gè)因素,比如代碼可讀性,與業(yè)務(wù)結(jié)合等等,我們現(xiàn)實(shí)中不一定要追求如此的效率,但是也要避免毫無(wú)思考的寫下baseline里的代碼,因?yàn)槊黠@是可優(yōu)化的,why not?

          注:文中單個(gè)實(shí)現(xiàn)代碼來(lái)自stackoverflow的各個(gè)回答,測(cè)試代碼本人編寫。

          ref:

          本文提到的so的原帖:http://stackoverflow.com/questions/81346/most-efficient-way-to-increment-a-map-value-in-java

          trove:http://trove.starlight-systems.com/

          guava:http://code.google.com/p/guava-libraries/

          posted on 2013-01-20 12:40 changedi 閱讀(4662) 評(píng)論(0)  編輯  收藏 所屬分類: Java技術(shù)

          主站蜘蛛池模板: 益阳市| 长垣县| 彭阳县| 辰溪县| 文水县| 沂源县| 务川| 册亨县| 佛教| 喜德县| 沐川县| 沛县| 谢通门县| 娱乐| 淅川县| 湘西| 沁阳市| 临夏市| 长丰县| 美姑县| 长春市| 岚皋县| 彰武县| 竹山县| 施甸县| 交口县| 亳州市| 黎川县| 炉霍县| 科技| 宣武区| 永定县| 井冈山市| 泗阳县| 铜陵市| 富阳市| 霸州市| 巴南区| 金门县| 文安县| 瑞丽市|