posts - 403, comments - 310, trackbacks - 0, articles - 7
            BlogJava :: 首頁 :: 新隨筆 :: 聯系 :: 聚合  :: 管理

          GP-GPU 閱讀筆記 (5)

          Posted on 2008-02-15 19:51 ZelluX 閱讀(562) 評論(0)  編輯  收藏 所屬分類: Laboratory
          Mars: A MapReduce Framework on Graphics Processors
          by Bingsheng He @ Hong Kong Univ. of Sci. & Tech. 
              Nage K. Govindaraju @ Microsoft Corp.
              Qiong Luo, Tuyong Wang @ Sina Corp.

          一些重點摘記:
          1. Introduction
          Three challenges in implementing the MapReduce framework on the GPU:
          First, the synchronization overhead in the run-time system of the framework must be low.
          Second, a fine-grained load balancing scheme is required.
          Third, the core tasks of MapReduce, including string processing, file manipulation and concurrent reads and writes, are unconventional to GPUs and must be handled efficiently.
          Each thread is responsible for a Map or a Reduce task with a small number of key/value pairs as input.
          Performance improvement: 1.5-16 times

          2. Priliminary and Related Work
          2.1. Graphics Processors
          It is desirable to schedule the tasks between the CPU and the GPU to fully exploit their computation power.
          Given a kernel program, the occupancy of the GPU is the ratio of active schedule units to the maximum number of schedule units supported on the GPU.
          The GPU has a hardware feature called coalesced access to exploit the spatial locality of memory accesses among threads.

          2.2. GPGPU
          2.3. MapReduce
          Map: (k1, v1) -> (k2, v2)*
          Reduce: (k2, v2*) -> v3*

          3. Design and Immplementation
          3.1. Design Goals
          3.2. System Workflow and Configuration
          3.3. APIs
          3.4. Implementation Techniques
          Based on this compilation information and the total computation resources on the GPU, we set the number of threads per thread group and the number of thread groups to achieve a high occupancy at run time.

          4. Evaluation
          4.1. Experimental Setup


          主站蜘蛛池模板: 六安市| 衡东县| 利津县| 西盟| 余干县| 无为县| 成安县| 兴安县| 洮南市| 阜平县| 凯里市| 乐亭县| 琼海市| 正定县| 稷山县| 盐城市| 武冈市| 囊谦县| 上虞市| 夏津县| 全南县| 平江县| 孝义市| 永寿县| 灵山县| 武宣县| 万年县| 阿拉尔市| 丹巴县| 乐业县| 霸州市| 新和县| 淅川县| 家居| 泾川县| 淮北市| 延长县| 乌拉特后旗| 建湖县| 宁阳县| 射洪县|