posts - 403, comments - 310, trackbacks - 0, articles - 7
            BlogJava :: 首頁 :: 新隨筆 :: 聯(lián)系 :: 聚合  :: 管理

          GP-GPU 閱讀筆記 (5)

          Posted on 2008-02-15 19:51 ZelluX 閱讀(562) 評論(0)  編輯  收藏 所屬分類: Laboratory
          Mars: A MapReduce Framework on Graphics Processors
          by Bingsheng He @ Hong Kong Univ. of Sci. & Tech. 
              Nage K. Govindaraju @ Microsoft Corp.
              Qiong Luo, Tuyong Wang @ Sina Corp.

          一些重點摘記:
          1. Introduction
          Three challenges in implementing the MapReduce framework on the GPU:
          First, the synchronization overhead in the run-time system of the framework must be low.
          Second, a fine-grained load balancing scheme is required.
          Third, the core tasks of MapReduce, including string processing, file manipulation and concurrent reads and writes, are unconventional to GPUs and must be handled efficiently.
          Each thread is responsible for a Map or a Reduce task with a small number of key/value pairs as input.
          Performance improvement: 1.5-16 times

          2. Priliminary and Related Work
          2.1. Graphics Processors
          It is desirable to schedule the tasks between the CPU and the GPU to fully exploit their computation power.
          Given a kernel program, the occupancy of the GPU is the ratio of active schedule units to the maximum number of schedule units supported on the GPU.
          The GPU has a hardware feature called coalesced access to exploit the spatial locality of memory accesses among threads.

          2.2. GPGPU
          2.3. MapReduce
          Map: (k1, v1) -> (k2, v2)*
          Reduce: (k2, v2*) -> v3*

          3. Design and Immplementation
          3.1. Design Goals
          3.2. System Workflow and Configuration
          3.3. APIs
          3.4. Implementation Techniques
          Based on this compilation information and the total computation resources on the GPU, we set the number of threads per thread group and the number of thread groups to achieve a high occupancy at run time.

          4. Evaluation
          4.1. Experimental Setup


          主站蜘蛛池模板: 龙南县| 佛冈县| 金门县| 浮梁县| 洪江市| 阿巴嘎旗| 札达县| 老河口市| 柯坪县| 罗田县| 乃东县| 广平县| 南城县| 都昌县| 明光市| 高州市| 铜梁县| 阳谷县| 灵宝市| 汉川市| 武平县| 井陉县| 玉林市| 青冈县| 大石桥市| 克拉玛依市| 新和县| 崇仁县| 唐山市| 澄迈县| 泗水县| 湟中县| 普安县| 武清区| 望城县| 阳朔县| 宜川县| 石柱| 宁河县| 万安县| 邹平县|