posts - 403, comments - 310, trackbacks - 0, articles - 7
            BlogJava :: 首頁 :: 新隨筆 :: 聯系 :: 聚合  :: 管理

          GP-GPU 閱讀筆記 (5)

          Posted on 2008-02-15 19:51 ZelluX 閱讀(562) 評論(0)  編輯  收藏 所屬分類: Laboratory
          Mars: A MapReduce Framework on Graphics Processors
          by Bingsheng He @ Hong Kong Univ. of Sci. & Tech. 
              Nage K. Govindaraju @ Microsoft Corp.
              Qiong Luo, Tuyong Wang @ Sina Corp.

          一些重點摘記:
          1. Introduction
          Three challenges in implementing the MapReduce framework on the GPU:
          First, the synchronization overhead in the run-time system of the framework must be low.
          Second, a fine-grained load balancing scheme is required.
          Third, the core tasks of MapReduce, including string processing, file manipulation and concurrent reads and writes, are unconventional to GPUs and must be handled efficiently.
          Each thread is responsible for a Map or a Reduce task with a small number of key/value pairs as input.
          Performance improvement: 1.5-16 times

          2. Priliminary and Related Work
          2.1. Graphics Processors
          It is desirable to schedule the tasks between the CPU and the GPU to fully exploit their computation power.
          Given a kernel program, the occupancy of the GPU is the ratio of active schedule units to the maximum number of schedule units supported on the GPU.
          The GPU has a hardware feature called coalesced access to exploit the spatial locality of memory accesses among threads.

          2.2. GPGPU
          2.3. MapReduce
          Map: (k1, v1) -> (k2, v2)*
          Reduce: (k2, v2*) -> v3*

          3. Design and Immplementation
          3.1. Design Goals
          3.2. System Workflow and Configuration
          3.3. APIs
          3.4. Implementation Techniques
          Based on this compilation information and the total computation resources on the GPU, we set the number of threads per thread group and the number of thread groups to achieve a high occupancy at run time.

          4. Evaluation
          4.1. Experimental Setup


          主站蜘蛛池模板: 滦平县| 德令哈市| 永昌县| 秦安县| 新田县| 合川市| 玉山县| 邯郸市| 济源市| 安国市| 巧家县| 泸水县| 河东区| 巴彦县| 六安市| 南城县| 寻乌县| 泰顺县| 安溪县| 东山县| 根河市| 迭部县| 保山市| 八宿县| 成武县| 乐平市| 循化| 依安县| 亚东县| 濮阳市| 汕头市| 阜平县| 三门峡市| 包头市| 文山县| 凌海市| 揭阳市| 疏附县| 偏关县| 罗定市| 永福县|