posts - 403, comments - 310, trackbacks - 0, articles - 7
            BlogJava :: 首頁 :: 新隨筆 :: 聯系 :: 聚合  :: 管理

          GP-GPU 閱讀筆記 (5)

          Posted on 2008-02-15 19:51 ZelluX 閱讀(554) 評論(0)  編輯  收藏 所屬分類: Laboratory
          Mars: A MapReduce Framework on Graphics Processors
          by Bingsheng He @ Hong Kong Univ. of Sci. & Tech. 
              Nage K. Govindaraju @ Microsoft Corp.
              Qiong Luo, Tuyong Wang @ Sina Corp.

          一些重點摘記:
          1. Introduction
          Three challenges in implementing the MapReduce framework on the GPU:
          First, the synchronization overhead in the run-time system of the framework must be low.
          Second, a fine-grained load balancing scheme is required.
          Third, the core tasks of MapReduce, including string processing, file manipulation and concurrent reads and writes, are unconventional to GPUs and must be handled efficiently.
          Each thread is responsible for a Map or a Reduce task with a small number of key/value pairs as input.
          Performance improvement: 1.5-16 times

          2. Priliminary and Related Work
          2.1. Graphics Processors
          It is desirable to schedule the tasks between the CPU and the GPU to fully exploit their computation power.
          Given a kernel program, the occupancy of the GPU is the ratio of active schedule units to the maximum number of schedule units supported on the GPU.
          The GPU has a hardware feature called coalesced access to exploit the spatial locality of memory accesses among threads.

          2.2. GPGPU
          2.3. MapReduce
          Map: (k1, v1) -> (k2, v2)*
          Reduce: (k2, v2*) -> v3*

          3. Design and Immplementation
          3.1. Design Goals
          3.2. System Workflow and Configuration
          3.3. APIs
          3.4. Implementation Techniques
          Based on this compilation information and the total computation resources on the GPU, we set the number of threads per thread group and the number of thread groups to achieve a high occupancy at run time.

          4. Evaluation
          4.1. Experimental Setup


          主站蜘蛛池模板: 临汾市| 河曲县| 车致| 察雅县| 阜阳市| 琼中| 巴中市| 麟游县| 东兰县| 石棉县| 婺源县| 双流县| 濮阳市| 塔河县| 达日县| 德保县| 岫岩| 乐昌市| 澳门| 永新县| 衡南县| 裕民县| 德化县| 屯昌县| 吉木乃县| 柳江县| 密云县| 九台市| 天长市| 长岛县| 丹巴县| 宜州市| 西充县| 石阡县| 河南省| 开封县| 桦甸市| 克山县| 云林县| 吉隆县| 揭阳市|