包括各種paper, survey以及workshop上的講座等內(nèi)容
待讀:
Memory Resource Management in VMware ESX Server, Carl A. Waldspurger, OSDI 02
A Performance Study of General Purpose Applications on Graphics Processors, First Workshop on General Purpose Processing on GPU
The Google File System, Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung, SOSP 2003
Parallelization:
1. Automatic Parallelization for Graphic Processing Units, Alan Leung, 6th Workshop on Compiler-Driven Performance
自動(dòng)識(shí)別Java程序中的并行化代碼,放到GPU上做,用到了RapidMind庫(kù)。
2. Liquid SIMD: Abstracting SIMD Hardware using Lightweight Dynamic Mapping, Nathan Clark.etc, University of Michigan & Cambridge, HPCA 07
粗讀
3. Parallel Computing: What has changed lately?, David B.Kirk, NVIDIA Corporation 2007
CUDA比CTM好用多了。。
4. Data-Parallel Programming on the Cell BE and the GPU using the RapidMind Development Platform, GSPx Multicore Applications Conference
粗讀
5. Mars: A MapReduce Framework on Graphics Processors
香港科大、微軟、雅虎一起做的一個(gè)MapReduce庫(kù),利用GPU進(jìn)行map操作會(huì)提升不少性能,不過(guò)reduce操作應(yīng)該會(huì)慢不少吧。
6. A Survey of General-Purpose Computation on Graphics Hardware, EUROGRAPHICS 2005
寫(xiě)得很全面,GPU計(jì)算的大致流程,相關(guān)算法,以及調(diào)試工具等方面都涉及了,里面提到的一些GPU缺陷現(xiàn)在也已經(jīng)被完善了。
7. HMPP?: A Hybrid Multi-core Parallel Programming Environment, CAPS entreprise
粗讀。和OpenMP蠻像的,有一些HMPP directives。
8. Streamware: Programming General-Purpose Multicore Processors Using Streams, Jayanth Gummaraju.etc, Stanford University, ASPLOS 08
精讀。適用于多種并行架構(gòu)(多核CPU、GPGPU)的編程平臺(tái),對(duì)cache hierarchy的管理很出色(甚至通過(guò)這個(gè)平臺(tái),一些程序跑在單處理器的條件下也能加速)。
Compilers:
1. Open Research Compiler (ORC): Beyond Version 1.0, PACT 02
主要看了其中的Loop Nested Optimization和Interprocedural Optimization
Virtulization:
1. SubVirt: Implementing malware with virtual machines, Samuel T.King, etc., IEEE SP 06
閱讀筆記:
http://www.aygfsteel.com/zellux/archive/2008/05/05/198564.html
http://www.aygfsteel.com/zellux/archive/2008/05/06/198693.html
2. Comptibility is Not Transparency: VMM Detection Myths and Realities, Stanford University, VMWare, UBC/XenSource, Carnegie Mellon, Hot OS 07
聽(tīng)過(guò):
1. Accelerating Two-Dimensional PageWalks for Virtualized Systems, Ravi Bhargava.etc
小組例會(huì)上CC講的,Shadow Page Table開(kāi)始就聽(tīng)大不懂了 =_=。不過(guò)還是先放上來(lái),占個(gè)位,恩。
2. Inter-domain Socket Communications Supporting High Performance and Full Binary Compatibility on Xen, Kangho Kim.etc, VEE 08
通過(guò)一套新的XWAY機(jī)制來(lái)提高Xen中不同Domain互相之間網(wǎng)絡(luò)傳輸速率。不過(guò)僅限于TCP協(xié)議。
3. Dispersing Proprietary Applications as Benchmarks through Code Mutation, Luk Van Ertvelde & Lieven Eeckhout, ASPLOS 08
很好玩的一篇paper。現(xiàn)在的benchmark跑的程序通常都是開(kāi)源的或是免費(fèi)的軟件,但是商業(yè)軟件就會(huì)涉及到版權(quán)問(wèn)題。
這篇paper講的是如何通過(guò)二進(jìn)制代碼的突變(mutation)達(dá)到更改軟件功能,但又不影響軟件跑在benchmark上時(shí)所耗的時(shí)間。
4. Threads Cannot Be Implemented As a Library, ACM SIGPLAN 2005
?舉例說(shuō)明了用庫(kù)實(shí)現(xiàn)線程(如pthread)的一些缺陷
待讀:
Memory Resource Management in VMware ESX Server, Carl A. Waldspurger, OSDI 02
A Performance Study of General Purpose Applications on Graphics Processors, First Workshop on General Purpose Processing on GPU
The Google File System, Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung, SOSP 2003
Parallelization:
1. Automatic Parallelization for Graphic Processing Units, Alan Leung, 6th Workshop on Compiler-Driven Performance
自動(dòng)識(shí)別Java程序中的并行化代碼,放到GPU上做,用到了RapidMind庫(kù)。
2. Liquid SIMD: Abstracting SIMD Hardware using Lightweight Dynamic Mapping, Nathan Clark.etc, University of Michigan & Cambridge, HPCA 07
粗讀
3. Parallel Computing: What has changed lately?, David B.Kirk, NVIDIA Corporation 2007
CUDA比CTM好用多了。。
4. Data-Parallel Programming on the Cell BE and the GPU using the RapidMind Development Platform, GSPx Multicore Applications Conference
粗讀
5. Mars: A MapReduce Framework on Graphics Processors
香港科大、微軟、雅虎一起做的一個(gè)MapReduce庫(kù),利用GPU進(jìn)行map操作會(huì)提升不少性能,不過(guò)reduce操作應(yīng)該會(huì)慢不少吧。
6. A Survey of General-Purpose Computation on Graphics Hardware, EUROGRAPHICS 2005
寫(xiě)得很全面,GPU計(jì)算的大致流程,相關(guān)算法,以及調(diào)試工具等方面都涉及了,里面提到的一些GPU缺陷現(xiàn)在也已經(jīng)被完善了。
7. HMPP?: A Hybrid Multi-core Parallel Programming Environment, CAPS entreprise
粗讀。和OpenMP蠻像的,有一些HMPP directives。
8. Streamware: Programming General-Purpose Multicore Processors Using Streams, Jayanth Gummaraju.etc, Stanford University, ASPLOS 08
精讀。適用于多種并行架構(gòu)(多核CPU、GPGPU)的編程平臺(tái),對(duì)cache hierarchy的管理很出色(甚至通過(guò)這個(gè)平臺(tái),一些程序跑在單處理器的條件下也能加速)。
Compilers:
1. Open Research Compiler (ORC): Beyond Version 1.0, PACT 02
主要看了其中的Loop Nested Optimization和Interprocedural Optimization
Virtulization:
1. SubVirt: Implementing malware with virtual machines, Samuel T.King, etc., IEEE SP 06
閱讀筆記:
http://www.aygfsteel.com/zellux/archive/2008/05/05/198564.html
http://www.aygfsteel.com/zellux/archive/2008/05/06/198693.html
2. Comptibility is Not Transparency: VMM Detection Myths and Realities, Stanford University, VMWare, UBC/XenSource, Carnegie Mellon, Hot OS 07
聽(tīng)過(guò):
1. Accelerating Two-Dimensional PageWalks for Virtualized Systems, Ravi Bhargava.etc
小組例會(huì)上CC講的,Shadow Page Table開(kāi)始就聽(tīng)大不懂了 =_=。不過(guò)還是先放上來(lái),占個(gè)位,恩。
2. Inter-domain Socket Communications Supporting High Performance and Full Binary Compatibility on Xen, Kangho Kim.etc, VEE 08
通過(guò)一套新的XWAY機(jī)制來(lái)提高Xen中不同Domain互相之間網(wǎng)絡(luò)傳輸速率。不過(guò)僅限于TCP協(xié)議。
3. Dispersing Proprietary Applications as Benchmarks through Code Mutation, Luk Van Ertvelde & Lieven Eeckhout, ASPLOS 08
很好玩的一篇paper。現(xiàn)在的benchmark跑的程序通常都是開(kāi)源的或是免費(fèi)的軟件,但是商業(yè)軟件就會(huì)涉及到版權(quán)問(wèn)題。
這篇paper講的是如何通過(guò)二進(jìn)制代碼的突變(mutation)達(dá)到更改軟件功能,但又不影響軟件跑在benchmark上時(shí)所耗的時(shí)間。
4. Threads Cannot Be Implemented As a Library, ACM SIGPLAN 2005
?舉例說(shuō)明了用庫(kù)實(shí)現(xiàn)線程(如pthread)的一些缺陷