久久综合88中文色鬼,自拍av在线,大片网站久久

[導入]分布式相關會議歷屆最佳論文(02-04)

2004
OSDI ‘04
Best Paper:
Recovering Device Drivers
Michael M. Swift, Muthukaruppan Annamalai, Brian N. Bershad, and Henry M. Levy, University of Washington
Best Paper:
Using Model Checking to Find Serious File System Errors
Junfeng Yang, Paul Twohey, and Dawson Engler, Stanford University; Madanlal Musuvathi, Microsoft Research

LISA ‘04
Best Paper:
Scalable Centralized Bayesian Spam Mitigation with Bogofilter
Jeremy Blosser and David Josephsen, VHA, Inc.

Security ‘04
Best Paper:
Understanding Data Lifetime via Whole System Simulation
Jim Chow, Ben Pfaff, Tal Garfinkel, Kevin Christopher, and Mendel Rosenblum, Stanford University
Best Student Paper:
Fairplay—A Secure Two-Party Computation System
Dahlia Malkhi and Noam Nisan, Hebrew University; Benny Pinkas, HP Labs; Yaron Sella, Hebrew University

2004 USENIX Annual Technical Conference
Best Paper:
Handling Churn in a DHT
Sean Rhea and Dennis Geels, University of California, Berkeley; Timothy Roscoe, Intel Research, Berkeley; John Kubiatowicz, University of California, Berkeley
Best Paper:
Energy Efficient Prefetching and Caching
Athanasios E. Papathanasiou and Michael L. Scott, University of Rochester

FREENIX Track
Best Paper:
Wayback: A User-level Versioning File System for Linux
Brian Cornell, Peter A. Dinda, and Fabián E. Bustamante, Northwestern University
Best Student Paper:
Design and Implementation of Netdude, a Framework for Packet Trace Manipulation
Christian Kreibich, University of Cambridge, UK

VM ‘04
Best Paper:
Semantic Remote Attestation—A Virtual Machine Directed Approach to Trusted Computing
Vivek Haldar, Deepak Chandra, and Michael Franz, University of California, Irvine

FAST ‘04
Best Paper:
Row-Diagonal Parity for Double Disk Failure Correction
Peter Corbett, Bob English, Atul Goel, Tomislav Grcanac, Steven Kleiman, James Leong, and Sunitha Sankar, Network Appliance, Inc.
Best Student Paper:
Improving Storage System Availability with D-GRAID
Muthian Sivathanu, Vijayan Prabhakaran, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau, University of Wisconsin, Madison
Best Student Paper:
A Framework for Building Unobtrusive Disk Maintenance Applications
Eno Thereska, Jiri Schindler, John Bucy, Brandon Salmon, Christopher R. Lumb, and Gregory R. Ganger, Carnegie Mellon University

NSDI ‘04
Best Paper:
Trickle: A Self-Regulating Algorithm for Code Propagation and Maintenance in Wireless Sensor Networks Philip Levis, University of California, Berkeley, and Intel Research Berkeley; Neil Patel, University of California, Berkeley; David Culler, University of California, Berkeley, and Intel Research Berkeley; Scott Shenker, University of California, Berkeley, and ICSI
Best Student Paper:
Listen and Whisper: Security Mechanisms for BGP
Lakshminarayanan Subramanian, University of California, Berkeley; Volker Roth, Fraunhofer Institute, Germany; Ion Stoica, University of California, Berkeley; Scott Shenker, University of California, Berkeley, and ICSI; Randy H. Katz, University of California, Berkeley

2003
LISA ‘03
Award Paper:
STRIDER: A Black-box, State-based Approach to Change and Configuration Management and Support Yi-Min Wang, Chad Verbowski, John Dunagan, Yu Chen, Helen J. Wang, Chun Yuan, and Zheng Zhang, Microsoft Research
Award Paper:
Distributed Tarpitting: Impeding Spam Across Multiple Servers
Tim Hunter, Paul Terry, and Alan Judge, eircom.net

BSDCon ‘03
Best Paper:
Cryptographic Device Support for FreeBSD
Samuel J. Leffler, Errno Consulting
Best Student Paper:
Running BSD Kernels as User Processes by Partial Emulation and Rewriting of Machine Instructions Hideki Eiraku and Yasushi Shinjo, University of Tsukuba

12th USENIX Security Symposium
Best Paper:
Remote Timing Attacks Are Practical
David Brumley and Dan Boneh, Stanford University
Best Student Paper:
Establishing the Genuinity of Remote Computer Systems
Rick Kennell and Leah H. Jamieson, Purdue University

2003 USENIX Annual Technical Conference
Award Paper:
Undo for Operators: Building an Undoable E-mail Store
Aaron B. Brown and David A. Patterson, University of California, Berkeley
Award Paper:
Operating System I/O Speculation: How Two Invocations Are Faster Than One
Keir Fraser, University of Cambridge Computer Laboratory; Fay Chang, Google Inc.

FREENIX Track Best Paper:
StarFish: Highly Available Block Storage
Eran Gabber, Jeff Fellin, Michael Flaster, Fengrui Gu, Bruce Hillyer, Wee Teck Ng, Banu Özden, and Elizabeth Shriver, Lucent Technologies, Bell Labs
Best Student Paper:
Flexibility in ROM: A Stackable Open Source BIOS
Adam Agnew and Adam Sulmicki, University of Maryland at College Park; Ronald Minnich, Los Alamos National Labs; William Arbaugh, University of Maryland at College Park

First International Conference on Mobile Systems, Applications, and Services
Best Paper:
Energy Aware Lossless Data Compression
Kenneth Barr and Krste Asanovic, Massachusetts Institute of Technology

2nd USENIX Conference on File and Storage Technologies
Best Paper:
Using MEMS-Based Storage in Disk Arrays
Mustafa Uysal and Arif Merchant, Hewlett-Packard Labs; Guillermo A. Alvarez, IBM Almaden Research Center
Best Student Paper:
Pond: The OceanStore Prototype
Sean Rhea, Patrick Eaton, Dennis Geels, Hakim Weatherspoon, Ben Zhao, and John Kubiatowicz, University of California, Berkeley

4th USENIX Symposium on Internet Technologies and Systems
Best Paper:
SkipNet: A Scalable Overlay Network with Practical Locality Properties
Nicholas J. A. Harvey, Microsoft Research and University of Washington; Michael B. Jones, Microsoft Research; Stefan Saroiu, University of Washington; Marvin Theimer and Alec Wolman, Microsoft Research
Best Student Paper:
Scriptroute: A Public Internet Measurement Facility
Neil Spring, David Wetherall, and Tom Anderson, University of Washington

2002
5th Symposium on Operating Systems Design and Implementation
Best Paper:
Memory Resource Management in VMware ESX Server
Carl A. Waldspurger, VMware, Inc.
Best Student Paper:
An Analysis of Internet Content Delivery Systems
Stefan Saroiu, Krishna P. Gummadi, Richard J. Dunn, Steven D. Gribble, and Henry M. Levy, University of Washington

LISA ‘02: 16th Systems Administration Conference
Best Paper:
RTG: A Scalable SNMP Statistics Architecture for Service Providers
Robert Beverly, MIT Laboratory for Computer Science
Best Paper:
Work-Augmented Laziness with the Los Task Request System
Thomas Stepleton, Swarthmore College Computer Society

11th USENIX Security Symposium
Best Paper:
Security in Plan 9
Russ Cox, MIT LCS; Eric Grosse and Rob Pike, Bell Labs; Dave Presotto, Avaya Labs and Bell Labs; Sean Quinlan, Bell Labs
Best Student Paper:
Infranet: Circumventing Web Censorship and Surveillance
Nick Feamster, Magdalena Balazinska, Greg Harfst, Hari Balakrishnan, and David Karger, MIT

2nd Java Virtual Machine Research and Technology Symposium
Best Paper:
An Empirical Study of Method In-lining for a Java Just-in-Time Compiler
Toshio Suganuma, Toshiaki Yasue, and Toshio Nakatani, IBM Tokyo Research Laboratory
Best Student Paper:
Supporting Binary Compatibility with Static Compilation
Dachuan Yu, Zhong Shao, and Valery Trifonov, Yale University

2002 USENIX Annual Technical Conference
Best Paper:
Structure and Performance of the Direct Access File System
Kostas Magoutis, Salimah Addetia, Alexandra Fedorova, and Margo I. Seltzer, Harvard University; Jeffrey S. Chase, Andrew J. Gallatin, Richard Kisley, and Rajiv G. Wickremesinghe, Duke University; and Eran Gabber, Lucent Technologies
Best Student Paper:
EtE: Passive End-to-End Internet Service Performance Monitoring
Yun Fu and Amin Vahdat, Duke University; Ludmila Cherkasova and Wenting Tang, Hewlett-Packard Laboratories

FREENIX Track
Best FREENIX Paper:
CPCMS: A Configuration Management System Based on Cryptographic Names
Jonathan S. Shapiro and John Vanderburgh, Johns Hopkins University
Best FREENIX Student Paper:
SWILL: A Simple Embedded Web Server Library
Sotiria Lampoudi and David M. Beazley, University of Chicago

BSDCon ‘02
Best Paper:
Running “fsck” in the Background Marshall Kirk McKusick, Author and Consultant
Best Paper:
Design And Implementation of a Direct Access File System (DAFS) Kernel Server for FreeBSD
Kostas Magoutis, Division of Engineering and Applied Sciences, Harvard University

Conference on File and Storage Technologies
Best Paper:
VENTI - A New Approach to Archival Data Storage
Sean Quinlan and Sean Dorward, Bell Labs, Lucent Technologies
Best Student Paper:
Track-aligned Extents: Matching Access Patterns to Disk Drive Characteristics
Jiri Schindler, John Linwood Griffin, Christopher R. Lumb, Gregory R. Ganger, Carnegie Mellon University

閱讀全文
類別：默認分類查看評論
文章來源:http://hi.baidu.com/knuthocean/blog/item/8218034f4a01523caec3ab1c.html

posted @ 2009-12-03 13:43 Programmers 閱讀(275) | 評論 (0) | 編輯收藏

[導入]分布式相關會議歷屆最佳論文(05-09)

前幾天有同學問分布式系統方向有哪些會議。網上查了一下，頂級的會議是OSDI(Operating System Design and Implementation)和SOSP(Symposium on Operating System Principles)。其它幾個會議，如NSDI，FAST，VLDB也常常有讓人眼前一亮的論文。值得慶幸的是，現在云計算太火了，GFS/Mapreduce/Bigtable等工程性文章都發表在最牛的OSDI上，并且Google Bigtable和Microsoft的Dryad LINQ還獲得了最佳論文獎。下面列出了每個會議的歷年最佳論文，希望我們可以站在一個制高點上。
USENIX ‘09
Best Paper:
Satori: Enlightened Page Sharing
Grzegorz Miłoś, Derek G. Murray, and Steven Hand, University of Cambridge Computer Laboratory; Michael A. Fetterman, NVIDIA Corporation

Best Paper:
Tolerating File-System Mistakes with EnvyFS
Lakshmi N. Bairavasundaram, NetApp., Inc.; Swaminathan Sundararaman, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau, University of Wisconsin—Madison

NSDI ‘09
Best Paper:
TrInc: Small Trusted Hardware for Large Distributed Systems
Dave Levin, University of Maryland; John R. Douceur, Jacob R. Lorch, and Thomas Moscibroda, Microsoft Research

Best Paper:
Sora: High Performance Software Radio Using General Purpose Multi-core Processors
Kun Tan and Jiansong Zhang, Microsoft Research Asia; Ji Fang, Beijing Jiaotong University; He Liu, Yusheng Ye, and Shen Wang, Tsinghua University; Yongguang Zhang, Haitao Wu, and Wei Wang, Microsoft Research Asia; Geoffrey M. Voelker, University of California, San Diego

FAST ‘09
Best Paper:
CA-NFS: A Congestion-Aware Network File System
Alexandros Batsakis, NetApp and Johns Hopkins University; Randal Burns, Johns Hopkins University; Arkady Kanevsky, James Lentini, and Thomas Talpey, NetApp

Best Paper:
Generating Realistic Impressions for File-System Benchmarking
Nitin Agrawal, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau, University of Wisconsin, Madison
2008

OSDI ‘08
Jay Lepreau Best Paper:
Difference Engine: Harnessing Memory Redundancy in Virtual Machines
Diwaker Gupta, University of California, San Diego; Sangmin Lee, University of Texas at Austin; Michael Vrable, Stefan Savage, Alex C. Snoeren, George Varghese, Geoffrey M. Voelker, and Amin Vahdat, University of California, San Diego

Jay Lepreau Best Paper:
DryadLINQ: A System for General-Purpose Distributed Data-Parallel Computing Using a High-Level Language
Yuan Yu, Michael Isard, Dennis Fetterly, and Mihai Budiu, Microsoft Research Silicon Valley; Úlfar Erlingsson, Reykjavík University, Iceland, and Microsoft Research Silicon Valley; Pradeep Kumar Gunda and Jon Currey, Microsoft Research Silicon Valley

Jay Lepreau Best Paper:
KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs
Cristian Cadar, Daniel Dunbar, and Dawson Engler, Stanford University

LISA ‘08
Best Paper:
ENAVis: Enterprise Network Activities Visualization
Qi Liao, Andrew Blaich, Aaron Striegel, and Douglas Thain, University of Notre Dame

Best Student Paper:
Automatic Software Fault Diagnosis by Exploiting Application Signatures
Xiaoning Ding, The Ohio State University; Hai Huang, Yaoping Ruan, and Anees Shaikh, IBM T.J. Watson Research Center; Xiaodong Zhang, The Ohio State University

USENIX Security ‘08
Best Paper:
Highly Predictive Blacklisting
Jian Zhang and Phillip Porras, SRI International; Johannes Ullrich, SANS Institute

Best Student Paper:
Lest We Remember: Cold Boot Attacks on Encryption Keys
J. Alex Halderman, Princeton University; Seth D. Schoen, Electronic Frontier Foundation; Nadia Heninger and William Clarkson, Princeton University; William Paul, Wind River Systems; Joseph A. Calandrino and Ariel J. Feldman, Princeton University; Jacob Appelbaum; Edward W. Felten, Princeton University

USENIX ‘08
Best Paper:
Decoupling Dynamic Program Analysis from Execution in Virtual Environments
Jim Chow, Tal Garfinkel, and Peter M. Chen, VMware

Best Student Paper:
Vx32: Lightweight User-level Sandboxing on the x86
Bryan Ford and Russ Cox, Massachusetts Institute of Technology

NSDI ‘08
Best Paper:
Remus: High Availability via Asynchronous Virtual Machine Replication
Brendan Cully, Geoffrey Lefebvre, Dutch Meyer, Mike Feeley, and Norm Hutchinson, University of British Columbia; Andrew Warfield, University of British Columbia and Citrix Systems, Inc.

Best Paper:
Consensus Routing: The Internet as a Distributed System
John P. John, Ethan Katz-Bassett, Arvind Krishnamurthy, and Thomas Anderson, University of Washington; Arun Venkataramani, University of Massachusetts Amherst

LEET ‘08
Best Paper:
Designing and Implementing Malicious Hardware (PDF) or read in HTML
Samuel T. King, Joseph Tucek, Anthony Cozzie, Chris Grier, Weihang Jiang, and Yuanyuan Zhou, University of Illinois at Urbana-Champaign

FAST ‘08
Best Paper:
Portably Solving File TOCTTOU Races with Hardness Amplification
Dan Tsafrir, IBM T.J. Watson Research Center; Tomer Hertz, Microsoft Research; David Wagner, University of California, Berkeley; Dilma Da Silva, IBM T.J. Watson Research Center

Best Student Paper:
An Analysis of Data Corruption in the Storage Stack
Lakshmi N. Bairavasundaram, University of Wisconsin, Madison; Garth Goodson, Network Appliance Inc.; Bianca Schroeder, University of Toronto; Andrea C. Arpaci-Dusseau and Remzi H. Arpaci-Dusseau, University of Wisconsin, Madison
2007

LISA ‘07
Best Paper:
Application Buffer-Cache Management for Performance: Running the World’s Largest MRTG
David Plonka, Archit Gupta, and Dale Carder, University of Wisconsin Madison

Best Paper:
PoDIM: A Language for High-Level Configuration Management
Thomas Delaet and Wouter Joosen, Katholieke Universiteit Leuven, Belgium

16th USENIX Security Symposium
Best Paper:
Towards Automatic Discovery of Deviations in Binary Implementations with Applications to Error Detection and Fingerprint Generation
David Brumley, Juan Caballero, Zhenkai Liang, James Newsome, and Dawn Song, Carnegie Mellon University

Best Student Paper:
Keep Your Enemies Close: Distance Bounding Against Smartcard Relay Attacks
Saar Drimer and Steven J. Murdoch, Computer Laboratory, University of Cambridge

USENIX ‘07
Best Paper:
Hyperion: High Volume Stream Archival for Retrospective Querying
Peter Desnoyers and Prashant Shenoy, University of Massachusetts Amherst

Best Paper:
SafeStore: A Durable and Practical Storage System
Ramakrishna Kotla, Lorenzo Alvisi, and Mike Dahlin, The University of Texas at Austin

NSDI ‘07
Best Paper:
Life, Death, and the Critical Transition: Finding Liveness Bugs in Systems Code
Charles Killian, James W. Anderson, Ranjit Jhala, and Amin Vahdat, University of California, San Diego

Best Student Paper:
Do Incentives Build Robustness in BitTorrent?
Michael Piatek, Tomas Isdal, Thomas Anderson, and Arvind Krishnamurthy, University of Washington; Arun Venkataramani, University of Massachusetts Amherst

FAST ‘07
Best Paper:
Disk Failures in the Real World: What Does an MTTF of 1,000,000 Hours Mean to You?
Bianca Schroeder and Garth A. Gibson, Carnegie Mellon University

Best Paper:
TFS: A Transparent File System for Contributory Storage
James Cipar, Mark D. Corner, and Emery D. Berger, University of Massachusetts Amherst

2006

LISA ‘06
Best Paper:
A Platform for RFID Security and Privacy Administration
Melanie R. Rieback, Vrije Universiteit Amsterdam; Georgi N. Gaydadjiev, Delft University of Technology; Bruno Crispo, Rutger F.H. Hofman, and Andrew S. Tanenbaum, Vrije Universiteit Amsterdam

Honorable Mention:
A Forensic Analysis of a Distributed Two-Stage Web-Based Spam Attack
Daniel V. Klein, LoneWolf Systems

OSDI ‘06
Best Paper:
Rethink the Sync
Edmund B. Nightingale, Kaushik Veeraraghavan, Peter M. Chen, and Jason Flinn, University of Michigan

Best Paper:
Bigtable: A Distributed Storage System for Structured Data
Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E. Gruber, Google, Inc.

15th USENIX Security Symposium
Best Paper:
Evaluating SFI for a CISC Architecture
Stephen McCamant, Massachusetts Institute of Technology; Greg Morrisett, Harvard University

Best Student Paper:
Keyboards and Covert Channels
Gaurav Shah, Andres Molina, and Matt Blaze, University of Pennsylvania

2006 USENIX Annual Technical Conference
Best Paper:
Optimizing Network Virtualization in Xen
Aravind Menon, EPFL; Alan L. Cox, Rice University; Willy Zwaenepoel, EPFL

Best Paper:
Replay Debugging for Distributed Applications
Dennis Geels, Gautam Altekar, Scott Shenker, and Ion Stoica, University of California, Berkeley

NSDI ‘06
Best Paper:
Experience with an Object Reputation System for Peer-to-Peer Filesharing
Kevin Walsh and Emin Gün Sirer, Cornell University

Best Paper:
Availability of Multi-Object Operations
Haifeng Yu, Intel Research Pittsburgh and Carnegie Mellon University; Phillip B. Gibbons, Intel Research Pittsburgh; Suman Nath, Microsoft Research
2005

FAST ‘05
Best Paper:
Ursa Minor: Versatile Cluster-based Storage
Michael Abd-El-Malek, William V. Courtright II, Chuck Cranor, Gregory R. Ganger, James Hendricks, Andrew J. Klosterman, Michael Mesnier, Manish Prasad, Brandon Salmon, Raja R. Sambasivan, Shafeeq Sinnamohideen, John D. Strunk, Eno Thereska, Matthew Wachs, and Jay J. Wylie, Carnegie Mellon University

Best Paper:
On Multidimensional Data and Modern Disks
Steven W. Schlosser, Intel Research Pittsburgh; Jiri Schindler, EMC Corporation; Stratos Papadomanolakis, Minglong Shao, Anastassia Ailamaki, Christos Faloutsos, and Gregory R. Ganger, Carnegie Mellon University

LISA ‘05
Best Paper:
Toward a Cost Model for System Administration
Alva L. Couch, Ning Wu, and Hengky Susanto, Tufts University

Best Student Paper:
Toward an Automated Vulnerability Comparison of Open Source IMAP Servers
Chaos Golubitsky, Carnegie Mellon University

Best Student Paper:
Reducing Downtime Due to System Maintenance and Upgrades
Shaya Potter and Jason Nieh, Columbia University

IMC 2005
Best Student Paper:
Measurement-based Characterization of a Collection of On-line Games
Chris Chambers and Wu-chang Feng, Portland State University; Sambit Sahu and Debanjan Saha, IBM Research

Security ‘05
Best Paper:
Mapping Internet Sensors with Probe Response Attacks
John Bethencourt, Jason Franklin, and Mary Vernon University of Wisconsin, Madison

Best Student Paper:
Security Analysis of a Cryptographically-Enabled RFID Device
Steve Bono, Matthew Green, and Adam Stubblefield, Johns Hopkins University; Ari Juels, RSA Laboratories; Avi Rubin, Johns Hopkins University; Michael Szydlo, RSA Laboratories

MobiSys ‘05
Best Paper:
Reincarnating PCs with Portable SoulPads
Ramón Cáceres, Casey Carter, Chandra Narayanaswami, and Mandayam Raghunath, IBM T.J. Watson Research Center

NSDI ‘05
Best Paper:
Detecting BGP Configuration Faults with Static Analysis
Nick Feamster and Hari Balakrishnan, MIT Computer Science and Artificial Intelligence Laboratory

Best Student Paper:
Botz-4-Sale: Surviving Organized DDoS Attacks That Mimic Flash Crowds
Srikanth Kandula and Dina Katabi, Massachusetts Institute of Technology; Matthias Jacob, Princeton University; Arthur Berger, Massachusetts Institute of Technology/Akamai

2005 USENIX Annual Technical Conference
General Track
Best Paper:
Debugging Operating Systems with Time-Traveling Virtual Machines
Samuel T. King, George W. Dunlap, and Peter M. Chen, University of Michigan

Best Student Paper:
Itanium—A System Implementor’s Tale
Charles Gray, University of New South Wales; Matthew Chapman and Peter Chubb, University of New South Wales and National ICT Australia; David Mosberger-Tang, Hewlett-Packard Labs; Gernot Heiser, University of New South Wales and National ICT Australia

FREENIX Track
Best Paper:
USB/IP—A Peripheral Bus Extension for Device Sharing over IP Network
Takahiro Hirofuchi, Eiji Kawai, Kazutoshi Fujikawa, and Hideki Sunahara, Nara Institute of Science and Technology

閱讀全文
類別：默認分類查看評論
文章來源:http://hi.baidu.com/knuthocean/blog/item/7f32925830ed16d49d82040f.html

posted @ 2009-12-03 13:43 Programmers 閱讀(544) | 評論 (0) | 編輯收藏

[導入]分布式系統設計實現的幾個現象

分布式系統設計開發過程中有幾個比較有意思的現象：
1. CAP原理。CAP分別表示Consistency(一致性), Availability(可訪問性), Partition-tolerance(網絡分區容忍性)。Consistency指強一致性，符合ACID；Availability指每一個請求都能在確定的時間內返回結果；Partition-tolerance指系統能在網絡被分成多個部分，即允許任意消息丟失的情況下正常工作。CAP原理指出，CAP三者最多取其二，沒有完美的結果。因此，我們設計replication策略、一致性模型、分布式事務時都應該有所折衷。
2. 一致性的不可能性原理。該原理指出在允許失敗的異步系統下，進程間是不可能達成一致的。典型的問題就是分布式選舉問題，實際系統如Bigtable的tablet加載問題。所以，Google Chubby/Hadoop Zookeeper實現時都需要對服務器時鐘誤差做一個假設。當時鐘出現不一致時，工作機只能下線以防止出現不正確的結果。
3. 錯誤必然出現原理。只要是理論上有問題的設計/實現，運行時一定會出現，不管概率有多低。如果沒有出現問題，要么是穩定運行時間不夠長，要么是壓力不夠大。
4. 錯誤的必然復現原則。實踐表明，分布式系統測試中發現的錯誤等到數據規模增大以后必然會復現。分布式系統中出現的多機多線程問題有的非常難于排查，但是，沒關系，根據現象推測原因并補調試日志吧，加大數據規模，錯誤肯定會復現的。
5. 兩倍數據規模原則。實踐表明，分布式系統最大數據規模翻番時，都會發現以前從來沒有出現過的問題。這個原則當然不是準確的，不過可以指導我們做開發計劃。不管我們的系統多么穩定，不要高興太早，數據量翻番一定會出現很多意想不到的情況。不信就試試吧！
閱讀全文
類別：默認分類查看評論
文章來源:http://hi.baidu.com/knuthocean/blog/item/d291ab64301ddbfaf73654bc.html

posted @ 2009-12-03 13:43 Programmers 閱讀(244) | 評論 (0) | 編輯收藏

[導入]Hypertable C++ vs Hbase Java

Hypertable和Hbase二者同源，設計也有諸多相似之處，最主要的區別當然還是編程語言的選擇。Hbase選擇Java主要是因為Apache和Hadoop的公共庫、歷史項目基本都采用該語言，并且Java項目在設計模式和文檔上一般都比C++項目好，非常適合開源項目。C++的優勢當然還是在性能和內存使用上。Yahoo曾經給出了一個很好的Terasort結果(perspectives.mvdirona.com/2008/07/08/HadoopWinsTeraSort.aspx)，它們認為對于大多數Mapreduce任務，比如分布式排序，性能瓶頸在于IO和網絡，Java和C++在性能上基本沒有區別。不過，使用Java的Mapreduce在每臺服務器上明顯使用了更多的CPU和內存，如果用于分布式排序的服務器還需要部署其它的CPU/內存密集型應用，Java的性能劣勢將顯現。對于Hypertable/HBase這樣的表格系統，Java的選擇將帶來如下問題：
1. Hyertable/Hbase是內存和CPU密集型的。Hypertable/Hbase采用Log-Structured Merge Tree設計，系統可以使用的內存直接決定了系統性能。內存中的memtable和表格系統內部的緩存都大量使用內存，可使用的內存減少將導致merge-dump頻率加大，直接加重底層HDFS的壓力。另外，讀取和dump操作大量的歸并操作也可能使CPU成為一個瓶頸，再加上對數據的壓縮/解壓縮，特別是Bigtable中最經常使用的BM-diff算法在壓縮/解壓縮過程完全跑滿一個CPU核，很難想象Java實現的Hbase能夠與C++實現的Hypertable在性能上抗衡。
2. Java垃圾回收。目前Java虛擬機垃圾回收時將停止服務一段時間，這對Hypertable/HBase中大量使用的Lease機制是一個很大的考驗。雖然Java垃圾回收可以改進，但是企圖以通用的方式完全解決內存管理問題是不現實的。內存管理沒有通用做法，需要根據應用的訪問模式采取選擇不同的策略。
當然，Hadoop由于采用了Java設計，導致開源合作變得更加容易，三大核心系統之上開發的輔助系統，如Hadoop的監控，Pig等都相當成功。所以，我的觀點依然是：對于三駕馬車的核心系統，采用C++相對合理；對于輔助模塊，Java是一個不錯的選擇。閱讀全文
類別：默認分類查看評論
文章來源:http://hi.baidu.com/knuthocean/blog/item/ef201038f5d866f8b311c746.html

posted @ 2009-12-03 13:43 Programmers 閱讀(535) | 評論 (0) | 編輯收藏

[導入]Key-value存儲系統流派

對于Web應用來說，RDBMS在性能和擴展性上有著天生的缺陷，而key-value存儲系統通過犧牲關系數據庫的事務和范式等要求來換取性能和擴展性，成為了不錯的替代品。key-value存儲系統設計時一般需要關注擴展性，錯誤恢復，可靠性等，大致可以分類如下：
1. “山寨“流派：國產的很多系統屬于這種類型。這種類型的系統一般不容易擴展，錯誤恢復和負載平衡等都需要人工介入。由于國內的人力成本較低，這類系統通過增加運維人員的數量來回避分布式系統設計最為復雜的幾個問題，具有強烈的中國特色。這種系統的好處在于設計簡單，適合幾臺到幾十臺服務器的互聯網應用。比如，現在很多多機mysql應用通過人工分庫來實現系統擴展，即每次系統將要到達服務上限時，增加機器重新分庫。又如，很多系統將更新節點設計成單點，再通過簡單的冗余方式來提高系統可靠性；又如，很多系統規定單個表格最大的數據量，并通過人工指定機器服務每個表格來實現負載平衡。在這樣的設計下，應用規模增加一倍，服務器和運營各項成本增加遠大于一倍，不能用來提供云計算服務。然而由于其簡單可依賴，這類系統非常適合小型互聯網公司或者大型互聯網公司的一些規模較小的產品。
2. "P2P"流派：代表作為Amazon的Dynamo。Amazon作為提供云計算服務最為成功的公司，其商業模式和技術實力都異常強大。Amazon的系統典型特點是采用P2P技術，組合使用了多種流行的技術，如DHT，Vector Clock，Merkle Tree等，并且允許配置W和R值，在可靠性和一致性上求得一個平衡。Dynamo的負載平衡需要通過簡單的人工配置機器來配合，它的很多技術點可以單獨被其它系統借鑒。如，國內的“山寨”系統可以借鑒Dynamo的設計提高擴展性。
3. Google流派：代表作有Google的三駕馬車：GFS+Mapreduce+Bigtable。這種系統屬于貴族流派，模仿者眾多，知名的有以Yahoo為代表的Hadoop, 與Hadoop同源的Hypertable以及國內外眾多互聯網公司。Google的設計從數據中心建設，服務器選購到系統設計，數據存儲方式(數據壓縮)到系統部署都有一套指導原則，自成體系。如Hadoop的HDFS設計時不支持多個客戶端同時并發Append操作，導致后續的HBase及Hypertable實現極其困難。模仿者雖多，成功者少，HBase和Hypertable都在響應的延時及宕機恢復上有一系列的問題，期待后續發布的版本能有較大的突破。小型互聯網公司可以使用Hadoop的HDFS和Mapreduce，至于類似Hbase/Hypertable的表格系統，推薦自己做一個“山寨”版后不斷優化。
4. 學院派：這種類型的系統為研究人員主導，設計一般比較復雜，實現的時候以Demo為主。這類系統代表未來可能的方向，但實現的Demo可能有各種各樣的問題，如有的系統不能長期穩定運行，又如，有的系統不支持異構的機器環境。本人對這類系統知之甚少，著名的類Mapreduce系統Microsoft Dryad看起來有這種味道。

【注："山寨“如山寨手機，山寨開心網主要表示符合中國國情，非貶義】閱讀全文
類別：默認分類查看評論
文章來源:http://hi.baidu.com/knuthocean/blog/item/ae38ebf8891acb05d9f9fdb9.html

posted @ 2009-12-03 13:43 Programmers 閱讀(271) | 評論 (0) | 編輯收藏

[導入]Megastore/Bigtable Replication的文章

從Google App Engine中挖出的關于Megastore/Bigtable跨數據中心replication的文章，里面有提到一點點實現，希望對理解Bigtable及其衍生品的replication機制有用。我想指出幾點：
1. Bigtable的跨機房replication是保證最終一致性的，Megastore是通過Paxos 將tablet變成可以被跨機房的tablet server服務的。Bigtable的問題在于機器斷電會丟數據，Megastore可以做到不丟數據，但是實現起來極其復雜。Megastore的機制對性能還有一定影響，因為Google Chubby不適合訪問量過大的環境，所以，Bigtable和Megastore這兩個team正在合作尋找一個平衡點。
2. Bigtable內部的replication是后臺進行的，按照列級別執行復制；Megastore是按照Entity group級別進行Paxos控制。為什么Bigtable按照列級別復制？難道和locality group有關？

Migration to a Better Datastore

At Google, we've learned through experience to treat everything with healthy skepticism. We expect that servers, racks, shared GFS cells, and even entire datacenters will occasionally go down, sometimes with little or no warning. This has led us to try as hard as possible to design our products to run on multiple servers, multiple cells, and even multiple datacenters simultaneously, so that they keep running even if any one (or more) redundant underlying parts go down. We call this multihoming. It's a term that usually applies narrowly, to networking alone, but we use it much more broadly in our internal language.

Multihoming is straightforward for read-only products like web search, but it's more difficult for products that allow users to read and write data in real time, like GMail, Google Calendar, and App Engine. I've personally spent a while thinking about how multihoming applies to the App Engine datastore. I even gave a talk about it at this year's Google I/O.

While I've got you captive, I'll describe how multihoming currently works in App Engine, and how we're going to improve it with a release next week. I'll wrap things up with more detail about App Engine's maintenance schedule.

Bigtable replication and planned datacenter moves

When we launched App Engine, the datastore served each application's data out of one datacenter at a time. Data was replicated to other datacenters in the background, using

For example, if the datastore was serving data for some apps from datacenter A, and we needed to switch to serving their data from datacenter B, we simply flipped the datastore to read only mode, waited for Bigtable replication to flush any remaining writes from A to B, then flipped the switch back and started serving in read/write mode from B. This generally works well, but it depends on the Bigtable cells in both A and B to be healthy. Of course, we wouldn't want to move to B if it was unhealthy, but we definitely would if B was healthy but A wasn't.

Planning for trouble

Google continuously monitors the overall health of App Engine's underlying services, like GFS and Bigtable, in all of our datacenters. However, unexpected problems can crop up from time to time. When that happens, having backup options available is crucial.

You may remember the unplanned outage we had a few months ago. We published a detailed postmortem; in a nutshell, the shared GFS cell we use went down hard, which took us down as well, and it took a while to get the GFS cell back up. The GFS cell is just one example of the extent to which we use shared infrastructure at Google. It's one of our greatest strengths, in my opinion, but it has its drawbacks. One of the most noticeable drawback is loss of isolation. When a piece of shared infrastructure has problems or goes down, it affects everything that uses it.

In the example above, if the Bigtable cell in A is unhealthy, we're in trouble. Bigtable replication is fast, but it runs in the background, so it's usually at least a little behind, which is why we wait for that final flush before switching to B. If A is unhealthy, some of its data may be unavailable for extended periods of time. We can't get to it, so we can't flush it, we can't switch to B, and we're stuck in A until its Bigtable cell recovers enough to let us finish the flush. In extreme cases like this, we might not know how soon the data in A will become available. Rather than waiting indefinitely for A to recover, we'd like to have the option to cut our losses and serve out of B instead of A, even if it means a small, bounded amount of disruption to application data. Following our example, that extreme recovery scenario would go something like this:

We give up on flushing the most recent writes in A that haven't replicated to B, and switch to serving the data that is in B. Thankfully, there isn't much data in A that hasn't replicated to B, because replication is usually quite fast. It depends on the nature of the failure, but the window of unreplicated data usually only includes a small fraction of apps, and is often as small as a few thousand recent puts, deletes, and transaction commits, across all affected apps.

Naturally, when A comes back online, we can recover that unreplicated data, but if we've already started serving from B, we can't automatically copy it over from A, since there may have been conflicting writes in B to the same entities. If your app had unreplicated writes, we can at least provide you with a full dump of those writes from A, so that your data isn't lost forever. We can also provide you with tools to relatively easily apply those unreplicated writes to your current datastore serving out of B.

Unfortunately, Bigtable replication on its own isn't quite enough for us to implement the extreme recovery scenario above. We use Bigtable single-row transactions, which let us do read/modify/write operations on multiple columns in a row, to make our datastore writes transactional and consistent. Unfortunately, Bigtable replication operates at the column value level, not the row level. This means that after a Bigtable transaction in A that updates two columns, one of the new column values could be replicated to B but not the other.

If this happened, and we switched to B without flushing the other column value, the datastore would be internally inconsistent and difficult to recover to a consistent state without the data in A. In our July 2nd outage, it was partly this expectation of internal inconsistency that prevented us from switching to datacenter B when A became unhealthy.

Megastore replication saves the day!

Thankfully, there's a solution to our consistency problem: Megastore replication. Megastore is an internal library on top of Bigtable that supports declarative schemas, multi-row transactions, secondary indices, and recently, consistent replication across datacenters. The App Engine datastore uses Megastore liberally. We don't need all of its features - declarative schemas, for example - but we've been following the consistent replication feature closely during its development.

Megastore replication is similar to Bigtable replication in that it replicates data across multiple datacenters, but it replicates at the level of entire entity group transactions, not individual Bigtable column values. Furthermore, transactions on a given entity group are always replicated in order. This means that if Bigtable in datacenter A becomes unhealthy, and we must take the extreme option to switch to B before all of the data in A has flushed, B will be consistent and usable. Some writes may be stuck in A and unavailable in B, but B will always be a consistent recent snapshot of the data in A. Some scattered entity groups may be stale, ie they may not reflect the most recent updates, but we'd at least be able to start serving from B immediately, as opposed waiting for A to recover.

To Paxos or not to Paxos

Megastore replication was originally intended to replicate across multiple datacenters synchronously and atomically, using Paxos. Unfortunately, as I described in my Google I/O talk, the latency of Paxos across datacenters is simply too high for a low-level, developer facing storage system like the App Engine datastore.

Due to that, we've been working with the Megastore team on an alternative: asynchronous, background replication similar to Bigtable's. This system maintains the write latency our developers expect, since it doesn't replicate synchronously (with Paxos or otherwise), but it's still consistent and fast enough that we can switch datacenters at a moment's notice with a minimum of unreplicated data.

Onward and upward

We've had a fully functional version of asynchronous Megastore replication for a while. We've been testing it heavily, working out the kinks, and stressing it to make sure it's robust as possible. We've also been using it in our internal version of App Engine for a couple months. I'm excited to announce that we'll be migrating the public App Engine datastore to use it in a couple weeks, on September 22nd.

This migration does require some datastore downtime. First, we'll switch the datastore to read only mode for a short period, probably around 20-30 minutes, while we do our normal data replication flush, and roll forward any transactions that have been committed but not fully applied. Then, since Megastore replication uses a new transaction log format, we need to take the entire datastore down while we drop and recreate our transaction log columns in Bigtable. We expect this to only take a few minutes. After that, we'll be back up and running on Megastore replication!

As described, Megastore replication will make App Engine much more resilient to hiccoughs and outages in individual datacenters and significantly reduce the likelihood of extended outages. It also opens the door to two new options which will give developers more control over how their data is read and written. First, we're exploring allowing reads from the non-primary datastore if the primary datastore is taking too long to respond, which could decrease the likelihood of timeouts on read operations. Second, we're exploring full Paxos for write operations on an opt-in basis, guaranteeing data is always synchronously replicated across datacenters, which would increase availability at the cost of additional write latency.

Both of these features are speculative right now, but we're looking forward to allowing developers to make the decisions that fit their applications best!

Planning for scheduled maintenance

Finally, a word about our maintenance schedule. App Engine's scheduled maintenance periods usually correspond to shifts in primary application serving between datacenters. Our maintenance periods usually last for about an hour, during which application serving is continuous, but access to the Datastore and memcache may be read-only or completely unavailable.

We've recently developed better visibility into when we expect to shift datacenters. This information isn't perfect, but we've heard from many developers that they'd like more advance notice from App Engine about when these maintenance periods will occur. Therefore, we're happy to announce below the preliminary maintenance schedule for the rest of 2009.

Tuesday, September 22nd, 5:00 PM Pacific Time (migration to Megastore)
Tuesday, November 3rd, 5:00 PM Pacific Time
Tuesday, December 1st, 5:00 PM Pacific Time

We don't expect this information to change, but if it does, we'll notify you (via the App Engine Downtime Notify Google Group) as soon as possible. The App Engine team members are personally dedicated to keeping your applications serving without interruption, and we realize that weekday maintenance periods aren't ideal for many. However, we've selected the day of the week and time of day for maintenance to balance disruption to App Engine developers with availability of the full engineering teams of the services App Engine relies upon, like GFS and Bigtable. In the coming months, we expect features like Megastore replication to help reduce the length of our maintenance periods.

Posted by Ryan Barrett, App Engine Team

閱讀全文
類別：默認分類查看評論
文章來源:http://hi.baidu.com/knuthocean/blog/item/12bb9f3dea0e400abba1673c.html

posted @ 2009-12-03 13:43 Programmers 閱讀(235) | 評論 (0) | 編輯收藏

僅列出標題

常用鏈接

留言簿

隨筆檔案

搜索

最新評論

閱讀排行榜

評論排行榜

Migration to a Better Datastore

Bigtable replication and planned datacenter moves

To Paxos or not to Paxos

Planning for scheduled maintenance